What type of audio processing processing is included in Windows? Does Windows support Echo Cancellation? Noise Suppression? Some other type of audio processing?
One common misconception about the Windows audio stack is that Windows natively includes audio effects that modify the audio signal, e.g. acoustic echo cancellation, noise reduction, gain control, etc. However, in reality the audio stack just passes the unmodified raw signal from the application to the speakers or from the microphone to the application. The signal is not modified by any Windows component. In fact, the audio stack loads external components (called “Audio Processing Objects” or APOs) that modify the audio signal. There are multiple companies that create APOs for Windows, such as Dolby, DTS, Waves, Conexant, ForteMedia, Realtek, etc. All commercial systems for the well-known OEMs include at least one APO in the system. It is up to the OEM to decide what type of processing they want in each model and which company will create the APO for that model. In Windows 8.1 the APOs are bundled and installed together with the audio driver.
All APOs are required to declare the type of processing that they do, so that the applications can be informed about the type of processing that is available in each system. Windows provides a list of 18 types of audio effects (as is shown in this MSDN link):
- Acoustic Echo Cancellation
- Noise Suppression
- Automatic Gain Control
- Beam Forming
- Constant Tone Removal
- Loudness Equalizer
- Bass Boost
- Virtual Surround
- Virtual Headphones
- Speaker Fill
- Room Correction
- Bass Management
- Environmental Effects
- Speaker Protection
- Speaker Compensantion
- Dynamic Range Compression
- For playback processing: Control Panel -> Sound -> Playback tab -> Right click on speakers -> Properties -> Enhancements
- For capture processing: Control Panel -> Sound -> Recording tab -> Right click on microphone -> Properties -> Enhancements
Applications have the option to select if they want to open a stream using default mode or raw mode:
- Default mode is appropriate for most applications. The stream will go through all audio effects (SFX, MFX, EFX) that have been selected by the OEM, in order to optimize audio quality
- Raw mode will avoid most processing (only EFX are applied to raw streams). This should be used by applications who want to have unprocessed audio signal (e.g. Pro Audio applications)
- An individual stream: Stream effects (SFX)
- All streams that use same mode (explained below): Mode effects (MFX)
- All streams that use the same endpoint (e.g. speakers, microphone, etc): Endpoint effects (EFX)
Each of the 18 audio effect types from the above list can be applied to any of the 3 positions (SFX, MFX, EFX). Also, each position (SFX, MFX, EFX) can have processing for multiple effect types.
Audio effects can be implemented in S/W, in H/W or a combination of both.
If you want more information about this topic, you can look at the Audio Processing Object Architecture page in in MSDN. Just note that the MSDN page is describing the Windows 10 architecture, which is slightly different than the Windows 8.1 architecture that is described in this post.