Volume control in Vista

Before Vista, all of the controls available to applications were system-wide - when you changed the volume using the wave volume APIs, you changed the hardware volume, thus effecting all the applications in the system.  The problem with this is that for the vast majority of applications, this was exactly the wrong behavior.  This behavior was a legacy of the old Windows 3.1 audio architecture, where you could only have one application playing audio at a time.  In that situation, there was only one hardware volume, so the behavior made sense.

When  the WDM audio drivers were released for Win98, Microsoft added kernel mode audio mixing, but it left the volume control infrastructure alone.  The volume controls available to the Windows APIs remained the hardware volume controls.  The reason for this is pretty simple: Volume control really needs to be per-application, but in the Win98 architecture, there was no way of associating individual audio streams with a particular application, instead audio streams were treated independently.

The thing is, most applications REALLY wanted to just control the volume for their audio streams.  They didn't want (or need) to mess with other apps audio streams, that was just an unfortunate side effect of the audio architecture.

For some applications, there were solutions.  For instance, if you used DirectSound (or DirectShow, which is layered on DirectSound), you could render your audio streams into a secondary buffer, since DSound secondary buffers had their own volume controls, that effectively makes their volume control per-application.   But it doesn't do anything to help the applications that don't use DSound, they're stuck with manipulating the hardware volume.

 

For Vista, one of the things that was deployed as part of the new audio infrastructure was a component called "Audio Policy".  One of the tasks of the policy engine is tracking which audio streams belong to which application.

For Vista, each audio stream is associated with an "audio session", and the audio session is roughly associated with a process (each process can have more than one audio session, and audio sessions can span multiple process, but by default each audio session is the collection of audio streams being rendered by the process).

Each audio session has its own volume control, and WASAPI exposes interfaces that allow applications to control the volume of their audio session.  The volume control API also includes a notification mechanism so applications that want to be notified when their volume control changes can implement this - this mechanism allows an application to track when someone else changes their volume.

This is all well and good, but how does this solve the problem of existing applications that are using the hardware volume but probably don't want to?

Remember how I mentioned that all the existing APIs were plumbed to use WASAPI?  Well, we plumbed the volume controls for those APIs to WASAPI's volume control interfaces too. 

We also plumbed the mixerLine APIs to use WASAPI.  This was slightly more complicated, because the mixerLine API also requires that we define a topology for audio devices, but we've defined a relatively simple topology that should match existing hardware topologies (so appcompat shouldn't be an issue).

The upshot of this is that by default, for Vista Beta2, we're going to provide per-application volume control for the first time, for all applications.

There is a very small set of applications that may be broken by this behavior change, but we have a mechanism to ensure that applications that need to manipulate the hardware volume using the existing APIs will be able to work in Vista without rewriting the application (if you've got one of those applications, you should contact me out-of-band and I'll get the right people involved in the discussion).