Windows Audio Quality Enhancements

In my last post, I mentioned the architectural thrust behind the Vista audio changes.

I left off explaining how we're dealing with problem #2 - the audio quality issue (because it deserves an entire post on its own).

There were a couple of significant problems with audio quality in the pre-Vista audio stack.  The first (and probably most significant) had to do with the audio format being rendered.

Before Vista, the kernel audio stack set the output audio format to match the format of the audio being played.  Normally, this isn't a problem, since it means that we do less DSP of the signals.  Unfortunately, it can lead to some rather unanticipated consequences.  For instance, if you're playing a system sound (usually stereo, 22kHz), at the same time you start playing your MP3 files, then the MP3 file rendering happens at 22kHz, which is a noticeable  degradation of audio quality.  Once the audio system goes quiet, the rendering format will reset to the format of the content being played, but that may be quite some time later.

Another problem that the pre-Vista audio stack had was that the DSP wasn't particularly good.  Because the audio stack worked with integer math, it turns out that many of the calculations involved in the audio processing suffered from significant rounding errors.

For Vista, we worked to fix both of these problems.

First off, we removed the behavior that auto-selected the output format.  Instead, the system chooses an intelligent default output format (based on the formats that the device claims to support), and we've added UI to allow the user to override the default.  This selected format will be the output format for all content, regardless of the format of the content being rendered.  It's the responsibility of a system that uses the audio engine to ensure that it matches the output format providing whatever format conversions are necessary to match the output format.

The good news is that application authors don't typically have to care about this, for all the higher level audio APIs (waveXxx, DSound, MF, etc), we automatically insert the appropriate format converters between the source format and the output format.

The other significant change we made to ensure high fidelity audio rendering is that we converted the entire audio pipeline from dealing with 16bit integers to 32bit floating point values.

I have to say that originally I was quite skeptical about this change - I thought that floating point rounding errors would cause massive problems, but it turns out that using floating point values allows us to get 24bits of accuracy with no rounding errors at all.  This allows our DSP to have significantly fewer rounding errors when performing calculations on the audio.  We're also deploying a new higher quality rate converter that the Windows Codec team developed, which will also have a huge impact on the quality of audio when we DO have to perform sample rate conversions during the mix.

The end result of these changes should be a significant improvement in the quality of audio being rendered, especially on UAA compatible audio adapters.


Comments (16)

  1. Interesting post about the problems with the old method for selecting the audio format. Since you said that rounding error were a concern, does that mean you were originally thinking of using a fixed-point format instead of floating point in Vista? Also, how well does the use of floating point in the audio stack deal with applications that leave the floating point control registers in funny states?

  2. Anonymous says:

    Larry, you really should take a look at what your competition is doing. Apple’s Core Audio system uses floating point pretty much exclusively (for non-packetized audio), and it’s eliminated the need for ASIO on the Mac.

  3. Anonymous says:

    Will we be able to install our own filters and stuff at the very last stage before sending the audio to the hardware, or maybe in other places such as the per-app mixing?

    It would be kickass to be able to install a high-quality system-wide EQ for example…

  4. Chris, the guy who was the architect for most of our audio infrastructure (Steve Swenson ( worked on audio at Apple before coming to Microsoft.

    We know what Apple’s doing with audio 🙂

    The big thing about ASIO isn’t the floating point, it’s the latency. I’ll be talking about how we’re achieving low latency some point in the future.

    Nicholas, the floating point state shouldn’t be a problem, because the engine runs in a separate process, and NT virtualizes the FP hardware.

  5. Anonymous says:

    Will there be/Already is a way to find out if certain audio hardware or driver is UAA/User mode one? I hope it will be very clearly visible in marketing whether the chip on the motherboard or card/external box will have all the possible benefits that Vista can offer.

    There should be some new, hard to get, logo that ensures that both drivers and hardware are top notch in every respect. VistaHD for consumer crap and VistaPRO for mastering quality converters/audio path and drivers that pass the most rigorous tests in stability and audio quality when under heavy load.

  6. Aha! I thought this conversion was being done user mode and in-process. Thanks for clearing that up.

  7. Anonymous says:

    Will there be any support for audio hardware acceleration or DSP’s (such as Creative’s X-Fi)?

  8. Chris, the current plan does not include support hardware acceleration directly in the audio engine.

    There is a 3rd party extensibility story, I’ll be getting to that later on in the series.

  9. Anonymous says:

    Larry, i love these series, i hope you post every day!!!

  10. Anonymous says:

    What I’ve seen mentioned UI-wise so far is some kind of a panel with a list of apps and volume sliders. I’m wondering if the idea came up to add a volume slider to individual window title bars, control boxes or some other location so that if I want to control an apps volume I can always find the fader in the same spot without having to find it in the one big panel. Just an idea, obviously I have no first-hand knowledge of how the UI works so what you’ve come up with may actually work better…

  11. Anonymous says:

    Off topic, but someone has to fix Windows Media Player to be 100% noise-less between track transitions. I still hear a slight tick intermittantly with WMA and WMA Lossless, and it shouldn’t be happening at all… especially on Vista.

    Also, support of gapless MP3 would be nice, too, using the LAME MP3 Xing header enc_delay & enc_padding to compensate.

  12. Anonymous says:


    This comment has got nothing to do with this particular blog. But i want to know if you know anyother 20+ year people at microsoft with a blog. The age and experience defintely adds wisdom and i want to read more of you guys.



  13. Anonymous says:

    Sudarshan, here’s another one:”>

    I only recently found out about these blogs. I have gone back through a few of them, and read everything they’ve ever written. This is a good investment of time IMHO.

    Another approach is to open (the generic URL) daily, and see what comes up there.

Skip to main content