Why doesn’t Vista expose individual pairs of channels as separate endpoints?

One of the questions that was asked after EHO and my Channel 9 video aired was:

is there an ability in vista to set on which speaker you hear an application? for example i want the media player on the 2 front speakers and the game on the 2 back speakers? if it’s possible, is there also a possibility to set the volume for each application on a per speaker base?

With a followup comment (from someone else) of:

Last I checked most audio cards were either full of stereo outputs, for front, rear, side etc. You are saying that these stereo output pairs, one of which could be utilized as headphone output instead of “side speaker pair 7 & 8” (so many outputs are common in sound cardss, but so many speakers are not common in  average home computer), should be utilized only as one end point instead of how the user actually thinks about them?

The answer to both of these is: It’s an interesting idea, but has some technical issues.

First off, many audio solutions don’t have multiple sets of stereo jacks in the back.  Multi-channel solutions come in lots of forms, such as S/PDIF, USB, 1394, etc, most of which support multiple audio channels without separate jacks.

Secondly, Such a solution isn’t technically feasible given the way that the audio engine is architected – at a minimum, the fact that each endpoint has its own preferred audio format makes this quite difficult. 

The other problem with this idea is discoverability.  Assuming that it was possible to work past the architectural issues, before you enable such a feature, you’ve got to tell the user about it.

Clearly, out-of-the-box, each multi-channel audio adapter needs to be its own endpoint.  Otherwise owners of 5.1 audio systems will be a smidge peeved to find out that their 5.1 audio solution can’t render 5.1 content.  So this “each channel is an endpoint” idea has to be an opt-in.

Now how do you describe such a feature to the user?  How do you tell them “Make the rear left and right speakers their own endpoint”?  How do you get a user to hook that up correctly?  And how do you deal with the multi-channel audio solutions I mentioned above?  For those audio solutions, the “make each channel a separate endpoint” option results in a HORRIBLE user experience, so we would want to disable this option when you’re running in one of those configurations.

Adding to this, there are a significant number of audio applications that don’t handle multiple audio devices – they simply assume that audio device 0 is the one that they want to use.

And finally, you’ve got to consider the number of customers this would benefit.  It’s my understanding that the market penetration of multi-channel (more than 2) audio solutions is somewhere around 1%.  That means that of the 500,000,000 Windows customers, only somewhere around 5,000,000 of them could take advantage of this feature.  And of the 5,000,000 or so people with multi-channel sound solutions, how many of them have only 2 speakers plugged into them?  I suspect that number is significantly smaller.

So this feature is likely to benefit only a tiny fraction of the Windows customers.  Adding features, especially extremely complicated features like this one has huge costs associated with them – you’ve got to add tests for the feature, documentation, UI, localization, etc.

There’s an overarching thread throughout the Vista audio stuff – we’re trying to make things easier for the vast majority of users and we’re actively trying to reducing the complexity of the system.  This feature doesn’t do any of that, instead it seems to me that it introduces a great deal of complexity that will benefit a very small subset of the Windows user base.

IMCHO, this idea seems to be a complete non-starter to me – there are too few people who will take advantage of it and it has the potential of messing up a significant number of people’s machines.

Comments (18)

  1. CN says:

    This leads to an even simpler question (perharps already covered in the video or elsewhere): is there individual L<->R balance for each application? What about individual back<->front balance?

    If individual L<->R is lacking, I think that is a more serious omission, as lots of people have two speakers and the UI issues should be significantly simpler.

  2. CN, actually we’re not aware of ANY situations in which it’s appropriate for an application to control its balance. Having said that, we do support individual channel volumes for applications, but it is STRONGLY recommended that apps don’t use it.

    Could you please explain a situation where PER-APP balance is important? I can see where PER-SYSTEM balance is important, but not PER-APP.

    Why would you want your Outlook beeps coming out of the left speaker but messenger coming out of the right?

    How many multimedia playback applications support balance controls at all? I can’t think of any off the top of my head. They all support a single simple volume control, but none of them support per-channel volume.

  3. Joku says:

    I do not see where system wide or per app BALANCE is of any use.

    What I would LOVE to see is system wide PANNING.

    Balance is mostly very useless feature since if you have stereo sound, it will sound wrong if adjusted by balance control instead of pan control.

    The balance controls Windows at some point were actually pain in the *ss since you couldn’t easily put the balance in the sndvol32 to center easily (atleast visually it used to be always one way or another), thank goodness this seems to be fixed now since. However they are still useless when you need them since they do not work like the average user would presume.

    And regards to the actual subject: I somewhat agree with regards to digital outputs, however if you look at many of the mainboards, today they do tend to come with multiple analog stereo outputs and in that case for me it is easier to think about having multiple stereo outputs, not some left rear sub etc. If I had a media center PC I might expect to have such, but for a normal pc used for more purposes than just listening surround sound it seems more natural to think that you could plug in headphones to the front port in your pc case which has headphone icon next to the jack, and have your other speakers plugged in the back. Of course there maybe few vendors which disable the rear jacks if you use the front jacks but isn’t that just bad design? Headphone jack on front and some disabled jacks on the rear and you need to open the case to switch those. I’ve assembled few of these myself and it was a hard choice what to disable.

  4. Joku says:

    > Why would you want your Outlook beeps coming out of the left speaker but messenger coming out of the right?

    Brings up an old question which has been asked before but maybe now we can get the answer?

    In Vista, can I set bleeps beeps and IE/Flash noises along with voice apps to use the headset and have music/video apps to use another endpoint? If you can control per app volume, certainly you can select which endpoint that is going per app, even if the app doesn’t really support that. Right? Please?

  5. Andrew W says:

    Agreed on all points!

    I think a bigger looming question is concurrent outputs. Last I heard, an app would have to explicitly render to each output device. Currently, systems render to all outputs concurrently – apps don’t need to do anything and the audio is routed to speakers, headphones and the digital output. This is a huge boon because it means the end user doesn’t need to do any system configuring to hear their audio – they just need to plug *something* in *somewhere*. If this functionality is going away, then I forsee a huge number of new support calls.

    (Don’t mean to sound like I’m bashing the new architecture – I’m actually *very* excited about moving all our DSP code up to user mode, making the system more glitch-free, and most everything else!)

  6. Joku, we’re exposing the volume controls that the audio solution implements. If it can do pan, we do pan (we actually expose separate sliders for the left and right channels).

    And yes, you can have the beeps and IE flash noises go to the headset and have your multimedia stuff go to the 5.1 speakers. Actually you can do that in XP, but it’s buried deep in the control panel. We can’t do per-app endpoint defaults (again, it’s too complicated to explain that to the user) but we CAN direct system sounds to different targets than multimedia playback.

  7. Joku says:

    I fully support simplicity. But at same time I hope your architecture isn’t limited by it and if some vendor sees value in having an audio solution that is seen by the OS as multiple endpoints, but defaulting to a single surround endpoint, that vendor can implement such solution. But it would have been so much cooler if you had architected the system to allow for current surround hardware to be utilized in more ways than it was designed for. Then allowed this to be taken advantage of by flip of some switch that exposes a per channel control and grouping of those channels to stereo along with invidual software DSP to the channel or channel groups.

    "If it can do pan, we do pan (we actually expose separate sliders for the left and right channels)."

    Doing panning is much greater challenge than doing balance. I do not know any audio solution doing this. That is why I had hoped that Vista hides the hardware panning capability since either it does not do panning, or if couple solutions does, it makes the user experience different among different solutions and the user does not realize this except by listening carefully to the implementation or using test material. The most sensible solution would have been to forget about the hardware balance and have full panning and EQ support in the software mixer.

    A nice way of exposing all the control is to first give a very reduced and simple set of control, perhaps just the master volume that simply abstracts every invidual channel and so on to a single slider. Then have some checkbox to show expand the mixer to show stereo channels, and those could be expanded to invidual mono channels. Also different checkbox for EQ.

    The professional audio software has done it like this for years and as long as one would not have to deal with the various ways of routing effects and plugins, the system is pretty easy to learn. If Vista had had a by-default hidden full blown software mixer and API to add functionality to it, software would not have to go on reinvent the wheel every time they wanted a little more than just the most basic controls. Software sound sources would be easily presented in this system wide full mixer as additional channels, though hiding the EQ and other advanced stuff by default.

    It seems to me you designed the UI by the lowest common denominator instead of allowing to grow up to more advanced features if one happens to need them. Also you succeeded in leaking the hardware implementation in the balance/panning equation through your abstraction.

    All in all, it looks good from the developer side of things, so there is hope for IHVs just dumping the Windows mixer all together and implementing their own funky looking mixers (yet again).

  8. Joku says:

    Actually, maybe I’ll take some of the previous back. While possible, it would be too much asked for the OS to implement a audio architecture that would be flexible enough to cater to needs of a professional application.

    However still it is an interesting idea, I am quite tired of the fact that every application that tries to do something with audio has its own mixer implementation and to get audio from one application to another you have to beg and beg and beg the ISVs to license solution for that or get a buggy driver that tries to patch this need in Windows. I’ve tried them all and seems from a musician standpoint Vista will only bring lower latency, not ease of use.

  9. Gabe says:

    My car has a GPS nav system integrated with the radio. When the navigation directions are "read" by the system, the radio output from the front speakers is replaced by the nav system output. The rear speakers may also get softer, but still only play the radio.

    This seems like a perfectly good reason for an application to control its balance. How would an application like this be implemented in Windows Vista Embedded?

  10. Mike Dimmick says:

    I realise that this is I-am-not-the-universe type information, but both my home and work PCs support multichannel output, on multiple jacks, on the motherboard. However, my home one – C-Media chipset – only supports it by remapping what the three onboard jacks do – turning the Mic and Line In sockets into the separate line outs. Some of the work computers have that too – we had a devil of a time getting Skype to work until we realised that the default was 5.1 output!

    At work I have two speakers connected; at home I currently have zero, and use headphones. I must get around to buying some speakers since there’s a lot of noise on the headphones, which I suspect is impedance mismatch since the output is really a Line Out and is therefore designed to be connected to the high-impedance input of an amplifier, not the approx. 32 Ohms of the headphones.

  11. Gabe, you’re assuming that the same app is playing music on the radio as is doing the voice navigation.

  12. Surge says:

    what would be really useful is a way to route sound sources from windows to the inputs/outputs of the board, someting like this:


    don’t presume all windows users are dumb 🙂 people don’t like making backups and organizing their files but they like fiddling with gadgets

    I explained (and made some easy-to-use profiles) to people that don’t have a clue about computers and sound hardware and they are so happy now playing a game on a pair of speakers while the girlfriend listens to her winamp playlist on another pair, or having some speakers (and a mic) into the baby’s room and and a separate pair of speakers next to the computer for other uses, or recording audio from a source directly to disk, while using the speakers for something else…and I repeat, these are people that don’t really understand (or care) what is the difference between folders and files, but now they think tech is cool again 🙂

    and from what I see, most audio solutions nowadays, including onboard sound support at least 5.1

  13. Universalis says:

    "Assuming that the same app is playing music…".

    Larry – in the real world no-one uses apps. They use a computer.

    And it is certainly the same computer playing the music and speaking the voice commands.

    Your arguments may well have weight from the programmer’s point of view but you can’t use the "not-the-same-app" argument to convince the users who pay your wages that they shouldn’t want what they say they want.

  14. Gabe says:

    Larry, I think it’s safe to assume that the whole in-dash system is really just one single app. It may be implemented as multiple processes, but I can’t think of any embedded system that is composed of what I would consider multiple apps. Regardless, my only assumption here is that if multiple processes are involved, they are cooperating so that the nav process would be able to tell the radio process to mute the front channels.

    My question still remains, though: only making the assumption that the components fully cooperate with eachother, how would this application be implemented for Vista?

  15. Joku says:


    That’s a great question and something I’ve wanted to do on multiple occasions. So to rephrase that post on the MSDN forums:

    Can you use the Vista Mixer to select the outputs/streams of for example Microphone, Winamp (or WMP) and possibly something else like VOIP app output, and then have Windows mix these together and present them as "in endpoint" so that any, even legacy app, can take this and record it to disk or play out as Internet radio broadcast (ice/shoutcast)?

    That is pretty common scenario, ask anyone who has done dj/podcasting. They are only a fraction of the current users for sure but I’ll just claim that it is *only because it is so difficult*. Currently you have to dig for the combined mix slider that goes around with various names or is not implemented by the drivers. And if you are foolish enough to use it, you might find out your Windows system Bleeps and other unwanted sources mixing in to the final result as well along with possibility of feedback if some setting is not just right.

    So it is pretty much no-go scenario even though it would be so simple if done purely with proper global software mixing and possibly a wizard to setup the ins and outs.