Multichannel Audio in Windows CE


Most of the infrastructure is in place to support multichannel audio in Windows CE, although the number of components that we ship to actually implement it is limited. In this blog I’ll cover the varying types of multichannel audio and what features are in place in Windows CE to support it.

For the purposes of this blog I’ll define multichannel audio to mean any audio stream containing more than two channel stereo. We’ll further subdivide multichannel audio into three types, differentiated by how the audio data gets from your CE device (e.g. Set Top Box, Smartphone/PPC, whatever) to your receiver:

1. Analog Matrix Decoders: In this type of decoding, multichannel audio is sent as left/right stereo data to the receiver. In a simple stereo receiver the audio can still be played through left and right speakers and will sound more-or-less correct. However, a receiver supporting the appropriate decoder can use cues placed in the audio to decode to more than two speakers and synthesize additional channels. The most well known decoders are Dolby Pro Logic and Pro Logic II.

2. Compressed audio over S/PDIF: S/PDIF was originally designed to support a maximum of 4 decompressed PCM channels. While one might use this to pass four discrete audio signals to a receiver, today’s multichannel content typically has at least six channels (e.g. 5.1). There’s no way to squeeze 6 decompressed audio channels across S/PDIF. The solution to this has been to pass the compressed audio over S/PDIF and let the receiver decompress it (as long as the compressed data bandwidth is less than that which would be required for four decompressed channels). Apart from enabling use of a single cable from the device to the receiver, this has the added benefit of offloading the audio decompression processing into the receiver. The downside of this architecture is that it relies on the receiver to be able to correctly decode the audio data. This is complicated because S/PDIF is a one-way transmission mechanism, so there’s no way to query the receiver at runtime to determine what it supports. There are potential timing/latency issues with lip synch.

Transferring compressed audio over S/PDIF typically involves massaging the compressed audio into a format that matches the S/PDIF frame format
(e.g. by padding the data with zero’s as needed) and adding some header information that lets the receiver figure out that you’re sending a
compressed audio stream rather than PCM. Both WMAPro and Dolby AC3 have a spec for this. Almost every receiver in the world supports AC3
decoding. A small (but a growing number) also support WMAPro (Pioneer in particular have spread WMAPro support to even the low end of their product line).


Info on WMAPro is here:
http://download.microsoft.com/download/5/b/5/5b5bec17-ea71-4653-9539-204a672f11cf/wmadrv.doc


Info on AC3-over-S/PDIF is here in Appendix B:
http://www.dolby.com/assets/pdf/tech_library/46_DDEncodingGuidelines.pdf

3. Multiple discrete audio outputs: In this type of connection multiple PCM audio channels are sent to the receiver. Until recently this has meant a separate RCA cable for each channel: a six channel (e.g. 5.1) signal would require six cables between components. HDMI has the potential to overcome this limitation by supporting 6 or more decompressed audio channels (and video) via a single cable.


Outputting to 6 DAC channels presumes that you’ve already got decompressed multichannel content or you’ve got a compressed multichannel content that is going to get decompressed before being sent to the wave driver (e.g. AC3 or WMAPro). The latter case is most likely, which means you’ll need the appropriate DirectShow decompression filter for CE.


Now, on to what CE supports (and doesn’t): 


Device Drivers 

If you want to support either S/PDIF or multiple discrete audio channels, you should probably want to start with the Ensoniq wavedev2 sample driver that
shipped in the Windows CE 5.0 Networked Media Device Feature Pack for CE 5 under public\fp_nmd\common\oak\drivers\wavedev\wavedev2\ensoniq. (Note: everything in this feature pack was rolled forward to CE6 as well, so there’s nothing in the feature pack that isn’t available in CE6 as well, although it might be in a different place).

This version of the Ensoniq driver has S/PDIF support built into it (the Ensoniq 1371 chip has a sort-of-undocumented S/PDIF mode which we
take advantage of), and supports passing WMAPro-over-S/PDIF compressed date. Support for AC3-over-S/PDIF would be a fairly trivial modification.


One other issue with passing compressed data over S/PDIF is that since the data isn’t decompressed to PCM until it gets to your
receiver, there’s no way for you to programatically control the volume or mix it with other PCM audio data. The former isn’t really a big
issue (the user can always control the volume on their receiver). The latter doesn’t have a really great solution.


In the sample Ensoniq driver, whenever we’re playing compressed WMAPro out the S/PDIF port we just throw away any PCM data that we’re asked to
play so it’s never heard (although we maintain the appropriate playback timing, so from the application standpoint everything appears to behave as expected).


To support multichannel discrete outputs in the wave driver one would need to modify the driver to accept a WAVEFORMATEX structure which looks like a normal PCM format but for which the nChannels field is 6 (or more). This is not be a trivial exercise, but should be pretty straightforward. As part of this, for wavedev2 one would have to rewrite the output.cpp file to add a new output stream class that accepts 6 streams, and modify the render functions that handle sample-rate-conversion to support all 6 channels.


Note that the kernel software mixer only supports stereo streams, so it won’t do any multichannel mixing for you. This is one reason wavedev2 is probably a good starting place, as it already has code built into it to mix stereo streams which could be extended to more channels.


DirectShow Filters


DirectShow is Windows CE’s media processing infrastructure. The architecture is media-type agnostic, meaning that there’s nothing in the overall design that makes it support one type of media any better than another. A number of outside customers are working on multichannel audio products using their own DirectShow filters (or filters they licensed from third-parties). The description below only discusses what Microsoft currently ships with CE5 and CE6. 


WMAPro-over-SPDIF filter: The abovementioned Feature Pack also includes a WMAPro-over-SPDIF DirectShow filter to massage WMAPro data into a format which can be sent over S/PDIF. Used in conjunction with a wave driver that supports WMAPro-over-SPDIF content and a receiver which supports decoding WMAPro, this allows a the best decoding quality and performance. To be honest, this isn’t currently a terribly common scenario given the limited WMAPro receiver penetration in the market; we did this partly as a proof-of-concept, partly to support our own (Microsoft) technology, and partly because all the pieces were available to us within the company so it wasn’t a major development effort. In addition, the architecture and driver changes are applicable to other more common formats (e.g. AC3); although we don’t currently ship any explicit support for AC3 streams, OEMs have implemented AC3 support using a similar set of components based on some of this work.


WMAPro decoder: Windows CE includes a WMAPro decoder to decode 5.1, 6.1, and 7.1 compressed content. However, when CE5 was first shipped all our existing customers were still using stereo outputs, so there was no value in passing the discrete channels down to the wave driver. Therefore, while the version that shipped in CE5 decodes all the discrete channels internally, it downmixes them to stereo for output. Therefore, there is currently no way to get the discrete channels out of the WMAPro decoder. The NMD feature pack improved on this situation by introducing matrix-encoding into downmix algorithm: a receiver supporting Pro Logic or Pro Logic II should be able to make use of this information to partially regenerate the discrete channels which were lost during the downmix. We’ll look into improving this situation if there’s sufficient customer demand.


Dolby AC3: Dolby AC3 is  probably the most common/popular multichannel format. Microsoft doesn’t currently ship a Dolby AC3 Directshow decoder, although there are probably lots of third party companies that produce such a thing and there may be open source versions (google “ac3filter”).


That’s all I’ve got for now. Please let me know if you found this useful, if there were any errors, or if you have any questions.


Responses to comments (if I misunderstood anyone’s question, please let me know):


1. How can I playback audio content simultaneously to both analog audio jacks and S/PDIF (Ianbing)


If I understand correctly, you’re trying to play the same audio content over two connections simultaneously (one RCA analog audio jack, and one S/PDIF jack). Assuming that’s correct:
– If your audio hardware can simultaneously send a single audio stream over both connections, have the audio driver handle it internally and just expose a single device at the waveapi level.
– If you have two separate pieces of audio hardware (one to handle analog, the other for S/PDIF), you’ll need to split the PCM output of the decoder (using a Tee filter- I think there’s one under public\directx\sdk\samples\dshow\filters\inftee) and hook both outputs of the tee to the wave renderer.


The latter design causes an additional problem because you’ll need a way to tell each renderer which audio device to playback to. To do this, you’ll need to hand-construct the graph, get pointers to each of the two audio render filters, and tell each wave renderer which device ID to play to. I don’t believe I’ve ever tried this, but it should be possible by creating an IPropertyBag object (I think you’ll have to roll-your-own, but it’s not too difficult), setting the “WaveOutId” property to the ID you want to use, and pass that propertybag to the IPersistProperty interface on the wave renderer.


Your code would look something like this (sorry, I haven’t compiled/tested this):


    // CPropertyBag is your implementation of the IPropertyBag interface.


    // We might have a public sample of this (search for cpropertybag.cpp), but I’m not sure


    CPropertyBag PropertyBag;


 


    // Setup your desired device ID


    VARIANT var;


    var.vt = VT_I4;


    var.lVal = <desired device ID>;


 


    // Write the desired ID to your property bag


    PropBag.Write( L”WaveOutId”, &var ));


 


    // Find the waveout renderer in the graph that you want to talk to…


    …


 


    // QI for the IID_IPersistPropertyBag interface… something like this…


    IPersistPropertyBag *pPersistPropertyBag = NULL;


    pWaveOutFilter->QueryInterface(IID_IPersistPropertyBag, (void **)&pPersistPropertyBag);


 


    // Pass the property bag into the wave renderer


    pPersistPropertyBag->Load( &PropBag, NULL );



Deep inside the Load call, the waveout renderer will do something like this with the PropBag pointer you passed in:


    VARIANT var;
    var.vt = VT_I4;
    HRESULT hr = pPropBag->Read(L”WaveOutId”, &var, 0);
    if(SUCCEEDED(hr))
    {
        m_iWaveOutId = var.lVal;
    }


 

Comments (10)

  1. This is my first blog post, so please feel free to leave feedback with questions or comments, especially

  2. writetothiru@gmail.com says:

    Hi Andy,

    I guess "WMAPro-over-SPDIF dshow filter"  is a renderer/sink filter.

    Is the source code for this filter available in either NMD FP or Windows CE 6.0?

    This filter is used in conjunction with a wave driver that supports WMAPro-over-SPDIF content . In the middle of WMAPro playback, if a A2DP headset is plugged in then A2DP wavedev2 driver becomes the default wave device. How does the playback takes place in this case?

  3. Andy Raffman says:

    The WMAPro-over-SPDIF filter isn’t a renderer/sink. It’s a transform filter that converts from WMAPro to WMAPro-over-SPDIF.

    Sorry, I don’t believe we ship source code to the filter as part of the NMD FP (we do ship the binary).

    When DShow constructs the filter graph, it looks at all the possible topologies before it chooses one. For example, when playing WMAPro, it might see the following:

       (WMAPro) -> WMAPro Decoder -> (PCM) -> Wave renderer -> Wave driver

       (WMAPro) -> WMAPro-over-SPDIF Transform -> (WMAPRO-over-SPDIF) -> Wave renderer -> Wave driver.

    DShow will only _see_ the second option if the wave driver claims it supports WMAPRO-over-SPDIF format (by succeeding an attempt to open this format with the WAVEFORMAT_QUERY flag set). In other words, the wave driver ultimately controls which mechanism DSHow will use.

    (One bit I left out: when DShow sees more than one valid topology, it chooses the one with the best “merit”, which is controlled through registry settings. For the moment assume the SPDIF path has higher merit than decoding WMAPro internally).

    The case with A2DP is pretty rare, since that would typically make sense on a Smartphone or PPC, and it’s exceedingly unlikely we’ll see a Smartphone/PPC with a S/PDIF output playing WMAPro. However, anything is possible in the future, so I’ll take a shot at what _should_ happen:

    When the BT A2DP driver becomes associated, it becomes the “default wave device”. For the moment this is done with a hack that moves it to device 0, although this might change in the future. If one wants audio to auto-magically get rerouted to the new device while it’s playing, the application needs to monitor this notification. When the application receives the notification, it needs to take one of two actions:

    1. Stop and restart the filter graph. A side effect of this will be to force the wave renderer to close and reopen the default wave device. The pro of this method is that it’s fast. The con is that it probably won’t handle the case of switching from a wave driver that supports SPDIF to one that doesn’t.

    2. Tear down the filter graph and rebuild it. The pro of this is that it can do a full rebuild of the topology and can switch between drivers that do/don’t support WMAPro-over-SPDIF. THe con is that it might take awhile.

    Ideally an application might try method 1, and if that fails, switch to 2.

    I hope that sheds some light on things; let me know if it’s too confusing (it’s getting late here and I’m tired).

  4. writetothiru@gmail.com says:

    Hi Andy,

    Thanks a lot for your clean and clearcut response.

    In the approach1 (i.e. stop and restart the graph), stopping would mean resetting the stream position to zero. So the application has to note the current stream position before stopping. Afterwards it has to re-set the stream position to saved value, before restarting.

    What happens if there is a media stream which cannot be indexed i.e. seeking is not possible due to lack of index table?

  5. Andy Raffman says:

    I think that pausing the stream (rather than stopping it) may be sufficient to force the audio renderer to close/reopen the wave device, but I’ll need to go back and look at the code.

  6. Ianbing says:

    Does anybody know below case and how to do? thank you.

    (MPEG Auido)->decoder ->(PCM)->Wave renderer->driver-RCA port.

                                         ->(PCM)->PCM over SPDIF filter->wave renderer-driver->S/PDIF port

  7. riverleee says:

    Hi, Andy,

    I installed Windows CE 5.0 Networked Media Device Feature Pack, but I cann’t find the folder publicfp_nmdcommonoakdrivers.

    Does anything else I missed?

    Thanks!

    BR,

    LiJiang

  8. P. Bossart says:

    A multichannel stream is really made of interleaved samples, with the channel set-up described in a WAVEFORMATEXTENSIBLE structure. As such there is no need to add a new output stream class to accept 6 streams as suggested.

    What would be needed is a means to map these channels onto the physical outputs corresponding to the relevant speakers. In addition there would be a need to up- or down-mix so that a stereo stream is also played on all available speakers. The desktop windows edition does all this, not sure why WinCE has not evolved and followed suit.

  9. mtpdev says:

    Hi Andy,

    I am writing a DirestShow Camera application.

    I am loading the camera driver using the following code.

    varCamName = _T ("CAM1:");

    hr = PropBag.Write( L"VCapName", &varCamName );

    hr = pPropertyBag->Load( &PropBag, NULL);

    It starts well for the first time, but after destroying the graph, if the capture graph is started for the second time it gives the following error..

    //hr fails with -2147024841 (ERROR_DEV_NOT_EXIST)

    What could be the reason ?

    Can u give some solution ?

    Thanks in advance,

    Sitharth.

  10. calvin.hung says:

    Hi, Andy,

    I met the same problem as riverlee had.

    I can’t find the folder publicfp_nmdcommon.

    I downloaded and installed the followings:

    setup.exe

    Shared_Sources.msi

    WinCEPB50-Product-Update-Rollup-Armv4I.msi

    WinCEPB50_NMDFP-071231-Product-Update-Rollup-Armv4I.msi

    WinCEPB50_NMDFP-080131-2008M01-Armv4I.msi

    WinCE_NMD_FP.msi

    Anything wrong or missed?

    Thanks!

Skip to main content