Windows Media Player does not negotiate media types for DMO DSP plug-ins on Vista

Here is an interesting issue I ran across writing Windows Media Player (WMP) custom DSP plug-ins. When I am prototyping a new filter or plug-in for DirectShow or WMP, I will usually write my transform’s algorithm using a single format / media type. Once I have the algorithm working with this single media type I’ll look into supporting other media types. In the past I've usually I only support a few “compatible” types.

What I found is that when writing a custom WMP DSP plug-in (using the plug-in wizard), and hard coding support for only a single media type (WAVE_FORMAT_PCM, MEDIASUBTYPE_YV12), WMP would never load the plug-in. I debugged the issue down and found that WMP was always trying to connect with a certain media type. I found this to be the case for both audio and video. For audio WMP was always trying to use the IEEE subtype (WAVE_FORMAT_IEEE_FORMAT) and for video it was always using NV-12 (MEDIASUBTYPE_NV12) when playing back WMA or WMV formatted content.

This is certainly not the behavior that I expected. In fact on XP, WMP has no problem negotiating different media types for a custom plug-in. It took a very long time but finally I was able to get an answer from a guy on the WMP team. In short this behavior is “by design”. From what I understand the developers made this change after seeing numerous reports of performance problems with custom DMO plug-ins.

This is what I found: WMP builds the graph first, and then tries to insert the plug-in(s) into the filter chain. After negotiation, WMP will “lock” into using the negotiated media type. In other words WMP will only use the media type that was previously negotiated between the decoder and renderer. In most cases (but not all), this will be NV-12 (video) or IEEE (audio). If the plug-in wants to be inserted into the chain, it needs to support the format that’s already been negotiated. When WMP tries to add the plug-in to the graph, the WMP graph manager uses “connect direct”. This has the effect of bypassing intelligent connect. Because of this, additional filters will not be inserted into the graph. This is why we don’t see a color-space converter being added to the graph to try and facilitate the negotiation between the filters.

The key design decision behind this behavior is performance. If the inserted plug-in requires a format that the upstream and downstream filters don’t support and the color-space converter is added to try and facilitate this connection, additional processing power will be required. It is very likely that a color-space converter will need to be added both before and after the plug-in. If each plug-in in the chain requires a color-space converter in order to function there could be dozens of color-space converters in the graph (depending on the number of plug-ins). Since color-space conversion itself cannot be accelerated, this convoluted topology would cause extremely high CPU usage. Because of this, the decision was made to require the plug-ins to support the negotiated format type.

Arguably it is possible that the plug-in will now need to do any of the color-space conversion, still causing the possibility of high CPU usage. That is why it is recommended that the plug-in’s algorithm be optimized to use the new NV12 or IEEE formats. These are very efficient standardized formats closely related to the existing YUV and PCM formats. Only minor changes to the overall algorithm of the plug-in should be necessary to support these formats. However, to be on the safe side and guarantee that your plug-in will always get loaded, you should support all of the formats that WMP may try to use. Here is a list:

Audio:

WAVE_FORMAT_IEEE_FLOAT
WAVE_FORMAT_EXTENSIBLE
WAVE_FORMAT_PCM

Video:

MEDIASUBTYPE_NV12
MEDIASUBTYPE_YV12
MEDIASUBTYPE_YUY2
MEDIASUBTYPE_UYVY
MEDIASUBTYPE_RGB32
MEDIASUBTYPE_RGB24
MEDIASUBTYPE RGB555
MEDIASUBTYPE RGB565