Windows Media Metadata in Managed Code (and by extension - Silverlight)

I have encountered several inquiries about accessing the metadata embedded in Windows Media file headers from Silverlight. Unfortunately Silverlight does not expose this information out of the box. The only way you can access this is by utilizing the Windows Media Format SDK.

The Format SDK exposes its functionality in COM and requires you to write your code in C++. The nice thing about Visual C++, is that the C++ compiler allows you to author mixed mode assemblies i.e. one assembly containing both CLR types as well as code that is completely unmanaged, both authored in C++. Using this feature you could author a set of CLR targeted wrappers in C++ that interact with the Format SDK to expose select metadata to the managed world. Once you are done with that step, you can host this code in a Windows Service, a WCF service or some other form of server side component accessible over HTTP/TCP, and then have your Silverlight code interact with that service to get the metadata over the wire.

Attached is some code that I authored recently, that allows you to get started in that direction. You will find the code here .

In the rest of the post I will give you some details on the code. To better understand (and modify if necessary) the source, you may need some familiarity with authoring managed code with C++, the basics of COM in C++, and the Windows Media Format SDK.

Windows Media Files contain a lot of information in the header. The requests I have most often encountered are for the following:

  • File Level Metadata Attributes and Per Stream Metadata Attributes: A Windows Media file can contain multiple streams of different types including video streams, audio streams, text streams etc. The header may contain metadata attributes defined for the entire file, as well for each individual stream in the file.
  • Frame Rate for the video streams in the file
  • Embedded SMPTE Timecode ranges for the video streams

In the attached solution, you will find three projects – the WMShim and the WMShimManagedClasses projects produce assemblies that you will need to reference in your code to extract the metadata from the file. The third project named WMShimTestHarness contains a simple WPF application that shows you how to extract metadata using the classes in the previous two assemblies.

The class diagram below shows  the managed types that I use to define the metadata that the C++ code extracts (these types are defined in C#, in the MetadataDataContracts.cs file, in the WMShimManagedClasses project). Since the goal is to extract the metadata and have it be accessible remotely, these types are defined as data contracts to make them easily serializable.

  ClassDiagram1

The WMMetadataAttribute type defines a single metadata attribute and its value. The Name property contains the actual Windows Media metadata field name, and Label (where possible) contains a more friendly name explaining the field. The PropertyType contains string name for the CLR type that is best suited to represent the Value in managed code, and the Unit (where applicable) contains a string literal defining the unit of measurement(like bps for bitrate).

The WMStream type defines a stream, and the Attributes property returns a collection of WMMetadataAttribute instances for a specific stream or at the file level.

  • StreamType  contains one of the following values : “File”, “Video”, “Audio”, “Text”, “Script”, “Image”, “File Transfer” or “Unknown”. Each of these string literals are meant to represent a valid media type or subtype as described by the Format SDK (see https://msdn.microsoft.com/en-us/library/dd757532(VS.85).aspx ) that can be used to define a stream type. The exceptions are the values “File” used to describe file level metadata (it is not actually a stream, I just reuse the WMStream type – too lazy to build another type to represent a file) and the value “Unknown” in case there is some weird stuff going on.
  • FrameRate & TimeCodeRanges are meaningful only for video streams
  • For file level metadata BitRate, FrameRate, TimeCodeRanges are not meaningful, and StreamIndex and StreamNumber are set to 0.

The primary C++ class that is central to the extraction functionality is named WMMetadataReader, defined in WMMetadataReader.h in the WMShim project.

 

 

 public ref class WMMetadataReader
  {
  public:
    WMMetadataReader(void){} 
    /** 
    Get the number of streams in a media file
    FilePath: Absolute file path 
    Return: Stream count
    **/
    unsigned int GetStreamCount(String^ MediaFilePath);
    /** 
    Get the metadata for a specific stream in a media file
    FilePath: Absolute file path
    RFCLangIdentifier: string literal for the language identifier - "en-us" etc.
    StreamIdx: 1 based Stream Index, Pass 0 for file level metadata
    Return: WMStream instance
    **/
    WMStream^ GetStreamMetadata(String^ FilePath,String^ RFCLangIdentifier, unsigned int StreamIdx);
    /** 
    Get the framerate for a video stream
    FilePath: Absolute file path 
    StreamIdx: 1 based Stream Index
    Return: Frame rate, 0 for non-video streams
    **/
    double GetFrameRate(String^ FilePath, unsigned int StreamIndex);
  }

The code above shows the public methods on WMMetadataReader- the code comments will tell you what each method does. The easiest way to extract metadata for a media file is to call the GetStreamCount() method to get the # of streams, and then call GetStreamMetadata() once for each Stream. Remember to pass a 1 based StreamIdx value for the actual streams, and pass 0 to get the file level metadata. The C# code snippet below will give you an idea – as we loop through the streams and get the metadata for each, we just add them to a List named streams, that we will use eventually else where in our code.

 WMShim.WMMetadataReader rdr = new WMShim.WMMetadataReader();
int StreamCount = (int)rdr.GetStreamCount(FilePath);

List<WMStream> streams = new List<WMStream>(StreamCount + 1);

for (int idx = 0; idx <= StreamCount; idx++)
{
  streams.Add(rdr.GetStreamMetadata(ofd.FileName, "en-us", (uint)idx));
}

Below are screen shots from the WPF test harness that is also in the attached code. The above C# code snippet is actually from that app. The two screen shot shows the video stream metadata and the file level metadata from a wmv file named Dust_To_Glory.wmv.

snap1 snap2

If you look through the code you will notice that I do not supply any code to extract SMPTE Timecode ranges yet. I am still finalizing that code – I will have another post with an attachment here in a few days.

A few notes about the attached code:

- The code was built on 64 bit Windows 7 using Visual Studio 2008

- The WMShim C++ project was built targeting the x64 platform. So do make sure that you have the x64 compilers and tools for C++ installed as a part of your VS 2008 installation. If you want to retarget WMShim to a 32 bit platform instead, just change the platform to Win32 in your Project|Properties dialog and recompile.

- If you want to run the code on Windows Server, you will need to enable the “Desktop Experience” feature. This puts the necessary windows media components on the server (they are not there by default – unlike the Windows client desktop OS platforms).

 

 

 

Until the next post then !!