Programmatic access to closed captioning data in Media Center


One of the folks I talked to at CES 2006 last week asked me about how to use Media Center extensibility APIs to access the closed captioning (teletext) stream from a TV broadcast.  At a high leve, he wanted to be able to monitor some television channels, store the teletext data to a database and then do searches on the data later on.  I asked around a little bit when I got back to the office this week and found that we don't have any officially supported means of accessing teletext data with our extensibility APIs.

However, I also found that Stephen Toub has written an interesting white paper, blog post and some excellent sample code that parses and exposes all of the closed captioning data from an NTSC, non-high definition DVR-MS file.  Here are links to the things he has written about the DVR-MS file format and how to parse it:

  • Fun with DVR-MS white paper - discusses the DVR-MS file format, introduces DirectShow, and shows how to use DirectShow to work with DVR-MS files; you should read this first to get an overview of DVR-MS files and the data they contain in addition to the video stream
  • DVR-MS closed captioning parser - blog post (which is as in-depth and well written as a white paper) that introduces a managed library Stephen wrote to parse closed captioning data
  • Sample code for closed captioning parser

Using and extending the techniques that Stephen describes in these articles should allow you to parse a closed captioning (teletext) stream from a recorded TV show, and once you have parsed the data you can store it in a database or manipulate it in other ways as needed for your scenarios.  For inspiration, check out the list of cool ideas at the bottom of Stephen's blog post...

<update date="2/8/2006"> Fixed the link to the sample code for the closed captioning parser </update>

 

Comments (5)

  1. Anders Majland says:

    Interesting – but right now i’m trying to figure out how to get subtitles to work for DVB-T. Here in Denmark (as in Sweden and Finland) the non national channels are broadcast with subtitles enstead of dubbing the audio. We are switchning over to digital broadcast soon (tests are transmitted and 1/4 DVB-T is available nationwide)

    anders AT majland DOT org

  2. lisa says:

    The link is broken for the closed captioning parser sample code.

  3. Hi Lisa – thanks for the heads up.  Looks like Stephen moved some of his sample code around.  I’ve fixed the link so it should work fine now.

  4. Lli says:

    at blogs.msdn.com/…/470491.aspx, some questions posted about win7, but don't know if anyone is still following up.

    Thanks,

  5. Hi Lli – Stephen and I haven't worked on Windows Media Center for several releases.  I'd suggest posting a question about your closed captioning scenarios on the Green Button forums instead – http://www.thegreenbutton.com.

Skip to main content