What’s in a WAV file?

So yesterday, I mentioned that a WAV file is just a wrapper around raw PCM data.  Well, that’s not entirely accurate (it’s not inaccurate either, but…)

A WAV file is in fact a RIFF file, that contains waveform (discrete PCM audio samples) content.  RIFF (Resource Interchange File Format) is the file format that Microsoft historically has used to encode much of its multimedia content (WAV, AVI, etc).  The RIFF format describes “chunks” of data, which can be strung together to make a single multimedia title.

Windows provides APIs for manipulating RIFF files, they’re the “mmio” family of APIs exported from winmm.dll.  There are APIs for opening, reading and writing to RIFF files, and for parsing the various chunks of data in the file (each chunk can in turn contain multiple chunks).

A WAV file is a RIFF file that contains two chunks.  The first chunk is named “fmt”, and the other is named “data”.

The “fmt” chunk contains a WAVEFORMAT which describes the format of the data contained in the “data” chunk.

The “data” chunk contains the PCM data for the audio sample.

There’s actually a partial example of parsing a RIFF file on MSDN, which shows how to navigate through a RIFF file.

If you’re not interested in using the mmio APIs to write (or parse) RIFF files, then I discovered this page at Stanford that describes the format of a WAVE file as a flat file.

Comments (11)

  1. Anonymous says:

    Its it just a bit dangerous treating it as a flat file though? There might be some deviant application out there that writes perfectly legal RIFF files with an extended header or alternate data stream.

  2. Anonymous says:

    You’re absolutely right Chris. You shouldn’t treat it as a flat file.

    You should always use the documented APIs to navigate through the file.

    The article from Stanford doesn’t really include any information that can’t be easily discovered through careful reading of the RIFF documentation, however – they just layed the info out in an easy-to-read fashion.

  3. Anonymous says:

    It’s not deviant to embed some extra info in .wav. What makes flatfile approach work is probably the fact that extra data is located at the end of the .wav file.

    For example, open "c:program filesmessengernewalert.wav" in the notepad and scroll to the end of data (having wordwrap turned on). You should see some copyright information in the ‘LIST’ chunk.

  4. Anonymous says:

    szul-c – this is actually exactly why you DON’T want to treat these as a flat file.

    If you treated them as a flat file, you’d likely get confused when you hit the LIST ‘INFO’ chunk at the end (which, btw contains several other chunks within it (an ICMT, ICOP, ICRD, IENG, ISFT, and ITCH)).

    That WAV file also has a DISP tag associated with it named "TEXT" with the contents "Vivid Sound".

    If you treated it as a flat file, your app would likely get confused, but if you used the MMIO APIs you’d be ok.

  5. Anonymous says:

    Larry, I never wanted to treat .wavs as flatfiles 🙂

    Just wanted to point out that having more than fmt/data is more common that it seems.

  6. Anonymous says:

    You’re right. I’m actually really happy you found that example, I hadn’t seen it.

    But it does reinforce why you should use the APIs instead of rolling your own – you never know what you’ll find, and then someone in appcompat is going to have to play games to get your app to work.

  7. Anonymous says:

    You can save non-PCM format in WAV files too.

  8. Anonymous says:

    That’s true B.Y. For instance you can have ADPCM, and a bunch of other formats (the enumeration’s in MMREG.H).

    Realistically, only PCM, ADPCM, ALAW/MULAW, and a bunch of others work. Some of the "formats" are really aliases (IEEE_FLOAT is PCM, but instead of the samples being integers, they’re IEEE floating point values).

  9. Anonymous says:

    Of course those aren’t your only options. If you don’t want to (or can’t) use the API, you can still follow the structure properly.

  10. Anonymous says:

    winmm.dll isn’t exactly available on unix, classic mac, embedded systems, or other places where riff files could concievably pop up. On the other hand, libraries written by people smarter than you* already exist and have been throughly debugged, working with both valid and (presumably) invalid files, with as much flexibility as you need. Don’t play C god and reinvent yet another wheel.

    Unless it’s fun, of course.

    I wish MS had included mpeg layer-2 or layer-1 audio in its original lineup of audio tools, particularly in the sdk. That could have cut down on the amount of games being distributed with more than half the size being audio. (Then again, they had the chance to use adpcm/mulaw.)

    *trans: less lazy then me.

  11. Anonymous says:

    To give another example of non-PCM WAVs, I think that MP3 stored in a WAV/RIFF file is quite common.