Intro to Audio Programming, Part 3: Synthesizing Simple Wave Audio using C#

If you’ve been following this series, you’re probably thinking, “Finally! He is going to show us some code!”

Well, I hate to disappoint you. So I’ll go ahead and show some code.

We’ve already discussed how audio is represented and what the WAV format looks like. The time has come to put these concepts into practice.

WaveFun! Wave Generator

image The app we will be building is really not all that flashy. It just generates a simple waveform with 1 second of audio and plays it back for you. Nothing configurable or anything like that. Trust me, though, the flashy stuff is coming.

>> DOWNLOAD THE TASTY CODE HERE (18.8 KB) <<

If one giant button on a form isn’t the pinnacle of UI design, I have no idea what to do in this world.

Anyway, this is what the structure of the app looks like:

image

Chunk Wrappers (Chunks.cs)

(oooh, delicious!)

The first thing we care about is Chunks.cs, which contains wrappers for the header and two chunks that we learned about in the last article.

Let’s look at the code for the WaveHeader wrapper class. Note the data types we use instead of just “int.” The strings will be converted to character arrays later when we write the file. If you don’t convert them, you get end-of-string characters that ruin your file. dwFileLength is initialized to zero, but is determined later (retroactively) after we have written the stream and we know how long the file is.

 public class WaveHeader
 {
     public string sGroupID; // RIFF
     public uint dwFileLength; // total file length minus 8, which is taken up by RIFF
     public string sRiffType; // always WAVE
  
     /// <summary>
     /// Initializes a WaveHeader object with the default values.
     /// </summary>
     public WaveHeader()
     {
         dwFileLength = 0;
         sGroupID = "RIFF";
         sRiffType = "WAVE";
     }
 }

Next up is the code for the Format chunk wrapper class. Again, note that the datatypes are consistent with the wave file format spec. Also note that we can explicitly set the chunk size in the constructor to 16 bytes, because the size of this chunk never changes (just add up the number of bytes taken up by each field, you get 16).

 public class WaveFormatChunk
 {
     public string sChunkID;         // Four bytes: "fmt "
     public uint dwChunkSize;        // Length of header in bytes
     public ushort wFormatTag;       // 1 (MS PCM)
     public ushort wChannels;        // Number of channels
     public uint dwSamplesPerSec;    // Frequency of the audio in Hz... 44100
     public uint dwAvgBytesPerSec;   // for estimating RAM allocation
     public ushort wBlockAlign;      // sample frame size, in bytes
     public ushort wBitsPerSample;    // bits per sample
  
     /// <summary>
     /// Initializes a format chunk with the following properties:
     /// Sample rate: 44100 Hz
     /// Channels: Stereo
     /// Bit depth: 16-bit
     /// </summary>
     public WaveFormatChunk()
     {
         sChunkID = "fmt ";
         dwChunkSize = 16;
         wFormatTag = 1;
         wChannels = 2;
         dwSamplesPerSec = 44100;
         wBitsPerSample = 16;
         wBlockAlign = (ushort)(wChannels * (wBitsPerSample / 8));
         dwAvgBytesPerSec = dwSamplesPerSec * wBlockAlign;            
     }
 }

Finally, let’s have a look at the wrapper for the Data chunk. Here, we use an array of shorts because we have 16-bit samples as specified in the format block. If you want to change to 8-bit audio, use an array of bytes. If you want to use 32-bit audio, use an array of floats. dwChunkSize is initialized to zero and is determined after the wave data is generated, when we know how long the array is and what the bit depth is.

 public class WaveDataChunk
 {
     public string sChunkID;     // "data"
     public uint dwChunkSize;    // Length of header in bytes
     public short[] shortArray;  // 8-bit audio
  
     /// <summary>
     /// Initializes a new data chunk with default values.
     /// </summary>
     public WaveDataChunk()
     {
         shortArray = new short[0];
         dwChunkSize = 0;
         sChunkID = "data";
     }   
 }

Now we have all the tools we need to assemble a wave file!

The Wave Generator (WaveGenerator.cs)

This class does two things. It has a constructor, which instantiates all these chunks and then uses a very simple algorithm to generate sample data for a sine wave oscillating at 440Hz. This results in an audible pitch known as Concert A.

In this file, we have an enum called WaveExampleType, which is used to identify what kind of wave we want to create. Feel free to create your own and modify the “big switch statement” to add different sound wave options.

 public enum WaveExampleType
{
    ExampleSineWave = 0
}

The WaveGenerator class only has three members, and they are all chunks.

 public class WaveGenerator
{
    // Header, Format, Data chunks
    WaveHeader header;
    WaveFormatChunk format;
    WaveDataChunk data;
    
    /// <snip>
}

The constructor of the WaveGenerator class takes in an argument of type WaveExampleType, which we switch on to determine what kind of wave to generate. Lots of stuff happens in the constructor, so I’ll use line numbers here to refer to after the jump.

    1: public WaveGenerator(WaveExampleType type)
    2: {          
    3:     // Init chunks
    4:     header = new WaveHeader();
    5:     format = new WaveFormatChunk();
    6:     data = new WaveDataChunk();            
    7:  
    8:     // Fill the data array with sample data
    9:     switch (type)
   10:     {
   11:         case WaveExampleType.ExampleSineWave:
   12:  
   13:             // Number of samples = sample rate * channels * bytes per sample
   14:             uint numSamples = format.dwSamplesPerSec * format.wChannels;
   15:             
   16:             // Initialize the 16-bit array
   17:             data.shortArray = new short[numSamples];
   18:  
   19:             int amplitude = 32760;  // Max amplitude for 16-bit audio
   20:             double freq = 440.0f;   // Concert A: 440Hz
   21:  
   22:             // The "angle" used in the function, adjusted for the number of channels and sample rate.
   23:             // This value is like the period of the wave.
   24:             double t = (Math.PI * 2 * freq) / (format.dwSamplesPerSec * format.wChannels);
   25:  
   26:             for (uint i = 0; i < numSamples - 1; i++)
   27:             {
   28:                 // Fill with a simple sine wave at max amplitude
   29:                 for (int channel = 0; channel < format.wChannels; channel++)
   30:                 {
   31:                     data.shortArray[i + channel] = Convert.ToInt16(amplitude * Math.Sin(t * i));
   32:                 }                        
   33:             }
   34:  
   35:             // Calculate data chunk size in bytes
   36:             data.dwChunkSize = (uint)(data.shortArray.Length * (format.wBitsPerSample / 8));
   37:  
   38:             break;
   39:     }          
   40: }

Lines 4-6 instantiate the chunks.

On line 9, we switch on the wave type. This gives us an opportunity to try different things without breaking stuff that works, which I encourage you to do.

On line 14, we calculate the size of the data array. This is calculated by multiplying the sample rate and channel count together. In our case, we have 44100 samples and 2 channels of data , giving us an array of length 88,200.

Line 19 specifies an important value: 32760 is the max amplitude for 16-bit audio. I discussed this in the second article. As an aside, the samples will range from -32760 to 32760; the negative values are provided by the fact that the sine function’s output ranges from -1.0 to 1.0. For other nonperiodic functions you may have to specify -32760 as your lower bound instead of zero – we’ll see this in action in a future article.

Line 20 specifies the frequency of the sound. 440Hz is concert A. You can use any other pitch you want – check out this awesome table for a handy reference.

On line 24, we are doing a little fun trig. See this article if you want to understand the math, otherwise just use this formula and love it.

Line 26 is where the magic happens! The structure of this nested for loop can change. It works for 1 or 2 channels – anything beyond that and you would need to change the condition in the topmost loop (i < numSamples – 1) lest you get an out of memory error.

It’s important to note how multichannel data is written. For WAV files, data is written in an interleaved manner. The sample at each time point is written to all the channels first before advancing to the next time. So shortArray[0] would be the sample in channel 1, and shortArray[1] would be the exact same sample in channel 2. That’s why we have a nested loop.

On line 31, we use Math.Sin to generate the sample data based on the “angle” (t) and the current time (i). This value is written once for each channel before “i” is incremented.

On line 36, we set the chunk size of the data chunk. Most other chunks know how to do this themselves, but because the chunks are independent, the data chunk does not know what the bitrate is (it’s stored in the format chunk). So we set that value directly. The reason we need the bit rate is that the chunk size is stored in bytes, and each sample takes two bytes. Therefore we are setting the data chunk size to the array length times the number of bytes in a sample (2).

At this point, all of our chunks have the correct values and we are ready to write the chunks to a stream. This is where the Save method comes in.

Again, I’ll use line numbers to refer to the Save method below.

    1: public void Save(string filePath)
    2: {
    3:     // Create a file (it always overwrites)
    4:     FileStream fileStream = new FileStream(filePath, FileMode.Create);   
    5:  
    6:     // Use BinaryWriter to write the bytes to the file
    7:     BinaryWriter writer = new BinaryWriter(fileStream);
    8:  
    9:     // Write the header
   10:     writer.Write(header.sGroupID.ToCharArray());
   11:     writer.Write(header.dwFileLength);
   12:     writer.Write(header.sRiffType.ToCharArray());
   13:  
   14:     // Write the format chunk
   15:     writer.Write(format.sChunkID.ToCharArray());
   16:     writer.Write(format.dwChunkSize);
   17:     writer.Write(format.wFormatTag);
   18:     writer.Write(format.wChannels);
   19:     writer.Write(format.dwSamplesPerSec);
   20:     writer.Write(format.dwAvgBytesPerSec);
   21:     writer.Write(format.wBlockAlign);
   22:     writer.Write(format.wBitsPerSample);
   23:  
   24:     // Write the data chunk
   25:     writer.Write(data.sChunkID.ToCharArray());
   26:     writer.Write(data.dwChunkSize);
   27:     foreach (short dataPoint in data.shortArray)
   28:     {
   29:         writer.Write(dataPoint);
   30:     }
   31:  
   32:     writer.Seek(4, SeekOrigin.Begin);
   33:     uint filesize = (uint)writer.BaseStream.Length;
   34:     writer.Write(filesize - 8);
   35:     
   36:     // Clean up
   37:     writer.Close();
   38:     fileStream.Close();            
   39: }

Save takes one argument – a file path. Lines 4-7 set up our file stream and binary writer associated with that stream. The order in which values are written is EXTREMELY IMPORTANT!

Lines 10-12 write the header chunk to the stream. We use the .ToCharArray method on the strings to convert them to actual character / byte arrays. If you don’t do this, your header gets messed up with end-of-string characters.

Lines 15-22 write the format chunk.

Lines 25 and 26 write the first two parts of the data array, and the foreach loop writes out every value of the data array.

Now we know exactly how long the file is, so we have to go back and specify the file length as the second value in the file. The first 4 bytes of the file are taken up with “RIFF" so we seek to byte 4 and write out the total length of the stream that we’ve written, minus 8 (as noted by the spec; we don’t count RIFF or WAVE).

Lastly, we close the streams. Our file is written! And it looks like this:

image

Zoom in a bit to see the awesome sine waviness:

image

All that’s left are the 5 lines of code that initialize the WaveGenerator object, save the file and play it back to you.

Putting it All Together – Main.cs

Let’s look at Main.cs, the codebehind for our main winform.

    1: using System;
    2: using System.Windows.Forms;
    3: using System.Media;
    4:  
    5: namespace WaveFun
    6: {
    7:     public partial class frmMain : Form
    8:     {
    9:         public frmMain()
   10:         {
   11:             InitializeComponent();
   12:         }
   13:  
   14:         private void btnGenerateWave_Click(object sender, EventArgs e)
   15:         {
   16:             string filePath = @"C:\Users\Dan\Desktop\test2.wav";
   17:             WaveGenerator wave = new WaveGenerator(WaveExampleType.ExampleSineWave);
   18:             wave.Save(filePath);            
   19:  
   20:             SoundPlayer player = new SoundPlayer(filePath);               
   21:             player.Play();
   22:         }
   23:     }
   24: }

On line 3, we reference System.Media. We need this namespace to play back our wave file.

Line 14 is the event handler for the Click event of the only huge button on the form.

On line 16, we define the location of the file to be written. IT IS VERY IMPORTANT THAT YOU CHANGE THIS TO A LOCATION THAT WORKS ON YOUR BOX.

Line 17 initializes the wave generator with a sine wave, and line 18 saves it to the location you defined.

Lines 20 and 21 use System.Media.SoundPlayer to play back the wave that we saved.

All Done!

Press F5 to run your program and bask in the glory of a very loud 440Hz sine wave.

Next Steps: If you are a math Jedi, you can experiment with the following code from WaveGenerator.cs:

 double t = (Math.PI * 2 * freq) / (format.dwSamplesPerSec * format.wChannels);

for (uint i = 0; i < numSamples - 1; i++)
{
    // Fill with a simple sine wave at max amplitude
    for (int channel = 0; channel < format.wChannels; channel++)
    {
        data.shortArray[i + channel] = Convert.ToInt16(amplitude * Math.Sin(t * i));
    }                        
}

Just remember it’s two-channel audio, so you have to write each channel in the frame first before writing the next frame.

In the next article, we’ll look at some algorithms to generate other types of waves.

Currently Playing: Lamb of God – Wrath – Set to Fail