This is a part of the DIY Synthesizer series of posts where each post is roughly built upon the knowledge of the previous posts. If you are lost, check the earlier posts!
This is the first in a series of tutorials on how to make your own software synthesizer.
These tutorials are aimed at C++ programmers, and the example code is meant to be as easy to understand as possible and have as few dependencies as possible. The code ought to compile and run for you no matter what system or compiler you are using with minimal if any changes required.
You can download the full source for this chapter here: DIY Synthesizer Chapter 1 Source Code
Wave File Format
Since making sound come out of computer speakers varies a lot between different systems, we’ll start out just writing a .wave file.
If you want to jump into doing real time audio, i recommend portaudio (http://www.portaudio.com/) , and i also recomend libsndfile for reading and writing other audio file formats(http://www.mega-nerd.com/libsndfile/).
I found these 2 links really helpful in understanding the wave file format:
There’s a lot of optional parts of a wave file header, but we are only going to focus on the bare minimum required to get the job done. Here’s what our wave file header struct looks like:
//this struct is the minimal required header data for a wav file
//the main chunk
unsigned char m_szChunkID;
unsigned char m_szFormat;
//sub chunk 1 "fmt "
unsigned char m_szSubChunk1ID;
//sub chunk 2 "data"
unsigned char m_szSubChunk2ID;
//then comes the data!
And boringly, here’s the function that fills out the struct and writes it to disk:
bool WriteWaveFile(const char *szFileName, void *pData, int32 nDataSize, int16 nNumChannels, int32 nSampleRate, int32 nBitsPerSample)
//open the file if we can
FILE *File = fopen(szFileName,"w+b");
//fill out the main chunk
waveHeader.m_nChunkSize = nDataSize + 36;
//fill out sub chunk 1 "fmt "
waveHeader.m_nSubChunk1Size = 16;
waveHeader.m_nAudioFormat = 1;
waveHeader.m_nNumChannels = nNumChannels;
waveHeader.m_nSampleRate = nSampleRate;
waveHeader.m_nByteRate = nSampleRate * nNumChannels * nBitsPerSample / 8;
waveHeader.m_nBlockAlign = nNumChannels * nBitsPerSample / 8;
waveHeader.m_nBitsPerSample = nBitsPerSample;
//fill out sub chunk 2 "data"
waveHeader.m_nSubChunk2Size = nDataSize;
//write the header
//write the wave data itself
//close the file and return success
Nothing too crazy or all that interesting, but it gets the job done. Again, check out those links above if you are interested in the details of why things are written the way they are, or what other options there are.
Generating a Mono Wave File
Now, finally something interesting, we are going to generate some audio data and make a real wave file!
Since they are easy to generate, we’ll use a sawtooth wave for our sound. For more information about sawtooth waves, check out this wikipedia page: http://en.wikipedia.org/wiki/Sawtooth_wave.
int nSampleRate = 44100;
int nNumSeconds = 4;
int nNumChannels = 1;
The sample rate defines how many samples of audio data there are per second. A stream of audio data is nothing more than a stream of numbers, and each number is a single audio sample, so the sample rate is just how many numbers there are per second of audio data. The less numbers you use, the less “horizontal resolution” your sound file has, or, the less times the wave data can change in amplitude per second.
The sample rate also defines the maximum frequency you can store in the audio stream. The maximum frequency you can store is half of the sample rate. In other words, with a 44100 sample rate, the maximum frequency you can store is 22,050hz. The maximum audible frequency for the human ear is about 20,000hz so using a sample rate of 44100 ought to be pretty good for most needs (you might need to go higher, for complex technical reasons, but this is info enough for now!). Here’s some interesting info about audio frequencies: http://en.wikipedia.org/wiki/Audio_frequency
The number of seconds is how long (in seconds) the wave goes on for, and the number of channels is how many audio channels there are. Since this is a mono sound, there is only one audio channel.
int nNumSamples = nSampleRate * nNumChannels * nNumSeconds;
int32 *pData = new int32[nNumSamples];
Here we calculate how many actual audio samples there are and then allocate space to hold the audio data. We are using 32 bit integers, but you could also use 16 bit integers. The number of bits in your audio samples indicates the vertical resolution of your audio data, or how many unique values there are. in 16 bit ints, there are 65536 different values, and in 32 bits there are 4.2 billion different values. If you think about your data as plots on a graph (essentially, what it is, where X is time and Y is wave amplitude) the more bits per sample, and the higher the sample rate, the closer your graph can be to whatever real values you are trying to use (such as a sine wave). Less bits and a lower sample rate mean it’s farther away from the real data you are trying to model, which will cause the audio to sound less correct.
int32 nValue = 0;
for(int nIndex = 0; nIndex < nNumSamples; ++nIndex)
nValue += 8000000;
pData[nIndex] = nValue;
Here we are actually creating our wave data. We are using the fact that if you have an int near the maximum value you can store, and then add some more, it will wrap around to the minimum value the int can store. If you look at this on a graph, it looks like a saw tooth wave, ie we are creating a saw tooth wave! Normally you wouldn’t create them this way because the way we are doing it is harsh on the ear, and introduces something called aliasing (http://en.wikipedia.org/wiki/Aliasing). In a later tutorial we’ll see how to create a band limited saw tooth wave to make higher quality sound, but for now this will work file!
you can change how much is added to nValue to change the frequency of the resulting wave. Add a smaller number to make it a lower frequency, add a higher number to make it a higher frequency. We’ll get into the math of more finely controlling frequency in another chapter so you can actually match your waves to notes you watch to hit.
WriteWaveFile("outmono.wav",pData,nNumSamples * sizeof(pData),nNumChannels,nSampleRate,sizeof(pData)*8);
Lastly we write our wave file and free our memory.
Tada! All done, we have a sawtooth mono wave file written out, give it a listen!
Writing a Stereo File
The only thing that has really changed in the stereo file is that there are 2 channels instead of 1, and how we generate the audio data is slightly different. Since there are 2 channels, one for left, one for right, there is actually double the audio data for the same sample rate and time length wave file, since it needs a full set of data for each channel.
The audio data itself is interleaved, meaning that the first audio sample is for the left channel, the second sample is for the right channel, the third sample is for the left channel, and so on.
Here’s how the audio is generated:
int32 nValue1 = 0;
int32 nValue2 = 0;
for(int nIndex = 0; nIndex < nNumSamples; nIndex += 2)
nValue1 += 8000000;
nValue2 += 12000000;
pData[nIndex] = nValue1; //left channel
pData[nIndex+1] = nValue2; //right channel
Note that for the right channel we write a different frequency wave. I did this so that you can tell the difference between this and the mono file. Play around with the values and try muting one channel or the other to convince yourself that it really is a stereo file!
Until Next Time…
That’s all for chapter 1, thanks for reading.
Next up we’ll talk about the basic wave forms – sine, square, saw, square, and noise – and we’ll talk more about frequency and oscillators.