DIY Synth: Multitap Reverb

This is a part of the DIY Synthesizer series of posts where each post is roughly built upon the knowledge of the previous posts. If you are lost, check the earlier posts!

Reverb is similar to delay in that it adds echoes to a sound. It’s different, though, in that whereas delay adds one echo (even if that echo repeats itself, quieter each time), reverb adds several echos to the sound. Basically, reverb is what makes things sound like they are played in a church, or a deep cavern, or a small bathroom. Delay just makes things sound a bit echoey.

Multitap reverb is the most ghetto of reverb techniques. It’s kind of hacky, it takes a lot of manual tweaking, and in the end, usually doesn’t sound very good. It is less computationally expensive compared to other reverb techniques though, so if that’s a concern of yours, or you don’t want to be bothered with the more complex and sophisticated reverb techniques, multitap reverb may be a good solution for you!

Here are some audio samples for you to hear the effect in action:

	Legend Quote	Cymbals
Raw	legend.wav	cymbal.wav
Multitap Reverb	legend_mtr.wav	cymbal_mtr.wav

Technique

The technique is pretty straightforward. You have a sample buffer to hold the last N samples, and then when you process an incoming sample, you add in multiple samples from the delay buffer, each multiplied by a different volume (amplitude) and add that into the incoming sample to get the outgoing sample. You then also put that outgoing sample into the reverb buffer at the current index in the circular buffer.

Here is some pseudocode about how it might be implemented:

// The size of the delay buffer is controlled by the maximum time parameter of all taps
// make sure the buffer is initialized with silence
reverbBuffer[reverbSamples] = 0;
 
// start the circular buffer index at 0
reverbIndex= 0;
 
// process each input sample
for (i = 0; i < numSamples; ++i)
{
  // calculate the output sample, which is the input sample plus all the taps from the delay buffer
  // TODO: handle wrapping around zero when the tapTime is greater than the current reverbIndex
  outSample[i] = inSample[i];
  for (j = 0; j < numTaps; ++j)
    outSample[i] += reverbBuffer[reverbIndex - taps[j].tapTime] * taps[j].feedbackMultiplier; 
 
  // also store this output sample in the reverb buffer
  reverbBuffer[reverbIndex] = outSample[i];
 
  // advance the circular buffer index
  reverbIndex++;
  if (reverbIndex>= reverbSamples)
    reverbIndex= 0;
}

In the sample code below, and in the reverb processed samples above, here are the times and amplitudes of the taps that were used. The amplitude is given both as dB and amplitude so you can see whichever you are more comfortable with.

Time (ms)	dB	Amplitude
79	-25	0.0562
130	-23	0.0707
230	-15	0.1778
340	-23	0.0707
470	-17	0.1412
532	-21	0.0891
662	-13	0.2238

With some more effort, you could likely come up with some better tap values to make the reverb sound better.

Also, I was going for a cavernous feel, but you could come up with specific taps to make things feel smaller instead.

You have to be careful when setting up your taps so that the overall volume diminishes over time instead of grows. If you put too much acoustic energy back into the reverb buffer, the reverbed sound will get louder and louder over time instead of things dying out giving you that nice diminishing echo sound.

Sample Code

Here’s sample code that loads in.wav, processes it with the taps mentioned above, and writes it out as out.wav. As per usual, the wave loading code has some issues with certain wave formats. If you need better sound file loading, check out libsndfile!

#define _CRT_SECURE_NO_WARNINGS

#include <array>
#include <stdio.h>
#include <string.h>
#include <stdint.h>
#include <cmath>
#include <vector>

#define _USE_MATH_DEFINES
#include <math.h>

//=====================================================================================
// SNumeric - uses phantom types to enforce type safety
//=====================================================================================
template <typename T, typename PHANTOM_TYPE>
struct SNumeric
{
public:
    explicit SNumeric(const T &value) : m_value(value) { }
    SNumeric() : m_value() { }
    inline T& Value() { return m_value; }
    inline const T& Value() const { return m_value; }

    typedef SNumeric<T, PHANTOM_TYPE> TType;
    typedef T TInnerType;

    // Math Operations
    TType operator+ (const TType &b) const
    {
        return TType(this->Value() + b.Value());
    }

    TType operator- (const TType &b) const
    {
        return TType(this->Value() - b.Value());
    }

    TType operator* (const TType &b) const
    {
        return TType(this->Value() * b.Value());
    }

    TType operator/ (const TType &b) const
    {
        return TType(this->Value() / b.Value());
    }

    TType& operator+= (const TType &b)
    {
        Value() += b.Value();
        return *this;
    }

    TType& operator-= (const TType &b)
    {
        Value() -= b.Value();
        return *this;
    }

    TType& operator*= (const TType &b)
    {
        Value() *= b.Value();
        return *this;
    }

    TType& operator/= (const TType &b)
    {
        Value() /= b.Value();
        return *this;
    }

    TType& operator++ ()
    {
        Value()++;
        return *this;
    }

    TType& operator-- ()
    {
        Value()--;
        return *this;
    }

    // Extended Math Operations
    template <typename T>
    T Divide(const TType &b)
    {
        return ((T)this->Value()) / ((T)b.Value());
    }

    // Logic Operations
    bool operator< (const TType &b) const {
        return this->Value() < b.Value();
    }
    bool operator<= (const TType &b) const {
        return this->Value() <= b.Value();
    }
    bool operator> (const TType &b) const {
        return this->Value() > b.Value();
    }
    bool operator>= (const TType &b) const {
        return this->Value() >= b.Value();
    }
    bool operator== (const TType &b) const {
        return this->Value() == b.Value();
    }
    bool operator!= (const TType &b) const {
        return this->Value() != b.Value();
    }

private:
    T m_value;
};

//=====================================================================================
// Typedefs
//=====================================================================================

typedef uint8_t uint8;
typedef uint16_t uint16;
typedef uint32_t uint32;
typedef int16_t int16;
typedef int32_t int32;

// type safe types!
typedef SNumeric<float, struct S__Frequency>      TFrequency;
typedef SNumeric<uint32, struct S__TimeMs>        TTimeMs;
typedef SNumeric<uint32, struct S__Samples>       TSamples;
typedef SNumeric<float, struct S__FractSamples>   TFractionalSamples;
typedef SNumeric<float, struct S__Decibels>       TDecibels;
typedef SNumeric<float, struct S__Amplitude>      TAmplitude;
typedef SNumeric<float, struct S__Phase>          TPhase;

//=====================================================================================
// Constants
//=====================================================================================

static const float c_pi = (float)M_PI;
static const float c_twoPi = c_pi * 2.0f;

//=====================================================================================
// Structs
//=====================================================================================

struct SSoundSettings
{
    TSamples        m_sampleRate;
    TTimeMs         m_lengthMs;
    TSamples        m_currentSample;
};

//=====================================================================================
// CMultiTapReverb -> the multi tap reverb object
//=====================================================================================

struct SReverbTap
{
    TSamples    m_timeOffset;
    TAmplitude  m_feedback;
};

class CMultitapReverb
{
public:
    CMultitapReverb(const std::vector<SReverbTap>& taps)
        : m_sampleIndex(0)
    {
        // copy the taps table
        m_taps = taps;

        // find out the largest tap time offset so we know how big to make the buffer
        TSamples largestTimeOffset(0);
        std::for_each(m_taps.begin(), m_taps.end(),
            [&largestTimeOffset](const SReverbTap& tap)
            {
                if (tap.m_timeOffset > largestTimeOffset)
                    largestTimeOffset = tap.m_timeOffset;
            }
        );

        // if it's 0, bail out, we are done
        if (largestTimeOffset.Value() == 0)
            return;

        // else resize our internal buffer and fill it with silence
        m_samples.resize(largestTimeOffset.Value()+1);
        std::fill(m_samples.begin(), m_samples.end(), TAmplitude(0.0f));
    }

    TAmplitude ProcessSample(TAmplitude sample)
    {
        // if no taps, or none with any time value, bail out!
        if (m_samples.size() == 0)
            return sample;

        // take our taps from the delay buffer
        TAmplitude outSample = sample;
        std::for_each(m_taps.begin(), m_taps.end(),
            [&outSample, this](const SReverbTap& tap)
            {
                size_t tapSampleIndex;
                if (tap.m_timeOffset.Value() > m_sampleIndex)
                    tapSampleIndex = m_samples.size() - 1 - (tap.m_timeOffset.Value() - m_sampleIndex);
                else
                    tapSampleIndex = m_sampleIndex - tap.m_timeOffset.Value();

                outSample += m_samples[tapSampleIndex] * tap.m_feedback;
            }
        );

        // put the output sample into the buffer
        m_samples[m_sampleIndex] = outSample;

        // advance the circular buffer index
        m_sampleIndex++;
        if (m_sampleIndex >= m_samples.size())
            m_sampleIndex = 0;

        // return the reverbed sample
        return outSample;
    }

private:
    std::vector<SReverbTap> m_taps;
    std::vector<TAmplitude> m_samples;
    size_t                  m_sampleIndex;
};

//=====================================================================================
// Conversion Functions
//=====================================================================================
inline TDecibels AmplitudeToDB(TAmplitude volume)
{
    return TDecibels(log10(volume.Value()));
}

inline TAmplitude DBToAmplitude(TDecibels dB)
{
    return TAmplitude(pow(10.0f, dB.Value() / 20.0f));
}

TSamples SecondsToSamples(const SSoundSettings &s, float seconds)
{
    return TSamples((int)(seconds * (float)s.m_sampleRate.Value()));
}

TSamples MilliSecondsToSamples(const SSoundSettings &s, float milliseconds)
{
    return SecondsToSamples(s, milliseconds / 1000.0f);
}

TTimeMs SecondsToMilliseconds(float seconds)
{
    return TTimeMs((uint32)(seconds * 1000.0f));
}

TFrequency Frequency(float octave, float note)
{
    /* frequency = 440×(2^(n/12))
    Notes:
    0  = A
    1  = A#
    2  = B
    3  = C
    4  = C#
    5  = D
    6  = D#
    7  = E
    8  = F
    9  = F#
    10 = G
    11 = G# */
    return TFrequency((float)(440 * pow(2.0, ((double)((octave - 4) * 12 + note)) / 12.0)));
}

template <typename T>
T AmplitudeToAudioSample(const TAmplitude& in)
{
    const T c_min = std::numeric_limits<T>::min();
    const T c_max = std::numeric_limits<T>::max();
    const float c_minFloat = (float)c_min;
    const float c_maxFloat = (float)c_max;

    float ret = in.Value() * c_maxFloat;

    if (ret < c_minFloat)
        return c_min;

    if (ret > c_maxFloat)
        return c_max;

    return (T)ret;
}

TAmplitude GetLerpedAudioSample(const std::vector<TAmplitude>& samples, TFractionalSamples& index)
{
    // get the index of each sample and the fractional blend amount
    uint32 a = (uint32)floor(index.Value());
    uint32 b = a + 1;
    float fract = index.Value() - floor(index.Value());

    // get our two amplitudes
    float ampA = 0.0f;
    if (a >= 0 && a < samples.size())
        ampA = samples[a].Value();

    float ampB = 0.0f;
    if (b >= 0 && b < samples.size())
        ampB = samples[b].Value();

    // return the lerped result
    return TAmplitude(fract * ampB + (1.0f - fract) * ampA);
}

void NormalizeSamples(std::vector<TAmplitude>& samples, TAmplitude maxAmplitude)
{
    // nothing to do if no samples
    if (samples.size() == 0)
        return;

    // 1) find the largest absolute value in the samples.
    TAmplitude largestAbsVal = TAmplitude(abs(samples.front().Value()));
    std::for_each(samples.begin() + 1, samples.end(), [&largestAbsVal](const TAmplitude &a)
    {
        TAmplitude absVal = TAmplitude(abs(a.Value()));
        if (absVal > largestAbsVal)
            largestAbsVal = absVal;
    }
    );

    // 2) adjust largestAbsVal so that when we divide all samples, none will be bigger than maxAmplitude
    // if the value we are going to divide by is <= 0, bail out
    largestAbsVal /= maxAmplitude;
    if (largestAbsVal <= TAmplitude(0.0f))
        return;

    // 3) divide all numbers by the largest absolute value seen so all samples are [-maxAmplitude,+maxAmplitude]
    std::for_each(samples.begin(), samples.end(), [&largestAbsVal](TAmplitude &a)
    {
        a /= largestAbsVal;

        if (a >= TAmplitude(1.0f))
        {
            int ijkl = 0;
        }
    }
    );
}

void ResampleData(std::vector<TAmplitude>& samples, int srcSampleRate, int destSampleRate)
{
    //if the requested sample rate is the sample rate it already is, bail out and do nothing
    if (srcSampleRate == destSampleRate)
        return;

    //calculate the ratio of the old sample rate to the new
    float fResampleRatio = (float)destSampleRate / (float)srcSampleRate;

    //calculate how many samples the new data will have and allocate the new sample data
    int nNewDataNumSamples = (int)((float)samples.size() * fResampleRatio);

    std::vector<TAmplitude> newSamples;
    newSamples.resize(nNewDataNumSamples);

    //get each lerped output sample.  There are higher quality ways to resample
    for (int nIndex = 0; nIndex < nNewDataNumSamples; ++nIndex)
        newSamples[nIndex] = GetLerpedAudioSample(samples, TFractionalSamples((float)nIndex / fResampleRatio));

    //free the old data and set the new data
    std::swap(samples, newSamples);
}

void ChangeNumChannels(std::vector<TAmplitude>& samples, int nSrcChannels, int nDestChannels)
{
    //if the number of channels requested is the number of channels already there, or either number of channels is not mono or stereo, return
    if (nSrcChannels == nDestChannels ||
        nSrcChannels < 1 || nSrcChannels > 2 ||
        nDestChannels < 1 || nDestChannels > 2)
    {
        return;
    }

    //if converting from mono to stereo, duplicate the mono channel to make stereo
    if (nDestChannels == 2)
    {
        std::vector<TAmplitude> newSamples;
        newSamples.resize(samples.size() * 2);
        for (size_t index = 0; index < samples.size(); ++index)
        {
            newSamples[index * 2] = samples[index];
            newSamples[index * 2 + 1] = samples[index];
        }

        std::swap(samples, newSamples);
    }
    //else converting from stereo to mono, mix the stereo channels together to make mono
    else
    {
        std::vector<TAmplitude> newSamples;
        newSamples.resize(samples.size() / 2);
        for (size_t index = 0; index < samples.size() / 2; ++index)
            newSamples[index] = samples[index * 2] + samples[index * 2 + 1];

        std::swap(samples, newSamples);
    }
}

float PCMToFloat(unsigned char *pPCMData, int nNumBytes)
{
    switch (nNumBytes)
    {
    case 1:
    {
        uint8 data = pPCMData[0];
        return (float)data / 255.0f;
    }
    case 2:
    {
        int16 data = pPCMData[1] << 8 | pPCMData[0];
        return ((float)data) / ((float)0x00007fff);
    }
    case 3:
    {
        int32 data = pPCMData[2] << 16 | pPCMData[1] << 8 | pPCMData[0];
        return ((float)data) / ((float)0x007fffff);
    }
    case 4:
    {
        int32 data = pPCMData[3] << 24 | pPCMData[2] << 16 | pPCMData[1] << 8 | pPCMData[0];
        return ((float)data) / ((float)0x7fffffff);
    }
    default:
    {
        return 0.0f;
    }
    }
}

//=====================================================================================
// Wave File Writing Code
//=====================================================================================
struct SMinimalWaveFileHeader
{
    //the main chunk
    unsigned char m_szChunkID[4];      //0
    uint32        m_nChunkSize;        //4
    unsigned char m_szFormat[4];       //8

    //sub chunk 1 "fmt "
    unsigned char m_szSubChunk1ID[4];  //12
    uint32        m_nSubChunk1Size;    //16
    uint16        m_nAudioFormat;      //18
    uint16        m_nNumChannels;      //20
    uint32        m_nSampleRate;       //24
    uint32        m_nByteRate;         //28
    uint16        m_nBlockAlign;       //30
    uint16        m_nBitsPerSample;    //32

    //sub chunk 2 "data"
    unsigned char m_szSubChunk2ID[4];  //36
    uint32        m_nSubChunk2Size;    //40

    //then comes the data!
};

//this writes a wave file
template <typename T>
bool WriteWaveFile(const char *fileName, const std::vector<TAmplitude> &samples, const SSoundSettings &sound)
{
    //open the file if we can
    FILE *file = fopen(fileName, "w+b");
    if (!file)
        return false;

    //calculate bits per sample and the data size
    const int32 bitsPerSample = sizeof(T) * 8;
    const int dataSize = samples.size() * sizeof(T);

    SMinimalWaveFileHeader waveHeader;

    //fill out the main chunk
    memcpy(waveHeader.m_szChunkID, "RIFF", 4);
    waveHeader.m_nChunkSize = dataSize + 36;
    memcpy(waveHeader.m_szFormat, "WAVE", 4);

    //fill out sub chunk 1 "fmt "
    memcpy(waveHeader.m_szSubChunk1ID, "fmt ", 4);
    waveHeader.m_nSubChunk1Size = 16;
    waveHeader.m_nAudioFormat = 1;
    waveHeader.m_nNumChannels = 1;
    waveHeader.m_nSampleRate = sound.m_sampleRate.Value();
    waveHeader.m_nByteRate = sound.m_sampleRate.Value() * 1 * bitsPerSample / 8;
    waveHeader.m_nBlockAlign = 1 * bitsPerSample / 8;
    waveHeader.m_nBitsPerSample = bitsPerSample;

    //fill out sub chunk 2 "data"
    memcpy(waveHeader.m_szSubChunk2ID, "data", 4);
    waveHeader.m_nSubChunk2Size = dataSize;

    //write the header
    fwrite(&waveHeader, sizeof(SMinimalWaveFileHeader), 1, file);

    //write the wave data itself, converting it from float to the type specified
    std::vector<T> outSamples;
    outSamples.resize(samples.size());
    for (size_t index = 0; index < samples.size(); ++index)
        outSamples[index] = AmplitudeToAudioSample<T>(samples[index]);
    fwrite(&outSamples[0], dataSize, 1, file);

    //close the file and return success
    fclose(file);
    return true;
}

//loads a wave file in.  Converts from source format into the specified format
// TOTAL HONESTY: some wave files seem to have problems being loaded through this function and I don't have
// time to investigate why.  It seems to work best with 16 bit mono wave files.
// If you need more robust file loading, check out libsndfile at http://www.mega-nerd.com/libsndfile/
bool ReadWaveFile(const char *fileName, std::vector<TAmplitude>& samples, int32 sampleRate)
{
    //open the file if we can
    FILE *File = fopen(fileName, "rb");
    if (!File)
    {
        return false;
    }

    //read the main chunk ID and make sure it's "RIFF"
    char buffer[5];
    buffer[4] = 0;
    if (fread(buffer, 4, 1, File) != 1 || strcmp(buffer, "RIFF"))
    {
        fclose(File);
        return false;
    }

    //read the main chunk size
    uint32 nChunkSize;
    if (fread(&nChunkSize, 4, 1, File) != 1)
    {
        fclose(File);
        return false;
    }

    //read the format and make sure it's "WAVE"
    if (fread(buffer, 4, 1, File) != 1 || strcmp(buffer, "WAVE"))
    {
        fclose(File);
        return false;
    }

    long chunkPosFmt = -1;
    long chunkPosData = -1;

    while (chunkPosFmt == -1 || chunkPosData == -1)
    {
        //read a sub chunk id and a chunk size if we can
        if (fread(buffer, 4, 1, File) != 1 || fread(&nChunkSize, 4, 1, File) != 1)
        {
            fclose(File);
            return false;
        }

        //if we hit a fmt
        if (!strcmp(buffer, "fmt "))
        {
            chunkPosFmt = ftell(File) - 8;
        }
        //else if we hit a data
        else if (!strcmp(buffer, "data"))
        {
            chunkPosData = ftell(File) - 8;
        }

        //skip to the next chunk
        fseek(File, nChunkSize, SEEK_CUR);
    }

    //we'll use this handy struct to load in 
    SMinimalWaveFileHeader waveData;

    //load the fmt part if we can
    fseek(File, chunkPosFmt, SEEK_SET);
    if (fread(&waveData.m_szSubChunk1ID, 24, 1, File) != 1)
    {
        fclose(File);
        return false;
    }

    //load the data part if we can
    fseek(File, chunkPosData, SEEK_SET);
    if (fread(&waveData.m_szSubChunk2ID, 8, 1, File) != 1)
    {
        fclose(File);
        return false;
    }

    //verify a couple things about the file data
    if (waveData.m_nAudioFormat != 1 ||       //only pcm data
        waveData.m_nNumChannels < 1 ||        //must have a channel
        waveData.m_nNumChannels > 2 ||        //must not have more than 2
        waveData.m_nBitsPerSample > 32 ||     //32 bits per sample max
        waveData.m_nBitsPerSample % 8 != 0 || //must be a multiple of 8 bites
        waveData.m_nBlockAlign > 8)           //blocks must be 8 bytes or lower
    {
        fclose(File);
        return false;
    }

    //figure out how many samples and blocks there are total in the source data
    int nBytesPerBlock = waveData.m_nBlockAlign;
    int nNumBlocks = waveData.m_nSubChunk2Size / nBytesPerBlock;
    int nNumSourceSamples = nNumBlocks * waveData.m_nNumChannels;

    //allocate space for the source samples
    samples.resize(nNumSourceSamples);

    //maximum size of a block is 8 bytes.  4 bytes per samples, 2 channels
    unsigned char pBlockData[8];
    memset(pBlockData, 0, 8);

    //read in the source samples at whatever sample rate / number of channels it might be in
    int nBytesPerSample = nBytesPerBlock / waveData.m_nNumChannels;
    for (int nIndex = 0; nIndex < nNumSourceSamples; nIndex += waveData.m_nNumChannels)
    {
        //read in a block
        if (fread(pBlockData, waveData.m_nBlockAlign, 1, File) != 1)
        {
            fclose(File);
            return false;
        }

        //get the first sample
        samples[nIndex].Value() = PCMToFloat(pBlockData, nBytesPerSample);

        //get the second sample if there is one
        if (waveData.m_nNumChannels == 2)
        {
            samples[nIndex + 1].Value() = PCMToFloat(&pBlockData[nBytesPerSample], nBytesPerSample);
        }
    }

    //re-sample the sample rate up or down as needed
    ResampleData(samples, waveData.m_nSampleRate, sampleRate);

    //handle switching from mono to stereo or vice versa
    ChangeNumChannels(samples, waveData.m_nNumChannels, 1);

    return true;
}

//=====================================================================================
// Oscilators
//=====================================================================================

void AdvancePhase(TPhase &phase, TFrequency frequency, TSamples sampleRate)
{
    phase += TPhase(frequency.Value() / (float)sampleRate.Value());
    while (phase >= TPhase(1.0f))
        phase -= TPhase(1.0f);
    while (phase < TPhase(0.0f))
        phase += TPhase(1.0f);
}

TAmplitude AdvanceOscilator_Sine(TPhase &phase, TFrequency frequency, TSamples sampleRate)
{
    AdvancePhase(phase, frequency, sampleRate);
    return TAmplitude(sin(phase.Value()*c_twoPi));
}

TAmplitude AdvanceOscilator_Saw(TPhase &phase, TFrequency frequency, TSamples sampleRate)
{
    AdvancePhase(phase, frequency, sampleRate);
    return TAmplitude(phase.Value() * 2.0f - 1.0f);
}

TAmplitude AdvanceOscilator_Square(TPhase &phase, TFrequency frequency, TSamples sampleRate)
{
    AdvancePhase(phase, frequency, sampleRate);
    return TAmplitude(phase.Value() > 0.5f ? 1.0f : -1.0f);
}

TAmplitude AdvanceOscilator_Triangle(TPhase &phase, TFrequency frequency, TSamples sampleRate)
{
    AdvancePhase(phase, frequency, sampleRate);
    if (phase > TPhase(0.5f))
        return TAmplitude((((1.0f - phase.Value()) * 2.0f) * 2.0f) - 1.0f);
    else
        return TAmplitude(((phase.Value() * 2.0f) * 2.0f) - 1.0f);
}

TAmplitude AdvanceOscilator_Saw_BandLimited(TPhase &phase, TFrequency frequency, TSamples sampleRate)
{
    AdvancePhase(phase, frequency, sampleRate);

    // sum the harmonics
    TAmplitude ret(0.0f);
    for (int harmonicIndex = 1; harmonicIndex <= 4; ++harmonicIndex)
    {
        TPhase harmonicPhase = phase * TPhase((float)harmonicIndex);
        ret += TAmplitude(sin(harmonicPhase.Value()*c_twoPi) / (float)harmonicIndex);
    }

    //adjust the volume
    ret *= TAmplitude(2.0f / c_pi);

    return ret;
}

TAmplitude AdvanceOscilator_Square_BandLimited(TPhase &phase, TFrequency frequency, TSamples sampleRate)
{
    AdvancePhase(phase, frequency, sampleRate);

    // sum the harmonics
    TAmplitude ret(0.0f);
    for (int harmonicIndex = 1; harmonicIndex <= 4; ++harmonicIndex)
    {
        float harmonicFactor = (float)harmonicIndex * 2.0f - 1.0f;
        TPhase harmonicPhase = phase * TPhase(harmonicFactor);
        ret += TAmplitude(sin(harmonicPhase.Value()*c_twoPi) / harmonicFactor);
    }

    //adjust the volume
    ret *= TAmplitude(4.0f / c_pi);

    return ret;
}

TAmplitude AdvanceOscilator_Triangle_BandLimited(TPhase &phase, TFrequency frequency, TSamples sampleRate)
{
    AdvancePhase(phase, frequency, sampleRate);

    // sum the harmonics
    TAmplitude ret(0.0f);
    TAmplitude signFlip(1.0f);
    for (int harmonicIndex = 1; harmonicIndex <= 4; ++harmonicIndex)
    {
        float harmonicFactor = (float)harmonicIndex * 2.0f - 1.0f;
        TPhase harmonicPhase = phase * TPhase(harmonicFactor);
        ret += TAmplitude(sin(harmonicPhase.Value()*c_twoPi) / (harmonicFactor*harmonicFactor)) * signFlip;
        signFlip *= TAmplitude(-1.0f);
    }

    //adjust the volume
    ret *= TAmplitude(8.0f / (c_pi*c_pi));

    return ret;
}

//=====================================================================================
// Main
//=====================================================================================
int main(int argc, char **argv)
{
    //our desired sound parameters
    SSoundSettings sound;
    sound.m_sampleRate = TSamples(44100);
    sound.m_lengthMs = SecondsToMilliseconds(4.0f);

    // create a reverb object with a list of taps
    CMultitapReverb reverb(
        {
            { MilliSecondsToSamples(sound,  79.0f), DBToAmplitude(TDecibels(-25.0f)) },
            { MilliSecondsToSamples(sound, 130.0f), DBToAmplitude(TDecibels(-23.0f)) },
            { MilliSecondsToSamples(sound, 230.0f), DBToAmplitude(TDecibels(-15.0f)) },
            { MilliSecondsToSamples(sound, 340.0f), DBToAmplitude(TDecibels(-23.0f)) },
            { MilliSecondsToSamples(sound, 470.0f), DBToAmplitude(TDecibels(-17.0f)) },
            { MilliSecondsToSamples(sound, 532.0f), DBToAmplitude(TDecibels(-21.0f)) },
            { MilliSecondsToSamples(sound, 662.0f), DBToAmplitude(TDecibels(-13.0f)) },
        }
    );

    // load the wave file if we can
    std::vector<TAmplitude> inputData;
    if (!ReadWaveFile("in.wav", inputData, sound.m_sampleRate.Value()))
    {
        printf("could not load wave file!");
        return 0;
    }

    // allocate space for the output file
    std::vector<TAmplitude> samples;
    samples.resize(inputData.size());

    //apply the delay effect to the file
    const TSamples c_envelopeSize = MilliSecondsToSamples(sound, 50.0f);
    for (TSamples index = TSamples(0), numSamples(samples.size()); index < numSamples; ++index)
    {
        // calculate envelope at front and end of sound.
        TAmplitude envelope(1.0f);
        if (index < c_envelopeSize)
            envelope = TAmplitude((float)index.Value() / (float)c_envelopeSize.Value());
        else if (index >(numSamples - c_envelopeSize))
            envelope = TAmplitude(1.0f) - TAmplitude((float)(index - (numSamples - c_envelopeSize)).Value() / (float)c_envelopeSize.Value());

        // put our input through the reverb process
        TAmplitude outSample = reverb.ProcessSample(inputData[index.Value()]);

        // mix the sample with the offset sample and apply the envelope for the front and back of the sound
        samples[index.Value()] = outSample * envelope;
    }

    // normalize the amplitude of the samples to make sure they are as loud as possible without clipping
    // give 3db of headroom
    NormalizeSamples(samples, DBToAmplitude(TDecibels(-3.0f)));

    // save as a wave file
    WriteWaveFile<int16_t>("out.wav", samples, sound);

    return 0;
}

Links

Even though this is a pretty ghetto way to do reverb, it can be passable, and is not as expensive as some other reverb methods computationally.

There will soon be a post on how to do convoultion reverb, which is how the pros do reverb. It also makes it a lot easier to get the exact reverb type sound you want, because it lets you use a “reference recording” of the particular reverb you want. It’s cool stuff!

Sound On Sound: Creating & Using Custom Delay Effects
Wikipedia: Reverberation

DIY Synth: Delay Effect (Echo)

This is a part of the DIY Synthesizer series of posts where each post is roughly built upon the knowledge of the previous posts. If you are lost, check the earlier posts!

The delay effect is a pretty simple effect that adds echo to a sound.

Two parameters at minimum are usually exposed on delay effects: delay time and feedback.

Delay time is how long the echo is, and feedback controls how much the sound feeds back into itself – or how loud the echo is.

Implementing delay is actually super simple. You figure out how many samples you need to hold “delay time” amount of sound (using the sample rate of the sound), and then just keep a circular buffer of previous sound samples. The feedback value controls how much the sound feeds back into the delay buffer.

Here is some pseudo code for how it might be implemented:

//The size of the delay buffer is controlled by the time parameter 
// make sure the delay buffer is initialized with silence
delayBuffer[delaySamples] = 0;

// start the circular delay buffer index at 0
delayIndex = 0;

// process each input sample
for (i = 0; i = delaySamples)
    delayIndex = 0;
}

Delay is an effect in it’s own right, but it’s also the basis for many other effect types as well. Flange, phaser, and chorus for instance rely on a delay buffer to be able to do their work.

Some common variations on the delay effect include having different delay parameters for the left and right channel in a stereo sound, or modifying the output sound before it goes into the delay buffer to make it so that the echo sounds a bit different than the original. For instance you could put a lowpass or highpass filter on the echo, or even flange it!

Play around with it and get creative. You might make some interesting and unique sounding sounds (:

Audio Samples

These files were processed with the sample code in this post.

	Legend Quote	Cymbals
Raw	legend.wav	cymbal.wav
250ms delay, -3db feedback	legend_250_3.wav	cymbal_250_3.wav
250ms delay, -12db feedback	legend_250_12.wav	cymbal_250_12.wav
333ms delay, -6db feedback	legend_333_6.wav	cymbal_333_6.wav

Sample Code

The sample code below reads in.wav, applies a delay of 333ms with -12db feedback and writes it as out.wav.

Usual caveat: the wave reading code isn’t bullet proof (sorry). Seems to work best with 16 bit mono wave files. You can use libsndfile if you want more reliable and more diverse sound loading options!

#define _CRT_SECURE_NO_WARNINGS
   
#include 
#include 
#include 
#include 
#include 
#include 
   
#define _USE_MATH_DEFINES
#include 
   
//=====================================================================================
// SNumeric - uses phantom types to enforce type safety
//=====================================================================================
template 
struct SNumeric
{
public:
    explicit SNumeric(const T &value) : m_value(value) { }
    SNumeric() : m_value() { }
    inline T& Value() { return m_value; }
    inline const T& Value() const { return m_value; }
   
    typedef SNumeric TType;
    typedef T TInnerType;
   
    // Math Operations
    TType operator+ (const TType &b) const
    {
        return TType(this->Value() + b.Value());
    }
   
    TType operator- (const TType &b) const
    {
        return TType(this->Value() - b.Value());
    }
   
    TType operator* (const TType &b) const
    {
        return TType(this->Value() * b.Value());
    }
   
    TType operator/ (const TType &b) const
    {
        return TType(this->Value() / b.Value());
    }
   
    TType& operator+= (const TType &b)
    {
        Value() += b.Value();
        return *this;
    }
   
    TType& operator-= (const TType &b)
    {
        Value() -= b.Value();
        return *this;
    }
   
    TType& operator*= (const TType &b)
    {
        Value() *= b.Value();
        return *this;
    }
   
    TType& operator/= (const TType &b)
    {
        Value() /= b.Value();
        return *this;
    }
   
    TType& operator++ ()
    {
        Value()++;
        return *this;
    }
   
    TType& operator-- ()
    {
        Value()--;
        return *this;
    }
   
    // Extended Math Operations
    template 
    T Divide(const TType &b)
    {
        return ((T)this->Value()) / ((T)b.Value());
    }
   
    // Logic Operations
    bool operatorValue() < b.Value();
    }
    bool operatorValue()  (const TType &b) const {
        return this->Value() > b.Value();
    }
    bool operator>= (const TType &b) const {
        return this->Value() >= b.Value();
    }
    bool operator== (const TType &b) const {
        return this->Value() == b.Value();
    }
    bool operator!= (const TType &b) const {
        return this->Value() != b.Value();
    }
   
private:
    T m_value;
};
   
//=====================================================================================
// Typedefs
//=====================================================================================
   
typedef uint8_t uint8;
typedef uint16_t uint16;
typedef uint32_t uint32;
typedef int16_t int16;
typedef int32_t int32;
   
// type safe types!
typedef SNumeric      TFrequency;
typedef SNumeric        TTimeMs;
typedef SNumeric       TSamples;
typedef SNumeric   TFractionalSamples;
typedef SNumeric       TDecibels;
typedef SNumeric      TAmplitude;
typedef SNumeric          TPhase;
   
//=====================================================================================
// Constants
//=====================================================================================
   
static const float c_pi = (float)M_PI;
static const float c_twoPi = c_pi * 2.0f;
   
//=====================================================================================
// Structs
//=====================================================================================
   
struct SSoundSettings
{
    TSamples        m_sampleRate;
    TTimeMs         m_lengthMs;
    TSamples        m_currentSample;
};

//=====================================================================================
// CDelay -> the delay buffer object
//=====================================================================================

class CDelay
{
public:
    CDelay(TSamples delayTime, TAmplitude feedback)
        : m_feedBack(feedback)
        , m_sampleIndex(0)
    {
        m_samples.resize(delayTime.Value());
        std::fill(m_samples.begin(), m_samples.end(), TAmplitude(0.0f));
    }

    TAmplitude ProcessSample(TAmplitude sample)
    {
        TAmplitude ret = sample + m_samples[m_sampleIndex];
        m_samples[m_sampleIndex] = ret * m_feedBack;
        m_sampleIndex++;
        if (m_sampleIndex >= m_samples.size())
            m_sampleIndex = 0;
        return ret;
    }

private:
    TAmplitude              m_feedBack;
    std::vector m_samples;
    size_t                  m_sampleIndex;
};

//=====================================================================================
// Conversion Functions
//=====================================================================================
inline TDecibels AmplitudeToDB(TAmplitude volume)
{
    return TDecibels(log10(volume.Value()));
}
   
inline TAmplitude DBToAmplitude(TDecibels dB)
{
    return TAmplitude(pow(10.0f, dB.Value() / 20.0f));
}
   
TSamples SecondsToSamples(const SSoundSettings &s, float seconds)
{
    return TSamples((int)(seconds * (float)s.m_sampleRate.Value()));
}
   
TSamples MilliSecondsToSamples(const SSoundSettings &s, float milliseconds)
{
    return SecondsToSamples(s, milliseconds / 1000.0f);
}
   
TTimeMs SecondsToMilliseconds(float seconds)
{
    return TTimeMs((uint32)(seconds * 1000.0f));
}
   
TFrequency Frequency(float octave, float note)
{
    /* frequency = 440×(2^(n/12))
    Notes:
    0  = A
    1  = A#
    2  = B
    3  = C
    4  = C#
    5  = D
    6  = D#
    7  = E
    8  = F
    9  = F#
    10 = G
    11 = G# */
    return TFrequency((float)(440 * pow(2.0, ((double)((octave - 4) * 12 + note)) / 12.0)));
}
   
template 
T AmplitudeToAudioSample(const TAmplitude& in)
{
    const T c_min = std::numeric_limits::min();
    const T c_max = std::numeric_limits::max();
    const float c_minFloat = (float)c_min;
    const float c_maxFloat = (float)c_max;
   
    float ret = in.Value() * c_maxFloat;
   
    if (ret  c_maxFloat)
        return c_max;
   
    return (T)ret;
}
 
TAmplitude GetLerpedAudioSample(const std::vector& samples, TFractionalSamples& index)
{
    // get the index of each sample and the fractional blend amount
    uint32 a = (uint32)floor(index.Value());
    uint32 b = a + 1;
    float fract = index.Value() - floor(index.Value());
 
    // get our two amplitudes
    float ampA = 0.0f;
    if (a >= 0 && a = 0 && b < samples.size())
        ampB = samples[b].Value();
 
    // return the lerped result
    return TAmplitude(fract * ampB + (1.0f - fract) * ampA);
}
 
void NormalizeSamples(std::vector& samples, TAmplitude maxAmplitude)
{
    // nothing to do if no samples
    if (samples.size() == 0)
        return;
 
    // 1) find the largest absolute value in the samples.
    TAmplitude largestAbsVal = TAmplitude(abs(samples.front().Value()));
    std::for_each(samples.begin() + 1, samples.end(), [&largestAbsVal](const TAmplitude &a)
        {
            TAmplitude absVal = TAmplitude(abs(a.Value()));
            if (absVal > largestAbsVal)
                largestAbsVal = absVal;
        }
    );
 
    // 2) adjust largestAbsVal so that when we divide all samples, none will be bigger than maxAmplitude
    // if the value we are going to divide by is <= 0, bail out
    largestAbsVal /= maxAmplitude;
    if (largestAbsVal = TAmplitude(1.0f))
            {
                int ijkl = 0;
            }
        }
    );
}
 
void ResampleData(std::vector& samples, int srcSampleRate, int destSampleRate)
{
    //if the requested sample rate is the sample rate it already is, bail out and do nothing
    if (srcSampleRate == destSampleRate)
        return;
 
    //calculate the ratio of the old sample rate to the new
    float fResampleRatio = (float)destSampleRate / (float)srcSampleRate;
     
    //calculate how many samples the new data will have and allocate the new sample data
    int nNewDataNumSamples = (int)((float)samples.size() * fResampleRatio);
 
    std::vector newSamples;
    newSamples.resize(nNewDataNumSamples);
 
    //get each lerped output sample.  There are higher quality ways to resample
    for(int nIndex = 0; nIndex < nNewDataNumSamples; ++nIndex)
        newSamples[nIndex] = GetLerpedAudioSample(samples, TFractionalSamples((float)nIndex / fResampleRatio));
     
    //free the old data and set the new data
    std::swap(samples, newSamples);
}
 
void ChangeNumChannels(std::vector& samples, int nSrcChannels, int nDestChannels)
{
    //if the number of channels requested is the number of channels already there, or either number of channels is not mono or stereo, return
    if(nSrcChannels == nDestChannels ||
       nSrcChannels  2 ||
       nDestChannels  2)
    {
        return;
    }
 
    //if converting from mono to stereo, duplicate the mono channel to make stereo
    if(nDestChannels == 2)
    {
        std::vector newSamples;
        newSamples.resize(samples.size() * 2);
        for (size_t index = 0; index < samples.size(); ++index)
        {
            newSamples[index * 2] = samples[index];
            newSamples[index * 2 + 1] = samples[index];
        }
 
        std::swap(samples, newSamples);
    }
    //else converting from stereo to mono, mix the stereo channels together to make mono
    else
    {
        std::vector newSamples;
        newSamples.resize(samples.size() / 2);
        for (size_t index = 0; index < samples.size() / 2; ++index)
            newSamples[index] = samples[index * 2] + samples[index * 2 + 1];
 
        std::swap(samples, newSamples);
    }
}
 
float PCMToFloat(unsigned char *pPCMData, int nNumBytes)
{
    switch(nNumBytes)
    {
        case 1:
        {
            uint8 data = pPCMData[0];
            return (float)data / 255.0f;
        }
        case 2:
        {
            int16 data = pPCMData[1] << 8 | pPCMData[0];
            return ((float)data) / ((float)0x00007fff);
        }
        case 3:
        {
            int32 data = pPCMData[2] << 16 | pPCMData[1] << 8 | pPCMData[0];
            return ((float)data) / ((float)0x007fffff);
        }
        case 4:
        {
            int32 data = pPCMData[3] << 24 | pPCMData[2] << 16 | pPCMData[1] << 8 | pPCMData[0];
            return ((float)data) / ((float)0x7fffffff);
        }
        default:
        {
            return 0.0f;
        }
    }
}
   
//=====================================================================================
// Wave File Writing Code
//=====================================================================================
struct SMinimalWaveFileHeader
{
    //the main chunk
    unsigned char m_szChunkID[4];      //0
    uint32        m_nChunkSize;        //4
    unsigned char m_szFormat[4];       //8
   
    //sub chunk 1 "fmt "
    unsigned char m_szSubChunk1ID[4];  //12
    uint32        m_nSubChunk1Size;    //16
    uint16        m_nAudioFormat;      //18
    uint16        m_nNumChannels;      //20
    uint32        m_nSampleRate;       //24
    uint32        m_nByteRate;         //28
    uint16        m_nBlockAlign;       //30
    uint16        m_nBitsPerSample;    //32
   
    //sub chunk 2 "data"
    unsigned char m_szSubChunk2ID[4];  //36
    uint32        m_nSubChunk2Size;    //40
   
    //then comes the data!
};
   
//this writes a wave file
template 
bool WriteWaveFile(const char *fileName, const std::vector &samples, const SSoundSettings &sound)
{
    //open the file if we can
    FILE *file = fopen(fileName, "w+b");
    if (!file)
        return false;
   
    //calculate bits per sample and the data size
    const int32 bitsPerSample = sizeof(T) * 8;
    const int dataSize = samples.size() * sizeof(T);
   
    SMinimalWaveFileHeader waveHeader;
   
    //fill out the main chunk
    memcpy(waveHeader.m_szChunkID, "RIFF", 4);
    waveHeader.m_nChunkSize = dataSize + 36;
    memcpy(waveHeader.m_szFormat, "WAVE", 4);
   
    //fill out sub chunk 1 "fmt "
    memcpy(waveHeader.m_szSubChunk1ID, "fmt ", 4);
    waveHeader.m_nSubChunk1Size = 16;
    waveHeader.m_nAudioFormat = 1;
    waveHeader.m_nNumChannels = 1;
    waveHeader.m_nSampleRate = sound.m_sampleRate.Value();
    waveHeader.m_nByteRate = sound.m_sampleRate.Value() * 1 * bitsPerSample / 8;
    waveHeader.m_nBlockAlign = 1 * bitsPerSample / 8;
    waveHeader.m_nBitsPerSample = bitsPerSample;
   
    //fill out sub chunk 2 "data"
    memcpy(waveHeader.m_szSubChunk2ID, "data", 4);
    waveHeader.m_nSubChunk2Size = dataSize;
   
    //write the header
    fwrite(&waveHeader, sizeof(SMinimalWaveFileHeader), 1, file);
   
    //write the wave data itself, converting it from float to the type specified
    std::vector outSamples;
    outSamples.resize(samples.size());
    for (size_t index = 0; index < samples.size(); ++index)
        outSamples[index] = AmplitudeToAudioSample(samples[index]);
    fwrite(&outSamples[0], dataSize, 1, file);
   
    //close the file and return success
    fclose(file);
    return true;
}
 
//loads a wave file in.  Converts from source format into the specified format
// TOTAL HONESTY: some wave files seem to have problems being loaded through this function and I don't have
// time to investigate why.  It seems to work best with 16 bit mono wave files.
// If you need more robust file loading, check out libsndfile at http://www.mega-nerd.com/libsndfile/
bool ReadWaveFile(const char *fileName, std::vector& samples, int32 sampleRate)
{
    //open the file if we can
    FILE *File = fopen(fileName,"rb");
    if(!File)
    {
        return false;
    }
 
    //read the main chunk ID and make sure it's "RIFF"
    char buffer[5];
    buffer[4] = 0;
    if(fread(buffer,4,1,File) != 1 || strcmp(buffer,"RIFF"))
    {
        fclose(File);
        return false;
    }
 
    //read the main chunk size
    uint32 nChunkSize;
    if(fread(&nChunkSize,4,1,File) != 1)
    {
        fclose(File);
        return false;
    }
 
    //read the format and make sure it's "WAVE"
    if(fread(buffer,4,1,File) != 1 || strcmp(buffer,"WAVE"))
    {
        fclose(File);
        return false;
    }
 
    long chunkPosFmt = -1;
    long chunkPosData = -1;
 
    while(chunkPosFmt == -1 || chunkPosData == -1)
    {
        //read a sub chunk id and a chunk size if we can
        if(fread(buffer,4,1,File) != 1 || fread(&nChunkSize,4,1,File) != 1)
        {
            fclose(File);
            return false;
        }
 
        //if we hit a fmt
        if(!strcmp(buffer,"fmt "))
        {
            chunkPosFmt = ftell(File) - 8;
        }
        //else if we hit a data
        else if(!strcmp(buffer,"data"))
        {
            chunkPosData = ftell(File) - 8;
        }
 
        //skip to the next chunk
        fseek(File,nChunkSize,SEEK_CUR);
    }
 
    //we'll use this handy struct to load in 
    SMinimalWaveFileHeader waveData;
 
    //load the fmt part if we can
    fseek(File,chunkPosFmt,SEEK_SET);
    if(fread(&waveData.m_szSubChunk1ID,24,1,File) != 1)
    {
        fclose(File);
        return false;
    }
 
    //load the data part if we can
    fseek(File,chunkPosData,SEEK_SET);
    if(fread(&waveData.m_szSubChunk2ID,8,1,File) != 1)
    {
        fclose(File);
        return false;
    }
 
    //verify a couple things about the file data
    if(waveData.m_nAudioFormat != 1 ||       //only pcm data
       waveData.m_nNumChannels  2 ||        //must not have more than 2
       waveData.m_nBitsPerSample > 32 ||     //32 bits per sample max
       waveData.m_nBitsPerSample % 8 != 0 || //must be a multiple of 8 bites
       waveData.m_nBlockAlign > 8)           //blocks must be 8 bytes or lower
    {
        fclose(File);
        return false;
    }
 
    //figure out how many samples and blocks there are total in the source data
    int nBytesPerBlock = waveData.m_nBlockAlign;
    int nNumBlocks = waveData.m_nSubChunk2Size / nBytesPerBlock;
    int nNumSourceSamples = nNumBlocks * waveData.m_nNumChannels;
 
    //allocate space for the source samples
    samples.resize(nNumSourceSamples);
 
    //maximum size of a block is 8 bytes.  4 bytes per samples, 2 channels
    unsigned char pBlockData[8];
    memset(pBlockData,0,8);
 
    //read in the source samples at whatever sample rate / number of channels it might be in
    int nBytesPerSample = nBytesPerBlock / waveData.m_nNumChannels;
    for(int nIndex = 0; nIndex = TPhase(1.0f))
        phase -= TPhase(1.0f);
    while (phase  0.5f ? 1.0f : -1.0f);
}
   
TAmplitude AdvanceOscilator_Triangle(TPhase &phase, TFrequency frequency, TSamples sampleRate)
{
    AdvancePhase(phase, frequency, sampleRate);
    if (phase > TPhase(0.5f))
        return TAmplitude((((1.0f - phase.Value()) * 2.0f) * 2.0f) - 1.0f);
    else
        return TAmplitude(((phase.Value() * 2.0f) * 2.0f) - 1.0f);
}
   
TAmplitude AdvanceOscilator_Saw_BandLimited(TPhase &phase, TFrequency frequency, TSamples sampleRate)
{
    AdvancePhase(phase, frequency, sampleRate);
   
    // sum the harmonics
    TAmplitude ret(0.0f);
    for (int harmonicIndex = 1; harmonicIndex <= 4; ++harmonicIndex)
    {
        TPhase harmonicPhase = phase * TPhase((float)harmonicIndex);
        ret += TAmplitude(sin(harmonicPhase.Value()*c_twoPi) / (float)harmonicIndex);
    }
   
    //adjust the volume
    ret *= TAmplitude(2.0f / c_pi);
       
    return ret;
}
   
TAmplitude AdvanceOscilator_Square_BandLimited(TPhase &phase, TFrequency frequency, TSamples sampleRate)
{
    AdvancePhase(phase, frequency, sampleRate);
   
    // sum the harmonics
    TAmplitude ret(0.0f);
    for (int harmonicIndex = 1; harmonicIndex <= 4; ++harmonicIndex)
    {
        float harmonicFactor = (float)harmonicIndex * 2.0f - 1.0f;
        TPhase harmonicPhase = phase * TPhase(harmonicFactor);
        ret += TAmplitude(sin(harmonicPhase.Value()*c_twoPi) / harmonicFactor);
    }
   
    //adjust the volume
    ret *= TAmplitude(4.0f / c_pi);
   
    return ret;
}
   
TAmplitude AdvanceOscilator_Triangle_BandLimited(TPhase &phase, TFrequency frequency, TSamples sampleRate)
{
    AdvancePhase(phase, frequency, sampleRate);
   
    // sum the harmonics
    TAmplitude ret(0.0f);
    TAmplitude signFlip(1.0f);
    for (int harmonicIndex = 1; harmonicIndex <= 4; ++harmonicIndex)
    {
        float harmonicFactor = (float)harmonicIndex * 2.0f - 1.0f;
        TPhase harmonicPhase = phase * TPhase(harmonicFactor);
        ret += TAmplitude(sin(harmonicPhase.Value()*c_twoPi) / (harmonicFactor*harmonicFactor)) * signFlip;
        signFlip *= TAmplitude(-1.0f);
    }
   
    //adjust the volume
    ret *= TAmplitude(8.0f / (c_pi*c_pi));
   
    return ret;
}
 
//=====================================================================================
// Main
//=====================================================================================
int main(int argc, char **argv)
{
    //our desired sound parameters
    SSoundSettings sound;
    sound.m_sampleRate = TSamples(44100);
    sound.m_lengthMs = SecondsToMilliseconds(4.0f);

    // create a delay effect object with the specified parameters
    const TSamples c_delayTime = MilliSecondsToSamples(sound, 333.0f);
    const TAmplitude c_delayFeedback = DBToAmplitude(TDecibels(-12.0f));
    CDelay delay(c_delayTime, c_delayFeedback);
  
    // load the wave file if we can
    std::vector inputData;
    if (!ReadWaveFile("in.wav", inputData, sound.m_sampleRate.Value()))
    {
        printf("could not load wave file!");
        return 0;
    }
 
    // allocate space for the output file
    std::vector samples;
    samples.resize(inputData.size());

    //apply the delay effect to the file
    const TSamples c_envelopeSize = MilliSecondsToSamples(sound, 50.0f);
    for (TSamples index = TSamples(0), numSamples(samples.size()); index < numSamples; ++index)
    {
        // calculate envelope at front and end of sound.
        TAmplitude envelope(1.0f);
        if (index  (numSamples - c_envelopeSize))
            envelope = TAmplitude(1.0f) - TAmplitude((float)(index - (numSamples - c_envelopeSize)).Value() / (float)c_envelopeSize.Value());

        // put our input through the delay buffer
        TAmplitude outSample = delay.ProcessSample(inputData[index.Value()]);
 
        // mix the sample with the offset sample and apply the envelope for the front and back of the sound
        samples[index.Value()] = outSample * envelope;
    }
   
    // normalize the amplitude of the samples to make sure they are as loud as possible without clipping
    // give 3db of headroom
    NormalizeSamples(samples, DBToAmplitude(TDecibels(-3.0f)));
 
    // save as a wave file
    WriteWaveFile("out.wav", samples, sound);
 
    return 0;
}

Links

Wikipedia: Delay Effect

Next post will be about multitap reverb, which is similar to delay, but a little bit different.

DIY Synth: Flange Effect

This is a part of the DIY Synthesizer series of posts where each post is roughly built upon the knowledge of the previous posts. If you are lost, check the earlier posts!

Flange is a pretty interesting audio effect. It can give character to the most monotonous of sounds and it can also make good sounds even better.

Before we dive into it, check out these raw and flanged sound files to get a glimpse of what we’re talking about. The flanged files were created with the simple c++ sample code at the end of this chapter. Only standard header files used, so you too will be able to flange sounds by the end of this post!

A clip from the movie “legend”
Raw: legend.wav
Flanged: legend_f.wav

A drum loop
Raw: cymbal.wav
Flanged: cymbal_f.wav

How Does it Work?

The idea behind flange is actually pretty simple. All you do is you mix a sound with itself, but have one of the copies speed up and slow down (like, on a sine wave) instead of letting it play at normal speed. This makes the sounds mix together at different (but similar) points in time. Since sound is made up of peaks (positive numbers) and valleys (negative numbers), mixing the sound with the time offset sound causes some of the peaks and valleys to grow larger, and causes others to get smaller as they cancel each other out. This results in the distinctive flange sound.

The simple way to flange a file would be to load all of the audio samples into memory and do something like this:

for (int i = 0; i = flangeSampleDepth)
       output[i] += input[i - offset];
}

It’s important to note though that for better quality flanging sounds, you should actually use a flange with sub-sample accuracy. That way if your sine wave says it wants sample 3.6, it means your resulting sample should sample[3] * 0.4 + sample[4] * 0.6. That is just doing a linear interpolation to get the “inbetween” data of the samples, which works well enough for my needs, but higher quality flangers will use higher quality interpolation techniques and curve fitting.

Who invented the flanger is apparently not agreed on, but it’s origin is back in the days of tape deck based audio recording studios. If you put your finger on one of the tape flanges and slow it down, if you then mix that result with an undelayed version of the same sound, you’d start to hear the flanging effect.

These days we rely on hardware and software to emulate that.

If you have ever accidentally played too many copies of the same sound too closely together, you’ve probably heard a flange-like effect. It sounds fairly similar, but you don’t get the sweeping effect that you do with flange.

Some flanges also feed their output back into their input to further the effect and add some resonance. We aren’t doing that in this post, but feel free to experiment with that on your own! (check the links section for more info)

It’s important to note that you can use the same process on LIVE music to do flanging in real time. If you have a “delay buffer” to hold the last N seconds of sound, you can use the sine wave to control what part of that delay buffer mixes with the current sound coming out.

Flange Parameters

Flangers often have two parameters (at least). One parameter controls the frequency of the LFO (low frequency oscilator) sine wave. The other parameter controls it’s “depth” which means how far backwards or forwards in time the non-real-time sound can go.

Good frequency values of the oscilator depends entirely on the sound you are flanging as well as the style you are going for, but usually small values like less than 5 hz works best. I usually will use a value less than 1, and for best results I like to make it a value that isn’t likely to line up with the tempo of the music – such as perhaps 0.374.

The reason for this is that flange adds some interesting flavor to your sound, and if you had a value like 0.25 for your flanger, every 4 notes would always sound the same and line up with the flange effect. if instead, you have it at something like 0.374, you can play a repeating melody SEVERAL times over and over, and due to the flange effect, each time through the notes will sound different and accoustically interesting.

The best values of the other parameter (the flange depth), also varies depending on your source sounds and the sound you are going after. People usually suggest doing no more than 20ms though. I personally really enjoy the sound of a much smaller value, such as 1ms. Play around with different values and see what you like!

Flanging Basic Wave Forms

Here are some more flange samples of the basic wave forms, to give you an idea of how flange behaves with the various wave forms:

Triangle:
Raw: triangle.wav
Flanged: triangle_f.wav

Bandlimited Triangle:
Raw: triangleBL.wav
Flanged: triangleBL_f.wav

Saw:
Raw: saw.wav
Flanged: saw_f.wav

Bandlimited Saw:
Raw: sawBL.wav
Flanged: sawBL_f.wav

Square:
Raw: square.wav
Flanged: square_f.wav

Bandlimited Square:
Raw: squareBL.wav
Flanged: squareBL_f.wav

Sine:
Raw: sine1.wav
Flanged: sine_f.wav

Sample Code

This sample code reads in “in.wav” flanges it at 4hz with a 1ms depth, and writes out “out.wav”. Note, the wave file reading code is not bullet proof, sorry! It seems to work well with mono 16 bit wave files, but if you need better sound file reading, i suggest looking at libsndfile (link in links section!)

#define _CRT_SECURE_NO_WARNINGS
  
#include 
#include 
#include 
#include 
#include 
#include 
  
#define _USE_MATH_DEFINES
#include 
  
//=====================================================================================
// SNumeric - uses phantom types to enforce type safety
//=====================================================================================
template 
struct SNumeric
{
public:
    explicit SNumeric(const T &value) : m_value(value) { }
    SNumeric() : m_value() { }
    inline T& Value() { return m_value; }
    inline const T& Value() const { return m_value; }
  
    typedef SNumeric TType;
    typedef T TInnerType;
  
    // Math Operations
    TType operator+ (const TType &b) const
    {
        return TType(this->Value() + b.Value());
    }
  
    TType operator- (const TType &b) const
    {
        return TType(this->Value() - b.Value());
    }
  
    TType operator* (const TType &b) const
    {
        return TType(this->Value() * b.Value());
    }
  
    TType operator/ (const TType &b) const
    {
        return TType(this->Value() / b.Value());
    }
  
    TType& operator+= (const TType &b)
    {
        Value() += b.Value();
        return *this;
    }
  
    TType& operator-= (const TType &b)
    {
        Value() -= b.Value();
        return *this;
    }
  
    TType& operator*= (const TType &b)
    {
        Value() *= b.Value();
        return *this;
    }
  
    TType& operator/= (const TType &b)
    {
        Value() /= b.Value();
        return *this;
    }
  
    TType& operator++ ()
    {
        Value()++;
        return *this;
    }
  
    TType& operator-- ()
    {
        Value()--;
        return *this;
    }
  
    // Extended Math Operations
    template 
    T Divide(const TType &b)
    {
        return ((T)this->Value()) / ((T)b.Value());
    }
  
    // Logic Operations
    bool operatorValue() < b.Value();
    }
    bool operatorValue()  (const TType &b) const {
        return this->Value() > b.Value();
    }
    bool operator>= (const TType &b) const {
        return this->Value() >= b.Value();
    }
    bool operator== (const TType &b) const {
        return this->Value() == b.Value();
    }
    bool operator!= (const TType &b) const {
        return this->Value() != b.Value();
    }
  
private:
    T m_value;
};
  
//=====================================================================================
// Typedefs
//=====================================================================================
  
typedef uint8_t uint8;
typedef uint16_t uint16;
typedef uint32_t uint32;
typedef int16_t int16;
typedef int32_t int32;
  
// type safe types!
typedef SNumeric      TFrequency;
typedef SNumeric        TTimeMs;
typedef SNumeric       TSamples;
typedef SNumeric   TFractionalSamples;
typedef SNumeric       TDecibels;
typedef SNumeric      TAmplitude;
typedef SNumeric      TChannelCount;
typedef SNumeric          TPhase;
  
//=====================================================================================
// Constants
//=====================================================================================
  
static const float c_pi = (float)M_PI;
static const float c_twoPi = c_pi * 2.0f;
  
//=====================================================================================
// Structs
//=====================================================================================
  
struct SSoundSettings
{
    TSamples        m_sampleRate;
    TTimeMs         m_lengthMs;
    TChannelCount   m_numChannels;
    TSamples        m_currentSample;
};
  
//=====================================================================================
// Conversion Functions
//=====================================================================================
inline TDecibels AmplitudeToDB(TAmplitude volume)
{
    return TDecibels(log10(volume.Value()));
}
  
inline TAmplitude DBToAmplitude(TDecibels dB)
{
    return TAmplitude(pow(10.0f, dB.Value() / 20.0f));
}
  
TSamples SecondsToSamples(const SSoundSettings &s, float seconds)
{
    return TSamples((int)(seconds * (float)s.m_sampleRate.Value()));
}
  
TSamples MilliSecondsToSamples(const SSoundSettings &s, float milliseconds)
{
    return SecondsToSamples(s, milliseconds / 1000.0f);
}
  
TTimeMs SecondsToMilliseconds(float seconds)
{
    return TTimeMs((uint32)(seconds * 1000.0f));
}
  
TFrequency Frequency(float octave, float note)
{
    /* frequency = 440×(2^(n/12))
    Notes:
    0  = A
    1  = A#
    2  = B
    3  = C
    4  = C#
    5  = D
    6  = D#
    7  = E
    8  = F
    9  = F#
    10 = G
    11 = G# */
    return TFrequency((float)(440 * pow(2.0, ((double)((octave - 4) * 12 + note)) / 12.0)));
}
  
template 
T AmplitudeToAudioSample(const TAmplitude& in)
{
    const T c_min = std::numeric_limits::min();
    const T c_max = std::numeric_limits::max();
    const float c_minFloat = (float)c_min;
    const float c_maxFloat = (float)c_max;
  
    float ret = in.Value() * c_maxFloat;
  
    if (ret  c_maxFloat)
        return c_max;
  
    return (T)ret;
}

TAmplitude GetLerpedAudioSample(const std::vector& samples, TFractionalSamples& index)
{
    // get the index of each sample and the fractional blend amount
    uint32 a = (uint32)floor(index.Value());
    uint32 b = a + 1;
    float fract = index.Value() - floor(index.Value());

    // get our two amplitudes
    float ampA = 0.0f;
    if (a >= 0 && a = 0 && b < samples.size())
        ampB = samples[b].Value();

    // return the lerped result
    return TAmplitude(fract * ampB + (1.0f - fract) * ampA);
}

void NormalizeSamples(std::vector& samples, TAmplitude maxAmplitude)
{
    // nothing to do if no samples
    if (samples.size() == 0)
        return;

    // 1) find the largest absolute value in the samples.
    TAmplitude largestAbsVal = TAmplitude(abs(samples.front().Value()));
    std::for_each(samples.begin() + 1, samples.end(), [&largestAbsVal](const TAmplitude &a)
        {
            TAmplitude absVal = TAmplitude(abs(a.Value()));
            if (absVal > largestAbsVal)
                largestAbsVal = absVal;
        }
    );

    // 2) adjust largestAbsVal so that when we divide all samples, none will be bigger than maxAmplitude
    // if the value we are going to divide by is <= 0, bail out
    largestAbsVal /= maxAmplitude;
    if (largestAbsVal = TAmplitude(1.0f))
            {
                int ijkl = 0;
            }
        }
    );
}

void ResampleData(std::vector& samples, int srcSampleRate, int destSampleRate)
{
    //if the requested sample rate is the sample rate it already is, bail out and do nothing
    if (srcSampleRate == destSampleRate)
        return;

    //calculate the ratio of the old sample rate to the new
    float fResampleRatio = (float)destSampleRate / (float)srcSampleRate;
    
    //calculate how many samples the new data will have and allocate the new sample data
    int nNewDataNumSamples = (int)((float)samples.size() * fResampleRatio);

    std::vector newSamples;
    newSamples.resize(nNewDataNumSamples);

    //get each lerped output sample.  There are higher quality ways to resample
    for(int nIndex = 0; nIndex < nNewDataNumSamples; ++nIndex)
        newSamples[nIndex] = GetLerpedAudioSample(samples, TFractionalSamples((float)nIndex / fResampleRatio));
    
    //free the old data and set the new data
    std::swap(samples, newSamples);
}

void ChangeNumChannels(std::vector& samples, int nSrcChannels, int nDestChannels)
{
    //if the number of channels requested is the number of channels already there, or either number of channels is not mono or stereo, return
    if(nSrcChannels == nDestChannels ||
       nSrcChannels  2 ||
       nDestChannels  2)
    {
        return;
    }

    //if converting from mono to stereo, duplicate the mono channel to make stereo
    if(nDestChannels == 2)
    {
        std::vector newSamples;
        newSamples.resize(samples.size() * 2);
        for (size_t index = 0; index < samples.size(); ++index)
        {
            newSamples[index * 2] = samples[index];
            newSamples[index * 2 + 1] = samples[index];
        }

        std::swap(samples, newSamples);
    }
    //else converting from stereo to mono, mix the stereo channels together to make mono
    else
    {
        std::vector newSamples;
        newSamples.resize(samples.size() / 2);
        for (size_t index = 0; index < samples.size() / 2; ++index)
            newSamples[index] = samples[index * 2] + samples[index * 2 + 1];

        std::swap(samples, newSamples);
    }
}

float PCMToFloat(unsigned char *pPCMData, int nNumBytes)
{
    switch(nNumBytes)
    {
        case 1:
        {
            uint8 data = pPCMData[0];
            return (float)data / 255.0f;
        }
        case 2:
        {
            int16 data = pPCMData[1] << 8 | pPCMData[0];
            return ((float)data) / ((float)0x00007fff);
        }
        case 3:
        {
            int32 data = pPCMData[2] << 16 | pPCMData[1] << 8 | pPCMData[0];
            return ((float)data) / ((float)0x007fffff);
        }
        case 4:
        {
            int32 data = pPCMData[3] << 24 | pPCMData[2] << 16 | pPCMData[1] << 8 | pPCMData[0];
            return ((float)data) / ((float)0x7fffffff);
        }
        default:
        {
            return 0.0f;
        }
    }
}
  
//=====================================================================================
// Wave File Writing Code
//=====================================================================================
struct SMinimalWaveFileHeader
{
    //the main chunk
    unsigned char m_szChunkID[4];      //0
    uint32        m_nChunkSize;        //4
    unsigned char m_szFormat[4];       //8
  
    //sub chunk 1 "fmt "
    unsigned char m_szSubChunk1ID[4];  //12
    uint32        m_nSubChunk1Size;    //16
    uint16        m_nAudioFormat;      //18
    uint16        m_nNumChannels;      //20
    uint32        m_nSampleRate;       //24
    uint32        m_nByteRate;         //28
    uint16        m_nBlockAlign;       //30
    uint16        m_nBitsPerSample;    //32
  
    //sub chunk 2 "data"
    unsigned char m_szSubChunk2ID[4];  //36
    uint32        m_nSubChunk2Size;    //40
  
    //then comes the data!
};
  
//this writes a wave file
template 
bool WriteWaveFile(const char *fileName, const std::vector &samples, const SSoundSettings &sound)
{
    //open the file if we can
    FILE *file = fopen(fileName, "w+b");
    if (!file)
        return false;
  
    //calculate bits per sample and the data size
    const int32 bitsPerSample = sizeof(T) * 8;
    const int dataSize = samples.size() * sizeof(T);
  
    SMinimalWaveFileHeader waveHeader;
  
    //fill out the main chunk
    memcpy(waveHeader.m_szChunkID, "RIFF", 4);
    waveHeader.m_nChunkSize = dataSize + 36;
    memcpy(waveHeader.m_szFormat, "WAVE", 4);
  
    //fill out sub chunk 1 "fmt "
    memcpy(waveHeader.m_szSubChunk1ID, "fmt ", 4);
    waveHeader.m_nSubChunk1Size = 16;
    waveHeader.m_nAudioFormat = 1;
    waveHeader.m_nNumChannels = sound.m_numChannels.Value();
    waveHeader.m_nSampleRate = sound.m_sampleRate.Value();
    waveHeader.m_nByteRate = sound.m_sampleRate.Value() * sound.m_numChannels.Value() * bitsPerSample / 8;
    waveHeader.m_nBlockAlign = sound.m_numChannels.Value() * bitsPerSample / 8;
    waveHeader.m_nBitsPerSample = bitsPerSample;
  
    //fill out sub chunk 2 "data"
    memcpy(waveHeader.m_szSubChunk2ID, "data", 4);
    waveHeader.m_nSubChunk2Size = dataSize;
  
    //write the header
    fwrite(&waveHeader, sizeof(SMinimalWaveFileHeader), 1, file);
  
    //write the wave data itself, converting it from float to the type specified
    std::vector outSamples;
    outSamples.resize(samples.size());
    for (size_t index = 0; index < samples.size(); ++index)
        outSamples[index] = AmplitudeToAudioSample(samples[index]);
    fwrite(&outSamples[0], dataSize, 1, file);
  
    //close the file and return success
    fclose(file);
    return true;
}

//loads a wave file in.  Converts from source format into the specified format
// TOTAL HONESTY: some wave files seem to have problems being loaded through this function and I don't have
// time to investigate why.  It seems to work best with 16 bit mono wave files.
// If you need more robust file loading, check out libsndfile at http://www.mega-nerd.com/libsndfile/
bool ReadWaveFile(const char *fileName, std::vector& samples, int16 numChannels, int32 sampleRate)
{
    //open the file if we can
    FILE *File = fopen(fileName,"rb");
    if(!File)
    {
        return false;
    }

    //read the main chunk ID and make sure it's "RIFF"
    char buffer[5];
    buffer[4] = 0;
    if(fread(buffer,4,1,File) != 1 || strcmp(buffer,"RIFF"))
    {
        fclose(File);
        return false;
    }

    //read the main chunk size
    uint32 nChunkSize;
    if(fread(&nChunkSize,4,1,File) != 1)
    {
        fclose(File);
        return false;
    }

    //read the format and make sure it's "WAVE"
    if(fread(buffer,4,1,File) != 1 || strcmp(buffer,"WAVE"))
    {
        fclose(File);
        return false;
    }

    long chunkPosFmt = -1;
    long chunkPosData = -1;

    while(chunkPosFmt == -1 || chunkPosData == -1)
    {
        //read a sub chunk id and a chunk size if we can
        if(fread(buffer,4,1,File) != 1 || fread(&nChunkSize,4,1,File) != 1)
        {
            fclose(File);
            return false;
        }

        //if we hit a fmt
        if(!strcmp(buffer,"fmt "))
        {
            chunkPosFmt = ftell(File) - 8;
        }
        //else if we hit a data
        else if(!strcmp(buffer,"data"))
        {
            chunkPosData = ftell(File) - 8;
        }

        //skip to the next chunk
        fseek(File,nChunkSize,SEEK_CUR);
    }

    //we'll use this handy struct to load in 
    SMinimalWaveFileHeader waveData;

    //load the fmt part if we can
    fseek(File,chunkPosFmt,SEEK_SET);
    if(fread(&waveData.m_szSubChunk1ID,24,1,File) != 1)
    {
        fclose(File);
        return false;
    }

    //load the data part if we can
    fseek(File,chunkPosData,SEEK_SET);
    if(fread(&waveData.m_szSubChunk2ID,8,1,File) != 1)
    {
        fclose(File);
        return false;
    }

    //verify a couple things about the file data
    if(waveData.m_nAudioFormat != 1 ||       //only pcm data
       waveData.m_nNumChannels  2 ||        //must not have more than 2
       waveData.m_nBitsPerSample > 32 ||     //32 bits per sample max
       waveData.m_nBitsPerSample % 8 != 0 || //must be a multiple of 8 bites
       waveData.m_nBlockAlign > 8)           //blocks must be 8 bytes or lower
    {
        fclose(File);
        return false;
    }

    //figure out how many samples and blocks there are total in the source data
    int nBytesPerBlock = waveData.m_nBlockAlign;
    int nNumBlocks = waveData.m_nSubChunk2Size / nBytesPerBlock;
    int nNumSourceSamples = nNumBlocks * waveData.m_nNumChannels;

    //allocate space for the source samples
    samples.resize(nNumSourceSamples);

    //maximum size of a block is 8 bytes.  4 bytes per samples, 2 channels
    unsigned char pBlockData[8];
    memset(pBlockData,0,8);

    //read in the source samples at whatever sample rate / number of channels it might be in
    int nBytesPerSample = nBytesPerBlock / waveData.m_nNumChannels;
    for(int nIndex = 0; nIndex = TPhase(1.0f))
        phase -= TPhase(1.0f);
    while (phase  0.5f ? 1.0f : -1.0f);
}
  
TAmplitude AdvanceOscilator_Triangle(TPhase &phase, TFrequency frequency, TSamples sampleRate)
{
    AdvancePhase(phase, frequency, sampleRate);
    if (phase > TPhase(0.5f))
        return TAmplitude((((1.0f - phase.Value()) * 2.0f) * 2.0f) - 1.0f);
    else
        return TAmplitude(((phase.Value() * 2.0f) * 2.0f) - 1.0f);
}
  
TAmplitude AdvanceOscilator_Saw_BandLimited(TPhase &phase, TFrequency frequency, TSamples sampleRate)
{
    AdvancePhase(phase, frequency, sampleRate);
  
    // sum the harmonics
    TAmplitude ret(0.0f);
    for (int harmonicIndex = 1; harmonicIndex <= 4; ++harmonicIndex)
    {
        TPhase harmonicPhase = phase * TPhase((float)harmonicIndex);
        ret += TAmplitude(sin(harmonicPhase.Value()*c_twoPi) / (float)harmonicIndex);
    }
  
    //adjust the volume
    ret *= TAmplitude(2.0f / c_pi);
      
    return ret;
}
  
TAmplitude AdvanceOscilator_Square_BandLimited(TPhase &phase, TFrequency frequency, TSamples sampleRate)
{
    AdvancePhase(phase, frequency, sampleRate);
  
    // sum the harmonics
    TAmplitude ret(0.0f);
    for (int harmonicIndex = 1; harmonicIndex <= 4; ++harmonicIndex)
    {
        float harmonicFactor = (float)harmonicIndex * 2.0f - 1.0f;
        TPhase harmonicPhase = phase * TPhase(harmonicFactor);
        ret += TAmplitude(sin(harmonicPhase.Value()*c_twoPi) / harmonicFactor);
    }
  
    //adjust the volume
    ret *= TAmplitude(4.0f / c_pi);
  
    return ret;
}
  
TAmplitude AdvanceOscilator_Triangle_BandLimited(TPhase &phase, TFrequency frequency, TSamples sampleRate)
{
    AdvancePhase(phase, frequency, sampleRate);
  
    // sum the harmonics
    TAmplitude ret(0.0f);
    TAmplitude signFlip(1.0f);
    for (int harmonicIndex = 1; harmonicIndex <= 4; ++harmonicIndex)
    {
        float harmonicFactor = (float)harmonicIndex * 2.0f - 1.0f;
        TPhase harmonicPhase = phase * TPhase(harmonicFactor);
        ret += TAmplitude(sin(harmonicPhase.Value()*c_twoPi) / (harmonicFactor*harmonicFactor)) * signFlip;
        signFlip *= TAmplitude(-1.0f);
    }
  
    //adjust the volume
    ret *= TAmplitude(8.0f / (c_pi*c_pi));
  
    return ret;
}

//=====================================================================================
// Main
//=====================================================================================
int main(int argc, char **argv)
{
    //our desired sound parameters
    SSoundSettings sound;
    sound.m_sampleRate = TSamples(44100);
    sound.m_lengthMs = SecondsToMilliseconds(4.0f);
    sound.m_numChannels = TChannelCount(1);

    // flange effect parameters
    const TFrequency c_flangeFrequency(0.4f);
    const TSamples c_flangeDepth(MilliSecondsToSamples(sound, 1.0f));
 
    // load the wave file if we can
    std::vector inputData;
    if (!ReadWaveFile("in.wav", inputData, sound.m_numChannels.Value(), sound.m_sampleRate.Value()))
    {
        printf("could not load wave file!");
        return 0;
    }

    // allocate space for the output file
    std::vector samples;
    samples.resize(inputData.size());

    TSamples envelopeSize = MilliSecondsToSamples(sound, 50.0f);

    //apply the phase effect to the file
    TPhase flangePhase(0.0f);
    for (TSamples index = TSamples(0), numSamples(samples.size()); index < numSamples; ++index)
    {
        // calculate envelope at front and end of sound.
        TAmplitude envelope(1.0f);
        if (index  (numSamples - envelopeSize))
            envelope = TAmplitude(1.0f) - TAmplitude((float)(index - (numSamples - envelopeSize)).Value() / (float)envelopeSize.Value());

        // make a sine wave that goes from -1 to 0 at the specified frequency
        TAmplitude flangeSine = AdvanceOscilator_Sine(flangePhase, c_flangeFrequency, sound.m_sampleRate) * TAmplitude(0.5f) - TAmplitude(0.5f);

        // use that sine wave to calculate an offset backwards in time to sample from
        TFractionalSamples flangeOffset = TFractionalSamples((float)index.Value()) + TFractionalSamples(flangeSine.Value() * (float)c_flangeDepth.Value());

        // mix the sample with the offset sample and apply the envelope for the front and back of the sound
        samples[index.Value()] = (inputData[index.Value()] + GetLerpedAudioSample(inputData, flangeOffset)) * envelope;
    }
  
    // normalize the amplitude of the samples to make sure they are as loud as possible without clipping
    // give 3db of headroom
    NormalizeSamples(samples, DBToAmplitude(TDecibels(-3.0f)));

    // save as a wave file
    WriteWaveFile("out.wav", samples, sound);

    return 0;
}

Links

More DIY synth stuff coming soon, I have like 5 more posts I want to make right now, with the last couple being about some pretty awesome stuff I learned about recently!

Wikipedia: Flanging
The difference between flange, phaser & chorus
What is a chorus effect?

Shadertoy: Flange (made by me!)

libsndfile – to get better sound loading!

DIY Synth: Basic Drum

This is a part of the DIY Synthesizer series of posts where each post is roughly built upon the knowledge of the previous posts. If you are lost, check the earlier posts!

Hello! It’s time for another installment of DIY synth. It’s been so long since the last one, that when I look back at the code in that post now, I’m mortified by some of the stylistic and implementation choices I made, EEK! The curse of learning new stuff… we all experience that from time to time hehe.

Following the previous DIY synth posts, you are now able to do quite a bit of synth stuff, but the only way you have to make percussion is to use recorded sound samples of drums. There’s nothing wrong with that, and in fact is an easy way to get good quality percussion, but if you want to be a purist and synthesize everything yourself, that might make you sad to have to rely on a recorded sample.

In today’s post, I’ll walk you through the process of how to create a simple drum in 3 easy steps.

It may not be the best synthesized drum, but it definitely passes as a drum sound, and I provide links that explain how to refine it further.

Step 1: Sine Wave

Starting out is pretty simple, we just need a tone. Here is a sine wave that lasts 1 second and is a F note in the 1st octave, or approximately 87hz. Drums are low sounds, so we need to start with a low frequency sound.

sine.wav

Step 2:Envelope

There is a concept in audio called an “envelope”. An envelope is just a fancy way of saying that you are changing the volume over time. The envelope is the “shape” of the volume changes over time.

If you notice in step 1 the volume (amplitude of the sine wave) isn’t constant throughout the whole thing, it actually has a part at the beginning that gradually goes from 0 volume to full volume, and at the end, it has a part at the end that goes from full volume to 0 volume (it’s 50 milliseconds on each side if you are curious). That fading is an envelope too and is actually there to prevent “popping” which can occur when you make audio samples that aren’t smooth from one to the next. It might seem like that would be a minor problem, but it’s actually VERY noticeable. Check out the previous DIY synth posts for more info and examples of that!

Anyhow, if you think about the sound that a drum makes when you hit it, it starts out loud right away, and then quickly fades out. You can play with the specific values of the envelope and get a different sound, but what I went with was 10 milliseconds of fading in (0.01 seconds), 10 milliseconds of being at full volume (0.01 seconds), and 175 milliseconds of fading out (0.175 seconds). You can see a picture of the envelope below:

The fade in time is called the “attack”, the time it remains at full volume is called the “hold” and the time that it fades out is called the “release”. There are other common stages to envelopes that you might hear about if looking up more info about them. Two other common parts of an enveloper are “sustain” and “decay” for instance.

Envelopes are a big part of what make notes sound like specific instruments, so have fun playing with those values and listening to the results.

Here is the envelope applied to our low frequency sine wave (which you apply by just multiplying them together!)

sineenvelope.wav

Step 3: Frequency Decay

We have something that sounds a little more interesting than a plain vanilla sine tone, but it doesn’t sound much like a drum yet…

What we are missing is that in a real drum, the frequency of the note decays over time. If that isn’t intuitive, don’t worry it wasn’t for me either. It took me a good amount of reading and investigation to find that out a few years back.

To add frequency decay, let’s have the frequency decay 80% of the way (towards frequency 0) over the “release” (fade out) portion of the envelope. So, the frequency will still be F1 through the entire drum note of attack and hold, but then starting with release, it will decay linearly over time for that 175 ms, until at the end, the frequency should only be 20% of 87hz, or about 17hz.

Here’s what we end up with:
drum.wav

Good Enough!

That’s pretty passable as a drum, even if it isn’t the best. A neat thing too is that by changing the starting frequency, you can get different frequencies of your drum and get some different drum sounds.

Here’s a little drum melody showing what i mean:

melody.wav

Sample Code

Here’s the code with everything above implemented, which created the drum melody. It uses only standard include files, and writes a wave file called “out.wav” when you run it. Play around with the code, adjusting envelope times, frequencies, frequency decay, or even change it from using a sine wave to a different wave form (I included some standard wave forms for you).

Often times synthesis / music making is all about just playing around with the knobs that are exposed to you til you find something really interesting.

#define _CRT_SECURE_NO_WARNINGS
 
#include 
#include 
#include 
#include 
#include 
#include 
 
#define _USE_MATH_DEFINES
#include 
 
//=====================================================================================
// SNumeric - uses phantom types to enforce type safety
//=====================================================================================
template 
struct SNumeric
{
public:
    explicit SNumeric(const T &value) : m_value(value) { }
    SNumeric() : m_value() { }
    inline T& Value() { return m_value; }
    inline const T& Value() const { return m_value; }
 
    typedef SNumeric TType;
    typedef T TInnerType;
 
    // Math Operations
    TType operator+ (const TType &b) const
    {
        return TType(this->Value() + b.Value());
    }
 
    TType operator- (const TType &b) const
    {
        return TType(this->Value() - b.Value());
    }
 
    TType operator* (const TType &b) const
    {
        return TType(this->Value() * b.Value());
    }
 
    TType operator/ (const TType &b) const
    {
        return TType(this->Value() / b.Value());
    }
 
    TType& operator+= (const TType &b)
    {
        Value() += b.Value();
        return *this;
    }
 
    TType& operator-= (const TType &b)
    {
        Value() -= b.Value();
        return *this;
    }
 
    TType& operator*= (const TType &b)
    {
        Value() *= b.Value();
        return *this;
    }
 
    TType& operator/= (const TType &b)
    {
        Value() /= b.Value();
        return *this;
    }
 
    TType& operator++ ()
    {
        Value()++;
        return *this;
    }
 
    TType& operator-- ()
    {
        Value()--;
        return *this;
    }
 
    // Extended Math Operations
    template 
    T Divide(const TType &b)
    {
        return ((T)this->Value()) / ((T)b.Value());
    }
 
    // Logic Operations
    bool operatorValue() < b.Value();
    }
    bool operatorValue()  (const TType &b) const {
        return this->Value() > b.Value();
    }
    bool operator>= (const TType &b) const {
        return this->Value() >= b.Value();
    }
    bool operator== (const TType &b) const {
        return this->Value() == b.Value();
    }
    bool operator!= (const TType &b) const {
        return this->Value() != b.Value();
    }
 
private:
    T m_value;
};
 
//=====================================================================================
// Typedefs
//=====================================================================================
 
typedef uint8_t uint8;
typedef uint16_t uint16;
typedef uint32_t uint32;
typedef int16_t int16;
typedef int32_t int32;
 
// type safe types!
typedef SNumeric      TFrequency;
typedef SNumeric            TTimeMs;
typedef SNumeric           TSamples;
typedef SNumeric           TDecibels;
typedef SNumeric      TAmplitude;
typedef SNumeric       TChannelCount;
typedef SNumeric          TPhase;
 
//=====================================================================================
// Constants
//=====================================================================================
 
static const float c_pi = (float)M_PI;
static const float c_twoPi = c_pi * 2.0f;
 
//=====================================================================================
// Structs
//=====================================================================================
 
struct SSoundSettings
{
    TSamples        m_sampleRate;
    TTimeMs         m_lengthMs;
    TChannelCount   m_numChannels;
    TSamples        m_currentSample;
};
 
struct SDrumSettings
{
    TFrequency  m_frequency;
    TSamples    m_attack;
    TSamples    m_sustain;
    TSamples    m_release;
    TAmplitude  m_volume;
};
 
struct SDrumInstance
{
    SDrumInstance(TSamples startTime, const SDrumSettings &settings)
        : m_startTime(startTime)
        , m_settings(settings)
        , m_phase(0.0f)
    {
 
    }
 
    const SDrumSettings     &m_settings;
    TSamples                m_startTime;
    TPhase                  m_phase;
};
 
//=====================================================================================
// Globals
//=====================================================================================
 
std::vector    g_drumInstances;
 
//=====================================================================================
// Conversion Functions
//=====================================================================================
inline TDecibels AmplitudeToDB(TAmplitude volume)
{
    return TDecibels(log10(volume.Value()));
}
 
inline TAmplitude DBToAmplitude(TDecibels dB)
{
    return TAmplitude(pow(10.0f, dB.Value() / 20.0f));
}
 
TSamples SecondsToSamples(const SSoundSettings &s, float seconds)
{
    return TSamples((int)(seconds * (float)s.m_sampleRate.Value()));
}
 
TSamples MilliSecondsToSamples(const SSoundSettings &s, float milliseconds)
{
    return SecondsToSamples(s, milliseconds / 1000.0f);
}
 
TTimeMs SecondsToMilliseconds(float seconds)
{
    return TTimeMs((uint32)(seconds * 1000.0f));
}
 
TFrequency Frequency(float octave, float note)
{
    /* frequency = 440×(2^(n/12))
    Notes:
    0  = A
    1  = A#
    2  = B
    3  = C
    4  = C#
    5  = D
    6  = D#
    7  = E
    8  = F
    9  = F#
    10 = G
    11 = G# */
    return TFrequency((float)(440 * pow(2.0, ((double)((octave - 4) * 12 + note)) / 12.0)));
}
 
template 
T AmplitudeToAudioSample(const TAmplitude& in)
{
    const T c_min = std::numeric_limits::min();
    const T c_max = std::numeric_limits::max();
    const float c_minFloat = (float)c_min;
    const float c_maxFloat = (float)c_max;
 
    float ret = in.Value() * c_maxFloat;
 
    if (ret  c_maxFloat)
        return c_max;
 
    return (T)ret;
}
 
//=====================================================================================
// Wave File Writing Code
//=====================================================================================
struct SMinimalWaveFileHeader
{
    //the main chunk
    unsigned char m_szChunkID[4];      //0
    uint32        m_nChunkSize;        //4
    unsigned char m_szFormat[4];       //8
 
    //sub chunk 1 "fmt "
    unsigned char m_szSubChunk1ID[4];  //12
    uint32        m_nSubChunk1Size;    //16
    uint16        m_nAudioFormat;      //18
    uint16        m_nNumChannels;      //20
    uint32        m_nSampleRate;       //24
    uint32        m_nByteRate;         //28
    uint16        m_nBlockAlign;       //30
    uint16        m_nBitsPerSample;    //32
 
    //sub chunk 2 "data"
    unsigned char m_szSubChunk2ID[4];  //36
    uint32        m_nSubChunk2Size;    //40
 
    //then comes the data!
};
 
//this writes a wave file
template 
bool WriteWaveFile(const char *fileName, const std::vector &samples, const SSoundSettings &sound)
{
    //open the file if we can
    FILE *file = fopen(fileName, "w+b");
    if (!file)
        return false;
 
    //calculate bits per sample and the data size
    const int32 bitsPerSample = sizeof(T) * 8;
    const int dataSize = samples.size() * sizeof(T);
 
    SMinimalWaveFileHeader waveHeader;
 
    //fill out the main chunk
    memcpy(waveHeader.m_szChunkID, "RIFF", 4);
    waveHeader.m_nChunkSize = dataSize + 36;
    memcpy(waveHeader.m_szFormat, "WAVE", 4);
 
    //fill out sub chunk 1 "fmt "
    memcpy(waveHeader.m_szSubChunk1ID, "fmt ", 4);
    waveHeader.m_nSubChunk1Size = 16;
    waveHeader.m_nAudioFormat = 1;
    waveHeader.m_nNumChannels = sound.m_numChannels.Value();
    waveHeader.m_nSampleRate = sound.m_sampleRate.Value();
    waveHeader.m_nByteRate = sound.m_sampleRate.Value() * sound.m_numChannels.Value() * bitsPerSample / 8;
    waveHeader.m_nBlockAlign = sound.m_numChannels.Value() * bitsPerSample / 8;
    waveHeader.m_nBitsPerSample = bitsPerSample;
 
    //fill out sub chunk 2 "data"
    memcpy(waveHeader.m_szSubChunk2ID, "data", 4);
    waveHeader.m_nSubChunk2Size = dataSize;
 
    //write the header
    fwrite(&waveHeader, sizeof(SMinimalWaveFileHeader), 1, file);
 
    //write the wave data itself, converting it from float to the type specified
    std::vector outSamples;
    outSamples.resize(samples.size());
    for (size_t index = 0; index < samples.size(); ++index)
        outSamples[index] = AmplitudeToAudioSample(samples[index]);
    fwrite(&outSamples[0], dataSize, 1, file);
 
    //close the file and return success
    fclose(file);
    return true;
}
 
//=====================================================================================
// Oscilators
//=====================================================================================
 
void AdvancePhase(TPhase &phase, TFrequency frequency, TSamples sampleRate)
{
    phase += TPhase(frequency.Value() / (float)sampleRate.Value());
    while (phase >= TPhase(1.0f))
        phase -= TPhase(1.0f);
    while (phase  0.5f ? 1.0f : -1.0f);
}
 
TAmplitude AdvanceOscilator_Triangle(TPhase &phase, TFrequency frequency, TSamples sampleRate)
{
    AdvancePhase(phase, frequency, sampleRate);
    if (phase > TPhase(0.5f))
        return TAmplitude((((1.0f - phase.Value()) * 2.0f) * 2.0f) - 1.0f);
    else
        return TAmplitude(((phase.Value() * 2.0f) * 2.0f) - 1.0f);
}
 
TAmplitude AdvanceOscilator_Saw_BandLimited(TPhase &phase, TFrequency frequency, TSamples sampleRate)
{
    AdvancePhase(phase, frequency, sampleRate);
 
    // sum the harmonics
    TAmplitude ret(0.0f);
    for (int harmonicIndex = 1; harmonicIndex <= 4; ++harmonicIndex)
    {
        TPhase harmonicPhase = phase * TPhase((float)harmonicIndex);
        ret += TAmplitude(sin(harmonicPhase.Value()*c_twoPi) / (float)harmonicIndex);
    }
 
    //adjust the volume
    ret *= TAmplitude(2.0f / c_pi);
     
    return ret;
}
 
TAmplitude AdvanceOscilator_Square_BandLimited(TPhase &phase, TFrequency frequency, TSamples sampleRate)
{
    AdvancePhase(phase, frequency, sampleRate);
 
    // sum the harmonics
    TAmplitude ret(0.0f);
    for (int harmonicIndex = 1; harmonicIndex <= 4; ++harmonicIndex)
    {
        float harmonicFactor = (float)harmonicIndex * 2.0f - 1.0f;
        TPhase harmonicPhase = phase * TPhase(harmonicFactor);
        ret += TAmplitude(sin(harmonicPhase.Value()*c_twoPi) / harmonicFactor);
    }
 
    //adjust the volume
    ret *= TAmplitude(4.0f / c_pi);
 
    return ret;
}
 
TAmplitude AdvanceOscilator_Triangle_BandLimited(TPhase &phase, TFrequency frequency, TSamples sampleRate)
{
    AdvancePhase(phase, frequency, sampleRate);
 
    // sum the harmonics
    TAmplitude ret(0.0f);
    bool subtract = true;
    for (int harmonicIndex = 1; harmonicIndex <= 10; ++harmonicIndex)
    {
        float harmonicFactor = (float)harmonicIndex * 2.0f - 1.0f;
        TPhase harmonicPhase = phase * TPhase(harmonicFactor);
        ret += TAmplitude(sin(harmonicPhase.Value()*c_twoPi) / (harmonicFactor*harmonicFactor)) * TAmplitude(subtract ? -1.0f : 1.0f);
    }
 
    //adjust the volume
    ret *= TAmplitude(8.0f / (c_pi*c_pi));
 
    return ret;
}
 
//=====================================================================================
// Drum Synthesis
//=====================================================================================
TAmplitude Drum(const SSoundSettings &sound, SDrumInstance &drum)
{
    // if the drum hasn't started yet, nothing to do!
    if (sound.m_currentSample < drum.m_startTime)
        return TAmplitude(0.0f);
 
    TFrequency frequencyMultiplier(1.0f);
    TAmplitude envelopeVolume(0.0f);
    TSamples sampleRelative = sound.m_currentSample - drum.m_startTime;
 
    if (sampleRelative < drum.m_settings.m_attack)
    {
        envelopeVolume = TAmplitude(sampleRelative.Divide(drum.m_settings.m_attack));
    }
    else if (sampleRelative < drum.m_settings.m_attack + drum.m_settings.m_sustain)
    {
        envelopeVolume = TAmplitude(1.0f);
    }
    else if (sampleRelative < drum.m_settings.m_attack + drum.m_settings.m_sustain + drum.m_settings.m_release)
    {
        sampleRelative -= (drum.m_settings.m_attack + drum.m_settings.m_sustain);
        envelopeVolume = TAmplitude(1.0f - sampleRelative.Divide(drum.m_settings.m_release));
        frequencyMultiplier = TFrequency(envelopeVolume.Value());
    }
    else
    {
        return TAmplitude(0.0f);
    }
 
    const TFrequency freqDecay(0.8f);
    envelopeVolume *= drum.m_settings.m_volume;
    TFrequency frequency = drum.m_settings.m_frequency * ((TFrequency(1.0f) - freqDecay) + (frequencyMultiplier*freqDecay));
    return AdvanceOscilator_Sine(drum.m_phase, frequency, sound.m_sampleRate) * envelopeVolume;
}
 
//=====================================================================================
// Main
//=====================================================================================
int main(int argc, char **argv)
{
    //our sound parameters
    SSoundSettings sound;
    sound.m_sampleRate = TSamples(44100);
    sound.m_lengthMs = SecondsToMilliseconds(9.0f);
    sound.m_numChannels = TChannelCount(1);
 
    // set up the data for our drums.
    SDrumSettings drum1;
    drum1.m_frequency = Frequency(1, 8);
    drum1.m_attack = MilliSecondsToSamples(sound, 10.0f);
    drum1.m_sustain = MilliSecondsToSamples(sound, 10.0f);
    drum1.m_release = MilliSecondsToSamples(sound, 175.0f);
    drum1.m_volume = DBToAmplitude(TDecibels(-3.0f));
 
    SDrumSettings drum2 = drum1;
    drum2.m_frequency = Frequency(2, 5);
 
    SDrumSettings drum3 = drum1;
    drum3.m_frequency = Frequency(2, 5);
 
    SDrumSettings drum4 = drum1;
    drum4.m_frequency = Frequency(1, 10);
 
    SDrumSettings drumBG = drum1;
    drumBG.m_frequency = Frequency(1, 2);
    drumBG.m_volume = DBToAmplitude(TDecibels(-12.0f));
 
    // setup drums: make a 4 beat pattern that occurs every other second
    for (uint32 i = 1; i < sound.m_lengthMs.Value() / 1000; i += 2)
	{
        g_drumInstances.push_back(SDrumInstance(SecondsToSamples(sound, (float)i + 0.00f), drum1));
        g_drumInstances.push_back(SDrumInstance(SecondsToSamples(sound, (float)i + 0.25f), drum2));
        g_drumInstances.push_back(SDrumInstance(SecondsToSamples(sound, (float)i + 0.50f), drum3));
        g_drumInstances.push_back(SDrumInstance(SecondsToSamples(sound, (float)i + 1.00f), drum4));
    }
 
    // setup drums: make a background beat
    for (uint32 i = 0, c = sound.m_lengthMs.Value() / 1000 * 4; i < c; ++i)
        g_drumInstances.push_back(SDrumInstance(SecondsToSamples(sound, (float)i / 4.0f), drumBG));
 
    //make our buffer to hold the samples
    TSamples numSamples = TSamples(sound.m_sampleRate.Value() * sound.m_numChannels.Value() * sound.m_lengthMs.Value() / 1000);
    std::vector samples;
    samples.resize(numSamples.Value());
 
    // render our audio samples from our drum list
    for (TSamples index = TSamples(0); index < numSamples; ++index)
    {
        sound.m_currentSample = index;
        TAmplitude &sample = samples[index.Value()];
        sample = TAmplitude(0.0f);
 
        std::for_each(
            g_drumInstances.begin(),
            g_drumInstances.end(),
            [&sample, &sound](SDrumInstance& drum)
            {
                sample += Drum(sound, drum);
            }
        );
    }
 
    // save as a wave file
    WriteWaveFile("out.wav", samples, sound);
}

Links

If you want a better sounding drum, check out these links. Frankly, read everything on that site… there is such great stuff there. If you like synth, that is the place to read about cool stuff, but unfortunately it’s bent towards electrical engineers and musicians, not programmers.

Sound On Suond – Synthesizing Drums: The Bass Drum
Sound On Suond – Practical Bass Drum Synthesis

Here is some more info about envelopes too. Envelopes really are a core ingrediant to synthesizers. Applying different envelopes to the same sine wave frequency, you can make a variety of different sounding instruments. They are pretty darn powerful.
Wikipedia: ASDR Envelope

I got some REALLY cool synth stuff planned in the near future, so keep an eye out!

Four Ways to Calculate Sine Without Trig

Is it possible to sin without trig? That is a question that has plagued theologians for centuries. As evil as trigonometry is, modern science shows us that yes, it is possible to sin without trig. Here are some ways that I’ve come across.

1 – Slope Iteration Method

The above image uses 1024 samples from 0 to 2pi to aproximate sine iteratively using it’s slope. Red is true sin, green is the aproximation, and yellow is where they overlap.

This method comes from Game Programming Gems 8, which you can snag a copy of from amazon below if you are interested. It’s mentioned in chapter 6.1 A Practical DSP Radio Effect (which is a really cool read btw!).
Amazon: Game Programming Gems 8

This method uses calculus but is actually pretty straightforward and intuitive – and very surprising to me that it works so well!

The derivative of sin(x) is cos(x). That means, for any x you plug into sin, the slope of the function at that point is the cosine value of that same x.

In other words, sin(0) is 0, but it has a slope of cos(0) which is 1. Since slope is rise over run (change in y / change in x) that means that at sin(0), if you take an infinitely small step forward on x, you need to take the same sized step on y. That will get you to the next value of sine.

Let’s test that out!
sin(0) = 0 so we start at (0,0) on the graph.

If we try a step size of 0.1, our approximation is:
sin(0.1) = 0.1

The actual value according to google is 0.09983341664. so our error was about 0.0002. That is actually pretty close!

How about 0.25?
sin(0.25) = 0.25

The real value is 0.24740395925, so our error is about 0.003. We have about 10 times the error that we did at 0.1.

what if we try it with 0.5?
sin(0.5) = 0.5

The actual value is 0.4794255386, which leaves us with an error of about 0.02. Our error is 100 times as much as it was at 0.1. As you can see, the error increases as our step size gets larger.

If we wanted to, we could get the slope (cosine value) at the new x value and take another step. we could continue doing this to get our sine approximation, knowing that the smaller the step that we use, the more accurate our sine approximation would be.

We haven’t actually won anything at this point though, because we are just using cosine to take approximated steps through sine values. We are paying the cost of calculating cosine, so we might as well just calculate sine directly instead.

Well luckily, cosine has a similar relationship with sine; the derivative of cos(x) is -sin(x).

Now, we can use cosine values to step through sine values, and use those same sine values to step through cosine values.

Since we know that cos(0) = 1.0 and sin(0) = 0.0, we can start at an angle of 0 with those values and we can iteratively step through the values.

Here is a sample function in C++

// theta is the angle we want the sine value of.
// higher resolution means smaller step size AKA more accuracy but higher computational cost.
// I used a resolution value of 1024 in the image at the top of this section.
float SineApproximation (float theta, float resolution)
{
    // calculate the stepDelta for our angle.
    // resolution is the number of samples we calculate from 0 to 2pi radians
    const float TwoPi = 6.28318530718f;
    const float stepDelta = (TwoPi / resolution);

    // initialize our starting values
    float angle = 0.0;
    float vcos = 1.0;
    float vsin = 0.0;

    // while we are less than our desired angle
    while(angle < theta) {

        // calculate our step size on the y axis for our step size on the x axis.
        float vcosscaled = vcos * stepDelta;
        float vsinscaled = vsin * stepDelta;

        // take a step on the x axis
        angle += stepDelta;

        // take a step on the y axis
        vsin += vcosscaled;
        vcos -= vsinscaled;
    }

    // return the value we calculated
    return vsin;
}

Note that the higher the angle you use, the more the error rate accumulates. One way to help this would be to make sure that theta was between 0 and 2pi, or you could even just calculate between 0 and pi/2 and mirror / reflect the values for the other quadrants.

This function is quite a bit of work to calculate a single value of sine but it’s real power comes in the fact that it’s iterative. If you ever find yourself in a situation where you need progressive values of sine, and have some fixed angle step size through the sine values, this algorithm just needs to do a couple multiplies and adds to get to the next value of sine.

One great use of this could be in DSP / audio synthesis, for sine wave generation. Another good use could be in efficiently evaluating trigonometry based splines (a future topic I plan to make a post about!).

You can see this in action in this shadertoy or look below at the screenshots:
Shadertoy: Sin without Trig

64 Samples – Red is true sine, green is our approximation, and yellow is where they are the same

128 Samples

256 Samples

1024 Samples

2 – Taylor Series Method

Another way to calculate sine is by using an infinite Taylor series. Thanks to my friend Yuval for showing me this method.

You can get the Taylor series for sine by typing “series sin(x)” into wolfram alpha. You can see that here: Wolfram Alpha: series sin(x).

Wolfram alpha says the series is: x-x^3/6+x^5/120-x^7/5040+x^9/362880-x^11/39916800 ….

what this means is that if you plug a value in for x, you will get an approximation of sine for that x value. It’s an infinite series, but you can do as few or as many terms as you want to be able to trade off speed for accuracy.

For instance check out these graphs.

Google: graph y = x, y = sin(x)

Google: graph y = x-x^3/6, y = sin(x)

Google: graph y = x-x^3/6+x^5/120, y = sin(x)

Google: graph y = x-x^3/6+x^5/120-x^7/5040, y = sin(x)

Google: graph y = x-x^3/6+x^5/120-x^7/5040+x^9/362880, y = sin(x)

Google: graph y = x-x^3/6+x^5/120-x^7/5040+x^9/362880-x^11/39916800, y = sin(x)
(Note that I had to zoom out a bit to show where it became inaccurate)

When looking at these graphs, you’ll probably notice that very early on, the approximation is pretty good between -Pi/2 and + Pi/2. I leveraged that by only using those values (with modulus) and mirroring them to be able to get a sine value of any angle with more accuracy.

When using just x-x^3/6, there was an obvious problem at the top and bottom of the sine waves:

When i boosted the equation with another term, bringing it to x-x^3/6+x^5/120, my sine approximation was much better:

You can see this method in action in this shadertoy:
Shadertoy: Sin without Trig II

3 – Smoothstep Method

The third method may be my favorite, due to it’s sheer elegance and simplicity. Thanks to P_Malin on shadertoy.com for sharing this one with me.

There’s a function in graphics called “smoothstep” that is used to take the hard linear edge off of things, and give it a smoother, more organic feel. You can read more about it here: Wikipedia: Smoothstep.

BTW if you haven’t read the last post, I talk about how smooth step is really just a 1d bezier curve with specific control points (One Dimensional Bezier Curves). Also, smoothstep is just this function: y = (3x^2 – 2x^3).

Anyhow, if you have a triangle wave that has values from 0 to 1 on the y axis, and put it through a smoothstep, then scale it to -1 to +1 on the y axis, you get a pretty darn good sine approximation out.

Here is a step by step recipe:

Step 1 – Make a triangle wave that has values on the y axis from 0 to 1

Step 2 – Put that triangle wave through smoothstep (3x^2 – 2x^3)

Step 3 – Scale the result to having values from -1 to +1 on the axis.

That is pretty good isn’t it?

You can see this in action in this shadertoy (thanks to shadertoy’s Dave_Hoskins for some help with improving the code):
Shadertoy: Sin without Trig III

After I made that shadertoy, IQ, the creator of shadertoy who is an amazing programmer and an impressive “demoscene” guy, said that he experimented with removing the error from that technique to try to get a better sin/cos aproximation.

You can see that here: Shadertoy: Sincos approximation

Also, I recommend checking out IQ’s website. He has a lot of interesting topics on there: IQ’s Website

4 – CORDIC Mathematics

This fourth way is a method that cheaper calculators use to calculate trig functions, and other things as well.

I haven’t taken a whole lot of time to understand the details, but it looks like it’s using imaginary numbers to rotate vectors iteratively, doing a binary search across the search space to find the desired values.

The benefit of this technique is that it can be implemented with VERY little hardware support.

The number of logic gates for the implementation of a CORDIC is roughly comparable to the number required for a multiplier as both require combinations of shifts and additions.
Wikipedia: Coordinate Rotation Digital Computer

Did I miss any?

If you know of something that I’ve left out, post a comment, I’d love to hear about it!

Alloca and Realloc – Useful Tools, Not Ancient Relics

If you are a C/C++ programmer, you are likely familiar with malloc() and free(), the predecessors to C++’s new and delete operators, as well as the existence of the variations of malloc such as calloc, realloc and alloca.

If you are like me, you probably thought for a long while that malloc and it’s variations were relics of days gone by, only actually useful in a few very limited situations. Some of these guys still have use though, and don’t really have equivalents in C++ to replace them.

First the boring ones…
malloc – Allocates memory. Precursor to new operator.
calloc – Allocates memory and sets the contents to zero. C’s answer to the problem of uninitialized memory that constructors solve in C++.

Now the more interesting ones!

Alloca

Believe it or not, alloca actually allocates memory on the stack. When your function goes out of scope, the stack memory is automatically returned to the stack due to the nature of how the stack and stack pointer work. No need to free the memory allocated with alloca, and in fact if you tried, you’d probably get a crash 😛

If you are a programmer who writes high performance applications, you are probably familiar with the benefits of using the stack instead of allocating memory on the heap with new or malloc.

The benefits of using the stack include…

Quicker allocations – Allocating memory can be a relatively costly operation in terms of time, especially if you have multiple threads running using the same (thread safe) allocator. Allocating memory on the stack is essentially the same cost as defining a local variable. Under the hood, it’s just moving the stack pointer a little farther and gives you that empty space to use.
No memory leaks – when the function you’ve allocated the stack memory in exits, that memory is automatically freed. This is because the stack pointer just “moves back” to what it used to be. There is not really any memory to free.
Less memory fragmentation – When mixing large and small memory allocations and frees, sometimes you end up with your memory in a state where there is a lot of memory free, but just not all together in one place. For instance, your program might need to allocate 50MB, and there may be 300MB free on the heap total, but if there are small 16 byte allocations littered in the memory every 10MB, your program won’t be able to find a single 50MB region to allocate and the allocation will fail. One common cause of this problem is small allocations used for things like relatively small arrays or small string buffer allocations that exist temporarily to copy or transform some data, but are not meant to stick around very long. If you can put these on the stack instead of the heap, those small allocations don’t hit the heap, and your memory will be less fragmented in the end.
Increased performance (fewer cache misses) – the contents of the stack are likely already in the CPU cache, so putting your data there means less information for the CPU to have to gather from RAM which is a slow operation.

However, there are some dangers when allocating memory on the stack as well

If you keep a pointer to the memory, that memory could be “freed” and re-used, getting filled with other random data (local variables). That can cause crashes, memory corruption or other strange program behavior.
If you allocate too much on the stack you could run out of stack space. The stack isn’t really meant to hold large amounts of allocated data. You can adjust your programs stack size though if this is a route you want to pursue.

Alternatives

There are some common techniques I’ve seen people use in places that could have also used alloca instead. These include…

Small Pool Allocators – To get around the memory fragmentation problem, sometimes people will have different memory allocators based on the size of memory being allocated. This way, small temporary allocations for things like temporary string buffers will all be allocated from one place, while larger allocations for things like textures will be allocated elsewhere. This dramatically improves the memory fragmentation issue.
Object Pools – Object pools are similar to small pool allocators but they work by allocating some amount of memory for specific types of objects, and have a way to remember which objects are used and which ones are free. For instance, you may dynamically allocate an array of 100 SMyStruct objects and have a flag for each to know which ones are in use and which ones aren’t. This way, the program can ask for a new object, and it can find one currently not in use and return it to the caller without needing to hit the ACTUAL memory allocator to get the data (unless all objects are spoken for, at which point it can choose to fail, or allocate a new “page” of objects to be able to hand out). This also has an interesting side effect that cache misses can drop quite a bit since the same kinds of objects will be nearer to eachother in memory.
DIY Stack Allocator – When I was working at Midway, a friend (Hi Shawn!) profiled the animation code and found that a lot of time was spent in allocating temporary buffers to blend bone data together. To fix this, he rolled his own stack allocator, where there was one contiguous piece of memory on the heap that could be allocated from. There was an internal index keeping track of where the “top of the stack” was, and when memory was allocated, that stack index would just move up by however many bytes were asked for. At the end of the frame, the stack index was reset to zero, thus “freeing” the memory. This dramatically improved the animation system performance by making the temporary bone blend buffer allocations essentially free.
Thread Specific Memory – If you are having problems where multiple threads are trying to allocate memory at the same time, causing contention and slowdowns due to thread synchronization, another option is to give each thread it’s own chunk of memory and let it allocate from that. That way there is no contention and you won’t have the slowdown of thread synchronization due to memory allocation anymore. A problem here though can be figuring out how much memory each thread needs. One thread may need a lot of memory, and another thread may need none, and you may not have any way of knowing which in advance. In this case, you’d have to allocate “a lot” of memory for each thread in advance, and pay an extra cost in memory that you technically don’t have to. But hey, at least it’s fast, maybe the trade off is worth it in your situation!

Lastly, there’s another common trick to avoid dynamic allocations involving templates, check it out!

// define the CStaticArray class
template 
class CStaticArray
{
public:
  T m_values[N];

  // you could put functions in here to do operations on the array data to make it look more like a standard
  // data type, instead of a plain vanilla array
  unsigned int Count () { return N; }

  void SomeOtherFunction () { }
};

void MyFunc ()
{
  // make an array of 32 floats
  CStaticArray m_floatArray;

  // make an array of 128 SSomeStructs
  CStaticArray m_objectArray;

  for (unsigned int index = 0; index < m_objectArray.Count(); ++index)
  {
    m_objectArray.m_values[index].DoSomething();
  }
}

The above really shines if you have a standard API for strings or dynamic arrays in your code base. You can make a version like the above which works without dynamic allocations, but gives the same interface so it's easier for fellow programmers to use and swap in and out as needed.

Another nice benefit to the above technique is that it works for stack allocations, but you can also make them member variables of other objects. In this way, you can minimize dynamic allocations. Instead of having to dynamically allocate an object, and then dynamically allocate the array inside of it, you do a single allocation to get all the memory needed.

That is the closest thing in C++ that I've seen to alloca, but even so, alloca has the advantage that you can decide how much memory to allocate at run time. With the template method, you have to know at compile time which is fine for a lot of cases, but othertimes is a deal breaker, forcing you to have to go back to dynamic allocations (or perhaps now, alloca instead?)

Realloc

Realloc is the other interesting memory allocation function.

Like I was mentioning above, the fewer allocations you can do, the better off you are in terms of performance, and also memory fragmentation.

By writing smart containers (dynamic arrays, dynamic strings, etc) you can make it so when someone tries to make a container smaller, that instead of allocating new memory that’s smaller, copying the data over, and freeing the old memory, that instead it just remembers the new size but keeps the old, larger memory around.

Then later on, if the container was told to grow, if it was smaller than the larger size from the past, it could just use some of that old memory again.

However, if that container grows larger than it used to be, you are going to have to allocate, copy, and free (costly etc) to grow the container.

Down in the guts of your computer however, there may be memory right after the current memory that’s not being used by anything else. Wouldn’t it be great if you could just say “hey… use that memory too, i don’t want to reallocate!”.

Well, realloc does ALL of the above for you without you having to write special code.

When you realloc memory, you give the old pointer and the new size, and if it’s able to, it won’t do any allocations whatsoever, and will just return you your old pointer back to you. It may allocate the next memory block for you if the new size is larger, but would still return the old pointer value in this case. Or, if the new amount of memory is smaller, it may return you back the same memory without doing anything internally (it depends on your compiler’s specific implementation of realloc what it does when)

If realloc does have to allocate new memory though, it will copy over all the old data to the new memory that it returns to you and free the old memory. So, you don’t have to CARE whether the pointer returned is old or new, just store the return value and continue on with your life.

It’s pretty cool and can help reduce actual memory allocations, lowering memory fragmentation and increasing performance.

DIY Synth 3: Sampling, Mixing, and Band Limited Wave Forms

This is a part of the DIY Synthesizer series of posts where each post is roughly built upon the knowledge of the previous posts. If you are lost, check the earlier posts!

This it the third installment of a series of tutorials on how to program your own synthesizer.

In this chapter we’ll continue on from the last chapter, and talk about a way to generate simple wave forms that don’t have aliasing problems. We’ll also talk about sampling, mixing and end with a somewhat realistic song made with samples and our very own platform independent synthesizer code.

You can download the full source code and source wave files from the link below. The code got a bit more complex so there’s a zip file instead of a stand alone main.cpp. Also, it’s not the cleanest, best organized code in the world – sorry about that! – but hopefully it’ll be ok for the purposes of this tutorial (:

DIY Synthesizer: Chapter 3 Source Code

If you don’t want to wait til the end of the chapter to hear the sample song, check it out here:

The Lament Of Tim Curry

Aliasing

As mentioned in the previous tutorial, the wave forms we were generating have aliasing problems. Aliasing is an audio artifact where unintended audio frequencies appear in audio data due to trying to encode frequencies that are too high for the sample rate. Wikipedia describes Aliasing pretty well, check it out for more info: Aliasing.

Sound is pressure waves conducted in the air, and at the core, audio engineers and mathematicians like to think of all sound as being made up of sine waves at different frequencies and amplitudes (volumes).

If you have a smooth / bumpy wave form, you could picture building it up with sine waves pretty easily.

If on the other hand, you have something with sharp corners, like a saw wave, a triangle wave or a square wave, it gets more difficult.

In fact, to make a “perfect corner” out of sine waves, it would take an infinite amount of sine waves of ever diminishing frequency and amplitude to get the perfectly sharp corner.

In chapter one I briefly mentioned that the maximum frequency you can store in audio data is half the sample rate. This frequency is called the Nyquist frequency and you can read more about it here: Nyquist Frequency and here: Nyquist-Shannon sampling theorem.

Aliasing occurs whenever you try to store a frequency higher than the nyquist frequency. When you do that, your audio data is not what it ought to be (a higher frequency actually appears to be a lower frequency), causing audio artifacts. If you’ve ever seen a car’s wheels spinning too slowly or backwards in a tv commercial, that is the exact same problem.

So, when making a “perfect corner” on a saw, triangle, or square wave, and having to use infinitely high frequencies to make that corner, you can bet that an infinite frequency is above Nyquist, and that it will cause some aliasing.

So, to make band limited wave forms for saw, square, and triangle, we just add together the sine waves UP TO nyquist, and then stop, instead of continuing on to infinity (which would also take far too long to calculate hehe). That makes a much cleaner, smoother sound, that is a lot easier on the ears.

A friend of mine who wishes to remain nameless has been a good sport in listening to my audio tracks over the years and for a long, long time she would complain that my songs hurt her ears. I tried putting reverb and flange on my songs to try to mellow them out, and that helped a little, but even then, it still hurt her ears. After I started using band limited wave forms, my songs stopped hurting her ears and my tones started sounding a lot smoother and richer, and more “professional”.

So, if you don’t want people’s ears to bleed when they hear your tunes, I recommend band limited wave forms!

Band Limited Sine Wave

The sine wave does not have a band limited form, since since itself IS bandlimited by definition. So, a band limited sine wave is just the sine wave itself.

Onto the next!

Band Limited Saw Wave

Wikipedia has a great article Sawtooth wave which says:

A sawtooth wave’s sound is harsh and clear and its spectrum contains both even and odd harmonics of the fundamental frequency. Because it contains all the integer harmonics, it is one of the best waveforms to use for synthesizing musical sounds, particularly bowed string instruments like violins and cellos, using subtractive synthesis.

What they mean by that (and what the heavy math formulas on that page say) is that if you have a saw wave of frequency 100, that means it contains a sine wave of frequency 100 (1 * fundamental frequency), another of frequency 200 (2 * fundamental frequency), another of 300 (3 * fundamental frequency) and so on into infinity.

The amplitude (volume) of each sine wave (harmonic) is 1 over the harmonic number. So in our example, the sine wave at frequency 100 has an amplitude of 1 (1/1). The sine wave at frequency 200 has an amplitude of 0.5 (1/2), the sine wave at frequency 300 has an amplitude of 0.333 (1/3) and so on into infinity.

After that you’ll need to multiply your sample by 2 / PI to get back to a normalized amplitude.

There’s a function in the sample code called AdvanceOscilator_Saw_BandLimited() that you can use to generate a band limited saw wave sample. It has an optional parameter where you can tell it how many harmonics to use, but if you omit that parameter, it’ll use as many as it can without going over Nyquist.

Here’s how a band limited saw wave looks and sounds compared to a non band limited saw wave, like the ones we created in the last chapter.

Chapter 3 Saw

Chapter 3 Saw Band Limited

Band Limited Square Wave

Wikipedia has a good article on square wave’s too here: Square Wave which says:

Note that the square wave contains only odd-integer harmonic frequencies (of the form 2π(2k-1)f), in contrast to the sawtooth wave and real-world signals, which contain all integer harmonics.

What this means is that if you were trying to make a square wave at frequency 100, unlike a saw wave which has sine waves at frequencies 100, 200, 300, 400 and so on, a square wave is made up of sine waves of frequencies 100, 300, 500 and 700.

Like the saw wave, however, the amplitude of each frequency is the reciprocal of the multiple of the frequency. So, the sine wave at frequency 100 has an amplitude of 1/1, the sine wave at frequency 300 has an amplitude of 1/3, the sine wave at frequency 500 has an amplitude of 1/5.

After that, you need to multiply by 4/PI to get back to a normalized amplitude.

The function to generate this wave form in the sample code is called AdvanceOscilator_Square_BandLimited().

Here’s how a band limited square wave looks and sounds compared to a non band limited square wave, like the ones we created in the last chapter.

Chapter 3 Square

Chapter 3 Square Band Limited

Band Limited Triangle Wave

The triangle wave is often used as a cheap approximation of a sine wave so it’s kind of funny making a more expensive (computationally) version of a triangle wave out of sine waves.

The wikipedia article for the triangle wave is here: Triangle Wave and it says:

It is possible to approximate a triangle wave with additive synthesis by adding odd harmonics of the fundamental, multiplying every (4n−1)th harmonic by −1 (or changing its phase by π), and rolling off the harmonics by the inverse square of their relative frequency to the fundamental.

Ok so in English what that means is that a triangle wave is a lot like a square wave, but every other harmonic, we subtract, instead of adding it. Also, instead of the amplitude (volume) of a sine wave being the reciprocal of the multiple of the frequency, the amplitude is the reciprocal of the SQUARE of the multiple of the frequency.

So that means for a 100hz frequency triangle wave, we would…

make a sine wave of 100hz at 1/1 amplitude
Subtract a sine wave of 300hz at 1/9 amplitude
Add a sine wave of 500hz at 1/25 amplitude
Subtract a sine wave of 700hz at 1/49 amplitude

and so on til infinity (or Nyquist frequency)

After that you multiply by 8 / PI*PI to get back to a normalized amplitude.

The function to generate this wave form in the sample code is called AdvanceOscilator_Triangle_BandLimited().

Here’s how a band limited triangle wave looks and sounds compared to a non band limited triangle wave, like the ones we created in the last chapter.

Chapter 3 Triangle

Chapter 3 Triangle Band Limited

Band Limited Noise

In the last chapter we also talked about the “noise” wave form and I briefly mentioned that it had it’s uses – such as in percussion sounds.

Is it possible to make a band limited version? It is, but I’m not sure if it’s really useful for anything, other than a strange sound (but then again, strange sounds is what synth is all about right?)

A quick aside – In this chapter so far, we’ve actually been talking about “Additive Synthesis” which is the process of adding multiple noises together to get an interesting result. Specifically, we’ve been adding sine waves together to get band limited forms of a saw wave, a square wave and a triangle wave. There is something else called “Subtractive Synthesis” where you carve away sounds with filters (such as a low pass filter, a high pass filter, a band pass filter, etc) to get your sound. Another way to generate band limited wave forms is to make a pure, non band limited wave form, and then use a high pass filter to cut out the high frequencies of the sound (the ones generating the aliasing sounds).

In practice, it sounds the same either way you generate it. Subtractive synthesis is just another way to approach the problem of aliasing and synth in general. In fact, when you down sample a sound file (take it from a higher sample rate to a lower sample rate), you should apply a low pass filter first to get rid of any frequencies that would cause aliasing in the lower sample rate.

Anyways, to generate band limited noise, I figured I’d just make a sine wave that changes it’s frequency once every 4000 samples (at a sample rate of 44,100, that means it changes it’s frequency 10 times a second).

Here’s what that looks and sounds like:

Chapter 3 Random Beeps

Interesting audio, and band limited, but not quite noise, so here is the same thing, switching frequency every 40 samples instead of every 4000 samples. That’s about 1000 times a second .

Chapter 3 Noise Wave

It is technically noise, and it is band limited, but it sounds weird. Like a tape player on fast forward or water flowing quickly or something.

I didn’t make a function to generate that wave form, but the sample code does it “manually” if you want to make your own function.

Chapter 3 Song

So this chapter has a somewhat passable song as a culmination of the info from the tutorials so far. You can check it out at the bottom of this article, but I wanted to give a quick overview of some other things that went into making it.

The song loads some sound files to use as samples. It loads 3 percussion sounds for the drum parts, and two sound clips from a favorite movie of mine called “Legend” – starring Tim Curry as the devil, Mia Sara as a princess and Tom Cruise as a naturalist wildman who is friends with fairies and elves. It’s a really great movie i really recommend checking it out!

Anyways, it MIXES these sound effects with our generated synth tones by just adding the various sound sources together. Mixing sounds is literally just adding them together.

When it loads up the wave files, it RESAMPLES them if necessary, meaning that if the sound file has a lower sample rate than the sound we want to render, it interpolates samples to make a higher sample rate. If the sound file loaded has a higher sample rate than the sound we want to render, it drops samples to make a lower sample rate. Check out the code for the details of how it does this, but it’s really simple and pretty much works how you’d expect it to. Note that if you down sample audio, you normally want to put it through a low pass filter to cut out any frequencies which would be above Nyquist, but my resampling code doesn’t handle that. It just aliases if there’s a frequency that is too high for the sake of simplicity.

Another thing that happens when it loads each wave file is that it converts it to mono or stereo if needed, to match the format of the sound we want to render. To convert from mono to stereo, it just duplicates the mono channel for the left and right channels, and to convert from stereo to mono, it just mixes (adds!) the left and right channel data together to get the mono channel data. Intuition might tell you that adding the left and right channels together would make it louder, even maybe twice as loud, but in practice that doesn’t happen. Sounds mix together pretty darn well without getting way loud, especially if they are “real life” sounds (not synthesized wave forms) and not played at the exact same time. Basically, the peaks (positive numbers) and valleys (negative numbers) in sound sources tend to cancel each other out and keep things in normal range.

Lastly, when loading a wave file, it normalizes the audio data so that our synth and the audio samples are all working in the same amplitude ranges of -1 to 1. When normalizing, it also “re-centers” the audio data. That is to say, if audio data was really quiet, but was always above the zero axis, it would move the data down to be centered on the zero axis before normalizing to make sure and maximize loudness.

In reality, we’d want to re-center the left and right channels individually, but I just do them together. Also, you might want to normalize individual sections of the audio data at a time instead of normalizing the entire thing as one big chunk at the end. There are a lot of good techniques and algorithms out there to do this, but this functionality is often called a compressor (to give you a place to start your research).

Note, you could easily play sound files backwards to see if they sync up with the wizard of oz, or give you instructions for some tasty brownies, but I didn’t do that in this example code, I leave that up to you!

If you want to be able to read and write other sound formats besides wav files, you might check out libsndfile. I use it in my own projects and it works pretty nicely! You can find it at: libsndfile

The Lament Of Tim Curry

Without further ado, here’s this chapter’s sample song. The full source code and source wave files is in this chapter’s source code zip file. Check it out with headphones for a neat effect, the bass line floats between the left and right channels. Enjoy! And go watch Legend if you haven’t seen it before!

The Lament Of Tim Curry

DIY Synth 2: Common Wave Forms

This is a part of the DIY Synthesizer series of posts where each post is roughly built upon the knowledge of the previous posts. If you are lost, check the earlier posts!

This is the second chapter in a series of tutorials about programming your own synthesizer

In this chapter we’ll talk about oscillators, and some common basic wave forms: Sine, Square, Saw, Triangle and Noise.

By the end, you should have enough knowledge to make some basic electronic melodies.

You can download the full source for this chapter here: DIY Synthesizer: Chapter 2 Source Code

The Sine Wave

The sine wave is the basis of lots of things in audio synthesis. It can be used on it’s own to make sound, multiple sine waves can be combined to make other more complex wave forms (as we’ll see in the next chapter) and it’s also the basis of a lot of DSP theory and audio analysis. For instance, there is something called Fourier Analysis where you can analyze some audio data and it will tell you what audio frequencies are in that sound data, and how strong each is (useful for advanced synthesis and digital signal processing aka DSP). The math of how to get that information is based on some simple properties of sine waves. More info can be found here: http://en.wikipedia.org/wiki/Fourier_analysis.

If we want to use a sine wave in our audio data, the first problem we hit is that sine has a value from -1 to 1, but our audio data from the last chapter is stored in a 32 bit int, which has a range of -2,147,483,648 to 2,147,483,647, and is unable to store fractional numbers.

The solution is to just map -1 to -2,147,483,648, and 1 to 2,147,483,647 and all the numbers in between represent fractional numbers between -1 and 1. 0.25 for instance would become 536,870,911.

If instead of 32 bits, we wanted to store the data in 16 bits, or 8 bits, we could do that as well. After generating our floating point audio data, we just convert it differently to get to those 16 bits and 8 bits. 16 bits have a range of -32,768 to 32,767 so 0.25 would convert to 8191. In 8 bits, wave files want UNSIGNED 8 bit numbers, so the range is 0 to 255. In that case, 0.25 would become 158.

Note, in the code for this chapter, i modified WriteWaveFile to do this conversion for us so going forward we can work with floating point numbers only and not worry about bits per sample until we want to write the wave file. When you call the function, you have to give it a template parameter specifying what TYPE you want to use for your samples. The three supported types are uint8, int16 and int32. For simple wave forms like those we are working with today, there is no audible difference between the 3, so all the samples just make 16 bit wave files.

So, we bust out some math and figure out here’s how to generate a sine wave, respecting the sample rate and frequency we want to use:
//make a naive sine wave for(int nIndex = 0; nIndex < nNumSamples; ++nIndex) { pData[nIndex] = sin((float)nIndex * 2 * (float)M_PI * fFrequency / (float)nSampleRate); } WriteWaveFile("sinenaive.wav",pData,nNumSamples,nNumChannels,nSampleRate);

That does work, and if you listen to the wave file, it does sound correct:
Naive Sine Wave Generation

It even looks correct:

There is a subtle problem when generating the sine wave that way though which we will talk about next.

Popping aka Discontinuity

The problem with how we generated the wave file only becomes apparent when we try to play two tones right next to each other, like in the following code segment:
//make a discontinuitous (popping) sine wave for(int nIndex = 0; nIndex < nNumSamples; ++nIndex) { if(nIndex < nNumSamples / 2) { float fCurrentFrequency = CalcFrequency(3,3); pData[nIndex] = sin((float)nIndex * 2 * (float)M_PI * fCurrentFrequency / (float)nSampleRate); } else { float fCurrentFrequency = CalcFrequency(3,4); pData[nIndex] = sin((float)nIndex * 2 * (float)M_PI * fCurrentFrequency / (float)nSampleRate); } } WriteWaveFile("sinediscon.wav",pData,nNumSamples,nNumChannels,nSampleRate);

Quick note about a new function shown here, called CalcFrequency. I made that function so that you pass the note you want, and the octave you want, and it will return the frequency for that note. For instance, to get middle C aka C4 (the tone all these samples use), you use CalcFrequency(3,3), which returns approximately 261.626.

Listen to the wave file generated and you can hear a popping noise where the tone changes from one frequency to the next: Discontinuous Sine Wave

So why is this? The reason is because how we are generating our sine waves makes a discontinuity where the 2 wave files change.

Here you can see the point that the frequencies change and how a pretty small discontinuity can make a pretty big impact on your sound! The sound you are hearing has an official name, called a “pop” (DSP / synth / other audio people will talk about popping in their audio, and discontinuity is the reason for it)

So how do we fix it? Instead of making the sine wave be rigidly based on time, where for each point, we calculate the sine value with no regard to previous values, we use a “Free Spinning Oscillator”.

That is a fancy way of saying we just have a variable keep track of the current PHASE (angle) that we are at in the sine wave for the current sample, and to get the next sample, we advance our phase based on the frequency at the time. Basically our oscillator is a wheel that spins freely, and our current frequency just says how fast to turn the wheel (from wherever it is now) to get the value for the next sample.

Here’s what the looks like in code:

//make a continuous sine wave that changes frequencies for(int nIndex = 0; nIndex < nNumSamples; ++nIndex) { if(nIndex < nNumSamples / 2) { float fCurrentFrequency = CalcFrequency(3,3); fPhase += 2 * (float)M_PI * fCurrentFrequency/(float)nSampleRate;

while(fPhase >= 2 * (float)M_PI) fPhase -= 2 * (float)M_PI;

while(fPhase < 0) fPhase += 2 * (float)M_PI;

pData[nIndex] = sin(fPhase); } else { float fCurrentFrequency = CalcFrequency(3,4); fPhase += 2 * (float)M_PI * fCurrentFrequency/(float)nSampleRate;

while(fPhase >= 2 * (float)M_PI) fPhase -= 2 * (float)M_PI;

while(fPhase < 0) fPhase += 2 * (float)M_PI;

pData[nIndex] = sin(fPhase); } } WriteWaveFile("sinecon.wav",pData,nNumSamples,nNumChannels,nSampleRate);

Note that we keep the phase between 0 and 2 * PI. There’s no mathematical reason for needing to do this, but in floating point math, if you let a value get too large, it starts to lose precision. That means, that if you made a wave file that lasted a long time, the audio would start to degrade the longer it played. I also use a while loop instead of a regular if statement, because if someone uses very large frequencies, you can pass 2 * PI a couple of times in a single sample. Also, i check that it’s above zero, because it is valid to use negative frequency values! All stuff to be mindful of when making your own synth programs (:

Here’s what the generated wave file sounds like, notice the smooth transition between the two notes:
Continuous Sine Wave

And here’s what it looks like visually where the wave changes frequency, which you can see is nice and smooth (the bottom wave). The top wave is the popping sine wave image again at the same point in time for reference. On the smooth wave it isn’t even visually noticeable that the frequency has changed.

One last word on this… popping is actually sometimes desired and can help make up a part of a good sound. For instance, some percussion sounds can make use of popping to sound more appropriate!

Sine Wave Oscillator

For our final incarnation of a sine wave oscillator, here’s a nice simple helper function:
float AdvanceOscilator_Sine(float &fPhase, float fFrequency, float fSampleRate) { fPhase += 2 * (float)M_PI * fFrequency/fSampleRate;

while(fPhase >= 2 * (float)M_PI) fPhase -= 2 * (float)M_PI;

while(fPhase < 0) fPhase += 2 * (float)M_PI;

return sin(fPhase); }

You pass that function your current phase, the frequency you want, and the sample rate, and it will advance your phase, and return the value for your next audio sample.

Here’s an example of how to use it:
//make a sine wave for(int nIndex = 0; nIndex < nNumSamples; ++nIndex) { pData[nIndex] = AdvanceOscilator_Sine(fPhase,fFrequency,(float)nSampleRate); } WriteWaveFile("sine.wav",pData,nNumSamples,nNumChannels,nSampleRate);

Here’s what it sounds like (nothing new at this point!):
Vanilla Sine Wave

Wave Amplitude, Volume and Clipping

You can adjust the AMPLITUDE of any wave form by multiplying each sample by a value. Values greater than one increase the amplitude, making it louder, values less than one decrease the amplitude, making it quieter, and negative values flip the wave over, but also have the ability to make it quieter or louder.

One place people use negative amplitudes (volumes) is for noise cancellation. If you have a complex sound that has some noise in it, but you know the source of the noise, you can take that noice, multiply it by -1 to get a volume of -1, and ADD IT (or MIX IT) into the more complex sound, effectively removing the noise from the sound. There are other uses too but this is one concrete, real world example.

This code sample generates a quieter wave file:
//make a quieter sine wave for(int nIndex = 0; nIndex < nNumSamples; ++nIndex) { pData[nIndex] = AdvanceOscilator_Sine(fPhase,fFrequency,(float)nSampleRate) * 0.4f; } WriteWaveFile("sinequiet.wav",pData,nNumSamples,nNumChannels,nSampleRate);

And here’s what that sounds like:
Vanilla Sine Wave – Quiet

And here’s what that looks like:

If you recall though, when we write a wave file, we map -1 to the smallest int number we can store, and 1 to the highest int number we can store. What happens if we make something too loud, so that it goes above 1.0 or below -1.0?

One way to fix this would be to “Normalize” the sound data. To normalize it, you would loop through each sample in the stream and find the highest absolute value sample. For instance if you had 3 samples: 1.0, -1.2, 0.8, the highest absolute sample value would be 1.2.

Once you have this value, you loop through the samples in the stream and divide by this number. After you do this, every sample in the stream will be within the range -1 to 1. Note that if you had any data that would be clipping, this process has the side effect of making your entire stream quieter since it reduces the amplitude of every sample. If you didn’t have any clipping data, this process has the side effect of making your entire stream louder because it increases the amplitude of every sample.

Another way to deal with it is to just clamp the values to the -1, 1 range. In the case of a sine wave, that means we chop off the top and/or the bottom of the wave and there’s just a flat plateau where the numbers went out of range.

This is called clipping, and along with popping are 2 of the main problems people have with audio quality degradation. Aliasing is a third, and is something we address in the next chapter by the way! (http://en.wikipedia.org/wiki/Aliasing)

Here’s some code for generating a clipping sine wave:
//make a clipping sine wave for(int nIndex = 0; nIndex < nNumSamples; ++nIndex) { pData[nIndex] = AdvanceOscilator_Sine(fPhase,fFrequency,(float)nSampleRate) * 1.4f; } WriteWaveFile("sineclip.wav",pData,nNumSamples,nNumChannels,nSampleRate);

And here’s what it sounds like:
Vanilla Sine Wave – Clipping

Also, here’s what it looks like:

Note that in this case, it doesn’t necessarily sound BAD compared to a regular, non clipping sine wave, but it does sound different. That might be a good thing, or a bad thing, depending on your intentions. With more complex sounds, like voice, or acoustic music, this will usually make it sound terrible. Audio engineers have to carefully control the levels (volumes) of the channels being mixed (added) together to make sure the resulting output doesn’t go outside of the valid range and cause clipping. Also, in analog hardware, going out of range can cause damage to the devices if they aren’t built to protect themselves from it!

In the case of real time synthesis, as you might imagine, normalizing wave data is impossible to do because it requires that you know all the sound data up front to be able to normalize the data. In real time applications, besides just making sure the levels keep everything in range, you also have the option of using a compressor which sort of dynamically normalizes on the fly. Check this out for more information: http://en.wikipedia.org/wiki/Dynamic_range_compression

Square Wave Oscillator

Here’s the code for the square wave oscillator:
float AdvanceOscilator_Square(float &fPhase, float fFrequency, float fSampleRate) { fPhase += fFrequency/fSampleRate;

while(fPhase > 1.0f) fPhase -= 1.0f;

while(fPhase < 0.0f) fPhase += 1.0f;

if(fPhase <= 0.5f) return -1.0f; else return 1.0f; }

Note that we are using the phase as if it’s a percentage, instead of an angle. Since we are using it differently, that means if you switch from sine wave to square wave, there will be a discontinuity (a pop). However, in practice this happens anyways almost all the time because unless you change from sine to square at the very top or bottom of the sine wave, there will be discontinuity anyways. In reality, this really doesn’t matter, but you could “fix” it to switch only on those boundaries, or you could use “cross fading” or “blending” to fade one wave out (decrease amplitude from 1 to 0), while bringing the new wave in (increase amplitude from 0 to 1), adding them together to get the output. Doing so will make a smooth transition but adds some complexity, and square waves by nature constantly pop anyways – it’s what gives them their sound!

Here’s what a square wave sounds like and looks like:
Square Wave

Saw Wave Oscillator

We used the saw wave in chapter one. Here’s the code for a saw wave oscillator:
float AdvanceOscilator_Saw(float &fPhase, float fFrequency, float fSampleRate) { fPhase += fFrequency/fSampleRate;

while(fPhase > 1.0f) fPhase -= 1.0f;

while(fPhase < 0.0f) fPhase += 1.0f;

return (fPhase * 2.0f) - 1.0f; }

Here’s what a saw wave looks and sounds like:
Saw Wave

Note that sometimes saw waves point the other direction and the “drop off” is on the left instead of on the right, and the rest of the way descends instead of rises but as far as I have seen, there is no audible or practical difference.

Triangle Wave Oscillator

A lot of synths don’t even bother with a triangle wave, and those that do, are just for approximations of a sine wave. A triangle wave sounds a lot like a sine wave and looks a bit like it too.

Here’s the code for a triangle wave oscillator:
float AdvanceOscilator_Triangle(float &fPhase, float fFrequency, float fSampleRate) { fPhase += fFrequency/fSampleRate;

while(fPhase > 1.0f) fPhase -= 1.0f;

while(fPhase < 0.0f) fPhase += 1.0f;

float fRet; if(fPhase <= 0.5f) fRet=fPhase*2; else fRet=(1.0f - fPhase)*2;

return (fRet * 2.0f) - 1.0f; }

Here’s what it looks and sounds like:
Triangle Wave

Noise Oscillator

Believe it or not, even static has it’s place too. It’s used sometimes for percussion (put an envelope around some static to make a “clap” sound), it can be used as a low frequency oscillator aka LFO (the old “hold and sample” type stuff) and other things as well. Static is just random audio samples.

The code for a noise oscillator is slightly different than the others. You have to pass it the last sample generated (you can pass 0 if it’s the first sample) and it will continue returning that last value until it’s time to generate a new random number. It determines when it’s time based on the frequency you pass in. A higher frequency mean more random numbers will be chosen in the same amount of audio data while a lower frequency means that fewer random numbers will be chosen.

At lower frequencies (like in the sample), it kind of sounds like an explosion or rocket ship sound effect from the 80s which is fun 😛

Here’s the code:
float AdvanceOscilator_Noise(float &fPhase, float fFrequency, float fSampleRate, float fLastValue) { unsigned int nLastSeed = (unsigned int)fPhase; fPhase += fFrequency/fSampleRate; unsigned int nSeed = (unsigned int)fPhase;

while(fPhase > 2.0f) fPhase -= 1.0f;

if(nSeed != nLastSeed) { float fValue = ((float)rand()) / ((float)RAND_MAX); fValue = (fValue * 2.0f) - 1.0f;

//uncomment the below to make it slightly more intense /* if(fValue < 0) fValue = -1.0f; else fValue = 1.0f; */

return fValue; } else { return fLastValue; } }

Here’s what it looks and sounds like:
Noise

I think it kind of looks like the Arizona desert 😛

As a quick aside, i have the random numbers as random floating point numbers (they can be anything between -1.0 and 1.0). Another way to generate noise is to make it so it will choose only EITHER -1 or 1 and nothing in between. It gives a slightly harsher sound. The code to do that is in the oscillator if you want to try it out, it’s just commented out. There are other ways to generate noise too (check out “pink noise” http://en.wikipedia.org/wiki/Pink_noise) but this ought to be good enough for our immediate needs!

More Exotic Wave Forms

Two other oscillators I’ve used on occasion is the squared sine wave and the rectangle wave.

To create a “squared sine wave” all you need to do is multiply each sample by itself (square the audio sample). This makes a wave form that is similar to sine waves, but a little bit different, and sounds a bit different too.

A rectangle wave is created by making it so the wave spends either more or less time in the “up” or “down” part of the wave. Instead of it being 50% of the time in “up”, and 50% of the time in “down” you can make it so it spends 80% of the time in up, and 20% of the time in down. It makes it sound quite a bit different, and the more different the percentages are, the “brighter” it sounds.

Also, you can add multiple wave form samples together to get more interesting wave forms (like adding a triangle and a square wave of the same frequency together, and reducing the amplitude to avoid clipping). That’s called additive synthesis and we’ll talk more about that next chapter, including how to make more correct wave forms using sine waves to avoid aliasing.

You can also multiply wave forms together to create other, more interesting waves. Strictly speaking this is called AM synthesis (amplitude modulation synthesis) which is also sometimes known as ring modulation when done a certain way.

As you can see, there are a lot of different ways to create oscillators, and the wave forms are just limited by your imagination. Play around and try to make your own oscillators and experiment!

Final Samples

Now we have the simple basics down for being able to create music. here’s a small “song” that is generated in the sample code:
Simple Song

And just to re-inforce how important keeping your wave data continuous is, here’s the same wave file, but about 0.75 seconds in a put a SINGLE -1.0 sample where it doesn’t belong. a single sample wrong when there’s 44100 samples per second and look how much it affects the audio.
Simple Song With Pop

Until Next Time…

Next up we will talk about “aliasing” and how to avoid it, making much better sounding saw, square and triangle waves that are less harsh on the ears.

Code

Here’s the code that goes with this post:

/*===================================================

Written by Alan Wolfe 5/2012

http://demofox.org

some useful links about the wave file format:
http://www.piclist.com/techref/io/serial/midi/wave.html
https://ccrma.stanford.edu/courses/422/projects/WaveFormat/

Note: This source code assumes that you are on a little endian machine.

===================================================*/

#include 
#include 
#include 

#define _USE_MATH_DEFINES
#include 

//define our types.  If your environment varies, you can change these types to be what they should be
typedef int int32;
typedef unsigned int uint32;
typedef short int16;
typedef unsigned short uint16;
typedef signed char int8;
typedef unsigned char uint8;

//some macros
#define CLAMP(value,min,max) {if(value  max) { value = max; }}

//this struct is the minimal required header data for a wav file
struct SMinimalWaveFileHeader
{
	//the main chunk
	unsigned char m_szChunkID[4];
	uint32		  m_nChunkSize;
	unsigned char m_szFormat[4];

	//sub chunk 1 "fmt "
	unsigned char m_szSubChunk1ID[4];
	uint32		  m_nSubChunk1Size;
	uint16		  m_nAudioFormat;
	uint16		  m_nNumChannels;
	uint32		  m_nSampleRate;
	uint32		  m_nByteRate;
	uint16		  m_nBlockAlign;
	uint16		  m_nBitsPerSample;

	//sub chunk 2 "data"
	unsigned char m_szSubChunk2ID[4];
	uint32		  m_nSubChunk2Size;

	//then comes the data!
};

//0 to 255
void ConvertFloatToAudioSample(float fFloat, uint8 &nOut)
{
	fFloat = (fFloat + 1.0f) * 127.5f;
	CLAMP(fFloat,0.0f,255.0f);
	nOut = (uint8)fFloat;
}

//–32,768 to 32,767
void ConvertFloatToAudioSample(float fFloat, int16 &nOut)
{
	fFloat *= 32767.0f;
	CLAMP(fFloat,-32768.0f,32767.0f);
	nOut = (int16)fFloat;
}

//–2,147,483,648 to 2,147,483,647
void ConvertFloatToAudioSample(float fFloat, int32 &nOut)
{
	double dDouble = (double)fFloat;
	dDouble *= 2147483647.0;
	CLAMP(dDouble,-2147483648.0,2147483647.0);
	nOut = (int32)dDouble;
}

//calculate the frequency of the specified note.
//fractional notes allowed!
float CalcFrequency(float fOctave,float fNote)
/*
	Calculate the frequency of any note!
	frequency = 440×(2^(n/12))

	N=0 is A4
	N=1 is A#4
	etc...

	notes go like so...
	0  = A
	1  = A#
	2  = B
	3  = C
	4  = C#
	5  = D
	6  = D#
	7  = E
	8  = F
	9  = F#
	10 = G
	11 = G#
*/
{
	return (float)(440*pow(2.0,((double)((fOctave-4)*12+fNote))/12.0));
}


//this writes a wave file
//specify sample bit count as the template parameter
//can be uint8, int16 or int32
template 
bool WriteWaveFile(const char *szFileName, float *pRawData, int32 nNumSamples, int16 nNumChannels, int32 nSampleRate)
{
	//open the file if we can
	FILE *File = fopen(szFileName,"w+b");
	if(!File)
	{
		return false;
	}

	//calculate bits per sample and the data size
	int32 nBitsPerSample = sizeof(T) * 8;
	int nDataSize = nNumSamples * sizeof(T);

	SMinimalWaveFileHeader waveHeader;

	//fill out the main chunk
	memcpy(waveHeader.m_szChunkID,"RIFF",4);
	waveHeader.m_nChunkSize = nDataSize + 36;
	memcpy(waveHeader.m_szFormat,"WAVE",4);

	//fill out sub chunk 1 "fmt "
	memcpy(waveHeader.m_szSubChunk1ID,"fmt ",4);
	waveHeader.m_nSubChunk1Size = 16;
	waveHeader.m_nAudioFormat = 1;
	waveHeader.m_nNumChannels = nNumChannels;
	waveHeader.m_nSampleRate = nSampleRate;
	waveHeader.m_nByteRate = nSampleRate * nNumChannels * nBitsPerSample / 8;
	waveHeader.m_nBlockAlign = nNumChannels * nBitsPerSample / 8;
	waveHeader.m_nBitsPerSample = nBitsPerSample;

	//fill out sub chunk 2 "data"
	memcpy(waveHeader.m_szSubChunk2ID,"data",4);
	waveHeader.m_nSubChunk2Size = nDataSize;

	//write the header
	fwrite(&waveHeader,sizeof(SMinimalWaveFileHeader),1,File);

	//write the wave data itself, converting it from float to the type specified
	T *pData = new T[nNumSamples];
	for(int nIndex = 0; nIndex = 2 * (float)M_PI)
		fPhase -= 2 * (float)M_PI;

	while(fPhase  1.0f)
		fPhase -= 1.0f;

	while(fPhase  1.0f)
		fPhase -= 1.0f;

	while(fPhase < 0.0f)
		fPhase += 1.0f;

	if(fPhase  1.0f)
		fPhase -= 1.0f;

	while(fPhase < 0.0f)
		fPhase += 1.0f;

	float fRet;
	if(fPhase  2.0f)
		fPhase -= 1.0f;

	if(nSeed != nLastSeed)
	{
		float fValue = ((float)rand()) / ((float)RAND_MAX);
		fValue = (fValue * 2.0f) - 1.0f;

		//uncomment the below to make it slightly more intense
		/*
		if(fValue < 0)
			fValue = -1.0f;
		else
			fValue = 1.0f;
		*/

		return fValue;
	}
	else
	{
		return fLastValue;
	}
}

//the entry point of our application
int main(int argc, char **argv)
{
	//our parameters that all the wave forms use
	int nSampleRate = 44100;
	int nNumSeconds = 4;
	int nNumChannels = 1;
	float fFrequency = CalcFrequency(3,3); // middle C

	//make our buffer to hold the samples
	int nNumSamples = nSampleRate * nNumChannels * nNumSeconds;
	float *pData = new float[nNumSamples];

	//the phase of our oscilator, we don't really need to reset it between wave files
	//it just needs to stay continuous within a wave file
	float fPhase = 0;

	//make a naive sine wave
	for(int nIndex = 0; nIndex < nNumSamples; ++nIndex)
	{
		pData[nIndex] = sin((float)nIndex * 2 * (float)M_PI * fFrequency / (float)nSampleRate);
	}

	WriteWaveFile("sinenaive.wav",pData,nNumSamples,nNumChannels,nSampleRate);

	//make a discontinuitous (popping) sine wave
	for(int nIndex = 0; nIndex < nNumSamples; ++nIndex)
	{
		if(nIndex < nNumSamples / 2)
		{
			float fCurrentFrequency = CalcFrequency(3,3);
			pData[nIndex] = sin((float)nIndex * 2 * (float)M_PI * fCurrentFrequency / (float)nSampleRate);
		}
		else
		{
			float fCurrentFrequency = CalcFrequency(3,4);
			pData[nIndex] = sin((float)nIndex * 2 * (float)M_PI * fCurrentFrequency / (float)nSampleRate);
		}
	}

	WriteWaveFile("sinediscon.wav",pData,nNumSamples,nNumChannels,nSampleRate);

	//make a continuous sine wave that changes frequencies
	for(int nIndex = 0; nIndex < nNumSamples; ++nIndex)
	{
		if(nIndex = 2 * (float)M_PI)
				fPhase -= 2 * (float)M_PI;

			while(fPhase = 2 * (float)M_PI)
				fPhase -= 2 * (float)M_PI;

			while(fPhase < 0)
				fPhase += 2 * (float)M_PI;

			pData[nIndex] = sin(fPhase);
		}
	}

	WriteWaveFile("sinecon.wav",pData,nNumSamples,nNumChannels,nSampleRate);

	//make a sine wave
	for(int nIndex = 0; nIndex < nNumSamples; ++nIndex)
	{
		pData[nIndex] = AdvanceOscilator_Sine(fPhase,fFrequency,(float)nSampleRate);
	}

	WriteWaveFile("sine.wav",pData,nNumSamples,nNumChannels,nSampleRate);

	//make a quieter sine wave
	for(int nIndex = 0; nIndex < nNumSamples; ++nIndex)
	{
		pData[nIndex] = AdvanceOscilator_Sine(fPhase,fFrequency,(float)nSampleRate) * 0.4f;
	}

	WriteWaveFile("sinequiet.wav",pData,nNumSamples,nNumChannels,nSampleRate);

	//make a clipping sine wave
	for(int nIndex = 0; nIndex < nNumSamples; ++nIndex)
	{
		pData[nIndex] = AdvanceOscilator_Sine(fPhase,fFrequency,(float)nSampleRate) * 1.4f;
	}

	WriteWaveFile("sineclip.wav",pData,nNumSamples,nNumChannels,nSampleRate);

	//make a saw wave
	for(int nIndex = 0; nIndex < nNumSamples; ++nIndex)
	{
		pData[nIndex] = AdvanceOscilator_Saw(fPhase,fFrequency,(float)nSampleRate);
	}

	WriteWaveFile("saw.wav",pData,nNumSamples,nNumChannels,nSampleRate);

	//make a square wave
	for(int nIndex = 0; nIndex < nNumSamples; ++nIndex)
	{
		pData[nIndex] = AdvanceOscilator_Square(fPhase,fFrequency,(float)nSampleRate);
	}

	WriteWaveFile("square.wav",pData,nNumSamples,nNumChannels,nSampleRate);

	//make a triangle wave
	for(int nIndex = 0; nIndex < nNumSamples; ++nIndex)
	{
		pData[nIndex] = AdvanceOscilator_Triangle(fPhase,fFrequency,(float)nSampleRate);
	}

	WriteWaveFile("triangle.wav",pData,nNumSamples,nNumChannels,nSampleRate);

	//make some noise or... make... some... NOISE!!!
	for(int nIndex = 0; nIndex  0 ? pData[nIndex-1] : 0.0f);
	}

	WriteWaveFile("noise.wav",pData,nNumSamples,nNumChannels,nSampleRate);

	//make a dumb little song
	for(int nIndex = 0; nIndex < nNumSamples; ++nIndex)
	{
		//calculate which quarter note we are on
		int nQuarterNote = nIndex * 4 / nSampleRate;
		float fQuarterNotePercent = (float)((nIndex * 4) % nSampleRate) / (float)nSampleRate;

		//intentionally add a "pop" noise mid way through the 3rd quarter note
		if(nIndex == nSampleRate * 3 / 4 + nSampleRate / 8)
		{
			pData[nIndex] = -1.0f;
			continue;
		}

		//do different logic based on which quarter note we are on
		switch(nQuarterNote)
		{
			case 0:
			{
				pData[nIndex] = AdvanceOscilator_Sine(fPhase,CalcFrequency(4,0),(float)nSampleRate);
				break;
			}
			case 1:
			{
				pData[nIndex] = AdvanceOscilator_Sine(fPhase,CalcFrequency(4,2),(float)nSampleRate);
				break;
			}
			case 2:
			case 3:
			{
				pData[nIndex] = AdvanceOscilator_Sine(fPhase,CalcFrequency(4,5),(float)nSampleRate);
				break;
			}
			case 4:
			{
				pData[nIndex] = AdvanceOscilator_Sine(fPhase,CalcFrequency(4,5 - fQuarterNotePercent * 2.0f),(float)nSampleRate);
				break;
			}
			case 5:
			{
				pData[nIndex] = AdvanceOscilator_Sine(fPhase,CalcFrequency(4,3 + fQuarterNotePercent * 2.0f),(float)nSampleRate);
				break;
			}
			case 6:
			{
				pData[nIndex] = AdvanceOscilator_Sine(fPhase,CalcFrequency(4,5 - fQuarterNotePercent * 2.0f),(float)nSampleRate) * (1.0f - fQuarterNotePercent);
				break;
			}

			case 8:
			{
				pData[nIndex] = AdvanceOscilator_Saw(fPhase,CalcFrequency(4,0),(float)nSampleRate);
				break;
			}
			case 9:
			{
				pData[nIndex] = AdvanceOscilator_Saw(fPhase,CalcFrequency(4,2),(float)nSampleRate);
				break;
			}
			case 10:
			case 11:
			{
				pData[nIndex] = AdvanceOscilator_Saw(fPhase,CalcFrequency(4,5),(float)nSampleRate);
				break;
			}
			case 12:
			{
				pData[nIndex] = AdvanceOscilator_Saw(fPhase,CalcFrequency(4,5 - fQuarterNotePercent * 2.0f),(float)nSampleRate);
				break;
			}
			case 13:
			{
				pData[nIndex] = AdvanceOscilator_Saw(fPhase,CalcFrequency(4,3 + fQuarterNotePercent * 2.0f),(float)nSampleRate);
				break;
			}
			case 14:
			{
				pData[nIndex] = AdvanceOscilator_Saw(fPhase,CalcFrequency(4,5 - fQuarterNotePercent * 2.0f),(float)nSampleRate) * (1.0f - fQuarterNotePercent);
				break;
			}

			default:
			{
				pData[nIndex] = 0;
				break;
			}
		}
	}

	WriteWaveFile("song.wav",pData,nNumSamples,nNumChannels,nSampleRate);

	//free our data buffer
	delete[] pData;
}

DIY Synth 1: Sound Output

This is a part of the DIY Synthesizer series of posts where each post is roughly built upon the knowledge of the previous posts. If you are lost, check the earlier posts!

This is the first in a series of tutorials on how to make your own software synthesizer.

These tutorials are aimed at C++ programmers, and the example code is meant to be as easy to understand as possible and have as few dependencies as possible. The code ought to compile and run for you no matter what system or compiler you are using with minimal if any changes required.

You can download the full source for this chapter here: DIY Synthesizer Chapter 1 Source Code

Wave File Format

Since making sound come out of computer speakers varies a lot between different systems, we’ll start out just writing a .wave file.

If you want to jump into doing real time audio, i recommend portaudio (http://www.portaudio.com/) , and i also recomend libsndfile for reading and writing other audio file formats(http://www.mega-nerd.com/libsndfile/).

I found these 2 links really helpful in understanding the wave file format:

There’s a lot of optional parts of a wave file header, but we are only going to focus on the bare minimum required to get the job done. Here’s what our wave file header struct looks like:

//this struct is the minimal required header data for a wav file struct SMinimalWaveFileHeader { //the main chunk unsigned char m_szChunkID[4]; uint32 m_nChunkSize; unsigned char m_szFormat[4];

//sub chunk 1 "fmt " unsigned char m_szSubChunk1ID[4]; uint32 m_nSubChunk1Size; uint16 m_nAudioFormat; uint16 m_nNumChannels; uint32 m_nSampleRate; uint32 m_nByteRate; uint16 m_nBlockAlign; uint16 m_nBitsPerSample;

//sub chunk 2 "data" unsigned char m_szSubChunk2ID[4]; uint32 m_nSubChunk2Size;

//then comes the data! };

And boringly, here’s the function that fills out the struct and writes it to disk:
bool WriteWaveFile(const char *szFileName, void *pData, int32 nDataSize, int16 nNumChannels, int32 nSampleRate, int32 nBitsPerSample) { //open the file if we can FILE *File = fopen(szFileName,"w+b"); if(!File) { return false; }

SMinimalWaveFileHeader waveHeader;

//fill out the main chunk memcpy(waveHeader.m_szChunkID,"RIFF",4); waveHeader.m_nChunkSize = nDataSize + 36; memcpy(waveHeader.m_szFormat,"WAVE",4);

//fill out sub chunk 1 "fmt " memcpy(waveHeader.m_szSubChunk1ID,"fmt ",4); waveHeader.m_nSubChunk1Size = 16; waveHeader.m_nAudioFormat = 1; waveHeader.m_nNumChannels = nNumChannels; waveHeader.m_nSampleRate = nSampleRate; waveHeader.m_nByteRate = nSampleRate * nNumChannels * nBitsPerSample / 8; waveHeader.m_nBlockAlign = nNumChannels * nBitsPerSample / 8; waveHeader.m_nBitsPerSample = nBitsPerSample;

//fill out sub chunk 2 "data" memcpy(waveHeader.m_szSubChunk2ID,"data",4); waveHeader.m_nSubChunk2Size = nDataSize;

//write the header fwrite(&waveHeader,sizeof(SMinimalWaveFileHeader),1,File);

//write the wave data itself fwrite(pData,nDataSize,1,File);

//close the file and return success fclose(File); return true; }

Nothing too crazy or all that interesting, but it gets the job done. Again, check out those links above if you are interested in the details of why things are written the way they are, or what other options there are.

Generating a Mono Wave File

Now, finally something interesting, we are going to generate some audio data and make a real wave file!

Since they are easy to generate, we’ll use a sawtooth wave for our sound. For more information about sawtooth waves, check out this wikipedia page: http://en.wikipedia.org/wiki/Sawtooth_wave.

int nSampleRate = 44100; int nNumSeconds = 4; int nNumChannels = 1;

The sample rate defines how many samples of audio data there are per second. A stream of audio data is nothing more than a stream of numbers, and each number is a single audio sample, so the sample rate is just how many numbers there are per second of audio data. The less numbers you use, the less “horizontal resolution” your sound file has, or, the less times the wave data can change in amplitude per second.

The sample rate also defines the maximum frequency you can store in the audio stream. The maximum frequency you can store is half of the sample rate. In other words, with a 44100 sample rate, the maximum frequency you can store is 22,050hz. The maximum audible frequency for the human ear is about 20,000hz so using a sample rate of 44100 ought to be pretty good for most needs (you might need to go higher, for complex technical reasons, but this is info enough for now!). Here’s some interesting info about audio frequencies: http://en.wikipedia.org/wiki/Audio_frequency

The number of seconds is how long (in seconds) the wave goes on for, and the number of channels is how many audio channels there are. Since this is a mono sound, there is only one audio channel.
int nNumSamples = nSampleRate * nNumChannels * nNumSeconds; int32 *pData = new int32[nNumSamples];

Here we calculate how many actual audio samples there are and then allocate space to hold the audio data. We are using 32 bit integers, but you could also use 16 bit integers. The number of bits in your audio samples indicates the vertical resolution of your audio data, or how many unique values there are. in 16 bit ints, there are 65536 different values, and in 32 bits there are 4.2 billion different values. If you think about your data as plots on a graph (essentially, what it is, where X is time and Y is wave amplitude) the more bits per sample, and the higher the sample rate, the closer your graph can be to whatever real values you are trying to use (such as a sine wave). Less bits and a lower sample rate mean it’s farther away from the real data you are trying to model, which will cause the audio to sound less correct.

int32 nValue = 0; for(int nIndex = 0; nIndex < nNumSamples; ++nIndex) { nValue += 8000000; pData[nIndex] = nValue; }

Here we are actually creating our wave data. We are using the fact that if you have an int near the maximum value you can store, and then add some more, it will wrap around to the minimum value the int can store. If you look at this on a graph, it looks like a saw tooth wave, ie we are creating a saw tooth wave! Normally you wouldn’t create them this way because the way we are doing it is harsh on the ear, and introduces something called aliasing (http://en.wikipedia.org/wiki/Aliasing). In a later tutorial we’ll see how to create a band limited saw tooth wave to make higher quality sound, but for now this will work file!

you can change how much is added to nValue to change the frequency of the resulting wave. Add a smaller number to make it a lower frequency, add a higher number to make it a higher frequency. We’ll get into the math of more finely controlling frequency in another chapter so you can actually match your waves to notes you watch to hit.

WriteWaveFile("outmono.wav",pData,nNumSamples * sizeof(pData[0]),nNumChannels,nSampleRate,sizeof(pData[0])*8);

delete[] pData;
Lastly we write our wave file and free our memory.

Tada! All done, we have a sawtooth mono wave file written out, give it a listen!

DIY Synthesizer Chapter 1: outmono.wav

Writing a Stereo File

The only thing that has really changed in the stereo file is that there are 2 channels instead of 1, and how we generate the audio data is slightly different. Since there are 2 channels, one for left, one for right, there is actually double the audio data for the same sample rate and time length wave file, since it needs a full set of data for each channel.

The audio data itself is interleaved, meaning that the first audio sample is for the left channel, the second sample is for the right channel, the third sample is for the left channel, and so on.

Here’s how the audio is generated:

int32 nValue1 = 0; int32 nValue2 = 0; for(int nIndex = 0; nIndex < nNumSamples; nIndex += 2) { nValue1 += 8000000; nValue2 += 12000000; pData[nIndex] = nValue1; //left channel pData[nIndex+1] = nValue2; //right channel }

Note that for the right channel we write a different frequency wave. I did this so that you can tell the difference between this and the mono file. Play around with the values and try muting one channel or the other to convince yourself that it really is a stereo file!

DIY Synthesizer Chapter 1: outstereo.wav

Until Next Time…

That’s all for chapter 1, thanks for reading.

Next up we’ll talk about the basic wave forms – sine, square, saw, square, and noise – and we’ll talk more about frequency and oscillators.