Fast & Lightweight Random “Shuffle” Functionality – FIXED!

In this post I’m going to show a way to make an iterator that will visit items in a list in a random order, only visit each item once, and tell you when it’s visited all items and is finished. It does this without storing a shuffled list, and it also doesn’t have to keep track of which items it has already visited.

This means you could have a list that is a million items long and no matter how many you have visited, it will only take two uint32s to store the current state of the iterator, and it will be very fast to get the next item in the list (somewhere around 1 to 4 times the cost of calling the rand() function!).

This is a follow up post to an older post called Fast & Lightweight Random “Shuffle” Functionality.

In that older post, things didn’t work quite like I expected them to so it was back to the drawing board for a little while to do some research and experimentation to find a better solution. I’m not going to go back over the basics I talked about in that article so go back and have a read there first if anything is confusing.

High Level

In the last post on this topic we talked about how the high level goal was to map the index to a random number, and because we were randomizing the bits in a deterministic (and non destructive) way, we needed to iterate over the whole “next power of 2” items and reject any that were too large. Only doing this could we be sure that we visited every index. The problem I hit last time though was that I could not get the numbers to look random enough.

To solve this, i decided what i needed to do was ENCRYPT the index with a block cipher. When you encrypt data, it should come out looking like random data, even though the data you put in may be sequential or have other easily seen patterns. What else is great, is that when using a block cipher, each unique input should equate to a unique output which means that if we encrypt the full power of 2 range as input, we will get the full power of 2 range out as output, but just in a different order.

Once I realized this was a good solution, my next problem to tackle was that I knew of no block algorithms that would work for a variable number of bits. There are block cipher algorithms that will work for LARGE number of bits, but there is no algorithm I knew of where you can tell it “hey, i only want to use 4 bits” or “i only want to use 5 bits”.

In the end the answer I found was to roll my own, but use existing, well established technology to do so. In my specific case, I’m also aiming for high speed functions since I want this functionality used in real time gaming applications.

What I came up with in the end is not cryptographically secure, but using the same techniques I have laid out, you should be able to drop in a different block cipher algorithm if that is a requirement.

Feistel Network

As it turns out, there is a basic primitive of cryptography called a Feistel Network. It’s used by quite a number of modern ciphers and hash functions, and it is surprisingly simple.

For a balanced Feistel Network, you split the data into a left and a right side, and do the following, which consists of a single round (you can do as many rounds as you like):

Left[i+1]  = Right[i];
Right[i+1] = Left[i] ^ RoundFunction(Right[i], key);

After performing however many rounds you wish, you combine the left and the right sides again to get your encrypted data.

To unencrypt, the feistel network works much the same but only in reverse, looking like the below:

Right[i] = Left[i+1];
Left[i] = Right[i+1] ^ RoundFunction(Left[i+1], key);

Check out the wikipedia page if you are interested in more info.

The neat thing about Feistel Networks is that the round function can be any deterministic function that performs whatever operations it wants – even destructive and irreversible operations such as division or bit shifts. Even so, the feistel network as a whole is reversible, no matter what you do in the round function, and you can unencrypt to get your origional data back.

This threw me for quite a loop and I couldn’t get my head around why this worked for a long while until I found a webpage that explained it pretty well. unfortunately I lost the link and cannot find it again but the basic idea is this… For each round of encryption, the right side is encrypted using the key and the left side. This means that at any point, no matter how many rounds you’ve done on your data, the right side should be able to be decrypted using the key and the left side. If you have the key, and you know how many rounds were used in encryption, you have all the data you need to decrypt it again. Hopefully that makes sense… I had to work it out on paper a little bit to see it fully.

The other great thing about Feistel Networks is that you can make them be however many bits you want. So, if i want each side of the Feistel Network to be 1 bit, I can do that. Or, if i want each side to be 128 bits, I can do that too!

You can also tune the quality / performance a bit by doing less or more rounds.

BTW the Tiny Encryption Algorithm uses a Feistel Network if you want to see a simple example in action.

With the “variable bit size support” problem solved, next I needed to come up with a round function that did a good job of taking sequential numbers as input and spitting out seemingly random numbers as output. Thanks to what I was saying before, the round function doesn’t need to be reversible, so there are a lot of options available.

I ended up deciding to go with a hash function, specifically Murmur Hash 2 (which I actually also used in my last post if you’d like to see some more information on it! The Incredible Time Traveling Random Number Generator).

Since the hash spits out numbers that might be anything in the range of an unsigned int, but I only want N bits, I just AND the hash against a mask to get the number of bits I want. There’s probably a higher quality method of smashing down the bits using XOR or something, but my quality needs aren’t very high so I just opted to AND it.

A downside of going with the balanced Feistel Network approach is that before this, I only had to round up to the next power of 2, but now, since each half of the data needs to be a power of 2, I actually have to make sure I have an even number of bits and have to round up to the next power of 4. This means that when it’s looking for valid indices to return in the shuffle, it may have to calculate up to 4 different indices on average before it finds a valid one. Not the greatest thing in the world, but also not the worst and definitely not a deal breaker in my eyes.

The Code

At long last, here is the code! Use it in good health (:

There are some example runs of the program below it as well.

#include 
#include 
#include 

// MurmurHash code was taken from https://sites.google.com/site/murmurhash/
//-----------------------------------------------------------------------------
// MurmurHash2, by Austin Appleby

// Note - This code makes a few assumptions about how your machine behaves -

// 1. We can read a 4-byte value from any address without crashing
// 2. sizeof(int) == 4

// And it has a few limitations -

// 1. It will not work incrementally.
// 2. It will not produce the same results on little-endian and big-endian
//    machines.

unsigned int MurmurHash2 ( const void * key, int len, unsigned int seed )
{
	// 'm' and 'r' are mixing constants generated offline.
	// They're not really 'magic', they just happen to work well.

	const unsigned int m = 0x5bd1e995;
	const int r = 24;

	// Initialize the hash to a 'random' value

	unsigned int h = seed ^ len;

	// Mix 4 bytes at a time into the hash

	const unsigned char * data = (const unsigned char *)key;

	while(len >= 4)
	{
		unsigned int k = *(unsigned int *)data;

		k *= m; 
		k ^= k >> r; 
		k *= m; 
		
		h *= m; 
		h ^= k;

		data += 4;
		len -= 4;
	}
	
	// Handle the last few bytes of the input array

	switch(len)
	{
	case 3: h ^= data[2] << 16;
	case 2: h ^= data[1] <> 13;
	h *= m;
	h ^= h >> 15;

	return h;
}

struct SShuffler
{
public:
	SShuffler(unsigned int numItems, unsigned int seed)
	{
		// initialize our state
		m_numItems = numItems;
		m_index = 0;
		m_seed = seed;

		// calculate next power of 4.  Needed sice the balanced feistel network needs
		// an even number of bits to work with
		m_nextPow4 = 4;
		while (m_numItems > m_nextPow4)
			m_nextPow4 *= 4;

		// find out how many bits we need to store this power of 4
		unsigned int numBits = 0;
		unsigned int mask = m_nextPow4 - 1;
		while(mask)
		{
			mask = mask >> 1;
			numBits++;
		}

		// calculate our left and right masks to split our indices for the feistel 
		// network
		m_halfNumBits = numBits / 2;
		m_rightMask = (1 << m_halfNumBits) - 1;
		m_leftMask = m_rightMask << m_halfNumBits;
	}

	void Restart()
	{
		Restart(m_seed);
	}

	void Restart(unsigned int seed)
	{
		// store the seed we were given
		m_seed = seed;

		// reset our index
		m_index = 0;
	}

	// Get the next index in the shuffle.  Returning false means the shuffle
	// is finished and you should call Restart() if you want to start a new one.
	bool Shuffle(unsigned int &shuffleIndex)
	{
		// m_index is the index to start searching for the next number at
		while (m_index < m_nextPow4)
		{
			// get the next number
			shuffleIndex = NextNumber();

			// if we found a valid index, return success!
			if (shuffleIndex  1)
		{
			// get the last number
			shuffleIndex = LastNumber();

			// if we found a valid index, return success!
			if (shuffleIndex > m_halfNumBits;
		unsigned int right = (index & m_rightMask);

		// do 4 feistel rounds 
		for (int index = 0; index < 4; ++index)
		{
			unsigned int newLeft = right;
			unsigned int newRight = left ^ (MurmurHash2(&right, sizeof(right), m_seed) & m_rightMask);
			left = newLeft;
			right = newRight;
		}

		// put the left and right back together to form the encrypted index
		return (left << m_halfNumBits) | right;
	}

private:

	// precalculated values
	unsigned int m_nextPow4;
	unsigned int m_halfNumBits;
	unsigned int m_leftMask;
	unsigned int m_rightMask;

	// member vars
	unsigned int m_index;
	unsigned int m_seed;
	unsigned int m_numItems;

	// m_index assumptions:
	//   1) m_index is where to start looking for next valid number
	//   2) m_index - 2 is where to start looking for last valid number
};

// our songs that we are going to shuffle through
const unsigned int g_numSongs = 10;
const char *g_SongList[g_numSongs] =
{
	" 1. Head Like a Hole",
	" 2. Terrible Lie",
	" 3. Down in It",
	" 4. Sanctified",
	" 5. Something I Can Never Have",
	" 6. Kinda I Want to",
	" 7. Sin",
	" 8. That's What I Get",
	" 9. The Only Time",
	"10. Ringfinger"
};

int main(void)
{
	// create and seed our shuffler.  If two similar numbers are hashed they should give
	// very different results usually, so for a seed, we can hash the time in seconds,
	// even though that number should be really similar from run to run
    unsigned int currentTime = time(NULL);
    unsigned int seed = MurmurHash2(&currentTime, sizeof(currentTime), 0x1337beef);
	SShuffler shuffler(g_numSongs, seed);

	// shuffle play the songs
	printf("Listen to Pretty Hate Machine (seed = %u)rn", seed);
	unsigned int shuffleIndex = 0;
	while(shuffler.Shuffle(shuffleIndex))
		printf("%srn",g_SongList[shuffleIndex]);

	system("pause");
	return 0;
}

shuf1

shuf2

shuf3

shuf4

The Incredible Time Traveling Random Number Generator

It isn’t very often that you need a pseudo random number generator (PRNG) that can go forwards or backwards in time, or skip to specific points in the future or the past. However, if you are ever writing a game like Braid and do end up needing one, here’s one way to do it.

At the core of how this is going to work, we are going to keep track of an index, and have some way to convert that index into a random number. if we want to move forward in time, we will just increment the index. If we want to move backwards in time, we will just decrement the index. It’s super simple from a high level, but how are we going to convert an index into a random number?

There are lots of pseudo random number generators out there that we could leverage for this purpose, the most famous being C++’s built in “rand()” function, and another one famous in the game dev world is the Mersenne Twister.

I’m going to do something a little differently though as it leads well into the next post I want to write, and may be a little bit different than some people are used to seeing; I want to use a hash function.

Murmur Hash 2

Good hash functions have the property that small changes in input give large changes in output. This means that if we hash the number 1 and then hash the number 2, that they ought not to be similar output, they ought to be wildly different numbers in the usual case. Sometimes, just like real random numbers, we might get 2 of the same numbers in a row, but that is the desired behavior to have the output act like real random sequences.

There are varying levels of quality of hash functions, ranging from a simple string “hash” function of using the first character of a string (super low quality hash function, but super fast) all the way up to cryptographic quality hash functions like MD5 and SHA-1 which are a lot higher quality but also take a lot longer to generate.

In our usage case, I’m going to assume this random number generator is going to be used for game use, where if the player can discover the pattern in the random numbers, they won’t be able to gain anything meaningful or useful from that, other than at most be able to cheat at their own single player game. However, I really do want the numbers to be fairly random to the casual eye. I don’t want visible patterns to be noticeable since that would decrease the quality of the gameplay. I would also like my hash to run as quickly as possible to keep game performance up.

Because of that level of quality I’m aiming for, I opted to go with a fast, non cryptographic hash function called Murmur Hash 2. It runs pretty quick and it gives pretty decent quality results too – in fact the official Murmur Hash Website claims that it passes the Chi Squared Test for “practically all keysets & bucket sizes”.

If you need a higher quality set of random numbers, you can easily drop in a higher quality hash in place of Murmur Hash. Or, if you need to go the other way and have faster code at the expensive of random number quality, you can do that too.

Speed Comparison

How fast is it? Here’s some sample code to compare it vs C++’s built in rand() function, as well as an implementation of the Mersenne Twister I found online that seems to preform pretty well.

#include 
#include 
#include 
#include 
#include "tinymt32.h" // from http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/TINYMT/index.html

// how many numbers to generate
#define NUMBERCOUNT 10000000  // Generate 10 million random numbers

// profiling macros
#define PROFILE_BEGIN 
{ 
	LARGE_INTEGER freq; 
	LARGE_INTEGER start; 
    QueryPerformanceFrequency(&freq); 
    QueryPerformanceCounter(&start); 
		
#define PROFILE_END(label) 
	LARGE_INTEGER end; 
	QueryPerformanceCounter(&end); 
	printf(label " - %f msrn", ((double)(end.QuadPart - start.QuadPart)) * 1000.0 / freq.QuadPart); 
}

// MurmurHash code was taken from https://sites.google.com/site/murmurhash/
//-----------------------------------------------------------------------------
// MurmurHash2, by Austin Appleby

// Note - This code makes a few assumptions about how your machine behaves -

// 1. We can read a 4-byte value from any address without crashing
// 2. sizeof(int) == 4

// And it has a few limitations -

// 1. It will not work incrementally.
// 2. It will not produce the same results on little-endian and big-endian
//    machines.

unsigned int MurmurHash2 ( const void * key, int len, unsigned int seed )
{
	// 'm' and 'r' are mixing constants generated offline.
	// They're not really 'magic', they just happen to work well.

	const unsigned int m = 0x5bd1e995;
	const int r = 24;

	// Initialize the hash to a 'random' value

	unsigned int h = seed ^ len;

	// Mix 4 bytes at a time into the hash

	const unsigned char * data = (const unsigned char *)key;

	while(len >= 4)
	{
		unsigned int k = *(unsigned int *)data;

		k *= m; 
		k ^= k >> r; 
		k *= m; 
		
		h *= m; 
		h ^= k;

		data += 4;
		len -= 4;
	}
	
	// Handle the last few bytes of the input array

	switch(len)
	{
	case 3: h ^= data[2] << 16;
	case 2: h ^= data[1] <> 13;
	h *= m;
	h ^= h >> 15;

	return h;
}

void RandTest()
{
	for(int index = 0; index < NUMBERCOUNT; ++index)
		int i = rand();
}

unsigned int MurmurTest()
{
	unsigned int key = 0;
	for(int index = 0; index < NUMBERCOUNT; ++index)
		key = MurmurHash2(&key,sizeof(key),0);
	return key;
}

// g_twister is global and inited in main so it doesnt count towards timing
tinymt32_t g_twister; 
unsigned int TwisterTest()
{
	unsigned int ret = 0;
	for(int index = 0; index < NUMBERCOUNT; ++index)
		ret = tinymt32_generate_uint32(&g_twister);
	return ret;
}

int main(int argc, char**argv)
{
	// rand() test
	PROFILE_BEGIN;
	RandTest();
	PROFILE_END("rand()");

	// hash test
	unsigned int murmurhash;
	PROFILE_BEGIN;
	murmurhash = MurmurTest();
	PROFILE_END("Murmur Hash 2");

	// twister test
	g_twister.mat1 = 0;
	g_twister.mat2 = 0;
	tinymt32_init(&g_twister, 0);
	unsigned int twister;
	PROFILE_BEGIN;
	twister = TwisterTest();
	PROFILE_END("Mersenne Twister");

	// show the results
	system("pause");

	// this is here so that the murmur and twister code doesn't get optimized away
	printf("%u %urn", murmurhash, twister);

	return 0;
}

Here's the output of that code run in release on my machine, generating 10 million random numbers of each type. You can see that murmurhash takes about 1/3 as long as rand() but is not quite as fast as the Mersenne Twister. I ran this several times and got similar results, so all in all, Murmur Hash 2 is pretty fast!

mrmrrandperf

Final Code & Sample Output

Performance looks good but how about the time traveling part, and how about seeing some example output?

Here’s the finalized code:

#include 
#include 
#include 

// MurmurHash code was taken from https://sites.google.com/site/murmurhash/
//-----------------------------------------------------------------------------
// MurmurHash2, by Austin Appleby

// Note - This code makes a few assumptions about how your machine behaves -

// 1. We can read a 4-byte value from any address without crashing
// 2. sizeof(int) == 4

// And it has a few limitations -

// 1. It will not work incrementally.
// 2. It will not produce the same results on little-endian and big-endian
//    machines.

unsigned int MurmurHash2 ( const void * key, int len, unsigned int seed )
{
	// 'm' and 'r' are mixing constants generated offline.
	// They're not really 'magic', they just happen to work well.

	const unsigned int m = 0x5bd1e995;
	const int r = 24;

	// Initialize the hash to a 'random' value

	unsigned int h = seed ^ len;

	// Mix 4 bytes at a time into the hash

	const unsigned char * data = (const unsigned char *)key;

	while(len >= 4)
	{
		unsigned int k = *(unsigned int *)data;

		k *= m; 
		k ^= k >> r; 
		k *= m; 
		
		h *= m; 
		h ^= k;

		data += 4;
		len -= 4;
	}
	
	// Handle the last few bytes of the input array

	switch(len)
	{
	case 3: h ^= data[2] << 16;
	case 2: h ^= data[1] <> 13;
	h *= m;
	h ^= h >> 15;

	return h;
}

class CReversablePRNG
{
public:
	CReversablePRNG()
	{
		m_index = 0;
		m_seed = 0;
	}

	unsigned int NextNumber()
	{
		unsigned int ret = MurmurHash2(&m_index, sizeof(m_index), m_seed);
		m_index++;
		return ret;
	}

	unsigned int LastNumber()
	{
		unsigned int lastIndex = m_index - 2;
		unsigned int ret = MurmurHash2(&lastIndex, sizeof(lastIndex), m_seed);
		m_index--;
		return ret;
	}

	// to be able to save / restore state for a save game or whatever else
	void GetState(unsigned int &index, unsigned int &seed)
	{
		index = m_index;
		seed = m_seed;
	}

	void SetState(unsigned int index, unsigned int seed)
	{
		m_index = index;
		m_seed = seed;
	}

private:
	unsigned int m_index;
	unsigned int m_seed;
};

int main(int argc, char**argv)
{
	// create and seed our random number generator.  If two similar numbers are hashed
	// they should give very different results usually, so for a seed, we can hash the
	// time in seconds, even though the number from run to run should be really similar
	CReversablePRNG prng;
	unsigned int currentTime = time(NULL);
	unsigned int seed = MurmurHash2(&currentTime, sizeof(currentTime), 0x1337beef);
	prng.SetState(0, seed);

	// display our seed and our table header
	printf("seed = %urn", seed);
	printf("index | raw number | mod 10rn");
	printf("---------------------------rn");

	// generate 10 numbers forward
	for (int index = 0; index < 10; ++index)
	{
		unsigned int nextNumber = prng.NextNumber();
		printf("%2i    | %10u | %urn", index, nextNumber, nextNumber % 10);
	}

	// generate 3 numbers back
	printf("rn");
	for (int index = 0; index < 3; ++index)
	{
		unsigned int lastNumber = prng.LastNumber();
		printf("%2i    | %10u | %urn", 8 - index, lastNumber, lastNumber % 10);
	}

	// generate 5 numbers forward
	printf("rn");
	for (int index = 0; index < 5; ++index)
	{
		unsigned int nextNumber = prng.NextNumber();
		printf("%2i    | %10u | %urn", 7 + index, nextNumber, nextNumber % 10);
	}

	system("pause");

	return 0;
}

mrmrout4

mrmrout3

mrmrout2

mrmrout1

Next Up

Hopefully you enjoyed this post!

Next up I’m going to be applying this code to the problem of shuffling to continue on from the post where I tried to do that before: Fast & Lightweight Random “Shuffle” Functionality.

Why do you hate me rand()?!

TL;DR – I’ve always heard rand() sucked for generating (cryptographically strong) random numbers, but it turns out it’s just kind of bad in general too LOL.

OK so this is bizarre, I made a default settings console project in MSVC 2012 with the code below:

#include 
#include 
#include 

int main(int argc, char** argv)
{
	time_t thetime = 0;
	time(&thetime);
	srand(thetime);
	int a = rand();
	int b = rand();
	int c = rand();
	int d = rand();

	printf("time = %llu (%llu)rna = %irnb = %irnc =t %irnd = %irn", thetime, thetime % RAND_MAX, a, b, c, d);
	return 0;
}

Here are some sample outputs, can you see what’s wrong?!

time = 1371620230 (26377)
a = 11108
b = 28489
c = 18911
d = 15679
time = 1371620268 (26415)
a = 11232
b = 10944
c = 9621
d = 12581
time = 1371620289 (26436)
a = 11301
b = 7285
c = 24321
d = 26390
time = 1371620310 (26457)
a = 11369
b = 3625
c = 6252
d = 7432
time = 1371620332 (26479)
a = 11441
b = 10714
c = 6048
d = 12537

5 times in a row you can see that the first number randomly generated is in the 11,000’s. You can also see that it’s steadily increasing.

I included the time modulo RAND_MAX in case that was the first number returned but it isn’t. I also looked at the numbers in hex and there isn’t a clear pattern there either. I can’t really discern the correlation between the time and the first random number, but there is definitely a pattern of some kind.

You always hear you shouldn’t use rand() if you need really high quality random numbers (like used for encryption), but i always figured if you use srand() with time, your number will be good enough for games at least. Turns out, you might want to throw out the first random number rand gives you before using the stuff for your games too. Maybe throw out a couple just in case! 😛

You might wonder why b,c,d are seemingly more random then a, but that’s likely due to the Avalanche Effect aka “sensitivity to initial conditions” which as it turns out is a nice property of cryptographic algorithms as well as pseudo random number generators. That is also a fundamental idea from Chaos Theory.

Essentially, as you ask for more random numbers, they ought to be more unpredictable, and more “random”. You just get some trash in the beginning.

Anyways… I’m super surprised by just how bad rand() is… I guess I never looked at it like this before (or maybe this is some new bad behavior in MSVC 2012?). Also, RAND_MAX is defined for me as 0x7fff. Ouchies, where are the rest of our numbers? 😛

Fast & Lightweight Random “Shuffle” Functionality

shufflewhatif

NOTE: THIS ARTICLE ENDS IN FAILURE. IF YOU WANT TO SKIP AHEAD TO THE SUCCESSFUL METHOD CHECK OUT THIS POST: Fast & Lightweight Random “Shuffle” Functionality – FIXED

Sometimes in game development we have a list of things that we want to choose from randomly, but we want to make sure and only choose each thing one time.

For instance, let’s say you are making a game where the player gets quests randomly from an NPC, but when they finish the current group of quests, they can then move onto the next group of quests (because the next group is harder than the first group).

Here’s some obvious ways you might implement this:

  • Make a list of the items you want to choose randomly from. When you choose an item from the list, remove it from the list so that it won’t be chosen from next time. You have to allocate and maintain (and maybe serialize) a list which is a little bit of a bummer, but it gets the job done.
  • Add a flag or bool to each item to remember whether it’s been chosen from. When you randomly choose an item, if it has already been chosen, roll a random number again. If it hasn’t been chosen, mark it as having been chosen and use it. The bummer here is that it isn’t constant time. If you have a lot of items to choose from, when you get near the end and only have a few items left unused, you’ll have to roll a bunch of random numbers before you find a valid one to use.
  • Taking a cue from the last article Efficiently Generate Random Numbers Without Repeats, you could go with the 2nd option, but count how many unused items there are, roll a random number for the number of unused items, and then count through the list again to find which unused item you rolled. This is nicer, but it might be costly to have to traverse the list, especially if the list has a lot of items in it.

Computer Science Magic

Computer science has some neat things to it that could help you out here. Most notably, there are various algorithms for traversing numbers or multidimensional spaces in ways other than sequentially, for various reasons. Here are 2 such things for example (but there are many more out there waiting for you to find them!):

You could likely leverage something like these guys to traverse the items in a list in a way that looked random to a player, but would actually be well defined and have a pattern to them. The unfortunate thing about these is that they may be the same “random” every time though. With some creativity and programming alchemy you might be able to get around that problem though.

If something like this works well enough for you, it might be your solution!

XOR Magic

XOR is a magical thing. I started game development back in 16 bit CPU days before hardware accelerated graphics, and when GL (and later directX) first appeared, my first question was “this is all very neat, but how do i XOR pixels?” Sadly, the answer was that I couldn’t, and as far as I know, we still don’t have that ability with shaders and it makes me sad LOL.

Anyways, besides making really snazzy “programmer art” style graphics (and selection boxes), XOR is one of the corner stones of encryption and other cryptography. (Like Cryptography? I have a bunch of posts on it, go check em out! Cryptography 101)

For instance, the “one time pad” is the ONLY mathematically proven uncrackable encryption scheme, and it uses XOR. In fact, it ONLY uses XOR.

One property of XOR that makes it useful for cryptography also makes it useful to us here in our shuffle code. That property is this: If you have a random number, and a non random number, when you XOR them together, the result will be a random number too. (BTW the other property that makes it useful for cryptography is that it’s reversible, but we don’t care about that right now).

Think about that for a minute… that means that if you have a random number, and you count from 1 to 10, XORing each number by the same random number, the resulting numbers ought to be random too. What else is great is that thanks to the fact that Boolean math is deterministic (1 xor 0 is always 1, every time you do it), the numbers you get out will all be unique and not repeat. TA-DA! There are some problems left to solve, but we now have the basis for our shuffle algorithm!

Are you down with the SHUF?

Ok so the basic idea for shuffling is this: We are going to loop through the list normally, but we are going to xor each index against a pre-rolled random number so that it randomizes the index for us in a way that will have no repeats. Let’s pretend that we have 5 things to loop through and that our random number is 3. Let’s try it out:

index 0: 0 ^ 3 == 3
index 1: 1 ^ 3 == 2
index 2: 2 ^ 3 == 1
index 3: 3 ^ 3 == 0
index 4: 4 ^ 3 == 7

Our last number ended up being 7, what the heck happened? Well, the issue here is that it’s randomizing the bits in our indices, not really shuffling our 5 items. With 5 items to loop through, that means there are 3 bits that it is randomizing, which means that we might encounter any 3 bit value at any time (including 7, the highest one!), and that we would need to iterate through all 3 bit values to encounter all the indices that we are looking for (0 through 5). We’ll just have to loop through all 3 bit indices and ignore anything too large. Here’s all of the values:

index 0: 0 ^ 3 == 3
index 1: 1 ^ 3 == 2
index 2: 2 ^ 3 == 1
index 3: 3 ^ 3 == 0
index 4: 4 ^ 3 == 7 (ignore)
index 5: 5 ^ 3 == 6 (ignore)
index 6: 6 ^ 3 == 5 (ignore)
index 7: 7 ^ 3 == 4

Looks like we solved that issue.

The other issue that comes up is that the random number can be any number that can fit in an unsigned int. When we xor a huge number by our small indices, we’ll get giant numbers out as a result.

For instance if our random number was 15367, xoring that against index 3 would give us 15364.

To fix that, we can just use the lowest 3 bits of the random number (& against 7). That way, the random number can only have bits set in the lowest 3 bits, and our index already can only have bits set in the lowest 3 bits, so the end result can also only have bits set in the lowest 3 bits.

I think we are ready to write some code!

The Source Code

#include 
#include 
#include 

template 
struct SShuffler
{
public:
	SShuffler()
	{
		Start();
	}

	// start a new shuffle
	void Start()
	{
		m_index = (unsigned int)-1;
		m_randomNumber = ((unsigned int)rand()) & c_numItemsNextPow2Mask;
	}

	// whether or not the shuffle is finished
	bool IsDone()
	{
		return m_index == c_numItemsNextPow2;
	}

	// Get the next index in the shuffle
	bool Shuffle(unsigned int &shuffleIndex)
	{
		// increment our index until we reach our max index,
		// or we find a valid index
		do
		{
			m_index++;
			shuffleIndex = m_index ^ m_randomNumber;
		}
		while (m_index = c_numItems);

		// if we haven't reached the max index, our shuffle was successful
		return m_index > 1;
	static const unsigned int c_B = c_A | c_A >> 2;
	static const unsigned int c_C = c_B | c_B >> 4;
	static const unsigned int c_D = c_C | c_C >> 8;
	static const unsigned int c_numItemsNextPow2Mask = c_D | c_D >> 16;
	static const unsigned int c_numItemsNextPow2 = c_numItemsNextPow2Mask + 1;

	// member vars
	unsigned int m_index;
	unsigned int m_randomNumber;
};

// our songs that we are going to shuffle through
const unsigned int g_numSongs = 10;
const char *g_SongList[g_numSongs] =
{
	"1. Head Like a Hole",
	"2. Terrible Lie",
	"3. Down in It",
	"4. Sanctified",
	"5. Something I Can Never Have",
	"6. Kinda I Want to",
	"7. Sin",
	"8. That's What I Get",
	"9. The Only Time",
	"10. Ringfinger"
};

int main(void)
{
	// use the current time as a seed for our random number generator
	srand((unsigned)time(0));

	// declare a shuffler object
	SShuffler shuffler;

	// shuffle play once
	printf("I wanna listen to some NIN...(seed = %i)rnrn", shuffler.DebugGetSeed());
	unsigned int shuffleIndex = 0;
	while(!shuffler.IsDone())
	{
		if (shuffler.Shuffle(shuffleIndex))
			printf("%srn",g_SongList[shuffleIndex]);
	}

	// shuffle play again
	shuffler.Start();
	printf("rnThat was great, let's listen again! (seed = %i)rnrn", shuffler.DebugGetSeed());
	while(!shuffler.IsDone())
	{
		if (shuffler.Shuffle(shuffleIndex))
			printf("%srn",g_SongList[shuffleIndex]);
	}

	printf("rn");
	system("pause");
	return 0;
}

Example Run

Here’s the output of an example run of this program. Note that if ever you encounter seed 0, it will not shuffle at all. Also, if you encounter seed 15, it will play the list exactly backwards!

shuffle

Something Weird Going On

After playing with this stuff a bit, it looks like even though this technique works “ok”, that it actually doesn’t randomize the list as much as I thought it was. It looks like no matter what my seed is, adjacent numbers seem to “pair up”. Like 1 and 2 will always be next to each other but will change which comes first. Same with 3 and 4, 5 and 6, etc.

I think the problem is that if you have a set of numbers in order, that for each possible order those numbers can be in, there doesn’t exist a number you can XOR the set of numbers to get to be in that order. I think that even though a simple XOR can re-arrange the numbers, it can’t give you all possible combinations (which makes sense… 16 seeds is a lot less than 10 factorial, which is how many combinations there ought to be!)

I have to play around with it some more and think about it a little more though. There might be a way at least to make it better, maybe using some more bits from the random number to do more math operations on the index or something.

Efficiently Generate Random Numbers Without Repeats

Sometimes in game development, you want to get a random number, but you want it to be different than the last random number you got.

For instance, let’s say you were making a game where the player was an elemental golem and could change between ice, fire, electricity and earth randomly but it cost 50 mana to change to a new random element.

When the player presses the button to change forms, if your game rolled a random number to figure out the new element, sometimes it would choose the exact same element that the player already had, but the player would still get charged the 50 mana.

As a player, wouldn’t you feel kind of ripped off if that happened? Wouldn’t it be better if it just chose randomly from all elements except the current one?

I’m going to show you how so if you want to think about it a bit first and see if you can work out a solution, do so now! SPOILERS AHEAD!

Not so Great Implementations

To implement that, there are a couple of “not so great” ways to do it such as…

  • Make a list of all elements except the current one, and randomly choose from that list. This isn’t so great because you would have to allocate space for a list, copy in the elements, and then do the random roll. Memory allocations and data copying isn’t cheap. Imagine if there were 100 or 1000 different things you were choosing between. Then imagine that this operation happened for enemies every few game loops and that there were 1000 enemies on the screen at a time. That would a be a LOT of overhead just to roll fresh random numbers!
  • Roll a random number and if it’s the same as the current value, roll it again. Repeat until you get a new number. This isn’t so great because this code will randomly take longer than others. As an extreme case for instance, what if the random number generator chose the same number 100 times in a row? It would take 100 times longer to run the code in that case.
  • Another option might be to roll a random number, and if it was the same number as before, just add one (and use modulus to make sure it’s within valid range). This isn’t so great because you are basically giving the next number higher than your current number twice as much chance of coming up versus any other number. The solution shouldn’t bias the random chance in any way.

Solution 1

I faced this problem a few days ago and below is how i solved it. Note that Dice() is zero based, so if you call Dice(6), you will get a random number between 0 and 5. Dice() might be implemented using rand(), or a fast pseudo random number generator like Mersenne Twister or whatever else you would like to use.

unsigned int DiceNoRepeat(unsigned int numSides, unsigned int currentValue)
{
  if (numSides = currentValue)
    newValue++;

  return newValue;
}

Why that works is that if you throw out the current value, there are only numSides – 1 numbers left to choose from, so you first roll a number for which remaining number is the new number.

The numbers you chose from didn’t include the current value, but the values you are working with DO include the current value, so if the value is >= the current value, add one.

This solution is nice because it works in constant time (it only calls Dice() once), and also, it doesn’t mess up the probabilities (the chance of getting each number is the same).

Solution 2

Jay, a friend of mine who I used to work with, came up with a different solution that I think is essentially the same in terms of performance, and also has the nice properties of working in constant time and it doesn’t mess up the probabilities.

unsigned int DiceNoRepeat(unsigned int numSides, unsigned int currentValue)
{
  if (numSides < 1)
    return 0;

  unsigned int offset = Dice(numSides - 1) + 1;

  return (currentValue + offset) % numSides;
}

The reason this works is that instead of picking a random number, you are picking a random offset to add to your current number (wrapping around with modulus). You know that you want to move at least one space, and at most, numSides – 1. Since Dice() is zero based, Dice(numSides – 1) + 1 will give you a random number between 1 and numSides – 1.

When you add the offset to the current value and wrap it around by using modulus against numSides, you get a different random number.

Have a Different Solution?

Have a different way to do this? Post a comment and share! (:

Is pre-increment really faster than post increment? Part 1

If you are a C++ programmer, I’ll bet at some point in time, maybe during a code review, someone told you “hey you should use a pre-increment there because post-increments are slower”.

I had heard this some years ago in the context of for loops with the caveat of “they might have fixed it by now [the compiler], but why take the chance”. Fair enough I thought, and I started using pre-increment in for loops, but kept using post increment in other places because i felt it felt more natural. Also i don’t write code that makes it actually matter. I try to be explicit instead of clever to prevent bugs and such. For instance, lots of C++ programmers with years of experience probably wouldn’t be 100% sure about what order things happen in for this small code sample: *++p = 3;

To be more explicit, you could increment p on one line and then set *p to 3 on the next. That’s easier to get your head around because it’s more explicit. The optimizer can handle the details of combining the code, and as we will see a little bit later, code generation without an optimizer seems to do just fine too!

Ayways, I heard this again today during a code review. Someone saw i had a piece of code where i was incrementing an unsigned int like this (not as part of a for loop, just inside of an if statement): numEnables++;

They said “you should change that to a pre-increment because it’s faster”. Well, I decided I should have a look and see if that was true and that’s where today’s journey begins (:

Disclaimer: All my findings are based on what I’ve seen in MSVC 2010 and 2012 using default settings for debug and release building a win32 console application. When in doubt, check your own disassembly and make sure your code is doing what you think it is. I wish I had years ago so I would have known the truth back then.

Will It Blend?

Lets check out the assembly generated in release of a simple program to see what assembly it makes

int main(void)
{
int someNumber = 3;
someNumber++;
return 0;
}

Here is the resulting code from the disassembly window. The assembly code of our program is in the blue box:

PS you can find the diasassembly window by setting a breakpoint on someNumber++, running the program and when it hits the breapoint, going under the “Debug” menu, selecting “Windows” and then selecting “Disassembly”.

prepost1

Ok so what the heck happened to our program? All it’s doing is xoring the eax register against itself (to set it to zero) and then it’s calling “ret”. That is our “return 0” statement. Everything else got optimized away! The optimizer realized that nothing meaningful changes if it doesn’t calculate someNumber, so it decides not to.

Let’s try a printf to print out our number. That way the optimizer CAN’T optimize our code away.

#include
#include

int main(void)
{
int someNumber = 3;
someNumber++;
printf("someNumber = %irn", someNumber);
return 0;
}

Now here’s the disassembly:

prepost2

Ok we are doing better i guess. we see a push, a call, an add and then our return 0 code (xor eax, eax and ret).

I included the watch window so that you could see the values that the code is working with. The push pushes our printf format string onto the stack, and then it calls printf. Then it has the line “add sp, 8”. What that does is move the stack pointer up 8 bytes. Parameters are passed on the stack, so that line is there to undo the 4 byte push of the printf format string, and also the 4 byte push of someNumber. It’s just cleaning up the stack.

But where the heck was someNumber pushed onto the stack? I actually have no idea… do you know? If you know, please post a comment, I’m really curious how that works LOL.

EDIT: Thanks to Christophe for unraveling this, he points out that the assembly window was just not showing all the instructions:

For some reason, your MSVC debug window screencap shows the actual main() asm debug starting at 0x10b1002, not 0x10b1000. If you look at the line above at 0x10b0fff, it shows some “add byte ptr [edx+4], ch”. Which is because MSVC incorrectly deduced that there were some code starting at 0x10b0fff and so it messes up the debug print out of the actual code at 0x10b1000 (which would be some “push 4” on the stack, which is the incremented “someNumber” var).

We are not quite there. The optimizer is doing some funky stuff, and I think it’s because the optimizer knows that someNumber is the number “4”. It doesnt have to do any run time calculations so it isn’t

To make it so the optimizer can’t optimizer our number away like that, lets have the user input someNumber so the compiler has no idea what the right value is until run time.

#include
#include

int main(void)
{
int someNumber = 0;
printf("Please enter a number.rn");
scanf("%i",&someNumber);
someNumber++;
printf("someNumber = %irn", someNumber);
return 0;
}

And here’s the code generated. The important part (our increment) is in the blue box:

prepost3

Ok there’s our post increment code. Let’s change the post increment to a preincrement:

#include
#include

int main(void)
{
int someNumber = 0;
printf("Please enter a number.rn");
scanf("%i",&someNumber);
++someNumber;
printf("someNumber = %irn", someNumber);
return 0;
}

And here’s the code it generates:

prepost4

If you compare this generated code to the previously generated code, you’ll see it’s the exact same (some memory addresses have changed, but the instructions are the same). This is pretty good evidence that for this type of usage case, it doesn’t matter if you use post or pre increment – in RELEASE anyways.

Well, you might think to yourself “the optimizer does a good job, but if using a pre-increment, maybe you can make your DEBUG code faster so it isn’t so painful to debug the program”. Good point! It turns out that in debug though, preincrement and post increment give the same generated code. Here it is:

prepost5a
prepost5b

So, it looks like in this usage case that the compiler does not care whether you use preincrement or post increment.

Let’s check out for loops real quick before calling it a day.

For Loops

Here’s our post increment for loop testing code. Note we are doing the same tricks as before to prevent the key parts from getting optimized away.

#include
#include

int main(void)
{
int someNumber = 0;
printf("Please enter a number.rn");
scanf("%i",&someNumber);
for(int index = 0; index < someNumber; index++)
printf("index = %irn", index);
return 0;
}

Post and pre-increment generate the same code in release! Here it is:

prepost6

It turns out they generate the same code in debug as well:

prepost7a
prepost7b

Summary

Well, it looks like for these usage cases (admittedly, they are not very complex code), it really just does not matter. You still ought to check your own usage cases just to make sure you see the results I’m seeing. See something different? Post a comment!

In part 2 I’ll show you a case where using pre or post increment does matter. Here it is!

Is pre-increment really faster than post increment? Part 2

The Black Art of sizeof() aka Compile Time Type Deduction

Over the past couple decades of C++ programming, I’ve found sizeof useful, if relatively boring, and have found it handy for making things like memcpy and fread usable in a safe way. Little did I know that there were dark secrets lurking just below the surface, and that to truly understand sizeof is to dangle precariously above the black pit of infinity and see the secrets of creation laid bare.

I might be dramatizing things a little bit, but not by much (;

Sizeof Basics

Let’s start with the basics before working to the really interesting stuff:

#include 
void main(void)
{
	printf("size = %i",sizeof(int));
}

Running the above gives the output “size = 4”.

Luckily, instead of always having to give a type as a parameter to sizeof, we can also give a variable name. This saves us typing (and reduces potential bugs) when we change the type of a variable. For instance the below gives the same output.

#include 
void main(void)
{
	int index = 3;
	printf("size = %i",sizeof(index));
}

What’s great about the above is you can change what type index is defined as, and you don’t have to update the sizeof call. It knows what type index is and will always print out the right size for you. The less things you have to do manually or remember, the easier it is to maintain code and the less bugs you are likely to have.

Ok next up… sizeof() happens at compile time. It isn’t a function call, it does it’s work at compile time and has no runtime cost. This code below proves that and gives the same output:

#include 

int index = 3;

enum
{
	c_sizeOfIndex = sizeof(index)
};

void main(void)
{
	printf("size = %i",c_sizeOfIndex);
}

A Little More Interesting

Did you know that you can “call functions” inside of sizeof? Check out the below which gives the same output again:

#include 

int myFunc()
{
	return 3;
}

enum
{
	c_sizeOf = sizeof(myFunc())
};

void main(void)
{
	printf("size = %i",c_sizeOf);
}

But wait a second… if sizeof happens at compile time, how is it able to call a function? Does it evaluate the function at compile time somehow? Does it just do a sizeof on the function pointer?

What happens is it doesn’t actually evaluate anything that you pass to sizeof. it just looks at what’s inside, figures out what type it evaluates to, and gives you the size of that resulting type.

In the example above, it’s just looking at the return type of myFunc and giving you the sizeof that. That’s all 😛

To prove that it doesn’t actually execute the function, check this program out, which gives the same output and does not crash!

#include 

int myFunc()
{
	// this is safe, right?
	((int *)0)[0] = 0;

	// this is too right?
	int a = 0;
	int b = 3;
	int c = b / a;

	// this text never gets printed out
	printf("sup dawg, i heard you liked...");

	// lastly, infinite recursion..  i feel like i'm trying to kill rasputin...
	return 3 + myFunc();
}

enum
{
	c_sizeOf = sizeof(myFunc())
};

void main(void)
{
	printf("size = %i",c_sizeOf);
}

here’s one more interesting example, to show that sizeof has all the power that the compiler has to figure out data types at compile time, without actually evaluating any of the code. Running the below, you can see by observation that the result of what’s inside sizeof is a bool, and when you run the program, it reports sizeof(bool) which is 1 for me on my machine.

#include 

int myFunc(int someNumber)
{
	// this is safe, right?
	((int *)0)[0] = 0;

	// this is too right?
	int a = 0;
	int b = 3;
	int c = b / a;

	// this text never gets printed out
	printf("sup dawg, i heard you liked...");

	// lastly, infinite recursion..  i feel like i'm trying to kill rasputin...
	return someNumber + myFunc(someNumber * 2);
}

int someGlobalVariable = 3;

enum
{
	c_sizeOf = sizeof((myFunc(someGlobalVariable) * someGlobalVariable << 1) == 0)
};

void main(void)
{
	printf("size = %i",c_sizeOf);
}

So basically, we now know that we can pass any arbitrarily complex expression to sizeof – even ones that call functions and use variables – and the compiler will figure out the resulting type and give us the result of that.

True Power Revealed

The real power with sizeof comes in when we start using function overloads. If we have 2 versions of the same function, we can pass some parameters to that function, and the compiler will figure out which function is the best match. If each function has differently sized return types, we can use sizeof to know which one the compiler chose for any given parameters. Even better… we can know this at compile time with absolutely no run time cost.

Check out what I mean:

#include 

int someFunc(int someNumber);
char someFunc(const char *someString);

int myNumber = 3;
const char *myString = "hello!";

enum
{
	c_size1 = sizeof(someFunc(myNumber)),
	c_size2 = sizeof(someFunc(myString))
};

void main(void)
{
	printf("sizes = %i,  %i", c_size1, c_size2);
}

If you run that program, it prints out “sizes = 4, 1”. We now have a program that can tell at compile time whether a value (and even a variable) is a string or an int. Note that we don’t actually need to define bodies for our functions, because they aren’t ever actually called.

What if we wanted to just make it be able to tell us if it was a string or not, and we didn’t want to have to make a separate function (and return type) for each possible type to be able to tell it whether it was a string or not? Luckily we can very easily.

When the compiler tries to figure out the best matching function for a given function call, if there is a variable argument function that is a possibility (a function that has … as it’s parameter list), that function will always be least wanted. It’s what you get if nothing else matches. In our situation we’re going to use that kind of a function effectively as an “else” statement. Check it out:

#include 

char stringTester(const char *someString);
int  stringTester(...);

int myNumber = 3;
const char *myString = "hello!";

void main(void)
{
	if (sizeof(stringTester(myNumber)) == sizeof(char))
		printf("myNumber is a stringrn");
	else
		printf("myNumber is not a stringrn");

	if (sizeof(stringTester(myString)) == sizeof(char))
		printf("myString is a stringrn");
	else
		printf("myString is not a stringrn");
}

Pretty neat right? Running that program will tell you that myNumber is not a string, but myString is a string, just like you already know.

Even though we are using if statements above, it’s still a compile time check and will have no runtime costs, and the if statements should be completely eaten away by the optimizer (check the assembly to see for yourself!)

Lets clean up that last bit of code a bit to be a bit more generalized:

#include 

char stringTester(const char *someString);
int  stringTester(...);

#define REPORT_IF_STRING(x) 
	if (sizeof(stringTester(x)) == sizeof(char)) 
		printf(#x " is a stringrn"); 
	else 
		printf(#x " is not a stringrn"); 

int myNumber = 3;
const char *myString = "hello!";

void main(void)
{
	REPORT_IF_STRING(myNumber);
	REPORT_IF_STRING(myString);
}

That gives the exact same output, but as you can see, is a bit more generalized… something you could re-use in code easily.

If you look at the final compiled code (in release), you’ll basically just see 2 printf statements, because the rest of it all happened at compile time.

Working with only testing if a variable is a string or not a string, and printing that result is not exactly something you are likely to need on a daily basis, but there are a lot more useful examples. For instance, what if you wanted to be able to tell whether one object’s type could safely be cast to another object type?

You might be asking “Isn’t that what RTTI and dynamic_cast are for?”. Yep. But what if you could determine this at compile time, so that there was no cost to the dynamic cast, and the code was optimized to the point of not even having an if statement to check the type… it just did the right thing at runtime because it already knew. You’d end up with some pretty fast code!

Here’s how you might do that for a specific class hierarchy:

#include 

class CFruit {};
class CApple : public CFruit {};
class CBanana: public CFruit {};
class CPeach : public CFruit {};

class CVegetable {};
class CCarrot: public CVegetable {};
class CCelery: public CVegetable {};

char FruitTester(const CFruit &);
int  FruitTester(...);

#define REPORT_IS_FRUIT(x) 
	if (sizeof(FruitTester(x)) == sizeof(char)) 
		printf(#x " is a fruitrn"); 
	else 
		printf(#x " is not a fruitrn"); 

CFruit  fruit;
CApple	apple;
CBanana banana;
CPeach  peach;

CVegetable vegetable;
CCarrot    carrot;
CCelery    celery;

void main(void)
{
	REPORT_IS_FRUIT(fruit);
	REPORT_IS_FRUIT(apple);
	REPORT_IS_FRUIT(banana);
	REPORT_IS_FRUIT(peach);
	REPORT_IS_FRUIT(vegetable);
	REPORT_IS_FRUIT(carrot);
	REPORT_IS_FRUIT(celery);
}

Running that program will give you output like the below:

fruittest

Again, if you look at the compiled code in release, you’ll see what is effectively just a series of printf statements, because all the checking of types and inheritance happened at compile time and there was no runtime cost. Check out the disassembly below. It’s just pushing the address of the strings to show onto the stack, and then calling printf to show the string. No if statements, no jumps, nothing at all at runtime other than printing the right string:

disassembly

There is a way to generalize that so that you pass the type in that you want to test against, or other random variations for how it might function in a general case, but I leave that to you to figure out!

Not Quite a Dynamic Cast

Ok so the above is not quite a dynamic cast. This is an upcast and gives you the ability to tell if an object derives from a specific type, but it can’t work in the opposite direction which is what a dynamic cast can do.

What the above is good for is mainly for situations where you don’t know the object type because it came in as a macro parameter, or as a template parameter, but you still want to do specific logic on it if it’s a certain type, or derives from a certain type. That’s where this stuff comes in useful.

There may be a clever way to do a downcast (dynamic cast) using this functionality, but at this point in time, I’m not sure of how you might do that 😛

Other Uses

Looking online I’ve found some really interesting uses of this stuff beyond what I’ve shown you. Some people have mixed in a little other forms of black magic including templates, macros and template argument deduction and have done things as wild as being able to tell if an object has a specifically named function or member variable.

I personally feel like this trick of using sizeof to do compile time type deduction must have a million uses, but that my brain isn’t used to thinking in this way so it’ll probably be a little time before using it becomes more natural (whenever it’s appropriate that is hehe).

Hopefully you guys enjoyed learning about this as much as I did… soon I’ll be writing about another neat trick I came across called SFINAE. Give it a google if you want… it’s another pretty odd thing and I look forward to digging into it to be able to write up another post with some everyday useful examples.

Until next time!

Lists of Macro Lists

Example Code: Code_051113

In the two previous posts I showed some ways to use macro lists that I’ve found useful in the past. A downside to those methods though was that you had to manually expand / use each macro list individually. Whenever you have to do extra steps that are always the same, you have a possible point of failure in your code / development process that doesn’t need to be there (when you forget to do the step that’s always the same).

Wouldn’t it be great if you could make a list of macro lists, such as a list of enums that each had their own enum values, and say “expand all of these into all the enums described, with to and from string conversion functions” and it would do it?

How about describing a bunch of events that can happen in your game, specifying what data fields the events should have, what their types and default values are, as well as whether they are optional and required – and it would build full on classes or structs based on your settings to facilitate strongly typed, compile time protected message and event passing that also supported inheritance and custom dynamic cast functionality?

The example code for this post shows you how to do both of those things.

Boost Preprocessor Library (BOOST_PP)

I had to pull out the “big guns” for making this example work, specifically the boost preprocessor library. Yes it’s true that anything the boost preprocessor library can do, you can also do yourself without the boost preprocessor library. However, the neat thing about using the boost preprocessor library is that there are variations in the behaviors of the preprocessors for different compilers, and boost abstracts that away from you so that you know if you use their functions, they ought to work and behave the same way on other compilers. That’s a bunch of compatibility headaches that you don’t have to worry about.

The boost preprocessor library has a very developer friendly license too. It’s very similar to the zlib / mit license which in effect says “don’t call us and we won’t call you”. You can use it commercially for free, without having to give them any credit for anything, and also you aren’t required under any circumstances to link dynamically to avoid having to make your code open sourced (a problem you commonly hit with some other software licenses). That being said, there are a couple small requirements about keeping the license file in tact when redistributing the source code of their library etc so have a look at the license just to make sure you dot all your i’s and cross all your t’s, but if you work somewhere that is hesitant to use open sourced software for legal reasons (I have worked at my share of those places), the license is literally the same as zlib and i’ll bet your project is already using zlib, or one of the libraries you are using is internally using zlib. So seriously… there ought not be any problems with using boost_pp in your code, unless your legal department is retarded (again, something i’ve experienced in times past LOL).

Best of all, the boost preprocessory library is SUPER EASY to download, integrate into your project and start using. I’ve included a version in the sample code. It’s just a folder with some headers in it. The sample code #include’s some of those header files and it works. No other setup needed. You didn’t even need to build boost or link to it or anything crazy like that. SUPER EASY, don’t be afraid of boost_pp! Boost.org

FWIW I had my doubts about boost_pp too, but a fellow game dev friend of mine turned me onto it and I’m really glad he did. As they say, T-Law is the law.

Main.cpp – What our program is able to do in the end

Here’s the main.cpp file to show the capabilities of our sample code. Note specifically the custom dynamic cast functions, enum to/from string functions, and the CGameEvent objects. The constructors to the CGameEvent objects have required parameters for all required fields, and defaulted, optional parameters for all of the optional fields of the defined message.

#include "GameEvents.h"

int main(int argc, char **argv)
{
	// make some test events
	CGameEventPlayerDamaged damaged(ePlayer1, 5.0f, eDamageTypeElectricity);
	CGameEventPlayerDeath death(ePlayer1, eDeathTypeElectrocution);

	// get base class pointers to them
	CGameEvent *gameEvent1 = &damaged;
	CGameEvent *gameEvent2 = &death;

	// do a dynamic cast on the 1st game event (the damaged event)
	// test1a will be non null, and test1b will be null
	CGameEventPlayerDamaged *test1a = gameEvent1->GetAs();
	CGameEventPlayerDeath *test1b = gameEvent1->GetAs();

	// do a dynamic cast on the 2nd game event (the death event)
	// test2a will be null, test2b will be non null
	CGameEventPlayerDamaged *test2a = gameEvent2->GetAs();
	CGameEventPlayerDeath *test2b = gameEvent2->GetAs();

	// set and get some values using the accessors
	damaged.SetIsFatal(!damaged.GetIsFatal());
	damaged.SetDamageTaken(damaged.DefaultDamageTaken());
	damaged.SetAttackerPlayerID(EPlayerFromString(EPlayerToString(ePlayer2)));
	death.SetDeathType(EDeathTypeFromString("drowned"));

	return 0;
}

GameEvents.h

This is the file where the game events are defined. For each event, you specify the data type, the field name, the default value, and whether the field is required or optional. Note that you must specify all required field before all optional fields. It’s possible that there’s a way to make it so you don’t have to do that, but I couldn’t figure out a way. Note that if you mess that up, you’ll get a compile error.

Optional / required settings for fields are enforced by the constructor in each game event class.

//=====================================================================================
//
//	GameEvents.h
//
//	Define game events here.  They are expanded into full classes for each event.
//
//=====================================================================================

// include our enums so we can use them in our event definitions.
#include "GameEnums.h"

// Note that we could also use complex structures, even other events in our event
// definitions.  Also, this code could be expanded to allow you to define inheritance
// for each event.  So for instance, the PlayerDeath event could inherit the
// PlayerDamaged event if you wanted it to, and it would contain all of the same fields
// that the PlayerDamaged event had.  It might get a bit tricky making the constructor
// though.

#define EventList 
/*===================================================================================*/ 
/*                                   Settings                                        */ 
/*===================================================================================*/ 
(GameEvent) /*Event Prefix*/ 
( 
/*===================================================================================*/ 
/*                                  PlayerDamaged                                    */ 
/*===================================================================================*/ 
( 
	(PlayerDamaged) 
	( 
		((EPlayer)     (VictimPlayerID)   (ePlayerUnknown)    (ELB_REQUIRED)) 
		((float)       (DamageTaken)      (0.0f)              (ELB_REQUIRED)) 
		((EDamageType) (DamageType)       (eDamageTypeNormal) (ELB_OPTIONAL)) 
		((EPlayer)     (AttackerPlayerID) (ePlayerUnknown)    (ELB_OPTIONAL)) 
		((bool)        (IsFatal)          (false)             (ELB_OPTIONAL)) 
	) 
) 
/*===================================================================================*/ 
/*                                  PlayerDeath                                      */ 
/*===================================================================================*/ 
( 
	(PlayerDeath) 
	( 
		((EPlayer)    (VictimPlayerID) (ePlayerUnknown)   (ELB_REQUIRED)) 
		((EDeathType) (DeathType)      (eDeathTypeNormal) (ELB_OPTIONAL)) 
	) 
) 
/*===================================================================================*/ 
) 
/*===================================================================================*/ 

// build our event classes
#include "EventListBuilder.h"

GameEnums.h

Here’s where game enums are defined. Note that it makes an enum per entry here, but it also makes ToString and FromString functions for each enum type.

//=====================================================================================
//
//	GameEnums.h
//
//	Game enums are defined here
//=====================================================================================

#define EnumList 
/*===================================================================================*/ 
/*                                  EDeathType                                       */ 
/*===================================================================================*/ 
( 
	(DeathType) 
	( 
		(Normal) 
		(Incineration) 
		(Electrocution) 
		(Drowned) 
		(Crushed) 
		(Telefrag) 
	) 
) 
/*===================================================================================*/ 
/*                                  EDamageType                                      */ 
/*===================================================================================*/ 
( 
	(DamageType) 
	( 
		(Normal) 
		(Fire) 
		(Electricity) 
		(Ice) 
		(Arcane) 
	) 
) 
/*===================================================================================*/ 
/*                                    EPlayer                                        */ 
/*===================================================================================*/ 
( 
	(Player) 
	( 
		(1) 
		(2) 
		(3) 
		(4) 
	) 
) 
/*===================================================================================*/ 

#include "EnumBuilder.h"

EnumBuilder.h

The simpler of the two files used to generate code from our data, here’s EnumBuilder.h.

Note that I use both BOOST_PP_SEQ_FOR_EACH as well as BOOST_PP_SEQ_FOR_EACH_I. The reason for that is because with BOOST_PP you are unable to do nested for loops. However, if you use two different looping methods, it works just fine. So… i use FOR_EACH for one level of looping, and FOR_EACH_I for another level of looping. I’m not sure if you could do 3 or more levels of looping by switching between which loop method you use or not.

//=====================================================================================
//
//	EnumBuilder.h
//
//  Define "EnumList" and then include this header file to expand it into an enum and
//  also provide ToString and FromString functions.
//
//  Here's an example of what EnumList might look like
//
//	#define EnumList 
//  ( 
//		(FruitType) 
//		( 
//			(Apple) 
//			(Pear) 
//			(Banana) 
//			(Orange) 
//		) 
//	) 
//  ( 
//		(VegieType) 
//		( 
//			(Potato) 
//			(Broccoli) 
//		) 
//  ) 
//
//  The first entry is the name of the enum, and then after that are the enum values.
//  You can define as many enums as you would like I believe.  Boost might have an
//  internal limit, but I'm not sure...
//
//  The above would make two enums like below:
//
//  enum EFruitType
//  {
//		eFruitTypeUnknown = -1,
//
//		eFruitTypeApple,
//		eFruitTypePear,
//		eFruitTypeBanana,
//		eFruitTypeOrange,
//
//		eFruitTypeCount,
//		eFruitTypeFirst = 0
//	};
//
//  enum EVegieType
//  {
//		eVegieTypeUnknown = -1,
//
//		eVegieTypePotato,
//		eVegieTypeBroccoli,
//
//		eVegieTypeCount,
//		eVegieTypeFirst = 0
//  };
//
//  It will also make these functions:
//
//  const char *EFruitTypeToString( EFruitType value );
//  EFruitType EFruitTypeFromString( const char *value );
//
//  const char *EVegieTypeToString( EVegieType value );
//  EVegieType EVegieTypeFromString( const char *value );
//
//=====================================================================================

#include 
#include 
#include 
#include 

//=====================================================================================

// define an enum value
#define ENB_ENUMVALUE(depth, enumName, enumItem) 
	BOOST_PP_CAT(e, BOOST_PP_CAT(enumName, enumItem)), 

// convert something to a string
#define ENB_TOSTRING(x) #x

// ToString() enum case
#define ENB_TOSTRINGCASE(depth, enumName, enumItem) 
	case BOOST_PP_CAT(e, BOOST_PP_CAT(enumName, enumItem)): return ENB_TOSTRING(enumItem);

// FromString() string test
#define ENB_FROMSTRINGTEST(depth, enumName, enumItem) 
	if (!stricmp(ENB_TOSTRING(enumItem), value)) 
		return BOOST_PP_CAT(e, BOOST_PP_CAT(enumName, enumItem));

// define the enum, including an "unknown" value at -1, and also "first" and "count"
// values which are useful for iterating through enum values
#define ENB_MAKEENUM(depth, data, index, enumList) 
	enum BOOST_PP_CAT(E,BOOST_PP_SEQ_ELEM(0, enumList)) 
	{ 
		BOOST_PP_CAT(BOOST_PP_CAT(e,BOOST_PP_SEQ_ELEM(0, enumList)),Unknown) = -1, 
		BOOST_PP_SEQ_FOR_EACH(ENB_ENUMVALUE, BOOST_PP_SEQ_ELEM(0, enumList), BOOST_PP_SEQ_ELEM(1, enumList)) 
		BOOST_PP_CAT(BOOST_PP_CAT(e,BOOST_PP_SEQ_ELEM(0, enumList)),Count), 
		BOOST_PP_CAT(BOOST_PP_CAT(e,BOOST_PP_SEQ_ELEM(0, enumList)),First) = 0, 
	}; 

// make the ToString function
#define ENB_MAKETOSTRING(depth, data, index, enumList) 
	const char *BOOST_PP_CAT(BOOST_PP_CAT(E,BOOST_PP_SEQ_ELEM(0, enumList)),ToString) (BOOST_PP_CAT(E,BOOST_PP_SEQ_ELEM(0, enumList)) value) 
	{ 
		switch(value) 
		{ 
			BOOST_PP_SEQ_FOR_EACH(ENB_TOSTRINGCASE, BOOST_PP_SEQ_ELEM(0, enumList), BOOST_PP_SEQ_ELEM(1, enumList)) 
		} 
		/* error case */ 
		return "Unknown"; 
	}

// make the FromString function
#define ENB_MAKEFROMSTRING(depth, data, index, enumList) 
	BOOST_PP_CAT(E,BOOST_PP_SEQ_ELEM(0, enumList)) BOOST_PP_CAT(BOOST_PP_CAT(E,BOOST_PP_SEQ_ELEM(0, enumList)),FromString) (const char *value) 
	{ 
		BOOST_PP_SEQ_FOR_EACH(ENB_FROMSTRINGTEST, BOOST_PP_SEQ_ELEM(0, enumList), BOOST_PP_SEQ_ELEM(1, enumList)) 
		/* error case */ 
		return BOOST_PP_CAT(BOOST_PP_CAT(e,BOOST_PP_SEQ_ELEM(0, enumList)),Unknown); 
	}

//=====================================================================================

// make all of the enums
BOOST_PP_SEQ_FOR_EACH_I(ENB_MAKEENUM, _, EnumList)

// make the ToString functions
BOOST_PP_SEQ_FOR_EACH_I(ENB_MAKETOSTRING, _, EnumList)

// make the FromString functions
BOOST_PP_SEQ_FOR_EACH_I(ENB_MAKEFROMSTRING, _, EnumList)

//=====================================================================================

// don't pollute other code with our macros
#undef ENB_ENUMVALUE
#undef ENB_TOSTRING
#undef ENB_TOSTRINGCASE
#undef ENB_FROMSTRINGTEST
#undef ENB_MAKEENUM
#undef ENB_MAKETOSTRING
#undef ENB_MAKEFROMSTRING

// this was defined by the caller but clean it up here for convinience
#undef EnumList

EventListBuilder.h

Here’s the code that expands an event list into classes with constructors, accessors and dynamic casting functionality:

//=====================================================================================
//
//	EventListBuilder.h
//
//  Define "EventList" and then include this header file to expand it into classes
//
//  The EventList schema is:
//
//  (ExpansionPrefix)
//  (Events)
//
//  ExpansionPrefix - a prefix added to every event name in the group
//  Events          - the list of events, see the schema below
//
//	The Events schema is:
//
//	(EventName)
//	(
//		((type)(Name)(Default)(Required))
//	)
//
//	EventName - The name of the event. A "C" and the expansion prefix are added
//				to the front when making it into a class.
//
//	Type	  - The data type of the field.  Make sure the data type is "viewable"
//				before including EventListBuilder.h
//
//	Name	  - The name of the data field, used to name member vars and accessors
//
//	Default   - The default value of the field.
//
//	Required  - Specifies whether a value must be provided when creating a new event
//				of this type or not.  If optional fiels are not provided, the default
//				value will be used.  ELB_OPTIONAL / ELB_REQUIRED
//
//  Note: All required parameters must come before all optional parameters
//
//  Example EventList:
//
//  #define EventList 
//		(InputEvent) 
//      ( 
//        ( 
//          (ButtonPress) 
//          ( 
//				((EButtonId) (ButtonId) (eButtonInvalid) (ELB_REQUIRED)) 
//				((bool)      (Pressed)  (true)           (ELB_REQUIRED)) 
//          ) 
//        ) 
//        ( 
//          (AnalogStick) 
//          ( 
//				((float)          (PosX)          (0.0f)              (ELB_REQUIRED)) 
//				((float)          (PosY)          (0.0f)              (ELB_REQUIRED)) 
//				((EAnalogStickId) (AnalogStickId) (eAnalogStickMouse) (ELB_OPTIONAL)) 
//          ) 
//        ) 
//      )
//
//	The above example will make two event classes with getters/setters for members
//  and a constructor on each class which has optional and required parameters as
//  specified in the EventList data.
//
//    CInputEventButtonPress
//      EButtonId ButtonId - the button that generated the event
//      bool Pressed       - whether the button was pressed or released
//
//    CInputEventAnalogStick
//      float PosX                   - X position of the stick
//      float PosY                   - Y position of the stick
//      EAnalogStickId AnalogStickId - optional parameter to specify which stick.
//									   defaults to "mouse"
//
//=====================================================================================

#include 
#include 
#include 
#include 

//=====================================================================================
// Define the base class
// C
// 
// It has a templated GetAs function that works like a dynamic cast and lets you ask
// the event type too.
//=====================================================================================
#define ELB_DEFINEBASECLASS(expansionSettings) 
	class BOOST_PP_CAT(C, expansionSettings) 
	{ 
	public: 
		/* Constructor */ 
		BOOST_PP_CAT(C, expansionSettings) (BOOST_PP_CAT(E,expansionSettings) eventType) 
		{ 
			m_eventType = eventType; 
		} 
		
		/* EventType() */ 
		BOOST_PP_CAT(E,expansionSettings) EventType() const {return m_eventType;} 
		
		/* GetAs() */ 
		template  
		EventClass *GetAs() 
		{ 
			if (m_eventType == EventClass::StaticEventType()) 
				return (EventClass *)this; 
			else 
				return 0; 
		} 
	private: 
		BOOST_PP_CAT(E,expansionSettings) m_eventType; 
	};

//=====================================================================================
// Define the class
// C
//
// It has getters and setters for each data member defined and the construct handles
// initialization of members to defaults and enforces required data fields.
//=====================================================================================
#define ELB_DEFINECLASS(depth, expansionSettings, classData) 
	class BOOST_PP_CAT(C, BOOST_PP_CAT(expansionSettings, BOOST_PP_SEQ_ELEM(0, classData))) 
	: public BOOST_PP_CAT(C, expansionSettings) 
	{ 
	public: 
		/* Constructor */ 
		BOOST_PP_CAT(C, BOOST_PP_CAT(expansionSettings, BOOST_PP_SEQ_ELEM(0, classData))) 
		( 
			BOOST_PP_SEQ_FOR_EACH_I(ELB_CONSTRUCTPARAMS, _, BOOST_PP_SEQ_ELEM(1, classData)) 
			void *dummyParamIgnore = 0 
		) 
		: BOOST_PP_CAT(C, expansionSettings) (BOOST_PP_CAT(e,BOOST_PP_CAT(expansionSettings, BOOST_PP_SEQ_ELEM(0, classData)))) 
		{ 
			BOOST_PP_SEQ_FOR_EACH_I(ELB_INITIALIZEMEMBERVAR, _, BOOST_PP_SEQ_ELEM(1, classData)) 
		} 
		
		/* Accessors - Get, Set, static Default*/ 
		BOOST_PP_SEQ_FOR_EACH_I(ELB_DEFINEMEMBERVARACCESSORS, _, BOOST_PP_SEQ_ELEM(1, classData)) 
		
		/* StaticEventType() */ 
		static BOOST_PP_CAT(E,expansionSettings) StaticEventType() 
		{ 
			return BOOST_PP_CAT(e,BOOST_PP_CAT(expansionSettings, BOOST_PP_SEQ_ELEM(0, classData))); 
		} 
	private: 
		/* Member Variables */ 
		BOOST_PP_SEQ_FOR_EACH_I(ELB_DEFINEMEMBERVAR, _, BOOST_PP_SEQ_ELEM(1, classData)) 
	};

//=====================================================================================

// to make required / optional more apparent in the EventList.
// NOTE: BOOST_PP_IF needs to take a 1 or 0 unfortunately!
#define ELB_REQUIRED 1
#define ELB_OPTIONAL 0

// add member variables to the constructor, optional or required as specified in data
#define ELB_CONSTRUCTPARAMS(depth, data, index, classData) 
	BOOST_PP_SEQ_ELEM(0, classData) BOOST_PP_SEQ_ELEM(1, classData) 
	BOOST_PP_IF(BOOST_PP_SEQ_ELEM(3, classData), , = BOOST_PP_SEQ_ELEM(2, classData)),

// initialize a member variable to the parameter passed into the constructor
#define ELB_INITIALIZEMEMBERVAR(depth, data, index, classData) 
	BOOST_PP_CAT(m_,BOOST_PP_SEQ_ELEM(1, classData)) = BOOST_PP_SEQ_ELEM(1, classData);

// define a single member variable
#define ELB_DEFINEMEMBERVAR(depth, data, idex, classData) 
	BOOST_PP_SEQ_ELEM(0, classData) BOOST_PP_CAT(m_,BOOST_PP_SEQ_ELEM(1, classData));

// define a single member variable accessors
#define ELB_DEFINEMEMBERVARACCESSORS(depth, data, index, classData) 
	BOOST_PP_SEQ_ELEM(0, classData) BOOST_PP_CAT(Get,BOOST_PP_SEQ_ELEM(1, classData))() const { return BOOST_PP_CAT(m_,BOOST_PP_SEQ_ELEM(1, classData)); } 
	void BOOST_PP_CAT(Set,BOOST_PP_SEQ_ELEM(1, classData))(const BOOST_PP_SEQ_ELEM(0, classData) &value) { BOOST_PP_CAT(m_,BOOST_PP_SEQ_ELEM(1, classData)) = value; } 
	static BOOST_PP_SEQ_ELEM(0, classData) BOOST_PP_CAT(Default,BOOST_PP_SEQ_ELEM(1, classData))() { return BOOST_PP_SEQ_ELEM(2, classData); } 

// define an enum value
#define ELB_ENUMVALUE(depth, expansionSettings, classData) 
	BOOST_PP_CAT(e,BOOST_PP_CAT(expansionSettings, BOOST_PP_SEQ_ELEM(0, classData))), 

//=====================================================================================

// make an enum of the event types
enum BOOST_PP_CAT(E,BOOST_PP_SEQ_ELEM(0, EventList))
{
	BOOST_PP_SEQ_FOR_EACH(ELB_ENUMVALUE, BOOST_PP_SEQ_ELEM(0, EventList), BOOST_PP_SEQ_ELEM(1, EventList))
};

// define the base class
ELB_DEFINEBASECLASS(BOOST_PP_SEQ_ELEM(0, EventList))

// build EventList into classes
BOOST_PP_SEQ_FOR_EACH(ELB_DEFINECLASS, BOOST_PP_SEQ_ELEM(0, EventList), BOOST_PP_SEQ_ELEM(1, EventList))

//=====================================================================================

// don't pollute other code with our macros
#undef ELB_DEFINEBASECLASS
#undef ELB_DEFINECLASS
#undef ELB_REQUIRED
#undef ELB_OPTIONAL
#undef ELB_CONSTRUCTPARAMS
#undef ELB_INITIALIZEMEMBERVAR
#undef ELB_DEFINEMEMBERVAR
#undef ELB_DEFINEMEMBERVARACCESSORS
#undef ELB_ENUMVALUE

// this was defined by the caller but clean it up for convinience
#undef EventList

Possible Improvements

  • Imagine if the event classes had binary and/or text serialization for being able to save and load events from disk, or over a network connection. Something like this sample code could be the back bone of your editor, your save game systems, or your network communication protocol. It would be type safe and easily modified. When you changed structures, you’d likely get compile errors or warnings in most of the places you needed to update code to support the changes.
  • The definitions of enums and events are kind of weird in that they are lists of text in parentheses because that’s what a “boost sequence” is (BOOST_PP_SEQ family of functions). Boost also has something called tuples which are comma seperated lists. I think tuples would be more natural for usage in some of the definition cases but i was unable to get that working for whatever reason. If you end up getting that working, post back and let us all know how you did it!
  • There are a lot of “magic numbers” in the macros, such as saying “use sequence field 0 for this” or “use sequence field 2 for this”. It probably would be better (more readable at least) if I defined symbolic constants for various indices. That way you’d see ELB_ENUMNAME instead of “0”, which would make a lot more sense.

That’s All

This post was code heavy and explanation light but hopefully you can understand what’s going on here. If you have any questions about the code, or why I did something one way instead of doing it a different way, post a comment and I’ll get back to you!

Example Code: Code_051113

Macro Lists For The Win – Side B

Example Code: Code_050713.zip

In the previous post I talked about how you can use macro lists to solve the problem of wanting to generate a bunch of code based on the same set of data. This was useful for doing things like defining a list of resources a player could accumulate, and then being able to generate code to store and manipulate each resource type. You only had to update the resource list to add a new resource and the rest of the code would almost magically generate itself.

What if you wanted the reverse though? What if you had a fixed set of code that you want to apply to a bunch of different sets of data? This post is going to show you a way to do that.

In the example code, we are going to make a way to define several lists of items, and expand each list into an enum that also has a ToString and FromString function associated with it.

Another usage case for this technique might be to define lists of data fields, and expand each list into a data structure that contains serialization and deserialization functions. This would allow you to make data structures that could be saved and loaded to disk, or to sent and received over a network connection, just by defining what data fields they contained.

I haven’t yet seen this technique in the wild, and it kind of makes me wonder why since they are just two sides of the same coin.

GameEnums.h

In the last post, our data was always the same and we just applied it to different code. To do this, we had the code in one .h and the data in another .h that would get included multiple times. This allowed us to define different pieces of code in one .h, then include the other .h file to apply the fixed data to each piece of code.

In this post, it’s going to be the exact opposite. Our code will always stay the same and we will apply it to different data so our data will be in one .h and the code will be in another .h that gets included multiple times.

Here’s GameEnums.h:

//////////////////////
//     EDamageType
//////////////////////
#define ENUMNAME DamageType
#define ENUMLIST 
	ENUMENTRY(Normal) 
	ENUMENTRY(Electricity) 
	ENUMENTRY(Fire) 
	ENUMENTRY(BluntForce)

#include "EnumBuilder.h"
//////////////////////
//     EDeathType
//////////////////////
#define ENUMNAME DeathType
#define ENUMLIST 
	ENUMENTRY(Normal) 
	ENUMENTRY(Electrocuted) 
	ENUMENTRY(Incinerated) 
	ENUMENTRY(Smashed)

#include "EnumBuilder.h"
//////////////////////
//     EFruit
//////////////////////
#define ENUMNAME Fruit
#define ENUMLIST 
	ENUMENTRY(Apple) 
	ENUMENTRY(Banana) 
	ENUMENTRY(Orange) 
	ENUMENTRY(Kiwi)

#include "EnumBuilder.h"
//////////////////////
//     EPlayers
//////////////////////
#define ENUMNAME Player
#define ENUMLIST 
	ENUMENTRY(1) 
	ENUMENTRY(2) 
	ENUMENTRY(3) 
	ENUMENTRY(4)

#include "EnumBuilder.h"

EnumBuilder.h

This header file is where the real magic is; it’s responsible for taking the previously defined ENUMNAME and ENUMLIST macros as input, and turning them into an enum and the string functions. Here it is:

#include  // for _stricmp, for the enum Fromstring function

// this EB_COMBINETEXT macro works in visual studio 2010.  No promises anywhere else.
// Check out the boost preprocessor library if this doesn't work for you.
// BOOST_PP_CAT provides the same functionality, but ought to work on all compilers!
#define EB_COMBINETEXT(a, b) EB_COMBINETEXT_INTERNAL(a, b)
#define EB_COMBINETEXT_INTERNAL(a, b) a ## b

// make the enum E
#define ENUMENTRY(EnumValue) EB_COMBINETEXT(e, EB_COMBINETEXT(ENUMNAME, EnumValue)),
enum EB_COMBINETEXT(E,ENUMNAME) {
	EB_COMBINETEXT(EB_COMBINETEXT(e, ENUMNAME), Unknown) = -1,
	ENUMLIST
	EB_COMBINETEXT(EB_COMBINETEXT(e, ENUMNAME), Count),
	EB_COMBINETEXT(EB_COMBINETEXT(e, ENUMNAME), First) = 0,
	EB_COMBINETEXT(EB_COMBINETEXT(e, ENUMNAME), Last) = EB_COMBINETEXT(EB_COMBINETEXT(e, ENUMNAME), Count) - 1
};
#undef ENUMENTRY

// make the EToString function
const char *EB_COMBINETEXT(EB_COMBINETEXT(E,ENUMNAME), ToString)(EB_COMBINETEXT(E,ENUMNAME) value)
{
	switch(value)
	{
		#define ENUMENTRY(EnumValue) 
			case EB_COMBINETEXT(e, EB_COMBINETEXT(ENUMNAME, EnumValue)): 
			return #EnumValue;
		ENUMLIST
		#undef ENUMENTRY
	}
	return "Unknown";
}

// make the EFromString function
EB_COMBINETEXT(E,ENUMNAME) EB_COMBINETEXT(EB_COMBINETEXT(E,ENUMNAME), FromString)(const char *value)
{
	#define ENUMENTRY(EnumValue) 
	if(!_stricmp(value,#EnumValue)) 
		return EB_COMBINETEXT(e, EB_COMBINETEXT(ENUMNAME, EnumValue));
	ENUMLIST
	#undef ENUMENTRY
	return EB_COMBINETEXT(EB_COMBINETEXT(e, ENUMNAME), Unknown);
}

// clean up
#undef EB_COMBINETEXT
#undef EB_COMBINETEXT_INTERNAL

// these were defined by the caller but clean them up for convinience
#undef ENUMNAME
#undef ENUMLIST

Main.cpp

Now, here’s how you can actually use this stuff!

#include "GameEnums.h"

int main(int argc, char* argv[])
{
	EDamageType damageType = eDamageTypeBluntForce;
	EDeathType deathType = EDeathTypeFromString("smashed");
	EFruit fruit = eFruitLast;
	EPlayer player = EPlayerFromString(EPlayerToString(ePlayer1));
	return 0;
}

Combining the files

As a quick aside, in both this and the last post, I separated the code and data files. This is probably how you would normally want to do things because it’ll usually be cleaner, but it isn’t required. Here’s a cool technique I came across today…

Here’s Macro.cpp:

#ifdef MACROHEADER
	// Put your header stuff here
#else
	// Put cpp type stuff here

	// Include the "header"
	#define MACROHEADER
	#include "Macro.cpp"
	#undef MACROHEADER
#endif

If you ever really just want to combine all your code and data into a single file and not muddy up a directory or project with more files, this technique can help you do that. IMO you really ought to just use separate files, but I wanted to share this for when there are exceptions to that rule (as there always seem to be for every rule!)

Being Data Driven

After my last post, a fellow game developer friend of mine pointed out..

I hope that one day we hurl C++ into a raging sea of fire.

I do like this technique, in theory at least, but whenever I feel like it’s the right solution, a voice in the back of my head yells that I’m digging too greedily and too deeply and should step back for a second and consider what design choices have lead me to this point.

And I think that usually that introspection ends up at the intersection of “we’re not data-driven enough but want to be” and “we decided to use C++ for our engine.”

— jsola

He does have a point. For instance, in the case of our resource list from last post, it would be better if you had some “source data”, such as an xml file, listing all the resources a player could have. The game should load that data in on startup and make a dynamic array, etc to handle those resources. When you had game actions that added or subtracted specific resources from a player, the details of which resources got modified, and by how much, should also be specified in data.

When you save or pack your data, or at runtime (as your situation calls for), it can verify that your data is well formed and makes sure that if a data field is meant to specify a resource type, that it actually corresponds to an actual resource type listed in the list of resources.

That is closer to the ideal situation when making a game – especially when making larger games with a lot of people.

But there are still some good usage cases for this kind of macro magic (and template metaprogramming as well). For instance, maybe you use macros to define your data schemas so that your application can be data driven in the first place – I’ve done that on several projects myself and have seen other well respected people do it as well. So, add these things to your toolbox I say, because you never know when you might need them!

Next Post…

The next post will be what i promised at the end of the last one. I’m going to talk about a way to define a list of lists and then be able to expand that list of lists in a single go, instead of having to do a file include for each list individually.

Example Code: Code_050713.zip

Macro Lists For The Win

Around 6 years ago I was introduced to a programming technique that really blew my mind. John, My boss at the time and the tech director at inXile, had written it as part of the code base for the commercial version of a game called Line Rider and I believe he said he first heard about the technique from a friend at Sony.

Since seeing it at inXile, I’ve seen the same technique or variations on it at several places including Monolith and Blizzard, and have had the opportunity to use it on quite a few occasions myself.

What is this technique? I’m not sure if it has an actual name but I refer to it as “Macro Lists” and it’s a good tool towards achieving the DRY principle (don’t repeat yourself – http://en.wikipedia.org/wiki/Don’t_repeat_yourself)

Macro lists are often useful when you find yourself copy / pasting code just to change a couple things, and have multiple places you need to update whenever you need to add a new entry into the mix.

For example, let’s say that you have a class to store how much of each resource that a player has and lets say you start out with two resources – gold and wood.

To implement this, you might write some code like this:

enum EResource
{
	eResourceGold,
	eResourceWood,
};

class CResourceHolder
{
public:
	CResourceHolder()
	{
		m_resources[eResourceGold] = 100.0f;
		m_resources[eResourceWood] = 0.0f;
	}

	float GetGold() const
		{ return m_resources[eResourceGold ]; }
	void SetGold(float amount)
		{ m_resources[eResourceGold ] = amount; }

	float GetWood() const
		{ return m_resources[eResourceWood]; }
	void SetWood(float amount)
		{ m_resources[eResourceWood] = amount; }
private:
	float m_resources[2];
};

That seems pretty reasonable right?

Now let’s say that you (or someone else who doesn’t know the code) wants to add another resource type. What do you need to do to add a new resource?

  1. Add a new enum value to the enum
  2. Initialize the new value in the array to zero
  3. Make a Get and Set function for the resource
  4. Increase the array size of m_resources to hold the new value

If #1 or #3 are forgotten, it will probably be really obvious and it’ll be fixed right away. If #2 or #4 are missed though, you are going to have some bugs that might potentially be very hard to track down because they won’t happen all the time, and they may only happen in release, or only when doing some very specific steps that don’t seem to have anything to do with the resource code.

Kind of a pain right? As the code gets more mature and more features are added, there will likely be other places that need to be updated too that will easily be forgotten. Also, when this sort of stuff comes up, people tend to copy/paste existing patterns and then change what needs to be changed – which can be really dangerous if people forget to change some of the values which need to be changed.

Luckily macro lists can help out here to ensure that it’s IMPOSSIBLE for you, or anyone else, to forget the steps of what to change. Macro lists make it impossible to forget because they do the work for you!

Check out this code to see what I mean. It took me a little bit to wrap my head around how this technique worked when I first saw it, so don’t get discouraged if you have trouble wrapping your head around it as well.

#define RESOURCE_LIST 
	RESOURCE_ENTRY(Gold, 100.0) 
	RESOURCE_ENTRY(Wood, 0)

// make the enum
#define RESOURCE_ENTRY(resource, startingValue) 
	eResource#resource,
enum EResource
{
	eResourceUnknown = -1,
	RESOURCE_LIST
	eResourceCount,
	eResourcefirst = 0
};
#undef RESOURCE_ENTRY

class CResourceHolder
{
public:
	CResourceHolder()
	{
		// initialize to starting values
		#define RESOURCE_ENTRY(resource, startingValue) 
			m_resources[eResource#resource] = startingValue;
		RESOURCE_LIST
		#undef RESOURCE_ENTRY
	}

// make a Get and Set for each resource
#define RESOURCE_ENTRY(resource, startingValue) 
	float Get#resource() const 
	{return m_resources[eResource#resource];} 
	void Set#resource(float amount) 
	{m_resources[eResource#resource] = amount;} 
RESOURCE_LIST
#undef RESOURCE_ENTRY

private:
	// ensure that our array is always the right size
	float m_resources[eResourceCount];
};

In the above code, the steps mentioned before happen automatically. When you want to add a resource, all you have to do is add an entry to the RESOURCE_LIST and it does the rest for you. You can’t forget any of the steps, and as people add new features, they can work with the macro list to make sure people in the future can add resources without having to worry about the details.

Include File Variation

If you used the above technique a lot in your code base, you could imagine that someone might name their macros the same things you named yours which could lead to a naming conflict.

Keeping the “global macro namespace” as clean as possible is a good practice to follow and there’s a variation of the macro list technique that doesn’t pollute the global macro namespace like the above.

Basically, you put your macro list in a header file, and then include that header file every place you would normally put a RESOURCE_LIST.

Here’s the same example broken up that way. First is ResourceList.h:

///////////////////////////////////
//	RESOURCE_ENTRY(ResourceName, StartingValue)
//
//	ResourceName - the name of the resource
//	StartingValue - what to start the resource at
//
RESOURCE_ENTRY(Gold, 100.0)
RESOURCE_ENTRY(Wood, 0)
///////////////////////////////////

And now here is CResourceHolder.h:

///////////////////////////////////
// make the enum
#define RESOURCE_ENTRY(resource, startingValue) 
	eResource#resource,
enum EResource
{
	eResourceUnknown = -1,
	#include "ResourceList.h"
	eResourceCount,
	eResourcefirst = 0
};
#undef RESOURCE_ENTRY

class CResourceHolder
{
public:
	CResourceHolder()
	{
		// initialize to starting values
		#define RESOURCE_ENTRY(resource, startingValue) 
			m_resources[eResource#resource] = startingValue;
		#include "ResourceList.h"
		#undef RESOURCE_ENTRY
	}

// make a Get and Set for each resource
#define RESOURCE_ENTRY(resource, startingValue) 
	float Get#resource() const 
	{return m_resources[eResource#resource];} 
	void Set#resource(float amount) 
	{m_resources[eResource#resource] = amount;}
#include "ResourceList.h"
#undef RESOURCE_ENTRY

private:
	// ensure that our array is always the right size
	float m_resources[eResourceCount];
};

The Downside of Macro Lists

So, while doing the above makes code a lot easier to maintain and less error prone, it comes at a cost.

Most notably is it can be really difficult to figure out what code the macros will expand to, and it can be difficult to alter the functionality of the macros. A way to lessen this problem is that you can tell most compilers to make a file that shows what your code looks like after the preprocessor is done with it. It can still be difficult even with this feature, but it does help a lot.

When you have compiler errors due to macros, because perhaps you forgot a parameter, or it’s the wrong type, the compiler errors can be pretty difficult to understand sometimes.

Another problem with macros is that I don’t know of any debuggers that will let you step through macro code, so in a lot of ways it’s a black box while you are debugging, which sucks if that code malfunctions. If you keep your functionality simple, straightfoward and format it cleanly, you ought not to hit many of these problems though.

Instead of using macro lists, some people prefer to put their data into something like an xml or json data file, and then as a custom build step, use XSLT or the like to convert that data into some code, just like the C++ preprocessor would. The benefit here is that you can see the resulting code and step through it while debugging, but of course the downside is it can be more difficult for someone else to get set up to be able to compile your code.

To Be Continued…

Macro lists are great, but what if you want your lists to have sublists? For instance, what if you wanted to define network messages for your game in a format like this, and have it automatically expand into full fledged classes to be able to ensure that message parsing and data serialization was always done in a consistent way to minimize bugs and maximize efficiency (less code to write and less testing to do)?

As you might have noticed, macro lists can take parameters to help them be flexible (like, the starting value of the resources… you could add more parameters if you wanted to), but, a macro list can’t contain another macro list. At least not how the above implementations work.

I’m going to show you how to tackle this problem in the next post, so stay tuned! (: