Simplifying Boolean Expressions With Karnaugh Maps

Karnaugh maps are a formalized way of turning a truth table into a fairly minimal logical expression.

It’s fairly minimal in that it’s the minimal “sum of products” representation, but that might not be the minimal representation of the logic circuit.

The sum of products representation just means that you OR (sum) multiple terms that are ANDed (product) together. For instance the below is a sum of products expression:

Out = \overline{A}B + AB

With multiplication being AND, addition being OR, and a line over a letter being a NOT, the above could be written in C++ code like this:

bool out = (!A && B) || (A && B);

It would be real easy to write code like that especially if you were adding onto an existing condition, and you might not even notice that it isn’t the minimal Boolean expression to make the desired output.

Using Boolean algebra, you can do the following simplifications:
Out = \overline{A}B + AB\\ = B*(\overline{A}+A)\\ = B*1\\ = B

Which simplifies the C++ code to just this:

bool out = B;

Using Boolean algebra to simplify, you’d have to remember (or derive) the identity that A+\overline{A}=1 , and all the other identities to help you simplify equations.

Karnaugh maps make this easier because you will be able to see visually what can be combined (simplified) and what can’t.

Again though, while they give you the smallest possible sum of products representation of the logic circuit, that may not be the smallest possible representation of the circuit.

Let’s get to it!

Two Variable Karnaugh Map: Basics

Going with the example above, it takes two Boolean variables as input (A and B), and gives one Boolean variable as output. Having two input variables means we need a two variable Karnaugh map.

The first step to building the Karnaugh map is having a truth table for the input to output mappings. For our example we’ll use this truth table. This is one of many truth tables that satisfies our equation, so we are working backwards a bit, but hopefully it still makes sense. Usually you would start with the truth table and get a Boolean equation, not the other way around.
\begin{array}{c|c|c} A & B & Out\\ \hline 0 & 0 & 0 \\ 0 & 1 & 1 \\ 1 & 0 & 0 \\ 1 & 1 & 1 \\ \end{array}

The next thing we do is make our karnaugh map by making a square 2×2 grid where one side of the square is the possible A values, the other side is the possible B values, and the contents of the grid cells are the values we want the formula to come out to be:

\begin{array}{ccc} & \overline{B} & B \\ \overline{A} & 0 & 1 \\ A & 0 & 1 \\ \end{array}

Next, what you do is circle all 1’s that are in groups of a power of two (one item or two items). Doing that, we get this:

\begin{array}{ccc} & \overline{B} & B \\ \cline{3-3} \overline{A} & 0 & \multicolumn{1}{|c|}{1} \\ A & 0 & \multicolumn{1}{|c|}{1} \\ \cline{3-3} \end{array}

Looking at what we circled, we can see that both values of A are involved in our group, so A doesn’t matter to the output. We can also see that \overline{B} is not involved in any places where there is a 1, so we can ignore that too. All that leaves is B, which is our final, and most minimal answer.

That agrees with the Boolean algebra solution, but came without having to remember any identities. Hooray!

Two Variable Karnaugh Map: Overlaps

If there were multiple groups, you would combine each group with OR to get the final answer. Groups are also allowed to overlap! For instance, let’s look at this truth table to start out:

\begin{array}{c|c|c} A & B & Out\\ \hline 0 & 0 & 0 \\ 0 & 1 & 1 \\ 1 & 0 & 1 \\ 1 & 1 & 1 \\ \end{array}

Turning that into a Karnaugh map, we get this:

\begin{array}{ccc} & \overline{B} & B \\ \overline{A} & 0 & 1 \\ A & 1 & 1 \\ \end{array}

Next when it’s time to circle our groups, we have two groups, and they overlap! Here is the first group, which is the same as before:

\begin{array}{ccc} & \overline{B} & B \\ \cline{3-3} \overline{A} & 0 & \multicolumn{1}{|c|}{1} \\ A & 1 & \multicolumn{1}{|c|}{1} \\ \cline{3-3} \end{array}

That group can be expressed as just B.

The other group is this:

\begin{array}{ccc} & \overline{B} & B \\ \overline{A} & 0 & 1 \\ \cline{2-3} A & \multicolumn{1}{|c}{1} & \multicolumn{1}{c|}{1} \\ \cline{2-3} \end{array}

That group can be expressed as just A.

Lastly, we OR our groups together, aka we sum our products, and we get A+B as an answer, which in other words is just A OR B. Check out the truth table and you can see that it is indeed a truth table for OR!

Two Variable Karnaugh Map: Single Sized Groups

What if we don’t have groups of two though, what if we only have groups of one? Let’s explore that real quick with the following truth table:

\begin{array}{c|c|c} A & B & Out\\ \hline 0 & 0 & 1 \\ 0 & 1 & 0 \\ 1 & 0 & 0 \\ 1 & 1 & 1 \\ \end{array}

That becomes this Karnaugh map:

\begin{array}{ccc} & \overline{B} & B \\ \overline{A} & 1 & 0 \\ A & 0 & 1 \\ \end{array}

We once again have two groups, but they are each of size one, which is totally ok!

The upper left group is expressed as \overline{AB}, while the lower right group is expressed as AB. We OR those two groups together to get the answer: \overline{AB} + AB. That’s all there is to it.

Four Variable Karnaugh Map: Don’t Care Values

Let’s get a little more sophisticated and see how we would handle four input variables (We could go to three variables next but after learning two and four it’ll be easy to see how to do three). We will start with a truth table, but our truth table will only contain the input values we care about. We’ll omit the ones we don’t care about.

\begin{array}{c|c} ABCD & Out\\ \hline 0001 & 0 \\ 0011 & 1 \\ 0111 & 0 \\ 1111 & 1 \\ \end{array}

We’ll put 1’s and 0’s into the Karnaugh map to match our truth table, but put x’s where the output wasn’t listed. These are “don’t care” values, where they could either be 0’s or 1’s, depending on whichever is more convinient for us when simplifying. We are also going to change how we label the map a bit.

\begin{array}{cccccc} & & \multicolumn{4}{c}{CD} \\ & & 00 & 01 & 11 & 10 \\ & 00 & x & 0 & 1 & x \\ AB & 01 & x & x & 0 & x \\ & 11 & x & x & 1 & x \\ & 10 & x & x & x & x \\ \end{array}

In this case, even with the wild card don’t care values, we still just have two 1 item groups, that we OR together to get the answer:

\overline{AB}CD+ABCD

Note that we could factor out the CD and make it into the below, but then it would no longer be in a sum of products form.

CD(\overline{AB}+AB)

You might also have noticed the strange ordering of the values in the table: 00, 01, 11, 10. Normally doesn’t 2 (10) come before 3 (11)? It does, except in this case, we only want to change one variable at a time between neighboring cells. Going from 01 to 11 means that the first bit changed, while going from 01 to 10 means that two bits changed, so isn’t useful for us finding groups. The order that the numbers are in is actually called Gray Code, named after Frank Grey (Wikipedia: Gray Code).

Four Variable Karnaugh Map: Larger Groups

When dealing with karnaugh maps, like I said before, the groups have to be a size of a power of two, but interestingly it can be a power of two on each axis. So valid groups include 4×1, 2×1, 2×2, 4×2 and others. Let’s take a look at one where we encounter a 2×2 group.

Let’s start with the truth table:

\begin{array}{c|c} ABCD & Out\\ \hline 0000 & 0 \\ 0001 & 0 \\ 0010 & 0 \\ 0011 & 0 \\ 0100 & 0 \\ 0101 & 1 \\ 0110 & 0 \\ 0111 & 1 \\ 1000 & 0 \\ 1001 & 0 \\ 1010 & 0 \\ 1011 & 1 \\ 1100 & 0 \\ 1101 & 1 \\ 1110 & 0 \\ 1111 & 1 \\ \end{array}

That gives us the Karnaugh map:

\begin{array}{cccccc} & & \multicolumn{4}{c}{CD} \\ & & 00 & 01 & 11 & 10 \\ & 00 & 0 & 0 & 0 & 0 \\ AB & 01 & 0 & 1 & 1 & 0 \\ & 11 & 0 & 1 & 1 & 0 \\ & 10 & 0 & 0 & 1 & 0 \\ \end{array}

There are two groups there. The first is the 2×2 group below and is the intersection of where B is 1, and D is 1, so can be represented as BD.

\begin{array}{cccccc} & & \multicolumn{4}{c}{CD} \\ & & 00 & 01 & 11 & 10 \\ & 00 & 0 & 0 & 0 & 0 \\ \cline{4-5} AB & 01 & 0 & \multicolumn{1}{|c}{1} & \multicolumn{1}{c|}{1} & 0 \\ & 11 & 0 & \multicolumn{1}{|c}{1} & \multicolumn{1}{c|}{1} & 0 \\ \cline{4-5} & 10 & 0 & 0 & 1 & 0 \\ \end{array}

The second group is a 1×2 group that overlaps the first, and is where A,C and D are 1, but B can be either 0 or 1. That makes it able to be represented as ACD.

\begin{array}{cccccc} & & \multicolumn{4}{c}{CD} \\ & & 00 & 01 & 11 & 10 \\ & 00 & 0 & 0 & 0 & 0 \\ AB & 01 & 0 & 1 & 1 & 0 \\ \cline{5-5} & 11 & 0 & 1 & \multicolumn{1}{|c|}{1} & 0 \\ & 10 & 0 & 0 & \multicolumn{1}{|c|}{1} & 0 \\ \cline{5-5} \end{array}

We combine those groups with OR to get the answer:

BD+ACD

Four Variable Karnaugh Map: Wrap Around

Interestingly, you can make groups by wrapping around the edges of the Karnaugh map, either horizontally or vertically. Let’s start with a truth table:

\begin{array}{c|c} ABCD & Out\\ \hline 0000 & 0 \\ 0001 & 0 \\ 0010 & 0 \\ 0011 & 1 \\ 0100 & 0 \\ 0101 & 0 \\ 0110 & 0 \\ 0111 & 0 \\ 1000 & 0 \\ 1001 & 0 \\ 1010 & 0 \\ 1011 & 1 \\ 1100 & 0 \\ 1101 & 0 \\ 1110 & 0 \\ 1111 & 0 \\ \end{array}

That gives us the Karnaugh map:

\begin{array}{cccccc} & & \multicolumn{4}{c}{CD} \\ & & 00 & 01 & 11 & 10 \\ & 00 & 0 & 0 & 1 & 0 \\ AB & 01 & 0 & 0 & 0 & 0 \\ & 11 & 0 & 0 & 0 & 0 \\ & 10 & 0 & 0 & 1 & 0 \\ \end{array}

Here is the group highlighted below, which is represented as \overline{B}CD, which is also the answer:

\begin{array}{cccccc} & & \multicolumn{4}{c}{CD} \\ & & 00 & 01 & 11 & 10 \\ & 00 & 0 & 0 & \multicolumn{1}{|c|}{1} & 0 \\ \cline{5-5} AB & 01 & 0 & 0 & 0 & 0 \\ & 11 & 0 & 0 & 0 & 0 \\ \cline{5-5} & 10 & 0 & 0 & \multicolumn{1}{|c|}{1} & 0 \\ \end{array}

Two Variable Karnaugh Map: Handling Redundant Info

If you are like me, you might be wondering – If Karnaugh maps can give you the minimal sum of products expression for a truth table, how does it deal with redundant information or solutions that are of equal size, so it’s ambiguous which to choose?

For instance, Let’s go with the truth table table below. All other inputs not listed are “don’t care” values.

\begin{array}{c|c} AB & Out\\ \hline 00 & 0 \\ 11 & 1 \\ \end{array}

It’s obvious that the output bit corresponds exactly to both A and B separately. Which one does it choose, or does it make some more complex expression that involves both?

Here is the Karnaugh map:

\begin{array}{ccc} & \overline{B} & B \\ \overline{A} & 0 & x \\ A & x & 1 \\ \end{array}

Well, the disambiguation comes up now that you – the pattern finder in the Karnaugh map – chooses between the two possible groups.

One answer, which is perfectly valid is the below, which is just A.

\begin{array}{ccc} & \overline{B} & B \\ \overline{A} & 0 & x \\ \cline{2-3} A & \multicolumn{1}{|c}{x} & \multicolumn{1}{c|}{1} \\ \cline{2-3} \end{array}

The other answer, which is also perfectly valid, is the below, which is just B

\begin{array}{ccc} & \overline{B} & B \\ \cline{3-3} \overline{A} & 0 & \multicolumn{1}{|c|}{x} \\ A & x & \multicolumn{1}{|c|}{1} \\ \cline{3-3} \end{array}

So, the disambiguation / simplification is left up to the user to choose, but yes, it still comes up with a minimal sum of products answer, and doesn’t try to incorporate both bits into a more complex logic operation.

Other Notes

The act of turning a truth table into a logical expression is called logical synthesis, if you want to read more along those lines.

You might be wondering if because there is a sum of products form, if there is also a product of sums form? There is in fact, and you can get that form from Karnaugh maps as well. It may result in a more optimal logical expression. More info on that here: Is Karnaugh Map possible for Maxterms?.

You might be tempted to bring a higher number of variables into the mix. Be warned… adding a 5th variable makes the Karnaugh map into a 4x4x2 3d shape. Adding a 6th variable makes it into a 4x4x4 cube. Adding a 7th variable makes it into a 4x4x4x2 hypercube, and the pattern continues.

For higher numbers of inputs, people will often use a different algorithm instead, that I hope to write a post on before too long. You can read about it here: Wikipedia: Quine–McCluskey algorithm

Lastly, you might be wondering, what do i do if i have M input bits, and N output bits? How can I make a circuit or a set of instructions to generate a minimal logical expression to encompass that?

Well, one simple way is to handle each bit separately and have N Karnaugh maps each having M variables. A problem there though is that computers do operations on multiple bits at the same time with most operations, so having each bit do it’s calculations without considering sharing instructions with another bit leaves some efficiency on the table.

I’m not sure of any better algorithms currently, but I’ve asked on stack exchange so there may be some more info there by the time you read this:
Algorithms for logical synthesis of multiple output bits?

What Happens When you Mix Hash Tables and Binary Searching?

While not the most cache friendly operation, binary searching a sorted data set is a pretty good way to search a data set for a specific value because for N items, you have an average case and worst case of O(log N) (Wikipedia: Binary Search Algorithm).

Hashing is also a decent way to do data searching, because it has a best case for search of O(1) with the average case being able to be near that when tuned well, but in practice due to collisions it can get get up to O(n) (Wikipedia: Hash Table).

Note that if you know the keys in advance, you can ALWAYS get O(1) lookup by using the information from the last post: Minimal Perfect Hashing.

One way to deal with hash collisions is to store all the collisions for a specific hash value in a list or array.

For instance, if you had a 4 bit hash, you could have 16 arrays, where the [0]th array stored all the items that hashed to 0, the [1]th array stored all items that hashed to 1 and so on, going up to the [15]th array which stored all items that hashed to 15.

What would happen if we stored the arrays in sorted order and then did a binary search within the buckets? What would that mean to search times?

Interestingly, assuming a good hash function, the answer is that every bit of hash you use subtracts one from the number of tests you’d have to do with binary searching (See footnote 1).

Examples

For instance, let’s say you had 1024 items sorted in an array, you would have to do up to 10 tests to search for a value in that array using a binary search since log2(1024)=10 (See footnote 2).

If we use a 1 bit hash, assuming a good hash function, that would split the array into two arrays each having 512 items in it. Those each can take up to 9 tests to search using a binary search since log2(512)=9. Doing the hash to choose which of the two lists to search cuts our search from up to 10 tests, down to 9 tests.

If instead we used a 2 bit hash, we would divide the 1024 item list into four lists each having 256 items in them. Each of these lists can be searched with up to 8 tests since log2(256) = 8. So using a 2 bit hash, we can cut our search down to 8 tests max.

Let’s say that we used an 8 bit hash. That would cut our list up into 256 lists, each of which only has 4 items in it. Each list can be searched with up to 2 tests since log2(4) = 2. Using an 8 bit hash cuts our max number of searches from 10 down to 2!

Let’s say we used a 16 bit hash, what would happen then?

Well, that would mean we had 65536 hash buckets, but only 1024 items to store in the buckets. If we had a best case collision free hash, that means only 1024 buckets had an item in them, and the rest were empty.

You’d have to hash the value you were searching for to get the hash bucket index, and if there was an item there, you’d have to test that item. If there was no item there, you could return that the value wasn’t found.

The hash isn’t free though, so this is basically O(1) or doing 1 test.

So, while each bit of hash subtracts one test from the binary search, it of course can’t go negative, or become free, so it basically has a minimum of 1.

Footnotes

  1. Technically it only approximately subtracts one from the number of tests you’d have to do, even if the hash function divides the list as evenly as possible, due to footnote 2 and not being able to split an odd number of items exactly in half.
  2. technically 1024 items in an array could take up to 11 tests! You can see why with 4 items with indices 0,1,2,3. First you’d test index 2. If the value we were looking for was greater, we’d test 3 and then be done with either a found or not found result. That’s just 2 searches total. But, if the value we were looking for was less than index 2, we’d have to test index 1, and then optionally test index 0 depending on how the search value compared to index 1. With only 3 items, indicies 0,1,2, we test index 1, then either index 0 or 2 and be done, and so only have to test twice. A binary search takes log2(N+1) tests, where N is the number of items you are looking through.

Quick Blurb

The last post i wrote on minimal perfect hashing was insanely popular (by my standards) on both reddit and hacker news. The previous highest number of visitors I had ever had to my blog in one day was about 230, but on the 16th I got 4187 visitors, and on the 17th I got 7277 visitors. That means there were over 10,000 visitors over those two days.

That is nuts!!

As a data point for those who might find it interesting, the bulk of the traffic from the first day was reddit, and the bulk of the traffic from the second day was hacker news.

I also found it interesting to see what all those people looked at after the minimal perfect hashing algorithm.

I’ve learned as the years have gone on so some of my older stuff probably contains more errors than my newer stuff (which is also not error free I’m sure).

Anyways, thanks for reading, and I hope you guys find my writings interesting and useful. Please feel free to comment and correct any misinformation, or let us know about better alternatives (:

O(1) Data Lookups With Minimal Perfect Hashing

Hash tables are great in that you can hash a key and then use that hash as an index into an array to get the information associated with that key. That is very fast, so long as you use a fast hash function.

The story doesn’t end there though because hash functions can have collisions – multiple things can hash to the same value even though they are different. This means that in practice you have to do things like store a list of objects per hash value, and when doing a lookup, you need to do a slower full comparison against each item in the bucket to look for a match, instead of being able to only rely on the hash value comparisons.

What if our hash function didn’t have collisions though? What if we could take in N items, hash them, and get 0 to N-1 as output, with no collisions? This post will talk about how to make that very thing happen, with simple sample C++ code as well, believe it or not!

Minimal Perfect Hashing

Perfect hashing is a hash function which has no collisions. You can hash N items, and you get out N different hash values with no collisions. The number of items being hashed has to be smaller than or equal to the possible values your hash can give as output though. For instance if you have a 4 bit hash which gives 16 possible different hash values, and you hash 17 different items, you are going to have at least one collision, it’s unavoidable. But, if you hash 16 items or fewer, it’s possible in that situation that you could have a hash which had no collisions for the items you hashed.

Minimal perfect hashing takes this a step further. Not only are there no collisions, but when you hash N items, you get 0 to N-1 as output.

You might imagine that this is possible – because you could craft limitless numbers of hashing algorithms, and could pass any different salt value to it to change the output values it gives for inputs – but finding the specific hashing algorithm, and the specific salt value(s) to use sounds like a super hard brute force operation.

The method we are going to talk about today is indeed brute force, but it cleverly breaks the problem apart into smaller, easier to solve problems, which is pretty awesome. It’s actually a pretty simple algorithm too.

Here is how you create a minimal perfect hash table:

  1. Hash the items into buckets – there will be collisions at this point.
  2. Sort the buckets from most items to fewest items.
  3. Find a salt value for each bucket such that when all items in that bucket are hashed, they claim only unclaimed slots. Store this array of salt values for later. it will be needed when doing a data lookup.
  4. If a bucket only has one item, instead of searching for a salt value, you can just find an unclaimed index and store -(index+1) into the salt array.

Once you have your minimal perfect hash calculated, here is how you do a lookup:

  1. Hash the key, and use that hash to find what salt to use.
  2. If the salt is positive (or zero), hash the key again using the salt. The result of that hash will be an index into the data table.
  3. If the salt is negative, take the absolute value and subtract one to get the index in the data table.
  4. Since it’s possible the key being looked for isn’t in the table, compare the key with the key stored at that index in the table. If they match, return the data at that index. If they don’t match, the key was not in the table.

Pretty simple isn’t it?

Algorithm Characteristics

This perfect minimal hash algorithm is set up to be slow to generate, but fast to query. This makes it great for situations where the keys are static and unchanging, but you want fast lookup times – like for instance loading a data file for use in a game.

Interestingly though, while you can’t add new keys, you could make it so you could delete keys. You would just remove the key / value pair from the data set, and then when doing a lookup you’d find an empty slot.

Also, there is nothing about this algorithm that says you can’t modify the data associated with the keys, at runtime. Modifying the data associated with a key doesn’t affect where the key/value pair is stored in the table, so you can modify the data all you want.

If you wanted to be able to visit the items in a sorted order, when searching for the perfect minimal hash, you could also make the constraint that when looking for the salt values, that not only did the items in the bucket map to an unclaimed slot, you could make sure they mapped to the correct slot that they should be in to be in sorted order. That would increase the time it took to generate the table, and increase the chances that there was no valid solution for any salt values used, but it is possible if you desire being able to know the items in some sorted order.

Interestingly, the generation time of the minimal perfect hash apparently grows linearly with the number of items it acts on. That makes it scale well. On my own computer for instance, I am able to generate the table for 100,000 items in about 4.5 seconds.

Also, in my implementation, if you have N items, it has the next lower power of two number of salt values. You could decrease the number of salt values used if you wanted to use less memory, but that would again come at the cost of increased time to generate the table, as well as increase the chances that there was no valid solution for any salt values used.

Example Code

Below is a simple implementation of the algorithm described.

The main point of the code (besides functionality) is readability so it isn’t optimized as well as it could be, but still runs very fast (100,000 items processed in about 4.5 seconds on my machine). Debug is quite a bit slower than release for me though – I gave up on those same 100,000 items after a few minutes running in debug.

The code below uses MurmurHash2, but you could drop in whatever hashing algorithm you wanted.

The data file for this code is words.txt and comes to us courtesy of English Wordlists.

#include <vector>
#include <algorithm>
#include <assert.h>
#include <fstream>
#include <string>

unsigned int MurmurHash2(const void * key, int len, unsigned int seed);

//=================================================================================
template <typename VALUE>
class CPerfectHashTable {
public:

    typedef std::pair<std::string, VALUE> TKeyValuePair;
    typedef std::vector<TKeyValuePair> TKeyValueVector;
    struct SKeyValueVectorBucket {
        TKeyValueVector m_vector;
        size_t          m_bucketIndex;
    };
    typedef std::vector<struct SKeyValueVectorBucket> TKeyValueVectorBuckets;

    // Create the perfect hash table for the specified data items
    void Calculate (const TKeyValueVector& data) {

        // ----- Step 1: hash each data item and collect them into buckets.
        m_numItems = data.size();
        m_numBuckets = NumBucketsForSize(m_numItems);
        m_bucketMask = m_numBuckets - 1;
        m_salts.resize(m_numBuckets);
        m_data.resize(m_numItems);
        TKeyValueVectorBuckets buckets;
        buckets.resize(m_numBuckets);

        for (size_t i = 0; i < m_numBuckets; ++i)
            buckets[i].m_bucketIndex = i;

        for (const TKeyValuePair& pair : data) {
            size_t bucket = FirstHash(pair.first.c_str());
            buckets[bucket].m_vector.push_back(pair);
        }

        // ----- Step 2: sort buckets from most items to fewest items
        std::sort(
            buckets.begin(),
            buckets.end(),
            [](const SKeyValueVectorBucket& a, const SKeyValueVectorBucket& b) {
                return a.m_vector.size() > b.m_vector.size();
            }
        );

        // ----- Step 3: find salt values for each bucket such that when all items
        // are hashed with their bucket's salt value, that there are no collisions.
        // Note that we can stop when we hit a zero sized bucket since they are sorted
        // by length descending.
        std::vector<bool> slotsClaimed;
        slotsClaimed.resize(m_numItems);
        for (size_t bucketIndex = 0, bucketCount = buckets.size(); bucketIndex < bucketCount; ++bucketIndex) {
            if (buckets[bucketIndex].m_vector.size() == 0)
                break;
            FindSaltForBucket(buckets[bucketIndex], slotsClaimed);
        }
    }

    // Look up a value by key.  Get a pointer back.  null means not found.
    // You can modify data if you want, but you can't add/remove/modify keys without recalculating.
    VALUE* GetValue (const char* key) {

        // do the first hash lookup and get the salt to use
        size_t bucket = FirstHash(key);
        int salt = m_salts[bucket];

        // if the salt is negative, it's absolute value minus 1 is the index to use.
        size_t dataIndex;
        if (salt < 0)
            dataIndex = (size_t)((salt * -1) - 1);
        // else do the second hash lookup to get where the key value pair should be
        else
            dataIndex = MurmurHash2(key, strlen(key), (unsigned int)salt) % m_data.size();

        // if the keys match, we found it, else it doesn't exist in the table
        if (m_data[dataIndex].first.compare(key) == 0)
            return &m_data[dataIndex].second;
        return nullptr;
    }

private:

    unsigned int FirstHash (const char* key) {
        return MurmurHash2(key, strlen(key), 435) & m_bucketMask;
    }

    void FindSaltForBucket (const SKeyValueVectorBucket& bucket, std::vector<bool>& slotsClaimed) {

        // if the bucket size is 1, instead of looking for a salt, just encode the index to use in the salt.
        // store it as (index+1)*-1 so that we can use index 0 too.
        if (bucket.m_vector.size() == 1) {
            for (size_t i = 0, c = slotsClaimed.size(); i < c; ++i)
            {
                if (!slotsClaimed[i])
                {
                    slotsClaimed[i] = true;
                    m_salts[bucket.m_bucketIndex] = (i + 1)*-1;
                    m_data[i] = bucket.m_vector[0];
                    return;
                }
            }
            // we shouldn't ever get here
            assert(false);
        }

        // find the salt value for the items in this bucket that cause these items to take
        // only unclaimed slots
        for (int salt = 0; ; ++salt) {
            // if salt ever overflows to a negative number, that's a problem.
            assert(salt >= 0);
            std::vector<size_t> slotsThisBucket;
            bool success = std::all_of(
                bucket.m_vector.begin(),
                bucket.m_vector.end(),
                [this, &slotsThisBucket, salt, &slotsClaimed](const TKeyValuePair& keyValuePair) -> bool {
                    const char* key = keyValuePair.first.c_str();
                    unsigned int slotWanted = MurmurHash2(key, strlen(key), (unsigned int)salt) % m_numItems;
                    if (slotsClaimed[slotWanted])
                        return false;
                    if (std::find(slotsThisBucket.begin(), slotsThisBucket.end(), slotWanted) != slotsThisBucket.end())
                        return false;
                    slotsThisBucket.push_back(slotWanted);
                    return true;
                }
            );

            // When we find a salt value that fits our constraints, remember the salt
            // value and also claim all the buckets.
            if (success)
            {
                m_salts[bucket.m_bucketIndex] = salt;
                for (size_t i = 0, c = bucket.m_vector.size(); i < c; ++i)
                {
                    m_data[slotsThisBucket[i]] = bucket.m_vector[i];
                    slotsClaimed[slotsThisBucket[i]] = true;
                }
                return;
            }
        }
    }

    static size_t NumBucketsForSize (size_t size) {
        // returns how many buckets should be used for a specific number of elements.
        // Just uses the power of 2 lower than the size, or 1, whichever is bigger.
        if (!size)
            return 1;

        size_t ret = 1;
        size = size >> 1;
        while (size) {
            ret = ret << 1;
            size = size >> 1;
        }
        return ret;
    }

    // When doing a lookup, a first hash is done to find what salt to use
    // for the second hash.  This table stores the salt values for that second
    // hash.
    std::vector<int> m_salts;

    // NOTE: this stores both the key and the value.  This is to handle the
    // situation where a key is searched for which doesn't exist in the table.
    // That key will still hash to some valid index, so we need to detect that
    // it isn't the right key for that index.  If you are never going to look for
    // nonexistant keys, then you can "throw away the keys" and only store the
    // values.  That can be a nice memory savings.
    std::vector<TKeyValuePair> m_data;

    // useful values
    size_t m_numItems;
    size_t m_numBuckets;
    size_t m_bucketMask;
};

// MurmurHash code was taken from https://sites.google.com/site/murmurhash/
//=================================================================================
// MurmurHash2, by Austin Appleby
 
// Note - This code makes a few assumptions about how your machine behaves -
 
// 1. We can read a 4-byte value from any address without crashing
// 2. sizeof(int) == 4
 
// And it has a few limitations -
 
// 1. It will not work incrementally.
// 2. It will not produce the same results on little-endian and big-endian
//    machines.
 
unsigned int MurmurHash2 ( const void * key, int len, unsigned int seed )
{
    // 'm' and 'r' are mixing constants generated offline.
    // They're not really 'magic', they just happen to work well.
 
    const unsigned int m = 0x5bd1e995;
    const int r = 24;
 
    // Initialize the hash to a 'random' value
 
    unsigned int h = seed ^ len;
 
    // Mix 4 bytes at a time into the hash
 
    const unsigned char * data = (const unsigned char *)key;
 
    while(len >= 4)
    {
        unsigned int k = *(unsigned int *)data;
 
        k *= m; 
        k ^= k >> r; 
        k *= m; 
         
        h *= m; 
        h ^= k;
 
        data += 4;
        len -= 4;
    }
     
    // Handle the last few bytes of the input array
 
    switch(len)
    {
    case 3: h ^= data[2] << 16;
    case 2: h ^= data[1] << 8;
    case 1: h ^= data[0];
            h *= m;
    };
 
    // Do a few final mixes of the hash to ensure the last few
    // bytes are well-incorporated.
 
    h ^= h >> 13;
    h *= m;
    h ^= h >> 15;
 
    return h;
}


//=================================================================================
void WaitForEnter ()
{
    printf("Press Enter to quit");
    fflush(stdin);
    getchar();
}

//=================================================================================
int main (int argc, char **argv)
{
    // Read the data entries from a file.  Use the line as the key, and the line
    // number as the data.  Limit it to 100,000 entries.
    printf("Loading Data...");
    CPerfectHashTable<int> table;
    decltype(table)::TKeyValueVector data;
    std::ifstream file("words.txt");
    std::string str;
    int i = 0;
    while (std::getline(file, str) && i < 100000)
    {
        data.push_back(std::make_pair(str, i));
        ++i;
    }
    printf("Donenn");

    // make the perfect hash table
    printf("Generating Minimal Perfect Hash Table...");
    table.Calculate(data);
    printf("Donenn");

    // Verify results
    printf("Verifying results...");
    for (decltype(table)::TKeyValuePair& keyValuePair : data) {
        int* value = table.GetValue(keyValuePair.first.c_str());
        assert(value != nullptr);
        if (value == nullptr)
        {
            printf("Error, could not find data for key "%s"n", keyValuePair.first.c_str());
            WaitForEnter();
            return 0;
        }
        assert(*value == keyValuePair.second);
        if (*value != keyValuePair.second)
        {
            printf("  table["%s"] = %in", keyValuePair.first.c_str(), *value);
            printf("Incorrect value detected, should have gotten %i!n", keyValuePair.second);
            WaitForEnter();
            return 0;
        }
    }
    printf("Donenn");

    WaitForEnter();
    return 0;
}

Links & More

I learned the details of this algorithm from this page: Steve Hanov’s Blog: Throw away the keys: Easy, Minimal Perfect Hashing, after hearing about the technique mentioned occasionally by colleagues.

There are other ways to do minimal perfect hashing however. For instance give this a read: Minimal Perfect Hashing

One place that method is better than this one, is that in this one, when doing a lookup you have to hash the key twice. In the technique described by the last technique, you only have to hash the key once, and use that hash to combine the results of two lookups. The two lookups are not dependant so can be re-ordered or happen concurrently, which makes it faster on modern CPUs.

Apparently there are also some techniques for generating the minimal perfect hash on a large number of items by breaking them apart into smaller sets, which can then be paralelized across threads.

I also wanted to mention that a large part of the in memory representation of this data structure can come from storing the keys along with the data, to verify that you have indeed found the item you are looking for after doing the O(1) lookup. If you are in a situation where you are only ever going to be searching for keys that you know are in the table, you can actually forgo storing the keys, to bring the memory requirements down a significant amount.

Also, The example code implements this as a hash table, but you could also use this as a set, if you wanted fast membership tests. You could either store the keys, to be able to resolve when things were not part of the set, or, you could make a hash table of string to bool, where the bool specified whether something was in the set. Which one is better depends on your specific needs and whether you plan to search for unknown keys or not.

Lastly, as a byproduct of using a data structure like this, you can get a unique ID per key, which is the index that it appears in the data array. This can be super helpful if you have something like a list of filenames, where you would rather work with integers instead of specific file names, since integers are quicker and easier to compare, copy around, etc.

You could even make this data structure support lookups by either key or by unique id. This way, you could do a by key lookup the first time you needed to find something, and then could store off the ID to get it by ID from then on for a faster lookup. Heck, you could even do all your lookups “offline” (at tool time, not when the app is running) and for instance convert all file references in data files to be the each file’s unique ID instead. Then, you could toss out the keys of your data structure, only storing an array of data, and using the unique file ID to look into that table whenever you wanted to read or write meta data associated with that file. That would make lookups faster, and also decrease the memory requirements of memory resident data files.

It’s pretty cool stuff, and a pretty useful algorithm. Hopefully you find it useful as well (:

Update – 12/22/15

Interestingly, this post has had something like 15,000 readers since it was posted. That is by far the most read post on this blog 😛

Anyways, I wanted to add some more info I’ve found recently.

Here are three tools for doing minimal perfect hashing that are very likely to give you better results than the algorithm I describe above:

Here’s a conversation talking about gperf and the alternative applications, and pros and cons for each:
Stack Overflow: Making a perfect hash (all consecutive buckets full), gperf or alternatives?

Here’s a research paper on gperf by Doug Schmidt: GPERF – A Perfect Hash Function Generator

I had a thought that maybe there was some play here by using “logical synthesis” to come up with some algorithm to map the inputs (the keys of the hash table) to the outputs (collision free output indices).

I started looking into Karnaugh maps and then the Quine–McCluskey algorithm, and then espresso and espresso-exact (mincov). Where the first two things are decent at solving multi bit input to single bit output, the second two things are decent at solving multi bit input to multi bit output, allowing operations to be shared among bits.

While I haven’t found anyone using those specific algorithms to solve the problem, people have, and definitely are still, trying to also look into the ability to generate code without lookups. From what I’ve read so far, it sounds like finding such a function takes a lot longer to find and also that it runs more slowly in practice than a less perfect solution which has lookups.

Either way, this is still an active area of research, and plenty of folks are working on it so I’m going to leave it to them.

I also sort of have the feeling that if you are in need of minimal perfect hashing, you may be “doing it wrong”. For instance, if you are at all able to, you probably are likely to be better off having a pre-agreed on set of unique IDs per “thing” you want to look up. These unique IDs can be used directly as array indices for the magical always O(1) lookup that minimal perfect hashing is going for, and is actually a quicker lookup in practice since you don’t need to jump through special hoops to calculate the array index.

The only exceptions I can think of are:

  1. Real world requirements and not being able to reach the ideal situation – this is the boat I’m in right now. Ideally, systems would be asking about things by an integer ID, but I can’t get there in a single step, so the perfect hashing is a temporary bridge til I can get there.
  2. Even with IDs, sparse arrays are still problematic. You may have an ID per file that could be asked about, but say that you have 1,000,000 files, but you want to make an array of data for only 10,000 of them. How do you take the unique ID of a file and do a lookup into that smaller table? Minimal perfect hashing seems to be useful in this situation, but there may be other decent or comparable alternatives if you are already using integer IDs.