Quantum Computing For Programmers Part Ib: Bloch Sphere

If you read anything on quantum computing you are extremely likely to see the Bloch sphere, so it’s probably important to explain what it is and how it works.

Image from wikipedia: Wikipedia: Qubit

The Bloch sphere is a way of visually representing a qubit.

The north pole is the |0\rangle state, and the south pole is the |1\rangle state. The Z axis is vertical and runs from the |0\rangle to the |1\rangle state. A qubit’s state is represented by a point on the sphere.

If you read the last post, you might remember that there was a Pauli-Z gate which changed the phase of a qubit without changing it’s probability, as well as a more generalized version of the phase changing gate which also just rotated around the Z axis. Looking at the bloch sphere, you can see why that is true! Rotating a point around the Z axis moves it around the sphere horizontally, but it doesn’t make it any closer or farther to either the north or south pole. Since the distance to the poles is preserved, the probability of being one or the other stays the same, even though the phase changes.

The not gate, aka the Pauli-X gate, rotated a point around the X axis 180 degrees. Looking at the sphere, you could see how that would change a |1\rangle to a |0\rangle, or otherwise flip the probabilities & amplitudes of the states.

The Pauli-Y gate is a little more complex though (pun intended!) as it mapped |0\rangle to i|1\rangle and |1\rangle to -i|0\rangle by rotating 180 degrees around the Y axis. You may wonder how |1\rangle and i|1\rangle can both refer to the south pole, but the situation there is that every point EXCEPT the poles have a unique representation. So, they really do both refer to the south pole.

To find where a qubit sits on the sphere, the first step is to figure out the spherical coordinates theta (\theta) and phi (\phi) values, which you can then convert to a 3d point.

Start with whatever qubit you have in this form, where you know what the values for alpha (\alpha) and beta (\beta) are:
|\psi\rangle = \alpha|0\rangle + \beta|1\rangle

You can also write a qubit as:
|\psi\rangle = \cos\left(\tfrac{\theta}{2}\right) |0 \rangle \, + \, e^{i \phi}  \sin\left(\tfrac{\theta}{2}\right) |1 \rangle

Or like this, with less euler, and more trig:
|\psi\rangle = \cos\left(\tfrac{\theta}{2}\right) |0 \rangle \, +  \, ( \cos \phi + i \sin \phi) \, \sin\left(\tfrac{\theta}{2}\right) |1 \rangle

You can then set your known alpha and beta values to the trig (or euler) based equations:
\alpha = \cos\left(\tfrac{\theta}{2}\right)
\beta = \, ( \cos \phi + i \sin \phi) \, \sin\left(\tfrac{\theta}{2}\right)

and then solve for theta and phi (I wave my hands a bit here).

Once you have the theta and phi values figured out, you can convert that to a unit distance 3d point representing a point on the sphere by using the below, which describes the X,Y,Z components of the point:
(\sin \theta \cos \phi, \; \sin \theta  \sin \phi, \; \cos \theta)

Hopefully this explanation makes enough sense that if you encounter the bloch sphere, or things talking about rotations on the bloch sphere, you won’t be too lost 😛

Ok… time to move on to multiple qubit quantum circuits so we can do more interesting things and perhaps analyze the most well known quantum algorithms!

Quantum Computing For Programmers Part I: One Qubit

Are you a programmer? Do you have interest in learning how quantum computing works? Does hard core math and crazy abstract “philosophical” type physics questions about the nature of reality make you feel like you could never understand quantum computing?

If so, me too, welcome to the club! We are programmers, not mathematicians or physicists, but we are masters of our realm – computation and logic. With that, I say that quantum computing is ultimately meant for us, since it is computing and programming, so lets figure it out and do some interesting stuff 😛

Before we start, I want to mention that these posts should be able to be read stand alone, but if you want to get into the deeper details, you probably want to give this list of references I’m using a read at some point. Perhaps before reading my posts, or after, or both!
Quantum Computing References

Lastly, to be clear, quantum computers seem like they are a long way off from being common place objects. These posts will show how quantum computers work, and allow you to simulate quantum computers on your own computer with some simple C++. The code won’t run as fast as it would on a real quantum computer (obviously!), but it should let you better understand how Quantum Computing (QC) works and let you explore and experiment on your own machine.

Probabilities (aka SuperPositons)

At the core of QC is the qubit. A qubit is like a regular bit, in that it can have the value of 0 or 1, but unlike a regular bit, it can also be somewhere in the middle where it has a probability of being in each state.

That might sound strange, but think of a qubit like a coin. It can be heads up, or it can be heads down, or it can be in the air, waiting to land heads up or heads down.

When the coin is in the air, if it is perfectly balanced, it has a 50/50 chance of becoming heads or tails.

If the coin is not perfectly balanced (no real coin can be perfectly balanced!), it will have some bias towards heads or towards tails so will not be a pure 50/50 chance of heads or tails.

While the coin is on the ground, it is either a heads (0) or a tails (1), but while it’s in the air, you can imagine that it is a superposition of 0 and 1. When it lands, that superposition will collapse into a 0 or 1 value, which is also what happens when you “observe” or measure a qubit.

One interesting thing about quantum computing, and where part of it’s power comes from is that you can perform logic on superpositional values in a way that when you have your final superpositional result, you can measure it and get the result from your logic circuit as if the values had been fixed all the way through the system, instead of being in superpositions.

The bummer about this though is that since it’s all based on probabilities, your answer has a chance of being not the value you wanted to get. Quantum circuits / quantum algorithms will try to combat this by doing operations iteratively to make the probability of the desired answer go up, and the probability of the undesired answers go down.

They can also run the quantum algorithm multiple times, to take multiple samples and be more sure about their results.

You still with me? Pretty simple so far right?

Before moving on I also wanted to mention that regular computers – like the one you are using right now – also have a greater than zero chance that they will give you the wrong answer to a calculation. The error rate can increase when the components are too hot, but even otherwise, a stray particle of background radiation could hit your cpu and flip a bit, giving you the wrong answer. You could fight this by doing every calculation 3 times and going with majority rules, but even then, if two bits get flipped, the majority will be wrong. Nothing you can do can completely eliminate the random chance that your calculations will come up with the wrong answer, but the chances are low enough that we rely on computers to do the right things, even in the most critical of situations (like, not starting a thermonuclear war).

With quantum computers you could get that same level of assurance. You can never completely eliminate the chance that a calculation is wrong, but you can definitely be sure with as much accuracy as you can with a standard computer, so that shouldn’t really be an issue in practice, for the cases where you need “certainty”.

Probability Vectors (aka Amplitude Vectors)

Using the coin flip scenario above, we could describe the probability of a coin landing heads or tails as a vector where the first element is the chance that it will land heads, and the second element is the chance that it will land tails.

A perfectly balanced coin would look like this for instance: [0.5,0.5]

If the coin was 75% likely to land heads, and only 25% likely to land tails the vector would look like this: [0.75,0.25]

You might notice that the elements of the vector have to add up to 1. This makes sense because there are only two options and one of the two MUST happen (unless it lands perfectly on edge), so they better add up to 1! The technical term for this is that the vector must have an L1 Norm of 1.

Qubits describe their probabilities a little bit differently though. Instead of storing the probability of them being 0 or 1, they store what’s called the amplitude, which when squared gives the probability. Doing this lets them do something really important that I’ll talk about lower down, but for now, you can hopefully just accept that this is how it works.

Consider a qubit having a 50% chance that it’s a 0, and a 50% chance that it’s a 1. We need some number that when squared gives 0.5 aka 1/2. That value is 1/\sqrt{2}.

So, for a qubit that has an even chance of being a 0 or 1, a vector to represent it is [1/\sqrt{2}, 1/\sqrt{2}].

Then, a vector describing a qubit that has a 75% chance being a 0 and a 25% chance of being a 1 would be [\sqrt{3}/2, 1/2]. If you square the first number you can see that you get 3/4 (75% chance) and squaring the second number gives you 1/4 (25% chance).

I got those vectors by solving the simple equation:
x^2 = P

Where P is the desired probability from 0 to 1.

You might now notice that the length of a qubit amplitude vector has to be 1. In other words, it’s a normalized vector. The technical term for this is that the vector must have an L2 norm of 1.

Still nothing too crazy going on…

Ket Notation

There is some notation you’ll come across a lot when reading about quantum computing / mechanics / etc that looks like this:
1/\sqrt{2}(|0\rangle + |1\rangle)

That is “Ket Notation” and comes from “Bra-Ket Notation”. It’s bark is worse than it’s bite.

If you multiply it out, you get this:
1/\sqrt{2} * |0\rangle + 1/\sqrt{2} * |1\rangle

All that means is that the “0” state of the vector has an amplitude of 1/\sqrt{2} and so does the “1” state. Writing that as a vector, we get what we already saw above for the qubit which had 50/50 odds of being a 0 or a 1:
[1/\sqrt{2}, 1/\sqrt{2}]

Here is the ket notation for the qubit which had a 75% chance of being a 0 and a 25% chance of being a 1:
1/2(\sqrt{3}|0\rangle + |1\rangle)

You can multiply that out to get this:
\sqrt{3}/2*|0\rangle + 1/2*|1\rangle

Which translates to the vector we saw before:
[\sqrt{3}/2, 1/2]

Lastly, you might see it shown like this:
|0\rangle

All that means is that the “0” state is 1. Since the “1” state isn’t listed, it has a value of 0. That is the same as this vector:
[1, 0]

Ket notation is used because you only have to list the states that actually have values, and not the ones that have zeros, making a more compact representation. That will be more useful when we move into a larger number of qubits.

Note that the mapping of which state belongs to which spot in the vector is completely arbitrary. I chose to make the left value be the “0” state and the right value be the “1” state, but it really doesn’t matter how you define it, so long as you are consistent.

Phase

Phase is the reason that qubits store amplitude, which square to probabilities, instead of storing probabilities themselves.

I said that the amplitude vector for a qubit which has a 50/50 chance of being a 0 or a 1 was this:
[1/\sqrt{2}, 1/\sqrt{2}]

I lied a bit though, and that is only one possible answer. Here’s another that has a 50/50 chance:
[-1/\sqrt{2}, 1/\sqrt{2}]

This gets interesting when you add two vectors together. First lets add two amplitude vectors which have the same phases for each state:
[1/\sqrt{2}, 1/\sqrt{2}] + [1/\sqrt{2}, 1/\sqrt{2}] = [\sqrt{2}, \sqrt{2}]

Each component added together.

Now, let’s add two amplitude vectors where the |1\rangle state has opposite phase in one of the vectors:
[1/\sqrt{2}, 1/\sqrt{2}] + [1/\sqrt{2}, -1/\sqrt{2}] = [\sqrt{2}, 0]

You can see that the |0\rangle states added together while the |1\rangle states subtracted and resulted in zero.

This is called destructive interference, and is one of the things that makes quantum computing powerful. We can change a qubit’s phase without affecting it’s value (or probability of values).

I left out another important amplitude vector for describing a qubit that has a 50/50 chance of being a 0 or a 1:
[1/\sqrt{2}, 1/\sqrt{-2}]

Which can also be written as:
[1/\sqrt{2}, 1/(\sqrt{2}*i)]

Yep, that is an imaginary number in the 1 state!

Since amplitude squared is probability, it means we can also use imaginary numbers for amplitude vector values. If you know that the term phase relates to angles and rotations, you might want to have a look at these two things to follow the rabbit hole a bit deeper. Totally optional (;
Using Imaginary Numbers To Rotate 2D Vectors
Wikipedia: Bloch Sphere

However, since imaginary numbers can be involved, squaring amplitudes to get probabilities isn’t enough. Like for example with the vector [0,i], if you square the |1\rangle state to get the probability, you get -1, or -100%. That isn’t right!

The real operation to getting probability from amplitude is you multiply the amplitude by it’s complex conjugate. In other words, if you have a complex number, you multiply the imaginary part by -1 to get the conjugate, and multiply by that.

As an example, if you have 3+5i, the complex conjugate is 3-5i.

In the example of the vector [0,i], you get the probability of the |1\rangle state by multiplying i by -i to get 1, or 100%.

Complex conjugate sounds scary, but hopefully you can see it isn’t as scary as it sounds. You just flip the sign of the imaginary component of the complex number.

Common One Qubit Logic Gates

It’s finally time to dive into some logic gates so we can actually do some quantum programming!

Real quickly I want to mention that we are going to be simulating quantum computing by multiplying the qubit amplitude vectors by matrices. The matrices themselves are unitary matrices, which preserves the length of the vectors (keeping them normalized).

One requirement of quantum gates is that they must be reversible, and unitary matrices are also reversible.

The matrices are square matrices and their dimensions will always be 2^N by 2^N where N is the number of qubits involved in the quantum circuit. With a single qubit, that means we will be working with 2×2 matrices which is fairly small.

When working with more qubits the size of the matrix grows very quickly though. Quantum computing is very fast at doing these “matrix multiplies” and is part of where their power comes from. Large matrix multiplies will be slow for us on regular (classical) computers, but the quantum computers would be able to do them very quickly. That’s where the difference is in our simulations versus the real deal.

Anyways, let’s get onto some specific logic gates! (These are from Wikipedia: Quantum Gate)

Not Gate (Pauli-X Gate)

This is a not gate, but is also described as a 180 degree rotation around the x axis of the bloch sphere.

\begin{bmatrix} 0 & 1 \\ 1 & 0 \\ \end{bmatrix}

You might notice that it looks like a backwards identity matrix. That is exactly how it works too… if you multiply a 2 element vector by that matrix, it will flip the elements in that vector!

In the case of our qubit amplitude vectors, this has the result of swapping the probability that the qubit will be 0, with the probability that the qubit will be 1.

If the qubit is not in a superposition, and instead is actually just a 0 or a 1, it will act exactly like a traditional NOT gate and flip it from one value to the other.

If the qubit is in a superposition, it will just flip the probabilities of each state.

This maps the |0\rangle state to |1\rangle and vice versa.

Pauli-Y Gate

This is described as a 180 degree rotation around the y axis of the bloch sphere, and maps |0\rangle to i|1\rangle and |1\rangle to -i|0\rangle.

\begin{bmatrix} 0 & -i \\ i & 0 \\ \end{bmatrix}

It’s like a NOT but also adjusts phase.

Pauli-Z Gate

This is described as a 180 degree rotation around the z axis of the bloch sphere. All it does in practice is negate the phase of the |1\rangle state.

\begin{bmatrix} 1 & 0 \\ 0 & -1 \\ \end{bmatrix}

General Phase Adjustment Gates

You can adjust the phase of individual gates by using a matrix like this one, which is a more general version of the Pauli-Z gate, and adjusts the phase of of the |1\rangle state to whatever value is desired, without affecting the probabilities of the qubit values.

\begin{bmatrix} 1 & 0 \\ 0 & e^{i\theta} \\ \end{bmatrix}

The value e^{i\theta} ends up being a complex number, that can also be written like this:
cos(\theta)+i sin(\theta)

where \theta is any angle in radians.

You’ll find that circuits commonly only adjust the |1\rangle state. This is because the absolute phase of the individual states don’t matter (since they don’t affect probabilities), but what does matter is the difference in phase between the |0\rangle state and the |1\rangle state, since that can cause constructive or destructive interference. For this reason, you’ll often see the |0\rangle state without a phase value, and only the |1\rangle state will get it’s phase modified during calculations.

Hadamard Gate (Interesting Gate!)

Ok so the NOT gate is somewhat useful. You are probably thinking the others might come in handy if you need to adjust the phase of a qubit, but so far, nothing has been that spectacular for single qubit gates.

Well, here’s the more interesting gate.

What this gate does is if you pass it a pure 0 or 1, it will output a value that has a 50% chance of being a 0 or a 1.

The interesting part happens when you put that output through a Hadamard gate again – you get the original value of 0 or 1 out!

It can do this because it stores the info about the original value as phase information. When the input value is |0\rangle, the Hadamard gate outputs [1/\sqrt{2}, 1/\sqrt{2}] which has matching phases for each state. When the input value is |1\rangle, the gate outputs [1/\sqrt{2}, -1/\sqrt{2}] which has opposite phases for each state.

1/\sqrt{2} \begin{bmatrix} 1 & 1 \\ 1 & -1 \\ \end{bmatrix}

This gate is a crucial part in both Grover’s algorithm – which can search an unsorted list in O(\sqrt{N}), as well as in Shor’s algorithm, which can factor large numbers much faster than classical computers, putting certain types of cryptography at risk for being cracked by quantum computers.

It has some other uses too, which you can check out here:
Does a Hadamard Gate have uses outside of pure and evenly mixed states?

Code

Here is some example code to show some single qubit circuits in action.

#include <stdio.h>
#include <array>
#include <complex>

typedef std::array<std::complex<float>, 2> TQubit;
typedef std::array<std::complex<float>, 4> TComplexMatrix;
const float c_pi = 3.14159265359f;

//=================================================================================
static const TQubit c_qubit0 = { 1.0f, 0.0f };  // false aka |0>
static const TQubit c_qubit1 = { 0.0f, 1.0f };  // true aka |1>
static const TQubit c_qubit01_0deg = { 1.0f / std::sqrt(2.0f), 1.0f / std::sqrt(2.0f) }; // 50% true. 0 degree phase
static const TQubit c_qubit01_180deg = { 1.0f / std::sqrt(2.0f), -1.0f / std::sqrt(2.0f) }; // 50% true. 180 degree phase

// A not gate AKA Pauli-X gate
// Flips false and true probabilities (amplitudes)
// Maps |0> to |1>, and |1> to |0>
// Rotates PI radians around the x axis of the Bloch Sphere
static const TComplexMatrix c_notGate =
{
    {
        0.0f, 1.0f,
        1.0f, 0.0f
    }
};
static const TComplexMatrix c_pauliXGate = c_notGate;

// Pauli-Y gate
// Maps |0> to i|1>, and |1> to -i|0>
// Rotates PI radians around the y axis of the Bloch Sphere
static const TComplexMatrix c_pauliYGate =
{
    {
        { 0.0f, 0.0f }, { 0.0f, -1.0f },
        { 0.0f, 1.0f }, { 0.0f, 0.0f }
    }
};

// Pauli-Z gate
// Negates the phase of the |1> state
// Rotates PI radians around the z axis of the Bloch Sphere
static const TComplexMatrix c_pauliZGate =
{
    {
        1.0f, 0.0f,
        0.0f, -1.0f
    }
};

// Hadamard gate
// Takes a pure |0> or |1> state and makes a 50/50 superposition between |0> and |1>.
// Put a 50/50 superposition through and get the pure |0> or |1> back.
// Encodes the origional value in the phase information as either matching or
// mismatching phase.
static const TComplexMatrix c_hadamardGate =
{
    {
        1.0f / std::sqrt(2.0f), 1.0f / std::sqrt(2.0f),
        1.0f / std::sqrt(2.0f), 1.0f / -std::sqrt(2.0f)
    }
};

//=================================================================================
void WaitForEnter ()
{
    printf("nPress Enter to quit");
    fflush(stdin);
    getchar();
}

//=================================================================================
TQubit ApplyGate (const TQubit& qubit, const TComplexMatrix& gate)
{
    // multiply qubit amplitude vector by unitary gate matrix
    return 
    {
        qubit[0] * gate[0] + qubit[1] * gate[1],
        qubit[0] * gate[2] + qubit[1] * gate[3]
    };
}

//=================================================================================
int ProbabilityOfBeingTrue (const TQubit& qubit)
{
    float prob = std::round((qubit[1] * std::conj(qubit[1])).real() * 100.0f);
    return int(prob);
}

//=================================================================================
TComplexMatrix MakePhaseAdjustmentGate (float radians)
{
    // This makes a gate like this:
    //
    // [ 1  0             ]
    // [ 0  e^(i*radians) ]
    //
    // The gate will adjust the phase of the |1> state by the specified amount.
    // A more general version of the pauli-z gate

    return
    {
        {
            1.0f, 0.0f,
            0.0f, std::exp(std::complex<float>(0.0f,1.0f) * radians)
        }
    };
}

//=================================================================================
void Print (const TQubit& qubit)
{
    printf("[(%0.2f, %0.2fi), (%0.2f, %0.2fi)] %i%% true",
        qubit[0].real(), qubit[0].imag(),
        qubit[1].real(), qubit[1].imag(),
        ProbabilityOfBeingTrue(qubit));
}

//=================================================================================
int main (int argc, char **argv)
{
    // Not Gate
    {
        printf("Not gate:n  ");

        // Qubit: false
        TQubit v = c_qubit0;
        Print(v);
        printf("n  ! = ");
        v = ApplyGate(v, c_notGate);
        Print(v);
        printf("nn  ");

        // Qubit: true
        v = c_qubit1;
        Print(v);
        printf("n  ! = ");
        v = ApplyGate(v, c_notGate);
        Print(v);
        printf("nn  ");

        // Qubit: 50% chance, reverse phase
        v = c_qubit01_180deg;
        Print(v);
        printf("n  ! = ");
        v = ApplyGate(v, c_notGate);
        Print(v);
        printf("nn");
    }

    // Pauli-y gate
    {
        printf("Pauli-y gate:n  ");

        // Qubit: false
        TQubit v = c_qubit0;
        Print(v);
        printf("n  Y = ");
        v = ApplyGate(v, c_pauliYGate);
        Print(v);
        printf("nn  ");

        // Qubit: true
        v = c_qubit1;
        Print(v);
        printf("n  Y = ");
        v = ApplyGate(v, c_pauliYGate);
        Print(v);
        printf("nn  ");

        // Qubit: 50% chance, reverse phase
        v = c_qubit01_180deg;
        Print(v);
        printf("n  Y = ");
        v = ApplyGate(v, c_pauliYGate);
        Print(v);
        printf("nn");
    }

    // Pauli-z gate
    {
        printf("Pauli-z gate:n  ");

        // Qubit: false
        TQubit v = c_qubit0;
        Print(v);
        printf("n  Z = ");
        v = ApplyGate(v, c_pauliZGate);
        Print(v);
        printf("nn  ");

        // Qubit: true
        v = c_qubit1;
        Print(v);
        printf("n  Z = ");
        v = ApplyGate(v, c_pauliZGate);
        Print(v);
        printf("nn  ");

        // Qubit: 50% chance, reverse phase
        v = c_qubit01_180deg;
        Print(v);
        printf("n  Z = ");
        v = ApplyGate(v, c_pauliZGate);
        Print(v);
        printf("nn");
    }

    // 45 degree phase adjustment gate
    {
        printf("45 degree phase gate:n  ");
        TComplexMatrix gate = MakePhaseAdjustmentGate(c_pi / 4.0f);

        // Qubit: false
        TQubit v = c_qubit0;
        Print(v);
        printf("n  M = ");
        v = ApplyGate(v, gate);
        Print(v);
        printf("nn  ");

        // Qubit: true
        v = c_qubit1;
        Print(v);
        printf("n  M = ");
        v = ApplyGate(v, gate);
        Print(v);
        printf("nn  ");

        // Qubit: 50% chance, reverse phase
        v = c_qubit01_180deg;
        Print(v);
        printf("n  M = ");
        v = ApplyGate(v, gate);
        Print(v);
        printf("nn");
    }

    // Hadamard gate
    {
        printf("Hadamard gate round trip:n  ");

        // Qubit: false
        TQubit v = c_qubit0;
        Print(v);
        printf("n  H = ");
        v = ApplyGate(v, c_hadamardGate);
        Print(v);
        printf("n  H = ");
        v = ApplyGate(v, c_hadamardGate);
        Print(v);
        printf("nn  ");

        // Qubit: true
        v = c_qubit1;
        Print(v);
        printf("n  H = ");
        v = ApplyGate(v, c_hadamardGate);
        Print(v);
        printf("n  H = ");
        v = ApplyGate(v, c_hadamardGate);
        Print(v);
        printf("nn  ");

        // Qubit: 50% chance, reverse phase
        v = c_qubit01_180deg;
        Print(v);
        printf("n  H = ");
        v = ApplyGate(v, c_hadamardGate);
        Print(v);
        printf("n  H = ");
        v = ApplyGate(v, c_hadamardGate);
        Print(v);
        printf("nn");
    }

    // 1 bit circuit
    // Hadamard -> Pauli-Z -> Hadamard
    {
        printf("Circuit Hadamard->Pauli-Z->Hadamard:n  ");

        // Qubit: false
        TQubit v = c_qubit0;
        Print(v);
        printf("n  H = ");
        v = ApplyGate(v, c_hadamardGate);
        Print(v);
        printf("n  Z = ");
        v = ApplyGate(v, c_pauliZGate);
        Print(v);
        printf("n  H = ");
        v = ApplyGate(v, c_hadamardGate);
        Print(v);
        printf("nn  ");

        // Qubit: true
        v = c_qubit1;
        Print(v);
        printf("n  H = ");
        v = ApplyGate(v, c_hadamardGate);
        Print(v);
        printf("n  Z = ");
        v = ApplyGate(v, c_pauliZGate);
        Print(v);
        printf("n  H = ");
        v = ApplyGate(v, c_hadamardGate);
        Print(v);
        printf("nn  ");

        // Qubit: 50% chance, reverse phase
        v = c_qubit01_180deg;
        Print(v);
        printf("n  H = ");
        v = ApplyGate(v, c_hadamardGate);
        Print(v);
        printf("n  Z = ");
        v = ApplyGate(v, c_pauliZGate);
        Print(v);
        printf("n  H = ");
        v = ApplyGate(v, c_hadamardGate);
        Print(v);
        printf("n");
    }

    WaitForEnter();

    return 0;
}

Here’s the output of the program, showing the qubit circuits in action:

Next Up

In the next part, we’ll look at how multiple qubits work together in quantum circuits so we can do more interesting things. We might need a quick post explaining the Bloch sphere before that though. It’s a fairly important thing, but this post is already enough to digest, so I didn’t include it.

Quantum Computing References

I’m in the middle of some research to better understand quantum computing so that I can write a short series of blog posts entitled “Quantum Computing for Programmers”.

These posts will be light on – and possibly completely absent of – hardcore math and physics, and instead speak more to programmers. It will aim to show the rules of quantum computing, explain which operations are fast and which aren’t, show the basic building blocks of quantum logic, combine those simple gates into more complicated quantum circuits, and most importantly it will have simple sample C++ code which shows this stuff in action, so you can do your own (simulated) quantum computing experimentation and exploration on your own computer.

This page is a list of the references I’ve found so far, and I’ll keep updating it as I find new, useful references.

Gentle Introduction

Ars Technica – A tale of two qubits: how quantum computers work

Technical Details, Very Well Explained

Twisted Oak Studios – What Quantum Computers Do Faster, with Caveats

Twisted Oak Studios – Grover’s Quantum Search Algorithm

A Bit More Advanced, Still Very Readable

Scott Aaronson – Lecture 9: Quantum

Scott Aaronson – Lecture 10: Quantum Computing

Good Info, More Mathy

Math ∩ Programming – A Motivation for Quantum Computing

Math ∩ Programming – The Quantum Bit

Math ∩ Programming – Multiple Qubits and the Quantum Circuit

Useful Wikipedia Pages

Wikipedia – Commonly Used Quantum Gates

Wikipedia – Quantum Circuit

Wikipedia – Deutsch–Jozsa algorithm

Quantum Algorithm Stuff

A Quantum Network Flow Puzzle

Building your own Quantum Fourier Transform

An interactive page that shows you how quantum pseudo telepathy is not communication

Twisted Oak Studios – Implementing Quantum Pseudo-Telepathy

Other

An article on some people who are working on emulating (not simulating) quantum computers using sound waves

A Photonic C-NOT Gate Breakthrough for Quantum Computing

Wolfram Alpha Demonstrations: Generating Entangled Qubits

Wikipedia: Simon’s problem

Quantum Circuit Simulator in a Web Page

Solving Nested Modulus Equations

x \% 2 = 1

The above equation is pretty easy to solve, it just means that x is any value that when you divide by 2, gives a remainder of 1. x is all odd numbers. The more formal answer is:

x \equiv 1 + 2\mathbb{Z}

That reads as “x is equivalent to 1 plus 2 times any integer”. x has infinitely many solutions, so long as they fit that constraint.

That’s easy enough to figure out, but how would you solve this equation?

(x \% 5) \%2 = 1

or this one?

((x \% 7) \% 5) \%2 = 1

One Level (Simple Case)

There is something called the quotient remainder theorem that lets us change an equation like this:

A \% B = C

Into one like this:

A \equiv C + B \mathbb{Z}

The Z means “all integers” which implies that there are infinitely many solutions to A. This reads as “A is equivelant to C plus B times any integer”.

Let’s work out a couple examples.

x \% 5 = 3

That transforms to:

x \equiv 3 + 5 \mathbb{Z}

If you try plugging any positive or negative integer in place of Z, you’ll see that you get a number that satisfies the first equation.

Here’s another one:

x \% 538 = 211

That transforms to:

x \equiv 211 + 538 \mathbb{Z}

Once again you can plug any positive or negative integer in for Z and get a value for x that satisfies the original equation.

Pretty easy and pretty handy right? Let’s get a little more complex.

Two Levels

Let’s say you want to solve this equation from the intro.

(x \% 5) \%2 = 1

The first thing you might do is transform it to the below.

x \% 5 \equiv 1 + 2 N_1 where N_1 \in \mathbb{Z}.

What now? Can we transform it again? If we do, we get this:

x \equiv 1 + 2 N_1+ 5 N_2 where N_i \in \mathbb{Z}.

plugging 1 in for the N1 and N2, we get 8, which does satisfy the original equation.

If we plug 3 in for N1 and 1 in for N2 though, we get 12 which DOESN’T satisfy the original equation. What gives??

Well it turns out there are subtle restrictions on our equations that got lost in the shuffle. Let’s start over…

(x \% 5) \%2 = 1

In effect, what that equation says is “Something divided by two gives a remainder of 1” where the something happens to be x % 5, but we don’t really care about what the something actually is right now.

The equation also implies that the right side of the equation is a value modulo two, and it’s only valid values are {0,1}. Another way to write this is to say that the right side \in \mathbb Z_2 which reads as “in all integers mod 2”.

1 is obviously in the set of valid values {0,1}, so we don’t need any special notation just yet, but let’s keep it in mind as we move onto the next step.

x \% 5 \equiv 1 + 2 \mathbb Z

The above says that x divided by 5 gives a remainder that is 1 plus two times all integers. However there is a catch. There right side of the equation is a mod 5 value, which means it’s only valid values are {0,1,2,3,4}. In other words, the right side of the equation is \in \mathbb Z_5.

This has an effect of implicitly limiting the valid values that we can plug in to Z. It limits it to values of Z where 1+2*Z is in {0,1,2,3,4}. Let’s write this out a little more formally.

x \% 5 \equiv a where a = 1 + 2 \mathbb Z and a \in \mathbb Z_5

Let’s transform the equation again to solve for x:

x \equiv a + 5\mathbb Z where a = 1 + 2 \mathbb Z and a \in \mathbb Z_5

Note that there is no modulus on the left side of the equation now, so the right side of the equation is unlimited into the valid values it can have.

How do we use this resulting formula though to find valid values of x?

First we figure out the valid values of a. a = 1 + 2 \mathbb Z where a \in \mathbb Z_5, so it’s only valid values have to be a subset of {0,1,2,3,4}.

Plugging a -1 in for Z, we get a value for a of -1, which isn’t a valid answer so we throw it away.

Plugging a 0 in for Z, we get a value for a of 1. That is a valid answer so we keep it!

Plugging a 1 in gives us 3 which is valid, so we keep that too.

Plugging a 2 in gives us a value of 5, which isn’t valid, so we throw it away and know that we are done.

Our values for a are {1,3}.

The solution we got was this:
x \equiv a + 5\mathbb Z

Plugging in each value of a means that we get two equations as a solution for x. Valid values of x are the union of these two lists – both equations give valid answers.

x \equiv 1 + 5\mathbb Z
x \equiv 3 + 5\mathbb Z

The first equation gives us values of {…, -4, 1, 6, 11, …}
The second equation gives us values of {…, -2, 3, 8, 13, …}

Taking the union of those, our solutions for x are:
{…, -4, -2, 1, 3, 6, 8, 11, 13, …}

If you plug those into the original equation (x \% 5) \%2 = 1, you’ll see that they are valid solutions! Those equations also represent ALL solutions, so they aren’t only valid, they are also complete.

Let’s step it up just one more notch.

Three Levels (Boss Mode)

Let’s solve the hardest equation from the intro:

((x \% 7) \% 5) \%2 = 1

This starts off just like the two leveled equation. Something on the left mod 2 is equation to 1. The 1 on the right side is \in \mathbb Z_2, but since that’s obviously true for this constant, we don’t need to do anything special. So, we transform it to the below:

(x \% 7) \% 5 \equiv a
a = 1 + 2 \mathbb Z
a \in \mathbb Z_5

Then we do another transformation:
x \% 7 \equiv b
b = a + 5 \mathbb Z
a = 1 + 2 \mathbb Z
a \in \mathbb Z_5
b \in \mathbb Z_7

Then for the last transformation, we get this:
x \equiv b + 7 \mathbb Z
b = a + 5 \mathbb Z
a = 1 + 2 \mathbb Z
a \in \mathbb Z_5
b \in \mathbb Z_7

Now that we have our equations worked out, we need to start finding out what the values actually are. We start with a because it’s what b and x are based on, and is made up of constants. We get {1,3} again as the valid values of a. That gives us:
x \equiv b + 7 \mathbb Z
b = a + 5 \mathbb Z
a = [1,3]
b \in \mathbb Z_7

Now we want to plug each value of a into b and find all the valid values for b. When we plug 1 into b for a, it becomes b = 1 + 5 \mathbb Z, b \in \mathbb Z_7. The valid values for that are {1,6}.

When we plug 3 into b for a, it becomes b = 3 + 5 \mathbb Z, b \in \mathbb Z_7. The only valid value there is {3}.

That means that our valid values for b are {1,3,6}. It’s the union of the valid values we found for each value of a.

That makes our equations into this:
x \equiv b + 7 \mathbb Z
b = [1,3,6]

If we plug those values of b into the equation for x, we get these three equations:

x \equiv 1 + 7 \mathbb Z
x \equiv 3 + 7 \mathbb Z
x \equiv 6 + 7 \mathbb Z

The first equation gives us some x values of {…, -6, 1, 8, 15, …}. The second equation gives us x values of {…, -4, 3, 10, 17, …}. The third equation gives us x values of {…, -1, 6, 13, 20, …}.

Taking the union of all of those, we get…
{…, -6, -4, -1, 1, 3, 6, 8, 10, 13, 15, 17, 20, …}

If you plug those numbers into the original equation ((x \% 7) \% 5) \%2 = 1, you can see that they are valid solutions!

When you are performing this algorithm, if you ever hit a case where there are no valid answers in one of the steps – like say, there were no valid answers for equation b in the above – that means that there is no solution to your equation.

Sample Code

Since solving these equations can be pretty manual and tedious, here is some code that can solve these equations for you. If you were confused at all by the explanation above, the code may also be able to show you how it works better, especially if you step through it and see what it’s doing.

#include <stdio.h>
#include <array>
#include <vector>

//=================================================================================
void WaitForEnter()
{
    printf("nPress Enter to quit");
    fflush(stdin);
    getchar();
}

//=================================================================================
void AddSolutions(std::vector<int>& solutions, int add, int multiply, int mod)
{
    int Z = 0;
    while (1)
    {
        int value = multiply * Z + add;
        if (value < mod)
            solutions.push_back(value);
        else
            return;
        ++Z;
    }
}

//=================================================================================
void AddSolutionsFromSolutions (std::vector<int>& solutions, const std::vector<int>& adds, int multiply, int mod)
{
    std::for_each(
        adds.begin(),
        adds.end(),
        [&solutions, multiply, mod] (int add)
        {
            AddSolutions(solutions, add, multiply, mod);
        }
    );
}

//=================================================================================
int main(int argc, char **argv)
{
    // a,b,c,d...
    // ((x % a) % b) % c = d
    // etc.
    //const int c_values[] = {7, 5, 2, 1};
    const int c_values[] = { 12, 9, 7, 5, 3, 2 };
    const size_t c_numValues = sizeof(c_values) / sizeof(c_values[0]);
    
    // print the equation
    printf("Solving for x:n");
    for (size_t i = 0; i < c_numValues - 1; ++i)
        printf("(");
    printf("x");
    for (size_t i = 0; i < c_numValues - 1; ++i)
        printf(" %% %i)", c_values[i]);
    printf(" = %inn", c_values[c_numValues-1]);

    // print the solution equations
    for (size_t i = 0; i < c_numValues - 2; ++i)
    {
        char eqn = i > 0 ? 'A' + i - 1 : 'x';
        char eq = i > 0 ? '=' : 0xF0;

        printf("%c %c %c + %i*Z", eqn, eq, 'B' + i - 1, c_values[i]);
        if (i > 0)
            printf(" (in Z_%i)n", c_values[i-1]);
        else
            printf(" (in Z)n");
    }
    printf("%c = %i + %i*Z (in Z_%i)nn", 'A' + c_numValues - 2 - 1, c_values[c_numValues - 1], c_values[c_numValues - 2], c_values[c_numValues - 3]);

    // gather up the permutation of solutions for each equation, starting with the lowest equation which has only constants
    std::array<std::vector<int>, c_numValues - 2> solutions;
    AddSolutions(solutions[c_numValues - 3], c_values[c_numValues - 1], c_values[c_numValues - 2], c_values[c_numValues - 3]);
    for (size_t i = c_numValues - 3; i > 0; --i)
        AddSolutionsFromSolutions(solutions[i-1], solutions[i], c_values[i], c_values[i-1]);

    // Detect empty set
    if (solutions[0].size() == 0)
    {
        printf("No solutions!n");
        WaitForEnter();
        return 0;
    }

    // Print the more specific solution equations
    printf("x = ");
    for (size_t i = 0, c = solutions[0].size(); i < c; ++i)
    {
        if (i < c - 1)
            printf("%c U ", 'a' + i);
        else
            printf("%cn", 'a' + i);
    }
    std::sort(solutions[0].begin(), solutions[0].end());
    for (size_t i = 0, c = solutions[0].size(); i < c; ++i)
        printf("%c = %i + %iZn", 'a' + i, solutions[0][i], c_values[0]);
  
    // Print specific examples of solutions (first few numbers in each)
    std::vector<int> xValues;
    printf("n");
    for (size_t i = 0, c = solutions[0].size(); i < c; ++i)
    {
        printf("%c = {..., ", 'a' + i);
        for (int z = 0; z < 3; ++z)
        {
            printf("%i, ", solutions[0][i] + z * c_values[0]);
            xValues.push_back(solutions[0][i] + z * c_values[0]);
        }
        printf("...}n");
    }

    // Show the list of specific values of X
    std::sort(xValues.begin(), xValues.end());
    printf("nx = {..., ");
    std::for_each(xValues.begin(), xValues.end(), [](int v) {printf("%i, ", v); });
    printf("...}n");

    // Test the solutions to verify that they are valid!
    bool valuesOK = true;
    for (size_t i = 0, c = xValues.size(); i < c; ++i)
    {
        int value = xValues[i];
        for (size_t j = 0; j < c_numValues - 1; ++j)
            value = value % c_values[j];

        if (value != c_values[c_numValues - 1])
        {
            printf("Solution %i is invalid!!n", xValues[i]);
            valuesOK = false;
        }
    }
    if (valuesOK)
        printf("nAll solutions shown tested valid!n");

    WaitForEnter();
    return 0;
}

Here’s an example run, where it solves a 5 level equation.

Links

If you know of a better way to solve this type of equation let me know. This is what I found when trying to figure this stuff out, but it doesn’t mean it’s the only way to do it.

The one thing I don’t like about this technique is that when you find solutions, it’s very manual, and very specific to the values used. Imagine if one of the modulus divisors was a variable instead of them all being constants. How would you solve it then? I’m not really sure…

Have some links!

Math Stack Exchange: How to solve nested congruences?

Khan Academy: The Quotient Remainder Theorem

Solving Simultaneous Congruences (Chinese Remainder Theorem)

x \equiv 2 \pmod{3}

The equation above is a congruence. What it says is that x % 3 is 2.

The equals sign with three bars means “is equivalent to”, so more literally what the equation says is “x is equivalent to 2, when we are looking at only the integers mod 3”.

5 is a solution, so is 8, so is 11 and so is -1!

The general solution for x is x = 2 +3*N where N is any integer, aka N \in \mathbb{Z}.

What if you were trying to find an integer value x that satisfied the multiple congruences below? How would you solve it?

x \equiv 2 \pmod{3}
x \equiv 2 \pmod{4}
x \equiv 1 \pmod{5}

You could use brute force, but you’d have to check every value from 0 to 60 (not including 60) since 60 is the least common multiple (In this case, you calculate LCM 3*4*5 = 60, since the numbers have a greatest common divisor of 1, aka they are co-prime).

As the number of equations grew, or the mod values were larger, brute force could grow to be a large problem very quickly.

Well luckily there is a better way called the Chinese Remainder Theorem.

BTW – the answer is 26. Or 26 mod 60 more correctly, or 26 + 60*N where N is any integer.

The Chinese Remainder Theorem

The CRT was first published sometime in the 3rd-5th centuries by Sun Tzu – but not the Sun Tzu that wrote “The Art of War”, that was a different person. It was later proven to be true by Gauss in 1801.

When trying to learn the CRT, I found this amazing video that explains it extremely well. You should give it a look: The Chinese Remainder Theorem made easy.

In that video, it talks about modular inversion. You can read about the details of that in the last post I made: Modular Multiplicative Inverse.

An Example

Let’s work through an example that you can follow along with just to make sure you understand it, since I pointed you elsewhere for explanations 😛

Here is our set of congruences:
x \equiv 5 \pmod{7}
x \equiv 8 \pmod{11}
x \equiv 2 \pmod{3}

First up, we want to break it into 3 sections, combined with addition.
x = \_\_\_\_\_ + \_\_\_\_\_ + \_\_\_\_\_

In the above, the first empty space will be the solution to the first equation, the second empty space will be the solution to the second equation, and the third empty space will be the solution to the third equation.

Each section is going to get a co-efficient which is the mods for every equation multiplied together, except the one we are trying to solve. We still have an unknown though per term that we are going to solve in a moment.

x = 11*3*N_1 + 7*3*N_2 + 7*11*N_3
or
x = 33*N_1 + 21*N_2 + 77*N_3

The reason we make those coefficients is because it isolates our terms so that we can solve each equation individually then combine them to get the combined solution.

This works because you can see that in the first term, no matter what we end up choosing for N_1, it will always be a perfect multiple of 3 and 11, which means that the value will be zero in the other terms / other equations, so won’t affect whatever value we come up for them.

Next up, we need to solve for the N’s.

The first term needs to have an N_1 such that 33*N_1 \pmod 7 = 5.

How do we solve that? We could use brute force, testing every number 0 through 6 (otherwise known as 7-1), but in the last post we showed a better way to do it using the extended euclidean algorithm. So, that link once again is: Modular Multiplicative Inverse.

The answer when solving using inversion comes out to be 15. Note that our answer is in “mod 7” space, so you could use any value 1+7*I where I \in \mathbb{Z}, but we can stick to using 15 to make it easier to follow.

That gives us this:
x = 33*15 + 21*N_2 + 77*N_3
or
x = 495 + 21*N_2 + 77*N_3

The value of N2 and N3 end up being -8 and -2 respectively. That gives us:
x = 33*15 + 21*-8 + 77*-2
or
x = 495 - 168 - 154
or
x = 173

Since this is just one of many solutions, the real answer is:
x = 173 \pmod{231}
or
x = 173 + 231*N where N \in \mathbb{Z}

Let’s check our result and see if we got it right…

173 % 7 is indeed 5.
173 % 11 is 8.
173 % 3 is 2.

Woot, it worked!

Example Code

#include <stdio.h>
#include <algorithm>
#include <array>

// x = a mod m
struct SEquation
{
    int a;
    int m;
};
 
//=================================================================================
unsigned int ExtendedEuclidianAlgorithm (int smaller, int larger, int &s, int &t)
{
    // make sure A <= B before starting
    bool swapped = false;
    if (larger < smaller)
    {
        swapped = true;
        std::swap(smaller, larger);
    }
 
    // set up our storage for the loop.  We only need the last two values so will
    // just use a 2 entry circular buffer for each data item
    std::array<int, 2> remainders = { larger, smaller };
    std::array<int, 2> ss = { 1, 0 };
    std::array<int, 2> ts = { 0, 1 };
    int indexNeg2 = 0;
    int indexNeg1 = 1;
 
    // loop
    while (1)
    {
        // calculate our new quotient and remainder
        int newQuotient = remainders[indexNeg2] / remainders[indexNeg1];
        int newRemainder = remainders[indexNeg2] - newQuotient * remainders[indexNeg1];
 
        // if our remainder is zero we are done.
        if (newRemainder == 0)
        {
            // return our s and t values as well as the quotient as the GCD
            s = ss[indexNeg1];
            t = ts[indexNeg1];
            if (swapped)
                std::swap(s, t);
            return remainders[indexNeg1];
        }
 
        // calculate this round's s and t
        int newS = ss[indexNeg2] - newQuotient * ss[indexNeg1];
        int newT = ts[indexNeg2] - newQuotient * ts[indexNeg1];
 
        // store our values for the next iteration
        remainders[indexNeg2] = newRemainder;
        ss[indexNeg2] = newS;
        ts[indexNeg2] = newT;
 
        // move to the next iteration
        std::swap(indexNeg1, indexNeg2);
    }
}
 
//=================================================================================
void WaitForEnter ()
{
    printf("nPress Enter to quit");
    fflush(stdin);
    getchar();
}
 
//=================================================================================
int main(int argc, char **argv)
{
    const SEquation equations[] =
    {
        { 2, 3 },
        { 2, 4 },
        { 1, 5 }
    };

    const int c_numEquations = sizeof(equations) / sizeof(equations[0]);

    // print out the equations
    printf("Solving for x:n");
    for (int i = 0; i < c_numEquations; ++i)
        printf("eq %i:  x = %i mod %in", i, equations[i].a, equations[i].m);
    printf("n");

    // make sure the m's are pairwise co-prime
    for (int i = 0; i < c_numEquations; ++i)
    {
        for (int j = i + 1; j < c_numEquations; ++j)
        {
            int s, t;
            int gcd = ExtendedEuclidianAlgorithm(equations[i].m, equations[j].m, s, t);
            if (gcd != 1)
            {
                printf("%i and %i are not co-prime (index %i and %i)n", equations[i].m, equations[j].m, i, j);
                WaitForEnter();
                return 0;
            }
        }
    }

    // calculate the coefficients for each term
    std::array < int, c_numEquations > coefficients;
    coefficients.fill(1);
    for (int i = 0; i < c_numEquations; ++i)
    {
        for (int j = 0; j < c_numEquations; ++j)
        {
            if (i != j)
                coefficients[i] *= equations[j].m;
        }
    }

    // now figure out how much to multiply each coefficient by to make it have the specified modulus residue (remainder)
    int result = 0;
    for (int i = 0; i < c_numEquations; ++i)
    {
        int s, t;
        ExtendedEuclidianAlgorithm(coefficients[i], equations[i].m, s, t);
        coefficients[i] *= t * equations[i].a;
    }

    // calculate result and simplify it to the smallest positive integer mod lcm
    // lcm is the product when they are pairwise coprime, as the gcd of any two is 1.
    int lcm = 1;
    for (int i = 0; i < c_numEquations; ++i)
    {
        lcm *= equations[i].m;
        result += coefficients[i];
    }
    result = result % lcm;
    if (result < 0)
        result += lcm;

    // print out the answer
    printf("x = %i mod %inor...n", result, lcm);
    printf("x = %i + %i*NnWhere N is any positive or negative integer.nn", result, lcm);

    // verify that our result is correct
    printf("Verifying Equations:n");
    for (int i = 0; i < c_numEquations; ++i)
        printf("eq %i:  %i mod %i = %i (%s)n", i, result, equations[i].m, result % equations[i].m, (result % equations[i].m == equations[i].a) ? "PASS" : "!!FAIL!!");

    WaitForEnter();
    return 0;
}

Here’s the output of the program:

Links

One final note. If the program can’t find an answer, it doesn’t necessarily mean that there is no answer. For instance, you could imagine multiplying everything by 2, which would make the modulus values not be co-prime (they would have 2 as a common denominator). That would make this algorithm fail, even though there was a valid answer.

You can try the method of successive substitution as an alternate method when that happens.

Wikipedia: The Chinese Remainder Theorem

Wikipedia: Method Of Successive Substitution

Also, it looks like Khan Academy has a good bit on modular math!

Khan Academy: What is Modular Arithmetic?

Modular Multiplicative Inverse

This post is a pre-requisite for the next thing I want to talk about so may not make a whole lot of sense or be all that interesting until shown in that context.

Say you have a function like this:

(a*x) \mod m = n

If you know the values of a, m and n, how do you solve for x? Note in this post we are only dealing with integers, so we are looking for the integer solution for x.

It might be hard to visualize with so many symbols, so here it is with some constants:

(5*x) \mod 7 = 3

How would you solve that for x? In other words, what do you need to multiply 5 by, so that when you divide the result by 7, that you get 3 as the remainder?

One way to solve for x would be brute force. We could try plugging every value from 0 to 6 into x (every value from 0 to n-1), and see if any gives us the result we are looking for.

Brute force can be a challenge if the numbers are really large, like in some cryptographic situations.

Interestingly, there might not even be a valid answer for x that satisfies the equation! The below has no answer for instance:

(2*x) \mod 8 = 5

Better Than Brute Force

There’s something called the “Modular Multiplicative Inverse” which looks eerily familiar:

(a*x) \mod m = 1

Where a and m are known, and the inverse itself is the value of x.

Using the same constants we did above, that gives us this:

(5*x) \mod 7 = 1

In this case, the inverse (x) is 3. You can verify that by seeing that (5*3) % 7 is 1.

Once you have the inverse, if you wanted to solve the original equation where the modulus end up being 3, you just multiply the inverse by the desired modulus amount. Since the inverse is 3 and the desired modulus value is 3, you multiply them together and get 9. Plugging the numbers in, we can see that (5*9) % 7 = 3.

Pretty cool, but how to calculate the inverse? You can calculate it by using something called the “Extended Euclidean Algorithm”.

The regular Euclidean algorithm is in a post here: Programmatically Calculating GCD and LCM.

The extended euclidean algorithm is explained really well on wikipedia: Wikipedia: Extended Euclidean Algorithm.

Sample Code

Here’s some sample code that asks the user for input and solves these style of equations for x. Below the code I’ll show some example runs and talk about a few more things.

#include <stdio.h>
#include <algorithm>
#include <array>

//=================================================================================
unsigned int ExtendedEuclidianAlgorithm (int smaller, int larger, int &s, int &t)
{
    // make sure A <= B before starting
    bool swapped = false;
    if (larger < smaller)
    {
        swapped = true;
        std::swap(smaller, larger);
    }

    // set up our storage for the loop.  We only need the last two values so will
    // just use a 2 entry circular buffer for each data item
    std::array<int, 2> remainders = { larger, smaller };
    std::array<int, 2> ss = { 1, 0 };
    std::array<int, 2> ts = { 0, 1 };
    int indexNeg2 = 0;
    int indexNeg1 = 1;

    // loop
    while (1)
    {
        // calculate our new quotient and remainder
        int newQuotient = remainders[indexNeg2] / remainders[indexNeg1];
        int newRemainder = remainders[indexNeg2] - newQuotient * remainders[indexNeg1];

        // if our remainder is zero we are done.
        if (newRemainder == 0)
        {
            // return our s and t values as well as the quotient as the GCD
            s = ss[indexNeg1];
            t = ts[indexNeg1];
            if (swapped)
                std::swap(s, t);
            return remainders[indexNeg1];
        }

        // calculate this round's s and t
        int newS = ss[indexNeg2] - newQuotient * ss[indexNeg1];
        int newT = ts[indexNeg2] - newQuotient * ts[indexNeg1];

        // store our values for the next iteration
        remainders[indexNeg2] = newRemainder;
        ss[indexNeg2] = newS;
        ts[indexNeg2] = newT;

        // move to the next iteration
        std::swap(indexNeg1, indexNeg2);
    }
}

//=================================================================================
void WaitForEnter ()
{
    printf("nPress Enter to quit");
    fflush(stdin);
    getchar();
}

//=================================================================================
int main(int argc, char **argv)
{
    // get user input
    int a, m, n;
    printf("Given a, m and n, solves for X.n(a * X) %% m = nnn");
    printf("a = ");
    scanf("%i", &a);
    printf("m = ");
    scanf("%i", &m);
    printf("n = ");
    scanf("%i", &n);

    // show details of what they entered
    printf("n(%i * X) mod %i = %in", a, m, n);

    // Attempt brute force
    printf("nBrute Force Testing X from 0 to %i:n", (m-1));
    for (int i = 0; i < m; ++i) {
        if ((a*i) % m == n)
        {
            printf("  X = %in", i);
            printf("  %i mod %i = %in", a*i, m, (a*i) % m);
            break;
        }
        else if (i == (m - 1))
        {
            printf("  No solution!n");
        }
    }

    // Attempt inverse via Extended Euclidean Algorithm
    printf("nExtended Euclidean Algorithm:n");
    int s, t;
    int GCD = ExtendedEuclidianAlgorithm(a, m, s, t);

    // report failure if we couldn't do inverse
    if (GCD != 1)
    {
        printf("  Values are not co-prime, cannot invert! GCD = %in", GCD);
    }
    // Else report details of inverse and show that it worked
    else
    {
        printf("  Inverse = %in", t);
        printf("  X = Inverse * n = %in", t*n);
        printf("  %i mod %i = %in", a*t*n, m, (a*t*n) % m);
    }

    WaitForEnter();
    return 0;
}

Example Runs

Here is a normal run that solves (7*x) \mod 9 = 2, to come up with a value of 8 for x.

Here is a run that solves (5*x) \mod 7 = 3. Brute force gives us a value of 2 for x, while the inverse gives us a value of 9. Both are valid, and in fact are equivalent since 9 % 7 = 2. This shows that getting the inverse and then multiplying it to get the desired answer doesn’t always give you the smallest possible value of x.

Here is a large number run that solves (7*x) \mod 1000001 = 538. Brute force gives a value of 571,506 for x, while using the inversion method gives us a value of 230,571,736.

Lastly, here is a run that solves (8*x) \mod 6 = 4. Brute force gives us a value of 2 for x, but interestingly, it isn’t invertible, so the inversion based solution can’t even find us an answer!

This happens when a and m are not co-prime. In other words, if they have a GCD that isn’t 1, they aren’t coprime, and the modulus can’t be inverted.

Links

You can read more about the modular multiplicative inverse here: Wikipedia: Modular Multiplicative Inverse.

Improving the Security of the Super Simple Symmetric Leveled Homomorphic Encryption Implementation

The last post showed a super simple encryption algorithm that let an untrusted person perform calculations with encrypted data such that when they gave the results back to you, you could decrypt them and get the results of the calculation as if they were done on the unencrypted values. The best part was that this untrusted party had no knowledge of the data it was doing the calculations on.

While it was (hopefully) easy to understand, there were a lot of problems with it’s security. The biggest of which probably was the fact that the encryption of a false bit was the secret key itself!

This post is going to slightly increase the complexity of the encryption operation to match what the paper that I’m getting this stuff from says to do (Fully Homomorphic Encryption over the Integers).

All of the other operations – like XOR, AND, Decryption, use of constants – remain the same. It’s just the process of turning a plain text bit into a cipher text bit that is going to change.

Disclaimer: I am a game programmer, not a cryptographer, and you should REALLY do your own investigation and consult experts before actually rolling anything out to production or trusting that what I say is correct!

Improvement #1 – Multiply Key By Random Number

The scheme to encrypt a plaintext bit from the last blog post was this:

encryptedBit = key + value ? 1 : 0;

A major problem with that scheme is that if you encrypt a false bit, you get the key itself.

Another problem that we touched on was that the parity of the plain text bit (whether it was odd or even, aka a 1 or a 0) was always opposite of the parity of the cipher text.

Yet another problem was that for the same key, 0 always encrypted to the same value, and 1 always encrypted to the same value, which was the “0” encrypted value, plus 1.

We are going to address all of these problems by adding a simple operation to encryption. We are just going to multiply the key by a random number before adding the plain text bit in. That gives us the following:

encryptedBit = key*randomNumber + value ? 1 : 0;

Where randomNumber is at least 1.

This above helps in the following ways:

  • Encrypting false doesn’t always just give you the key anymore!
  • Since the cipherBit divided by the key is now a random number (and since the key is an odd number), it will be random whether the parity of the cipher text matches or mismatches the plain text. You can no longer use that information to figure out the value of the encrypted bit!
  • If you encrypt a false value, you will get different values each time. Same when encrypting a true value. When looking at two ciphertexts that have the same underlying plaintext value, you will no longer be able to tell that they are equal just by looking at them, or be able to tell that one is larger than the other so must be the true bit!

That is a pretty good improvement, but we can do a little better.

Improvement #2 – Add Random Noise

The second improvement we are going to do is add random noise to our encrypted value. This will make it so encrypting a false bit will not result in a multiple of the key, but will instead result in NEARLY a multiple of the key, which is a harder problem to figure out as an attacker.

You might ask how we are going to preserve our encrypted value if we are adding random noise into the result. Well, in actuality, all we really need to preserve is the lowest bit, so we are going to add an EVEN NUMBERED amount of noise.

That makes our encryption scheme become this:

encryptedBit = key*randomNumber1 + 2*randomNumber2 + value ? 1 : 0;

While this increases our security, it also increases the noise (error) in our encrypted data, which makes it so we can do fewer operations before the error gets too large and we start getting wrong answers.

There we are, our encryption scheme now matches the one described in the paper. All the rest of our operations remain unchanged.

Security

The above looks good, but what range of values should be we use for randomNumber1 and randomNumber2 and the actual size of the key?

The paper refers to a security parameter lambda (\lambda) that everything else is based on.

It says that the size of the key (N) should be (\lambda^2) bits, randomNumber1 should be around 2^{N^3} (N^3 bits) and that randomNumber2 should be around 2^{\sqrt{N}} (\sqrt{N} bits).

It also says that the best known attack against this algorithm takes about 2^{N^2} operations and that you can assume an attacker is going to be able to do a billion operations per second (per this info here). Note that 1 billion operations per second is a typical computer, not a super computer or a distributed attack using a bot network!

Let’s look at some example values for lambda!

Lambda Key Size RN1 Size RN2 Size Attack Time
80 800B 30.5GB 10B 38 million years
60 450B 5.4GB 8B 36 years
40 200B 488MB 5B 17 minutes
20 50B 7.6MB 3B < 1 second

Ouch! RandomNumber1 sure is huge isn’t it? I’ve double and tripple checked and that really does seem to be correct. Encrypting a single bit is essentially going to be as large as RandomNumber1. That is so unwieldy it’s ridiculous. I’m going to quadruple check that I think because that is just insane…

BTW quick tangent. An interesting thing to note is that if your key is an even number, instead of an odd number, noise/error can grow as much as it wants, and will never give you the wrong result! That means that by using an even numbered key, this scheme is fully homomorphic. However, using an even key, encryptedBit % 2 == decryptedBit, so it’s super insecure.

I’ve been thinking about it quite a bit, but I can’t think of any avenues to use an even numbered key but still have any semblance of real security. If you can think of a way, go publish a paper about it and become famous! 😛

Example Code

Here is some sample code from last post with the new encryption routine. I had to decrease the number of bits in the addition tests, and I had to tone down the security quite a bit to make it fit within 64 bits.

#include <stdio.h>
#include <stdint.h>
#include <random>
#include <array>
#include <inttypes.h>

typedef uint64_t uint64;

// Increase this value to increase the size of the key, and also the maximum
// size of the error allowed.
// If you set it too high, operations will fail when they run out of storage space
// in the 64 bit ints.  If you set it too low, you will not be able to do very many
// operations in a row.
// The recomended values for good security for these numbers are way too large to
// fit in a uint64, so adjusting them down to show their effects while using uint64s
const size_t c_numKeyBits = 15;
const size_t c_numNoiseBits = 3; //size_t(sqrt(c_numKeyBits));
const size_t c_numMultiplierBits = 4; //c_numKeyBits * c_numKeyBits * c_numKeyBits;

#define Assert(x) if (!(x)) ((int*)nullptr)[0] = 0;

//=================================================================================
// TODO: Replace with something crypto secure if desired!
uint64 RandomUint64 (uint64 min, uint64 max)
{
    static std::random_device rd;
    static std::mt19937 gen(rd());
    std::uniform_int_distribution<uint64> dis(min, max);
    return dis(gen);
}

//=================================================================================
void WaitForEnter ()
{
    printf("Press Enter to quit");
    fflush(stdin);
    getchar();
}

//=================================================================================
uint64 GenerateKey ()
{
    // Generate an odd random number in [2^(N-1), 2^N)
    // N is the number of bits in our key
    // The key also defines the maximum amount of error allowed, and thus the number
    // of operations allowed in a row.
    uint64 key = RandomUint64(0, (uint64(1) << uint64(c_numKeyBits)) - 1);
    key = key | (uint64(1) << uint64(c_numKeyBits - 1));
    key = key | 1;
    return key;
}

//=================================================================================
bool Decrypt (uint64 key, uint64 value)
{
    return ((value % key) % 2) == 1;
}

//=================================================================================
uint64 Encrypt (uint64 key, bool value)
{
    uint64 keyMultiplier = RandomUint64(0, (1 << c_numMultiplierBits) - 2) + 1;
    uint64 noise = RandomUint64(0, (1 << c_numNoiseBits) - 1);
    uint64 ret = key * keyMultiplier + 2 * noise + (value ? 1 : 0);
    Assert(Decrypt(key, ret) == value);
    return ret;
}

//=================================================================================
uint64 XOR (uint64 A, uint64 B)
{
    return A + B;
}

//=================================================================================
uint64 AND (uint64 A, uint64 B)
{
    return A * B;
}

//=================================================================================
float GetErrorPercent (uint64 key, uint64 value)
{
    // Returns what % of maximum error this value has in it.  When error >= 100%
    // then we have hit our limit and start getting wrong answers.
    return 100.0f * float(value % key) / float(key);
}

//=================================================================================
uint64 FullAdder (uint64 A, uint64 B, uint64 &carryBit)
{
    // homomorphically add the encrypted bits A and B
    // return the single bit sum, and put the carry bit into carryBit
    // From http://en.wikipedia.org/w/index.php?title=Adder_(electronics)&oldid=381607326#Full_adder
    uint64 sumBit = XOR(XOR(A, B), carryBit);
    carryBit = XOR(AND(A, B), AND(carryBit, XOR(A, B)));
    return sumBit;
}

//=================================================================================
int main (int argc, char **argv)
{
    // run this test a bunch to show that it works.  If you get a divide by zero
    // in an Assert, that means that it failed, and hopefully it's because you
    // increased c_numKeyBits to be too large!
    printf("Verifying 10000 truth tables.  Details of first one:n");
    for (int index = 0; index < 10000; ++index)
    {
        // make our key and a true and false bit
        uint64 key = GenerateKey();
        uint64 falseBit1 = Encrypt(key, false);
        uint64 falseBit2 = Encrypt(key, false);
        uint64 trueBit1  = Encrypt(key, true);
        uint64 trueBit2  = Encrypt(key, true);

        // report the results for the first iteration of the loop
        if (index == 0)
        {
            printf("Key 0x%" PRIx64 ", false = 0x%" PRIx64 ", 0x%" PRIx64 " true = 0x%" PRIx64 " 0x%" PRIx64 "n", key, falseBit1, falseBit2, trueBit1, trueBit2);
            printf("  [0 xor 0] = 0   0x%" PRIx64 " xor(+) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%0.2f%%)n", falseBit1, falseBit2, XOR(falseBit1, falseBit2), Decrypt(key, XOR(falseBit1, falseBit2)) ? 1 : 0, GetErrorPercent(key, XOR(falseBit1, falseBit2)));
            printf("  [0 xor 1] = 1   0x%" PRIx64 " xor(+) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%0.2f%%)n", falseBit1, trueBit2 , XOR(falseBit1, trueBit2 ), Decrypt(key, XOR(falseBit1, trueBit2 )) ? 1 : 0, GetErrorPercent(key, XOR(falseBit1, trueBit2 )));
            printf("  [1 xor 0] = 1   0x%" PRIx64 " xor(+) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%0.2f%%)n", trueBit1 , falseBit2, XOR(trueBit1 , falseBit2), Decrypt(key, XOR(trueBit1 , falseBit2)) ? 1 : 0, GetErrorPercent(key, XOR(trueBit1 , falseBit2)));
            printf("  [1 xor 1] = 0   0x%" PRIx64 " xor(+) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%0.2f%%)n", trueBit1 , trueBit2 , XOR(trueBit1 , trueBit2 ), Decrypt(key, XOR(trueBit1 , trueBit2 )) ? 1 : 0, GetErrorPercent(key, XOR(trueBit1 , trueBit2 )));
            printf("  [0 and 0] = 0   0x%" PRIx64 " and(*) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%0.2f%%)n", falseBit1, falseBit2, AND(falseBit1, falseBit2), Decrypt(key, AND(falseBit1, falseBit2)) ? 1 : 0, GetErrorPercent(key, XOR(falseBit1, falseBit2)));
            printf("  [0 and 1] = 0   0x%" PRIx64 " and(*) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%0.2f%%)n", falseBit1, trueBit2 , AND(falseBit1, trueBit2 ), Decrypt(key, AND(falseBit1, trueBit2 )) ? 1 : 0, GetErrorPercent(key, XOR(falseBit1, trueBit2 )));
            printf("  [1 and 0] = 0   0x%" PRIx64 " and(*) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%0.2f%%)n", trueBit1 , falseBit2, AND(trueBit1 , falseBit2), Decrypt(key, AND(trueBit1 , falseBit2)) ? 1 : 0, GetErrorPercent(key, XOR(trueBit1 , falseBit2)));
            printf("  [1 and 1] = 1   0x%" PRIx64 " and(*) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%0.2f%%)n", trueBit1 , trueBit2 , AND(trueBit1 , trueBit2 ), Decrypt(key, AND(trueBit1 , trueBit2 )) ? 1 : 0, GetErrorPercent(key, XOR(trueBit1 , trueBit2 )));
        }

        // Verify truth tables for XOR and AND
        Assert(Decrypt(key, XOR(falseBit1, falseBit2)) == false);
        Assert(Decrypt(key, XOR(falseBit1, trueBit2 )) == true );
        Assert(Decrypt(key, XOR(trueBit1 , falseBit2)) == true );
        Assert(Decrypt(key, XOR(trueBit1 , trueBit2 )) == false);

        Assert(Decrypt(key, AND(falseBit1, falseBit2)) == false);
        Assert(Decrypt(key, AND(falseBit1, trueBit2 )) == false);
        Assert(Decrypt(key, AND(trueBit1 , falseBit2)) == false);
        Assert(Decrypt(key, AND(trueBit1 , trueBit2 )) == true );
    }

    // Do multi bit addition as an example of using compound circuits to
    // do meaningful work.
    const size_t c_numBitsAdded = 3;
    printf("nDoing 10000 Multibit Additions.  Details of first one:n");
    std::array<uint64, c_numBitsAdded> numberAEncrypted;
    std::array<uint64, c_numBitsAdded> numberBEncrypted;
    std::array<uint64, c_numBitsAdded> resultEncrypted;
    std::array<uint64, c_numBitsAdded> carryEncrypted;
    for (int index = 0; index < 10000; ++index)
    {
        // generate the numbers we want to add
        uint64 numberA = RandomUint64(0, (1 << c_numBitsAdded) - 1);
        uint64 numberB = RandomUint64(0, (1 << c_numBitsAdded) - 1);

        // generate our key
        uint64 key = GenerateKey();

        // encrypt our bits
        for (int bitIndex = 0; bitIndex < c_numBitsAdded; ++bitIndex)
        {
            numberAEncrypted[bitIndex] = Encrypt(key, (numberA & (uint64(1) << uint64(bitIndex))) != 0);
            numberBEncrypted[bitIndex] = Encrypt(key, (numberB & (uint64(1) << uint64(bitIndex))) != 0);
        }

        // do our multi bit addition!
        // we could initialize the carry bit to 0 or the encrypted value of 0. either one works since 0 and 1
        // are also poor encryptions of 0 and 1 in this scheme!
        uint64 carryBit = Encrypt(key, false);
        for (int bitIndex = 0; bitIndex < c_numBitsAdded; ++bitIndex)
        {
            carryEncrypted[bitIndex] = carryBit;
            resultEncrypted[bitIndex] = FullAdder(numberAEncrypted[bitIndex], numberBEncrypted[bitIndex], carryBit);
        }

        // decrypt our result
        uint64 resultDecrypted = 0;
        for (int bitIndex = 0; bitIndex < c_numBitsAdded; ++bitIndex)
        {
            if (Decrypt(key, resultEncrypted[bitIndex]))
                resultDecrypted |= uint64(1) << uint64(bitIndex);
        }

        // report the results for the first iteration of the loop
        if (index == 0)
        {
            printf("Key 0x%" PRIx64 ", %" PRId64 " + %" PRId64 " in %i bits = %" PRId64 "n", key, numberA, numberB, c_numBitsAdded, (numberA + numberB) % (1 << c_numBitsAdded));
            for (int bitIndex = 0; bitIndex < c_numBitsAdded; ++bitIndex)
                printf("  A[%i] = 0x%" PRIx64 " (%i err=%0.2f%%)n", bitIndex, numberAEncrypted[bitIndex], Decrypt(key, numberAEncrypted[bitIndex]), GetErrorPercent(key, numberAEncrypted[bitIndex]));
            printf("+n");
            for (int bitIndex = 0; bitIndex < c_numBitsAdded; ++bitIndex)
                printf("  B[%i] = 0x%" PRIx64 " (%i err=%0.2f%%)n", bitIndex, numberBEncrypted[bitIndex], Decrypt(key, numberBEncrypted[bitIndex]), GetErrorPercent(key, numberBEncrypted[bitIndex]));
            printf("=n");
            for (int bitIndex = 0; bitIndex < c_numBitsAdded; ++bitIndex)
                printf("  Result[%i] = 0x%" PRIx64 " (%i err=%0.2f%%)n", bitIndex, resultEncrypted[bitIndex], Decrypt(key, resultEncrypted[bitIndex]), GetErrorPercent(key, resultEncrypted[bitIndex]));
            printf("Carry Bits =n");
            for (int bitIndex = 0; bitIndex < c_numBitsAdded; ++bitIndex)
                printf("  Result[%i] = 0x%" PRIx64 " (%i err=%0.2f%%)n", bitIndex, carryEncrypted[bitIndex], Decrypt(key, carryEncrypted[bitIndex]), GetErrorPercent(key, carryEncrypted[bitIndex]));
            printf("result decrypted = %" PRId64 "n", resultDecrypted);
        }

        // make sure that the results match, keeping in mind that the 4 bit encryption may have rolled over
        Assert(resultDecrypted == ((numberA + numberB) % (1 << c_numBitsAdded)));
    }

    WaitForEnter();
    return 0;
}

Here’s the output of an example run.

Links

Another paper about symmetric HE over the integers: Symmetric Somewhat Homomorphic Encryption over the Integers

An implementation of FHE: Implementation of the DGHV fully homomorphic encryption scheme

Next Up

Here are some interesting avenues to explore with this stuff. More blog posts coming in the future about these topics, but for now, here they are in case you want to check them out yourself:

  • Making the public / private key implementation
  • Make this stuff work with multi precision math to be able to make realistically sized and see what sort of perf it gives
  • Show how to achieve fully homomorphic encryption using boot strapping and/or modulus switching
  • Further explore tuning down security parameters to get game quality HE
  • Explore the other known methods for implementing HE. HE over the integers is apparently very easy to understand compared to other methods, but other methods may have different characteristics – like maybe not being crazy gigantic

Super Simple Symmetric Leveled Homomorphic Encryption Implementation

Homomorphic encryption is a pretty interesting thing. It allows you to do calculations on encrypted data such that when you decrypt the results, it’s as if you did the calculations on the unencrypted data. This allows computation to happen without the person doing the computation knowing what the data actually is!

Brief History

For a long time, cryptographers wondered if fully homomorphic encryption was even possible. There were various encryption algorithms that could perform SOME operations homomorphically (RSA can do multiplication for instance!), but there weren’t any that could do ALL operations. In other words, you couldn’t execute arbitrary computations.

Those types of algorithms are called “Partially Homomorphic Encryption” or PHE.

Another problem standing in the way of fully homomorphic encryption was that many algorithms would only have a limited count of operations they could perform before error would accumulate and they would start giving incorrect answers. In essence they were limited to evaluating low degree polynomials.

Those types of algorithms are called “Somewhat Homomorphic Encryption” or SWHE.

In contrast, Fully Homomorphic Encryption (FHE) can perform an unlimited number of homomorphic operations, and it can perform any operation homomorphically. It is unbounded in both ways.

Amazingly, In 2009 Craig Gentry figured out the first fully homomorphic encryption scheme! With his setup, you can calculate both XOR and AND on encrypted bits, which makes it Turing complete. It is also able to keep errors from becoming too large by using an ingenious bootstrapping technique to decrease accumulated error. Here’s a link to his PHd thesis: A Fully Homomorphic Encryption Scheme.

Unfortunately, the current implementations of secure FHE take too much computational power to be practical in most situations – like 30 minutes to calculate an AND between 2 bits!

In this post I’m going to show you a super simple HE implementation that will be very easy to understand. It won’t be fully homomorphic, but it will be “leveled” (or, somewhat homomorphic), meaning it is Turing complete, but the count of calculations you can perform are limited due to error creeping in. It also won’t be secure – due to making it easy to understand – but it will be lightning fast.

This will be a symmetric key algorithm, but as we’ll explore in future posts, it can also be used for public key algorithms.

Why Is HE Useful?

One thing you could do with HE is store your financial transactions encrypted on a server. The server could run queries and calculations on your financial data and send back the results. You could then unencrypt the result and see what the values are, even though the server itself – which generated the values – has no idea what the numbers actually are.

Another use could be in games. Whether you are playing a first person shooter, or a real time strategy game, many different types of games send information about each player to every other player in the game. Hashes of game state can be used to make sure that everyone is in agreement about calculations to prevent a player from cheating by WRITING to a value they shouldn’t be writing to (or, at least you can detect when they do, and use majority rule to boot them out of the game), but how do you stop a player from READING a value they shouldn’t be reading?

Using HE, you could encrypt the data you need to send to players that they shouldn’t be able to read. With this, they could still do game play logic calculations on the data, and calculate hashes of the encrypted results to ensure that all players were in agreement, but with HE, they wouldn’t gain knowledge of the data they were working with.

In other words, player A could verify that player B’s state is correct and they haven’t cheated, without player A getting details about player B’s state.

In theory this could eliminate or at least help combat things like wall hacks and other “data read” based cheats. In practice there would be some complications to work out, even if it wasn’t crazy slow to calculate, but the fact that there is a path to addressing these issues is pretty exciting! People are working on improving speed, and games don’t need the same level of security that other usage cases do.

How To Do It

Here are the details of this super simple leveled homomorphic symmetric key algorithm.

By the way, all the percent signs below mean “modulus” which is just the remainder of a division. 25 % 4 = 1 for instance, because 25/4 = 6 with a remainder of 1. That remainder of 1 is what we get when we take the modulus. A term you’ll see more often if reading through this stuff on your own will be “residue”. Don’t let that word scare you, it is just another name for the remainder.

Making A Secret Key

To make a key, generate an odd random number between 2^(N-1) and 2^N. In other words, it will be N random bits, except the highest and lowest bit will be set to 1. N is the size of your secret key. Larger keys are more secure, and allow more computations to be done in a row, but they also take more storage space. If you are using a fixed size int – like say a uint32 – a larger key will make you run out of those 32 bits sooner.

key = RandomNumber(0, (1 << N) - 1) | 1 | (1 << (N - 1));

Encrypt

To encrypt a bit, the encrypted value is just the key plus the value of the unencrypted bit (0 or 1).

encryptedBit = key + value ? 1 : 0;

Decrypt

To decrypt a bit, you take the encrypted bit modulo the key, and then modulo 2.

decryptedBit = (encryptedBit % key) % 2;

XOR

To do an XOR of two encrypted bits, you just add the two values together.

xorResult = encryptedBit1 + encryptedBit2;

AND

To do an AND of two encrypted bits, you just multiply the two values together.

andResult = encryptedBit1 * encryptedBit2;

Example

Let’s run through an example to see this in action.

We’ll use a 4 bit key, and say that the key is 13 (1101 in binary).

Let’s encrypt some bits:

trueBitEncrypted = key + 1 = 13 + 1 = 14 \newline falseBitEncrypted = key + 0 = 13 + 0 = 13

Let’s do some logical operations:

Xor00 = falseBitEncrypted + falseBitEncrypted = 13 + 13 = 26 \newline Xor01 = falseBitEncrypted + trueBitEncrypted  = 13 + 14 = 27 \newline Xor10 = trueBitEncrypted  + falseBitEncrypted = 14 + 13 = 27 \newline Xor11 = trueBitEncrypted  + trueBitEncrypted  = 14 + 14 = 28 \newline \newline And00 = falseBitEncrypted * falseBitEncrypted = 13 * 13 = 169 \newline And01 = falseBitEncrypted * trueBitEncrypted  = 13 * 14 = 182 \newline And10 = trueBitEncrypted  * falseBitEncrypted = 14 * 13 = 182 \newline And11 = trueBitEncrypted  * trueBitEncrypted  = 14 * 14 = 196 \newline \newline FalseXorFalseAndTrue = falseBitEncrypted + falseBitEncrypted * trueBitEncrypted = 13 + 13 * 14 = 195

Notice how AND is a multiplication where XOR is an addition, and that the result of an AND operation is a larger number than an XOR operation. This means that if you are working with a specific sized number (again, such as a uint32), that you can do fewer ANDs than XORs before you run out of bits. When you run out of bits and your number has integer overflow, you have hit the cieling of this leveled HE scheme. That means that ANDs are more expensive than XORs when considering the number of computations you can do.

Ok, time to decrypt our XOR values!

Xor00Decrypted = ((Xor00 \% key) \% 2) = (26 \% 13) \% 2 = 0 \newline Xor01Decrypted = ((Xor01 \% key) \% 2) = (27 \% 13) \% 2 = 1 \newline Xor10Decrypted = ((Xor10 \% key) \% 2) = (27 \% 13) \% 2 = 1 \newline Xor11Decrypted = ((Xor11 \% key) \% 2) = (28 \% 13) \% 2 = 0 \newline

XOR is looking correct, how about AND?

And00Decrypted = ((And00 \% key) \% 2) = (169 \% 13) \% 2 = 0 \newline And01Decrypted = ((And01 \% key) \% 2) = (182 \% 13) \% 2 = 0 \newline And10Decrypted = ((And10 \% key) \% 2) = (182 \% 13) \% 2 = 0 \newline And11Decrypted = ((And11 \% key) \% 2) = (196 \% 13) \% 2 = 1 \newline

AND is looking good as well. Lastly let’s decrypt the compound operation:

FalseXorFalseAndTrueDecrypted = ((FalseXorFalseAndTrue \% key) \% 2) = (195 \% 13) \% 2 = 0

Lookin good!

Intuition

Let’s get some intuition for why this works…

Key Generation

First up, why is it that the key needs to have it’s high bit set? Well, on one hand, larger keys are more secure, and allow more room for error accumulation so allow more operations to be done. On the other hand, this is kind of misleading to say. If you generate ANY random odd integer, there will be a highest bit set to 1 SOMEWHERE. You technically don’t need to store the zeros above that. So i guess you could look at it like you are just generating ANY random odd integer, and you could figure out N FROM that value (the position of the highest bit). Thinking about it the way we do though, it lets us specify how many bits we actually want to commit to for the key which gives us more consistent behavior, upper bound storage space, etc.

Secondly, why does the key need to be odd?

Let’s say that you have two numbers A and B where A represents an encrypted bit and B represents the encryption key. If B is even, then A % B will always have the same parity (whether it’s even or odd) as A. Since we are trying to hide whether our encrypted bit is 0 or 1 (even or odd), that makes it very bad encryption since you can recover the plain text bit by doing encryptedValue % 2. If on the other hand, B is odd, A % B will have the same parity as A only if A / B is even.

This doesn’t really make much of a difference in the scheme in this post, because A / B will always be 1 (since the encrypted bit is the key plus the plain text bit), but in the next scheme it is more important because A / B will be a random number, which means that it will be random with a 50/50 chance whether or not the parity of the encrypted bit matches the parity of the plain text bit. Since it’s an even chance whether it matches or not, that means that an attacker can’t use that information to their advantage.

While it’s true that when generating a random key, there is a 50/50 chance of whether you will get an even or odd key, you can see how we’d be in a situation where 75% of the time the parity of the ciphertext would match the parity of the plaintext if we allowed both even and off keys.

That would mean that while an attacker couldn’t know for CERTAIN whether an encrypted bit is 1 or 0 based on the cipher text, they can guess with 75% confidence that the unencrypted bit will just be the cipher text % 2, which is no good! So, we are better off sticking with an odd numbered key in this scheme. But again, that won’t really matter until the next post!

XOR as Addition

I know that I’m going to butcher this explanation a bit in the eyes of someone who knows this math stuff better than me. If you are reading this and see that I have indeed done that, please drop me a line or leave a comment and let me know what I’ve missed or could explain better. I suspect there’s something about rings going on here (;

Believe it or not, when you add two numbers together and then take the modulus, you get the same answer as if you did the modulus on the two numbers, added them together, and then took the modulus again.

In other words, adding two numbers can be seen as adding their residue (remainder).

Let me show you an example.

15 + 28 = 43 \newline \newline ((15 \% 13) + (28 \% 13)) \% 13 = 43 \% 13 \newline (2 + 2) \% 13 = 4 \newline 4 = 4

Let’s try another one. I’m picking these numbers “randomly” out of my head 😛

28 + 47 = 75 \newline \newline ((28 \% 8) + (47 \% 8)) \% 8 = 75 \% 8 \newline (4 + 7) \% 8 = 3 \newline 3 = 3

OK makes sense, but who cares about that?

Well, believe it or not, 1 bit addition is the same as XOR! This means that you can add numbers together, which adds the modulus of their key together, which then in turn adds that number mod 2 together, to preserve the encrypted parity (odd or even-ness).

Check out this 2 bit binary math. Keep in mind that with 1 bit results, you would only keep the right most binary digit. I’m showing two digits to show you that it is in fact binary addition, and that the right most bit is in fact the same as XOR.

0 + 0 = 00 \newline 0 + 1 = 01 \newline 1 + 0 = 01 \newline 1 + 1 = 10

One thing to note before we move on is that since we are doing a modulus against the key, when the remainder gets to be too large it rolls over. When it rolls over, we start getting the wrong answers and have hit our ceiling of how many operations we can do. So, our encrypted value modulo the key divided by the key can be seen as where we are at by percentage towards our error ceiling.

To avoid hitting the problem of error getting too high too quickly and limiting your calculation count too much you can increase the key size. When you do that you’ll then run out of bits in your fixed size integer storage faster. To avoid THAT problem you can use “multi precision math libraries” to allow your integers to use an arbitrary number of bytes. This is what many real crypto algorithms use when they need to deal with very large numbers.

AND as Multiplication

Similar to the above, when you multiply two numbers and take a modulus of the result, it’s the same as if you took the modulus of the two numbers, multiplied that, and then took the modulus of the result.

In other words, when you multiply two numbers, you can think of it as also multiplying their residue (remainder).

Using the first example numbers from above:

15 * 28 = 420 \newline \newline ((15 \% 13) * (28 \% 13)) \% 13 = 420 \% 13 \newline (2 * 2) \% 13 = 4 \newline 4 = 4

And the second:

28 * 47 = 1316 \newline \newline ((28 \% 8) * (47 \% 8)) \% 8 = 1316 \% 8 \newline (4 * 7) \% 8 = 4 \newline 4 = 4

A bit of a coincidence that they both worked out to 4 this time 😛

Similar to XOR being the same as 1 bit addition, 1 bit multiplication is actually the same as AND, check it out:

0 * 0 = 0 \newline 0 * 1 = 0 \newline 1 * 0 = 0 \newline 1 * 1 = 1

Since AND multiplies residue, and XOR adds residue, and residue is what limits our homomorphic instruction count, you can see that AND is a more expensive operation compared to XOR, since it eats into our instruction budget a lot faster.

Error In Action

To see why rolling over is a problem, let’s say that our key is 9 and we want to XOR two encrypted bits 8 and 1, which represent 0 and 1 respectively.

To do an XOR, we add them together: 8 + 1 = 9.

Now, when we decrypt it we do this: (9 % 9) % 2 = 0

That result tells us that 0 XOR 1 is 0, which is incorrect! Our residue got too large and we hit the ceiling of our homomorphic instruction budget.

If the first bit was 6 instead of 8, the result of the XOR would have been 7, and (7 % 9) % 2 comes out to 1. That re-affirms to us that if we are under the error budget, we are good to go, but if our residue gets too large, we will have problems!

Sample Code

// Note that this encryption scheme is insecure so please don't actually use it
// in production!  A false bit with a given key is the same value every time, and
// so is a true bit.  Also, the encrypted true bit value will always be the
// encrypted false bit plus 1.  Even worse, an encrypted false bit is the key itself!
// This is just for demonstration purposes to see how the basics of homomorphic
// encryption work.  The next blog post will increase security.

#include <stdio.h>
#include <stdint.h>
#include <random>
#include <array>
#include <inttypes.h>

typedef uint64_t uint64;

// Increase this value to increase the size of the key, and also the maximum
// size of the error allowed.
// If you set it too high, operations will fail when they run out of storage space
// in the 64 bit ints.  If you set it too low, you will not be able to do very many
// operations in a row.
const size_t c_numKeyBits = 6;

#define Assert(x) if (!(x)) ((int*)nullptr)[0] = 0;

//=================================================================================
// TODO: Replace with something crypto secure if desired!
uint64 RandomUint64 (uint64 min, uint64 max)
{
    static std::random_device rd;
    static std::mt19937 gen(rd());
    std::uniform_int_distribution<uint64> dis(min, max);
    return dis(gen);
}

//=================================================================================
void WaitForEnter ()
{
    printf("Press Enter to quit");
    fflush(stdin);
    getchar();
}

//=================================================================================
uint64 GenerateKey ()
{
    // Generate an odd random number in [2^(N-1), 2^N)
    // N is the number of bits in our key
    // The key also defines the maximum amount of error allowed, and thus the number
    // of operations allowed in a row.
    return RandomUint64(0, (1 << c_numKeyBits) - 1) | 1 | (1 << (c_numKeyBits - 1));
}

//=================================================================================
bool Decrypt (uint64 key, uint64 value)
{
    return ((value % key) % 2) == 1;
}

//=================================================================================
uint64 Encrypt (uint64 key, bool value)
{
    uint64 ret = key + (value ? 1 : 0);
    Assert(Decrypt(key, ret) == value);
    return ret;
}

//=================================================================================
uint64 XOR (uint64 A, uint64 B)
{
    return A + B;
}

//=================================================================================
uint64 AND (uint64 A, uint64 B)
{
    return A * B;
}

//=================================================================================
int GetErrorPercent (uint64 key, uint64 value)
{
    // Returns what % of maximum error this value has in it.  When error >= 100%
    // then we have hit our limit and start getting wrong answers.
    return int(100.0f * float(value % key) / float(key));
}

//=================================================================================
uint64 FullAdder (uint64 A, uint64 B, uint64 &carryBit)
{
    // homomorphically add the encrypted bits A and B
    // return the single bit sum, and put the carry bit into carryBit
    // From http://en.wikipedia.org/w/index.php?title=Adder_(electronics)&oldid=381607326#Full_adder
    uint64 sumBit = XOR(XOR(A, B), carryBit);
    carryBit = XOR(AND(A, B), AND(carryBit, XOR(A, B)));
    return sumBit;
}

//=================================================================================
int main (int argc, char **argv)
{
    // run this test a bunch to show that it works.  If you get a divide by zero
    // in an Assert, that means that it failed, and hopefully it's because you
    // increased c_numKeyBits to be too large!
    printf("Verifying 10000 truth tables.  Details of first one:n");
    for (int index = 0; index < 10000; ++index)
    {
        // make our key and a true and false bit
        uint64 key = GenerateKey();
        uint64 falseBit = Encrypt(key, false);
        uint64 trueBit = Encrypt(key, true);

        // Verify truth tables for XOR and AND
        Assert(Decrypt(key, XOR(falseBit, falseBit)) == false);
        Assert(Decrypt(key, XOR(falseBit, trueBit )) == true );
        Assert(Decrypt(key, XOR(trueBit , falseBit)) == true );
        Assert(Decrypt(key, XOR(trueBit , trueBit )) == false);

        Assert(Decrypt(key, AND(falseBit, falseBit)) == false);
        Assert(Decrypt(key, AND(falseBit, trueBit )) == false);
        Assert(Decrypt(key, AND(trueBit , falseBit)) == false);
        Assert(Decrypt(key, AND(trueBit , trueBit )) == true );

        // report the results for the first iteration of the loop
        if (index == 0)
        {
            printf("Key 0x%" PRIx64 ", false 0x%" PRIx64 ", true 0x%" PRIx64 "n", key, falseBit, trueBit);
            printf("  [0 xor 0] = 0   0x%" PRIx64 " xor(+) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%i%%)n", falseBit, falseBit, XOR(falseBit, falseBit), Decrypt(key, XOR(falseBit, falseBit)) ? 1 : 0, GetErrorPercent(key, XOR(falseBit, falseBit)));
            printf("  [0 xor 1] = 1   0x%" PRIx64 " xor(+) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%i%%)n", falseBit, trueBit , XOR(falseBit, trueBit ), Decrypt(key, XOR(falseBit, trueBit )) ? 1 : 0, GetErrorPercent(key, XOR(falseBit, trueBit )));
            printf("  [1 xor 0] = 1   0x%" PRIx64 " xor(+) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%i%%)n", trueBit , falseBit, XOR(trueBit , falseBit), Decrypt(key, XOR(trueBit , falseBit)) ? 1 : 0, GetErrorPercent(key, XOR(trueBit , falseBit)));
            printf("  [1 xor 1] = 0   0x%" PRIx64 " xor(+) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%i%%)n", trueBit , trueBit , XOR(trueBit , trueBit ), Decrypt(key, XOR(trueBit , trueBit )) ? 1 : 0, GetErrorPercent(key, XOR(trueBit , trueBit )));
            printf("  [0 and 0] = 0   0x%" PRIx64 " and(*) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%i%%)n", falseBit, falseBit, AND(falseBit, falseBit), Decrypt(key, AND(falseBit, falseBit)) ? 1 : 0, GetErrorPercent(key, XOR(falseBit, falseBit)));
            printf("  [0 and 1] = 0   0x%" PRIx64 " and(*) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%i%%)n", falseBit, trueBit , AND(falseBit, trueBit ), Decrypt(key, AND(falseBit, trueBit )) ? 1 : 0, GetErrorPercent(key, XOR(falseBit, trueBit )));
            printf("  [1 and 0] = 0   0x%" PRIx64 " and(*) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%i%%)n", trueBit , falseBit, AND(trueBit , falseBit), Decrypt(key, AND(trueBit , falseBit)) ? 1 : 0, GetErrorPercent(key, XOR(trueBit , falseBit)));
            printf("  [1 and 1] = 1   0x%" PRIx64 " and(*) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%i%%)n", trueBit , trueBit , AND(trueBit , trueBit ), Decrypt(key, AND(trueBit , trueBit )) ? 1 : 0, GetErrorPercent(key, XOR(trueBit , trueBit )));
        }
    }

    // Do multi bit addition as an example of using compound circuits to
    // do meaningful work.
    const size_t c_numBitsAdded = 5;
    printf("nDoing 10000 Multibit Additions.  Details of first one:n");
    std::array<uint64, c_numBitsAdded> numberAEncrypted;
    std::array<uint64, c_numBitsAdded> numberBEncrypted;
    std::array<uint64, c_numBitsAdded> resultEncrypted;
    for (int index = 0; index < 10000; ++index)
    {
        // generate the numbers we want to add
        uint64 numberA = RandomUint64(0, (1 << c_numBitsAdded) - 1);
        uint64 numberB = RandomUint64(0, (1 << c_numBitsAdded) - 1);

        // generate our key
        uint64 key = GenerateKey();

        // encrypt our bits
        for (int bitIndex = 0; bitIndex < c_numBitsAdded; ++bitIndex)
        {
            numberAEncrypted[bitIndex] = Encrypt(key, (numberA & (uint64(1) << uint64(bitIndex))) != 0);
            numberBEncrypted[bitIndex] = Encrypt(key, (numberB & (uint64(1) << uint64(bitIndex))) != 0);
        }

        // do our multi bit addition!
        // we could initialize the carry bit to 0 or the encrypted value of 0. either one works since 0 and 1
        // are also poor encryptions of 0 and 1 in this scheme!
        uint64 carryBit = Encrypt(key, false);
        for (int bitIndex = 0; bitIndex < c_numBitsAdded; ++bitIndex)
            resultEncrypted[bitIndex] = FullAdder(numberAEncrypted[bitIndex], numberBEncrypted[bitIndex], carryBit);

        // decrypt our result
        uint64 resultDecrypted = 0;
        for (int bitIndex = 0; bitIndex < c_numBitsAdded; ++bitIndex)
        {
            if (Decrypt(key, resultEncrypted[bitIndex]))
                resultDecrypted |= uint64(1) << uint64(bitIndex);
        }

        // make sure that the results match, keeping in mind that the 4 bit encryption may have rolled over
        Assert(resultDecrypted == ((numberA + numberB) % (1 << c_numBitsAdded)));

        // report the results for the first iteration of the loop
        if (index == 0)
        {
            printf("Key 0x%" PRIx64 ", %" PRId64 " + %" PRId64 " in %i bits = %" PRId64 "n", key, numberA, numberB, c_numBitsAdded, (numberA + numberB) % (1 << c_numBitsAdded));
            for (int bitIndex = 0; bitIndex < c_numBitsAdded; ++bitIndex)
                printf("  A[%i] = 0x%" PRIx64 " (%i err=%i%%)n", bitIndex, numberAEncrypted[bitIndex], Decrypt(key, numberAEncrypted[bitIndex]), GetErrorPercent(key, numberAEncrypted[bitIndex]));
            printf("+n");
            for (int bitIndex = 0; bitIndex < c_numBitsAdded; ++bitIndex)
                printf("  B[%i] = 0x%" PRIx64 " (%i err=%i%%)n", bitIndex, numberBEncrypted[bitIndex], Decrypt(key, numberBEncrypted[bitIndex]), GetErrorPercent(key, numberBEncrypted[bitIndex]));
            printf("=n");
            for (int bitIndex = 0; bitIndex < c_numBitsAdded; ++bitIndex)
                printf("  Result[%i] = 0x%" PRIx64 " (%i err=%i%%)n", bitIndex, resultEncrypted[bitIndex], Decrypt(key, resultEncrypted[bitIndex]), GetErrorPercent(key, resultEncrypted[bitIndex]));
            printf("result decrypted = %" PRId64 "n", resultDecrypted);
        }
    }

    WaitForEnter();
    return 0;
}

Here is the output of a run of the program:

What if I Need Constants?!

If you are thinking how you might actually use this code in a real setting, you might be thinking to yourself “it’s great to be able to multiply two encrypted numbers together, but what if I just need to multiply them by a constant like 43?”

Well, interestingly, you can literally just use 0 and 1 in this scheme as constants to perform operations against the encrypted bits.

The reason that this works is that you can see 0 and 1 as just very poor encryptions 😛

(0 % KEY) % 2 = 0
(1 % KEY) % 2 = 1

As long as KEY is >= 2, the above is always true, no matter what the key actually is!

So there you go, add your own constants into the calculations all you want. They also happen to have very low residue/error (actually, they have the least amount possible!), so are much more friendly to use, versus having someone provide you with an encrypted table of constants to use in your calculations. It’s also more secure for the person doing the encrypting for them to provide you less encrypted data that you know the plain text for. It limits your (and anyone else’s) ability to do a known plain text attack.

The Other Shoe Drops

You might notice that in our scheme, given the same key, every true bit will be the same value, and every false bit will be the same value. Unfortunately, the true bit is also always the false bit + 1. As an attacker, this means that once you have seen both a true bit and a false bit, you will then have broken the encryption.

Even worse, when you encrypt a false bit, it gives you back the key itself!

We’ll improve that in the next post by adding a few more simple operations to the encryption process.

This leveled HE encryption scheme comes directly from the paper below. If you want to give it a look, what we covered is only part of the first two pages!
Fully Homomorphic Encryption over the Integers

The links below are where I started reading up on HE. They go a different route with FHE that you might find interesting, and also have a lot more commentary about the usage cases of HE:
The Swiss Army Knife of Cryptography
Building the Swiss Army Knife

In the scheme in those links, I haven’t figured out how multiplication is supposed to work yet (or bootstrapping, but one thing at a time). If you figure it out, let me know!

Gaussian Blur

In this post we are going to take the concepts we went over in the last post (Box Blur) and apply them to Gaussian blurring.

At a high level, Gaussian blurring works just like box blurring in that there is a weight per pixel and that for each pixel, you apply the weights to that pixel and it’s neighbors to come up with the final value for the blurred pixel.

With true Gaussian blurring however, the function that defines the weights for each pixel technically never reaches zero, but gets smaller and smaller over distance. In theory this makes a Gaussian kernel infinitely large. In practice though, you can choose a cut off point and call it good enough.

The parameters to a Gaussian blur are:

  • Sigma (\sigma) – This defines how much blur there is. A larger number is a higher amount of blur.
  • Radius – The size of the kernel in pixels. The appropriate pixel size can be calculated for a specific sigma, but more information on that lower down.

Just like a box blur, a Gaussian blur is separable which means that you can either apply a 2d convolution kernel, or you can apply a 1d convolution kernel on each axis. Doing a single 2d convolution means more calculations, but you only need one buffer to put the results into. Doing two 1d convolutions (one on each axis), ends up being fewer calculations, but requires two buffers to put the results into (one intermediate buffer to hold the first axis results).

Here is a 3 pixel 1d Gaussian Kernel for a sigma of 1.0:

Below is a 3×3 pixel 2d Gaussian Kernel also with a sigma of 1.0. Note that this can be calculated as an outer product (tensor product) of 1d kernels!

An interesting property of Gaussian blurs is that you can apply multiple smaller blurs and it will come up with the result as if you did a larger Blur. Unfortunately it’s more calculations doing multiple smaller blurs so is not usually worth while.

If you apply multiple blurs, the equivalent blur is the square root of the sum of the squares of the blur. Taking wikipedia’s example, if you applied a blur with radius 6 and a blur with a radius of 8, you’d end up with the equivelant of a radius 10 blur. This is because \sqrt{6^2 + 8^2} = 10.

Calculating The Kernel

There are a couple ways to calculate a Gaussian kernel.

Believe it or not, Pascal’s triangle approaches the Gaussian bell curve as the row number reaches infinity. If you remember, Pascal’s triangle also represents the numbers that each term is calculated by after expanding binomials (x+y)^N. So technically, you could use a row from Pascal’s triangle as a 1d kernel and normalize the result, but it isn’t the most accurate.

A better way is to use the Gaussian function which is this:

e^{-x^2/(2*\sigma^2)}

Where the sigma is your blur amount and x ranges across your values from the negative to the positive. For instance if your kernel was 5 values, it would range from -2 to +2.

An even better way would be to integrate the Gaussian function instead of just taking point samples. You can read about it in the link at the bottom “Gaussian Kernel Calculator”, but it’s also what we do in the example code.

Whatever way you do it, make sure and normalize the result so that the weights add up to 1. This makes sure that your blurring doesn’t make the image get brighter (greater than 1) or dimmer (less than 1).

Calculating The Kernel Size

Given a sigma value, you can calculate the size of the kernel you need by using this formula:

1+2 \sqrt{-2 \sigma^2 \ln 0.005}

That formula makes a Kernel large enough such that it cuts off when the value in the kernel is less than 0.5%. You can adjust the number in there to higher or lower depending on your desires for speed versus quality.

Examples

Once again, here is the unaltered image we are working with:

Here is the image blurred with a sigma of 3,3 (3 on the x axis and 3 on the y axis):

Here is the image blurred with a sigma of 20,3:

Here is the image blurred with a sigma of 50,50:

Shadertoy

Here’s a shadertoy implementing Gaussian Blur: Shadertoy:DF Gaussian Blur

Code

Here’s the source code I used to blur the examples above:

#define _CRT_SECURE_NO_WARNINGS
 
#include <stdio.h>
#include <stdint.h>
#include <array>
#include <vector>
#include <functional>
#include <windows.h>  // for bitmap headers.  Sorry non windows people!
 
typedef uint8_t uint8;

const float c_pi = 3.14159265359f;

struct SImageData
{
    SImageData()
        : m_width(0)
        , m_height(0)
    { }
 
    long m_width;
    long m_height;
    long m_pitch;
    std::vector<uint8> m_pixels;
};
 
void WaitForEnter ()
{
    printf("Press Enter to quit");
    fflush(stdin);
    getchar();
}
 
bool LoadImage (const char *fileName, SImageData& imageData)
{
    // open the file if we can
    FILE *file;
    file = fopen(fileName, "rb");
    if (!file)
        return false;
 
    // read the headers if we can
    BITMAPFILEHEADER header;
    BITMAPINFOHEADER infoHeader;
    if (fread(&header, sizeof(header), 1, file) != 1 ||
        fread(&infoHeader, sizeof(infoHeader), 1, file) != 1 ||
        header.bfType != 0x4D42 || infoHeader.biBitCount != 24)
    {
        fclose(file);
        return false;
    }
 
    // read in our pixel data if we can. Note that it's in BGR order, and width is padded to the next power of 4
    imageData.m_pixels.resize(infoHeader.biSizeImage);
    fseek(file, header.bfOffBits, SEEK_SET);
    if (fread(&imageData.m_pixels[0], imageData.m_pixels.size(), 1, file) != 1)
    {
        fclose(file);
        return false;
    }
 
    imageData.m_width = infoHeader.biWidth;
    imageData.m_height = infoHeader.biHeight;
 
    imageData.m_pitch = imageData.m_width*3;
    if (imageData.m_pitch & 3)
    {
        imageData.m_pitch &= ~3;
        imageData.m_pitch += 4;
    }
 
    fclose(file);
    return true;
}
 
bool SaveImage (const char *fileName, const SImageData &image)
{
    // open the file if we can
    FILE *file;
    file = fopen(fileName, "wb");
    if (!file)
        return false;
 
    // make the header info
    BITMAPFILEHEADER header;
    BITMAPINFOHEADER infoHeader;
 
    header.bfType = 0x4D42;
    header.bfReserved1 = 0;
    header.bfReserved2 = 0;
    header.bfOffBits = 54;
 
    infoHeader.biSize = 40;
    infoHeader.biWidth = image.m_width;
    infoHeader.biHeight = image.m_height;
    infoHeader.biPlanes = 1;
    infoHeader.biBitCount = 24;
    infoHeader.biCompression = 0;
    infoHeader.biSizeImage = image.m_pixels.size();
    infoHeader.biXPelsPerMeter = 0;
    infoHeader.biYPelsPerMeter = 0;
    infoHeader.biClrUsed = 0;
    infoHeader.biClrImportant = 0;
 
    header.bfSize = infoHeader.biSizeImage + header.bfOffBits;
 
    // write the data and close the file
    fwrite(&header, sizeof(header), 1, file);
    fwrite(&infoHeader, sizeof(infoHeader), 1, file);
    fwrite(&image.m_pixels[0], infoHeader.biSizeImage, 1, file);
    fclose(file);
    return true;
}

int PixelsNeededForSigma (float sigma)
{
    // returns the number of pixels needed to represent a gaussian kernal that has values
    // down to the threshold amount.  A gaussian function technically has values everywhere
    // on the image, but the threshold lets us cut it off where the pixels contribute to
    // only small amounts that aren't as noticeable.
    const float c_threshold = 0.005f; // 0.5%
    return int(floor(1.0f + 2.0f * sqrtf(-2.0f * sigma * sigma * log(c_threshold)))) + 1;
}

float Gaussian (float sigma, float x)
{
    return expf(-(x*x) / (2.0f * sigma*sigma));
}

float GaussianSimpsonIntegration (float sigma, float a, float b)
{
    return
        ((b - a) / 6.0f) *
        (Gaussian(sigma, a) + 4.0f * Gaussian(sigma, (a + b) / 2.0f) + Gaussian(sigma, b));
}

std::vector<float> GaussianKernelIntegrals (float sigma, int taps)
{
    std::vector<float> ret;
    float total = 0.0f;
    for (int i = 0; i < taps; ++i)
    {
        float x = float(i) - float(taps / 2);
        float value = GaussianSimpsonIntegration(sigma, x - 0.5f, x + 0.5f);
        ret.push_back(value);
        total += value;
    }
    // normalize it
    for (unsigned int i = 0; i < ret.size(); ++i)
    {
        ret[i] /= total;
    }
    return ret;
}

const uint8* GetPixelOrBlack (const SImageData& image, int x, int y)
{
    static const uint8 black[3] = { 0, 0, 0 };
    if (x < 0 || x >= image.m_width ||
        y < 0 || y >= image.m_height)
    {
        return black;
    }

    return &image.m_pixels[(y * image.m_pitch) + x * 3];
}

void BlurImage (const SImageData& srcImage, SImageData &destImage, float xblursigma, float yblursigma, unsigned int xblursize, unsigned int yblursize)
{
    // allocate space for copying the image for destImage and tmpImage
    destImage.m_width = srcImage.m_width;
    destImage.m_height = srcImage.m_height;
    destImage.m_pitch = srcImage.m_pitch;
    destImage.m_pixels.resize(destImage.m_height * destImage.m_pitch);

    SImageData tmpImage;
    tmpImage.m_width = srcImage.m_width;
    tmpImage.m_height = srcImage.m_height;
    tmpImage.m_pitch = srcImage.m_pitch;
    tmpImage.m_pixels.resize(tmpImage.m_height * tmpImage.m_pitch);

    // horizontal blur from srcImage into tmpImage
    {
        auto row = GaussianKernelIntegrals(xblursigma, xblursize);

        int startOffset = -1 * int(row.size() / 2);

        for (int y = 0; y < tmpImage.m_height; ++y)
        {
            for (int x = 0; x < tmpImage.m_width; ++x)
            {
                std::array<float, 3> blurredPixel = { 0.0f, 0.0f, 0.0f };
                for (unsigned int i = 0; i < row.size(); ++i)
                {
                    const uint8 *pixel = GetPixelOrBlack(srcImage, x + startOffset + i, y);
                    blurredPixel[0] += float(pixel[0]) * row[i];
                    blurredPixel[1] += float(pixel[1]) * row[i];
                    blurredPixel[2] += float(pixel[2]) * row[i];
                }
                
                uint8 *destPixel = &tmpImage.m_pixels[y * tmpImage.m_pitch + x * 3];

                destPixel[0] = uint8(blurredPixel[0]);
                destPixel[1] = uint8(blurredPixel[1]);
                destPixel[2] = uint8(blurredPixel[2]);
            }
        }
    }

    // vertical blur from tmpImage into destImage
    {
        auto row = GaussianKernelIntegrals(yblursigma, yblursize);

        int startOffset = -1 * int(row.size() / 2);

        for (int y = 0; y < destImage.m_height; ++y)
        {
            for (int x = 0; x < destImage.m_width; ++x)
            {
                std::array<float, 3> blurredPixel = { 0.0f, 0.0f, 0.0f };
                for (unsigned int i = 0; i < row.size(); ++i)
                {
                    const uint8 *pixel = GetPixelOrBlack(tmpImage, x, y + startOffset + i);
                    blurredPixel[0] += float(pixel[0]) * row[i];
                    blurredPixel[1] += float(pixel[1]) * row[i];
                    blurredPixel[2] += float(pixel[2]) * row[i];
                }

                uint8 *destPixel = &destImage.m_pixels[y * destImage.m_pitch + x * 3];

                destPixel[0] = uint8(blurredPixel[0]);
                destPixel[1] = uint8(blurredPixel[1]);
                destPixel[2] = uint8(blurredPixel[2]);
            }
        }
    }
}

int main (int argc, char **argv)
{
    float xblursigma, yblursigma;
 
    bool showUsage = argc < 5 ||
        (sscanf(argv[3], "%f", &xblursigma) != 1) ||
        (sscanf(argv[4], "%f", &yblursigma) != 1);
 
    char *srcFileName = argv[1];
    char *destFileName = argv[2];
 
    if (showUsage)
    {
        printf("Usage: <source> <dest> <xblur> <yblur>nBlur values are sigmann");
        WaitForEnter();
        return 1;
    }
    
    // calculate pixel sizes, and make sure they are odd 
    int xblursize = PixelsNeededForSigma(xblursigma) | 1;
    int yblursize = PixelsNeededForSigma(yblursigma) | 1;

    printf("Attempting to blur a 24 bit image.n");
    printf("  Source=%sn  Dest=%sn  blur=[%0.1f, %0.1f] px=[%d,%d]nn", srcFileName, destFileName, xblursigma, yblursigma, xblursize, yblursize);
 
    SImageData srcImage;
    if (LoadImage(srcFileName, srcImage))
    {
        printf("%s loadedn", srcFileName);
        SImageData destImage;
        BlurImage(srcImage, destImage, xblursigma, yblursigma, xblursize, yblursize);
        if (SaveImage(destFileName, destImage))
            printf("Blurred image saved as %sn", destFileName);
        else
        {
            printf("Could not save blurred image as %sn", destFileName);
            WaitForEnter();
            return 1;
        }
    }
    else
    {
        printf("could not read 24 bit bmp file %snn", srcFileName);
        WaitForEnter();
        return 1;
    }
    return 0;
}

Links

Here is a really great explanation of the Gaussian blur.
Gaussian Blur – Image processing for scientists and engineers, Part 4
I highly recommend reading the 6 part series about image processing (DSP) from the beginning because it’s really informative and very easy to read!
Images are data – Image processing for scientists and engineers, Part 1

The Gaussian Kernel
Gaussian Kernel Calculator
DSP Stack Exchange: Gaussian Blur – standard deviation, radius and kernel size
Wikipedia: Gaussian blur

If you want to take this from theory / hobby level up to pro level, give this link a read from intel:
Intel: An investigation of fast real-time GPU-based image blur algorithms

Box Blur

If you ever have heard the terms “Box Blur”, “Boxcar Function”, “Box Filter”, “Boxcar Integrator” or other various combinations of those words, you may have thought it was some advanced concept that is hard to understand and hard to implement. If that’s what you thought, prepare to be surprised!

A box filter is nothing more than taking N samples of data (or NxN samples of data, or NxNxN etc) and averaging them! Yes, that is all there is to it 😛

In this post, we are going to implement a box blur by averaging pixels.

1D Case

For the case of a 1d box filter, let’s say we wanted every data point to be the result of averaging it with it’s two neighbors. It’d be easy enough to program that by just doing it, but let’s look at it a different way. What weight would we need to multiply each of the three values by (the value and it’s two neighbors) to make it come up with the average?

Yep, you guessed it! For every data value, you multiply it and it’s neighbors by 1/3 to come up with the average value. We could easily increase the size of the filter to 5 pixels, and multiply each pixel by 1/5 instead. We could continue the pattern as high as we wanted.

One thing you might notice is that if we want a buffer with all the results, we can’t just alter the source data as we go, because we want the unaltered source values of the data to use those weights with, to get the correct results. Because of that, we need to make a second buffer to put the results of the filtering into.

Believe it or not, that diagram above is a convolution kernel, and how we talked about applying it is how you do convolution in 1d! It just so happens that this convolution kernel averages three pixels into one, which also happens to provide a low pass filter type effect.

Low pass filtering is what is done before down sampling audio data to prevent aliasing (frequencies higher than the sample rate can handle, which makes audio sound bad).

Surprise… blurring can also be seen as low pass filtering, which is something you can do before scaling an image down in size, to prevent aliasing.

2D Case

The 2d case isn’t much more difficult to understand than the 1d case. Instead of only averaging on one axis, we average on two instead:

Something interesting to note is that you can either use this 3×3 2d convolution kernel, or, you could apply the 1d convolution kernel described above on the X axis and then the Y axis. The methods are mathematically equivalent.

Using the 2d convolution kernel would result in 9 multiplications per pixel, but if going with the separated axis X and then Y 1d kernel, you’d only end up doing 6 multiplications per pixel (3 multiplications per axis). In general, if you have a seperable 2d convolution kernel (meaning that you can break it into a per axis 1d convolution), you will end up doing N^2 multiplications when using the 2d kernel, versus N*2 multiplications when using the 1d kernels. You can see that this would add up quickly in favor of using 1d kernels, but unfortunately not all kernels are separable.

Doing two passes does come at a cost though. Since you have to use a temporary buffer for each pass, you end up having to create two temporary buffers instead of one.

You can build 2d kernels from 1d kernels by multiplying them as a row vector, by a column vector. For instance, you can see how multiplying the (1/3,1/3,1/3) kernel by itself as a column vector would create the 2nd kernel, that is 3×3 and has 1/9 in every spot.

The resulting 3×3 matrix is called an outer product, or a tensor product. Something interesting to note is that you don’t have to do the same operation on each axis!

Examples

Here are some examples of box blurring with different values, using the sample code provided below.

The source image:

Now blurred by a 10×10 box car convolution kernel:

Now blurred by a 100×10 box car convolution kernel:

Shadertoy

You can find a shadertoy implementation of box blurring here: Shadertoy:DF Box Blur

Code

Here’s the code I used to blur the example images above:

#define _CRT_SECURE_NO_WARNINGS
  
#include <stdio.h>
#include <stdint.h>
#include <array>
#include <vector>
#include <functional>
#include <windows.h>  // for bitmap headers.  Sorry non windows people!
  
typedef uint8_t uint8;
 
const float c_pi = 3.14159265359f;
 
struct SImageData
{
    SImageData()
        : m_width(0)
        , m_height(0)
    { }
  
    long m_width;
    long m_height;
    long m_pitch;
    std::vector<uint8> m_pixels;
};
  
void WaitForEnter ()
{
    printf("Press Enter to quit");
    fflush(stdin);
    getchar();
}
  
bool LoadImage (const char *fileName, SImageData& imageData)
{
    // open the file if we can
    FILE *file;
    file = fopen(fileName, "rb");
    if (!file)
        return false;
  
    // read the headers if we can
    BITMAPFILEHEADER header;
    BITMAPINFOHEADER infoHeader;
    if (fread(&header, sizeof(header), 1, file) != 1 ||
        fread(&infoHeader, sizeof(infoHeader), 1, file) != 1 ||
        header.bfType != 0x4D42 || infoHeader.biBitCount != 24)
    {
        fclose(file);
        return false;
    }
  
    // read in our pixel data if we can. Note that it's in BGR order, and width is padded to the next power of 4
    imageData.m_pixels.resize(infoHeader.biSizeImage);
    fseek(file, header.bfOffBits, SEEK_SET);
    if (fread(&imageData.m_pixels[0], imageData.m_pixels.size(), 1, file) != 1)
    {
        fclose(file);
        return false;
    }
  
    imageData.m_width = infoHeader.biWidth;
    imageData.m_height = infoHeader.biHeight;
  
    imageData.m_pitch = imageData.m_width*3;
    if (imageData.m_pitch & 3)
    {
        imageData.m_pitch &= ~3;
        imageData.m_pitch += 4;
    }
  
    fclose(file);
    return true;
}
  
bool SaveImage (const char *fileName, const SImageData &image)
{
    // open the file if we can
    FILE *file;
    file = fopen(fileName, "wb");
    if (!file)
        return false;
  
    // make the header info
    BITMAPFILEHEADER header;
    BITMAPINFOHEADER infoHeader;
  
    header.bfType = 0x4D42;
    header.bfReserved1 = 0;
    header.bfReserved2 = 0;
    header.bfOffBits = 54;
  
    infoHeader.biSize = 40;
    infoHeader.biWidth = image.m_width;
    infoHeader.biHeight = image.m_height;
    infoHeader.biPlanes = 1;
    infoHeader.biBitCount = 24;
    infoHeader.biCompression = 0;
    infoHeader.biSizeImage = image.m_pixels.size();
    infoHeader.biXPelsPerMeter = 0;
    infoHeader.biYPelsPerMeter = 0;
    infoHeader.biClrUsed = 0;
    infoHeader.biClrImportant = 0;
  
    header.bfSize = infoHeader.biSizeImage + header.bfOffBits;
  
    // write the data and close the file
    fwrite(&header, sizeof(header), 1, file);
    fwrite(&infoHeader, sizeof(infoHeader), 1, file);
    fwrite(&image.m_pixels[0], infoHeader.biSizeImage, 1, file);
    fclose(file);
    return true;
}

const uint8* GetPixelOrBlack (const SImageData& image, int x, int y)
{
    static const uint8 black[3] = { 0, 0, 0 };
    if (x < 0 || x >= image.m_width ||
        y < 0 || y >= image.m_height)
    {
        return black;
    }
 
    return &image.m_pixels[(y * image.m_pitch) + x * 3];
}
 
void BlurImage (const SImageData& srcImage, SImageData &destImage, unsigned int xblur, unsigned int yblur)
{
    // allocate space for copying the image for destImage and tmpImage
    destImage.m_width = srcImage.m_width;
    destImage.m_height = srcImage.m_height;
    destImage.m_pitch = srcImage.m_pitch;
    destImage.m_pixels.resize(destImage.m_height * destImage.m_pitch);
 
    SImageData tmpImage;
    tmpImage.m_width = srcImage.m_width;
    tmpImage.m_height = srcImage.m_height;
    tmpImage.m_pitch = srcImage.m_pitch;
    tmpImage.m_pixels.resize(tmpImage.m_height * tmpImage.m_pitch);
 
    // horizontal blur from srcImage into tmpImage
    {
        float weight = 1.0f / float(xblur);
        int half = xblur / 2;
        for (int y = 0; y < tmpImage.m_height; ++y)
        {
            for (int x = 0; x < tmpImage.m_width; ++x)
            {
                std::array<float, 3> blurredPixel = { 0.0f, 0.0f, 0.0f };
                for (int i = -half; i <= half; ++i)
                {
                    const uint8 *pixel = GetPixelOrBlack(srcImage, x + i, y);
                    blurredPixel[0] += float(pixel[0]) * weight;
                    blurredPixel[1] += float(pixel[1]) * weight;
                    blurredPixel[2] += float(pixel[2]) * weight;
                }
                 
                uint8 *destPixel = &tmpImage.m_pixels[y * tmpImage.m_pitch + x * 3];
 
                destPixel[0] = uint8(blurredPixel[0]);
                destPixel[1] = uint8(blurredPixel[1]);
                destPixel[2] = uint8(blurredPixel[2]);
            }
        }
    }
 
    // vertical blur from tmpImage into destImage
    {
        float weight = 1.0f / float(yblur);
        int half = yblur / 2;
 
        for (int y = 0; y < destImage.m_height; ++y)
        {
            for (int x = 0; x < destImage.m_width; ++x)
            {
                std::array<float, 3> blurredPixel = { 0.0f, 0.0f, 0.0f };
                for (int i = -half; i <= half; ++i)
                {
                    const uint8 *pixel = GetPixelOrBlack(tmpImage, x, y + i);
                    blurredPixel[0] += float(pixel[0]) * weight;
                    blurredPixel[1] += float(pixel[1]) * weight;
                    blurredPixel[2] += float(pixel[2]) * weight;
                }
 
                uint8 *destPixel = &destImage.m_pixels[y * destImage.m_pitch + x * 3];
 
                destPixel[0] = uint8(blurredPixel[0]);
                destPixel[1] = uint8(blurredPixel[1]);
                destPixel[2] = uint8(blurredPixel[2]);
            }
        }
    }
}
 
int main (int argc, char **argv)
{
    int xblur, yblur;
  
    bool showUsage = argc < 5 ||
        (sscanf(argv[3], "%i", &xblur) != 1) ||
        (sscanf(argv[4], "%i", &yblur) != 1);
  
    char *srcFileName = argv[1];
    char *destFileName = argv[2];
  
    if (showUsage)
    {
        printf("Usage: <source> <dest> <xblur> <yblur>nn");
        WaitForEnter();
        return 1;
    }
     
    // make sure blur size is odd
    xblur = xblur | 1;
    yblur = yblur | 1;
 
    printf("Attempting to blur a 24 bit image.n");
    printf("  Source=%sn  Dest=%sn  blur=[%d,%d]nn", srcFileName, destFileName, xblur, yblur);
  
    SImageData srcImage;
    if (LoadImage(srcFileName, srcImage))
    {
        printf("%s loadedn", srcFileName);
        SImageData destImage;
        BlurImage(srcImage, destImage, xblur, yblur);
        if (SaveImage(destFileName, destImage))
            printf("Blurred image saved as %sn", destFileName);
        else
        {
            printf("Could not save blurred image as %sn", destFileName);
            WaitForEnter();
            return 1;
        }
    }
    else
    {
        printf("could not read 24 bit bmp file %snn", srcFileName);
        WaitForEnter();
        return 1;
    }
    return 0;
}

Next Up

Next up will be a Gaussian blur, and I’m nearly done w/ that post but wanted to make this one first as an introductory step!

Before we get there, I wanted to mention that if you do multiple box blurs in a row, it will start to approach Gaussian blurring. I’ve heard that three blurs in a row will make it basically indistinguishable from a Gaussian blur.