Low Tech Homomorphic Encryption

Homomorphic encryption is a special type of encryption that lets you do calculations on encrypted values as if they weren’t encrypted. One reason it’s desired is that secure computing could be done in the cloud, if practical homomorphic encryption were available.

Homomorphic encryption has been a hot research topic since 2009, when Craig Gentry figured out a way to do it while working on his PhD. Since then, people have been working on making it better, faster and more efficient.

You can read more about a basic incarnation of his ideas in my blog posts:
Super Simple Symmetric Leveled Homomorphic Encryption Implementation
Improving the Security of the Super Simple Symmetric Leveled Homomorphic Encryption Implementation

This post is about a low tech type of homomorphic encryption that anyone can easily do and understand. There is also some very simple C++ that implements it.

This idea may very well be known about publically, but I’m not aware of any places that talk about it. I may just be ignorant of them though so ::shrug::

Quick Update

I’ve gotten some feedback on this article, the most often feedback being that this is obfuscation not encryption. I think that’s a fair assessment as the secret value you are trying to protect is in no way transformed, but is just hidden. This post could easily be titled Homomorphic Obfuscation, and perhaps should be.

To see other feedback and responses to this post, check out the reddit links at the bottom!

The Idea

The idea is actually super simple:

  1. Take the value you want to encrypt.
  2. Hide it in a list of a bunch of other random values, and remember where it is in the list. The position in the list is your key.
  3. Send this list to an untrusted party.
  4. They do the same calculation on every item in the list and send it back.
  5. Since you know which value was your secret value, you know which answer is the one you care about.

At the end of that process, you have the resulting value, and they have no idea what value was your secret value. You have done, by definition, homomorphic encryption!

There is a caveat of course… they know that your secret value was ONE of the values on the list.

Security Details

The thing here is that security is a sliding scale between resource usage (computation time, RAM, network bandwidth, etc) and security.

The list size is your security parameter in this case.

A larger list of random values means that it takes longer to transfer the data, more memory to store it, it takes longer to do the homomorphic computations, but the untrusted party is less sure about what your secret value is.

On the other hand, a shorter list is faster to transmit, easier to store, quicker to compute with, but the untrusted party has a better idea what your secret value is.

For maximal security you can just take this to the extreme – if your secret value is a 32 bit floating point number, you could make a list with all possible 2^32 floating point numbers in it, have them do the calculation and send it back. You can even do an optimization here and not even generate or send the list, but rather just have the person doing the calculations generate the full 2^32 float list, do the calculations, and send you the results.

That gets pretty big pretty fast though. That list would actually be 16 gigabytes, but the untrusted party would know almost nothing about your value, other than it can be represented by a 32 bit floating point number.

Depending on your security needs, you might be ok with shortening your list a bit to bring that number down. Making your list only be one million numbers long (999,999 random numbers and one value you actually care about), your list is only 3.8 megabytes.

Not quite as bad.

Some Interesting Abilities

Using this homomorphic encryption, like other homomorphic encryption, you can do computation involving multiple encrypted values. AKA you could multiply two encrypted values together. To do this, you are going to need to encrypt all values involved using the same key. In other words, they are going to have to be at the same index in each of their respective lists of random numbers.

Something else that is interesting is that you can also encode MULTIPLE secret values in your encrypted value list. You could have 1 secret value at index 50 and another at index 100 for instance. Doing this, you get a sort of homomorphic SIMD setup.

Homomorphic SIMD is actually a real thing in other homomorphic encryption methods as well. Check out this paper for instance:
Fully Homomorphic SIMD Operations

The only problem with homomorphic SIMD is that adding more secret values to the same encrypted list decreases the security, since there are more values in the list that you don’t want other people to know about.

You can of course also modify encrypted values by unencrypted values. You could multiply an encrypted value by 3, by multiplying every value in the list by 3.

Extending to Public Key Cryptography

If you wanted to use asymmetric key cryptography (public/private keys) instead of symmetric key cryptography, that is doable here too.

What you would do is have the public key public as per usual, and that key would be used in a public key algorithm to encrypt the index of the secret value in the random list.

Doing this, the person who has the private key would be able to receive the list and encrypted index, decrypt the index, and then get the secret value out using that index.

Sample Code Tests

The sample code only does Symmetric key encryption, and does these 3 tests:

  1. Encrypts two floating point numbers into a single list, SIMD style, does an operation on the encrypted values, then unencrypts and verifies the results.
  2. Does the same with two sets of floats (three floats in each set), to show how you can make encrypted values interact with each other. Does the operation, then unencrypts and verifies the results.
  3. Encrypts three values of a 3 byte structure, does an operation on the encrypted values, then unencrypts and verifies the results.

All secret data was hidden in lists of 10,000,000 random values. That made the first two tests (the ones done with 4 byte floats) have encrypted files of 38.1MB (40,000,000 bytes), and the last test (the one done with a 3 byte struct) had a file size of 28.6 MB (30,000,000 bytes).

Here are the timing of the above tests:

Sample Code

Here is the header, LTHE.h:

/*

Written by Alan Wolfe
http://blog.demofox.org


*/

#pragma once
#include <vector>
#include <random>

// A static class with template functions in it.
// A namespace would be nice, except I want to hide some things as private.
class LTHE
{
public:

    //=================================================================================
    template <typename T>
    static bool Encrypt (std::vector<T> values, size_t listSize, const char* fileName, std::vector<size_t>& keys, bool generateKeys = true)
    {
        // Make sure we have a list that is at least as long as the values we want to encrypt
        if (values.size() > listSize)
        {
            fprintf(stderr, "ERROR in " __FUNCTION__ "(): values.size() > listSize.n");
            return false;
        }

        // Generate a list of keys if we are told to
        // Ideally you want to take the first M items of a cryptographically secure shuffle
        // of N items.
        // This could be done with format preserving encryption or some other method
        // to make it not roll and check, and also more secure random.
        if (generateKeys)
        {
            keys.clear();
            for (size_t i = 0, c = values.size(); i < c; ++i)
            {
                size_t newKey;
                do
                {
                    newKey = RandomInt<size_t>(0, listSize - 1);
                }
                while (std::find(keys.begin(), keys.end(), newKey) != keys.end());
                keys.push_back(newKey);
            }
        }

        // make a file of random values, size of T, count of <listSize> 
        FILE *file = fopen(fileName, "w+b");
        if (!file)
        {
            fprintf(stderr, "ERROR in " __FUNCTION__ "(): Could not open %s for writing.n", fileName);
            return false;
        }

        // Note: this may not be the most efficient way to generate this much random data or 
        // write it all to the file.
        // In a real crypto usage case, you'd want a crypto secure random number generator.
        // You'd also want to make sure the random numbers had the same properties as your
        // input values to help anonymize them better.
        // Like if your numbers are not whole numbers, you don't want to generate only whole numbers.
        // Or if your numbers are salaries, you may not want purely random values, but more "salaryish"
        // looking numbers.
        // You could alternately just do all 2^N possible values which would definitely anonymize
        // the values you wanted to encrypt.  This is maximum security, but also takes most
        // memory and most processing time.
        size_t numUint32s = (listSize * sizeof(T)) / sizeof(uint32_t);
        size_t numExtraBytes = (listSize * sizeof(T)) % sizeof(uint32_t);
        for (size_t i = 0; i < numUint32s; ++i)
        {
            uint32_t value = RandomInt<uint32_t>();
            if (fwrite(&value, sizeof(value), 1, file) != 1)
            {
                fprintf(stderr, "ERROR in " __FUNCTION__ "(): Could not write random numbers (uint32s).n");
                fclose(file);
                return false;
            }
        }
        for (size_t i = 0; i < numExtraBytes; ++i)
        {
            uint8_t value = RandomInt<uint8_t>();
            if (fwrite(&value, sizeof(value), 1, file) != 1)
            {
                fprintf(stderr, "ERROR in " __FUNCTION__ "(): Could not write random numbers (extra bytes).n");
                fclose(file);
                return false;
            }
        }

        // Now put the values in the file where they go, based on their key
        for (size_t i = 0, c = values.size(); i < c; ++i)
        {
            long pos = (long)(keys[i] * sizeof(T));
            if (fseek(file, pos, SEEK_SET) != 0)
            {
                fprintf(stderr, "ERROR in " __FUNCTION__ "(): Could not fseek.n");
                fclose(file);
                return false;
            }
            if (fwrite(&values[i], sizeof(values[i]), 1, file) != 1)
            {
                fprintf(stderr, "ERROR in " __FUNCTION__ "(): Could not write secret value.n");
                fclose(file);
                return false;
            }
        }

        // close file and return success
        fclose(file);
        return true;
    }

    //=================================================================================
    template <typename T, typename LAMBDA>
    static bool TransformHomomorphically (const char* srcFileName, const char* destFileName, const LAMBDA& function)
    {
        // open the source and dest file if we can
        FILE *srcFile = fopen(srcFileName, "rb");
        if (!srcFile)
        {
            fprintf(stderr, "ERROR in " __FUNCTION__ "(): Could not open %s for reading.n", srcFileName);
            return false;
        }
        FILE *destFile = fopen(destFileName, "w+b");
        if (!destFile)
        {
            fprintf(stderr, "ERROR in " __FUNCTION__ "(): Could not open %s for writing.n", destFileName);
            fclose(srcFile);
            return false;
        }

        // Process the data in the file and write it back out.
        // This could be done much better.
        // We could read more from the file at once.
        // We could use SIMD.
        // We could go multithreaded.
        // We could do this on the GPU for large data sets and longer transformations! Assuming data transfer time isn't too prohibitive.
        // We could decouple the disk access from processing, so it was reading and writing while it was processing.
        const size_t c_bufferSize = 1024;
        std::vector<T> dataBuffer;
        dataBuffer.resize(c_bufferSize);
        size_t elementsRead;
        do
        {
            // read data from the source file
            elementsRead = fread(&dataBuffer[0], sizeof(T), c_bufferSize, srcFile);

            // transform the data
            for (size_t i = 0; i < elementsRead; ++i)
                dataBuffer[i] = function(dataBuffer[i]);

            // write the transformed data to the dest file
            if (fwrite(&dataBuffer[0], sizeof(T), elementsRead, destFile) != elementsRead)
            {
                fprintf(stderr, "ERROR in " __FUNCTION__ "(): Could not write transformed elements.n");
                fclose(srcFile);
                fclose(destFile);
                return false;
            }
        }
        while (!feof(srcFile));

        // close files and return success
        fclose(srcFile);
        fclose(destFile);
        return true;
    }

    //=================================================================================
    template <typename T, typename LAMBDA>
    static bool TransformHomomorphically (const char* src1FileName, const char* src2FileName, const char* destFileName, const LAMBDA& function)
    {
        // open the source and dest file if we can
        FILE *srcFile1 = fopen(src1FileName, "rb");
        if (!srcFile1)
        {
            fprintf(stderr, "ERROR in " __FUNCTION__ "(): Could not open %s for reading.n", src1FileName);
            return false;
        }
        FILE *srcFile2 = fopen(src2FileName, "rb");
        if (!srcFile2)
        {
            fprintf(stderr, "ERROR in " __FUNCTION__ "(): Could not open %s for reading.n", src2FileName);
            fclose(srcFile1);
            return false;
        }
        FILE *destFile = fopen(destFileName, "w+b");
        if (!destFile)
        {
            fprintf(stderr, "ERROR in " __FUNCTION__ "(): Could not open %s for writing.n", destFileName);
            fclose(srcFile1);
            fclose(srcFile2);
            return false;
        }

        // Process the data in the file and write it back out.
        // This could be done much better.
        // We could read more from the file at once.
        // We could use SIMD.
        // We could go multithreaded.
        // We could do this on the GPU for large data sets and longer transformations! Assuming data transfer time isn't too prohibitive.
        // We could decouple the disk access from processing, so it was reading and writing while it was processing.
        const size_t c_bufferSize = 1024;
        std::vector<T> dataBuffer1, dataBuffer2;
        dataBuffer1.resize(c_bufferSize);
        dataBuffer2.resize(c_bufferSize);
        size_t elementsRead1;
        size_t elementsRead2;
        do
        {
            // read data from the source files
            elementsRead1 = fread(&dataBuffer1[0], sizeof(T), c_bufferSize, srcFile1);
            elementsRead2 = fread(&dataBuffer2[0], sizeof(T), c_bufferSize, srcFile2);

            if (elementsRead1 != elementsRead2)
            {
                fprintf(stderr, "ERROR in " __FUNCTION__ "(): Different numbers of elements in each file!n");
                fclose(srcFile1);
                fclose(srcFile2);
                fclose(destFile);
                return false;
            }

            // transform the data
            for (size_t i = 0; i < elementsRead1; ++i)
                dataBuffer1[i] = function(dataBuffer1[i], dataBuffer2[i]);

            // write the transformed data to the dest file
            if (fwrite(&dataBuffer1[0], sizeof(T), elementsRead1, destFile) != elementsRead1)
            {
                fprintf(stderr, "ERROR in " __FUNCTION__ "(): Could not write transformed elements.n");
                fclose(srcFile1);
                fclose(srcFile2);
                fclose(destFile);
                return false;
            }
        }
        while (!feof(srcFile1));

        // close files and return success
        fclose(srcFile1);
        fclose(srcFile2);
        fclose(destFile);
        return true;
    }

    //=================================================================================
    template <typename T>
    static bool Decrypt (const char* fileName, std::vector<T>& values, std::vector<size_t>& keys)
    {
        // Open the file if we can
        FILE *file = fopen(fileName, "rb");
        if (!file)
        {
            fprintf(stderr, "ERROR in " __FUNCTION__ "(): Could not open %s for reading.n", fileName);
            return false;
        }

        // Read the values from the file.  The key is their location in the file.
        values.clear();
        for (size_t i = 0, c = keys.size(); i < c; ++i)
        {
            long pos = (long)(keys[i] * sizeof(T));
            if (fseek(file, pos, SEEK_SET) != 0)
            {
                fprintf(stderr, "ERROR in " __FUNCTION__ "(): Could not fseek.n");
                fclose(file);
                return false;
            }
            T value;
            if (!fread(&value, sizeof(T), 1, file))
            {
                fprintf(stderr, "ERROR in " __FUNCTION__ "(): Could not decrypt value for key.n");
                fclose(file);
                return false;
            }
            values.push_back(value);
        }

        // Close file and return success
        fclose(file);
        return true;
    }

private:
    template <typename T>
    static T RandomInt (T min = std::numeric_limits<T>::min(), T max = std::numeric_limits<T>::max())
    {
        static std::random_device rd;
        static std::mt19937 mt(rd());
        static std::uniform_int<T> dist(min, max);
        return dist(mt);
    }
};

And here is the test program, main.cpp:

#include <stdio.h>
#include "LTHE.h"
#include <chrono>

//=================================================================================
// times a block of code
struct SBlockTimer
{
    SBlockTimer()
    {
        m_start = std::chrono::high_resolution_clock::now();
    }

    ~SBlockTimer()
    {
        std::chrono::duration<float> seconds = std::chrono::high_resolution_clock::now() - m_start;
        printf("    %0.2f secondsn", seconds.count());
    }

    std::chrono::high_resolution_clock::time_point m_start;
};

//=================================================================================
float TransformDataUnitary (float& value)
{
    return (float)sqrt(value * 2.17f + 0.132);
}

//=================================================================================
float TransformDataBinary (float& value1, float value2)
{
    return (float)sqrt(value1 * value1 + value2 * value2);
}

//=================================================================================
struct SStruct
{
    uint8_t x, y, z;

    static SStruct Transform (const SStruct& b)
    {
        SStruct ret;
        ret.x = b.x * 2;
        ret.y = b.y * 3;
        ret.z = b.z * 4;
        return ret;
    }

    bool operator != (const SStruct& b) const
    {
        return b.x != x || b.y != y || b.z != z;
    }
};

//=================================================================================
int Test_FloatUnitaryOperation ()
{
    printf("n----- " __FUNCTION__ " -----n");

    // Encrypt the data
    printf("Encrypting data:  ");
    std::vector<float> secretValues = { 3.14159265359f, 435.0f };
    std::vector<size_t> keys;
    {
        SBlockTimer timer;
        if (!LTHE::Encrypt(secretValues, 10000000, "Encrypted.dat", keys))
        {
            fprintf(stderr, "Could not encrypt data.n");
            return -1;
        }
    }

    // Transform the data
    printf("Transforming data:");
    {
        SBlockTimer timer;
        if (!LTHE::TransformHomomorphically<float>("Encrypted.dat", "Transformed.dat", TransformDataUnitary))
        {
            fprintf(stderr, "Could not transform encrypt data.n");
            return -2;
        }
    }

    // Decrypt the data
    printf("Decrypting data:  ");
    std::vector<float> decryptedValues;
    {
        SBlockTimer timer;
        if (!LTHE::Decrypt("Transformed.dat", decryptedValues, keys))
        {
            fprintf(stderr, "Could not decrypt data.n");
            return -3;
        }
    }

    // Verify the data
    printf("Verifying data:   ");
    {
        SBlockTimer timer;
        for (size_t i = 0, c = secretValues.size(); i < c; ++i)
        {
            if (TransformDataUnitary(secretValues[i]) != decryptedValues[i])
            {
                fprintf(stderr, "decrypted value mismatch!n");
                return -4;
            }
        }
    }

    return 0;
}

//=================================================================================
int Test_FloatBinaryOperation ()
{
    printf("n----- " __FUNCTION__ " -----n");

    // Encrypt the data
    printf("Encrypting data:  ");
    std::vector<float> secretValues1 = { 3.14159265359f, 435.0f, 1.0f };
    std::vector<float> secretValues2 = { 1.0f, 5.0f, 9.0f };
    std::vector<size_t> keys;
    {
        SBlockTimer timer;
        if (!LTHE::Encrypt(secretValues1, 10000000, "Encrypted1.dat", keys))
        {
            fprintf(stderr, "Could not encrypt data.n");
            return -1;
        }
        if (!LTHE::Encrypt(secretValues2, 10000000, "Encrypted2.dat", keys, false)) // reuse the keys made for secretValues1
        {
            fprintf(stderr, "Could not encrypt data.n");
            return -1;
        }
    }

    // Transform the data
    printf("Transforming data:");
    {
        SBlockTimer timer;
        if (!LTHE::TransformHomomorphically<float>("Encrypted1.dat", "Encrypted2.dat", "Transformed.dat", TransformDataBinary))
        {
            fprintf(stderr, "Could not transform encrypt data.n");
            return -2;
        }
    }

    // Decrypt the data
    printf("Decrypting data:  ");
    std::vector<float> decryptedValues;
    {
        SBlockTimer timer;
        if (!LTHE::Decrypt("Transformed.dat", decryptedValues, keys))
        {
            fprintf(stderr, "Could not decrypt data.n");
            return -3;
        }
    }

    // Verify the data
    printf("Verifying data:   ");
    {
        SBlockTimer timer;
        for (size_t i = 0, c = secretValues1.size(); i < c; ++i)
        {
            if (TransformDataBinary(secretValues1[i], secretValues2[i]) != decryptedValues[i])
            {
                fprintf(stderr, "decrypted value mismatch!n");
                return -4;
            }
        }
    }

    return 0;
}

//=================================================================================
int Test_StructUnitaryOperation ()
{
    printf("n----- " __FUNCTION__ " -----n");

    // Encrypt the data
    printf("Encrypting data:  ");
    std::vector<SStruct> secretValues = { {0,1,2},{ 3,4,5 },{ 6,7,8 } };
    std::vector<size_t> keys;
    {
        SBlockTimer timer;
        if (!LTHE::Encrypt(secretValues, 10000000, "Encrypted.dat", keys))
        {
            fprintf(stderr, "Could not encrypt data.n");
            return -1;
        }
    }

    // Transform the data
    printf("Transforming data:");
    {
        SBlockTimer timer;
        if (!LTHE::TransformHomomorphically<SStruct>("Encrypted.dat", "Transformed.dat", SStruct::Transform))
        {
            fprintf(stderr, "Could not transform encrypt data.n");
            return -2;
        }
    }

    // Decrypt the data
    printf("Decrypting data:  ");
    std::vector<SStruct> decryptedValues;
    {
        SBlockTimer timer;
        if (!LTHE::Decrypt("Transformed.dat", decryptedValues, keys))
        {
            fprintf(stderr, "Could not decrypt data.n");
            return -3;
        }
    }

    // Verify the data
    printf("Verifying data:   ");
    {
        SBlockTimer timer;
        for (size_t i = 0, c = secretValues.size(); i < c; ++i)
        {
            if (SStruct::Transform(secretValues[i]) != decryptedValues[i])
            {
                fprintf(stderr, "decrypted value mismatch!n");
                return -4;
            }
        }
    }

    return 0;
}

//=================================================================================
int main (int argc, char **argv)
{
    // test doing an operation on a single encrypted float
    int ret = Test_FloatUnitaryOperation();
    if (ret != 0)
    {
        system("pause");
        return ret;
    }

    // test doing an operation on two encrypted floats
    ret = Test_FloatBinaryOperation();
    if (ret != 0)
    {
        system("pause");
        return ret;
    }

    // test doing an operation on a single 3 byte struct
    ret = Test_StructUnitaryOperation();
    if (ret != 0)
    {
        system("pause");
        return ret;
    }
    
    printf("nAll Tests Passed!nn");
    system("pause");
    return 0;
}

If you found this post interesting or useful, or you have anything to add or talk about, let me know!

Reddit discussion:
r/programming
r/cryptography

Shamir’s Quest: Collect Any 3 Keys To Unlock The Secret!

This post is on something called Shamir’s Secret Sharing. It’s a technique where you can break a secret number up into M different pieces, where if you have any N of those M pieces, you are able to figure out the secret.

Thinking of it in video game terms, imagine there are 10 keys hidden in a level, but you can escape the level whenever you find any 7 of them. This is what Shamir’s Secret Sharing enables you to set up cryptographically.

Interestingly in this case, the term sharing in “secret sharing” doesn’t mean sharing the secret with others. It means breaking the secret up into pieces, or SHARES. Secret sharing means that you make shares out of a secret, such that if you have enough of the shares, you can recover the secret.

How Do You Share (Split) The Secret?

The basic idea of how it works is actually really simple. This is good for us trying to learn the technique, but also good to show it’s security since there are so few moving parts.

It relies on something called the Unisolvence Theorem which is a fancy label meaning these things:

  • If you have a linear equation, it takes two (x,y) points to uniquely identify that line. No matter how you write a linear equation, if it passes through those same two points, it’s mathematically equivelant.
  • If you have a quadratic equation, it takes three (x,y) points to uniquely identify that quadratic curve. Again, no matter how you write a quadratic equation, if it passes through those same three points, it’s mathematically equivalent.
  • The pattern continues for equations of any degree. Cubic equations require four points to be uniquely identified, Quartic equations require five points, and so on.

At a high level, how this technique works is that the number of shares (keys) you want someone to collect (N ) defines the degree of an equation.

You use random numbers as the coefficients of the powers of x in that equation, but use your secret number as the constant term.

You then create M data points of the form (x,y) aka (x,f(x)) . Those are your shares. You then give individual shares to people, or go hide them in your dungeon or do whatever you are going to do with them.

As soon as any one person has N of those M shares (data points), they will be able to figure out the equation of the curve and thus get the secret.

The secret number is the constant term of the polynomial, which is also just f(0) .

This image below from wikipedia is great for seeing how you may have two points of a cubic curve, but without a third point you can’t be sure what the quadratic equation is. In fact, there are an infinite number of quadratic curves that pass through any two points! Because of that, it takes the full number of required shares for you to be able to unlock the secret.

Example: Sharing (Splitting) The Secret

First you decide how many shares you want it to take to unlock the secret. This determines the degree of your equation.

Let’s say you wanted a person to have to have four shares to unlock the secret. This means our equation will be a cubic equation, since it takes four points to uniquely define a cubic equation.

Our equation is:

f(x) = R_1x^3 + R_2x^2 + R_3x + S

Where the R_i values are random numbers, and S is the secret value.

Let’s say that our secret value is 435, and that we picked some random numbers for the equation, making the below:

f(x) = 28x^3 + 64x^2 + 9x + 435

We now have a function that is uniquely identifiable by any 4 points of data on it’s curve.

Next we decide how many pieces we are going to create total. We need at least 4 so that it is in fact solvable. Let’s make 6 shares.

To do this, you just plug in 6 different values of x and pair each x value with it’s y value. Let’s do that:

\begin{array}{c|c} x & f(x) \\ \hline 1 & 536 \\ 2 & 933 \\ 3 & 1794 \\ 4 & 3287 \\ 5 & 5580 \\ 6 & 8841 \\ \end{array}

When doing this part, remember that the secret number is f(0) , so make sure and not share what the value of the function is when x is 0!

You could then distribute the shares (data pairs) as you saw fit. Maybe some people are more important, so you give them more than one share, requiring a smaller amount of cooperation with them to unlock the secret.

Share distribution details are totally up to you, but we now have our shares, whereby if you have any of the 4 of the 6 total shares, you can unlock the secret.

How Do You Join The Secret?

Once you have the right number of shares and you know the degree of the polynomial (pre-shared “public” information), unlocking the secret is a pretty straightforward process too. To unlock the secret, you just need to use ANY method available for creating an equation of the correct degree from a set of data points.

This can be one of several different interpolation techniques, but the most common one to use seems to be Lagrange interpolation, which is something I previously wrote up that you can read about here: Lagrange Interpolation.

Once you have the equation, you can either evaluate f(0) , or you can write the equation in polynomial form and the constant term will be the secret value.

Example: Joining the Secret

Let’s say that we have these four shares and are ready to get the cubic function and then unlock the secret number:

\begin{array}{c|c} x & y \\ \hline 1 & 536 \\ 2 & 933 \\ 4 & 3287 \\ 6 & 8841 \\ \end{array}

We could bust out some Lagrange interpolation and figure this out, but let’s be lazy… err efficient I mean. Wolfram alpha can do this for us!

Wolfram Alpha: cubic fit (1, 536), (2, 933), (4, 3287), (6, 8841)

That gives us this equation, saying that it is a perfect fit (which it is!)
28x^3 + 64x^2 + 9x + 435

You can see that our constant term (and f(0) ) is the correct secret value of 435.

Daaaayummm Bru… that is lit AF! We just got hacked by wolfram alpha 😛

A Small Complication

Unfortunately, the above has a weakness. The weakness is that each share you get gives you a little bit more information about the secret value. You can read more about this in the links section at the end if you want to know more details.

Ideally, you wouldn’t have any information about the secret value until you had the full number of shares required to unlock the secret.

To address this problem, we are going to choose some prime number k and instead of shares being (x,y) data points on the curve, they are going to be (x,y \bmod k) . In technical terms we are going to be using points on a finite field, or a Galois field.

The value we choose for k needs to be larger than any of the coefficients of our terms (the random numbers) as well as larger than our secret value and larger than the number of shares we want to create. The larger the better besides that, because a larger k value means a larger “brute force” space to search.

If you want to use this technique in a situation which has real needs for security, please make sure and read more on this technique from more authoritative sources. I’m glossing over the details of security quite a bit, and just trying to give an intuitive understanding of this technique (:

Source Code

Below is some sample source code that implements Shamir’s Secret Sharing in C++.

I use 64 bit integers, but if you were going to be using this in a realistic situation you could very well overflow 64 bit ints and get the wrong answers. I hit this problem for instance when trying to require more than about 10 shares, using a prime of 257, and generating 50 shares. If you hit the limit of 64 bit ints you can use a multi precision math library instead to have virtually unlimited sized ints. The boost multiprecision header library is a decent choice for multi precision integers, specifically cpp_int.

#include <stdio.h>
#include <array>
#include <vector>
#include <math.h>
#include <random>
#include <assert.h>
#include <stdint.h>
#include <inttypes.h>

typedef int64_t TINT;
typedef std::array<TINT, 2> TShare;
typedef std::vector<TShare> TShares;

class CShamirSecretSharing
{
public:
    CShamirSecretSharing (size_t sharesNeeded, TINT prime)
        : c_sharesNeeded(sharesNeeded), c_prime(prime)
    {
        // There needs to be at least 1 share needed
        assert(sharesNeeded > 0);
    }

    // Generate N shares for a secretNumber
    TShares GenerateShares (TINT secretNumber, TINT numShares) const
    {
        // calculate our curve coefficients
        std::vector<TINT> coefficients;
        {
            // store the secret number as the first coefficient;
            coefficients.resize((size_t)c_sharesNeeded);
            coefficients[0] = secretNumber;

            // randomize the rest of the coefficients
            std::array<int, std::mt19937::state_size> seed_data;
            std::random_device r;
            std::generate_n(seed_data.data(), seed_data.size(), std::ref(r));
            std::seed_seq seq(std::begin(seed_data), std::end(seed_data));
            std::mt19937 gen(seq);
            std::uniform_int_distribution<TINT> dis(1, c_prime - 1);
            for (TINT i = 1; i < c_sharesNeeded; ++i)
                coefficients[(size_t)i] = dis(gen);
        }

        // generate the shares
        TShares shares;
        shares.resize((size_t)numShares);
        for (size_t i = 0; i < numShares; ++i)
            shares[i] = GenerateShare(i + 1, coefficients);
        return shares;
    }

    // use lagrange polynomials to find f(0) of the curve, which is the secret number
    TINT JoinShares (const TShares& shares) const
    {
        // make sure there is at elast the minimum number of shares
        assert(shares.size() >= size_t(c_sharesNeeded));

        // Sigma summation loop
        TINT sum = 0;
        for (TINT j = 0; j < c_sharesNeeded; ++j)
        {
            TINT y_j = shares[(size_t)j][1];

            TINT numerator = 1;
            TINT denominator = 1;

            // Pi product loop
            for (TINT m = 0; m < c_sharesNeeded; ++m)
            {
                if (m == j)
                    continue;

                numerator = (numerator * shares[(size_t)m][0]) % c_prime;
                denominator = (denominator * (shares[(size_t)m][0] - shares[(size_t)j][0])) % c_prime;
            }

            sum = (c_prime + sum + y_j * numerator * modInverse(denominator, c_prime)) % c_prime;
        }
        return sum;
    }

    const TINT GetPrime () const { return c_prime; }
    const TINT GetSharesNeeded () const { return c_sharesNeeded; }

private:

    // Generate a single share in the form of (x, f(x))
    TShare GenerateShare (TINT x, const std::vector<TINT>& coefficients) const
    {
        TINT xpow = x;
        TINT y = coefficients[0];
        for (TINT i = 1; i < c_sharesNeeded; ++i) {
            y += coefficients[(size_t)i] * xpow;
            xpow *= x;
        }
        return{ x, y % c_prime };
    }

    // Gives the decomposition of the gcd of a and b.  Returns [x,y,z] such that x = gcd(a,b) and y*a + z*b = x
    static const std::array<TINT, 3> gcdD (TINT a, TINT b) {
        if (b == 0)
            return{ a, 1, 0 };

        const TINT n = a / b;
        const TINT c = a % b;
        const std::array<TINT, 3> r = gcdD(b, c);

        return{ r[0], r[2], r[1] - r[2] * n };
    }

    // Gives the multiplicative inverse of k mod prime.  In other words (k * modInverse(k)) % prime = 1 for all prime > k >= 1 
    static TINT modInverse (TINT k, TINT prime) {
        k = k % prime;
        TINT r = (k < 0) ? -gcdD(prime, -k)[2] : gcdD(prime, k)[2];
        return (prime + r) % prime;
    }

private:
    
    // Publically known information
    const TINT          c_prime;
    const TINT          c_sharesNeeded;
};

void WaitForEnter ()
{
    printf("Press Enter to quit");
    fflush(stdin);
    getchar();
}

int main (int argc, char **argv)
{
    // Parameters
    const TINT c_secretNumber = 435;
    const TINT c_sharesNeeded = 7;
    const TINT c_sharesMade = 50;
    const TINT c_prime = 439;   // must be a prime number larger than the other three numbers above

    // set up a secret sharing object with the public information
    CShamirSecretSharing secretSharer(c_sharesNeeded, c_prime);

    // split a secret value into multiple shares
    TShares shares = secretSharer.GenerateShares(c_secretNumber, c_sharesMade);

    // shuffle the shares, so it's random which ones are used to join
    std::array<int, std::mt19937::state_size> seed_data;
    std::random_device r;
    std::generate_n(seed_data.data(), seed_data.size(), std::ref(r));
    std::seed_seq seq(std::begin(seed_data), std::end(seed_data));
    std::mt19937 gen(seq);
    std::shuffle(shares.begin(), shares.end(), gen);

    // join the shares
    TINT joinedSecret = secretSharer.JoinShares(shares);

    // show the public information and the secrets being joined
    printf("%" PRId64 " shares needed, %i shares maden", secretSharer.GetSharesNeeded(), shares.size());
    printf("Prime = %" PRId64 "nn", secretSharer.GetPrime());
    for (TINT i = 0, c = secretSharer.GetSharesNeeded(); i < c; ++i)
        printf("Share %" PRId64 " = (%" PRId64 ", %" PRId64 ")n", i+1, shares[i][0], shares[i][1]);

    // show the result
    printf("nJoined Secret = %" PRId64 "nActual Secret = %" PRId64 "nn", joinedSecret, c_secretNumber);
    assert(joinedSecret == c_secretNumber);
    WaitForEnter();
    return 0;
}

Example Output

Here is some example output of the program:

Links

Wikipedia: Shamir’s Secret Sharing (Note: for some reason the example javascript implementation here only worked for odd numbered keys required)
Wikipedia: Finite Field
Cryptography.wikia.com: Shamir’s Secret Sharing
Java Implementation of Shamir’s Secret Sharing (Note: I don’t think this implementation is correct, and neither is the one that someone posted to correct them!)

When writing this post I wondered if maybe you could use the coefficients of the other terms as secrets as well. These two links talk about the details of that:
Cryptography Stack Exchange: Why only one secret value with Shamir’s secret sharing?
Cryptography Stack Exchange: Coefficients in Shamir’s Secret Sharing Scheme

Now that you understand this, you are probably ready to start reading up on elliptic curve cryptography. Give this link below a read if you are interested in a gentle introduction on that!
A (Relatively Easy To Understand) Primer on Elliptic Curve Cryptography

Turning a Truth Table Into A digital Circuit (ANF)

In this post I’m going to show how you turn a truth table into a digital logic circuit that uses XOR and AND gates.

My Usage Case

My specific usage case for this is in my investigations into homomorphic encryption, which as you may recall is able to perform computation on encrypted data. This lets encrypted data be operated on by an untrusted source, given back to you, and then you can decrypt your data to get a result.

Lots of use cases if this can ever get fast enough to become practical, such as doing cloud computing with private data. However, when doing homomorphic encryption (at least currently, for the techniques I’m using), you only have XOR and AND logic operations.

So, I’m using the information in this post to be able to turn a lookup table, or a specific boolean function, into a logic circuit that I can feed into a homomorphic encryption based digital circuit.

Essentially I want to figure out how to do a homomorphic table lookup to try and make some simple as possible circuits, that will in turn be as fast and lean as possible.

If you want to know more about homomorphic encryption, here’s a post I wrote which explains a very simple algorithm: Super Simple Symmetric Leveled Homomorphic Encryption Implementation

Algebraic Normal Form

Algebraic normal form (ANF) is a way of writing a boolean function using only XOR and AND.

Since it’s a normal form, two functions that do the same thing will be the same thing in ANF.

There are other forms for writing boolean logic, but ANF suits me best for my homomorphic encryption circuit needs!

An example of boolean logic in ANF is the below:

f(x_1, x_2, x_3, x_4) = x_1 x_2 \oplus x_1 x_3 \oplus x_1 x_4

It is essentially a boolean polynomial, where AND is like multiplication, and XOR is like addition. It even factors the same way. In fact, ANF is not always the smallest circuit possible, you’d have to factor common ANDs to find the smallest way you could represent the circuit, like the below:

f(x_1, x_2, x_3, x_4) = x_1 (x_2 \oplus x_3 \oplus x_4)

That smaller form does 1 AND and 2 XORs, versus the ANF which does 3 ANDs and 2 XORs. In homomorphic encryption, since AND is so much more costly than XOR, minimizing the ANDs is a very nice win, and worth the effort.

Wikipedia has some more info about ANF here: Wikipedia: Algebraic normal form

Truth Tables and Lookup Tables

A truth table is just where you specify the inputs into a boolean function and the output of that boolean function for the given input:

\begin{array}{c|c|c|c} x_1 & x_2 & x_3 & f(x_1, x_2, x_3) \\ \hline 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 1 \\ 0 & 1 & 0 & 1 \\ 0 & 1 & 1 & 0 \\ 1 & 0 & 0 & 1 \\ 1 & 0 & 1 & 0 \\ 1 & 1 & 0 & 0 \\ 1 & 1 & 1 & 1 \\ \end{array}

A lookup table is similar in functionality, except that it has multi bit output. When dealing with digital circuits, you can make a lookup table by making a truth table per output bit. For instance, the above truth table might just be the low bit of the lookup table below, which is just a truth table for addition of the input bits.

\begin{array}{c|c|c|c} x_1 & x_2 & x_3 & f(x_1, x_2, x_3) \\ \hline 0 & 0 & 0 & 00 \\ 0 & 0 & 1 & 01 \\ 0 & 1 & 0 & 01 \\ 0 & 1 & 1 & 10 \\ 1 & 0 & 0 & 01 \\ 1 & 0 & 1 & 10 \\ 1 & 1 & 0 & 10 \\ 1 & 1 & 1 & 11 \\ \end{array}

Converting Truth Table to ANF

When I first saw the explanation for converting a truth table to ANF, it looked pretty complicated, but luckily it turns out to be pretty easy.

The basic idea is that you make a term for each possible combination of x inputs, ANDing a term by each constant, and then solving for those constants.

Let’s use the truth table from the last section:

\begin{array}{c|c|c|c} x_1 & x_2 & x_3 & f(x_1, x_2, x_3) \\ \hline 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 1 \\ 0 & 1 & 0 & 1 \\ 0 & 1 & 1 & 0 \\ 1 & 0 & 0 & 1 \\ 1 & 0 & 1 & 0 \\ 1 & 1 & 0 & 0 \\ 1 & 1 & 1 & 1 \\ \end{array}

For three inputs, the starting equation looks like this:

f(x_1, x_2, x_3) = \\ a_0 \\ \oplus a_1 x_1 \oplus a_2 x_2 \oplus a_3 x_3 \\ \oplus a_{12} x_1 x_2 \oplus a_{13} x_1 x_3 \oplus a_{23} x_2 x_3 \\ \oplus a_{123} x_1 x_2 x_3

Now we have to solve for the a values.

To solve for a_{123} , we just look in the truth table for function f(x_1, x_2, x_3) to see if we have an odd or even number of ones in the output of the function. If there is an even number, it is 0, else it is a 1.

Since we have an even number of ones, the value is 0, so our equation becomes this:

f(x_1, x_2, x_3) = \\ a_0 \\ \oplus a_1 x_1 \oplus a_2 x_2 \oplus a_3 x_3 \\ \oplus a_{12} x_1 x_2 \oplus a_{13} x_1 x_3 \oplus a_{23} x_2 x_3 \\ \oplus 0 \land x_1 x_2 x_3

Note that \land is the symbol for AND. I’m showing it explicitly because otherwise the equation looks weird, and a multiplication symbol isn’t correct.

Since 0 ANDed with anything else is 0, and also since n XOR 0 = n, that whole last term disappears, leaving us with this equation:

f(x_1, x_2, x_3) = \\ a_0 \\ \oplus a_1 x_1 \oplus a_2 x_2 \oplus a_3 x_3 \\ \oplus a_{12} x_1 x_2 \oplus a_{13} x_1 x_3 \oplus a_{23} x_2 x_3

Next up, to solve for a_{12} , we need to limit our truth table to f(x_1, x_2, 0) . That truth table is below, made from the original truth table, but throwing out any row where x_{3} is 1.

\begin{array}{c|c|c|c} x_1 & x_2 & x_3 & f(x_1, x_2, 0) \\ \hline 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 1 \\ 1 & 0 & 0 & 1 \\ 1 & 1 & 0 & 0 \\ \end{array}

We again just look at whether there are an odd or even number of ones in the function output, and use that to set a_{12} appropriately. In this case, there are an even number, so we set it to 0, which makes that term disappear again. Our function is now down to this:

f(x_1, x_2, x_3) = \\ a_0 \\ \oplus a_1 x_1 \oplus a_2 x_2 \oplus a_3 x_3 \\ \oplus a_{13} x_1 x_3 \oplus a_{23} x_2 x_3

If we look at f(x_1,0,x_3) , we find that it also has an even number of ones, making a_{13} become 0 and making that term disappear.

Looking at f(0,x_2,x_3) , it also has an even number of ones, making a_{23} become 0 and making that term disappear as well.

That leaves us with this equation:

f(x_1, x_2, x_3) = \\ a_0 \\ \oplus a_1 x_1 \oplus a_2 x_2 \oplus a_3 x_3

To solve for a_1 , we look at the truth table for f(x_1,0,0) , which is below:

\begin{array}{c|c|c|c} x_1 & x_2 & x_3 & f(x_1, 0, 0) \\ \hline 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 1 \\ \end{array}

There are an odd number of ones in the output, so a_1 becomes 1. Finally, we get to keep a term! The equation is below:

f(x_1, x_2, x_3) = \\ a_0 \\ \oplus 1 \land x_1 \oplus a_2 x_2 \oplus a_3 x_3

Since 1 AND n = n, we can drop the explicit 1 to become this:

f(x_1, x_2, x_3) = \\ a_0 \\ \oplus x_1 \oplus a_2 x_2 \oplus a_3 x_3

If you do the same process for a_2 and a_3 , you’ll find that they also have odd numbers of ones in the output so also become ones. That puts our equation at:

f(x_1, x_2, x_3) = \\ a_0 \\ \oplus x_1 \oplus x_2 \oplus x_3

Solving for a_0 , is just looking at whether there are an odd or even number of ones in the function f(0,0,0) which you can look up directly in the lookup table. It’s even, so a_0 becomes 0, which makes our full final equation into this:

f(x_1, x_2, x_3) = x_1 \oplus x_2 \oplus x_3

We are done! This truth table can be implemented with 3 XORs and 0 ANDs. A pretty efficient operation!

You can see this is true if you work it out with the truth table. Try it out and see!

\begin{array}{c|c|c|c} x_1 & x_2 & x_3 & f(x_1, x_2, x_3) \\ \hline 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 1 \\ 0 & 1 & 0 & 1 \\ 0 & 1 & 1 & 0 \\ 1 & 0 & 0 & 1 \\ 1 & 0 & 1 & 0 \\ 1 & 1 & 0 & 0 \\ 1 & 1 & 1 & 1 \\ \end{array}

Sample Code

Here is some sample code that lets you define a lookup table by implementing an integer function, and it generates the ANF for each output bit for the truth table. It also verifies that the ANF gives the correct answer. It shows you how to use this to make various circuits: bit count, addition, multiplication, division and modulus.

#include <stdio.h>
#include <array>
#include <vector>

#define PRINT_TRUTHTABLES() 0
#define PRINT_NUMOPS() 1
#define PRINT_ANF() 1

void WaitForEnter ()
{
    printf("Press Enter to quit");
    fflush(stdin);
    getchar();
}

template <size_t NUM_INPUT_BITS>
bool LookupEntryPassesMask (size_t entry, size_t mask)
{
    for (size_t i = 0; i < NUM_INPUT_BITS; ++i)
    {
        const size_t bitMask = 1 << i;
        const bool allowOnes = (mask & bitMask) != 0;
        const bool bitPassesMask = allowOnes || (entry & bitMask) == 0;
        if (!bitPassesMask)
            return false;
    }
    return true;
}

template <size_t NUM_INPUT_BITS>
bool ANFHasTerm (const std::array<size_t, 1 << NUM_INPUT_BITS> &lookupTable, size_t outputBitIndex, size_t termMask)
{
    const size_t c_inputValueCount = 1 << NUM_INPUT_BITS;

    int onesCount = 0;
    for (size_t i = 0; i < c_inputValueCount; ++i)
    {
        if (LookupEntryPassesMask<NUM_INPUT_BITS>(i, termMask) && ((lookupTable[i] >> outputBitIndex) & 1) != 0)
            onesCount++;
    }

    return (onesCount & 1) != 0;
}

template <size_t NUM_INPUT_BITS>
void MakeANFTruthTable (const std::array<size_t, 1 << NUM_INPUT_BITS> &lookupTable, std::array<size_t, 1 << NUM_INPUT_BITS> &reconstructedLookupTable, size_t outputBitIndex)
{
    const size_t c_inputValueCount = 1 << NUM_INPUT_BITS;
    printf("-----Output Bit %u-----rn", outputBitIndex);

    // print truth table if we should
    #if PRINT_TRUTHTABLES()
        for (size_t inputValue = 0; inputValue < c_inputValueCount; ++inputValue)
            printf("  [%u] = %urn", inputValue, ((lookupTable[inputValue] >> outputBitIndex) & 1) ? 1 : 0);
        printf("rn");
    #endif

    // find each ANF term
    std::vector<size_t> terms;
    for (size_t termMask = 0; termMask < c_inputValueCount; ++termMask)
    {
        if (ANFHasTerm<NUM_INPUT_BITS>(lookupTable, outputBitIndex, termMask))
            terms.push_back(termMask);
    }

    // print function params
    #if PRINT_ANF()
        printf("f(");
        for (size_t i = 0; i < NUM_INPUT_BITS; ++i)
        {
            if (i > 0)
                printf(",");
            printf("x%i",i+1);
        }
        printf(") = rn");
    #endif

    // print ANF and count XORs and ANDs
    size_t numXor = 0;
    size_t numAnd = 0;
    if (terms.size() == 0)
    {
        #if PRINT_ANF()
        printf("0rn");
        #endif
    }
    else
    {
        for (size_t termIndex = 0, termCount = terms.size(); termIndex < termCount; ++termIndex)
        {
            if (termIndex > 0) {
                #if PRINT_ANF()
                printf("XOR ");
                #endif
                ++numXor;
            }

            size_t term = terms[termIndex];
            if (term == 0)
            {
                #if PRINT_ANF()
                printf("1");
                #endif
            }
            else
            {
                bool firstProduct = true;
                for (size_t bitIndex = 0; bitIndex < NUM_INPUT_BITS; ++bitIndex)
                {
                    const size_t bitMask = 1 << bitIndex;
                    if ((term & bitMask) != 0)
                    {
                        #if PRINT_ANF()
                        printf("x%i ", bitIndex + 1);
                        #endif
                        if (firstProduct)
                            firstProduct = false;
                        else
                            ++numAnd;
                    }
                }
            }
            #if PRINT_ANF()
            printf("rn");
            #endif
        }
    }
    #if PRINT_ANF()
    printf("rn");
    #endif

    #if PRINT_NUMOPS()
    printf("%u XORs, %u ANDsrnrn", numXor, numAnd);
    #endif

    // reconstruct a bit of the reconstructedLookupTable for each entry to be able to verify correctness
    const size_t c_outputBitMask = 1 << outputBitIndex;
    for (size_t valueIndex = 0; valueIndex < c_inputValueCount; ++valueIndex)
    {
        bool xorSum = false;
        for (size_t termIndex = 0, termCount = terms.size(); termIndex < termCount; ++termIndex)
        {
            size_t term = terms[termIndex];
            if (term == 0)
            {
                xorSum = 1 ^ xorSum;
            }
            else
            {
                bool andProduct = true;
                for (size_t bitIndex = 0; bitIndex < NUM_INPUT_BITS; ++bitIndex)
                {
                    const size_t bitMask = 1 << bitIndex;
                    if ((term & bitMask) != 0)
                    {
                        if ((valueIndex & bitMask) == 0)
                            andProduct = false;
                    }
                }
                xorSum = andProduct ^ xorSum;
            }
        }
        if (xorSum)
            reconstructedLookupTable[valueIndex] |= c_outputBitMask;
    }
}

template <size_t NUM_INPUT_BITS, size_t NUM_OUTPUT_BITS, typename LAMBDA>
void MakeANFLookupTable (const LAMBDA& lambda)
{
    // make lookup table
    const size_t c_outputValueMask = (1 << NUM_OUTPUT_BITS) - 1;
    const size_t c_inputValueCount = 1 << NUM_INPUT_BITS;
    std::array<size_t, c_inputValueCount> lookupTable;
    for (size_t inputValue = 0; inputValue < c_inputValueCount; ++inputValue)
        lookupTable[inputValue] = lambda(inputValue, NUM_INPUT_BITS, NUM_OUTPUT_BITS) & c_outputValueMask;

    // make the anf for each truth table (each output bit of the lookup table)
    std::array<size_t, c_inputValueCount> reconstructedLookupTable;
    std::fill(reconstructedLookupTable.begin(), reconstructedLookupTable.end(), 0);
    for (size_t outputBitIndex = 0; outputBitIndex < NUM_OUTPUT_BITS; ++outputBitIndex)
        MakeANFTruthTable<NUM_INPUT_BITS>(lookupTable, reconstructedLookupTable, outputBitIndex);

    // verify that our anf expressions perfectly re-create the lookup table
    for (size_t inputValue = 0; inputValue < c_inputValueCount; ++inputValue)
    {
        if (lookupTable[inputValue] != reconstructedLookupTable[inputValue])
            printf("ERROR: expression / lookup mismatch for index %urn", inputValue);
    }
    printf("expression / lookup verification complete.rnrn");
}

size_t CountBits (size_t inputValue, size_t numInputBits, size_t numOutputBits)
{
    // Count how many bits there are
    int result = 0;
    while (inputValue)
    {
        if (inputValue & 1)
            result++;
        inputValue = inputValue >> 1;
    }
    return result;
}

size_t AddBits (size_t inputValue, size_t numInputBits, size_t numOutputBits)
{
    // break the input bits in half and add them
    const size_t bitsA = numInputBits / 2;
    const size_t mask = (1 << bitsA) - 1;

    size_t a = inputValue & mask;
    size_t b = inputValue >> bitsA;
    
    return a+b;
}

size_t MultiplyBits (size_t inputValue, size_t numInputBits, size_t numOutputBits)
{
    // break the input bits in half and add them
    const size_t bitsA = numInputBits / 2;
    const size_t mask = (1 << bitsA) - 1;

    size_t a = inputValue & mask;
    size_t b = inputValue >> bitsA;

    return a * b;
}

size_t DivideBits (size_t inputValue, size_t numInputBits, size_t numOutputBits)
{
    // break the input bits in half and add them
    const size_t bitsA = numInputBits / 2;
    const size_t mask = (1 << bitsA) - 1;

    size_t a = inputValue & mask;
    size_t b = inputValue >> bitsA;

    // workaround for divide by zero
    if (b == 0)
        return 0;

    return a / b;
}

size_t ModulusBits (size_t inputValue, size_t numInputBits, size_t numOutputBits)
{
    // break the input bits in half and add them
    const size_t bitsA = numInputBits / 2;
    const size_t mask = (1 << bitsA) - 1;

    size_t a = inputValue & mask;
    size_t b = inputValue >> bitsA;

    // workaround for divide by zero
    if (b == 0)
        return 0;

    return a % b;
}

int main (int argc, char **argv)
{
    //MakeANFLookupTable<3, 2>(CountBits);    // Output bits needs to be enough to store the number "input bits"
    //MakeANFLookupTable<4, 3>(AddBits);      // Output bits needs to be (InputBits / 2)+1
    //MakeANFLookupTable<4, 4>(MultiplyBits); // Output bits needs to be same as input bits
    //MakeANFLookupTable<4, 2>(DivideBits);   // Output bits needs to be half of input bits (rounded down)
    //MakeANFLookupTable<4, 2>(ModulusBits);  // Output bits needs to be half of input bits (rounded down)
    //MakeANFLookupTable<10, 5>(DivideBits);  // 5 bit vs 5 bit division is amazingly complex!
    MakeANFLookupTable<4, 2>(ModulusBits);  // Output bits needs to be half of input bits (rounded down)
    WaitForEnter();
    return 0;
}

Sample Code Runs

Here is the program output for a “bit count” circuit. It counts the number of bits that are 1, in the 3 bit input, and outputs the answer as 2 bit output. Note that the bit 0 output is the same functionality as the example we worked through by hand, and you can see that it comes up with the same answer.

Here is the program output for an adder circuit. It adds two 2 bit numbers, and outputs a 3 bit output.

Here is the program output for a multiplication circuit. It multiplies two 2 bit numbers, and outputs a 4 bit number.

Here is the program output for a division circuit. It divides a 2 bit number by another 2 bit number and outputs a 2 bit number. When higher bit counts are involved, the division circuit gets super complicated, it’s really crazy! 5 bit divided by 5 bit is several pages of output for instance. Note that it returns 0 whenever it would divide by 0.

Lastly, here is the program output for a modulus circuit. It divides a 2 bit number by another 2 bit number and outputs the remainder as a 2 bit number.

Closing and Links

While the above shows you how to turn a single bit truth table into ANF, extending this to a multi bit lookup table is super simple; you just do the same process for each output bit in the lookup table.

Here are a few links in case anything above is unclear, or you want more information.

Finding Boolean/Logical Expressions for truth tables in algebraic normal form(ANF)

Finding Boolean/Logical Expressions for truth tables

Modular Multiplicative Inverse

This post is a pre-requisite for the next thing I want to talk about so may not make a whole lot of sense or be all that interesting until shown in that context.

Say you have a function like this:

(a*x) \mod m = n

If you know the values of a, m and n, how do you solve for x? Note in this post we are only dealing with integers, so we are looking for the integer solution for x.

It might be hard to visualize with so many symbols, so here it is with some constants:

(5*x) \mod 7 = 3

How would you solve that for x? In other words, what do you need to multiply 5 by, so that when you divide the result by 7, that you get 3 as the remainder?

One way to solve for x would be brute force. We could try plugging every value from 0 to 6 into x (every value from 0 to n-1), and see if any gives us the result we are looking for.

Brute force can be a challenge if the numbers are really large, like in some cryptographic situations.

Interestingly, there might not even be a valid answer for x that satisfies the equation! The below has no answer for instance:

(2*x) \mod 8 = 5

Better Than Brute Force

There’s something called the “Modular Multiplicative Inverse” which looks eerily familiar:

(a*x) \mod m = 1

Where a and m are known, and the inverse itself is the value of x.

Using the same constants we did above, that gives us this:

(5*x) \mod 7 = 1

In this case, the inverse (x) is 3. You can verify that by seeing that (5*3) % 7 is 1.

Once you have the inverse, if you wanted to solve the original equation where the modulus end up being 3, you just multiply the inverse by the desired modulus amount. Since the inverse is 3 and the desired modulus value is 3, you multiply them together and get 9. Plugging the numbers in, we can see that (5*9) % 7 = 3.

Pretty cool, but how to calculate the inverse? You can calculate it by using something called the “Extended Euclidean Algorithm”.

The regular Euclidean algorithm is in a post here: Programmatically Calculating GCD and LCM.

The extended euclidean algorithm is explained really well on wikipedia: Wikipedia: Extended Euclidean Algorithm.

Sample Code

Here’s some sample code that asks the user for input and solves these style of equations for x. Below the code I’ll show some example runs and talk about a few more things.

#include <stdio.h>
#include <algorithm>
#include <array>

//=================================================================================
unsigned int ExtendedEuclidianAlgorithm (int smaller, int larger, int &s, int &t)
{
    // make sure A <= B before starting
    bool swapped = false;
    if (larger < smaller)
    {
        swapped = true;
        std::swap(smaller, larger);
    }

    // set up our storage for the loop.  We only need the last two values so will
    // just use a 2 entry circular buffer for each data item
    std::array<int, 2> remainders = { larger, smaller };
    std::array<int, 2> ss = { 1, 0 };
    std::array<int, 2> ts = { 0, 1 };
    int indexNeg2 = 0;
    int indexNeg1 = 1;

    // loop
    while (1)
    {
        // calculate our new quotient and remainder
        int newQuotient = remainders[indexNeg2] / remainders[indexNeg1];
        int newRemainder = remainders[indexNeg2] - newQuotient * remainders[indexNeg1];

        // if our remainder is zero we are done.
        if (newRemainder == 0)
        {
            // return our s and t values as well as the quotient as the GCD
            s = ss[indexNeg1];
            t = ts[indexNeg1];
            if (swapped)
                std::swap(s, t);
            return remainders[indexNeg1];
        }

        // calculate this round's s and t
        int newS = ss[indexNeg2] - newQuotient * ss[indexNeg1];
        int newT = ts[indexNeg2] - newQuotient * ts[indexNeg1];

        // store our values for the next iteration
        remainders[indexNeg2] = newRemainder;
        ss[indexNeg2] = newS;
        ts[indexNeg2] = newT;

        // move to the next iteration
        std::swap(indexNeg1, indexNeg2);
    }
}

//=================================================================================
void WaitForEnter ()
{
    printf("nPress Enter to quit");
    fflush(stdin);
    getchar();
}

//=================================================================================
int main(int argc, char **argv)
{
    // get user input
    int a, m, n;
    printf("Given a, m and n, solves for X.n(a * X) %% m = nnn");
    printf("a = ");
    scanf("%i", &a);
    printf("m = ");
    scanf("%i", &m);
    printf("n = ");
    scanf("%i", &n);

    // show details of what they entered
    printf("n(%i * X) mod %i = %in", a, m, n);

    // Attempt brute force
    printf("nBrute Force Testing X from 0 to %i:n", (m-1));
    for (int i = 0; i < m; ++i) {
        if ((a*i) % m == n)
        {
            printf("  X = %in", i);
            printf("  %i mod %i = %in", a*i, m, (a*i) % m);
            break;
        }
        else if (i == (m - 1))
        {
            printf("  No solution!n");
        }
    }

    // Attempt inverse via Extended Euclidean Algorithm
    printf("nExtended Euclidean Algorithm:n");
    int s, t;
    int GCD = ExtendedEuclidianAlgorithm(a, m, s, t);

    // report failure if we couldn't do inverse
    if (GCD != 1)
    {
        printf("  Values are not co-prime, cannot invert! GCD = %in", GCD);
    }
    // Else report details of inverse and show that it worked
    else
    {
        printf("  Inverse = %in", t);
        printf("  X = Inverse * n = %in", t*n);
        printf("  %i mod %i = %in", a*t*n, m, (a*t*n) % m);
    }

    WaitForEnter();
    return 0;
}

Example Runs

Here is a normal run that solves (7*x) \mod 9 = 2, to come up with a value of 8 for x.

Here is a run that solves (5*x) \mod 7 = 3. Brute force gives us a value of 2 for x, while the inverse gives us a value of 9. Both are valid, and in fact are equivalent since 9 % 7 = 2. This shows that getting the inverse and then multiplying it to get the desired answer doesn’t always give you the smallest possible value of x.

Here is a large number run that solves (7*x) \mod 1000001 = 538. Brute force gives a value of 571,506 for x, while using the inversion method gives us a value of 230,571,736.

Lastly, here is a run that solves (8*x) \mod 6 = 4. Brute force gives us a value of 2 for x, but interestingly, it isn’t invertible, so the inversion based solution can’t even find us an answer!

This happens when a and m are not co-prime. In other words, if they have a GCD that isn’t 1, they aren’t coprime, and the modulus can’t be inverted.

Links

You can read more about the modular multiplicative inverse here: Wikipedia: Modular Multiplicative Inverse.

Improving the Security of the Super Simple Symmetric Leveled Homomorphic Encryption Implementation

The last post showed a super simple encryption algorithm that let an untrusted person perform calculations with encrypted data such that when they gave the results back to you, you could decrypt them and get the results of the calculation as if they were done on the unencrypted values. The best part was that this untrusted party had no knowledge of the data it was doing the calculations on.

While it was (hopefully) easy to understand, there were a lot of problems with it’s security. The biggest of which probably was the fact that the encryption of a false bit was the secret key itself!

This post is going to slightly increase the complexity of the encryption operation to match what the paper that I’m getting this stuff from says to do (Fully Homomorphic Encryption over the Integers).

All of the other operations – like XOR, AND, Decryption, use of constants – remain the same. It’s just the process of turning a plain text bit into a cipher text bit that is going to change.

Disclaimer: I am a game programmer, not a cryptographer, and you should REALLY do your own investigation and consult experts before actually rolling anything out to production or trusting that what I say is correct!

Improvement #1 – Multiply Key By Random Number

The scheme to encrypt a plaintext bit from the last blog post was this:

encryptedBit = key + value ? 1 : 0;

A major problem with that scheme is that if you encrypt a false bit, you get the key itself.

Another problem that we touched on was that the parity of the plain text bit (whether it was odd or even, aka a 1 or a 0) was always opposite of the parity of the cipher text.

Yet another problem was that for the same key, 0 always encrypted to the same value, and 1 always encrypted to the same value, which was the “0” encrypted value, plus 1.

We are going to address all of these problems by adding a simple operation to encryption. We are just going to multiply the key by a random number before adding the plain text bit in. That gives us the following:

encryptedBit = key*randomNumber + value ? 1 : 0;

Where randomNumber is at least 1.

This above helps in the following ways:

  • Encrypting false doesn’t always just give you the key anymore!
  • Since the cipherBit divided by the key is now a random number (and since the key is an odd number), it will be random whether the parity of the cipher text matches or mismatches the plain text. You can no longer use that information to figure out the value of the encrypted bit!
  • If you encrypt a false value, you will get different values each time. Same when encrypting a true value. When looking at two ciphertexts that have the same underlying plaintext value, you will no longer be able to tell that they are equal just by looking at them, or be able to tell that one is larger than the other so must be the true bit!

That is a pretty good improvement, but we can do a little better.

Improvement #2 – Add Random Noise

The second improvement we are going to do is add random noise to our encrypted value. This will make it so encrypting a false bit will not result in a multiple of the key, but will instead result in NEARLY a multiple of the key, which is a harder problem to figure out as an attacker.

You might ask how we are going to preserve our encrypted value if we are adding random noise into the result. Well, in actuality, all we really need to preserve is the lowest bit, so we are going to add an EVEN NUMBERED amount of noise.

That makes our encryption scheme become this:

encryptedBit = key*randomNumber1 + 2*randomNumber2 + value ? 1 : 0;

While this increases our security, it also increases the noise (error) in our encrypted data, which makes it so we can do fewer operations before the error gets too large and we start getting wrong answers.

There we are, our encryption scheme now matches the one described in the paper. All the rest of our operations remain unchanged.

Security

The above looks good, but what range of values should be we use for randomNumber1 and randomNumber2 and the actual size of the key?

The paper refers to a security parameter lambda (\lambda) that everything else is based on.

It says that the size of the key (N) should be (\lambda^2) bits, randomNumber1 should be around 2^{N^3} (N^3 bits) and that randomNumber2 should be around 2^{\sqrt{N}} (\sqrt{N} bits).

It also says that the best known attack against this algorithm takes about 2^{N^2} operations and that you can assume an attacker is going to be able to do a billion operations per second (per this info here). Note that 1 billion operations per second is a typical computer, not a super computer or a distributed attack using a bot network!

Let’s look at some example values for lambda!

Lambda Key Size RN1 Size RN2 Size Attack Time
80 800B 30.5GB 10B 38 million years
60 450B 5.4GB 8B 36 years
40 200B 488MB 5B 17 minutes
20 50B 7.6MB 3B < 1 second

Ouch! RandomNumber1 sure is huge isn’t it? I’ve double and tripple checked and that really does seem to be correct. Encrypting a single bit is essentially going to be as large as RandomNumber1. That is so unwieldy it’s ridiculous. I’m going to quadruple check that I think because that is just insane…

BTW quick tangent. An interesting thing to note is that if your key is an even number, instead of an odd number, noise/error can grow as much as it wants, and will never give you the wrong result! That means that by using an even numbered key, this scheme is fully homomorphic. However, using an even key, encryptedBit % 2 == decryptedBit, so it’s super insecure.

I’ve been thinking about it quite a bit, but I can’t think of any avenues to use an even numbered key but still have any semblance of real security. If you can think of a way, go publish a paper about it and become famous! 😛

Example Code

Here is some sample code from last post with the new encryption routine. I had to decrease the number of bits in the addition tests, and I had to tone down the security quite a bit to make it fit within 64 bits.

#include <stdio.h>
#include <stdint.h>
#include <random>
#include <array>
#include <inttypes.h>

typedef uint64_t uint64;

// Increase this value to increase the size of the key, and also the maximum
// size of the error allowed.
// If you set it too high, operations will fail when they run out of storage space
// in the 64 bit ints.  If you set it too low, you will not be able to do very many
// operations in a row.
// The recomended values for good security for these numbers are way too large to
// fit in a uint64, so adjusting them down to show their effects while using uint64s
const size_t c_numKeyBits = 15;
const size_t c_numNoiseBits = 3; //size_t(sqrt(c_numKeyBits));
const size_t c_numMultiplierBits = 4; //c_numKeyBits * c_numKeyBits * c_numKeyBits;

#define Assert(x) if (!(x)) ((int*)nullptr)[0] = 0;

//=================================================================================
// TODO: Replace with something crypto secure if desired!
uint64 RandomUint64 (uint64 min, uint64 max)
{
    static std::random_device rd;
    static std::mt19937 gen(rd());
    std::uniform_int_distribution<uint64> dis(min, max);
    return dis(gen);
}

//=================================================================================
void WaitForEnter ()
{
    printf("Press Enter to quit");
    fflush(stdin);
    getchar();
}

//=================================================================================
uint64 GenerateKey ()
{
    // Generate an odd random number in [2^(N-1), 2^N)
    // N is the number of bits in our key
    // The key also defines the maximum amount of error allowed, and thus the number
    // of operations allowed in a row.
    uint64 key = RandomUint64(0, (uint64(1) << uint64(c_numKeyBits)) - 1);
    key = key | (uint64(1) << uint64(c_numKeyBits - 1));
    key = key | 1;
    return key;
}

//=================================================================================
bool Decrypt (uint64 key, uint64 value)
{
    return ((value % key) % 2) == 1;
}

//=================================================================================
uint64 Encrypt (uint64 key, bool value)
{
    uint64 keyMultiplier = RandomUint64(0, (1 << c_numMultiplierBits) - 2) + 1;
    uint64 noise = RandomUint64(0, (1 << c_numNoiseBits) - 1);
    uint64 ret = key * keyMultiplier + 2 * noise + (value ? 1 : 0);
    Assert(Decrypt(key, ret) == value);
    return ret;
}

//=================================================================================
uint64 XOR (uint64 A, uint64 B)
{
    return A + B;
}

//=================================================================================
uint64 AND (uint64 A, uint64 B)
{
    return A * B;
}

//=================================================================================
float GetErrorPercent (uint64 key, uint64 value)
{
    // Returns what % of maximum error this value has in it.  When error >= 100%
    // then we have hit our limit and start getting wrong answers.
    return 100.0f * float(value % key) / float(key);
}

//=================================================================================
uint64 FullAdder (uint64 A, uint64 B, uint64 &carryBit)
{
    // homomorphically add the encrypted bits A and B
    // return the single bit sum, and put the carry bit into carryBit
    // From http://en.wikipedia.org/w/index.php?title=Adder_(electronics)&oldid=381607326#Full_adder
    uint64 sumBit = XOR(XOR(A, B), carryBit);
    carryBit = XOR(AND(A, B), AND(carryBit, XOR(A, B)));
    return sumBit;
}

//=================================================================================
int main (int argc, char **argv)
{
    // run this test a bunch to show that it works.  If you get a divide by zero
    // in an Assert, that means that it failed, and hopefully it's because you
    // increased c_numKeyBits to be too large!
    printf("Verifying 10000 truth tables.  Details of first one:n");
    for (int index = 0; index < 10000; ++index)
    {
        // make our key and a true and false bit
        uint64 key = GenerateKey();
        uint64 falseBit1 = Encrypt(key, false);
        uint64 falseBit2 = Encrypt(key, false);
        uint64 trueBit1  = Encrypt(key, true);
        uint64 trueBit2  = Encrypt(key, true);

        // report the results for the first iteration of the loop
        if (index == 0)
        {
            printf("Key 0x%" PRIx64 ", false = 0x%" PRIx64 ", 0x%" PRIx64 " true = 0x%" PRIx64 " 0x%" PRIx64 "n", key, falseBit1, falseBit2, trueBit1, trueBit2);
            printf("  [0 xor 0] = 0   0x%" PRIx64 " xor(+) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%0.2f%%)n", falseBit1, falseBit2, XOR(falseBit1, falseBit2), Decrypt(key, XOR(falseBit1, falseBit2)) ? 1 : 0, GetErrorPercent(key, XOR(falseBit1, falseBit2)));
            printf("  [0 xor 1] = 1   0x%" PRIx64 " xor(+) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%0.2f%%)n", falseBit1, trueBit2 , XOR(falseBit1, trueBit2 ), Decrypt(key, XOR(falseBit1, trueBit2 )) ? 1 : 0, GetErrorPercent(key, XOR(falseBit1, trueBit2 )));
            printf("  [1 xor 0] = 1   0x%" PRIx64 " xor(+) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%0.2f%%)n", trueBit1 , falseBit2, XOR(trueBit1 , falseBit2), Decrypt(key, XOR(trueBit1 , falseBit2)) ? 1 : 0, GetErrorPercent(key, XOR(trueBit1 , falseBit2)));
            printf("  [1 xor 1] = 0   0x%" PRIx64 " xor(+) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%0.2f%%)n", trueBit1 , trueBit2 , XOR(trueBit1 , trueBit2 ), Decrypt(key, XOR(trueBit1 , trueBit2 )) ? 1 : 0, GetErrorPercent(key, XOR(trueBit1 , trueBit2 )));
            printf("  [0 and 0] = 0   0x%" PRIx64 " and(*) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%0.2f%%)n", falseBit1, falseBit2, AND(falseBit1, falseBit2), Decrypt(key, AND(falseBit1, falseBit2)) ? 1 : 0, GetErrorPercent(key, XOR(falseBit1, falseBit2)));
            printf("  [0 and 1] = 0   0x%" PRIx64 " and(*) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%0.2f%%)n", falseBit1, trueBit2 , AND(falseBit1, trueBit2 ), Decrypt(key, AND(falseBit1, trueBit2 )) ? 1 : 0, GetErrorPercent(key, XOR(falseBit1, trueBit2 )));
            printf("  [1 and 0] = 0   0x%" PRIx64 " and(*) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%0.2f%%)n", trueBit1 , falseBit2, AND(trueBit1 , falseBit2), Decrypt(key, AND(trueBit1 , falseBit2)) ? 1 : 0, GetErrorPercent(key, XOR(trueBit1 , falseBit2)));
            printf("  [1 and 1] = 1   0x%" PRIx64 " and(*) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%0.2f%%)n", trueBit1 , trueBit2 , AND(trueBit1 , trueBit2 ), Decrypt(key, AND(trueBit1 , trueBit2 )) ? 1 : 0, GetErrorPercent(key, XOR(trueBit1 , trueBit2 )));
        }

        // Verify truth tables for XOR and AND
        Assert(Decrypt(key, XOR(falseBit1, falseBit2)) == false);
        Assert(Decrypt(key, XOR(falseBit1, trueBit2 )) == true );
        Assert(Decrypt(key, XOR(trueBit1 , falseBit2)) == true );
        Assert(Decrypt(key, XOR(trueBit1 , trueBit2 )) == false);

        Assert(Decrypt(key, AND(falseBit1, falseBit2)) == false);
        Assert(Decrypt(key, AND(falseBit1, trueBit2 )) == false);
        Assert(Decrypt(key, AND(trueBit1 , falseBit2)) == false);
        Assert(Decrypt(key, AND(trueBit1 , trueBit2 )) == true );
    }

    // Do multi bit addition as an example of using compound circuits to
    // do meaningful work.
    const size_t c_numBitsAdded = 3;
    printf("nDoing 10000 Multibit Additions.  Details of first one:n");
    std::array<uint64, c_numBitsAdded> numberAEncrypted;
    std::array<uint64, c_numBitsAdded> numberBEncrypted;
    std::array<uint64, c_numBitsAdded> resultEncrypted;
    std::array<uint64, c_numBitsAdded> carryEncrypted;
    for (int index = 0; index < 10000; ++index)
    {
        // generate the numbers we want to add
        uint64 numberA = RandomUint64(0, (1 << c_numBitsAdded) - 1);
        uint64 numberB = RandomUint64(0, (1 << c_numBitsAdded) - 1);

        // generate our key
        uint64 key = GenerateKey();

        // encrypt our bits
        for (int bitIndex = 0; bitIndex < c_numBitsAdded; ++bitIndex)
        {
            numberAEncrypted[bitIndex] = Encrypt(key, (numberA & (uint64(1) << uint64(bitIndex))) != 0);
            numberBEncrypted[bitIndex] = Encrypt(key, (numberB & (uint64(1) << uint64(bitIndex))) != 0);
        }

        // do our multi bit addition!
        // we could initialize the carry bit to 0 or the encrypted value of 0. either one works since 0 and 1
        // are also poor encryptions of 0 and 1 in this scheme!
        uint64 carryBit = Encrypt(key, false);
        for (int bitIndex = 0; bitIndex < c_numBitsAdded; ++bitIndex)
        {
            carryEncrypted[bitIndex] = carryBit;
            resultEncrypted[bitIndex] = FullAdder(numberAEncrypted[bitIndex], numberBEncrypted[bitIndex], carryBit);
        }

        // decrypt our result
        uint64 resultDecrypted = 0;
        for (int bitIndex = 0; bitIndex < c_numBitsAdded; ++bitIndex)
        {
            if (Decrypt(key, resultEncrypted[bitIndex]))
                resultDecrypted |= uint64(1) << uint64(bitIndex);
        }

        // report the results for the first iteration of the loop
        if (index == 0)
        {
            printf("Key 0x%" PRIx64 ", %" PRId64 " + %" PRId64 " in %i bits = %" PRId64 "n", key, numberA, numberB, c_numBitsAdded, (numberA + numberB) % (1 << c_numBitsAdded));
            for (int bitIndex = 0; bitIndex < c_numBitsAdded; ++bitIndex)
                printf("  A[%i] = 0x%" PRIx64 " (%i err=%0.2f%%)n", bitIndex, numberAEncrypted[bitIndex], Decrypt(key, numberAEncrypted[bitIndex]), GetErrorPercent(key, numberAEncrypted[bitIndex]));
            printf("+n");
            for (int bitIndex = 0; bitIndex < c_numBitsAdded; ++bitIndex)
                printf("  B[%i] = 0x%" PRIx64 " (%i err=%0.2f%%)n", bitIndex, numberBEncrypted[bitIndex], Decrypt(key, numberBEncrypted[bitIndex]), GetErrorPercent(key, numberBEncrypted[bitIndex]));
            printf("=n");
            for (int bitIndex = 0; bitIndex < c_numBitsAdded; ++bitIndex)
                printf("  Result[%i] = 0x%" PRIx64 " (%i err=%0.2f%%)n", bitIndex, resultEncrypted[bitIndex], Decrypt(key, resultEncrypted[bitIndex]), GetErrorPercent(key, resultEncrypted[bitIndex]));
            printf("Carry Bits =n");
            for (int bitIndex = 0; bitIndex < c_numBitsAdded; ++bitIndex)
                printf("  Result[%i] = 0x%" PRIx64 " (%i err=%0.2f%%)n", bitIndex, carryEncrypted[bitIndex], Decrypt(key, carryEncrypted[bitIndex]), GetErrorPercent(key, carryEncrypted[bitIndex]));
            printf("result decrypted = %" PRId64 "n", resultDecrypted);
        }

        // make sure that the results match, keeping in mind that the 4 bit encryption may have rolled over
        Assert(resultDecrypted == ((numberA + numberB) % (1 << c_numBitsAdded)));
    }

    WaitForEnter();
    return 0;
}

Here’s the output of an example run.

Links

Another paper about symmetric HE over the integers: Symmetric Somewhat Homomorphic Encryption over the Integers

An implementation of FHE: Implementation of the DGHV fully homomorphic encryption scheme

Next Up

Here are some interesting avenues to explore with this stuff. More blog posts coming in the future about these topics, but for now, here they are in case you want to check them out yourself:

  • Making the public / private key implementation
  • Make this stuff work with multi precision math to be able to make realistically sized and see what sort of perf it gives
  • Show how to achieve fully homomorphic encryption using boot strapping and/or modulus switching
  • Further explore tuning down security parameters to get game quality HE
  • Explore the other known methods for implementing HE. HE over the integers is apparently very easy to understand compared to other methods, but other methods may have different characteristics – like maybe not being crazy gigantic

Super Simple Symmetric Leveled Homomorphic Encryption Implementation

Homomorphic encryption is a pretty interesting thing. It allows you to do calculations on encrypted data such that when you decrypt the results, it’s as if you did the calculations on the unencrypted data. This allows computation to happen without the person doing the computation knowing what the data actually is!

Brief History

For a long time, cryptographers wondered if fully homomorphic encryption was even possible. There were various encryption algorithms that could perform SOME operations homomorphically (RSA can do multiplication for instance!), but there weren’t any that could do ALL operations. In other words, you couldn’t execute arbitrary computations.

Those types of algorithms are called “Partially Homomorphic Encryption” or PHE.

Another problem standing in the way of fully homomorphic encryption was that many algorithms would only have a limited count of operations they could perform before error would accumulate and they would start giving incorrect answers. In essence they were limited to evaluating low degree polynomials.

Those types of algorithms are called “Somewhat Homomorphic Encryption” or SWHE.

In contrast, Fully Homomorphic Encryption (FHE) can perform an unlimited number of homomorphic operations, and it can perform any operation homomorphically. It is unbounded in both ways.

Amazingly, In 2009 Craig Gentry figured out the first fully homomorphic encryption scheme! With his setup, you can calculate both XOR and AND on encrypted bits, which makes it Turing complete. It is also able to keep errors from becoming too large by using an ingenious bootstrapping technique to decrease accumulated error. Here’s a link to his PHd thesis: A Fully Homomorphic Encryption Scheme.

Unfortunately, the current implementations of secure FHE take too much computational power to be practical in most situations – like 30 minutes to calculate an AND between 2 bits!

In this post I’m going to show you a super simple HE implementation that will be very easy to understand. It won’t be fully homomorphic, but it will be “leveled” (or, somewhat homomorphic), meaning it is Turing complete, but the count of calculations you can perform are limited due to error creeping in. It also won’t be secure – due to making it easy to understand – but it will be lightning fast.

This will be a symmetric key algorithm, but as we’ll explore in future posts, it can also be used for public key algorithms.

Why Is HE Useful?

One thing you could do with HE is store your financial transactions encrypted on a server. The server could run queries and calculations on your financial data and send back the results. You could then unencrypt the result and see what the values are, even though the server itself – which generated the values – has no idea what the numbers actually are.

Another use could be in games. Whether you are playing a first person shooter, or a real time strategy game, many different types of games send information about each player to every other player in the game. Hashes of game state can be used to make sure that everyone is in agreement about calculations to prevent a player from cheating by WRITING to a value they shouldn’t be writing to (or, at least you can detect when they do, and use majority rule to boot them out of the game), but how do you stop a player from READING a value they shouldn’t be reading?

Using HE, you could encrypt the data you need to send to players that they shouldn’t be able to read. With this, they could still do game play logic calculations on the data, and calculate hashes of the encrypted results to ensure that all players were in agreement, but with HE, they wouldn’t gain knowledge of the data they were working with.

In other words, player A could verify that player B’s state is correct and they haven’t cheated, without player A getting details about player B’s state.

In theory this could eliminate or at least help combat things like wall hacks and other “data read” based cheats. In practice there would be some complications to work out, even if it wasn’t crazy slow to calculate, but the fact that there is a path to addressing these issues is pretty exciting! People are working on improving speed, and games don’t need the same level of security that other usage cases do.

How To Do It

Here are the details of this super simple leveled homomorphic symmetric key algorithm.

By the way, all the percent signs below mean “modulus” which is just the remainder of a division. 25 % 4 = 1 for instance, because 25/4 = 6 with a remainder of 1. That remainder of 1 is what we get when we take the modulus. A term you’ll see more often if reading through this stuff on your own will be “residue”. Don’t let that word scare you, it is just another name for the remainder.

Making A Secret Key

To make a key, generate an odd random number between 2^(N-1) and 2^N. In other words, it will be N random bits, except the highest and lowest bit will be set to 1. N is the size of your secret key. Larger keys are more secure, and allow more computations to be done in a row, but they also take more storage space. If you are using a fixed size int – like say a uint32 – a larger key will make you run out of those 32 bits sooner.

key = RandomNumber(0, (1 << N) - 1) | 1 | (1 << (N - 1));

Encrypt

To encrypt a bit, the encrypted value is just the key plus the value of the unencrypted bit (0 or 1).

encryptedBit = key + value ? 1 : 0;

Decrypt

To decrypt a bit, you take the encrypted bit modulo the key, and then modulo 2.

decryptedBit = (encryptedBit % key) % 2;

XOR

To do an XOR of two encrypted bits, you just add the two values together.

xorResult = encryptedBit1 + encryptedBit2;

AND

To do an AND of two encrypted bits, you just multiply the two values together.

andResult = encryptedBit1 * encryptedBit2;

Example

Let’s run through an example to see this in action.

We’ll use a 4 bit key, and say that the key is 13 (1101 in binary).

Let’s encrypt some bits:

trueBitEncrypted = key + 1 = 13 + 1 = 14 \newline falseBitEncrypted = key + 0 = 13 + 0 = 13

Let’s do some logical operations:

Xor00 = falseBitEncrypted + falseBitEncrypted = 13 + 13 = 26 \newline Xor01 = falseBitEncrypted + trueBitEncrypted  = 13 + 14 = 27 \newline Xor10 = trueBitEncrypted  + falseBitEncrypted = 14 + 13 = 27 \newline Xor11 = trueBitEncrypted  + trueBitEncrypted  = 14 + 14 = 28 \newline \newline And00 = falseBitEncrypted * falseBitEncrypted = 13 * 13 = 169 \newline And01 = falseBitEncrypted * trueBitEncrypted  = 13 * 14 = 182 \newline And10 = trueBitEncrypted  * falseBitEncrypted = 14 * 13 = 182 \newline And11 = trueBitEncrypted  * trueBitEncrypted  = 14 * 14 = 196 \newline \newline FalseXorFalseAndTrue = falseBitEncrypted + falseBitEncrypted * trueBitEncrypted = 13 + 13 * 14 = 195

Notice how AND is a multiplication where XOR is an addition, and that the result of an AND operation is a larger number than an XOR operation. This means that if you are working with a specific sized number (again, such as a uint32), that you can do fewer ANDs than XORs before you run out of bits. When you run out of bits and your number has integer overflow, you have hit the cieling of this leveled HE scheme. That means that ANDs are more expensive than XORs when considering the number of computations you can do.

Ok, time to decrypt our XOR values!

Xor00Decrypted = ((Xor00 \% key) \% 2) = (26 \% 13) \% 2 = 0 \newline Xor01Decrypted = ((Xor01 \% key) \% 2) = (27 \% 13) \% 2 = 1 \newline Xor10Decrypted = ((Xor10 \% key) \% 2) = (27 \% 13) \% 2 = 1 \newline Xor11Decrypted = ((Xor11 \% key) \% 2) = (28 \% 13) \% 2 = 0 \newline

XOR is looking correct, how about AND?

And00Decrypted = ((And00 \% key) \% 2) = (169 \% 13) \% 2 = 0 \newline And01Decrypted = ((And01 \% key) \% 2) = (182 \% 13) \% 2 = 0 \newline And10Decrypted = ((And10 \% key) \% 2) = (182 \% 13) \% 2 = 0 \newline And11Decrypted = ((And11 \% key) \% 2) = (196 \% 13) \% 2 = 1 \newline

AND is looking good as well. Lastly let’s decrypt the compound operation:

FalseXorFalseAndTrueDecrypted = ((FalseXorFalseAndTrue \% key) \% 2) = (195 \% 13) \% 2 = 0

Lookin good!

Intuition

Let’s get some intuition for why this works…

Key Generation

First up, why is it that the key needs to have it’s high bit set? Well, on one hand, larger keys are more secure, and allow more room for error accumulation so allow more operations to be done. On the other hand, this is kind of misleading to say. If you generate ANY random odd integer, there will be a highest bit set to 1 SOMEWHERE. You technically don’t need to store the zeros above that. So i guess you could look at it like you are just generating ANY random odd integer, and you could figure out N FROM that value (the position of the highest bit). Thinking about it the way we do though, it lets us specify how many bits we actually want to commit to for the key which gives us more consistent behavior, upper bound storage space, etc.

Secondly, why does the key need to be odd?

Let’s say that you have two numbers A and B where A represents an encrypted bit and B represents the encryption key. If B is even, then A % B will always have the same parity (whether it’s even or odd) as A. Since we are trying to hide whether our encrypted bit is 0 or 1 (even or odd), that makes it very bad encryption since you can recover the plain text bit by doing encryptedValue % 2. If on the other hand, B is odd, A % B will have the same parity as A only if A / B is even.

This doesn’t really make much of a difference in the scheme in this post, because A / B will always be 1 (since the encrypted bit is the key plus the plain text bit), but in the next scheme it is more important because A / B will be a random number, which means that it will be random with a 50/50 chance whether or not the parity of the encrypted bit matches the parity of the plain text bit. Since it’s an even chance whether it matches or not, that means that an attacker can’t use that information to their advantage.

While it’s true that when generating a random key, there is a 50/50 chance of whether you will get an even or odd key, you can see how we’d be in a situation where 75% of the time the parity of the ciphertext would match the parity of the plaintext if we allowed both even and off keys.

That would mean that while an attacker couldn’t know for CERTAIN whether an encrypted bit is 1 or 0 based on the cipher text, they can guess with 75% confidence that the unencrypted bit will just be the cipher text % 2, which is no good! So, we are better off sticking with an odd numbered key in this scheme. But again, that won’t really matter until the next post!

XOR as Addition

I know that I’m going to butcher this explanation a bit in the eyes of someone who knows this math stuff better than me. If you are reading this and see that I have indeed done that, please drop me a line or leave a comment and let me know what I’ve missed or could explain better. I suspect there’s something about rings going on here (;

Believe it or not, when you add two numbers together and then take the modulus, you get the same answer as if you did the modulus on the two numbers, added them together, and then took the modulus again.

In other words, adding two numbers can be seen as adding their residue (remainder).

Let me show you an example.

15 + 28 = 43 \newline \newline ((15 \% 13) + (28 \% 13)) \% 13 = 43 \% 13 \newline (2 + 2) \% 13 = 4 \newline 4 = 4

Let’s try another one. I’m picking these numbers “randomly” out of my head 😛

28 + 47 = 75 \newline \newline ((28 \% 8) + (47 \% 8)) \% 8 = 75 \% 8 \newline (4 + 7) \% 8 = 3 \newline 3 = 3

OK makes sense, but who cares about that?

Well, believe it or not, 1 bit addition is the same as XOR! This means that you can add numbers together, which adds the modulus of their key together, which then in turn adds that number mod 2 together, to preserve the encrypted parity (odd or even-ness).

Check out this 2 bit binary math. Keep in mind that with 1 bit results, you would only keep the right most binary digit. I’m showing two digits to show you that it is in fact binary addition, and that the right most bit is in fact the same as XOR.

0 + 0 = 00 \newline 0 + 1 = 01 \newline 1 + 0 = 01 \newline 1 + 1 = 10

One thing to note before we move on is that since we are doing a modulus against the key, when the remainder gets to be too large it rolls over. When it rolls over, we start getting the wrong answers and have hit our ceiling of how many operations we can do. So, our encrypted value modulo the key divided by the key can be seen as where we are at by percentage towards our error ceiling.

To avoid hitting the problem of error getting too high too quickly and limiting your calculation count too much you can increase the key size. When you do that you’ll then run out of bits in your fixed size integer storage faster. To avoid THAT problem you can use “multi precision math libraries” to allow your integers to use an arbitrary number of bytes. This is what many real crypto algorithms use when they need to deal with very large numbers.

AND as Multiplication

Similar to the above, when you multiply two numbers and take a modulus of the result, it’s the same as if you took the modulus of the two numbers, multiplied that, and then took the modulus of the result.

In other words, when you multiply two numbers, you can think of it as also multiplying their residue (remainder).

Using the first example numbers from above:

15 * 28 = 420 \newline \newline ((15 \% 13) * (28 \% 13)) \% 13 = 420 \% 13 \newline (2 * 2) \% 13 = 4 \newline 4 = 4

And the second:

28 * 47 = 1316 \newline \newline ((28 \% 8) * (47 \% 8)) \% 8 = 1316 \% 8 \newline (4 * 7) \% 8 = 4 \newline 4 = 4

A bit of a coincidence that they both worked out to 4 this time 😛

Similar to XOR being the same as 1 bit addition, 1 bit multiplication is actually the same as AND, check it out:

0 * 0 = 0 \newline 0 * 1 = 0 \newline 1 * 0 = 0 \newline 1 * 1 = 1

Since AND multiplies residue, and XOR adds residue, and residue is what limits our homomorphic instruction count, you can see that AND is a more expensive operation compared to XOR, since it eats into our instruction budget a lot faster.

Error In Action

To see why rolling over is a problem, let’s say that our key is 9 and we want to XOR two encrypted bits 8 and 1, which represent 0 and 1 respectively.

To do an XOR, we add them together: 8 + 1 = 9.

Now, when we decrypt it we do this: (9 % 9) % 2 = 0

That result tells us that 0 XOR 1 is 0, which is incorrect! Our residue got too large and we hit the ceiling of our homomorphic instruction budget.

If the first bit was 6 instead of 8, the result of the XOR would have been 7, and (7 % 9) % 2 comes out to 1. That re-affirms to us that if we are under the error budget, we are good to go, but if our residue gets too large, we will have problems!

Sample Code

// Note that this encryption scheme is insecure so please don't actually use it
// in production!  A false bit with a given key is the same value every time, and
// so is a true bit.  Also, the encrypted true bit value will always be the
// encrypted false bit plus 1.  Even worse, an encrypted false bit is the key itself!
// This is just for demonstration purposes to see how the basics of homomorphic
// encryption work.  The next blog post will increase security.

#include <stdio.h>
#include <stdint.h>
#include <random>
#include <array>
#include <inttypes.h>

typedef uint64_t uint64;

// Increase this value to increase the size of the key, and also the maximum
// size of the error allowed.
// If you set it too high, operations will fail when they run out of storage space
// in the 64 bit ints.  If you set it too low, you will not be able to do very many
// operations in a row.
const size_t c_numKeyBits = 6;

#define Assert(x) if (!(x)) ((int*)nullptr)[0] = 0;

//=================================================================================
// TODO: Replace with something crypto secure if desired!
uint64 RandomUint64 (uint64 min, uint64 max)
{
    static std::random_device rd;
    static std::mt19937 gen(rd());
    std::uniform_int_distribution<uint64> dis(min, max);
    return dis(gen);
}

//=================================================================================
void WaitForEnter ()
{
    printf("Press Enter to quit");
    fflush(stdin);
    getchar();
}

//=================================================================================
uint64 GenerateKey ()
{
    // Generate an odd random number in [2^(N-1), 2^N)
    // N is the number of bits in our key
    // The key also defines the maximum amount of error allowed, and thus the number
    // of operations allowed in a row.
    return RandomUint64(0, (1 << c_numKeyBits) - 1) | 1 | (1 << (c_numKeyBits - 1));
}

//=================================================================================
bool Decrypt (uint64 key, uint64 value)
{
    return ((value % key) % 2) == 1;
}

//=================================================================================
uint64 Encrypt (uint64 key, bool value)
{
    uint64 ret = key + (value ? 1 : 0);
    Assert(Decrypt(key, ret) == value);
    return ret;
}

//=================================================================================
uint64 XOR (uint64 A, uint64 B)
{
    return A + B;
}

//=================================================================================
uint64 AND (uint64 A, uint64 B)
{
    return A * B;
}

//=================================================================================
int GetErrorPercent (uint64 key, uint64 value)
{
    // Returns what % of maximum error this value has in it.  When error >= 100%
    // then we have hit our limit and start getting wrong answers.
    return int(100.0f * float(value % key) / float(key));
}

//=================================================================================
uint64 FullAdder (uint64 A, uint64 B, uint64 &carryBit)
{
    // homomorphically add the encrypted bits A and B
    // return the single bit sum, and put the carry bit into carryBit
    // From http://en.wikipedia.org/w/index.php?title=Adder_(electronics)&oldid=381607326#Full_adder
    uint64 sumBit = XOR(XOR(A, B), carryBit);
    carryBit = XOR(AND(A, B), AND(carryBit, XOR(A, B)));
    return sumBit;
}

//=================================================================================
int main (int argc, char **argv)
{
    // run this test a bunch to show that it works.  If you get a divide by zero
    // in an Assert, that means that it failed, and hopefully it's because you
    // increased c_numKeyBits to be too large!
    printf("Verifying 10000 truth tables.  Details of first one:n");
    for (int index = 0; index < 10000; ++index)
    {
        // make our key and a true and false bit
        uint64 key = GenerateKey();
        uint64 falseBit = Encrypt(key, false);
        uint64 trueBit = Encrypt(key, true);

        // Verify truth tables for XOR and AND
        Assert(Decrypt(key, XOR(falseBit, falseBit)) == false);
        Assert(Decrypt(key, XOR(falseBit, trueBit )) == true );
        Assert(Decrypt(key, XOR(trueBit , falseBit)) == true );
        Assert(Decrypt(key, XOR(trueBit , trueBit )) == false);

        Assert(Decrypt(key, AND(falseBit, falseBit)) == false);
        Assert(Decrypt(key, AND(falseBit, trueBit )) == false);
        Assert(Decrypt(key, AND(trueBit , falseBit)) == false);
        Assert(Decrypt(key, AND(trueBit , trueBit )) == true );

        // report the results for the first iteration of the loop
        if (index == 0)
        {
            printf("Key 0x%" PRIx64 ", false 0x%" PRIx64 ", true 0x%" PRIx64 "n", key, falseBit, trueBit);
            printf("  [0 xor 0] = 0   0x%" PRIx64 " xor(+) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%i%%)n", falseBit, falseBit, XOR(falseBit, falseBit), Decrypt(key, XOR(falseBit, falseBit)) ? 1 : 0, GetErrorPercent(key, XOR(falseBit, falseBit)));
            printf("  [0 xor 1] = 1   0x%" PRIx64 " xor(+) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%i%%)n", falseBit, trueBit , XOR(falseBit, trueBit ), Decrypt(key, XOR(falseBit, trueBit )) ? 1 : 0, GetErrorPercent(key, XOR(falseBit, trueBit )));
            printf("  [1 xor 0] = 1   0x%" PRIx64 " xor(+) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%i%%)n", trueBit , falseBit, XOR(trueBit , falseBit), Decrypt(key, XOR(trueBit , falseBit)) ? 1 : 0, GetErrorPercent(key, XOR(trueBit , falseBit)));
            printf("  [1 xor 1] = 0   0x%" PRIx64 " xor(+) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%i%%)n", trueBit , trueBit , XOR(trueBit , trueBit ), Decrypt(key, XOR(trueBit , trueBit )) ? 1 : 0, GetErrorPercent(key, XOR(trueBit , trueBit )));
            printf("  [0 and 0] = 0   0x%" PRIx64 " and(*) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%i%%)n", falseBit, falseBit, AND(falseBit, falseBit), Decrypt(key, AND(falseBit, falseBit)) ? 1 : 0, GetErrorPercent(key, XOR(falseBit, falseBit)));
            printf("  [0 and 1] = 0   0x%" PRIx64 " and(*) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%i%%)n", falseBit, trueBit , AND(falseBit, trueBit ), Decrypt(key, AND(falseBit, trueBit )) ? 1 : 0, GetErrorPercent(key, XOR(falseBit, trueBit )));
            printf("  [1 and 0] = 0   0x%" PRIx64 " and(*) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%i%%)n", trueBit , falseBit, AND(trueBit , falseBit), Decrypt(key, AND(trueBit , falseBit)) ? 1 : 0, GetErrorPercent(key, XOR(trueBit , falseBit)));
            printf("  [1 and 1] = 1   0x%" PRIx64 " and(*) 0x%" PRIx64 " = 0x%" PRIx64 " (%i err=%i%%)n", trueBit , trueBit , AND(trueBit , trueBit ), Decrypt(key, AND(trueBit , trueBit )) ? 1 : 0, GetErrorPercent(key, XOR(trueBit , trueBit )));
        }
    }

    // Do multi bit addition as an example of using compound circuits to
    // do meaningful work.
    const size_t c_numBitsAdded = 5;
    printf("nDoing 10000 Multibit Additions.  Details of first one:n");
    std::array<uint64, c_numBitsAdded> numberAEncrypted;
    std::array<uint64, c_numBitsAdded> numberBEncrypted;
    std::array<uint64, c_numBitsAdded> resultEncrypted;
    for (int index = 0; index < 10000; ++index)
    {
        // generate the numbers we want to add
        uint64 numberA = RandomUint64(0, (1 << c_numBitsAdded) - 1);
        uint64 numberB = RandomUint64(0, (1 << c_numBitsAdded) - 1);

        // generate our key
        uint64 key = GenerateKey();

        // encrypt our bits
        for (int bitIndex = 0; bitIndex < c_numBitsAdded; ++bitIndex)
        {
            numberAEncrypted[bitIndex] = Encrypt(key, (numberA & (uint64(1) << uint64(bitIndex))) != 0);
            numberBEncrypted[bitIndex] = Encrypt(key, (numberB & (uint64(1) << uint64(bitIndex))) != 0);
        }

        // do our multi bit addition!
        // we could initialize the carry bit to 0 or the encrypted value of 0. either one works since 0 and 1
        // are also poor encryptions of 0 and 1 in this scheme!
        uint64 carryBit = Encrypt(key, false);
        for (int bitIndex = 0; bitIndex < c_numBitsAdded; ++bitIndex)
            resultEncrypted[bitIndex] = FullAdder(numberAEncrypted[bitIndex], numberBEncrypted[bitIndex], carryBit);

        // decrypt our result
        uint64 resultDecrypted = 0;
        for (int bitIndex = 0; bitIndex < c_numBitsAdded; ++bitIndex)
        {
            if (Decrypt(key, resultEncrypted[bitIndex]))
                resultDecrypted |= uint64(1) << uint64(bitIndex);
        }

        // make sure that the results match, keeping in mind that the 4 bit encryption may have rolled over
        Assert(resultDecrypted == ((numberA + numberB) % (1 << c_numBitsAdded)));

        // report the results for the first iteration of the loop
        if (index == 0)
        {
            printf("Key 0x%" PRIx64 ", %" PRId64 " + %" PRId64 " in %i bits = %" PRId64 "n", key, numberA, numberB, c_numBitsAdded, (numberA + numberB) % (1 << c_numBitsAdded));
            for (int bitIndex = 0; bitIndex < c_numBitsAdded; ++bitIndex)
                printf("  A[%i] = 0x%" PRIx64 " (%i err=%i%%)n", bitIndex, numberAEncrypted[bitIndex], Decrypt(key, numberAEncrypted[bitIndex]), GetErrorPercent(key, numberAEncrypted[bitIndex]));
            printf("+n");
            for (int bitIndex = 0; bitIndex < c_numBitsAdded; ++bitIndex)
                printf("  B[%i] = 0x%" PRIx64 " (%i err=%i%%)n", bitIndex, numberBEncrypted[bitIndex], Decrypt(key, numberBEncrypted[bitIndex]), GetErrorPercent(key, numberBEncrypted[bitIndex]));
            printf("=n");
            for (int bitIndex = 0; bitIndex < c_numBitsAdded; ++bitIndex)
                printf("  Result[%i] = 0x%" PRIx64 " (%i err=%i%%)n", bitIndex, resultEncrypted[bitIndex], Decrypt(key, resultEncrypted[bitIndex]), GetErrorPercent(key, resultEncrypted[bitIndex]));
            printf("result decrypted = %" PRId64 "n", resultDecrypted);
        }
    }

    WaitForEnter();
    return 0;
}

Here is the output of a run of the program:

What if I Need Constants?!

If you are thinking how you might actually use this code in a real setting, you might be thinking to yourself “it’s great to be able to multiply two encrypted numbers together, but what if I just need to multiply them by a constant like 43?”

Well, interestingly, you can literally just use 0 and 1 in this scheme as constants to perform operations against the encrypted bits.

The reason that this works is that you can see 0 and 1 as just very poor encryptions 😛

(0 % KEY) % 2 = 0
(1 % KEY) % 2 = 1

As long as KEY is >= 2, the above is always true, no matter what the key actually is!

So there you go, add your own constants into the calculations all you want. They also happen to have very low residue/error (actually, they have the least amount possible!), so are much more friendly to use, versus having someone provide you with an encrypted table of constants to use in your calculations. It’s also more secure for the person doing the encrypting for them to provide you less encrypted data that you know the plain text for. It limits your (and anyone else’s) ability to do a known plain text attack.

The Other Shoe Drops

You might notice that in our scheme, given the same key, every true bit will be the same value, and every false bit will be the same value. Unfortunately, the true bit is also always the false bit + 1. As an attacker, this means that once you have seen both a true bit and a false bit, you will then have broken the encryption.

Even worse, when you encrypt a false bit, it gives you back the key itself!

We’ll improve that in the next post by adding a few more simple operations to the encryption process.

This leveled HE encryption scheme comes directly from the paper below. If you want to give it a look, what we covered is only part of the first two pages!
Fully Homomorphic Encryption over the Integers

The links below are where I started reading up on HE. They go a different route with FHE that you might find interesting, and also have a lot more commentary about the usage cases of HE:
The Swiss Army Knife of Cryptography
Building the Swiss Army Knife

In the scheme in those links, I haven’t figured out how multiplication is supposed to work yet (or bootstrapping, but one thing at a time). If you figure it out, let me know!

Feistel Networks – Do They Have to use XOR?

If you have no idea what a Feistel network is, but like cryptography and/or random number generation algorithms, read this link first:
Fast & Lightweight Random “Shuffle” Functionality – FIXED!

As a quick refresher, to encrypt data with a Feistel network, you break the plain text data into a left and a right side and do N rounds of this operation:

Left[i+1]  = Right[i];
Right[i+1] = Left[i] ^ RoundFunction(Right[i], key);

Where RoundFunction is ideally some chaotic function that returns some pseudo-random-esque number based on the inputs. For instance, RoundFunction could be MD5 so that it returned the MD5 hash of the data and the key, where the key could be considered the salt of the hash. The better the round function, the better your encryption algorithm will be.

To decrypt data with a Feistel network, you break the data into a left and right side and do the same number of rounds of this operation:

Right[i] = Left[i+1];
Left[i] = Right[i+1] ^ RoundFunction(Left[i+1], key);

Ok, onto the question….

Does it Have to use XOR?

Recently a friend of mine was using Feistel networks for something pretty amazing (so amazing, I can’t even talk about it), but in doing so, he asked me an interesting question. He asked “do you think this HAS to be XOR here, where we combine the round functions result back into the data?”. Well, it turns out, it doesn’t!

The operation has to be a reversible operation though, and you have to do the reverse operation when decrypting that you did while encrypting.

For instance, when encrypting you could add the round function result in, but then when decrypting, you would have to subtract the round function result out.

Or, you could do bitwise rotation left when encrypting, and right when decrypting perhaps.

Basically, anything that has a reverse operation can be used.

You have to be careful though because you might be lured into the trap of thinking that this includes something like multiplication and division.

If you multiply when you encrypt, you might get an integer overflow and lose data that can’t be corrected by doing a divide. For instance, if you multiply 255*2 in an unsigned 8 bit number you get 254 as a result. If you divide 254 by 2 to “undo” the multiplication, you get 127 which is obviously not 255, so we’ve lost some data. In an unsigned 8 bit number, ((255*2)/2) = 127.

If you go the other way and divide on encryption, and multiply on decryption, that doesn’t work either. For instance, when you divide 3 by 2, you get 1 with integer math, and when you multiply by 2, you get 2. So, with integers… ((3/2)*2) = 2.

Confusing note: you ARE able to do irreversible operations within the round function though. Feel free to do a divide or whatever you want in there. If that is difficult to understand how that could possibly work, you aren’t alone. Step through the code a bit by hand with a simple round function and a low number of rounds and you might be able to understand better how it does what it does.

I’m really not sure if anyone else out there does this variation on the traditional Feistel networks or not, but it is pretty interesting to combine the RoundFunction result back into the data with something other than XOR.

Source Code

Here’s some simple C++ code below to play with if you want to mess around with this stuff.

#include 
#include 
#include 

static const unsigned int c_numRounds = 4;

void PrimeRandomNumberPump ()
{
	// if you are curious about this, check out:
	// https://blog.demofox.org/2013/06/18/wtf-rand/
	srand((unsigned)time(NULL));
	for (unsigned int index = 0; index < 20; ++index)
		rand();
}

unsigned char RoundFunction (unsigned char value, unsigned char key)
{
	// Not a particularly effective round function, but the round function
	// isn't the point of this code.
	// If you want a better round function, try plugging in a hash function
	// or another chaotic function that has big changes in output for
	// small changes in input.  Also, you could change c_numRounds to a
	// higher number if you want better encryption.
	return value + key | (value * key) + 3;
}

void Encrypt (unsigned char &left, unsigned char &right, unsigned char key)
{
	for (unsigned int index = 0; index < c_numRounds; ++index)
	{
		// Feistel Network Encryption:
		//  Left[i+1]  = Right[i];
		//  Right[i+1] = Left[i] ^ RoundFunction(Right[i], key);

		// let's do addition to combine the value of the round function on 
		// encryption, instead of doing xor.  Xor is used in feistel networks
		// because xor is it's own inverse operation.
		unsigned char oldLeft = left;
		left = right;
		right = oldLeft + RoundFunction(right, key);
	}
}

void Decrypt (unsigned char &left, unsigned char &right, unsigned char key)
{
	for (unsigned int index = 0; index < c_numRounds; ++index)
	{
		// Feistel Network Decryption:
		//  Right[i] = Left[i+1];
		//  Left[i] = Right[i+1] ^ RoundFunction(Left[i+1], key);

		// let's do subtraction to combine the value of the round function on 
		// decryption, instead of doing xor.  Xor is used in feistel networks
		// because xor is it's own inverse operation.
		unsigned char oldRight = right;
		right = left;
		left = oldRight - RoundFunction(left, key);
	}
}

void DoTest (unsigned char plainText1, unsigned char plainText2, unsigned char key, int &tests, int &errors)
{
	// encrypt the plaintext
	unsigned char cipherText1 = plainText1;
	unsigned char cipherText2 = plainText2;
	Encrypt(cipherText1, cipherText2, key);

	// decrypt the cipher text
	unsigned char decryptedData1 = cipherText1;
	unsigned char decryptedData2 = cipherText2;
	Decrypt(decryptedData1, decryptedData2, key);

	// if the decrypted data doesn't match the plaintext data, count it as an error
	// and show the details
	tests++;
	if (decryptedData1 != plainText1 || decryptedData2 != plainText2)
	{
		errors++;
		printf("plaintext = 0x%02X%02Xrn", (unsigned int)plainText1, (unsigned int)plainText2);
		printf("ciphertext = 0x%02X%02Xrn", (unsigned int)cipherText1, (unsigned int)cipherText2);
		printf("decrypteddata = 0x%02X%02Xrnrn", (unsigned int)decryptedData1, (unsigned int)decryptedData2);
	}
}

void main (void)
{
	// generate a key
	PrimeRandomNumberPump();
	unsigned char key = (unsigned char)rand();

	// run tests with the key
	int errors = 0;
	int tests = 0;
	for (unsigned int y = 0; y < 256; ++y)
		for (unsigned int x = 0; x < 256; ++x)
			DoTest((unsigned char)y, (unsigned char)x, key, tests, errors);
		
	// display the test results
	printf("%i tests ran, %i errors encountered. key = 0x%02Xrn", tests, errors, key);
}

How to Test Randomness of Numbers

At first i said the answer was to check this out: Diehard Battery of Tests of Randomness which is linked to by this page which may also be of interest: Tests for Random Number Generators.

But apparently that is the “old way” and there is a new program from NIST that you can get here: NIST test suite for random numbers, which subsequently is linked to from random.org: Random.org Statistical Analysis.

Getting that program from NIST to compile was a little bit of a chore for me on msvc 2010. The biggest hurdle i hit was that msvc 2010 doesnt have erf() and erfc() so i had to google “erf.cpp” and find an implementation. If you can’t find one, erf and erfc are part of gcc which is open sourced so you can always go that route if you need to!

After compiling, i was able to run the test on my numbers but couldn’t make much sense of the results very easily. There were a few p scores and presumably some chi squared scores somewhere, but the “summary file” was very cryptic (pun intended) so i wasn’t really sure…

Anyways, just wanted to put it here for myself and others if anyone’s looking for this in the future 😛

Thanks to my buddy James for the correction and links to the newer NIST program. Thanks man!

Bottom Line

Interestingly, the tests above use the source number data to do a bunch of different things, and then measure the statistics of the results.

For instance, it will use the numbers to shuffle a deck of cards, and then it will play poker and see if there is any bias of cards dealt, or players winning.

Or, it will use the numbers as the source of numbers for a roulette wheel and see if players win at the right rate statistically.

I guess the bottom line lesson for testing random numbers is that you should use the numbers how you intend to use them, and see if there’s any statistical anomalies.

There doesn’t seem to be a magic bullet test that works for generic randomness, but I’m guessing it’s just sort of… check for patterns in every way you can, or every way you care about, and if you don’t find any, consider it random. If you are using it for purposes where randomness really matters – like security or gambling – you then hope nobody else finds a pattern you didn’t! 😛

On that topic, check this out: Wikipedia: Michael Larson

Why do you hate me rand()?!

TL;DR – I’ve always heard rand() sucked for generating (cryptographically strong) random numbers, but it turns out it’s just kind of bad in general too LOL.

OK so this is bizarre, I made a default settings console project in MSVC 2012 with the code below:

#include 
#include 
#include 

int main(int argc, char** argv)
{
	time_t thetime = 0;
	time(&thetime);
	srand(thetime);
	int a = rand();
	int b = rand();
	int c = rand();
	int d = rand();

	printf("time = %llu (%llu)rna = %irnb = %irnc =t %irnd = %irn", thetime, thetime % RAND_MAX, a, b, c, d);
	return 0;
}

Here are some sample outputs, can you see what’s wrong?!

time = 1371620230 (26377)
a = 11108
b = 28489
c = 18911
d = 15679
time = 1371620268 (26415)
a = 11232
b = 10944
c = 9621
d = 12581
time = 1371620289 (26436)
a = 11301
b = 7285
c = 24321
d = 26390
time = 1371620310 (26457)
a = 11369
b = 3625
c = 6252
d = 7432
time = 1371620332 (26479)
a = 11441
b = 10714
c = 6048
d = 12537

5 times in a row you can see that the first number randomly generated is in the 11,000’s. You can also see that it’s steadily increasing.

I included the time modulo RAND_MAX in case that was the first number returned but it isn’t. I also looked at the numbers in hex and there isn’t a clear pattern there either. I can’t really discern the correlation between the time and the first random number, but there is definitely a pattern of some kind.

You always hear you shouldn’t use rand() if you need really high quality random numbers (like used for encryption), but i always figured if you use srand() with time, your number will be good enough for games at least. Turns out, you might want to throw out the first random number rand gives you before using the stuff for your games too. Maybe throw out a couple just in case! 😛

You might wonder why b,c,d are seemingly more random then a, but that’s likely due to the Avalanche Effect aka “sensitivity to initial conditions” which as it turns out is a nice property of cryptographic algorithms as well as pseudo random number generators. That is also a fundamental idea from Chaos Theory.

Essentially, as you ask for more random numbers, they ought to be more unpredictable, and more “random”. You just get some trash in the beginning.

Anyways… I’m super surprised by just how bad rand() is… I guess I never looked at it like this before (or maybe this is some new bad behavior in MSVC 2012?). Also, RAND_MAX is defined for me as 0x7fff. Ouchies, where are the rest of our numbers? 😛