This is the third article in a series on the basics of cryptography:
- Cryptography 101: Hashing
- Cryptography 101: Encryption – One Time Pad
- Cryptography 101: Encryption – Symmetric Keys
- Cryptography 101: Encryption – Asymmetric Keys
- Cryptography 101: Realistic Security
DISCLAIMER: These articles are meant for educational purposes only. The methods explained here are meant only to illustrate the basic concepts of cryptography and may or may not be suitable in real world applications. For serious applications such as financial transactions, I recommend hiring security professionals and also getting a lawyer involved. Use this info and code at your own risk, I claim no responsibility!
If you want more in depth information about cryptography than these introductory articles provide, I highly recommend a book called Applied Cryptography by Bruce Schneier. That book literally almost didnt get published because the NSA didnt want the info getting out into the public. Yay for the 1st amendment!
Symmetric Key Encryption
Symmetric key encryption is a fancy name for the type of encryption you are probably most familiar with, which is using a password to scramble and unscramble data to make sure only certain people can see it.
This is in contrast to asymmetric key encryption, where you have two passwords; one for encrypting and one for decrypting (The next article is going to be on asymmetric key encryption).
There are numerous symmetric key encryption algorithms out there but they all have one thing in common: their security relies on only the right people having the password, and the assumption that the best way attackers have for getting the plaintext from the ciphertext is to guess the password via brute force.
In good (modern) algorithms, people say things like “on average it will take geological or astronomical amounts of time to guess a password with the computing technology of today and the projected future” so they are reasonably sure people won’t be able to brute force the password in any useful amount of time.
Quantum computers give some forms of cryptography a scare though, because there is something called Simon’s Algorithm which is a quantum computing algorithm that can brute force search ANYTHING with exponentially fewer operations than classical computing. This means it can brute force guess passwords of an encryption algorithm a lot faster than a normal computer. At the time of writing this, I think the record for quantum computing power is something like having 4 cubits work together to do some simple math operation (like multiplication). We could be on the precipice of disaster regarding cryptography, but luckily there are encryption algorithms that take the same amount of time, or longer, for quantum computing to solve, so it isn’t all doom and gloom.
When decrypting data with either symetric or asymetric key encryption, there is no built in way to know if you had the right password or not. You can know by looking at the recovered plaintext and seeing if you got junk out, or meaningful data, but if you don’t know what the data out is supposed to be exactly, or what it’s supposed to look like, there’s no way to know if decrypted it correctly. This makes it so sometimes it can be difficult for attackers to even KNOW if they have guessed the right password or not, which is good for us folk trying to protect data.
Just like a good hashing algorithm, small changes in input should ideally yield large changes in output, which makes it a Chaotic Function and makes it so the cipher text gives as little information about the plaintext as possible.
Sometimes people will use multiple encryption algorithms on a piece of data in the hopes of making it harder to crack, which sometimes works, but can also be fairly dangerous.
To understand the danger, consider how every program, no matter how complex, is essentially a traditional algebraic function (with perhaps lots and lots and lots of terms). For encryption, the input is the plain text and key, and the output is the cipher text.
Now, just like in junior high and high school, sometimes when you plug one function into another like f(g(x)) and preform algebraic substitution, terms from f and g maybe cancel out. You may end up with a function that is less complex than either f(x) or g(x), or it just may be less complex for certain values of x. An attacker could exploit these attacks to their advantage and it might be easier for them to recover some or all of the plaintext because you used two encryption algorithms instead of one.
On the other hand, using multiple algorithms, or the same algorithm multiple times (perhaps with different keys) can also make it a lot more secure. It’s just something to be mindful of.
Clever programmers and mathematicians sometime come up with encryption techniques where attacking the algorithm itself is the literal equivalent of having to solve famous unsolved math problems from the ages. These often seem really secure because for some of these problems, the best and brightest minds in all of history have been fighting with the problems for hundreds or thousands of years and making no progress.
Every now and then, some smarty figures one of these out though, and suddenly, encryption algorithms based on it become essentially worthless.
Another common way that people attack ciphertext is via something called a “known plaintext attack”. What this means is that if the attacker knows any part of the plaintext before it became ciphertext, they can sometimes leverage that knowledge to know a bit more about the key or algorithm used to encrypt the data. That simplifies their work and makes it more likely that they can get the plaintext back without having to revert to brute force.
One really common way this comes up is if people do something like compress their data before encrypting, or they encrypt known file types like executables, word processing documents, image files etc.
The reason for this is because in all of those file types, there is a standard, well known header that those files have, which allow other programs to use them. That header data is known plaintext and can be used by an attacker to get more information how to recover the plaintext.
For all the clever people out there trying to make encryption based on super advanced mathematics, in the end, some of the very most secure algorithms out there are based on very simple computing operations such as addition, subtraction, bit rotation, and XOR.
As an example, there is an algorithm called RC5 which only uses those basic operations (you can find the source code for it easily!) and yet is extremely secure. The makers of RC5 published their source code, and encrypted some data with various key sizes (7 byte, 8 byte and 9 byte) in 1994, and it took something like 5 years for the first one to be cracked (via brute force), 10 years for the second, and they project that cracking the third will take 200 more years. More information available here: RC5
A symmetric key algorithm is any deterministic algorithm where given a key, has the ability to obfuscate (hide / scramble) data, and then later given the same key, has the ability to undo the operations that it did to get the original data back.
Since all operations have to be reversible, that limits you to non destructive operations. XOR isn’t destructive, because A XOR B XOR B = A. Addition and subtraction isn’t destructive, because A + B – B = A (even true when you wrap around the max size of your integer). Division is destructive however, because when you divide on a computer, you have finite precision (even with floating point numbers) which means you can never fully recover the origional data when trying to undo a division with a multiplication. Bit rotation is another operation that isn’t destructive. NOT isn’t destructive, but AND and OR are destructive. Another operation that isn’t destructive is moving bytes around, since you could just do the moves again in reverse order to get the original data back.
As simple as all this sounds, these are essentially the building blocks of all encryption algorithms.
Here’s an example algorithm that you could use to encrypt and unencrypt data. I don’t do any byte swapping (moving bytes around), or bit rotation, but those would be some good ways to improve it.
//Takes a pointer and length so you can encrypt binary data as well as text
//the pOutData parameter should point to memory that is the same size as pData
//If bEncrypt is true, it will encrypt data. If bEncrypt is false, it will decrypt data.
void EncryptData(const unsigned char *pData, int nDataLength, unsigned char *pOutData, const unsigned char *pKey, int nKeyLength, bool bEncrypt)
int nKeyIndex = 0;
unsigned char nRunningSum = 0;
for(int nDataIndex = 0; nDataIndex < nDataLength; ++nDataIndex)
//update our running sum
nRunningSum += pKey[nKeyIndex % nKeyLength];
//get our current byte of plaintext or ciphertext
unsigned char nDataByte = pData[nDataIndex];
//to decrypt, it subtracts a running sum of the key then xors against the current key byte
nDataByte -= nRunningSum;
//do our xor, whether we are encrypting or decrypting
nDataByte = nDataByte ^ pKey[nKeyIndex % nKeyLength];
//to encrypt, it xors against the current key byte and then adds a running sum of the key
nDataByte += nRunningSum;
//set the output data byte
pOutData[nDataIndex] = nDataByte;
//move to the next byte in the key
Also, here’s some example code of how to use this function:
//our key and plain text
const char *pKey = "MyKeyIsFairlyLongButThatIsJustFine!124351 seven";
const char *pPlainText = "This is some plaintext, how do you do?";
//allocate space for our cipher text and recovered plain text
unsigned char *pCipherText = new unsigned char[strlen(pPlainText)];
unsigned char *pRecoveredPlainText = new unsigned char [strlen(pPlainText)+1];
//print out our plain text
//encrypt the plain text
EncryptData((unsigned char *)pPlainText,strlen(pPlainText),pCipherText,(unsigned char *)pKey,strlen(pKey),true);
//print out the cipher text as hex digits
for(int nIndex = 0; nIndex < strlen(pPlainText); ++nIndex)
//decrypt the cipher text to recover the plain text
EncryptData(pCipherText,strlen(pPlainText),pRecoveredPlainText,(unsigned char *)pKey,strlen(pKey),false);
//print out the recovered plain text after we null terminate it
//free the memory we allocated
Until Next Time!
That’s it for symmetric key algorithms, next up I’ll be talking about asymmetric key algorithms, which have some pretty interesting uses.