Permutation Iteration and Random Access

This post has an interesting link to the last one (Inverting Gauss’ Formula). The last post was focused on adding all the numbers from 1 to N, this post will feature multiplying all the numbers from 1 to N.

We aren’t going to be doing it in constant time though, if you want that check out this page: https://devblogs.microsoft.com/oldnewthing/20151214-00/?p=92621.

Calculating The Number of Permutations

Let’s say we have 3 letters A,B,C. How many ways are there to arrange them?

If you look at is as if we have 3 slots to fill with letters, the first slot has 3 different possible choices. It can pick A, B or C. When the first slot chooses a letter, the second slot only has 2 letters to choose from, but it can make that choices 3 times, which means there are 3*2=6 options. For the last slot, there is only one letter left, which means there is no choice to make, and we are done.

First Letter ChoicesSecond Letter ChoicesThird Letter Choices
A _ _AB_
AC_
ABC
ACB
B _ _BA_
BC_
BAC
BCA
C _ _CA_
CB_
CAB
CBA

Let’s see the same table for a 4 letter alphabet.

First Letter ChoicesSecond Letter ChoicesThird Letter ChoicesFourth Letter Choices
A_ _ _AB_ _
AC_ _
AD_ _
ABC_
ABD_
ACB_
ACD_
ADB_
ADC_
ABCD
ABDC
ACBD
ACDB
ADBC
ADCB
B_ _ _BA_ _
BC_ _
BD_ _
BAC_
BAD_
BCA_
BCD_
BDA_
BDC_
BACD
BADC
BCAD
BCDA
BDAC
BDCA
C_ _ _CA_ _
CB_ _
CD_ _
CAB_
CAD_
CBA_
CBD_
CDA_
CDB_
CABD
CADB
CBAD
CBDA
CDAB
CDBA
D_ _ _DA_ _
DB_ _
DC_ _
DAB_
DAC_
DBA_
DBC_
DCA_
DCB_
DABC
DACB
DBAC
DBCA
DCAB
DCBA

This is only two data points, but it generalizes. If you have N letters, the first slot has N choices, the second slot has N-1 choices for each of those. The third slot has N-2 for each of those. It continues until the last slot has only 1 choice. The formula for the number of permutations is then:

N * (N-1) * (N-2) * ... * 1

If you know the sigma symbol for adding numbers together (\Sigma), capital pi is the same, but for multiplying numbers together (\Pi). You can use that to rewrite the formula this way:

\Pi^N_{i=1} N

An even easier way is to just write it as a factorial, because that’s what it is:

N!

Number of LettersPermutation CountProduct Formula
111
221*2
361*2*3
4241*2*3*4
51201*2*3*4*5
67201*2*3*4*5*6

Calculating an Index (Lexicographic Rank) From a Permutation String

Let’s say you have a 4 letter alphabet (A, B, C, D), and you want to know the index of the permutation string “CBAD”.

The index you are looking for is called the Lexicographic rank, and it tells you where this string would be if you had all possible strings and sorted them alphabetically. The intuition for calculating this index is realizing that it is also the number of words that would come before it in that sorted list.

So, we look at the first letter in “CBAD” and see that it is a C. Anything with an A in the first letter position would come first. That is all the strings of the form “A _ _ _”. There are 3 empty slots, with 3 letters to choose from, so this means there are 3! = 3 * 2 * 1 = 6 words that start with the letter A. Our index is at least 6 then.

Also, any word with a “B” in the first slot would come earlier, so that’s another 6 words that would come before our word. Our index is at least 12 then, because we have found 12 words that would come before our word.

Moving on from the first digit, we see that the second slot has a “B”. Any word that begins with “CA” would come before our word. That is all words of the form “CA _ _”. There are 2 empty slots with 2 letters to fill them in, so that is 2! = 2 * 1 = 2 more words. We have now found 14 words that would come before ours, so our index is at least 14.

We have an A in the third slot of our word “CBAD”. No letter is less than A so we are ok there.

Lastly we come to the fourth letter D. Plenty of letters are less than D, but we’ve used them all already! D is the only letter that could go here, so we are done.

There are 14 words that would come before “CBAD”, so the index – or lexicographic rank – of that word is 14.

When looking at a letter in a specific slot in a word to see if there are any letters lower, it’s important that you don’t consider letters that have already been used, to the left of that slot. Each letter should appear exactly once in every word. Also, by that logic, the last slot of a word has no alternative choices. It has to take whatever letter is left over after the other slots have been filled. There is not going to be any word that only differs in the last slot!

If anything about this explanation was unclear, the code section at the end should clear things up.

Calculating a Permutation String From an Index (Lexicographic Rank)

Let’s go the other way. Staying with a 4 letter dictionary, that gives you 4! = 4 * 3 * 2 * 1 = 24 permutations. How would we calculate what string is at index 14? We converted a string into this index in the last section so we already know, but let’s pretend that we didn’t yet know.

The process is very similar to converting from decimal to binary or hexadecimal!

We are going to start off by making a sorted list of the letters in our alphabet. We’ll need this in our algorithm:

letterList = [A, B, C, D]

To calculate the letter for the first slot, we calculate how many (N-1)! ‘s fit into the index.

With a 4 letter dictionary, looking for index 14, that means we want to know how many times 6 (aka 3!) goes into 14. The answer is 2, with a remainder of 2. The answer controls the next letter, and the remainder is passed onto the next iteration.

So, the first letter of our string is letterList[2], which is C. We’ll set the index equal to the remainder we calculated above, which was 2. We will also remove C from the letterList. That leaves us with:

letterList = [A, B, D]
index = 2
string = C_ _ _

Then we move onto the next slot. We calculate the next factorial lower which is 2 (aka 2!) and then calculate how many times it goes into our index. our index is 2, so 2/2 = 1 with a remainder of 0.

We set the second letter of our string to letterList[1], which is B. We set the index equal to the remainder, which is 0. We will also remove B from the letterList. That leaves us with:

letterList = [A, D]
index = 0
string = CB_ _

We then move onto the next slot, and calculate the next factorial lower, which is 1! or 1. We calculate how many times 1 goes into 0. 0/1 = 0 with a remainder of 0.

We set the third letter of our string to letterList[0], or A. We set the index equalt to the raminder, which is 0. We also remove A from the letterList. That leaves us with:

letterList = [D]
index = 0
string = CBA_

When filling the last slot we only ever have one choice, so set it equal to letterList[0] to complete the string: CBAD.

You can verify that CBAD gave us an index of 14 in the last section, so our process successfully did a “round trip”.

Code

Here’s code that implements converting from index to string and from string to index. This code allows you to iterate through permutations, get a string for a specific permutation index, and also allows you to go from string back to index. The program output is below the source code. Enjoy!

#include <stdio.h>
#include <vector>

static const int c_numLetters = 4;

// Returns N!
int Factorial(int N)
{
	int ret = 1;
	for (int i = 2; i <= N; ++i)
		ret *= i;
	return ret;
}

// Given a string (an array of values) of size N, assuming the alphabet is [0,N), returns the lexicographic rank.
// The lexicographic rank is the position the string would be in if you had all possible strings and sorted them.
// Also assumes each letter in the alphabet appears only once.
template <size_t SIZE>
int LexicographicRankFromString(const int(&string)[SIZE])
{
	// Keep track of which letters have already been used, so we know how many lower valued letters would
	// be possible at each position.
	bool lettersUsed[SIZE];
	for (int i = 0; i < (int)SIZE; ++i)
		lettersUsed[i] = false;

	// Calculate the rank
	int rank = 0;
	int factorial = Factorial(SIZE - 1);
	for (int pos = 0; pos < SIZE - 1; ++pos)
	{
		// count how many letters could be in this position that are a lower value
		int lowerLetterCount = 0;
		for (int letterIndex = 0; letterIndex < string[pos]; ++letterIndex)
		{
			if (!lettersUsed[letterIndex])
				lowerLetterCount++;
		}

		// Add those lower valued strings to the rank
		rank += lowerLetterCount * factorial;

		// prepare for next loop
		factorial /= (SIZE - 1 - pos);
		lettersUsed[string[pos]] = true;
	}

	return rank;
}

// Returns the string for a given lexicographic rank
template <size_t SIZE>
void StringFromLexicographicRank(int rank, int(&string)[SIZE])
{
	// Make the full alphabet.
	// We could pass this in as a parameter but we need a mutable copy anyways
	int alphabet[SIZE];
	for (int i = 0; i < (int)SIZE; ++i)
		alphabet[i] = i;

	// Calculate each letter for each position, based on the rank
	int factorial = Factorial(SIZE - 1);
	for (int pos = 0; pos < SIZE - 1; ++pos)
	{
		// However many of the current factorial fit into the rank,
		// that is the index of the alphabet character to put there.
		int numFactorialsFit = rank / factorial;
		string[pos] = alphabet[numFactorialsFit];

		// remove those factorials from the rank
		rank = rank % factorial;

		// remove that value from the alphabet
		for (int i = numFactorialsFit; i < SIZE - 1; ++i)
			alphabet[i] = alphabet[i + 1];

		// prepare for next loop
		factorial /= (SIZE - 1 - pos);
	}

	// one letter left, no choice on what it could be!
	string[SIZE - 1] = alphabet[0];
}

int main(int argc, char** argv)
{
	static const int c_numPermutations = Factorial(c_numLetters);

	printf("%i letters have %i permutations\n", c_numLetters, c_numPermutations);

	// For each permutation...
	int string[c_numLetters];
	for (int index = 0; index < c_numPermutations; ++index)
	{
		// Get the string for this index
		StringFromLexicographicRank(index, string);

		// Get the index from the string just to show that round tripping works
		int roundTrip = LexicographicRankFromString(string);

		// print the index, string and round trip index
		printf("  [%i] ", index);
		for (int i = 0; i < c_numLetters; ++i)
			printf("%c", (char)string[i] + 'A');
		printf("   (round trip = %i)\n", roundTrip);
	}

	return 0;
}

Closing

It’s possible to have permutations where the number of letters in the alphabet is more than the number of letters in a word. Here’s a breadcrumb to working with that situation! https://www.mathplanet.com/education/algebra-2/discrete-mathematics-and-probability/permutations-and-combinations

This is unrelated, but this video talks about generalizing the factorial function to non integers, and even negative numbers: https://www.youtube.com/watch?v=v_HeaeUUOnc

There is an algorithm to iterate through permutation strings directly, without having to use indices. It’s called the “Fischer-Krause algorithm” https://mathsanew.com/articles_html/5/generating_permutationsli3.html.

Hacker news discussion of this post: https://news.ycombinator.com/item?id=37232451


One comment


Leave a comment