# What the Heck is Blue Noise?

This is a gentle explanation of blue noise and how it can be useful.

We’ll start with something simple that we can all get behind – not getting eaten by a cheetah!

Let’s talk about our eyes for a minute.

Our eyes have about 126 million photo receptors in them – about 6 million cones, 120 million rods (source). These photo receptors give your brain an image of the world around you. They are a bit like pixels because they are just small points of data that your brain combines into an image.

How those photo receptors are arranged in your eye can make a big difference. Imagine for a second that we only had 10 photo receptors. If they were laid out like these blue dots, we wouldn’t be able to see the cheetah and we’d become a tasty cat snack.

In the image above, white noise random numbers were used to place the points. White noise is what most people are talking about when they talk about random numbers. Using white noise to generate numbers, the numbers can clump up in some spots and leave empty holes in other spots. When using white noise to lay out photo receptors, that makes it so some photo receptors give redundant information when they are too close together, and leave big open spaces in your vision where you are not getting any information at all. Not good!

What if the dots were laid out like this instead?

The points are still randomly placed, but they are roughly evenly spaced. This makes it so we get the most bang for our buck from the photo receptors. We basically have the maximum amount of information we can get for the number of photo receptors we have to work with.

In this case, two of the photo receptors are on the cat, so we have some information about that predator, and we have a better chance at reacting before we become lunch!

Blue noise random numbers were used to place the points on this image, and this example shows exactly why blue noise can be better than white noise – you get maximal information with fewer samples.

Interestingly, our photo receptors (as well as other animals) are in fact laid out this way. Here is an image of a primate (macaque) retina (source)

You might also find this an interesting read about chicken eyes which also have blue noise properties:
https://www.princeton.edu/news/2014/02/24/eye-chicken-new-state-matter-comes-view

That’s blue noise in a nut shell, but continue on if you’d like to go just a tad bit deeper.

## A Little More Technical

If maximizing information is the goal, you might wonder why blue noise is better than putting the sample points in a grid, or in a honeycomb structure or some other regular pattern. The short answer is that regular patterns have a problem called “aliasing”. Random numbers in general trade the problem of aliasing for the problem of noise, but blue noise random numbers in particular still get the benefits of “roughly even coverage”, so blue noise is the best of both worlds.

Blue noise is difficult / computationally intensive to generate though, compared to white noise or regular sampling. Generating better blue noise more efficient is in fact is an ongoing area of research!

For a deeper comparison of white noise, blue noise, and regular sampling, and also how to generate blue noise sample points, give this a read: https://blog.demofox.org/2017/10/20/generating-blue-noise-sample-points-with-mitchells-best-candidate-algorithm/

If you want at least some of the benefits of blue noise, but don’t want to spend the resources to compute it, a nice alternative might be low discrepancy sequences. You can read about them (and how to generate them) here: https://blog.demofox.org/2017/05/29/when-random-numbers-are-too-random-low-discrepancy-sequences/

You often hear about blue noise and low discrepancy sequences in graphics / in numerical integration. For low sample counts, the blue noise / LDS’s give you more even spaces for your samples in the sampling domain, but I’ve heard that white noise gives you better results for larger sample counts.

There is a whole rainbow of noises possible, each with their own unique usage cases. If you want to know a way to transmute white noise to other colors of noises, give this a read: https://blog.demofox.org/2017/10/25/transmuting-white-noise-to-blue-red-green-purple/

Lastly, the other day I found out that Tempurpedic beds are the best, because they have some secret formula/process they bought from NASA. This recipe allows them to make memory foam such that the bubbles are all roughly the same size. The foam is not arranged into any regular structure such as a grid or a honeycomb, so in essence, the memory foam is blue noise. More specifically, it’s basically the Voronoi diagram of blue noise distributed sample points in 3d.

So, Tempurpedic is the best because they have blue noise foam.

Weird, right?!

# C++ Differentiable Programming: Searching For An Optimal Dither Pattern

The simple standalone C++ source code that implements this blog post and replicates the results shown is on github at: https://github.com/Atrix256/DitherFindGradientDescent

Neural networks are a hot topic right now. There is a lot of mystery and mystique surrounding them, but at their core, they are just simple programs where parameters are tuned using gradient descent.

(If curious about neural networks, you might find this interesting: How to Train Neural Networks With Backpropagation)

Gradient descent can be used in a lot of other situations though, and in fact, you can even generalize the core functionality of neural networks to work on other types of programs. That is exactly what we are doing in this post.

To be able to use gradient descent to optimize parameters of a program, your program has to be roughly of the form of:

1. It has parameters that specify how it processes some other data
2. There is some way for you to give a score to how well it did

Beyond those two points, much like as a shader program or a SIMD program, you want your program to be as branchless as possible. The reason for this is because ideally your entire program should be made up of differentiable operations. Branches (if statements) cause discontinuities and are not differentiable. There are ways to deal with branches, and some branches don’t actually impact the result, but it’s a good guideline to keep in mind. Because of this, you also want to stay away from non differentiable functions – such as a “step” function which you might be tempted to use instead of an if statement.

This post is going to go into detail about using differentiable programming in C++ for a specific goal. Results are shown, and the simple / no external dependency C++ code that generated them are at https://github.com/Atrix256/DitherFindGradientDescent.

First, let’s have a short introduction to gradient descent.

If you have a function of the form $f(x)$, it takes one input so is one dimensional.

You can think of a function like this as having a value for every point on the number line.

You can visualize those values as a height, which gives you a function of the form $y=f(x)$ which we are still going to call one dimensional, despite it now having two dimensions.

Let’s look at a function $y=3x+1$

You might remember that the equation of a line is $y=mx+b$ where m is the slope of the line ($\frac{\text{rise}}{\text{run}}$ or $\frac{y}{x}$) and b is where the line crosses the y axis.

In calculus, you learn that the slope m is also the derivative of the function: $\frac{dy}{dx}$

The slope / derivative tells you how much is added to y for every 1 you add to x.

Let’s say that you were on this graph at the point $x=1$ (which puts you at $y=4$), and let’s say that you want to go downhill from where you were at. You could do that by looking at the slope / derivative at that point, which is 3 (it’s 3 for every point on the line). Since the derivative is positive, that means going to the right will make the y value larger (you’ll go up hill) and going to the left will make the y value smaller (you’ll go down hill).

So, if you want to go downhill to a smaller y value, you know that you need to subtract values from x.

A simpler way to think of this is that you need to subtract the derivative from your x value to make your y value smaller.

That is a core fact that will help guide you through things as they get more difficult: subtract the derivative (later, subtract the gradient) to make your value smaller. The value subtracted is often multiplied by some scalar value to make it move faster or slower.

What happens if you have a more complex function, such as $y=(x-2)^2$?

Let’s say that you are on this graph at the point $x=1$, which puts you at $y=1$. Now, which way do you move to go downhill?

The derivative of this function is $y=2x-4$, which you can plug your x value into to get the slope / derivative at that point: -2.

Remembering that we subtract that derivative to go down hill, that means we need to subtract a negative value from our x; aka we need to ADD a value to our x.

As you can see, adding a value to x and making it move to the right does in fact make us go down hill.

The rule works, hooray!

Things do get a little more complex when there’s more than one dimension, but not really that much more complex, so hang in there!

Let’s look at the function $z=xy$

Let’s say that we are at the (x,y) point (1,1) – in the upper right corner – which puts us at $z=1$, and let’s say that we want to go down hill. Instead of just having one variable to take the derivative of (x), we now have two variables (x and y). How are we going to deal with this?

First up, we are going to pretend that y is a constant value, and not actually a variable. This will give us the partial derivative for x: $\frac{\partial z}{\partial x}$. That tells us how much we would add to z if we added one to x. It’s a slope that is specifically down the x axis.

In this case, the partial derivative of z with respect to x is just: y.

Doing the same thing for the other variable, the partial derivative of z with respect to y is just: x.

Now that we have partial derivatives for each variable, we put them into a vector. This vector is called the gradient, and has some intimidating notation that looks like this:

$\nabla z = \nabla f(x,y) = (\frac{\partial z}{\partial x}, \frac{\partial z}{\partial y})$

For this function, the gradient is:

$\nabla z = \nabla f(x,y) = (y,x)$

That makes the gradient at our specific point:

$\nabla z = \nabla f(1,1) = (1,1)$

In the last section we saw that the derivative / slope pointed to where the function got larger. The same thing is true of gradients, they point in the direction where the function gets larger too!

So, if we want to go downhill, we need to subtract values from our x and our y to go there. In fact, we know that the steepest way down from our current point is when we subtract the same value from both x and y. This is because the gradient doesn’t just point to where it gets larger, it points to where it gets larger the FASTEST. So, the reverse of the gradient also points to where it gets smaller the fastest.

Pretty cool huh?

You can confirm this visually by looking at the graph of the function.

One last things about slopes, derivatives and gradients before moving on. While they do point in the direction of greatest increase, they are only valid for an infinitely small point on the graph for functions that are non linear. This will be important later when we move in the opposite direction of the gradients, but do so with very small steps to help make sure we find the lowest points on the graph.

Why do we want to use gradient descent? Imagine that we have a function:

$w=f(x,y,z)$

Sure, we can pick some random starting values for x,y and z, and then use gradient descent to find the smallest w, but who cares?

Let’s give some other names to these variables and see if the value becomes a little more apparent:

$DamageTakenMultiplier = CalculateDamageTakenMultiplier(Armor, Dodge, Resist)$

Now, by only changing the names of the variables, we can see that we could use gradient descent to find what amount of Armor, Dodge and Resist would make it so our character takes the least amount of damage. This can now tell you how to distribute stat points to a character to get the best results 😛

Note that if you are ever trying to find the highest number possible, instead of the lowest, you can just multiply your function by -1 and do everything else the same way. You could also do gradient ASCENT, but it’s equivalent to multiplying by -1 and doing gradient descent.

## Problems

Here are a few common problems you can encounter when doing gradient descent.

• Local minima – when you get to the bottom of a bowl, but it isn’t the deepest bowl.
• Flat derivatives – these make it hard to escape a local area because the derivatives are very small, which will make each movement also very small.
• Discontinuities – The problem space (graph) changes abruptly without warning, making gradient descent do the wrong thing

Here’s an example of a local minima versus a global minima. You can see that depending on where you start on this graph, you might end up in the deeper bowl, or the shallower bowl if your only rule is “move downhill”.

(Image from wikipedia By KSmrq – http://commons.wikimedia.org/wiki/File:Extrema_example.svg, GFDL 1.2, https://commons.wikimedia.org/w/index.php?curid=6870865)

Here’s an example of a flat derivative. You can imagine that if you were at x=1, that you could see that the derivative would tell you to go to the left to decrease the y value, but it’s a very, very small number. This is a problem because it’s common to multiply the derivative or gradient by a multiplier before subtracting it, so you’d only take a very small step towards the goal.

It’s also possible to hit a perfectly flat derivative, which will be exactly 0. In this case, no matter how big or small of a number you multiply the derivative by, you won’t move AT ALL!

Below is a discontinuous function where if x is less than 0.5, the value is 1, otherwise the value is x. This essentially shows you what happens when you use if statements in differentiable programming. If you start on the right side, it’s going to correctly tell you that you should move left to improve your score. However, it’ll keep telling you to move left, until you get to x being less than 0.5, at which point your score will suddenly get a lot worse and your derivative will become 0. You will now be stuck!

There are ways to deal with these problems, but they are deep topics. If nothing else, you should know these problems exist, so you can know when they are affecting you, and/or why you should avoid them if you have a choice.

## What If I Want to Avoid Calculus?

Let’s say that you don’t get a kick out of calculating all these partial derivatives. Or, more pragmatically, you don’t want to sit down and manually calculate the gradient function of some generic C++ code!

I have some great news for you.

While we do need partial derivatives for our gradients, we aren’t going to have to do all this calculus to get them!

Here are a few other ways to get partial derivatives:

• Finite Differences – Conceptually super simple, but slow to calculate and not always very precise. More info: Finite Differences
• Backpropagation – What neural networks use. Also called backwards mode automatic differentiation. Fast but a bit complex mentally. I linked this already but for more info: How to Train Neural Networks With Backpropagation
• Dual Numbers – Also called forward mode automatic differentiation. Not as fast as backwards mode, but in the same neighborhood for speed. Super, super convinient and awesome for programmers. I love these. More info: Dual Numbers & Automatic Differentiation

Care to guess which one we are going to use? Yep, Dual Numbers!

In a nutshell, if you have code that uses floats, you can change it to use a templated type instead. Then, you put dual numbers through your code instead of floats. The output you get will be the specific value output from your code, but also the GRADIENT of your code at that value. Even better, this isn’t a numerical method (it’s not an approximation), it’s analytical (it’s exact).

That is seriously all there is to it. Dual numbers are amazing!

Since you made the code templated, you can still use it for floats when you don’t want or need the gradient.

## Differentiable Programming / Gradient Descent Skeleton

Here’s the general skeleton we are going to be following for using gradient descent with our differentiable program.

1. Initialize the parameters to random (but valid) values, storing them in dual numbers.
2. Run the code that does our work, taking dual numbers as input for the parameters of how it does the work.
3. Put the result (which is dual numbers) into a scoring function to give us a score. Usually the score is such that smaller numbers are better. If not, just multiply the score by -1 so it is.
4. Since we did the work and calculated the score using dual numbers, we now have a gradient which describes how we need to adjust the parameters to make our score better.
5. Adjust our parameters using the gradient and go back to step 2. Repeating until whatever exit condition we want is hit: maybe when a certain number of iterations happen, or maybe when our score gets below a certain value.

That’s our game plan. Let’s dive into the specific problem we are going to be attacking.

## Searching For an Ideal Dithering Pattern

Here is the problem we want to tackle:

We want to find a 3×3 dithering pattern such that when we use it to dither an image (by repeating the 3×3 pattern over and over across the image), and then blur the result by a specific amount, that it’s as close as possible to the original image being blurred by that same amount.

That sounds a bit challenging right? It’s not actually that bad, don’t worry (:

The steps the code has to do (differentiably) are:

1. Dither the source image
2. Blur the results
3. Blur the source image
4. Calculate a score for how similar they are
5. Use all this with Gradient Descent to optimize the dither pattern

Once again, we need to do this stuff differentiably, using dual numbers, so that we get a gradient for how to modify the dither pattern to better our score.

### Step 1 – Dither Source Image

Dithering an image is a pretty simple process.

We are going to be dithering it such that we take a greyscale image as input and convert it to a black and white image using the dither pattern.

(If you are starting with a color image, this shows how to convert it to greyscale: Converting RGB to Grayscale)

For every pixel (x,y) in the source image, you look at pixel (x%3, y%3) in the dither pattern, and if the dither pattern pixel is less than the source, you write a black pixel out, else you write a white pixel out.

if (sourcePixel(x,y) < ditherPixel(x%3, y%3))
pixelOut(x,y) = 0.0;
else
pixelOut(x,y) = 1.0;


There’s a problem though… this is a branch, which makes a discontinuity, which will make it so we can’t have good derivatives to help us get to the goal.

Another way to write the dithering operation above is to write it like this:

difference = ditherPixel(x%3, y%3) - sourcePixel(x,y);
pixelOut(x,y) = step(difference);


Where “step” is the “heaviside step function”, which is 1 if x >= 0, otherwise is 0.

(Image from Wikipedia By Omegatron (Own work) [CC BY-SA 3.0 (https://creativecommons.org/licenses/by-sa/3.0) or GFDL (http://www.gnu.org/copyleft/fdl.html)%5D, via Wikimedia Commons)

That got rid of the branch (if statement), but we still have a discontinuous function.

Luckily we can approximate a step function with other functions. I decided to use the formula $0.5+atan(100*x)/pi$ which looks like this:

Unfortunately, I found that my results weren’t that good, so i switched it to $0.5+atan(10000*x)/pi$ which ended up working better for me:

This function does have the problem of having flat derivatives, but I found that it worked pretty well anyways. The flat derivatives don’t seem to be a big problem in this case luckily.

To put it all together, the differentiable version of dithering a pixel that I use looks like this:

difference = ditherPixel(x%3, y%3) - sourcePixel(x,y);
pixelOut(x,y) = 0.5+atan(10000.0f * difference) / pi;


As input to this dithering process, we take:

• The source image
• a 3×3 dither pattern, where each pixel is a dual number

As output this dithering process gives us:

• A dithered image that is converted to black and white (either a 1.0 or 0.0 value per pixel)
• It’s the same size as the source image
• Each pixel is a dual number with 9 derivatives. There is one derivative per dither pixel.

### Step 2 – Blur the Results

Blurring the results of the dither wasn’t that difficult. I used a Gaussian blur, but other blurs could be used easily.

I had some Gaussian blur code laying around (from this blog post: Gaussian Blur) and I converted it to using a templated type instead of floats/pixels where appropriate, also making sure there were no branches or anything discontinuous.

It turned out there wasn’t a whole lot to fix up here luckily so wasn’t too difficult.

This allowed me to take the dithered results which are a dual number per pixel, and do a Gaussian blur on them, preserving and correctly modifying the gradient (derivatives) as it did the Blur.

### Step 3 – Blur the Source Image

Blurring the source image was easy since the last step made a generic gaussian blur function. I used the generic Gaussian blur function to blur the image. This doesn’t need to be done as dual numbers, so it was regular pixels in and regular pixels out.

You might wonder why this part doesn’t need to be done as dual numbers.

The simple answer is because these values are in no way dependant on the dither pattern (which are what we are tracking with the derivatives).

The more mathematical explanation is that you could in fact consider these dual numbers, which just have a gradient of zero because they are essentially constants that have nothing to do (yet) with the parameters of the function. The gradient would just implicitly be zero, like any other constant value you might introduce to the function.

### Step 4 – Calculating a Similarity Score

Next up I needed to calculate a similarity score between the dithered then blurred results (which is made up of dual numbers), and the source image which was blurred (and is made up of regular pixels).

The similarity score I went with is just MSE or “Mean Squared Error”.

To calculate MSE, for every pixel you do this:

error = ditheredBlurredImage(x,y) - blurredImage(x,y);
errorSquared = error * error;


After you have the squared error for every pixel, you just take the average of them to get the MSE.

An interesting thing about MSE is that because errors are squared, it will favor smaller errors much more than larger errors, which is a nice property.

A not so nice property about MSE is that it might decide something is a small difference mathematically even though a human would decide that it was a huge difference perceptually. The reverse is also true. Despite this, I chose it because it is simple and I ended up getting decent results with it.

If you want to go down the rabbit hole of looking at “perceptual similarity scores of images” check out these links:

After this step, we have an MSE value which says how similar the images are. A lower value means lower average squared error, so lower numbers are indeed better.

What else is nice is that the MSE value is a dual number with a gradient that has the 9 partial derivatives that describe how much the MSE changes as you adjust each parameter.

That gradient tells us how to adjust the parameters (the 3×3 dither pixels!) to lower the MSE!

### Step 5 – Putting it All Together

Now it’s time to put all of this together and use gradient descent to make our dither pattern better.

Here’s how the program operates:

1. Initialize the 3×3 dither pattern to random values, setting the derivatives to 1.0 in the gradient, for the variable that they represent.
2. do 1000 iterations of this loop:
1. Dither and blur the source image
2. Calculate MSE of this result compared to the source image blurred
3. Using the gradient from the MSE value, subtract the respective partial derivative from each of the pixels in the dither pattern, but scaling the partial derivative by a “learning rate”.
3. Output the best result found

The learning rate starts at 3.0 at loop iteration 0, but decays with each iteration, down to 0.1 at iteration 999. It starts above 1 to help escape local minima, and uses a very small rate at the end to try and get deeper into whatever minimum it has found.

After adjusting the dither pattern pixels, I clamp them to be between 0 and 1.

Something else I ought to mention is that while I’m doing the gradient descent, I keep track of the best scoring dither pattern seen.

This way, after the 1000 iterations are up, if we ever saw anything better than where we are at currently, we just use that instead of the final result.

Presumably, if you tune your parameters (learning rate, iterations, etc!) correctly, this won’t come up often, but it’s always a possibility that your final state is not the best state encountered, so this is a nice way to get better results more often.

## Results

Did you notice that I called this post “searching for an ideal dither pattern” instead of “finding an ideal dither pattern”? (:

The results are decent, but I know they could be better. Even so, I think the techniques talked about here are a good start going down the path of differentiable programming, and similar topics.

Here are some results I was able to get with the code. Click to see the full size images. The shrunken down images have aliasing issues.

The images left to right are: The original, the dither pattern used (repeated), the dithered image, the blurred dither image, and lastly the blurred original image. The program aims to make the last two images look as close as possible as it can, using MSE as the metric for how close they are.

Here is the starting state of using a Gaussian blur with a sigma of 10:

Here it is after the 1000 iterations of gradient descent. Notice the black blob at the top is gone compared to where it started.

Here’s the starting state when using a Gaussian blur sigma of 1:

And here it is after 1000 iterations, which is pretty decent results:

Lastly, here it is with no blurring whatsoever:

And after 1000 iterations, I think it actually looks worse!

Using no blur at all makes for some really awful results. The blur gives the algorithm more freedom on how it can succeed, whereas with no blur, there is a lot less wiggle room for finding a solution.

Another benefit of using the blur before MSE calculation is that a blur is a low pass filter. That means that higher frequencies are gone before the MSE calculation. The result of this is that the algorithm will favor results which are closer to blue noise dithering. Pretty neat right?!

## Closing

I hope you enjoyed this journey through differentiable programming and gradient descent, and I hope you were able to follow along.

Here are some potentially interesting things to do beyond what we talked about here:

• Have it learn from a set of images, instead of only this single image. This should help prevent “over fitting” and let it find a dither pattern which works well for all images instead of just this one specific image.
• Use a separate set of images to gauge the accuracy of the result that weren’t used as part of the training, to help prove that it really hasn’t overfit the training data.
• Try applying “small corruption” in the learning to help prevent overfitting or getting stuck in local minima – one idea would be to have some percentage chance per derivative that you don’t apply the change to the dither pattern pixel. This would add some randomness to the gradient descent instead of it only being down the steepest direction all of the time.
• Instead of optimizing the dithering patterns, you could make a formula that generated the dithering patterns, and instead optimize the coefficients / terms of that formula. If you get good results, you’ll end up with a formula you can use for dithering instead of a pattern, which might be nice for the case of avoiding a texture read in a pixel shader to do the dithering.

I’m not a data scientist or machine learning expert by any means, so there are plenty of improvements to be made. There is a lot of overlap with what is being done here and other algorithms – both in the machine learning realm and outside of the machine learning realm.

For one, you can use Newton’s method for gradient descent. It can find minima faster by using the second derivative in the calculations as well.

However, this algorithm is almost purely “exploitative” in that wherever you start with your parameters, it will try to go from there to the deepest point in whatever valley it’s already in. Some other types of algorithms differ from this in that they are more “explorative” and try to find other valleys, but aren’t always as good at finding the deepest part of the valleys that they do find. More explorative algorithms include simulated annealing, differential evolution, and genetic algorithms.

If you enjoyed this post, check out this book for deeper details on algorithms relating to gradient descent (simulated annealing, genetic algorithms, etc!). It’s a very good book and very easy to read!
Essentials of Metaheuristics

Any corrections to what i’ve said, the code, or suggestions for improvements, please let me know by leaving a comment here, or hitting me up on twitter: https://twitter.com/Atrix256

# Demystifying Floating Point Precision

Floating point numbers have limited precision. If you are a game programmer, you have likely encountered bugs where things start breaking after too much time has elapsed, or after something has moved too far from the origin.

This post aims to show you how to answer the questions:

1. What precision do I have at a number?
2. When will I hit precision issues?

First, a very quick look at the floating point format.

## Floating Point Format

Floating point numbers (Wikipedia: IEEE 754) have three components:

1. Sign bit – whether the number is positive or negative
2. Exponent bits – the magnitude of the number
3. Mantissa bits – the fractional bits

32 bit floats use 1 bit for sign, 8 bits for exponent and 23 bits for mantissa. Whatever number is encoded in the exponent bits, you subtract 127 to get the actual exponent, meaning the exponent can be from -126 to +127.

64 bit doubles use 1 bit for sign, 11 bits for exponent and 52 bits for mantissa. Whatever number is encoded in the exponent bits, you subtract 1023 to get the actual exponent, meaning the exponent can be from -1022 to +1023.

16 bit half floats use 1 bit for sign, 5 bits for exponent and 10 bits for mantissa. Whatever number is encoded in the exponent bits, you subtract 15 to get the actual exponent, meaning the exponent can be from -14 to +15.

For all of the above, an exponent of all zeros has the special meaning “exponent 0” (and this is where the denormals / subnormals come into play) and all ones has the special meaning “infinity”

The exponent bits tell you which power of two numbers you are between – $[2^{exponent}, 2^{exponent+1})$ – and the mantissa tells you where you are in that range.

## What precision do I have at a number?

Let’s look at the number 3.5.

To figure out the precision we have at that number, we figure out what power of two range it’s between and then subdivide that range using the mantissa bits.

3.5 is between 2 and 4. That means we are diving the range of numbers 2 to 4 using the mantissa bits. A float has 23 bits of mantissa, so the precision we have at 3.5 is:

$\frac{4-2}{2^{23}} = \frac{2}{8388608} \approx 0.000000238418579$

3.5 itself is actually exactly representable by a float, double or half, but the amount of precision numbers have at that scale is that value. The smallest number you can add or subtract to a value between 2 and 4 is that value. That is the resolution of the values you are working with when working between 2 and 4 using a float.

Using a double instead of a float gives us 52 bits of mantissa, making the precision:

$\frac{4-2}{2^{52}} = \frac{2}{4503599627370496} \approx 0.00000000000000044408921$

Using a half float with 10 bits of mantissa it becomes:

$\frac{4-2}{2^{10}} = \frac{2}{1024} = 0.001953125$

Here’s a table showing the amount of precision you get with each data type at various exponent values. N/A is used when an exponent is out of range for the specific data type.

$\begin{array}{c|c|c|c|c} exponent & range & half & float & double \\ \hline 0 & [1,2) & 0.0009765625 & 0.00000011920929 & 0.0000000000000002220446 \\ 1 & [2,4) & 0.001953125 & 0.000000238418579 & 0.00000000000000044408921 \\ 2 & [4,8) & 0.00390625 & 0.000000476837158 & 0.00000000000000088817842 \\ 9 & [512, 1024) & 0.5 & 0.00006103515 & 0.00000000000011368684 \\ 10 & [1024,2048) & 1 & 0.00012207031 & 0.00000000000022737368 \\ 11 & [2048,4096) & 2 & 0.00024414062 & 0.00000000000045474735 \\ 12 & [4096,8192) & 4 & 0.00048828125 & 0.0000000000009094947 \\ 15 & [32768, 65536) & 32 & 0.00390625 & 0.0000000000072759576 \\ 16 & [65536, 131072) & N/A & 0.0078125 & 0.0000000000014551915 \\ 17 & [131072, 262144) & N/A & 0.015625 & 0.00000000002910383 \\ 18 & [262144, 524288) & N/A & 0.03125 & 0.000000000058207661 \\ 19 & [524288, 1048576) & N/A & 0.0625 & 0.00000000011641532 \\ 23 & [8388608,16777216) & N/A & 1 & 0.00000000186264515 \\ 52 & [4503599627370496, 9007199254740992) & N/A & 536870912 & 1 \\ \end{array}$

A quick note on the maximum number you can store in floating point numbers, by looking at the half float specifically:

A half float has a maximum exponent of 15, which you can see above puts the number range between 32768 and 65536. The precision is 32 which is the smallest step that can be made in a half float at that scale. That range includes the smaller number but not the larger number. That means that the largest number a half float can store is one step away (32) from the high side of that range. So, the largest number that can be stored is 65536 – 32 = 65504.

## How Many Digits Can I Rely On?

Another helpful way of looking at floating point precision is how many digits of precision you can rely on.

A float has 23 bits of mantissa, and 2^23 is 8,388,608. 23 bits let you store all 6 digit numbers or lower, and most of the 7 digit numbers. This means that floating point numbers have between 6 and 7 digits of precision, regardless of exponent.

That means that from 0 to 1, you have quite a few decimal places to work with. If you go into the hundreds or thousands, you’ve lost a few. When you get up into the tens of millions, you’ve run out of digits for anything beyond the decimal place.

You can actually see that this is true in the table in the last section. With floating point numbers, it’s at exponent 23 (8,388,608 to 16,777,216) that the precision is at 1. The smallest value that you can add to a floating point value in that range is in fact 1. It’s at this point that you have lost all precision to the right of the decimal place. Interestingly, you still have perfect precision of the integers though.

Half floats have 10 mantissa bits and 2^10 = 1024, so they just barely have 3 digits of precision.

Doubles have 52 mantissa bits and 2^52 = 4,503,599,627,370,496. That means doubles have between 15 and 16 digits of precision.

This can help you understand how precision will break down for you when using a specific data type for a specific magnitude of numbers.

## When will I hit precision issues?

Besides the loose rules above about how many digits of precision you can count on, you can also solve to see when precision will break down for you.

Let’s say that you are tracking how long your game has been running (in seconds), and you do so by adding your frame delta (in seconds) to a variable every frame.

If you have a 30fps game, your frame delta is going to be 0.0333.

Adding that each frame to a float will eventually cause the float to reach a value where that number is smaller than the smallest difference representable (smaller than the precision), at which point things will start breaking. At first your accuracy will drop and your time will be wrong, but eventually adding your frame delta to the current time won’t even change the value of the current time. Time will effectively stop!

When will this happen though?

We’ll start with the formula we saw earlier and do one step of simple algebra to get us an equation which can give us this answer.

$\frac{range}{mantissa} = precision \\ \\ range = mantissa * precision$

How we use this formula is we put the precision we want into “precision” and we put the size of the mantissa ($2^{MantissaBits}$) into “mantissa”. The result tells us the range that we’ll get the precision at.

Let’s plug in our numbers:

$range = 8388608 * 0.0333 = 279340.6464$

This tells us the range of the floating point numbers where we’ll have our problems, but this isn’t the value that we’ll have our problems at, so we have another step to do. We need to find what exponent has this range.

Looking at the table earlier in the post you might notice that the range at an exponent also happens to be just $2^{exponent}$.

That’s helpful because that just means we take log2 of the answer we got:

$log2(279340.6464) = 18.0916659875$

Looking at the table again, we can see that floating point numbers have a precision of 0.03125 at exponent value 18. So, exponent 18 is close, but it’s precision is smaller than what we want – aka the precision is still ok.

That means we need to ceil() the number we got from the log2.

Doing that, we see that things break down at exponent 19, which has precision of 0.0625. This actual value it has this problem at is 528,288 (which is $2^{19}$).

So, our final formula for “where does precision become this value?” becomes:

$value = pow(2, ceil(log2(mantissa * precision)))$

Note that at exponent 18, there is still imprecision happening. When adding 1/30 to 264144, It goes from 264144 to 264144.031 to 264144.063, instead of 264144, 264144.033, 264144.066. There is error, but it’s fairly small.

At exponent 19 though, things fall apart a lot more noticeably. When adding 1/30 to 528288, it goes from 528288 to 528288.063 to 528288.125. Time is actually moving almost twice as fast in this case!

At exponent 20, we start at 1056576.00 and adding 1/30 doesn’t even change the value. Time is now stopped.

It does take 6.1 days (528,288 seconds) to reach exponent 19 though, so that’s quite a long time.

If we use half floats, it falls apart at value 64. That’s right, it only takes 64 seconds for this to fall apart when using 16 bit half floats, compared to 6.1 days when using 32 bit floats!

With doubles, it falls apart at value 281,474,976,710,656. That is 8,925,512 years!

Let’s check out that equation again:

$value = pow(2, ceil(log2(mantissa * precision)))$

A possibly more programmer friendly way to do the above would be to calculate mantissa * precision and then round up to the next power of 2. That’s exactly what the formula is doing above, but in math terms, not programming terms.

## Storing Integers

I recently learned that floating point numbers can store integers surprisingly well. It blows my mind that I never knew this. Maybe you are in the same boat 😛

Here’s the setup:

1. For any exponent, the range of numbers it represents is a power of 2.
2. The mantissa will always divide that range into a power of 2 different values.

It might take some time and/or brain power to soak that up (it did for me!) but what that ends up ultimately meaning is that floating point numbers can exactly represent a large number of integers.

In fact, a floating point number can EXACTLY store all integers from $-2^{MantissaBits+1}$ to $+2^{MantissaBits+1}$.

For half floats that means you can store all integers between (and including) -2048 to +2048. ($\pm 2^{11}$)

For floats, it’s -16,777,216 to +16,777,216. ($\pm 2^{24}$)

For doubles it’s -9,007,199,254,740,992 to +9,007,199,254,740,992. ($\pm 2^{53}$)

Doubles can in fact exactly represent any 32 bit unsigned integer, since 2^32 = 4,294,967,296.

Here are some links you might find interesting!

Floating point visually explained:
http://fabiensanglard.net/floating_point_visually_explained/

What Every Computer Scientist Should Know About Floating-Point Arithmetic:
https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html

A matter of precision:
http://tomforsyth1000.github.io/blog.wiki.html#[[A%20matter%20of%20precision]]

Denormal numbers – aka very small numbers that make computations slow when you use them:
https://en.m.wikipedia.org/wiki/Denormal_number

Catastrophic Cancellation – a problem you can run into when doing floating point math:
https://en.wikipedia.org/wiki/Loss_of_significance

A handy web page that lets you play with the binary representation of a float and what number it comes out as:
https://www.h-schmidt.net/FloatConverter/IEEE754.html

Half precision floating point format:
https://en.wikipedia.org/wiki/Half-precision_floating-point_format

What is the first integer that a float is incapable of representing?
https://stackoverflow.com/questions/3793838/which-is-the-first-integer-that-an-ieee-754-float-is-incapable-of-representing-e

Ready to go deeper? Bruce Dawson has some amazing write ups on deeper floating point issues:
https://randomascii.wordpress.com/category/floating-point/

This talks about how to use floating point precision limits as an activation function in a neural network (?!)
https://blog.openai.com/nonlinear-computation-in-linear-networks/

# Animating Noise For Integration Over Time 2: Uniform Over Time

After I put out the last post, Mikkel Gjoel (@pixelmager), made an interesting observation that you can see summarized in his image below. (tweet / thread here)

BTW Mikkel has an amazing presentation about rendering the beautiful game “Inside” that you should check out. Lots of interesting techniques used, including some enlightening uses of noise.
Low Complexity, High Fidelity: The Rendering of INSIDE

The images left to right are:

• One frame of white noise
• N frames of white noise averaged.
• N frames averaged where the first frame is white noise, and a per frame random number is added to all pixels every frame.
• N frames averaged where the first frame is white noise, and 1/N is added to all pixels every frame.
• N frames averaged where the first frame is white noise, and the golden ratio is added to all pixels every frame.

In the above, the smoother and closer to middle grey that an image is, the better it is – that means it converged to the true result of the integral better.

Surprisingly it looks like adding 1/N outperforms the golden ratio, which means that regular spaced samples are outperforming a low discrepancy sequence!

To compare apples to apples, we’ll do the “golden ratio” tests we did last post, but instead do them with adding this uniform value instead.

To be explicit, there are 8 frames and they are:

• Frame 0: The noise
• Frame 1: The noise + 1/8
• Frame 2: The noise + 2/8
• Frame 7: the noise + 7/8

Modulus is used to keep the values between 0 and 1.

Below is how white noise looks animated with golden ratio (top) vs uniform values (bottom). There are 8 frames and it’s played at 8fps so it loops every second.

Interleaved Gradient Noise. Top is golden ratio, bottom is uniform.

Blue Noise. Top is golden ratio, bottom is uniform.

The uniform ones look pretty similar. Maybe a little smoother, but it’s hard to tell by just looking at it. Interestingly, the frequency content of the blue noise seems more stable using these uniform values instead of golden ratio.

The histogram data of the noise was the same for all frames of animation, just like in last post, which is a good thing. The important bit is that adding a uniform value doesn’t modify the histogram shape, other than changing which counts go to which specific buckets. Ideally the histogram would start out perfectly even like the blue noise does, but since this post is about the “adding uniform values” process, and not about the starting noise, this shows that the process does the right thing with the histogram.

• White Noise – min 213, max 306, average 256, std dev 16.51
• Interleaved Gradient Noise – min 245, max 266, average 256, std dev 2.87
• Blue Noise – min, max, average are 256, std dev 0.

Let’s look at the integrated animations.

White noise. Top is golden ratio, bottom is uniform.

Interleaved gradient noise. Top is golden ratio, bottom is uniform.

Blue noise. Top is golden ratio, bottom is uniform.

The differences between these animations are subtle unless you know what you are looking for specifically so let’s check out the final frames and the error graphs.

Each noise comparison below has three images. The first image is the “naive” way to animate the noise. The second uses golden ratio instead. The third one uses 1/N. The first two images (and techniques) are from (and explained in) the last post, and the third image is the technique from this post.

White noise. Naive (top), golden ratio (mid), uniform (bottom).

Interleaved gradient noise. Naive (top), golden ratio (mid), uniform (bottom).

Blue noise. Naive (top), golden ratio (mid), uniform (bottom).

So, what’s interesting is that the uniform sampling over time has lower error and standard deviation (variance) than golden ratio, which has less than the naive method. However, it’s only at the end that the uniform sampling over time has the best results, and it’s actually quite terrible until then.

The reason for this is that uniform has good coverage over the sample space, but it takes until the last frame to get that good coverage because each frame takes a small step over the remaining sample space.

What might work out better would be if our first frame was the normal noise, but then the second frame was the normal noise plus a half, so we get the most information we possibly can from that sample by splitting the sample space in half. We would then want to cut the two halves of the space space in half, and so the next two frames would be the noise plus 1/4 and the noise plus 3/4. We would then continue with 1/8, 5/8, 3/8 and 7/8 (note we didn’t do these 1/8 steps in order. We did it in the order that gives us the most information the most quickly!). At the end of all this, we would have our 8 uniformly spaced samples over time, but we would have taken the samples in an order that makes our intermediate frames look better hopefully.

Now, interestingly, that number sequence I just described has a name. It’s the base 2 Van Der Corput sequence, which is a type of low discrepancy sequence. It’s also the 1D version of the Halton sequence, and is related to other sequences as well. More info here: When Random Numbers Are Too Random: Low Discrepancy Sequences

Mikkel mentioned he thought this would be helpful, and I was thinking the same thing too. Let’s see how it does!

White noise. Uniform (top), Van Der Corput (bottom).

Interleaved gradient noise. Uniform (top), Van Der Corput (bottom).

Blue noise. Uniform (top), Van Der Corput (bottom).

The final frames look the same as before (and the same as each other), so I won’t show those again but here are the updated graphs.

Interestingly, using the Van Der Corput sequence has put intermediate frames more in line with golden ratio, while of course still being superior at the final frame.

I’ve been trying to understand why uniform sampling over time out performs the golden ratio which acts more like blue noise over time. I still don’t grasp why it works as well as it does, but the proof is in the pudding.

Theoretically, this uniform sampling over time should lead to the possibility of aliasing on the time axis, since blue noise / white noise (and other randomness) get rid of the aliasing in exchange for noise.

Noise over the time dimension would mean missing details that were smaller than the sample spacing size. in our case, we are using the time sampled values (noise + uniform value) to threshold a source image to make a sample. It may be that since we are thresholding, that aliasing isn’t possible since our sample represents everything below or equal to the value?

I’m not really sure, but will be thinking about it for a while. If you have any insights please let me know!

It would be interesting to try an actual 1d blue noise sequence and see how it compares. If it does better, it sounds like it would be worth while to try jittering the uniform sampled values on the time axis to try and approximate blue noise a bit. Mikkel tried the jittering and said it gave significantly worse results, so that seems like a no go.

Lastly, some other logical experiments from here seem to be…

• See how other forms of noise and ordered dithers do, including perhaps a Bayer Matrix. IG noise seems to naturally do better on the time axis for some reason I don’t fully understand yet. There may be some interesting properties of other noise waiting to be found.
• Do we get any benefits in this context by arranging the interleaved gradient noise in a spiral like Jorge mentions in his presentation? (Next Generation Post Processing In Call Of Duty: Advanced Warfare
• It would be interesting to see how this works in a more open ended case – such as if you had temporal AA which was averaging a variable number of pixels each frame. Would doing a van Der Corput sequence give good results there? Would you keep track of sample counts per pixel and keep marching the Van Der Corput forward for each pixel individually? Or would you just pick something like an 8 Van Der Corput sequence, adding the current sequence to all pixels and looping that sequence every 8 frames? It really would be interesting to see what is best in that sort of a setup.

I’m sure there are all sorts of other things to try to. This is a deep, interesting and important topic for graphics and beyond (:

## Code

Source code below, but it’s also available on github, along with the source images used: Github:
Atrix256/RandomCode/AnimatedNoise

#define _CRT_SECURE_NO_WARNINGS

#include <windows.h>  // for bitmap headers.  Sorry non windows people!
#include <stdint.h>
#include <vector>
#include <random>
#include <atomic>
#include <complex>
#include <array>

typedef uint8_t uint8;

const float c_pi = 3.14159265359f;

// settings
const bool c_doDFT = true;

// globals
FILE* g_logFile = nullptr;

//======================================================================================
inline float Lerp (float A, float B, float t)
{
return A * (1.0f - t) + B * t;
}

//======================================================================================
struct SImageData
{
SImageData ()
: m_width(0)
, m_height(0)
{ }

size_t m_width;
size_t m_height;
size_t m_pitch;
std::vector<uint8> m_pixels;
};

//======================================================================================
struct SColor
{
SColor (uint8 _R = 0, uint8 _G = 0, uint8 _B = 0)
: R(_R), G(_G), B(_B)
{ }

inline void Set (uint8 _R, uint8 _G, uint8 _B)
{
R = _R;
G = _G;
B = _B;
}

uint8 B, G, R;
};

//======================================================================================
struct SImageDataComplex
{
SImageDataComplex ()
: m_width(0)
, m_height(0)
{ }

size_t m_width;
size_t m_height;
std::vector<std::complex<float>> m_pixels;
};

//======================================================================================
std::complex<float> DFTPixel (const SImageData &srcImage, size_t K, size_t L)
{
std::complex<float> ret(0.0f, 0.0f);

for (size_t x = 0; x < srcImage.m_width; ++x)
{
for (size_t y = 0; y < srcImage.m_height; ++y)
{
// Get the pixel value (assuming greyscale) and convert it to [0,1] space
const uint8 *src = &srcImage.m_pixels[(y * srcImage.m_pitch) + x * 3];
float grey = float(src[0]) / 255.0f;

// Add to the sum of the return value
float v = float(K * x) / float(srcImage.m_width);
v += float(L * y) / float(srcImage.m_height);
ret += std::complex<float>(grey, 0.0f) * std::polar<float>(1.0f, -2.0f * c_pi * v);
}
}

return ret;
}

//======================================================================================
void ImageDFT (const SImageData &srcImage, SImageDataComplex &destImage)
{
// NOTE: this function assumes srcImage is greyscale, so works on only the red component of srcImage.
// ImageToGrey() will convert an image to greyscale.

// size the output dft data
destImage.m_width = srcImage.m_width;
destImage.m_height = srcImage.m_height;
destImage.m_pixels.resize(destImage.m_width*destImage.m_height);

// calculate 2d dft (brute force, not using fast fourier transform) multithreadedly
std::atomic<size_t> nextRow(0);
{
[&] ()
{
bool reportProgress = (row == 0);
int lastPercent = -1;

while (row < srcImage.m_height)
{
// calculate the DFT for every pixel / frequency in this row
for (size_t x = 0; x < srcImage.m_width; ++x)
{
destImage.m_pixels[row * destImage.m_width + x] = DFTPixel(srcImage, x, row);
}

// report progress if we should
if (reportProgress)
{
int percent = int(100.0f * float(row) / float(srcImage.m_height));
if (lastPercent != percent)
{
lastPercent = percent;
printf("            \rDFT: %i%%", lastPercent);
}
}

// go to the next row
}
}
);
}

t.join();

printf("\n");
}

//======================================================================================
void GetMagnitudeData (const SImageDataComplex& srcImage, SImageData& destImage)
{
// size the output image
destImage.m_width = srcImage.m_width;
destImage.m_height = srcImage.m_height;
destImage.m_pitch = 4 * ((srcImage.m_width * 24 + 31) / 32);
destImage.m_pixels.resize(destImage.m_pitch*destImage.m_height);

// get floating point magnitude data
std::vector<float> magArray;
magArray.resize(srcImage.m_width*srcImage.m_height);
float maxmag = 0.0f;
for (size_t x = 0; x < srcImage.m_width; ++x)
{
for (size_t y = 0; y < srcImage.m_height; ++y)
{
// Offset the information by half width & height in the positive direction.
// This makes frequency 0 (DC) be at the image origin, like most diagrams show it.
int k = (x + (int)srcImage.m_width / 2) % (int)srcImage.m_width;
int l = (y + (int)srcImage.m_height / 2) % (int)srcImage.m_height;
const std::complex<float> &src = srcImage.m_pixels[l*srcImage.m_width + k];

float mag = std::abs(src);
if (mag > maxmag)
maxmag = mag;

magArray[y*srcImage.m_width + x] = mag;
}
}
if (maxmag == 0.0f)
maxmag = 1.0f;

const float c = 255.0f / log(1.0f+maxmag);

// normalize the magnitude data and send it back in [0, 255]
for (size_t x = 0; x < srcImage.m_width; ++x)
{
for (size_t y = 0; y < srcImage.m_height; ++y)
{
float src = c * log(1.0f + magArray[y*srcImage.m_width + x]);

uint8 magu8 = uint8(src);

uint8* dest = &destImage.m_pixels[y*destImage.m_pitch + x * 3];
dest[0] = magu8;
dest[1] = magu8;
dest[2] = magu8;
}
}
}

//======================================================================================
bool ImageSave (const SImageData &image, const char *fileName)
{
// open the file if we can
FILE *file;
file = fopen(fileName, "wb");
if (!file) {
printf("Could not save %s\n", fileName);
return false;
}

// write the data and close the file
fclose(file);

return true;
}

//======================================================================================
bool ImageLoad (const char *fileName, SImageData& imageData)
{
// open the file if we can
FILE *file;
file = fopen(fileName, "rb");
if (!file)
return false;

{
fclose(file);
return false;
}

// read in our pixel data if we can. Note that it's in BGR order, and width is padded to the next power of 4
if (fread(&imageData.m_pixels[0], imageData.m_pixels.size(), 1, file) != 1)
{
fclose(file);
return false;
}

imageData.m_pitch = 4 * ((imageData.m_width * 24 + 31) / 32);

fclose(file);
return true;
}

//======================================================================================
void ImageInit (SImageData& image, size_t width, size_t height)
{
image.m_width = width;
image.m_height = height;
image.m_pitch = 4 * ((width * 24 + 31) / 32);
image.m_pixels.resize(image.m_pitch * image.m_height);
std::fill(image.m_pixels.begin(), image.m_pixels.end(), 0);
}

//======================================================================================
template <typename LAMBDA>
void ImageForEachPixel (SImageData& image, const LAMBDA& lambda)
{
size_t pixelIndex = 0;
for (size_t y = 0; y < image.m_height; ++y)
{
SColor* pixel = (SColor*)&image.m_pixels[y * image.m_pitch];
for (size_t x = 0; x < image.m_width; ++x)
{
lambda(*pixel, pixelIndex);
++pixel;
++pixelIndex;
}
}
}

//======================================================================================
template <typename LAMBDA>
void ImageForEachPixel (const SImageData& image, const LAMBDA& lambda)
{
size_t pixelIndex = 0;
for (size_t y = 0; y < image.m_height; ++y)
{
SColor* pixel = (SColor*)&image.m_pixels[y * image.m_pitch];
for (size_t x = 0; x < image.m_width; ++x)
{
lambda(*pixel, pixelIndex);
++pixel;
++pixelIndex;
}
}
}

//======================================================================================
void ImageConvertToLuma (SImageData& image)
{
ImageForEachPixel(
image,
[] (SColor& pixel, size_t pixelIndex)
{
float luma = float(pixel.R) * 0.3f + float(pixel.G) * 0.59f + float(pixel.B) * 0.11f;
uint8 lumau8 = uint8(luma + 0.5f);
pixel.R = lumau8;
pixel.G = lumau8;
pixel.B = lumau8;
}
);
}

//======================================================================================
void ImageCombine2 (const SImageData& imageA, const SImageData& imageB, SImageData& result)
{
// put the images side by side. A on left, B on right
ImageInit(result, imageA.m_width + imageB.m_width, max(imageA.m_height, imageB.m_height));
std::fill(result.m_pixels.begin(), result.m_pixels.end(), 0);

// image A on left
for (size_t y = 0; y < imageA.m_height; ++y)
{
SColor* destPixel = (SColor*)&result.m_pixels[y * result.m_pitch];
SColor* srcPixel = (SColor*)&imageA.m_pixels[y * imageA.m_pitch];
for (size_t x = 0; x < imageA.m_width; ++x)
{
destPixel[0] = srcPixel[0];
++destPixel;
++srcPixel;
}
}

// image B on right
for (size_t y = 0; y < imageB.m_height; ++y)
{
SColor* destPixel = (SColor*)&result.m_pixels[y * result.m_pitch + imageA.m_width * 3];
SColor* srcPixel = (SColor*)&imageB.m_pixels[y * imageB.m_pitch];
for (size_t x = 0; x < imageB.m_width; ++x)
{
destPixel[0] = srcPixel[0];
++destPixel;
++srcPixel;
}
}
}

//======================================================================================
void ImageCombine3 (const SImageData& imageA, const SImageData& imageB, const SImageData& imageC, SImageData& result)
{
// put the images side by side. A on left, B in middle, C on right
ImageInit(result, imageA.m_width + imageB.m_width + imageC.m_width, max(max(imageA.m_height, imageB.m_height), imageC.m_height));
std::fill(result.m_pixels.begin(), result.m_pixels.end(), 0);

// image A on left
for (size_t y = 0; y < imageA.m_height; ++y)
{
SColor* destPixel = (SColor*)&result.m_pixels[y * result.m_pitch];
SColor* srcPixel = (SColor*)&imageA.m_pixels[y * imageA.m_pitch];
for (size_t x = 0; x < imageA.m_width; ++x)
{
destPixel[0] = srcPixel[0];
++destPixel;
++srcPixel;
}
}

// image B in middle
for (size_t y = 0; y < imageB.m_height; ++y)
{
SColor* destPixel = (SColor*)&result.m_pixels[y * result.m_pitch + imageA.m_width * 3];
SColor* srcPixel = (SColor*)&imageB.m_pixels[y * imageB.m_pitch];
for (size_t x = 0; x < imageB.m_width; ++x)
{
destPixel[0] = srcPixel[0];
++destPixel;
++srcPixel;
}
}

// image C on right
for (size_t y = 0; y < imageC.m_height; ++y)
{
SColor* destPixel = (SColor*)&result.m_pixels[y * result.m_pitch + imageA.m_width * 3 + imageC.m_width * 3];
SColor* srcPixel = (SColor*)&imageC.m_pixels[y * imageC.m_pitch];
for (size_t x = 0; x < imageC.m_width; ++x)
{
destPixel[0] = srcPixel[0];
++destPixel;
++srcPixel;
}
}
}

//======================================================================================
float GoldenRatioMultiple (size_t multiple)
{
return float(multiple) * (1.0f + std::sqrtf(5.0f)) / 2.0f;
}

//======================================================================================
void IntegrationTest (const SImageData& dither, const SImageData& groundTruth, size_t frameIndex, const char* label)
{
// calculate min, max, total and average error
size_t minError = 0;
size_t maxError = 0;
size_t totalError = 0;
size_t pixelCount = 0;
for (size_t y = 0; y < dither.m_height; ++y)
{
SColor* ditherPixel = (SColor*)&dither.m_pixels[y * dither.m_pitch];
SColor* truthPixel = (SColor*)&groundTruth.m_pixels[y * groundTruth.m_pitch];
for (size_t x = 0; x < dither.m_width; ++x)
{
size_t error = 0;
if (ditherPixel->R > truthPixel->R)
error = ditherPixel->R - truthPixel->R;
else
error = truthPixel->R - ditherPixel->R;

totalError += error;

if ((x == 0 && y == 0) || error < minError)
minError = error;

if ((x == 0 && y == 0) || error > maxError)
maxError = error;

++ditherPixel;
++truthPixel;
++pixelCount;
}
}
float averageError = float(totalError) / float(pixelCount);

// calculate standard deviation
float sumSquaredDiff = 0.0f;
for (size_t y = 0; y < dither.m_height; ++y)
{
SColor* ditherPixel = (SColor*)&dither.m_pixels[y * dither.m_pitch];
SColor* truthPixel = (SColor*)&groundTruth.m_pixels[y * groundTruth.m_pitch];
for (size_t x = 0; x < dither.m_width; ++x)
{
size_t error = 0;
if (ditherPixel->R > truthPixel->R)
error = ditherPixel->R - truthPixel->R;
else
error = truthPixel->R - ditherPixel->R;

float diff = float(error) - averageError;

sumSquaredDiff += diff*diff;
}
}
float stdDev = std::sqrtf(sumSquaredDiff / float(pixelCount - 1));

// report results
fprintf(g_logFile, "%s %zu error\n", label, frameIndex);
fprintf(g_logFile, "  min error: %zu\n", minError);
fprintf(g_logFile, "  max error: %zu\n", maxError);
fprintf(g_logFile, "  avg error: %0.2f\n", averageError);
fprintf(g_logFile, "  stddev: %0.2f\n", stdDev);
fprintf(g_logFile, "\n");
}

//======================================================================================
void HistogramTest (const SImageData& noise, size_t frameIndex, const char* label)
{
std::array<size_t, 256> counts;
std::fill(counts.begin(), counts.end(), 0);

ImageForEachPixel(
noise,
[&] (const SColor& pixel, size_t pixelIndex)
{
counts[pixel.R]++;
}
);

// calculate min, max, total and average
size_t minCount = 0;
size_t maxCount = 0;
size_t totalCount = 0;
for (size_t i = 0; i < 256; ++i)
{
if (i == 0 || counts[i] < minCount)
minCount = counts[i];

if (i == 0 || counts[i] > maxCount)
maxCount = counts[i];

totalCount += counts[i];
}
float averageCount = float(totalCount) / float(256.0f);

// calculate standard deviation
float sumSquaredDiff = 0.0f;
for (size_t i = 0; i < 256; ++i)
{
float diff = float(counts[i]) - averageCount;
sumSquaredDiff += diff*diff;
}
float stdDev = std::sqrtf(sumSquaredDiff / 255.0f);

// report results
fprintf(g_logFile, "%s %zu histogram\n", label, frameIndex);
fprintf(g_logFile, "  min count: %zu\n", minCount);
fprintf(g_logFile, "  max count: %zu\n", maxCount);
fprintf(g_logFile, "  avg count: %0.2f\n", averageCount);
fprintf(g_logFile, "  stddev: %0.2f\n", stdDev);
fprintf(g_logFile, "  counts: ");
for (size_t i = 0; i < 256; ++i)
{
if (i > 0)
fprintf(g_logFile, ", ");
fprintf(g_logFile, "%zu", counts[i]);
}

fprintf(g_logFile, "\n\n");
}

//======================================================================================
void GenerateWhiteNoise (SImageData& image, size_t width, size_t height)
{
ImageInit(image, width, height);

std::random_device rd;
std::mt19937 rng(rd());
std::uniform_int_distribution<unsigned int> dist(0, 255);

ImageForEachPixel(
image,
[&] (SColor& pixel, size_t pixelIndex)
{
uint8 value = dist(rng);
pixel.R = value;
pixel.G = value;
pixel.B = value;
}
);
}

//======================================================================================
void GenerateInterleavedGradientNoise (SImageData& image, size_t width, size_t height, float offsetX, float offsetY)
{
ImageInit(image, width, height);

std::random_device rd;
std::mt19937 rng(rd());
std::uniform_int_distribution<unsigned int> dist(0, 255);

for (size_t y = 0; y < height; ++y)
{
SColor* pixel = (SColor*)&image.m_pixels[y * image.m_pitch];
for (size_t x = 0; x < width; ++x)
{
float valueFloat = std::fmodf(52.9829189f * std::fmod(0.06711056f*float(x + offsetX) + 0.00583715f*float(y + offsetY), 1.0f), 1.0f);
size_t valueBig = size_t(valueFloat * 256.0f);
uint8 value = uint8(valueBig % 256);
pixel->R = value;
pixel->G = value;
pixel->B = value;
++pixel;
}
}
}

//======================================================================================
template <size_t NUM_SAMPLES>
void GenerateVanDerCoruptSequence (std::array<float, NUM_SAMPLES>& samples, size_t base)
{
for (size_t i = 0; i < NUM_SAMPLES; ++i)
{
samples[i] = 0.0f;
float denominator = float(base);
size_t n = i;
while (n > 0)
{
size_t multiplier = n % base;
samples[i] += float(multiplier) / denominator;
n = n / base;
denominator *= base;
}
}
}

//======================================================================================
void DitherWithTexture (const SImageData& ditherImage, const SImageData& noiseImage, SImageData& result)
{
// init the result image
ImageInit(result, ditherImage.m_width, ditherImage.m_height);

// make the result image
for (size_t y = 0; y < ditherImage.m_height; ++y)
{
SColor* srcDitherPixel = (SColor*)&ditherImage.m_pixels[y * ditherImage.m_pitch];
SColor* destDitherPixel = (SColor*)&result.m_pixels[y * result.m_pitch];

for (size_t x = 0; x < ditherImage.m_width; ++x)
{
// tile the noise in case it isn't the same size as the image we are dithering
size_t noiseX = x % noiseImage.m_width;
size_t noiseY = y % noiseImage.m_height;
SColor* noisePixel = (SColor*)&noiseImage.m_pixels[noiseY * noiseImage.m_pitch + noiseX * 3];

uint8 value = 0;
if (noisePixel->R < srcDitherPixel->R)
value = 255;

destDitherPixel->R = value;
destDitherPixel->G = value;
destDitherPixel->B = value;

++srcDitherPixel;
++destDitherPixel;
}
}
}

//======================================================================================
void DitherWhiteNoise (const SImageData& ditherImage)
{
printf("\n%s\n", __FUNCTION__);

// make noise
SImageData noise;
GenerateWhiteNoise(noise, ditherImage.m_width, ditherImage.m_height);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// save the results
SImageData combined;
ImageCombine3(ditherImage, noise, dither, combined);
ImageSave(combined, "out/still_whitenoise.bmp");
}

//======================================================================================
{
printf("\n%s\n", __FUNCTION__);

// make noise
SImageData noise;

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// save the results
SImageData combined;
ImageCombine3(ditherImage, noise, dither, combined);
ImageSave(combined, "out/still_ignoise.bmp");
}

//======================================================================================
void DitherBlueNoise (const SImageData& ditherImage, const SImageData& blueNoise)
{
printf("\n%s\n", __FUNCTION__);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, blueNoise, dither);

// save the results
SImageData combined;
ImageCombine3(ditherImage, blueNoise, dither, combined);
ImageSave(combined, "out/still_bluenoise.bmp");
}

//======================================================================================
void DitherWhiteNoiseAnimated (const SImageData& ditherImage)
{
printf("\n%s\n", __FUNCTION__);

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/anim_whitenoise%zu.bmp", i);

// make noise
SImageData noise;
GenerateWhiteNoise(noise, ditherImage.m_width, ditherImage.m_height);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// save the results
SImageData combined;
ImageCombine2(noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
{
printf("\n%s\n", __FUNCTION__);

std::random_device rd;
std::mt19937 rng(rd());
std::uniform_real_distribution<float> dist(0.0f, 1000.0f);

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/anim_ignoise%zu.bmp", i);

// make noise
SImageData noise;

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// save the results
SImageData combined;
ImageCombine2(noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
void DitherBlueNoiseAnimated (const SImageData& ditherImage, const SImageData blueNoise[8])
{
printf("\n%s\n", __FUNCTION__);

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/anim_bluenoise%zu.bmp", i);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, blueNoise[i], dither);

// save the results
SImageData combined;
ImageCombine2(blueNoise[i], dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
void DitherWhiteNoiseAnimatedIntegrated (const SImageData& ditherImage)
{
printf("\n%s\n", __FUNCTION__);

std::vector<float> integration;
integration.resize(ditherImage.m_width * ditherImage.m_height);
std::fill(integration.begin(), integration.end(), 0.0f);

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animint_whitenoise%zu.bmp", i);

// make noise
SImageData noise;
GenerateWhiteNoise(noise, ditherImage.m_width, ditherImage.m_height);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// integrate and put the current integration results into the dither image
ImageForEachPixel(
dither,
[&] (SColor& pixel, size_t pixelIndex)
{
float pixelValueFloat = float(pixel.R) / 255.0f;
integration[pixelIndex] = Lerp(integration[pixelIndex], pixelValueFloat, 1.0f / float(i+1));

uint8 integratedPixelValue = uint8(integration[pixelIndex] * 255.0f);
pixel.R = integratedPixelValue;
pixel.G = integratedPixelValue;
pixel.B = integratedPixelValue;
}
);

// do an integration test
IntegrationTest(dither, ditherImage, i, __FUNCTION__);

// save the results
SImageData combined;
ImageCombine2(noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
{
printf("\n%s\n", __FUNCTION__);

std::vector<float> integration;
integration.resize(ditherImage.m_width * ditherImage.m_height);
std::fill(integration.begin(), integration.end(), 0.0f);

std::random_device rd;
std::mt19937 rng(rd());
std::uniform_real_distribution<float> dist(0.0f, 1000.0f);

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animint_ignoise%zu.bmp", i);

// make noise
SImageData noise;

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// integrate and put the current integration results into the dither image
ImageForEachPixel(
dither,
[&](SColor& pixel, size_t pixelIndex)
{
float pixelValueFloat = float(pixel.R) / 255.0f;
integration[pixelIndex] = Lerp(integration[pixelIndex], pixelValueFloat, 1.0f / float(i + 1));

uint8 integratedPixelValue = uint8(integration[pixelIndex] * 255.0f);
pixel.R = integratedPixelValue;
pixel.G = integratedPixelValue;
pixel.B = integratedPixelValue;
}
);

// do an integration test
IntegrationTest(dither, ditherImage, i, __FUNCTION__);

// save the results
SImageData combined;
ImageCombine2(noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
void DitherBlueNoiseAnimatedIntegrated (const SImageData& ditherImage, const SImageData blueNoise[8])
{
printf("\n%s\n", __FUNCTION__);

std::vector<float> integration;
integration.resize(ditherImage.m_width * ditherImage.m_height);
std::fill(integration.begin(), integration.end(), 0.0f);

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animint_bluenoise%zu.bmp", i);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, blueNoise[i], dither);

// integrate and put the current integration results into the dither image
ImageForEachPixel(
dither,
[&] (SColor& pixel, size_t pixelIndex)
{
float pixelValueFloat = float(pixel.R) / 255.0f;
integration[pixelIndex] = Lerp(integration[pixelIndex], pixelValueFloat, 1.0f / float(i+1));

uint8 integratedPixelValue = uint8(integration[pixelIndex] * 255.0f);
pixel.R = integratedPixelValue;
pixel.G = integratedPixelValue;
pixel.B = integratedPixelValue;
}
);

// do an integration test
IntegrationTest(dither, ditherImage, i, __FUNCTION__);

// save the results
SImageData combined;
ImageCombine2(blueNoise[i], dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
void DitherWhiteNoiseAnimatedGoldenRatio (const SImageData& ditherImage)
{
printf("\n%s\n", __FUNCTION__);

// make noise
SImageData noiseSrc;
GenerateWhiteNoise(noiseSrc, ditherImage.m_width, ditherImage.m_height);

SImageData noise;
ImageInit(noise, noiseSrc.m_width, noiseSrc.m_height);

SImageDataComplex noiseDFT;
SImageData noiseDFTMag;

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animgr_whitenoise%zu.bmp", i);

// add golden ratio to the noise after each frame
noise.m_pixels = noiseSrc.m_pixels;
ImageForEachPixel(
noise,
[&] (SColor& pixel, size_t pixelIndex)
{
float valueFloat = (float(pixel.R) / 255.0f) + add;
size_t valueBig = size_t(valueFloat * 255.0f);
uint8 value = uint8(valueBig % 256);
pixel.R = value;
pixel.G = value;
pixel.B = value;
}
);

// DFT the noise
if (c_doDFT)
{
ImageDFT(noise, noiseDFT);
GetMagnitudeData(noiseDFT, noiseDFTMag);
}
else
{
ImageInit(noiseDFTMag, noise.m_width, noise.m_height);
std::fill(noiseDFTMag.m_pixels.begin(), noiseDFTMag.m_pixels.end(), 0);
}

// Histogram test the noise
HistogramTest(noise, i, __FUNCTION__);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// save the results
SImageData combined;
ImageCombine3(noiseDFTMag, noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
{
printf("\n%s\n", __FUNCTION__);

// make noise
SImageData noiseSrc;

SImageData noise;
ImageInit(noise, noiseSrc.m_width, noiseSrc.m_height);

SImageDataComplex noiseDFT;
SImageData noiseDFTMag;

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animgr_ignoise%zu.bmp", i);

// add golden ratio to the noise after each frame
noise.m_pixels = noiseSrc.m_pixels;
ImageForEachPixel(
noise,
[&] (SColor& pixel, size_t pixelIndex)
{
float valueFloat = (float(pixel.R) / 255.0f) + add;
size_t valueBig = size_t(valueFloat * 255.0f);
uint8 value = uint8(valueBig % 256);
pixel.R = value;
pixel.G = value;
pixel.B = value;
}
);

// DFT the noise
if (c_doDFT)
{
ImageDFT(noise, noiseDFT);
GetMagnitudeData(noiseDFT, noiseDFTMag);
}
else
{
ImageInit(noiseDFTMag, noise.m_width, noise.m_height);
std::fill(noiseDFTMag.m_pixels.begin(), noiseDFTMag.m_pixels.end(), 0);
}

// Histogram test the noise
HistogramTest(noise, i, __FUNCTION__);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// save the results
SImageData combined;
ImageCombine3(noiseDFTMag, noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
void DitherBlueNoiseAnimatedGoldenRatio (const SImageData& ditherImage, const SImageData& noiseSrc)
{
printf("\n%s\n", __FUNCTION__);

SImageData noise;
ImageInit(noise, noiseSrc.m_width, noiseSrc.m_height);

SImageDataComplex noiseDFT;
SImageData noiseDFTMag;

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animgr_bluenoise%zu.bmp", i);

// add golden ratio to the noise after each frame
noise.m_pixels = noiseSrc.m_pixels;
ImageForEachPixel(
noise,
[&] (SColor& pixel, size_t pixelIndex)
{
float valueFloat = (float(pixel.R) / 255.0f) + add;
size_t valueBig = size_t(valueFloat * 255.0f);
uint8 value = uint8(valueBig % 256);
pixel.R = value;
pixel.G = value;
pixel.B = value;
}
);

// DFT the noise
if (c_doDFT)
{
ImageDFT(noise, noiseDFT);
GetMagnitudeData(noiseDFT, noiseDFTMag);
}
else
{
ImageInit(noiseDFTMag, noise.m_width, noise.m_height);
std::fill(noiseDFTMag.m_pixels.begin(), noiseDFTMag.m_pixels.end(), 0);
}

// Histogram test the noise
HistogramTest(noise, i, __FUNCTION__);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// save the results
SImageData combined;
ImageCombine3(noiseDFTMag, noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
void DitherWhiteNoiseAnimatedUniform (const SImageData& ditherImage)
{
printf("\n%s\n", __FUNCTION__);

// make noise
SImageData noiseSrc;
GenerateWhiteNoise(noiseSrc, ditherImage.m_width, ditherImage.m_height);

SImageData noise;
ImageInit(noise, noiseSrc.m_width, noiseSrc.m_height);

SImageDataComplex noiseDFT;
SImageData noiseDFTMag;

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animuni_whitenoise%zu.bmp", i);

// add uniform value to the noise after each frame
noise.m_pixels = noiseSrc.m_pixels;
float add = float(i) / 8.0f;
ImageForEachPixel(
noise,
[&] (SColor& pixel, size_t pixelIndex)
{
float valueFloat = (float(pixel.R) / 255.0f) + add;
size_t valueBig = size_t(valueFloat * 255.0f);
uint8 value = uint8(valueBig % 256);
pixel.R = value;
pixel.G = value;
pixel.B = value;
}
);

// DFT the noise
if (c_doDFT)
{
ImageDFT(noise, noiseDFT);
GetMagnitudeData(noiseDFT, noiseDFTMag);
}
else
{
ImageInit(noiseDFTMag, noise.m_width, noise.m_height);
std::fill(noiseDFTMag.m_pixels.begin(), noiseDFTMag.m_pixels.end(), 0);
}

// Histogram test the noise
HistogramTest(noise, i, __FUNCTION__);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// save the results
SImageData combined;
ImageCombine3(noiseDFTMag, noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
{
printf("\n%s\n", __FUNCTION__);

// make noise
SImageData noiseSrc;

SImageData noise;
ImageInit(noise, noiseSrc.m_width, noiseSrc.m_height);

SImageDataComplex noiseDFT;
SImageData noiseDFTMag;

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animuni_ignoise%zu.bmp", i);

// add uniform value to the noise after each frame
noise.m_pixels = noiseSrc.m_pixels;
float add = float(i) / 8.0f;
ImageForEachPixel(
noise,
[&] (SColor& pixel, size_t pixelIndex)
{
float valueFloat = (float(pixel.R) / 255.0f) + add;
size_t valueBig = size_t(valueFloat * 255.0f);
uint8 value = uint8(valueBig % 256);
pixel.R = value;
pixel.G = value;
pixel.B = value;
}
);

// DFT the noise
if (c_doDFT)
{
ImageDFT(noise, noiseDFT);
GetMagnitudeData(noiseDFT, noiseDFTMag);
}
else
{
ImageInit(noiseDFTMag, noise.m_width, noise.m_height);
std::fill(noiseDFTMag.m_pixels.begin(), noiseDFTMag.m_pixels.end(), 0);
}

// Histogram test the noise
HistogramTest(noise, i, __FUNCTION__);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// save the results
SImageData combined;
ImageCombine3(noiseDFTMag, noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
void DitherBlueNoiseAnimatedUniform (const SImageData& ditherImage, const SImageData& noiseSrc)
{
printf("\n%s\n", __FUNCTION__);

SImageData noise;
ImageInit(noise, noiseSrc.m_width, noiseSrc.m_height);

SImageDataComplex noiseDFT;
SImageData noiseDFTMag;

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animuni_bluenoise%zu.bmp", i);

// add uniform value to the noise after each frame
noise.m_pixels = noiseSrc.m_pixels;
float add = float(i) / 8.0f;
ImageForEachPixel(
noise,
[&] (SColor& pixel, size_t pixelIndex)
{
float valueFloat = (float(pixel.R) / 255.0f) + add;
size_t valueBig = size_t(valueFloat * 255.0f);
uint8 value = uint8(valueBig % 256);
pixel.R = value;
pixel.G = value;
pixel.B = value;
}
);

// DFT the noise
if (c_doDFT)
{
ImageDFT(noise, noiseDFT);
GetMagnitudeData(noiseDFT, noiseDFTMag);
}
else
{
ImageInit(noiseDFTMag, noise.m_width, noise.m_height);
std::fill(noiseDFTMag.m_pixels.begin(), noiseDFTMag.m_pixels.end(), 0);
}

// Histogram test the noise
HistogramTest(noise, i, __FUNCTION__);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// save the results
SImageData combined;
ImageCombine3(noiseDFTMag, noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
void DitherWhiteNoiseAnimatedGoldenRatioIntegrated (const SImageData& ditherImage)
{
printf("\n%s\n", __FUNCTION__);

std::vector<float> integration;
integration.resize(ditherImage.m_width * ditherImage.m_height);
std::fill(integration.begin(), integration.end(), 0.0f);

// make noise
SImageData noiseSrc;
GenerateWhiteNoise(noiseSrc, ditherImage.m_width, ditherImage.m_height);

SImageData noise;
ImageInit(noise, noiseSrc.m_width, noiseSrc.m_height);

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animgrint_whitenoise%zu.bmp", i);

// add golden ratio to the noise after each frame
noise.m_pixels = noiseSrc.m_pixels;
ImageForEachPixel(
noise,
[&] (SColor& pixel, size_t pixelIndex)
{
float valueFloat = (float(pixel.R) / 255.0f) + add;
size_t valueBig = size_t(valueFloat * 255.0f);
uint8 value = uint8(valueBig % 256);
pixel.R = value;
pixel.G = value;
pixel.B = value;
}
);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// integrate and put the current integration results into the dither image
ImageForEachPixel(
dither,
[&] (SColor& pixel, size_t pixelIndex)
{
float pixelValueFloat = float(pixel.R) / 255.0f;
integration[pixelIndex] = Lerp(integration[pixelIndex], pixelValueFloat, 1.0f / float(i+1));

uint8 integratedPixelValue = uint8(integration[pixelIndex] * 255.0f);
pixel.R = integratedPixelValue;
pixel.G = integratedPixelValue;
pixel.B = integratedPixelValue;
}
);

// do an integration test
IntegrationTest(dither, ditherImage, i, __FUNCTION__);

// save the results
SImageData combined;
ImageCombine2(noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
{
printf("\n%s\n", __FUNCTION__);

std::vector<float> integration;
integration.resize(ditherImage.m_width * ditherImage.m_height);
std::fill(integration.begin(), integration.end(), 0.0f);

// make noise
SImageData noiseSrc;

SImageData noise;
ImageInit(noise, noiseSrc.m_width, noiseSrc.m_height);

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animgrint_ignoise%zu.bmp", i);

// add golden ratio to the noise after each frame
noise.m_pixels = noiseSrc.m_pixels;
ImageForEachPixel(
noise,
[&] (SColor& pixel, size_t pixelIndex)
{
float valueFloat = (float(pixel.R) / 255.0f) + add;
size_t valueBig = size_t(valueFloat * 255.0f);
uint8 value = uint8(valueBig % 256);
pixel.R = value;
pixel.G = value;
pixel.B = value;
}
);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// integrate and put the current integration results into the dither image
ImageForEachPixel(
dither,
[&] (SColor& pixel, size_t pixelIndex)
{
float pixelValueFloat = float(pixel.R) / 255.0f;
integration[pixelIndex] = Lerp(integration[pixelIndex], pixelValueFloat, 1.0f / float(i+1));

uint8 integratedPixelValue = uint8(integration[pixelIndex] * 255.0f);
pixel.R = integratedPixelValue;
pixel.G = integratedPixelValue;
pixel.B = integratedPixelValue;
}
);

// do an integration test
IntegrationTest(dither, ditherImage, i, __FUNCTION__);

// save the results
SImageData combined;
ImageCombine2(noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
void DitherBlueNoiseAnimatedGoldenRatioIntegrated (const SImageData& ditherImage, const SImageData& noiseSrc)
{
printf("\n%s\n", __FUNCTION__);

std::vector<float> integration;
integration.resize(ditherImage.m_width * ditherImage.m_height);
std::fill(integration.begin(), integration.end(), 0.0f);

SImageData noise;
ImageInit(noise, noiseSrc.m_width, noiseSrc.m_height);

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animgrint_bluenoise%zu.bmp", i);

// add golden ratio to the noise after each frame
noise.m_pixels = noiseSrc.m_pixels;
ImageForEachPixel(
noise,
[&] (SColor& pixel, size_t pixelIndex)
{
float valueFloat = (float(pixel.R) / 255.0f) + add;
size_t valueBig = size_t(valueFloat * 255.0f);
uint8 value = uint8(valueBig % 256);
pixel.R = value;
pixel.G = value;
pixel.B = value;
}
);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// integrate and put the current integration results into the dither image
ImageForEachPixel(
dither,
[&] (SColor& pixel, size_t pixelIndex)
{
float pixelValueFloat = float(pixel.R) / 255.0f;
integration[pixelIndex] = Lerp(integration[pixelIndex], pixelValueFloat, 1.0f / float(i+1));

uint8 integratedPixelValue = uint8(integration[pixelIndex] * 255.0f);
pixel.R = integratedPixelValue;
pixel.G = integratedPixelValue;
pixel.B = integratedPixelValue;
}
);

// do an integration test
IntegrationTest(dither, ditherImage, i, __FUNCTION__);

// save the results
SImageData combined;
ImageCombine2(noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
void DitherWhiteNoiseAnimatedUniformIntegrated (const SImageData& ditherImage)
{
printf("\n%s\n", __FUNCTION__);

std::vector<float> integration;
integration.resize(ditherImage.m_width * ditherImage.m_height);
std::fill(integration.begin(), integration.end(), 0.0f);

// make noise
SImageData noiseSrc;
GenerateWhiteNoise(noiseSrc, ditherImage.m_width, ditherImage.m_height);

SImageData noise;
ImageInit(noise, noiseSrc.m_width, noiseSrc.m_height);

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animuniint_whitenoise%zu.bmp", i);

// add uniform value to the noise after each frame
noise.m_pixels = noiseSrc.m_pixels;
float add = float(i) / 8.0f;
ImageForEachPixel(
noise,
[&] (SColor& pixel, size_t pixelIndex)
{
float valueFloat = (float(pixel.R) / 255.0f) + add;
size_t valueBig = size_t(valueFloat * 255.0f);
uint8 value = uint8(valueBig % 256);
pixel.R = value;
pixel.G = value;
pixel.B = value;
}
);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// integrate and put the current integration results into the dither image
ImageForEachPixel(
dither,
[&] (SColor& pixel, size_t pixelIndex)
{
float pixelValueFloat = float(pixel.R) / 255.0f;
integration[pixelIndex] = Lerp(integration[pixelIndex], pixelValueFloat, 1.0f / float(i+1));

uint8 integratedPixelValue = uint8(integration[pixelIndex] * 255.0f);
pixel.R = integratedPixelValue;
pixel.G = integratedPixelValue;
pixel.B = integratedPixelValue;
}
);

// do an integration test
IntegrationTest(dither, ditherImage, i, __FUNCTION__);

// save the results
SImageData combined;
ImageCombine2(noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
{
printf("\n%s\n", __FUNCTION__);

std::vector<float> integration;
integration.resize(ditherImage.m_width * ditherImage.m_height);
std::fill(integration.begin(), integration.end(), 0.0f);

// make noise
SImageData noiseSrc;

SImageData noise;
ImageInit(noise, noiseSrc.m_width, noiseSrc.m_height);

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animuniint_ignoise%zu.bmp", i);

// add uniform value to the noise after each frame
noise.m_pixels = noiseSrc.m_pixels;
float add = float(i) / 8.0f;
ImageForEachPixel(
noise,
[&] (SColor& pixel, size_t pixelIndex)
{
float valueFloat = (float(pixel.R) / 255.0f) + add;
size_t valueBig = size_t(valueFloat * 255.0f);
uint8 value = uint8(valueBig % 256);
pixel.R = value;
pixel.G = value;
pixel.B = value;
}
);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// integrate and put the current integration results into the dither image
ImageForEachPixel(
dither,
[&] (SColor& pixel, size_t pixelIndex)
{
float pixelValueFloat = float(pixel.R) / 255.0f;
integration[pixelIndex] = Lerp(integration[pixelIndex], pixelValueFloat, 1.0f / float(i+1));

uint8 integratedPixelValue = uint8(integration[pixelIndex] * 255.0f);
pixel.R = integratedPixelValue;
pixel.G = integratedPixelValue;
pixel.B = integratedPixelValue;
}
);

// do an integration test
IntegrationTest(dither, ditherImage, i, __FUNCTION__);

// save the results
SImageData combined;
ImageCombine2(noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
void DitherBlueNoiseAnimatedUniformIntegrated (const SImageData& ditherImage, const SImageData& noiseSrc)
{
printf("\n%s\n", __FUNCTION__);

std::vector<float> integration;
integration.resize(ditherImage.m_width * ditherImage.m_height);
std::fill(integration.begin(), integration.end(), 0.0f);

SImageData noise;
ImageInit(noise, noiseSrc.m_width, noiseSrc.m_height);

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animuniint_bluenoise%zu.bmp", i);

// add uniform value to the noise after each frame
noise.m_pixels = noiseSrc.m_pixels;
float add = float(i) / 8.0f;
ImageForEachPixel(
noise,
[&] (SColor& pixel, size_t pixelIndex)
{
float valueFloat = (float(pixel.R) / 255.0f) + add;
size_t valueBig = size_t(valueFloat * 255.0f);
uint8 value = uint8(valueBig % 256);
pixel.R = value;
pixel.G = value;
pixel.B = value;
}
);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// integrate and put the current integration results into the dither image
ImageForEachPixel(
dither,
[&] (SColor& pixel, size_t pixelIndex)
{
float pixelValueFloat = float(pixel.R) / 255.0f;
integration[pixelIndex] = Lerp(integration[pixelIndex], pixelValueFloat, 1.0f / float(i+1));

uint8 integratedPixelValue = uint8(integration[pixelIndex] * 255.0f);
pixel.R = integratedPixelValue;
pixel.G = integratedPixelValue;
pixel.B = integratedPixelValue;
}
);

// do an integration test
IntegrationTest(dither, ditherImage, i, __FUNCTION__);

// save the results
SImageData combined;
ImageCombine2(noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
void DitherWhiteNoiseAnimatedVDCIntegrated (const SImageData& ditherImage)
{
printf("\n%s\n", __FUNCTION__);

std::vector<float> integration;
integration.resize(ditherImage.m_width * ditherImage.m_height);
std::fill(integration.begin(), integration.end(), 0.0f);

// make noise
SImageData noiseSrc;
GenerateWhiteNoise(noiseSrc, ditherImage.m_width, ditherImage.m_height);

SImageData noise;
ImageInit(noise, noiseSrc.m_width, noiseSrc.m_height);

// Make Van Der Corput sequence
std::array<float, 8> VDC;
GenerateVanDerCoruptSequence(VDC, 2);

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animvdcint_whitenoise%zu.bmp", i);

// add uniform value to the noise after each frame
noise.m_pixels = noiseSrc.m_pixels;
ImageForEachPixel(
noise,
[&] (SColor& pixel, size_t pixelIndex)
{
float valueFloat = (float(pixel.R) / 255.0f) + add;
size_t valueBig = size_t(valueFloat * 255.0f);
uint8 value = uint8(valueBig % 256);
pixel.R = value;
pixel.G = value;
pixel.B = value;
}
);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// integrate and put the current integration results into the dither image
ImageForEachPixel(
dither,
[&] (SColor& pixel, size_t pixelIndex)
{
float pixelValueFloat = float(pixel.R) / 255.0f;
integration[pixelIndex] = Lerp(integration[pixelIndex], pixelValueFloat, 1.0f / float(i+1));

uint8 integratedPixelValue = uint8(integration[pixelIndex] * 255.0f);
pixel.R = integratedPixelValue;
pixel.G = integratedPixelValue;
pixel.B = integratedPixelValue;
}
);

// do an integration test
IntegrationTest(dither, ditherImage, i, __FUNCTION__);

// save the results
SImageData combined;
ImageCombine2(noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
{
printf("\n%s\n", __FUNCTION__);

std::vector<float> integration;
integration.resize(ditherImage.m_width * ditherImage.m_height);
std::fill(integration.begin(), integration.end(), 0.0f);

// make noise
SImageData noiseSrc;

SImageData noise;
ImageInit(noise, noiseSrc.m_width, noiseSrc.m_height);

// Make Van Der Corput sequence
std::array<float, 8> VDC;
GenerateVanDerCoruptSequence(VDC, 2);

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animvdcint_ignoise%zu.bmp", i);

// add uniform value to the noise after each frame
noise.m_pixels = noiseSrc.m_pixels;
ImageForEachPixel(
noise,
[&] (SColor& pixel, size_t pixelIndex)
{
float valueFloat = (float(pixel.R) / 255.0f) + add;
size_t valueBig = size_t(valueFloat * 255.0f);
uint8 value = uint8(valueBig % 256);
pixel.R = value;
pixel.G = value;
pixel.B = value;
}
);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// integrate and put the current integration results into the dither image
ImageForEachPixel(
dither,
[&] (SColor& pixel, size_t pixelIndex)
{
float pixelValueFloat = float(pixel.R) / 255.0f;
integration[pixelIndex] = Lerp(integration[pixelIndex], pixelValueFloat, 1.0f / float(i+1));

uint8 integratedPixelValue = uint8(integration[pixelIndex] * 255.0f);
pixel.R = integratedPixelValue;
pixel.G = integratedPixelValue;
pixel.B = integratedPixelValue;
}
);

// do an integration test
IntegrationTest(dither, ditherImage, i, __FUNCTION__);

// save the results
SImageData combined;
ImageCombine2(noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
void DitherBlueNoiseAnimatedVDCIntegrated (const SImageData& ditherImage, const SImageData& noiseSrc)
{
printf("\n%s\n", __FUNCTION__);

std::vector<float> integration;
integration.resize(ditherImage.m_width * ditherImage.m_height);
std::fill(integration.begin(), integration.end(), 0.0f);

SImageData noise;
ImageInit(noise, noiseSrc.m_width, noiseSrc.m_height);

// Make Van Der Corput sequence
std::array<float, 8> VDC;
GenerateVanDerCoruptSequence(VDC, 2);

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animvdcint_bluenoise%zu.bmp", i);

// add uniform value to the noise after each frame
noise.m_pixels = noiseSrc.m_pixels;
ImageForEachPixel(
noise,
[&] (SColor& pixel, size_t pixelIndex)
{
float valueFloat = (float(pixel.R) / 255.0f) + add;
size_t valueBig = size_t(valueFloat * 255.0f);
uint8 value = uint8(valueBig % 256);
pixel.R = value;
pixel.G = value;
pixel.B = value;
}
);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// integrate and put the current integration results into the dither image
ImageForEachPixel(
dither,
[&] (SColor& pixel, size_t pixelIndex)
{
float pixelValueFloat = float(pixel.R) / 255.0f;
integration[pixelIndex] = Lerp(integration[pixelIndex], pixelValueFloat, 1.0f / float(i+1));

uint8 integratedPixelValue = uint8(integration[pixelIndex] * 255.0f);
pixel.R = integratedPixelValue;
pixel.G = integratedPixelValue;
pixel.B = integratedPixelValue;
}
);

// do an integration test
IntegrationTest(dither, ditherImage, i, __FUNCTION__);

// save the results
SImageData combined;
ImageCombine2(noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
int main (int argc, char** argv)
{
// load the dither image and convert it to greyscale (luma)
SImageData ditherImage;
{
return 0;
}
ImageConvertToLuma(ditherImage);

// load the blue noise images.
SImageData blueNoise[8];
for (size_t i = 0; i < 8; ++i)
{
char buffer[256];
sprintf(buffer, "src/BN%zu.bmp", i);
{
return 0;
}

// They have different values in R, G, B so make R be the value for all channels
ImageForEachPixel(
blueNoise[i],
[] (SColor& pixel, size_t pixelIndex)
{
pixel.G = pixel.R;
pixel.B = pixel.R;
}
);
}

g_logFile = fopen("log.txt", "w+t");

// still image dither tests
DitherWhiteNoise(ditherImage);
DitherBlueNoise(ditherImage, blueNoise[0]);

// Animated dither tests
DitherWhiteNoiseAnimated(ditherImage);
DitherBlueNoiseAnimated(ditherImage, blueNoise);

// Golden ratio animated dither tests
DitherWhiteNoiseAnimatedGoldenRatio(ditherImage);
DitherBlueNoiseAnimatedGoldenRatio(ditherImage, blueNoise[0]);

// Uniform animated dither tests
DitherWhiteNoiseAnimatedUniform(ditherImage);
DitherBlueNoiseAnimatedUniform(ditherImage, blueNoise[0]);

// Animated dither integration tests
DitherWhiteNoiseAnimatedIntegrated(ditherImage);
DitherBlueNoiseAnimatedIntegrated(ditherImage, blueNoise);

// Golden ratio animated dither integration tests
DitherWhiteNoiseAnimatedGoldenRatioIntegrated(ditherImage);
DitherBlueNoiseAnimatedGoldenRatioIntegrated(ditherImage, blueNoise[0]);

// Uniform animated dither integration tests
DitherWhiteNoiseAnimatedUniformIntegrated(ditherImage);
DitherBlueNoiseAnimatedUniformIntegrated(ditherImage, blueNoise[0]);

// Van der corput animated dither integration tests
DitherWhiteNoiseAnimatedVDCIntegrated(ditherImage);
DitherBlueNoiseAnimatedVDCIntegrated(ditherImage, blueNoise[0]);

fclose(g_logFile);

return 0;
}


# Animating Noise For Integration Over Time

You can use noise textures (like the ones from the last post) to do dithering.

For instance, you can do the process below to make a 1 bit (black and white) dithered image using a gray scale source image and a gray scale noise texture. This would be useful if you had a 1 bit display that you were trying to display an image on.

1. For each pixel in the source image…
2. If the source image pixel is brighter than the noise texture, put a white pixel.
3. Else put a black pixel.

(info on converting images to grey scale here: Converting RGB to Grayscale)

The quality of the result depends on the type of noise you use.

If you use pure random numbers (white noise) it looks like this:

You could also use something called “Interleaved Gradient Noise” which would look like this:

Or you could use blue noise which would look like this:

As you can see, white noise was the worst looking, interleaved gradient noise is is the middle, and blue noise looked the best.

White noise is very cheap to generate and can be done in real time on either the CPU or GPU – you just use random numbers.

Blue noise is more expensive to generate and usually must be done in advance, but gives high quality results.

Interleaved gradient noise, which gives middle results, is actually very similar in generation costs as white noise believe it or not, and so can also be done in real time on either the CPU or GPU.

If you have X and Y pixel coordinates (not uv coordinates), you can generate the noise value for the pixel by using this formula:

float noise = std::fmodf(52.9829189f * std::fmodf(0.06711056f*float(x) + 0.00583715f*float(y), 1.0f), 1.0f);


Next Generation Post Processing in Call Of Duty: Advanced Warfare
Dithering part three – real world 2D quantization dithering (Bart Wronksi)

Dithering still images is fun, but in the context of video games, we are more interested in animated images, so let’s look at things in motion.

## Animated Noise

Let’s start by just animating those three noise types over 8 frames.

For white noise, we’ll generate a new white noise texture every frame.

For interleaved gradient noise, we’ll add a random offset (0 to 1000) to the pixel each frame, so we get 8 different interleaved gradient noise textures.

For blue noise, we’ll just have 8 different blue noise textures that we generate in advance.

These are playing at 8 fps, so loop every second.

White Noise:

IG Noise:

Blue Noise:

Once again we can see that white noise is worst, blue noise is best, and interleaved gradient noise is in the middle.

When you think about it though, despite these animations all using different types of noise over SPACE, they all use white noise over time. What i mean by that is if you isolate any individual pixel in any of the images and look at it over the 8 frames, that single pixel will look like white noise.

Let’s see if we can improve that.

## Golden Ratio Animated Noise

In a conversation on twitter, @R4_Unit told me that in the past he had good success by adding the golden ratio to blue noise textures to make the noise more blue over time.

The background here is that repeatedly adding the golden ratio to any number will make a low discrepancy sequence (details: When Random Numbers Are Too Random: Low Discrepancy Sequences)

The golden ratio is $\frac{1+\sqrt{5}}{2}$ or approximately 1.61803398875, and interestingly is THE MOST irrational number that there is. Weird right?

For each of the noise types, we’ll generate a single texture for frame 0, and each subsequent frame we will add the golden ratio to each pixel. The pixel values are in the 0 to 1 space when adding the golden ratio (not 0 to 255) and we use modulus to wrap it around.

The DFT magnitude is shown on the left to show how adding the golden ratio affects frequency components.

White Noise:

IG Noise:

Blue Noise:

When I look at these side by side with the previous animations, it’s hard for me to see much of a difference. That is interesting for the case of blue noise, where it’s difficult to generate multiple blue noise textures. It means that you can get a fairly decent “blue noise” texture by adding multiples of the golden ratio to an existing blue noise texture (aka recycling!).

It’s interesting that the white noise and interleaved gradient noise don’t change their frequency spectrum much over time. On the other hand, it’s a bit sad to see that the blue noise texture gains some low frequency content so the blue noise becomes lower quality. You aren’t just getting more blue noise textures for free by adding the golden ratio, even though they are blue-ish.

Another important aspect to look at is the histogram of colors used of these images when adding golden ratio. The ideal situation is that the starting images have roughly the same number of every color in the image, and that when adding the golden ratio for each frame, that we still use roughly the same number of every color. That turns out to be the case luckily.

The white noise histogram has a minimum count of 213, a maximum count of 303, an average count of 256 (the image is 256×256), and a standard deviation of 15.64. Those values are the same for each frame of the animation.

For interleaved gradient noise, it has a minimum count of 245, a maximum count of 266, an average count of 256 and a standard deviation of 2.87. Those values are the same for the entire animation.

Lastly, for blue noise, it has a minimum, maximum, and average count of 256, and a standard deviation of 0. This also remains true for the entire animation.

## Integration Over Time

A big reason we might want animated noise in graphics is because we are taking multiple samples and want to numerically integrate them.

Lets analyze how these two types of animations (regular and golden ratio) compare for integration.

These animations are the same as before, but on frame 1, we show the average of frame 0 and 1. On frame 2 we show the average of frame 0, 1 and 2. And so on to frame 7 which is the average of all 8 frames. This is an integration of our black and white sample points we are taking, where the correct value of the integration is the greyscale image we started with.

Here is white noise, IG noise and blue noise animated (new noise each frame), integrated over those 8 frames, playing at 8 frames a second:

Here is the same using the golden ratio to animate the noise instead:

Since it can be a little difficult to compare these things while they are in motion, here is the final frames of each method and some graphs to show the average error and standard deviation of the error, compared to the ground truth source image.

White Noise vs White Noise Golden Ratio:

IG Noise vs IG Noise Golden Ratio:

Blue Noise vs Blue Noise Golden Ratio:

Interestingly, the golden ratio average error and standard deviation (from the ground truth) are pretty even for all types of noise by frame 7, even though the blue noise is perceptually superior. This also happens for the non golden ratio integrations of blue noise and white noise. That’s part of the value of blue noise, that even if it has the same amount of error as say, white noise, it still looks better.

Another interesting observation is that interleaved gradient noise performs better at integration (at least numerically) than white or blue noise, when not using the golden ratio. The only way I can explain this is that when picking random pixel offsets to generate each frame of interleaved gradient noise, it’s somehow more blue over time than the other two methods. It’s a strange but pretty useful property.

Despite IG having success when looking at the numbers, it has very visible directional patterns which are not so nice. The fact that it is as cheap as white noise to generate, but has results much closer to blue noise perceptually is pretty awesome though.

Something else important to note is that white noise beats blue noise in the long run (higher sample counts). It’s only at these lower sample counts that blue noise is the clear winner.

Lastly, it seems like the ideal setup for integrating some values over time with a lower sample count would be to have N blue noise textures to use over N frames, but *somehow* have a constraint on those textures generated such that each individual pixel over time has blue noise distributed values.

I’m not sure how to generate that, or if it’s even possible to do so, but doing that seems like it would be pretty near the ideal for doing integration like the above.

Taking a guess at how the DFT’s would look, each individual slice seems like it should look like a totally normal blue noise texture where it’s black in the middle (low frequencies) and noisy elsewhere (high frequencies). If you had N slices of these it would look like a black cylinder surrounded by noise when taking the 3D DFT. I’m not sure though how having the constraint on individual pixels would modify the DFT, or if it even would.

This “ideal” I’m describing is different than vanilla 3d blue noise. The 3d DFT of 3d blue noise is a black sphere surrounded by noise. What I’m describing is a cylinder instead of a sphere.

3d blue noise turns out not to be great for these needs. You can read about that here:

The problem with 3D blue noise

That author also has some an interesting post on blue noise, and a zip file full of blue noise textures that you can take and use.

Free Blue Noise Textures

I have some thoughts on generating this blue noise cylinder that if they work out may very well be the next blog post.

## Code

Here is the code used to generate the images in this post. It’s also on github, which also contains the source images used.

Atrix256: RandomCode/AnimatedNoise

#define _CRT_SECURE_NO_WARNINGS

#include <windows.h>  // for bitmap headers.  Sorry non windows people!
#include <stdint.h>
#include <vector>
#include <random>
#include <atomic>
#include <complex>
#include <array>

typedef uint8_t uint8;

const float c_pi = 3.14159265359f;

// settings
const bool c_doDFT = true;

// globals
FILE* g_logFile = nullptr;

//======================================================================================
inline float Lerp (float A, float B, float t)
{
return A * (1.0f - t) + B * t;
}

//======================================================================================
struct SImageData
{
SImageData ()
: m_width(0)
, m_height(0)
{ }

size_t m_width;
size_t m_height;
size_t m_pitch;
std::vector<uint8> m_pixels;
};

//======================================================================================
struct SColor
{
SColor (uint8 _R = 0, uint8 _G = 0, uint8 _B = 0)
: R(_R), G(_G), B(_B)
{ }

inline void Set (uint8 _R, uint8 _G, uint8 _B)
{
R = _R;
G = _G;
B = _B;
}

uint8 B, G, R;
};

//======================================================================================
struct SImageDataComplex
{
SImageDataComplex ()
: m_width(0)
, m_height(0)
{ }

size_t m_width;
size_t m_height;
std::vector<std::complex<float>> m_pixels;
};

//======================================================================================
std::complex<float> DFTPixel (const SImageData &srcImage, size_t K, size_t L)
{
std::complex<float> ret(0.0f, 0.0f);

for (size_t x = 0; x < srcImage.m_width; ++x)
{
for (size_t y = 0; y < srcImage.m_height; ++y)
{
// Get the pixel value (assuming greyscale) and convert it to [0,1] space
const uint8 *src = &srcImage.m_pixels[(y * srcImage.m_pitch) + x * 3];
float grey = float(src[0]) / 255.0f;

// Add to the sum of the return value
float v = float(K * x) / float(srcImage.m_width);
v += float(L * y) / float(srcImage.m_height);
ret += std::complex<float>(grey, 0.0f) * std::polar<float>(1.0f, -2.0f * c_pi * v);
}
}

return ret;
}

//======================================================================================
void ImageDFT (const SImageData &srcImage, SImageDataComplex &destImage)
{
// NOTE: this function assumes srcImage is greyscale, so works on only the red component of srcImage.
// ImageToGrey() will convert an image to greyscale.

// size the output dft data
destImage.m_width = srcImage.m_width;
destImage.m_height = srcImage.m_height;
destImage.m_pixels.resize(destImage.m_width*destImage.m_height);

// calculate 2d dft (brute force, not using fast fourier transform) multithreadedly
std::atomic<size_t> nextRow(0);
{
[&] ()
{
bool reportProgress = (row == 0);
int lastPercent = -1;

while (row < srcImage.m_height)
{
// calculate the DFT for every pixel / frequency in this row
for (size_t x = 0; x < srcImage.m_width; ++x)
{
destImage.m_pixels[row * destImage.m_width + x] = DFTPixel(srcImage, x, row);
}

// report progress if we should
if (reportProgress)
{
int percent = int(100.0f * float(row) / float(srcImage.m_height));
if (lastPercent != percent)
{
lastPercent = percent;
printf("            \rDFT: %i%%", lastPercent);
}
}

// go to the next row
}
}
);
}

t.join();

printf("\n");
}

//======================================================================================
void GetMagnitudeData (const SImageDataComplex& srcImage, SImageData& destImage)
{
// size the output image
destImage.m_width = srcImage.m_width;
destImage.m_height = srcImage.m_height;
destImage.m_pitch = 4 * ((srcImage.m_width * 24 + 31) / 32);
destImage.m_pixels.resize(destImage.m_pitch*destImage.m_height);

// get floating point magnitude data
std::vector<float> magArray;
magArray.resize(srcImage.m_width*srcImage.m_height);
float maxmag = 0.0f;
for (size_t x = 0; x < srcImage.m_width; ++x)
{
for (size_t y = 0; y < srcImage.m_height; ++y)
{
// Offset the information by half width & height in the positive direction.
// This makes frequency 0 (DC) be at the image origin, like most diagrams show it.
int k = (x + (int)srcImage.m_width / 2) % (int)srcImage.m_width;
int l = (y + (int)srcImage.m_height / 2) % (int)srcImage.m_height;
const std::complex<float> &src = srcImage.m_pixels[l*srcImage.m_width + k];

float mag = std::abs(src);
if (mag > maxmag)
maxmag = mag;

magArray[y*srcImage.m_width + x] = mag;
}
}
if (maxmag == 0.0f)
maxmag = 1.0f;

const float c = 255.0f / log(1.0f+maxmag);

// normalize the magnitude data and send it back in [0, 255]
for (size_t x = 0; x < srcImage.m_width; ++x)
{
for (size_t y = 0; y < srcImage.m_height; ++y)
{
float src = c * log(1.0f + magArray[y*srcImage.m_width + x]);

uint8 magu8 = uint8(src);

uint8* dest = &destImage.m_pixels[y*destImage.m_pitch + x * 3];
dest[0] = magu8;
dest[1] = magu8;
dest[2] = magu8;
}
}
}

//======================================================================================
bool ImageSave (const SImageData &image, const char *fileName)
{
// open the file if we can
FILE *file;
file = fopen(fileName, "wb");
if (!file) {
printf("Could not save %s\n", fileName);
return false;
}

// write the data and close the file
fclose(file);

return true;
}

//======================================================================================
bool ImageLoad (const char *fileName, SImageData& imageData)
{
// open the file if we can
FILE *file;
file = fopen(fileName, "rb");
if (!file)
return false;

{
fclose(file);
return false;
}

// read in our pixel data if we can. Note that it's in BGR order, and width is padded to the next power of 4
if (fread(&imageData.m_pixels[0], imageData.m_pixels.size(), 1, file) != 1)
{
fclose(file);
return false;
}

imageData.m_pitch = 4 * ((imageData.m_width * 24 + 31) / 32);

fclose(file);
return true;
}

//======================================================================================
void ImageInit (SImageData& image, size_t width, size_t height)
{
image.m_width = width;
image.m_height = height;
image.m_pitch = 4 * ((width * 24 + 31) / 32);
image.m_pixels.resize(image.m_pitch * image.m_height);
std::fill(image.m_pixels.begin(), image.m_pixels.end(), 0);
}

//======================================================================================
template <typename LAMBDA>
void ImageForEachPixel (SImageData& image, const LAMBDA& lambda)
{
size_t pixelIndex = 0;
for (size_t y = 0; y < image.m_height; ++y)
{
SColor* pixel = (SColor*)&image.m_pixels[y * image.m_pitch];
for (size_t x = 0; x < image.m_width; ++x)
{
lambda(*pixel, pixelIndex);
++pixel;
++pixelIndex;
}
}
}

//======================================================================================
template <typename LAMBDA>
void ImageForEachPixel (const SImageData& image, const LAMBDA& lambda)
{
size_t pixelIndex = 0;
for (size_t y = 0; y < image.m_height; ++y)
{
SColor* pixel = (SColor*)&image.m_pixels[y * image.m_pitch];
for (size_t x = 0; x < image.m_width; ++x)
{
lambda(*pixel, pixelIndex);
++pixel;
++pixelIndex;
}
}
}

//======================================================================================
void ImageConvertToLuma (SImageData& image)
{
ImageForEachPixel(
image,
[] (SColor& pixel, size_t pixelIndex)
{
float luma = float(pixel.R) * 0.3f + float(pixel.G) * 0.59f + float(pixel.B) * 0.11f;
uint8 lumau8 = uint8(luma + 0.5f);
pixel.R = lumau8;
pixel.G = lumau8;
pixel.B = lumau8;
}
);
}

//======================================================================================
void ImageCombine2 (const SImageData& imageA, const SImageData& imageB, SImageData& result)
{
// put the images side by side. A on left, B on right
ImageInit(result, imageA.m_width + imageB.m_width, max(imageA.m_height, imageB.m_height));
std::fill(result.m_pixels.begin(), result.m_pixels.end(), 0);

// image A on left
for (size_t y = 0; y < imageA.m_height; ++y)
{
SColor* destPixel = (SColor*)&result.m_pixels[y * result.m_pitch];
SColor* srcPixel = (SColor*)&imageA.m_pixels[y * imageA.m_pitch];
for (size_t x = 0; x < imageA.m_width; ++x)
{
destPixel[0] = srcPixel[0];
++destPixel;
++srcPixel;
}
}

// image B on right
for (size_t y = 0; y < imageB.m_height; ++y)
{
SColor* destPixel = (SColor*)&result.m_pixels[y * result.m_pitch + imageA.m_width * 3];
SColor* srcPixel = (SColor*)&imageB.m_pixels[y * imageB.m_pitch];
for (size_t x = 0; x < imageB.m_width; ++x)
{
destPixel[0] = srcPixel[0];
++destPixel;
++srcPixel;
}
}
}

//======================================================================================
void ImageCombine3 (const SImageData& imageA, const SImageData& imageB, const SImageData& imageC, SImageData& result)
{
// put the images side by side. A on left, B in middle, C on right
ImageInit(result, imageA.m_width + imageB.m_width + imageC.m_width, max(max(imageA.m_height, imageB.m_height), imageC.m_height));
std::fill(result.m_pixels.begin(), result.m_pixels.end(), 0);

// image A on left
for (size_t y = 0; y < imageA.m_height; ++y)
{
SColor* destPixel = (SColor*)&result.m_pixels[y * result.m_pitch];
SColor* srcPixel = (SColor*)&imageA.m_pixels[y * imageA.m_pitch];
for (size_t x = 0; x < imageA.m_width; ++x)
{
destPixel[0] = srcPixel[0];
++destPixel;
++srcPixel;
}
}

// image B in middle
for (size_t y = 0; y < imageB.m_height; ++y)
{
SColor* destPixel = (SColor*)&result.m_pixels[y * result.m_pitch + imageA.m_width * 3];
SColor* srcPixel = (SColor*)&imageB.m_pixels[y * imageB.m_pitch];
for (size_t x = 0; x < imageB.m_width; ++x)
{
destPixel[0] = srcPixel[0];
++destPixel;
++srcPixel;
}
}

// image C on right
for (size_t y = 0; y < imageC.m_height; ++y)
{
SColor* destPixel = (SColor*)&result.m_pixels[y * result.m_pitch + imageA.m_width * 3 + imageC.m_width * 3];
SColor* srcPixel = (SColor*)&imageC.m_pixels[y * imageC.m_pitch];
for (size_t x = 0; x < imageC.m_width; ++x)
{
destPixel[0] = srcPixel[0];
++destPixel;
++srcPixel;
}
}
}

//======================================================================================
float GoldenRatioMultiple (size_t multiple)
{
return float(multiple) * (1.0f + std::sqrtf(5.0f)) / 2.0f;
}

//======================================================================================
void IntegrationTest (const SImageData& dither, const SImageData& groundTruth, size_t frameIndex, const char* label)
{
// calculate min, max, total and average error
size_t minError = 0;
size_t maxError = 0;
size_t totalError = 0;
size_t pixelCount = 0;
for (size_t y = 0; y < dither.m_height; ++y)
{
SColor* ditherPixel = (SColor*)&dither.m_pixels[y * dither.m_pitch];
SColor* truthPixel = (SColor*)&groundTruth.m_pixels[y * groundTruth.m_pitch];
for (size_t x = 0; x < dither.m_width; ++x)
{
size_t error = 0;
if (ditherPixel->R > truthPixel->R)
error = ditherPixel->R - truthPixel->R;
else
error = truthPixel->R - ditherPixel->R;

totalError += error;

if ((x == 0 && y == 0) || error < minError)
minError = error;

if ((x == 0 && y == 0) || error > maxError)
maxError = error;

++ditherPixel;
++truthPixel;
++pixelCount;
}
}
float averageError = float(totalError) / float(pixelCount);

// calculate standard deviation
float sumSquaredDiff = 0.0f;
for (size_t y = 0; y < dither.m_height; ++y)
{
SColor* ditherPixel = (SColor*)&dither.m_pixels[y * dither.m_pitch];
SColor* truthPixel = (SColor*)&groundTruth.m_pixels[y * groundTruth.m_pitch];
for (size_t x = 0; x < dither.m_width; ++x)
{
size_t error = 0;
if (ditherPixel->R > truthPixel->R)
error = ditherPixel->R - truthPixel->R;
else
error = truthPixel->R - ditherPixel->R;

float diff = float(error) - averageError;

sumSquaredDiff += diff*diff;
}
}
float stdDev = std::sqrtf(sumSquaredDiff / float(pixelCount - 1));

// report results
fprintf(g_logFile, "%s %zu error\n", label, frameIndex);
fprintf(g_logFile, "  min error: %zu\n", minError);
fprintf(g_logFile, "  max error: %zu\n", maxError);
fprintf(g_logFile, "  avg error: %0.2f\n", averageError);
fprintf(g_logFile, "  stddev: %0.2f\n", stdDev);
fprintf(g_logFile, "\n");
}

//======================================================================================
void HistogramTest (const SImageData& noise, size_t frameIndex, const char* label)
{
std::array<size_t, 256> counts;
std::fill(counts.begin(), counts.end(), 0);

ImageForEachPixel(
noise,
[&] (const SColor& pixel, size_t pixelIndex)
{
counts[pixel.R]++;
}
);

// calculate min, max, total and average
size_t minCount = 0;
size_t maxCount = 0;
size_t totalCount = 0;
for (size_t i = 0; i < 256; ++i)
{
if (i == 0 || counts[i] < minCount)
minCount = counts[i];

if (i == 0 || counts[i] > maxCount)
maxCount = counts[i];

totalCount += counts[i];
}
float averageCount = float(totalCount) / float(256.0f);

// calculate standard deviation
float sumSquaredDiff = 0.0f;
for (size_t i = 0; i < 256; ++i)
{
float diff = float(counts[i]) - averageCount;
sumSquaredDiff += diff*diff;
}
float stdDev = std::sqrtf(sumSquaredDiff / 255.0f);

// report results
fprintf(g_logFile, "%s %zu histogram\n", label, frameIndex);
fprintf(g_logFile, "  min count: %zu\n", minCount);
fprintf(g_logFile, "  max count: %zu\n", maxCount);
fprintf(g_logFile, "  avg count: %0.2f\n", averageCount);
fprintf(g_logFile, "  stddev: %0.2f\n", stdDev);
fprintf(g_logFile, "  counts: ");
for (size_t i = 0; i < 256; ++i)
{
if (i > 0)
fprintf(g_logFile, ", ");
fprintf(g_logFile, "%zu", counts[i]);
}

fprintf(g_logFile, "\n\n");
}

//======================================================================================
void GenerateWhiteNoise (SImageData& image, size_t width, size_t height)
{
ImageInit(image, width, height);

std::random_device rd;
std::mt19937 rng(rd());
std::uniform_int_distribution<unsigned int> dist(0, 255);

ImageForEachPixel(
image,
[&] (SColor& pixel, size_t pixelIndex)
{
uint8 value = dist(rng);
pixel.R = value;
pixel.G = value;
pixel.B = value;
}
);
}

//======================================================================================
void GenerateInterleavedGradientNoise (SImageData& image, size_t width, size_t height, float offsetX, float offsetY)
{
ImageInit(image, width, height);

std::random_device rd;
std::mt19937 rng(rd());
std::uniform_int_distribution<unsigned int> dist(0, 255);

for (size_t y = 0; y < height; ++y)
{
SColor* pixel = (SColor*)&image.m_pixels[y * image.m_pitch];
for (size_t x = 0; x < width; ++x)
{
float valueFloat = std::fmodf(52.9829189f * std::fmod(0.06711056f*float(x + offsetX) + 0.00583715f*float(y + offsetY), 1.0f), 1.0f);
size_t valueBig = size_t(valueFloat * 256.0f);
uint8 value = uint8(valueBig % 256);
pixel->R = value;
pixel->G = value;
pixel->B = value;
++pixel;
}
}
}

//======================================================================================
void DitherWithTexture (const SImageData& ditherImage, const SImageData& noiseImage, SImageData& result)
{
// init the result image
ImageInit(result, ditherImage.m_width, ditherImage.m_height);

// make the result image
for (size_t y = 0; y < ditherImage.m_height; ++y)
{
SColor* srcDitherPixel = (SColor*)&ditherImage.m_pixels[y * ditherImage.m_pitch];
SColor* destDitherPixel = (SColor*)&result.m_pixels[y * result.m_pitch];

for (size_t x = 0; x < ditherImage.m_width; ++x)
{
// tile the noise in case it isn't the same size as the image we are dithering
size_t noiseX = x % noiseImage.m_width;
size_t noiseY = y % noiseImage.m_height;
SColor* noisePixel = (SColor*)&noiseImage.m_pixels[noiseY * noiseImage.m_pitch + noiseX * 3];

uint8 value = 0;
if (noisePixel->R < srcDitherPixel->R)
value = 255;

destDitherPixel->R = value;
destDitherPixel->G = value;
destDitherPixel->B = value;

++srcDitherPixel;
++destDitherPixel;
}
}
}

//======================================================================================
void DitherWhiteNoise (const SImageData& ditherImage)
{
printf("\n%s\n", __FUNCTION__);

// make noise
SImageData noise;
GenerateWhiteNoise(noise, ditherImage.m_width, ditherImage.m_height);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// save the results
SImageData combined;
ImageCombine3(ditherImage, noise, dither, combined);
ImageSave(combined, "out/still_whitenoise.bmp");
}

//======================================================================================
{
printf("\n%s\n", __FUNCTION__);

// make noise
SImageData noise;

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// save the results
SImageData combined;
ImageCombine3(ditherImage, noise, dither, combined);
ImageSave(combined, "out/still_ignoise.bmp");
}

//======================================================================================
void DitherBlueNoise (const SImageData& ditherImage, const SImageData& blueNoise)
{
printf("\n%s\n", __FUNCTION__);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, blueNoise, dither);

// save the results
SImageData combined;
ImageCombine3(ditherImage, blueNoise, dither, combined);
ImageSave(combined, "out/still_bluenoise.bmp");
}

//======================================================================================
void DitherWhiteNoiseAnimated (const SImageData& ditherImage)
{
printf("\n%s\n", __FUNCTION__);

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/anim_whitenoise%zu.bmp", i);

// make noise
SImageData noise;
GenerateWhiteNoise(noise, ditherImage.m_width, ditherImage.m_height);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// save the results
SImageData combined;
ImageCombine2(noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
{
printf("\n%s\n", __FUNCTION__);

std::random_device rd;
std::mt19937 rng(rd());
std::uniform_real_distribution<float> dist(0.0f, 1000.0f);

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/anim_ignoise%zu.bmp", i);

// make noise
SImageData noise;

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// save the results
SImageData combined;
ImageCombine2(noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
void DitherBlueNoiseAnimated (const SImageData& ditherImage, const SImageData blueNoise[8])
{
printf("\n%s\n", __FUNCTION__);

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/anim_bluenoise%zu.bmp", i);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, blueNoise[i], dither);

// save the results
SImageData combined;
ImageCombine2(blueNoise[i], dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
void DitherWhiteNoiseAnimatedIntegrated (const SImageData& ditherImage)
{
printf("\n%s\n", __FUNCTION__);

std::vector<float> integration;
integration.resize(ditherImage.m_width * ditherImage.m_height);
std::fill(integration.begin(), integration.end(), 0.0f);

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animint_whitenoise%zu.bmp", i);

// make noise
SImageData noise;
GenerateWhiteNoise(noise, ditherImage.m_width, ditherImage.m_height);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// integrate and put the current integration results into the dither image
ImageForEachPixel(
dither,
[&] (SColor& pixel, size_t pixelIndex)
{
float pixelValueFloat = float(pixel.R) / 255.0f;
integration[pixelIndex] = Lerp(integration[pixelIndex], pixelValueFloat, 1.0f / float(i+1));

uint8 integratedPixelValue = uint8(integration[pixelIndex] * 255.0f);
pixel.R = integratedPixelValue;
pixel.G = integratedPixelValue;
pixel.B = integratedPixelValue;
}
);

// do an integration test
IntegrationTest(dither, ditherImage, i, __FUNCTION__);

// save the results
SImageData combined;
ImageCombine2(noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
{
printf("\n%s\n", __FUNCTION__);

std::vector<float> integration;
integration.resize(ditherImage.m_width * ditherImage.m_height);
std::fill(integration.begin(), integration.end(), 0.0f);

std::random_device rd;
std::mt19937 rng(rd());
std::uniform_real_distribution<float> dist(0.0f, 1000.0f);

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animint_ignoise%zu.bmp", i);

// make noise
SImageData noise;

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// integrate and put the current integration results into the dither image
ImageForEachPixel(
dither,
[&](SColor& pixel, size_t pixelIndex)
{
float pixelValueFloat = float(pixel.R) / 255.0f;
integration[pixelIndex] = Lerp(integration[pixelIndex], pixelValueFloat, 1.0f / float(i + 1));

uint8 integratedPixelValue = uint8(integration[pixelIndex] * 255.0f);
pixel.R = integratedPixelValue;
pixel.G = integratedPixelValue;
pixel.B = integratedPixelValue;
}
);

// do an integration test
IntegrationTest(dither, ditherImage, i, __FUNCTION__);

// save the results
SImageData combined;
ImageCombine2(noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
void DitherBlueNoiseAnimatedIntegrated (const SImageData& ditherImage, const SImageData blueNoise[8])
{
printf("\n%s\n", __FUNCTION__);

std::vector<float> integration;
integration.resize(ditherImage.m_width * ditherImage.m_height);
std::fill(integration.begin(), integration.end(), 0.0f);

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animint_bluenoise%zu.bmp", i);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, blueNoise[i], dither);

// integrate and put the current integration results into the dither image
ImageForEachPixel(
dither,
[&] (SColor& pixel, size_t pixelIndex)
{
float pixelValueFloat = float(pixel.R) / 255.0f;
integration[pixelIndex] = Lerp(integration[pixelIndex], pixelValueFloat, 1.0f / float(i+1));

uint8 integratedPixelValue = uint8(integration[pixelIndex] * 255.0f);
pixel.R = integratedPixelValue;
pixel.G = integratedPixelValue;
pixel.B = integratedPixelValue;
}
);

// do an integration test
IntegrationTest(dither, ditherImage, i, __FUNCTION__);

// save the results
SImageData combined;
ImageCombine2(blueNoise[i], dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
void DitherWhiteNoiseAnimatedGoldenRatio (const SImageData& ditherImage)
{
printf("\n%s\n", __FUNCTION__);

// make noise
SImageData noiseSrc;
GenerateWhiteNoise(noiseSrc, ditherImage.m_width, ditherImage.m_height);

SImageData noise;
ImageInit(noise, noiseSrc.m_width, noiseSrc.m_height);

SImageDataComplex noiseDFT;
SImageData noiseDFTMag;

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animgr_whitenoise%zu.bmp", i);

// add golden ratio to the noise after each frame
noise.m_pixels = noiseSrc.m_pixels;
ImageForEachPixel(
noise,
[&] (SColor& pixel, size_t pixelIndex)
{
float valueFloat = (float(pixel.R) / 255.0f) + add;
size_t valueBig = size_t(valueFloat * 255.0f);
uint8 value = uint8(valueBig % 256);
pixel.R = value;
pixel.G = value;
pixel.B = value;
}
);

// DFT the noise
if (c_doDFT)
{
ImageDFT(noise, noiseDFT);
GetMagnitudeData(noiseDFT, noiseDFTMag);
}
else
{
ImageInit(noiseDFTMag, noise.m_width, noise.m_height);
std::fill(noiseDFTMag.m_pixels.begin(), noiseDFTMag.m_pixels.end(), 0);
}

// Histogram test the noise
HistogramTest(noise, i, __FUNCTION__);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// save the results
SImageData combined;
ImageCombine3(noiseDFTMag, noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
{
printf("\n%s\n", __FUNCTION__);

// make noise
SImageData noiseSrc;

SImageData noise;
ImageInit(noise, noiseSrc.m_width, noiseSrc.m_height);

SImageDataComplex noiseDFT;
SImageData noiseDFTMag;

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animgr_ignoise%zu.bmp", i);

// add golden ratio to the noise after each frame
noise.m_pixels = noiseSrc.m_pixels;
ImageForEachPixel(
noise,
[&] (SColor& pixel, size_t pixelIndex)
{
float valueFloat = (float(pixel.R) / 255.0f) + add;
size_t valueBig = size_t(valueFloat * 255.0f);
uint8 value = uint8(valueBig % 256);
pixel.R = value;
pixel.G = value;
pixel.B = value;
}
);

// DFT the noise
if (c_doDFT)
{
ImageDFT(noise, noiseDFT);
GetMagnitudeData(noiseDFT, noiseDFTMag);
}
else
{
ImageInit(noiseDFTMag, noise.m_width, noise.m_height);
std::fill(noiseDFTMag.m_pixels.begin(), noiseDFTMag.m_pixels.end(), 0);
}

// Histogram test the noise
HistogramTest(noise, i, __FUNCTION__);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// save the results
SImageData combined;
ImageCombine3(noiseDFTMag, noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
void DitherBlueNoiseAnimatedGoldenRatio (const SImageData& ditherImage, const SImageData& noiseSrc)
{
printf("\n%s\n", __FUNCTION__);

SImageData noise;
ImageInit(noise, noiseSrc.m_width, noiseSrc.m_height);

SImageDataComplex noiseDFT;
SImageData noiseDFTMag;

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animgr_bluenoise%zu.bmp", i);

// add golden ratio to the noise after each frame
noise.m_pixels = noiseSrc.m_pixels;
ImageForEachPixel(
noise,
[&] (SColor& pixel, size_t pixelIndex)
{
float valueFloat = (float(pixel.R) / 255.0f) + add;
size_t valueBig = size_t(valueFloat * 255.0f);
uint8 value = uint8(valueBig % 256);
pixel.R = value;
pixel.G = value;
pixel.B = value;
}
);

// DFT the noise
if (c_doDFT)
{
ImageDFT(noise, noiseDFT);
GetMagnitudeData(noiseDFT, noiseDFTMag);
}
else
{
ImageInit(noiseDFTMag, noise.m_width, noise.m_height);
std::fill(noiseDFTMag.m_pixels.begin(), noiseDFTMag.m_pixels.end(), 0);
}

// Histogram test the noise
HistogramTest(noise, i, __FUNCTION__);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// save the results
SImageData combined;
ImageCombine3(noiseDFTMag, noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
void DitherWhiteNoiseAnimatedGoldenRatioIntegrated (const SImageData& ditherImage)
{
printf("\n%s\n", __FUNCTION__);

std::vector<float> integration;
integration.resize(ditherImage.m_width * ditherImage.m_height);
std::fill(integration.begin(), integration.end(), 0.0f);

// make noise
SImageData noiseSrc;
GenerateWhiteNoise(noiseSrc, ditherImage.m_width, ditherImage.m_height);

SImageData noise;
ImageInit(noise, noiseSrc.m_width, noiseSrc.m_height);

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animgrint_whitenoise%zu.bmp", i);

// add golden ratio to the noise after each frame
noise.m_pixels = noiseSrc.m_pixels;
ImageForEachPixel(
noise,
[&] (SColor& pixel, size_t pixelIndex)
{
float valueFloat = (float(pixel.R) / 255.0f) + add;
size_t valueBig = size_t(valueFloat * 255.0f);
uint8 value = uint8(valueBig % 256);
pixel.R = value;
pixel.G = value;
pixel.B = value;
}
);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// integrate and put the current integration results into the dither image
ImageForEachPixel(
dither,
[&] (SColor& pixel, size_t pixelIndex)
{
float pixelValueFloat = float(pixel.R) / 255.0f;
integration[pixelIndex] = Lerp(integration[pixelIndex], pixelValueFloat, 1.0f / float(i+1));

uint8 integratedPixelValue = uint8(integration[pixelIndex] * 255.0f);
pixel.R = integratedPixelValue;
pixel.G = integratedPixelValue;
pixel.B = integratedPixelValue;
}
);

// do an integration test
IntegrationTest(dither, ditherImage, i, __FUNCTION__);

// save the results
SImageData combined;
ImageCombine2(noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
{
printf("\n%s\n", __FUNCTION__);

std::vector<float> integration;
integration.resize(ditherImage.m_width * ditherImage.m_height);
std::fill(integration.begin(), integration.end(), 0.0f);

// make noise
SImageData noiseSrc;

SImageData noise;
ImageInit(noise, noiseSrc.m_width, noiseSrc.m_height);

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animgrint_ignoise%zu.bmp", i);

// add golden ratio to the noise after each frame
noise.m_pixels = noiseSrc.m_pixels;
ImageForEachPixel(
noise,
[&] (SColor& pixel, size_t pixelIndex)
{
float valueFloat = (float(pixel.R) / 255.0f) + add;
size_t valueBig = size_t(valueFloat * 255.0f);
uint8 value = uint8(valueBig % 256);
pixel.R = value;
pixel.G = value;
pixel.B = value;
}
);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// integrate and put the current integration results into the dither image
ImageForEachPixel(
dither,
[&] (SColor& pixel, size_t pixelIndex)
{
float pixelValueFloat = float(pixel.R) / 255.0f;
integration[pixelIndex] = Lerp(integration[pixelIndex], pixelValueFloat, 1.0f / float(i+1));

uint8 integratedPixelValue = uint8(integration[pixelIndex] * 255.0f);
pixel.R = integratedPixelValue;
pixel.G = integratedPixelValue;
pixel.B = integratedPixelValue;
}
);

// do an integration test
IntegrationTest(dither, ditherImage, i, __FUNCTION__);

// save the results
SImageData combined;
ImageCombine2(noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
void DitherBlueNoiseAnimatedGoldenRatioIntegrated (const SImageData& ditherImage, const SImageData& noiseSrc)
{
printf("\n%s\n", __FUNCTION__);

std::vector<float> integration;
integration.resize(ditherImage.m_width * ditherImage.m_height);
std::fill(integration.begin(), integration.end(), 0.0f);

SImageData noise;
ImageInit(noise, noiseSrc.m_width, noiseSrc.m_height);

// animate 8 frames
for (size_t i = 0; i < 8; ++i)
{
char fileName[256];
sprintf(fileName, "out/animgrint_bluenoise%zu.bmp", i);

// add golden ratio to the noise after each frame
noise.m_pixels = noiseSrc.m_pixels;
ImageForEachPixel(
noise,
[&] (SColor& pixel, size_t pixelIndex)
{
float valueFloat = (float(pixel.R) / 255.0f) + add;
size_t valueBig = size_t(valueFloat * 255.0f);
uint8 value = uint8(valueBig % 256);
pixel.R = value;
pixel.G = value;
pixel.B = value;
}
);

// dither the image
SImageData dither;
DitherWithTexture(ditherImage, noise, dither);

// integrate and put the current integration results into the dither image
ImageForEachPixel(
dither,
[&] (SColor& pixel, size_t pixelIndex)
{
float pixelValueFloat = float(pixel.R) / 255.0f;
integration[pixelIndex] = Lerp(integration[pixelIndex], pixelValueFloat, 1.0f / float(i+1));

uint8 integratedPixelValue = uint8(integration[pixelIndex] * 255.0f);
pixel.R = integratedPixelValue;
pixel.G = integratedPixelValue;
pixel.B = integratedPixelValue;
}
);

// do an integration test
IntegrationTest(dither, ditherImage, i, __FUNCTION__);

// save the results
SImageData combined;
ImageCombine2(noise, dither, combined);
ImageSave(combined, fileName);
}
}

//======================================================================================
int main (int argc, char** argv)
{
// load the dither image and convert it to greyscale (luma)
SImageData ditherImage;
{
return 0;
}
ImageConvertToLuma(ditherImage);

// load the blue noise images.
SImageData blueNoise[8];
for (size_t i = 0; i < 8; ++i)
{
char buffer[256];
sprintf(buffer, "src/BN%zu.bmp", i);
{
return 0;
}

// They have different values in R, G, B so make R be the value for all channels
ImageForEachPixel(
blueNoise[i],
[] (SColor& pixel, size_t pixelIndex)
{
pixel.G = pixel.R;
pixel.B = pixel.R;
}
);
}

g_logFile = fopen("log.txt", "w+t");

// still image dither tests
DitherWhiteNoise(ditherImage);
DitherBlueNoise(ditherImage, blueNoise[0]);

// Animated dither tests
DitherWhiteNoiseAnimated(ditherImage);
DitherBlueNoiseAnimated(ditherImage, blueNoise);

// Golden ratio animated dither tests
DitherWhiteNoiseAnimatedGoldenRatio(ditherImage);
DitherBlueNoiseAnimatedGoldenRatio(ditherImage, blueNoise[0]);

// Animated dither integration tests
DitherWhiteNoiseAnimatedIntegrated(ditherImage);
DitherBlueNoiseAnimatedIntegrated(ditherImage, blueNoise);

// Golden ratio animated dither integration tests
DitherWhiteNoiseAnimatedGoldenRatioIntegrated(ditherImage);
DitherBlueNoiseAnimatedGoldenRatioIntegrated(ditherImage, blueNoise[0]);

fclose(g_logFile);

return 0;
}


# Calculating the Distance Between Points in “Wrap Around” (Toroidal) Space

Let’s say you are trying to find the distance between two points in 2D, but that these points are in a universe that “wraps around” like old video games – leaving the screen on the right, left, top or bottom side makes you re-appear on the opposite edge.

This universe is actually shaped like a toroid, also known as a doughnut. It’s actually an impossible object, a “flat torus”, so not exactly a doughnut, but whatever.

If you imagine yourself on the surface of a doughnut, it would behave exactly this way. If you go “down” you end up where you previously considered “up”. If you go far enough “left” you end up where you previously considered “right”.

How would you calculate the distance between two points in a universe like this?

Let’s imagine the situation below where we are trying to find the distance between the red point and the green point:

One way to do this would be to pick one of the points (I’m picking red in this case) and clone it 8 times to surround the cell like the below. You’d calculate the distance from the green point to each of the 9 red points, and whatever distance was smallest would be the answer.

Something not so desirable about this is that it takes 9 distance calculations to find the minimum distance. You can work with squared distances instead of regular distances to avoid a square root on each of these distance calculations, but that’s still a bit of calculation to do.

Going up in dimensions makes the problem even worse. In 3D, it requires 27 distance calculations to find the shortest point, and 81 distance calculations in 4D!

Luckily there’s a better way to approach this.

Let’s say that our universe (image) is 1 unit by 1 unit big (aka we are working in texture UVs). If you look at the image with 9 copies of the red dot, you can see that they are just the 9 possible combinations of having -1, +0, +1 on each axis added to the red dot’s coordinates. All possible combinations of the x and y axis having -1, +0 or +1 added to them are valid locations of the red dot.

Looking at the distance formula we can see that if we minimize each axis individually, that we will also end up with the minimal distance overall.

$d = \sqrt{(x_2-x_1)^2+(y_2-y_1)^2}$

So, the better way is to minimize each axis individually.

On the x axis you’d find if the x axis distance between the red and green point is minimal when you subtract 1 from the red dot’s x axis position, leave it alone, or add 1.

Whichever x axis value of the red dot gives you the minimal x axis 1D distance is the x axis location to use.

You’d repeat for the y axis to get the y axis location to use (and would repeat for any further axes for higher dimensions).

This gives you the closest point which you can then plug into the distance formula to get the distance between the points in this wrap around space.

You can actually do better though.

Still working on each axis individually, you can calculate the absoluate value of the 1D distance between the two points on that axis. If that distance is greater than 0.5, the real distance for that axis is 1-distance.

The intuition here is that if you are in a 1d repeating space, if going from A to B is more than half the distance, it means that you went the wrong way, and that going the other way is shorter. The distance of that other way is one minus whatever distance you just calculated since the distance from one point to itself is 1!

Do that for each axis and use those 1d distances in the distance formula to get the actual distance.

This lets you minimize the distance without having to explicitly figure out which combination makes the point closest.

More importantly, it lets you efficiently calculate the distance between the two points in toroidal space (doughnut space!)

The computational complexity is a lot better. It’s now linear in the number of dimensions: $O(N)$, instead of $O(3^N)$.

Here is some C++ to show you how it would work in 2D.

float ToroidalDistance (float x1, float y1, float x2, float y2)
{
float dx = std::abs(x2 - x1);
float dy = std::abs(y2 - y1);

if (dx > 0.5f)
dx = 1.0f - dx;

if (dy > 0.5f)
dy = 1.0f - dy;

return std::sqrt(dx*dx + dy*dy);
}


I hit this problem trying to make a tileable texture. I needed to place a few circles on a texture such that the circles weren’t too close to each other, even when the texture was tiled.

The calculations above gave me the basic tool needed to be able to calculate distances between points. Subtracting circle radii from the distance between points let me get toroidal distance between circles and make sure I didn’t place them too closely to each other.

That let me make an image that kept the distance constraints even when it was tiled.

Here’s an example image by itself:

Here is the image tiled:

# Generating Random Numbers From a Specific Distribution With Rejection Sampling

The last post showed how to transform uniformly generated random numbers into any random number distribution you desired.

It did so by turning the PDF (probability density function) into a CDF (cumulative density function) and then inverting it – either analytically (making a function) or numerically (making a look up table).

This post will show you how to generate numbers from a PDF as well, but will do so using rejection sampling.

# Dice

Let’s say you wanted to simulate a fair five sided die but that you only had a six sided die.

You can use rejection sampling for this by rolling a six sided die and ignoring the roll any time a six came up. Doing that, you do in fact get a fair five sided die roll!

This shows doing that to get 10,000 five sided die rolls:

One disadvantage to this method is that you are throwing away die rolls which can be a source of inefficiency. In this setup it takes 1.2 six sided die rolls on average to get a valid five sided die roll since a roll will be thrown away 1/6 of the time.

Another disadvantage is that each time you need a new value, there are an unknown number of die rolls needed to get it. On average it’s true that you only need 1.2 die rolls, but in reality, it’s possible you may roll 10 sixes in a row. Heck it’s even technically possible (but very unlikely) that you could be rolling dice until the end of time and keep getting sixes. (Using PRNG’s in computers, this won’t happen, but it does take a variable number of rolls).

This is just to say: there is uneven and unpredictable execution time of this algorithm, and it needs an unknown (but somewhat predictable) amount of random numbers to work. This is true of the other forms of sampling methods I talk about lower down as well.

Instead of using a six sided die you could use a die of any size that is greater than (or equal to…) five. Here shows a twenty sided die simulating a five sided die:

It looks basically the same as using a six sided die, which makes sense (that shows that it works), but in this case, it actually took 4 rolls on average to make a valid five sided die roll, since the roll fails 15/20 times (3 out of 4 rolls will fail).

Quick Asides:

• If straying from rejection sampling ideas for a minute, in the case of the twenty sided die, you could use modulus to get a fair five sided die roll each time: $((roll - 1) \% 5) + 1$. This works because there is no remainder for 20 % 5. If there was a remainder it would bias the rolls towards the numbers <= the remainder, making them more likely to come up than the other numbers.
• You could also get a four sided die roll at the same time if you didn’t want to waste any of this precious random information: $((roll - 1) / 5) + 1$
• Another algorithm to check out for discrete (integer) weighted random numbers is Vose’s method: Vose’s Method.

# Box Around PDF

Moving back into the world of continuous valued random numbers and PDF’s, a simple version of how rejection sampling can be used is like this:

2. Draw a box around the PDF
3. Generate a (uniform) random point in that box
4. If the point is under the curve of the PDF, use the x axis value as your random number, else throw it out and go to 1

That’s all there is to it!

This works because the x axis value of your 2d point is the random number you might be choosing. The y axis value of your 2d point is a probability of choosing that point. Since the PDF graph is higher in places that are more probable, those places are more likely to accept your 2d point than places that have lower PDF values.

Furthermore, the average number of rejected samples vs accepted samples is based on the area under the PDF compared to the area of the box.

The number of samples on average will be the area of the box divided by the area of the PDF.

Since PDF’s by definition have to integrate to 1, that means that you are dividing by 1. So, to simplify: The number of samples on average will be the same as the area of the box!

If it’s hard to come up with the exact size of the box for the PDF, the box doesn’t have to fit exactly, but of course the tighter you can fit the box around the PDF, the fewer rejected samples you’ll have.

You don’t actually need to graph the PDF and draw a box to do this though. Just generate a 2d random number (a random x and a random y) and reject the point if $PDF(x) < y$.

Here I'm using this technique with the PDF $y=2x$ where x is in [0,1) and I'm using a box that goes from (0,0) to (1,2) to get 100,000 samples.

As expected, it took on average 2 points to get a single valid point since the area of the box is 2. Here are how many failed tests each histogram bucket had. Unsurprisingly, lower values of the PDF have more failed tests!

Moving to a more complex PDF, let’s look at $y=\frac{x^3-10x^2+5x+11}{10.417}$

Here are 10 million samples (lots of samples to minimize the noise), using a box height of 1.2, which unsurprisingly takes 1.2 samples on average to get a valid sample:

Here is the graph of the failure counts:

Here the box has a height of 2.8. It still works, but uses 2.8 samples on average which is less efficient:

Here’s the graph of failure counts:

Something interesting about this technique is that technically, the distribution you are sampling from doesn’t even have to be a PDF! If you have negative parts of the graph, they will be treated as zero, assuming your box has a minimum y of 0. Also, the fact that your function may not integrate to (have an area of) 1 doesn’t matter at all.

Here we take the PDF from the last examples, and take off the division by a constant, so that it doesn’t integrate to 1: $y=x^3-10x^2+5x+11$

The interesting thing is that we get as output a normalized PDF (the red line), even though the distribution we were using to sample was not normalized (the blue line, which is mostly hidden behind the yellow line).

Here are the rejection counts:

## Generating One PDF from Another PDF

In the last section we showed how to enclose a PDF in a box, make uniformly random 2d points, and use them to generate points from the PDF.

By enclosing it in a box, all we were really doing is putting it under a uniform distribition that was scaled up to be larger than the PDF at all points.

Now here’s the interesting thing: We aren’t limited to using the uniform distribution!

To generalize this technique, if you are trying to sample from a PDF $f(x)$, you can use any PDF $g(x)$ to do so, so long as you multiply $g(x)$ by a scalar value $M$ so that $M*g(x)>= f(x)$ for all values of x. In other words: scale up g so that it’s always bigger than f.

Using this more generalized technique has one or two more steps than the other way, but allows for a tighter fit of a generating function, resulting in fewer samples thrown away.

Here’s how to do it:

1. Generate a random number from the distribution g, and call it x.
2. Calculate the percentage chance of x being chosen by getting a ratio of how likely that number is to be chosen in each PDF: $\frac{f(x)}{M*g(x)}$
3. Generate a uniform random number from 0 to 1. If it’s less than the value you just calculated, accept x as the random number, else reject it and go back to 1.

Let’s see this in action!

We’ll generate numbers in a Gaussian distribution with a mean of 15 and a standard deviation of 5. We’ll truncate it to +/- 3 standard deviations so we want to generate random numbers from [0,30).

To generate these numbers, we’ll draw random numbers from the PDF $y=x*0.002222$. We’ll use an $M$ value of 3 to scale up this PDF to always be greater than the Gaussian one.

Here is how it looks doing this with 20,000 samples:

We generate random numbers along the red line, multiply them by 3 to make them be the yellow line. Then, at whatever point we are at on the x axis, we divide the blue line value by the yellow line value and use that as an acceptance probability. Doing this and counting numbers in a histogram gives us our result – the green line. Since the end goal is the blue line, you can see it is indeed working! With a larger number of samples, the green line would more closely match the blue line.

Here’s the graph of the failed tests:

We have to take on average 3 samples before we get a valid random number. That shouldn’t be too surprising because both PDF’s start with area of 1, but we are multiplying one of them by 3 to make it always be larger than the other.

Something else interesting you might notice is that we have a lot fewer failed tests where the two PDF functions are more similar.

That is the power of this technique: If you can cheaply and easily generate samples that are “pretty close” to a harder distribution to sample from, you can use this technique to more cheaply sample from it.

Something to note is that just like in the last section, the target PDF doesn’t necessarily need to be a real PDF with only positive values and integrating to 1. It would work just the same with a non PDF function, just so long as the PDF generating the random numbers you start with is always above the function.

# Some Other Notes

There is family of techniques called “adaptive rejection sampling” that will change the PDF they are drawing from whenever there is a failed test.

Basically, if you imagine the PDF you are drawing from as being a bunch of line segments connected together, you could imagine that whenever you failed a test, you moved a line segment down to be closer to the curve, so that when you sampled from that area again, the chances would be lower that you’d fail the test again.

Taking this to the limit, your sampling PDF will eventually become the PDF you are trying to sample from, and then using this PDF will be a no-op.

These techniques are a continued area of research.

Something else to note is that rejection sampling can be used to find random points within shapes.

For instance, a random point on a triangle, ellipse or circle could be done by putting a (tight) bounding box around the shape, generating points randomly in that box, and only accepting ones within the inner shape.

This can be extended to 3d shapes as well.

Some shapes have better ways to generate points within them that don’t involve iteration and rejected samples, but if all else fails, rejection sampling does indeed work!

At some point in the future I’d like to look into “Markov Chain Monte Carlo” more deeply. It seems like a very interesting technique to approach this same problem, but I have no idea if it’s used often in graphics, especially real time graphics.

# Code

Here is the code that generated all the data from this post. The data was visualized with open office.

#define _CRT_SECURE_NO_WARNINGS

#include <stdio.h>
#include <random>
#include <array>
#include <unordered_map>

template <size_t NUM_TEST_SAMPLES, size_t SIMULATED_DICE_SIDES, size_t ACTUAL_DICE_SIDES>
void TestDice (const char* fileName)
{
// seed the random number generator
std::random_device rd;
std::mt19937 rng(rd());
std::uniform_int_distribution<size_t> dist(0, ACTUAL_DICE_SIDES-1);

// generate the histogram
std::array<size_t, SIMULATED_DICE_SIDES> histogram = { 0 };
size_t rejectedSamples = 0;
for (size_t i = 0; i < NUM_TEST_SAMPLES; ++i)
{
size_t roll = dist(rng);
while (roll >= SIMULATED_DICE_SIDES)
{
++rejectedSamples;
roll = dist(rng);
}
histogram[roll]++;
}

// write the histogram and rejected sample count to a csv
// an extra 0 data point forces the graph to include 0 in the scale. hack to make the data not look noisier than it really is.
FILE *file = fopen(fileName, "w+t");
fprintf(file, "Actual Count, Expected Count, , %0.2f samples needed per roll on average.\n", (float(NUM_TEST_SAMPLES) + float(rejectedSamples)) / float(NUM_TEST_SAMPLES));
for (size_t value : histogram)
fprintf(file, "%zu,%zu,0\n", value, (size_t)(float(NUM_TEST_SAMPLES) / float(SIMULATED_DICE_SIDES)));
fclose(file);
}

template <size_t NUM_TEST_SAMPLES, size_t NUM_HISTOGRAM_BUCKETS, typename PDF_LAMBDA>
void Test (const char* fileName, float maxPDFValue, const PDF_LAMBDA& PDF)
{
// seed the random number generator
std::random_device rd;
std::mt19937 rng(rd());
std::uniform_real_distribution<float> dist(0.0f, 1.0f);

// generate the histogram
std::array<size_t, NUM_HISTOGRAM_BUCKETS> histogram = { 0 };
std::array<size_t, NUM_HISTOGRAM_BUCKETS> failedTestCounts = { 0 };
size_t rejectedSamples = 0;
for (size_t i = 0; i < NUM_TEST_SAMPLES; ++i)
{
// Generate a sample from the PDF by generating a random 2d point.
// If the y axis of the value is <= the value returned by PDF(x), accept it, else reject it.
// NOTE: this takes an unknown number of iterations, and technically may NEVER finish.
float pointX = 0.0f;
float pointY = 0.0f;
bool validPoint = false;
while (!validPoint)
{
pointX = dist(rng);
pointY = dist(rng) * maxPDFValue;
float pdfValue = PDF(pointX);
validPoint = (pointY <= pdfValue);

// track number of failed tests per histogram bucket
if (!validPoint)
{
size_t bin = (size_t)std::floor(pointX * float(NUM_HISTOGRAM_BUCKETS));
failedTestCounts[std::min(bin, NUM_HISTOGRAM_BUCKETS - 1)]++;
++rejectedSamples;
}
}

// increment the correct bin in the histogram
size_t bin = (size_t)std::floor(pointX * float(NUM_HISTOGRAM_BUCKETS));
histogram[std::min(bin, NUM_HISTOGRAM_BUCKETS -1)]++;
}

// write the histogram and pdf sample to a csv
FILE *file = fopen(fileName, "w+t");
fprintf(file, "PDF, Simulated PDF, Generating Function, Failed Tests, %0.2f samples needed per value on average.\n", (float(NUM_TEST_SAMPLES) + float(rejectedSamples)) / float(NUM_TEST_SAMPLES));
for (size_t i = 0; i < NUM_HISTOGRAM_BUCKETS; ++i)
{
float x = (float(i) + 0.5f) / float(NUM_HISTOGRAM_BUCKETS);
float pdfSample = PDF(x);
fprintf(file, "%f,%f,%f,%f\n",
pdfSample,
NUM_HISTOGRAM_BUCKETS * float(histogram[i]) / float(NUM_TEST_SAMPLES),
maxPDFValue,
float(failedTestCounts[i])
);
}
fclose(file);
}

template <size_t NUM_TEST_SAMPLES, size_t NUM_HISTOGRAM_BUCKETS, typename PDF_LAMBDA>
void TestNotPDF (const char* fileName, float maxPDFValue, float normalizationConstant, const PDF_LAMBDA& PDF)
{
// seed the random number generator
std::random_device rd;
std::mt19937 rng(rd());
std::uniform_real_distribution<float> dist(0.0f, 1.0f);

// generate the histogram
std::array<size_t, NUM_HISTOGRAM_BUCKETS> histogram = { 0 };
std::array<size_t, NUM_HISTOGRAM_BUCKETS> failedTestCounts = { 0 };
size_t rejectedSamples = 0;
for (size_t i = 0; i < NUM_TEST_SAMPLES; ++i)
{
// Generate a sample from the PDF by generating a random 2d point.
// If the y axis of the value is <= the value returned by PDF(x), accept it, else reject it.
// NOTE: this takes an unknown number of iterations, and technically may NEVER finish.
float pointX = 0.0f;
float pointY = 0.0f;
bool validPoint = false;
while (!validPoint)
{
pointX = dist(rng);
pointY = dist(rng) * maxPDFValue;
float pdfValue = PDF(pointX);
validPoint = (pointY <= pdfValue);

// track number of failed tests per histogram bucket
if (!validPoint)
{
size_t bin = (size_t)std::floor(pointX * float(NUM_HISTOGRAM_BUCKETS));
failedTestCounts[std::min(bin, NUM_HISTOGRAM_BUCKETS - 1)]++;
++rejectedSamples;
}
}

// increment the correct bin in the histogram
size_t bin = (size_t)std::floor(pointX * float(NUM_HISTOGRAM_BUCKETS));
histogram[std::min(bin, NUM_HISTOGRAM_BUCKETS -1)]++;
}

// write the histogram and pdf sample to a csv
FILE *file = fopen(fileName, "w+t");
fprintf(file, "Function, Simulated PDF, Scaled Simulated PDF, Generating Function, Failed Tests, %0.2f samples needed per value on average.\n", (float(NUM_TEST_SAMPLES) + float(rejectedSamples)) / float(NUM_TEST_SAMPLES));
for (size_t i = 0; i < NUM_HISTOGRAM_BUCKETS; ++i)
{
float x = (float(i) + 0.5f) / float(NUM_HISTOGRAM_BUCKETS);
float pdfSample = PDF(x);
fprintf(file, "%f,%f,%f,%f,%f\n",
pdfSample,
NUM_HISTOGRAM_BUCKETS * float(histogram[i]) / float(NUM_TEST_SAMPLES),
NUM_HISTOGRAM_BUCKETS * float(histogram[i]) / float(NUM_TEST_SAMPLES) * normalizationConstant,
maxPDFValue,
float(failedTestCounts[i])
);
}
fclose(file);
}

template <size_t NUM_TEST_SAMPLES, size_t NUM_HISTOGRAM_BUCKETS, typename PDF_F_LAMBDA, typename PDF_G_LAMBDA, typename INVERSE_CDF_G_LAMBDA>
void TestPDFToPDF (const char* fileName, const PDF_F_LAMBDA& PDF_F, const PDF_G_LAMBDA& PDF_G, float M, const INVERSE_CDF_G_LAMBDA& Inverse_CDF_G, float rngRange)
{
// We generate a sample from PDF F by generating a sample from PDF G, and accepting it with probability PDF_F(x)/(M*PDF_G(x))

// seed the random number generator
std::random_device rd;
std::mt19937 rng(rd());
std::uniform_real_distribution<float> dist(0.0f, 1.0f);

// generate the histogram
std::array<size_t, NUM_HISTOGRAM_BUCKETS> histogram = { 0 };
std::array<size_t, NUM_HISTOGRAM_BUCKETS> failedTestCounts = { 0 };
size_t rejectedSamples = 0;
for (size_t i = 0; i < NUM_TEST_SAMPLES; ++i)
{
// generate random points until we have one that's accepted
// NOTE: this takes an unknown number of iterations, and technically may NEVER finish.
float sampleG = 0.0f;
bool validPoint = false;
while (!validPoint)
{
// Generate a sample from the soure PDF G
sampleG = Inverse_CDF_G(dist(rng));

// calculate the ratio of how likely we are to accept this sample
float acceptChance = PDF_F(sampleG) / (M * PDF_G(sampleG));

// see if we should accept it
validPoint = dist(rng) <= acceptChance;

// track number of failed tests per histogram bucket
if (!validPoint)
{
size_t bin = (size_t)std::floor(sampleG * float(NUM_HISTOGRAM_BUCKETS) / rngRange);
failedTestCounts[std::min(bin, NUM_HISTOGRAM_BUCKETS - 1)]++;
++rejectedSamples;
}
}

// increment the correct bin in the histogram
size_t bin = (size_t)std::floor(sampleG * float(NUM_HISTOGRAM_BUCKETS) / rngRange);
histogram[std::min(bin, NUM_HISTOGRAM_BUCKETS - 1)]++;
}

// write the histogram and pdf sample to a csv
FILE *file = fopen(fileName, "w+t");
fprintf(file, "PDF F,PDF G,Scaled PDF G,Simulated PDF,Failed Tests,%0.2f samples needed per value on average.\n", (float(NUM_TEST_SAMPLES) + float(rejectedSamples)) / float(NUM_TEST_SAMPLES));
for (size_t i = 0; i < NUM_HISTOGRAM_BUCKETS; ++i)
{
float x = (float(i) + 0.5f) * rngRange / float(NUM_HISTOGRAM_BUCKETS);

fprintf(file, "%f,%f,%f,%f,%f\n",
PDF_F(x),
PDF_G(x),
PDF_G(x)*M,
NUM_HISTOGRAM_BUCKETS * float(histogram[i]) / (float(NUM_TEST_SAMPLES)*rngRange),
float(failedTestCounts[i])
);
}
fclose(file);
}

int main(int argc, char **argv)
{
// Dice
{
// Simulate a 5 sided dice with a 6 sided dice
TestDice<10000, 5, 6>("test1_5_6.csv");

// Simulate a 5 sided dice with a 20 sided dice
TestDice<10000, 5, 20>("test1_5_20.csv");
}

// PDF y=2x, simulated with a uniform distribution
{
auto PDF = [](float x) { return 2.0f * x; };

Test<1000, 100>("test2_1k.csv", 2.0f, PDF);
Test<100000, 100>("test2_100k.csv", 2.0f, PDF);
Test<1000000, 100>("test2_1m.csv", 2.0f, PDF);
}

// PDF y=(x^3-10x^2+5x+11)/10.417, simulated with a uniform distribution
{
auto PDF = [](float x) {return (x*x*x - 10.0f*x*x + 5.0f*x + 11.0f) / (10.417f); };
Test<10000000, 100>("test3_10m_1_15.csv", 1.15f, PDF);
Test<10000000, 100>("test3_10m_1_5.csv", 1.5f, PDF);
Test<10000000, 100>("test3_10m_2_8.csv", 2.8f, PDF);
}

// function (not PDF, Doesn't integrate to 1!) y=(x^3-10x^2+5x+11), simulated with a scaled up uniform distribution
{
auto PDF = [](float x) {return (x*x*x - 10.0f*x*x + 5.0f*x + 11.0f); };
TestNotPDF<10000000, 100>("test4_10m_12_5.csv", 12.5f, 10.417f, PDF);
}

// Generate samples from PDF F using samples from PDF G.  random numbers are from 0 to 30.
// F PDF = gaussian distribution, mean 15, std dev of 5.  Truncated to +/- 3 stddeviations.
// G PDF = x*0.002222
// G CDF = 0.001111 * x^2
// G inverted CDF = (1000 * sqrt(x)) / sqrt(1111)
// M = 3
{
// gaussian PDF F
const float mean = 15.0f;
const float stddev = 5.0f;
auto PDF_F = [=] (float x) -> float
{
return (1.0f / (stddev * sqrt(2.0f * (float)std::_Pi))) * std::exp(-0.5f * pow((x - mean) / stddev, 2.0f));
};

// PDF G
auto PDF_G = [](float x) -> float
{
return x * 0.002222f;
};

// Inverse CDF of G
auto Inverse_CDF_G = [] (float x) -> float
{
return 1000.0f * std::sqrtf(x) / std::sqrtf(1111.0f);
};

TestPDFToPDF<20000, 100>("test5.csv", PDF_F, PDF_G, 3.0f, Inverse_CDF_G, 30.0f);
}

return 0;
}


# Generating Random Numbers From a Specific Distribution By Inverting the CDF

The last post talked about the normal distribution and showed how to generate random numbers from that distribution by generating regular (uniform) random numbers and then counting the bits.

What would you do if you wanted to generate random numbers from a different, arbitrary distribution though? Let’s say the distribution is defined by a function even.

It turns out that in general this is a hard problem, but in practice there are a few ways to approach it. The below are the most common techniques for achieving this that I’ve seen.

• Inverting the CDF (analytically or numerically)
• Rejection Sampling
• Markov Chain Monte Carlo
• Ziggurat algorithm

This post talks about the first one listed: Inverting the CDF.

# What Is A CDF?

The last post briefly explained that a PDF is a probability density function and that it describes the relative probability of numbers being chosen at random. A requirement of a PDF is that it has non negative value everywhere and also that the area under the curve is 1.

It needs to be non negative everywhere because a negative probability doesn’t make any sense. It needs to have an area under the curve of 1 because that means it represents the full 100% probability of all possible outcomes.

CDF stands for “Cumulative distribution function” and is related to the PDF.

A PDF is a function y=f(x) where y is the probability of the number x number being chosen at random from the distribution.

A CDF is a function y=f(x) where y is the probability of the number x, or any lower number, being chosen at random from that distribution.

You get a CDF from a PDF by integrating the PDF. From there you make sure that the CDF has a starting y value of 0, and an ending value of 1. You might have to do a bias (addition or subtraction) and/or scale (multiplication or division) to make that happen.

# Why Invert the CDF? (And Not the PDF?)

With both a PDF and a CDF, you plug in a number, and you get information about probabilities relating to that number.

To get a random number from a specific distribution, we want to do the opposite. We want to plug in a probability and get out the number corresponding to that probability.

Basically, we want to flip x and y in the equation and solve for y, so that we have a function that does this. That is what we have to do to invert the CDF.

Why invert the CDF though and not the PDF? Check out the images below from Wikipedia. The first is some Gaussian PDF’s and the second is the same distributions as CDF’s:

The issue is that if we flip x and y’s in a PDF, there would be multiple y values corresponding to the same x. This isn’t true in a CDF.

Let’s work through sampling some PDFs by inverting the CDF.

# Example 0: y=1

This is the easiest case and represents uniform random numbers, where every number is evenly likely to be chosen.

Our PDF equation is: $y=1$ where $x \in [0,1]$. The graph looks like this:

If we integrate the pdf to get the cdf, we get $y=x$ where $x \in [0,1]$ which looks like this:

Now, to invert the cdf, we flip x and y, and then solve for y again. It’s trivially easy…

$y=x \Leftarrow \text{CDF}\\ x=y \Leftarrow \text{Flip x and y}\\ y=x \Leftarrow \text{Solve for y again}$

Now that we have our inverted CDF, which is $y=x$, we can generate uniform random numbers, plug them into that equation as x and get y which is the actual value drawn from our PDF.

You can see that since we are plugging in numbers from an even distribution and not doing anything to them at all, that the result is going to an even distribution as well. So, we are in fact generating uniformly distributed random numbers using this inverted CDF, just like our PDF asked for.

This is so trivially simple it might be confusing. If so, don’t sweat it. Move onto the next example and you can come back to this later if you want to understand what I’m talking about here.

Note: The rest of the examples are going to have x in [0,1] as well but we are going to stop explicitly saying so. This process still works when x is in a different range of values, but for simplicity we’ll just have x be in [0,1] for the rest of the post.

# Example 1: y=2x

The next easiest case for a PDF is $y=2x$ which looks like this:

You might wonder why it’s $y=2x$ instead of $y=x$. This is because the area under the curve $y=x$ is 0.5. PDF’s need to have an area of 1, so I multiplied by 2 to make it have an area of 1.

What this PDF means is that small numbers are less likely to be picked than large numbers.

If we integrate the PDF $y=2x$ to get the CDF, we get $y=x^2$ which looks like this:

Now let’s flip x and y and solve for y again.

$y=x^2 \Leftarrow \text{CDF}\\ x=y^2 \Leftarrow \text{Flip x and y}\\ y=\sqrt{x} \Leftarrow \text{Solve for y again}$

We now have our inverted CDF which is $y=\sqrt{x}$ and looks like this:

Now, if we plug uniformly random numbers into that formula as x, we should get as output samples that follow the probability of our PDF.

We can use a histogram to see if this is really true. We can generate some random numbers, square root them, and count how many are in each range of values.

Here is a histogram where I took 1,000 random numbers, square rooted them, and put their counts into 100 buckets. Bucket 1 counted how many numbers were in [0, 0.01), bucket 2 counted how many numbers were in [0.01, 0.02) and so on until bucket 100 which counted how many numbers were in [0.99, 1.0).

Increasing the number of samples to 100,000 it gets closer:

At 1,000,000 samples you can barely see a difference:

The reason it doesn’t match up at lower sample counts is just due to the nature of random numbers being random. It does match up, but you’ll have some variation with lower sample counts.

# Example 2: y=3x^2

Let’s check out the PDF $y=3x^2$. The area under that curve where x is in [0,1) is 1.0 and it’s non negative everywhere in that range too, so it’s a valid PDF.

Integrating that, we get $y=x^3$ for the CDF. Then we invert the CDF:

$y=x^3 \Leftarrow \text{CDF}\\ x=y^3 \Leftarrow \text{Flip x and y}\\ y=\sqrt[3]{x} \Leftarrow \text{Solve for y again}$

And here is a 100,000 sample histogram vs the PDF to verify that we got the right answer:

# Example 3: Numeric Solution

So far we’ve been able to invert the CDF to get a nice easy function to transform uniform distribution random numbers into numbers from the distribution described by the PDF.

Sometimes though, inverting a CDF isn’t possible, or gives a complex equation that is costly to evaluate. In these cases, you can actually invert the CDF numerically via a lookup table.

A lookup table may also be desired in cases where eg you have a pixel shader that is drawing numbers from a PDF, and instead of making N shaders for N different PDFs, you want to unify them all into a single shader. Passing a lookup table via a constant buffer, or perhaps even via a texture can be a decent solution here. (Note: if storing in a texture you may be interested in fitting the data with curves and using this technique to store it and recall it from the texture: GPU Texture Sampler Bezier Curve Evaluation)

Let’s invert a PDF numerically using a look up table to see how that would work.

Our PDF will be:

$y=\frac{x^3-10x^2+5x+11}{10.417}$

And looks like this:

It’s non negative in the range we care about and it integrates to 1.0 – or it integrates closely enough… the division by 10.417 is there for that reason, and using more digits would get it closer to 1.0.

What we are going to do is evaluate that PDF at N points to get a probability for those samples of numbers. That will give us a lookup table for our PDF.

We are then going to make each point be the sum of all the PDF samples to the left of it to make a lookup table for a CDF. We’ll also have to normalize the CDF table since it’s likely that our PDF samples don’t all add up (integrate) to 1.0. We do this by dividing every item in the CDF by the last entry in the CDF. If you look at the table after that, it will fully cover everything from 0% to 100% probability.

Below are some histogram comparisons of the lookup table technique vs the actual PDF.

Here is 100 million samples (to make it easier to see the data without very much random noise), in 100 histogram buckets, and a lookup table size of 3 which is pretty low quality:

Increasing it to a lookup table of size 5 gives you this:

Here’s 10:

25:

And here’s 100:

So, not surprisingly, the size of the lookup table affects the quality of the results!

## Code

here is the code I used to generate the data in this post, which i visualized with open office. I visualized the function graphs using wolfram alpha.

#define _CRT_SECURE_NO_WARNINGS

#include
#include
#include
#include

template
void Test (const char* fileName, const PDF_LAMBDA& PDF, const INVERSE_CDF_LAMBDA& inverseCDF)
{
// seed the random number generator
std::random_device rd;
std::mt19937 rng(rd());
std::uniform_real_distribution dist(0.0f, 1.0f);

// generate the histogram
std::array histogram = { 0 };
for (size_t i = 0; i < NUM_TEST_SAMPLES; ++i)
{
// put a uniform random number into the inverted CDF to sample the PDF
float x = dist(rng);
float y = inverseCDF(x);

// increment the correct bin on the histogram
size_t bin = (size_t)std::floor(y * float(NUM_HISTOGRAM_BUCKETS));
histogram[std::min(bin, NUM_HISTOGRAM_BUCKETS -1)]++;
}

// write the histogram and pdf sample to a csv
FILE *file = fopen(fileName, "w+t");
fprintf(file, "PDF, Inverted CDF\n");
for (size_t i = 0; i < NUM_HISTOGRAM_BUCKETS; ++i)
{
float x = (float(i) + 0.5f) / float(NUM_HISTOGRAM_BUCKETS);
float pdfSample = PDF(x);
fprintf(file, "%f,%f\n",
pdfSample,
NUM_HISTOGRAM_BUCKETS * float(histogram[i]) / float(NUM_TEST_SAMPLES)
);
}
fclose(file);
}

template
void TestPDFOnly (const char* fileName, const PDF_LAMBDA& PDF)
{
// make the CDF lookup table by sampling the PDF
// NOTE: we could integrate the buckets by averaging multiple samples instead of just the 1. This bucket integration is pretty low tech and low quality.
std::array CDFLookupTable;
float value = 0.0f;
for (size_t i = 0; i < LOOKUP_TABLE_SIZE; ++i)
{
float x = float(i) / float(LOOKUP_TABLE_SIZE - 1); // The -1 is so we cover the full range from 0% to 100%
value += PDF(x);
CDFLookupTable[i] = value;
}

// normalize the CDF - make sure we span the probability range 0 to 1.
for (float& f : CDFLookupTable)
f /= value;

// make our LUT based inverse CDF
// We will binary search over the y's (which are sorted smallest to largest) looking for the x, which is implied by the index.
// I'm sure there's a better & more clever lookup table setup for this situation but this should give you an idea of the technique
auto inverseCDF = [&CDFLookupTable] (float y) {

// there is an implicit entry of "0%" at index -1
if (y < CDFLookupTable[0])
{
float t = y / CDFLookupTable[0];
return t / float(LOOKUP_TABLE_SIZE);
}

// get the lower bound in the lut using a binary search
auto it = std::lower_bound(CDFLookupTable.begin(), CDFLookupTable.end(), y);

// figure out where we are at in the table
size_t index = it - CDFLookupTable.begin();

// Linearly interpolate between the values
// NOTE: could do other interpolation methods, like perhaps cubic (https://blog.demofox.org/2015/08/08/cubic-hermite-interpolation/)
float t = (y - CDFLookupTable[index - 1]) / (CDFLookupTable[index] - CDFLookupTable[index - 1]);
float fractionalIndex = float(index) + t;
return fractionalIndex / float(LOOKUP_TABLE_SIZE);
};

// call the usual function to do the testing
Test(fileName, PDF, inverseCDF);
}

int main (int argc, char **argv)
{
// PDF: y=2x
// inverse CDF: y=sqrt(x)
{
auto PDF = [] (float x) { return 2.0f * x; };
auto inverseCDF = [] (float x) { return std::sqrt(x); };

Test("test1_1k.csv", PDF, inverseCDF);
Test("test1_100k.csv", PDF, inverseCDF);
Test("test1_1m.csv", PDF, inverseCDF);
}

// PDF: y=3x^2
// inverse CDF: y=cuberoot(x) aka y = pow(x, 1/3)
{
auto PDF = [] (float x) { return 3.0f * x * x; };
auto inverseCDF = [](float x) { return std::pow(x, 1.0f / 3.0f); };

Test("test2_100k.csv", PDF, inverseCDF);
}

// PDF: y=(x^3-10x^2+5x+11)/10.417
// Inverse CDF Numerically via a lookup table
{
auto PDF = [] (float x) {return (x*x*x - 10.0f*x*x + 5.0f*x + 11.0f) / (10.417f); };
TestPDFOnly("test3_100m_3.csv", PDF);
TestPDFOnly("test3_100m_5.csv", PDF);
TestPDFOnly("test3_100m_10.csv", PDF);
TestPDFOnly("test3_100m_25.csv", PDF);
TestPDFOnly("test3_100m_100.csv", PDF);
}

return 0;
}


# Counting Bits & The Normal Distribution

I recently saw some interesting posts on twitter about the normal distribution:

I’m not really a statistics kind of guy, but knowing that probability distributions come up in graphics (Like in PBR & Path Tracing), it seemed like a good time to upgrade knowledge in this area while sharing an interesting technique for generating normal distribution random numbers.

## Basics

Below is an image showing a few normal (aka Gaussian) distributions (from wikipedia).

Normal distributions are defined by these parameters:

• $\mu$ – “mu” is the mean. This is the average value of the distribution. This is where the center (peak) of the curve is on the x axis.
• $\sigma^2$ – “sigma squared” is the variance, and is just the standard deviation squared. I find standard deviation more intuitive to think about.
• $\sigma$ – “sigma” is the standard deviation, which (surprise surprise!) is the square root of the variance. This controls the “width” of the graph. The area under the cover is 1.0, so as you increase standard deviation and make the graph wider, it also gets shorter.

Here’s a diagram of standard deviations to help understand them (also from wikipedia):

I find the standard deviation intuitive because 68.2% of the data is within one standard deviation from the mean (on the plus and minus side of the mean). 95.4% of the data is within two standard deviations of the mean.

Standard deviation is given in the same units as the data itself, so if a bell curve described scores on a test, with a mean of 80 and a standard deviation of 5, it means that 68.2% of the students got between 75 and 85 points on the test, and that 95.4% of the students got between 70 and 90 points on the test.

The normal distribution is what’s called a “probability density function” or pdf, which means that the y axis of the graph describes the likelyhood of the number on the x axis being chosen at random.

This means that if you have a normal distribution that has a specific mean and variance (standard deviation), that numbers closer to the mean are more likely to be chosen randomly, while numbers farther away are less likely. The variance controls how the probability drops off as you get farther away from the mean.

Thinking about standard deviation again, 68.2% of the random numbers generated will be within 1 standard deviation of the mean (+1 std dev or -1 std dev). 95.4% will be within 2 standard deviations.

## Generating Normal Distribution Random Numbers – Coin Flips

Generating uniform random numbers, where every number is as likely as every other number, is pretty simple. In the physical world, you can roll some dice or flip some coins. In the software world, you can use PRNGs.

How would you generate random numbers that follow a normal distribution though?

In C++, there is std::normal_distribution that can do this for you. There is also something called the Box-Muller transform that can turn uniformly distributed random numbers into normal distribution random numbers (info here: Generating Gaussian Random Numbers).

I want to talk about something else though and hopefully build some better intuition.

First let’s look at coin flips.

If you flip a fair coin a million times and keep a count of how many heads and tails you saw, you might get 500014 heads and 499986 tails (I got this with a PRNG – std::mt19937). That is a pretty uniform distribution of values in the range of [0,1]. (breadcrumb: pascal’s triangle row 2 is 1,1)

Let’s flip two coins at a time though and add our values together (say that heads is 0 and tails is 1). Here’s what that graph looks like:

Out of 1 million flips, 250639 had no tails, 500308 had one tail, and 249053 had two tails. It might seem weird that they aren’t all even, but it makes more sense when you look at the outcome of flipping two coins: we can get heads/heads (00), heads/tails (01), tails/heads (10) or tails/tails (11). Two of the four possibilities have a single tails, so it makes sense that flipping two coins and getting one coin being a tail would be twice as likely as getting no tails or two tails. (breadcrumb: pascal’s triangle row 3 is 1,2,1)

What happens when we sum 3 coins? With a million flips I got 125113 0’s, 375763 1’s, 373905 2’s and 125219 3’s.

If you work out the possible combinations, there is 1 way to get 0, 3 ways to get 1, 3 ways to get 2 and 1 way to get 3. Those numbers almost exactly follow that 1, 3, 3, 1 probability. (breadcrumb: pascal’s triangle row 4 is 1,3,3,1)

If we flip 100 coins and sum them, we get this:

That looks a bit like the normal distribution graphs at the beginning of this post doesn’t it?

Flipping and summing coins will get you something called the “Binomial Distribution”, and the interesting thing there is that the binomial distribution approaches the normal distribution the more coins you are summing together. At an infinite number of coins, it is the normal distribution.

## Generating Normal Distribution Random Numbers – Dice Rolls

What if instead of flipping coins, we roll dice?

Well, rolling a 4 sided die a million times, you get each number roughly the same percentage of the time as you’d expect; roughly 25% each. 250125 0’s, 250103 1’s, 249700 2’s, 250072 3’s.

If we sum two 4 sided dice rolls we get this:

If we sum three 4 sided dice rolls we get this:

And if we sum one hundred we get this, which sure looks like a normal distribution:

This isn’t limited to four sided dice though, here’s one hundred 6 sided dice being summed:

With dice, instead of being a “binomial distribution”, it’s called a “multinomial distribution”, but as the number of dice goes to infinity, it also approaches the normal distribution.

This means you can get a normal distribution with not only coins, but any sided dice in general.

An even stronger statement than that is the Central Limit Theorem which says that if you have random numbers from ANY distribution, if you add enough of em together, you’ll often approach a normal distribution.

Strange huh?

## Generating Normal Distribution Random Numbers – Counting Bits

Now comes a fun way of generating random numbers which follow a normal distribution. Are you ready for it?

Simply generate an N bit random number and return how many 1 bits are set.

That gives you a random number that follows a normal distribution!

One problem with this is that you have very low “resolution” random numbers. Counting the bits of a 64 bit random number for instance, you can only return 0 through 64 so there are only 65 possible random numbers.

That is a pretty big limitation, but if you need normal distribution numbers calculated quickly and don’t mind if they are low resolution (like in a pixel shader?), this technique could work well for you.

Another problem though is that you don’t have control over the variance or the mean of the distribution.

That isn’t a super huge deal though because you can easily convert numbers from one normal distribution into another normal distribution.

To do so, you get your normal distribution random number. First you subtract the mean of the distribution to make it centered on 0 (have a mean of 0). You then divide it by the standard deviation to make it be part of a distribution which has a standard deviation of 1.

At this point you have a random number from a normal distribution which has a mean of 0 and a standard deviation of 1.

Next, you multiply the number by the standard deviation of the distribution you want, and lastly you add the mean of the distribution you want.

That’s pretty simple (and is implemented in the source code at the bottom of this post), but to do this you need to know what standard deviation (variance) and mean you are starting with.

If you have some way to generate random numbers in [0, N) and you are summing M of those numbers together, the mean is $M*(N-1)/2$. Note that if you instead are generating random numbers in [1,N], the mean instead is $M*(N+1)/2$.

The variance in either case is $M*(N^2-1)/12$. The standard deviation is the square root of that.

Using that information you have everything you need to generate normal distribution random numbers of a specified mean and variance.

Thanks to @fahickman for the help on calculating mean and variance of dice roll sums.

## Code

Here is the source code I used to generate the data which was used to generate the graphs in this post. There is also an implementation of the bit counting algorithm i mentioned, which converts to the desired mean and variance.

#define _CRT_SECURE_NO_WARNINGS

#include <array>
#include <random>
#include <stdint.h>
#include <stdio.h>
#include <limits>

const size_t c_maxNumSamples = 1000000;
const char* c_fileName = "results.csv";

template <size_t DiceRange, size_t DiceCount, size_t NumBuckets>
void DumpBucketCountsAddRandomNumbers (size_t numSamples, const std::array<size_t, NumBuckets>& bucketCounts)
{
// open file for append if we can
FILE* file = fopen(c_fileName, "a+t");
if (!file)
return;

// write the info
float mean = float(DiceCount) * float(DiceRange - 1.0f) / 2.0f;
float variance = float(DiceCount) * (DiceRange * DiceRange) / 12.0f;
if (numSamples == 1)
{
fprintf(file, "\"%zu random numbers [0,%zu) added together (sum %zud%zu). %zu buckets.  Mean = %0.2f.  Variance = %0.2f.  StdDev = %0.2f.\"\n", DiceCount, DiceRange, DiceCount, DiceRange, NumBuckets, mean, variance, std::sqrt(variance));
fprintf(file, "\"\"");
for (size_t i = 0; i < NumBuckets; ++i)
fprintf(file, ",\"%zu\"", i);
fprintf(file, "\n");
}
fprintf(file, "\"%zu samples\",", numSamples);

// report the samples
for (size_t count : bucketCounts)
fprintf(file, "\"%zu\",", count);

fprintf(file, "\"\"\n");
if (numSamples == c_maxNumSamples)
fprintf(file, "\n");

// close file
fclose(file);
}

template <size_t DiceSides, size_t DiceCount>
{
std::mt19937 rng;
rng.seed(std::random_device()());
std::uniform_int_distribution<size_t> dist(size_t(0), DiceSides - 1);

std::array<size_t, (DiceSides - 1) * DiceCount + 1> bucketCounts = { 0 };

size_t nextDump = 1;
for (size_t i = 0; i < c_maxNumSamples; ++i)
{
size_t sum = 0;
for (size_t j = 0; j < DiceCount; ++j)
sum += dist(rng);

bucketCounts[sum]++;

if (i + 1 == nextDump)
{
nextDump *= 10;
}
}
}

template <size_t NumBuckets>
void DumpBucketCountsCountBits (size_t numSamples, const std::array<size_t, NumBuckets>& bucketCounts)
{
// open file for append if we can
FILE* file = fopen(c_fileName, "a+t");
if (!file)
return;

// write the info
float mean = float(NumBuckets-1) * 1.0f / 2.0f;
float variance = float(NumBuckets-1) * 3.0f / 12.0f;
if (numSamples == 1)
{
fprintf(file, "\"%zu random bits (coin flips) added together. %zu buckets.  Mean = %0.2f.  Variance = %0.2f.  StdDev = %0.2f.\"\n", NumBuckets - 1, NumBuckets, mean, variance, std::sqrt(variance));
fprintf(file, "\"\"");
for (size_t i = 0; i < NumBuckets; ++i)
fprintf(file, ",\"%zu\"", i);
fprintf(file, "\n");
}
fprintf(file, "\"%zu samples\",", numSamples);

// report the samples
for (size_t count : bucketCounts)
fprintf(file, "\"%zu\",", count);

fprintf(file, "\"\"\n");
if (numSamples == c_maxNumSamples)
fprintf(file, "\n");

// close file
fclose(file);
}

template <size_t NumBits> // aka NumCoinFlips!
void CountBitsTest ()
{

size_t maxValue = 0;
for (size_t i = 0; i < NumBits; ++i)
maxValue = (maxValue << 1) | 1;

std::mt19937 rng;
rng.seed(std::random_device()());
std::uniform_int_distribution<size_t> dist(0, maxValue);

std::array<size_t, NumBits + 1> bucketCounts = { 0 };

size_t nextDump = 1;
for (size_t i = 0; i < c_maxNumSamples; ++i)
{
size_t sum = 0;
size_t number = dist(rng);
while (number)
{
if (number & 1)
++sum;
number = number >> 1;
}

bucketCounts[sum]++;

if (i + 1 == nextDump)
{
DumpBucketCountsCountBits(nextDump, bucketCounts);
nextDump *= 10;
}
}
}

float GenerateNormalRandomNumber (float mean, float variance)
{
static std::mt19937 rng;
static std::uniform_int_distribution<uint64_t> dist(0, (uint64_t)-1);

static bool seeded = false;
if (!seeded)
{
seeded = true;
rng.seed(std::random_device()());
}

// generate our normal distributed random number from 0 to 65.
//
float sum = 0.0f;
uint64_t number = dist(rng);
while (number)
{
if (number & 1)
sum += 1.0f;
number = number >> 1;
}

// convert from: mean 32, variance 16, stddev 4
// to: mean 0, variance 1, stddev 1
float ret = sum;
ret -= 32.0f;
ret /= 4.0f;

// convert to the specified mean and variance
ret *= std::sqrt(variance);
ret += mean;
return ret;
}

void VerifyGenerateNormalRandomNumber (float mean, float variance)
{
// open file for append if we can
FILE* file = fopen(c_fileName, "a+t");
if (!file)
return;

// write info
fprintf(file, "\"Normal Distributed Random Numbers. mean = %0.2f.  variance = %0.2f.  stddev = %0.2f\"\n", mean, variance, std::sqrt(variance));

// write some random numbers
fprintf(file, "\"100 numbers\"");
for (size_t i = 0; i < 100; ++i)
fprintf(file, ",\"%f\"", GenerateNormalRandomNumber(mean, variance));
fprintf(file, "\n\n");

// close file
fclose(file);
}

int main (int argc, char **argv)
{
// clear out the file
FILE* file = fopen(c_fileName, "w+t");
if (file)
fclose(file);

// coin flips
{
// flip a fair coin

// flip two coins and sum them

// sum 3 coin flips

// sum 100 coin flips
}

// dice rolls
{
// roll a 4 sided die

// sum two 4 sided dice

// sum three 4 sided dice

// sum one hundred 4 sided dice

// sum one hundred 6 sided dice
}

CountBitsTest<8>();
CountBitsTest<16>();
CountBitsTest<32>();
CountBitsTest<64>();

VerifyGenerateNormalRandomNumber(0.0f, 20.0f);

VerifyGenerateNormalRandomNumber(0.0f, 10.0f);

VerifyGenerateNormalRandomNumber(5.0f, 10.0f);

return 0;
}


# When Random Numbers Are Too Random: Low Discrepancy Sequences

Random numbers can be useful in graphics and game development, but they have a pesky and sometimes undesirable habit of clumping together.

This is a problem in path tracing and monte carlo integration when you take N samples, but the samples aren’t well spread across the sampling range.

This can also be a problem for situations like when you are randomly placing objects in the world or generating treasure for a treasure chest. You don’t want your randomly placed trees to only be in one part of the forest, and you don’t want a player to get only trash items or only godly items when they open a treasure chest. Ideally you want to have some randomness, but you don’t want the random number generator to give you all of the same or similar random numbers.

The problem is that random numbers can be TOO random, like in the below where you can see clumps and large gaps between the 100 samples.

For cases like that, when you want random numbers that are a little bit more well distributed, you might find some use in low discrepancy sequences.

The standalone C++ code (one source file, standard headers, no libraries to link to) I used to generate the data and images are at the bottom of this post, as well as some links to more resources.

# What Is Discrepancy?

In this context, discrepancy is a measurement of the highest or lowest density of points in a sequence. High discrepancy means that there is either a large area of empty space, or that there is an area that has a high density of points. Low discrepancy means that there are neither, and that your points are more or less pretty evenly distributed.

The lowest discrepancy possible has no randomness at all, and in the 1 dimensional case means that the points are evenly distributed on a grid. For monte carlo integration and the game dev usage cases I mentioned, we do want some randomness, we just want the random points to be spread out a little more evenly.

If more formal math notation is your thing, discrepancy is defined as:
$D_{N}(P)=\sup _{{B\in J}}\left|{\frac {A(B;P)}{N}}-\lambda _{s}(B)\right|$

Equidistributed sequence

For monte carlo integration specifically, this is the behavior each thing gives you:

• High Discrepancy: Random Numbers / White Noise aka Uniform Distribution – At lower sample counts, convergance is slower (and have higher variance) due to the possibility of not getting good coverage over the area you integrating. At higher sample counts, this problem disappears. (Hint: real time graphics and preview renderings use a smaller number of samples)
• Lowest Discrepancy: Regular Grid – This will cause aliasing, unlike the other “random” based sampling, which trade aliasing for noise. Noise is preferred over aliasing.
• Low Discrepancy: Low Discrepancy Sequences – In lower numbers of samples, this will have faster convergence by having better coverage of the sampling space, but will use randomness to get rid of aliasing by introducing noise.

Also interesting to note, Quasi Monte Carlo has provably better asymptotic convergence than regular monte carlo integration.

# 1 Dimensional Sequences

We’ll first look at 1 dimensional sequences.

## Grid

Here are 100 samples evenly spaced:

## Random Numbers (White Noise)

This is actually a high discrepancy sequence. To generate this, you just use a standard random number generator to pick 100 points between 0 and 1. I used std::mt19937 with a std::uniform_real_distribution from 0 to 1:

## Subrandom Numbers

Subrandom numbers are ways to decrease the discrepancy of white noise.

One way to do this is to break the sampling space in half. You then generate even numbered samples in the first half of the space, and odd numbered samples in the second half of the space.

There’s no reason you can’t generalize this into more divisions of space though.

This splits the space into 4 regions:

8 regions:

16 regions:

32 regions:

There are other ways to generate subrandom numbers though. One way is to generate random numbers between 0 and 0.5, and add them to the last sample, plus 0.5. This gives you a random walk type setup.

Here is that:

## Uniform Sampling + Jitter

If you take the first subrandom idea to the logical maximum, you break your sample space up into N sections and place one point within those N sections to make a low discrepancy sequence made up of N points.

Another way to look at this is that you do uniform sampling, but add some random jitter to the samples, between +/- half a uniform sample size, to keep the samples in their own areas.

This is that:

I have heard that Pixar invented this technique interestingly.

# Irrational Numbers

Rational numbers are numbers which can be described as fractions, such as 0.75 which can be expressed as 3/4. Irrational numbers are numbers which CANNOT be described as fractions, such as pi, or the golden ratio, or the square root of a prime number.

Interestingly you can use irrational numbers to generate low discrepancy sequences. You start with some value (could be 0, or could be a random number), add the irrational number, and modulus against 1.0. To get the next sample you add the irrational value again, and modulus against 1.0 again. Rinse and repeat until you get as many samples as you want.

Some values work better than others though, and apparently the golden ratio is provably the best choice (1.61803398875…), says Wikipedia.

Here is the golden ratio, using 4 different random (white noise) starting values:

Here I’ve used the square root of 2, with 4 different starting random numbers again:

Lastly, here is pi, with 4 random starting values:

## Van der Corput Sequence

The Van der Corput sequence is the 1d equivelant of the Halton sequence which we’ll talk about later.

How you generate values in the Van der Corput sequence is you convert the index of your sample into some base.

For instance if it was base 2, you would convert your index to binary. If it was base 16, you would convert your index to hexadecimal.

Now, instead of treating the digits as if they are $B^0$, $B^1$, $B^2$, etc (where B is the base), you instead treat them as $B^{-1}$, $B^{-2}$, $B^{-3}$ and so on. In other words, you multiply each digit by a fraction and add up the results.

To show a couple quick examples, let’s say we wanted sample 6 in the sequence of base 2.

First we convert 6 to binary which is 110. From right to left, we have 3 digits: a 0 in the 1’s place, a 1 in the 2’s place, and a 1 in the 4’s place. $0*1 + 1*2 + 1*4 = 6$, so we can see that 110 is in fact 6 in binary.

To get the Van der Corput value for this, instead of treating it as the 1’s, 2’s and 4’s digit, we treat it as the 1/2, 1/4 and 1/8’s digit.

$0 * 1/2 + 1 * 1/4 + 1 * 1/8 = 3/8$.

So, sample 6 in the Van der Corput sequence using base 2 is 3/8.

Let’s try sample 21 in base 3.

First we convert 21 to base 3 which is 210. We can verify this is right by seeing that $0 * 1 + 1 * 3 + 2 * 9 = 21$.

Instead of a 1’s, 3’s and 9’s digit, we are going to treat it like a 1/3, 1/9 and 1/27 digit.

$0 * 1/3 + 1 * 1/9 + 2 * 1/27 = 5/27$

So, sample 21 in the Van der Corput sequence using base 3 is 5/27.

Here is the Van der Corput sequence for base 2:

Here it is for base 3:

Base 4:

Base 5:

## Sobol

One dimensional Sobol is actually just the Van der Corput sequence base 2 re-arranged a little bit, but it’s generated differently.

You start with 0 (either using it as sample 0 or sample -1, doesn’t matter which), and for each sample you do this:

1. Calculate the Ruler function value for the current sample’s index(more info in a second)
2. Make the direction vector by shifting 1 left (in binary) 31 – ruler times.
3. XOR the last sample by the direction vector to get the new sample
4. To interpret the sample as a floating point number you divide it by $2^{32}$

That might sound completely different than the Van der Corput sequence but it actually is the same thing – just re-ordered.

In the final step when dividing by $2^{32}$, we are really just interpreting the binary number as a fraction just like before, but it’s the LEFT most digit that is the 1/2 spot, not the RIGHT most digit.

The Ruler Function goes like: 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, …

It’s pretty easy to calculate too. Calculating the ruler function for an index (starting at 1) is just the zero based index of the right most 1’s digit after converting the number to binary.

1 in binary is 001 so Ruler(1) is 0.
2 in binary is 010 so Ruler(2) is 1.
3 in binary is 011 so Ruler(3) is 0.
4 in binary is 100 so Ruler(4) is 2.
5 in binary is 101 so Ruler(5) is 0.
and so on.

Here is 1D Sobol:

# Hammersley

In one dimension, the Hammersley sequence is the same as the base 2 Van der Corput sequence, and in the same order. If that sounds strange that it’s the same, it’s a 2d sequence I broke down into a 1d sequence for comparison. The one thing Hammersley has that makes it unique in the 1d case is that you can truncate bits.

It doesn’t seem that useful for 1d Hammersley to truncate bits but knowing that is useful info too I guess. Look at the 2d version of Hammersley to get a fairer look at it, because it’s meant to be a 2d sequence.

Here is Hammersley:

With 1 bit truncated:

With 2 bits truncated:

# Poisson Disc

Poisson disc points are points which are densely packed, but have a minimum distance from each other.

Computer scientists are still working out good algorithms to generate these points efficiently.

I use “Mitchell’s Best-Candidate” which means that when you want to generate a new point in the sequence, you generate N new points, and choose whichever point is farthest away from the other points you’ve generated so far.

Here it is where N is 100:

# 2 Dimensional Sequences

Next up, let’s look at some 2 dimensional sequences.

## Grid

Below is 2d uniform samples on a grid.

Note that uniform grid is not particularly low discrepancy for the 2d case! More info here: Is it expected that uniform points would have non zero discrepancy?

## Random

Here are 100 random points:

## Uniform Grid + Jitter

Here is a uniform grid that has random jitter applied to the points. Jittered grid is a pretty commonly used low discrepancy sampling technique that has good success.

## Subrandom

Just like in 1 dimensions, you can apply the subrandom ideas to 2 dimensions where you divide the X and Y axis into so many sections, and randomly choose points in the sections.

If you divide X and Y into the same number of sections though, you are going to have a problem because some areas are not going to have any points in them.

@Reedbeta pointed out that instead of using i%x and i%y, that you could use i%x and (i/x)%y to make it pick points in all regions.

Picking different numbers for X and Y can be another way to give good results. Here’s dividing X and Y into 2 and 3 sections respectively:

If you choose co-prime numbers for divisions for each axis you can get maximal period of repeats. 2 and 3 are coprime so the last example is a good example of that, but here is 3 and 11:

Here is 3 and 97. 97 is large enough that with only doing 100 samples, we are almost doing jittered grid on the y axis.

Here is the other subrandom number from 1d, where we start with a random value for X and Y, and then add a random number between 0 and 0.5 to each, also adding 0.5, to make a “random walk” type setup again:

## Halton

The Halton sequence is just the Van der Corput sequence, but using a different base on each axis.

Here is the Halton sequence where X and Y use bases 2 and 3:

Here it is using bases 5 and 7:

Here are bases 13 and 9:

# Irrational Numbers

The irrational numbers technique can be used for 2d as well but I wasn’t able to find out how to make it give decent looking output that didn’t have an obvious diagonal pattern in them. Bart Wronski shared a neat paper that explains how to use the golden ratio in 2d with great success: Golden Ratio Sequences For Low-Discrepancy Sampling

This uses the golden ratio for the X axis and the square root of 2 for the Y axis. Below that is the same, with a random starting point, to make it give a different sequence.

Here X axis uses square root of 2 and Y axis uses square root of 3. Below that is a random starting point, which gives the same discrepancy.

## Hammersley

In 2 dimensions, the Hammersley sequence uses the 1d Hammersley sequence for the X axis: Instead of treating the binary version of the index as binary, you treat it as fractions like you do for Van der Corput and sum up the fractions.

For the Y axis, you just reverse the bits and then do the same!

Here is the Hammersley sequence. Note we would have to take 128 samples (not just the 100 we did) if we wanted it to fill the entire square with samples.

Truncating bits in 2d is a bit useful. Here is 1 bit truncated:

2 bits truncated:

## Poisson Disc

Using the same method we did for 1d, we can generate points in 2d space:

## N Rooks

There is a sampling pattern called N-Rooks where you put N rooks onto a chess board and arrange them such that no two are in the same row or column.

A way to generate these samples is to realize that there will be only one rook per row, and that none of them will ever be in the same column. So, you make an array that has numbers 0 to N-1, and then shuffle the array. The index into the array is the row, and the value in the array is the column.

Here are 100 rooks:

## Sobol

Sobol in two dimensions is more complex to explain so I’ll link you to the source I used: Sobol Sequences Made Simple.

The 1D sobol already covered is used for the X axis, and then something more complex was used for the Y axis:

Bart Wronski has a really great series on a related topic: Dithering in Games

Wikipedia: Low Discrepancy Sequence

Wikipedia: Halton Sequence

Wikipedia: Van der Corput Sequence

Using Fibonacci Sequence To Generate Colors

Deeper info and usage cases for low discrepancy sequences

Poisson-Disc Sampling

Low discrepancy sequences are related to blue noise. Where white noise contains all frequencies evenly, blue noise has more high frequencies and fewer low frequencies. Blue noise is essentially the ultimate in low discrepancy, but can be expensive to compute. Here are some pages on blue noise:

Free Blue Noise Textures

The problem with 3D blue noise

Stippling and Blue Noise

Vegetation placement in “The Witness”

Here are some links from @marc_b_reynolds:
Sobol (low-discrepancy) sequence in 1-3D, stratified in 2-4D.
Classic binary-reflected gray code.
Sobol.h

Weyl Sequence

## Code

#define _CRT_SECURE_NO_WARNINGS

#include <windows.h>  // for bitmap headers and performance counter.  Sorry non windows people!
#include <vector>
#include <stdint.h>
#include <random>
#include <array>
#include <algorithm>
#include <stdlib.h>
#include <set>

typedef uint8_t uint8;

#define NUM_SAMPLES 100  // to simplify some 2d code, this must be a square
#define NUM_SAMPLES_FOR_COLORING 100

// Turning this on will slow things down significantly because it's an O(N^5) operation for 2d!
#define CALCULATE_DISCREPANCY 0

#define IMAGE1D_WIDTH 600
#define IMAGE1D_HEIGHT 50
#define IMAGE2D_WIDTH 300
#define IMAGE2D_HEIGHT 300

#define AXIS_HEIGHT 40
#define DATA_HEIGHT 20
#define DATA_WIDTH 2

#define COLOR_FILL SColor(255,255,255)
#define COLOR_AXIS SColor(0, 0, 0)

//======================================================================================
struct SImageData
{
SImageData ()
: m_width(0)
, m_height(0)
{ }

size_t m_width;
size_t m_height;
size_t m_pitch;
std::vector<uint8> m_pixels;
};

struct SColor
{
SColor (uint8 _R = 0, uint8 _G = 0, uint8 _B = 0)
: R(_R), G(_G), B(_B)
{ }

uint8 B, G, R;
};

//======================================================================================
bool SaveImage (const char *fileName, const SImageData &image)
{
// open the file if we can
FILE *file;
file = fopen(fileName, "wb");
if (!file) {
printf("Could not save %s\n", fileName);
return false;
}

// write the data and close the file
fclose(file);

return true;
}

//======================================================================================
void ImageInit (SImageData& image, size_t width, size_t height)
{
image.m_width = width;
image.m_height = height;
image.m_pitch = 4 * ((width * 24 + 31) / 32);
image.m_pixels.resize(image.m_pitch * image.m_width);
std::fill(image.m_pixels.begin(), image.m_pixels.end(), 0);
}

//======================================================================================
void ImageClear (SImageData& image, const SColor& color)
{
uint8* row = &image.m_pixels[0];
for (size_t rowIndex = 0; rowIndex < image.m_height; ++rowIndex)
{
SColor* pixels = (SColor*)row;
std::fill(pixels, pixels + image.m_width, color);

row += image.m_pitch;
}
}

//======================================================================================
void ImageBox (SImageData& image, size_t x1, size_t x2, size_t y1, size_t y2, const SColor& color)
{
for (size_t y = y1; y < y2; ++y)
{
uint8* row = &image.m_pixels[y * image.m_pitch];
SColor* start = &((SColor*)row)[x1];
std::fill(start, start + x2 - x1, color);
}
}

//======================================================================================
float Distance (float x1, float y1, float x2, float y2)
{
float dx = (x2 - x1);
float dy = (y2 - y1);

return std::sqrtf(dx*dx + dy*dy);
}

//======================================================================================
SColor DataPointColor (size_t sampleIndex)
{
SColor ret;
float percent = (float(sampleIndex) / (float(NUM_SAMPLES_FOR_COLORING) - 1.0f));

ret.R = uint8((1.0f - percent) * 255.0f);
ret.G = 0;
ret.B = uint8(percent * 255.0f);

float mag = (float)sqrt(ret.R*ret.R + ret.G*ret.G + ret.B*ret.B);
ret.R = uint8((float(ret.R) / mag)*255.0f);
ret.G = uint8((float(ret.G) / mag)*255.0f);
ret.B = uint8((float(ret.B) / mag)*255.0f);

return ret;
}

//======================================================================================
float RandomFloat (float min, float max)
{
static std::random_device rd;
static std::mt19937 mt(rd());
std::uniform_real_distribution<float> dist(min, max);
return dist(mt);
}

//======================================================================================
size_t Ruler (size_t n)
{
size_t ret = 0;
while (n != 0 && (n & 1) == 0)
{
n /= 2;
++ret;
}
return ret;
}

//======================================================================================
float CalculateDiscrepancy1D (const std::array<float, NUM_SAMPLES>& samples)
{
// some info about calculating discrepancy
// https://math.stackexchange.com/questions/1681562/how-to-calculate-discrepancy-of-a-sequence

// Calculates the discrepancy of this data.
// Assumes the data is [0,1) for valid sample range
std::array<float, NUM_SAMPLES> sortedSamples = samples;
std::sort(sortedSamples.begin(), sortedSamples.end());

float maxDifference = 0.0f;
for (size_t startIndex = 0; startIndex <= NUM_SAMPLES; ++startIndex)
{
// startIndex 0 = 0.0f.  startIndex 1 = sortedSamples[0]. etc

float startValue = 0.0f;
if (startIndex > 0)
startValue = sortedSamples[startIndex - 1];

for (size_t stopIndex = startIndex; stopIndex <= NUM_SAMPLES; ++stopIndex)
{
// stopIndex 0 = sortedSamples[0].  startIndex[N] = 1.0f. etc

float stopValue = 1.0f;
if (stopIndex < NUM_SAMPLES)
stopValue = sortedSamples[stopIndex];

float length = stopValue - startValue;

// open interval (startValue, stopValue)
size_t countInside = 0;
for (float sample : samples)
{
if (sample > startValue &&
sample < stopValue)
{
++countInside;
}
}
float density = float(countInside) / float(NUM_SAMPLES);
float difference = std::abs(density - length);
if (difference > maxDifference)
maxDifference = difference;

// closed interval [startValue, stopValue]
countInside = 0;
for (float sample : samples)
{
if (sample >= startValue &&
sample <= stopValue)
{
++countInside;
}
}
density = float(countInside) / float(NUM_SAMPLES);
difference = std::abs(density - length);
if (difference > maxDifference)
maxDifference = difference;
}
}
return maxDifference;
}

//======================================================================================
float CalculateDiscrepancy2D (const std::array<std::array<float, 2>, NUM_SAMPLES>& samples)
{
// some info about calculating discrepancy
// https://math.stackexchange.com/questions/1681562/how-to-calculate-discrepancy-of-a-sequence

// Calculates the discrepancy of this data.
// Assumes the data is [0,1) for valid sample range.

// Get the sorted list of unique values on each axis
std::set<float> setSamplesX;
std::set<float> setSamplesY;
for (const std::array<float, 2>& sample : samples)
{
setSamplesX.insert(sample[0]);
setSamplesY.insert(sample[1]);
}
std::vector<float> sortedXSamples;
std::vector<float> sortedYSamples;
sortedXSamples.reserve(setSamplesX.size());
sortedYSamples.reserve(setSamplesY.size());
for (float f : setSamplesX)
sortedXSamples.push_back(f);
for (float f : setSamplesY)
sortedYSamples.push_back(f);

// Get the sorted list of samples on the X axis, for faster interval testing
std::array<std::array<float, 2>, NUM_SAMPLES> sortedSamplesX = samples;
std::sort(sortedSamplesX.begin(), sortedSamplesX.end(),
[] (const std::array<float, 2>& itemA, const std::array<float, 2>& itemB)
{
return itemA[0] < itemB[0];
}
);

// calculate discrepancy
float maxDifference = 0.0f;
for (size_t startIndexY = 0; startIndexY <= sortedYSamples.size(); ++startIndexY)
{
float startValueY = 0.0f;
if (startIndexY > 0)
startValueY = *(sortedYSamples.begin() + startIndexY - 1);

for (size_t startIndexX = 0; startIndexX <= sortedXSamples.size(); ++startIndexX)
{
float startValueX = 0.0f;
if (startIndexX > 0)
startValueX = *(sortedXSamples.begin() + startIndexX - 1);

for (size_t stopIndexY = startIndexY; stopIndexY <= sortedYSamples.size(); ++stopIndexY)
{
float stopValueY = 1.0f;
if (stopIndexY < sortedYSamples.size())
stopValueY = sortedYSamples[stopIndexY];

for (size_t stopIndexX = startIndexX; stopIndexX <= sortedXSamples.size(); ++stopIndexX)
{
float stopValueX = 1.0f;
if (stopIndexX < sortedXSamples.size())
stopValueX = sortedXSamples[stopIndexX];

// calculate area
float length = stopValueX - startValueX;
float height = stopValueY - startValueY;
float area = length * height;

// open interval (startValue, stopValue)
size_t countInside = 0;
for (const std::array<float, 2>& sample : samples)
{
if (sample[0] > startValueX &&
sample[1] > startValueY &&
sample[0] < stopValueX &&
sample[1] < stopValueY)
{
++countInside;
}
}
float density = float(countInside) / float(NUM_SAMPLES);
float difference = std::abs(density - area);
if (difference > maxDifference)
maxDifference = difference;

// closed interval [startValue, stopValue]
countInside = 0;
for (const std::array<float, 2>& sample : samples)
{
if (sample[0] >= startValueX &&
sample[1] >= startValueY &&
sample[0] <= stopValueX &&
sample[1] <= stopValueY)
{
++countInside;
}
}
density = float(countInside) / float(NUM_SAMPLES);
difference = std::abs(density - area);
if (difference > maxDifference)
maxDifference = difference;
}
}
}
}

return maxDifference;
}

//======================================================================================
void Test1D (const char* fileName, const std::array<float, NUM_SAMPLES>& samples)
{
// create and clear the image
SImageData image;

// setup the canvas
ImageClear(image, COLOR_FILL);

// calculate the discrepancy
#if CALCULATE_DISCREPANCY
float discrepancy = CalculateDiscrepancy1D(samples);
printf("%s Discrepancy = %0.2f%%\n", fileName, discrepancy*100.0f);
#endif

// draw the sample points
size_t i = 0;
for (float f: samples)
{
size_t pos = size_t(f * float(IMAGE1D_WIDTH)) + IMAGE_PAD;
ImageBox(image, pos, pos + 1, IMAGE1D_CENTERY - DATA_HEIGHT / 2, IMAGE1D_CENTERY + DATA_HEIGHT / 2, DataPointColor(i));
++i;
}

// draw the axes lines. horizontal first then the two vertical
ImageBox(image, IMAGE_PAD, IMAGE_PAD + 1, IMAGE1D_CENTERY - AXIS_HEIGHT / 2, IMAGE1D_CENTERY + AXIS_HEIGHT / 2, COLOR_AXIS);
ImageBox(image, IMAGE1D_WIDTH + IMAGE_PAD, IMAGE1D_WIDTH + IMAGE_PAD + 1, IMAGE1D_CENTERY - AXIS_HEIGHT / 2, IMAGE1D_CENTERY + AXIS_HEIGHT / 2, COLOR_AXIS);

// save the image
SaveImage(fileName, image);
}

//======================================================================================
void Test2D (const char* fileName, const std::array<std::array<float,2>, NUM_SAMPLES>& samples)
{
// create and clear the image
SImageData image;

// setup the canvas
ImageClear(image, COLOR_FILL);

// calculate the discrepancy
#if CALCULATE_DISCREPANCY
float discrepancy = CalculateDiscrepancy2D(samples);
printf("%s Discrepancy = %0.2f%%\n", fileName, discrepancy*100.0f);
#endif

// draw the sample points
size_t i = 0;
for (const std::array<float, 2>& sample : samples)
{
size_t posx = size_t(sample[0] * float(IMAGE2D_WIDTH)) + IMAGE_PAD;
size_t posy = size_t(sample[1] * float(IMAGE2D_WIDTH)) + IMAGE_PAD;
ImageBox(image, posx - 1, posx + 1, posy - 1, posy + 1, DataPointColor(i));
++i;
}

// horizontal lines

// vertical lines

// save the image
SaveImage(fileName, image);
}

//======================================================================================
void TestUniform1D (bool jitter)
{
// calculate the sample points
const float c_cellSize = 1.0f / float(NUM_SAMPLES+1);
std::array<float, NUM_SAMPLES> samples;
for (size_t i = 0; i < NUM_SAMPLES; ++i)
{
samples[i] = float(i+1) / float(NUM_SAMPLES+1);
if (jitter)
samples[i] += RandomFloat(-c_cellSize*0.5f, c_cellSize*0.5f);
}

// save bitmap etc
if (jitter)
Test1D("1DUniformJitter.bmp", samples);
else
Test1D("1DUniform.bmp", samples);
}

//======================================================================================
void TestUniformRandom1D ()
{
// calculate the sample points
const float c_halfJitter = 1.0f / float((NUM_SAMPLES + 1) * 2);
std::array<float, NUM_SAMPLES> samples;
for (size_t i = 0; i < NUM_SAMPLES; ++i)
samples[i] = RandomFloat(0.0f, 1.0f);

// save bitmap etc
Test1D("1DUniformRandom.bmp", samples);
}

//======================================================================================
void TestSubRandomA1D (size_t numRegions)
{
const float c_randomRange = 1.0f / float(numRegions);

// calculate the sample points
const float c_halfJitter = 1.0f / float((NUM_SAMPLES + 1) * 2);
std::array<float, NUM_SAMPLES> samples;
for (size_t i = 0; i < NUM_SAMPLES; ++i)
{
samples[i] = RandomFloat(0.0f, c_randomRange);
samples[i] += float(i % numRegions) / float(numRegions);
}

// save bitmap etc
char fileName[256];
sprintf(fileName, "1DSubRandomA_%zu.bmp", numRegions);
Test1D(fileName, samples);
}

//======================================================================================
void TestSubRandomB1D ()
{
// calculate the sample points
std::array<float, NUM_SAMPLES> samples;
float sample = RandomFloat(0.0f, 0.5f);
for (size_t i = 0; i < NUM_SAMPLES; ++i)
{
sample = std::fmodf(sample + 0.5f + RandomFloat(0.0f, 0.5f), 1.0f);
samples[i] = sample;
}

// save bitmap etc
Test1D("1DSubRandomB.bmp", samples);
}

//======================================================================================
void TestVanDerCorput (size_t base)
{
// calculate the sample points
std::array<float, NUM_SAMPLES> samples;
for (size_t i = 0; i < NUM_SAMPLES; ++i)
{
samples[i] = 0.0f;
float denominator = float(base);
size_t n = i;
while (n > 0)
{
size_t multiplier = n % base;
samples[i] += float(multiplier) / denominator;
n = n / base;
denominator *= base;
}
}

// save bitmap etc
char fileName[256];
sprintf(fileName, "1DVanDerCorput_%zu.bmp", base);
Test1D(fileName, samples);
}

//======================================================================================
void TestIrrational1D (float irrational, float seed)
{
// calculate the sample points
std::array<float, NUM_SAMPLES> samples;
float sample = seed;
for (size_t i = 0; i < NUM_SAMPLES; ++i)
{
sample = std::fmodf(sample + irrational, 1.0f);
samples[i] = sample;
}

// save bitmap etc
char irrationalStr[256];
sprintf(irrationalStr, "%f", irrational);
char seedStr[256];
sprintf(seedStr, "%f", seed);
char fileName[256];
sprintf(fileName, "1DIrrational_%s_%s.bmp", &irrationalStr[2], &seedStr[2]);
Test1D(fileName, samples);
}

//======================================================================================
void TestSobol1D ()
{
// calculate the sample points
std::array<float, NUM_SAMPLES> samples;
size_t sampleInt = 0;
for (size_t i = 0; i < NUM_SAMPLES; ++i)
{
size_t ruler = Ruler(i + 1);
size_t direction = size_t(size_t(1) << size_t(31 - ruler));
sampleInt = sampleInt ^ direction;
samples[i] = float(sampleInt) / std::pow(2.0f, 32.0f);
}

// save bitmap etc
Test1D("1DSobol.bmp", samples);
}

//======================================================================================
void TestHammersley1D (size_t truncateBits)
{
// calculate the sample points
std::array<float, NUM_SAMPLES> samples;
size_t sampleInt = 0;
for (size_t i = 0; i < NUM_SAMPLES; ++i)
{
size_t n = i >> truncateBits;
float base = 1.0f / 2.0f;
samples[i] = 0.0f;
while (n)
{
if (n & 1)
samples[i] += base;
n /= 2;
base /= 2.0f;
}
}

// save bitmap etc
char fileName[256];
sprintf(fileName, "1DHammersley_%zu.bmp", truncateBits);
Test1D(fileName, samples);
}

//======================================================================================
float MinimumDistance1D (const std::array<float, NUM_SAMPLES>& samples, size_t numSamples, float x)
{
// Used by poisson.
// This returns the minimum distance that point (x) is away from the sample points, from [0, numSamples).
float minimumDistance = 0.0f;
for (size_t i = 0; i < numSamples; ++i)
{
float distance = std::abs(samples[i] - x);
if (i == 0 || distance < minimumDistance)
minimumDistance = distance;
}
return minimumDistance;
}

//======================================================================================
void TestPoisson1D ()
{
// every time we want to place a point, we generate this many points and choose the one farthest away from all the other points (largest minimum distance)
const size_t c_bestOfAttempts = 100;

// calculate the sample points
std::array<float, NUM_SAMPLES> samples;
for (size_t sampleIndex = 0; sampleIndex < NUM_SAMPLES; ++sampleIndex)
{
// generate some random points and keep the one that has the largest minimum distance from any of the existing points
float bestX = 0.0f;
float bestMinDistance = 0.0f;
for (size_t attempt = 0; attempt < c_bestOfAttempts; ++attempt)
{
float attemptX = RandomFloat(0.0f, 1.0f);
float minDistance = MinimumDistance1D(samples, sampleIndex, attemptX);

if (minDistance > bestMinDistance)
{
bestX = attemptX;
bestMinDistance = minDistance;
}
}
samples[sampleIndex] = bestX;
}

// save bitmap etc
Test1D("1DPoisson.bmp", samples);
}

//======================================================================================
void TestUniform2D (bool jitter)
{
// calculate the sample points
std::array<std::array<float, 2>, NUM_SAMPLES> samples;
const size_t c_oneSide = size_t(std::sqrt(NUM_SAMPLES));
const float c_cellSize = 1.0f / float(c_oneSide+1);
for (size_t iy = 0; iy < c_oneSide; ++iy)
{
for (size_t ix = 0; ix < c_oneSide; ++ix)
{
size_t sampleIndex = iy * c_oneSide + ix;

samples[sampleIndex][0] = float(ix + 1) / (float(c_oneSide + 1));
if (jitter)
samples[sampleIndex][0] += RandomFloat(-c_cellSize*0.5f, c_cellSize*0.5f);

samples[sampleIndex][1] = float(iy + 1) / (float(c_oneSide) + 1.0f);
if (jitter)
samples[sampleIndex][1] += RandomFloat(-c_cellSize*0.5f, c_cellSize*0.5f);
}
}

// save bitmap etc
if (jitter)
Test2D("2DUniformJitter.bmp", samples);
else
Test2D("2DUniform.bmp", samples);
}

//======================================================================================
void TestUniformRandom2D ()
{
// calculate the sample points
std::array<std::array<float, 2>, NUM_SAMPLES> samples;
const size_t c_oneSide = size_t(std::sqrt(NUM_SAMPLES));
const float c_halfJitter = 1.0f / float((c_oneSide + 1) * 2);
for (size_t i = 0; i < NUM_SAMPLES; ++i)
{
samples[i][0] = RandomFloat(0.0f, 1.0f);
samples[i][1] = RandomFloat(0.0f, 1.0f);
}

// save bitmap etc
Test2D("2DUniformRandom.bmp", samples);
}

//======================================================================================
void TestSubRandomA2D (size_t regionsX, size_t regionsY)
{
const float c_randomRangeX = 1.0f / float(regionsX);
const float c_randomRangeY = 1.0f / float(regionsY);

// calculate the sample points
std::array<std::array<float, 2>, NUM_SAMPLES> samples;
for (size_t i = 0; i < NUM_SAMPLES; ++i)
{
samples[i][0] = RandomFloat(0.0f, c_randomRangeX);
samples[i][0] += float(i % regionsX) / float(regionsX);

samples[i][1] = RandomFloat(0.0f, c_randomRangeY);
samples[i][1] += float(i % regionsY) / float(regionsY);
}

// save bitmap etc
char fileName[256];
sprintf(fileName, "2DSubRandomA_%zu_%zu.bmp", regionsX, regionsY);
Test2D(fileName, samples);
}

//======================================================================================
void TestSubRandomB2D ()
{
// calculate the sample points
float samplex = RandomFloat(0.0f, 0.5f);
float sampley = RandomFloat(0.0f, 0.5f);
std::array<std::array<float, 2>, NUM_SAMPLES> samples;
for (size_t i = 0; i < NUM_SAMPLES; ++i)
{
samplex = std::fmodf(samplex + 0.5f + RandomFloat(0.0f, 0.5f), 1.0f);
sampley = std::fmodf(sampley + 0.5f + RandomFloat(0.0f, 0.5f), 1.0f);
samples[i][0] = samplex;
samples[i][1] = sampley;
}

// save bitmap etc
Test2D("2DSubRandomB.bmp", samples);
}

//======================================================================================
void TestHalton (size_t basex, size_t basey)
{
// calculate the sample points
std::array<std::array<float, 2>, NUM_SAMPLES> samples;
const size_t c_oneSide = size_t(std::sqrt(NUM_SAMPLES));
const float c_halfJitter = 1.0f / float((c_oneSide + 1) * 2);
for (size_t i = 0; i < NUM_SAMPLES; ++i)
{
// x axis
samples[i][0] = 0.0f;
{
float denominator = float(basex);
size_t n = i;
while (n > 0)
{
size_t multiplier = n % basex;
samples[i][0] += float(multiplier) / denominator;
n = n / basex;
denominator *= basex;
}
}

// y axis
samples[i][1] = 0.0f;
{
float denominator = float(basey);
size_t n = i;
while (n > 0)
{
size_t multiplier = n % basey;
samples[i][1] += float(multiplier) / denominator;
n = n / basey;
denominator *= basey;
}
}
}

// save bitmap etc
char fileName[256];
sprintf(fileName, "2DHalton_%zu_%zu.bmp", basex, basey);
Test2D(fileName, samples);
}

//======================================================================================
void TestSobol2D ()
{
// calculate the sample points

// x axis
std::array<std::array<float, 2>, NUM_SAMPLES> samples;
size_t sampleInt = 0;
for (size_t i = 0; i < NUM_SAMPLES; ++i)
{
size_t ruler = Ruler(i + 1);
size_t direction = size_t(size_t(1) << size_t(31 - ruler));
sampleInt = sampleInt ^ direction;
samples[i][0] = float(sampleInt) / std::pow(2.0f, 32.0f);
}

// y axis
// uses numbers: new-joe-kuo-6.21201

// Direction numbers
std::vector<size_t> V;
V.resize((size_t)ceil(log((double)NUM_SAMPLES) / log(2.0)));
V[0] = size_t(1) << size_t(31);
for (size_t i = 1; i < V.size(); ++i)
V[i] = V[i - 1] ^ (V[i - 1] >> 1);

// Samples
sampleInt = 0;
for (size_t i = 0; i < NUM_SAMPLES; ++i) {
size_t ruler = Ruler(i + 1);
sampleInt = sampleInt ^ V[ruler];
samples[i][1] = float(sampleInt) / std::pow(2.0f, 32.0f);
}

// save bitmap etc
Test2D("2DSobol.bmp", samples);
}

//======================================================================================
void TestHammersley2D (size_t truncateBits)
{
// figure out how many bits we are working in.
size_t value = 1;
size_t numBits = 0;
while (value < NUM_SAMPLES)
{
value *= 2;
++numBits;
}

// calculate the sample points
std::array<std::array<float, 2>, NUM_SAMPLES> samples;
size_t sampleInt = 0;
for (size_t i = 0; i < NUM_SAMPLES; ++i)
{
// x axis
samples[i][0] = 0.0f;
{
size_t n = i >> truncateBits;
float base = 1.0f / 2.0f;
while (n)
{
if (n & 1)
samples[i][0] += base;
n /= 2;
base /= 2.0f;
}
}

// y axis
samples[i][1] = 0.0f;
{
size_t n = i >> truncateBits;
size_t mask = size_t(1) << (numBits - 1 - truncateBits);

float base = 1.0f / 2.0f;
{
samples[i][1] += base;
base /= 2.0f;
}
}
}

// save bitmap etc
char fileName[256];
sprintf(fileName, "2DHammersley_%zu.bmp", truncateBits);
Test2D(fileName, samples);
}

//======================================================================================
void TestRooks2D ()
{
// make and shuffle rook positions
std::random_device rd;
std::mt19937 mt(rd());
std::array<size_t, NUM_SAMPLES> rookPositions;
for (size_t i = 0; i < NUM_SAMPLES; ++i)
rookPositions[i] = i;
std::shuffle(rookPositions.begin(), rookPositions.end(), mt);

// calculate the sample points
std::array<std::array<float, 2>, NUM_SAMPLES> samples;
for (size_t i = 0; i < NUM_SAMPLES; ++i)
{
// x axis
samples[i][0] = float(rookPositions[i]) / float(NUM_SAMPLES-1);

// y axis
samples[i][1] = float(i) / float(NUM_SAMPLES - 1);
}

// save bitmap etc
Test2D("2DRooks.bmp", samples);
}

//======================================================================================
void TestIrrational2D (float irrationalx, float irrationaly, float seedx, float seedy)
{
// calculate the sample points
std::array<std::array<float, 2>, NUM_SAMPLES> samples;
float samplex = seedx;
float sampley = seedy;
for (size_t i = 0; i < NUM_SAMPLES; ++i)
{
samplex = std::fmodf(samplex + irrationalx, 1.0f);
sampley = std::fmodf(sampley + irrationaly, 1.0f);

samples[i][0] = samplex;
samples[i][1] = sampley;
}

// save bitmap etc
char irrationalxStr[256];
sprintf(irrationalxStr, "%f", irrationalx);
char irrationalyStr[256];
sprintf(irrationalyStr, "%f", irrationaly);
char seedxStr[256];
sprintf(seedxStr, "%f", seedx);
char seedyStr[256];
sprintf(seedyStr, "%f", seedy);
char fileName[256];
sprintf(fileName, "2DIrrational_%s_%s_%s_%s.bmp", &irrationalxStr[2], &irrationalyStr[2], &seedxStr[2], &seedyStr[2]);
Test2D(fileName, samples);
}

//======================================================================================
float MinimumDistance2D (const std::array<std::array<float, 2>, NUM_SAMPLES>& samples, size_t numSamples, float x, float y)
{
// Used by poisson.
// This returns the minimum distance that point (x,y) is away from the sample points, from [0, numSamples).
float minimumDistance = 0.0f;
for (size_t i = 0; i < numSamples; ++i)
{
float distance = Distance(samples[i][0], samples[i][1], x, y);
if (i == 0 || distance < minimumDistance)
minimumDistance = distance;
}
return minimumDistance;
}

//======================================================================================
void TestPoisson2D ()
{
// every time we want to place a point, we generate this many points and choose the one farthest away from all the other points (largest minimum distance)
const size_t c_bestOfAttempts = 100;

// calculate the sample points
std::array<std::array<float, 2>, NUM_SAMPLES> samples;
for (size_t sampleIndex = 0; sampleIndex < NUM_SAMPLES; ++sampleIndex)
{
// generate some random points and keep the one that has the largest minimum distance from any of the existing points
float bestX = 0.0f;
float bestY = 0.0f;
float bestMinDistance = 0.0f;
for (size_t attempt = 0; attempt < c_bestOfAttempts; ++attempt)
{
float attemptX = RandomFloat(0.0f, 1.0f);
float attemptY = RandomFloat(0.0f, 1.0f);
float minDistance = MinimumDistance2D(samples, sampleIndex, attemptX, attemptY);

if (minDistance > bestMinDistance)
{
bestX = attemptX;
bestY = attemptY;
bestMinDistance = minDistance;
}
}
samples[sampleIndex][0] = bestX;
samples[sampleIndex][1] = bestY;
}

// save bitmap etc
Test2D("2DPoisson.bmp", samples);
}

//======================================================================================
int main (int argc, char **argv)
{
// 1D tests
{
TestUniform1D(false);
TestUniform1D(true);

TestUniformRandom1D();

TestSubRandomA1D(2);
TestSubRandomA1D(4);
TestSubRandomA1D(8);
TestSubRandomA1D(16);
TestSubRandomA1D(32);

TestSubRandomB1D();

TestVanDerCorput(2);
TestVanDerCorput(3);
TestVanDerCorput(4);
TestVanDerCorput(5);

// golden ratio mod 1 aka (sqrt(5) - 1)/2
TestIrrational1D(0.618034f, 0.0f);
TestIrrational1D(0.618034f, 0.385180f);
TestIrrational1D(0.618034f, 0.775719f);
TestIrrational1D(0.618034f, 0.287194f);

// sqrt(2) - 1
TestIrrational1D(0.414214f, 0.0f);
TestIrrational1D(0.414214f, 0.385180f);
TestIrrational1D(0.414214f, 0.775719f);
TestIrrational1D(0.414214f, 0.287194f);

// PI mod 1
TestIrrational1D(0.141593f, 0.0f);
TestIrrational1D(0.141593f, 0.385180f);
TestIrrational1D(0.141593f, 0.775719f);
TestIrrational1D(0.141593f, 0.287194f);

TestSobol1D();

TestHammersley1D(0);
TestHammersley1D(1);
TestHammersley1D(2);

TestPoisson1D();
}

// 2D tests
{
TestUniform2D(false);
TestUniform2D(true);

TestUniformRandom2D();

TestSubRandomA2D(2, 2);
TestSubRandomA2D(2, 3);
TestSubRandomA2D(3, 11);
TestSubRandomA2D(3, 97);

TestSubRandomB2D();

TestHalton(2, 3);
TestHalton(5, 7);
TestHalton(13, 9);

TestSobol2D();

TestHammersley2D(0);
TestHammersley2D(1);
TestHammersley2D(2);

TestRooks2D();

// X axis = golden ratio mod 1 aka (sqrt(5)-1)/2
// Y axis = sqrt(2) mod 1
TestIrrational2D(0.618034f, 0.414214f, 0.0f, 0.0f);
TestIrrational2D(0.618034f, 0.414214f, 0.775719f, 0.264045f);

// X axis = sqrt(2) mod 1
// Y axis = sqrt(3) mod 1
TestIrrational2D(std::fmodf((float)std::sqrt(2.0f), 1.0f), std::fmodf((float)std::sqrt(3.0f), 1.0f), 0.0f, 0.0f);
TestIrrational2D(std::fmodf((float)std::sqrt(2.0f), 1.0f), std::fmodf((float)std::sqrt(3.0f), 1.0f), 0.775719f, 0.264045f);

TestPoisson2D();
}

#if CALCULATE_DISCREPANCY
printf("\n");
system("pause");
#endif
}