# Basic Methods For Finding Zeroes and Mins / Maxes of Functions

There is ~500 lines of simple standalone C++ code that goes with this post, that generated all data used and implements the things talked about. You can find it at: https://github.com/Atrix256/BasicRootFinding

Simple equations can be solved using algebra without too much fuss:

$3x-6=0 \\ 3x=6 \\ x=2$

You can even solve more complicated equations by cleverly using identities, the quadratic equation, and other mathematical tools you might have in your tool belt.

Some equations are beyond our ability to solve them using simple algebra though, like high degree polynomials:

$3x^5+2x^2-20x+2=0$

We still can find where functions like that equal zero, but we can’t do it analytically, we have to do it numerically.

That is, we do things like roll a ball down hill and hope we find a zero.

(Apparently there are some analytical solutions to be had from that function. I don’t know how to get them though, so would have to resort to numerical methods!)

This post talks about a few methods for finding zeroes of equations we can’t solve analytically, and talks about how to use those methods to find minimum and maximum values of functions as well.

# The Basic Idea

Let’s say I tell you that the value of a function at a specific point is “3” and that for every step to the right you take, it subtracts 1 from that value.

Then I ask: “Where does this function equal zero?”

If you answered “three steps to the right” you are correct, and guess what – you’ve just used Newton’s method for root finding.

There is an asterisk here though. The slope (how the function value changes as you take a step to the right. Also called the derivative.) was constant in the above example, but is not going to be constant in the functions we want to solve. That slope is going to change depending where you are on the graph.

To deal with this, Newton’s method takes the step it thinks it should take, and then asks again for the value and slope (derivative) and takes another step.

The hope is that with a good initial guess, Newton’s method will find a zero without too much effort.

We’re going to get a little more formal, while also showing some results, but all the rest of what we are going to talk about is based on Newton’s method (or is even simpler).

# Newton’s Method

We’ve seen this informally, but formally, a single step of Newton’s method looks like this:

$x = x - \frac{f(x)}{f'(x)}$

That is, we divide the y value at our current point by the derivative at our current point, and subtract that from our current x to get our new x.

Bullet Points:

• Requires f(x) and f'(x)
• Requires a “good guess” for an initial x value

# Secant Method

What if you don’t have the analytical first derivative to your function and still want to use Newton’s Method?

Well, you can use finite differences to get the first derivative: move the x over a little bit, see how the y value changes over distance, and use that as your numerical derivative, instead of the analytical one. (A blog post of mine on finite differences: https://blog.demofox.org/2015/08/02/finite-differences/)

If you do that, you are now using the secant Method.

$m = \frac{f(x+e)-f(x-e)}{2e}$

$x = x - \frac{f(x)}{m}$

The above uses “central differences” to calculate the first derivative numerically. e is a tuneable parameter, but should be small (without triggering numerical issues) – like perhaps maybe 0.01.

Bullet Points:

• Requires f(x)
• Requires a “good guess” for an initial x value
• Converges at a rate of 1.618… (the golden ratio! ??!) so is slower than Newton’s method, but can be handy if you don’t have the first derivative.

# Halley’s Method (& Householder’s method)

If the first derivative helped us find a zero, surely the second derivative can help us find it even faster, right?

Yep. If you generalize Newton’s method to using the second derivative, you get Halley’s method.

$x = x - \frac{2f(x)f'(x)}{2f'(x)^2-f(x)f''(x)}$

Bullet Points:

• Requires f(x), f'(x) and f”(x). You can get the derivatives numerically but it will slow down convergence.
• Requires a “good guess” for an initial x value
• Converges cubically (fast!!)

Both Newton and Halley are part of a larger family of methods called “Householder’s Method” which generalizes Newton’s method to any order derivative.

# Bisection

Bisection is simpler than the other methods. You start with a left x and a right x, with the requirement that the signs of the y values for these x’s have to be different (one positive, one negative) which means that a zero must be between the two x’s.

The algorithm itself is just a binary search, where you look at the value in the middle of the left and right x, and update the left or right x to be the middle, based on the sign of the y value at the middle.

Bullet Points:

• Requires f(x)
• Requires a min x and a max x that contains a zero between them, and the y values at those x’s have to have opposite signs
• Converges Linearly (slow!)

# Experimental Results: y = x^2 – 1

Here’s the first equation: $y=x^2-1$

You can see visually that there’s a zero at x=-1 and at x=1.

Here’s how the various methods did at finding the roots. the x axis is how many steps were taken and the y axis is the absolute value of y at that step, which also happens to be the error too, since we are finding zeros.

Halley is the winner which is unsurprising with it’s cubic convergence rate, then newton with it’s quadratic convergence rate, then secant with it’s golden ratio convergence rate, and bisect comes in last after a strong start, with it’s linear convergence rate. Our experimental results match the theory, neat!

The values in the legend next to the result name specify what parameters were used. For newton and Halley, the value shown is the initial guess. For secant, x is the initial guess and px is the previous initial guess. For bisect it shows the range given to the bisection algorithm.

Choosing different parameters for those can drastically affect performance of the algorithms. Here are some less well chosen values shown along with the previous results. The previous results are the blue shades.

Bisection is now so bad that it dwarfs everything else. Let’s remove that.

Removing bisection, the first worst data point dropped to 100 from 2500 for all the “more poorly chosen parameters” which are orange, yellow and green. The previous values are the blues and the grey line, which basically look flat.

In these larger values with worse parameters, we are still seeing Halley winning, Newton coming in second, secant coming next, and then bisection, but we are seeing much worse performance. These parameters are important to getting good performance!

# Experimental Results: y = sin(x)

Here’s the next equation: $y=sin(x)$

Here is the convergence for each technique:

You can’t see it, but the orange, grey and blue are all basically overlapping. Looking at the actual data below though, you can see that the winners are in the same order again.

# Experimental Results: y = x^2-x-1

This is a fun equation, because a zero is at the golden ratio (and the other is at -1/goldenRatio): $y=x^2-x-1$

Here is the convergence graph:

An interesting thing happens here if we choose to start at the value “0.5” for Newton and Halley. The dark blue line for Newton is under the grey line for Halley which is flat and doesn’t go up and down. The reason this happens is that the first derivative is 0 at 0.5 so the algorithm doesn’t know which direction to go and gets stuck.

Using other initial guess values causes it not to be stuck, but you can see that newton and secant starting at 0.75 has a pretty big jump in error in the beginning.

# Error Case: x^2+1

For the next equation we have: $y=x^2+1$

This graph has no zeroes. Our bisect bounds also do not have their needs met since one side is supposed to be negative and the other positive, but this graph is positive everywhere. That means we don’t have any results for bisect on this one.

Here’s the convergence graph:

Newton goes absolutely off the rails for a few samples. Lets remove it to see what the others are doing:

Secant goes real bad at the end, so let’s remove it to see that Halley is pretty well behaved despite there not actually being a root to find:

# Ray vs Sphere

Let’s do something more interesting… let’s use these methods to find where a ray intersects a sphere. We can do this analytically (there is a formula to calculate this!) but we can use this as a simpler example of a more complex usage case.

We need a formula that is zero where a ray hits a sphere, and we need to be able to calculate it’s first and second derivative analytically ideally.

I fought with this for a bit but came up with a solution.

Here are the variables involved:

• rayPos and rayDir, both are vec3’s
• spherePos which is a vec3 and a sphereRadius which is a scalar
• t – how far down the ray we are, and is a scalar

We are going to make the sphere center be the origin by subtracting it out of the ray position (then we don’t need spherePos anymore), so then all we have to find is where the magnitude of the vector from the origin to the ray position at time t is the sphere radius. To simplify calculating the derivatives, we are going to square both sides of the equation.

$f(t) = MagnitudeSquared(rayPos + rayDir * t) - sphereRadius^2$

If we can find where that equation is zero, that will tell us the time t that the ray hits the sphere, and we can get the actual intersected position (then normal, etc) from that.

For a 3d vector, Magnitude squared is just the below which is the distance equation without the square root.

$MagnitudeSquared(P) = P.x^2+P.y^2+P.z^2$

If you expand f(t) out for just the x axis to get the pattern per axis, you get:

$rayPos.x^2+2*rayPos.x*rayDir.x*t+rayDir.x^2*t^2 - sphereRadius^2$

You would add a similar thing for y and z – but not do the sphere radius part because it’s already handled above.

We can take the derivative of the above with respect to t, to get:

$2*rayPos.x*rayDir.x+2*rayDir.x^2*t$

So the first derivative is that equation, but also adding in the y and z axes:

$f'(t) = 2*rayPos.x*rayDir.x+2*rayDir.x^2*t + 2*rayPos.y*rayDir.y+2*rayDir.y^2*t + 2*rayPos.z*rayDir.z+2*rayDir.z^2*t$

That looks a bit like spaghetti but is way cleaner in code with a loop iteration for each axis 😛

We can then take the derivative of that first derivative to get the second derivative (still, with respect to t). That gives us this:

$f''(t) = 2*rayDir.x^2 + 2*rayDir.y^2 + 2*rayDir.z^2$

Ok cool, now that we have our equations, I used (0,0,0) for the ray position, (0,0,1) for the ray direction, (0.2, 0.3, 5.0) for the sphere position, and a sphere radius of 2.0.

Let’s see how our methods did at finding the intersection time of the ray vs the sphere.

Once again, Halley wins, newton comes in second, then secant, then bisect. Neat!

# Finding Function Mins & Maxes

For being one of the headlines of the blog post title, this is actually a really simple thing to explain now that you know the rest.

The minimum and maximum of a function – by definition! – has a derivative (slope) of zero, because it’s right where the function goes flat. The function has either climbed to the top of a mountain, gone flat, and is heading back down, or it has gone to the bottom of a valley, gone flat, and is heading back up. There is actually also a third option called a saddlepoint where it is pausing it’s ascent or descent momentarily and then continues.

In any case, to find the minimum or maximum of a function, we need to find where it’s derivative is zero. We do this by doing root finding on it’s first derivative.

From there, we can plug that x value found into the original f(x) function to get our extrema value.

The code that goes with this blog post uses this technique to find the maximum value for the function $y=-x^2+x+1$.

It finds that the derivative ($y=-2x+1$) is zero at x=0.5. Plugging 0.5 into the original function for x gives us a value of 1.25. That does look like the correct x and y value for the maximum of the function, doesn’t it?

The code that goes with this blog post can be found at: https://github.com/Atrix256/BasicRootFinding

A good 10 minute video on Newton’s Method

Here’s a way to get an analytic derivative of a function without having to do it symbolically. Dual numbers are a form of automatic differentiation, just like backpropagation is.
https://blog.demofox.org/2014/12/30/dual-numbers-automatic-differentiation/

You can extend this stuff to higher dimensions using the gradient instead of the derivative, and maybe a Jacobian matrix or Hesian matrix for higher derivatives. That might make a good future blog post.

We actually do something similar to Newton’s method when doing sphere tracing (ray marching with a distance estimate). We divide the distance value at a point in space (the y value, or f(x)) by the length of the gradient (which is basically the slope aka generalized derivative) to know that a surface can’t be any closer than that value, so we can step that length in whatever direction our ray wants to move. This also has applications to drawing 2d graphics. You can read about that here:
https://www.iquilezles.org/www/articles/distance/distance.htm

# Using White Noise to Choose Between Red Noise and Blue Noise

There is source code that goes with this post, which generated all the images and did all the tests. You can find it at https://github.com/Atrix256/DFTRandomFibonacci

I recently saw a really cool video from @Numberphile which mixed some of my very favorite things: Fibonacci numbers (aka the golden ratio), red noise and blue noise.

The link to the video is below and is very much worth watching.
“Random Fibonacci Numbers”

It had me thinking though… they are using a coin flip (white noise) to determine if they should add the last two numbers (red noise / low pass filter) or subtract them (blue noise / high pass filter).

I was curious what the DFT of that would look like, which would show what sort of frequency content that number sequence had.

BTW I have a post talking about dice, probability distributions and noise color here that is a bit related: https://blog.demofox.org/2019/07/30/dice-distributions-noise-colors/

# White Noise

Just to prime things, here is 100 uniform white noise values on the number line and the DFT of those points. The points clump together and leave holes, and the frequency spectrum shows all frequencies.

Here is the average DFT of 100,000 such point sets. The average flattens out and the grey lines showing 1 standard deviation are flat as well.

# Regular Fibonacci Numbers

The first 90 Fibonacci numbers look like the below on the number line:

And here’s their DFT, where 0hz (DC) is in the middle:

Nothing super interesting really.

# Randomized Fibonacci Numbers

Here is 90 randomized Fibonacci numbers on the numberline, the dft, and the average DFT of 100,000 such number sets.

It’s interesting to see that the individual randomized Fibonacci has strong low frequency content, but other frequencies too, while the average DFT of the number sets shows only low frequency content.

I think what’s going on here is that since the numbers start out small and grow larger over time, that they will always start out clumped together (red noise), but then depending on the coin flip (white noise) which have different frequency content in each set of numbers, as the numbers grow larger. This means that the only thing common among them is low frequency content, but the rest is just white noise and averages out to be flat.

Maybe not that interesting of a result, but it’s an answer at least 😛

# Prime Numbers

I got a tweet from Tariq wondering what DFTing the prime numbers would look like.

Here are the first 25 on the numberline, and the DFT:

Here are the first 100:

Here are the first 200:

Here are the first 1000:

I don’t really see much of a pattern, but I guess if someone did, they would have gotten some prize by now? 🙂

# “The next coin flip has to be tails!”

If you saw a fair coin flipped and 10 heads came up in a row, you’d probably either think that the coin was a 2 headed coin, or that the next flip HAD to be tails.

In the code that goes with this post (https://github.com/Atrix256/DFTRandomFibonacci check out DoCoinTossTest()), I have a random number generator generate bits until there are 10 ones in a row, and then count how many times the next random bit is a one again.

I do that test 10000 times, and ran that overall test 5 times. Here are the results!

At an infinite number of coin flips, you can expect an even numbers of heads and tails, but until you reach infinity, all bets are off. When a coin has come up heads 10 times in a row, it still has a 50/50 chance of heads in the next coin flip.

That’s the definition of white noise – independent, random events.

Red noise would tend to have similar values – so, heads would clump together, and tails would clump together. This makes more sense with dice, where similar values would be rolled.

Blue noise would tend to have dissimilar values – so heads and tails would specifically NOT clump. And with dice, you’d rarely get the same, or similar values, in 2 dice rolls.

White noise doesn’t care about what came before, and just gives you a new number based on the probability distribution.

Keep this in mind if you ever play roulette!

# How Do I Calculate Variance in 1 Pass?

If you google “how do i calculate variance?” you’ll get some nice explanations that say:

1. Calculate the mean (average) of your numbers
2. Calculate the average of: each number minus the mean, squared

That’s fine for people trying to just understand how the math works, but if you are calculating variance in a computer program, you might not realize there is a way to do it in a single pass over the data.

That can be significant to the performance and even feasibility of your algorithm!

Here is how you calculate variance in one pass:

1. Calculate the mean (average) of your numbers
2. In the same loop, calculate the mean (average) of your numbers squared
3. After the loop, variance is the absolute value of #2, minus #1 squared

That might look like word salad, so here’s a code snippet.

float Lerp(float a, float b, float t)
{
return a * (1.0f - t) + b * t;
}

float Variance_1Pass(const std::vector & data)
{
// get the average (value) and average (value*value)
float average_value = {};
float average_valueSquared = {};
for (size_t index = 0; index < data.size(); ++index)
{
float value = data[index];
average_value = Lerp(average_value, value, 1.0f / float(index + 1));
average_valueSquared = Lerp(average_valueSquared, value * value, 1.0f / float(index + 1));
}

// variance is absolute value of average(value*value) - (average_value*average_value)
return abs(average_valueSquared - (average_value * average_value));
}


There is code that goes with this post, that implements it both ways and shows you that they are equivalent. You can find it at: https://github.com/Atrix256/CalculateVariance1Pass/blob/master/main.cpp

If you are wondering why I'm using "lerp" to average numbers, check out this post: https://blog.demofox.org/2016/08/23/incremental-averaging/

It turns out this one pass method can have numerical problems though, so no free lunch. Here is a more numerically robust way to do it, which also allows you to incrementally calculate variance, as numbers come in (Thanks Bart!): https://www.johndcook.com/blog/standard_deviation/

Why might you want to calculate variance?

One reason is if you are analyzing or reporting data, the average value is important to know, but it's also important to know if the numbers were usually pretty close to the average, or if they had lots of spikes above and below the average. You can square root variance to get the standard deviation, which is in the same units as the data you are reporting.

Assuming your data is a Gaussian distribution (due to something called the central limit theorem, a surprising number of things are actually gaussian distributed – like even rolling a bunch of dice and adding them up), 68% of the data points are + or – 1 standard deviation from the mean.

As an example, if the average temperature at your house over a month was 75 degrees Farenheit with a standard deviation of 5 degrees, that means that 68% of the days had a temperature between 70 and 80 degrees.

If the average temperature was still 75 but had a variance of 25 degrees, that means that 68% of the days had a temperature between 50 and 100 degrees. That is quite a difference! Just reporting the average temperature doesn't convey this information the same way as reporting average and standard deviation (or variance) does.

Lastly, I mentioned that doing 2 passes to calculate variance could be a deal breaker for an algorithm.

An example of this is a graphics technique called "Variance Shadow Maps" (paper: http://www.punkuser.net/vsm/vsm_paper.pdf) which ultimately calculates a mean and a variance for how far a group if pixels is away from a light source. When rendering the variance shadow map from the point of the view of the light, each pixel stores the depth, as well as the depth squared. A fun property is that you can blur these values with neighboring pixels without harming the mathematical integrity of the values. The result is soft shadows. (more info on soft shadows: https://blog.demofox.org/2017/07/01/why-are-some-shadows-soft-and-other-shadows-hard/)

When lighting a pixel later on and using the shadow map, it knows the pixel's distance from the light source, and can read the two values from the filtered (blurred) shadow map, which allow it to get the mean and variance of the objects in the shadow map (the things that are casting shadows). It then uses something called Chebyshev's inequality to get an estimate for how much in shadow the pixel is.

That is a lot of setup explanation, but if having to do 2 pass variance calculations, instead of the 1 pass that it does do, you'd have to run over the full screen of pixels and do logic for each one (subtract the average pixel value), to calculate the variance. In real time graphics, having to do an extra "full screen pass" can be pretty costly, and can easily put a game over budget, meaning the technique would have to be cut so the rest of the game could render fast enough.

This blog post is here in the hopes that the next time someone googles "how do i calculate variance" for use in a computer program, that they see this post, and implement it as a single pass. Fingers crossed! 😛

# Using Low Discrepancy Sequences & Blue Noise in Loot Drop Tables for Games

I never thought I’d be much of a stats person but here we are. This post is low on formalism though, so may the gods of formalism have mercy on my soul!

This post includes the result of experiments showing the value of what is talked about, and includes some simple C++ that you can find at: https://github.com/Atrix256/LootLDS

If you’ve ever played a game that involved grinding for loot, you might have looked online and found the drop rate for a specific item, only to find that if it says it drops one out of 100 times, that it takes you 200-300 runs to get it, while your friends get the drop in the first 10 runs.

What the heck is that about?!

That, my friends, is the nature of regular old random numbers – aka white noise – the kind of random numbers you get from rolling fair dice, flipping fair coins, hashing values using good hash algorithms, or using well made random number generators.

The core issue is that white noise random numbers can take on any valid value with equal probability at any time, regardless of whatever has happened before.

If you were betting on whether a fair coin would come up heads or tails after seeing 10 heads, if you say the next will be tails (because of course it will!) you will still only be right 50% of the time. If you flip the coin an infinite number of times, you will get an even number of heads or tails, but before reaching infinity, all bets are off.

This can be a problem for game designers too. They can collect statistics about how their randomized systems are behaving, analyze the results and come to the conclusion that everything is balanced and even. While that may be true when looking at global averages, the individual player experience may vary wildly and not be balanced at all.

Tangent: This is called variance and is the source of noise in raytraced rendering.

Tangent: There’s a fun related story here about the U.S. air force realizing there is no such thing as an average pilot:
https://www.thestar.com/news/insight/2016/01/16/when-us-air-force-discovered-the-flaw-of-averages.html

In any case, is this “globally balanced but individually unbalanced” something we have to live with, or is there something we can do about it?

Luckily there is something we can do about it, and we can ensure that individual players have a more balanced, more pleasant, and more controlled experience, without sacrificing randomization.

# Enter Low Discrepancy Sequences

A low discrepancy sequence is a sequence of numbers which are neither too close together nor too far apart.

If we put marks evenly spaced on a number line, the sequence of numbers at those marks would have zero discrepancy because they were evenly spaced. Low discrepancy numbers have a low discrepancy value that is greater than zero.

Examples of low discrepancy sequences that you may have heard of are: Sobol, Halton, Van Der Corput.

Some nice links of mine for learning more about low discrepancy sequences are:
https://blog.demofox.org/2017/05/29/when-random-numbers-are-too-random-low-discrepancy-sequences/
https://github.com/Atrix256/SampleZoo

Tangent: Going back to the raytraced noise example, regular sampling makes aliasing, and white noise sampling makes noise. Low discrepancy sequences sort of lay somewhere in the middle, gaining the benefits of both worlds, and actually having mystical powers of making numerical integration converge very quickly.

So what do low discrepancy sequences have to do with our problem?

If you use a low discrepancy sequence to generate 5 “random numbers” between 0 and 1, those 5 numbers will be roughly evenly spaced, which means that if you use those numbers on a loot table, the player is going to get a wider spread on the full possibilities of what the loot table has to offer.

If something has a very small percentage to drop, the player still has a low probability to get that drop, but if it’s a 1 in 100 drop, it’s more likely to happen at the 100th drop mark.

This is in constrast to white noise where the values may be clumped together and leave big holes in the 0 to 1 range, like: 0.114, 0.081, 0.093, 0.2, 0.95. There is a huge gap empty between 0.2 and 0.95, which is 75% of the possibilities!

There’s a problem with low discrepancy sequences though: They are deterministic – that is, they are predictable and not random. You get the same values from low discrepancy sequences every time.

Before admitting defeat though, there is another way to get randomization from this even though the sequences themselves are not random: You can shuffle the loot table!

Now, if you have thousands of players on an MMO server rolling numbers against a loot table, you probably just barfed in your mouth a little at my suggestion. There is a way to make a “shuffle iterator” though, so that you can get the benefits of shuffling a loot table, without actually having to keep copies of shuffled loot tables around. You’d use some unique player id (and more) as a random seed for the shuffle iterator, then could use a low discrepancy sequence to select loot. This way, each player would see different (randomized) loot drop sequences, but the loot rolls would still be low discrepancy.

Tangent: you can read more about a way to make shuffle iterators using low quality encryption in “Format Preserving Encryption” here: https://blog.demofox.org/2013/07/06/fast-lightweight-random-shuffle-functionality-fixed/

But we aren’t done yet…

# Enter Randomized Low Discrepancy Sequences

The low discrepancy sequences talked about in the last section were deterministic, but what if we were able to make number sequences that had low discrepancy but were randomized?

That exists, and it’s known as “blue noise” because blue noise is random numbers which have only high frequency components (like how blue light is high frequency).

The property of both being low discrepancy, but also randomized, is great for many reasons way outside the scope of this article. For our loot drop purposes, it means that the loot will be both unpredictable, but also a player will get a balanced personalized experience, instead of only the global averages being balanced.

Tangent: Here’s a link about how to generate a blue noise sequence: https://blog.demofox.org/2017/10/20/generating-blue-noise-sample-points-with-mitchells-best-candidate-algorithm/

The other shoe dropping is that blue noise can take a long time to generate, so is computationally expensive. In graphics, it’s common to generate blue noise in advance and just use the pre-made sequence. In loot drops, that is a less viable option because it makes your sequence deterministic and then you are back to shuffling the loot table.

Not real sure the answer here, but it may involve just keeping track of the last N loot drops, and using Mitchell’s best candidate algorithm to generate the N+1th value, adding that to the loot drop RNG list and removing the oldest one. If you get creative you might find a solution that fits your needs.

# Prove it

Before we show experimental results, I wanted to defined a couple terms in regards to low discrepancy sequences.

1. Progressive Sequence – A progressive sequence is a sequence of N values, where if you use less than N of the values, they still have the desired properties. For instance, if you make 100 blue noise distributed numbers, but only use the first 10, if it’s a progressive sequence, those first 10 will also be blue. If it isn’t a progressive sequence, you have to use all 100 before they are blue. This is also a property of deterministic low discrepancy sequences. For our loot drop purposes we NEED to use progressive sequences because other wise, the loot drops won’t be balanced until the very end, which kind of defeats the point.
2. Open Sequence – An open sequence is one that you can always add more items to. If you regularly space 4 samples from 0 to 1 you are going to get 0, 0.25, 0.5, 0.75. If you want to add a 5th number you can’t! That means that this sequence is not open. Many low discrepancy sequences are open, and using Mitchell’s best candidate to generate blue noise does make an open sequence. For loot drops, we generally do want open sequences, because we usually don’t know how many times the player is going to loot something in advance.

The numbers below are from experiments done using the code that goes with this blog post. It’s ~380 lines of C++ in a single file using only standard includes. You can find it at: https://github.com/Atrix256/LootLDS

I used the following sequences:

• White Noise – Straight up vanilla RNG.
• Blue Noise – Using Mitchell’s Best Candidate to generate a progressive, open, uniform blue noise distributed number sequence.
• Golden Ratio – Starting at 0, i add the golden ratio to the previous loot drop value to get the next loot drop value. I use modulus to keep the value between 0 and 1. The golden ratio has some interesting & good properties as a low discrepancy sequence.
• Sobol – This is the low discrepancy sequence that leads in numerical integration convergence speeds.

For each sequence type, I generated 10 random loot tables which had 2 to 6 items, each item having a random roll weighting between 1 and 20. I then rolled as many loot drops as i needed to make it so the actual loot drop percentages were within 1% of what the loot drop table said they should be.

Higher numbers mean it took more loot drops for the actual loot dropped to reach the probabilities the loot table said they should be. Lower numbers mean it took fewer loot drops to reach the desired probabilities. I did 10 runs so that you can see if things are consistently good or consistently bad. Just doing a single test isn’t enough to show if something just got lucky one time, or if it’s a good choice.

Here are the results….

• White Noise: 50513, 7834, 1859, 516, 8824, 3650, 1380, 24461, 35, 12455
• Blue Noise: 72, 77, 143, 9, 129, 308, 353, 236, 176, 205
• Golden Ratio: 47, 34, 50, 76, 55, 51, 114, 77, 21, 105
• Sobol: 216, 155, 161, 77, 13, 71, 56, 75, 127, 51

It’s important to note that the loot tables themselves are generated with white noise, which is a source of variance in the results above. 10 samples isn’t a whole lot, so in a real analysis you might want to do more (100,000 or more runs?) but hopefully you should see that white noise really is not great. I was also surprised to see that Sobol didn’t do that well compared to blue noise and golden ratio. It must just do better at higher dimensions.

One last thing I wanted to mention is that this isn’t limited to loot drop tables. You can use these concepts for randomized events, procedural content generation, and basically anything else you use random numbers for in games.

The important takeaway here is that even if things look right when looking at global averages, using white noise makes the individual experience very different from that global average. Having better control over crafting a player’s individual experience is possible though, and has the possibility of giving a game a more hand crafted feel, even though you are still using RNG.

FRIENDS DON’T LET FRIENDS USE WHITE NOISE!

# IIR Audio & Data Filters – Featuring Biquads

The last post talked about finite impulse response filters (FIRs) where the output of the filter was based on the current and previous input samples. (https://blog.demofox.org/2020/01/14/fir-audio-data-filters/)

In this post we are going to talk about infinite impulse response filters (IIRs) where the output of the filter is based not only on current and previous input samples, but also previous output samples.

This seemingly minor change makes for much more powerful and interesting filtering possibilities, but as it isn’t stateless, that means it must be evaluated serially (no SIMD!), and so is more computationally expensive.

The simple standalone C++ code that goes with this post is at: https://github.com/Atrix256/DSPIIR

The interactive web demo that goes with this post is at: http://demofox.org/DSPIIR/IIR.html

# IIR Difference Equation

$y_n = a_0x_n + a_1x_{n-1} + a_2x_{n-2}$

Now let’s say we want the difference equation to include the previous two output samples too. We can just complete the pattern, by including some previous y terms with coefficient multipliers on the left side of the equation.

$y_n + b_1y_{n-1} + b_2y_{n-2} = a_0x_n + a_1x_{n-1} + a_2x_{n-2}$

We can then move everything but $y_n$ to the right side of the equation to get a function that gives us our current filtered output:

$y_n = a_0x_n + a_1x_{n-1} + a_2x_{n-2} - b_1y_{n-1} - b_2y_{n-2}$

You might be wondering why there is no $b_0$ term and the reason for that is because it would be a multiplier for $y_n$ which is weird. Do we really need to scale the output like that? Sometimes people will include the $b_0$ term, and will divide both sides of the equation by $b_0$ to get an equation like this:

$y_n = \frac{1}{b_0}(a_0x_n + a_1x_{n-1} + a_2x_{n-2} - b_1y_{n-1} - b_2y_{n-2})$

Let’s just pretend that if $b_0$ exists, it’s value is always 1, and then we can move on without it actually being there, complicating our equations needlessly.

So, to repeat it, here is a difference equation for an order 2 IIR filter, which is built on an order 2 FIR filter.

$y_n = a_0x_n + a_1x_{n-1} + a_2x_{n-2} - b_1y_{n-1} - b_2y_{n-2}$

You can pull out the $a_0$ parameter as a gain parameter again if you want to, but the b parameters don’t get the same sort of benefit, so you can leave them in their raw form.

$y_n = a_0(x_n + \alpha_1x_{n-1} + \alpha_2x_{n-2}) - b_1y_{n-1} - b_2y_{n-2}$

# IIR Transfer Function

To calculate the transfer function, lets start back from where we added the previous outputs on the left side of the equation:

$y_n + b_1y_{n-1} + b_2y_{n-2} = a_0x_n + a_1x_{n-1} + a_2x_{n-2}$

Next, let’s take the z transform:

$y(z) + b_1y(z)z^{-1} + b_2y(z)z^{-2} = a_0x(z) + a_1x(z)z^{-1} + a_2x(z)z^{-2}$

We then factor out y(z) and x(z) to get:

$y(z)(1 + b_1z^{-1} + b_2z^{-2}) = x(z)(a_0 + a_1z^{-1} + a_2z^{-2})$

Since the transfer function is just y(z) divided by x(z) we can do simple algebra to do that now!

$\frac{y(z)}{x(z)}(1 + b_1z^{-1} + b_2z^{-2}) = a_0 + a_1z^{-1} + a_2z^{-2} \\ H(z) = \frac{y(z)}{x(z)} = \frac{a_0 + a_1z^{-1} + a_2z^{-2}}{1 + b_1z^{-1} + b_2z^{-2}}$

You can factor out the $a0$ term to be gain if you want, to get a more familiar looking top of the equation:

$H(z) = \frac{a_0(1 + \alpha_1z^{-1} + \alpha_2z^{-2})}{1 + b_1z^{-1} + b_2z^{-2}}$

From there you can plug in frequencies and see what sort of frequency and phase response you get, just like in the last blog post for FIRs.

You might notice that the transfer function is quadratic in the numerator, and the denominator. This is in fact called a “biquadratic filter” or a “biquad” for short.

Often times higher order filters (like EQs) are made by chaining multiple biquads together. Biquads are a pretty important staple of DSP.

# Pole Zero Plot

You might wonder why in the last post we called it a “Pole Zero Plot” when there were zeros but no poles.

IIRs add poles to the pole zero plot and they are where the function shoots to infinity. This happens in a fraction when you divide by zero, so a pole is a place where the denominator of the transfer function is zero.

To make it explicit:

1. You solve the quadratic equation in the numerator to get the zeros of that quadratic equation, which are also the zeros of the filter.
2. You solve the quadratic equation in the denominator to get the zeros of that quadratic equation, which are the also the POLES of the filter.

That makes things pretty easy for making a pole zero plot. Calculating the zeros of the filter is the same, and you use the same technique to find the poles.

In the last post, we saw that the zeros of an order 2 FIR filter were at:

$z = \frac{-\alpha_1}{2} \pm \frac{\sqrt{\alpha_1^2-4\alpha_2}}{2}$

That still works for IIRS too. For poles, all you need to do is replace the alphas with bs:

$z = \frac{-b_1}{2} \pm \frac{\sqrt{b_1^2-4b_2}}{2}$

## Example Filters

Unlike FIRs which are relatively tame and can only make certain frequencies lower, IIRs can cause frequency amplitudes to get much higher and even shoot off to infinity. That makes for some cool sounds by adding distortion to certain frequency notes of an instrument but not others.

Check the links section for the “RBJ cookbook” if you want to get deeper into the filters, but here are a couple interesting filters I found.

This one boosts low frequencies and attenuates high frequencies.

This does the reverse and boosts high frequencies while attenuating low frequencies.

This makes use of both poles and zeros to make a really interesting sounding filter.

It’s also still fun to slowly move the controls back and forth while some simple instrumental loop plays. Filtering white noise is still really interesting too because white noise has all frequency content, which means filtering frequencies out or boosting them up will always affect white noise. That isn’t true of things with simpler frequency components. The extreme of this is the sine wave which only has a single frequency so is unaffected by other frequencies being boosted up or attenuated.

# Crafting a Filter By Placing Zeros

Creating a biquad filter from zero and pole locations is pretty simple if you already can make an FIR filter by placing zeros. In fact, that is exactly how you place the zeros.

To place the poles, you do the exact same steps as placing the zeros, but the coefficients you get out are for b0, b1 and b2 instead of a0, a1 and a2.

# Estimating Frequency Response From a Pole Zero Plot

Doing this from an FIR involved getting the distance from a point on the unit circle to every zero, multiplying all those distances together as a product and that is the amount the amplitude would be multiplied by.

Adding poles into the mix extends this.

As a second step, you get the distance from the point on the unit circle to every pole. You multiply all those pole distances together as a product and divide the previous product by this amount.

That’s all there is to it, and you should hopefully be able to see that this is why a frequency at the pole would shoot to infinity. The distance is zero so the frequency response shoots to infinity.

# Oscillators

If you want to generate sinusoid waves of specific frequencies, you can use IIRs to do that by putting the poles on the unit circle, and leaving the zeros at the origin.

For a filter to be stable (not shoot off to infinity), the poles need to be inside of the unit circle, so putting them on the edge of the unit circle is playing with fire a bit, but the instability is also what makes the oscillator function.

We could talk about the algebra for calculating polynomials in the numerator and denominator of the transfer function to do this, but lets jump to the end and look at the simplified result of what to set the parameters to in the difference equation below:

$y_n = a_0x_n + a_1x_{n-1} + a_2x_{n-2} - b_1y_{n-1} - b_2y_{n-2}$

The parameters should be:

• $a_0 = 1$
• $a_1 = 0$
• $a_2 = 0$
• $b_1 = 2 cos(\omega)$
• $b_2 = 1$

Where omega ($\omega$) is the amount of radians to advance the wave form per sample.

If you plug these into the difference equation, you can simplify it to this:

$y_n = x_n - 2 cos(\omega)y_{n-1} - y_{n-2}$

These oscillators don’t need input and just want a steady stream of zeros. Because of this, we can remove the input from the equation.

$y_n = -2 cos(\omega)y_{n-1} - y_{n-2}$

The last thing you need to do however is to give it some initial starting state. If you make it so y[-1] and y[-2] are zero, to calculate y[0] and y[1], the oscillator won’t work correctly.

This is because we need to initialize the state (prior output) to be as if it’s been running all along.

So, you can set y[-1] to be $cos(-\omega*1)$ and y[-2] to be $cos(-\omega*2)$. That will make it so the next sample will be $cos(0)$ which means the wave will start at zero degrees and continue.

You could initialize the state to whatever phase you wanted to, by initializing y[-1] and y[-2] to the prior two values to the desired phase.

As a quick and dumb example, let’s look at a sine wave that advances 180 degrees (pi radians) per sample. That means b1 will be 2, which makes our difference equation:

$y_n = -2y_{n-1} - y_{n-2}$

We’ll initialize y[-2] to be cos(-2pi) or 1, and we’ll initialize y[-1] to be cos(-pi) or 0.

Following the difference equation starting at y[0] we get…

$\begin{array}{|c|c|} \hline \text{y index} & \text{value} \\ \hline -2 & 1 \\ -1 & -1 \\ 0 & 1 \\ 1 & -1 \\ 2 & 1 \\ 3 & -1 \\ 4 & 1 \\ 5 & -1 \\ \hline \end{array}$

180 degrees is nyquist, and we can see that it’s doing the right thing of flipping between 1 and -1. It works with less clean numbers too, and the simple c++ code that goes with this post shows that working (https://github.com/Atrix256/DSPIIR).

Unfortunately, with less clean numbers, this method will start to drift from reality over time due to floating point imprecision. One way to deal with this is to reset the oscillator after every 360 degrees of advancement.

Nick Appleton (@nickappleton) has an alternate method if you are looking for a cheap oscillator.

First you make two complex numbers:

• y = 1 + 0i
• a = e^(i*omega)

Where omega is still the number of degrees the wave advances per sample. Another way to calculate a is: std::polar(1.0f, radiansPerSample)

Then, for each sample you do this:

y = y * a

the resulting complex number will have the cosine value in the real portion, and the sine value in the imaginary portion.

This has better numerical stability, which you can see in the c++ code output too.

http://dspfirst.gatech.edu/chapters/08feedbac/demos/recur/index.html

There is a famous document called the “RBJ cookbook” which gives recipes for biquads that act in specific ways. RBJ stands for the author’s name Robert Bristow-Johnson. You can find it attached to the message at the link below!

https://www.musicdsp.org/en/latest/Filters/197-rbj-audio-eq-cookbook.html

Marc B Reynolds (@marc_b_reynolds) recently put out an interesting blog post talking about how it’s better to use integers to repeatedly add things (irrational numbers in his case) instead of using floats. There are some good lessons there that apply here as well I bet, probably especially for oscillators.

http://marc-b-reynolds.github.io/distribution/2020/01/24/Rank1Pre.html

# Linear Fit Search

Binary search looks in the middle of a list to make a guess about where a search value is. If that guess is wrong, it can eliminate half of the list (based on whether the search value is less than or greater than the guess location) and try again. It repeats until it’s either found the search value, or runs out of list.

This algorithm works well but it is blind to the actual values it got when making guesses, beyond just checking if they were greater or less than the search value.

I recently wondered: If we knew the min and max value stored in the list, couldn’t we make a more intelligent guess as to where the search value might be? We could fit the data with a line, figure out where our guess would be on that line, and make that be our initial guess. As we iterate, we could use our incorrect guesses as new min or max values of the line as appropriate, updating our line fit as we went, and perhaps arrive at an answer more quickly.

Another way of looking at this: If the guess a binary search made is VERY far from the search value, maybe it should go farther than the midpoint when making the next guess? Or, if it was pretty close to the search value, maybe it shouldn’t go as far as the midpoint? Close vs far measurements depend on the overall magnitude of the numbers in the list, so you’d need to know what sort of values are stored. A min and a max value of the list can give you a rough idea of that, especially if you update those min / max values as you repeatedly cut the list with guesses.

This post explores that idea. The result is something that could be more attractive than binary search, depending on what kind of trade offs are being looked for. While I haven’t heard of this technique , I wouldn’t be surprised if it’s been tried before and written about. (Know of a source? let me know!).

UPDATE: @thouis from twitter mentioned the basic idea is called “interpolation search”. This post goes beyond that basic idea but you can read more about it here if you’d like 🙂 https://www.techiedelight.com/interpolation-search/. He has a paper about interpolation search that you can read here (it has some relation to discrepancy, as in low discrepancy sequences, oddly!) https://erikdemaine.org/papers/InterpolationSearch_SODA2004/

The post goes a step further to address a problem that is encountered when using this algorithm, and also talks about other ways this algorithm might be extended or generalized.

An implementation, and the code that generated all the data for this post, can be found here: https://github.com/Atrix256/LinearFitSearch

# Initial Problem / Other Possible Avenues

(Feel free to skip this section if you get lost. You won’t miss anything important about the algorithm itself)

If you are wise in the ways of numbers, you might be saying to yourself that this only works if you have roughly evenly distributed numbers – basically, a flat PDF, or a flat histogram. This is because by only knowing the min and max, you are doing a linear fit of the data, and making guesses as if your data is well represented by that line. The less like a line your data actually is, the less good this ought to work.

That is true, and I thought up this idea while trying to think of how to generate 1d blue noise more quickly, which is random but roughly evenly spaced values. For that usage case it does well, but there are many types of non linear data out there that you might want to search through.

Really what you want to do is learn the distribution of the values in the list, and use that knowledge to know where the value you are searching for is likely to be.

I didn’t go that direction in these experiments, but it seems like a data scientist would have plenty of tools in their tool box to attempt something like that. Markov chain Monte Carlo type algorithms come to mind.

There’s another way to look at the problem of searching for a value in a list, and that’s to look at it as strictly a function inversion problem.

If you look at your sorted list as a lookup table, where the index is the x value, and the value stored is the y value, a search tries to tell you the x value for a specific y value that you are searching for.

In this context you only care about integer values of x, and there might be duplicate values in the list, making it not a strictly monotonic function – not having each y value be larger than the last y value – but has a more relaxed version where each y value is >= the last y value.

Thinking about the search problem as a function inversion problem, ignoring the monotocity issue, there are far too many data points to do an analytic inverse, so you would be looking at numerical inverse solutions.

I also didn’t really explore that direction, so it’s another way to go that might yield some better fruit.

Lastly, you could see searching a sorted list as a root finding problem. If you are looking for where the function minus the search value equals zero, numerical root finding functions could maybe help you here. I also did not try anything in that direction.

If anyone ends up exploring any of the alternative avenues, I’d love to hear what kind of techniques you used and what your results were!

# Linear Fit Search

The algorithm works like this…

1. Start with a sorted list, and the minimum and maximum value stored in that list.
2. Calculate a line fitting the min and max. For an equation y=mx+b, you are calculating m and b.
3. Using the inverse of the function, which is x=(y-b)/m, make a guess for what index (x) the search value (y) is at by plugging the search value into that equation as y and getting an x. That x is the index you are guessing the value is at.
4. If your guess was correct, you are done so exit. Otherwise, if the guess was too high, this is your new max. If the guess was too low, this is your new min. If you’ve run out of list to search, the value isn’t there, so exit.
5. Goto 2

This algorithm assumes the sorted list looks like a line if you were to graph it, so it does better when the sorted list actually looks like a line.

Let’s see how it does for a linear list with values in it between 0 and 2000. (Click to see full size image)

The left image shows the items in the array.

In the middle image, x axis is the number of items in the list, and y axis is how many guesses it took to search for a random value. This shows the average of 100 runs.

In the right image, it shows the minimum and maximum guesses it took for each list size, for those same 100 runs.

The linear fit did pretty well didn’t it? At minimum it took zero guesses (the search value was less or equal to min or greater or equal to max), and at maximum it took 2 guesses to find the search value, regardless of list size.

Binary search took about the usual log2(N), as expected.

Let’s try a list made up of random numbers between 0 and 2000.

That looks pretty similar to the linear case, but the line fit search doesn’t beat binary search by quite as much. The randomness of the list makes it so the guesses are more often wrong, and so it takes a few extra guesses to find the right place.

Let’s try a quadratic function: y=2000x^2:

The average for line fit search still beats binary search, but if you look at the min/max graph, the line fit min and max entirely encompasses the binary search min and max. That means there is a ton of variance about whether it will be faster or slower than binary search, even though on average it will be faster.

Let’s try a cubic function: y=2000x^3:

While the average still (barely) beats binary search, the maximum for line fit search has gotten REALLY erratic.

Let’s try a log function:

Ouch, the line fit is actually doing worse now than the binary search.

Lastly, let’s go back to the linear list, but let’s make the last entry in the table be 200,000 instead of 2000:

Ouch! Linear fit search is super awful now. What happened?!

It turns out that this uneven histogram type of list is really a worst case scenario for the line fit search.

What is happening here is that it sees the min as 0 and the max as 200,000 so it thinks the line is very steep. On it’s first guess, everything it could search for (it searches for a random value between 0 and 2000), it will think the value is at index 0. It will very likely be wrong, and elminate index 0. The next round, it will choose index 1, be very likely wrong again, and repeat by picking 2 then 3 then 4 and so on. This data layout nearly forces this search to a more computationally expensive version of linear search. Binary search doesn’t have this problem because it doesn’t care what the values are, it just cuts the list in half repeatedly until it’s done.

Wouldn’t it be nice if we could know whether it’d be better to use binary search or linear fit search for a data set?

We’d have to analyze the data set to figure that out, and if we are going to go to all that trouble, we probably should just learn the shape of the data set in general and use that knowledge to make a better guess than either binary search or linear fit.

I think going that route could be fruitful, but I didn’t try it. Instead I came up with a Hybrid Search.

Here is my more readable, less optimized code for the linear fit search.

TestResults TestList_LineFit(const std::vector<size_t>& values, size_t searchValue)
{
// The idea of this test is that we keep a fit of a line y=mx+b
// of the left and right side known data points, and use that
// info to make a guess as to where the value will be.
//
// When a guess is wrong, it becomes the new left or right of the line
// depending on if it was too low (left) or too high (right).
//
// This function returns how many steps it took to find the value
// but doesn't include the min and max reads at the beginning because
// those could reasonably be done in advance.

// get the starting min and max value.
size_t minIndex = 0;
size_t maxIndex = values.size() - 1;
size_t min = values[minIndex];
size_t max = values[maxIndex];

TestResults ret;
ret.found = true;
ret.guesses = 0;

// if we've already found the value, we are done
if (searchValue < min)
{
ret.index = minIndex;
ret.found = false;
return ret;
}
if (searchValue > max)
{
ret.index = maxIndex;
ret.found = false;
return ret;
}
if (searchValue == min)
{
ret.index = minIndex;
return ret;
}
if (searchValue == max)
{
ret.index = maxIndex;
return ret;
}

// fit a line to the end points
// y = mx + b
// m = rise / run
// b = y - mx
float m = (float(max) - float(min)) / float(maxIndex - minIndex);
float b = float(min) - m * float(minIndex);

while (1)
{
// make a guess based on our line fit
ret.guesses++;
size_t guessIndex = size_t(0.5f + (float(searchValue) - b) / m);
guessIndex = Clamp(minIndex + 1, maxIndex - 1, guessIndex);
size_t guess = values[guessIndex];

// if we found it, return success
if (guess == searchValue)
{
ret.index = guessIndex;
return ret;
}

// if we were too low, this is our new minimum
if (guess < searchValue)
{
minIndex = guessIndex;
min = guess;
}
// else we were too high, this is our new maximum
else
{
maxIndex = guessIndex;
max = guess;
}

// if we run out of places to look, we didn't find it
if (minIndex + 1 >= maxIndex)
{
ret.index = minIndex;
ret.found = false;
return ret;
}

// fit a new line
m = (float(max) - float(min)) / float(maxIndex - minIndex);
b = float(min) - m * float(minIndex);
}

return ret;
}


# Hybrid Search

Since binary search and linear fit search both have situationally good properties, I decided to try a hybrid of the two where it switches between the two for each guess. The first guess is a linear fit, the next is a binary search guess, then back to linear fit, and so on.

Here’s where that puts things with the previous worst case scneario: the linear data with a single huge outlier. New graph on top, old on bottom for comparison. Apologies that the colors aren’t consistent between old and new! 😛

There’s quite a bit of variance, and the linear fit min and max contains the binary search min and max, but on average it does beat the binary search now, which is kind of neat.

Let’s analyze the line fit worst performers to best performers and see how the hybrid search compares.

Here’s the log function:

The variance has decreased compared to line fit. The average beats binary search too, where the non hybrid test didn’t.

Next is the cubic function:

With the non hybrid approach, cubic on average was barely beating binary search and had a huge amount of variance. The hybrid average is beating binary search by a larger margin and the variance has dropped a lot.

The line fit search beat binary search, like the hybrid search does. It even beats it by roughly the same amount. The hybrid search has a lot less variance though, which is a nice property. You’ll have more consistent timings as you search.

Here’s random:

The hybrid search does a little worse both for average, and variance, than the linear fit search did.

Last is linear:

it’s impossible to see where the hybrid max line is, but it went up to 3, from the 2 that line fit max was at, which also brings the average up just a little bit. In my opinion, that isn’t so bad that we slightly damaged the perfectly linear and random cases in favor of making it much more robust in the general case.

Here is my more readable, less optimized code for the hybrid search. The only meaningful difference is on line 48 where it chooses to do a linear fit or binary search step, and line 72 where it toggles which one it does next.

TestResults TestList_HybridSearch(const std::vector<size_t>& values, size_t searchValue)
{
// On even iterations, this does a line fit step.
// On odd iterations, this does a binary search step.
// Line fit can do better than binary search, but it can also get trapped in situations that it does poorly.
// The binary search step is there to help it break out of those situations.

// get the starting min and max value.
size_t minIndex = 0;
size_t maxIndex = values.size() - 1;
size_t min = values[minIndex];
size_t max = values[maxIndex];

TestResults ret;
ret.found = true;
ret.guesses = 0;

// if we've already found the value, we are done
if (searchValue < min)
{
ret.index = minIndex;
ret.found = false;
return ret;
}
if (searchValue > max)
{
ret.index = maxIndex;
ret.found = false;
return ret;
}
if (searchValue == min)
{
ret.index = minIndex;
return ret;
}
if (searchValue == max)
{
ret.index = maxIndex;
return ret;
}

// fit a line to the end points
// y = mx + b
// m = rise / run
// b = y - mx
float m = (float(max) - float(min)) / float(maxIndex - minIndex);
float b = float(min) - m * float(minIndex);

bool doBinaryStep = false;
while (1)
{
// make a guess based on our line fit, or by binary search, depending on the value of doBinaryStep
ret.guesses++;
size_t guessIndex = doBinaryStep ? (minIndex + maxIndex) / 2 : size_t(0.5f + (float(searchValue) - b) / m);
guessIndex = Clamp(minIndex + 1, maxIndex - 1, guessIndex);
size_t guess = values[guessIndex];

// if we found it, return success
if (guess == searchValue)
{
ret.index = guessIndex;
return ret;
}

// if we were too low, this is our new minimum
if (guess < searchValue)
{
minIndex = guessIndex;
min = guess;
}
// else we were too high, this is our new maximum
else
{
maxIndex = guessIndex;
max = guess;
}

// if we run out of places to look, we didn't find it
if (minIndex + 1 >= maxIndex)
{
ret.index = minIndex;
ret.found = false;
return ret;
}

// fit a new line
m = (float(max) - float(min)) / float(maxIndex - minIndex);
b = float(min) - m * float(minIndex);

// toggle what search mode we are using
doBinaryStep = !doBinaryStep;
}

return ret;
}


## Random Odds and Ends

Just like binary search, the linear fit and hybrid search algorithms can return you the index to insert your value into the list, if not present.

Some folks may balk at the idea of having the min and max value of the list before you do a search, from the point of view that it’s sort of like 2 guesses that aren’t being counted against the graph. If that’s your point of view, you can add 2 to the values graphed and you can see that the hybrid search is still compelling. I think it’s perfectly reasonable that you’d know the min and max of a sorted list though. After all, we store the length, why not also the min and max?

It may not be optimal to do 1 step of line fit search and 1 step of binary search in the hybrid search method. It might be that by doing something like 1 binary step then 3 line fit steps, and repeating that pattern, may give you better results. It may also be a better idea to just do line fit search, but if you aren’t making good enough progress, throw in a binary search step. I didn’t explore this at all due to the “nice enough” results i got switching off every time.

I had a thought that it might be good to try doing an “online linear squares fit” while making guesses so that you learned the shape of the list while searching it. If that sounds interesting to you, give this a read: https://blog.demofox.org/2016/12/22/incremental-least-squares-curve-fitting/. I suspect that having a more localized fit (like in this post) performs better, but I might be wrong. I could also see doing a least squares fit of the data offline in advance so you had that data available, like a min and a max, before you started the search. A problem with doing a fit in general though is that you have to be able to invert the function of whatever you fit the data with. Quadratic or cubic seem like they are probably the limit of what you’d want to try to avoid ringing and the complexity of higher order function inversion.

You can make binary searches more cache friendly by putting them into binary trees stored in arrays. This makes it so for instance, that when you test index 0, you are really testing the half way point. If the search value is less than index 0, you look at index 1, else you look at index 2. The left and right child of an index is just index*2 and index*2+1. I bring this up, because the “fixed guess points” of a binary search make this possible. A linear fit search doesn’t have fixed guess points, which makes it not possible to do the same thing. I’m betting with some creativity, some better cache friendliness could be figured out for a linear fit search.

Following in that idea, is the concept of a cache oblivious b-tree. Check it out here: https://github.com/lodborg/cache-oblivious-btree

Another nice property of binary searching is that you can make it branchless and very SIMD friendly, or very friendly for simple hardware implementations. A linear fit search doesn’t seem as well suited for that, but again, maybe some creativity could help it be so. Here’s more about binary search operating like I just described: https://blog.demofox.org/2017/06/20/simd-gpu-friendly-branchless-binary-search/

Lastly, you might have noticed that the graph for the linear data set showed that the line fit and hybrid searches were taking fewer guesses as the list got larger. It looks impossible, and lets me make this dank meme:

What the heck is going on there?

The x axis of those graphs shows how large the list is, and the y axis is how many guesses are taken, but in all those linear lists of each size, the list linearly breaks up the range [0,2000]. It’s also always searching for random numbers in [0,2000]

In smaller lists, the numbers are more sparse, while in larger lists the numbers are more dense.

If you have a linear data set, and are using a linear fit to look for a number in that list that may or may not be there, a denser list will have the values there more often, and the first guess is going to more often be the correct location of the search value.

That’s what is happening, and that’s why it’s showing an improvement in the linear case as the list gets larger, because it’s also getting more dense.

Here’s a graph for a version of the test where the density is kept the same for each list. The lists are between [0,5*count] and the search values are in the same range.

It’s interesting and kind of cool that both the average and min/max are flat, but this is a best case scenario for the line fit (and hybrid) search, with the data actually being linear.

## Performance

Ok finally we get to performance. Many of you fine folks were probably looking at the guess count graphs and thinking “So what? Where’s the perf measurements?” TL;DR I think this is a pareto frontier advancement but i’ll explain more.

here are the perf results but don’t be too quick to say “aha!”, because they need some explanation and context. These results are on my modern-ish gaming laptop.

Results:

• Linear search takes ~1.5 nanoseconds per guess. (eg, increment the index and read the next value from the array)
• Binary search takes ~5 nanoseconds per guess.
• Both linear fit and hybrid search takes ~12 nanoseconds per guess.

So, from my tests, binary search would need to take 2.5 times as many guesses as linear fit or hybrid searching to break even. The only case where that is true in my tests is the purely linear list.

Now that I’ve said that, I don’t think the tests I’ve done are really a good apples to apples comparison.

What I did as a test was generate lists of the various types described above, generated a list of random numbers to search for in them, then had each search algorithm do all the searches and i divided the total time by the total number of guesses done to get a time per guesses for each algorithm.

It is true that the linear fit is slightly more complicated logic than a binary search, or the linear search, so computationally I do expect it to take longer, and the 2.5x as long seems like a fair measurement.

HOWEVER, searching the same list over and over is an unrealistic pattern for most applications. More of the list would be likely to be in the cache when doing multiple searches back to back like this, so memory reading would be under-reported in the profiling.

Because the linear fit (and hybrid) searches are more computationally expensive, but end up doing fewer guesses, they use more cpu, but less memory bandwidth. That means that the wins they give would show up in times when memory reads (or wherever the list was stored) were slower. Having the list in the cache is not a time when the reads are going to be slower, so I think the testing is stacked against the linear fit and hybrid testing.

That said, I can’t think of a better “canned performance test” to compare apples to apples. You really would need to drop it in, in a realistic usage case for searching in an application, and see if it was better or worse for that specific usage case.

If you were memory bandwidth bound, and thus had some compute to spare, this search seems like it could possibly be a nice option. Or, in exotic situations where reading a list was VERY VERY slow (remote servers, homomorphic encryption, data stored on disk not in memory?) this could be a better algorithm. In those exotic situations where reads are way more expensive that computation, you’d probably want to go further though, and use more advanced algorithms to really make every guess count, using a lot more CPU to do so.

Lastly on perf: none of this code has been optimized. I wrote it for clarity, not speed. It’s possible that the comparison landscape could change (either for better or worse) with optimized code.

If anyone investigates perf more deeply, I’d love to hear results and in what context those results were found. Thanks!

## Quadratic Fit Search and Beyond?

An obvious questions is: can this search technique extend to quadratic and beyond?

I do think so. Let’s look at how that might work, and then i’ll point out some complications that make it more challenging.

Let’s think about the quadratic case. You’d need to start with a quadratic fit of the data, which would require 3 data samples from the list. Two data samples would be the first and last index just like the linear search, but where should the third data point be from?

One place it could be is in the middle of the list. If you can afford more processing time than that, you might consider picking whatever index gives the lowest error between the quadratic fit and the actual data stored in the array.

Now you have a quadratic fit of the data in the array and can begin searching. You have some y=f(x) function that is quadratic, and you invert it to get a x=f(y) function. All is well so far.

You make your first guess by pluggin your search value in for y and getting an x out which is your first guess for where the number is. When you read that number, if it is the search value, you are done. If it doesn’t match though, what do you do?

Your guess point is going to be between your min and max, but it might be to the left or the right of the third point you have in the quadratic fit. That is two possibilities.

Your guess may also be too low, or too high. That is two more possibilities, making for four possible outcomes to your guess.

Let’s say your guess was to the left of the “third point” and deal with these two outcomes first:

• If your guess was less than the search value, it means that your guess is the new minimum.
• If your guess was greater that the search value it means that your guess is the new maximum. A problem though is that your “third point” is now to the right of the search maximum. This isn’t so bad because it still fits real data on the curve but it seems a little weird.

If your guess was on the right of the “third point”, we have these two outcomes to deal with:

• If your guess was less than the search value, the guess is the new minimum, and the “third point” in the quadratic fit is to the left and is less than the minimum.
• If your guess was greater than the search value, the guess is the new maximum.

Are you with me so far? the “third point” seems oddly stationary at this point, but the next round of searching fixes that.

On the second step of searching (and beyond), we have some new possibilities to add to the previous four. The “third point” can either be less than the minimum or greater than the maximum. That is two possibilities.

And once again, we have two possibilities in regards to what our guess found: The guess value could be lower than the search value, or it could be higher.

Due to symmetry, let’s just consider the “third point” to be greater than our max, and then we can just consider the less than and greater than case:

• If our guess was too small, it’s the new minimum.
• If our guess was too large, it’s the new maximum, but the old maximum becomes the new “third point”. This moves the “third point” to be more local, giving us a more local quadratic fit of our data, which should help the search make better guesses.

So now, the “third point” moves around, and the quadratic fit is updated to be a localized fit, like we want it to be.

For the cubic case and above, I’ll leave that to you to sort out. It just is updating the minimum and maximums based on the guess value vs search value, and then doing a dance to make sure and keep the most local points around for the curve fit of the data, and throwing out the less local points to make room. I am pretty sure it’s extendable to any degree you want, and that one algorithm could be written to satisfy arbitrary degrees.

Now onto a complication!

Our very first step is to make an initial fit of data of whatever degree and then invert it. To invert the function, it needs to be monotonically increasing – aka there is no part on the graph where if you look at the point to the left, it’s higher. Each point on the graph should be higher than the point to the left.

The bad news is that if even looking at the quadratic case, making a quadratic curve pass through 3 data points A, B, C where A <= B <= C, the result is very often NOT going to be monotonic.

That means you are going to have a bad time trying to invert that function to make a guess for where a search value should be in the list.

I think a good plan of attack would be to fit it with a monotonic quadratic function that didn't necessarily pass through the 3 data points. That would affect the quality of your guess, but it might (probably should??) do better at guessing than a line fit, at the cost of being more computationally expensive. I'm not sure how to do that specifically, but I'd be surprised if there wasn't an algorithm for it.

For details on how even quadratic often isn't monotonic:

Some possibly good leads to dealing with this:

https://math.stackexchange.com/questions/3129051/how-to-restrict-coefficients-of-polynomial-so-the-function-is-strictly-monotoni

https://en.wikipedia.org/wiki/Monotone_cubic_interpolation

## Closing

Thanks for reading. Hopefully you found it enjoyable.

If you use this, or do any related experimentation, I’d love to hear about it.

# Posts TODO

A place to keep track of blog posts I’d like to do:

Chebyshev curve fitting / interpolation – with simple working c++ code. Possibly rational chebyshev too. Chebyshev apparently is optimal for polynomial interpolation.
https://www.embeddedrelated.com/showarticle/152.php

Optimistic concurrency (in databases). Select data and version id per row. Update versionid = versionid+1, blah where blah and version id = version id. If rows affected is 0, that means something else beat you to the punch and you can deal with it however.

Cordic math. Every iteration in a loop gives you another bit of precision, since it’s basically a binary search.

2D SDFs for vector graphics – using modulus for “free repeating”. anti aliasing. use your shadertoy verlet physics game as an example?

Verlet Physics Keep last and current position to get implicit velocity. Simulate. Iterative constraint solving. Things “just work” pretty well.

Minkowski Portal Refinement A nice & simple algorithm for collision detection. Maybe talk about algorithm to get depth. Mention JGK, possibly do that after.

Deterministic Simulations using deterministic sim to eg decrease network traffic.

Quick Math: phi / goden ratio show how golden ratio conjugate is the same number past the decimal point. show how this is the only number that can do that. The main point being “i remember this fact, but don’t remember the number”. This fact lets you calculate the number.

Quick math: eulers constant show how e^x being the derivative (and integral) can only work for e. The main point being “i remember this fact, but don’t remember the number”. This fact lets you calculate the number.

Ear clipping – for turning polygons into triangles. extendable to 3d with tetrahedron clipping, and to higher dimensions as well.

Storageless Shuffle With Weights – this is like if you have 3 red marbles and 5 blue marbles, how would you use FPE to storagelessly shuffle them.

Recurrent neural networks (etc) for “time series” learninghttps://twitter.com/Peter_shirley/status/1066832031043149824?s=03

Markov Chain Monte Carlo – Eg. for decryption maybe try 2nd order or higher chains.
Maybe also try with rendering / numerical integration http://statweb.stanford.edu/~cgates/PERSI/papers/MCMCRev.pdf

Blue Noise AO – It’s common to use white noise for sampling and also sample rotation. Start from there and show how to use blue for sampling and also rotation!
http://john-chapman-graphics.blogspot.com/2013/01/ssao-tutorial.html

Other blue noise usage cases – specific usage cases with easy to follow implementations
* fog shafts
* reflections
* dithering

Data Cache When doing coding experiments, there are often pieces of data that take time to calculate that are based on parameters that don’t often change from run to run. Making a data cache can help. Semi compelling usage cases: 1) next largest prime greater than N. 2) integrate a bilinear function. Compare / contrast to content addressable storage. CAS is the hash of the contents, this is the hash of the params that make the contents. code: https://github.com/Atrix256/ProgressiveProjectiveBlueNoise/blob/master/cache.h

Magic Eye Implementation turn a generic depth buffer into magic eye?
https://steemit.com/steemit/@mynameisbrian/how-to-create-a-steemit-themed-magic-eye-image-using-photoshop

Exposure and fstops
exposure is a multiplication. fstops are 2^(stop). feels linear when using fstops. must do multiplication in linear space, else is wrong (show it’s wrong).

Reaction Diffusion aka Turing patterns
https://en.wikipedia.org/wiki/Turing_pattern
https://www.quantamagazine.org/ancient-turing-pattern-builds-feathers-hair-and-now-shark-skin-20190102/

Kalman Filter
https://home.wlu.edu/~levys/kalman_tutorial/

# Audio Stuff

Biquad – a better frequency filter

Compressor & Limiter – automatic volume adjustment to eg avoid clipping. Include “side chain” stuff.

Statistical Search

Binary search works the way it does because it doesn’t know anything about the sorted list.

If you knew the min and the max in that sorted list, you could take a better guess at where to look first by finding the % that the value you are searching for is between min and max, and start at that % in the list.

The problem with that though is that it assumes an even distribution of numbers. If you have a bunch of small numbers and one huge number, this guess won’t be any good.

So, idea… if you fit the sorted list with some low order monotonic polynomial, you could reverse that to get an initial guess.

You could also update your best guess as to where the number was each time you looked in a place and got something that wasn’t your search value. This maybe using a kalman filter?

Faster 1d blue noise generation

1) brute force
2) binary search
3) linear search from initial guess from fit
4) kalman filter?

Use this stuff in generic data lists, not just in blue noise?

Maybe also a fixed # of buckets to cut search down. basically generate multiple in parallel and then append them together (but neighbors across edges do matter!)

james would…
I’d probably just put put uniform buckects
instead of a sorted array of numbers I’d try keeping the numbers ordered like a heep, then binary search is super faster, since it no longer suffers from non locality of memory access
an implicit binary tree where the childeren of a[i] are at a[i2+1] and a[i2 + 2]

# Not All Blue Noise is Created Equal

Some ways I know of creating blue noise distributed samples are:

To be honest, the void and cluster algorithm is a “top tier” algorithm while filtering white noise is just kind of a hack, and Mitchell’s best candidate algorithm is decent, simple but a bit out dated too.

Let’s look at some 128×128 blue noise textures created via void and cluster (top) and white noise filtering (bottom). The images on the right are the frequencies of the images (DFT magnitude). Since blue noise is high frequency noise, having a darker middle means a higher quality result.

Note: the white noise filtering used 25 iterations and a sigma of 1.5 for the blurring.

They look pretty similar don’t they? It turns out they are actually pretty different which I found really surprising when I was told. I had to see this for myself.

Below we threshold both images to 10%. What I mean by that is that if we consider black to be 0 and white to be 1, we make an image where there is a black dot if the color is < 0.1, else we make a white dot.

Void and cluster is top, and white noise filtering is middle. On the bottom is blue noise sample points generated with a discrete version of Mitchell's best candidate algorithm.

As you can see, the filtered white noise has already fallen apart for our purposes. It's basically unusable for this usage case. Mitchell is doing fairly ok though.

Let's move on to 20%:

Mitchell is gaining some low frequencies (it isn't as dark in the middle) but the filtered white noise is starting to look a tiny bit better.

Here are the rest up to 90%:

30%:

40%:

50%:

60%:

70%:

80%:

90%:

So, void and cluster beat the pants off the other two methods.

Filtered white noise used for this purpose is no good and basically fell completely apart.

Mitchell was decent until the sample density got too high and then it failed. There are some parameters to tune with this algorithm so it's possible that it could do better, but in general the algorithm does poorly for high point densities. As an alternative, above 50% density, you could perhaps invert the colors and treat it as 100-density so that it was always working against < 50% density. Even at 50% density, it isn't that great though, at least with my setup.

shadertoy.com recently got a blue noise texture and luckily the blue noise texture was made with the void and cluster algorithm, so it's "the good stuff".

Another family of algorithms to generate blue noise are based on constrained Voronoi diagrams and relaxation to evolve starting sample points to be more blue. Those are good for generating specific point sets for a specific density, but differ from void and cluster which are designed to make a texture that works well for any density.

There are other algorithms out there as well with different properties, and new ones coming out all the time. SIGGRAPH is starting right now and I bet at least one or two new blue noise algorithms are shown 😛

Have any interesting blue noise info to share? I'd love to hear it! It feels like the rabbit hole here is a lot deeper than it seems.

# Pathtraced Depth of Field & Bokeh

Let’s say you have a path tracer that can generate an image like this:

Adding depth of field (and bokeh) can make an image that looks like this:

The first image is rendered using an impossibly perfect pinhole camera (which is what we usually do in roughly real time graphics, in both rasterization and ray based rendering), and the second image is rendered using a simulated lens camera. This post is meant to explain everything you need to know to go from image 1 to image 2.

There is also a link to the code at the bottom of the post.

We are going to start off by looking at pinhole cameras – which can in fact have Bokeh too! – and then look at lens cameras.

If you don’t yet know path tracing basics enough to generate something like the first image, here are some great introductions:

# Pinhole Camera

A pinhole camera is a box with a small hole – called an aperture – that lets light in. The light goes through the hole and hits a place on the back of the box called the “sensor plane” where you would have film or digital light sensors.

The idea is that the aperture is so small that each sensor has light hitting it from only one direction. When this is true, you have a perfectly sharp image of what’s in front of the camera. The image is flipped horizontally and vertically and is also significantly dimmer, but it’s perfectly sharp and in focus.

As you might imagine, a perfect pinhole camera as described can’t actually exist. The size of the hole is larger than a single photon, the thickness of the material is greater than infinitesimally small, and there are also diffraction effects that bend light as it goes through.

These real world imperfections make it so an individual sensor will get light from more than one direction through the aperture, making it blurrier and out of focus.

Reality is pretty forgiving though. Pinhole cameras that give decent results can be made easily, even with simple materials laying around the house (http://www.instructables.com/id/How-To-Make-A-Pinhole-Camera/).

You can even go deeper and make your own fairly high quality pinhole camera if you want: https://www.diyphotography.net/the-comprehensive-tech-guide-to-pinhole-photography/

As far as aperture size goes, the smaller the aperture, the sharper the image. The larger the aperture, the blurrier the image. However, smaller apertures also let in less light so are dimmer.

This is why if you’ve ever seen a pinhole camera exhibit at a museum, they are always in very dark rooms. That lets a smaller aperture hole be used, giving a sharper and more impressive result.

When using a pinhole camera with film, if you wanted a sharp image that was also bright, you could make this happen by exposing the film to light for a longer period of time. This longer exposure time lets more light hit the film, resulting in a brighter image. You can also decrease the exposure time to make a less bright image.

Real film has non linear reaction to different wavelengths of light, but in the context of rendered images, we can just multiply the resulting colors by a value as a post effect process (so, you can adjust it without needing to re-render the image with different exposure values!). A multiplier between 0 and 1 makes the image darker, while a multiplier greater than 1 makes the image brighter.

It’s important to note that with a real camera, longer exposure times will also result in more motion blur. To counter act this effect, you can get film that reacts more quickly or more slowly to light. This lets you have the aperture size you want for desired sharpness level, while having the exposure time you want for desired motion blur, while still having the desired brightness, due to the films ISO (film speed).

For a much deeper dive on these concepts, here is a really good read:
https://www.cambridgeincolour.com/tutorials/camera-exposure.htm

While aperture size matters, so does shape. When things are out of focus, they end up taking the shape of the aperture. Usually the aperture is shaped like something simple, such as a circle or a hexagon, but you can exploit this property to make for some really exotic bokeh effects. The image at the top of this post used a star of David shaped aperture for instance and this image below uses a heart shape.

Here’s two articles that talk about how to make your own bokeh mask for custom bokeh shapes for physical cameras:
https://www.diyphotography.net/diy_create_your_own_bokeh/

Ultimately what is happening is convolution between the aperture and the light coming in. When something is in focus, the area of convolution is very small (and not noticeable). As it gets out of focus, it gets larger.

The last property I wanted to talk about is focal length. Adjusting focal length is just moving the sensor plane to be closer or farther away from the aperture. Adjusting the focal length gives counter intuitive results. The smaller the focal length (the closer the sensor plane is to the aperture), the smaller the objects appear. Conversely, the larger the focal length (the farther the sensor plane is from the aperture), the larger the objects appear.

The reason for this is because as the sensor plane gets closer, the field of view increases (the sensor can see a wider angle of stuff), and as it gets farther, the field of view decreases. It makes sense if you think about it a bit!

# Pinhole Camera Settings Visualized

In the below, focal length and aperture radius are in “World Units”. For reference, the red sphere is 3 world units in radius. The path traced image is multiplied by an exposure multiplier before being shown on the screen and is only a post effect, meaning you can change the exposure without having to re-render the scene, since it’s just a color multiplier.

Here is a video showing how changing focal length affects the image. It ranges from 0.5 to 5.0. Wayne’s world, party time, excellent!

These next three images show how changing the aperture size affects brightness. This first image has an aperture size of 0.01 and an exposure of 3000.

This second image has an aperture size of 0.001 and the same exposure amount, making it a lot sharper, but also much darker.

This third image also has an aperture size of 0.001, but an exposure of 300,000. That makes it have the same brightness as the first image, but the same sharpness as the second image.

If you are wondering how to calculate how much exposure you need to get the same brightness with one aperture radius as another, it’s not too difficult. The amount of light coming through the aperture (aka the brightness) is multiplied by the area of the aperture.

When using a circular aperture, we can remember that the area of a circle is $\pi * \text{radius}^2$.

So, let’s say you were changing from a radius 10 aperture to a radius 5 aperture. The radius 10 circle has area of $100\pi$, and the radius 5 circle has an area of $25\pi$. That means that the radius 5 circle has 1/4 the area that the radius 10 circle does, which means you need to multiply your exposure by 4 to get the same brightness.

In the case of moving from radius 0.01 to 0.001, we are making the brightness be 1/100 of what it was, so we multiply the 3,000 by 100 to get the exposure value of 300,000.

Here is a video showing how aperture radius affects the sharpness of the image. The exposure is automatically adjusted to preserve brightness. Aperture radius ranges from 0.001 to 0.2.

In the next section we’ll talk about how to make different aperture shapes actually function, but as far as brightness and exposure goes, it’s the same story. You just need to be able to calculate the area of whatever shape (at whatever size it is) that you are using for your aperture shape. With that info you can calculate how to adjust the exposure when adjusting the aperture size.

Here are some different aperture shapes with roughly the same brightness (I eyeballed it instead of doing the exact math)

Circle:

Gaussian distributed circle:

Star of David:

Triangle:

Square:

Ring:

Even though it’s possible to do bokeh with a pinhole camera as you can see, there is something not so desirable. We get the nice out of focus shapes, but we don’t get any in focus part of the image to contrast it. The reason for this is that pinhole cameras have constant focus over distance. Pinhole camera image sharpness is not affected by an object being closer or farther away.

To get different focus amounts over different distances, we need to use a lens! Before we talk about lenses though, lets talk about how you’d actually program a pinhole camera as we’ve described it.

# Programming A Pinhole Camera With Bokeh

With the concepts explained let’s talk about how we’d actually program this.

First you calculate a ray as you normally would for path tracing, where the origin is the camera position, and the direction is the direction of the ray into the world. Adding subpixel jittering for anti aliasing (to integrate over the whole pixel) is fine.

At this point, you have a pinhole camera that has a infinitesimally small aperture. To make a more realistic pinhole camera, we’ll need to calculate a new ray which starts on the sensor plane, and heads towards a random point on the aperture.

Important note: the position of the aperture is the same as the camera position. They are the same point!

Calculating the Point on the Sensor Plane

We first find where the ray would hit the sensor plane if it were 1 unit behind the aperture (which will be a negative amount of time). We put that point into camera space, multiply the z of the camera space by the focal length (this moves the sensor plane), and then put it back into world space to get the actual world space origin of the ray, starting at the sensor plane.

To calculate the plane equation for the sensor plane, the normal for that plane is the camera’s forward direction, and a point on that plane is the camera position minus the camera’s forward direction. Calculating the equation for that plane is just:

sensorPlane.xyz = cameraForward;
sensorPlane.w = -dot(cameraForward, (cameraPos - cameraForward));


Note that xyzw are ABCD in the plane equation $Ax+By+Cz+d=0$.

You can then do this to find the point where the ray hits the sensor plane:

float t = -(dot(cameraPos, sensorPlane.xyz) + sensorPlane.w) / dot(rayDirection sensorPlane.xyz);
sensorPos= cameraPos + rayDirection  * t;


From there, you do this to adjust the focal length and to get the world space starting position of the ray:

// convert the sensorPos from world space to camera space
float3 cameraSpaceSensorPos = mul(float4(sensorPos, 1.0f), viewMtx).xyz;

// elongate z by the focal length
cameraSpaceSensorPos.z *= DOFFocalLength;

// convert back into world space
sensorPos = mul(float4(cameraSpaceSensorPos, 1.0f), invViewMtx).xyz;


Now we know where the ray starts, but we need to know what direction it’s heading in still.

Calculating the Random Point on the Aperture

Now that we have the point on the sensor, we need to find a random point on the aperture to shoot the ray at.

To do that, we first calculate a uniform random point in a circle with radius “ApertureRadius”, since the aperture is a circle. Here is some code that does that (RandomFloat01() returns a random floating point number between 0 and 1):

float angle = RandomFloat01(state) * 2.0f * c_pi;


If you wanted different shaped apertures for different shaped bokeh, you are only limited to whatever shapes you can generate uniformly random points on.

If we add that random offset to the camera position in camera space (multiply offset.x by the camera’s x axis, and offset.y by the camera’s y axis and add those to the camera position), that gives us a random point on the aperture. This is where we want to shoot the ray towards.

rayOrigin = sensorPlanePosition;
rayDirection = normalize(randomAperturePosition - sensorPlanePosition);


You can now use this ray to have a more realistic pinhole camera!

Brightness

If you want to be more physically correct, you would also multiply the result of your raytrace into the scene by the area of the aperture. This is the correct way to do monte carlo integration over the aperture (more info on monte carlo basics: https://blog.demofox.org/2018/06/12/monte-carlo-integration-explanation-in-1d/), but the intuitive explanation here is that a bigger hole lets in more light.

After you do that, you may find that you want to be able to adjust the aperture without affecting brightness, so then you’d go through the math I talk about before, and you’d auto calculate exposure based on aperture size.

When looking at the bigger picture of that setup, you’d be multiplying a number to account for aperture size, then you’d basically be dividing by that number to make it have the desired brightness – with a little extra to make it a little bit darker or brighter as the baseline brightness.

A more efficient way to do this would be to just not multiply by the aperture area, and apply an exposure to that result. That way, instead of doing something like dividing by 300,000 and then multiplying by 450,000, you would just multiply by 1.5, and it’d be easier for a human to work with.

# Lens Cameras

Finally, onto lenses!

The simplest lens camera that you can make (and what I used) is to just put a convex lens inside the aperture.

Funny tangent: lens comes from the greek word for lentil. (https://jakubmarian.com/are-lens-and-lentil-related/)

A motivation for using lenses is that unlike pinhole cameras, you can increase the aperture size to let more light in, but still get a focused shot.

This comes at a cost though: there is a specific range of depth that is in focus. Other things that are too close or too far will appear blurry. Also, the larger the aperture, the smaller the “in focus range” will be.

From that perspective, it feels a bit silly simulating lenses in computer graphics, because there is no technical reason to simulate a lens. In computer graphics, it’s easier to make a sharper image than a blurry one, and if we want to adjust the image brightness, we just multiply the pixels by a constant.

Simulating a lens for depth of field and bokeh is purely a stylistic choice, and has nothing to do with a rendering being more correct!

How Convex Lenses Work

Convex lenses are also called converging lenses because they bend incoming light inwards to cross paths. Below is a diagram showing how the light travels from objects on the left side, through the lens, to the right side. The light meets on the other side of the lens at a focus point for each object. The orange “F” labels shows the focal distance of the lens.

If two points are the same distance from the lens on the axis perpendicular to the lens, their focal points will also be the same distance from the lens on that axis, on the other side of the lens.

This means that if we had a camera with a sensor plane looking through a lens, that there would be a focal PLANE on the other side of the lens, made up of the focus points of each point for each sensor on the sensor plane. Things closer than the focus plane would be blurry, and things farther than the focus plane would be blurry, but things near the focus plane would be sharper.

The distance from the camera (aperture) to the focal plane is based on the focal distance of the lens, and also how far back the sensor plane is. Once you have those two values, you could calculate where the focal plane is.

There is a simpler way though for us. We can skip the middle man and just define the distance from the camera to the focal plane, pretending like we calculated it from the other values.

This is also a more intuitive setting because it literally tells you where an object has to be to be in focus. It has to be that many units from the camera to be perfectly in focus.

Going this route doesn’t make our renderer any less accurate, it just makes it easier to work with.

Nathan Reed (@Reedbeta) has this information to add, to clarify how focus works on lens cameras (Thanks!):

The thing you change when you adjust focus on your camera is the “image distance”, how far the aperture is from the film, which should be greater than or equal to the lens focal length.

The farther the aperture from the sensor, the nearer the focal plane, and vice versa. 1/i + 1/o = 1/f.

And this good info too:

“focal length” of a lens is the distance from film plane at which infinite depth is in sharp focus, and is a property of the lens, eg “18mm lens”, “55mm lens” etc. The focal length to sensor size ratio controls the FOV: longer lens = narrower FOV

# Programming A Lens Camera With Bokeh

Programming a lens camera is pretty simple:

1. Calculate a ray like you normally would for a path tracer: the origin is the camera position, and the direction is pointed out into the world. Subpixel jitter is again just fine to mix with this.
2. Find where this ray hits the focal plane. This is the focal point for this ray
3. Pick a uniform random spot on the aperture
4. Shoot the ray from the random aperture position to the focal point.

That’s all there is to it!

You could go through a more complex simulation where you shoot a ray from the sensor position to a random spot on the aperture, calculate the refraction ray, and shoot that ray into the world, but you’d come up with the same result.

Doing it the way I described makes it no less accurate(*)(**), but is simpler and computationally less expensive.

* You’ll notice that changing the distance to the focal plane doesn’t affect FOV like changing the focal distance did for the pinhole camera. If you did the “full simulation” it would.
** Ok technically this is a “thin lens approximation”, so isn’t quite as accurate but it is pretty close for most uses. A more realistic lens would also have chromatic aberration and other things so ::shrug::

You can optionally multiply the result of the ray trace by the aperture size like we mentioned in the pinhole camera to make the brightness be properly affected by aperture size. If you’d rather not fight with exposure multiplier calculations as you change aperture size though, feel free to leave it out.

# Lens Camera Settings Visualized

This video shows the effect of the aperture size changing. Notice that the area in focus is smaller with a larger aperture radius.

This video shows the effect of the focal distance changing. Nothing too surprising here, it just changes what depth is in focus.

# Photography is a Skill

Even after I had things implemented correctly, I was having trouble understanding how to set the parameters to get good Bokeh shots, as you can see from my early images below:

Luckily, @romainguy clued me in: “Longer focals, wider apertures, larger distance separation between subjects”

So what I was missing is that the bokeh is the stuff in the background, which you make out of focus, and you put the focal plane at the foreground objects you want in focus.

It’s a bit strange when you’ve implemented something and then need to go ask folks skilled in another skill set how to use what you’ve made hehe.

Here’s some other links I found useful while implementing the code and writing this post:

https://en.wikipedia.org/wiki/Pinhole_camera_model#The_geometry_and_mathematics_of_the_pinhole_camera
https://en.wikipedia.org/wiki/Camera_lens#Theory_of_operation
https://www.scratchapixel.com/lessons/3d-basic-rendering/3d-viewing-pinhole-camera/virtual-pinhole-camera-model
https://en.m.wikipedia.org/wiki/Circle_of_confusion

# My Code

My GPU path tracer that generated this images is up on github.

It’s a work in progress so is missing some things, has some todo notes, and maybe has some things that are incorrect in it, be warned! 🙂

The code is here: https://github.com/Atrix256/FalcorPathTracer/releases/tag/v1.0

The path tracer uses nvidia’s “Falcor” api abstraction layer. As best as I can tell, just pulling down falcor to your machine and compiling it registers it *somewhere* such that projects that depend on falcor can find it. I’m not really sure how that works, but that worked for me on a couple machines I tried it on strangely.

This is the version / commit of Falcor I used:
https://github.com/NVIDIAGameWorks/Falcor/commit/0b561caae19e8325853166cc4c93d4763570774a

I wish I had a more fool proof way to share the code – like if it were to download the right version of falcor when you try to build it. AFAIK there isn’t a better way, but if there is I would love to hear about it.

Anyhow, happy rendering!! 🙂

# Test

This is a test

double SimpleMonteCarlo()
{
double rangeMin = 0;
double rangeMax = 3.14159265359;

size_t numSamples = 10000;

std::random_device rd;
std::mt19937 mt(rd());
std::uniform_real_distribution dist(rangeMin, rangeMax);

double ySum = 0.0;
for (size_t i = 1; i <= numSamples; ++i)
{
double x = dist(mt);
double y = sin(x)*sin(x);
ySum += y;
}
double yAverage = ySum / double(numSamples);

double width = rangeMax - rangeMin;
double height = yAverage;

return width * height;
}


That was a test