Dual Numbers & Automatic Differentiation

In the last post, I talked about imaginary numbers, complex numbers, and how to use them to rotate vectors in 2d.

In this post, I want to share another interesting type of number called a “Dual Number” that uses the symbol ε (epsilon) and has a neat trick of automatically calculating the derivative of a function while you calculate the value of the function at the same time.

Dual numbers are pretty similar to imaginary numbers but there is one important difference. With imaginary numbers, i^2 = -1, but with dual numbers, ε^2 = 0 (and ε is not 0!). That may seem like a small difference, but oddly, that opens up a whole interesting world of mathematical usefulness.

Before we dig into automatic differentiation, I want to go over the mathematical basics for how dual numbers behave.

Basic Dual Number Math

Adding dual numbers is the same as adding complex numbers; you just add the real and dual parts separately:

(3 + 4ε) + (1 + 2ε) = 4 + 6ε

Subtraction works the same way as well:

(3 + 4ε) – (1 + 2ε) = 2 + 2ε

To multiply dual numbers, you use F.O.I.L. just like you do with complex numbers:

(3 + 4ε) * (1 + 2ε) =
3 + 6ε + 4ε + 8ε^2 =
3 + 10ε + 8ε^2

However, since ε^2 is zero, the last term 8ε^2 disappears:
3 + 10ε

It’s interesting to note that with complex numbers, the i^2 became -1, so the last term changed from imaginary to real, meaning that the imaginary numbers fed back into the real numbers during multiplication. With dual numbers, that isn’t the case, the dual numbers don’t feed back into the real numbers during multiplication.

In both complex and dual numbers the real terms do affect the non real terms during multiplication.

The division operator relates to the conjugate. I have source code for it below, and some of the links at the end of the post go into the details of that and other operations.

Quick Review: Derivatives (Slope)

If you know the line formula y=mx+b, but you don’t know what a derivative is you are in luck. Remember how “m” is the slope of the line, specifying how steep it is? That is what the derivative is too, it’s just the slope.

Below is a graph of y=2x+1. At every point on that line, the derivative (or slope) is 2. That means that for every step we make on the x axis to the right (positive direction), we make 2 steps up on the y axis (positive direction).

Now, check out this graph of y=x^2-0.2

The derivative (or slope) at every point on this graph is 2x. That means that the slope changes depending on where the x coordinate is!

So, when x=0, the slope is 0. You can see that in the graph where x=0, that it is horizontal, meaning that a step on the x axis becomes no steps on the y axis (only at that point where x is 0, and only if you take an infinitely small step).

When x is 1, the slope is 2, when x is 2, the slope is 4, when x is 3, the slope is 6. Since the numbers increase as we increase x from 0, that tells us that the graph gets steeper as we go to the right, which you can see in the graph.

Alternately, when x is -1, the slope is -2, when x is -2, the slope is -4, and when x is -3, the slope is -6. This shows us that as we decrease x from 0, the graph gets steeper in the opposite direction, which you can see in the graph as well.

What is Automatic Differentiation?

Let’s say you have a function (possibly a curve) describing the path of a rocket, and you want to make the rocket point down the path that it’s traveling.

One way you might do this is to evaluate your function f(T) to get the current location of your rocket (where T is how long the rocket has been flying), and then calculate the derivative f'(T) to find the slope of the graph at that point so that you can orient the rocket in that direction.

You could calculate the value and slope of the function at time T independently easily enough if you know how to get the derivative of a function (a calculus topic), or use wolframalpha.com.

However, if you have a complex equation, or maybe if the equation is controlled by user input, or game data, it might not be so easy to figure out what the derivative is at run time.

For instance… imagine having a function that rolled random numbers to figure out what mathematical operation it should preform on a number next (if we roll a 0, add 3, if we roll a 1 multiply by 2, if we roll a 2, square the number… etc). It isn’t going to be simple to take the derivative of the same mathematical function.

Here enters automatic differentiation (or AD). AD lets you calculate the derivative WHILE you are calculating the value of the function.

That way, you can do whatever math operations you want on your number, and in the end you will have both the value of f(T) as well as the derivative f'(T).

Using ε for Automatic Differentiation

You can use dual number operations on numbers to calculate the value of f(x) while also calculating f'(x) at the same time. I’ll show you how with a simple example using addition and multiplication like we went over above.

the first thing we do is convert our 4 into a dual number, using 1 for the dual component, since we are plugging it in for the value of x, which has a derivative of 1.

4+1ε

Next, we want to multiply that by the constant 3, using 0 for the dual component since it is just a constant (and the derivative of a constant is 0)

(4+1ε) * (3 + 0ε) =
12 + 0ε + 3ε + 0ε^2 =
12 + 3e

Lastly, we need to add the constant 2, using 0 again for the dual component since it’s just a constant.
(12 + 3ε) + (2 + 0ε) =
14 + 3ε

In our result, the real number component (14) is the value of f(4) and the dual component (3) is the derivative f'(4), which is correct if you work it out!

Let’s try f(5). First we convert 5 to a dual number, with the dual component being 1.

5 + 1ε

Next we need to multiply it by the constant 3 (which has a dual component of 0)

(5 + 1ε) * (3 + 0e) =
15 + 0ε + 3ε + 0ε^2 =
15 + 3ε

Now, we add the constant 2 (which has a dual component of 0 again since it’s just a constant)
(15 + 3ε) + (2 + 0ε) =
17 + 3ε

So, our answer says that f(5) = 17, and f'(5) = 3, which again you can verify is true!

The example above worked well but it was a linear function. What if we want to do a function like f(x) = 5x^2 + 4x + 1?

Let’s calculate f(2). We are going to first calculate the 5x^2 term, so we need to start by making a dual number for the function parameter x:
(2 + 1ε)

Next, we need to multiply it by itself to make x^2:
(2 + 1ε) * (2 + 1ε) =
4 + 2ε + 2ε + 1ε^2 =
4 + 4ε

(remember that ε^2 is 0, so the last term disappears)

next, we multiply that by the constant 5 to finish making the 5x^2 term:
(4 + 4ε) * (5 + 0ε) =
20 + 0ε + 20ε + 0ε^2 =
20 + 20ε

Now, putting that number aside for a second we need to calculate the “4x” term by multiplying the value we plugged in for x by the constant 4
(2 + 1ε) * (4 + 0ε) =
8 + 0ε + 4ε + 0ε^2 =
8 + 4ε

Next, we need to add the last 2 values together (the 5x^2 term and the 4x term):
(20 + 20ε) + (8 + 4ε) =
28 + 24ε

Lastly, we need to add in the last term, the constant 1
(28 + 24ε) + (1 + 0ε) =
29 + 24e

There is our answer! For the equation y = 5x^2 + 4x + 1, f(2) = 29 and f'(2) = 24. Check it, it’s correct (:

As one last example let’s calculate f(10) and f'(10) with the same function above y = 5x^2 + 4x + 1.

First, to start calculating the 5x^2 term, we need to make 10 into a dual number and multiply it by itself to make x^2:
(10 + 1ε) * (10 + 1ε) =
100 + 10ε + 10ε + 1ε^2 =
100 + 20ε

Next, we multiply by the constant 5 to finish making the 5x^2 term:
(100 + 20ε) * (5 + 0ε) =
500 + 0ε + 100ε + 0ε^2 =
500 + 100ε

Putting that aside, let’s calculate the 4x term by multiplying our x value by the constant 4:
(10 + 1ε) * (4 + 0ε) =
40 + 0ε + 4ε + 0ε^2 =
40 + 4ε

Lastly, let’s add our terms: 5x^2, 4x and the constant 1
(500 + 100ε) + (40 + 4ε) + (1 + 0ε) =
541 + 104ε

The answer tells us that for the equation y = 5x^2 + 4x + 1, f(10) = 541 and f'(10) = 104.

Sample Code

There are lots of other mathematical operations that you can do with dual numbers. I’ve collected as many as I was able to find and made up some sample code that uses them. The sample code is below, as well as the program output.

```#include
#include

#define PI 3.14159265359f

// In production code, this class should probably take a template parameter for
// it's scalar type instead of hard coding to float
class CDualNumber
{
public:
CDualNumber (float real = 0.0f, float dual = 0.0f)
: m_real(real)
, m_dual(dual)
{
}

float Real () const { return m_real; }
float Dual () const { return m_dual; }

private:
float m_real;
float m_dual;
};

//----------------------------------------------------------------------
// Math Operations
//----------------------------------------------------------------------
inline CDualNumber operator + (const CDualNumber &a, const CDualNumber &b)
{
return CDualNumber(a.Real() + b.Real(), a.Dual() + b.Dual());
}

inline CDualNumber operator - (const CDualNumber &a, const CDualNumber &b)
{
return CDualNumber(a.Real() - b.Real(), a.Dual() - b.Dual());
}

inline CDualNumber operator * (const CDualNumber &a, const CDualNumber &b)
{
return CDualNumber(
a.Real() * b.Real(),
a.Real() * b.Dual() + a.Dual() * b.Real()
);
}

inline CDualNumber operator / (const CDualNumber &a, const CDualNumber &b)
{
return CDualNumber(
a.Real() / b.Real(),
(a.Dual() * b.Real() - a.Real() * b.Dual()) / (b.Real() * b.Real())
);
}

inline CDualNumber sqrt (const CDualNumber &a)
{
float sqrtReal = ::sqrt(a.Real());
return CDualNumber(
sqrtReal,
0.5f * a.Dual() / sqrtReal
);
}

inline CDualNumber pow (const CDualNumber &a, float y)
{
return CDualNumber(
::pow(a.Real(), y),
y * a.Dual() * ::pow(a.Real(), y - 1.0f)
);
}

inline CDualNumber sin (const CDualNumber &a)
{
return CDualNumber(
::sin(a.Real()),
a.Dual() * ::cos(a.Real())
);
}

inline CDualNumber cos (const CDualNumber &a)
{
return CDualNumber(
::cos(a.Real()),
-a.Dual() * ::sin(a.Real())
);
}

inline CDualNumber tan (const CDualNumber &a)
{
return CDualNumber(
::tan(a.Real()),
a.Dual() / (::cos(a.Real()) * ::cos(a.Real()))
);
}

inline CDualNumber atan (const CDualNumber &a)
{
return CDualNumber(
::atan(a.Real()),
a.Dual() / (1.0f + a.Real() * a.Real())
);
}

inline CDualNumber SmoothStep (CDualNumber x)
{
// f(x) = 3x^2 - 2x^3
// f'(x) = 6x - 6x^2
return x * x * (CDualNumber(3) - CDualNumber(2) * x);
}

//----------------------------------------------------------------------
// Test Functions
//----------------------------------------------------------------------

void TestSmoothStep (float x)
{
CDualNumber y = SmoothStep(CDualNumber(x, 1.0f));
printf("smoothstep 3x^2-2x^3(%0.4f) = %0.4fn", x, y.Real());
printf("smoothstep 3x^2-2x^3'(%0.4f) = %0.4fnn", x, y.Dual());
}

void TestTrig (float x)
{
CDualNumber y = sin(CDualNumber(x, 1.0f));
printf("sin(%0.4f) = %0.4fn", x, y.Real());
printf("sin'(%0.4f) = %0.4fnn", x, y.Dual());

y = cos(CDualNumber(x, 1.0f));
printf("cos(%0.4f) = %0.4fn", x, y.Real());
printf("cos'(%0.4f) = %0.4fnn", x, y.Dual());

y = tan(CDualNumber(x, 1.0f));
printf("tan(%0.4f) = %0.4fn", x, y.Real());
printf("tan'(%0.4f) = %0.4fnn", x, y.Dual());

y = atan(CDualNumber(x, 1.0f));
printf("atan(%0.4f) = %0.4fn", x, y.Real());
printf("atan'(%0.4f) = %0.4fnn", x, y.Dual());
}

void TestSimple (float x)
{
CDualNumber y = CDualNumber(3.0f) / sqrt(CDualNumber(x, 1.0f));
printf("3/sqrt(%0.4f) = %0.4fn", x, y.Real());
printf("3/sqrt(%0.4f)' = %0.4fnn", x, y.Dual());

y = pow(CDualNumber(x, 1.0f) + CDualNumber(1.0f), 1.337f);
printf("(%0.4f+1)^1.337 = %0.4fn", x, y.Real());
printf("(%0.4f+1)^1.337' = %0.4fnn", x, y.Dual());
}

int main (int argc, char **argv)
{
TestSmoothStep(0.5f);
TestSmoothStep(0.75f);
TestTrig(PI * 0.25f);
TestSimple(3.0f);
return 0;
}
```

Here is the program output:

Closing Info

When you are thinking what number ε has to be so that ε^2 is 0 but ε is not 0, you may be tempted to think that it is an imaginary number, just like i (the square root of -1) that doesn’t actually exist. This is actually not how it is… I’ve seen ε described in two ways.

One way I’ve seen it described is that it’s an infinitesimal number. That sort of makes sense to me, but not in a concrete and tangible way.

The way that makes more sense to me is to describe it as a matrix like this:
[0, 1]
[0, 0]

If you multiply that matrix by itself, you will get zero(s) as a result.

In fact, an alternate way to implement the dual numbers is to treat them like a matrix like that.

I also wanted to mention that it’s possible to modify this technique to get the 2nd derivative of a function or the 3rd, or the Nth. It isn’t only limited to the 1st derivative. Check the links at the bottom of this post for more info, but essentially, if you want 1st and 2nd derivative, you need to make it so that ε^3 = 0 instead of ε^2 = 0. There is a way to do that with matrices.

Another neat thing is that you can also extend this into multiple dimensions. This is useful for situations like if you have some terrain described by mathematical functions, when you are walking the grid of terrain to make vertex information, you can get the slope / gradient / surface normal at the same time.

Lastly, I wanted to mention a different kind of number called a hyperbolic number.

The imaginary number i^2 = -1 and we can use it to do 2d rotations.

The dual number ε^2 is 0 (and ε is not 0) and we can use it to do automatic differentiation.

Hyperbolic numbers have j, and j^2 = 1 (and j is not 1). I’m not sure, but I bet they have some interesting usefulness to them too. It would be interesting to research that more sometime. If you know anything about them, please post a comment!

This shadertoy is what got me started looking into dual numbers. It’s a mandelbrot viewer done by iq using dual numbers to estimate a distance from a point to the mandelbrot set (as far as I understand it anyhow, ha!). He uses that estimated distance to color the pixels.

I didn’t get very much into the reasons of why this works (has to do with taylor series terms disappearing if ε^2 is 0), or the rigorous math behind deriving the operators, but here are some great links I found researching this stuff and putting this blog post together.

Using Imaginary Numbers To Rotate 2D Vectors

I’m a big fan of “exotic” math and programming techniques. There’s nothing I like more than seeing something common used in an uncommon way to do something that I didn’t know was possible.

In this post I share a technique that lets you use imaginary numbers (complex numbers more specifically) to be able to rotate vectors in 2d. This technique is so simple that you can even use it to rotate vectors by hand on paper!

Quick Review: Imaginary and Complex Numbers

The imaginary number “i” is the square root of -1. Without using i, you can’t take the square root of a negative number because if the answer is negative, multiplying a negative by a negative is positive, and if the answer is a positive, multiplying a positive by a positive is still a positive.

But, using i, you CAN take the square root of negative numbers.

The square root of 25 is 5.

The square root of -25 is 5*i, or just 5i, which means “5 times the square root of -1”.

You can combine a real and imaginary number into something called a complex number like this: 3 + 5i

Quick Review: Multiplying Complex Numbers

We’ll need to be able to multiply complex numbers together to do our rotations.

(3+5i)*(2+3i) = ???

Luckily, multiplying complex numbers together is as simple as using F.O.I.L. if you remember that from high school math class, it stands for First, Outer, Inner, Last.

Using that, we can multiply and then combine term, remembering that i^2 is -1:

(3+5i)*(2+3i) =
6 + 9i + 10i + 15i^2 =
6 + 19i + 15i^2 =
6 + 19i – 15 =
-9 + 19i

There we go, that’s all there is to multiplying complex numbers together!

Let’s say that we want to rotate the vector (0,1) by the angle of the vector (1,1). To do that, we just convert the vectors to complex numbers, using the x axis as the real number component, and the y axis as the imaginary number component, then multiply them, and convert back to vectors.

That’s a mouth full, so here it is, step by step.

1) Convert Vectors to Complex Numbers

(0,1) = 0 + 1i = i
(1,1) = 1 + 1i = 1 + i

2) Multiply the Complex Numbers

i * (1 + i) =
i + i^2 =
i – 1 =
-1 + i

In the above we change i – 1 to -1 + i to make the next step easier. The real component of the complex number is the X axis and the imaginary component is the Y axis.

3) Convert from Complex Number to Vector

-1 + i =
-1 + 1i =
(-1, 1)

The diagram below shows this operation:

As you can see we rotated the purple vector (0,1) which has an angle of 90 degrees and length of 1, by the blue vector (1,1) which has an angle of 45 degrees and a length of sqrt(2), and as a result we got the tan vector (-1,1) which has an angle of 135 degrees and a length of sqrt(2). So, we essentially rotated the 90 degree angle by the 45 degree angle and ended up with a 135 degree angle.

Note that order of operations doesn’t matter, you could rotate the 90 degree angle by 45 degrees, or you could rotate the 45 degree angle by 90 degrees, either way you are going to end up with 135 degrees.

A caveat with this technique though is that when you do the rotation, the resulting vector’s length will be the the product of the two source vectors used; you won’t get a normalized vector out as a result unless your source vectors are normalized, or you normalize after you are done with your operations.

As another example, let’s rotate the vector (4,3) by the vector (-12,5).

The first step is to convert them to complex numbers:
(4,3) = 4 + 3i
(-12,5) = -12 + 5i

Then we multiply them:
(4 + 3i) * (-12 + 5i) =
-48 + 20i – 36i + 15i^2 =
-48 – 16i – 15 =
-63 – 16i

Then convert back to a vector to get: (-63, -16)

In the image, you can see that we started with the blue vector (4,3) which is about 37 degrees and has a length of 5.

We rotated that vector by the purple vector (-12,5) which is about 157 degrees and has a length of 13.

That gave us the tan vector of (-63, -16) which is about 194 degrees and has a length of 65.

The resulting vector has the rotations added, and the lengths multiplied together.

Unrotating

If we multiply the complex numbers together to add rotations, does that mean that we divide the complex numbers to subtract rotations? Sort of…

If you want to subtract a rotation, you multiply the imaginary part of the vector you want to subtract by -1 (or just flip the sign) and then multiply as normal.

When you flip the sign of the imaginary part, this is actually called the “complex conjugate” but if that term scares you, feel free to ignore it 😛

As an example of unrotation, let’s take the vector (5,1) and subtract the vector (2,2) from it to see what we get.

The first step is to convert them into complex numbers.
(5,1) = 5 + i
(2,2) = 2 + 2i

Next up, we are going to flip the imaginary part of the vector we want to subtract.
2 + 2i becomes 2 – 2i

Then, we multiply as per normal:
(5 + i) * (2 – 2i) =
10 – 10i + 2i – 2i^2 =
10 – 8i + 2 =
12 – 8i

Finally, we convert back to a vector to get (12, -8).

In the image above, we started with the blue vector (5,1) which is about 11 degrees and has a length of sqrt(26).

We then unrotated by the purple vector (2,2) which is 45 degrees and has a length of sqrt(8).

That gave us the tan vector as a result (12,-8) which is about -34 degrees and has a length of sqrt(208).

Our resulting vector is the result of the second vector’s angle subtracted from the first vector’s angle, but the length of the vectors are still multiplied together, not divided.

Unrotating to Get Vector Length

As a fun aside, if you unrotate a vector by itself, the result will be that the imaginary components (the Y component) will disappear, and in the result, the real component (the x component) will be the squared length of the vector.

It’s easy enough to calculate the squared length of a vector by adding x^2 and y^2 but this is an interesting property.

In fact, in the last post I mentioned CORDIC math using imaginary numbers to rotate vectors to try and find sine, cosine, etc. This is related to how it does that work. It basically rotates or unrotates a vector by smaller and smaller angles iteratively, based on the sign of the y (imaginary) component to try and get y to zero, which leaves the answer in the x component. It also has some other magic sprinkled in to where it only has to deal with integer math.

Hopefully before too long I’ll be able to make a CORDIC blog post and talk more about that in detail.

Example Code

Ok so this theory stuff is all well and good but how about some code?

Before we get to that I want to give the naked formulas for rotating or unrotating vectors.

Given two vectors (A.x, A.y) and (B.x, B.y)…

Rotating A by B to get C:
C.x = A.x*B.x – A.y*B.y
C.y = A.x*B.y + A.y*B.x

Note: In the resulting vector C, the angles will be added, but the vector lengths will be multiplied.

Unrotating A by B to get C, we just flip the sign of any terms that contain B.y:
C.x = A.x*B.x + A.y*B.y
C.y = -A.x*B.y + A.y*B.x

Note: In the resulting vector C, the angles will be subtracted, but the vector lengths will be multiplied.

Below is some sample code, along with the output, showing our examples above working correctly.

```#include

struct SVector
{
float x;
float y;
};

void Rotate (const SVector &A, const SVector &B, SVector &C)
{
C.x = A.x * B.x - A.y * B.y;
C.y = A.x * B.y + A.y * B.x;
}

void Unrotate (const SVector &A, const SVector &B, SVector &C)
{
C.x = A.x * B.x + A.y * B.y;
C.y = -A.x * B.y + A.y * B.x;
}

void print (const SVector &V)
{
printf("(%0.2f,%0.2f)", V.x, V.y);
}

int main(int argc, char **argv)
{
{
SVector testA = {0.0f, 1.0f};
SVector testB = {1.0f, 1.0f};
SVector testC = {0.0f, 0.0f};
Rotate(testA, testB, testC);
printf("Rotating ");
print(testA);
printf(" by ");
print(testB);
printf(" = ");
print(testC);
printf("n");
}
{
SVector testA = {4.0f, 3.0f};
SVector testB = {-12.0f, 5.0f};
SVector testC = {0.0f, 0.0f};
Rotate(testA, testB, testC);
printf("Rotating ");
print(testA);
printf(" by ");
print(testB);
printf(" = ");
print(testC);
printf("n");
}
{
SVector testA = {5.0f, 1.0f};
SVector testB = {2.0f, 2.0f};
SVector testC = {0.0f, 0.0f};
Unrotate(testA, testB, testC);
printf("Unrotating ");
print(testA);
printf(" by ");
print(testB);
printf(" = ");
print(testC);
printf("n");
}
return 0;
}
```

Program output:

Check out these links for more details:
Better Explained: A Visual, Intuitive Guide to Imaginary Numbers
Better Explained: Intuitive Arithmetic With Complex Numbers

Four Ways to Calculate Sine Without Trig

Is it possible to sin without trig? That is a question that has plagued theologians for centuries. As evil as trigonometry is, modern science shows us that yes, it is possible to sin without trig. Here are some ways that I’ve come across.

1 – Slope Iteration Method

The above image uses 1024 samples from 0 to 2pi to aproximate sine iteratively using it’s slope. Red is true sin, green is the aproximation, and yellow is where they overlap.

This method comes from Game Programming Gems 8, which you can snag a copy of from amazon below if you are interested. It’s mentioned in chapter 6.1 A Practical DSP Radio Effect (which is a really cool read btw!).
Amazon: Game Programming Gems 8

This method uses calculus but is actually pretty straightforward and intuitive – and very surprising to me that it works so well!

The derivative of sin(x) is cos(x). That means, for any x you plug into sin, the slope of the function at that point is the cosine value of that same x.

In other words, sin(0) is 0, but it has a slope of cos(0) which is 1. Since slope is rise over run (change in y / change in x) that means that at sin(0), if you take an infinitely small step forward on x, you need to take the same sized step on y. That will get you to the next value of sine.

Let’s test that out!
sin(0) = 0 so we start at (0,0) on the graph.

If we try a step size of 0.1, our approximation is:
sin(0.1) = 0.1

The actual value according to google is 0.09983341664. so our error was about 0.0002. That is actually pretty close!

sin(0.25) = 0.25

The real value is 0.24740395925, so our error is about 0.003. We have about 10 times the error that we did at 0.1.

what if we try it with 0.5?
sin(0.5) = 0.5

The actual value is 0.4794255386, which leaves us with an error of about 0.02. Our error is 100 times as much as it was at 0.1. As you can see, the error increases as our step size gets larger.

If we wanted to, we could get the slope (cosine value) at the new x value and take another step. we could continue doing this to get our sine approximation, knowing that the smaller the step that we use, the more accurate our sine approximation would be.

We haven’t actually won anything at this point though, because we are just using cosine to take approximated steps through sine values. We are paying the cost of calculating cosine, so we might as well just calculate sine directly instead.

Well luckily, cosine has a similar relationship with sine; the derivative of cos(x) is -sin(x).

Now, we can use cosine values to step through sine values, and use those same sine values to step through cosine values.

Since we know that cos(0) = 1.0 and sin(0) = 0.0, we can start at an angle of 0 with those values and we can iteratively step through the values.

Here is a sample function in C++

```// theta is the angle we want the sine value of.
// higher resolution means smaller step size AKA more accuracy but higher computational cost.
// I used a resolution value of 1024 in the image at the top of this section.
float SineApproximation (float theta, float resolution)
{
// calculate the stepDelta for our angle.
// resolution is the number of samples we calculate from 0 to 2pi radians
const float TwoPi = 6.28318530718f;
const float stepDelta = (TwoPi / resolution);

// initialize our starting values
float angle = 0.0;
float vcos = 1.0;
float vsin = 0.0;

// while we are less than our desired angle
while(angle < theta) {

// calculate our step size on the y axis for our step size on the x axis.
float vcosscaled = vcos * stepDelta;
float vsinscaled = vsin * stepDelta;

// take a step on the x axis
angle += stepDelta;

// take a step on the y axis
vsin += vcosscaled;
vcos -= vsinscaled;
}

// return the value we calculated
return vsin;
}
```

Note that the higher the angle you use, the more the error rate accumulates. One way to help this would be to make sure that theta was between 0 and 2pi, or you could even just calculate between 0 and pi/2 and mirror / reflect the values for the other quadrants.

This function is quite a bit of work to calculate a single value of sine but it’s real power comes in the fact that it’s iterative. If you ever find yourself in a situation where you need progressive values of sine, and have some fixed angle step size through the sine values, this algorithm just needs to do a couple multiplies and adds to get to the next value of sine.

One great use of this could be in DSP / audio synthesis, for sine wave generation. Another good use could be in efficiently evaluating trigonometry based splines (a future topic I plan to make a post about!).

You can see this in action in this shadertoy or look below at the screenshots:

64 Samples – Red is true sine, green is our approximation, and yellow is where they are the same

128 Samples

256 Samples

1024 Samples

2 – Taylor Series Method

Another way to calculate sine is by using an infinite Taylor series. Thanks to my friend Yuval for showing me this method.

You can get the Taylor series for sine by typing “series sin(x)” into wolfram alpha. You can see that here: Wolfram Alpha: series sin(x).

Wolfram alpha says the series is: x-x^3/6+x^5/120-x^7/5040+x^9/362880-x^11/39916800 ….

what this means is that if you plug a value in for x, you will get an approximation of sine for that x value. It’s an infinite series, but you can do as few or as many terms as you want to be able to trade off speed for accuracy.

For instance check out these graphs.

Google: graph y = x-x^3/6+x^5/120-x^7/5040+x^9/362880-x^11/39916800, y = sin(x)
(Note that I had to zoom out a bit to show where it became inaccurate)

When looking at these graphs, you’ll probably notice that very early on, the approximation is pretty good between -Pi/2 and + Pi/2. I leveraged that by only using those values (with modulus) and mirroring them to be able to get a sine value of any angle with more accuracy.

When using just x-x^3/6, there was an obvious problem at the top and bottom of the sine waves:

When i boosted the equation with another term, bringing it to x-x^3/6+x^5/120, my sine approximation was much better:

You can see this method in action in this shadertoy:

3 – Smoothstep Method

The third method may be my favorite, due to it’s sheer elegance and simplicity. Thanks to P_Malin on shadertoy.com for sharing this one with me.

There’s a function in graphics called “smoothstep” that is used to take the hard linear edge off of things, and give it a smoother, more organic feel. You can read more about it here: Wikipedia: Smoothstep.

BTW if you haven’t read the last post, I talk about how smooth step is really just a 1d bezier curve with specific control points (One Dimensional Bezier Curves). Also, smoothstep is just this function: y = (3x^2 – 2x^3).

Anyhow, if you have a triangle wave that has values from 0 to 1 on the y axis, and put it through a smoothstep, then scale it to -1 to +1 on the y axis, you get a pretty darn good sine approximation out.

Here is a step by step recipe:

Step 1 – Make a triangle wave that has values on the y axis from 0 to 1

Step 2 – Put that triangle wave through smoothstep (3x^2 – 2x^3)

Step 3 – Scale the result to having values from -1 to +1 on the axis.

That is pretty good isn’t it?

You can see this in action in this shadertoy (thanks to shadertoy’s Dave_Hoskins for some help with improving the code):

After I made that shadertoy, IQ, the creator of shadertoy who is an amazing programmer and an impressive “demoscene” guy, said that he experimented with removing the error from that technique to try to get a better sin/cos aproximation.

You can see that here: Shadertoy: Sincos approximation

Also, I recommend checking out IQ’s website. He has a lot of interesting topics on there: IQ’s Website

4 – CORDIC Mathematics

This fourth way is a method that cheaper calculators use to calculate trig functions, and other things as well.

I haven’t taken a whole lot of time to understand the details, but it looks like it’s using imaginary numbers to rotate vectors iteratively, doing a binary search across the search space to find the desired values.

The benefit of this technique is that it can be implemented with VERY little hardware support.

The number of logic gates for the implementation of a CORDIC is roughly comparable to the number required for a multiplier as both require combinations of shifts and additions.
Wikipedia: Coordinate Rotation Digital Computer

One Dimensional Bezier Curves

I was recently looking at the formula for bezier curves:

A * (1-T)^2 + B * 2 * (1-T) * T + C * T ^2

Cubic Bezier curve:
A*(1-T)^3+3*B*(1-T)^2*T+3*C*(1-T)*T^2+D*T^3

And I wondered… since you can have a Bezier curve in any dimension, what would happen if you made the control points (A,B,C,D) scalar? In other words, what would happen if you made bezier curves 1 dimensional?

Well it turns out something pretty interesting happens. If you replace T with x, you end up with f(x) functions, like the below:

y = A * (1-x)^2 + B * 2 * (1-x) * x + C * x ^2

Cubic Bezier curve:
y = A*(1-x)^3+3*B*(1-x)^2*x+3*C*(1-x)*x^2+D*x^3

What makes that so interesting is that most math operations you may want to do on a bezier curve are a lot easier using y=f(x), instead of the parameterized formula Point = F(S,T).

For instance, try to calculate where a line segment intersects with a parameterized bezier curve, and then try it again with a quadratic equation. Or, try calculating the indefinite integral of the parameterized curve so that you can find the area underneath it (like, for Analytic Fog Density). Or, try to even find the given Y location that the curve has for a specific X value. These things are pretty difficult with a parameterized function, but a lot easier with the y=f(x) style function.

This ease of use does come at a price though. Specifically, you can’t move control points freely, you can only move them up and down and cannot move them left and right. If you are ok with that trade off, these 1 dimensional curves can be pretty powerful.

Below is an image of a 1 dimensional cubic bezier curve that has control points A = 0.5 B = 0.25 C = 0.75 D = 0.5. The function of this curve is y = 0.5 * (1-x)^3 + 0.25 * 3x(1-x)^2 + 0.75 * 3x^2(1-x) + 0.5 * x^3.

You can ask google to graph that for you to see that it is in fact the same: Google: graph y = 0.5 * (1-x)^3 + 0.25 * 3x(1-x)^2 + 0.75 * 3x^2(1-x) + 0.5 * x^3

Another benefit to these one dimensional bezier curves is that you could kind of use them as a “curve fitting” method. If you have some data that you wanted to approximate with a quadratic function, you could adjust the control points of a one dimensional quadratic Bezier curve to match your data set. If you need more control points to get a better aproximation of the data, you can just increase the degree of the bezier curve (check this out for more info on how to do that: Bezier Curves Part 2 (and Bezier Surfaces)).

Smoothstep as a Cubic 1d Bezier Curve

It’s kind of out of the scope of this post to talk about why smoothstep is awesome, but to give you strong evidence that it is, it’s a GLSL built in function. You may have also seen it used in the post I made about distance fields (Distance Field Textures), because one of it’s common uses is to make the edges of things look smoother. Here’s a wikipedia page on it as well if you want more info: Wikipedia: Smoothstep

Anyhow, I had no idea, but apparently the smoothstep equation is the same as if you take a 1d cubic bezier curve and make the first two control points 0, and the last two control points 1.

The equation for smoothstep is: y = 3*x^2 – 2*x^3

The equation for the bezier curve i mentioned is: y = 0*(1-x)^3+3*0*(1-x)^2*x+3*1*(1-x)*x^2+1*x^3

otherwise known as: y = 3*1*(1-x)*x^2+1*x^3

If you work it out, those are the same equations! Wolfram alpha can verify that for us even: Wolfram Alpha: does 3*x^2 – 2*x^3 = 3*1*(1-x)*x^2+1*x^3.

Kinda neat 😛

Moving Control Points on the X Axis

There’s a way you could fake moving control points on the X axis (left and right) if you really needed to. What I’m thinking is basically that you could scale X before you plug it into the equation.

For instance, if you moved the last control point to the left so that it was at 0.9 instead of 1.0, the space between the 3rd and 4th control point is now .23 instead of .33 on the x axis. You could simulate that by having some code like this:

```if (x > 0.66)
x = 0.66 + (x - 0.66) / 0.33 * 0.23
```

Basically, we are squishing the X values that are between 0.66 and 1.0 into 0.66 to 0.9. This is the x value we want to use, but we’d still plug the raw, unscaled x value into the function to get the y value for that x.

As another example, let’s say you moved the 3rd control point left from 0.66 to 0.5. In this situation, you would squish the X values that were between 0.33 and 0.66 into 0.33 to 0.5. HOWEVER, you would also need to EXPAND the values that were between 0.66 and 1.0 to be from 0.5 and 1.0. Since you only moved the 3rd control point left, you’d have to squish the section to the left of the control point, and expand the section to the right to make up the difference to keep the 4th control point in the same place. The code for converting X values is a little more complex, but not too bad.

What happens if you move the first control point left or right? Well, basically you have to expand or squish the first section, but you also need to add or subtract an offset for the x as well.

I’ll leave the last 2 conversions as an exercise for whoever might want to give this a shot 😛

Another complication that comes up with the above is, what if you tried to move the 3rd control point to the left, past the 2nd control point? Here are a couple ways I can think of off hand to deal with it, but there are probably other ways too, and the right way to deal with it depends on your needs.

1. Don’t let them do it – If a control point tries to move past another control point, just prevent it from happening
2. Switch the control points – If you move control point 3 to the left past control point 2, secretly have them start controling control point 2 as they drag it left. As far as the user is concerned, it’s doing what they want, but as far as we are concerned, the control points never actually crossed
3. Move both – if you move control point 3 to the left past control point 2, take control point 2 along for the ride, keeping it to the left of control point 3

When allowing this fake x axis movement, it does complicate the math a bit, which might bite you depending on what you are doing with the curve. Intersecting a line segment with it for example is going to be more complex.

An alternative to this would be letting the control points move on the X axis by letting a user control a curve that controls the X axis location of the control points – hopefully this would happen behind the scenes and they would just move points in X & Y, not directly editing the curve that controls X position of control points. This is a step towards making the math simpler again, by modifying the bezier curve function, instead of requiring if statements and loops, but as far as all the possibly functions I can think of, moving one control point on the X axis is probably going to move other control points around. Also, it will probably change the shape of the graph. It might change in a reasonable way, or it might be totally unreasonable and not be a viable alternative. I’m not really sure, but maybe someday I’ll play around with it – or if you do, please post a comment back and let us know how it went for you!

Here are some links to experiment with these guys and see them in action:
HTML5 Interactive 1D Quadratic Bezier Demo
HTML5 Interactive 1D Cubic Bezier Demo
Shadertoy: Interactive 1D Cubic Bezier Demo

Counting in Binary is a Fractal

I can’t remember for sure but I think I read about this first in A New Kind Of Science by Stephen Wolfram (yes, the same guy who made Wolfram Alpha!). Counting in binary is actually a fractal, check out these 4 sequential images…

Real Time Demo

To see it in action, check out this WebGL shadertoy:

Wang Tiling

Wang tiling is a really cool concept… it’s a good way to use 2d tiled graphics in such a way that can look very organic, without discernable patterns.

The basic idea of how they work is that each tile has a type of edge. You start by placing a random tile, and then you start putting down it’s neighboring tiles. When you place a tile, the rule is you can only put down a tile that has compatible edge types (aka the tiles can go together seamlessly). Rinse and repeat and pretty soon you have tile based graphics that don’t look tiled at all.

Specifically here is a strategy I like to use for filling a grid with wang tiles:

1. Place any random tile in the upper left corner
2. Put a tile below it that has an edge on it’s top that is compatible with the already placed tile’s bottom edge
3. Continue placing tiles downward until you reach the bottom of the column
4. Now, move back to the top, move over to the next column, and now place a tile such that the left edge is compatible with the right edge of the tile it is next to.
5. Moving down, you now have to find a tile which is compatible with both the tile above it, and the tile to the left. Since there are going to be multiple tiles that fit these constraints, just choose randomly from the ones that do.
6. Rinse and repeat until the grid is filled

There is a lot more info out there (links at bottom of post) so I’ll leave it at that and show you some results I got with some simple tiles.

The tiles I used are very geometric, but if you have more organic looking tiles, the resulting tile grid will look a lot more organic as well.

Also, as the links at the bottom will tell you, if you have wang tiles where each axis has only 2 edge types, even though the number of permutations of tiles in that situation is 16 (XVariation^2 * YVariation^2), you can actually get away with just using 8 tiles (XVariation * YVariation * 2). In my example below I had to use all 16 though because I’m just generating edge types in a pixel shader without deeply analyzing neighboring tiles, and it would be a lot more complex to limit my generation to just the 8 tiles. If you can think of a nice way to generate a wang tile grid using only the 8 tiles though, please let me know!

The 16 wang tiles used:

A resulting grid:

Here’s a more complex set of 16 wang tiles:

And a resulting grid:

Wang Tiling Research Paper

Introduction to Wang Tiles

By the way… something really crazy about wang tiles is that apparently they can be used to do computation and they are turing complete. Seriously? Yes! Check out the link below:

Computing with Tiles

Temporal supersampling, flipquads and real time raytracing

Follow me on this train of thought 😛

1) There’s this thing called super sampling where you render an image at a larger resolution, so that you can properly downsample it to the right size (the size of your screen for instance) to avoid aliasing problems. The problem here is that you are rendering more pixels, so it is more expensive to render, which is usually a deal breaker for real time applications that are trying to push the envelope of performance – like modern games.

2) There’s a way to get around this with something called Temporal Supersampling where you use the last frame rendered to provide extra information for the current frame, so that in a way, you get supersampled data by spreading it out over 2 frames. (More info on supersampling: Temporal supersampling). You get better results by jittering (offseting) the pixels you render from frame to frame, by a sub-pixel amount. This is the usual monte carlo sampling kind of situation… find some cheap but well behaving pseudorandom number generator you can run in your pixel shader to offset each pixel by, or use a regular pattern of some sort that gives good enough results.

3) That gives you 2 samples if you only compost the last and current frame, but more samples is better of course. You could keep more frames from the past around, but that takes up the precious resource of memory. Apparently, when the hardware does MSAA (multisampling antialiasing), it has different configurations for different numbers of samples and it’s configurable somehow. If you have 2 samples, they may be 2 vertical dots, or 2 horizontal dots. If you have 3 samples, it might look like a “3” on a domino. If you have 5 samples it might look like a “5” on a domino.

4) Sometimes a corner will be sampled so that a sample can be shared across multiple pixels to increase efficiency. There is this really interesting thing called “flipquads” that samples on an edge for that same reason. You can see some info on here: An Extremely Inexpensive Multisampling Scheme. Basically, you only do two samples per pixel, sampling at 2 of the 4 sample locations on the edge of a pixel, so that the pixels that share the edge can use the results. Effectively, you are doing 2 sample per pixel, but getting 4 samples per pixel due to sample sharing.

5) If you combine flipquads with temporal supersampling, it means that you get 4 samples for the cost of 2, amortized over 2 frames. So, you essentially just render the normal amount of pixels (1 sample per pixel), compost frame N against frame N-1, and get the benefit of a 4 tap MSAA. So, it’s really cheap, and yes… it does actually help significantly, despite the fact that so many samples are redundant.

None of the above is anything new… I watched it all in various SIGGRAPH 2014 presentations earlier today from big name modern games – and man am i amazed what people are doing these days!

Now for the new part…

One way for raytracers to get better visual quality is to do multiple rays per sample, doing monte carlo sampling, where each of the rays in the group is perturbed by tiny amounts. Some details here: Advanced Topics in Computer Graphics: Sampling Techniques

In my own personal OpenCL real time raytracer, I don’t have the luxury of doing multiple rays per pixel – and in fact, I have a graphics option that allows you to render only half the screen (top / bottom) each frame alternating, to cut the number of rays down so that it runs faster!

What if a person was able to do temporal supersampling with a realtime raytracer, using flipquads to make it so it could get the information of 4 rays per pixel, while only taking a single ray cast per pixel each frame? Wouldn’t that be something?

There are some technical details to work out but I think there is some real magic here waiting to happen.

The biggest technical problem I foresee is reprojecting the pixels from the last frame to the current frame. This probably would work ok if your rays had a strict projection matrix governing them, but there may be difficulties with reflection and refraction, and honestly, I personally want to distort camera rays for game effects (like being underwater) so wouldn’t want to be stuck with a strict projection matrix. Maybe there’s some clever solution to make it all ok though…

Also – the link to flipquads is actually an explanation of “fliptris” a technique using 1.25 samples per pixel. If that were amortized across 2 frames, that means you would only need to cast 62% of your rays theoretically. That might be a nice performance win, while gaining the benefits of temporal supersampling and ultimately having 3 samples for each pixel!

What if My Equation DOESN’T Equal Zero??

Take a simple equation such as y = 2x. You can transform that into the equation 2x – y = 0, and then could write it as f(x,y) = 2x – y = 0.

Now we have some function of x and y that equals zero. If we plug in numbers that are valid points in the original equation y = 2x (in other words, coordinates where y is double x), we’ll get zero out as a result:

```f(x,y) = 2x - y

f(1,2) = 2 * 1 - 2 = 0
f(2,4) = 2 * 2 - 4 = 0
f(0.5, 1.0) = 2 * 0.5 - 1.0 = 0
```

If we plug numbers in that don’t fit that pattern, we get non zero answers as a result:

```f(x,y) = 2x - y

f(2,1) = 2 * 2 - 1 = 3
f(5,5) = 2 * 5 - 5 = 5
f(-10, 20) = 2 * -10 - 20 = -40
```

What do these numbers mean? Are they useful for anything? As it turns out, they do have meaning and they are useful! They are part of what is needed to calculate the distance from the point (x,y) to the closest point on the curve. The other part of what we need is called the “gradient vector”.

As you probably know, there is a tangent vector at every point on a curve (let’s ignore things like asymptotes for now), and the tangent vector basically points from one point on the curve to the next point on the curve. The tangent vector can also be seen as a slope of that point on the curve.

A gradient vector is another property of a point, but it is different than a tangent vector in a couple ways. First off, it is perpendicular to the tangent vector, so sticks out away from the curve, and could be seen as a normal vector. Secondly, every point in the graph has a gradient vector, even if that point isn’t part of the curve and is just floating by itself in empty space.

The interesting part here is that a gradient vector of a point not on the curve will actually lead you straight to the closest point on the curve to the point. This is because of a handy property where the line connecting a point to the closest point on the curve will be perpendicular to the curve (perpendicular to the tangent).

So, if we can calculate a gradient vector for a point, we have the vector that points towards the closest point on the curve. Technically it might be pointing away from the closest point on the curve, instead of towards it but that gets cleared up in the math.

The gradient vector is just a partial derivative for each of the variables of the equation. If you have no idea what I’m talking about, here are 2 options:

1) You can watch some videos and learn about it here: Khan Academy: Multivariable calculus
2) You can have wolfram alpha calculate it for you, by typing “gradient” before the function. Click this link to see what I mean: Wolfram Alpha: gradient x^2-y

Here are some example equations, along with their gradient vector, to make sure that you either understand what’s going on, or are properly entering your equations into wolfram alpha. Note that a gradient vector may not always be constant, in which case, it changes value for each point! If it is constant, it’s the same gradient vector across the entire graph space.

line:
f(x,y) = 2x – y

cubic function:
f(x,y) = x^3 – 10y

sine:
f(x,y) = sin(x) – y

circle:
f(x,y) = x^2 + y^2 – 5

Calculating Distance

Ok, so now that we have the value of our equation at some point (x,y) and we have the ability to find our gradient vector at that point, we have everything we need to calculate the distance from the point to the closest point on the curve.

All you need to do is divide the absolute value of the equation value at that point by the length of the gradient vector at that point.

Example 1 – Line

Let’s start with something super simple… y = x, aka x – y = 0 aka f(x,y) = x – y.

the gradient of that function is (1, -1) at every point on the graph. The magnitude of the gradient vector is the square root of 2.

What is the distance from the line to point P(2,0)?

Well.. the value of the equation at that point f(2,0) is 2, and the absolute value of that is still 2.

Next we divide that value by the magnitude of the gradient to get 2 / square root(2), which is just square root (2). So, the distance is square root of 2.

Since the value of the function at that point was positive before taking an absolute value, that means that the gradient points from the line to the point, so you can flip the gradient around to get (-1,1), and that is the direction that the closest point on the line is from the point P(2,0). If we travel square root of 2 units from P(2,0) down the vector (-1,1) we will get to the point (1,1) which is on the graph. That is the closest point on the graph to our point.

Note that it works out in this case, but usually the gradient vector won’t be the right magnitude since it’s not normalized. It happened that in this case it is, by dumb luck, but it usually isn’t.

Example 2 – Another Line

Let’s try a more complex line. y = 2x aka 2x – y = 0 aka f(x,y) = 2x – y.

the gradient of that function is (2, -1) with a magnitude of square root of 5 at all points on the graph.

What is the distance from that line to point P(1,5)?

Well, the value of the function at (1,5) is -3. When you take the absolute value of that, it becomes 3.

So, the distance is 3 / square root (5).

In this case, since the value of the function at the point was negative before we took the absolute value, it means that the gradient points from the point to the graph, so the line y=2x is 3/square root(5) units of length away from the point P(1,5) and the direction to travel from the point to get to the line is the same direction as the gradient vector at that point aka (2,1).

For the last example lets do y=x^2 aka x^2 – y = 0 aka f(x,y) = x^2 – y.

The gradient of that function is (2x,-1) and since it isn’t constant at all points, we can’t get the magnitude of the gradient yet.

What is the distance from that function to point P(2,0)?

Well, the value of f(2,0) is 4, which is still 4 if you take the absolute value.

The gradient at P(2,0) is (4,-1) with a magnitude of square root of 5.

So, the distance from point P(2,0) to the curve f(x,y) = x^2 – y is 4/square root(5) and to get from the point to the curve you should travel that distance down the same direction as the vector (4,-1).

Caveat

Ok ok… the other shoe dropping here is that this distance calculation is only a distance ESTIMATE, and not always the right answer.

This is because if the gradient is a constant or linear function, it will be correct, but higher order gradients don’t always follow straight lines from the point to the curve.

The good news though is that this estimation will be less than or equal to the actual distance. Maybe not as useful as a for sure distance calculation, but an upper bound estimate still has it’s uses (like in ray marching, or drawing vector graphics).

Basically, the straight line tells you the closest it could possibly be, but the actual gradient may take a longer, curving path, from the point to the curve.

Here’s the article that got me started down this path, trying to find the answer to this question, while trying to understand the main topic:
IQ: Distance Estimation

And here’s a link that really helped me understand what a gradient was, and what it was all about:

This is a little bit different than the normal kinds of topics I write about, but I found it pretty interesting so wanted to share with you guys. (Thanks for sharing the idea with me Paul!)

This is a “theory of management” type of thing that helps explain why people act certain ways in specific situations, and what sort of management style they need to be successful. I might butcher the details a little bit, but if you want to read about it in more depth, you might start with this wikipedia page: Wikipedia: Situation Leadership.

Let’s kick things off with a diagram:

The idea is that basically, when someone is put into a new situation, they will generally progress from 1 to 4(*). This can be either hiring a programmer fresh out of college, or taking an experienced programmer from one team and putting him onto another team, or presumably could also refer to an entire team tackling a new type of problem.

(*) I sort of hate generalizing about how people act, and cookie-cutter-ing people and situations, but I take this as a sort of general guideline, instead of a hard and fast rule for every person and every situation.

Step 1

At step 1, a person is overly confident and think that they know what needs to be done, but in reality they don’t have the skills and/or information needed to actually do what is needed – but they don’t know it yet. A person at this stage needs guidance or pairing with another person to keep from doing reckless things and to keep them on the path towards success.

Step 2

At step 2, a person has learned a bit more about things and realizes it’s a bit more challenging than they thought. There is some drop in morale at this stage, and is where people might contemplate giving up, switching teams, switching companies, or switching careers. At this stage, a person needs “beer and hugs” I’m told. Maybe they also need some smaller, isolated tasks, to give them a sense of accomplishment. Maybe tasks drawing on their passions or previous experience to give them some victories to help them get over the hump.

How often have you found yourself in the mindset? Maybe on a new team, at a new company, working with new technology, and feeling completely overwhelmed, thinking things are too difficult and maybe even wanting to give up, or feeling like you are not preforming well enough?

Chances are, everyone’s felt that way. I know I have! I’ve worked at something like 7 companies in 13 years and early in my career i felt that way A LOT, and OFTEN. It happens less as the years go on, and I find that more skills carry over than in previous days, but it still happens from time to time. That’s a good thing though, because if it didn’t, it would mean I wasn’t learning and growing, and stagnation is no good.

The good news is that this feeling is normal, it’s typical, and if you have a good manager, they’ve seen it many times before and expect it. No, you aren’t under-performing, you aren’t under-qualified, and you aren’t a slacker. But keep working hard anyways so you can get out of this stage!

Step 3

At step 3, a person has begun learning more and is getting more proficient, and starting to feel better about things. Keep on trucking!

Step 4

At step 4, a person has mastery over the subject matter and is feeling good about things. At this point they know what they are doing and are confident, and need a bit of a longer leash to be able to go out into the weeds a little bit to feel good about their work, even if it seems a bit silly. People who are truly passionate about what they do want to be trusted to do the right thing, and they want the freedom to pursue their interests. They’ve worked hard, through challenges both with the work and psychological, and now they are effective and hopefully pleasant – so hopefully they’ve deserved a little bit of diversion time hehe.

Plus! if you’ve done any programming in the areas of genetic algorithms, or training neural networks, you may remember that if you only let the winners reproduce (in genetic algorithms), or you only activate the winning neurons while training (for neural networks), you will likely end up in a local minima, instead of the global minimum. That “longer leash” time of letting people wander into the woods a little bit is kind of like mutation, or letting some of the losers reproduce. They may very well come back with something way better than you ever considered.

Super tangent – someone recently told me that path finding has the same “local minima” problem, and that if you pursue some of the “not so great” paths while searching a pathing space, you can get better results.

Outside of the Box

Anyways….

While this may be a useful tool for helping to understand people’s motivations, and helping to give them what they need to succeed and be happy and effective, this isn’t the whole story of course. Just like there is no such thing as a straight line in nature (AFAIK, but not sure what happens below the plank scale!), this doesn’t exactly match reality. It’s just a useful guideline.

You might also find this interesting, as a different take on the transition from step 1 to step 2: Wikipedia: Dunning-Kruger Effect

Recently I’ve been thinking about a few topics like this that are game dev related, but not about specific algorithms. You’ll probably see some more of this sort of thing smattered in amongst the cool algorithms I stumble on going forward (:

Distance Field Textures

A friend recently turned me onto a really cool paper (thanks James!) that Valve wrote that allows you to encode monochromatic (black & white) textures in a way that they can be incredibly low resolution, but when you scale them up, they still look crisp and smooth, not blurry or pixelated.

It is really quite amazing and is perfect for things like fonts or decals.

I recommend reading the paper, but below are some details to help you implement this in your own application, and also some examples of things taken to the extreme.

Here’s a really easy to use program that can turn fonts or SVG files into distance field images: signed-distance-field-font-generator

Implementation

Ok so, in a signed distance field texture, the alpha value of each pixel is a value of how far that pixel is from the edge of the shape. In a signed distance field, you essentially take the value which is from 0 to 1, and you subtract 0.5 and multiply by 2 so that you change it from 0-1 to -1 to +1. Negative distances mean the pixel is inside the shape, Positive distances mean the pixel is outside the shape.

You only need to do that math if you care about the exact distance though. If you only care about whether the pixel is inside or outside the shape, you can just consider values less than 0.5 to be inside the shape, and values greater than 0.5 to be outside the shape. In other words, you could just do an ALPHA TEST against 0.5 to render these guys.

Here’s an excerpt of some OpenCL code that does this:

```float alpha = read_imagef(tex3dIn, g_textureSampler, textureCoords).w;
float3 color = (alpha < 0.5f) ? (float3)(1.0f) : (float3)(0.0f);
```

I'll refer to that code as the "Alpha Test" code.

Another way to do it would be to use smoothstep to smooth the jaggies out a bit. Here's an excerpt of some OpenCL code that does that:

const float smoothing = 1.0/64.0;
float distance = read_imagef(tex3dIn, g_textureSampler, textureCoords).w;
float alpha = Saturate(smoothstep(0.5 – smoothing, 0.5 + smoothing, distance));
float3 color = (float3)(1.0f – alpha);

In the above, the smoothing constant can be adjusted to change how it smooths out the jaggies.

Note that even though the texture is monochromatic, you could use the color channel in the texture if you wanted to, or multiply the color by some other color to make it a colored image.

Here are the two source images I used. The first one is of the "Comic Sans" font which I doubled vertically since my textures have to be square, and the second one is a mustache SVG vector graphics image I found online. The font image is 512×512 and the mustache is 128×128.

Distance Field Textures in Action

Here’s a shot of the texture usages rendered from a distance:

Font in Action

Here’s a shot of the text close up with the alpha test code:

Here’s the same shot, using the smooth step code. Keep in mind that the “8” you are looking at is about 32×32 pixels 😛

Here’s the text taken from 512×512 down to 256×256, rendered with the alpha test code. You can already see degradation unfortunately but the look at the pictures above and remember that the full font texture is essentially 512×256 (I doubled it because my textures have to be square) and looks great up close:

Here’s the 256×256 font texture again, this time rendered with smooth step. A little bit better, but still pretty bad (but not bad for the resolution of the source font texture!):

Decal in Action

Here’s the mustache decal, which has a source image size of 128×128, rendered with the alpha test code:

Here’s the mustache rendered with the smooth step code:

Now it starts to get interesting. Here it is at 64×64 with alpha test code:

And now 64×64 with smooth step:

Here’s 32×32 with alpha test:

Here’s 32×32 with smooth step:

Here’s 16×16 with alpha test:

And lastly, here’s 16×16 with smooth step. Not freaking bad for a 16×16 texture right??!!!