Scaling Points In a Specific Direction

In this post I’ll be showing two different ways to scale points along a specific direction (a vector). There isn’t anything novel here, but it gives an example of the type of math encountered in game development, and how you might approach a solution. The first method involves matrices, and the second involves vector projections.

The points in the red circle below are scaled along the dark green vector, with the magnitude of the bright green vector.

The C++ code that made this diagram can be found at https://github.com/Atrix256/ScalePoints. It implements the vector projection method.

This problem came up for me when implementing a Petzval lens effect (https://bbhphoto.org/2017/04/24/the-petzval-lens/) in a bokeh depth of field (https://blog.demofox.org/2018/07/04/pathtraced-depth-of-field-bokeh/) post processing rendering technique. The setup is that I take point samples in the shape of the camera aperture for bokeh and depth of field, and I needed to stretch this sampled shape in a specific direction and magnitude, based on where the pixel is on the screen.

In the end, I just needed to scale points in a specific direction, by a specific amount. If we can do this operation to 1 point, we can do it to N points, so we’ll focus on doing this to a single point.

Method 1 – Matrices

Using matrices to scale a point along a specific direction involves three steps:

Rotate the point so that the direction we want to scale is aligned with the X axis.
Multiply the x axis value by the scaling amount.
Unrotate the point back to the original orientation.

Our vectors are going to be row vectors. For a deep dive on other matrix and vector conventions and reasons to choose one way or another, read this great post by Jasper St. Pierre: https://blog.mecheye.net/2024/10/the-ultimate-guide-to-matrix-multiplication-and-ordering/

Let’s say we want to scale a point by 4 along the vector [3,1].

First we normalize that vector to [3/sqrt(10), 1/sqrt(10)]. Then we need to get the rotation matrix that will transform [3/sqrt(10), 1/sqrt(10)] to [1,0]. We can do that by making a matrix where the first column is where we want the x axis to point, and the second column to be where we want the y axis to point. For the x axis, we use the normalized vector we already have. For the y axis, we just need a perpendicular vector, which we can get by swapping the x and y components of the vector, and negating one to get [-1/sqrt(10), 3/sqrt(10)]. This operation of swapping x and y and negating one is sometimes called the “2D Cross Product” even though it isn’t really a cross product, and there is no cross product in 2D. That gives us this rotation matrix:

$R = \begin{bmatrix} \frac{3}{\sqrt(10)} & \frac{-1}{\sqrt(10)} \\ \frac{1}{\sqrt(10)} & \frac{3}{\sqrt(10)} \end{bmatrix}$

Next we need to make the scaling matrix, which we get by multiplying the vector [4,1] by the identity matrix:

$S = \begin{bmatrix} 4 & 0 \\ 0 & 1 \end{bmatrix}$

Lastly we need to calculate the unrotation matrix. We need a matrix that rotates by the negative amount of R. We need the inverse matrix of R. The inverse of a rotation matrix is just the transpose matrix, so we can transpose R to make it. Another way to think about it is when we made the matrix before with the first column being the x axis and the second column being the y axis, we are now going to make the first row be the x axis, the second row be the y axis. Rows instead of columns. Whichever explanation makes most sense to you, we end up with this:

$R' = \begin{bmatrix} \frac{3}{\sqrt(10)} & \frac{1}{\sqrt(10)} \\ \frac{-1}{\sqrt(10)} & \frac{3}{\sqrt(10)} \end{bmatrix}$

Now that we have all the transformations, we can calculate R * S * R’ to get a final matrix that does the transformation we want. I’ll do it in 2 steps in case that helps you follow along, to make sure you get the same numbers.

$R*S = \begin{bmatrix} \frac{12}{\sqrt(10)} & \frac{-1}{\sqrt(10)} \\ \frac{4}{\sqrt(10)} & \frac{3}{\sqrt(10)} \end{bmatrix}$

$R*S*R' = \begin{bmatrix} \frac{37}{10} & \frac{9}{10} \\ \frac{9}{10} & \frac{13}{10} \end{bmatrix}$

That is our matrix which scales a point along the vector [3,1], with a magnitude of 4. Let’s put the vector [4,5] through this transformation by multiplying it by the matrix.

$\begin{bmatrix} 4 & 5 \end{bmatrix} \begin{bmatrix} \frac{37}{10} & \frac{9}{10} \\ \frac{9}{10} & \frac{13}{10} \end{bmatrix} = \begin{bmatrix} \frac{193}{10} & \frac{101}{10} \end{bmatrix} = \begin{bmatrix} 19.3 & 10.1 \end{bmatrix}$

For those who are counting instructions, processing a point using this process is 2 multiplies and two adds, to do that 2d vector / matrix product.

Creating the matrix took 16 multiplies and 8 adds (two matrix / matrix multiplies aka eight 2d dot products), but usually, you calculate a matrix like this once and re-use it for many points, which makes the matrix creation basically zero as a percentage of the total amount of calculations done, when amortized across all the points.

Method 2 – Vector Projection

I’m a fan of vector projection techniques. There is a certain intuitiveness in them that is missing from matrix operations, I find.

Using vector projection to scale a point along a specific direction involves these three steps:

Project the point onto the scaling vector and multiply that by the scaling amount.
Project the point onto the perpendicular vector.
Add the scaling vector projection times the scaling vector to the perpendicular vector projection times the perpendicular vector.

We will use the same values from the last section, so we want to scale a point by 4 along the vector [3,1], which we normalize to [3/sqrt(10), 1/sqrt(10)]. We will put the point [4, 5] through this process.

Step 1 is to project our point onto the scaling vector. We do that by doing a dot product between our normalized vector [3/sqrt(10), 1/sqrt(10)], and our point [4, 5]. That gives us the value 17/sqrt(10). We then multiply that by the scaling amount 4 to get 68/sqrt(10).

Step 2 is to project our point onto the perpendicular vector. We can once again use the “2D Cross Product” to get the perpendicular vector. We just flip the x and y component and negate one, to get the vector perpendicular to the scaling vector: [-1/sqrt(10), 3/sqrt(10)]. We can dot product that with our point [4, 5] to get: 11/sqrt(10).

Step 3 is to multiply our projections by the vectors we projected onto, and add the results together. Our scaling vector contribution is 68/sqrt(10) * [3/sqrt(10), 1/sqrt(10)] or [204/10, 68/10]. Our perpendicular vector contribution is 11/sqrt(10)*[-1/sqrt(10), 3/sqrt(10)] or [-11/10, 33/10].

When we add the two values together, we get [193/10, 101/10] or [19.3, 10.1].

That result matches what we got with the matrix operations!

As far as instruction counts, we did two dot products for step 1 and 2 which is 4 multiplies and 2 adds total. Step 3 is 4 multiplies. Step 4 is 2 multiplies and 2 adds. This is a total of 10 multiplies and 4 adds which is a lot more than the matrix version, which was just 2 multiplies and 2 adds.

If you optimized this process to do fewer operations by combining work where you could, you’d eventually end up at the same operations done in the matrix math. Algebra is fun that way.

Higher Dimensions?

Using the matrix method in higher dimensions, making the scaling vector is easy, and making the unrotation matrix is still just taking the transpose of the rotation matrix. It’s more difficult making the rotation matrix though.

In 3D, the scaling direction will be a 3D vector, and you need to come up with two other vectors that are perpendicular to that scaling direction. One way to do this could be to take any vector which is different from the scaling vector, and cross product that with the scaling vector. That will give you a vector perpendicular to both, and you can take that as your second vector. To get the third vector, cross product that vector with the scaling vector. You will then have 3 perpendicular vectors, and an orthonormal basis that you can use to fill out your rotation matrix. The first column is the scaling vector, the second column is the second vector found, and the third column is the third column found.

The cross product only exists in the 3rd and 7th dimension though, so if you are working in a different dimension, or if you don’t want to use the cross product for some reason, another way you can make an orthonormal basis is by using the Gram-Schmidt process. There’s a great video on it here: https://www.youtube.com/watch?v=KOkuTXrv5Gg

For the vector project method, you also need the orthonormal basis vectors to do all the vector projections, before you scale the x axis, and then re-combine the projections, so it boils down to the same issues as the matrix method.

From Readers

Nick Appleton (https://mastodon.gamedev.place/@nickappleton) says:

shameless plug of my last blog post (regarding higher order rotation matrices) https://www.appletonaudio.com/blog/2023/high-dimension-rotation-matrices/

This has methods for generating a high order rotation that moves a point to a particular axis. There is rarely a need need for a Gram Schmidt process and computing the matrix can be made quite cheap 🙂

I think the most efficient way to find a rotation matrix that takes a unit vector A and moves it to another unit vector B (in any dimension) is to find the find the reflection matrix that maps A to C (where C=B with a single component negated – doesn’t matter which one) and then flip the sign of the corresponding row of the matrix to turn it into a rotation.

Finding a reflection matrix that does this requires only a single division in an efficient implementation for any dimension.

Mastodon link: https://mastodon.gamedev.place/@nickappleton/113315462425042217

Andrew Gang (https://vis.social/@pteromys) says:

if your use case doesn’t need accuracy for scaling amounts near zero, method 2 has a variant that saves you from having to find perpendicular vectors: point + (scaling amount – 1) * dot(normalized scaling vector, point) * normalized scaling vector.

Mastodon link: https://vis.social/@pteromys/113317364225437813

A Two Dimensional Low Discrepancy Shuffle Iterator (+Random Access & Inversion)

The C++ code that implements this blog post and generated the images can be found at https://github.com/Atrix256/GoldenRatioShuffle2D.

I previously wrote about how to make a one dimensional low discrepancy shuffle iterator at https://blog.demofox.org/2024/05/19/a-low-discrepancy-shuffle-iterator-random-access-inversion/. That shuffle iterator also supported random access and inversion. That is a lot of words, so breaking it down:

Shuffle Iterator – This means you have an iterator that can walk through the items in a shuffle without actually shuffling the list. It’s seedable too, to have different shuffles.
Low Discrepancy – This means that if you are doing something like numerical integration on the items in the list, you’ll get faster convergence than using a white noise random shuffle.
Random Access – You can ask for any item index in the shuffle and get the result in constant time. If you are shuffling 1,000,000 items and want to know what item will be at the 900,000th place, you don’t have to walk through 900,000 items to find out what’s there. It is just as happy telling you what is at the 900,000th place in the shuffle, as what is at the 0th place.
Inversion – You can also ask the reverse question: At what point in the shuffle does item 1000 come up? It also gives this answer in constant time.

After writing that post, I spent some time trying to find a way to do the same thing in two dimensions. I tried some things with limited success, but didn’t find anything worth sharing.

A couple people suggested I try putting the 1D shuffle iterator output through the Hilbert curve, which is a way of mapping 1d points to 2d points. I finally got around to trying it recently, and yep, it works!

You can also use the Z-order curve (or Morton curve) to do the same thing, and it works, but it doesn’t give as nice results as the Hilbert curve does.

People also probably wonder how Martin Robert’s R2 sequence (from https://extremelearning.com.au/unreasonable-effectiveness-of-quasirandom-sequences/) would work here, since it generalizes the golden ratio to 2d, and the 1d shuffle iterator is based on the golden ratio, but I couldn’t find a way to make it work. For example, multiplying the R2 sequence by the width and height of the 2D grid and casting to integer causes the same 2d location to be chosen multiple times, which also leaves other 2d locations never chosen. That is fine for many applications, but if you want a shuffle, it really needs to visit each item exactly once during the shuffle. Below is how many duplicates that had at various resolutions. Random is also included (white noise, but not a shuffle) to compare to.

Texture Size	Pixel Count	R2 Duplicate Pixels	R2 Duplicate Percent	Random Duplicate Percent
64×64	4,096	847	20.68%	36.89%
128×128	16,384	3,379	20.62%	37.01%
256×256	65,536	16,375	24.99%	36.78%
512×512	262,144	29,203	11.14%	36.80%
1024×1024	1,048,576	392,921	37.47%	36.78%
2048×2048	4,194,304	1,831,114	43.66%	36.78%

So first up, here’s an RMSE (root mean squared error) graph, integrating an image in the repo cabin.png (aka finding the average pixel color). RMSE is averaged over 1000 tests to reduce noise in the plot, where each test used a different random seed for the shuffles. White uses a standard white noise shuffle. Hilbert and ZOrder use a 1D low discrepancy shuffle, then use their respective curves to turn it into a 2D point.

Here is the 512×512 image being integrated:

So that’s great… Hilbert gives best results generally, but Z Order also does well, compared to a white noise shuffle.

Some things worth noting:

Both Hilbert and Z order curves can be reversed – they can map 1d to 2d, and then 2d back to 1d. That means that this 2d shuffle is reversible as well. To figure out at what point in a shuffle a specific 2D point will appear, you first do the inverse curve transformation to get the 1D index of that point, and then ask the low discrepancy shuffle when that 1D index will show up.
As I’ve written it, this is limited to powers of 2. There are ways of making Hilbert curves that are not powers of 2 in size but I’ll leave that to you implement 😛

I did omit something from the previous test though… What if we didn’t put the 1d shuffle through any curve? What if we treated the image as one long line of pixels (like it is, in memory) and used a 1d shuffle on that? Well, it does pretty well!

However, you should see the actual point sets we are talking about before you make a final judgement.

Below are the White noise shuffled points (left), Z order points (middle) left, Hilbert points (middle right) and 1DShuffler points (right) at different numbers of points, to show how they fill in the space. The last image in each uses a point’s order as a greyscale color to show how the values show up over time.

Click on the images to see them full sized.

64×64

128×128

256×256

512×512

1024×1024

So, while the 1DShuffler may give better convergence, especially near the full sample count, the point set is very regularly ordered and would make noticeable rendering artifacts. This is very similar to the usual trade off of low discrepancy sequences converging faster but having aliasing problems, compared to blue noise which converges more slowly but does not suffer from aliasing, and is isotropic.

You might notice a strange thing in Hilbert at 1024×1024 where it seems to fill in “in clumps”. 2048×2048 seems to do the same, but 4096×4096 goes back to not clumping. It’s strange and my best guess as to what is going on is that Hilbert mapping 1d to 2d means that points nearby in 1d are also nearby in 2d, but the reverse does not hold. Points far away in 1d are not necessarily far away in 2d? I’m not certain though.

Bonus Convergence Tests

Here’s a 512×512 white noise image to integrate, and the convergence graph below.

Here is the same with a blue noise texture. This blue noise made with FAST (https://github.com/electronicarts/fastnoise). This is the same pixel values as the white noise texture above, just re-arranged into a blue noise pattern.

So interestingly, integrating the white noise texture made the “good samplers” do less good. Integrating the blue noise texture made the “good samplers” do even less well and be equal to white noise sampling.

What gives?

This isn’t proven, and isn’t anything that I’ve seen talked about in literature, but here’s my intuition of what’s going on:

“Good sampling” relies on small changes in sampling location giving small changes in sampled value. This way, when it samples a domain in a roughly evenly spaced way, it can be confident that it got fairly representative values of the whole image.

This is true of regular images, where moving a pixel to the left or right is going to usually give you the same color.

This may or may not be true in a white noise texture, randomly.

A blue noise texture however, is made to BREAK this assumption. Small changes in location on a blue noise texture will give big changes in value.

Just a weird observation – good sampling tends to be high frequency, while good integrands tend to be low frequency. Good sampling tends to have negative correlation, while good integrands tend to have positive correlation.

The blog at the bottom of the sea

Programming, Graphics, Gamedev, Exotic Computation, Audio Synthesis

Monthly Archives: October 2024