Transformation Matrix Basics

Here are some interesting tidbits of info that I’ve found useful for being able to think about matrix math in a more intuitive way. We start off with 2d matrix math but extend it to 3d at the end.

Why Use A Matrix?

You might ask why you might go through all the trouble of using a matrix for doing transformations like translation, rotation, scaling and shearing / skewing.

Why not just manually modify the points, putting them through equations to get the results. Well, there are two main reasons.

The first one is for performance. The function for rotating a point in 2d looks like this:

x’ = x * cos(theta) – y * sin(theta)
y’ = x * sin(theta) + y * cos(theta)

If you have 1000 points, that means you are calculating sin twice and cos twice for each point which is 4000 trig operations. If you are smart (or your compiler is!), you’ll only do sin and cos once for each point, but that’s still 2000 trig operations.

If you are super duper smart (or your compiler is…), you’ll notice that theta is the same for all 1000 points, and perhaps you’ll calculate sin(theta) and cos(theta) once ahead of time and use those values for each point.

That last step is basically what matrix math does for you. A 2d rotation matrix looks like the below:

[ cos(theta), sin(theta) ]
[-sin(theta), cos(theta) ]

That means that once you have calculated your rotation matrix, you don’t need to keep performing trig operations. You have your values and can use them over and over very cheaply.

This especially saves processor time when you combine multiple transformations together. If you needed to perform an operation that did some crazy combination of 2000 rotations, 2000 translations and 2000 scale adjustments, instead of needing to do those 6000 operations on each point, you can just calculate the final matrix (by combining the matrix of each of those 6000 operations into one matrix) and then use that single matrix to your hearts content.

Another reason why you might want to use matrices instead of doing transforms by hand is that it’s a lot simpler writing code that does general transformations instead of deciding after the fact “hey i want to add scaling now and i need to touch all my transformation related code to implement it”.

Using a matrix, you don’t have to know or care what the transform is, it will just do it and you can move on with your life.

Matrix Vector Multiplication Is Just Dot Products!

As a quick refresher, in 2d, a dot product is just 2 multiplies and an add (A.X * B.X + A.Y * B.Y) and in 3d, a dot product is just 3 multiplies and 2 adds (A.X * B.X + A.Y * B.Y + A.Z * B.Z).

A lot of modern hardware (both CPU and GPU) has varying amounts of support built in for vector and matrix math to make it faster, but matrix vs vector multiplies are really not that bad to begin with. In 2d, a matrix * vector operation is just 2 dot products! In 3d, it’s 3 dot products. Look at the below to see what I mean:

VX’ = AX * VX + BX * VY
VY’ = AY * VX + BY * VY

and in 3d:

VX’ = AX * VX + BX * VY + CX * VZ
VY’ = AY * VX + BY * VY + CY * VZ
VZ’ = AZ * VX + BZ * VY + CZ * VZ

You Can Make a Matrix From Basis Vectors

Let’s say that you are working in 2d and you want to rotate a point. Let’s say that for some reason you know what the rotated X and Y axis are supposed to be. You can actually create a rotation matrix from that knowledge alone without having to do any trig or geometry type math.

Like for instance, if you wanted an object’s x axis to point parallel to the vector of a moving object, and you knew the object’s normalized velocity vector (the direction it is moving, normalized to have a vector length of 1) was (0.34, 0.93).

That normalized velocity vector would be your X axis, and you could use the “faked 2d cross product” of flipping x and y and negating one to get a vector perpendicular to that X axis (aka your Y axis) . So…

X axis = ( 0.34, 0.93)
Y axis = (-0.93, 0.34)

(note, which one you negate on the Y axis matters only as much as you care which of the 2 directions you want the Y axis to point. Basically if you don’t care if it points up or down, it doesn’t matter which one you flip. If you do care, you need to pick the right one for the direction you want. This may not be the best way, but i do it by visual inspection, or by evaluating the math and seeing if it’s pointing in the way that i want or not. In other words… try it one way, and if it’s backwards, do it the other way.)

Now that we have an X and a Y axis, we just use the X axis as the first row, and the Y axis as the second row and get our rotation matrix:

[ 0.34, 0.93]
[-0.93, 0.34]

To see that it really works, try multiplying the vector (1,0) by that matrix to see if we get the right number out (it should be the same vector as the velocity of the object we are orienting to). We are basically verifying here that our X axis comes out to what it should.

[1 0]
[ 0.34, 0.93]
[-0.93, 0.34]
X’ = 0.34 * 1 – 0.93 * 0 = 0.34
Y’ = 0.93 * 1 + 0.34 * 0 = 0.93

now, let’s check our y axis

[0 1]
[ 0.34, 0.93]
[-0.93, 0.34]
X’ = 0.34 * 0 – 0.93 * 1 = -0.93
Y’ = 0.93 * 0 + 0.34 * 1 = 0.34

Note that when you put your X and Y axis basis vectors into the matrix, that they should be normalized, otherwise they will do strange things to your point – like introduce scaling and skewing.

You Can Get Basis Vectors From a Matrix

The process works backwards too which is really handy. If you have some matrix of unknown rotation, you can get the basis vectors out the same way you put them in.

You might have already seen this when looking at the matrix from the last example

[ 0.34, 0.93]
[-0.93, 0.34]

The first row (0.34, 0.93) is the X axis, and the second row (-0.93, 0.34) is the Y axis.

One caveat to be aware of though is that if you are working with a “matrix from the wild” where you don’t know if it’s a rotation matrix only, or if it might have some other transforms in it (scaling or skewing), the rows may not be normalized.

If you know for sure that it’s just a rotation matrix, you can take the basis vectors right out of the matrix. If you don’t know for sure, you need to normalize the basis vectors after you pull them out.

Why is this useful?

If in 3d, you had the matrix representing the camera transform, you could grab the 3rd row to get the forward vector. You could use this vector when launching a projectile from the player’s position so that it would go where they were aiming.

Again, in 3d if you had the camera matrix, you could grab the first row to get the “left vector” and you could add or subtract that from the player’s position to do strafing left and right.

No complex math required (:

2d Translation Matrix

To be able to have a matrix that can do translation, you need to go to a 3×3 matrix.

Below is what a translation matrix looks like. (TX,TY) is the translation.

[1 0 0]
[0 1 0]
[TX TY 1]

When you want to transform a 2d point by a 3×3 matrix like the above, you need to use a 1 for the Z component. Let’s see what happens when we transform a 2d point by this 3×3 translation matrix.

[X Y 1]
[1 0 0]
[0 1 0]
[TX TY 1]
X’ = X * 1 + Y * 0 + 1 * TX
Y’ = Y * 0 + Y * 1 + 1 * TY
Z’ = 1 * 0 + 1 * 0 + 1 * 1
X’ = X + TX
Y’ = Y * TY
Z’ = 1

If you want to transform a 2d VECTOR (something that represents a direction, not a location) by a 3×3 matrix, you need to use a zero in the Z component instead of a 1. You may have heard this before, but let’s see why:

[X Y 0]
[1 0 0]
[0 1 0]
[TX TY 1]
X’ = X * 1 + Y * 0 + 0 * TX
Y’ = Y * 0 + Y * 1 + 0 * TY
Z’ = 1 * 0 + 1 * 0 + 0 * 1
X’ = X
Y’ = Y
Z’ = 0

As you can see, the vector was not affected by the translation of the matrix. If the 3×3 matrix had scaling and rotation in it as well as the translation, the vector WOULD be affected by those things as it should be. The translation is the only thing that doesn’t apply when you use a Z value of zero.

Note that if you want to get the 2d basis vectors from a 3×3 matrix, you just ignore the 3rd row and the 3rd column and do the same thing you would do with a 2×2 matrix. Again, making sure to normalize the basis vectors when it’s appropriate.

Combining Matrices

To combine transforms together (whether 2×2 or 3×3 matrices) you just multiply them together.

The order of matrix multiplication matters though. A * B is not the same as B * A.

To make things confusing, OpenGL and DirectX use different representations of matrices (one is “column major” the other is “row major”) which means that the matrices in each API are transposes of the other.

To make things even more confusing, if AT and BT are the transpose of A and B, then A * B = BT * BA. This means that premultiplication and postmultiplication (aka is A on the left or the right in A * B) swap meanings when going from one API to the other.

I’m not sure who is to blame for that one, but here’s an interesting link on the subject:

Note that row vs column major matrices also change which direction the basis vectors are stored in… so instead of X axis being the 3 top numbers, it would be the 3 left numbers! It should be easy enough to tell which is which by inspection, but keep it in mind!

Anyways, I’ll continue to use the conventions set in this article above in the vector / matrix multiplication.

If you multiply a translation matrix by a rotation matrix, you’ll get a matrix that rotates a point, and then translates it.

[1 0 3]
[0 1 5]
[0 0 1]
[ 0.34 0.93 0]
[-0.93 0.34 0]
[ 0 0 1]
[ 0.34 0.93 3]
[-0.93 0.34 5]
[ 0 0 1]

If, however, you multiply a rotation matrix by a translation matrix, you’ll get a matrix that translates a point, then rotates it. Going that direction, the translated point is rotated.

[ 0.34 0.93 0]
[-0.93 0.34 0]
[ 0 0 1]
[1 0 3]
[0 1 5]
[0 0 1]
[ 0.34 0.93 5]
[-0.93 0.34 1.4]
[ 0 0 1 ]

Which way you multiply entirely depends on what it is you are trying to achieve. And, well… it also depends on whether you are dealing with row major or column major matrices!

Multiplying a 3×3 matrix by another 3×3 matrix is the same as doing nine 3d dot products.

Inverting Matrices

Taking the transpose of a matrix doesn’t have any intuitive geometrical (or other) meaning that I’m aware of. I’ve looked on the net and all I could find was some “simple” explanations involving general relativity. Awesome right? LOL.

On the other hand, inverse matrices have a very intuitive and very useful meaning. Inverse matrices do the reverse of whatever the matrix does.

That means if you have a matrix that translates by (7,5) and then rotates by (45, 30) degrees, the inverse matrix will rotate by (-45, -30) degrees and then translate by (-7,-5).

This is super useful sometimes (:

Inverting a 2×2 matrix is actually really easy. I could explain it but you really ought to check out this page to see how. I recommend doing the exercises at the bottom to make sure you firmly understand how to do it!

Inverting a 3×3 matrix is fairly easy too, but kind of tedious. Here’s a page that explains how:

After you are done with that, here are some problems to run through to make sure you really do understand it:

Not all matrices are invertible. If you read the links and walk through the exercises, you’ll see why. Basically, an uninvertable matrix will cause a divide by zero in the inversion process. I believe this comes up when you have a matrix with parallel basis vectors, or if you have a zero scaling matrix (a multiply by zero can’t be reversed!) but i could be mistaken or not have the full picture there. If I’ve mis-spoken, post a comment!

Extending to 3d

Extending the above to 3d matrices is pretty simple and there aren’t really many suprises.

One difference is that in the above where we use a 2×2 matrix in 2d, we would use a 3×3 matrix in 3d because of the extra Z coordinate.

If you want to do translation in 3d, you have to use a 4v4 matrix instead of a 3×3 (just like in 2d how you had to move from a 2×2 to a 3v3, in 3d you have to move from a 3×3 to a 4×4). A 3d translation matrix looks like this:

[1 0 0 0]
[0 1 0 0]
[0 0 1 0]
[TX TY TZ 1]

Inverting a 4×4 matrix is pretty tedious, but follows the same patterns as inverting a 3×3 and a 2×2.

You can find info on a 3d rotation matrix here, if you don’t want to build it up with basis vectors:

In our example above where we had the X axis and used a “fake 2d cross product” to get the perpendicular vector, when you move into 3d you’ll probably want to use the cross product to get perpendicular vectors.

Like for instance if you know 2 of the basis vectors, you can use cross product of those 2 to get the third.

Z = X x Y

If, however, you only have one basic vector (say “Z” because maybe you have a “camera direction”), you can use cross product to get 2 other vectors so long as you can make certain assumptions about the orientations involved. Like for instance, you might do this:

Fwd = normalize(Camera.Forward)
Left = normalize(Fwd x (0, 1, 0))
Up = normalize(Fwd x Left)

rotation matrix =
[Left.X Left.Y Left.Z]
[Up.X Up.Y Up.Z]
[Fwd.X Fwd.Y Fwd.Z]

The above only works if the camera can never look straight up, and it also assumes that your camera doesn’t have any roll – but it is a useful technique if those assumptions are ok.

That’s about it! I hope you found at least some of this information useful (:

if I missed anything you think belongs here, post a comment and share with the rest of us!