The Discrete Cosine Transform, and Derivatives

First off, fuck Putin. I wish the world was giving more direct support to Ukraine against the invasion (It seems like today, that is starting to happen though luckily!). There’s too much tolerance happening for bad behavior IMO. Violence is terrible, and that’s why it has to be stopped as quickly and decisively as possible. You stop the trouble makers, you don’t just hope they’ll stop making trouble. Ukraine is fighting back hard and IMO we are all hoping they are successful, but they shouldn’t have to do this alone.

Onto the math!

For some reason, the discrete cosine transform (DCT) has been confusing to me for a long time, even though I have been intimately familiar with the discrete Fourier transform (DFT). I expected that there would be more to it than there was.

This post is a follow up to my last post, which talks about how to adjust the position of points in a point set to change the frequencies present in the point set. The last post is actually significantly more complex than this one!

Adjusting Point Sets in Frequency Space Using a Differentiable Fourier Transform

There are two web demos that go with this post, that work like the demos from last post, but using the DCT instead of the DFT:

1D Point Set Frequencies (Cosine Transform) http://demofox.org/PointSetsCT1D.html
2D Point Set Frequencies (Cosine Transform) http://demofox.org/PointSetsCT2D.html

1D Signal DCT

Let’s start with a quick overview of the DFT. Here’s the formula for calculating the DFT of 1D signals.

$X_k = \sum_{n=0}^{N-1} x_n * e^{2 \pi i k n/N}$

N is the length of the signal, k is the frequency being evaluated (0 for DC, 1 for 1hz, 2 for 2hz, etc), n is the index of the current value in the signal, $x_n$ is the value at that index, and $X_k$ is the complex valued coefficient representing the phase and magnitude of frequency k in the signal.

You can also express the equation like this, which explicitly breaks the sum into a sum of imaginary and real parts:

$X_k = \sum_{n=0}^{N-1} x_n * e^{2 \pi i k n/N} = \sum_{n=0}^{N-1} x_n * \cos(2 \pi k n/N) + x_n * i \sin(2 \pi k n/N)$

From here, if you wanted to get the magnitude of the frequency in the signal, you’d treat the real and imaginary parts of the coefficient as x and y values of a vector and get the length of the vector. If you wanted to get the phase of the frequency (how much it is offset in the signal), you use atan2(imaginary, real).

Things get a lot simpler for the DCT: we only look at the real / cosine term:

$X_k =\sum_{n=0}^{N-1} x_n * \cos(2 \pi k n/N)$

All the symbols are the same except for $X_k$ which is now a scalar value which is the frequency magnitude. You don’t get phase information like you do with the Fourier transform, but no more complex math (ha!) to calculate the magnitude.

That fact makes it a lot easier to get the derivative – or how changing a specific value index in the signal affects a specific frequency. If we want to know how changing the value at index m affects frequency magnitude $X_k$ , all terms of the sum go to zero as constants except for the one involving index m, which makes the derivative this:

$\frac{dX_k}{dx_m} = \cos(2 \pi k m/N)$

You can gather up this value for frequency k for each of the N values and get a gradient which will tell you how to adjust all values in the signal to increase or decrease frequency k.

You can also gather up this value for index m, for each frequency k, to get a gradient that tells you how changing this signal value affects all frequencies.

1D Point Set DCT

We can change the DCT formula to be for sparse values in 1D, instead of a dense N valued signal.

$X_k =\sum_{p \in P} \cos(2 \pi k p)$

And once again, it’s super easy to get the derivative of, to know how much moving a specific point q affects the frequency.

$\frac{dX_k}{dq} =-2 \pi k \sin(2 \pi k q)$

That’s it, we are done!

You can see this in action here: http://demofox.org/PointSetsCT1D.html

2D Signal DCT

To calculate a DCT of am MxN image, we have frequency j across the x axis, multiplied by frequency k across the y axis.

$X_{jk} = \sum_{m=0}^{M-1} \sum_{n=0}^{N-1} x_{mn} * \cos(2 \pi j m/M) * \cos(2 \pi k n/N)$

If we want to take the derivative of how this frequency pair changes as the sigal value at a specific location pq changes, it’s again pretty easy. All values in the sum are constants except for the one involving the value at pq. The cosine terms are also constants in this context.

$\frac{dX_{jk}}{dx_{pq}} = \cos(2 \pi j p/M) * \cos(2 \pi k q/N)$

2D Point Set DCT

We can change the 2D dense signal DCT into a 2D point set DCT that looks like this:

$X_{jk} = \sum_{p \in P} \cos(2 \pi j p_x) * \cos(2 \pi k p_y)$

If we want to get the derivative of frequency $X_{jk}$ as we move a specific 2D point q around, we need to take a partial derivative on the x axis, and a partial derivative on the y axis. This tells us how much the frequency magnitude changes as we move the point on the x and y axis.

$\frac{X_{jk}}{dq_x} = -2 \pi j \sin(2 \pi j q_x) * \cos(2 \pi k q_y)$

$\frac{X_{jk}}{dq_y} = \cos(2 \pi j q_x) * -2 \pi k \sin(2 \pi k q_y)$

You can see this in action here: http://demofox.org/PointSetsCT2D.html

Differences Between DCT and DFT

There are some differences between using the DCT and DFT for frequency analysis and similar.

For one, the DFT assumes that the data you give it infinitely repeats. The DCT however assumes that the data you give it repeats forever too, but that each time it repeats, it is flipped like in a mirror.

Another difference is that because DFT has phase and DCT doesn’t, translation of data affects DCT frequency magnitude results, while it doesn’t affect DFT frequency magnitude results, but it does affect DFT phase results.

More concretely, imagine you have a 2hz cosine wave starting at x=0 (so, has a phase of 0 degrees). Both DFT and DCT will recognize this as a 2hz frequency with amplitude 1.

If you move this wave to the right so that it starts at x = pi/2, the DFT will still show a 2hz frequency with amplitude 1, but will now show a pi/2 phase. the DCT however, will show a 2hz frequency with amplitude -1!

If you play around in the demos that go with this post you can see this in action, that translation matters for DCT, but not for DFT, when looking at frequency magnitudes.

Lastly, DCT frequency magnitudes can be negative, where in DFT they can’t be negative. The demos are adjusted to account for this.

Once again, thanks for reading, and I hope you find this useful or at least interesting.

And here’s hoping Ukraine comes out on top soon, with friends coming to their aid and helping them rebuild. What a terrible and senseless situation.

The blog at the bottom of the sea

Programming, Graphics, Gamedev, Exotic Computation, Audio Synthesis

The Discrete Cosine Transform, and Derivatives

1D Signal DCT

1D Point Set DCT

2D Signal DCT

2D Point Set DCT

Differences Between DCT and DFT

Leave a comment Cancel reply

1D Signal DCT

1D Point Set DCT

2D Signal DCT

2D Point Set DCT

Differences Between DCT and DFT

Share this:

Related

Leave a comment Cancel reply