What’s up with convolution (the second part)?

Previously, we managed to get an idea of the justification of using convolution by using probabilities, and it seems to make some kind of intuitive sense. But do the same concepts translate to Fourier analysis?

With the convolution product defined as

(f \star g)(x) = \int f(y) g(x-y) dy,

(where we leave the limits of integration up to the specific implementation), the key property of the convolution product we will consider is the following:

\widehat{ (f \star g)}(n) = \hat{f} (n) \hat{g} (n).

One possibility that immediately presents itself is that, if the Fourier coefficients of f and g are easy to compute, we have a quicker way to compute the Fourier coefficients of the convolution, which doesn’t involve computing some ostensibly more difficult integrals. Which requires us to answer the question: why do we want to calculate the convolution in the first place?

Let’s take some easy functions and see what happens when we take the convolution. Let

f (t) = \sin 2\pi t \textrm{ and } g (t) = \cos 2\pi t.

The Fourier coefficient of these functions are really easy to compute, since we have

\sin 2\pi  t = \frac{e^{2\pi i t} - e^{-2\pi i t}}{2i}

and

\cos 2\pi t = \frac{e^{2\pi i t} + e^{-2\pi i t}}{2},

meaning that \hat{f}(1) = 1/2i, \hat{f} (-1) = -1/2i, \hat{g} (1) = 1/2 and \hat{g} (-1) = 1/2. This implies that

\widehat{(f \star g)} (-1) = -\frac{1}{4i}, \widehat{(f \star g)}(1) = \frac{1}{4i}.

This allows us to immediately write down the convolution product:

(f \star g)(t) = \frac{1}{4i} \left( e^{2\pi i t} - e^{-2\pi i t} \right) = \frac{1}{2} \sin 2\pi t.

Clearly, our two signals modify each other. How is this significant? The best way to think of this is as an operation taking place in the frequency domain. Suppose f is our “original” signal, and it is being “modified” by g, yielding a signal h = f \star g. The contribution of a frequency n to h (in other words, the n-th Fourier coefficient), is the n-th Fourier coefficient of f multiplied by the n-th coefficient of g. We can see this as a way of affecting different frequencies in a specific way; something we often refer to as filtering.

We often think of the function in the time domain as the primary object, but looking at the Fourier coefficients is, in a way, far more natural when the function is considered as a signal. Your hear by activating certain hairs in the inner ear, which respond to specific frequencies. In order to generate a specific sound, a loudspeaker has to combine certain frequencies. To modify a signal then, it is therefore by far the easiest to work directly on the level of frequency. Instead of trying to justify the usual definition of convolution, we see the multiplication of Fourier coefficients as the starting point, and then try to see what must be done to the “original” functions to make this possible. So, we suppose that we have made a new function h that has Fourier coefficients formed by the product of the Fourier coefficients of f and g, and try to find an expression for h:

h(x) = \int \hat{f} (\xi ) \hat{g} (\xi ) e^{2\pi i \xi x} d\xi.

If you simply write out the definitions in the above, and you remember that

\int e^{2\pi i y (\xi - \zeta)} dy = 0 when \xi \neq \zeta,

you will get the expression for the convolution product of f and g. As such, the definition of convolution has to be seen as a consequence of what we would like to do to the signal, rather than the starting point.

We still have not related the definition to how convolution originated in probability, as detailed in the previous post. Unfortunately, the comparison between the two cases is not exact, because in the probabilistic case we obtain a completely new measure space after the convolution, whereas in the present case we require our objects to live in the same function space. Again, the solution is to think in frequency space: to find all ways of getting e^{-2\pi i x \xi}, we need to multiply e^{-2\pi (x-y)\xi} and e^{-2\pi i y \xi} for all values of y, which leads to our definition.

(As always, I have been rather lax with certain vitally important technicalities – such as the spaces we’re working in and the measures – such as whether we’re working with a sum or an integral. I leave this for the curious reader to sort out.)

Fourier analysis – the real story III

We’re trying to get inside Fourier’s head, so to speak, and explore the origins of his methods. To do this, we’re going to look at his derivation of the identity

\frac{\pi}{4} = \cos x - \frac{1}{3} \cos 3x + \frac{1}{5} \cos 5x - \frac{1}{7} \cos 7x + \cdots.

(Again, this is from Paul J. Nahin’s wonderful “Hot Molecules, Cold Electrons”.)

First, we need the indefinite integral

\int \frac{1}{1+x^2} dx = \tan^{-1}x + C.

(Deriving this integral is an easy exercise, but worth knowing.) Now suppose you have a right-angled triangle with the two unspecified angles being \theta and, by necessity, \pi/2 - \theta. Letting the hypotenuse be 1, the side adjacent to \theta be x and the remaining side (necessarily) being \sqrt{1-x^2}, we have that

\frac{\pi}{2} = \tan^{-1} \left( \frac{\sqrt{1-x^2}}{x}\right)+ \tan^{-1} \left( \frac{x}{\sqrt{1-x^2}}\right).

By using the appropriate substitution, we get

\frac{\pi}{2} = \tan^{-1} u + \tan^{-1} \frac{1}{u}.

Now, using the “fact” that

\frac{1}{1+u^2} = 1 - u^2 + u^4 - u^6 + \cdots

(which you can get from, e.g., long division), we integrate to get the indefinite integral

\int \frac{du}{1+u^2} = \tan^{-1} u + C = u - \frac{1}{3}u^3 + \frac{1}{5}u^5 - \frac{1}{7}u^7 + \cdots.

Comparing the two expressions on the right, we see that C = 0. By replacing u with 1/u, we get

\tan^{-1} \left( \frac{1}{u} \right) = \frac{1}{u} - \frac{1}{3} \left( \frac{1}{u} \right)^3 + \frac{1}{5} \left( \frac{1}{u} \right)^5 - \frac{1}{7} \left( \frac{1}{u} \right)^7 + \cdots.

Combining our previous expressions yields

\frac{\pi}{2} = \left( u + \frac{1}{u} \right) - \frac{1}{3} \left( u + \frac{1}{u} \right)^3 + \frac{1}{5}\left( u + \frac{1}{u} \right)^5 - \cdots.

Replacing u by e^{ix}, we can immediately get Fourier’s identity:

\frac{\pi}{4} = \cos x -\frac{1}{3}\cos 3x +\frac{1}{5}\cos 5x - \frac{1}{7} \cos 7x + \cdots.

This is a remarkable identity, but is it true? It is very instructive to graph the right-hand side to see what happens. As it turns out, the identity is only kind-of true. It can be improved, and I suggest finding the mistakes in the derivation as a first step to doing so.

Fourier analysis – the real story I

Instead of going on with nonstandard analysis (which I might do later in a new format), I thought I would try another series of posts, on something very near and dear to my heart – Fourier analysis. I have (partly) read many books on Fourier analysis, and even published some work in it. But I have struggled to understand the essence of it – not surprising, since it is a subject both vast and deep. At the moment, I am solving differential equations and having much fun with the process. In doing so, I realized that I build a much deeper understanding when I get my hands dirty, so to speak. This is not a great revelation, but it is a principle I have ignored for too long. To understand Fourier analysis, I shouldn’t be reading the most advanced, abstract books on the subject and looking at the theoretical research, I need to go back to the origin. Why was it formulated this way? Why were kernels introduced? Why do we use Poisson summation? Why is it important that convergence takes place in certain spaces?

In this series, I intend to go back through history, even though I do not intend this as a historical work. It is more about following the evolution of ideas, and so I do not intend it to be 100% chronological. Mathematics is a messy place, and the narrative is not always clear (just like history itself, really). Things are rarely as neat as we make them out to be in our textbooks, and I think we do our students a disservice by not exposing them to these ideas. Fourier techniques are usually presented to the student as if ex nihilo, but there is a fascinating evolution of thought to explore. One of the few books that do not try to hide this is that of Körner, and I’m sure I will be referring to it extensively as I write this.

Let’s go back to the beginning then, and see what Fourier actually did, why he did it and whether anyone had done it before. There is probably no place better to go than Fourier’s “The Analytical Theory of Heat”. Now, I’m obviously not going to read the whole thing, but looking through the table of contents we see that Section II of Chapter 3 starts with “First example of the use of trigonometric series in the theory of heat”, which seems like the kind of thing we’re after.

Perhaps the best place to start would be to discuss Fourier’s heat equation itself:

a^2 \frac{\partial^2 U}{\partial x^2} = \frac{\partial  U}{\partial t}.

Here, U is some function of distance, x, and time. t. We’re not going to discuss how this equation came about and are just going to accept it as it is. There’s a good (but short) post on some of the history here. I am only presenting the one-dimensional form of the equation here, but of course the higher-dimensional version can be expressed in terms of the Laplacian. Interestingly, Fourier’s solution of the heat equation was inspired by that of Laplace, which in turn was inspired by work of Poisson.

For a moment, let us think about the content of this equation. What does it mean, and why should it be applicable to heat? Since we have a first derivative in one variable, which denotes rate of change, and the second derivative in another, we know that somehow that change in time is influenced by the curvature of the function, seen as a function of space. Instead of fully exploring this idea myself, I’ll direct you to the video at https://youtu.be/b-LKPtGMdss.

If you have had any courses in differential equations, the solution is quite obvious. It is now completely accepted, and that obfuscates how much of a revolution it actually was. As explained in the post I referred to, mathematicians had certain ideas of what a function should be, and it didn’t look like Fourier’s series conformed to those ideas. One might dismiss this as foolish conservatism by the mathematicians of old, but we must never fall into the trap of thinking our ancestors were ignorant or stupid. Rather, it indicates that Fourier analysis was something truly revolutionary, with tremendous implications for the very foundations of mathematics (more on that later). Even today, simply answering whether a trigonometric series does indeed define a certain kind of function is no simple task. One of the greatest theorems of twentieth-century mathematics is an ostensibly simple question on the convergence of Fourier series of quite well-behaved functions…

Next time, we’ll look at the solutions of the equation.