Fourier analysis – the real story iv: How did it start?

Dedicated to the memory of Willem L Fouché who, amongst many other stellar contributions to my life, told me to go read Dym & Mckean, and also taught me the connections between Fourier analysis and descriptive set theory.

It’s all well and good to learn advanced mathematics because it’s interesting to you or useful. Not everyone is interested in the history or origin of a field, and that is perfectly fine. For myself, I do not feel I fully understand something until I have some sense of how it originated and more importantly, why. I occasionally discuss Fourier analysis on this blog, because I think it is kind-of magical and because I feel like I don’t have a deep, intuitive understanding of it yet. Recently, I’ve been considering the question whether Fourier analysis was inevitable. Without necessarily going back through the historical record in detail, what would be my guess as to how it came about? This post should perhaps be regarded as historical fiction – an account of how things might have happened.

Since little (but not nothing) happens in a vacuum, let us start with the heat equation, which – as we’ve established – Fourier was obsessed with. Without giving initial or boundary conditions then, we’re looking at the equation

\frac{\partial u}{\partial t} = \alpha \frac{\partial^2 u}{\partial x^2}

when we’re only considering one dimension.

How do you solve a problem in mathematics? By some combination of two things:

  • Solve an easier but related problem, or
  • breaking it up into bits you can solve.

We’ll use this combination to try to puzzle out how Fourier analysis could have come about.

The heat equation is not difficult to get a general solution for, if we leave out the initial and boundary conditions. By separating variables and setting u(x,t) = X(x)T(t), we get

\frac{T'(t)}{\alpha T(t)} = \frac{X''(x)}{X(x)} = -\lambda.

Since the first two parts depend on different variables, they need to be equal to a constant. Using a minus sign in front of \lambda is a bit of a cheat, anticipating the form of the solution, but it will make things easier. We can then see that the following function is a general solution:

u(x,t) = \sin (\sqrt{\lambda} x) e^{-\alpha \lambda t}.

Of course, this works for \cos as well, and we can multiply by arbitrary constants without breaking the solution, so we have

u(x,t) = A\sin (\sqrt{\lambda} x) e^{-\alpha \lambda t} + B\cos (\sqrt{\lambda} x) e^{-\alpha \lambda t}.

Now, a differential equation usually isn’t much use without some initial and boundary conditions, so let’s suppose we’re looking at something (a rod, perhaps) of unit length, whose ends are kept a zero degrees:

u(0,t) = u(1,t) = 0.

To ensure this is satisfied, we can set B = 0 and replace x by 2\pi x. This would break our solution, if it weren’t for the fact that we can compensate for it by adjusting A, which we are free to do. (Some members of the audience are screaming out that I’m missing a whole bunch of solutions – don’t worry, we’ll get to it!)

It’s going well so far, but what about the initial conditions? This is where it gets interesting. But making it too interesting makes it very difficult, so let’s start with the easiest possible case. Since the initial condition is given by

u(x,0) = f(x),

we surely can’t get any simpler than setting f(x) = \sin 2\pi x. The problem is solved! At least, as long as we can set \lambda = 1, which we can do because it was, after all, arbitrary. Unfortunately, things usually don’t stay this simple, but let’s make it incrementally more difficult. What if

f(x) = a_1 \sin 2\pi x + a_2 \sin 4\pi x?

(We only take integer multiples in the argument of \sin since we still have to satisfy the boundary conditions.) That’s absolutely no problem, either. The sum of two solutions is still a solution (because our differential equation is linear), so we can solve separately for the two terms and just add them, with the new frequency involved adjusted for in the coefficient of the term. We can do this for any number of terms. In other words, we can solve the problem for any initial condition of the form

\Sigma_{n=1}^{N} a_n \sin 2\pi n x.

Great! We’ve solved the problem for a whole class of functions. The question becomes: exactly how big is this class? And can we use this method to expand this class?

Fourier’s audacity was to say that, if we allow N to be infinite, all functions can be written this way. Now, this isn’t exactly true, and much of Fourier analysis has focused on finding out specifically how true it is, and more modern notions of convergence are necessary to even frame the question properly (back in Fourier’s day, they played a bit fast and loose with these issues). Certainly, most “nice” functions on the unit interval can be written this way, meaning the heat equation is solved for all of them. That is a great accomplishment!

Fair enough, but this part of the story is still not obvious. Perhaps Fourier had a suspicion that functions can be expanded as sums of trigonometric functions, but how do you go about verifying that? I can only imagine that an enormous amount of work went into this. Nowadays, it is the work of a few minutes to write a script that will output the visualization of a trigonometric sum of the above type. In Fourier’s day, this all had to be done by hand. This might seem like a disadvantage, but I’m not so sure. You would need to think deeply on what you would spend your time on, and choose your problems with care. I’m not against the widespread use of computing in mathematics, but I do think we can learn something from the work habits of the old masters.

So, we can assume that Fourier (like his contemporaries) was an absolute wizard at calculus. If I had to reconstruct his thought process – albeit through a modern lens – I would imagine it went something like the following.

Fourier didn’t have function spaces and orthonormal bases to play around with – even the basic concepts of set theory were still decades away (and Fourier analysis played a crucial role in Cantor’s work, too). But he probably would have been exquisitely aware of the following integral:

\int_0^1 \sin 2\pi nx \sin 2\pi mx dx = 0 \textrm{ when } n \neq m

and

\int_0^1 \sin 2\pi nx \sin 2\pi nx dx = \frac{1}{2}.

If we now take the trigonometric series f as given above, we can say that

\int_0^1 f(x) \sin 2\pi m x dx = \Sigma_{n=1}^N \int_0^1 a_n \sin 2\pi n x \sin 2\pi m x dx.

Using the identity above, we can conclude that

\int_0^1 f(x) \sin 2\pi m x dx = \frac{a_m}{2}.

In other words, if we assume that f is a linear combination of “nice” \sin terms, we can recover that linear combinations by “projecting” f onto each term. If we assume that N can be infinite, and that any function (appropriate to the boundary conditions) can be written as such an infinite linear combination, we have a way of finding each coefficient. This means that if we are able to disregard a whole mess of convergence issues, we can solve the heat equation for a very wide range of initial conditions. To make this more general, we can do something very similar with \cos, which will allow us to handle other conditions.

Of course, the mathematical world did not immediately accept Fourier’s methods, with good reason. A lot of work remained to be done. Even today, the convergence of Fourier-type series is an active area of investigation. I can imagine that those who had to use the heat equation in practice welcomed this advance, though. Indeed, it was only a short while before the consequences of this stretched far beyond application to the heat equation…

Fourier analysis – the real story II

One thing I didn’t know about Fourier was his obsession with heat – and not just in the mathematical sense. He literally thought that heat was some sort of panacea. So much so that it contributed to his death! See https://engines.egr.uh.edu/episode/186, for instance.

So, we have the heat equation. How do we solve it? More importantly, how did Fourier solve it? This is somewhat difficult to ascertain, in the general form, for Fourier’s book is written in a manner different from what is today accepted as a mathematical text. First of all, there are many more words and lengthy physical explanations – Fourier’s aim, after all, was not to promulgate a mathematical theory, but a real and useful way of solving the problem of heat in many bodies under many conditions. Of course, we also cannot expect a book from the early 19th century to adhere to our standards of mathematical rigour.

Fortunately, I have come across a great book which starts off with Fourier’s approach to the heat equation, namely “Hot molecules, cold electrons” by Paul J Nahin. If you haven’t read any of Nahin’s books, do yourself a favour. He’s an electrical engineer with a keen appreciation for mathematics, and he makes it a lot of fun. Over the next few posts then, since we’re exploring the origins of Fourier analysis, I thought I would present an argument by Fourier that is recounted in Nahin’s book. I’m not going to go into too much detail, because working that out is part of the fun.

What Fourier was so very good at was solving problems by representing functions as infinite series. It does not seem at first that this would make things simpler, but it does – a lot. Many (including Laplace and Lagrange) were skeptical of Fourier’s methods, and honestly, the techniques did not exist yet to fully justify the methods. Nevertheless, Fourier knew they worked, and he became a titan of mathematics because of them. Here, we will look at his proof for the identity

\frac{\pi}{4} = \cos x - \frac{1}{3} \cos 3x + \frac{1}{5} \cos 5x - \frac{1}{7} \cos 7x + \cdots.

Of course, the first striking thing about this identity is that the right side depends on x, whilst the left does not. This immediately raises suspicion – as it should, since the identity is not actually correct. Before delving into the derivation, let us approximate the right hand side with some software to see how wrong the identity is. We can use the following in Matlab:

test = @piover4
figure
fplot(test,[-5 5])

function f = piover4(x)
f = 0
for i = 1:100
f = f + ((-1).^(i+1))*cos((2*i -1)*x)/(2*i-1);
end
end

Clearly, the equation is not exactly correct, but it definitely works over a large range of x. By the way, if we want to do this in Python we can use

import numpy as np
import matplotlib.pyplot as plt
def piover4(x):
    sum = 0
    for i in range(1,101):
    sum += (-1)*(i+1)np.cos((2i-1)x)/(2*i-1)
    return sum
xx = np.linspace(-5,5,1000)
yy = piover4(xx)
plt.plot(xx,yy)

The two different values the series converges to over most of the x are probably not too concerning – those are just sign reversals in the series, and we’ll probably see what’s going on there during the derivation. What is much more interesting are the errors near the discontinuities. For the moment, we will park that discussion, although it turns out to be very important. In the next post, we will look at Fourier’s derivation of the identity, and figure out why it’s wrong, but also kind of correct.