I promised myself that I would make at least every second post a proper mathematical one, so here goes… In fact, I will break this topic up into sections, so there will be a continuation of this post. Note that some of the links open up pdf files.
I am reading the proof of the ergodic theorem in the book Nonstandard Methods in Ramsey Theory and Combinatorial Number Theory by Mauro Di Nasso, Isaac Goldbring and Martino Lupini. Since the proof is clearly presented in the book and is freely available, I will not go into detail here. There is however one part of the proof which is not presented in the book – one assumes because it would take one too far afield. This concerns the existence of “typical elements”:
Definition 1. Let be a measure-preserving dynamical system, where is the Borel -algebra on , is the unilateral shift operator and is a -invariant probability measure. An element of is called typical if, for any ,
I have recently spent quite some time on ideas surrounding the concept of equidistribution, which is why I immediately found typical elements appealing. The idea that you can have a single element which can be used, via an ergodic average, to approximate the integral of any continuous function is perhaps not shocking, but pleasing nonetheless. My immediate reaction is to wonder what else we can say about these elements and collections of them. For instance, how many are there? How accessible are these elements in constructive terms? I have not yet explored these notions, but am eager to do so.
Existence can be proved with the ergodic theorem, but more interesting to me is that it can also be done without. The proof I will present here is from the paper by Kamae (who came up with the nonstandard proof of the ergodic theorem), who in turn states that he found it in Ville. The proof relies on little but some basic measure theory. I will stick closely to the Kamae paper, but hopefully clear up some details.
The key to the proof is to construct a sequence of periodic elements of which can be used to approximate the measure . We say that is periodic if there is some such that for all . Given a periodic with period , we construct a measure on by setting
as usual denoting the Dirac measure at . We want to show that we can find a sequence of periodic elements such that the associated measures converge weakly to .
The idea of the proof is now to encode a cylinder set in as a finite Cartesian product of a finite alphabet, then find a measure on that product which is very close to . The new measure will yield a kind of “maximal sequence” in the product, which we can use to construct our periodic element. But for now I am going to skip straight to the end and show how we can use such elements to get a typical one. In a follow-up post, we will get to the construction of the periodic elements, which is the real meat of the proof.
Suppose now that we have a sequence of periodic elements, assumed to have period , such that the associated measures converge weakly to as . Since each of these elements is determined by a finite numbers of “bits”, we can get the full information of each one in a finite string. To get our typical element then, we might be tempted to stick the first bits of string onto the end of the first bits of string , but this will lead to convergence problems when we look at again. Rather, we can think of the as increasingly good approximations to , and so would want to play a greater role in than , when . So we take the first elements of to form the first element of , follow them with the first bits of and so on, where . We will also require that is sufficiently small with respect to . To be a little more formal about it, we set
for any , with and , with .
In order to show that is indeed typical, it seems reasonable to write the sum in Definition 1 some form of integral in . Given that is a continuous function on a compact space, we know that the range is bounded and for convenience, positive. What we want then is something of the form
We can now see why it is necessary for the to increase quickly. It is a useful exercise to write out the integral as a sum according to the definition of a Lebesgue integral, to see that, for large
Due to the use of the periods of the in the construction of , we see that
where indicates the error due to the terms for . This can be made as small as we please by allowing the to increase rapidly, which yields the result.
Some of the fine detail has of course been left out here, but not, I think, anything that is too difficult to supply oneself. In a future post, I will discuss how we find using the measure .