Typical elements

I promised myself that I would make at least every second post a proper mathematical one, so here goes… In fact, I will break this topic up into sections, so there will be a continuation of this post. Note that some of the links open up pdf files.

I am reading the proof of the ergodic theorem in the book Nonstandard Methods in Ramsey Theory and Combinatorial Number Theory by Mauro Di Nasso, Isaac Goldbring and Martino Lupini. Since the proof is clearly presented in the book and is freely available, I will not go into detail here. There is however one part of the proof which is not presented in the book – one assumes because it would take one too far afield. This concerns the existence of “typical elements”:

Definition 1. Let ([0,1]^{\mathbb{N}}, \mathcal{C}, T, \nu) be a measure-preserving dynamical system, where \mathcal{C} is the Borel \sigma-algebra on [0,1]^{\mathbb{N}}, T is the unilateral shift operator and \nu is a T-invariant probability measure. An element \alpha of [0,1]^{\mathbb{N}} is called typical if, for any f \in C([0,1]^{\mathbb{N}}),

\displaystyle \lim_{n\to \infty} \sum_{i=0}^{n-1} f(T^i \alpha ) = \int_{[0,1]^{\mathbb{N}}}f(y)d\nu.

I have recently spent quite some time on ideas surrounding the concept of equidistribution, which is why I immediately found typical elements appealing. The idea that you can have a single element which can be used, via an ergodic average, to approximate the integral of any continuous function is perhaps not shocking, but pleasing nonetheless. My immediate reaction is to wonder what else we can say about these elements and collections of them. For instance, how many are there? How accessible are these elements in constructive terms? I have not yet explored these notions, but am eager to do so.

Existence can be proved with the ergodic theorem, but more interesting to me is that it can also be done without. The proof I will present here is from the paper by Kamae (who came up with the nonstandard proof of the ergodic theorem), who in turn states that he found it in Ville. The proof relies on little but some basic measure theory. I will stick closely to the Kamae paper, but hopefully clear up some details.

The key to the proof is to construct a sequence of periodic elements of [0,1]^{\mathbb{N}} which can be used to approximate the measure \nu. We say that \alpha = (\alpha_1, \alpha_2, \alpha_3, \dots) \in [0,1]^{\mathbb{N}} is periodic if there is some k such that \alpha_i = \alpha_{i+k} for all i\in \mathbb{N}. Given a periodic \alpha with period p, we construct a measure on [0,1]^{\mathbb{N}} by setting

\displaystyle \mu_{\alpha} = \frac{1}{p}(\delta_{\alpha} + \delta_{T\alpha}+\dots +\delta_{T^{p-1}\alpha}),

\delta_{\beta} as usual denoting the Dirac measure at \beta. We want to show that we can find a sequence of periodic elements such that the associated measures converge weakly to \nu.

The idea of the proof is now to encode a cylinder set in [0,1]^{\mathbb{N}} as a finite Cartesian product of a finite alphabet, then find a measure on that product which is very close to \mu . The new measure will yield a kind of “maximal sequence” in the product, which we can use to construct our periodic element. But for now I am going to skip straight to the end and show how we can use such elements to get a typical one. In a follow-up post, we will get to the construction of the periodic elements, which is the real meat of the proof.

Suppose now that we have a sequence \alpha_1, \alpha_2, \alpha_3, \dots of periodic elements, \alpha_i assumed to have period c_i, such that the associated measures \mu_{\alpha_i} converge weakly to \nu as i\to \infty. Since each of these elements is determined by a finite numbers of “bits”, we can get the full information of each one in a finite string. To get our typical element then, we might be tempted to stick the first c_{i+1} bits of string \alpha_i onto the end of the first c_i bits of string \alpha_i, but this will lead to convergence problems when we look at \int fd\nu again. Rather, we can think of the \mu_{\alpha_i} as increasingly good approximations to \nu, and so would want \alpha_i to play a greater role in \alpha than \alpha_j, when i>j.  So we take the first c_1 t_1 elements of \alpha_1 to form the first  c_1 t_1 element of \alpha, follow them with the first c_2 t_2 bits of \alpha_2 and so on, where t_1 < t_2 < t_3 < \cdots. We will also require that c_i t_i is sufficiently small with respect to c_{i+1}t_{i+1}. To be a little more formal about it, we set

\displaystyle \alpha (n) = \alpha_m (n - T_m),

for any n\in \mathbb{N}, with T_m \leq n <T_{m+1} and T_{m+1} = T_{m}+c_m t_m, with T_0 =0.

In order to show that \alpha is indeed typical, it seems reasonable to write the sum in Definition 1 some form of integral in \mu_i. Given that f is a continuous function on a compact space, we know that the range is bounded and for convenience, positive.  What we want then is something of the form

\displaystyle  \int f d\mu_{\alpha_n} \approx \frac{1}{m} \sum f(T^{i} \alpha) \to \int f d\nu.

We can now see why it is necessary for the t_i to increase quickly. It is a useful exercise to write out the integral \int f d \mu_n as a sum according to the definition of a Lebesgue integral, to see that, for large m

\displaystyle \int f d \mu_m \approx \frac{1}{c_m}\sum_{i=0}^{c_m -1} f(T^i \alpha_m).

Due to the use of the periods of the \alpha_m in the construction of \alpha, we see that

\displaystyle \frac{1}{t_m c_m} \sum_{i=0}^{t_m c_m -1} f(T^i \alpha) + \varepsilon ,

where \varepsilon indicates the error due to the terms f(T^i \alpha) for i=0,\dots ,c_m t_m -1. This can be made as small as we please by allowing the t_i to increase rapidly, which yields the result.

Some of the fine detail has of course been left out here, but not, I think, anything that is too difficult to supply oneself. In a future post, I will discuss how we find \alpha using the measure \nu.

Advertisement

Trying to be a mathematician

The title of this blog has changed recently, as previous visitors might notice. I have been trying to decide on a direction for some time, and wanted it to be something close to the thing most important to me, namely mathematics. More than that, I want it to be about my mathematics. Not just technical exposition, although there will be some of that as well, but about my struggle to try to become a good mathematician, whatever that means.

The blog will therefore be of a somewhat mixed nature. Some of the posts will be more personal, and may not always be of interest to someone interested in something mathematical. But I suspect there are quite a few people out there in similar positions to mine, and may benefit from knowing that the struggle is a shared one. This is not limited to mathematicians, but to scientists and academics in general. Especially, this may help some of those new to the business to at least avoid some of the mistakes I have made…

In addition, there will be purely mathematical entries. I have discovered (at a late stage, it must be said) that to understand something I have to write about it. I think I have avoided doing so more because writing mathematical documents is not easy. Yet one cannot be said to understand a topic to a reasonable degree unless one can convey it in writing. Lecturing is also good, but one usually doesn’t have the leisure to lecture on everything you want to understand, and it does not leave a permanent record to consult at a later stage. Writing a blog on the subject is a good compromise, since one has to make the material presentable to an audience, who could (at best) offer insight and correction and (at worst) savage your lack of understanding.

I make no claims of originality here. Much of what I write will have been stated by others. For instance, Terry Tao has a blog post from years ago that pretty much gives the exact advice of the previous paragraph. He has a great page of career advice at https://terrytao.wordpress.com/career-advice/, which should be read. All I can offer here is offer my own take on things.

This all sound quite serious, which will hopefully not be the dominant tone of this blog. One piece of advice for budding mathematicians would be not to take yourself too seriously. I have to keep reminding myself of this. Mathematics should be fun, and if it is not, find yourself another job. I tend to fall into a deep depression and blame myself endlessly when a solution I have been working on for months turns out to have a trivial counterexample. The answer is to get away from the problem, take a walk in a park, and realise that you are not a failure as a human being because of this. Sure, not every minute of mathematics will be pleasant, but if you can’t periodically remind yourself that you are doing this because you like it, you will suffer under a great psychological burden.

Thus, if I post something which seems silly, realise that this is just me reminding myself to lighten the hell up.