Graphs and arithmetic progressions

In a previous post I wrote about hypergraph complexity, which I encountered in the paper by Conlon, Fox and Zhao [1]. In that same paper, the hypergraph techniques were used to prove a theorem on the existence of arithmetic progressions. In this post, I want to discuss what arithmetic progressions have to do with graphs, which will perhaps give an indication of why the hypergraph approach is useful. To recap, here is the definition of a hypergraph:

Definition 1. Let J be a finite set and r a positive integer. Define {J \choose r} = \{ e\subseteq J: |e| =r\} to be the set of all r-element subsets of J. An runiform hypergraph on J is defined to be any subset H \subseteq  {J \choose r}.

Note that the usual definition of a graph corresponds to the case r=2 in the above definition. One could picture an r-uniform hypergraph as a normal graph, if we require that an edge between v_1 ,\dots v_r exists if and only if there is an edge (in the usual sense of a graph) between v_i and v_{i+1} for i = 1,2,\dots , r-1, plus an edge between v_r and v_1.

Using graphs in the context of arithmetic progressions goes back (at least) to the paper of Ruzsa and Szemeredi [2]. In this, they show that the triangle removal lemma implies Roth’s theorem. For this, I am going to follow the discussion in the book [3]. Consider the following theorem of Ajtai and Szemeredi [3, p23]:

Theorem 2. Let R\subset [N]^2. For \delta >0 there exists an N_0 = N_0 (\delta) such that if |R|\geq \delta N^2 and N\geq N_0, then there exist (x,y), (x+d,y), (x,y+d) \in R for some integer d\neq 0.

Getting from Theorem 2 to Roth’s theorem is not hard. Define R\subseteq [N]^2 by letting (x,y)\in R if x-y \in A, where we assume A is some set of integers with density. We assume that [N] is large enough so that A has a density of \delta in [N]. We can see this as a graph with the vertices given by [N], or we can consider [N]^2 as a grid. In the latter case, the “lower left” (under the diagonal) part of the grid contains all possible differences a subset of [N] could generate, and hence each element of A must be represented. We will now try to minimise the occurrences of elements of A, in order to be able to apply Theorem 2. Looking at one half of the grid (the easiest thing is to draw a small one of your own), we see that 1 occurs N-1 times, 2 occurs N-2 times, and so on. Clearly then, if we want to minimise the elements of A, they must occur as the largest numbers available in the grid. The number of occurrences is then at least 1 + 2+ \cdots + \delta N (approximately), which means we have \frac{1}{2} (\delta N)^2 occurrences of A in the worst-case scenario. This means that we can apply the Ajtai-Szemeredi theorem.

Another way of using graphs for 3-arithmetic progressions is to consider tripartite graphs. For that, we will need the following:

Theorem 3. [2] (Triangle removal lemma for tripartite graphs) Let V_1, V_2, V_3 be finite non-empty sets of vertices, and let G = (V_1,V_2,V_3, E_{12}, E_{23}, E_{31}) be a tripartite graph on these sets of vertices, thus E_{ij} \subseteq V_i \times V_j for ij = 12,23,31. Suppose that the number of triangles in this graph does not exceed \delta |V_1| |V_2| |V_3| for some 0< \delta <1. Then there exists a graph G' = G'(V_1, V_2, V_3, E_{12}', E_{23}' , E_{31}') which contains no triangle whatsoever, and such that |E_{ij} \setminus E_{ij}'| = o_{\delta \to 0}(|V_i \times V_j |) for ij = 12, 23, 31.

In this case, o_{\delta \to 0}(X) denotes a quantity that is bounded in magnitude by c(\delta), where c(\delta) \to 0 as \delta \to 0. The following discussion is similar to that in

Now, suppose that A is a subset of \{1,2,\dots ,n\}=[n] so that |A| \geq \epsilon n. Suppose we have three elements of A. Is there a way of describing these elements that will only be possible if they are in arithmetic progression, in terms of general integers? Of course, one way to do so would be to say that if x,y, (x+y)/2 are all in A, we have a 3AP. However, if we want to use the Triangle Removal Lemma and specifically tripartite graphs, we should use 3 quantities to describe a progressions. It is quick to check that if we have integers x,y,z such that

y-x, \ \frac{z-x}{2}, \ z-y

are all in A, we have a 3AP. Suppose now we pick sets X,Y,Z = \{1,2, \dots ,2n\}.

Consider three disjoint vertex sets X, Y,Z, each of cardinality 3n, with the vertices of each labelled as \{ 1,2,\dots ,3n \}. The important thing about these sets is that the difference (as follows) will let us recover all elements of A. Let us connect a vertex x\in X and y\in Y if y-x \in A, z \in Z and x\in X if (z-x)/2 \in A, and z\in Z and y\in Y if z-y \in A. (Check that this representation captures all three APs, and no more.)

We now show how the triangle lemma implies Roth’s theorem. We let our graph be constructed from A_n =  A\cap [1,n] for some n, and assume that A_n = \epsilon n. Also assume that A_n contains no non-trivial arithmetic progressions. What does this mean for the tripartite graph? Especially, how many triangles does our graph contain? Since there are no non-trivial arithmetic progressions, we must have

y-x = z-y = \frac{z-x}{2} \in A.

From this, we can easily see that if we pick x, y such that y-x \in A, we can set z=2y -x to satisfy the above equations and obtain a triangle. How many ways are there to do this? Given some a\in A, we can pick y = a+ m, x = m and z= 2a+m. We therefore look at all triples of the form m,m+a,m+2a for a\in A, m\in [3n]. Given that A\subseteq [n], we should get at least \epsilon n^2 triangles. But how do we apply the triangle removal lemma here? The form it is in does not immediately lend itself to our problem. Let us rephrase it as follows.

Triangle Removal Lemma (2). For each \delta >0 there exists some \epsilon >0 such that if removing \delta n^2 edges does not make the graph triangle-free, then the graph must have more than \epsilon n^3 triangles.

This gives us enough for Roth’s theorem. We have constructed our triangles in a very specific way, so that we know they are the only triangles in the graph, and they do not share any edges. Given the lower bound on the number of triangles in the graph from above, this means that we will need to remove at least \epsilon n^2 edges to make the graph triangle-free. Therefore, removing \epsilon' n^2 edges will not make the graph triangle-free, where \epsilon' is slightly smaller than \epsilon. We must then have some \delta' such that the graph contains at least \delta' n^3 triangles. This is impossible however, since it is easily shown that the upper bound on the number of triangles is also O(n^2). For n large enough, we have a contradiction.

What is the real essence of this proof? It lies in the fact that to have a lot of triangles in such a graph, we would expect many of them to share edgea. This is why we can remove fewer edges than there are triangles in order to make the graph triangle-free. Since the set A has density, we know that there will be a significant number of triangles, but not too many, since they have to be disjoint.


  1. David Conlon, Jacob Fox and Yufei Zhao: A relative Szemeredi theorem, GAFA Vol.25 (2015)
  2. Ruzsa and Szemeredi: Triple systems with no six points carrying three triangles, Combinatorics (Keszthely, 1976), Coll. Math. Soc. J. Bolyai 18 (1978): 939-945.
  3. Terence Tao: A variant of the hypergraph removal lemma, Journal of combinatorial theory, Series A 113.7 (2006): 1257-1280.
  4. Fan Chung Graham: The Combinatorics of Patterns in Subsets and Graphs, 2004.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s