Ramsey’s Theorem and Ultrafilters

In this post, I will go through a proof of one of my favorite results in combinatorics using a technique that is not necessarily well known outside of logic.

Ramsey’s Theorem

There are many ways to describe (Infinite) Ramsey’s Theorem. One description uses graph theory. A graph is a set of vertices (nodes) and edges (connections between the nodes). A clique in a graph is a subset X of the set of vertices such that all vertices in X have edges between them. An anti-clique is a subset X of the set of vertices such that no two vertices in X have edges between them.

Theorem: Every infinite graph has an infinite clique or an infinite anti-clique.

We can formalize Ramsey’s Theorem in another way. A two-coloring of a set X is a function f : X \to \{ 0, 1 \}; in other words, a partition of X into two sets (the elements 0 and 1 are the “colors”, often representing blue and red). GIven a set X, the set [X]^2 is the set of pairs of elements of X. For example, the set [\mathbb{N}]^2 is the set of (unordered) pairs of natural numbers, e.g., it’s the set {0, 1}, {0, 2}, {1, 2}, {0, 3}, etc.

A two-coloring of [\mathbb{N}]^2.

Theorem (Ramsey’s Theorem for Pairs): If f is a two-coloring of [\mathbb{N}]^2, there is an infinite set H such that all pairs of elements of H get the same color.

These two theorems are equivalent: given a two-coloring of the set of pairs of natural numbers, you can form an infinite graph by letting the set of vertices just be \mathbb{N}, and by putting an edge between numbers n and m if and only if the color f(n, m) = 1. You can also go the other way: given a graph, you can form a two-coloring of the set of pairs of vertices of the graph in a canonical way.

This second statement, while more difficult to parse, is the one we will focus on for this post.

First, let’s prove an easier statement: if f is a two-coloring of \mathbb{N}, there is an infinite set X such that every element of X gets the same color. (This would be referred to as “Ramsey’s Theorem for Singletons”.)

A two-coloring of \mathbb{N}

This result is fairly easy to prove: let A = \{ n : f(n) = 0 \} and B = \{ n : f(n) = 1 \}. One of these two sets must be infinite, and every element of A gets color 0 (blue), while every element of B gets 1 (red).

If we try to generalize this proof to Ramsey’s Theorem for Pairs, we can look at the sets A = \{ \{ x, y \} : f(x, y) = 0 \} and B = \{ \{ x, y \} : f(x, y) = 1 \}; clearly one of these is infinite. But these sets are sets of pairs, and Ramsey’s Theorem states that there is an infinite set of numbers where all the pairs of numbers get the same color (in graph-theoretic terms, it’s easy to find an infinite set of edges or non-edges in an infinite graph; we want an infinite set of vertices which are either all mutually connected or mutually disconnected).

The reason Ramsey’s Theorem for Singletons is easy to prove is because we know the color of each number; we know f(0), f(1), etc. But if we are coloring pairs, then perhaps f(0, 1) = 0, f(0, 2) = 1, f(0, 3) = 0, . In other words, the color of 0 might be blue infinitely often, and it might be red infinitely often. We need a way to decide the color of each natural number “on average”. That is, if we could say, given a natural number n, “for most numbers m, f(n, m) = 0“, or, “for most numbers m, f(n, m) = 1“.

Of course, it is not obvious that it is possible to make such a claim for each number n. Sometimes it’s clear: if, for example, the color “stabilizes”; ie, maybe f(0, 1) = 0, f(0, 2) = 1, f(0, 3) = 0, and for all n > 3, f(0, n) = 1. In that case, it is clear that the color of 0 is “usually” 1. But perhaps the color does not stabilize: maybe there are infinitely many numbers m such that f(0, m) = 0 and infinitely many n such that f(0, n) = 1. So in that case, how would you decide what the color of 0 is on average?

Ultrafilters: “averaging” over infinity

This idea of averaging over an infinite set can be studied formally with the concept of an ultrafilter. An ultrafilter is a way of choosing which sets are “large”.

Definition: A filter on the natural numbers is a family of sets \mathcal{F} with the following properties:

  • for any sets A and B, if A \in \mathcal{F} and A \subseteq B \subseteq \mathbb{N}, then B \in \mathcal{F}
  • for any sets A, B \in \mathcal{F}, the intersection A \cap B \in \mathcal{F}
  • \emptyset \not \in \mathcal{F}.

An ultrafilter is a filter \mathcal{U} with the additional property that for all X \subseteq \mathbb{N}, either X \in \mathcal{U} or \mathbb{N} \setminus X \in \mathcal{U}.

Again, the idea is that the sets in an ultrafilter are considered “large”. Each of these properties represents some “largeness” principle. If a set is large, any set that contains it should also be large; if two sets are large, their intersection is large; the empty set is not large; and if a set is not large, the complement of it should be large. We say that a property “almost always” happens if it happens on a large set, and it “almost never” happens if it happens on a set which is not large.

There are some easy examples of ultrafilters: take any number n, and let \mathcal{U}_n = \{ A \subseteq \mathbb{N} : n \in A \}. It’s not hard to verify that all the properties are satisfied. Ultrafilters like these (the ones generated in some sense by a single number) are called principal. Non-principal ultrafilters are harder to construct, but given some amount of set theory it is possible to show that they also exist. Non-principal ultrafilters have a crucial property: they contain no finite sets. That means that if A is the complement of a finite set (“cofinite”), then A is in every non-principal ultrafilter.

Proof of Ramsey’s Theorem

Let f : [\mathbb{N}]^2 -> \{0, 1\}. Let \mathcal{U} be a non-principal ultrafilter. We use the ultrafilter \mathcal{U} to assign colors to each number as follows: g : \mathbb{N} \to \{ 0, 1 \} is defined as g(n) = 0 if and only if \{ x : f(n, x) = 0 \} \in \mathcal{U}, and g(n) = 1 otherwise. Notice that g(n) = 1 if and only if \{ x : f(n, x) = 1 \} \in \mathcal{U}. In other words, think of f as assigning an infinite sequence of colors to n. Then, using the ultrafilter, we pick out the color of n “almost always”, and call that color g(n).

We will define a sequence by induction. Let a_0 = 0. Given a_0, \ldots, a_n, let a_{n+1} be the least a > a_n such that f(a_i, a) = g(a_i) for each i \leq n. We must show that such an a exists. The idea here is that the function g assigns the “correct” color according to the ultrafilter; that is, for each i, the set of those x such that f(a_i, x) = g(a_i) is large. Since the intersection of finitely many large sets is also large, the set X = \{ x : f(a_i, x) = g(a_i) for all i \leq n \} is large. Furthermore, in a non-principal ultrafilter, large sets are always infinite, so there must be an a \in X greater than a_n.

Let Y = \{ a_n : n \in \mathbb{N} \}, B = \{ a \in Y: g(a) = 0 \} and R = \{ a \in Y : g(a) = 1 \}. Clearly Y is infinite, so one of B or R is infinite. Further, for all x < y \in B, f(x, y) = 0 and for all x < y \in R, f(x, y) = 1, so one of B or R is the infinite set required by the statement of Ramsey’s Theorem.

Other applications of ultrafilters

Ultrafilters have applications all throughout mathematics, including in model theory, social choice, and non-standard analysis. I hope to explore non-standard analysis, in particular, in a future post, where I will discuss ideas like formalizing the notion of a limit using infinitesimals (instead of using epsilons and deltas).