The goal of this note is to present a proof of the following fundamental result. A theory is said to be *satisfiable* iff there is a model of

Theorem 1 (Compactness)Let be a first order theory. Suppose that any finite subtheory is satisfiable. Then itself is satisfiable.

As indicated on the set of notes on the completeness theorem, compactness is an immediate consequence of completeness. Here I want to explain a purely semantic proof, that does not rely on the notion of proof.

The argument I present uses the notion of *ultraproducts*. Although their origin is in model theory, ultraproducts have become an essential tool in modern set theory, so it seems a good idea to present them here. We will require the axiom of choice, in the form of Zorn’s lemma.

The notion of ultraproduct is a bit difficult to absorb the first time one encounters it. I recommend working out through some examples in order to understand it well. Here I confine myself to the minimum necessary to make sense of the argument.

**1. Ultrafilters **

Definition 2Let be a nonempty set. A collection of subsets of is said to have the finite intersection property (fip) iff the intersection of any (nonzero but) finitely many of the sets in is nonempty.

Note in particular that if has the fip, then the sets are all nonempty,

Here are some typical examples of such collections. First, we could fix a nonempty subset and let

Or we could consider an infinite set and let

this is called the *Fréchet* filter on

For a more interesting example, consider the interval and let

where denotes Lebesgue measure.

All these examples share the following additional properties: First, if and then Second, not only it is the case that whenever but, in fact, as well. Families like this are called *filters* and will be essential in what follows.

Definition 3Given a nonempty set a filter is a collection of subsets of with the following properties:

- Whenever then
- Whenever and then

Generalizing from the last example above, a typical example of a filter is obtained by considering a complete measure space and letting be the collection of subsets of of -measure 1. The completeness of the measure is needed to ensure that property (4) holds. This is an important example, and a working understanding of filters is obtained by thinking of its elements as being “large” in some sense. A similar example is obtained by considering a, say, complete, separable metric space and letting consist of those subsets of whose complement is meager (first-category).

Lemma 4Suppose that is nonempty and has the fip. Write for the collection of (nonempty) finite intersections of sets in Let

Then is a filter.

One usually refers to the filter as *the filter generated* by

*Proof:* All properties are trivially verified. (1) follows from being nonempty. (2) follows from the fact that has the fip. Given say that is a *witness* iff and Clearly, if with witness then is also a witness that for any such that This gives us (4). To see (3), simply notice that so if with witnesses then witnesses that

Note that, conversely, any filter has the fip, and if is a filter, then as above is just

Definition 5An ultrafilter on a set is a filter on that is maximal under containment, i.e., if and is also a filter on then

There is an equivalent definition of ultrafilters that is widely used in practice.

Lemma 6A filter on a set is an ultrafilter iff for any either or

*Proof:* If a filter is as given in the lemma, it must be maximal: For any either is already in or else is disjoint from some element of namely Hence no larger subclass of is a filter.

Conversely, given a filter suppose that neither nor are in We argue that is not maximal. For this, it suffices to check that at least one of and has the fip. Suppose otherwise, so (since is closed under finite intersections) there are sets such that

and

But then and contradicting that has the fip.

A routine application of Zorn’s lemma now shows that any filter is contained in an ultrafilter.

Lemma 7Let be a filter on a set Then there is an ultrafilter on extending

*Proof:* Consider the collection of all filters on that contain This is obviously nonempty, since itself is one of them. Given a -chain of such filters, its union clearly has the fip, and therefore generates a filter that contains all the members of the chain. This shows that Zorn’s lemma applies, and there is therefore a maximal member of the collection. It is easy to see that this is indeed a maximal filter on and we are done.

A typical example of ultrafilters are the *principal* ones, those ultrafilters such that It is easy to check that an ultrafilter on is principal iff there is some such that

Ultrafilters not of this form are called nonprincipal or {em free}. It is obvious that if is finite, any ultrafilter on is principal. On the other hand, for any infinite set there are nonprincipal ultrafilters on For example, consider any ultrafilter extending the Fréchet filter.

**2. Ultraproducts **

At the moment, our interest on ultrafilters is purely utilitarian: They will allow us to define an “average” operation on structures in a given language (think of it as an abstract integration of such structures). The properties of this operation will easily imply the compactness theorem.

Suppose that is an ultrafilter on a set Let be a given first order language, and suppose that for all we are given an -structure We want to define an -structure that somehow averages together all the This will be the *ultraproduct*

of the structures by

The definition of the ultraproduct will take us some time.

First, we define the product.

Definition 8The set theoretic product of a collection of nonempty sets is the set of all “choice functions” on this collection,

The axiom of choice is equivalent to the statement that these products are never empty. Note that if is finite, one can naturally identify and the Cartesian product

We can think of the “threads” in the product as “variable elements” (thinking of as time, we have that varies with time, being the element at time ).

Suppose now that rather than just having nonempty sets we actually have structures The basic idea is that the ultrafilter will allow us to identify “typical” properties the threads maintain as they change in time, and we will ensure that these properties hold in the “average” structure we are defining.

I proceed to make this precise:

First, we define the *universe* of the ultraproduct structure to be the quotient of by the equivalence relation

This means that the elements of are the equivalence classes of What we are saying is that two threads are identified if they coincide almost everywhere.

This is easily seen to be an equivalence relation. To prove transitivity, note that the intersection of sets in is in

Now we define the interpretation in the ultraproduct of relation, function, and constant symbols in

Suppose that is a constant symbol. Then for all we have an interpretation Let be the choice function given by for all Now define

where denotes the -equivalence class of the choice function

Suppose that is a -ary relation symbol. Given any -tuple of elements of the ultraproduct, say set

iff

One needs to verify that this is well defined, i.e., that if for then iff

This is immediate from the fact that

which follows from the fact that is closed under finite intersections.

Finally, suppose that is a -ary function symbol. Given a -tuple of elements of the ultraproduct, define

as the equivalence class of the function that to assigns

As before, one needs to verify that this is well-defined, but the argument is almost identical to the one for relations.

This completes the definition of the ultraproduct structure.

If all the structures are equal to some fixed structure we talk of the *ultrapower* of by and write

Of course, this notion has been designed with the goal in mind that the ultrapower structure must satisfy any property that holds of almost all the structures (in the sense that the set of such that has the property is in That this is indeed the case is the content of the following result:

*Proof:* We argue by induction in the complexity of To simplify, assume that are the only primitive connectives and quantiiers, and all others are abbreviations.

- If is atomic, the result holds by definition. (This requires in turn an induction on the complexity of terms.)
- Assume the result for Suppose that Then the result also holds for since is both upwards closed and closed under intersections.
- Assume the result holds for Suppose that Then the result also holds for since for any either or else
- Finally, assume the result holds for and Let be elements of the product. Then we have, by definition, that
iff there is some in the product such that

This is equivalent, by assumption, to the statement that there is some in the product such that

But this last statement is equivalent to

In one direction, note that

for any In the other, for each such that pick a witness Define a function by if and let be an arbitrary element of otherwise. Then

and we are done.

This completes the proof.

This proves, for example, the existence of nonstandard models of Simply consider the ultrapower for any free ultrafilter on and note that an easy argument shows that contains infinite elements.

**3. Compactness **

We are finally ready to prove the compactness theorem. Suppose a theory is given with the property that for any finite there is a models of We want to find a model for the whole of

Begin by replacing with the theory consisting of the finite (nonempty) conjunctions of sentences in Obviously, has a model iff this new theory has a model, and any finite subset of the new theory has models. To simplify notation, I will then simply call this extension as well.

Now let be the collection of finite subsets of For let

The following is then immediate:

Fact 1has the fip.

There is then an ultrafilter on such that all the sets are -large for

Let This is the model of we were trying to find, for if then and therefore by ‘s lemma,

*Typeset using LaTeX2WP. Here is a printable version of this post.*