The goal of this note is to present a proof of the following fundamental result. A theory is said to be satisfiable iff there is a model of
Theorem 1 (Compactness) Let
be a first order theory. Suppose that any finite subtheory
is satisfiable. Then
itself is satisfiable.
As indicated on the set of notes on the completeness theorem, compactness is an immediate consequence of completeness. Here I want to explain a purely semantic proof, that does not rely on the notion of proof.
The argument I present uses the notion of ultraproducts. Although their origin is in model theory, ultraproducts have become an essential tool in modern set theory, so it seems a good idea to present them here. We will require the axiom of choice, in the form of Zorn’s lemma.
The notion of ultraproduct is a bit difficult to absorb the first time one encounters it. I recommend working out through some examples in order to understand it well. Here I confine myself to the minimum necessary to make sense of the argument.
1. Ultrafilters
Definition 2 Let
be a nonempty set. A collection
of subsets of
is said to have the finite intersection property (fip) iff the intersection of any (nonzero but) finitely many of the sets in
is nonempty.
Note in particular that if has the fip, then the sets
are all nonempty,
Here are some typical examples of such collections. First, we could fix a nonempty subset and let
Or we could consider an infinite set and let
this is called the Fréchet filter on
For a more interesting example, consider the interval and let
where denotes Lebesgue measure.
All these examples share the following additional properties: First, if and
then
Second, not only it is the case that
whenever
but, in fact,
as well. Families like this are called filters and will be essential in what follows.
Definition 3 Given a nonempty set
a filter
is a collection of subsets of
with the following properties:
- Whenever
then
- Whenever
and
then
Generalizing from the last example above, a typical example of a filter is obtained by considering a complete measure space and letting
be the collection of subsets of
of
-measure 1. The completeness of the measure is needed to ensure that property (4) holds. This is an important example, and a working understanding of filters is obtained by thinking of its elements as being “large” in some sense. A similar example is obtained by considering a, say, complete, separable metric space
and letting
consist of those subsets of
whose complement is meager (first-category).
Lemma 4 Suppose that
is nonempty and has the fip. Write
for the collection of (nonempty) finite intersections of sets in
Let
Then
is a filter.
One usually refers to the filter as the filter generated by
Proof: All properties are trivially verified. (1) follows from being nonempty. (2) follows from the fact that
has the fip. Given
say that
is a witness iff
and
Clearly, if
with witness
then
is also a witness that
for any
such that
This gives us (4). To see (3), simply notice that
so if
with witnesses
then
witnesses that
Note that, conversely, any filter has the fip, and if is a filter, then
as above is just
Definition 5 An ultrafilter
on a set
is a filter on
that is maximal under containment, i.e., if
and
is also a filter on
then
![]()
There is an equivalent definition of ultrafilters that is widely used in practice.
Lemma 6 A filter
on a set
is an ultrafilter iff for any
either
or
![]()
Proof: If a filter is as given in the lemma, it must be maximal: For any
either
is already in
or else
is disjoint from some element of
namely
Hence no larger subclass of
is a filter.
Conversely, given a filter suppose that neither
nor
are in
We argue that
is not maximal. For this, it suffices to check that at least one of
and
has the fip. Suppose otherwise, so (since
is closed under finite intersections) there are sets
such that
and
But then
and
contradicting that
has the fip.
A routine application of Zorn’s lemma now shows that any filter is contained in an ultrafilter.
Lemma 7 Let
be a filter on a set
Then there is an ultrafilter on
extending
![]()
Proof: Consider the collection of all filters on that contain
This is obviously nonempty, since
itself is one of them. Given a
-chain of such filters, its union clearly has the fip, and therefore generates a filter that contains all the members of the chain. This shows that Zorn’s lemma applies, and there is therefore a maximal member of the collection. It is easy to see that this is indeed a maximal filter on
and we are done.
A typical example of ultrafilters are the principal ones, those ultrafilters such that
It is easy to check that an ultrafilter
on
is principal iff there is some
such that
Ultrafilters not of this form are called nonprincipal or {em free}. It is obvious that if is finite, any ultrafilter on
is principal. On the other hand, for any infinite set
there are nonprincipal ultrafilters on
For example, consider any ultrafilter extending the Fréchet filter.
2. Ultraproducts
At the moment, our interest on ultrafilters is purely utilitarian: They will allow us to define an “average” operation on structures in a given language (think of it as an abstract integration of such structures). The properties of this operation will easily imply the compactness theorem.
Suppose that is an ultrafilter on a set
Let
be a given first order language, and suppose that for all
we are given an
-structure
We want to define an
-structure that somehow averages together all the
This will be the ultraproduct
of the structures by
The definition of the ultraproduct will take us some time.
First, we define the product.
Definition 8 The set theoretic product of a collection
of nonempty sets is the set of all “choice functions” on this collection,
The axiom of choice is equivalent to the statement that these products are never empty. Note that if is finite, one can naturally identify
and the Cartesian product
We can think of the “threads” in the product as “variable elements” (thinking of
as time, we have that
varies with time, being the element
at time
).
Suppose now that rather than just having nonempty sets we actually have structures
The basic idea is that the ultrafilter will allow us to identify “typical” properties the threads maintain as they change in time, and we will ensure that these properties hold in the “average” structure we are defining.
I proceed to make this precise:
First, we define the universe of the ultraproduct structure to be the quotient of
by the equivalence relation
This means that the elements of are the equivalence classes of
What we are saying is that two threads are identified if they coincide almost everywhere.
This is easily seen to be an equivalence relation. To prove transitivity, note that the intersection of sets in is in
Now we define the interpretation in the ultraproduct of relation, function, and constant symbols in
Suppose that is a constant symbol. Then for all
we have an interpretation
Let
be the choice function given by
for all
Now define
where denotes the
-equivalence class of the choice function
Suppose that is a
-ary relation symbol. Given any
-tuple of elements of the ultraproduct, say
set
iff
One needs to verify that this is well defined, i.e., that if for
then
iff
This is immediate from the fact that
which follows from the fact that is closed under finite intersections.
Finally, suppose that is a
-ary function symbol. Given a
-tuple
of elements of the ultraproduct, define
as the equivalence class of the function that to assigns
As before, one needs to verify that this is well-defined, but the argument is almost identical to the one for relations.
This completes the definition of the ultraproduct structure.
If all the structures are equal to some fixed structure
we talk of the ultrapower of
by
and write
Of course, this notion has been designed with the goal in mind that the ultrapower structure must satisfy any property that holds of almost all the structures (in the sense that the set of such that
has the property is in
That this is indeed the case is the content of the following result:
Proof: We argue by induction in the complexity of To simplify, assume that
are the only primitive connectives and quantiiers, and all others are abbreviations.
- If
is atomic, the result holds by definition. (This requires in turn an induction on the complexity of terms.)
- Assume the result for
Suppose that
Then the result also holds for
since
is both upwards closed and closed under intersections.
- Assume the result holds for
Suppose that
Then the result also holds for
since for any
either
or else
- Finally, assume the result holds for
and
Let
be elements of the product. Then we have, by definition, that
iff there is some
in the product such that
This is equivalent, by assumption, to the statement that there is some
in the product such that
But this last statement is equivalent to
In one direction, note that
for any
In the other, for each
such that
pick a witness
Define a function
by
if
and let
be an arbitrary element of
otherwise. Then
and we are done.
This completes the proof.
This proves, for example, the existence of nonstandard models of Simply consider the ultrapower
for any free ultrafilter
on
and note that an easy argument shows that
contains infinite elements.
3. Compactness
We are finally ready to prove the compactness theorem. Suppose a theory is given with the property that for any finite
there is a models
of
We want to find a model
for the whole of
Begin by replacing with the theory consisting of the finite (nonempty) conjunctions of sentences in
Obviously,
has a model iff this new theory has a model, and any finite subset of the new theory has models. To simplify notation, I will then simply call this extension
as well.
Now let be the collection of finite subsets of
For
let
The following is then immediate:
Fact 1
has the fip.
![]()
There is then an ultrafilter on
such that all the sets
are
-large for
Let This is the model of
we were trying to find, for if
then
and therefore by
‘s lemma,
Typeset using LaTeX2WP. Here is a printable version of this post.