The goal of this note is to show the following result:
Theorem 1 The following statements are equivalent in
- The axiom of choice: Every set can be well-ordered.
- Every collection of nonempty set admits a choice function, i.e., if
for all
then there is
such that
for all
- Zorn’s lemma: If
is a partially ordered set with the property that every chain has an upper bound, then
has maximal elements.
- Any family of pairwise disjoint nonempty sets admits a selector, i.e., a set
such that
for all
in the family.
- Any set is a well-ordered union of finite sets of bounded size, i.e., for every set
there is a natural
an ordinal
and a function
such that
for all
and
- Tychonoff’s theorem: The topological product of compact spaces is compact.
- Every vector space (over any field) admits a basis.
Parts of this note have been discussed elsewhere in the blog, sometimes in a different form (see here and here, for example), but I haven’t examined before the equivalence of 5–7 with choice. In the course of the proof of the equivalence of item 7 with the other statements, a few additional equivalent versions of independent interest will be identified.
The equivalence of 1–3 is classic. Zermelo’s original axiomatization of set theory intended to formalize the proof that Statements 2 and 4 are only irrelevantly different.
The equivalence of 5 with 1 is due to Levy. On the other hand, the version of 5 where we simply require that each is finite does not suffice to recover choice.
The equivalence of Tychonoff’s theorem and choice is due to Kelley.
That the existence of bases implies choice is due to Blass, who proved that 7 implies the axiom of multiple choices. That this statement implies choice is due to Pincus. Pincus’s argument uses the axiom of foundation, and Levy showed that this is essential. The proof I indicate follows a suggestion of Felgner-Jech and uses a result of H. Rubin.
Many other results have been studied and shown to be equivalent to choice. For example, the textbook mentions the innocuous looking statement that every nonempty set admits an abelian group structure. Another important equivalence is that for any infinite set
Now we proceed to the proof:
Assume 2. Let be a family of pairwise disjoint nonempty sets, and let
be a choice function. Let
and note
for each
Now assume 4. Given a family of nonempty sets, let
and let
be a selector for
Define
by
and note that
is a choice function on
Assume 1. Let be a collection of noenempty sets. and let
be well-ordered. For
define
as the
-least element of
Now assume 2, and let be a set. Let
be a choice function, and define
by recursion as follows: As long as set
Note that there must be a such that
is not defined, or else we can easily find a surjection of
onto
contradicting replacement. Letting
be the least such
we have that
can be well-ordered by setting
for
iff
and
for some
Assume 1. Let satisfy the assumption of Zorn’s lemma, and fix a well-ordering
of
We show that every element of
that is not already maximal has at least one maximal element above. Note first that
is a chain, and therefore
Now let
Define a strictly increasing sequence of elements of
by recursion as follows: Given
if it is not maximal, let
be the
-least element of
strictly
-larger than
(If
is maximal, let
and this completes the construction.) Given
for
limit, since
satisfies the assumption of Zorn’s lemma, there must be elements of
strictly
-larger than all the
and let
be the
-least one. Note that this process must stop at some point, or we can find a surjection of
onto
a contradiction. This means that there is some
such that
is
-maximal. Since
we are done.
Now assume 3. Given a set consider the family
of injective sequences into
ordered by
iff
This partial order satisfies the assumption of Zorn’s lemma, so it has maximal elements. But any maximal
must be surjective onto
Clearly, with
For the converse, note that any is contained in some
such that
Simply take
where
and
Let
and
We argue that if then
by showing that if
and
then
as well. Assuming 5 amounts to asserting that
for any
and our initial observation shows that this then gives us 1.
Suppose, then, that and
where
Let
and
be witnesses that
For
define
Case 1:
For any with
let
be the lexicographically least pair
witnessing Case 1 for
Now define
as follows:
Given if
set
If
let
and
Note that for all
and
partition
so
Also, if
and
then
and
are both nonempty and therefore of size at most
It follows that
witness that
Case 2:
Let be the least such
and fix an
For any
such that
note that there is some
such that
This is because
We can then define as the least such
and note that
is a relation with domain a subset of
of size
i.e.,
but then
so for each
there is exactly one
such that
This means that
is a function.
Now define as follows: Given
if
set
Otherwise, set
and
As in Case 1, witness that
This completes the proof.
Assume 6. Let be a family of nonempty sets. We need to show that
Let
and set
for each
Define a topology on each
by letting
and finite sets be the only closed sets. Clearly, each
is compact, so
is compact as well.
For each let
and note that the
are closed and have the finite intersection property. By compactness of
and we are done.
Note that Zorn’s lemma gives us that any filter over a set can be extended to an ultrafilter over
The following is easy:
Fact 1 Assume
Let
be a topological space, and suppose that
whenever
is an ultrafilter over
Then
is compact.
Proof: Let be a family of closed subsets of
with the finite intersection property. Let
Then is a filter and
Let
be an ultrafilter extending
By assumption,
and it follows that is compact.
Let be a family of compact spaces, and set
Assume 2 and 3. By Fact 1, to show that
is compact, it suffices to show that whenever
is an ultrafilter over
then
To see this, fix such and, for each
set
where denotes the projection onto the
-th coordinate. Note that each
is nonempty, by the compactness of
Now use 2 to pick, for each an element
of
Let
be the map given by
It suffices to check that
To see this, let be an arbitrary open neighborhood of
We may assume that there is a finite set
such that
where
for all
and each
(for
) is open. For
set
where if
and
Then
We claim that each
is in
This gives the result, because then
as well, and therefore
whenever
Since
was arbitrary, this means that
for each
as we needed to show.
To see the claim, fix We argue that
has the finite intersection property. By maximality of
it follows that
and we are done (it is here that we use that
is an ultrafilter rather than just a collection of sets with the finite intersection property). But it suffices to check that
whenever
and this is obvious, as it amounts to noticing that
which holds since is a neighborhood of
in
and
This completes the proof.
Remark 1 Suppose that each
is Hausdorff. Then the maximality of
ensures that
is a singleton, and therefore we do not need to appeal to 2 in the argument above. This shows that the fact that each filter can be extended to an ultrafilter implies Tychonoff’s theorem for compact Hausdorff spaces. Conversely, this version of Tychonoff’s result suffices to prove in
that filters can be extended to ultrafilters.
This version of choice (usually referred to as the Boolean prime ideal theorem) is strictly weaker than 3. It has been recently shown by S. Rossi that Tychonoff’s theorem for compact Hausdorff spaces is also equivalent to the Banach-Alaoglu theorem from functional analysis. It also implies the Hahn-Banach theorem, which (by a result of Foreman and Wehrung) gives that there are non-Lebesgue measurable subsets of
Assume 3, and let be a vector space over a field
Recall the following basic notions and results from linear algebra:
Definition 2 If
a
-linear combination of elements of
is a vector that can be written in the form
where
is finite, and each
is a scalar in
The span of
(or, more precisely, the
-span of
),
is by definition the collection of all
-linear combinations of elements of
Naturally, we say that
spans
iff
is independent iff
for all
A basis for
is an independent set that spans
![]()
A basic result from linear algebra is Steinitz exchange lemma, that states that if then
In particular, if
is independent and
then
is independent.
Let be the collection of independent subsets of
ordered by containment. It is clear that
satisfies the assumption of Zorn’s lemma, and therefore there is a maximal independent subset
of
But then
is a basis, i.e, it spans
Otherwise, by Steinitz lemma,
can be extended to a larger independent set, contradicting its maximality.
Now we argue that if every vector space admits a basis, then every set is well-orderable. This takes some work, and we do this by stages. First, we show Blass’ result that 7 implies what is usually called the axiom of multiple choices, a weak version of 2.
Definition 3 The axiom of multiple choices,
is the statement that for every family of nonempty sets there is a function that assigns to each set in the family a finite, nonempty, subset.
Lemma 4 (Blass)
![]()
Proof: Let be a collection of nonempty sets. We need a function
such that
is nonempty and finite for each
We may as well assume that the are pairwise disjoint, and let
We will treat the elements of
as formal variables. Let
be a field, which we also assume disjoint from
and consider the field
consisting of rational functions (i.e., formal quotients of polynomials) with coefficients in
and variables from
Given and a monomial
in the ring of polynomials
define the
-degree of
to be the sum of the exponents of the members of
that appear in
A polynomial is
-homogeneous of degree
iif all the distinct monomials whose sum makes up
are of the
-degree
Finally, is
-homogeneous of degree
iff
can be written as a quotient
of polynomials with
-homogeneous of degree
and
-homogeneous of degree
for some
Note that this notion is well-defined.
Let be the collection of
that are
-homogeneous of degree 0 for all
This is a subfield of
As such,
is therefore a vector space over
and we can consider the vector subspace
By assumption, has a basis
Given any and any
we can then write
where is finite, and
for all
Given any we have
Note that is also an element of
It follows, since
is a basis, that in the representation
we must have and
for all
This means that
is actually independent of
and only depends on
and
is also independent of
and only depends on
and
Note that if then
is
-homogeneous of degree
This means that if
is written in reduced form as a quotient
then each monomial in the polynomial
must have variables from
Let
denote this finite set of variables.
Now set
This is a finite nonempty subset of and the assignment
is precisely the function we were looking for.
We must now show that implies choice. Here I follow Felgner-Jech and conclude with a result of Rubin.
Definition 5 The Antichain Principle is the statement that every poset
has a maximal antichain. An antichain here denotes a subset
of the poset any two elements of which are incomparable, i.e., if
are in
then
and
![]()
For example, consider as a partial order under inclusion. Antichains in this case are usually called Sperner systems, and Sperner showed in 1928 that any Sperner system has size at most
where, as usual, if
is a real, then
denotes the largest integer
To see this: Note that the collection of subsets of of size
forms a maximal antichain of the stated size
To prove that no antichain can be longer, I follow an argument of Lubell:
Note first that for any
as can be easily, though perhaps not elegantly, shown simply from the definition
Let be an antichain, and write
for the collection of sets in
of size
Since
for each
we have
and we are done once we check that the latter sum adds up to at most 1, since
The stated claim is usually called the Lubell-Yamamoto-Meshalkin (LYM) inequality. Note that there are permutations of the set
A permutation is a bijection
Given a set
in
there are
permutations
of
such that
Note that if
are in
none of the permutations associated to
coincides with a permutation associated to
because
and
are
-incomparable. It follows that
and dividing by we obtain the desired result.
Remark 2 There is another notion of antichain commonly used in set theory, that does not coincide with the notion we are using here. In the other version, we say that points
in the poset
are incompatible iff there is no
such that
and
An antichain is now a set of pairwise incompatible elements.
For example, let
consist of the quotient of the collection of Borel subsets of
by the equivalence relation that identifies two Borel sets
and
iff
has measure 0. Turn this into a partial order by setting that
iff
has measure 0. This is well-defined, i.e., it does not depend on representatives. Then any antichain in this poset is countable. This is usually expressed by saying that
is ccc, which means that it satisfies the countable chain condition. Perhaps this name ought to be changed to cac, but I believe the community is by now stuck with this misnomer.
To see this fact, suppose
is an uncountable antichain, and for each
let
denote the subset of
consisting of classes
with
of measure at least
It must be that for some
![]()
is uncountable, since
is. But this is impossible, since
being an antichain, can in fact have size at most
![]()
Lemma 6
Antichain Principle.
Proof: Let be a poset.
guarantees that there is a multiple choice function
such that
is finite for each nonempty
subset of
Define
on the nonempty subsets of
by setting
to be the collection of
-minimal elements of
Then
is a nonempty antichain contained in
Now define a maximal antichain in by recursion, by setting
where, as long as
has been defined, we set
and, as long as we set
Finally, is the least ordinal
such that
One easily checks that exists, that each
is an antichain, and that
is maximal.
Lemma 7 The antichain principle implies that every linearly orderable set is well-orderable.
Proof: Let be a linearly ordered set. Consider the poset
where
and
iff
and
The antichain principle grants that there is a maximal antichain
in
But then
is a choice function on
and so
is well-orderable, by the argument given above that
Remark 3 It is not yet immediate that we are done, since it is not clear that every set can be linearly orderable. An example of a set that cannot be expected to be linearly orderable without some appeal to choice is
where
is the collection of infinite sequences of zeros and ones, and
is the equivalence relation given by
iff
![]()
Now note the following easy observation:
Lemma 8 If
is an ordinal, then
is linearly orderable.
Proof: We can linearly order lexicographically: Given
subsets of
let
be the minimum element in the symmetric difference of
and
Now set
iff
It is straightforward to check that this is a linear order.
finally follows, in view of the following theorem which I have included previously in the blog. For the sake of completeness, I reproduce here the argument:
Theorem 9 (Rubin) If
is well-orderable for each ordinal
then
holds.
Proof: Assume in that the power set of any ordinal is well-orderable. We want to conclude that choice holds, i.e., that every set is well-orderable. A natural strategy is to proceed inductively, showing that each
is well-orderable: Clearly, if the result is true, each
would be well-orderable. But also, given any set
it belongs to some
and, since the latter is transitive, in fact
and therefore
is well-orderable as well. The strategy is suggested by the fact that for all
so a well-ordering of
gives us a well-ordering of
thanks to our initial assumption.
We argue by induction: Clearly is well-ordered by the well-ordering
Given a well-ordering
of
there is a unique ordinal
and a unique order isomorphism
By assumption, is well-orderable, and any well-ordering of it induces (via
) a well-ordering of
We are left with the task of showing that is well-orderable for
limit. The natural approach is to patch together the well-orderings of the previous
into a well-ordering of
This approach meets two obstacles.
The first one is that the well-orderings of different are not necessarily compatible, so we need to be careful on how we “patch them together.” This is not too much of an obstacle: The natural solution is to simply order the sets as they appear inside
More precisely, define
for
iff
- Either
or else
say, and if
is the well-ordering of
then
It is easy to see that this is indeed a well-ordering of : Given a non-empty
let
be least such that
has an element of rank
Then the
-first among these elements would be the
-least element of
The second obstacle is more serious. Namely, the assumption is simply that there is a well-ordering of each not that there is any canonical way of choosing one. In order for the argument above to work, we need not just that each
for
is well-orderable, but in fact we need to have selected a sequence
of well-orderings of the
with respect to which we proceeded to define the well-ordering
of
The way we began the proof suggests a solution: When we argued that it suffices to well-order each we considered an arbitrary set
and noticed that if
then a well-ordering of
gives us a well-ordering of
Similarly, given
limit, if we can find
large enough so each
for
is below
then we can use a well-ordering of
to induce the required well-ordering
We now proceed to implement this idea: Let (Notice that this makes sense since, inductively, each
with
is well-orderable and therefore isomorphic to a unique cardinal.) Let
be a well-ordering of
We use
to define a sequence
so that
well-orders
for all
We use recursion on
to define this sequence. Again,
At limit stages
we copy the strategy with which we tried to well-order
to define
: For
set
iff
- Either
or else
say, and
Finally, given we describe how to define
: Let
be the unique ordinal such that there is an order isomorphism
Since then
so
and the well-ordering
of
also well-orders
Via
this induces the well-ordering
of
we were looking for.
Typeset using LaTeX2WP. Here is a printable version of this post.
[Edit, June 18, 2012: Removed at the author’s request. A.C.]
[…] there is an injection from one into the other. The standard argument for this uses that choice is equivalent to the well-ordering theorem: One can prove (without choice) that any two well-ordered sets are […]