The goal of this note is to show the following result:
Theorem 1 The following statements are equivalent in
- The axiom of choice: Every set can be well-ordered.
- Every collection of nonempty set admits a choice function, i.e., if for all then there is such that for all
- Zorn’s lemma: If is a partially ordered set with the property that every chain has an upper bound, then has maximal elements.
- Any family of pairwise disjoint nonempty sets admits a selector, i.e., a set such that for all in the family.
- Any set is a well-ordered union of finite sets of bounded size, i.e., for every set there is a natural an ordinal and a function such that for all and
- Tychonoff’s theorem: The topological product of compact spaces is compact.
- Every vector space (over any field) admits a basis.
Parts of this note have been discussed elsewhere in the blog, sometimes in a different form (see here and here, for example), but I haven’t examined before the equivalence of 5–7 with choice. In the course of the proof of the equivalence of item 7 with the other statements, a few additional equivalent versions of independent interest will be identified.
The equivalence of 1–3 is classic. Zermelo’s original axiomatization of set theory intended to formalize the proof that Statements 2 and 4 are only irrelevantly different.
The equivalence of 5 with 1 is due to Levy. On the other hand, the version of 5 where we simply require that each is finite does not suffice to recover choice.
The equivalence of Tychonoff’s theorem and choice is due to Kelley.
That the existence of bases implies choice is due to Blass, who proved that 7 implies the axiom of multiple choices. That this statement implies choice is due to Pincus. Pincus’s argument uses the axiom of foundation, and Levy showed that this is essential. The proof I indicate follows a suggestion of Felgner-Jech and uses a result of H. Rubin.
Many other results have been studied and shown to be equivalent to choice. For example, the textbook mentions the innocuous looking statement that every nonempty set admits an abelian group structure. Another important equivalence is that for any infinite set
Now we proceed to the proof:
Assume 2. Let be a family of pairwise disjoint nonempty sets, and let be a choice function. Let and note for each
Now assume 4. Given a family of nonempty sets, let and let be a selector for Define by and note that is a choice function on
Assume 1. Let be a collection of noenempty sets. and let be well-ordered. For define as the -least element of
Now assume 2, and let be a set. Let be a choice function, and define
by recursion as follows: As long as set
Note that there must be a such that is not defined, or else we can easily find a surjection of onto contradicting replacement. Letting be the least such we have that can be well-ordered by setting for iff and for some
Assume 1. Let satisfy the assumption of Zorn’s lemma, and fix a well-ordering of We show that every element of that is not already maximal has at least one maximal element above. Note first that is a chain, and therefore Now let Define a strictly increasing sequence of elements of
by recursion as follows: Given if it is not maximal, let be the -least element of strictly -larger than (If is maximal, let and this completes the construction.) Given for limit, since satisfies the assumption of Zorn’s lemma, there must be elements of strictly -larger than all the and let be the -least one. Note that this process must stop at some point, or we can find a surjection of onto a contradiction. This means that there is some such that is -maximal. Since we are done.
Now assume 3. Given a set consider the family of injective sequences into ordered by iff This partial order satisfies the assumption of Zorn’s lemma, so it has maximal elements. But any maximal must be surjective onto
For the converse, note that any is contained in some such that Simply take where and Let
We argue that if then by showing that if and then as well. Assuming 5 amounts to asserting that for any and our initial observation shows that this then gives us 1.
Suppose, then, that and where Let and be witnesses that For define
For any with let be the lexicographically least pair witnessing Case 1 for Now define as follows:
Given if set If let and
Note that for all and partition so Also, if and then and are both nonempty and therefore of size at most It follows that witness that
Let be the least such and fix an For any such that note that there is some such that This is because
We can then define as the least such and note that is a relation with domain a subset of of size i.e., but then so for each there is exactly one such that This means that is a function.
Now define as follows: Given if set Otherwise, set and
As in Case 1, witness that This completes the proof.
Assume 6. Let be a family of nonempty sets. We need to show that Let and set for each Define a topology on each by letting and finite sets be the only closed sets. Clearly, each is compact, so is compact as well.
For each let and note that the are closed and have the finite intersection property. By compactness of
and we are done.
Note that Zorn’s lemma gives us that any filter over a set can be extended to an ultrafilter over The following is easy:
Fact 1 Assume Let be a topological space, and suppose that
whenever is an ultrafilter over Then is compact.
Proof: Let be a family of closed subsets of with the finite intersection property. Let
Then is a filter and Let be an ultrafilter extending By assumption,
and it follows that is compact.
Let be a family of compact spaces, and set Assume 2 and 3. By Fact 1, to show that is compact, it suffices to show that whenever is an ultrafilter over then
To see this, fix such and, for each set
where denotes the projection onto the -th coordinate. Note that each is nonempty, by the compactness of
Now use 2 to pick, for each an element of Let be the map given by It suffices to check that
To see this, let be an arbitrary open neighborhood of We may assume that there is a finite set such that where for all and each (for ) is open. For set
where if and Then We claim that each is in This gives the result, because then as well, and therefore whenever Since was arbitrary, this means that for each as we needed to show.
To see the claim, fix We argue that has the finite intersection property. By maximality of it follows that and we are done (it is here that we use that is an ultrafilter rather than just a collection of sets with the finite intersection property). But it suffices to check that whenever and this is obvious, as it amounts to noticing that
which holds since is a neighborhood of in and This completes the proof.
Remark 1 Suppose that each is Hausdorff. Then the maximality of ensures that
is a singleton, and therefore we do not need to appeal to 2 in the argument above. This shows that the fact that each filter can be extended to an ultrafilter implies Tychonoff’s theorem for compact Hausdorff spaces. Conversely, this version of Tychonoff’s result suffices to prove in that filters can be extended to ultrafilters.
This version of choice (usually referred to as the Boolean prime ideal theorem) is strictly weaker than 3. It has been recently shown by S. Rossi that Tychonoff’s theorem for compact Hausdorff spaces is also equivalent to the Banach-Alaoglu theorem from functional analysis. It also implies the Hahn-Banach theorem, which (by a result of Foreman and Wehrung) gives that there are non-Lebesgue measurable subsets of
Assume 3, and let be a vector space over a field Recall the following basic notions and results from linear algebra:
Definition 2 If a -linear combination of elements of is a vector that can be written in the form
where is finite, and each is a scalar in
The span of (or, more precisely, the -span of ), is by definition the collection of all -linear combinations of elements of Naturally, we say that spans iff
is independent iff for all
A basis for is an independent set that spans
A basic result from linear algebra is Steinitz exchange lemma, that states that if then In particular, if is independent and then is independent.
Let be the collection of independent subsets of ordered by containment. It is clear that satisfies the assumption of Zorn’s lemma, and therefore there is a maximal independent subset of But then is a basis, i.e, it spans Otherwise, by Steinitz lemma, can be extended to a larger independent set, contradicting its maximality.
Now we argue that if every vector space admits a basis, then every set is well-orderable. This takes some work, and we do this by stages. First, we show Blass’ result that 7 implies what is usually called the axiom of multiple choices, a weak version of 2.
Definition 3 The axiom of multiple choices, is the statement that for every family of nonempty sets there is a function that assigns to each set in the family a finite, nonempty, subset.
Lemma 4 (Blass)
Proof: Let be a collection of nonempty sets. We need a function such that is nonempty and finite for each
We may as well assume that the are pairwise disjoint, and let We will treat the elements of as formal variables. Let be a field, which we also assume disjoint from and consider the field consisting of rational functions (i.e., formal quotients of polynomials) with coefficients in and variables from
Given and a monomial in the ring of polynomials define the -degree of to be the sum of the exponents of the members of that appear in
A polynomial is -homogeneous of degree iif all the distinct monomials whose sum makes up are of the -degree
Finally, is -homogeneous of degree iff can be written as a quotient of polynomials with -homogeneous of degree and -homogeneous of degree for some Note that this notion is well-defined.
Let be the collection of that are -homogeneous of degree 0 for all This is a subfield of As such, is therefore a vector space over and we can consider the vector subspace
By assumption, has a basis
Given any and any we can then write
where is finite, and for all
Given any we have
Note that is also an element of It follows, since is a basis, that in the representation
we must have and for all This means that is actually independent of and only depends on and is also independent of and only depends on and
Note that if then is -homogeneous of degree This means that if is written in reduced form as a quotient then each monomial in the polynomial must have variables from Let denote this finite set of variables.
This is a finite nonempty subset of and the assignment is precisely the function we were looking for.
We must now show that implies choice. Here I follow Felgner-Jech and conclude with a result of Rubin.
Definition 5 The Antichain Principle is the statement that every poset has a maximal antichain. An antichain here denotes a subset of the poset any two elements of which are incomparable, i.e., if are in then and
For example, consider as a partial order under inclusion. Antichains in this case are usually called Sperner systems, and Sperner showed in 1928 that any Sperner system has size at most where, as usual, if is a real, then denotes the largest integer
To see this: Note that the collection of subsets of of size forms a maximal antichain of the stated size To prove that no antichain can be longer, I follow an argument of Lubell:
Note first that for any as can be easily, though perhaps not elegantly, shown simply from the definition
Let be an antichain, and write for the collection of sets in of size Since for each we have
and we are done once we check that the latter sum adds up to at most 1, since
The stated claim is usually called the Lubell-Yamamoto-Meshalkin (LYM) inequality. Note that there are permutations of the set A permutation is a bijection Given a set in there are permutations of such that Note that if are in none of the permutations associated to coincides with a permutation associated to because and are -incomparable. It follows that
and dividing by we obtain the desired result.
Remark 2 There is another notion of antichain commonly used in set theory, that does not coincide with the notion we are using here. In the other version, we say that points in the poset are incompatible iff there is no such that and An antichain is now a set of pairwise incompatible elements.
For example, let consist of the quotient of the collection of Borel subsets of by the equivalence relation that identifies two Borel sets and iff has measure 0. Turn this into a partial order by setting that iff has measure 0. This is well-defined, i.e., it does not depend on representatives. Then any antichain in this poset is countable. This is usually expressed by saying that is ccc, which means that it satisfies the countable chain condition. Perhaps this name ought to be changed to cac, but I believe the community is by now stuck with this misnomer.
To see this fact, suppose is an uncountable antichain, and for each let denote the subset of consisting of classes with of measure at least It must be that for some is uncountable, since is. But this is impossible, since being an antichain, can in fact have size at most
Lemma 6 Antichain Principle.
Proof: Let be a poset. guarantees that there is a multiple choice function such that is finite for each nonempty subset of Define on the nonempty subsets of by setting to be the collection of -minimal elements of Then is a nonempty antichain contained in
Now define a maximal antichain in by recursion, by setting where, as long as has been defined, we set
and, as long as we set
Finally, is the least ordinal such that
One easily checks that exists, that each is an antichain, and that is maximal.
Lemma 7 The antichain principle implies that every linearly orderable set is well-orderable.
Proof: Let be a linearly ordered set. Consider the poset where and iff and The antichain principle grants that there is a maximal antichain in But then is a choice function on and so is well-orderable, by the argument given above that
Remark 3 It is not yet immediate that we are done, since it is not clear that every set can be linearly orderable. An example of a set that cannot be expected to be linearly orderable without some appeal to choice is where is the collection of infinite sequences of zeros and ones, and is the equivalence relation given by iff
Now note the following easy observation:
Lemma 8 If is an ordinal, then is linearly orderable.
Proof: We can linearly order lexicographically: Given subsets of let be the minimum element in the symmetric difference of and Now set iff It is straightforward to check that this is a linear order.
finally follows, in view of the following theorem which I have included previously in the blog. For the sake of completeness, I reproduce here the argument:
Theorem 9 (Rubin) If is well-orderable for each ordinal then holds.
Proof: Assume in that the power set of any ordinal is well-orderable. We want to conclude that choice holds, i.e., that every set is well-orderable. A natural strategy is to proceed inductively, showing that each is well-orderable: Clearly, if the result is true, each would be well-orderable. But also, given any set it belongs to some and, since the latter is transitive, in fact and therefore is well-orderable as well. The strategy is suggested by the fact that for all so a well-ordering of gives us a well-ordering of thanks to our initial assumption.
We argue by induction: Clearly is well-ordered by the well-ordering Given a well-ordering of there is a unique ordinal and a unique order isomorphism
By assumption, is well-orderable, and any well-ordering of it induces (via ) a well-ordering of
We are left with the task of showing that is well-orderable for limit. The natural approach is to patch together the well-orderings of the previous into a well-ordering of This approach meets two obstacles.
The first one is that the well-orderings of different are not necessarily compatible, so we need to be careful on how we “patch them together.” This is not too much of an obstacle: The natural solution is to simply order the sets as they appear inside More precisely, define for iff
- Either or else
- say, and if is the well-ordering of then
It is easy to see that this is indeed a well-ordering of : Given a non-empty let be least such that has an element of rank Then the -first among these elements would be the -least element of
The second obstacle is more serious. Namely, the assumption is simply that there is a well-ordering of each not that there is any canonical way of choosing one. In order for the argument above to work, we need not just that each for is well-orderable, but in fact we need to have selected a sequence of well-orderings of the with respect to which we proceeded to define the well-ordering of
The way we began the proof suggests a solution: When we argued that it suffices to well-order each we considered an arbitrary set and noticed that if then a well-ordering of gives us a well-ordering of Similarly, given limit, if we can find large enough so each for is below then we can use a well-ordering of to induce the required well-ordering
We now proceed to implement this idea: Let (Notice that this makes sense since, inductively, each with is well-orderable and therefore isomorphic to a unique cardinal.) Let be a well-ordering of We use to define a sequence so that well-orders for all We use recursion on to define this sequence. Again, At limit stages we copy the strategy with which we tried to well-order to define : For set iff
- Either or else
- say, and
Finally, given we describe how to define : Let be the unique ordinal such that there is an order isomorphism
Since then so and the well-ordering of also well-orders Via this induces the well-ordering of we were looking for.
Typeset using LaTeX2WP. Here is a printable version of this post.
[Edit, June 18, 2012: Removed at the author’s request. A.C.]
[…] there is an injection from one into the other. The standard argument for this uses that choice is equivalent to the well-ordering theorem: One can prove (without choice) that any two well-ordered sets are […]