502 – Equivalents of the axiom of choice

The goal of this note is to show the following result:

Theorem 1 The following statements are equivalent in {{\sf ZF}:}

  1. The axiom of choice: Every set can be well-ordered.
  2. Every collection of nonempty set admits a choice function, i.e., if {x\ne\emptyset} for all {x\in I,} then there is {f:I\rightarrow\bigcup I} such that {f(x)\in x} for all {x\in I.}
  3. Zorn’s lemma: If {(P,\le)} is a partially ordered set with the property that every chain has an upper bound, then {P} has maximal elements.
  4. Any family of pairwise disjoint nonempty sets admits a selector, i.e., a set {S} such that {|S\cap x|=1} for all {x} in the family.
  5. Any set is a well-ordered union of finite sets of bounded size, i.e., for every set {x} there is a natural {m,} an ordinal {\alpha,} and a function {f:\alpha\rightarrow{\mathcal P}(x)} such that {|f(\beta)|\le m} for all {\beta<\alpha,} and {\bigcup_{\beta<\alpha}f(\beta)=x.}
  6. Tychonoff’s theorem: The topological product of compact spaces is compact.
  7. Every vector space (over any field) admits a basis.

Parts of this note have been discussed elsewhere in the blog, sometimes in a different form (see here and here, for example), but I haven’t examined before the equivalence of 5–7 with choice. In the course of the proof of the equivalence of item 7 with the other statements, a few additional equivalent versions of independent interest will be identified.

The equivalence of 1–3 is classic. Zermelo’s original axiomatization of set theory intended to formalize the proof that {2\Rightarrow1.} Statements 2 and 4 are only irrelevantly different.

The equivalence of 5 with 1 is due to Levy. On the other hand, the version of 5 where we simply require that each {f(\beta)} is finite does not suffice to recover choice.

The equivalence of Tychonoff’s theorem and choice is due to Kelley.

That the existence of bases implies choice is due to Blass, who proved that 7 implies the axiom of multiple choices. That this statement implies choice is due to Pincus. Pincus’s argument uses the axiom of foundation, and Levy showed that this is essential. The proof I indicate follows a suggestion of Felgner-Jech and uses a result of H. Rubin.

Many other results have been studied and shown to be equivalent to choice. For example, the textbook mentions the innocuous looking statement that every nonempty set admits an abelian group structure. Another important equivalence is that {|A\times A|=|A|} for any infinite set {A.}

Now we proceed to the proof:

  • {2\Leftrightarrow 4:}

Assume 2. Let {I} be a family of pairwise disjoint nonempty sets, and let {f:I\rightarrow\bigcup I} be a choice function. Let {S=f[I],} and note {S\cap x=\{f(x)\}} for each {x\in I.}

Now assume 4. Given a family {I} of nonempty sets, let {J=\{x\times\{x\}\mid x\in I\},} and let {S} be a selector for {J.} Define {f:I\rightarrow\bigcup I} by {\{(f(x),x)\}=S\cap(x\times\{x\}),} and note that {f} is a choice function on {I.}

  • {1\Leftrightarrow 2:}

Assume 1. Let {I} be a collection of noenempty sets. and let {(I,<)} be well-ordered. For {x\in I,} define {f(x)} as the {<}-least element of {x.}

Now assume 2, and let {x} be a set. Let {f:{\mathcal P}(x)\setminus\{\emptyset\}\rightarrow x} be a choice function, and define

\displaystyle (a_\beta\mid\beta<\alpha)

by recursion as follows: As long as {\{a_\gamma\mid\gamma<\beta\}\ne x,} set

\displaystyle a_\beta=f(x\setminus\{a_\gamma\mid\gamma<\beta\}).

Note that there must be a {\beta} such that {a_\beta} is not defined, or else we can easily find a surjection of {x} onto {{\sf ORD},} contradicting replacement. Letting {\alpha} be the least such {\beta,} we have that {x} can be well-ordered by setting {y<z} for {y,z\in x,} iff {y=a_\beta} and {z=a_\gamma} for some {\beta<\gamma<\alpha.}

  • {1\Leftrightarrow 3:}

Assume 1. Let {(P,\le)} satisfy the assumption of Zorn’s lemma, and fix a well-ordering {\prec} of {P.} We show that every element of {P} that is not already maximal has at least one maximal element above. Note first that {\emptyset} is a chain, and therefore {P\ne\emptyset.} Now let {x\in P.} Define a strictly increasing sequence of elements of {P,}

\displaystyle (a_\beta\mid\beta<\alpha),

by recursion as follows: {a_0=x.} Given {a_\beta,} if it is not maximal, let {a_{\beta+1}} be the {\prec}-least element of {P} strictly {<}-larger than {a_\beta.} (If {a_\beta} is maximal, let {\alpha=\beta+1} and this completes the construction.) Given {(a_\beta\mid\beta<\lambda)} for {\lambda} limit, since {(P,\le)} satisfies the assumption of Zorn’s lemma, there must be elements of {P} strictly {<}-larger than all the {a_\beta,} and let {a_\lambda} be the {\prec}-least one. Note that this process must stop at some point, or we can find a surjection of {P} onto {{\sf ORD},} a contradiction. This means that there is some {\beta} such that {a_\beta} is {<}-maximal. Since {x=a_0\le a_\beta,} we are done.

Now assume 3. Given a set {x,} consider the family {P} of injective sequences into {x,} ordered by {f\le g} iff {g\upharpoonright {\rm dom}(f)=f.} This partial order satisfies the assumption of Zorn’s lemma, so it has maximal elements. But any maximal {f} must be surjective onto {x.}

  • {1\Leftrightarrow 5:}

Clearly, {1\Rightarrow 5,} with {m=1.}

For the converse, note that any {x} is contained in some {y} such that {y\times y\subseteq y:} Simply take {y=\bigcup_{n<\omega} x_n,} where {x_0=x} and {x_{n+1}=x_n\cup(x_n\times x_n).} Let

\displaystyle N_y=\{m\in\omega\mid\exists\alpha,f:\alpha\rightarrow{\mathcal P}(y)\,\bigl(\bigcup_{\beta<\alpha}f(\beta)=y and \displaystyle\forall\beta<\alpha\,(|f(\beta)|\le m)\bigr)\}.

We argue that if {N_y\ne\emptyset,} then {1\in N_y,} by showing that if {m\in N_y} and {m>1,} then {m-1\in N_y} as well. Assuming 5 amounts to asserting that {N_y\ne\emptyset} for any {y,} and our initial observation shows that this then gives us 1.

Suppose, then, that {m>1} and {m\in N_y,} where {y\times y\subseteq y.} Let {\alpha} and { f:\alpha\rightarrow{\mathcal P}(y)} be witnesses that {m\in N_y.} For {\beta,\gamma,\delta<\alpha,} define

\displaystyle u_{\beta,\gamma,\delta}=\bigl(f(\beta)\times f(\gamma)\bigr)\cap f(\delta).

Case 1: {\forall\beta<\alpha\,\bigl(f(\beta)\ne\emptyset\Rightarrow\exists\gamma,\delta<\alpha\,(0<|{\rm dom}(u_{\beta,\gamma,\delta})|<m)\bigr).}

For any {\beta<\alpha} with {f(\beta)\ne\emptyset,} let {(\gamma_\beta,\delta_\beta)} be the lexicographically least pair {(\gamma,\delta)} witnessing Case 1 for {\beta.} Now define {g:\alpha+\alpha\rightarrow{\mathcal P}(y)} as follows:

Given {\beta<\alpha,} if {f(\beta)=\emptyset,} set {g(\beta)=g(\alpha+\beta)=\emptyset.} If {f(\beta)\ne\emptyset,} let {g(\beta)={\rm dom}(u_{\beta,\gamma_\beta,\delta_\beta})} and {g(\alpha+\beta)=f(\beta)\setminus g(\beta).}

Note that for all {\beta<\alpha,} {g(\beta)} and {g(\alpha+\beta)} partition {f(\beta),} so {\bigcup_{\beta<\alpha+\alpha}g(\beta)=y.} Also, if {\beta<\alpha} and {f(\beta)\ne\emptyset,} then {g(\beta)} and {g(\alpha+\beta)} are both nonempty and therefore of size at most {m-1.} It follows that {\alpha+\alpha,g} witness that {m-1\in N_y.}

Case 2: {\exists\beta<\alpha\,\bigl(f(\beta)\ne\emptyset\mbox{ and }\forall\gamma,\delta<\alpha\,(\emptyset\ne{\rm dom}(u_{\beta,\gamma,\delta})\Rightarrow} {|{\rm dom}(u_{\beta,\gamma,\delta})|=m)\bigr).}

Let {\beta_0} be the least such {\beta,} and fix an {s\in f(\beta_0).} For any {\gamma<\alpha} such that {f(\gamma)\ne\emptyset,} note that there is some {\delta} such that {u_{\beta_0,\gamma,\delta}\ne\emptyset.} This is because

\displaystyle \emptyset\ne f(\beta_0)\times f(\gamma)\subseteq y\times y\subseteq y=\bigcup_\delta f(\delta).

We can then define {\delta_\gamma} as the least such {\delta,} and note that {u_{\beta_0,\gamma,\delta_\gamma}} is a relation with domain a subset of {f(\beta_0)} of size {m,} i.e., {{\rm dom}(u_{\beta_0,\gamma,\delta_\gamma})=f(\beta_0);} but then {m\le |u_{\beta_0,\gamma,\delta_\gamma}|\le |f(\delta_\gamma)|\le m,} so for each {\tau\in f(\beta_0)} there is exactly one {\rho\in f(\gamma)} such that {(\tau,\rho)\in u_{\beta_0,\gamma,\delta_\gamma}=f(\delta_\gamma).} This means that {u_{\beta_0,\gamma,\delta_\gamma}:f(\beta_0)\rightarrow f(\gamma)} is a function.

Now define {g:\alpha+\alpha\rightarrow{\mathcal P}(y)} as follows: Given {\gamma<\alpha,} if {f(\gamma)=\emptyset,} set {g(\gamma)=g(\alpha+\gamma)=\emptyset.} Otherwise, set {g(\gamma)=\{u_{\beta_0,\gamma,\delta_\gamma}(s)\}} and {g(\alpha+\gamma)=f(\gamma)\setminus g(\gamma).}

As in Case 1, {\alpha+\alpha,g} witness that {m-1\in N_y.} This completes the proof.

  • {6\Rightarrow 2:}

Assume 6. Let {(X_i\mid i\in I)} be a family of nonempty sets. We need to show that {\prod_i X_i\ne\emptyset.} Let {a\notin\bigcup_i X_i} and set {Y_i=X_i\cup\{a\}} for each {i.} Define a topology on each {Y_i} by letting {Y_i,X_i} and finite sets be the only closed sets. Clearly, each {Y_i} is compact, so {Y=\prod_i Y_i} is compact as well.

For each {i\in I,} let {Z_i=\{f\in Y\mid f(i)\in X_i\},} and note that the {Z_i} are closed and have the finite intersection property. By compactness of {Y,}

\displaystyle \prod_i X_i=\bigcap_i Z_i\ne\emptyset,

and we are done.

  • {(2\mbox{ and }3)\Rightarrow6:}

Note that Zorn’s lemma gives us that any filter over a set {S} can be extended to an ultrafilter over {S.} The following is easy:

Fact 1 Assume {{\sf ZF}+3.} Let {X} be a topological space, and suppose that

\displaystyle \bigcap\{\bar Y\mid Y\in{\mathcal U}\}\ne\emptyset

whenever {{\mathcal U}} is an ultrafilter over {X.} Then {X} is compact.

Proof: Let {{\mathcal A}} be a family of closed subsets of {X} with the finite intersection property. Let

\displaystyle {\mathcal F}=\{Y\subseteq X\mid\mbox{ there is a finite }S\subseteq{\mathcal A}\mbox{ such that }Y\supseteq\bigcap S\}.

Then {{\mathcal F}} is a filter and {{\mathcal A}\subseteq{\mathcal F}.} Let {{\mathcal U}} be an ultrafilter extending {{\mathcal F}.} By assumption,

\displaystyle \bigcap{\mathcal A}\supseteq\bigcap\{\bar Y\mid Y\in{\mathcal U}\}\ne\emptyset,

and it follows that {X} is compact. \Box

Let {(X_i\mid i\in I)} be a family of compact spaces, and set {X=\prod_i X_i.} Assume 2 and 3. By Fact 1, to show that {X} is compact, it suffices to show that whenever {{\mathcal U}} is an ultrafilter over {X,} then {\bigcap\{\bar Y\mid Y\in{\mathcal U}\}\ne\emptyset.}

To see this, fix such {{\mathcal U}} and, for each {i\in I,} set

\displaystyle A_i=\bigcap\{\overline{\pi_i(Y)}\mid Y\in{\mathcal U}\},

where {\pi_i:X\rightarrow X_i} denotes the projection onto the {i}-th coordinate. Note that each {A_i} is nonempty, by the compactness of {X_i.}

Now use 2 to pick, for each {i\in I,} an element {x_i} of {A_i.} Let {x:I\rightarrow\bigcup_i X_i } be the map given by {x(i)=x_i.} It suffices to check that

\displaystyle x\in\bigcap\{\bar Y\mid Y\in{\mathcal U}\}.

To see this, let {A} be an arbitrary open neighborhood of {x.} We may assume that there is a finite set {I_0\subseteq I} such that {A=\prod_i B_i,} where {B_i=X_i} for all {i\notin I_0,} and each {B_i} (for {i\in I_0}) is open. For {i\in I_0,} set

\displaystyle C_i=\prod_{j\in I} D_j,

where {D_j=X_j} if {j\ne i,} and {D_i=B_i.} Then {A=\bigcap_{i\in I_0}C_i.} We claim that each {C_i} is in {{\mathcal U}.} This gives the result, because then {A\in{\mathcal U}} as well, and therefore {A\cap Y\ne\emptyset} whenever {Y\in{\mathcal U}.} Since {A} was arbitrary, this means that {x\in\bar Y} for each {Y\in{\mathcal U},} as we needed to show.

To see the claim, fix {i\in I_0.} We argue that {\{C_i\}\cup{\mathcal U}} has the finite intersection property. By maximality of {{\mathcal U},} it follows that {C_i\in{\mathcal U},} and we are done (it is here that we use that {{\mathcal U}} is an ultrafilter rather than just a collection of sets with the finite intersection property). But it suffices to check that {C_i\cap Y\ne\emptyset} whenever {Y\in{\mathcal U},} and this is obvious, as it amounts to noticing that

\displaystyle D_i\cap \pi_i(Y)\ne\emptyset,

which holds since {D_i} is a neighborhood of {x_i} in {X_i,} and {x_i\in A_i.} This completes the proof.

Remark 1 Suppose that each {X_i} is Hausdorff. Then the maximality of {{\mathcal U}} ensures that

\displaystyle \bigcap\{\bar Y\mid Y\in{\mathcal U}\}

is a singleton, and therefore we do not need to appeal to 2 in the argument above. This shows that the fact that each filter can be extended to an ultrafilter implies Tychonoff’s theorem for compact Hausdorff spaces. Conversely, this version of Tychonoff’s result suffices to prove in {{\sf ZF}} that filters can be extended to ultrafilters.

This version of choice (usually referred to as the Boolean prime ideal theorem) is strictly weaker than 3. It has been recently shown by S. Rossi that Tychonoff’s theorem for compact Hausdorff spaces is also equivalent to the Banach-Alaoglu theorem from functional analysis. It also implies the Hahn-Banach theorem, which (by a result of Foreman and Wehrung) gives that there are non-Lebesgue measurable subsets of {{\mathbb R}.}

  • {3\Rightarrow7:}

Assume 3, and let {V} be a vector space over a field {K.} Recall the following basic notions and results from linear algebra:

Definition 2 If {A\subseteq V,} a {K}-linear combination of elements of {A} is a vector that can be written in the form

\displaystyle \sum_{a\in A_0}k_aa,

where {A_0\subseteq A} is finite, and each {k_a} is a scalar in {K.}

The span of {A} (or, more precisely, the {K}-span of {A}), {{\rm span}(A),} is by definition the collection of all {K}-linear combinations of elements of {A.} Naturally, we say that {A} spans {V} iff {{\rm span}(A)=V.}

{A} is independent iff {a\notin{\rm span}(A\setminus\{a\})} for all {a\in A.}

A basis for {V} is an independent set that spans {V.}

A basic result from linear algebra is Steinitz exchange lemma, that states that if {a\notin{\rm span}(A\cup\{b\}),} then {b\notin{\rm span}(A\cup\{a\})\setminus{\rm span}(A).} In particular, if {A} is independent and {a\notin{\rm span}(A),} then {A\cup\{a\}} is independent.

Let {P} be the collection of independent subsets of {V,} ordered by containment. It is clear that {P} satisfies the assumption of Zorn’s lemma, and therefore there is a maximal independent subset {A} of {V.} But then {A} is a basis, i.e, it spans {V.} Otherwise, by Steinitz lemma, {A} can be extended to a larger independent set, contradicting its maximality.

  • {7\Rightarrow1:}

Now we argue that if every vector space admits a basis, then every set is well-orderable. This takes some work, and we do this by stages. First, we show Blass’ result that 7 implies what is usually called the axiom of multiple choices, a weak version of 2.

Definition 3 The axiom of multiple choices, {{\sf AMC},} is the statement that for every family of nonempty sets there is a function that assigns to each set in the family a finite, nonempty, subset.

Lemma 4 (Blass) {7\Rightarrow{\sf AMC}.}

Proof: Let {\{X_i\mid i\in I\}} be a collection of nonempty sets. We need a function {F:I\rightarrow\bigcup_i{\mathcal P}(X_i)} such that {F(i)\subseteq X_i} is nonempty and finite for each {i\in I.}

We may as well assume that the {X_i} are pairwise disjoint, and let {X=\bigcup_iX_i.} We will treat the elements of {X} as formal variables. Let {{\mathbb F}} be a field, which we also assume disjoint from {X,} and consider the field {{\mathbb F}(X)} consisting of rational functions (i.e., formal quotients of polynomials) with coefficients in {{\mathbb F}} and variables from {X.}

Given {i\in I} and a monomial {p} in the ring of polynomials {{\mathbb F}[X],} define the {i}-degree of {p} to be the sum of the exponents of the members of {X_i} that appear in {p.}

A polynomial {p\in{\mathbb F}[X]} is {i}-homogeneous of degree {n} iif all the distinct monomials whose sum makes up {p} are of the {i}-degree {n.}

Finally, {f\in{\mathbb F}(X)} is {i}-homogeneous of degree {k} iff {f} can be written as a quotient {p/q} of polynomials with {p} {i}-homogeneous of degree {n+k} and {q} {i}-homogeneous of degree {n,} for some {n.} Note that this notion is well-defined.

Let {K} be the collection of {f\in{\mathbb F}(X)} that are {i}-homogeneous of degree 0 for all {i.} This is a subfield of {{\mathbb F}(X).} As such, {{\mathbb F}(X)} is therefore a vector space over {K,} and we can consider the vector subspace {V={\rm span}(X).}

By assumption, {V} has a basis {B.}

Given any {i\in I} and any {x\in X_i,} we can then write

\displaystyle x=\sum_{b\in B(x)}a_b(x)\cdot b

where {\emptyset \ne B(x)\subseteq B} is finite, and {0\ne a_b(x)\in K} for all {b\in B(x).}

Given any {y\in B_i,} we have

\displaystyle y=\sum_{b\in B(x)}a_b(x)\frac yx\cdot b.

Note that {0\ne a_b(x)\frac yx} is also an element of {K.} It follows, since {B} is a basis, that in the representation

\displaystyle y=\sum_{b\in B(y)}a_b(y)\cdot b,

we must have {B(y)=B(x),} and {a_b(y)=a_b(x)\frac yx} for all {b\in B(y).} This means that {B(x)=B_i} is actually independent of {x} and only depends on {i,} and {\frac{a_b(x)}x=a_{b,i}} is also independent of {x,} and only depends on {b\in B_i} and {i.}

Note that if {b\in B_i} then {a_{b,i}} is {i}-homogeneous of degree {-1.} This means that if {a_{b,i}} is written in reduced form as a quotient {p/q,} then each monomial in the polynomial {q} must have variables from {X_i.} Let {A(b,i)} denote this finite set of variables.

Now set

\displaystyle F(i)=\bigcup_{b\in B_i}A(b,i).

This is a finite nonempty subset of {X_i,} and the assignment {i\mapsto F(i)} is precisely the function we were looking for. \Box

We must now show that {{\sf AMC}} implies choice. Here I follow Felgner-Jech and conclude with a result of Rubin.

Definition 5 The Antichain Principle is the statement that every poset {(P,\le)} has a maximal antichain. An antichain here denotes a subset {A} of the poset any two elements of which are incomparable, i.e., if {a\ne b} are in {A,} then {a\not\le b} and {b\not\le a.}

For example, consider {{\mathcal P}(n)} as a partial order under inclusion. Antichains in this case are usually called Sperner systems, and Sperner showed in 1928 that any Sperner system has size at most {\binom n{\lfloor n/2\rfloor}} where, as usual, if {a} is a real, then {\lfloor a\rfloor} denotes the largest integer {n\le a.}

To see this: Note that the collection of subsets of {n} of size {\lfloor n/2\rfloor} forms a maximal antichain of the stated size {m=\binom n{\lfloor n/2\rfloor}.} To prove that no antichain can be longer, I follow an argument of Lubell:

Note first that {m\ge\binom nk} for any {k,} as can be easily, though perhaps not elegantly, shown simply from the definition {\displaystyle \binom nk=\frac{n!}{k!(n-k)!}.}

Let {A} be an antichain, and write {A_k} for the collection of sets in {A} of size {k.} Since {m\ge\binom nk} for each {k,} we have

\displaystyle \sum_k\frac{|A_k|}m\le\sum_k \frac{|A_k|}{\binom nk},

and we are done once we check that the latter sum adds up to at most 1, since {|A|=\sum_k|A_k|.}

The stated claim is usually called the Lubell-Yamamoto-Meshalkin (LYM) inequality. Note that there are {n!} permutations of the set {n.} A permutation is a bijection {f:n\rightarrow n.} Given a set {B} in {A,} there are {|B|!(n-|B|)!} permutations {f} of {n} such that {f\upharpoonright|B|:|B|\rightarrow B.} Note that if {B\ne C} are in {A,} none of the permutations associated to {B} coincides with a permutation associated to {C,} because {B} and {C} are {\subseteq}-incomparable. It follows that

\displaystyle \sum_k|A_k| k!(n-k)!\le n!,

and dividing by {n!} we obtain the desired result.

Remark 2 There is another notion of antichain commonly used in set theory, that does not coincide with the notion we are using here. In the other version, we say that points {p,q} in the poset {(P,\le)} are incompatible iff there is no {r\in P} such that {r\le p} and {r\le q.} An antichain is now a set of pairwise incompatible elements.

For example, let {P} consist of the quotient of the collection of Borel subsets of {{}[0,1]} by the equivalence relation that identifies two Borel sets {A} and {B} iff {A\bigtriangleup B} has measure 0. Turn this into a partial order by setting that {{}[A]\le[B]} iff {A\setminus B} has measure 0. This is well-defined, i.e., it does not depend on representatives. Then any antichain in this poset is countable. This is usually expressed by saying that {(P,\le)} is ccc, which means that it satisfies the countable chain condition. Perhaps this name ought to be changed to cac, but I believe the community is by now stuck with this misnomer.

To see this fact, suppose {A} is an uncountable antichain, and for each {n<\omega} let {A_n} denote the subset of {A} consisting of classes {[B]} with {B} of measure at least {1/(n+1).} It must be that for some {n,} {A_n} is uncountable, since {A} is. But this is impossible, since {A_n,} being an antichain, can in fact have size at most {n+1.}

Lemma 6 {{\sf AMC}\Rightarrow} Antichain Principle.

Proof: Let {\mathop{\mathbb P}=(P,\le)} be a poset. {{\sf AMC}} guarantees that there is a multiple choice function {F:{\mathcal P}(P)\setminus\{\emptyset\}\rightarrow{\mathcal P}(P)} such that {\emptyset\ne F(X)\subseteq X} is finite for each nonempty {X} subset of {P.} Define {G} on the nonempty subsets of {P} by setting {G(X)} to be the collection of {\le}-minimal elements of {F(X).} Then {G(X)} is a nonempty antichain contained in {X.}

Now define a maximal antichain in {\mathop{\mathbb P}} by recursion, by setting {A=\bigcup_{\alpha<\Theta}A_\alpha,} where, as long as {(A_\beta\mid\beta<\alpha)} has been defined, we set

\displaystyle B_\alpha=\{x\in\mathop{\mathbb P}\mid x\mbox{ is incomparable with each element of }\bigcup_{\beta<\alpha}A_\beta\},

and, as long as {B_\alpha\ne\emptyset,} we set

\displaystyle A_\alpha=G(B_\alpha).

Finally, {\Theta} is the least ordinal {\alpha} such that {B_\alpha=\emptyset.}

One easily checks that {\Theta} exists, that each {A_\alpha,} {\alpha<\Theta,} is an antichain, and that {A} is maximal. \Box

Lemma 7 The antichain principle implies that every linearly orderable set is well-orderable.

Proof: Let {(L,<)} be a linearly ordered set. Consider the poset {\mathop{\mathbb P}=(P,\prec),} where {P=\{(X,x)\mid \emptyset\ne X\subseteq L,x\in X\}} and {(X,x)\prec(Y,y)} iff {X=Y} and {x<y.} The antichain principle grants that there is a maximal antichain {A} in {\mathop{\mathbb P}.} But then {A:{\mathcal P}(L)\setminus\{\emptyset\}\rightarrow L} is a choice function on {L,} and so {L} is well-orderable, by the argument given above that {2\Rightarrow 1.} \Box

Remark 3 It is not yet immediate that we are done, since it is not clear that every set can be linearly orderable. An example of a set that cannot be expected to be linearly orderable without some appeal to choice is {2^{\mathbb N}/E_0,} where {2^{\mathbb N}={}^{\mathbb N}2} is the collection of infinite sequences of zeros and ones, and {E_0} is the equivalence relation given by {p\sim q} iff {\exists m\,\forall n>m\,\bigl(p(n)=q(n)\bigr).}

Now note the following easy observation:

Lemma 8 If {\alpha} is an ordinal, then {{\mathcal P}(\alpha)} is linearly orderable.

Proof: We can linearly order {{\mathcal P}(\alpha)} lexicographically: Given {A\ne B} subsets of {\alpha,} let {\beta=\min(A\bigtriangleup B)} be the minimum element in the symmetric difference of {A} and {B.} Now set {A<B} iff {\beta\in A.} It is straightforward to check that this is a linear order. \Box

{{\sf AC}} finally follows, in view of the following theorem which I have included previously in the blog. For the sake of completeness, I reproduce here the argument:

Theorem 9 (Rubin) If {{\mathcal P}(\alpha)} is well-orderable for each ordinal {\alpha,} then {{\sf AC}} holds.

Proof: Assume in { {\sf ZF}} that the power set of any ordinal is well-orderable. We want to conclude that choice holds, i.e., that every set is well-orderable. A natural strategy is to proceed inductively, showing that each { V_\alpha} is well-orderable: Clearly, if the result is true, each { V_\alpha} would be well-orderable. But also, given any set { x,} it belongs to some { V_\alpha} and, since the latter is transitive, in fact { x\subseteq V_\alpha} and therefore { x} is well-orderable as well. The strategy is suggested by the fact that for all { \alpha,} { V_{\alpha+1}={\mathcal P}(V_\alpha),} so a well-ordering of { V_\alpha} gives us a well-ordering of { V_{\alpha+1}} thanks to our initial assumption.

We argue by induction: Clearly { V_0} is well-ordered by the well-ordering { <_0=\emptyset.} Given a well-ordering { <} of { V_\alpha,} there is a unique ordinal { \beta} and a unique order isomorphism

\displaystyle \pi : (V_\alpha , <)\rightarrow (\beta, \in).

By assumption, { {\mathcal P}(\beta)} is well-orderable, and any well-ordering of it induces (via { \pi^{-1}}) a well-ordering of { V_{\alpha+1}.}

We are left with the task of showing that { V_\alpha} is well-orderable for { \alpha} limit. The natural approach is to patch together the well-orderings of the previous { V_\beta} into a well-ordering of { V_\alpha.} This approach meets two obstacles.

The first one is that the well-orderings of different { V_\beta} are not necessarily compatible, so we need to be careful on how we “patch them together.” This is not too much of an obstacle: The natural solution is to simply order the sets as they appear inside { V_\alpha.} More precisely, define { x<y} for { x,y\in V_\alpha,} iff

  • Either { {\rm rk}(x)<{\rm rk}(y),} or else
  • { {\rm rk}(x)={\rm rk}(y)=\beta,} say, and if { <_{\beta+1}} is the well-ordering of { V_{\beta+1},} then { x<_{\beta+1}y.}

It is easy to see that this is indeed a well-ordering of { V_\alpha}: Given a non-empty { A\subseteq V_\alpha,} let { \gamma} be least such that { A} has an element of rank { \gamma.} Then the { <_{\gamma+1}}-first among these elements would be the { <}-least element of { A.}

The second obstacle is more serious. Namely, the assumption is simply that there is a well-ordering of each { {\mathcal P}(\delta),} not that there is any canonical way of choosing one. In order for the argument above to work, we need not just that each { V_\beta} for { \beta<\alpha} is well-orderable, but in fact we need to have selected a sequence { (<_{\beta+1}:\beta<\alpha)} of well-orderings of the { V_{\beta+1},} with respect to which we proceeded to define the well-ordering { <} of { V_\alpha.}

The way we began the proof suggests a solution: When we argued that it suffices to well-order each { V_\gamma,} we considered an arbitrary set { x} and noticed that if { x\subseteq V_\beta,} then a well-ordering of { V_\beta} gives us a well-ordering of { x.} Similarly, given { \alpha} limit, if we can find { \delta} large enough so each { |V_\beta|} for { \beta<\alpha} is below { \delta,} then we can use a well-ordering of { {\mathcal P}(\delta)} to induce the required well-ordering { <_\beta.}

We now proceed to implement this idea: Let { \delta=\sup_{\beta<\alpha}|V_\beta|^+.} (Notice that this makes sense since, inductively, each { V_\beta} with { \beta<\alpha} is well-orderable and therefore isomorphic to a unique cardinal.) Let { <^*} be a well-ordering of { {\mathcal P}(\delta).} We use { <^*} to define a sequence { (<_\beta:\beta<\alpha)} so that { <_\beta} well-orders { V_\beta} for all { \beta<\alpha.} We use recursion on { \beta<\alpha} to define this sequence. Again, { <_0=\emptyset.} At limit stages { \gamma<\alpha} we copy the strategy with which we tried to well-order { V_\alpha} to define { <_\gamma}: For { x,y\in V_\gamma,} set { x<_\gamma y} iff

  • Either { {\rm rk}(x)<{\rm rk}(y),} or else
  • { {\rm rk}(x)={\rm rk}(y)=\beta,} say, and { x<_{\beta+1}y.}

Finally, given { <_\beta,} we describe how to define { <_{\beta+1}}: Let { \xi=\xi_\beta} be the unique ordinal such that there is an order isomorphism

\displaystyle \pi : (V_\beta,<_\beta)\rightarrow (\xi,\in).

Since { |\xi|=|V_\beta|,} then { \xi<\delta,} so { \xi\subset\delta} and the well-ordering { <^*} of { {\mathcal P}(\delta)} also well-orders { {\mathcal P}(\xi).} Via { \pi^{-1},} this induces the well-ordering { <_{\beta+1}} of { V_{\beta+1}} we were looking for. \Box

Typeset using LaTeX2WP. Here is a printable version of this post.


2 Responses to 502 – Equivalents of the axiom of choice

  1. Ewa Graczynska says:

    [Edit, June 18, 2012: Removed at the author’s request. A.C.]

  2. […] there is an injection from one into the other. The standard argument for this uses that choice is equivalent to the well-ordering theorem: One can prove (without choice) that any two well-ordered sets are […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: