On anti-foundation and coding the hereditarily finite sets

August 27, 2016

I would like to highlight a cute question in a recent paper,

Giovanna D’Agostino, Alberto Policriti, Eugenio G. Omodeo, and Alexandru I. Tomescu.
Mapping sets and hypersets into numbers.
Fund. Inform. 140 (2015), no. 3-4, 307–328.

Recall that W. Ackermann verified what in modern terms we call the bi-interpretability of \mathsf{ZFfin} and \mathsf{PA}, where the latter is (first-order) Peano arithmetic, and the former is finite set theory, the result of replacing in \mathsf{ZF} the axiom of infinity with its negation (and with foundation formulated as the schema of \in-induction). The reference is

Wilhelm Ackermann.
Die Widerspruchsfreiheit der allgemeinen Mengenlehre.
Math. Ann. 114 (1937), no. 1, 305–315.

I have written about this before. Briefly, one exhibits (definable) translations between the collection \mathsf{HF} of hereditarily finite sets and \mathbb{N}, and verifies that the translation extends to a definable translation of the relations, functions and constants of the language of each structure in a way that \mathsf{PA} verifies that \mathsf{ZFfin} holds in the translation of (\mathsf{HF},\in), and \mathsf{ZFfin} verifies that \mathsf{PA} holds in the translation of {\mathbb N}=(\omega,+,\times,<,0,1). Recall that \mathsf{HF} consists of those sets a whose transitive closure is finite, that is, a is finite, and all its elements are finite, and all the elements of its elements are finite, and so on. Using foundation, one easily verifies that \mathsf{HF}=V_\omega=\bigcup_{n\in\omega}V_n, that is, it is the collection of sets resulting from iterating the power-set operation (any finite number of times) starting from the empty set.

In the direction relevant here, one defines a map h:\mathsf{HF}\to\mathbb{N} by

h(a)=\sum_{b\in a}2^{h(b)}.

One easily verifies, using induction on the set-theoretic rank of the sets involved, that this recursive definition makes sense and is injective (and, indeed, bijective).

Of course this argument uses foundation. In the D’Agostino-Policriti-Omodeo-Tomescu paper they consider instead the theory resulting from replacing foundation with the  anti-foundation axiom, and proceed to describe a suitable replacement for h that injects (codes) \mathsf{HF} into the real numbers. They do quite a bit more in the paper but, for the coding itself, I highly recommend the nice review by Randall Holmes in MathSciNet, linked to above.

The anti-foundation axiom \mathsf{AFA} became known thanks to the work of Peter Aczel, and it is his formulation that I recall below, although it was originally introduced in work of Forti and Honsell from 1983, where they call it X_1. Aczel’s presentation appears in the excellent book

MR0940014 (89j:03039)
Peter Aczel.
Non-well-founded sets. With a foreword by Jon Barwise.
CSLI Lecture Notes, 14. Stanford University, Center for the Study of Language and Information, Stanford, CA, 1988. xx+137 pp.
ISBN: 0-937073-22-9.

The original paper is

MR0739920 (85f:03054)
Marco Forti, Furio Honsell.
Set theory with free construction principles.
Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 10 (1983), no. 3, 493–522.

Given a binary relation R, its field \mathrm{fld}(R) is the union of its domain and codomain. A decoration of R is a function d:\mathrm{fld}(R)\to V satisfying

d(x)=\{d(y)\mid y\mathrel{R}x\}

for all x,y\in\mathrm{fld}(R). When R is \in and the sets in question are well-founded, the only decoration is the identity. Similarly, any well-founded relation R admits a unique decoration. Define \mathsf{AFA} as the statement that any binary R (whether well-founded or not) admits a unique decoration.

In \mathsf{ZF} with foundation replaced with \mathsf{AFA} one can prove the existence of many non-well-founded sets. One of the appealing aspects of \mathsf{AFA} is that the resulting univere is actually quite structured: Other anti-foundation axioms allow the existence of infinitely many Quine atoms, sets x such that x=\{x\}, for instance. Under \mathsf{AFA}, there is exactly one such x, usually called \Omega. The axiom is sometimes described as saying that it provides solutions to many “equations” among sets. For instance, consider the system of equations x=\{y\} and y=\{x\}. Under \mathsf{AFA} the system has x=y=\Omega as its unique solution. Note that assuming \mathsf{AFA}, \Omega is in \mathsf{HF}, as are many other non-well-founded sets.

Here is the open question from the D’Agostino-Policriti-Omodeo-Tomescu paper: Work in set theory with \mathsf{AFA} instead of foundation. Is there a unique, injective, function h:\mathsf{HF}\to \mathbb{R}  satisfying

h(x)=\sum_{y\in x}2^{-h(y)}

for all x,y\in\mathsf{HF}?

Note that there is a unique such h on the well-founded hereditarily finite sets, and it is in fact injective. In general, existence, uniqueness and injectivity of h appear to be open. The claim that there is such a function h is a statement about solutions of certain equations on the reals, and the claim that h is unique requires moreover uniqueness of such solutions. The expectation is that h(x) is transcendental for all non-well-founded hereditarily finite x but, even assuming this, the injectivity of h seems to require additional work.

For example, consider x=\Omega. The function h must satisfy


and, indeed h(\Omega)=0.6411857\dots is the unique solution x of the equation x=2^{-x}

I would be curious to hear of any progress regarding this problem.


August 11, 2016

I have just posted on my papers page a preprint of a review of

Smullyan, Raymond
Reflections—the magic, music and mathematics of Raymond Smullyan.
World Scientific Publishing Co. Pte. Ltd., Hackensack, NJ, 2015. x+213 pp.
ISBN: 978-981-4644-58-7; 978-981-4663-19-9

that I have submitted to Mathematical Reviews.


July 5, 2016


Help us identify all mathematicians in this picture (click on it for a larger version). Please post comments here, on G+, or email me or Paul Larson.

The picture will appear in the book of proceedings of the Woodin conference, http://logic.harvard.edu/woodin_meeting.html. (Thanks to David Schrittesser for allowing us to use it.)

Path decompositions

April 2, 2016

On Thursday March 10, Peter Cholak gave a beautiful talk at the Logic Seminar at the University of Michigan, on Rado’s path decomposition theorem and its effective content. I want to review here some of the results covered by Peter. Slides for another version of the talk can be found in Peter’s page. This is joint work by Peter, Greg Igusa, Ludovic Patey, and Mariya Soskova. 

As usual, given a set X, let {}[X]^2 denote the collection of 2-sized subsets of X. If r is a positive integer, an rcoloring of {}[X]^2 (or simply, a coloring, if r is understood) is a map c:[X]^2\to r (where we use ordinal notation, so r=\{0,1,\dots,r-1\}).  We can think of this as a coloring using r colors of the edges of the complete graph whose underlying set of vertices is X. When r=2, we have an even simpler interpretation: A 2-coloring is just a graph on X.

Given an r-coloring  c of {}[X]^2, a path of color i\in r is a sequence a_0,a_1,\dots of distinct elements of X (which may be finite or infinite, or even empty, or of length 1) such that for all k, if a_{k+1} is defined, then c(a_k,a_{k+1})=i. Note that this is a much weaker requirement than asking that \{a_0,a_1,\dots\} be monochromatic (which would mean that c(a_k,a_j)=i for all k\ne j).  Also, in what follows X is either a finite number or \mathbb N. However, we do not require that the elements in the sequence be listed in their natural order: We may very well have that a_k>a_{k+1} for some k.

The starting point is the following observation:

Fact (Erdős).  If m is finite and c is a 2-coloring of {}[m]^2, then there are paths P_0 of color 0 and P_1 of color 1 such that every (vertex) n\in m appears in exactly one of the P_i.

In general, if c is an r-coloring of {}[X]^2, we say that P_i, i<r, is a path decomposition of c iff each P_i is a path of color i and every vertex x\in X appears in exactly one of the P_i. Using this notion, what the fact states is that for any finite m, any 2-coloring of {}[m]^2 admits a path decomposition.

Proof. Suppose the result holds for m and c is a 2-coloring of {}[m+1]^2. We can then find paths P_0 and P_1 of color 0 and 1 respectively such that each n<m appears in exactly one of the P_i. We want to show that the same holds for the full coloring (which includes edges one of whose vertices is m) at the possible expense of having to modify the partial paths we currently have. If one of the P_i is empty, this is clear. Assume then that P_0=(a_1,\dots,a_k,a_{k+1}) and P_1=(b_1,\dots,b_l,b_{l+1}). The result is also clear if c(a_{k+1},m)=0 or c(b_{l+1},m)=1. Finally, if c(a_{k+1},m)=1 and c(b_{l+1},m)=0, consider c(a_{k+1},b_{l+1}). If this color is 0, we can let the paths be P_0'=P_0{}^\frown(b_{l+1},m) and P_1'=(b_1,\dots,b_l). Similarly, if c(a_{k+1},b_{l+1})=1, we can let the paths be P_0'=(a_1,\dots,a_k) and P_1'=P_1{}^\frown(a_{k+1},m). (This is perhaps most obvious if a picture is drawn.) \Box

Rado’s paper is a generalization of this result and its countable version. The reference is

MR0485504 (58 #5334)
Rado, Richard
Monochromatic paths in graphs
Advances in graph theory (Cambridge Combinatorial Conf., Trinity College, Cambridge, 1977).
Ann. Discrete Math. 3 (1978), 191–194.

The paper opens indicating that Erdős sketched his proof to Rado; there does not seem to be an actual reference for Erdős’s proof. Rado proceeds to prove a more general version. I will only discuss here a particular case.

First, it should be noted that, unlike typical results in Ramsey theory where, once the case of two colors is handled, the argument easily generalizes to any number of colors, the proof above does not lift to more than two. The usual way of doing this lifting is by identifying all but one of the colors. This would result in two paths P_0 and P_1, where along P_0 we only see color 0 and along P_1 we only see the other colors, but not 0. Let c be the given coloring and V be the set of vertices appearing in P_1. If the restriction of c to {}[V]^2 does not use color 0 we could indeed proceed inductively. But there is nothing to prevent 0 from being present as well, so the “easy” lifting argument actually breaks down.

The situation is indeed worse:

Theorem (Pokrovskiy). For any r>2 and any m there is an M>m and an r-coloring of {}[M]^2 that does not admit a path decomposition.

The proof can be found in:

Pokrovskiy, Alexey
Partitioning edge-coloured complete graphs into monochromatic cycles and paths
J. Combin. Theory Ser. B 106 (2014), 70–97.

On the other hand, we have:

Theorem (Rado). For any finite r, any r-coloring of {}[\mathbb N]^2 admits a path decomposition.

As already mentioned, Rado’s result is more general, in particular allowing the use of countably many colors. However, the arguments that follow only apply directly to the stated version.

Before sketching the proof, note that even for r=2, the result does not follow as usual from the finite version: Given a 2-coloring c of \mathbb N, the standard approach would consist of letting P_0,P_1 be the paths resulting from successively applying Erdős’s theorem to the restrictions of c to m=2,3,\dots. But the inductive argument we presented allows the paths to be modified from one value of m to the next, which means that we cannot ensure that the process will successfully identify (via initial segments) paths for the full coloring (the partial paths do not “stabilize”). Together with Pokrovskiy’s negative result just indicated, this leaves us with a curious Ramsey-theoretic statement to which the usual compactness arguments do not apply. (Its finite counterpart, Erdős’s result, is weaker in the sense that it only applies to two colors, and requires a different argument.)

Proof. Consider a nonprincipal ultrafilter \mathcal U on \mathbb N. The ultrafilter provides us with a notion of largeness. Given r\in\mathbb N and a coloring c:[\mathbb N]^2\to r, define for x\in\mathbb N and i\in r the set of neighbors of x in color i as 

N(x,i)=\{y\mid c(x,y)=i\}.

Note that for any x the N(x,i) partition \mathbb N\setminus\{x\} and therefore there is exactly one i such that N(x,i) is large (that is, it is in $\mathcal U$). For i\in r, define

A_i=\{x\mid N(x,i)\in\mathcal U\},

and note that the A_i partition \mathbb N.

We proceed by stages to define the paths P_0,\dots,P_{r-1} as required. We set P^0_i=\emptyset for all i. In general, at the beginning of any given stage n we have defined (finite) partial approximations P^n_i to each path P_i, say P^n_i has length n_i, with P^n_i=(a_{i,1},\dots,a_{i,n_i}), using the convention that n_i=0 indicates that P^n_i=\emptyset. For each i, we will ensure that P^{n+1}_i end extends P^n_i (for all n), and simply set P_i as the resulting path. Inductively, we require that each P^n_i is a path of color i, and that if n_i>0, then a_{i,n_i}\in A_i.

Now, at stage n, we simply consider the least y not yet in any of the P^n_i. There is a unique i with y\in A_i. We set P^{n+1}_j=P^n_j for all j\ne i. If P^n_i=\emptyset, then set P^{n+1}_i=(y). Finally, if n_i>0, the point is that since N(y,i) and N(a_{i,n_i},i) are both large, then so is their intersection (all we really need is that the intersection of sets in \mathcal U is nonempty). Let x be a point in their intersection, and set P^{n+1}_i=P^n_i{}^\frown(x,y). The induction hypothesis is preserved, and this completes stage n of the construction.

It should be immediate that the P_i so constructed indeed provide a path decomposition of c, and this completes the proof. \Box

It is interesting to note that the notation just developed allows us to give a quick proof of Ramsey’s theorem for pairs: Given a coloring c:[\mathbb N]^2\to r, use notation as above, and note that for exactly one i\in r, the set A_i is in \mathcal U. We argue that there is an infinite subset H\subseteq A_i that is homogeneous for c with color i, that is, c''[H]^2=\{i\}. Indeed, we can simply set H=\{h_n\mid n\in\mathbb N\}, where the h_i are defined recursively so that h_0\in A_i and h_{j+1}\in A_i\cap\bigcap_{k\le j}N(h_k,i) for all j.

As Peter indicated in his talk, these pretty arguments are somewhat dissatisfying in that invoking a nonprincipal ultrafilter is too strong a tool for the task at hand. He then proceeded to indicate how we can in fact do better, computationally speaking. For instance, if the coloring c is computable, then we can find a path decomposition below 0''. The key to this improvement comes from two observations.

First, we do not really need an ultrafilter to carry out the argument. It suffices to consider a set C\subseteq\mathbb N that is cohesive with respect to all the N(x,i), meaning that C is infinite and, for any x,i, either C\subseteq^* N(x,i) or C\subseteq^* \mathbb N\setminus N(x,i), where \subseteq^* is the eventual containment relation: A\subseteq^* B iff there is a finite subset s of A such that A\setminus s\subseteq B, in which case we say that A is almost contained in B.

The point is that we can replace all instances where we required that a set N is in \mathcal U by the new largeness condition stating that C is almost contained in N. For instance, note that if N_1,N_2 are large, then so is their intersection. As before, for any x there is a unique i with N(x,i) large, and we can redefine A_i as the set of x such that

\exists k\,\forall y>k\,(y\in C\to c(x,y)=i),\qquad (a)

or, equivalently,

\forall k\,\exists y>k\,(y\in C\land c(x,y)=i).\qquad (b)

With these modifications, it is straightforward to verify that the proof above goes through. This shows that a path decomposition of c is \Delta^0_2(C).

In more detail: Note first that these two conditions are indeed equivalent, and second, clearly the A_i are pairwise disjoint since C is infinite and, moreover, for all x there is a unique i such that x\in A_i:

Suppose that (a) holds, and let k_0 be such that \forall y>k_0\,(y\in C\to(c(x,y)=i). Since C is infinite, we can indeed find elements y of C larger than k_0, and any such y witnesses (b).

Conversely, if (b) holds, then C\subseteq^* N(x,i), because C is cohesive and has infinite intersection with N(x,i). But then (a) holds, as wanted.

To see that any x is in a unique A_i, fix x and use that x is cohesive to conclude that if C\subseteq^*\mathbb N\setminus N(x,i) for all i, then C\subseteq^*\{x\}, which contradicts the infinitude of C. It follows that C\subseteq^* N(x,i) for some i and, since the N(x,j) are pairwise disjoint, this i is unique. This proves that (a) holds and therefore x\in A_i. Uniqueness follows from this same observation: If x\in A_i, then (as shown above) C\subseteq^* N(x,i). But there is only one i for which this is true.

The second observation is that there is an easy recursive construction of a set that is cohesive with respect to all the N(x,i): Consider first N(0,0),\dots,N(0,r-1). One of these sets is infinite (since their union is \mathbb N\setminus\{0\}), say N(0,j_0), and let a_0 be its first element. Consider now

N(0,j_0)\cap N(1,0),\dots,N(0,j_0)\cap N(1,r-1).

Their union is N(0,j_0)\setminus\{1\}, so one of these sets is infinite, say N(0,j_0)\cap N(1,j_1). Let a_1 be its first element above a_0. Etc. The set C=\{a_0,a_1,\dots\} so constructed is as wanted. Note that this construction explicitly obtains an infinite set C that, for each x, is almost contained in one the N(x,i), which is superficially stronger than being cohesive. However, as verified above, any set cohesive for all the N(x,i) must actually have this property. 

Computationally, the advantage of this construction is that it makes explicit that all we need to access a cohesive set is an oracle deciding of any N(x,i) whether it is infinite. For computable c, these are all \Pi^0_2 questions.

Peter further refined this analysis in his talk via the notion of a set being \mathsf{PA} over 0': This is any set X such that for any uniformly computable sequence of pairs of \Pi^0_2 sentences \phi_{i,0},\phi_{i,1} for i\in\mathbb N such that at least one is true, there is an f\le_T X that predicts the true sentence of each pair in the sense that for all i, if f(i)=j, then \phi_{i,j} is true. In symbols, say that X>>0'. The point of the notion is that a result of Jockusch and Stephen gives us that if X>>0' then there is a cohesive set C such that C'\le_1 X. The relevant paper is:

MR1270396 (95d:03078)
Jockusch, Carl; Stephan, Frank
A cohesive set which is not high.
Math. Logic Quart. 39 (1993), no. 4, 515–530.

MR1477624 (99a:03044)
Math. Logic Quart. 43 (1997), no. 4, 569.

 This shows that a path decomposition for a computable coloring c can actually be found below 0''  (and more).

Peter concluded his talk by indicating how for special colorings the complexity can be further improved. For instance, say that a coloring c:[\mathbb N]^2\to r is stable iff \lim_y c(x,y) exists for all x. One can check that for stable c, we can use cofinite as a notion of largeness in the preceding arguments, and that a path decomposition can accordingly be found when c is computable below 0'. On the other hand, this is optimal, in that one can find a stable computable c such that any path decomposition computes 0'.

Woodin meeting

April 1, 2015



(Don’t know who to credit for the group picture, but as pointed out by Paul in the comments, it was with David Schrittesser‘s camera. Toast picture by Paul Larson. Toast by Ted Slaman.)

I’m sad I had to miss the meeting, although it was for obvious reasons.

Cryptic marks

February 5, 2015

New scientist recently ran a series on articles on “How to think about…” One of them, by Richard Webb and published December 13, 2014, was about infinity. It contains this quote:

Woodin’s notepads consist mainly of cryptic marks he uses to focus his attention, to the occasional consternation of fellow plane passengers. “If they don’t try to change seats they ask me if I’m an artist,” he says.

David Roberts wondered on Google+ what these cryptic marks look like. This reminded me of some pictures I took of them at the Conference on inner model theory at UC Berkeley last year.

2014-06-10 17.59.20

2014-06-10 17.37.00

2014-06-10 17.36.08

Neugebauer’s theorem

December 30, 2014

The problem of characterizing when a given function is a derivative has been studied at least since 1911 when Young published a paper recognizing its relevance:

W. H. Young. A note on the property of being a differential coefficient, Proc. Lond. Math. Soc., 9 (2), (1911), 360–368.

The paper opens with the following comment (Young refers to derivatives as “differential coefficients”):

Recent research has provided us with a set of necessary and sufficient conditions that a function may be an indefinite integral, in the generalised sense, of another function, and the way has thus been opened to important developments. The corresponding, much more difficult, problem of determining necessary and sufficient conditions that a function may be a differential coefficient, has barely been mooted; indeed, though we know a number of necessary conditions, no set even of sufficient conditions has to my knowledge ever been formulated, except that involved in the obvious statement that a continuous function is a differential coefficient. 

It is more or less understood nowadays that no completely satisfactory characterization is possible. We know that derivatives are Darboux continuous (that is, they satisfy the intermediate value property), and are Baire one functions (that is, they are the pointwise limit of a sequence of continuous functions). But this is not enough: For instance, any function f such that f(x)=\sin(1/x) for x\ne 0, and f(0)\in[-1,1] is Darboux continuous and Baire one, but only the function with f(0)=0 is a derivative.

Andrew Bruckner has written excellent surveys on derivatives and the characterization problem. See for instance:

Andrew M. Bruckner and John L. Leonard. Derivatives. Amer. Math. Monthly, 73 (4, part II), (1966), 24–56. MR0197632 (33 #5797).

Andrew M. Bruckner. Differentiation of real functions. Second edition. CRM Monograph Series, 5. American Mathematical Society, Providence, RI, 1994. xii+195 pp. ISBN: 0-8218-6990-6. MR1274044 (94m:26001).

Andrew M. Bruckner. The problem of characterizing derivatives revisited. Real Anal. Exchange, 21 (1),  (1995/96), 112–133. MR1377522 (97g:26004).

Here I want to discuss briefly a characterization obtained by Neugebauer, see

Christoph Johannes Neugebauer. Darboux functions of Baire class one and derivatives. Proc. Amer. Math. Soc., 13 (6), (1962), 838–843. MR0143850 (26 #1400).

For concreteness, I will restrict discussion to functions f:[0,1]\to\mathbb R, although this makes no real difference. Whenever an interval is mentioned, it is understood to be nondegenerate. For any closed subinterval I\subseteq [0,1], we write I^\circ for its interior, and \mathrm{lh}(I) for its length. Given a point x\in[0,1], we write I\to x to indicate that \mathrm{lh}(I)\to 0 and x\in I.

Theorem (Neugebauer). A function f:[0,1]\to\mathbb R is a derivative iff to each closed subinterval I of {}[0,1] we can associate a point x_I\in I^\circ in such a way that the following hold: 

  1. For all x\in[0,1], if I\to x, then f(x_I)\to f(x), and 

  2. For all  closed subintervals I,I_1,I_2 of {}[0,1], if I=I_1\cup I_2 and {I_1}^\circ\cap {I_2}^\circ=\emptyset, then \mathrm{lh}(I)f(x_I)=\mathrm{lh}(I_1)f(x_{I_1})+\mathrm{lh}(I_2)f(x_{I_2}).

Read the rest of this entry »