On anti-foundation and coding the hereditarily finite sets

August 27, 2016

I would like to highlight a cute question in a recent paper,

MR3400774
Giovanna D’Agostino, Alberto Policriti, Eugenio G. Omodeo, and Alexandru I. Tomescu.
Mapping sets and hypersets into numbers.
Fund. Inform. 140 (2015), no. 3-4, 307–328.

Recall that W. Ackermann verified what in modern terms we call the bi-interpretability of \mathsf{ZFfin} and \mathsf{PA}, where the latter is (first-order) Peano arithmetic, and the former is finite set theory, the result of replacing in \mathsf{ZF} the axiom of infinity with its negation (and with foundation formulated as the schema of \in-induction). The reference is

MR1513141
Wilhelm Ackermann.
Die Widerspruchsfreiheit der allgemeinen Mengenlehre.
Math. Ann. 114 (1937), no. 1, 305–315.

I have written about this before. Briefly, one exhibits (definable) translations between the collection \mathsf{HF} of hereditarily finite sets and \mathbb{N}, and verifies that the translation extends to a definable translation of the relations, functions and constants of the language of each structure in a way that \mathsf{PA} verifies that \mathsf{ZFfin} holds in the translation of (\mathsf{HF},\in), and \mathsf{ZFfin} verifies that \mathsf{PA} holds in the translation of {\mathbb N}=(\omega,+,\times,<,0,1). Recall that \mathsf{HF} consists of those sets a whose transitive closure is finite, that is, a is finite, and all its elements are finite, and all the elements of its elements are finite, and so on. Using foundation, one easily verifies that \mathsf{HF}=V_\omega=\bigcup_{n\in\omega}V_n, that is, it is the collection of sets resulting from iterating the power-set operation (any finite number of times) starting from the empty set.

In the direction relevant here, one defines a map h:\mathsf{HF}\to\mathbb{N} by

h(a)=\sum_{b\in a}2^{h(b)}.

One easily verifies, using induction on the set-theoretic rank of the sets involved, that this recursive definition makes sense and is injective (and, indeed, bijective).

Of course this argument uses foundation. In the D’Agostino-Policriti-Omodeo-Tomescu paper they consider instead the theory resulting from replacing foundation with the  anti-foundation axiom, and proceed to describe a suitable replacement for h that injects (codes) \mathsf{HF} into the real numbers. They do quite a bit more in the paper but, for the coding itself, I highly recommend the nice review by Randall Holmes in MathSciNet, linked to above.

The anti-foundation axiom \mathsf{AFA} became known thanks to the work of Peter Aczel, and it is his formulation that I recall below, although it was originally introduced in work of Forti and Honsell from 1983, where they call it X_1. Aczel’s presentation appears in the excellent book

MR0940014 (89j:03039)
Peter Aczel.
Non-well-founded sets. With a foreword by Jon Barwise.
CSLI Lecture Notes, 14. Stanford University, Center for the Study of Language and Information, Stanford, CA, 1988. xx+137 pp.
ISBN: 0-937073-22-9.

The original paper is

MR0739920 (85f:03054)
Marco Forti, Furio Honsell.
Set theory with free construction principles.
Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 10 (1983), no. 3, 493–522.

Given a binary relation R, its field \mathrm{fld}(R) is the union of its domain and codomain. A decoration of R is a function d:\mathrm{fld}(R)\to V satisfying

d(x)=\{d(y)\mid y\mathrel{R}x\}

for all x,y\in\mathrm{fld}(R). When R is \in and the sets in question are well-founded, the only decoration is the identity. Similarly, any well-founded relation R admits a unique decoration. Define \mathsf{AFA} as the statement that any binary R (whether well-founded or not) admits a unique decoration.

In \mathsf{ZF} with foundation replaced with \mathsf{AFA} one can prove the existence of many non-well-founded sets. One of the appealing aspects of \mathsf{AFA} is that the resulting univere is actually quite structured: Other anti-foundation axioms allow the existence of infinitely many Quine atoms, sets x such that x=\{x\}, for instance. Under \mathsf{AFA}, there is exactly one such x, usually called \Omega. The axiom is sometimes described as saying that it provides solutions to many “equations” among sets. For instance, consider the system of equations x=\{y\} and y=\{x\}. Under \mathsf{AFA} the system has x=y=\Omega as its unique solution. Note that assuming \mathsf{AFA}, \Omega is in \mathsf{HF}, as are many other non-well-founded sets.

Here is the open question from the D’Agostino-Policriti-Omodeo-Tomescu paper: Work in set theory with \mathsf{AFA} instead of foundation. Is there a unique, injective, function h:\mathsf{HF}\to \mathbb{R}  satisfying

h(x)=\sum_{y\in x}2^{-h(y)}

for all x,y\in\mathsf{HF}?

Note that there is a unique such h on the well-founded hereditarily finite sets, and it is in fact injective. In general, existence, uniqueness and injectivity of h appear to be open. The claim that there is such a function h is a statement about solutions of certain equations on the reals, and the claim that h is unique requires moreover uniqueness of such solutions. The expectation is that h(x) is transcendental for all non-well-founded hereditarily finite x but, even assuming this, the injectivity of h seems to require additional work.

For example, consider x=\Omega. The function h must satisfy

h(\Omega)=2^{-h(\Omega)}

and, indeed h(\Omega)=0.6411857\dots is the unique solution x of the equation x=2^{-x}

I would be curious to hear of any progress regarding this problem.

Advertisements

Interpretations

August 18, 2013

(This started as an answer on Math.Stackexchange. This version has been lightly edited and expanded. Also posted at fff.)

Throughout this post, theory means first-order theory. In fact, we are concerned with theories that are recursively presented, though the abstract framework applies more generally. Thanks to Fredrik Engström Ellborg for suggesting in Google+ the reference Kaye-Wong, and to Ali Enayat for additional references and many useful conversations on this topic.

1.

Informally, to say that a theory T interprets a theory S means that there is a procedure for associating structures \mathcal N in the language of S to structures \mathcal M in the language of T in such a way that if \mathcal M is a model of T, then \mathcal N is a model of S.

Let us be a bit more precise, and do this syntactically to reduce the requirements of the metatheory. The original notion is due to Tarski, see

Alfred Tarski. Undecidable theories. In collaboration with Andrzej Mostowski and Raphael M. Robinson. Studies in Logic and the Foundations of Mathematics. North-Holland Publishing Company, Amsterdam, 1953. MR0058532 (15,384h).

I follow here the modern reference on interpretations,

Albert Visser, Categories of theories and interpretations, in Logic in Tehran, Lecture Notes in Logic, vol. 26, Association for Symbolic Logic, La Jolla, CA, 2006, pp. 284–341. MR2262326 (2007j:03083).

One can take “the theory T interprets the theory S to mean that there are

  1. A map i that assigns formulas in the language of T to the symbols of the language \mathcal L of S, and
  2. A formula \mathrm{Dom}(x) in the language of T,

with the following properties: We can extend i to all \mathcal L-formulas recursively: i(\phi\land\psi)=i(\phi)\land i(\psi), etc, and i(\forall x\phi)=\forall x(\mathrm{Dom}(x)\to i(\phi)). It then holds that T proves

  1. \exists x\,\mathrm{Dom}(x), and
  2. i(\phi) for each axiom \phi of S (including the axioms of first-order logic).

Here, T,S are taken to be recursive, and so is i.

If the above happens, then we can see i as a strong witness to the fact that the consistency of T implies the consistency of S.

Two theories are mutually interpretable iff each one interprets the other. By the above, this is a strong version of the statement that they are equiconsistent.

Two theories are bi-interpretable iff they are mutually interpretable, and in fact, the interpretations i from T is S and j from S in T can be taken to be “inverses” of each other, in the sense that T proves that \phi and j(i(\phi)) are equivalent for each \phi in the language of T, and similarly for S, \psi and i(j(\psi)). In a sense, two theories that are bi-interpretable are very much “the same”, only differing in their presentation.

Read the rest of this entry »