502 – Formal systems

1. Formal systems

Before introducing first order logic, I want to present a “toy example” of the issues we will face, and the results we are after.

In a formal system we present a set of rules about manipulation of finite strings of symbols, and attempt to study which strings can be obtained through these manipulations.

Informally, this corresponds to the syntactic part of logic, and the beginning of proof theory.

I will follow an example from Richard Kaye’s book The Mathematics of Logic.

Denote the empty string by {{\perp}.} The reason for choosing this symbol will be explained later.

The example I want to study is a tree system. The name will be explained later.

We begin with a given subset {\Sigma\subseteq 2^{<{\mathbb N}}.} Consider the following “game” where one can apply the following rules:

  1. (Axioms) One can write down any string {\sigma\in\Sigma.}
  2. (Extension) If {\sigma} can be written down, one can also write down (one or both of) {\sigma0} and {\sigma1.}
  3. (Compression) If {\sigma0} and {\sigma1} can be written down, one can also write down {\sigma.}

Given such {\Sigma} and a {\tau\in 2^{<{\mathbb N}},} the goal of the game is to see whether {\tau} can be written down using the rules just described. In that case, we write

\displaystyle \Sigma \vdash \tau,

which can be read as “{\tau} can be derived from {\Sigma}” or “{\tau} is provable from {\Sigma,}” or a variant of these expressions. (This is why we call axioms the strings in {\Sigma,} as they are the “basic assumptions” in our proofs.) We can think of {\tau} as a (formal) theorem derivable from the set of axioms {\Sigma.}

A (formal) proof of {\tau} from {\Sigma} is then a sequence {\sigma_0,\dots,\sigma_n} where:

  • {\sigma_n=\tau,} and
  • Each {\sigma_i,} {i\le n,} is in {\Sigma}, or comes from {\sigma_j} for some {j<i} by applying the extension rule (i.e., {\sigma_i=\sigma_j0} or {\sigma_i=\sigma_j1}), or comes from {\sigma_j} and {\sigma_k} for some {j,k<i,} by applying the compression rule (i.e., {\sigma_j=\sigma_i0} and {\sigma_k=\sigma_i1}).

We can think of the proof as a certificate that, indeed, {\Sigma\vdash\tau.} If {\Sigma} does not prove {\tau,} we write {\Sigma\not\vdash\tau.}

Some immediate observations are useful.

  • Note first that if {s=(\sigma_0,\dots,\sigma_n)} is a proof of some string {\tau} from {\Sigma,} then each {\sigma_i,} {i\le n,} is also provable from {\Sigma,} as witnessed by the proof {s\upharpoonright i.}
  • {\Sigma} proves something iff {\Sigma\ne\emptyset.}
  • {\Sigma\vdash{\perp}} iff {\Sigma\vdash\tau} for all {\tau.}

In logic, it is customary to write {{\perp}} to denote falsehood. It is also standard that if one can prove something false, then one can prove anything. This explains why we are denoting by {{\perp}} the empty string. One calls a system {\Sigma} that proves anything inconsistent, and it is a desirable feature that {\Sigma} is consistent. On the other hand, we want {\Sigma} to be somewhat useful, so we can derive at least some statements from it.

Here is a simple example: {\{00,01\}\vdash\tau} iff {\tau} begins with {0.}

One direction of this assertion is clear: The proof {00,01,0} shows that {0} can be derived from the given {\Sigma,} and repeated applications of the extension rule show now that any {\tau} that begins with {0} is indeed derivable.

On the other hand, an argument is needed to check that these are the only provable statements. Typically in these arguments, one proceeds by induction in either the length of proofs, or the length of the strings involved. Here we use induction on the length of proofs, to show that, for any {n,} any proof from {\Sigma=\{00,01\}} of length {n} is a proof of a string that begins with {0.}

Once the right induction hypothesis has been identified, the argument is in many cases straightforward, and that is indeed the case here: If {s} is a proof of length {1,} then {s=(00)} or {s=(01),} and in both cases the result holds. Assume now the result for all proofs of length at most {n,} and suppose {s} is a proof of length {n+1.}

Say {s=(\sigma_0,\dots,\sigma_n).} If {\sigma_n\in\Sigma,} we are done. If {\sigma_n} comes from a previous {\sigma_i} by extension, then {\sigma_i,} having a proof of length {i+1\le n,} starts with {0,} by the induction hypothesis, and then so does {\sigma_n,} since {\sigma_i\sqsubset\sigma_n.} Finally, if {\sigma_n} comes from previous {\sigma_i,\sigma_j} by compression, then {\sigma_i=\sigma_n0} and {\sigma_j=\sigma_n1} begin with {0,} by hypothesis. In particular, considering {\sigma_j,} it follows that {\sigma_n} must begin with {0.} This completes the inductive argument, and therefore the proof.

(It is customary to refer to these statements about proofs as metatheorems, to help us distinguish between formal proofs (the strings such as {s} above) and proofs (such as the argument of the last four paragraphs). One also talks about reasoning in the metatheory and similar expressions. For a while at least these distinctions should be obvious and hopefully do not lead to confusion.)

Now we proceed to establish some metatheorems explaining some of the properties of the formal system.

If {\Sigma\vdash\tau,} then there is a finite {\Sigma_0\subseteq\Sigma} such that {\Sigma_0\vdash\tau.} {\Box}

Of course, {\Sigma_0} as above depends in general on {\tau.}

If {\Sigma\vdash\tau,} then there is a proof of {\tau} from {\Sigma} where first rule (1) is used (perhaps several times), then rule (3) (perhaps none, perhaps several times), and finally rule (2) (perhaps none, perhaps several times).

Proof: Let {s_0=(\sigma_0,\dots,\sigma_n)} be a proof of {\tau} from {\Sigma.} We describe a procedure that transforms such a proof {s} into another such proof {s'.} We will argue that if we start with {s_0} and iteratively apply the procedure, thus producing a sequence of proofs of {\tau,}

\displaystyle s_0,s_1,\dots,

eventually (after finitely many steps) we reach a proof {s_k} to which the procedure does not apply. We then argue that this means that {s_k} is as in the statement of the fact.

First, say that {s} has property {(a)} iff {s} is a proof of {\tau} from {\Sigma} where some application of rule (1) comes after applying either rule (2) or (3).

Suppose {s} has property {(a),} say

\displaystyle s=(\tau_0,\dots,\tau_k,\tau_{k+1}){}^\frown \rho,

where {k+1} is the first place where this occurs, i.e., {\tau_{k+1}\in\Sigma,} but some {\tau_i} with {i<k+1} comes from extension or compression, and {k+1} is least with this property. In this case, we say that {k+1} is a witness to property {(a)} in {s.}

For such an {s,} if {\tau_{k+1}=\tau,} replace {s} with {s'=(\tau).} Otherwise, replace {s} with

\displaystyle s'=(\tau_{k+1},\tau_0,\dots,\tau_k){}^\frown\rho.

Note that, in either case, {s'} is a proof of {\tau} from {\Sigma} and {{\rm lh}(s')\le{\rm lh}(s).} Moreover, either {s'} does not have property {(a),} or else property {(a)} is witnessed in {s'} by some {j} with {k+1<j.}

Second, say that {s} has property {(b)} iff {s} is a proof of {\tau} from {\Sigma} that does not have property {(a),} but there is in {s} some application of rule (3) after some application of rule (2).

Suppose {s} has property {(b),} say

\displaystyle s=(\tau_0,\dots,\tau_k,\tau_{k+1}){}^\frown \rho,

where {k+1} is the first place where this occurs, i.e., {\tau_{k+1}} is the result of applying compression to previous strings {\tau_i,\tau_j,} but some {\tau_l} with {l\le k} is the result of applying extension to some {\tau_m} with {m<l;} moreover, {k+1} is least with this property. In this case, we say that {k+1} witnesses property {(b)} in {s.}

For such an {s,} consider {\tau_i} and {\tau_j,} so {\tau_i=\tau_{k+1}0} and {\tau_j=\tau_{k+1}1.} There are several cases:

Case 1: If both {\tau_i,\tau_j} are in {\Sigma,} let {t<k+1} be largest such that {\tau_t\in\Sigma.} If {\tau_{k+1}=\tau,} replace {s} with

\displaystyle s'=(\tau_0,\dots,\tau_t){}^\frown(\tau).

Otherwise, replace {s} with

\displaystyle s'=(\tau_0,\dots,\tau_t){}^\frown(\tau_{k+1}){}^\frown(\tau_{t+1},\dots,\tau_k){}^\frown\rho.

Case 2: Either {\tau_i} or {\tau_j} comes from extension. Say it is {\tau_i,} so there is {l<i} such that {\tau_i=\tau_l0.} Then {\tau_l=\tau_{k+1}.} (Similarly if it is {\tau_j} that comes from extension.) If {\tau_{k+1}=\tau,} replace {s} with

\displaystyle s'= (\tau_0,\dots,\tau_l).

Otherwise, replace {s} with

\displaystyle s'= (\tau_0,\dots,\tau_k){}^\frown\rho.

Case 3: Neither {\tau_i} nor {\tau_j} comes from extension, but at least one of them comes from compression. Without loss, {i\le j.} Note that in {(\tau_0,\dots,\tau_j)} there are no applications of rule (2), by minimality of {k+1.} If {\tau_{k+1}=\tau,} replace {s} with

\displaystyle s'=(\tau_0,\dots,\tau_j){}^\frown(\tau).

Otherwise, replace {s} with

\displaystyle s'=(\tau_0,\dots,\tau_j){}^\frown(\tau_{k+1}){}^\frown(\tau_{j+1},\dots,\tau_k){}^\frown\rho.

Note that these cases are exhaustive, and that in all three cases, {s'} is a proof of {\tau} from {\Sigma} without property {(a),} and {{\rm lh}(s')\le{\rm lh}(s).} Moreover, either {s'} does not have property {(b),} or else property {(b)} is witnessed in {s'} by some {r} with {k+1<r.}

Finally, we describe the procedure: If {s} has property {(a),} apply the transformation explained in that case. If {s} has property {(b),} apply the transformation described in that case. Note that if {s} has neither property, then it is a proof as wanted.

Now we explain that iterative application of this procedure, starting with {s,} should stop after finitely many steps. This is because each application of the procedure either shortens the length of the proof, or increases the size of the witnesses (which are bounded by the length of the proof). We are done. \Box

Typeset using LaTeX2WP. Here is a printable version of this post.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: