## Monochromatic colorings

August 20, 2016

Caïus Wojcik and Luca Zamboni recently posted a paper on the arXiv solving an interesting problem in combinatorics on words.

http://arxiv.org/abs/1608.03519
Monochromatic factorisations of words and periodicity.
Caïus Wojcik, Luca Q. Zamboni.

I had recently learned of the problem through another paper by Zamboni and a collaborator,

MR3425965
Aldo de Luca, Luca Q. Zamboni
On prefixal factorizations of words.
European J. Combin. 52 (2016), part A, 59–73.

It is a nice result and I think it may be enjoyable to work through the argument here. Everything that follows is either straightforward, standard, or comes from these papers.

1. The problem

To make the post reasonably self-contained, I begin by recalling some conventions, not all of which we need here.

By an alphabet we simply mean a set $A$, whose elements we refer to as letters. A word $w$ is a sequence $w:N\to A$ of letters from $A$ where $N$ is a (not necessarily non-empty, not necessarily proper) initial segment of $\mathbb N$. If we denote $w_i=w(i)$ for all $i\in N$, it is customary to write the word simply as

$w_0w_1\dots$

and we will follow the convention. The empty word is typically denoted by $\Lambda$ or $\varepsilon$. By $A^*$ we denote the collection of all finite words from $A$, and $A^+=A^*\setminus \{\varepsilon\}.$ By $|x|$ we denote the length of the word $x$ (that is, the size of the domain of the corresponding function).

We define concatenation of words in the obvious way, and denote by $x_0x_1$ the word resulting from concatenating the words $x_0$ and $x_1$, where $x_0\in A^*$. This operation is associative, and we extend it as well to infinite concatenations.

If a word $w$ can be written as the concatenation of words $x_0,x_1,\dots,$

$w=x_0x_1\dots,$

we refer to the right-hand side as a factorization of $w$. If $w=xy$ and $x$ is non-empty, we say that $x$ is a prefix of $w$. Similarly, if $y$ is non-empty, it is a suffix of $w$. By $x^n$ for $n\in\mathbb N$ we denote the word resulting form concatenating $n$ copies of $x$. Similarly, $x^{\mathbb N}$ is the result of concatenating infinitely many copies.

By a coloring we mean here a function $c:A^+\to C$ where $C$ is a finite set of “colors”.

Apparently the problem I want to discuss was first considered by T.C. Brown around 2006 and, independently, by Zamboni around 2010. It is a question about monochromatic factorizations of infinite words. To motivate it, let me begin with a cute observation.

Fact. Suppose $w=w_0w_1\dots$ is an infinite word, and $c$ is a coloring. There is then a factorization

$w=px_0x_1\dots$

where all the $x_i\in A^+$ have the same color.

Proof. The proof is a straightforward application of Ramsey’s theorem: Assign to $c$ the coloring of the set $[\mathbb N]^2$ of $2$-sized subsets of $\mathbb N$ given by $d(\{i,j\})=c(w_iw_{i+1}\dots w_{j-1})$ whenever $i. Ramsey’s theorem ensures that there is an infinite set $I=\{n_0 such that all $w_{n_i}w_{n_i+1}\dots w_{n_j-1}$ with $i have the same color. We can then take $p=w_0\dots w_{n_0-1}$ and $x_i=w_{n_i}\dots w_{n_{i+1}-1}$ for all $i$. $\Box$

In the fact above, the word $w$ was arbitrary, and we obtained a monochromatic factorization of a suffix of $w$. However, without additional assumptions, it is not possible to improve this to a monochromatic factorization of $w$ itself. For example, consider the word $w=01^{\mathbb N}$ and the coloring

$c(x)=\left\{\begin{array}{cl}0&\mbox{if }0\mbox{ appears in }x,\\ 1&\mbox{otherwise.}\end{array}\right.$

If nothing else, it follows that if $w$ is an infinite word that admits a monochromatic factorization for any coloring, then the first letter of $w$ must appear infinitely often. The same idea shows that each letter in $w$ must appear infinitely often.

Actually, significantly more should be true. For example, consider the word

$w=010110111\dots 01^n0 1^{n+1}\dots,$

and the coloring

$c(x)=\left\{\begin{array}{cl}0&\mbox{if }x\mbox{ is a prefix of }w,\\1&\mbox{otherwise.}\end{array}\right.$

This example shows that in fact any such $w$ must admit a prefixal factorization, a factorization

$w=x_0x_1\dots$

where each $x_i$ is a prefix of $w$.

Problem. Characterize those infinite words $w$ with the property P that given any coloring, there is a monochromatic factorization of $w$.

The above shows that any word with property P admits a prefixal factorization. But it is easy to see that this is not enough. For a simple example, consider

$w=010^210^31\dots0^n10^{n+1}1\dots$

Consider the coloring $c$ where $c(x)=0$ if $x$ is not a prefix of $w$, $c(0)=$1, and $c(x)=2$ otherwise. If

$w=x_0x_1\dots$

is a monochromatic factorization of $w$, then $x_0=01\dots$ so $c(x_0)=2$ and each $x_i$ must be a prefix of $w$ of length at least $2$. But it is easy to see that $w$ admits no such factorization: For any $n>2$, consider the first appearance in $w$ of $0^{n+1}$ and note that none of the first $n$ zeros can be the beginning of an $x_i$, so for some $j$ we must have $x_j=01\dots 10^n$ and since $n>2$, in fact $x_j=01\dots 10^n10^n$, but this string only appears once in $w$, so actually $j=0$. Since $n$ was arbitrary, we are done.

Here is a more interesting example: The Thue-Morse word

$t=0110100110010110\dots$

was defined by Axel Thue in 1906 and became known through the work of Marston Morse in the 1920s. It is defined as the limit (in the natural sense) of the sequence $x_0,x_1,\dots$ of finite words given by $x_0=0$ and $x_{n+1}=x_n\bar{x_n}$ where, for $x\in\{0,1\}^*$, $\bar x$ is the result of replacing each letter $i$ in $x$ with $1-i$.

This word admits a prefixal factorization, namely

$t=(011)(01)0(011)0(01)(011)(01)0(01)(011)0(011)(01)0\dots$

To see this, note that the sequence of letters of $t$ can be defined recursively by $t_0=0$, $t_{2n}=t_n$ and $t_{2n+1}=1-t_n$. To see this, note in turn that the sequence given by this recursive definition actually satisfies that $t_n$ is the parity of the number of $1$s in the binary expansion of $n,$ from which the recursive description above as the limit of the $x_n$ should be clear. The relevance of this observation is that no three consecutive letters in $t$ can be the same (since $t_{2n+1}=1-t_{2n}$ for all $n$), and from this it is clear that $t$ can be factored using only the words $0$, $01$, and $011$.

But it is not so straightforward as in the previous example to check whether $t$ admits a factorization into prefixes of length larger than $1$.

Instead, I recall a basic property of $t$ and use it to exhibit an explicit coloring for which $t$ admits no monochromatic factorization.