## The Binary “Look-and-Say” Sequence

The look-and-say sequence (which I talked about here) is the sequence that you get by starting with the number 1 and constructing the next term in the sequence by “reading” the previous term. So 1 becomes “one one”, or 11. That becomes “two ones”, or 21. That becomes “one two, one one”, or 1211, and so on.

In this post, I am going to investigate the related binary version of the sequence, which starts off 1, 11 much like the regular sequence. But then when reading 11, we read it as “two ones”. Since two in binary is 10, the next term in the sequence is 101. When reading that term, we read it as “one one, one zero, one one”, so the next term is 111011. That term is read as “three ones, one zero, two ones”, and since three is 11 in binary and two is 10 in binary, the next term is 11110101, and so on. In this post we will answer two questions in particular about this sequence:

1) On average, how much longer is the (n+1)^{th} term in the sequence than the n^{th} term in the sequence?

2) On average, what is the ratio of the number of ones to the number of zeroes in the sequence?

### Non-Interacting Subsequences

Much like the regular look-and-say sequence, we are able to study this sequence by constructing a “basis” of non-interacting subsequences that every term in the binary look-and-say sequence is made up of. Fortunately, constructing such a family of subsequences for the binary version of the look-and-say sequence is much simpler than it is for the decimal version of the sequence – here we only need ten different basic subsequences (whereas we needed 92 different subsequences for the regular look-and-say sequence!). These ten subsequences, and the subsequences they evolve into, are summarized in the following table.

# | Subsequence | Evolves Into |
---|---|---|

1 |
1 | (2) |

2 |
11 | (3)(1) |

3 |
10 | (5) |

4 |
110 | (3)(4) |

5 |
1110 | (6) |

6 |
11110 | (7)(4) |

7 |
100 | (9) |

8 |
1100 | (3)(8) |

9 |
11100 | (10) |

10 |
111100 | (7)(8) |

So for example, the first term in the sequence, 1, evolves into the subsequence (2), which is 11. That term then evolves into subsequence (3) followed by subsequence (1), or 101. That term then evolves into the subsequence (5) followed by the subsequence (2), or 111011, and so on. The reason that this representation of the sequence is useful is we can use it to describe the evolution of the binary look-and-say sequence entirely within a matrix T. In particular, we let T be the matrix with 1 in its (i,j) entry if the subsequence (i) appears in the evolution rule for subsequence (j), and 0 in its (i,j) entry otherwise:

Now if **v** is a 10-dimensional vector whose ith entry indicates how many times the subsequence (i) appears in a particular term of the binary look-and-say sequence, it follows that the entries of T**v** tell us how many times each subsequence appears in the *next *term of the binary look-and-say sequence. So it follows from standard theory of linear homogeneous recurrence relations that we can now read off all of the long-term behaviour of the binary look-and-say sequence from the eigenvalues and eigenvectors of T.

### Rate of Growth of the Sequence

The asymptotic rate of growth of the number of digits in the terms of the binary look-and-say sequence is simply the magnitude of the largest eigenvalue of the transition matrix T above. Using Maple it is simple to derive this value. If L_{n} is the number of digits in the n^{th} term of the binary look-and-say sequence, then

This limit is approximately 1.465571, which means that the binary version of this sequence grows much faster than the decimal version of the sequence (recall that the growth rate of the number of digits of the regular look and say sequence is approximately 1.303577). This limit is also the unique real root of the cubic x^{3} – x^{2} – 1, which follows from the fact that the characteristic polynomial of T is

### Ratio of Number of Ones to Zeroes

If we let N_{n} denote the number of ones in the n^{th} term of the binary look-and-say sequence, and if we let Z_{n} denote the number of zeroes in the n^{th} term of the sequence, what is

_{}

In other words, what is the average ratio of ones to zeroes in this sequence? The following table shows the value of N_{n}/Z_{n} for n = 3, 4, …, 25, which might give some intuition to the problem:

n | N_{n}/Z_{n} |
---|---|

3 |
2.000 |

4 |
5.000 |

5 |
3.000 |

6 |
2.000 |

7 |
2.000 |

8 |
2.000 |

9 |
1.786 |

10 |
1.762 |

11 |
1.742 |

12 |
1.717 |

13 |
1.691 |

14 |
1.690 |

15 |
1.680 |

16 |
1.676 |

17 |
1.672 |

18 |
1.671 |

19 |
1.669 |

20 |
1.668 |

21 |
1.667 |

22 |
1.667 |

23 |
1.666 |

24 |
1.666 |

25 |
1.666 |

Based on numerical estimates like those given in the table above, it has been conjectured that the limiting ratio is 5/3 (or some nearby value). We will now show that the limit does indeed exist, but its value is *not* 5/3 — it just happens to be really close to 5/3.

Much like the maximal eigenvalue of T tells us the overall growth rate of the sequence, the corresponding eigenvector tells us the distribution of the different subsequences that are present in the limit. Once we know the distribution of the individual subsequences, it is not difficult to find out the overall ratio of ones to zeroes by weighing the different subsequences appropriately. So our first step is to find the eigenvector corresponding to the maximal eigenvalue. To this end, it will be convenient to let

α is the same as in the previous section, and β is exactly the growth rate limit that we computed. Then the eigenvector corresponding to the maximal eigenvalue of T is:

What this means is that, in the limit, the fifth subsequence, 1110, is β times as frequently-occurring as the sixth subsequence, 11110 (for example). Now we just weigh each subsequence according to how many zeros and ones they contain, and we find the limiting ratio of ones to zeroes is

In particular, this ratio does *not *equal 5/3, but rather its decimal expansion begins 1.6657272222676… (which is less than 1/1000 away from 5/3).

Nice job! It’s really a good idea to develope the ”look and say ” sequence and have differents versions! And the version on binary is much simpler than the old version as you say. Just a question that i’m not sure: are these 10 subsequences really a basis of a vectorial space of dimension 10 ? I have a little difficulty in identifing all these with what we learn in our maths lessons. Thanks a lot and sorry for the errors of orthaugraphe. I’m not a native English speaker. I’m Chinese and I study now in France.

@Sun Fangyan – Nope, when I referred to those subsequences as “basic”, I didn’t mean to imply any relationship with vector bases — I’m using the term very informally. They are “basic” in that, once you understand how they evolve, you understand how *all* binary sequences evolve. However, the operator T is not invertible, so these subsequences are not a basis in the linear algebra sense.

Still a question.For the evolution of the length of the sequence,why don’t you use the same kind of matrix for the matrix classique and the binary one.I mean, why don’t you ,for the coefficents of matrix T, use the division of the lengths*the time that it appears in the next sequence. Why are there 1 everywhere?

so i use the same method of derivation of the conway sequence and i have P := (x^2-2)*(x^8-2*x^7+x^6-x^5+(34/25)*x^4-(9/25)*x^3)as characteristic polynomial,so that way i have 2^1/2 as grow race,around 1.412.

so what’s the problem?

@Sun Fangyan – You can do it either way, either with 1’s everywhere or with the ratio of the lengths. If you use the ratio of the lengths, then the transition matrix becomes the transpose of:

T =

0 2 0 0 0 0 0 0 0 0

1/2 0 1 0 0 0 0 0 0 0

0 0 0 0 2 0 0 0 0 0

0 0 2/3 1 0 0 0 0 0 0

0 0 0 0 0 5/4 0 0 0 0

0 0 0 3/5 0 0 3/5 0 0 0

0 0 0 0 0 0 0 0 5/3 0

0 0 1/2 0 0 0 0 1 0 0

0 0 0 0 0 0 0 0 0 6/5

0 0 0 0 0 0 1/2 2/3 0 0

The maximal eigenvalue of this matrix is 1.465571, just like the matrix presented in the post. I’m not sure why you’re getting a value of sqrt(2) — perhaps if you posted the matrix you’re using we could find out where the discrepancy lies.

Thanks a lot. I have found my error, I made a mistake in one of the coefficient.But why do the two methods give the same answer? And what’s the difference between them?

@Sun Fangyan

Multiplying by the length quotients corresponds to changing the vector basis of the vector space – essentially a renaming of the vectors. This changes which vectors are eigenvectors, but

notthe eigenvalue.Hi, Can you site some properties of this binary look and say sequence. This variant is awesome and I would like to know more about it. Anyway, I’m planning to make a small research on this and want to know if you have any established properties on this binary look and say. I would be very happy if you can send information about it. Thanks and happy new year! Hope for a fast respond.

@Jason T. Isaiah – Unfortunately, I don’t know much more about it than what’s in this blog post and on its OEIS page.

Good luck with your project though, and let me know if you find out anything else interesting about the sequence!

No worries Nathaniel. Thank you, your post is sure a head start for me. :))

@Nathaniel

What are the applications of the look and say sequence in real life?

Cool stuff! I cited your work in my paper https://arxiv.org/abs/2004.06414/.

I’m working on a similar recursion, where each term of the sequence is the binary description of the bit-wise compliment of the previous term.