Radiolab has a show on randomness (the embed feature doesn’t work) where they discuss an experiment of flipping a coin 100 times. A string of tails appeared 7 times, which seems to be relatively small. Indeed, given 7 coin tosses, the probability of all 7 coming up tails is

$$

\frac{1}{2^7} = \frac{1}{128} \approx 0.007812 \text{.}

$$

As an aside, the probability of all coming up the same side is because the initial toss does not matter. All that matters is that all subsequent tosses match.

The discussion sets up an experiment where one group flips a coin 100 times and a second group says they flip a coin 100 times and each reports the results. A statistics professor can then tell the difference because the group that actually flipped the coin got a string of 7 tails which seems unlikely, but is more likely than you think. The show notes that the probability of getting 7 tails is 1 in 6 because there are 14 groups of 7 in 100. Which in one sense is true. But with overlap, there are 94 groups of 7 any one of which can be all tails. So the probability of getting a string of exactly 7 tails is really

$$

1 – \big(\frac{127}{128}\big)^{94} \approx 0.521576 \text{.}

$$

Or just more than half of all sets of 100 flips will contain a string of 7 tails. And if we can allow that we don’t care about whether the string of 7 is either heads or tails, the probability jumps to more than three-quarters:

$$

1 – \big(\frac{63}{64}\big)^{94} \approx 0.772440 \text{.}

$$

These are all a lot more than the 1 in 6 Radiolab claims.

All due respect, Ton’ (to quote from the The Sopranos), but unless I’m missing something this calculation doesn’t sound correct. As I understand your reasoning, if we were dealing with 10 coin flips rather than 100 then your estimate would be 1 – (127/128)^4 = ~0.030886 for the probability of getting 7 tails in a row.

Per my own calculation there are 12 ways to get exactly 7 tails in a row in flipping 10 coins: TTTTTTTHHH, TTTTTTTHHT, TTTTTTTHTH, TTTTTTTHTT, HTTTTTTTHH, HTTTTTTTHT, HHTTTTTTTH, THTTTTTTTH, HHHTTTTTTT, THHTTTTTTT, HTHTTTTTTT, TTHTTTTTTT. Since there are 2^10 = 1024 possible outcomes, this corresponds to a probability of 10/1024 = ~0.011719, much lower than the value by your method.

If we’re talking about the probability of getting 7 *or more* tails in a row in 10 coin flips then there are 20 ways to do this (see below), for a probability of 20/1024 = ~0.019531, again lower.

R code for calculation in previous paragraph:

x <- rep("", 1024)

for (i in 1:1024) {

x[i] <- paste(as.numeric(intToBits(i)[1:10]), collapse = "")

}

sum(grepl("1111111", x))

One good discussion of how to compute the probability of a run of K or more tails in N coin flips is at http://www.askamathematician.com/2010/07/q-whats-the-chance-of-getting-a-run-of-k-successes-in-n-bernoulli-trials-why-use-approximations-when-the-exact-answer-is-known/

Using the Python code from that page the probability of getting a run of 7 or more tails in 100 coin flips is 0.317520. The probability of getting a run of exactly 7 tails is of course less than that number.

Well, a few things. First, your count is off. Let’s assume this os limited 7 tails, just to simplify the discussion. There are, essentially, 4 ways to get 7 tails in a row. TTTTTTTXXX, XTTTTTTTXX, XXTTTTTTTX, and XXXTTTTTTT, where X can be either T or F. There are $2^3 = 8$ possibilities for X, so that’s $8 * 4 = 32$ total out of the 1024 options.

Your count is off. Let’s assume this is limited 7 tails, just to simplify the discussion. There are, essentially, 4 ways to get 7 tails in a row. TTTTTTTXXX, XTTTTTTTXX, XXTTTTTTTX, and XXXTTTTTTT, where X can be either T or F. There are $2^3 = 8$ possibilities for X, so that’s $8 * 4 = 32$ total out of the 1024 options.

So that is $32/1024 = 0.03125$. That’s pretty close to my estimate of $1 – (127/128)^4 approx 0.03088569268$, but I am reasonably convinced my original method is correct, still. I am open to suggestion about why it might not be.

Aren’t you double-counting a bit? For example, if you take TTTTTTTXXX and choose XXX = TFF you get TTTTTTTTFF. But if you take XTTTTTTTXX and choose XXX = TFF you also get TTTTTTTTFF. More obviously, if you choose XXX = TTT then all four base sequences produce the sequence TTTTTTTTTT. So the eight possibilities for XXX will not produce 32 distinct sequences from the four base sequences.

That’s why I did the thing with the R code: Every integer from 0 to 1023 corresponds to a unique bit string of length 10. So finding all the bit strings in that list with 7 or more 1’s will produce the correct count.

After much pondering…I think you’re right.