Radiolab loses at statistics

This coin will land on the edge (Jeff Golden / Flickr)
This coin will land on the edge (Jeff Golden / Flickr)

Radiolab has a show on randomness (the embed feature doesn’t work) where they discuss an experiment of flipping a coin 100 times. A string of tails appeared 7 times, which seems to be relatively small. Indeed, given 7 coin tosses, the probability of all 7 coming up tails is
$$
\frac{1}{2^7} = \frac{1}{128} \approx 0.007812 \text{.}
$$
As an aside, the probability of all coming up the same side is \frac{1}{2^6} because the initial toss does not matter. All that matters is that all subsequent tosses match.

The discussion sets up an experiment where one group flips a coin 100 times and a second group says they flip a coin 100 times and each reports the results. A statistics professor can then tell the difference because the group that actually flipped the coin got a string of 7 tails which seems unlikely, but is more likely than you think. The show notes that the probability of getting 7 tails is 1 in 6 because there are 14 groups of 7 in 100. Which in one sense is true. But with overlap, there are 94 groups of 7 any one of which can be all tails. So the probability of getting a string of exactly 7 tails is really
$$
1 – \big(\frac{127}{128}\big)^{94} \approx 0.521576 \text{.}
$$
Or just more than half of all sets of 100 flips will contain a string of 7 tails. And if we can allow that we don’t care about whether the string of 7 is either heads or tails, the probability jumps to more than three-quarters:
$$
1 – \big(\frac{63}{64}\big)^{94} \approx 0.772440 \text{.}
$$
These are all a lot more than the 1 in 6 Radiolab claims.

  • All due respect, Ton’ (to quote from the The Sopranos), but unless I’m missing something this calculation doesn’t sound correct. As I understand your reasoning, if we were dealing with 10 coin flips rather than 100 then your estimate would be 1 – (127/128)^4 = ~0.030886 for the probability of getting 7 tails in a row.

    Per my own calculation there are 12 ways to get exactly 7 tails in a row in flipping 10 coins: TTTTTTTHHH, TTTTTTTHHT, TTTTTTTHTH, TTTTTTTHTT, HTTTTTTTHH, HTTTTTTTHT, HHTTTTTTTH, THTTTTTTTH, HHHTTTTTTT, THHTTTTTTT, HTHTTTTTTT, TTHTTTTTTT. Since there are 2^10 = 1024 possible outcomes, this corresponds to a probability of 10/1024 = ~0.011719, much lower than the value by your method.

    If we’re talking about the probability of getting 7 *or more* tails in a row in 10 coin flips then there are 20 ways to do this (see below), for a probability of 20/1024 = ~0.019531, again lower.

    R code for calculation in previous paragraph:

    x <- rep("", 1024)
    for (i in 1:1024) {
    x[i] <- paste(as.numeric(intToBits(i)[1:10]), collapse = "")
    }
    sum(grepl("1111111", x))

    One good discussion of how to compute the probability of a run of K or more tails in N coin flips is at http://www.askamathematician.com/2010/07/q-whats-the-chance-of-getting-a-run-of-k-successes-in-n-bernoulli-trials-why-use-approximations-when-the-exact-answer-is-known/

    Using the Python code from that page the probability of getting a run of 7 or more tails in 100 coin flips is 0.317520. The probability of getting a run of exactly 7 tails is of course less than that number.

    • Your count is off. Let’s assume this is limited 7 tails, just to simplify the discussion. There are, essentially, 4 ways to get 7 tails in a row. TTTTTTTXXX, XTTTTTTTXX, XXTTTTTTTX, and XXXTTTTTTT, where X can be either T or F. There are $2^3 = 8$ possibilities for X, so that’s $8 * 4 = 32$ total out of the 1024 options.

      So that is $32/1024 = 0.03125$. That’s pretty close to my estimate of $1 – (127/128)^4 approx 0.03088569268$, but I am reasonably convinced my original method is correct, still. I am open to suggestion about why it might not be.

      • Aren’t you double-counting a bit? For example, if you take TTTTTTTXXX and choose XXX = TFF you get TTTTTTTTFF. But if you take XTTTTTTTXX and choose XXX = TFF you also get TTTTTTTTFF. More obviously, if you choose XXX = TTT then all four base sequences produce the sequence TTTTTTTTTT. So the eight possibilities for XXX will not produce 32 distinct sequences from the four base sequences.

        That’s why I did the thing with the R code: Every integer from 0 to 1023 corresponds to a unique bit string of length 10. So finding all the bit strings in that list with 7 or more 1’s will produce the correct count.

        • After much pondering…I think you’re right.