Pen Pals
Probability, Calculus ; Difficulty: medium
What percentage of the trip duration will you receive letters?
This week’s Riddler Classic is a statistical parameter estimation exercise. The Classic reproduced:
Graydon is about to depart on a boating expedition that seeks to catch footage of the rare aquatic creature, F. Riddlerius. Every day he is away, he will send a hand-written letter to his new best friend, David Hacker.2 But if Graydon still has not spotted the creature after N days (where N is some very, very large number), he will return home.
Knowing the value of N, Graydon confides to David there is only a 50 percent chance of the expedition ending in success before the N days have passed. But as soon as any footage is collected, he will immediately return home (after sending a letter that day, of course).
On average, for what fraction of the N days should David expect to receive a letter?
*spoiler alert* solution and answer follow below
Solution Set up
We are only told that the probability of Graydon’s mission succeeding in \(N\) days is 0.5. Since we are not given the actual distribution, we’re going to have to make an assumption. A reasonable one is a constant and independent probability \(p\) of capturing footage of F. Riddlerius on any day.
In this case, the probability of first success on day \(k\) is just the probability of \(k-1\) successive failures followed by a success, which is equal to \( (1-p)^{k-1}p \)
(This by the way is the Geometric distribution)
Given the mission success probability of 0.5, this means:
\begin{align*}
P(\textit{Mission Success in N days or less}) &= \sum_{k=1}^{N}p(1-p)^{k-1} \\
&= \frac{p[1-(1-p)^N)]}{1-(1-p)} \qquad \textit{(sum of N terms of a Geometric series)} \\
&= 1 – (1-p)^N \\
& = \frac{1}{2} \\
\Rightarrow \qquad (1-p)^N &= \frac{1}{2} \\
\Rightarrow \qquad \qquad p &= 1-\frac{1}{2^{\frac{1}{N}}}
\end{align*}
Calculating the expected number of letters and required fraction
The probability that Graydon mails \(k\) letters when \( k < N \) is the same as having first success on the \(k^{th}\) day. This is equal to \( (1-p)^{k-1}p \) as shown before.
If Graydon has not had success in \(N-1\) days, he will send a letter on the \(N^{th} \) day regardless of success on that last day. Hence, the probability that he sends N letters is the sum of the probabilities 1.) that he has success on the \(N^{th}\) day and 2.) the probability that he has no success in \(N\) days. The former is just \( (1-p)^{N-1}p \) and the latter is 0.5.
So the expected number of letters E is calculated as:
\begin{align*}
E &= p + 2p(1-p)+3p(1-p)^2 +\quad … \quad + N\left ( \ p(1-p)^{N-1} + \frac{1}{2} \ \right ) \\
&= \sum_{k=1}^{N}kp(1-p)^{k-1} + \frac{N}{2} \\
&= \sum_{j=1}^{N} \left ( \sum_{k=j}^{N}p(1-p)^{k-1} \right) + \frac{N}{2} \qquad \textit{(breaking into sums of geometric series shown below)} \\
&= \sum_{k=1}^{N}p(1-p)^{k-1} + \sum_{k=2}^{N}p(1-p)^{k-1} +\quad … \quad + \sum_{k=N}^{N}p(1-p)^{k-1} + \frac{N}{2} \\
&= \left [1-(1-p)^N\right ] + \left [(1-p) -(1-p)^N\right ] +\quad … \quad + \left [(1-p)^{N-1}-(1-p)^N \right ] + \frac{N}{2} \\
&= \sum_{k=1}^{N}(1-p)^{N-1} – N(1-p)^N + \frac{N}{2} \qquad \textit{(regrouping terms)} \\
&= \left [ \frac{1-(1-p)^N}{p} \ – \ N(1-p)^N \right ] + \frac{N}{2} \\
&= \frac{1}{2p} \ – \ \frac{N}{2} + \frac{N}{2} \\
&= \frac{1}{2p}
\end{align*}
Where in the second last line above, we substitute the value for \( (1-p)^N = \frac{1}{2} \) that we obtained in the previous section.
But what we are really looking for is the expected proportion of the \(N\) days that Graydon mails a letter. Dividing the above result by N we get:
\begin{align*}
\frac{E}{N} &= \frac{1}{2pN} \\
&= \frac{1}{2N\left ( 1 -\frac{1}{2^\frac{1}{N}} \right )} \\
\end{align*}
where we use the value of \(p\) in terms of \(N\) from the previous section.
Calculating the limiting behavior as N becomes large
Consider the denominator (without the 2) from above as \(N\) becomes large:
\begin{align*}
&= \lim_{N\to \infty } N\left ( 1 – \frac{1}{2^{\frac{1}{N}}} \right ) \\
&= \lim_{N\to \infty }\frac{N(2^\frac{1}{N} – 1)}{2^\frac{1}{N}} \\
&= \frac{\lim_{N\to \infty }N(2^\frac{1}{N} – 1)}{\lim_{N\to \infty }2^\frac{1}{N}} \qquad \textit{(can split num and denom since denom limit is not zero)}\\
&= \lim_{N\to \infty }N(2^\frac{1}{N} – 1) \qquad \textit{(denom limit above = 1)}\\
&= \lim_{N\to \infty }\frac{2^\frac{1}{N} – 1}{\frac{1}{N}} \\
& \textit{now substitute} \ \ y = 2^\frac{1}{N} – 1 \\ \\
&= \lim_{y\to 0 }\frac{y}{\frac{log(y+1)}{log2}} \\
&= log2 \lim_{y\to 0 }\frac{1}{\frac{1}{y}log(y+1)} \\
& = log2 \lim_{y\to 0 }\frac{1}{log(y+1)^\frac{1}{y}} \\
&= log2 \frac{1}{log e} \\
&= log 2 \\
\end{align*}
We can now plug this result into our expression for E/N to arrive at the required answer as N grows large:
\begin{align*}
\lim_{N\to \infty } \frac{E}{N} &= {\color{Red} { \frac{1}{2log2} }} \\ \\
& \approx \mathbf{{\color{Red} {72.13\%} }}
\end{align*}
72.13% is the required answer. Graydon better carry a lot of stationery and postage for his extended trip.
The chart below plots the fraction of the N days a letter is sent vs. N, showing fairly quick convergence.
2 thoughts on “Pen Pals”
Really nice, thank you! Also noteworthy that the step about calculating the limit in N can be done with a Taylor expansion of 1 / 2 ^ (1 /N), and the step about computing the sum of the k . (1 – p) ^ (k – 1) can be done by deriving the formula for the sum of the j . x ^ (j – 1) (by differencing the formula for the sum of the x ^ j). This avoids having to find all the smart substitutions and calculation tricks…
Great observations. For the limit, I wanted to use first principles. Thanks for reading and the comment.