This is a puzzle I’ve known for quite a while, I originally stumbled upon it when doing a research project for my International Baccalaureate extended essay in high school. There are many equivalent formulations but I’ve chosen to use balls and urns as these are a standard for many probability puzzles. Here’s the problem:
An urn initially contains 2 black balls and 3 white balls. The following process is repeated N times.
A ball is selected at random from the urn and it’s color is inspected then the ball is put back in the urn and another ball of the same color is added to the urn.
Let P_N be the probability that at the end of the process the majority of the balls in the urn are white.
find lim N→inf P_N
The solution is coming so if you want to try to solve it yourself don’t scroll down
.
.
.
.
Straight forward solution: We’ll start with a rather straightforward solution. We first notice that we start with exactly 1 extra white ball so to end up with a white majority we need at least N/2 of the draws to be white balls. So lets write an expression for the probability that exactly K of the draws are white and sum from N/2 to N to get an expression for P_N.
So what’s the probability of exactly K white draws? Well first we note that K white draws can occur in (N choose K) orders. Do each of these orders have the same probability? at first it might seem like the answer is no since the probability of each draw is affected by the previous draws. But it turns out that due to some nice cancellation the answer is actually yes. Here’s how we can see that. Imagine writing down a particular sequence of White and Black of length N with K terms of the sequence White. Now for each term of the sequence write down a fraction corresponding to the probability of the next draw equaling that term of the sequence conditional upon all previous draws aligning with the sequence. The probability of this particular sequence of draws occurring will of course be the product of these fractions. Now consider the denominators of these fractions, they will range from 5 to N+4 and thus their product will be (N+4)!/24, now consider the numerators of the fractions corresponding to white draws, they will range from 3 to (K+2) thus their product will be (K+2)!/2 and finally consider the numerators corresponding to black draws, they will range from 2 to N-K+1 and thus their product will be (N-K+1)! We can see that the overall product does not depend on the order of the sequence. Now putting this all together we can calculate that the probability of exactly K white draws is
Thus we can write the following expression for our limit
Since N and K are both going to infinity we can simplify this to
This step may seem unrigorous but if you want you can write out the error term and show that it is O(1/N^2) and thus goes to 0 even after summing. Now here’s the clever part. let
then we notice that our limit can be rewritten as
This is a Riemann sum for the function f on the interval [1/2,1] and thus it will converge to the integral
And there’s our answer.
The following problem is equivalent:
Consider a coin with a probability p of heads. Your prior distribution on p is originally uniform, but then you flip the coin three times and observe two heads and one tails. What is your posterior probability that p > 1/2?
The equivalence results from the fact that the process of adding balls to your urn exactly matches the evolution of the posterior mean of p, which you can check is always (h+1) / (t + h + 2) if you've observed h heads and t tails so far (hence why the 3 white and 2 black balls become 2 heads and 1 tails). And of course, the posterior mean will converge in the limit to the actual value of p by the law of large numbers.
The answer is then simply the same Beta function integral you calculated at the end.
Here's another solution. Note that in order for black to exceed white it must pass through some state (black,white) = (m,m), where m>=3. In such a state, the probability in question is obviously 1/2 by symmetry. Counting the Dyck paths (to avoid double counting states) and noting each has the same probability of reaching (m,m), we get p = 1-1/2*(sum (5-1)!/(3-1)!/(2-1)!*CatalanNumber[n]*(n+2)!*(n+2)!/(2*n+5)! for n=0 to oo), which telescopes to 11/16 after a partial fraction decomposition.