Unexpected expected value

Let’s say that random variables \(x_i\) are distributed uniformly on segment \([0;1]\). How many such variables we have to add to get a sum bigger than 1? Formally, if \(n\) is a minimum value such that \(\sum_{i=1}^{n}x_i > 1\), what is the expected value of \(n\)? Surprisingly, it is \(e\).

Let \(m\) be a natural number. We denote the probability that we need more than \(m\) variables so that their sum exceeds 1 as \(p_n\). Geometrically, this probability is just the volume of an \(n\)-dimensional simplex \(X = \{x | \sum_{i=1}^n x_i < 1\}\). In particular, \(p_0 = 1\). This volume can be calculated directly, but there is an elegant probabilistic proof which I’ll demonstrate.

Consider the random variables

$$\begin{eqnarray}y_1 &=& x_1\\ y_2 &=& x_1 + x_2\\ y_3 &=& x_1 + x_2 + x_3\\ \ldots\\ y_m &=& x_1 + x_2 + \ldots + x_m \end{eqnarray}$$

It is obvious that this sequence is increasing and that \(x\in X\) iff \(y \in Y = \{0 \le y_1 \le y_2 \le \ldots \le y_m \le 1\}\). The linear operator which takes \(\{x_i\}\) to \(\{y_i\}\) has determinant 1, so the volumes of \(X\) and \(Y\) are exactly the same.

The volume of \(Y\) is easily calculated. It is the probability that \(m\) random values would appear in a sorted order, which is \(\frac{1}{m!}\). We conclude that \(p_m = \frac{1}{m!}\).

The probability that the sum of the first \(m\) variables exceeds 1 is \(q_m = 1 – p_m\), so the probability that we need exactly \(m\) variables to exceed 1 is given by \(q_m – q_{m-1} = p_{m-1} – p_m\). The expected number of variables which we have to add is then

$$E\left[n\right] = \sum_{m=1}^\infty m (p_{m-1} – p_m)$$

We can get an easier equivalent expression if we consider how we can rearrange a partial sum:

$$\begin{eqnarray} \sum_{m=1}^k m(p_{m-1} – p_m) &=& p_0 – p_1 + 2p_1 – 2p_2 + 3p_2 – 3p_3 + \ldots + kp_{k-1} – kp_k\\ &=& p_0 + p_1 + p_2 + \ldots + p_{k-1} – kp_k\\ &=& \left(\sum_{m=0}^{k-1} \frac{1}{m!}\right) – kp_k \end{eqnarray}$$

Since \(kp_k = \frac{k}{k!} \to 0 \), we conclude that

$$E\left[n\right] = \sum_{m=0}^\infty \frac{1}{m!} = e$$

Arbitrary distribution using biased coin

Problem. Let’s assume that you have a biased coin, but you don’t know the probabilities of its sides. Using the coin, how would you choose one of the numbers 1, 2 or 3 with equal probability?

Stop reading here and try solving the problem yourself before moving on.

Solution. First of all, I have to note that this problem have a trivial solution that works only for this particular question, not in a general case. One such solution was proposed by a reader in the comments section. In this post, we present a solution that will work for arbitrary distributions, no matter how asymmetric they are.

The problem becomes easier if we split it into two parts. Firstly we will learn how to emulate a fair coin using a biased one. After this we describe a procedure which allows us to get numbers from an arbitrary probabilistic distribution.

Let’s denote the probability of tails as \(p\). Then if we toss the coin twice, we can get one of the four outcomes:

  1. Tails both times, with probability \(p^2\),
  2. Heads both times, with probability \((1-p)^2\),
  3. Tails first time, heads second time, probability \(p(1-p)\).
  4. Heads first time, tails second time, probability \(p(1-p)\).

As you see, outcomes 3 and 4 have equal chances to appear. Thus, in order to emulate an experiment with probability 50/50 (fair coin), we can toss the coin twice. If both tosses give the same outcome, then discard it and toss again. When we get tails in one toss and heads in the other, we choose the outcome which comes first.

Example. Consider a sequence of tosses T, T, H, H, T, H, where I denoted tails as T and heads as H. This sequence describes an experiment in which we toss the coin twice and both times we get tails. We discard this result and toss twice again. This time we get two heads. Now, we toss twice one more time. First, we get tails and then heads. We write “tails” as our outcome.

Since now we can assume that we have a fair coin. We will be repeating tosses sequentially and interpreting them as digits in a binary representation of a number \(s\) in interval \([0, 1]\):

$$s=\sum\limits_{i=1}^{+\infty} \frac{t_i}{2^i}$$

Here \(t_i\) denotes the outcome of i-th toss and can be either 0 or 1. In order to get one of the three values at random, we split \([0,1]\) into three subintervals \([0, 1/3), [1/3, 2/3], (2/3, 1]\). (Since particular points like 1/3 has zero probability, we don’t care into which interval we include them). The outcome of the experiment is the number of the subinterval in which \(s\) lies. Since we assume that each digit of \(s\) is uniform, \(s\) itself is uniform in \([0,1]\).

In fact we don’t need the exact value of \(s\). The only interesting bit is the subinterval in which \(s\) lies, so we don’t have to make an infinite number of tosses. Let’s say that the first toss gave us value 1. We know that \(s=0.1\ldots\) (in binary) and it lies somewhere in \([1/3, 2/3]\). If the next toss gives us 0, then \(s=0.10\ldots\) and \(s\in[1/3, 0.59]\). If the next toss gives us 0 again, then \(s=0.100\ldots\) and we know that \(s\in[1/3, 0.46)\) which is a subinterval of \([1/3, 2/3]\). The outcome is the second value.

This scheme can be generalized in an obvious way. If we want to get an arbitrary distribution we just have to split \([0,1]\) into a given number of subintervals of lengths that match the probabilities of the values.

Monty Hall problem

I’ll start my blog with the Monty Hall problem. It is a well-known classic problem and quite a trivial one, but I still want to present it here, because for me personally, it means a lot. It was one of the first “paradoxes” I learned in my life and it has greatly affected my interests and mathematical taste, so I have some special kind of respect for this problem.

Imagine that you are on a TV show in front of three closed doors. There is a car behind one of the doors and if you pick the right door, you win the car. After you select the door, the host doesn’t open it immediately but instead opens another door without a prize. Now you have a choice out of two doors and you are given a chance to change your initial guess. Would it make sense to change the door or not?

This problem doesn’t need any maths knowledge except basic arithmetic. If you don’t know this problem yet, stop reading here and try to solve it on your own.

One incorrect response I saw is the following. We have three doors and the car is behind one of the doors. So the chance to win is 1/3. (This is correct). Since each door has the same chances of having a car behind, it doesn’t matter if you change the door.

In response to the idea above one may propose another incorrect reasoning: after one losing door is eliminated, each of the remaining doors has the probability of 1/2 of being winning. So it is still doesn’t matter if you change the door, you always get 1/2 chance of winning. Can you see what’s wrong with this solution?

The correct answer is the following: if you don’t switch the door, then the probability of winning is 1/3. If you switch the door, the probability is 2/3.

One way to see how this works is to consider the probability that your initial choice is incorrect, which is 2/3. The phrase “the door I’ve chosen is incorrect” may be rephrased as “one of the doors I haven’t chosen is correct”. This probability stays the same after the host eliminates one of the doors. After this you know that the remaining door is “the door I haven’t chosen”, and the probability it has the car is the same that you initially made a mistake. This is 2/3.

This solution gives the right answer, but it doesn’t give the right intuition. It is not entirely clear immediately why this solution is more correct that the solution which gives us answer 1/2. I’ll try to explain this below.

Before I go to the solution, you can try playing this game several times with your kid. You will be a host and your kid will try to win. Play this game 20 times and check how many times your kid was able to win. Then suggest her a strategy to always keep her initial choice. Play another 20 times. After this, she can realize that she is winning less than before (if she was making choices more or less randomly). If you’re lucky, she would suggest playing again, but this time she changes the door each time. If not, you can suggest this idea yourself (or this time she can be the host and you can demonstrate a good result by switching the door each time). Amusingly, when you or your kid changes the door each time, you may expect the results to be better than in two previous tries. Now you can go to an explanation.

The problem can be trivially explained by enumeration of all of the possible cases. Let’s assume that we’ve chosen the first door. There are three possibilities where the car can be:

1. A car is behind the first door (the one you chose). If we don’t change the door when we have the opportunity, we win. If we change the door, we lose.

2. A car is behind the second door. If don’t change the door, we lose. Otherwise, we win.

3. A car is behind the third door. If don’t change, lose. Otherwise, win.

As you see, if we keep the initial door, we win only in one case out of three. If we change the door, we win in two cases out of three. This gives us probabilities of 1/3 for keeping our choice and 2/3 changing the door.

What if initially we chose the second or the third door? The situation will be symmetric to what we’ve just seen, so it doesn’t affect the probability.

If you know the basics of probability theory, you can easily solve this problem with the law of total probability and see that the solution just repeats the solution above. But if we used this law initially, it wouldn’t look so convincing (at least if you’re just learning basics). This is not a coincidence: probability theory is just about counting possibilities at its heart, and it is always useful to remember this. This often makes reasoning simpler and helps you to distinguish the correct reasoning from an incorrect one.

Another approach to take on such kind of problems is to generalize them. Let’s say that instead of 3 doors, TV-show uses 100 doors. Only one has a prize behind. You select one door, then the host opens 98 doors which don’t have prizes. Would you switch the door in this case? The answer should be obvious.

Problem. Let’s say that there are N doors in total, W of them are with prizes. You are allowed to choose one door. After this, the host opens K doors without a prize and you’re allowed to choose another door. What is your probability to win a prize in this case?

This is quite a simple problem if you know about binomial coefficients so I’ll leave it as an exercise. Feel free to post your solution in comments.

Not only the problem itself but also its history is interesting though.

Initially, this problem was proposed by Steve Selvin in 1975, but it became well-known only when a reader of Parade magazine asked this question in the “Ask Marilyn” column. Marylin vos Savant gave a correct answer, but she immediately started receiving letters explaining why she is wrong. Most of these letters were from people with a Ph.D. degree. Here’s a couple of examples, taken from Marylin vos Savant website:

[…] You blew it! Let me explain. If one door is shown to be a loser, that information changes the probability of either remaining choice, neither of which has any reason to be more likely, to 1/2. As a professional mathematician, I’m very concerned with the general public’s lack of mathematical skills. Please help by confessing your error and in the future being more careful.
Robert Sachs, Ph.D.
George Mason University

You are utterly incorrect about the game show question, and I hope this controversy will call some public attention to the serious national crisis in mathematical education. If you can admit your error, you will have contributed constructively towards the solution of a deplorable situation. How many irate mathematicians are needed to get you to change your mind?
E. Ray Bobo, Ph.D.
Georgetown University

I am in shock that after being corrected by at least three mathematicians, you still do not see your mistake.
Kent Ford
Dickinson State University

It seems that Ph.D. status doesn’t guarantee sanity or at least basic knowledge of arithmetic, unfortunately. At the same time, Ms. Savant was also receiving letters from primary school teachers who tried this experiment with young kids. And these kids and teachers did solve the problem correctly and was amused by the result. (My advice above to try it with your kid wasn’t random).

What is more interesting is the difference between confusion which it made in 1990 and the acceptance it has now. This paradox is learned in many schools and universities around the world, and I haven’t heard that it provokes any doubts among the students. The mathematical education hasn’t changed a lot in these 28 years, so what’s going on here? I am afraid that the answer is that students are just taught to believe that the proof from the textbook is correct, but don’t understand it. If people with Ph.D.s weren’t able to solve the problem, then how can 17 y.o. kids solve it in an exam? This is why I not only present the correct solution but at first give incorrect ones to sparkle a doubt.

The last bit. As far as I know, originally the problem didn’t mention any particular TV-show explicitly (I may be wrong, because I didn’t find reliable sources). Monty Hall was a host in a real TV-show called “Let’ make a deal!” in the seventies and Steve Selvin wrote a letter to him. I don’t know what was in that letter (if you know where I can read it, please let me know), but the response from Monty became popular and since then the problem has Monty Hall name:

May 12, 1975

Mr. Steve Selvin
Asst. Professor of Biostatistics
University of California, Berkeley

Dear Steve:

Thank you for sending me the problem from “The American Statistician.”

Although I am not a student of a statistics problems, I do know that these figures can always be used to one’s advantage, if I wished to manipulate same. The big hole in your argument of problems is that once the first box is seen to be empty, the contestant cannot exchange his box. So the problems still remain the same, don’t they? one out of three. Oh, and incidentally, after one is seen to be empty, his chances are no longer 50/50 but remain what they were in the first place, one out of three. It just seems to the contestant that one box having been eliminated, he stands a better chance. Not so. It was always two to one against him. And if you ever get on my show, the rules hold fast for you-no trading boxes after the selection.

Next time let’s play on my home grounds. I graduated in chemistry and zoology. You want to know your chances of surviving with our polluted air and water?

Sincerely,
Monty