Last week I posed an interview question about the ratio of baby boys to girls in a hypothetical country.
In a country in which people only want boys, every family continues to have children until they have a boy. If they have a girl, they have another child. If they have a boy, they stop. What is the proportion of boys to girls in the country?
The correct conclusion is "probably pretty close to 50-50," but the path to the conclusion is far more interesting and important than the conclusion itself.
This question interests me (though I would never use it as an interview question) because there are many ways to get the answer. There are many different kinds of assumptions that are entailed by the problem statement, and the mathematical rigor of the solution can vary tremendously.
First, check out Business Insider's answer, which is perfectly adequate. The approach is to solve the more specific problem "... when there are 10 families, and they all start at once." (This is an excellent approach, recommended [among others] by the famous mathematician George Polya.)
This one caused quite the debate, but we figured it out following these steps:
- Imagine you have 10 couples who have 10 babies. 5 will be girls. 5 will be boys. (Total babies made: 10, with 5 boys and 5 girls)
- The 5 couples who had girls will have 5 babies. Half (2.5) will be girls. Half (2.5) will be boys. Add 2.5 boys to the 5 already born and 2.5 girls to the 5 already born. (Total babies made: 15, with 7.5 boys and 7.5 girls.)
- The 2.5 couples that had girls will have 2.5 babies. Half (1.25) will be boys and half (1.25) will be girls. Add 1.25 boys to the 7.5 boys already born and 1.25 girls to the 7.5 already born. (Total babies: 17.5 with 8.75 boys and 8.75 girls).
- And so on, maintianing a 50/50 population.
This is a good back-of-the-envelope calculation, but it talks about things like "2.5 babies" which is silly and "and so on, maintaining a 50/50 population" which begs the question. Nevertheless, it's a good answer for a short interview.
Before I get to my own solutions, a digression on another related problem:
Two trains are heading toward each other on the same track. They are 120 miles apart (as the track is laid) at time t = 0. Each train is moving at 30 mph.
At time t = 0 a fly takes off from the front of the first train and starts flying toward the second train, following the track, at 60 mph (yes, this fly is faster than the train.) As soon as it reaches the second train it turns around (losing no time on the turnaround, and without being crushed on the windshield) and starts back toward the first train. When it reaches the first train again it turns around and heads back toward the second train again, and so on, back and forth, until the trains collide and the poor creature's miserable existence is mercifully ended between them.
The question is: how many miles did the fly's odometer advance between time t = 0 and the collision? (I know, flies don't usually have odometers. This fly does. He's a very peculiar fly. But you knew that already.)
There are two canonical ways to answer this question. One way is to start calculating from the fly's point of view:
Well, the fly F starts at x = -60 and heads toward train T2 at 60 mph... but the train is heading toward the origin at 30 mph... so I solve the system of two linear equations F(t) = -60 + 60t and T2(t) = 60 - 30t and I get the point of intersection is at x = 20 (t = 1 1/3 but this is less relevant.) So the fly goes 60 + 20 = 80 miles on the first leg. Then he turns around and heads back toward T1, which by this time is at x = -20... so I solve the new system of two linear equations and I get the point of intersection is at x = -6 2/3. So the fly goes 26 2/3 miles on the second leg, or 80 + 26 2/3 so far. Now I do the third leg... hmm, it looks like the distance he goes on each leg is 1/3 the distance of the leg before, that's interesting... I wonder if I can prove it... [proof elided]... OK, it boils down to:
... and I know the geometric series formula...
for |r| < 1 (Geometric series theorem.)
... if I pull out the 80 and substitute r = 1/3 I get...
which comes out to... 120 miles!
This is absolutely correct. But the other way is to back off a step and look at it from the director's point of view.
OK, the trains start at time t = 0 and they head toward each other at 30 miles per hour each. So the distance between them is getting smaller at a rate of 60 miles per hour. There's 120 miles between them at the start, so the crash will happen at time t = 2 hours. The little fly is going at a constant 60 miles per hour throughout those two hours (poor little fella) so he'll have racked up exactly 120 miles of travel at the time of the collision. I don't know how far he goes on any given leg (and I don't particularly care.)
Legend has it that legendary mathematician John von Neumann was asked this question during class by a student eager to show him up. Von Neumann thought for a few moments and came up with the correct answer.
Student (somewhat embarrassed:) Darn... I was hoping you wouldn't see the shortcut, and you'd have to sum the series.
Von Neumann (puzzled:) I did sum the series.
Student: (awed silence)
Heh. Gets me every time.
Anyway, back to boys and girls. The "easy" way to solve this is to think of it from the point of view of the hospital maternity ward:
Every day, women come in and give birth. The midwife doesn't know (or care) how many siblings the newborn has, and nothing in the problem statement changes the assumption that the midwife is going to see a (roughly) equal number of boys and girls being delivered. So the ratio of boys to girls is (roughly) 50-50.
Changing the point of view from the family to the hospital reveals that almost all of the information in the problem statement was extraneous:
In a country in which people only want boys, every family continues to have children until they have a boy. If they have a girl, they have another child. If they have a boy, they stop.What is the proportion of boys to girls in the country?
All well and good. But let's try the summing-the-series way (I'm a glutton for punishment.) We'll start by figuring out how many boys there are, and then figure out how many girls there are, and then calculate the ratio directly. This way we'll actually use all the information in the problem.
-- How many boys are there? --
Well, the number of families in the country is not given. Let's call it F, and we'll only count families that actually have children. About half of the families would have had a boy first... that's F / 2 boys. The remainder would have tried again. About half of these would have gotten a boy on the second try... that's F / 4 boys. The remainder would have tried again. About half of these would have gotten a boy on the third try... that's F / 8 boys.
At this point we can use an inductive argument to say that for any n from 1 to infinity, there are F / 2n boys that result from the a family getting a boy on the nth try. Key to this inductive argument are the perfectly natural assumptions that a family can support arbitrarily many children, and that a woman is capable of generating children arbitrarily quickly.
The total number of boys B can then be calculated as:
(Note the use of the geometric series formula again.)
That is to say, there are precisely as many boys as there are families. Which makes sense because each family has one boy. Hmm, perhaps there was an easier way to solve this part...
-- How many girls are there? --
The F / 2 families that got a boy first shot out of the box have 0 girls: 0 * F / 2.
The F / 4 families that got a boy on the second try have 1 girl each: 1 * F / 4.
The F / 8 families that got a boy on the third try have 2 girls each: 2 * F / 8.
The total number of girls G is thus:
It's not immediately clear how to sum the series. An elegant way is to use a trick worth remembering:
I am assuming both |x| < 1 (for convergence) and x ≠ 0 (for convenience.)
Differentiate (this is the trick:)
Multiply both sides by x:
Now substitute x = 1/2:
Back to the calculation:
Heh. The total number of girls is also equal to the total number of families... they're just distributed differently (F / 2 families have no girls; F / 4 have one; F / 8 have two; F / 16 have three, etc.)
-- What is the ratio of boys to girls? --
Since B = G = F, the ratio B / G = 1.
All this beautiful analysis to the contrary, the answer is actually wrong. More boys are born every year than girls (at a ratio of about 106 to 100), and this is countered gradually due to a higher mortality rate among boys than girls (there are fewer girls born every year but they have a higher life expectancy.) The cutoff age is 36... below this age there are more men than women, above that age there are more women than men.