If you haven't read them yet, please see Monday's and Tuesday's posts before starting here. They're the start of this journey that I'm chronicling, ending with today.
We stopped on Tuesday with a change of strategy in estimating the odds of different family compositions. See my long list of all 32 possible series of boy/girl in a five-kid family and add up the ways to achieve the six different family compositions. Here are our results:
What are the odds that you'll get...
5 girls, 0 boys? 1/32
5 boys, 0 girls? 1/32
4 girls, 1 boy? 5/32 (there are 5 possible series out of 32 that make up this boy/girl ratio in a family)
4 boys, 1 girl? 5/32
3 girls, 2 boys? 10/32 (there are 10 possible series out of 32 that make up this boy/girl ratio in a family)
3 boys, 2 girls? 10/32
(Psst. I googled how to calculate probabilities and found this website and DINGALING! they're actually using my example. And here's a nice site showing how to work with a binomial equation rather than list all the possible 32 outcomes like I did Tuesday.)
This sort of thinking about probabilities should remind you of how the odds of the outcomes of rolling the dice are not uniform across all numbers. Your best bet is a 6, 7, or 8 because there are more ways to get those three numbers than the others.
(The following list was edited thanks to a very nice comment, February 5, 2015)
to roll a ...
2 ... there is 1 way: 1+1
3 ... there are 2 ways: 2 + 1; 1 + 2
4 ... there are 3 ways: 3 + 1; 1 + 3; 2 + 2
5 ... there are 4 ways: 3 + 2; 2 + 3; 4 + 1; 1 + 4
6 ... there are 5 ways: 3 + 3; 2 + 4; 4 + 2; 5 + 1; 1 + 5
7 ... there are 6 ways: 6 + 1;1+6; 5 + 2; 2 + 5; 4 + 3; 3 + 4
8 ... there are 5 ways: 4 + 4; 5 + 3; 3 + 5; 6 + 2; 2 + 6
9 ... there are 4 ways:3 + 6; 6 + 3; 5 + 4; 4 + 5
10 ... there are 3 ways: 5 + 5; 6 + 4; 4 + 6
11 ... there are 2 ways: 5 + 6; 6 + 5
12 ... there is 1 way: 6 + 6
(Psst. If you still think 7 is lucky for rolling the dice, then you should have more of a think about probability.)
And just like 6,7,and 8 from rolling the dice, having three boys and two girls (or three girls and two boys) has a "luckier" or higher probability, or more probable, more likely sex ratio in a family of five children.
How do we know which of the two sets of probabilities that I calculated--Tuesday's or today's--is correct?
All girls, no boys: 1/6 or 1/32? (17% or 3 %)
All boys, no boys: 1/6 or 1/32? (17% or 3 %)
Four girls, one boy: 1/6 or 5/32? (17% or 16%)
Four boys, one girl: 1/6 or 5/32? (17% or 16%)
Three girls, two boys: 1/6 or 10/32? (17% or 31%)
Three boys, two girls: 1/6 or 10/32? (17% or 31%)
I see very clearly why our second method (in bold) is superior to our first which was to incorrectly divvy up the odds in sixths. That is, I can see clearly why the odds of having five girls is still 1/32 and not 1/6. There are so many more ways to make a family of five with four girls or with three girls or with two girls than to make one with five girls, so you can't possibly have evenly distributed 1/6 odds for all those types of families of five children.
Initiate mind-blowing sequence.
But when you take the long view, 1/6 (or at least higher odds than 1/32) for a streak of five girls still seems not so crazy.
After all, the odds of having five children of all the same sex are only the lowest, the rarest, becuase we've arbitrarily decided that our family in question maxes out at five!
Would we find those same low odds of 1/32 for five girls in a row if the family had six kids--having more opportunities to have streaks of five girls during that span?
That's (a +b)^6 and if you scratch it out on a piece of paper you don't need to expand the binomial equation. Odds of having six straight girls is 1/64. Same for any series you can make out of six births (all of which add up to a total of 64 different series of boy/girl adding up to six kids).
And then by just sketching or scribbling (but if you're fancy, you can also just use the binomial) you can see how you can get only three series (gggggg; gggggb; bggggg) to have five girls in a row to occur in a family with six births.
That means the odds of having a streak of five girls in a six child family is 3/64 which is 4.6875% (compared to 1/32 or 3.125% in a five child family).
So the odds are slightly larger in a bigger family.
Wait. Did I just do that right?
Let's try a family of seven to make sure I did.
Here are all possible streaks of five girls in a family of seven...
The odds of having five girls in a row in a family of seven = 8/128 = 6.25%
Okay, with a bigger family, the odds are even larger.
What about a family of eight? The odds of having five girls in a row in a family of eight = 19/256 = 7.4%
(Trust me... I scratched it out. And it could be more than 19/256, but my contacts fogged up before I could find anymore.)
Okay, yes. The odds of having a streak of five girls increase as the size of the family increases.
Wait. What?! How do odds change? Odds are odds?
Instead of going up in scale again to check, making calculations even harder, let's go down in scale to check our math. We already know from Tuesday that the odds of a five-kid family having a streak of four girls is 3/32 (ggggg; ggggb; bgggg) = 9%.
WHAT?! Just by making a fifth baby, you've just seriously upped your chances of having a streak of four girls. Your odds go from 6% if you max out at four kids to 9% if you max out at five kids. That sounds reasonable, but...
This means your odds of having a streak of four girls or five girls (or anything!) depend on what DIDN'T YET HAPPEN IN THE FUTURE.
I'm sorry. Hold on. Time out for a sec. My brain is literally inside out right now.
Am I seriously figuring out now--Today. This minute.--that probability is vulnerable to what hasn't yet happened in the future? And that the present can change past probabilities?
That sounds so familiar. That idea. But never do I think I've ever come to it by myself.
Until now I think it was always just a sentiment that Deepak Chopra hugged into to Oprah who gifted to Martha Stewart who baked into a lemon zest fortune cookie.*
So predicting or estimating frequencies can change by the very nature of the present? Very interesting.
Doesn't it sound like we're crossing streams with the whole quantum mechanics pickle about changing a particle's state the moment it's observed? (and here)
Are people just particles?!?!Am I on psychedelic drugs and where can you get some too?
This shouldn't be so bleeping mind-blowing should it?
Unless... unless... As my repulsed reaction to a snappy "1/32" indicated at the outset back on Monday: Small scale probabilities are different from large scale ones. Probabilities become different the bigger and bigger that you get.
And it's no secret that people who think evolutionarily think big. We're transcending space and time constantly. What? We are. You're welcome to join us. It's fun here. No vomit comets necessary either.
So if we approach 100, 1,000, or say... um... just to pull a random number from the air... SEVEN BILLION births, we should expect to have a much higher than 1/32 chance in finding a streak of five girls.
True. Nobody's making a family of seven billion children. So the question is, do we treat each family as defined by a finite probability or do we see births and families in our species as part of one big series with vastly different probabilities at that level than at the level of the family?
If it's the latter, we should expect what, exactly? Greater odds than 1/32 for having five girls in a row that's for sure ... Greater than 19/256 that's for sure ... The odds are x (where x = ways to make 5+ girls in a row out of 7 billion) divided by 7 billion and so they're going to be greater than 1/32 by a long shot! It may even be close to our earlier totally gauche calculation on Tuesday of 1/6 or it could be even higher!**
So why do we even calculate odds at the family unit level? Just to practice our algebra? Are they really as meaningless as my gut was screaming out in Monday's post?
No no no. I know why we calculate them in our math workbooks and our homeworks. It's not just algebra practice, these are hypotheses we can test. We can use these expectations to see whether there is any factor skewing the outcomes of some families, perhaps there is something biochemical in the babymaking process that results in one kind of offspring for some parents. We'd have to look at families (to account for genes, etc) or at clusters of people living in the same environment (to account for bio-enviro interactions). If we find that within those sorts of sample populations people are having an unexpectedly high number of all-girl families (i.e. there are significantly more than 1/32 families of five children who are girl-only), for example, then we might suspect that there is not a 50/50 boy/girl probability with these folks each time they make a baby and that might entice us to investigate further into their genes or into their ground water, etc.
But back to these issues about small versus large perspectives that we've uncovered here...
In general, we might find 1/32 five-girl familes of five kids max in our species, but if you look at a hospital register, for example, we'll find streaks of five girls much much more frequently than 1/32 (3%) of the time.
There is something misleading about the way we calculate probability in a closed and narrow view of the world. And there is something subtly different about thinking probabilistically about a series of independent events and thinking probabilistically about their outcomes, instead, especially when many separate series can have the same outcomes (e.g. rolling a 7 with the dice or having 3 boys and 2 girls).
I think I've located my trouble with probability. It's just a small one with having a large denominator. You know, something pretty easily surmounted--it's just grasping silly little old infinity's all.
O! Maybe later I'll see if I can dig up what the demographic data say. I can ask: How do the frequencies of sex-ratios in human families fit these tight little closed and narrow probabilities/hypotheses? And do birth registers in hospitals show something much different, probably larger? I shall hope to find out. Not because it's a mystery; I already believe I know the answer. But because I can't simply believe to know an answer if there is a real way to see one, and there is a way in this case, so I should go and see in order to believe.
Thanks for reading. Hope our little journey back in time to the fundamentals of statistics blew your mind even a fraction of the way it blew mine!
Further humbling questions and related thoughts are increasingly probable to appear in future posts....
**Anybody know how to calculate this? Is a super computer necessary? Are there shortcuts for working with such large numbers--something like Pascal's Triangle perhaps?