Created
May 30, 2019 13:20
-
-
Save AllenDowney/3e0ee50e828cb3a4bc2a720797bb303c to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# How many women are taller than their husbands?\n", | |
"\n", | |
"\n", | |
"Copyright 2019 Allen Downey\n", | |
"\n", | |
"[MIT License](https://opensource.org/licenses/MIT)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 1, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"%matplotlib inline\n", | |
"\n", | |
"import numpy as np" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"[Chris Anderson tweeted](https://twitter.com/thetalperry/status/1133964553597927425):\n", | |
"\n", | |
">\"My daughter asked what percentage of the world's couples have the woman taller than the man and I'm trying to turn it into a math estimation lesson. I can calculate what fraction of women are taller than the average man, but how to figure out the pairings?\"\n", | |
"\n", | |
"What a great question!\n", | |
"\n", | |
"According to [Our World in Data: Human Height](https://ourworldindata.org/human-height)\n", | |
"the global average adult height is 171 cm for men and 159 cm for women.\n", | |
"\n", | |
"Standard deviation is harder to come by. Using data from the BRFSS, [I have estimated](https://allendowney.blogspot.com/2017/01/more-notebooks-for-think-stats.html) that the standard deviation of adult height in the U.S. is 7.7 cm for men and 7.3 cm for women.\n", | |
"\n", | |
"I'll assume that the global standard deviation is a little bigger, 9 cm for men and 8 cm for women.\n", | |
"\n", | |
"Now I can generate a random sample of 1000 men and 1000 women." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 2, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"from scipy.stats import norm\n", | |
"\n", | |
"male_sample = norm(171, 9).rvs(1000)\n", | |
"female_sample = norm(159, 8).rvs(1000)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"If men and women got married at random, without regard to height, the wife would be taller in about 16% of couples." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 3, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"0.15" | |
] | |
}, | |
"execution_count": 3, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"np.mean(female_sample > male_sample)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"But if people lined up by height, with the tallest man marrying the tallest women, and so on down the line, it's possible the wife would never be taller than the husband." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 4, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"male_sample.sort()\n", | |
"female_sample.sort()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 5, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"0.0" | |
] | |
}, | |
"execution_count": 5, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"np.mean(female_sample > male_sample)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"So that's the lower bound. What's the upper bound? \n", | |
"\n", | |
"What if the tallest woman married the shortest man, the second tallest woman married the second tallest man, and so on?\n", | |
"\n", | |
"We can sort the men in reverse order\n", | |
"using this [not at all intuitive idiom](https://stackoverflow.com/questions/26984414/efficiently-sorting-a-numpy-array-in-descending-order)." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 6, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"male_sample[::-1].sort()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"With this ordering, about 25% of women would be taller than their husbands." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 7, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"0.237" | |
] | |
}, | |
"execution_count": 7, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"np.mean(female_sample > male_sample)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"But that's not the upper bound. To get that, we can start with the tallest woman and pair her with the tallest man shorter than her. Then we remove that man from the list, and repeat with the next tallest woman.\n", | |
"\n", | |
"How many women would find a husband shorter than them?" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 8, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"male_list = list(male_sample)\n", | |
"pairs = []\n", | |
"\n", | |
"for female in female_sample:\n", | |
" t = [male for male in male_list if male<female]\n", | |
" if t:\n", | |
" # choose the tallest male shorter than the woman, if any\n", | |
" husband = max([male for male in male_list if male<female])\n", | |
" else:\n", | |
" # choose the tallest male\n", | |
" husband = max(male_list)\n", | |
" \n", | |
" male_list.remove(husband)\n", | |
" pairs.append((female, husband))" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 9, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"husbands, wives = np.transpose(pairs)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 10, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"0.534" | |
] | |
}, | |
"execution_count": 10, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"np.mean(wives > husbands)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"So the range of possiblilities is pretty wide, from 0% with perfect assortative mating to about 50% with a deliberate strategy of maximization.\n", | |
"\n", | |
"Where are we in reality? Assortative mating might be moderate based on height, but it is strong on several factors that correlate with height (particularly nationality). Worldwide, I would guess that fewer than 10% of women marry a shorter man." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.6.7" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 2 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment