Skip to content

Instantly share code, notes, and snippets.

@bmtgoncalves
Last active November 21, 2021 17:34
Show Gist options
  • Save bmtgoncalves/f16d157de3323e19dee6ad9c05b841b7 to your computer and use it in GitHub Desktop.
Save bmtgoncalves/f16d157de3323e19dee6ad9c05b841b7 to your computer and use it in GitHub Desktop.
Nightingale.ipynb
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"metadata": {},
"cell_type": "markdown",
"source": "<div style=\"width: 100%; overflow: hidden;\">\n <div style=\"width: 150px; float: left;\"> <img src=\"https://raw.githubusercontent.com/DataForScience/Graphs4Sci/master/data/D4Sci_logo_ball.png\" alt=\"Data For Science, Inc\" align=\"left\" border=\"0\" width=150px> </div>\n <div style=\"float: left; margin-left: 10px;\"><h1>Visualization for Science</h1>\n<h1>Nightingale Plot</h1>\n <a href=\"http://www.data4sci.com/\">www.data4sci.com</a><br/>\n @bgoncalves, @data4sci</p></div>\n</div>"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "import pandas as pd\nimport numpy as np\n\nimport matplotlib\nimport matplotlib.pyplot as plt \nfrom matplotlib.patches import Wedge, Patch\nfrom matplotlib.collections import PatchCollection\nimport matplotlib.font_manager as font_manager\n\nimport watermark\n\n%load_ext watermark\n%matplotlib inline",
"execution_count": 1,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "We start by print out the versions of the libraries we're using for future reference"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "%watermark -n -v -m -g -iv",
"execution_count": 2,
"outputs": [
{
"output_type": "stream",
"text": "Python implementation: CPython\nPython version : 3.8.5\nIPython version : 7.19.0\n\nCompiler : Clang 10.0.0 \nOS : Darwin\nRelease : 20.6.0\nMachine : x86_64\nProcessor : i386\nCPU cores : 16\nArchitecture: 64bit\n\nGit hash: 9fa3dbd9d128a84e679d23456a0b79a628fff382\n\nwatermark : 2.1.0\njson : 2.0.9\nmatplotlib: 3.3.2\npandas : 1.1.3\nnumpy : 1.19.2\n\n",
"name": "stdout"
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "# Load the dataset"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "data = pd.read_excel('https://github.com/DataForScience/Viz4Sci/raw/master/data/Nightingale.xlsx')",
"execution_count": 3,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "data.head()",
"execution_count": 4,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 4,
"data": {
"text/plain": " Month Year Cause of Death Monthly Death Total \\\n0 April 1854 Infectious Disease 1 \n1 May 1854 Infectious Disease 12 \n2 June 1854 Infectious Disease 11 \n3 July 1854 Infectious Disease 359 \n4 August 1854 Infectious Disease 828 \n\n Annual Mortality Rate (per 1000 soldiers) \n0 1.4 \n1 6.2 \n2 4.7 \n3 150.0 \n4 328.5 ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Month</th>\n <th>Year</th>\n <th>Cause of Death</th>\n <th>Monthly Death Total</th>\n <th>Annual Mortality Rate (per 1000 soldiers)</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>April</td>\n <td>1854</td>\n <td>Infectious Disease</td>\n <td>1</td>\n <td>1.4</td>\n </tr>\n <tr>\n <th>1</th>\n <td>May</td>\n <td>1854</td>\n <td>Infectious Disease</td>\n <td>12</td>\n <td>6.2</td>\n </tr>\n <tr>\n <th>2</th>\n <td>June</td>\n <td>1854</td>\n <td>Infectious Disease</td>\n <td>11</td>\n <td>4.7</td>\n </tr>\n <tr>\n <th>3</th>\n <td>July</td>\n <td>1854</td>\n <td>Infectious Disease</td>\n <td>359</td>\n <td>150.0</td>\n </tr>\n <tr>\n <th>4</th>\n <td>August</td>\n <td>1854</td>\n <td>Infectious Disease</td>\n <td>828</td>\n <td>328.5</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Combine month and year into a single column"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "data['date'] = data[['Month', 'Year']].apply(lambda x:x['Month'] + ' ' + str(x['Year']), axis=1)",
"execution_count": 5,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Pivot data into a table with each cause of death as a column. We're keeping only the mortality rate."
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "infectious = pd.pivot_table(data, index='date', columns='Cause of Death', \n values='Annual Mortality Rate (per 1000 soldiers)', aggfunc='sum')",
"execution_count": 6,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "infectious",
"execution_count": 7,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 7,
"data": {
"text/plain": "Cause of Death All Other Causes Infectious Disease Wounds and Injuries\ndate \nApril 1854 7.0 1.4 0.0\nApril 1855 21.2 177.5 17.9\nAugust 1854 11.9 328.5 0.4\nAugust 1855 6.7 129.9 44.1\nDecember 1854 48.0 631.5 41.7\nDecember 1855 7.8 25.3 5.0\nFebruary 1855 140.1 822.8 16.3\nFebruary 1856 5.2 6.6 0.0\nJanuary 1855 120.0 1022.8 30.7\nJanuary 1856 13.0 11.4 0.5\nJuly 1854 9.6 150.0 0.0\nJuly 1855 9.3 107.5 37.7\nJune 1854 2.5 4.7 0.0\nJune 1855 9.6 247.6 64.5\nMarch 1855 68.6 480.3 12.8\nMarch 1856 9.1 3.9 0.0\nMay 1854 4.6 6.2 0.0\nMay 1855 12.5 171.8 16.6\nNovember 1854 42.8 340.6 115.8\nNovember 1855 10.1 56.4 10.5\nOctober 1854 50.1 197.0 51.7\nOctober 1855 4.6 32.8 13.6\nSeptember 1854 27.7 312.2 32.1\nSeptember 1855 5.0 47.5 69.4",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th>Cause of Death</th>\n <th>All Other Causes</th>\n <th>Infectious Disease</th>\n <th>Wounds and Injuries</th>\n </tr>\n <tr>\n <th>date</th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>April 1854</th>\n <td>7.0</td>\n <td>1.4</td>\n <td>0.0</td>\n </tr>\n <tr>\n <th>April 1855</th>\n <td>21.2</td>\n <td>177.5</td>\n <td>17.9</td>\n </tr>\n <tr>\n <th>August 1854</th>\n <td>11.9</td>\n <td>328.5</td>\n <td>0.4</td>\n </tr>\n <tr>\n <th>August 1855</th>\n <td>6.7</td>\n <td>129.9</td>\n <td>44.1</td>\n </tr>\n <tr>\n <th>December 1854</th>\n <td>48.0</td>\n <td>631.5</td>\n <td>41.7</td>\n </tr>\n <tr>\n <th>December 1855</th>\n <td>7.8</td>\n <td>25.3</td>\n <td>5.0</td>\n </tr>\n <tr>\n <th>February 1855</th>\n <td>140.1</td>\n <td>822.8</td>\n <td>16.3</td>\n </tr>\n <tr>\n <th>February 1856</th>\n <td>5.2</td>\n <td>6.6</td>\n <td>0.0</td>\n </tr>\n <tr>\n <th>January 1855</th>\n <td>120.0</td>\n <td>1022.8</td>\n <td>30.7</td>\n </tr>\n <tr>\n <th>January 1856</th>\n <td>13.0</td>\n <td>11.4</td>\n <td>0.5</td>\n </tr>\n <tr>\n <th>July 1854</th>\n <td>9.6</td>\n <td>150.0</td>\n <td>0.0</td>\n </tr>\n <tr>\n <th>July 1855</th>\n <td>9.3</td>\n <td>107.5</td>\n <td>37.7</td>\n </tr>\n <tr>\n <th>June 1854</th>\n <td>2.5</td>\n <td>4.7</td>\n <td>0.0</td>\n </tr>\n <tr>\n <th>June 1855</th>\n <td>9.6</td>\n <td>247.6</td>\n <td>64.5</td>\n </tr>\n <tr>\n <th>March 1855</th>\n <td>68.6</td>\n <td>480.3</td>\n <td>12.8</td>\n </tr>\n <tr>\n <th>March 1856</th>\n <td>9.1</td>\n <td>3.9</td>\n <td>0.0</td>\n </tr>\n <tr>\n <th>May 1854</th>\n <td>4.6</td>\n <td>6.2</td>\n <td>0.0</td>\n </tr>\n <tr>\n <th>May 1855</th>\n <td>12.5</td>\n <td>171.8</td>\n <td>16.6</td>\n </tr>\n <tr>\n <th>November 1854</th>\n <td>42.8</td>\n <td>340.6</td>\n <td>115.8</td>\n </tr>\n <tr>\n <th>November 1855</th>\n <td>10.1</td>\n <td>56.4</td>\n <td>10.5</td>\n </tr>\n <tr>\n <th>October 1854</th>\n <td>50.1</td>\n <td>197.0</td>\n <td>51.7</td>\n </tr>\n <tr>\n <th>October 1855</th>\n <td>4.6</td>\n <td>32.8</td>\n <td>13.6</td>\n </tr>\n <tr>\n <th>September 1854</th>\n <td>27.7</td>\n <td>312.2</td>\n <td>32.1</td>\n </tr>\n <tr>\n <th>September 1855</th>\n <td>5.0</td>\n <td>47.5</td>\n <td>69.4</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Put the columns in the right order"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "infectious = infectious[['Wounds and Injuries', 'All Other Causes', 'Infectious Disease', ]]",
"execution_count": 8,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Normalize values"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "infectious = (infectious**2).cumsum(axis=1)\ninfectious /= infectious.max().max()\ninfectious = np.sqrt(infectious)",
"execution_count": 9,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Subset and order the rows"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "order = [\n 'April 1854', 'May 1854', 'June 1854', \n 'July 1854', 'August 1854', 'September 1854', \n 'October 1854', 'November 1854', 'December 1854',\n 'January 1855', 'February 1855', 'March 1855',]\n\ninfectious = infectious.loc[order].reset_index()\ninfectious = infectious[::-1]",
"execution_count": 10,
"outputs": []
},
{
"metadata": {
"trusted": false
},
"cell_type": "markdown",
"source": "Color list"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "colors =['#51a7f9', 'black', '#f9517b', ]",
"execution_count": 11,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Generate the figure"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "fig, ax = plt.subplots(1, figsize=(10, 10))\nax.set_aspect(1.)\n\npatches = []\ncolor = []\ntotal = 90 # First wedge starts off at 90'\nangle = 30 # Each wedge covers 30'\n\nfor i in range(infectious.shape[0])[::1]:\n for j, col in enumerate(['Infectious Disease', \n 'All Other Causes', \n 'Wounds and Injuries', ]):\n value = infectious[col].iloc[i]\n patches.append(Wedge((0, 0), np.sqrt(value), \n total,\n total+angle\n ))\n color.append(colors[j])\n \n length = np.max([np.sqrt(infectious.iloc[i, 1:4].max())+0.02, 0.4])\n x = length * np.cos((total+angle/2)*np.pi/180)\n y = length * np.sin((total+angle/2)*np.pi/180)\n \n label = infectious['date'].iloc[i]\n \n if label[:3] not in ['Mar', 'Jan', 'Apr']:\n label = label.split()[0]\n elif label[:3] == 'Apr':\n label = '\\n'.join(label.split())\n \n ax.text(x, y, label.upper(), rotation=(total+angle/2-90), \n ha='center', va='center', fontsize=12)\n \n total += angle\n\np = PatchCollection(patches, color=color, alpha=0.5)\nax.add_collection(p)\nax.set_xlim(-1.2, 1.2)\nax.set_ylim(-1.2, 1.2)\n\npatches = []\nlegend = ['deaths from preventable diseases', \n 'deaths from wounds', \n 'deaths from all other sources']\n\n# you'll need to change the font location\nfont = font_manager.FontProperties(fname='Lucida Calligraphy Italic.ttf',\n weight='normal',\n style='italic', size=16)\n\nfor i, leg in enumerate(legend):\n patches.append(Patch(color=colors[i], label=leg, alpha=0.5))\n\nax.legend(handles=patches, loc=(0.5, 0.7), \n fancybox=False, frameon=False, prop=font)\n\nax.axis('off')\nfig.tight_layout()",
"execution_count": 12,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 720x720 with 1 Axes>",
"image/png": "\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Which compares nicely with the original"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "<img src='https://pbs.twimg.com/media/BqzygKNCUAEHaNf?format=png&name=small'>"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "<div style=\"width: 100%; overflow: hidden;\">\n <img src=\"https://raw.githubusercontent.com/DataForScience/Graphs4Sci/master/data/D4Sci_logo_full.png\" alt=\"Data For Science, Inc\" align=\"center\" border=\"0\" width=300px> \n</div>"
}
],
"metadata": {
"kernelspec": {
"name": "python3",
"display_name": "Python 3",
"language": "python"
},
"language_info": {
"name": "python",
"version": "3.8.5",
"mimetype": "text/x-python",
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"pygments_lexer": "ipython3",
"nbconvert_exporter": "python",
"file_extension": ".py"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": true,
"sideBar": true,
"skip_h1_title": true,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
},
"varInspector": {
"window_display": false,
"cols": {
"lenName": 16,
"lenType": 16,
"lenVar": 40
},
"kernels_config": {
"python": {
"library": "var_list.py",
"delete_cmd_prefix": "del ",
"delete_cmd_postfix": "",
"varRefreshCmd": "print(var_dic_list())"
},
"r": {
"library": "var_list.r",
"delete_cmd_prefix": "rm(",
"delete_cmd_postfix": ") ",
"varRefreshCmd": "cat(var_dic_list()) "
}
},
"types_to_exclude": [
"module",
"function",
"builtin_function_or_method",
"instance",
"_Feature"
]
},
"gist": {
"id": "f16d157de3323e19dee6ad9c05b841b7",
"data": {
"description": "Nightingale.ipynb",
"public": true
}
},
"_draft": {
"nbviewer_url": "https://gist.github.com/f16d157de3323e19dee6ad9c05b841b7"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment