Last active
September 2, 2024 14:21
-
-
Save pfandzelter/0ae861f0dee1fb4fd1d11344e3f85c9e to your computer and use it in GitHub Desktop.
Bar Chart With a Broken Y Axis in Python Using Seaborn
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "code", | |
"execution_count": 24, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"# import seaborn, pyplot (for plotting), and pandas (to build the dataframe)\n", | |
"import seaborn as sns\n", | |
"import pandas as pd\n", | |
"import matplotlib.pyplot as plt" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"First, we input our data as the following table: \n", | |
"\n", | |
"| | X1 | X2 | X3 |\n", | |
"| :------------- | :----------: | :----------: | -----------: |\n", | |
"| Test T1 | 36.08911234 | 35.44650908 | 13.28387507 |\n", | |
"| Test T2 | 334.0905209 | 332.8183322 | 114.2073644 |\n", | |
"| Test T3 | 125.7836401 | 331.9770472 | 132.2351763 |\n", | |
"\n", | |
"Then, we make a pandas dataframe from it." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 25, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
" Experiment Setup T1 T2 T3\n", | |
"0 X1 36.089112 334.090521 125.783640\n", | |
"1 X2 35.446509 332.818332 331.977047\n", | |
"2 X3 13.283875 114.207364 132.235176\n" | |
] | |
} | |
], | |
"source": [ | |
"# input data\n", | |
"data = {\"Experiment Setup\": [\"X1\", \"X2\", \"X3\"],\n", | |
" \"T1\": [36.08911234, 35.44650908, 13.28387507],\n", | |
" \"T2\": [334.0905209, 332.8183322, 114.207364],\n", | |
" \"T3\": [125.7836401, 331.977047, 132.2351763]\n", | |
" }\n", | |
"\n", | |
"# make a dataframe\n", | |
"data = pd.DataFrame(\n", | |
" data, columns=[\"Experiment Setup\", \"T1\", \"T2\", \"T3\"])\n", | |
"\n", | |
"print(data)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Next, melt the dataframe to make data easier to handle for seaborn.\n", | |
"It will look something like this: \n", | |
"\n", | |
"| Test | Experiment Setup | Latency in ms |\n", | |
"| :------------- | :----------: | -----------: |\n", | |
"| Test T1 | X1 | 36.08911234 |\n", | |
"| Test T1 | X2 | 35.44650908 |\n", | |
"| Test T1 | X3 | 13.28387507 |\n", | |
"| Test T2 | X1 | 334.0905209 |\n", | |
"| Test T2 | X2 | 332.8183322 |\n", | |
"| Test T2 | X3 | 114.2073644 |\n", | |
"| Test T3 | X1 | 125.7836401 |\n", | |
"| Test T3 | X2 | 331.9770472 |\n", | |
"| Test T3 | X3 | 132.2351763 |\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 26, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
" Experiment Setup Test Latency in ms\n", | |
"0 X1 T1 36.089112\n", | |
"1 X2 T1 35.446509\n", | |
"2 X3 T1 13.283875\n", | |
"3 X1 T2 334.090521\n", | |
"4 X2 T2 332.818332\n", | |
"5 X3 T2 114.207364\n", | |
"6 X1 T3 125.783640\n", | |
"7 X2 T3 331.977047\n", | |
"8 X3 T3 132.235176\n" | |
] | |
} | |
], | |
"source": [ | |
"# transform dataframe\n", | |
"\n", | |
"data_M = pd.melt(data, id_vars=\"Experiment Setup\", var_name=\"Test\",\n", | |
" value_name=\"Latency in ms\")\n", | |
"\n", | |
"print(data_M)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 27, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"# set style for seaborn plot\n", | |
"sns.set(style=\"whitegrid\", font=\"CMU Sans Serif\")\n", | |
"# create a color palette (we only have three different colors for the three different tests T1...T3)\n", | |
"pal = sns.color_palette(n_colors=3)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 28, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"image/png": "", | |
"text/plain": [ | |
"<Figure size 432x288 with 2 Axes>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"# let's create a figure for our two plots to live in\n", | |
"# we need a lower part (anything below the cutoff), which will be ax2\n", | |
"# and an upper part (anything above the cutoff) which will be ax1\n", | |
"# because we have only two plots above each other, we set ncols=1 and nrows=2\n", | |
"# also, they should share an x axis, which is why we set sharex=True\n", | |
"f, (ax1, ax2) = plt.subplots(ncols=1, nrows=2,\n", | |
" sharex=True)\n", | |
"\n", | |
"# we want the \"Test\" to appear on the x axis as individual parameters\n", | |
"# \"Latency in ms\" should be what is shown on the y axis as a value\n", | |
"# hue should be the \"Experiment Setup\"\n", | |
"# this will result three ticks on the x axis with X1...X3 and each with three bars for T1...T3\n", | |
"# (you could turn this around if you need to, depending on what kind of data you want to show)\n", | |
"ax1 = sns.barplot(x=\"Test\", y=\"Latency in ms\",\n", | |
" hue=\"Experiment Setup\", data=data_M, palette=pal, ax=ax1)\n", | |
"\n", | |
"# we basically do the same thing again for the second plot\n", | |
"ax2 = sns.barplot(x=\"Test\", y=\"Latency in ms\",\n", | |
" hue=\"Experiment Setup\", data=data_M, palette=pal, ax=ax2)\n", | |
"\n", | |
"# here is the fun part: setting the limits for the individual y axis\n", | |
"# the upper part (ax1) should show only values from 250 to 400\n", | |
"# the lower part (ax2) should only show 0 to 150\n", | |
"# you can define your own limits, but the range (150) should be the same so scale is the same across both plots\n", | |
"# it could be possible to use a different range and then adjust plot height but who knows how that works\n", | |
"ax1.set_ylim(250, 400)\n", | |
"ax2.set_ylim(0, 150)\n", | |
"\n", | |
"# the upper part does not need its own x axis as it shares one with the lower part\n", | |
"ax1.get_xaxis().set_visible(False)\n", | |
"\n", | |
"# by default, each part will get its own \"Latency in ms\" label, but we want to set a common for the whole figure\n", | |
"# first, remove the y label for both subplots\n", | |
"ax1.set_ylabel(\"\")\n", | |
"ax2.set_ylabel(\"\")\n", | |
"# then, set a new label on the plot (basically just a piece of text) and move it to where it makes sense (requires trial and error)\n", | |
"f.text(0.05, 0.55, \"Latency in ms\", va=\"center\", rotation=\"vertical\")\n", | |
"\n", | |
"# by default, seaborn also gives each subplot its own legend, which makes no sense at all\n", | |
"# soe remove both default legends first\n", | |
"ax1.get_legend().remove()\n", | |
"ax2.get_legend().remove()\n", | |
"# then create a new legend and put it to the side of the figure (also requires trial and error)\n", | |
"ax2.legend(loc=(1.025, 0.5), title=\"Design\")\n", | |
"\n", | |
"# let's put some ticks on the top of the upper part and bottom of the lower part for style\n", | |
"ax1.xaxis.tick_top()\n", | |
"ax2.xaxis.tick_bottom()\n", | |
"\n", | |
"# finally, adjust everything a bit to make it prettier (this just moves everything, best to try and iterate)\n", | |
"f.subplots_adjust(left=0.15, right=0.85, bottom=0.15, top=0.85)\n", | |
"\n", | |
"plt.show()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 29, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"image/png": "", | |
"text/plain": [ | |
"<Figure size 432x288 with 2 Axes>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"# now for the fun part, this is just copied from https://matplotlib.org/examples/pylab_examples/broken_axis.html\n", | |
"# (most of this is, actually)\n", | |
"# here, we create these little diagonal lines that bring this chart to a whole new level:\n", | |
"\n", | |
"# This looks pretty good, and was fairly painless, but you can get that\n", | |
"# cut-out diagonal lines look with just a bit more work. The important\n", | |
"# thing to know here is that in axes coordinates, which are always\n", | |
"# between 0-1, spine endpts are at these locations (0,0), (0,1),\n", | |
"# (1,0), and (1,1). Thus, we just need to put the diagonals in the\n", | |
"# appropriate corners of each of our axes, and so long as we use the\n", | |
"# right transform and disable clipping.\n", | |
"\n", | |
"d = .01 # how big to make the diagonal lines in axes coordinates\n", | |
"# arguments to pass to plot, just so we don't keep repeating them\n", | |
"kwargs = dict(transform=ax1.transAxes, color=\"k\", clip_on=False)\n", | |
"ax1.plot((-d, +d), (-d, +d), **kwargs) # top-left diagonal\n", | |
"ax1.plot((1 - d, 1 + d), (-d, +d), **kwargs) # top-right diagonal\n", | |
"\n", | |
"kwargs.update(transform=ax2.transAxes) # switch to the bottom axes\n", | |
"ax2.plot((-d, +d), (1 - d, 1 + d), **kwargs) # bottom-left diagonal\n", | |
"ax2.plot((1 - d, 1 + d), (1 - d, 1 + d), **kwargs) # bottom-right diagonal\n", | |
"\n", | |
"# display new plot again\n", | |
"# https://stackoverflow.com/questions/50452455/plt-show-does-nothing-when-used-for-the-second-time\n", | |
"from IPython.display import display\n", | |
"display(f) # Shows plot again" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.9.10" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 1 | |
} |
I wonder if the "looking break-like" between both graphs is because of the subplot definition "f, (ax1, ax2) = plt.subplots(ncols=1, nrows=2 sharex=True)". Particularly, the common 'y-axis' and the absence of a 'x-axis title" for the upper graphs creates an effect of "break-like"
I hope this insight is beneficial
@rafmora Yes I think you're exactly right. The "broken" look is exactly what we wanted to achieve to present the data without cutting off outliers or using a logarithmic scale, which can both be misleading
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Offiziell Most Savage Plot Ive Seen today