Skip to content

Instantly share code, notes, and snippets.

@bmtgoncalves
Last active August 29, 2021 18:51
Show Gist options
  • Save bmtgoncalves/f132f4030ee38963ebd4e7f8dc11587e to your computer and use it in GitHub Desktop.
Save bmtgoncalves/f132f4030ee38963ebd4e7f8dc11587e to your computer and use it in GitHub Desktop.
Candlestick Chart.ipynb
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"metadata": {},
"cell_type": "markdown",
"source": "<div style=\"width: 100%; overflow: hidden;\">\n <div style=\"width: 150px; float: left;\"> <img src=\"https://raw.githubusercontent.com/DataForScience/Graphs4Sci/master/data/D4Sci_logo_ball.png\" alt=\"Data For Science, Inc\" align=\"left\" border=\"0\" width=150px> </div>\n <div style=\"float: left; margin-left: 10px;\"><h1>Visualization for Science</h1>\n<h1>Candlestick chart</h1>\n <a href=\"http://www.data4sci.com/\">www.data4sci.com</a><br/>\n @bgoncalves, @data4sci</p></div>\n</div>"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "from collections import Counter\nfrom pprint import pprint\n\nimport pandas as pd\nimport numpy as np\n\nimport matplotlib\nimport matplotlib.pyplot as plt \nimport yfinance as yf\n\nimport tqdm as tq\nfrom tqdm import tqdm\n\nimport watermark\n\n%load_ext watermark\n%matplotlib inline",
"execution_count": 1,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "We start by print out the versions of the libraries we're using for future reference"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "%watermark -n -v -m -g -iv",
"execution_count": 2,
"outputs": [
{
"output_type": "stream",
"text": "Python implementation: CPython\nPython version : 3.8.5\nIPython version : 7.19.0\n\nCompiler : Clang 10.0.0 \nOS : Darwin\nRelease : 20.6.0\nMachine : x86_64\nProcessor : i386\nCPU cores : 16\nArchitecture: 64bit\n\nGit hash: 2ca189e6e9c9af68fc07169751f33298eef96492\n\njson : 2.0.9\npandas : 1.1.3\nyfinance : 0.1.63\ntqdm : 4.62.0\nwatermark : 2.1.0\nmatplotlib: 3.3.2\nnumpy : 1.19.2\n\n",
"name": "stdout"
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Set the default colors we'll be using"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "colors = np.array(['#70bf41', '#f9517b'])",
"execution_count": 3,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "# DJIA data"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "We start by downling the Dow Jones Industrial Average data using the Yahoo! Finance API"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "data = yf.download(\"DJI\", start=\"2020-01-01\", end=\"2021-07-31\", interval='1mo')",
"execution_count": 4,
"outputs": [
{
"output_type": "stream",
"text": "[*********************100%***********************] 1 of 1 completed\n",
"name": "stdout"
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "We collected data from Jan 1st, 2020 until Jul 31st, 2021 at one month resolution, so we get the values indexed by the first day of the month"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "data.head()",
"execution_count": 5,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 5,
"data": {
"text/plain": " Open High Low Close \\\nDate \n2020-01-01 28634.900391 29348.099609 28256.000000 28256.000000 \n2020-02-01 28399.800781 29551.400391 25409.400391 27960.800781 \n2020-03-01 26703.300781 27090.900391 18591.900391 21917.199219 \n2020-04-01 20943.500000 24633.900391 20943.500000 24345.699219 \n2020-05-01 23723.699219 25548.300781 23248.000000 25383.099609 \n\n Adj Close Volume \nDate \n2020-01-01 28256.000000 0 \n2020-02-01 27960.800781 0 \n2020-03-01 21917.199219 0 \n2020-04-01 24345.699219 0 \n2020-05-01 25383.099609 0 ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Open</th>\n <th>High</th>\n <th>Low</th>\n <th>Close</th>\n <th>Adj Close</th>\n <th>Volume</th>\n </tr>\n <tr>\n <th>Date</th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>2020-01-01</th>\n <td>28634.900391</td>\n <td>29348.099609</td>\n <td>28256.000000</td>\n <td>28256.000000</td>\n <td>28256.000000</td>\n <td>0</td>\n </tr>\n <tr>\n <th>2020-02-01</th>\n <td>28399.800781</td>\n <td>29551.400391</td>\n <td>25409.400391</td>\n <td>27960.800781</td>\n <td>27960.800781</td>\n <td>0</td>\n </tr>\n <tr>\n <th>2020-03-01</th>\n <td>26703.300781</td>\n <td>27090.900391</td>\n <td>18591.900391</td>\n <td>21917.199219</td>\n <td>21917.199219</td>\n <td>0</td>\n </tr>\n <tr>\n <th>2020-04-01</th>\n <td>20943.500000</td>\n <td>24633.900391</td>\n <td>20943.500000</td>\n <td>24345.699219</td>\n <td>24345.699219</td>\n <td>0</td>\n </tr>\n <tr>\n <th>2020-05-01</th>\n <td>23723.699219</td>\n <td>25548.300781</td>\n <td>23248.000000</td>\n <td>25383.099609</td>\n <td>25383.099609</td>\n <td>0</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "The height of each bar will be given by the difference between the Open and Close values so we compute the difference"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "data['height'] = data['Open']-data['Close']",
"execution_count": 6,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "And because we want to color the bars by wether the price went up or down, we create another column with the respective color"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "data['colors'] = colors[(data['Open'] > data['Close']).astype('int')] ",
"execution_count": 7,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "data.head()",
"execution_count": 8,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 8,
"data": {
"text/plain": " Open High Low Close \\\nDate \n2020-01-01 28634.900391 29348.099609 28256.000000 28256.000000 \n2020-02-01 28399.800781 29551.400391 25409.400391 27960.800781 \n2020-03-01 26703.300781 27090.900391 18591.900391 21917.199219 \n2020-04-01 20943.500000 24633.900391 20943.500000 24345.699219 \n2020-05-01 23723.699219 25548.300781 23248.000000 25383.099609 \n\n Adj Close Volume height colors \nDate \n2020-01-01 28256.000000 0 378.900391 #f9517b \n2020-02-01 27960.800781 0 439.000000 #f9517b \n2020-03-01 21917.199219 0 4786.101562 #f9517b \n2020-04-01 24345.699219 0 -3402.199219 #70bf41 \n2020-05-01 25383.099609 0 -1659.400391 #70bf41 ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Open</th>\n <th>High</th>\n <th>Low</th>\n <th>Close</th>\n <th>Adj Close</th>\n <th>Volume</th>\n <th>height</th>\n <th>colors</th>\n </tr>\n <tr>\n <th>Date</th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>2020-01-01</th>\n <td>28634.900391</td>\n <td>29348.099609</td>\n <td>28256.000000</td>\n <td>28256.000000</td>\n <td>28256.000000</td>\n <td>0</td>\n <td>378.900391</td>\n <td>#f9517b</td>\n </tr>\n <tr>\n <th>2020-02-01</th>\n <td>28399.800781</td>\n <td>29551.400391</td>\n <td>25409.400391</td>\n <td>27960.800781</td>\n <td>27960.800781</td>\n <td>0</td>\n <td>439.000000</td>\n <td>#f9517b</td>\n </tr>\n <tr>\n <th>2020-03-01</th>\n <td>26703.300781</td>\n <td>27090.900391</td>\n <td>18591.900391</td>\n <td>21917.199219</td>\n <td>21917.199219</td>\n <td>0</td>\n <td>4786.101562</td>\n <td>#f9517b</td>\n </tr>\n <tr>\n <th>2020-04-01</th>\n <td>20943.500000</td>\n <td>24633.900391</td>\n <td>20943.500000</td>\n <td>24345.699219</td>\n <td>24345.699219</td>\n <td>0</td>\n <td>-3402.199219</td>\n <td>#70bf41</td>\n </tr>\n <tr>\n <th>2020-05-01</th>\n <td>23723.699219</td>\n <td>25548.300781</td>\n <td>23248.000000</td>\n <td>25383.099609</td>\n <td>25383.099609</td>\n <td>0</td>\n <td>-1659.400391</td>\n <td>#70bf41</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Let's see what our plot looks like so far"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "data.plot(kind='bar', # Bar plot\n y='height', # Use the Open/Close difference as the height\n bottom=data['Close'], # Start plotting the bar at the Close value\n color=data['colors'], # Color each bar by whether the value went up or down\n legend=None) # Remove the legend",
"execution_count": 9,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 9,
"data": {
"text/plain": "<AxesSubplot:xlabel='Date'>"
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 432x288 with 1 Axes>",
"image/png": "\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Not a bad start, but there's still plenty of room for improvement"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Since we'll be using a bar plot, we have to manually specify the axis labels to use as the bar plot doesn't play nicely with datetime objects"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "data.index = (day.strftime('%b')[0] \n if day.strftime('%b') != 'Jan' \n else day.strftime('%b\\n%Y') \n for day in data.index)",
"execution_count": 10,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "ax = data.plot(kind='bar', y='height', bottom=data['Close'], \n color=data['colors'], legend=None)\n\n# Set the ticklabels, to undo the default rotation\nax.set_xticklabels(data.index, rotation=0); ",
"execution_count": 11,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 432x288 with 1 Axes>",
"image/png": "\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Finally, we add the vertical lines representing the range of values from the heighest to the lowest"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "ax = data.plot(kind='bar', y='height', bottom=data['Close'], \n color=data['colors'], legend=None)\n\n# Add the vertical lines from High to Low\nax.plot([np.arange(data.shape[0]), # x-values\n np.arange(data.shape[0])], \n data[['High', 'Low']].T.values, # y-values\n color='darkgray', # Color\n zorder=-2, #Plot below the current bars\n lw=2) # Make the lines thicker\n\n# Set the ticklabels, to undo the default rotation\nax.set_xticklabels(data.index, rotation=0); ",
"execution_count": 12,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 432x288 with 1 Axes>",
"image/png": "\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "And increase figure and fontsizes to make the figure more legible and remove the upper and right side spines to make it sexier"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "fig, ax = plt.subplots(1, figsize=(14, 10))\ndata.plot(kind='bar', y='height', bottom=data['Close'], \n color=data['colors'], legend=None, ax=ax)\n\n# Add the vertical lines from High to Low\nax.plot([np.arange(data.shape[0]), # x-values\n np.arange(data.shape[0])], \n data[['High', 'Low']].T.values, # y-values\n color='darkgray', # Color\n zorder=-2, #Plot below the current bars\n lw=2) # Make the lines thicker\n\n# Set the correct labels\nax.set_ylabel('DJIA', fontsize=24.0)\n\n# Increase tick mark font size\nax.tick_params(axis='x', labelsize=20)\nax.tick_params(axis='y', labelsize=20)\n\n# Set the ticklabels, to undo the default rotation\nax.set_xticklabels(data.index, rotation=0)\n\n# Add commas to the y-labels\nax.get_yaxis().set_major_formatter(\n matplotlib.ticker.FuncFormatter(lambda x, p: format(int(x), ',')))\n\n# Hide the right and top spines\nax.spines['right'].set_visible(False)\nax.spines['top'].set_visible(False)",
"execution_count": 13,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 1008x720 with 1 Axes>",
"image/png": "\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "<div style=\"width: 100%; overflow: hidden;\">\n <img src=\"https://raw.githubusercontent.com/DataForScience/Graphs4Sci/master/data/D4Sci_logo_full.png\" alt=\"Data For Science, Inc\" align=\"center\" border=\"0\" width=300px> \n</div>"
}
],
"metadata": {
"kernelspec": {
"name": "python3",
"display_name": "Python 3",
"language": "python"
},
"varInspector": {
"window_display": false,
"cols": {
"lenName": 16,
"lenType": 16,
"lenVar": 40
},
"kernels_config": {
"python": {
"library": "var_list.py",
"delete_cmd_prefix": "del ",
"delete_cmd_postfix": "",
"varRefreshCmd": "print(var_dic_list())"
},
"r": {
"library": "var_list.r",
"delete_cmd_prefix": "rm(",
"delete_cmd_postfix": ") ",
"varRefreshCmd": "cat(var_dic_list()) "
}
},
"types_to_exclude": [
"module",
"function",
"builtin_function_or_method",
"instance",
"_Feature"
]
},
"language_info": {
"name": "python",
"version": "3.8.5",
"mimetype": "text/x-python",
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"pygments_lexer": "ipython3",
"nbconvert_exporter": "python",
"file_extension": ".py"
},
"gist": {
"id": "f132f4030ee38963ebd4e7f8dc11587e",
"data": {
"description": "Candlestick Chart.ipynb",
"public": true
}
},
"_draft": {
"nbviewer_url": "https://gist.github.com/f132f4030ee38963ebd4e7f8dc11587e"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment