Created
August 15, 2014 14:04
-
-
Save psychemedia/bc614ceb74917ffb40bf to your computer and use it in GitHub Desktop.
Informal, scraper based pandas API for UN data website
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"metadata": { | |
"name": "", | |
"signature": "sha256:0178a777e1fc03ee651a06ff67307768e5efadfa1d9869714cba9db19201e2af" | |
}, | |
"nbformat": 3, | |
"nbformat_minor": 0, | |
"worksheets": [ | |
{ | |
"cells": [ | |
{ | |
"cell_type": "heading", | |
"level": 1, | |
"metadata": {}, | |
"source": [ | |
"UNdata Informal API" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"The UNdata website offers an [official API](http://data.un.org/Host.aspx?Content=API) but it doesn't look overly welcoming to someone not versed in the XML protocol it supports. So here's a hacked solution based on scraping a websearch that let's you search the site for datasets, and then download the one you want as a zipped CSV file that gets automatically parsed into a *pandas* dataframe.\n", | |
"\n", | |
"The UN data search form lets you download data directly from the results page:\n", | |
"\n", | |
"\n", | |
"\n", | |
"So let's write a simple scraper to grab the results and see if you can download a selected ata file automatically..." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"If we *View Source* on the results page we can look for the individual results items - and see what we neeed to parse out.\n", | |
"\n", | |
"" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"We also need to have a look at what form the HTTP request for a data download looks like to make sure we get what we need when we do scrape the results...\n", | |
"\n", | |
"" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"#Load in some libraries to handle the web page requests and the web page parsing...\n", | |
"import requests\n", | |
"from bs4 import BeautifulSoup\n", | |
"\n", | |
"#Note - I'm in Python3\n", | |
"from urllib.parse import parse_qs\n", | |
"\n", | |
"#The scraper will be limited to just the first results page...\n", | |
"def searchUNdata(q):\n", | |
" ''' Run a search on the UN data website and scrape the results '''\n", | |
" \n", | |
" params={'q':q}\n", | |
" url='http://data.un.org/Search.aspx'\n", | |
"\n", | |
" response = requests.get(url,params=params)\n", | |
"\n", | |
" soup=BeautifulSoup(response.content)\n", | |
"\n", | |
" results={}\n", | |
"\n", | |
" #Get the list of results\n", | |
" searchresults=soup.findAll('div',{'class':'Result'})\n", | |
" \n", | |
" #For each result, parse out the name of the dataset, the datamart ID and the data filter ID\n", | |
" for result in searchresults:\n", | |
" h2=result.find('h2')\n", | |
" #We can find everything we need in the <a> tag...\n", | |
" a=h2.find('a')\n", | |
" p=parse_qs(a.attrs['href'])\n", | |
" results[a.text]=(p['d'][0],p['f'][0])\n", | |
"\n", | |
" return results" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 94 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"#A couple of helper functions to let us display the results\n", | |
"\n", | |
"results=searchUNdata('carbon dioxide')\n", | |
"\n", | |
"def printResults(results):\n", | |
" ''' Nicely print the search results '''\n", | |
" \n", | |
" for result in results.keys():\n", | |
" print(result)\n", | |
"\n", | |
"\n", | |
"def unDataSearch(q):\n", | |
" ''' Simple function to take a searh phrase, run the search on the UN data site, and print and return the results. '''\n", | |
" \n", | |
" results=searchUNdata(q)\n", | |
" printResults(results)\n", | |
" return results\n", | |
"\n", | |
"printResults(results)\n", | |
"\n", | |
"#q='carbon dioxide'\n", | |
"#unDataSearch(q)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"Carbon dioxide emissions (CO2), thousand metric tons of CO2 (CDIAC)\n", | |
"Carbon dioxide (CO2) Emissions without Land Use, Land-Use Change and Forestry (LULUCF), in Gigagrams (Gg)\n", | |
"Carbon dioxide emissions (CO2), kg CO2 per $1 GDP (PPP) (CDIAC)\n", | |
"Carbon dioxide emissions (CO2), metric tons of CO2 per capita (UNFCCC)\n", | |
"Trade of goods , US$, HS 1992, 28 Inorganic chemicals, precious metal compound, isotope\n", | |
"Carbon dioxide emissions (CO2), kg CO2 per $1 GDP (PPP) (UNFCCC)\n", | |
"Carbon dioxide emissions (CO2), thousand metric tons of CO2 (UNFCCC)\n", | |
"Carbon dioxide emissions (CO2), metric tons of CO2 per capita (CDIAC)\n" | |
] | |
} | |
], | |
"prompt_number": 98 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"#Just in case - a helper routine for working with the search results data\n", | |
"def search(d, substr):\n", | |
" ''' Partial string match search within dict key names '''\n", | |
" #via http://stackoverflow.com/a/10796050/454773\n", | |
" \n", | |
" result = []\n", | |
" for key in d:\n", | |
" if substr.lower() in key.lower():\n", | |
" result.append((key, d[key])) \n", | |
"\n", | |
" return result" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 67 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"search(results, 'per capita')" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 68, | |
"text": [ | |
"[('Carbon dioxide emissions (CO2), metric tons of CO2 per capita (UNFCCC)',\n", | |
" ('MDG', 'seriesRowID:752')),\n", | |
" ('Carbon dioxide emissions (CO2), metric tons of CO2 per capita (CDIAC)',\n", | |
" ('MDG', 'seriesRowID:751'))]" | |
] | |
} | |
], | |
"prompt_number": 68 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"#Note - I'm in Python3\n", | |
"from io import BytesIO\n", | |
"\n", | |
"import zipfile\n", | |
"import pandas as pd\n", | |
"\n", | |
"def getUNdata(undataSearchResults,dataset):\n", | |
" ''' Download a named dataset from the UN Data website and load it into a pandas dataframe '''\n", | |
"\n", | |
" datamartID,seriesRowID=undataSearchResults[dataset]\n", | |
" \n", | |
" url='http://data.un.org/Handlers/DownloadHandler.ashx?DataFilter='+seriesRowID+'&DataMartId='+datamartID+'&Format=csv'\n", | |
"\n", | |
" r = requests.get(url)\n", | |
" \n", | |
" \n", | |
" s=BytesIO(r.content)\n", | |
" z = zipfile.ZipFile(s)\n", | |
" \n", | |
" #Show the files in the zip file\n", | |
" #z.namelist()\n", | |
" \n", | |
" #Let's assume we just get one file per zip...\n", | |
" #Drop any all blank columns\n", | |
" df=pd.read_csv( BytesIO( z.read( z.namelist()[0] ) )).dropna(axis=1,how='all')\n", | |
" return df" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 78 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"results=unDataSearch('carbon dioxide')" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"Carbon dioxide emissions (CO2), thousand metric tons of CO2 (CDIAC)\n", | |
"Carbon dioxide (CO2) Emissions without Land Use, Land-Use Change and Forestry (LULUCF), in Gigagrams (Gg)\n", | |
"Carbon dioxide emissions (CO2), kg CO2 per $1 GDP (PPP) (CDIAC)\n", | |
"Carbon dioxide emissions (CO2), metric tons of CO2 per capita (UNFCCC)\n", | |
"Trade of goods , US$, HS 1992, 28 Inorganic chemicals, precious metal compound, isotope\n", | |
"Carbon dioxide emissions (CO2), kg CO2 per $1 GDP (PPP) (UNFCCC)\n", | |
"Carbon dioxide emissions (CO2), thousand metric tons of CO2 (UNFCCC)\n", | |
"Carbon dioxide emissions (CO2), metric tons of CO2 per capita (CDIAC)\n" | |
] | |
} | |
], | |
"prompt_number": 100 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"dd=getUNdata(results,'Carbon dioxide emissions (CO2), metric tons of CO2 per capita (UNFCCC)')\n", | |
"dd" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"html": [ | |
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>Country or Area</th>\n", | |
" <th>Year</th>\n", | |
" <th>Value</th>\n", | |
" <th>Value Footnotes</th>\n", | |
" <th>Value Footnotes.1</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 2010</td>\n", | |
" <td> 18.042955</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 2009</td>\n", | |
" <td> 18.394162</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 2008</td>\n", | |
" <td> 18.680381</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 2007</td>\n", | |
" <td> 18.700552</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 2006</td>\n", | |
" <td> 18.660320</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>5 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 2005</td>\n", | |
" <td> 18.741587</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>6 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 2004</td>\n", | |
" <td> 18.887782</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>7 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 2003</td>\n", | |
" <td> 18.833971</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>8 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 2002</td>\n", | |
" <td> 18.382553</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>9 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 2001</td>\n", | |
" <td> 18.369852</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>10 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 2000</td>\n", | |
" <td> 18.249353</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>11 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 1999</td>\n", | |
" <td> 18.110541</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>12 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 1998</td>\n", | |
" <td> 17.790195</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>13 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 1997</td>\n", | |
" <td> 17.276190</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>14 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 1996</td>\n", | |
" <td> 17.017798</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>15 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 1995</td>\n", | |
" <td> 16.791157</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>16 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 1994</td>\n", | |
" <td> 16.382401</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>17 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 1993</td>\n", | |
" <td> 16.292707</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>18 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 1992</td>\n", | |
" <td> 16.247799</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>19 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 1991</td>\n", | |
" <td> 16.145272</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>20 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 1990</td>\n", | |
" <td> 16.274010</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>21 </th>\n", | |
" <td> Austria</td>\n", | |
" <td> 2010</td>\n", | |
" <td> 8.612525</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>22 </th>\n", | |
" <td> Austria</td>\n", | |
" <td> 2009</td>\n", | |
" <td> 8.032090</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>23 </th>\n", | |
" <td> Austria</td>\n", | |
" <td> 2008</td>\n", | |
" <td> 8.861574</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>24 </th>\n", | |
" <td> Austria</td>\n", | |
" <td> 2007</td>\n", | |
" <td> 8.948816</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>25 </th>\n", | |
" <td> Austria</td>\n", | |
" <td> 2006</td>\n", | |
" <td> 9.311083</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>26 </th>\n", | |
" <td> Austria</td>\n", | |
" <td> 2005</td>\n", | |
" <td> 9.684401</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>27 </th>\n", | |
" <td> Austria</td>\n", | |
" <td> 2004</td>\n", | |
" <td> 9.555359</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>28 </th>\n", | |
" <td> Austria</td>\n", | |
" <td> 2003</td>\n", | |
" <td> 9.559380</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>29 </th>\n", | |
" <td> Austria</td>\n", | |
" <td> 2002</td>\n", | |
" <td> 8.872889</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>...</th>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>834</th>\n", | |
" <td> United Kingdom</td>\n", | |
" <td> 1995</td>\n", | |
" <td> 9.525717</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>835</th>\n", | |
" <td> United Kingdom</td>\n", | |
" <td> 1994</td>\n", | |
" <td> 9.695512</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>836</th>\n", | |
" <td> United Kingdom</td>\n", | |
" <td> 1993</td>\n", | |
" <td> 9.824330</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>837</th>\n", | |
" <td> United Kingdom</td>\n", | |
" <td> 1992</td>\n", | |
" <td> 10.087042</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>838</th>\n", | |
" <td> United Kingdom</td>\n", | |
" <td> 1991</td>\n", | |
" <td> 10.399337</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>839</th>\n", | |
" <td> United Kingdom</td>\n", | |
" <td> 1990</td>\n", | |
" <td> 10.301924</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>840</th>\n", | |
" <td> United States</td>\n", | |
" <td> 2010</td>\n", | |
" <td> 18.115315</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>841</th>\n", | |
" <td> United States</td>\n", | |
" <td> 2009</td>\n", | |
" <td> 17.613975</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>842</th>\n", | |
" <td> United States</td>\n", | |
" <td> 2008</td>\n", | |
" <td> 19.134200</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>843</th>\n", | |
" <td> United States</td>\n", | |
" <td> 2007</td>\n", | |
" <td> 19.938306</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>844</th>\n", | |
" <td> United States</td>\n", | |
" <td> 2006</td>\n", | |
" <td> 19.790435</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>845</th>\n", | |
" <td> United States</td>\n", | |
" <td> 2005</td>\n", | |
" <td> 20.264177</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>846</th>\n", | |
" <td> United States</td>\n", | |
" <td> 2004</td>\n", | |
" <td> 20.319687</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>847</th>\n", | |
" <td> United States</td>\n", | |
" <td> 2003</td>\n", | |
" <td> 20.138513</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>848</th>\n", | |
" <td> United States</td>\n", | |
" <td> 2002</td>\n", | |
" <td> 20.141864</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>849</th>\n", | |
" <td> United States</td>\n", | |
" <td> 2001</td>\n", | |
" <td> 20.227848</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>850</th>\n", | |
" <td> United States</td>\n", | |
" <td> 2000</td>\n", | |
" <td> 20.813767</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>851</th>\n", | |
" <td> United States</td>\n", | |
" <td> 1999</td>\n", | |
" <td> 20.421978</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>852</th>\n", | |
" <td> United States</td>\n", | |
" <td> 1998</td>\n", | |
" <td> 20.371008</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>853</th>\n", | |
" <td> United States</td>\n", | |
" <td> 1997</td>\n", | |
" <td> 20.500903</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>854</th>\n", | |
" <td> United States</td>\n", | |
" <td> 1996</td>\n", | |
" <td> 20.458236</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>855</th>\n", | |
" <td> United States</td>\n", | |
" <td> 1995</td>\n", | |
" <td> 20.023986</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>856</th>\n", | |
" <td> United States</td>\n", | |
" <td> 1994</td>\n", | |
" <td> 19.986553</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>857</th>\n", | |
" <td> United States</td>\n", | |
" <td> 1993</td>\n", | |
" <td> 19.879693</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>858</th>\n", | |
" <td> United States</td>\n", | |
" <td> 1992</td>\n", | |
" <td> 19.647320</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>859</th>\n", | |
" <td> United States</td>\n", | |
" <td> 1991</td>\n", | |
" <td> 19.437409</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>860</th>\n", | |
" <td> United States</td>\n", | |
" <td> 1990</td>\n", | |
" <td> 19.801924</td>\n", | |
" <td> 1</td>\n", | |
" <td> 1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>861</th>\n", | |
" <td> NaN</td>\n", | |
" <td> NaN</td>\n", | |
" <td> NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>862</th>\n", | |
" <td> footnoteSeqID</td>\n", | |
" <td> Footnote</td>\n", | |
" <td> NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>863</th>\n", | |
" <td> 1</td>\n", | |
" <td> For Denmark, France, United Kingdom and United...</td>\n", | |
" <td> NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"<p>864 rows \u00d7 5 columns</p>\n", | |
"</div>" | |
], | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 81, | |
"text": [ | |
" Country or Area Year \\\n", | |
"0 Australia 2010 \n", | |
"1 Australia 2009 \n", | |
"2 Australia 2008 \n", | |
"3 Australia 2007 \n", | |
"4 Australia 2006 \n", | |
"5 Australia 2005 \n", | |
"6 Australia 2004 \n", | |
"7 Australia 2003 \n", | |
"8 Australia 2002 \n", | |
"9 Australia 2001 \n", | |
"10 Australia 2000 \n", | |
"11 Australia 1999 \n", | |
"12 Australia 1998 \n", | |
"13 Australia 1997 \n", | |
"14 Australia 1996 \n", | |
"15 Australia 1995 \n", | |
"16 Australia 1994 \n", | |
"17 Australia 1993 \n", | |
"18 Australia 1992 \n", | |
"19 Australia 1991 \n", | |
"20 Australia 1990 \n", | |
"21 Austria 2010 \n", | |
"22 Austria 2009 \n", | |
"23 Austria 2008 \n", | |
"24 Austria 2007 \n", | |
"25 Austria 2006 \n", | |
"26 Austria 2005 \n", | |
"27 Austria 2004 \n", | |
"28 Austria 2003 \n", | |
"29 Austria 2002 \n", | |
".. ... ... \n", | |
"834 United Kingdom 1995 \n", | |
"835 United Kingdom 1994 \n", | |
"836 United Kingdom 1993 \n", | |
"837 United Kingdom 1992 \n", | |
"838 United Kingdom 1991 \n", | |
"839 United Kingdom 1990 \n", | |
"840 United States 2010 \n", | |
"841 United States 2009 \n", | |
"842 United States 2008 \n", | |
"843 United States 2007 \n", | |
"844 United States 2006 \n", | |
"845 United States 2005 \n", | |
"846 United States 2004 \n", | |
"847 United States 2003 \n", | |
"848 United States 2002 \n", | |
"849 United States 2001 \n", | |
"850 United States 2000 \n", | |
"851 United States 1999 \n", | |
"852 United States 1998 \n", | |
"853 United States 1997 \n", | |
"854 United States 1996 \n", | |
"855 United States 1995 \n", | |
"856 United States 1994 \n", | |
"857 United States 1993 \n", | |
"858 United States 1992 \n", | |
"859 United States 1991 \n", | |
"860 United States 1990 \n", | |
"861 NaN NaN \n", | |
"862 footnoteSeqID Footnote \n", | |
"863 1 For Denmark, France, United Kingdom and United... \n", | |
"\n", | |
" Value Value Footnotes Value Footnotes.1 \n", | |
"0 18.042955 NaN NaN \n", | |
"1 18.394162 NaN NaN \n", | |
"2 18.680381 NaN NaN \n", | |
"3 18.700552 NaN NaN \n", | |
"4 18.660320 NaN NaN \n", | |
"5 18.741587 NaN NaN \n", | |
"6 18.887782 NaN NaN \n", | |
"7 18.833971 NaN NaN \n", | |
"8 18.382553 NaN NaN \n", | |
"9 18.369852 NaN NaN \n", | |
"10 18.249353 NaN NaN \n", | |
"11 18.110541 NaN NaN \n", | |
"12 17.790195 NaN NaN \n", | |
"13 17.276190 NaN NaN \n", | |
"14 17.017798 NaN NaN \n", | |
"15 16.791157 NaN NaN \n", | |
"16 16.382401 NaN NaN \n", | |
"17 16.292707 NaN NaN \n", | |
"18 16.247799 NaN NaN \n", | |
"19 16.145272 NaN NaN \n", | |
"20 16.274010 NaN NaN \n", | |
"21 8.612525 NaN NaN \n", | |
"22 8.032090 NaN NaN \n", | |
"23 8.861574 NaN NaN \n", | |
"24 8.948816 NaN NaN \n", | |
"25 9.311083 NaN NaN \n", | |
"26 9.684401 NaN NaN \n", | |
"27 9.555359 NaN NaN \n", | |
"28 9.559380 NaN NaN \n", | |
"29 8.872889 NaN NaN \n", | |
".. ... ... ... \n", | |
"834 9.525717 1 1 \n", | |
"835 9.695512 1 1 \n", | |
"836 9.824330 1 1 \n", | |
"837 10.087042 1 1 \n", | |
"838 10.399337 1 1 \n", | |
"839 10.301924 1 1 \n", | |
"840 18.115315 1 1 \n", | |
"841 17.613975 1 1 \n", | |
"842 19.134200 1 1 \n", | |
"843 19.938306 1 1 \n", | |
"844 19.790435 1 1 \n", | |
"845 20.264177 1 1 \n", | |
"846 20.319687 1 1 \n", | |
"847 20.138513 1 1 \n", | |
"848 20.141864 1 1 \n", | |
"849 20.227848 1 1 \n", | |
"850 20.813767 1 1 \n", | |
"851 20.421978 1 1 \n", | |
"852 20.371008 1 1 \n", | |
"853 20.500903 1 1 \n", | |
"854 20.458236 1 1 \n", | |
"855 20.023986 1 1 \n", | |
"856 19.986553 1 1 \n", | |
"857 19.879693 1 1 \n", | |
"858 19.647320 1 1 \n", | |
"859 19.437409 1 1 \n", | |
"860 19.801924 1 1 \n", | |
"861 NaN NaN NaN \n", | |
"862 NaN NaN NaN \n", | |
"863 NaN NaN NaN \n", | |
"\n", | |
"[864 rows x 5 columns]" | |
] | |
} | |
], | |
"prompt_number": 81 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"#One thing to note is that footnotes may appear at the bottom of a dataframe\n", | |
"#We can spot the all empty row and drop rows from that\n", | |
"#We can also drop the footnote related columns\n", | |
"def dropFootnotes(df):\n", | |
" return df[:pd.isnull(dd).all(1).nonzero()[0][0]].drop(['Value Footnotes','Value Footnotes.1'], 1)\n", | |
"\n", | |
"dropFootnotes(dd)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"html": [ | |
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>Country or Area</th>\n", | |
" <th>Year</th>\n", | |
" <th>Value</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 2010</td>\n", | |
" <td> 18.042955</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 2009</td>\n", | |
" <td> 18.394162</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 2008</td>\n", | |
" <td> 18.680381</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 2007</td>\n", | |
" <td> 18.700552</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 2006</td>\n", | |
" <td> 18.660320</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>5 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 2005</td>\n", | |
" <td> 18.741587</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>6 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 2004</td>\n", | |
" <td> 18.887782</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>7 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 2003</td>\n", | |
" <td> 18.833971</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>8 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 2002</td>\n", | |
" <td> 18.382553</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>9 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 2001</td>\n", | |
" <td> 18.369852</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>10 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 2000</td>\n", | |
" <td> 18.249353</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>11 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 1999</td>\n", | |
" <td> 18.110541</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>12 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 1998</td>\n", | |
" <td> 17.790195</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>13 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 1997</td>\n", | |
" <td> 17.276190</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>14 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 1996</td>\n", | |
" <td> 17.017798</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>15 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 1995</td>\n", | |
" <td> 16.791157</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>16 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 1994</td>\n", | |
" <td> 16.382401</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>17 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 1993</td>\n", | |
" <td> 16.292707</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>18 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 1992</td>\n", | |
" <td> 16.247799</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>19 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 1991</td>\n", | |
" <td> 16.145272</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>20 </th>\n", | |
" <td> Australia</td>\n", | |
" <td> 1990</td>\n", | |
" <td> 16.274010</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>21 </th>\n", | |
" <td> Austria</td>\n", | |
" <td> 2010</td>\n", | |
" <td> 8.612525</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>22 </th>\n", | |
" <td> Austria</td>\n", | |
" <td> 2009</td>\n", | |
" <td> 8.032090</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>23 </th>\n", | |
" <td> Austria</td>\n", | |
" <td> 2008</td>\n", | |
" <td> 8.861574</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>24 </th>\n", | |
" <td> Austria</td>\n", | |
" <td> 2007</td>\n", | |
" <td> 8.948816</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>25 </th>\n", | |
" <td> Austria</td>\n", | |
" <td> 2006</td>\n", | |
" <td> 9.311083</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>26 </th>\n", | |
" <td> Austria</td>\n", | |
" <td> 2005</td>\n", | |
" <td> 9.684401</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>27 </th>\n", | |
" <td> Austria</td>\n", | |
" <td> 2004</td>\n", | |
" <td> 9.555359</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>28 </th>\n", | |
" <td> Austria</td>\n", | |
" <td> 2003</td>\n", | |
" <td> 9.559380</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>29 </th>\n", | |
" <td> Austria</td>\n", | |
" <td> 2002</td>\n", | |
" <td> 8.872889</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>...</th>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>831</th>\n", | |
" <td> United Kingdom</td>\n", | |
" <td> 1998</td>\n", | |
" <td> 9.480194</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>832</th>\n", | |
" <td> United Kingdom</td>\n", | |
" <td> 1997</td>\n", | |
" <td> 9.435972</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>833</th>\n", | |
" <td> United Kingdom</td>\n", | |
" <td> 1996</td>\n", | |
" <td> 9.877613</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>834</th>\n", | |
" <td> United Kingdom</td>\n", | |
" <td> 1995</td>\n", | |
" <td> 9.525717</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>835</th>\n", | |
" <td> United Kingdom</td>\n", | |
" <td> 1994</td>\n", | |
" <td> 9.695512</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>836</th>\n", | |
" <td> United Kingdom</td>\n", | |
" <td> 1993</td>\n", | |
" <td> 9.824330</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>837</th>\n", | |
" <td> United Kingdom</td>\n", | |
" <td> 1992</td>\n", | |
" <td> 10.087042</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>838</th>\n", | |
" <td> United Kingdom</td>\n", | |
" <td> 1991</td>\n", | |
" <td> 10.399337</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>839</th>\n", | |
" <td> United Kingdom</td>\n", | |
" <td> 1990</td>\n", | |
" <td> 10.301924</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>840</th>\n", | |
" <td> United States</td>\n", | |
" <td> 2010</td>\n", | |
" <td> 18.115315</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>841</th>\n", | |
" <td> United States</td>\n", | |
" <td> 2009</td>\n", | |
" <td> 17.613975</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>842</th>\n", | |
" <td> United States</td>\n", | |
" <td> 2008</td>\n", | |
" <td> 19.134200</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>843</th>\n", | |
" <td> United States</td>\n", | |
" <td> 2007</td>\n", | |
" <td> 19.938306</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>844</th>\n", | |
" <td> United States</td>\n", | |
" <td> 2006</td>\n", | |
" <td> 19.790435</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>845</th>\n", | |
" <td> United States</td>\n", | |
" <td> 2005</td>\n", | |
" <td> 20.264177</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>846</th>\n", | |
" <td> United States</td>\n", | |
" <td> 2004</td>\n", | |
" <td> 20.319687</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>847</th>\n", | |
" <td> United States</td>\n", | |
" <td> 2003</td>\n", | |
" <td> 20.138513</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>848</th>\n", | |
" <td> United States</td>\n", | |
" <td> 2002</td>\n", | |
" <td> 20.141864</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>849</th>\n", | |
" <td> United States</td>\n", | |
" <td> 2001</td>\n", | |
" <td> 20.227848</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>850</th>\n", | |
" <td> United States</td>\n", | |
" <td> 2000</td>\n", | |
" <td> 20.813767</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>851</th>\n", | |
" <td> United States</td>\n", | |
" <td> 1999</td>\n", | |
" <td> 20.421978</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>852</th>\n", | |
" <td> United States</td>\n", | |
" <td> 1998</td>\n", | |
" <td> 20.371008</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>853</th>\n", | |
" <td> United States</td>\n", | |
" <td> 1997</td>\n", | |
" <td> 20.500903</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>854</th>\n", | |
" <td> United States</td>\n", | |
" <td> 1996</td>\n", | |
" <td> 20.458236</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>855</th>\n", | |
" <td> United States</td>\n", | |
" <td> 1995</td>\n", | |
" <td> 20.023986</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>856</th>\n", | |
" <td> United States</td>\n", | |
" <td> 1994</td>\n", | |
" <td> 19.986553</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>857</th>\n", | |
" <td> United States</td>\n", | |
" <td> 1993</td>\n", | |
" <td> 19.879693</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>858</th>\n", | |
" <td> United States</td>\n", | |
" <td> 1992</td>\n", | |
" <td> 19.647320</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>859</th>\n", | |
" <td> United States</td>\n", | |
" <td> 1991</td>\n", | |
" <td> 19.437409</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>860</th>\n", | |
" <td> United States</td>\n", | |
" <td> 1990</td>\n", | |
" <td> 19.801924</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"<p>861 rows \u00d7 3 columns</p>\n", | |
"</div>" | |
], | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 102, | |
"text": [ | |
" Country or Area Year Value\n", | |
"0 Australia 2010 18.042955\n", | |
"1 Australia 2009 18.394162\n", | |
"2 Australia 2008 18.680381\n", | |
"3 Australia 2007 18.700552\n", | |
"4 Australia 2006 18.660320\n", | |
"5 Australia 2005 18.741587\n", | |
"6 Australia 2004 18.887782\n", | |
"7 Australia 2003 18.833971\n", | |
"8 Australia 2002 18.382553\n", | |
"9 Australia 2001 18.369852\n", | |
"10 Australia 2000 18.249353\n", | |
"11 Australia 1999 18.110541\n", | |
"12 Australia 1998 17.790195\n", | |
"13 Australia 1997 17.276190\n", | |
"14 Australia 1996 17.017798\n", | |
"15 Australia 1995 16.791157\n", | |
"16 Australia 1994 16.382401\n", | |
"17 Australia 1993 16.292707\n", | |
"18 Australia 1992 16.247799\n", | |
"19 Australia 1991 16.145272\n", | |
"20 Australia 1990 16.274010\n", | |
"21 Austria 2010 8.612525\n", | |
"22 Austria 2009 8.032090\n", | |
"23 Austria 2008 8.861574\n", | |
"24 Austria 2007 8.948816\n", | |
"25 Austria 2006 9.311083\n", | |
"26 Austria 2005 9.684401\n", | |
"27 Austria 2004 9.555359\n", | |
"28 Austria 2003 9.559380\n", | |
"29 Austria 2002 8.872889\n", | |
".. ... ... ...\n", | |
"831 United Kingdom 1998 9.480194\n", | |
"832 United Kingdom 1997 9.435972\n", | |
"833 United Kingdom 1996 9.877613\n", | |
"834 United Kingdom 1995 9.525717\n", | |
"835 United Kingdom 1994 9.695512\n", | |
"836 United Kingdom 1993 9.824330\n", | |
"837 United Kingdom 1992 10.087042\n", | |
"838 United Kingdom 1991 10.399337\n", | |
"839 United Kingdom 1990 10.301924\n", | |
"840 United States 2010 18.115315\n", | |
"841 United States 2009 17.613975\n", | |
"842 United States 2008 19.134200\n", | |
"843 United States 2007 19.938306\n", | |
"844 United States 2006 19.790435\n", | |
"845 United States 2005 20.264177\n", | |
"846 United States 2004 20.319687\n", | |
"847 United States 2003 20.138513\n", | |
"848 United States 2002 20.141864\n", | |
"849 United States 2001 20.227848\n", | |
"850 United States 2000 20.813767\n", | |
"851 United States 1999 20.421978\n", | |
"852 United States 1998 20.371008\n", | |
"853 United States 1997 20.500903\n", | |
"854 United States 1996 20.458236\n", | |
"855 United States 1995 20.023986\n", | |
"856 United States 1994 19.986553\n", | |
"857 United States 1993 19.879693\n", | |
"858 United States 1992 19.647320\n", | |
"859 United States 1991 19.437409\n", | |
"860 United States 1990 19.801924\n", | |
"\n", | |
"[861 rows x 3 columns]" | |
] | |
} | |
], | |
"prompt_number": 102 | |
}, | |
{ | |
"cell_type": "heading", | |
"level": 2, | |
"metadata": {}, | |
"source": [ | |
"Summary" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"This notebook demonstrates a simple, informal scraper based API to the UN data website. Searches can be run on the UN data website to obtain a list of named datasets, and then a specified named dataset can be automatically downloaded into a *pandas* dataframe." | |
] | |
} | |
], | |
"metadata": {} | |
} | |
] | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment