Last active
December 17, 2018 20:34
-
-
Save xflr6/9050337 to your computer and use it in GitHub Desktop.
Glottolog with Python
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Exploring *Glottolog* with Python" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Sebastian Bank ([email protected]) http://www.uni-leipzig.de/~sbank/\n", | |
"\n", | |
"The latest version of this [IPython Notebook](http://ipython.org/notebook.html) is available at http://gist.github.com/xflr6/9050337.\n", | |
"\n", | |
"[Glottolog](http://glottolog.org) provides its comprehensive catalog of the world's languages, language families and dialects for [download](http://glottolog.org/meta/downloads) in linked data format.\n", | |
"\n", | |
"In this notebook, I will process this data set using the following tools:\n", | |
"\n", | |
"* [Python](http://www.python.org) (2.7)\n", | |
"* [rdflib](http://github.com/RDFLib/rdflib)\n", | |
"* [sqlite3](http://docs.python.org/2/library/sqlite3.html) (included with Python)\n", | |
"* [pandas](http://pandas.pydata.org) (using [matplotlib](http://matplotlib.org) for visualization)\n", | |
"\n", | |
"If you are new to scientific Python, the [Anaconda Python Distribution](http://continuum.io/downloads) is probably the fastest way to get Python installed with all the commonly used scientific packages. It supports all platforms (Linux, Mac, and Windows).\n", | |
"\n", | |
"If you are on Windows, there are [Unofficial Windows Binaries](http://www.lfd.uci.edu/~gohlke/pythonlibs/) for a lot of Python extension packages used in scientific computing." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Getting the file" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Download the [RDF](http://en.wikipedia.org/wiki/Resource_Description_Framework) export file with Pythons built-in `urllib` module ([docs](http://docs.python.org/2/library/urllib.html))." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 1, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"import urllib\n", | |
"\n", | |
"URL = 'http://glottolog.org/static/download/2.7/glottolog-language.n3.gz'\n", | |
"\n", | |
"filename, headers = urllib.urlretrieve(URL, URL.rpartition('/')[2])" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"The file contains RDF in [Notation3](http://en.wikipedia.org/wiki/Notation3) compressed with gzip." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 2, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"glottolog-language.n3.gz\n" | |
] | |
} | |
], | |
"source": [ | |
"print filename" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Display the size in megabytes." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 3, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"2.63014793396\n" | |
] | |
} | |
], | |
"source": [ | |
"size = int(headers['Content-Length'])\n", | |
"\n", | |
"print size / 1024.0 ** 2" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## A first look" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
" Read the first few bytes from the file with `gzip` ([docs](http://docs.python.org/2/library/gzip.html)) so we can get an impression of the format." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 4, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"import gzip\n", | |
"\n", | |
"with gzip.open(filename) as fd:\n", | |
" sample = fd.read(4000)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Split the sample into the namespaces definitions and the actual RDF triples. They are separated by a blank line." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 5, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"(932, 3066)" | |
] | |
}, | |
"execution_count": 5, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"head, _, body = sample.partition('\\n\\n')\n", | |
"\n", | |
"len(head), len(body)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Inspect the start of the namespaces." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 6, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"@prefix bibo: <http://purl.org/ontology/bibo/> .\n", | |
"@prefix dc: <http://purl.org/dc/elements/1.1/> .\n", | |
"@prefix dcterms: <http://purl.org/dc/terms/> .\n", | |
"@prefix dctype: <http://purl.org/dc/dcmitype/> .\n", | |
"@prefix foaf: <http://xmlns.com/foaf/0.1/> .\n", | |
"@prefix frbr: <http://purl.org/vocab/frbr/core#> .\n", | |
"@prefix geo: <http://www.w3.org/2003/01/geo/wgs84_pos#> .\n", | |
"@prefix gold: <http://purl.org/linguistics/gold/> .\n", | |
"@prefix isbd: <http://iflastandards.info/ns/isbd/elements/> .\n", | |
"@prefix lexvo: <http://lexvo.org/ontology#> .\n", | |
"@prefix owl: <http://www.w3.org/2002/07/owl#> .\n", | |
"@prefix rdf: <http://www.w3.org/1999/02/22-r...\n" | |
] | |
} | |
], | |
"source": [ | |
"print head[:600] + '...'" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Glottolog uses well-known ontologies and some which are dedicated to linguistics like [Lexvo](http://www.lexvo.org/) and [GOLD](http://linguistics-ontology.org/)." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Display the first RDF triples." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 7, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"<http://glottolog.org/resource/languoid/id/muni1258> a dcterms:LinguisticSystem,\n", | |
" gold:Language ;\n", | |
" rdfs:label \"Muniche\"@en ;\n", | |
" lexvo:iso639P3PCode \"myr\"^^xsd:string ;\n", | |
" dcterms:description <http://glottolog.org/resource/reference/id/10167>,\n", | |
" <http://glottolog.org/resource/reference/id/132589>,\n", | |
" <http://glottolog.org/resource/reference/id/135495>,\n", | |
" <http://glottolog.org/resource/reference/id/300702>,\n", | |
" <http://glottolog.org/resource/reference/id/303200>,\n", | |
" <http://glottolog.org/resource/reference/id/34227>,\n", | |
" <http://glottolog.org/resource/re...\n" | |
] | |
} | |
], | |
"source": [ | |
"print body[:600] + '...'" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"The entry starts with the full URI of the languoid, followed by its types, label, ISO 639-3 code and description." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Let's try to extract some meaningful information from this string just using Pythons regular expressions." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Using text processing" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Load the whole file uncompressed into memory." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 8, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"with gzip.open(filename) as fd:\n", | |
" data = fd.read()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Display the size in megabytes." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 9, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"32.698595047\n" | |
] | |
} | |
], | |
"source": [ | |
"print len(data) / 1024.0 ** 2" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Extract the glottocode from the start of all `dcterms:LinguisticSystem` entries with the `re` module ([docs](http://docs.python.org/2/library/re.html)) and count them." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 10, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"24393" | |
] | |
}, | |
"execution_count": 10, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"import re\n", | |
"\n", | |
"GLOTTOCODE = '<http://glottolog.org/resource/languoid/id/(\\w+)> a dcterms:LinguisticSystem'\n", | |
"\n", | |
"gcodes = re.findall(GLOTTOCODE, data)\n", | |
"\n", | |
"len(gcodes)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Display the glottocodes of the first five entries." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 11, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"['muni1258', 'west1503', 'port1278', 'west2205', 'nilo1247']" | |
] | |
}, | |
"execution_count": 11, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"gcodes[:5]" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Looks unordered, sort them alphabetically and display the first and last five entries." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 12, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"['aala1237', 'aant1238', 'aari1238', 'aari1239', 'aari1240']\n", | |
"['zuti1239', 'zuwa1238', 'zwal1238', 'zyph1238', 'zyud1238']\n" | |
] | |
} | |
], | |
"source": [ | |
"gcodes.sort()\n", | |
"\n", | |
"print gcodes[:5] \n", | |
"print gcodes[-5:]" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Extract everything that looks like an ISO code. Count the results." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 13, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"7822" | |
] | |
}, | |
"execution_count": 13, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"ISO_CODE = 'iso639P3PCode \"(\\w+)\"'\n", | |
"\n", | |
"icodes = re.findall(ISO_CODE, data)\n", | |
"\n", | |
"len(icodes)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Display the first ten ISO codes." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 14, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"['myr', 'pko', 'oki', 'mwy', 'kqh', 'mwx', 'aam', 'spy', 'tec', 'kpz']" | |
] | |
}, | |
"execution_count": 14, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"icodes[:10]" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Sort them as well and display the start and end." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 15, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"['aaa', 'aab', 'aac', 'aad', 'aae', 'aaf', 'aag', 'aah', 'aai', 'aak']\n", | |
"['zun', 'zuy', 'zwa', 'zyb', 'zyg', 'zyj', 'zyn', 'zyp', 'zza', 'zzj']\n" | |
] | |
} | |
], | |
"source": [ | |
"icodes.sort()\n", | |
"\n", | |
"print icodes[:10]\n", | |
"print icodes[-10:]" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Glottocodes" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Glottocodes consist of four letters and some apparently recurring digit combinations.\n", | |
"\n", | |
"Display the five most common of those digits and their frequency with `collections.Counter` ([docs](http://docs.python.org/2/library/collections.html#collections.Counter))." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 16, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"[('1238', 3022), ('1239', 1192), ('1242', 1039), ('1241', 997), ('1237', 903)]" | |
] | |
}, | |
"execution_count": 16, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"import collections\n", | |
"\n", | |
"collections.Counter(g[4:] for g in gcodes).most_common(5)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Show the most common inital parts." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 17, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"[('nort', 563), ('sout', 560), ('nucl', 508), ('west', 461), ('east', 425)]" | |
] | |
}, | |
"execution_count": 17, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"collections.Counter(g[:4] for g in gcodes).most_common(5)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Loading into RDFlib" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Use `rdflib` ([docs](http://rdflib.readthedocs.org/en/latest/)) to load the whole graph into memory.\n", | |
"\n", | |
"This will take a while and fill a couple hundred megabytes of RAM." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 18, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"<Graph identifier=N5f0224c79a154d14bd437619ecf4e397 (<class 'rdflib.graph.Graph'>)>" | |
] | |
}, | |
"execution_count": 18, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"import rdflib\n", | |
"\n", | |
"graph = rdflib.Graph()\n", | |
"\n", | |
"with gzip.open(filename) as fd:\n", | |
" graph.parse(fd, format='n3')\n", | |
"\n", | |
"graph" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Count the number of triples." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 19, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"670194" | |
] | |
}, | |
"execution_count": 19, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"len(graph)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Using the RDF graph" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Display some of the triples (subject, predicate, object)." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 20, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"pwon1235 rdf:type http://purl.org/dc/terms/LinguisticSystem\n", | |
"guin1260 dcterms:spatial http://www.geonames.org/countries/GW/\n", | |
"mogu1251 dcterms:description http://glottolog.org/resource/reference/id/156942\n", | |
"tibe1272 dcterms:description http://glottolog.org/resource/reference/id/26288\n", | |
"barr1251 dcterms:isReferencedBy http://glottolog.org/valuesets/vitality-barr1251\n", | |
"nang1261 skos:altLabel nang1261\n", | |
"choc1278 dcterms:spatial North America\n", | |
"song1308 rdf:type http://purl.org/linguistics/gold/Dialect\n", | |
"pato1242 void:inDataset http://glottolog.org/\n", | |
"nort2855 rdf:type http://purl.org/dc/terms/LinguisticSystem\n", | |
"chil1280 skos:broader http://glottolog.org/resource/languoid/id/nort2940\n", | |
"sate1242 dcterms:title Saterfriesisch\n", | |
"stan1290 dcterms:description http://glottolog.org/resource/reference/id/37004\n", | |
"bord1246 skos:broader http://glottolog.org/resource/languoid/id/komb1273\n", | |
"marg1251 dcterms:description http://glottolog.org/resource/reference/id/54615\n" | |
] | |
} | |
], | |
"source": [ | |
"import itertools\n", | |
"\n", | |
"for s, p, o in itertools.islice(graph, 15):\n", | |
" print s[42:], graph.qname(p), o" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Show all available predicates." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 21, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"lexvo:iso639P3PCode\n", | |
"dcterms:description\n", | |
"dcterms:isReferencedBy\n", | |
"dcterms:isReplacedBy\n", | |
"dcterms:spatial\n", | |
"dcterms:title\n", | |
"void:inDataset\n", | |
"rdf:type\n", | |
"rdfs:label\n", | |
"owl:sameAs\n", | |
"geo:lat\n", | |
"geo:long\n", | |
"skos:altLabel\n", | |
"skos:broader\n", | |
"skos:broaderTransitive\n", | |
"skos:changeNote\n", | |
"skos:editorialNote\n", | |
"skos:narrower\n", | |
"skos:prefLabel\n", | |
"skos:scopeNote\n" | |
] | |
} | |
], | |
"source": [ | |
"for p in sorted(set(graph.predicates())):\n", | |
" print graph.qname(p)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Create shortcuts for querying glottocodes and ISO codes. Translate glottocodes into ISO codes." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 22, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"aala1237 -> ___, aant1238 -> ___, aari1238 -> ___, aari1239 -> aiw, aari1240 -> aay,\n" | |
] | |
} | |
], | |
"source": [ | |
"glottocode = rdflib.Namespace('http://glottolog.org/resource/languoid/id/')\n", | |
"lexvo = rdflib.Namespace('http://lexvo.org/ontology#')\n", | |
"iso639 = lexvo.iso639P3PCode\n", | |
"\n", | |
"for g in gcodes[:5]:\n", | |
" i = graph.value(glottocode[g], iso639, default='___')\n", | |
" print '%s -> %s,' % (g, i)," | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Translate ISO codes into glottocodes" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 23, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"aaa -> ghot1243, aab -> alum1246, aac -> arii1243, aad -> amal1242, aae -> arbe1236,\n" | |
] | |
} | |
], | |
"source": [ | |
"string = rdflib.namespace.XSD.string\n", | |
"\n", | |
"for i in icodes[:5]:\n", | |
" g = graph.value(None, iso639, rdflib.Literal(i, datatype=string))\n", | |
" print '%s -> %s,' % (i, g[42:])," | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Retrieve the preferred label of languoids." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 24, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"aala1237 -> Aalawa, aant1238 -> Aantantara, aari1238 -> Aari-Gayil, aari1239 -> Aari, aari1240 -> Aariya,\n" | |
] | |
} | |
], | |
"source": [ | |
"label = rdflib.namespace.RDFS.label\n", | |
"\n", | |
"for g in gcodes[:5]:\n", | |
" l = graph.value(glottocode[g], label)\n", | |
" print '%s -> %s,' % (g, l)," | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Lookup an arbitrary languoid with a given label. " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 25, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"http://glottolog.org/resource/languoid/id/aala1237\n" | |
] | |
} | |
], | |
"source": [ | |
"print graph.value(None, label, rdflib.Literal('Aalawa', lang='en'))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Show the predicates and objects of an individual languoid." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 26, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"skos:prefLabel Aalawa\n", | |
"rdfs:label Aalawa\n", | |
"skos:broaderTransitive http://glottolog.org/resource/languoid/id/aust1307\n", | |
"dcterms:isReferencedBy http://glottolog.org/valuesets/fc42061\n", | |
"dcterms:title Aalawa\n", | |
"dcterms:isReferencedBy http://glottolog.org/valuesets/sc42061\n", | |
"skos:altLabel aala1237\n", | |
"void:inDataset http://glottolog.org/\n", | |
"skos:scopeNote language\n", | |
"skos:broader http://glottolog.org/resource/languoid/id/ramo1244\n", | |
"rdf:type http://purl.org/dc/terms/LinguisticSystem\n", | |
"dcterms:spatial Papunesia\n", | |
"rdf:type http://purl.org/linguistics/gold/Dialect\n" | |
] | |
} | |
], | |
"source": [ | |
"for p, o in graph[glottocode['aala1237']]:\n", | |
" print graph.qname(p), o" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Display the nodes along a languoid's path up the tree." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 27, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Aalawa -> Ramoaaina -> Kandas-Duke of York -> Label-Bilur -> St George linkage -> New Ireland-Northwest Solomonic linkage -> Meso Melanesian linkage -> Western Oceanic linkage -> Oceanic -> Eastern Malayo-Polynesian -> Central-Eastern Malayo-Polynesian -> Malayo-Polynesian -> Nuclear Austronesian -> Austronesian\n" | |
] | |
} | |
], | |
"source": [ | |
"broader = rdflib.namespace.SKOS.broader\n", | |
"\n", | |
"aalawa = graph.resource(glottocode['aala1237'])\n", | |
"\n", | |
"print ' -> '.join(b.label() for b in aalawa.transitive_objects(broader))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Display the nodes immediately below a languoid." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 28, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Atlantic-Congo <- Volta-Congo, North-Central Atlantic, Nalu, Mansoanka-Fore-Mboteni, Limba, Mel\n" | |
] | |
} | |
], | |
"source": [ | |
"narrower = rdflib.namespace.SKOS.narrower\n", | |
"\n", | |
"atlaco = graph.resource(glottocode['atla1278'])\n", | |
"\n", | |
"print '%s <- %s' % (atlaco.label(), ', '.join(n.label() for n in atlaco.objects(narrower)))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Count all nodes below a languoid." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 29, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"4608" | |
] | |
}, | |
"execution_count": 29, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"len(list(atlaco.transitive_objects(narrower)))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Querying with SPARQL" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Retrieve rows of glottocode, ISO code, and label with RDFs query language [SPARQL](http://www.w3.org/TR/sparql11-query/). Also display the annotated language of the label." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 30, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"aala1237 | None | Aalawa | en\n", | |
"aant1238 | None | Aantantara | en\n", | |
"aari1238 | None | Aari-Gayil | en\n", | |
"aari1239 | aiw | Aari | en\n", | |
"aari1240 | aay | Aariya | en\n", | |
"aari1244 | aiz | Aari | en\n", | |
"aasa1238 | aas | Aasax | en\n", | |
"aata1238 | None | Aatasaara | en\n", | |
"abaa1238 | None | Aba | en\n", | |
"abab1239 | None | Ababda | en\n" | |
] | |
} | |
], | |
"source": [ | |
"GIL = \"\"\"\n", | |
"SELECT\n", | |
" (substr(str(?s), 43) AS ?glottocode) ?iso ?label\n", | |
"WHERE\n", | |
" { ?s a dcterms:LinguisticSystem ; skos:prefLabel ?label \n", | |
" OPTIONAL { ?s lexvo:iso639P3PCode ?iso } }\n", | |
"ORDER BY ?s LIMIT 10\"\"\"\n", | |
"\n", | |
"for g, i, l in graph.query(GIL):\n", | |
" print '%s | %-4s | %-10s | %s' % (g, i, l, l.language)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Display the result as CSV (`json` and `xml` format are also supported)." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 31, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"glottocode,iso,label\r\n", | |
"aala1237,,Aalawa\r\n", | |
"aant1238,,Aantantara\r\n", | |
"aari1238,,Aari-Gayil\r\n", | |
"aari1239,aiw,Aari\r\n", | |
"aari1240,aay,Aariya\r\n", | |
"aari1244,aiz,Aari\r\n", | |
"aasa1238,aas,Aasax\r\n", | |
"aata1238,,Aatasaara\r\n", | |
"abaa1238,,Aba\r\n", | |
"abab1239,,Ababda\r\n", | |
"\n" | |
] | |
} | |
], | |
"source": [ | |
"print graph.query(GIL).serialize(format='csv')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Determine the language families with the most child languages." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 32, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Atlantic-Congo\t1430\n", | |
"Austronesian\t1274\n", | |
"Indo-European\t583\n", | |
"Sino-Tibetan\t475\n", | |
"Bookkeeping\t391\n", | |
"Afro-Asiatic\t372\n", | |
"Nuclear Trans New Guinea\t315\n", | |
"Pama-Nyungan\t241\n", | |
"Otomanguean\t179\n", | |
"Sign Language\t168\n" | |
] | |
} | |
], | |
"source": [ | |
"FAMILIES = \"\"\"\n", | |
"SELECT\n", | |
" ?label (count(*) as ?n)\n", | |
"WHERE\n", | |
" { ?s a gold:LanguageFamily ; rdfs:label ?label ; skos:narrower+/a gold:Language }\n", | |
"GROUP BY ?s\n", | |
"ORDER BY desc(?n) LIMIT 10\"\"\"\n", | |
"\n", | |
"for f, n in graph.query(FAMILIES):\n", | |
" print '%s\\t%s' % (f, n)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Display the immediate children for some families." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 33, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Arawan <- Suruahá, Paumarí, Aruá (Amazonas State), Madi-Madiha\n", | |
"Artificial Language <- Neo, Efate group based (Artificial Language), Kotava, Esperanto, Lingua Franca Nova, Talossan, Interlingua (International Auxiliary Language Association), Rennellese Sign Language, Ladakhi Sign\n", | |
"Athapaskan-Eyak-Tlingit <- Athapaskan-Eyak, Tlingit\n", | |
"Atlantic-Congo <- Volta-Congo, North-Central Atlantic, Nalu, Mansoanka-Fore-Mboteni, Limba, Mel\n", | |
"Austroasiatic <- Nicobaric, Monic, Khmuic, Vietic, Mangic, Pearic, Bahnaric, Khasi-Palaung, Katuic, Mundaic, Aslian, Khmeric\n" | |
] | |
} | |
], | |
"source": [ | |
"CHILDREN = \"\"\"\n", | |
"SELECT\n", | |
" ?label (group_concat(?o; separator=\", \") as ?children)\n", | |
"WHERE\n", | |
" { ?s a gold:LanguageFamily ; rdfs:label ?label ; skos:narrower/rdfs:label ?o }\n", | |
"GROUP BY ?s\n", | |
"ORDER BY ?label OFFSET 10 LIMIT 5\"\"\"\n", | |
"\n", | |
"for f, c in graph.query(CHILDREN):\n", | |
" print '%s <- %s' % (f, c)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Do the same for a specific languoid." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 34, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Atlantic-Congo <- Volta-Congo, North-Central Atlantic, Nalu, Mansoanka-Fore-Mboteni, Limba, Mel\n" | |
] | |
} | |
], | |
"source": [ | |
"for l, c in graph.query(\"\"\"BASE <http://glottolog.org/resource/languoid/id/>\n", | |
"SELECT\n", | |
" ?label (group_concat(?o; separator=\", \") as ?children)\n", | |
"WHERE\n", | |
" { <atla1278> rdfs:label ?label ; skos:narrower/rdfs:label ?o }\"\"\"):\n", | |
" print '%s <- %s' % (l, c)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Here's a SPARQL query that retrieves most of the [functional properties](http://www.w3.org/TR/owl-ref/#FunctionalProperty-def) of the languoids." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 35, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"LANGUOIDS = \"\"\"\n", | |
"SELECT\n", | |
" (substr(str(?s), 43) AS ?id) ?label\n", | |
" (substr(str(?type), 34) AS ?level)\n", | |
" (substr(str(?broader), 43) AS ?parent)\n", | |
" (if(bound(?change_note), 1, 0) AS ?obsolete)\n", | |
" ?status ?iso639 ?latitude ?longitude\n", | |
"WHERE\n", | |
" { ?s a dcterms:LinguisticSystem ; skos:prefLabel ?label .\n", | |
" ?s a ?type FILTER (strstarts(str(?type), \"http://purl.org/linguistics/gold/\"))\n", | |
" OPTIONAL { ?s skos:broader ?broader }\n", | |
" OPTIONAL { ?s skos:changeNote ?change_note FILTER (?change_note = \"obsolete\") }\n", | |
" OPTIONAL { ?s skos:editorialNote ?status }\n", | |
" OPTIONAL { ?s lexvo:iso639P3PCode ?iso639 }\n", | |
" OPTIONAL { ?s geo:lat ?latitude; geo:long ?longitude } }\"\"\"" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Display some results." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 36, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"pwon1235 Pwo Northern Karen Language nort2704 0 established pww 18.016 98.2709\n", | |
"kwes1244 Kwese Language phee1234 0 established kws -5.60445 18.5759\n", | |
"abaw1238 Abawa Dialect gupa1248 0 None None None None\n", | |
"roto1247 Rotorua-Taupo Dialect maor1246 0 None None None None\n", | |
"nort2855 North Coast Mengen Dialect meng1267 0 None None None None\n", | |
"maca1260 Maca Language mata1290 0 established mca -25.0119 -57.3694\n", | |
"nyon1241 Nyong Language pere1234 0 established muo 7.27419 11.0615\n", | |
"fars1254 Farsic-Caucasian Tat LanguageSubfamily sout3157 0 established None None None\n", | |
"yeng1243 Yengi Hissar Dialect uigh1240 0 None None None None\n", | |
"thui1238 Thui Phum Dialect ngal1291 0 None None None None\n", | |
"west2339 Western Asturian Dialect astu1245 0 None None None None\n", | |
"kele1254 Kele (C.60) LanguageFamily None 1 established None None None\n", | |
"zumu1241 Zumu Dialect bata1314 0 None None None None\n", | |
"nort2742 Northern Isan Dialect nort2741 0 None None None None\n", | |
"tezo1238 Tezoatlán Mixtec Language mixt1427 0 established mxb 17.6155 -97.9002\n", | |
"sund1254 Sundi-Kamba LanguageSubfamily laad1234 0 established None None None\n", | |
"long1404 Long Bento' Dialect moda1244 0 None None None None\n", | |
"pouy1238 Pouye Language ramm1241 0 established bye -3.72704 141.864\n", | |
"gola1255 Gola Language mela1257 0 established gol 7.06193 -10.8138\n", | |
"supp1238 Suppire-Mamara LanguageFamily None 1 established None None None\n" | |
] | |
} | |
], | |
"source": [ | |
"for row in itertools.islice(graph.query(LANGUOIDS), 20):\n", | |
" print '%s %-20s %-17s %-8s %s %-11s %-4s %-8s %s' % row" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Write the results into a CSV file. Show the beginning of the file." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 37, | |
"metadata": { | |
"collapsed": false, | |
"scrolled": true | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"id,label,level,parent,obsolete,status,iso639,latitude,longitude\n", | |
"pwon1235,Pwo Northern Karen,Language,nort2704,0,established,pww,18.016,98.2709\n", | |
"kwes1244,Kwese,Language,phee1234,0,established,kws,-5.60445,18.5759\n", | |
"abaw1238,Abawa,Dialect,gupa1248,0,,,,\n", | |
"roto1247,Rotorua-Taupo,Dialect,maor1246,0,,,,\n", | |
"nort2855,North Coast Mengen,Dialect,meng1267,0,,,,\n", | |
"maca1260,Maca,Language,mata1290,0,established,mca,-25.0119,-57.3694\n", | |
"nyon1241,Nyong,Language,pere1234,0,established,muo,7.27419,11.0615\n", | |
"fars1254,Farsic-Cau...\n" | |
] | |
} | |
], | |
"source": [ | |
"CSV = 'glottolog.csv'\n", | |
"\n", | |
"graph.query(LANGUOIDS).serialize(CSV, format='csv')\n", | |
"\n", | |
"with open(CSV) as fd:\n", | |
" sample = fd.read(500)\n", | |
"\n", | |
"print sample + '...'" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Let's put that into a relational database so we can reuse it later." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Export to SQLite" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Create an [SQLite](http://www.sqlite.org/) database file connecting with `sqlite3` ([docs](http://docs.python.org/2/library/sqlite3.html)). Activate [foreign key checks](http://www.sqlite.org/foreignkeys.html) so we notice if something is inconsistent." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 38, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"<sqlite3.Connection at 0x28c60858>" | |
] | |
}, | |
"execution_count": 38, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"import sqlite3\n", | |
"\n", | |
"DB = 'glottolog.sqlite3'\n", | |
"\n", | |
"conn = sqlite3.connect(DB)\n", | |
"conn.execute('PRAGMA foreign_keys = ON')\n", | |
"\n", | |
"conn.execute('PRAGMA synchronous = OFF')\n", | |
"conn.execute('PRAGMA journal_mode = MEMORY')\n", | |
"\n", | |
"conn" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Create a table for the results of the languoids query with some additional sanity checks. Insert the query rows. Count them." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 39, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"(24393,)" | |
] | |
}, | |
"execution_count": 39, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"conn.execute(\"\"\"\n", | |
"CREATE TABLE languoid (\n", | |
" id TEXT NOT NULL PRIMARY KEY,\n", | |
" label TEXT NOT NULL,\n", | |
" level TEXT NOT NULL,\n", | |
" parent TEXT,\n", | |
" obsolete BOOLEAN NOT NULL,\n", | |
" status TEXT,\n", | |
" iso TEXT UNIQUE,\n", | |
" latitude REAL,\n", | |
" longitude REAL,\n", | |
" FOREIGN KEY(parent) REFERENCES languoid(id) DEFERRABLE INITIALLY DEFERRED,\n", | |
" CHECK (level IN ('LanguageFamily', 'LanguageSubfamily', 'Language', 'Dialect')),\n", | |
" CHECK (obsolete IN (0, 1)),\n", | |
" CHECK (status IN ('established', 'spurious', 'spurious retired', 'unattested',\n", | |
" 'provisional', 'retired'))\n", | |
")\"\"\")\n", | |
"\n", | |
"conn.executemany('INSERT INTO languoid VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)',\n", | |
" graph.query(LANGUOIDS))\n", | |
"conn.commit()\n", | |
"\n", | |
"conn.execute('SELECT count(*) FROM languoid').fetchone()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Languoids may have *n* alternative labels. \n", | |
"\n", | |
"Create a table for the labels and their language. Retrieve them with SPARQL. Insert the query results into the table. Count rows." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 40, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"(86463,)" | |
] | |
}, | |
"execution_count": 40, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"conn.execute(\"\"\"\n", | |
"CREATE TABLE label (\n", | |
" id TEXT NOT NULL,\n", | |
" lang TEXT NOT NULL,\n", | |
" label TEXT NOT NULL,\n", | |
" PRIMARY KEY (id, lang, label),\n", | |
" FOREIGN KEY(id) REFERENCES languoid(id)\n", | |
")\"\"\")\n", | |
"\n", | |
"LABELS = \"\"\"\n", | |
"SELECT\n", | |
" (substr(str(?s), 43) AS ?id) (lang(?label) AS ?lang) ?label\n", | |
"WHERE\n", | |
" { ?s a dcterms:LinguisticSystem ; skos:altLabel ?label }\"\"\"\n", | |
"\n", | |
"conn.executemany('INSERT INTO label VALUES (?, ?, ?)',\n", | |
" graph.query(LABELS))\n", | |
"conn.commit()\n", | |
"\n", | |
"conn.execute('SELECT count(*) FROM label').fetchone()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Languoids may have *n* references.\n", | |
"\n", | |
"Create a table for the references. Retrieve them with SPARQL. Insert the query results into the table. Count." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 41, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"(212614,)" | |
] | |
}, | |
"execution_count": 41, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"conn.execute(\"\"\"\n", | |
"CREATE TABLE reference (\n", | |
" id TEXT NOT NULL,\n", | |
" reference INTEGER NOT NULL,\n", | |
" PRIMARY KEY (id, reference),\n", | |
" FOREIGN KEY(id) REFERENCES languoid(id)\n", | |
")\"\"\")\n", | |
"\n", | |
"REFERENCES = \"\"\"\n", | |
"SELECT\n", | |
" (substr(str(?s), 43) AS ?id) (substr(str(?o), 44) AS ?reference)\n", | |
"WHERE\n", | |
" { ?s a dcterms:LinguisticSystem ; dcterms:description ?o\n", | |
" FILTER (strstarts(str(?o), \"http://glottolog.org/resource/reference/id/\")) }\"\"\"\n", | |
"\n", | |
"conn.executemany('INSERT INTO reference VALUES (?, ?)',\n", | |
" graph.query(REFERENCES))\n", | |
"conn.commit()\n", | |
"\n", | |
"conn.execute('SELECT count(*) FROM reference').fetchone()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Querying with SQLite" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Display the number of languoids. Break it down by type and check the proportion of superseded entries. Most of the family entries are obsolete." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 42, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"(24393,)\n" | |
] | |
}, | |
{ | |
"data": { | |
"text/plain": [ | |
"[(u'Dialect', 10599, 185),\n", | |
" (u'Language', 8418, 21),\n", | |
" (u'LanguageFamily', 1505, 1263),\n", | |
" (u'LanguageSubfamily', 3871, 0)]" | |
] | |
}, | |
"execution_count": 42, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"print conn.execute('SELECT count(*) FROM languoid').fetchone()\n", | |
"\n", | |
"conn.execute('SELECT level, count(*), sum(obsolete) FROM languoid GROUP BY level').fetchall()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Check the distribution of status values by type. Only language entries distinguish it. " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 43, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"[(u'Dialect', None, 10599),\n", | |
" (u'Language', u'established', 7945),\n", | |
" (u'Language', u'spurious', 199),\n", | |
" (u'Language', u'spurious retired', 192),\n", | |
" (u'Language', u'unattested', 61),\n", | |
" (u'Language', u'retired', 19),\n", | |
" (u'Language', u'provisional', 2),\n", | |
" (u'LanguageFamily', u'established', 1505),\n", | |
" (u'LanguageSubfamily', u'established', 3871)]" | |
] | |
}, | |
"execution_count": 43, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"conn.execute(\"\"\"SELECT level, status, count(*) AS n\n", | |
"FROM languoid GROUP BY level, status ORDER BY level, n DESC\"\"\").fetchall()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Display the number ISO codes. Break the proportions down by languoid type. ISO 639-3 also contains macrolanguages." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 44, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"(24393, 7822)\n" | |
] | |
}, | |
{ | |
"data": { | |
"text/plain": [ | |
"[(u'Dialect', 10599, 5),\n", | |
" (u'Language', 8418, 7789),\n", | |
" (u'LanguageFamily', 1505, 1),\n", | |
" (u'LanguageSubfamily', 3871, 27)]" | |
] | |
}, | |
"execution_count": 44, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"print conn.execute('SELECT count(*), count(iso) FROM languoid').fetchone()\n", | |
"\n", | |
"conn.execute('SELECT level, count(*), count(iso) FROM languoid GROUP BY level').fetchall()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Check how many entries specify location. Only language entries do so." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 45, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"[(u'Dialect', 0),\n", | |
" (u'Language', 7634),\n", | |
" (u'LanguageFamily', 0),\n", | |
" (u'LanguageSubfamily', 1)]" | |
] | |
}, | |
"execution_count": 45, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"conn.execute('SELECT level, count(latitude) FROM languoid GROUP BY level').fetchall()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Display the first and last glottocodes and ISO codes." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 46, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"aala1237, aant1238, aari1238, aari1239, aari1240, aari1244, aasa1238, aata1238, abaa1238, abab1239\n", | |
"zyud1238, zyph1238, zwal1238, zuwa1238, zuti1239, zurr1238, zuri1238, zura1238, zuoj1238, zuni1245\n", | |
"aaa, aab, aac, aad, aae, aaf, aag, aah, aai, aak, aal, aam, aan, aao, aap, aaq, aar, aas, aat, aau\n", | |
"zzj, zza, zyp, zyn, zyj, zyg, zyb, zwa, zuy, zun, zum, zul, zuh, zua, zty, ztx, ztu, ztt, zts, ztq\n" | |
] | |
} | |
], | |
"source": [ | |
"GLOTTOCODES = 'SELECT id FROM languoid ORDER BY id %s LIMIT 10'\n", | |
"\n", | |
"print ', '.join(g for g, in conn.execute(GLOTTOCODES % 'ASC'))\n", | |
"print ', '.join(g for g, in conn.execute(GLOTTOCODES % 'DESC'))\n", | |
"\n", | |
"ISO_CODES = 'SELECT iso FROM languoid WHERE iso NOT NULL ORDER BY iso %s LIMIT 20'\n", | |
"\n", | |
"print ', '.join(i for i, in conn.execute(ISO_CODES % 'ASC'))\n", | |
"print ', '.join(i for i, in conn.execute(ISO_CODES % 'DESC'))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Labels" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Display the number of labels. Break them down by language and entry type." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 47, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"(86463,)\n", | |
"[(u'en', 45862), (u'x-clld', 24393), (u'fr', 899), (u'br', 671), (u'ru', 589)]\n" | |
] | |
}, | |
{ | |
"data": { | |
"text/plain": [ | |
"[(u'Dialect', 14827),\n", | |
" (u'Language', 65685),\n", | |
" (u'LanguageFamily', 1723),\n", | |
" (u'LanguageSubfamily', 4228)]" | |
] | |
}, | |
"execution_count": 47, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"print conn.execute('SELECT count(*) FROM label').fetchone()\n", | |
"\n", | |
"print conn.execute(\"\"\"SELECT lang, count(*) AS n\n", | |
"FROM label GROUP BY lang ORDER BY n DESC LIMIT 5\"\"\").fetchall()\n", | |
"\n", | |
"conn.execute(\"\"\"SELECT languoid.level, count(*) AS n\n", | |
"FROM label JOIN languoid ON languoid.id=label.id\n", | |
"GROUP BY languoid.level\"\"\").fetchall()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Show the minimal, mean, and maximal number of labels per entry. Check the languoids with the most labels." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 48, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"(1, 3.5445824621817734, 174)\n" | |
] | |
}, | |
{ | |
"data": { | |
"text/plain": [ | |
"[(u'Standard French', 174), (u'Standard Spanish', 154), (u'Russian', 144)]" | |
] | |
}, | |
"execution_count": 48, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"print conn.execute(\"\"\"SELECT min(n), avg(n), max(n) FROM\n", | |
"(SELECT count(*) AS n FROM label GROUP BY id)\"\"\").fetchone()\n", | |
"\n", | |
"conn.execute(\"\"\"SELECT languoid.label, count(*) AS n\n", | |
"FROM label JOIN languoid ON languoid.id=label.id\n", | |
"GROUP BY label.id ORDER BY n DESC LIMIT 3\"\"\").fetchall()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Show the minimal, mean, and maximal label length. Check the frequencies of the most common lengths." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 49, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"[(1, 9.443553890103281, 65)]\n" | |
] | |
}, | |
{ | |
"data": { | |
"text/plain": [ | |
"[(4, 3882), (5, 6174), (6, 7481), (7, 6887), (8, 29717), (9, 4054), (10, 3336)]" | |
] | |
}, | |
"execution_count": 49, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"print conn.execute(\"\"\"SELECT min(s), avg(s), max(s) FROM\n", | |
"(SELECT length(label) AS s FROM label)\"\"\").fetchall()\n", | |
"\n", | |
"conn.execute(\"\"\"SELECT length(label) AS l, count(*) AS n\n", | |
"FROM label GROUP BY l HAVING n > 3200 ORDER BY l\"\"\").fetchall()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### References" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Display the number of references. Break them down by entry type. There are much less references for non-languages." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 50, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"(212614,)\n" | |
] | |
}, | |
{ | |
"data": { | |
"text/plain": [ | |
"[(u'Dialect', 43),\n", | |
" (u'Language', 210178),\n", | |
" (u'LanguageFamily', 1663),\n", | |
" (u'LanguageSubfamily', 730)]" | |
] | |
}, | |
"execution_count": 50, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"print conn.execute('SELECT count(*) FROM reference').fetchone()\n", | |
"\n", | |
"conn.execute(\"\"\"SELECT l.level, count(*) AS n\n", | |
"FROM reference AS r JOIN languoid AS l ON l.id=r.id GROUP BY l.level\"\"\").fetchall()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Show the minimal, mean, and maximal number of references per entry. Check the most referenced languoids." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 51, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"(1, 25.640858658948385, 2728)\n" | |
] | |
}, | |
{ | |
"data": { | |
"text/plain": [ | |
"[(u'Luxembourgish', 2728), (u'Standard French', 2160), (u'Swahili', 1840)]" | |
] | |
}, | |
"execution_count": 51, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"print conn.execute(\"\"\"SELECT min(n), avg(n), max(n) FROM\n", | |
"(SELECT count(*) AS n FROM reference GROUP BY id)\"\"\").fetchone()\n", | |
"\n", | |
"conn.execute(\"\"\"SELECT l.label, count(*) AS n FROM reference AS r\n", | |
"JOIN languoid AS l ON l.id=r.id GROUP BY r.id ORDER BY n DESC LIMIT 3\"\"\").fetchall()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Building the tree" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"The languoids table only specifies the direct parent of each entry. However, we want to be able to traverse the tree and query the whole path.\n", | |
"\n", | |
"As SQLite supports [hierarchical queries](http://en.wikipedia.org/wiki/Hierarchical_and_recursive_queries_in_SQL) only with version 3.8.3+, we will use a more general approach and generate a table with all tree paths.\n", | |
"\n", | |
"In other words, we will compute the *transitive closure* of the parent relation, a.k.a. tree closure table.\n", | |
"\n", | |
"Since we won't use recursion *inside* the database, we will simply put together a bunch of SQL queries and feed the results back into a new table of our database." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 52, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"PATH = \"\"\"SELECT\n", | |
" i0 AS child, %(depth)d AS steps, i%(depth)d AS parent, i%(next)d IS NULL AS terminal\n", | |
"FROM (\n", | |
" SELECT %(select)s\n", | |
" FROM languoid AS l0\n", | |
" %(joins)s\n", | |
") WHERE parent IS NOT NULL\"\"\"\n", | |
"\n", | |
"def path_query(depth):\n", | |
" select = ', '.join('l%(step)d.id AS i%(step)d' % {'step': i} for i in range(depth + 2))\n", | |
" joins = ' '.join('LEFT JOIN languoid AS l%(next)d ON l%(step)d.parent = l%(next)d.id'\n", | |
" % {'step': i, 'next': i + 1} for i in range(depth + 1))\n", | |
" return PATH % {'depth': depth, 'next': depth + 1, 'select': select, 'joins': joins}" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"The `path_query` function generates a query for a tree walk of the length given by `depth`. Note that we will omit zero step (*reflexive*) walks." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 53, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"SELECT\n", | |
" i0 AS child, 1 AS steps, i1 AS parent, i2 IS NULL AS terminal\n", | |
"FROM (\n", | |
" SELECT l0.id AS i0, l1.id AS i1, l2.id AS i2\n", | |
" FROM languoid AS l0\n", | |
" LEFT JOIN languoid AS l1 ON l0.parent = l1.id LEFT JOIN languoid AS l2 ON l1.parent = l2.id\n", | |
") WHERE parent IS NOT NULL\n" | |
] | |
} | |
], | |
"source": [ | |
"print path_query(1)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Each query returns the start glottocode, number of steps, end glottocode and a boolean indicating if there is no grandparent. " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 54, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"[(u'aala1237', 1, u'ramo1244', 0), (u'aant1238', 1, u'nort2920', 0), (u'aari1238', 1, u'ahkk1235', 0)]\n", | |
"[(u'aala1237', 2, u'kand1307', 0), (u'aant1238', 2, u'tair1260', 0), (u'aari1238', 2, u'sout2845', 1)]\n" | |
] | |
} | |
], | |
"source": [ | |
"print conn.execute('%s ORDER BY i0 LIMIT 3' % path_query(1)).fetchall()\n", | |
"print conn.execute('%s ORDER BY i0 LIMIT 3' % path_query(2)).fetchall()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"When all paths in the query are terminal, we have arrived at the maximal depth." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 55, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"[(u'patw1249', 18, u'indo1319', 1),\n", | |
" (u'yeri1239', 18, u'atla1278', 1),\n", | |
" (u'cher1272', 18, u'atla1278', 1),\n", | |
" (u'wile1238', 18, u'atla1278', 1),\n", | |
" (u'biri1258', 18, u'atla1278', 1),\n", | |
" (u'doli1238', 18, u'atla1278', 1),\n", | |
" (u'fufu1238', 18, u'atla1278', 1),\n", | |
" (u'bule1242', 18, u'atla1278', 1),\n", | |
" (u'pato1243', 18, u'indo1319', 1)]" | |
] | |
}, | |
"execution_count": 55, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"conn.execute(path_query(18)).fetchall()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Create a table for the results. Insert path walks of increasing depth until all walks have ended. Count the walks. " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 56, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"(145822,)" | |
] | |
}, | |
"execution_count": 56, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"conn.execute(\"\"\"\n", | |
"CREATE TABLE tree (\n", | |
" child TEXT NOT NULL,\n", | |
" steps INTEGER NOT NULL,\n", | |
" parent TEXT NOT NULL,\n", | |
" terminal BOOLEAN NOT NULL,\n", | |
" PRIMARY KEY (child, steps),\n", | |
" UNIQUE (child, parent),\n", | |
" UNIQUE (parent, child),\n", | |
" FOREIGN KEY (child) REFERENCES languoid (id),\n", | |
" FOREIGN KEY (parent) REFERENCES languoid (id),\n", | |
" CHECK (terminal IN (0, 1))\n", | |
")\"\"\")\n", | |
"\n", | |
"depth = 1\n", | |
"while True:\n", | |
" rows = conn.execute(path_query(depth)).fetchall()\n", | |
" if not rows:\n", | |
" break\n", | |
" conn.executemany('INSERT INTO tree VALUES (?, ?, ?, ?)', rows)\n", | |
" depth += 1\n", | |
"conn.commit()\n", | |
"\n", | |
"conn.execute('SELECT count(*) FROM tree').fetchone()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"If the SQlite we use from Python is version 3.8.3 or later, we can also get the rows for the tree closure table with a single query:\n", | |
"\n", | |
"```sql\n", | |
"WITH RECURSIVE tree(child, steps, parent, terminal) AS (\n", | |
" SELECT l.id, 1, l.parent, 0\n", | |
" FROM languoid AS l\n", | |
" WHERE l.parent IS NOT NULL\n", | |
"UNION ALL\n", | |
" SELECT t.child, t.steps + 1, p.parent, gp.parent IS NULL\n", | |
" FROM languoid AS p\n", | |
" JOIN tree AS t ON p.id=t.parent\n", | |
" LEFT JOIN languoid AS gp ON gp.id=p.parent\n", | |
" WHERE p.parent IS NOT NULL\n", | |
") \n", | |
"SELECT * FROM tree```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Querying the tree" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Show the minimal, mean, and maximal number of languages per family. Display the language familes with the most child languages." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 57, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"(1, 33.781893004115226, 1430)\n" | |
] | |
}, | |
{ | |
"data": { | |
"text/plain": [ | |
"[(u'Atlantic-Congo', 1430), (u'Austronesian', 1274), (u'Indo-European', 583)]" | |
] | |
}, | |
"execution_count": 57, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"print conn.execute(\"\"\"SELECT min(n), avg(n), max(n) FROM\n", | |
"(SELECT count(*) AS n FROM languoid AS p\n", | |
"JOIN tree AS w ON w.parent=p.id AND w.terminal\n", | |
"JOIN languoid AS c ON w.child=c.id AND c.level='Language'\n", | |
"WHERE p.level='LanguageFamily' GROUP BY p.id)\"\"\").fetchone()\n", | |
"\n", | |
"conn.execute(\"\"\"SELECT p.label, count(*) AS n FROM languoid AS p\n", | |
"JOIN tree AS w ON w.parent=p.id AND w.terminal\n", | |
"JOIN languoid AS c ON w.child=c.id AND c.level='Language'\n", | |
"WHERE p.level='LanguageFamily' GROUP BY p.id ORDER BY n DESC LIMIT 3\"\"\").fetchall()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Determine the languages with the most dialects." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 58, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"[(u'Gumuz', 19), (u'Basque', 11), (u'Kunama', 9), (u'Berta', 7)]" | |
] | |
}, | |
"execution_count": 58, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"conn.execute(\"\"\"SELECT p.label, count(*) AS n FROM languoid AS p\n", | |
"JOIN tree AS w ON w.parent=p.id AND w.terminal\n", | |
"JOIN languoid AS c ON w.child=c.id AND c.level='Dialect'\n", | |
"WHERE p.level='Language' GROUP BY p.id ORDER BY n DESC LIMIT 4\"\"\").fetchall()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Display some of the longest paths." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 59, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Atlantic-Congo <- Volta-Congo <- North Volta-Congo <- Gur <- Central Gur <- Northern Central Gur <- Bwamu-Oti-Volta <- Oti-Volta <- Nuclear Oti-Volta <- Gurma-Yom-Oti-Volta Occidental <- Western Oti-Volta <- Nuclear Oti-Volta Occidental <- Northwest Oti-Volta <- Safaliba-Dagaare <- Dagaaric <- North-West Dagaric <- Birifor <- Malba Birifor <= Birifor\n", | |
"\n", | |
"Atlantic-Congo <- Volta-Congo <- North Volta-Congo <- Gur <- Central Gur <- Northern Central Gur <- Bwamu-Oti-Volta <- Oti-Volta <- Nuclear Oti-Volta <- Gurma-Yom-Oti-Volta Occidental <- Western Oti-Volta <- Nuclear Oti-Volta Occidental <- Northwest Oti-Volta <- Safaliba-Dagaare <- Dagaaric <- Central-South Dagaric <- South Dagaric <- Wali (Ghana) <= 'Bulengee\n", | |
"\n", | |
"Atlantic-Congo <- Volta-Congo <- North Volta-Congo <- Gur <- Central Gur <- Northern Central Gur <- Bwamu-Oti-Volta <- Oti-Volta <- Nuclear Oti-Volta <- Gurma-Yom-Oti-Volta Occidental <- Western Oti-Volta <- Nuclear Oti-Volta Occidental <- Northwest Oti-Volta <- Safaliba-Dagaare <- Dagaaric <- Central-South Dagaric <- South Dagaric <- Wali (Ghana) <= Cherii\n", | |
"\n" | |
] | |
} | |
], | |
"source": [ | |
"for child, path in conn.execute(\"\"\"SELECT c.label, (SELECT group_concat(parent, ' <- ')\n", | |
" FROM (SELECT g.child AS child , p.label AS parent\n", | |
" FROM tree AS g JOIN languoid AS p ON g.parent=p.id\n", | |
" WHERE child=c.id ORDER BY g.steps DESC)\n", | |
" GROUP BY child)\n", | |
"FROM languoid AS c JOIN tree AS w ON w.child=c.id AND w.terminal\n", | |
"ORDER BY w.steps DESC, c.id LIMIT 3\"\"\"):\n", | |
" print '%s <= %s\\n' % (path, child)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Note that with SPARQL the [number of steps is not available](http://www.w3.org/TR/sparql11-property-paths/#Outstanding_Issues), so it might be [difficult](http://stackoverflow.com/questions/5198889/calculate-length-of-path-between-nodes) to get the path in the right order like this." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Analysis with pandas" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Activate [inline plotting](http://nbviewer.ipython.org/github/jrjohansson/scientific-python-lectures/blob/master/Lecture-4-Matplotlib.ipynb#The-IPython-notebook-inline-backend) in this notebook." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 60, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"%matplotlib inline" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Load the language labels into a `pandas` ([docs](http://pandas.pydata.org/pandas-docs/stable/)) `DataFrame`. Display the result." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 61, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>lang</th>\n", | |
" <th>label</th>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>id</th>\n", | |
" <th></th>\n", | |
" <th></th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>aari1239</th>\n", | |
" <td>an</td>\n", | |
" <td>Luenga aari</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>aari1239</th>\n", | |
" <td>ar</td>\n", | |
" <td>لغة آري</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>aari1239</th>\n", | |
" <td>en</td>\n", | |
" <td>Aari language</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>aari1239</th>\n", | |
" <td>en</td>\n", | |
" <td>Ara</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>aari1239</th>\n", | |
" <td>en</td>\n", | |
" <td>Ari</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>aari1239</th>\n", | |
" <td>en</td>\n", | |
" <td>Ari-Galila</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>aari1239</th>\n", | |
" <td>en</td>\n", | |
" <td>Aro</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>...</th>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>zuoj1238</th>\n", | |
" <td>en</td>\n", | |
" <td>Zuojiang</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>zuoj1238</th>\n", | |
" <td>x-clld</td>\n", | |
" <td>zuoj1238</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>zyph1238</th>\n", | |
" <td>br</td>\n", | |
" <td>Zac'hringeg</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>zyph1238</th>\n", | |
" <td>en</td>\n", | |
" <td>Zophei</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>zyph1238</th>\n", | |
" <td>en</td>\n", | |
" <td>Zoptei</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>zyph1238</th>\n", | |
" <td>en</td>\n", | |
" <td>Zyphe language</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>zyph1238</th>\n", | |
" <td>x-clld</td>\n", | |
" <td>zyph1238</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"<p>65685 rows × 2 columns</p>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" lang label\n", | |
"id \n", | |
"aari1239 an Luenga aari\n", | |
"aari1239 ar لغة آري\n", | |
"aari1239 en Aari language\n", | |
"aari1239 en Ara\n", | |
"aari1239 en Ari\n", | |
"aari1239 en Ari-Galila\n", | |
"aari1239 en Aro\n", | |
"... ... ...\n", | |
"zuoj1238 en Zuojiang\n", | |
"zuoj1238 x-clld zuoj1238\n", | |
"zyph1238 br Zac'hringeg\n", | |
"zyph1238 en Zophei\n", | |
"zyph1238 en Zoptei\n", | |
"zyph1238 en Zyphe language\n", | |
"zyph1238 x-clld zyph1238\n", | |
"\n", | |
"[65685 rows x 2 columns]" | |
] | |
}, | |
"execution_count": 61, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"import pandas as pd\n", | |
"\n", | |
"pd.set_option('max_rows', 15)\n", | |
"\n", | |
"labels = pd.read_sql_query(\"\"\"SELECT label.*\n", | |
"FROM label JOIN languoid ON label.id=languoid.id\n", | |
"WHERE languoid.level='Language' ORDER BY label.id\"\"\", conn, index_col='id')\n", | |
"\n", | |
"labels" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Break the number of labels down by language." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 62, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYEAAAEgCAYAAABSGc9vAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAHPVJREFUeJzt3X+wXGWd5/H3BxiIPwIFoyEajOBCMOhaECHqurs0UkNA\ntyDrSvYqK6EMsy4/RlZrrUncmcrFcteFKjA6s8nuCCNJRo2RWZao2RAZaH/sIAkCBkyEW6VBbiSZ\nGSKMP3acJHz2j35uPLnc5PbN7Xu7yfm8qro4/e3ndD+nO/TnnOec249sExER9XRUtzsQERHdkxCI\niKixhEBERI0lBCIiaiwhEBFRYwmBiIgaazsEJB0l6RFJ68r9pZIGJT1cbhdX2i6RNCBpm6SLKvU5\nkrZIelLSskr9WElryjoPSJrZqQ2MiIiDG8uRwA3AD4fVbrU9p9w2AEiaDSwAZgOXAMslqbRfASyy\nPQuYJWleqS8Cdts+A1gG3Hx4mxMREWPRVghIOgV4N3Db8IdGaH4ZsMb2XtvbgQFgrqTpwFTbm0u7\nVcD8yjory/KdwIVtb0FERBy2do8EPgN8HBj+58XXS3pU0m2STii1GcDTlTY7Sm0GMFipD5baAevY\n3gc8J+mktrciIiIOyzGjNZD0HmCX7UclNSoPLQc+aduSPgXcAlzdoX6NdISBpPzGRUTEYbA94vdq\nO0cC7wQulfRj4MvAuyStsv23/u0PD30emFuWdwCvq6x/SqkdrH7AOpKOBo63vfsgG9KR29KlSzv2\nXOlT+tSLferVfqVPk9+nQxk1BGx/wvZM228A+oD7bF9ZxviHvBd4vCyvA/rKFT+nAacDm2zvBJ6X\nNLecKL4SuLuyzsKyfDlw32j9ioiI8Rt1OOgQbpZ0NvACsB34MIDtrZLWAluBPcC1/m0UXQfcAUwB\n1rtcUQTcDqyWNAA8SytsIiJigo0pBGx/C/hWWb7yEO0+DXx6hPr3gX86Qv03tC4rnTSNRmMyX64t\n6VN70qf29WK/0qf2TFafNNp4US+R5JdSfyMieoEkPI4TwxERcYRKCERE1FhCICKixhICERE1lhCI\niKixhEBERI0lBCIiaiwhEBFRYwmBiIgaSwhERNRYQiAiosYSAhERNZYQiIiosYRARESNHXEhMH36\nqUjqyG369FO7vTkREROq7RCQdJSkhyWtK/dPlLRR0hOS7pF0QqXtEkkDkrZJuqhSnyNpi6QnJS2r\n1I+VtKas84CkmYe7Qbt2PQW4I7fWc0VEHLnGciRwA60pI4csBu61fSatOYGXAEg6i9YsYbOBS4Dl\nZU5hgBXAItuzgFmS5pX6ImC37TOAZcDNh7k9ERExBm2FgKRTgHcDt1XKlwEry/JKYH5ZvhRYY3uv\n7e3AADC3TEw/1fbm0m5VZZ3qc90JXDj2TYmIiLFq90jgM8DHaY2TDDnZ9i4A2zuBaaU+A3i60m5H\nqc0ABiv1wVI7YB3b+4DnJJ3U/mZERMThGHWieUnvAXbZflRS4xBNOzn574hzYQL09/fvX240Gj05\nQXRERDc1m02azWZbbUedaF7SfwX+HbAXeBkwFbgLOBdo2N5Vhnrutz1b0mLAtm8q628AlgJPDbUp\n9T7gfNvXDLWx/aCko4FnbE8b1pW2JppvnX7oVB6JTGwfES9145po3vYnbM+0/QagD7jP9geBrwFX\nlWYLgbvL8jqgr1zxcxpwOrCpDBk9L2luOVF85bB1Fpbly2mdaI6IiAk26nDQIfw3YK2kD9Hay18A\nYHurpLW0riTaA1xb2X2/DrgDmAKst72h1G8HVksaAJ6lFTYRETHBRh0O6iUZDoqIGLtxDQdFRMSR\nKyEQEVFjCYGIiBpLCERE1FhCICKixhICERE1lhCIiKixhEBERI0lBCIiaiwhEBFRYwmBiIgaSwhE\nRNRYQiAiosYSAhERNZYQiIiosYRARESNjRoCko6T9KCkRyQ9JmlpqS+VNCjp4XK7uLLOEkkDkrZJ\nuqhSnyNpi6QnJS2r1I+VtKas84CkmZ3e0IiIeLF25hj+DXCB7XOAs4FLJM0tD99qe065bQCQNJvW\nVJOzgUuA5WVOYYAVwCLbs4BZkuaV+iJgt+0zgGXAzR3avoiIOIS2hoNs/7osHkdrXuKhORdHmq7s\nMmCN7b22twMDwFxJ04GptjeXdquA+ZV1VpblO4ELx7IRERFxeNoKAUlHSXoE2Al8s/JFfr2kRyXd\nJumEUpsBPF1ZfUepzQAGK/XBUjtgHdv7gOcknXQ4GxQREe07pp1Gtl8AzpF0PHCXpLOA5cAnbVvS\np4BbgKs71K8RJ0QG6O/v37/caDRoNBodesmIiCNDs9mk2Wy21Va2R29VXUH6Y+BXtm+t1F4PfM32\nWyQtBmz7pvLYBmAp8BRwv+3Zpd4HnG/7mqE2th+UdDTwjO1pI7y2R+tv6/TD2LbpEM/GWN+fiIhe\nIwnbI+5ct3N10KuGhnokvQz4PeBHZYx/yHuBx8vyOqCvXPFzGnA6sMn2TuB5SXPLieIrgbsr6yws\ny5cD941pCyMi4rC0Mxz0GmClpKNohcZXbK+XtErS2cALwHbgwwC2t0paC2wF9gDXVnbfrwPuAKYA\n64euKAJuB1ZLGgCeBfo6sXEREXFoYx4O6qYMB0VEjN24hoMiIuLIlRCIiKixhEBERI0lBCIiaiwh\nEBFRYwmBiIgaSwhERNRYQiAiosYSAhERNZYQiIiosYRARESNJQQiImosIRARUWMJgYiIGksIRETU\nWEIgIqLG2ple8jhJD0p6RNJjkpaW+omSNkp6QtI9Q1NQlseWSBqQtE3SRZX6HElbJD0paVmlfqyk\nNWWdByTN7PSGRkTEi40aArZ/A1xg+xzgbOASSXOBxcC9ts+kNSfwEgBJZwELgNnAJcDyMqcwwApg\nke1ZwCxJ80p9EbDb9hnAMuDmTm1gREQcXFvDQbZ/XRaPozUvsYHLgJWlvhKYX5YvBdbY3mt7OzAA\nzC0T00+1vbm0W1VZp/pcdwIXHtbWRETEmLQVApKOkvQIsBP4ZvkiP9n2LgDbO4FppfkM4OnK6jtK\nbQYwWKkPltoB69jeBzwn6aTD2qKIiGjbMe00sv0CcI6k44G7JL2JF8/m3skZ2UecEBmgv79//3Kj\n0aDRaHTwZSMiXvqazSbNZrOttrLH9t0t6Y+BXwNXAw3bu8pQz/22Z0taDNj2TaX9BmAp8NRQm1Lv\nA863fc1QG9sPSjoaeMb2tBFe26P1t3X6oVN5JMb6/kRE9BpJ2B5x57qdq4NeNXTlj6SXAb8HbAPW\nAVeVZguBu8vyOqCvXPFzGnA6sKkMGT0vaW45UXzlsHUWluXLaZ1ojoiICdbOcNBrgJWSjqIVGl+x\nvV7S94C1kj5Eay9/AYDtrZLWAluBPcC1ld3364A7gCnAetsbSv12YLWkAeBZoK8jWxcREYc05uGg\nbspwUETE2I1rOCgiIo5cCYGIiBpLCERE1FhCICKixhICERE1lhCIiKixhEBERI0lBCIiaiwhEBFR\nYwmBiIgaSwhERNRYQiAiosYSAhERNZYQiIiosYRARESNtTOz2CmS7pP0Q0mPSfqDUl8qaVDSw+V2\ncWWdJZIGJG2TdFGlPkfSFklPSlpWqR8raU1Z5wFJMzu9oRER8WLtHAnsBT5m+03AO4DrJb2xPHar\n7TnltgFA0mxas4zNBi4BlpfpJAFWAItszwJmSZpX6ouA3bbPAJYBN3di4yIi4tBGDQHbO20/WpZ/\nSWt+4Rnl4ZFmqrkMWGN7r+3twAAwt0xGP9X25tJuFTC/ss7KsnwncOFhbEtERIzRmM4JSDoVOBt4\nsJSul/SopNuGJqOnFRBPV1bbUWozgMFKfZDfhsn+dWzvA56TdNJY+hYREWPXdghIeiWtvfQbyhHB\ncuANts8GdgK3dLBfI86FGRERnXVMO40kHUMrAFbbvhvA9t9Wmnwe+FpZ3gG8rvLYKaV2sHp1nZ9J\nOho43vbukfrS39+/f7nRaNBoNNrZhIiI2mg2mzSbzbbayvbojaRVwN/Z/lilNt32zrL8UeA82x+Q\ndBbwReBttIZ5vgmcYduSvgd8BNgMfAP4nO0Nkq4F3mz7Wkl9wHzbfSP0w6P1t3UOevRtao9o5/2J\niOhlkrA94gjLqEcCkt4JXAE8JukRWt+wnwA+IOls4AVgO/BhANtbJa0FtgJ7gGsr39zXAXcAU4D1\nQ1cUAbcDqyUNAM8CLwqAiIjovLaOBHpFjgQiIsbuUEcC+YvhiIgaSwhERNRYQiAiosYSAhERNZYQ\niIiosYRARESNJQQiImosIRARUWMJgYiIGksIRETUWEIgIqLGEgIRETWWEIiIqLGEQEREjSUEIiJq\nLCEQEVFjo4aApFMk3Sfph5Iek/SRUj9R0kZJT0i6R9IJlXWWSBqQtE3SRZX6HElbJD0paVmlfqyk\nNWWdByTN7PSGRkTEi7VzJLAX+JjtNwHvAK6T9EZgMXCv7TOB+4AlAGWO4QXAbOASYLla030BrAAW\n2Z4FzJI0r9QXAbttnwEsA27uyNZFRMQhjRoCtnfafrQs/xLYBpwCXAasLM1WAvPL8qXAGtt7bW8H\nBoC5kqYDU21vLu1WVdapPtedwIXj2aiIiGjPmM4JSDoVOBv4HnCy7V3QCgpgWmk2A3i6stqOUpsB\nDFbqg6V2wDq29wHPSTppLH2LiIixO6bdhpJeSWsv/Qbbv5Q0fAb2Ts7IPuKEyAD9/f37lxuNBo1G\no4MvGxHx0tdsNmk2m221lT36d7ekY4CvA//H9mdLbRvQsL2rDPXcb3u2pMWAbd9U2m0AlgJPDbUp\n9T7gfNvXDLWx/aCko4FnbE8boR8erb+t0w+dyiPRzvsTEdHLJGF7xJ3rdoeD/hzYOhQAxTrgqrK8\nELi7Uu8rV/ycBpwObCpDRs9LmltOFF85bJ2FZflyWieaIyJigo16JCDpncC3gcdo7WIb+ASwCVgL\nvI7WXv4C28+VdZbQuuJnD63ho42l/lbgDmAKsN72DaV+HLAaOAd4FugrJ5WH9yVHAhERY3SoI4G2\nhoN6RUIgImLsOjEcFBERR6CEQEREjSUEIiJqLCEQEVFjCYGIiBpLCERE1FhCICKixhICERE1lhCI\niKixhEBERI0lBCIiaiwhEBFRYwmBiIgaSwhERNRYQiAiosYSAhERNTZqCEi6XdIuSVsqtaWSBiU9\nXG4XVx5bImlA0jZJF1XqcyRtkfSkpGWV+rGS1pR1HpA0s5MbGBERB9fOkcAXgHkj1G+1PafcNgBI\nmg0sAGYDlwDLy3zCACuARbZnAbMkDT3nImC37TOAZcDNh785ERExFqOGgO3vAj8f4aGRpiq7DFhj\ne2+ZI3gAmCtpOjDV9ubSbhUwv7LOyrJ8J3Bh+92PiIjxGM85geslPSrpNkknlNoM4OlKmx2lNgMY\nrNQHS+2AdWzvA56TdNI4+hUREW065jDXWw580rYlfQq4Bbi6Q30acTLkIf39/fuXG40GjUajQy8b\nEXFkaDabNJvNttrK9uiNpNcDX7P9lkM9JmkxYNs3lcc2AEuBp4D7bc8u9T7gfNvXDLWx/aCko4Fn\nbE87SD88Wn9bpyBG36b2iHben4iIXiYJ2yPuYLc7HCQqe+hljH/Ie4HHy/I6oK9c8XMacDqwyfZO\n4HlJc8uJ4iuBuyvrLCzLlwP3tdmniIgYp1GHgyR9CWgAvyvpp7T27C+QdDbwArAd+DCA7a2S1gJb\ngT3AtZVd9+uAO4ApwPqhK4qA24HVkgaAZ4G+jmxZRESMqq3hoF6R4aCIiLHrxHBQREQcgRICERE1\nlhCIiKixhEBERI0lBCIiaiwhEBFRYwmBiIgaSwhERNRYQiAiosYSAhERNZYQiIiosYRARESNJQQi\nImosIRARUWMJgYiIGksIRETU2KghIOl2SbskbanUTpS0UdITku6RdELlsSWSBiRtk3RRpT5H0hZJ\nT0paVqkfK2lNWecBSTM7uYEREXFw7RwJfAGYN6y2GLjX9pm05gReAiDpLGABMBu4BFhe5hQGWAEs\nsj0LmCVp6DkXAbttnwEsA24ex/ZERMQYjBoCtr8L/HxY+TJgZVleCcwvy5cCa2zvtb0dGADmlonp\np9reXNqtqqxTfa47gQsPYzsiIuIwHO45gWm2dwHY3glMK/UZwNOVdjtKbQYwWKkPltoB69jeBzwn\n6aTD7FdERIzBMR16nk7Oxj7iZMhD+vv79y83Gg0ajUYHXzoi4qWv2WzSbDbbanu4IbBL0sm2d5Wh\nnr8p9R3A6yrtTim1g9Wr6/xM0tHA8bZ3H+yFqyEQEREvNnwH+cYbbzxo23aHg8SBe+jrgKvK8kLg\n7kq9r1zxcxpwOrCpDBk9L2luOVF85bB1Fpbly2mdaI6IiEkg+9AjOZK+BDSA3wV2AUuB/w18ldYe\n/FPAAtvPlfZLaF3xswe4wfbGUn8rcAcwBVhv+4ZSPw5YDZwDPAv0lZPKI/XFbfSXzo1OidFeLyKi\n10nC9ohD7aOGQC9JCEREjN2hQiB/MRwRUWMJgYiIGksIRETUWEJgkkyffiqSxn2bPv3Ubm9KRBxB\ncmL40M/WsRPDnetXTlZHxNjkxHBERIwoIRARUWMJgYiIGksIRETUWEIgIqLGEgIRETWWEIiIqLGE\nQEREjSUEIiJqLCEQEVFjCYGIiBobVwhI2i7pB5IekbSp1E6UtFHSE5LukXRCpf0SSQOStkm6qFKf\nI2mLpCclLRtPnyIion3jPRJ4AWjYPsf23FJbDNxr+0xa8wUvAZB0FrAAmA1cAiwv8w0DrAAW2Z4F\nzJI0b5z9ioiINow3BDTCc1wGrCzLK4H5ZflSYI3tvWUO4QFgrqTpwFTbm0u7VZV1IiJiAo03BAx8\nU9JmSVeX2sm2dwHY3glMK/UZwNOVdXeU2gxgsFIfLLWIiJhgx4xz/XfafkbSq4GNkp7gxT+a39Ef\nv+/v79+/3Gg0aDQanXz6iIiXvGazSbPZbKttxyaVkbQU+CVwNa3zBLvKUM/9tmdLWgzY9k2l/QZg\nKfDUUJtS7wPOt33NCK+RSWUyqUxEjNGETCoj6eWSXlmWXwFcBDwGrAOuKs0WAneX5XVAn6RjJZ0G\nnA5sKkNGz0uaW04UX1lZJyIiJtB4hoNOBu6S5PI8X7S9UdJDwFpJH6K1l78AwPZWSWuBrcAe4NrK\nbv11wB3AFGC97Q3j6FdERLQpcwwf+tkyHBQRL3mZYzgiIkaUEIiIqLGEQEREjSUEIiJqLCEQEVFj\nCYGIiBpLCERE1FhCICKixhICERE1lhCIiKixhEBERI0lBCIiaiwhEBFRYwmBiIgaSwhERNRYz4SA\npIsl/UjSk5L+sNv9qYPp009F0rhv06ef2u1NiYjD1BMhIOko4E+BecCbgPdLeuPEvmpzYp/+sDQn\n9dV27XqK1kQ3h7rdP2qb1vN0RqeCabLDqd1JvSdbL/YrfWrPZPWpJ0IAmAsM2H7K9h5gDXDZxL5k\nc2Kf/rA0u92BETQn9dXaC6albbTpXDi1E0wXXHDBpAZTu2HZTr86GZadeq+O9D61o24hMAN4unJ/\nsNQiuq4Xg6m9PrXXr04eyXXqvTrS+9ROMN14442TsmPRKyEQEVEbvbRj0RMTzUt6O9Bv++JyfzFg\n2zcNa9f9zkZEvAQdbKL5XgmBo4EngAuBZ4BNwPttb+tqxyIijnDHdLsDALb3Sboe2EhriOr2BEBE\nxMTriSOBiIjojpwYjoiosYRARESNJQRiP0lHS/pit/sxnKSjJC3odj8ijkS1OScg6dXA7wOnUjkh\nbvtDXejLnEM9bvvhyerLcJK+C7zL9j92qw8jkfSQ7XO73Y8qSV+gdbH2Abrxbwp6/t/VX9m+cLTa\nJPfpBuALwC+A24BzgMW2N3axT7OAFcDJtt8s6S3ApbY/NVGv2RNXB02Su4HvAPcC+7rcl1vKf6cA\n5wI/AAS8BXgIeEeX+gXwY+D/SloH/GqoaPvW7nUJgHsl/SfgKxzYr93d6xJfryxPAf418LMu9QV+\n++8KDgwnlfvvmtzugKQpwMuBV0k6sfQF4Hi6/6sAH7L9WUnzgBOBDwKraV2l2C2fBz4O/E8A21sk\nfQlICHTAy233xK+T2r4AQNL/AubYfqzcfzPQ340+SVpt+4PApcBnaA0VTu1GXw7i39L6Irt2WP0N\nXegLALb/snpf0peB73apO9V/Vy+j9T79c1rv2Xdo7V12w4eB/wi8Fvh+pf4LWj8a2U1DgfQeYLXt\nH0oa8Q+qJtHLbW8a1o29E/mCdQqBr0t6t+313e5IxZlDAQBg+3FJs7vUl7dKei3wU+BPutSHQzmL\nF3+x/Y+u9ujFzgCmdbsTwErg74HPlfsfAFYBk35exfZngc9K+gPgWA78/G6b7P4M831J99DakVgs\naSrwQpf79HeS/gnlSE7S+2j9Ae2EqdM5gV/QOiz9R2AP5RDZ9vFd7NOXaQ1t/EUpXQG80vb7u9CX\njwDXAKdx4JDG0PvUtT1uAElraX2xDZ24/gBwgu2unDAue4z7gF9WyjuBJcOPECabpK22zxqtNsl9\n+irwPD3y+ZU+HQX8EXCi7Y9Kmgm83vZ3utinNwB/Bvwz4OfAT4ArbHfu1+uGv2aNQuAoWl+yp9n+\nZPnAX2P7wS72aQqtL95/WUrfBlbY/ocu9mmF7Wu69foH06NfbI/bfnO3Xv9gJP0F8Ke2v1fuvw24\nzvaVXexTL35+K2jt+b/L9uxyzmKj7fO60JePDSu9jNaQ7K9gYs/J1Wk46L9TPnDgk7TGJP8SmPQP\nfEj5sv9MufWEXgyA4mFJbx/2xfZQl/v0fUnn2d7c5X4M91bgryX9tNyfCTwh6TFaR3Vv6UKfevHz\ne5vtOZIeAbD9c0nHdqkvQ+ffzqT1nXQ3raPwD9L6LbUJU6cQ6JkPfOh/xoM93qX/SXtS5b36HX77\nxWbg9cCPutk34G3AFZKeorXHNjR01u3P7+Iuv/5IejGY9pQfrxwaf381XTonYPvG0odv07pY5Bfl\nfj/wjYl87TqFQM984MC/6tLrvhT18ns1r9sdGMlEjh+PQy8G0+eAu4Bpkv4L8D5a5wi66WRa5y2H\n/GOpTZg6nRO4gtZlhnNoXT3xPuCPbH+1qx2LiK5Ray7zC2kdxf1Vt3+9WNJ/pnUV112lNB/4iu1P\nT9hr1iUEoHc+8HKl0khvfNevWIqI7ip/+f0vyt1v235kQl+vTiEQEREHyg/IdZGkt5c/UBm6P7Vc\nNRERMSlyJNBF5UqlOS4fQvlbhodsH/KHwCIiOiVHAt0lV1LY9gvU64qtiOiyhEB3/VjSRyT9Trnd\nQOtXPCMiJkVCoLv+A63fCNkBDNL646N/39UeRUSt5JxARESN5UigR0jq2qxPEVFfCYHe0e3JLCKi\nhhICXSSp+jO63yi1Rnd6ExF1lHMCXSTpcVpzmt5M6/fDbwLOtd3NOYYjokZyJNBdbwNeB/w1rd8M\n/xnwzq72KCJqJSHQXXuA/0frKGAK8JPyB2MREZMiIdBdm2mFwHm0fjXw/WUu1oiISZFzAl0k6Vzb\nDw2rfdD26m71KSLqJSEQEVFjGQ6KiKixhEBERI0lBCIiaiwhEDGKMid0xBEpIRAxulw9EUeshEBE\nmyS9QtK9kh6S9ANJl5b66yVtlfRnkh6XtEHSceWx80rbhyXdLOmx7m5FxIESAhHt+wdgvu1zgXcB\nt1QeOx34E9tvBp4H/k2p/znw+2Xe6H3kqCJ6TEIgon0CPi3pB8C9wGslTSuP/cT20F7+94FTJZ0A\nvNL2plL/0uR2N2J0mdQ8on1XAK8CzrH9gqSf0PrNJ4DfVNrtq9QzT0T0tBwJRIxu6Iv8BOBvSgBc\nALx+hDb72X4e+HtJ55VS38R2M2LsciQQMbqhcfwvAl8rw0EPAdtGaDPc1cBtkvYB36J1viCiZ+S3\ngyImkKRX2P5VWf5DYLrtj3a5WxH75UggYmK9R9ISWv+vbQeu6mpvIobJkUBERI3lxHBERI0lBCIi\naiwhEBFRYwmBiIgaSwhERNTY/wfN8iQiZTgNlAAAAABJRU5ErkJggg==\n", | |
"text/plain": [ | |
"<matplotlib.figure.Figure at 0x28c083c8>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"labels_lang = labels.groupby('lang').size().sort_values(ascending=False)\n", | |
"labels_lang[labels_lang > 400].plot.bar();" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Show summary statistics on the **number of labels per languoid**. Plot the more common label count frequencies." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 63, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"count 8418.000000\n", | |
"mean 7.802922\n", | |
"std 10.729925\n", | |
"min 1.000000\n", | |
"25% 2.000000\n", | |
"50% 5.000000\n", | |
"75% 9.000000\n", | |
"max 174.000000\n", | |
"dtype: float64\n" | |
] | |
}, | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAEACAYAAAC9Gb03AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAHURJREFUeJzt3XuQXWWZ7/Hvr7vTSYAkJASSIQkQhECCiIJEQNAWCRcv\nhJpyEM8BBSnGAsR4GYsEq4ZYU8WA5VEUgRkPyACKCM5IImIIDGzGOEA43IIkQEYNhAhNuAQCgaQ7\n+zl/vGvbO013utO9u9e+/D5Vq3r1u9da+8muzrPWft53vUsRgZmZ1a+mvAMwM7Oh5URvZlbnnOjN\nzOqcE72ZWZ1zojczq3NO9GZmda7PRC/pWkntklZ0a79A0ipJT0i6tKx9gaTV2WvHl7UfKmmFpGck\nXV7Zf4aZmfWmP1f01wEnlDdIagM+DRwcEQcD383aZwKnAjOBk4CrJCnb7Wrg7IiYAcyQtM0xzcxs\naPSZ6CNiGfBat+ZzgUsjojPb5uWsfS5wc0R0RsQaYDUwW9JkYExEPJRtdwNwSgXiNzOzPgy0Rj8D\n+IikByTdK+mwrH0KsLZsu3VZ2xTg+bL257M2MzMbYi2D2G98RBwh6XDgVmDfyoVlZmaVMtBEvxb4\nD4CIeEjSVkm7ka7g9yrbbmrWtg6Y1kN7jyR5Ah4zswGICHVv62/pRtlSchtwLICkGUBrRLwCLAY+\nK6lV0nRgP2B5RLwIvC5pdtY5+3lgUR/B1v1y8cUX5x5DtS/+jPz5+DPq/9KbPq/oJd0EtAG7SXoO\nuBj4CXCdpCeAzVniJiJWSroFWAl0AOdF17ufD/wbMAq4IyKW9PXeZmY2eH0m+oj4X728dEYv2/8z\n8M89tD8MHLxD0ZmZ2aD5ztgctbW15R1C1fNntH3+fPrmzwi0vbpOXiRFNcZlZlbNJBGD6Iw1M7Ma\n5URvZlbnnOjNzOqcE72ZWZ1zojczq3NO9GZmdc6J3syszjnRm5nVOSd6M7M650RvZlbnnOjNzOpc\n1Sb6LVvyjsDMrD5UbaJfsSLvCMzM6kPVJvply/KOwMysPlRtov/1r/OOwMysPlTtfPS77hq89lre\nkZiZ1Y6am4/+9ddh48a8ozAzq319JnpJ10pql/Su7lFJ35BUlDShrG2BpNWSVkk6vqz9UEkrJD0j\n6fK+3xceeWRH/ilmZtaT/lzRXwec0L1R0lRgDvBsWdtM4FRgJnAScJWk0teIq4GzI2IGMEPSu45Z\nrliEQqE//wQzM9uePhN9RCwDeqqWfx/4Zre2ucDNEdEZEWuA1cBsSZOBMRHxULbdDcApfb337bf3\ntYWZmfVlQDV6SScDayPiiW4vTQHWlv2+LmubAjxf1v581rZdK1cOJDozMyvXsqM7SBoNXEQq2wyp\nt9+G9eth992H+p3MzOrXDid64D3APsDjWf19KvCIpNmkK/i9yradmrWtA6b10L4dC4mAr34Vzjmn\njba2tgGEamZWvwqFAoV+dGb2axy9pH2AX0fEwT289mfg0Ih4TdIs4GfAh0ilmbuA/SMiJD0AfAV4\nCPgN8MOIWNLL+wWkuObNg8v7HKNjZmYDHkcv6Sbgv0kjZZ6TdFa3TQIQQESsBG4BVgJ3AOdF15nk\nfOBa4BlgdW9JvrulS/uzlZmZ9aZq74wtXdGPGAGbN6dx9WZm1ruauzO2ZOtWePbZvrczM7OeVX2i\nLxZh+fK8ozAzq11Vn+gBfvObvCMwM6tdNZHo77sv7wjMzGpX1XfGAjQ3pw7Z5uYcgzIzq3I12xkL\nEAFPPZV3FGZmtakmEn2xCPffn3cUZma1qSYSPfjRgmZmA1UTNXqAPfaA9vacAjIzqwG91ehrJtFL\naTbLkSNzCsrMrMrVdGcspET/+ON5R2FmVntqJtEXi7BsWd5RmJnVnppJ9ACLF+cdgZlZ7amZGj3A\nuHGwYUMOAZmZ1YCar9EDvPFGWszMrP9qKtFL8PDDeUdhZlZbairRF4tw7715R2FmVltqKtGDpyw2\nM9tRNdUZCzB6NGzaNMwBmZnVgME8HPxaSe2SVpS1fUfSKkmPSfp3SWPLXlsgaXX2+vFl7YdKWiHp\nGUmXD/Qfsnmzp0IwM9sR/SndXAec0K1tKXBQRLwfWA0sAJA0CzgVmAmcBFwl/fWx3lcDZ0fEDGCG\npO7H7JcIP1rQzGxH9JnoI2IZ8Fq3trsjopj9+gAwNVs/Gbg5IjojYg3pJDBb0mRgTEQ8lG13A3DK\nQAKOgKVLB7KnmVljqkRn7BeBO7L1KcDastfWZW1TgOfL2p/P2gbEid7MrP9aBrOzpG8BHRHx8wrF\nU2Zh2XpbtiR//nO6ste7uhzMzBpHoVCgUCj0uV2/Rt1I2hv4dUS8r6ztTOAc4NiI2Jy1zQciIi7L\nfl8CXAw8C9wbETOz9tOAj0bEub28X6+jbgCammD1ath33z5DNzNrGIOdAkHZUjrYicA3gZNLST6z\nGDhNUquk6cB+wPKIeBF4XdLsrHP288CiAf5bKBbhwQcHureZWWPpz/DKm4D/Jo2UeU7SWcAVwC7A\nXZIekXQVQESsBG4BVpLq9udF11eG84FrgWeA1RGxZDCB+8YpM7P+qbkbpkqmTYPnnhumgMzMakDN\nP0qwu6amdPNUy6C6k83M6kddTFPc3apVeUdgZlb9ajbRF4tw//15R2FmVv1qNtGDHy1oZtYfNVuj\nB5g4EdavH4aAzMxqQN11xqbt0pTFo0YNQ1BmZlWuLjtjJXjssbyjMDOrbjWd6ItFWLYs7yjMzKpb\nTSd6gEUDnkjBzKwx1HSNHmDsWHj99SEOyMysBtRljR5g40bYsCHvKMzMqlfNJ3qAhx/OOwIzs+pV\n84k+Au69N+8ozMyqV80nevCUxWZm21PznbGQbph6++0hDMjMrAbUbWcswJYt8MILeUdhZlad6iLR\nR8Dy5XlHYWZWneom0d95Z95RmJlVp7pI9AB33ZV3BGZm1ak/Dwe/VlK7pBVlbeMlLZX0tKQ7JY0r\ne22BpNWSVkk6vqz9UEkrJD0j6fJK/0PWrElX9mZmtq3+XNFfB5zQrW0+cHdEHADcAywAkDQLOBWY\nCZwEXCWp1AN8NXB2RMwAZkjqfsxBKRbhj3+s5BHNzOpDn4k+IpYBr3Vrngtcn61fD5ySrZ8M3BwR\nnRGxBlgNzJY0GRgTEQ9l291Qtk9FFIvw4IOVPKKZWX0YaI1+j4hoB4iIF4E9svYpwNqy7dZlbVOA\n58van8/aKur22yt9RDOz2tdSoeMMQXV8Ydl6W7Zs3+9+V/kozMyqVaFQoFAo9LndQBN9u6RJEdGe\nlWVeytrXAdPKtpuatfXWvh0LdzioF16Azk5oqdTpy8ysirW1tdHW1vbX37/97W/3uF1/SzfKlpLF\nwJnZ+heARWXtp0lqlTQd2A9YnpV3Xpc0O+uc/XzZPhUTAf04uZmZNZQ+57qRdBOpbrIb0A5cDNwG\n3Eq6Sn8WODUiNmTbLwDOBjqAeRGxNGs/DPg3YBRwR0TM28577tBcN+XGjIGXX4bW1gHtbmZWs3qb\n66YuJjXbdl847jhYurTCQZmZVbm6ntSsXES6S/b66/ve1sysEdTdFX1JczM89xzsuWeFgjIzq3IN\nc0VfUizCUUd5WgQzs7pN9BHpiv6CC/KOxMwsX3Vbuuk6Ftx3HxxzTEUOZ2ZWtRpm1E1PdtopDbkc\nPbpihzQzqzoNV6Mv9/bbMGdO3lGYmeWjIRJ9BPz+93DllXlHYmY2/BqidFPS1AT/8z8wfXrFD21m\nlruGrtF3HRcmTYJ161LSNzOrJw1doy+JgPZ2+OIX847EzGz4NNQVfbklS+CEij7M0MwsXy7ddDNy\nJKxfn2a7NDOrBy7ddLNlC5TN129mVrcaNtFHwCOPwKWX5h2JmdnQatjSTdd7wZNPwsyZw/J2ZmZD\nxjX6Xt8LJkxIo3Gam4flLc3MhoRr9L2IgFdfhc98Ju9IzMyGRsMnekjJ/rbb4Ior8o7EzKzyBpXo\nJX1N0h8krZD0M0mtksZLWirpaUl3ShpXtv0CSaslrZJ0/ODDr6x58+CBB/KOwsyssgZco5e0J7AM\nODAitkj6BXAHMAt4JSK+I+lCYHxEzJc0C/gZcDgwFbgb2D96CGA4a/TdjRyZpkjYbbdc3t7MbMCG\nqkbfDOwsqQUYDawD5gKlR3NfD5ySrZ8M3BwRnRGxBlgNzB7k+1dcRwccckh6FKGZWT0YcKKPiL8A\n/wd4jpTgX4+Iu4FJEdGebfMisEe2yxRgbdkh1mVtVaVYhL/8BT75ybwjMTOrjJaB7ihpV9LV+97A\n68Ctkv437665DLAGs7BsvS1bhkdEmgvnkkvgoouG7W3NzHZIoVCgUCj0ud1gavSfAU6IiHOy388A\njgCOBdoiol3SZODeiJgpaT4QEXFZtv0S4OKIeLCHY+dWo982Dli6FI47Lu9IzMz6NhQ1+ueAIySN\nkiTg48BKYDFwZrbNF4BF2fpi4LRsZM50YD9g+SDef8hFwCc+kTpnzcxq1YBLNxGxXNIvgUeBjuzn\nj4ExwC2Svgg8C5yabb9S0i2kk0EHcF5PI26qzdat8IEPpGQ/YkTe0ZiZ7biGnwKhPyT48Ifhd7/L\nOxIzs955CoRBiIBly+DCC/OOxMxsx/mKfgfddhvMnZt3FGZm7+bZKyukuRmeeQb23TfvSMzMtuVE\nXyESjB0LL74Io0blHY2ZWRfX6CskAjZuhKOOyjsSM7P+caIfgGIRHnsMzj0370jMzPrm0s0g3Xgj\nnH563lGYmblGP2QkOPZY+NSn0s+DDvIjCc0sH070Q6SpCVpbYfNmaGlJiX/mTJgzB046CWbPhl12\nyTtKM2sETvTDaPRo2LIlJf0I2HNPOPpo+PSn4ZhjYOrUvCM0s3rkRJ+j1taU9Ds60jeAnXdO8+dc\ncgkceWTe0ZlZvXCiryLNzanMs2UL/OEPMGtW3hGZWT1woq9CEuy+O7zwQrrSNzMbDN8wVYUiYP16\nj8c3s6HlK/oqcf/9cMQReUdhZrXMpZsqN3YsvPJKqt2bmQ2ESzdVbuNG+Oxn847CzOqRr+irzJIl\ncMIJeUdhZrXIpZsaMWpUKuHstFPekZhZrRmS0o2kcZJulbRK0pOSPiRpvKSlkp6WdKekcWXbL5C0\nOtv++MG8d73avDnNm2NmVimDrdH/ALgjImYChwBPAfOBuyPiAOAeYAGApFnAqcBM4CTgKknvOvM0\nugi49174+c/zjsTM6sWASzeSxgKPRsR7urU/BXw0ItolTQYKEXGgpPlARMRl2Xa/BRZGxIM9HLth\nSzclI0ZAezuMH593JGZWK4aidDMdeFnSdZIekfRjSTsBkyKiHSAiXgT2yLafAqwt239d1mY96OyE\nj3887yjMrB4MZtR2C3AocH5E/D9J3yeVbbpfig/w0nxh2XpbtjSOCHj0UbjySjj//LyjMbNqVCgU\nKBQKfW43mNLNJOD+iNg3+/1oUqJ/D9BWVrq5NyJm9lC6WQJc7NLN9jU3w3PPpamOzcy2p+Klm6w8\ns1bSjKzp48CTwGLgzKztC8CibH0xcJqkVknTgf2A5QN9/0ZRLMJHPpKu8M3MBmJQ4+glHQJcA4wA\n/gScBTQDtwDTgGeBUyNiQ7b9AuBsoAOYFxFLezmur+jLSPCP/wgLF+YdiZlVM98wVeMkePpp2H//\nvCMxs2rlRF/jJJg0Cf7yl7RuZtadJzWrcRFpXP0FF+QdiZnVGl/R1xgJHnoIDjss70jMrNq4dFNH\ndt0VXnop3T1rZlbi0k0deeMNOOgg2LQp70jMrBY40degYhH+9Kd0E9WaNXlHY2bVzom+Rm3dCm++\nmYZb/ud/5h2NmVUzJ/oatnVrWubMgR/+MO9ozKxauTO2jpx1FvzkJ3lHYWZ58aibBiDBBz8Iv/+9\nR+SYNSIn+gbR1AS77w6PP57upDWzxuHhlQ2iWISXX4a994blnhvUzHCir0tbt0JHBxx5JPz0p3lH\nY2Z5c6KvU8ViWs44A775zbyjMbM8uUbfACQ47jj47W/TE6vMrD65M7bBNTXBXnvBY4/BuHF5R2Nm\nQ8GdsQ2uWIS1a9NInG9/GzZvzjsiMxsuvqJvME3Zqb21Fb761fSIwtGj843JzCrDpRvbhpSWESPg\nvPPgn/4Jdt4576jMbDCGrHQjqUnSI5IWZ7+Pl7RU0tOS7pQ0rmzbBZJWS1ol6fjBvrcNXEQq52zZ\nAj/4AYwfD+efDxs35h2ZmVVaJWr084CVZb/PB+6OiAOAe4AFAJJmAacCM4GTgKskP/00b6WE39EB\n//qvMGECnH02bNiQd2RmVimDSvSSpgKfAK4pa54LXJ+tXw+ckq2fDNwcEZ0RsQZYDcwezPtbZW3d\nCp2dcMMNaRqF009Pd9maWW0b7BX994Fvsm1BfVJEtANExIvAHln7FGBt2XbrsjarMp2dafnFL2Dy\nZPi7v0sPJjez2tQy0B0lfRJoj4jHJLVtZ9MB9qouLFtvyxYbTp2d6eeiRfCrX8Fll8E3vpFvTGbW\npVAoUCgU+txuwKNuJF0CnA50AqOBMcCvgA8CbRHRLmkycG9EzJQ0H4iIuCzbfwlwcUQ82MOxPeqm\nSl1yCSxYkHcUZtaTio+6iYiLImKviNgXOA24JyLOAH4NnJlt9gVgUba+GDhNUquk6cB+gOdXrDEX\nXQTf+U7eUZjZjhhw6WY7LgVukfRF4FnSSBsiYqWkW0gjdDqA86IaB/Fbny68MI3B92RpZrXBN0zZ\ngH33u67Zm1UTz3VjFfcP/wDf/37eUZhZX5zobVC+/nW4/PK8ozCz7XGit0H72tfghz/MOwoz640T\nvVXEvHlwxRV5R2FmPXGit4r5ylfgyivzjsLMunOit4r68pfhqqvyjsLMyjnRW8Wdfz78y7/kHYWZ\nlTjR25A491z48Y/zjsLMwInehtCXvgTXXNP3dmY2tJzobUidc04aZ1+aCdPMhp+nQLAh19QELS3w\n/vfDZz4Dc+fC/vun+XLMrHL8cHDL3ejRsHlzSvw77QRHHw2f+xyccEJ6opWZDY4TvVWV5mYYMSI9\nnFxKT7I6/ng47TQ45ph0UjCzHeNEb1WttbXrQeUSzJgBp5wCF1yQTgJm1jcneqspo0alq32AE0+E\nq6+GvfbKNyazaudpiq2mvPNOurovFuGuu2D6dJgzB/74x7wjM6s9TvRW9To6UsK/775U0vnoR+Gp\np/KOyqx2ONFbzSgl/Pvvh1mz4IgjYMWKvKMyq34DTvSSpkq6R9KTkp6Q9JWsfbykpZKelnSnpHFl\n+yyQtFrSKknHV+IfYI2noyN13D76aBqbf9hh8PDDeUdlVr0G3BkraTIwOSIek7QL8DAwFzgLeCUi\nviPpQmB8RMyXNAv4GXA4MBW4G9i/pweEuzPWdkRra0r+731vmkztqKN2bP933oFXX4XXXoPXX4dD\nDoGddx6aWM2G0pCPupF0G/CjbPloRLRnJ4NCRBwoaT4QEXFZtv1vgYUR8WAPx3Kitx1WSvgHHADf\n+lZqKyXw9evhpZfSz1degQ0bYONGeOutVA5qbU1375Y6gC+/HP7+7/P995jtqCFN9JL2AQrAe4G1\nETG+7LVXI2KCpCuA+yPipqz9GuCOiPiPHo7nRG8DNmJEStYjR6Y5djo7U6mnpSXdqNXcnMbqF4vp\ntdIwzpKmprT9gQfC7bfDvvvm8+8w21FDNrwyK9v8EpgXEW/y7gztjG3DqqMDtm6FTZtSEi8WU+Lu\n6Ehlmrfegjff7Hq9u9L2q1enOXnmzUvHM6tVLYPZWVILKcnfGBGLsuZ2SZPKSjcvZe3rgGllu0/N\n2nqxsGy9LVvMhk9pxs0f/Qh++lO49VY49th8YzIrVygUKBQKfW43qNKNpBuAlyPi62VtlwGvRsRl\nvXTGfgiYAtyFO2OtRjQ3pyv9OXNSwh87Nu+IzN6t4jV6SR8G/gt4gpSVA7gIWA7cQrp6fxY4NSI2\nZPssAM4GOkilnqW9HNuJ3qpSS0uq4X/ve+mRiWbVxHPdmFVIaR79/fdPnbX7759vPGYlnuvGrEIi\n0vLnP6eROeee6ydoWXXzFb3ZIDU1pZr9jTfCJz/pJ2dZfly6MRtCpc7anXdOT84644w0vfKECXlH\nZo3Eid5sGDQ1pbtsS0/OmjYNPvWp9MjE2bNTZ67ZUHGiN8tBa2u60od0EjjssJT05871g1Ss8pzo\nzapA6clZEowfn8bln356OgFMnJhKQGYD5URvVmVKY/K3bu36ucsuqa4/aRJMnZqerLX33rDnnunZ\nuX/zN+mnH55uPXGiN6sBI0akBVLi7+xMpZ8RI9LVfkT6RtDaCrvuCnvskaZV/tjH0reCmTO79rfG\n40RvVkekNDtnU1PXDJ0tLenkMHUqfOADaV6eww+H970Pdtop74htODjRmzWIkSPTzy1b0tX91q2p\n/n/wwel5u0cemU4EHvpZf5zozRpYa2sq/Wze3DXmf+RIGDMmdQpPnJjKQHvuCVOmwO67p7bddks/\nJ05M27mzuLo50ZvZNkpj/puyiVBKfQJbt6aEXuosLr3W0ZE6gceMSZ3Fe+8NM2akZa+90jJtWnq9\nUopFeOONtEyZ4hNNX5zozWzQmpvTyaH0hK7Sg1tGjEhtHR1pfeLElJinT0/zAb3nPekkMHZserRj\n+fLqq9Denh71+MorqW3DhvRwmLff7noyWLEIH/pQuuv45JPTNxDblhO9mQ2L7h3FHR1d3xCam7se\n1VjqRC4NLy29XporqHQiKX+616hRqfzU1JROJCefnBL/Bz/Y9e2jkTnRm1ldKb/ruKUFjjoqJf1P\nfzr1LTQiJ3ozq1ulbxGlu4732itNM3H66WmEUamstHlzWt55p+/1zZth3LiuvocJE6p/ZlInejNr\nGCNHprJQqS8hIq2XSkelBbZN3qV0GLHtfqXy0YQJaWTSPvukvof99us6EUyblv/9Ck70ZtawpK4k\nPhijRnVNV1H69lDqiN6yJb2+++7pZDBhQhqSOmFCKiXtumsakTR2bM8/x4wZ/KgiJ3ozsyHW1JS+\nTZS+SRSL6aRQ6nAu/0ZR3ulcGtra0pKGsI4enZ5tsMsu6UQwblzXSWPixPR7TyeMgw6qkkQv6UTg\nctJjDK+NiMt62MaJ3swaUunmtlJpqVRG6s9JY9OmKnhmrKQm4EfACcBBwOckHTicMVSXQt4B1IBC\n3gFUuULeAdSAQt4B7JAtW9L9A2+9lZZNm1IncflQ02Kxq3P57bfTNps29X7M4R55OhtYHRHPRkQH\ncDMwd5hjqCKFvAOoAYW8A6hyhbwDqAGFvAPI3XAn+inA2rLfn8/azMxsiPheMjOzOjesnbGSjgAW\nRsSJ2e/zgejeIZs6Y83MbEflPupGUjPwNPBx4AVgOfC5iFg1bEGYmTWYluF8s4jYKunLwFK6hlc6\nyZuZDaGqvGHKzMwqx52xOZG0RtLjkh6VtDzvePIm6VpJ7ZJWlLWNl7RU0tOS7pQ0Ls8Y89bLZ3Sx\npOclPZItJ+YZY94kTZV0j6QnJT0h6StZe0P/LTnR56cItEXEByJidt7BVIHrSDfSlZsP3B0RBwD3\nAAuGParq0tNnBPC9iDg0W5YMd1BVphP4ekQcBBwJnJ/dlNnQf0tO9PkR/vz/KiKWAa91a54LXJ+t\nXw+cMqxBVZlePiNIf0sGRMSLEfFYtv4msAqYSoP/LTnR5CeAuyQ9JOmcvIOpUntERDuk/8CAHx7X\nsy9LekzSNY1WktgeSfsA7wceACY18t+SE31+PhwRhwKfIH29PDrvgGqARw6821XAvhHxfuBF4Hs5\nx1MVJO0C/BKYl13Zd//baai/JSf6nETEC9nP9cCvSPMA2bbaJU0CkDQZeCnneKpORKyPrqFz/xc4\nPM94qoGkFlKSvzEiFmXNDf235ESfA0k7ZVccSNoZOB74Q75RVQWxbb15MXBmtv4FYFH3HRrQNp9R\nlrRK/hb/HQH8BFgZET8oa2vovyWPo8+BpOmkq/gg3bT2s4i4NN+o8iXpJqAN2A1oBy4GbgNuBaYB\nzwKnRsSGvGLMWy+f0cdIdegisAb4UqkW3YgkfRj4L+AJ0v+vAC4i3YV/Cw36t+REb2ZW51y6MTOr\nc070ZmZ1zonezKzOOdGbmdU5J3ozszrnRG9mVuec6M3M6pwTvZlZnfv/ULnuxbWxzxcAAAAASUVO\nRK5CYII=\n", | |
"text/plain": [ | |
"<matplotlib.figure.Figure at 0x8bb7080>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"nlabels = labels.groupby(level='id').size()\n", | |
"nlabels_hist = nlabels.value_counts().sort_index()\n", | |
"\n", | |
"print nlabels.describe()\n", | |
"nlabels_hist[nlabels_hist > 30].plot.area();" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Do statistics on the **string length of the labels**." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 64, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"count 65685.000000\n", | |
"mean 9.834315\n", | |
"std 5.168938\n", | |
"min 1.000000\n", | |
"25% 6.000000\n", | |
"50% 8.000000\n", | |
"75% 13.000000\n", | |
"max 65.000000\n", | |
"Name: label, dtype: float64\n" | |
] | |
}, | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYEAAAEACAYAAABVtcpZAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3X+U3HV97/HnazfZJECA8CvRBCEYQ4PaKtbUU9vjtt7y\nw54CvZ4i1RZQTn8IiKCHQvQcCe2xFtt7CW3F1hY1/LDcBOslVgzIgbW1VwuIiBBMojSbZLOzJBES\nQiDZ7L7vH5/vZId1f87M7vc7M6/HOXP2u5/5fmff3wzMez6/FRGYmVlrass7ADMzy4+TgJlZC3MS\nMDNrYU4CZmYtzEnAzKyFOQmYmbWwcZOApNsk9Ul6coTnPi5pUNJxFWUrJG2W9IyksyrKz5T0pKRN\nklZVlHdIuju75ruSXlePGzMzs/FNpCbwJeDs4YWSFgG/BXRXlC0DLgSWAecCt0pS9vTngcsiYimw\nVFL5NS8DfhYRbwBWAZ+t8l7MzGySxk0CEfEd4PkRnroZuHZY2fnA3RFxKCK2AJuB5ZIWAHMj4tHs\nvNuBCyquWZ0d3wO8e1J3YGZmVauqT0DSecC2iPjRsKcWAtsqfu/JyhYC2yvKt2dlr7omIgaAFyqb\nl8zMbOrMmOwFkuYAnyA1BU0FjX+KmZnVw6STAPB64FTgh1l7/yLgcUnLSd/8Kzt2F2VlPcDJI5RT\n8dwOSe3A0RHxs5H+sCQvdGRmVoWIGPEL9kSbg5Q9iIinImJBRJwWEYtJTTtvjYjngHXA+7IRP4uB\nJcAjEVEC9khaniWOi4F7s9deB1ySHf8e8NA4N5LL44Ybbsjtb/uefD/N8mi2e2qU+xnLRIaIfgX4\nf6QRPVslfXD453JFgtgArAE2APcBl8dQBFcAtwGbgM0RsT4rvw04QdJm4Grg+vFiMjOz+hi3OSgi\n3j/O86cN+/0zwGdGOO/7wJtHKD9AGlZqZmbTzDOGJ6izszPvEOqu2e7J91N8zXZPzXA/Gq+9qEgk\nRSPFa2ZWBJKIGjuGzcysCTkJmJm1MCcBM7MW5iRgZtbCnATMzFqYk0AD2LwZ9u7NOwoza0YeItoA\nzj0X9u2D//iPvCMxs0Y01hDRahaQs2m2dSt0d49/npnZZLk5qAHs3An790MLVoLMbIo5CRRcBDz/\nfPr50kt5R2NmzcZJoOD27IFDh9Lxz0bcZcHMrHpOAgVXKg0dPz/STs9mZjVwEii4UgnasnfJNQEz\nqzcngYIrlWDmzHTsJGBm9eYkUHCl0tCoIDcHmVm9OQkUXG8v9Pen41278o3FzJqPk0DBdXdDe3s6\n3r4931jMrPk4CRTc9u1DfQLbtuUbi5k1HyeBguvtHaoJ9PXlG4uZNR8ngYLbvRsGB4eOzczqyQvI\nFdjAALz44tA8gT178o3HzJrPuDUBSbdJ6pP0ZEXZZyU9I+kJSV+VdHTFcyskbc6eP6ui/ExJT0ra\nJGlVRXmHpLuza74r6XX1vMFGtnMnzJo1tGzE/v35xmNmzWcizUFfAs4eVvYA8MaIeAuwGVgBIOkM\n4EJgGXAucKuk8hrWnwcui4ilwFJJ5de8DPhZRLwBWAV8tob7aSqVE8UAXnklv1jMrDmNmwQi4jvA\n88PKHoyIrKWa7wGLsuPzgLsj4lBEbCEliOWSFgBzI+LR7LzbgQuy4/OB1dnxPcC7q7yXplO5bhCk\n5qHynAEzs3qoR8fwh4D7suOFQOVAxp6sbCFQOcp9e1b2qmsiYgB4QdJxdYir4fX2DjUFQRol5FnD\nZlZPNXUMS/ok0B8R/1KneABG3AKtbOXKlYePOzs76ezsrOOfLpZSCQ4eHPpdSkngpJPyi8nMiq+r\nq4uurq4JnVt1EpB0KfAe4DcrinuAkyt+X5SVjVZeec0OSe3A0REx6lJplUmg2fX0pCagsggvImdm\n4xv+BfnGG28c9dyJNgeJim/oks4BrgXOi4gDFeetAy7KRvwsBpYAj0RECdgjaXnWUXwxcG/FNZdk\nx78HPDTBmJped/erO4adBMys3satCUj6CtAJHC9pK3AD8AmgA/hWNvjnexFxeURskLQG2AD0A5dH\nHN4Z9wrgy8Bs4L6IWJ+V3wbcIWkzsBu4qE731vB6elISKDcJOQmYWb0pGmj3cknRSPHWatGiNEFs\n376hsltugauuyi8mM2s8koiIEftbPWO4wHbvTp3BlYYPGzUzq4WTQEHt35+Gh1Z2DEPqJzAzqxcv\nIFdQfX0we/bQrmJlvb35xGNmzclJoKAqN5ivtHPn9MdiZs3LSaCgKvcWruQZw2ZWT04CBVUqjbxO\nUOVIITOzWjkJFFRv76uXjCh7+eXpj8XMmpeTQEGNtqn8wYMjNxOZmVXDSaCgtm6Fjo6fL29rS7uN\nmZnVg5NAQe3YMbTBfCXJS0eYWf04CRTUzp0/P1u4zCOEzKxePGO4gCLSB/1INQFwTcDM6sc1gQJ6\n4QWYMWPk0UGDg04CZlY/TgIFVCqN3CkMQ7UEM7N6cBIooFJp9P4AgF27pi8WM2tuTgIFVCqlZp/R\nbN06fbGYWXNzEiig4RvMDzfaRDIzs8lyEiig3t6R1w0q6+ubvljMrLk5CRRQd/fow0PBo4PMrH6c\nBApo+/bRRwcB7N07fbGYWXNzEiig0TaUKXvppemLxcyam5NAAe3aNfbooLE6jc3MJsNJoGAOHUob\nxxw4MPo5AwNjP29mNlHjJgFJt0nqk/RkRdk8SQ9I2ijpfknHVDy3QtJmSc9IOqui/ExJT0raJGlV\nRXmHpLuza74r6XX1vMFGs3Nn2mB+YGD0c9rbPWvYzOpjIjWBLwFnDyu7HngwIk4HHgJWAEg6A7gQ\nWAacC9wqHZ77+nngsohYCiyVVH7Ny4CfRcQbgFXAZ2u4n4ZXKqV1g8bjJGBm9TBuEoiI7wDDP3LO\nB1Znx6uBC7Lj84C7I+JQRGwBNgPLJS0A5kbEo9l5t1dcU/la9wDvruI+mkapNLHzPEzUzOqh2j6B\nkyKiDyAiSsBJWflCYFvFeT1Z2UKgcp7r9qzsVddExADwgqTjqoyr4ZVKqV9gLBFOAmZWH/XaT6Ce\nu96OsXQarFy58vBxZ2cnnZ2ddfzT+RtvyQhwEjCzsXV1ddHV1TWhc6tNAn2S5kdEX9bU81xW3gOc\nXHHeoqxstPLKa3ZIageOjohRP+Iqk0Az6ukZu1MYnATMbGzDvyDfeOONo5470eYg8epv6OuAS7Pj\nS4B7K8ovykb8LAaWAI9kTUZ7JC3POoovHnbNJdnx75E6mltWdzfMnDn+eb29Ux+LmTW/cWsCkr4C\ndALHS9oK3AD8FbBW0oeAbtKIICJig6Q1wAagH7g8IspNRVcAXwZmA/dFxPqs/DbgDkmbgd3ARfW5\ntcbU0zP6rmKVurunJx4za24a+owuPknRSPFWY9GitL3keEtDvOtdMMEmPzNrcZKIiBH7W73RfMHs\n3j32rmJlzz03/jlmZuNxEiiQ/fvT8NDxhohCqi2YmdXKawcVSF9fWjJiIvbtm9pYzKw1OAkUyHhL\nSFd65ZWpjcXMWoOTQIGUSmkOwET094+93LSZ2UQ4CRTIeHsLV2pr8w5jZlY7J4EC6e2d+IYxkmcN\nm1ntnAQKZNu28c+p5OWkzaxWTgIFsnXr2BvMV/L6QWZWD04CBdLbm3YNmwgnATOrByeBAtm5c+Ln\nRrg5yMxq5xnDBRGRZgFPtCYAk0saZmYjcU2gIJ5/fmKrh1baunXq4jGz1uAkUBCl0sQ7hcu2bx//\nHDOzsTgJFESpNLHVQyt5JVEzq5WTQEGUSuNvKzmcRweZWa2cBAqiVJr4khFlL744NbGYWetwEiiI\nHTsmnwT275+aWMysdTgJFMTWrWl00GQcODA1sZhZ63ASKIht22DmzMldEwEvvzw18ZhZa3ASKIi+\nvolvKFPW1uZZw2ZWGyeBgti1a/KbxEhOAmZWGy8bUQD9/fDSS5OfJ+BF5MysVjXVBCRdI+kpSU9K\nuktSh6R5kh6QtFHS/ZKOqTh/haTNkp6RdFZF+ZnZa2yStKqWmBrRzp1pg/nJzhNwEjCzWlWdBCS9\nFvgIcGZE/CKpVvH7wPXAgxFxOvAQsCI7/wzgQmAZcC5wq3T4u+/ngcsiYimwVNLZ1cbViEqlyY8M\ngtR85CRgZrWotU+gHThS0gxgDtADnA+szp5fDVyQHZ8H3B0RhyJiC7AZWC5pATA3Ih7Nzru94pqW\nMJkN5odzEjCzWlSdBCJiB/C/gK2kD/89EfEgMD8i+rJzSsBJ2SULgcoNFHuysoVA5VJo27OyllHN\nkhFlO3bUNxYzay1VdwxLOpb0rf8UYA+wVtIHgOHfaav8jjuylStXHj7u7Oyks7Ozni+fi1JpcktI\nV/Jy0mY2XFdXF11dXRM6t5bRQf8DeDYifgYg6WvArwJ9kuZHRF/W1FNe67IHOLni+kVZ2WjlI6pM\nAs2ip6f6mkCpVN9YzKzxDf+CfOONN456bi19AluBd0ianXXwvhvYAKwDLs3OuQS4NzteB1yUjSBa\nDCwBHsmajPZIWp69zsUV17SE7u7JzxYu27WrvrGYWWupuiYQEY9Iugf4AdCf/fwCMBdYI+lDQDdp\nRBARsUHSGlKi6AcujzjcHXoF8GVgNnBfRKyvNq5G1NOTkkA1TUIvvFD/eMysdSiqHZaSA0nRSPFO\n1MKFsGdPmjA2WXPnwt699Y/JzJqHJCJixOmonjFcALUM83zllfrFYWatx0kgZy+9BIcOpUc1+vtT\np3J7e33jMrPW4AXkctbXl5aMqFZbW2pKMjOrhpNAzkqlyS8hXUnyrGEzq56TQM5KpckvIT2cl5M2\ns2o5CeSsVKq+PwC8kqiZ1cZJIGe9vdUvGQFOAmZWGyeBnG3bNv45Y4lwc5CZVc9JIGdbt0JHR22v\n8dxz459jZjYSJ4Gc9fbWPsbfK4maWbWcBHK2c2ftr7F9+/jnmJmNxDOGcxSRFoCrtSZQj0RiZq3J\nNYEcPf982lu4ltFB4NFBZlY9J4EclUq1dwoDvPhi7a9hZq3JSaDOBgYmvml8qZSWfajV/v21v4aZ\ntSYngTp773thyRJ49tnxz61lg/lKBw9OPPGYmVVyEqijf/93+Pa3YcsWWLoU1qwZ+/ze3rQUdD28\n/HJ9XsfMWouTQJ0MDsLVV6cP48HB9Hjf++BP/mT0b+k7dtQnCbS1uXPYzKrjJFAna9akJSDKH/jl\nn//0T/CmN43cebt1axodVCvJS0eYWXWcBOrgwAG49tq01+/w4Z4RsHEjLFgAjz326ue2bUsbzNfK\ni8iZWbWcBOrg1ltTIhitk3dgID2/fDnccstQeV9fbRvKlDkJmFm1nARq9Pzz8Od/npp7xhrpUx46\nes018Du/k/oMdu+ufUMZSK/hJGBm1fCyETX69KfTt/lXXpnY+RFw331w8slpk/l6zBOAlFDMzCar\nppqApGMkrZX0jKSnJf2KpHmSHpC0UdL9ko6pOH+FpM3Z+WdVlJ8p6UlJmyStqiWm6bRlC3zhC5Mf\nnjk4mJqCBgfrM08A0kgjM7PJqrU56BbgvohYBvwS8GPgeuDBiDgdeAhYASDpDOBCYBlwLnCrdPh7\n8OeByyJiKbBU0tk1xjUtVqxIP6sZo1+vD/+y7u76vp6ZtYaqk4Cko4Ffj4gvAUTEoYjYA5wPrM5O\nWw1ckB2fB9ydnbcF2Awsl7QAmBsRj2bn3V5xTWE99lhq1qn3h3m1+vryjsDMGlEtNYHFwC5JX5L0\nuKQvSDoCmB8RfQARUQJOys5fCFRuptiTlS0EKlfE356VFVa5g/fgweKs2+M+ATOrRi0dwzOAM4Er\nIuIxSTeTmoKGz4+t66o2K1euPHzc2dlJZ2dnPV9+Qr7xDdiwYdr/7Jj27Mk7AjMriq6uLrq6uiZ0\nrqLKlcckzQe+GxGnZb//GikJvB7ojIi+rKnn4YhYJul6ICLipuz89cANQHf5nKz8IuBdEfHhEf5m\nVBtvvRw6BMuWpdm+te4DUE9HHgn79uUdhZkVkSQiYsSxiFU3B2VNPtskLc2K3g08DawDLs3KLgHu\nzY7XARdJ6pC0GFgCPJI1Ge2RtDzrKL644prC+eIX07fueozvr6cDB/KOwMwaUdU1AQBJvwT8MzAT\neBb4INAOrAFOJn3LvzAiXsjOXwFcBvQDH42IB7LytwFfBmaTRht9dJS/l2tNYN8+OO20tDxEET90\n+/vrsxaRmTWXsWoCNSWB6ZZ3Eli5Ej73udQJW7R/tra2NELohBPyjsTMisZJoA56e+H001OfQBHX\n7m9vT53VS5eOf66ZtZYp6RNoNZ/6VPq2XcQEUOblpM1sstyCPAE7dsC//EvxmoAqeSVRM6uGawIT\n8NWvQkdHcSaGjcRJwMyq4SQwAXfeOfLOYEUS4eYgM5s8J4Fx9PTAU081xtDLUinvCMys0TgJjOOe\ne2DWrInvF5CnrVvzjsDMGo2TwDjuuqv4TUFlPT15R2BmjcZJYAzbt8PTTzdGUxDArl15R2BmjcZJ\nYAz33JNGBTVCUxC4Y9jMJs9JYAx33pn2AW4UjdJsZWbF4SQwiq1b4Zln0nIMjaLIs5nNrJicBEbR\naE1BkPY3KPKsZjMrHieBUTRaU1BZI8ZsZvlxEhhBdzds2tRYTUGQFrjz0hFmNhlOAiO45x6YObOx\nmoLKPELIzCbDSWAEd9zRuM0qrgmY2WQ4CQyzZQv85CepaaXRDA46CZjZ5DTgR93UWrs2zRAu4h7C\n4/Fy0mY2WU4Cw9x5Z7H3DRjP7t15R2BmjcRJoMJ//zc8+2xjNgWVbd+edwRm1kga+OOu/tauTcNC\nG7EpqGzbtrwjMLNG4iRQ4Y47Gn/phb6+vCMws0ZScxKQ1CbpcUnrst/nSXpA0kZJ90s6puLcFZI2\nS3pG0lkV5WdKelLSJkmrao2pGj/9aWoOkvL46/XjPgEzm4x61AQ+Cmyo+P164MGIOB14CFgBIOkM\n4EJgGXAucKt0+CP388BlEbEUWCrp7DrENSnN0BQEsGdP3hGYWSOpKQlIWgS8B/jniuLzgdXZ8Wrg\nguz4PODuiDgUEVuAzcBySQuAuRHxaHbe7RXXTJs772zMGcLDNfLIJjObfrXWBG4GrgUq166cHxF9\nABFRAk7KyhcCld2WPVnZQqByTMv2rGza/OQnab2gZtAMiczMpk/VGydK+m2gLyKekNQ5xql1Xdx4\n5cqVh487Ozvp7BzrT0/M2rVpWOjBgzW/VO4GBqC/P619ZGatqauri66urgmdq6hyAXpJfwn8AXAI\nmAPMBb4G/DLQGRF9WVPPwxGxTNL1QETETdn164EbgO7yOVn5RcC7IuLDI/zNqDbesZxxRuoYboYk\n0N4OO3bASSeNf66ZtQZJRMSIw16qbg6KiE9ExOsi4jTgIuChiPhD4OvApdlplwD3ZsfrgIskdUha\nDCwBHsmajPZIWp51FF9ccc2U27y5uSZYSV46wswmrurmoDH8FbBG0odI3/IvBIiIDZLWkEYS9QOX\nV3ytvwL4MjAbuC8i1k9BXCNauzZ9cDZDLQDS+kFeTtrMJqrq5qA8TEVz0LJlaamIZkkCElx5Jfzt\n3+YdiZkVxZQ0BzWDjRuhp6e59uWNgL/7O/iLv8g7EjNrBC2dBL761fTNub8/70jq71Ofgr/8y7yj\nMLOia+kk8LWvNfe4+k9+Ej7zmbyjMLMia9k+gT17YMGCVBNo9EXjxnPTTfBnf5Z3FGaWF/cJjOCh\nh+DII5s/AQBcdx38zd/kHYWZFVHLJoFvfhP27s07iulz7bVw8815R2FmRdOSzUERsHBhSgIvvVSH\nwBrIzTfD1VfnHYWZTSc3Bw2zeTPs29fcncKjueaaNITUzAxaNAmsX58WjBsYyDuSfFx1FXzuc3lH\nYWZF0JJJ4Otfb71moOGuvBI+9jHvP2DW6lquT+DAAZg3L622uW9fnQJrUG1t6d/hD/4gDSM98cS8\nIzKzqeA+gQrf+Q4ccYQTAMDgYJotfccd8JrXwNlnw49/nHdUZjadpmIV0UJbv94JYLhDh9LPhx+G\nN74R3vxmWLUKJrJfz759sGlTWofppz+FY46B178eTjsNTj0VZs+eysjNrFYt1xx0+umwbVtrTBKr\nVkdHqiEsXAif/jS8//3p32zjxvTYsAF++MO0LefevalmNTiYRlsdOpR2NSuvyTR3Lrz2tbBkCbzp\nTWkDn9NOg+OOS7O29+5NP8uPF15I+yHs2pV+vvQSvPe98JGPeLc0s2qN1RzUUkmgtzd9Sx0YaJ6l\no6fSzJlDtYQjj0z9BwcPpg/7WbNgxoyUTMcaZTVjRkoqg4PpWim9brk/IiI9BgbSo/z32tvTtVLq\nx5kzJ818vu669HpmNnFOApnVq9M4eW+6MjlS/sttlxPGrFlpstunPuWmJrOJcsdw5hvfaK2lIuol\n7wQAqZYwOJhqBTfdBMceCx//uIf6mtWqZWoCg4NpaOjgoDuGm4GUHu3t8Ed/lJbMPvrovKMyKybX\nBIDHH0/t0J4c1Rwihoa4/uM/wvHHp87jwcG8IzNrLC2TBO6/P31g+EOi+ZQ7lG+9FV73ujSSycwm\npmWSwLp1zbmNpA0ZHIS+Pli8GG6/Pe9ozBpDS/QJlHcRc3NQaznvPPjXf039BmatrOX7BMq7iDkB\ntJZ/+7e0HMamTXlHYlZcVScBSYskPSTpaUk/knRVVj5P0gOSNkq6X9IxFdeskLRZ0jOSzqooP1PS\nk5I2SVpV2y39vG9+M9UGrLUMDqY5IcuWwd//fd7RmBVT1c1BkhYACyLiCUlHAd8Hzgc+COyOiM9K\nug6YFxHXSzoDuAt4O7AIeBB4Q0SEpP8CroyIRyXdB9wSEfeP8Dcn3RxU3kVszx7XBFqZBL/xG3Df\nfWnCmVkrmZLmoIgoRcQT2fE+4BnSh/v5wOrstNXABdnxecDdEXEoIrYAm4HlWTKZGxGPZufdXnFN\nzcq7iB04UK9XtEYUAd/+Nsyfn9Y9MrOkLn0Ckk4F3gJ8D5gfEX2QEgVwUnbaQqBy8F5PVrYQ2F5R\nvj0rq4v772/tXcRsyMBA+kLw1remuQVmVoelpLOmoHuAj0bEPknD22vqOvxo5cqVh487OzvpHGe9\n469/3TOEbUj5y8Cf/mlaqfSTn8w3HrOp0NXVRVdX14TOrWmIqKQZwL8B34yIW7KyZ4DOiOjLmnoe\njohlkq4HIiJuys5bD9wAdJfPycovAt4VER8e4e9Nqk/Au4jZeK6+Gm6+Oe8ozKbWVA4R/SKwoZwA\nMuuAS7PjS4B7K8ovktQhaTGwBHgkazLaI2m5JAEXV1xTk//8z7QEsROAjWbVKvjDP8w7CrP8VN0c\nJOmdwAeAH0n6AanZ5xPATcAaSR8ifcu/ECAiNkhaA2wA+oHLK77WXwF8GZgN3BcR66uNq9L69V5l\n0sZ3112we3daZVYjflcya15NPWP4F34BurvTJihmY5Hg7W+H7343DSQwayYtOWO4VIKtW71gnE1M\nBDz2WNr+0mtMWStp2iTwwANp5ylvI2kTNTiY5pUsXux+JGsdTZsEvIuYVWNwMNUiTzklDSE1a3ZN\n2ScwOAjHHTc0Ochsstra0siyp59OCcGskbVcn8Djj6eOPq8VZNUaHEwDCpYsgb/+62Lss2w2FZoy\nCXgXMauHgYH0uO46OPlkeOSRvCMyq7+mTALr1rlD2OojIj2eew7e8Q54z3u8LLk1l6ZLAo8/Dhs2\nwMyZeUdizaS/PyWDb30LTjwRPvMZNxFZc2iqjuGDB9MKkc8+6wliNrWktGXp2rXwznfmHY3Z2Fqm\nY/imm2DnTi8bbVMvIv239uu/DmedlXYwM2tETVMTePpp+NVfTbUB1wJsOs3IVuD67d+Ga65JicFL\nT1iRjFUTaIokMDAAv/IrqS/g5ZdzCMwM6OiAQ4fgiCPg/PPTMtVve5sXpbP8NX1z0C23wLZtbgay\nfB08mIYl798Pa9akLybHHQd//Mfw1FN5R2c2soavCfzkJ+nblpuBrIja21NNIAKOPx4+8AG46io4\n9dS8I7NW0rTNQYOD8K53wQ9+4H0DrPhmzBiad7BkCXz842lDmzlz8o7Mml3TNgd94QuwcWNqhzUr\nukOHUpPl4CBs2ZL2OT7mGDjvPHj0Uc87sHw0bE1g2zZ485tTE9CBAzkHZlaDcofyCSfAhz8MH/lI\najoyq5emaw6KgHPOSbtAvfhi3lGZ1ceMGamWIMEv/zJ87GPpi86iRTB3bt7RWSNruiRw++2pPXXf\nPncGW3OaNSstVdHRkX7OmJFqBwsWpKWt3/AGeP3r08J2ixal4yOOyDtqK6qmSgK9vcEb35g6gt0M\nZK1kzpxUSxgYGNoCc+bMVNbfD8cemzqczzwzLXb3pjfBsmXueLYmSwK/+7vBww/DCy/kHY1Zscyc\nmR7l+QozZ6a+hmOPTTWHt70tNTMtXJhqFPPnp34Iz25ufk2VBI4/Pti/3zODzSaqMjlEDNUeBgZS\nkjjqqDSp7cQTU4I45ZT0OPFEmDdv6HHccelnR0fed2ST1RBJQNI5wCrSsNXbIuKmEc6Jjo7wXgFm\ndVROEpBqEOUNmWbMGJrsVvlcezsceWRKHscem5LD/Pmpb2LRopQ8yo8TTkg/3V+Rr8InAUltwCbg\n3cAO4FHgooj48bDzQoqcxlN3AZ15/OEp1EVz3VMXvp/p0dGRkkFbW6pdlGsVg4OpfMaMoZnS/f3p\nvKOOgvb2Lo4+upNZs2D27NRfMWdOShJz5gwll6OPTglk3ryUaCprJMcem84rwppMXV1ddHZ25h3G\nuMZKAjOmO5hRLAc2R0Q3gKS7gfOBHw8/Mb+c1UVR/4esXhfNdU9d+H6mx1i18UOHfn4C58BAGswx\nONjFyy93Hi4vz6COSAmk/LO8NeyMGSmBlPstykklIiWNo45KCalca6n8Ofwxa1Yaanv00SmRHHts\n+v3II4eST/m4nJwqE9Xs2T+/WdVkk0BE2pnuuefSY+fOoeMdO2D79nTOKaekTv6FC+G1r02P17xm\najr5i5IEFgLbKn7fTkoMZtYkyoljMjP8y6OgRnLgQEoWY3Vsl780ln+WE0x55rY0VKNpa0u/V9Yw\nKpNSOe5hUpZ/AAADrklEQVRy81lHR4rhH/5h6Drp1a9T/j0C9u5NCaC9PV3b1jb0uuUmuI6OdE15\ndnk5qZX//WbNSs1vCxakxHDMMakWdcQRKXmVa1SzZ7/6MZaiJAEzs0kZqcYxWRGTf43+/nTNK6+k\nD+7J9FFGpPNHu2b4sPf+/lcnwpdfhr4+6O2F739/cnGPpih9Au8AVkbEOdnv1wMxvHNYUv7Bmpk1\noKJ3DLcDG0kdw73AI8DvR8QzuQZmZtbkCtEcFBEDkq4EHmBoiKgTgJnZFCtETcDMzPLhCeMTIGmL\npB9K+oGkR/KOpxqSbpPUJ+nJirJ5kh6QtFHS/ZKOyTPGyRjlfm6QtF3S49njnDxjnAxJiyQ9JOlp\nST+SdFVW3pDv0Qj385GsvCHfI0mzJP1X9hnwI0k3ZOUN+f5Uck1gAiQ9C7wtIp7PO5ZqSfo1YB9w\ne0T8YlZ2E7A7Ij4r6TpgXkRcn2ecEzXK/dwAvBgR/zvX4KogaQGwICKekHQU8H3SXJkP0oDv0Rj3\n8z4a9z06IiL2Z32Y/wlcBbyXBnx/KrkmMDGiwf+tIuI7wPAkdj6wOjteDVwwrUHVYJT7gfReNZyI\nKEXEE9nxPuAZYBEN+h6Ncj8Ls6cb9T3anx3OIvWnBg36/lRq6A+2aRTAtyQ9KumP8g6mjk6KiD5I\n/9MCJ+UcTz1cKekJSf/ciFVzAEmnAm8BvgfMb/T3qOJ+/israsj3SFKbpB8AJeBbEfEoTfD+OAlM\nzDsj4kzgPcAVWVNEM2r0tsFbgdMi4i2k/1EbscnhKOAe4KPZN+jh70lDvUcj3E/DvkcRMRgRbyXV\n0JZLeiMN/v6Ak8CERERv9nMn8DWaZ0mLPknz4XAb7nM5x1OTiNh5eBNq+Cfg7XnGM1mSZpA+MO+I\niHuz4oZ9j0a6n0Z/jwAiYi9pYadzaOD3p8xJYBySjsi+zSDpSOAs4Kl8o6qaeHV77Drg0uz4EuDe\n4RcU3KvuJ/ufsOx/0njv0xeBDRFxS0VZI79HP3c/jfoeSTqh3HQlaQ7wW6R+jkZ+fwCPDhqXpMWk\nb/9B6gy6KyL+Kt+oJk/SV0hLUh4P9AE3AP8XWAucDHQDF0ZEQ+zZNsr9/Aap7XkQ2AL8Sbm9tugk\nvRP4d+BHpP/WAvgEafb8GhrsPRrjft5PA75Hkt5M6vhtyx7/JyI+Lek4GvD9qeQkYGbWwtwcZGbW\nwpwEzMxamJOAmVkLcxIwM2thTgJmZi3MScDMrIU5CZiZtTAnATOzFvb/AUL0kJu9GPxnAAAAAElF\nTkSuQmCC\n", | |
"text/plain": [ | |
"<matplotlib.figure.Figure at 0x3c6e78d0>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"slabel = labels['label'].str.len()\n", | |
"slabel_hist = slabel.value_counts().sort_index()\n", | |
"\n", | |
"print slabel.describe()\n", | |
"slabel_hist[slabel_hist > 30].plot.area();" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Load the languages and the full paths into data frames. Join them into one data frame and show the result." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 65, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>label</th>\n", | |
" <th>level</th>\n", | |
" <th>parent</th>\n", | |
" <th>obsolete</th>\n", | |
" <th>status</th>\n", | |
" <th>iso</th>\n", | |
" <th>latitude</th>\n", | |
" <th>longitude</th>\n", | |
" <th>steps</th>\n", | |
" <th>parent_tree</th>\n", | |
" <th>terminal</th>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>id</th>\n", | |
" <th></th>\n", | |
" <th></th>\n", | |
" <th></th>\n", | |
" <th></th>\n", | |
" <th></th>\n", | |
" <th></th>\n", | |
" <th></th>\n", | |
" <th></th>\n", | |
" <th></th>\n", | |
" <th></th>\n", | |
" <th></th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>aari1239</th>\n", | |
" <td>Aari</td>\n", | |
" <td>Language</td>\n", | |
" <td>aari1238</td>\n", | |
" <td>0</td>\n", | |
" <td>established</td>\n", | |
" <td>aiw</td>\n", | |
" <td>5.95034</td>\n", | |
" <td>36.5721</td>\n", | |
" <td>3</td>\n", | |
" <td>sout2845</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>aari1240</th>\n", | |
" <td>Aariya</td>\n", | |
" <td>Language</td>\n", | |
" <td>book1242</td>\n", | |
" <td>0</td>\n", | |
" <td>spurious</td>\n", | |
" <td>aay</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>1</td>\n", | |
" <td>book1242</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>aari1244</th>\n", | |
" <td>Aari</td>\n", | |
" <td>Language</td>\n", | |
" <td>book1242</td>\n", | |
" <td>0</td>\n", | |
" <td>spurious retired</td>\n", | |
" <td>aiz</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>1</td>\n", | |
" <td>book1242</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>aasa1238</th>\n", | |
" <td>Aasax</td>\n", | |
" <td>Language</td>\n", | |
" <td>uncl1457</td>\n", | |
" <td>0</td>\n", | |
" <td>established</td>\n", | |
" <td>aas</td>\n", | |
" <td>-4.00679</td>\n", | |
" <td>36.8648</td>\n", | |
" <td>4</td>\n", | |
" <td>afro1255</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>abad1241</th>\n", | |
" <td>Abadi</td>\n", | |
" <td>Language</td>\n", | |
" <td>west2850</td>\n", | |
" <td>0</td>\n", | |
" <td>established</td>\n", | |
" <td>kbt</td>\n", | |
" <td>-9.03389</td>\n", | |
" <td>146.9920</td>\n", | |
" <td>11</td>\n", | |
" <td>aust1307</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>abag1245</th>\n", | |
" <td>Abaga</td>\n", | |
" <td>Language</td>\n", | |
" <td>kama1374</td>\n", | |
" <td>0</td>\n", | |
" <td>established</td>\n", | |
" <td>abg</td>\n", | |
" <td>-6.12028</td>\n", | |
" <td>145.6650</td>\n", | |
" <td>6</td>\n", | |
" <td>nucl1709</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>abai1240</th>\n", | |
" <td>Abai Sungai</td>\n", | |
" <td>Language</td>\n", | |
" <td>pait1248</td>\n", | |
" <td>0</td>\n", | |
" <td>established</td>\n", | |
" <td>abf</td>\n", | |
" <td>5.55394</td>\n", | |
" <td>118.3060</td>\n", | |
" <td>7</td>\n", | |
" <td>aust1307</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>...</th>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>zulg1242</th>\n", | |
" <td>Zulgo-Gemzek</td>\n", | |
" <td>Language</td>\n", | |
" <td>meri1245</td>\n", | |
" <td>0</td>\n", | |
" <td>established</td>\n", | |
" <td>gnd</td>\n", | |
" <td>10.82700</td>\n", | |
" <td>14.0578</td>\n", | |
" <td>7</td>\n", | |
" <td>afro1255</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>zulu1248</th>\n", | |
" <td>Zulu</td>\n", | |
" <td>Language</td>\n", | |
" <td>zulu1251</td>\n", | |
" <td>0</td>\n", | |
" <td>established</td>\n", | |
" <td>zul</td>\n", | |
" <td>-25.33050</td>\n", | |
" <td>31.3512</td>\n", | |
" <td>12</td>\n", | |
" <td>atla1278</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>zuma1239</th>\n", | |
" <td>Zumaya</td>\n", | |
" <td>Language</td>\n", | |
" <td>masa1324</td>\n", | |
" <td>0</td>\n", | |
" <td>established</td>\n", | |
" <td>zuy</td>\n", | |
" <td>10.55800</td>\n", | |
" <td>14.4445</td>\n", | |
" <td>5</td>\n", | |
" <td>afro1255</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>zumb1240</th>\n", | |
" <td>Zumbun</td>\n", | |
" <td>Language</td>\n", | |
" <td>west2712</td>\n", | |
" <td>0</td>\n", | |
" <td>established</td>\n", | |
" <td>jmb</td>\n", | |
" <td>10.82700</td>\n", | |
" <td>9.9683</td>\n", | |
" <td>5</td>\n", | |
" <td>afro1255</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>zuni1245</th>\n", | |
" <td>Zuni</td>\n", | |
" <td>Language</td>\n", | |
" <td>None</td>\n", | |
" <td>0</td>\n", | |
" <td>established</td>\n", | |
" <td>zun</td>\n", | |
" <td>35.00560</td>\n", | |
" <td>-108.7820</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>zuoj1238</th>\n", | |
" <td>Zuojiang Zhuang</td>\n", | |
" <td>Language</td>\n", | |
" <td>nort3180</td>\n", | |
" <td>0</td>\n", | |
" <td>established</td>\n", | |
" <td>zzj</td>\n", | |
" <td>21.83750</td>\n", | |
" <td>107.3620</td>\n", | |
" <td>5</td>\n", | |
" <td>taik1256</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>zyph1238</th>\n", | |
" <td>Zyphe</td>\n", | |
" <td>Language</td>\n", | |
" <td>nucl1757</td>\n", | |
" <td>0</td>\n", | |
" <td>established</td>\n", | |
" <td>zyp</td>\n", | |
" <td>22.52400</td>\n", | |
" <td>93.2640</td>\n", | |
" <td>5</td>\n", | |
" <td>sino1245</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"<p>8397 rows × 11 columns</p>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" label level parent obsolete status \\\n", | |
"id \n", | |
"aari1239 Aari Language aari1238 0 established \n", | |
"aari1240 Aariya Language book1242 0 spurious \n", | |
"aari1244 Aari Language book1242 0 spurious retired \n", | |
"aasa1238 Aasax Language uncl1457 0 established \n", | |
"abad1241 Abadi Language west2850 0 established \n", | |
"abag1245 Abaga Language kama1374 0 established \n", | |
"abai1240 Abai Sungai Language pait1248 0 established \n", | |
"... ... ... ... ... ... \n", | |
"zulg1242 Zulgo-Gemzek Language meri1245 0 established \n", | |
"zulu1248 Zulu Language zulu1251 0 established \n", | |
"zuma1239 Zumaya Language masa1324 0 established \n", | |
"zumb1240 Zumbun Language west2712 0 established \n", | |
"zuni1245 Zuni Language None 0 established \n", | |
"zuoj1238 Zuojiang Zhuang Language nort3180 0 established \n", | |
"zyph1238 Zyphe Language nucl1757 0 established \n", | |
"\n", | |
" iso latitude longitude steps parent_tree terminal \n", | |
"id \n", | |
"aari1239 aiw 5.95034 36.5721 3 sout2845 1 \n", | |
"aari1240 aay NaN NaN 1 book1242 1 \n", | |
"aari1244 aiz NaN NaN 1 book1242 1 \n", | |
"aasa1238 aas -4.00679 36.8648 4 afro1255 1 \n", | |
"abad1241 kbt -9.03389 146.9920 11 aust1307 1 \n", | |
"abag1245 abg -6.12028 145.6650 6 nucl1709 1 \n", | |
"abai1240 abf 5.55394 118.3060 7 aust1307 1 \n", | |
"... ... ... ... ... ... ... \n", | |
"zulg1242 gnd 10.82700 14.0578 7 afro1255 1 \n", | |
"zulu1248 zul -25.33050 31.3512 12 atla1278 1 \n", | |
"zuma1239 zuy 10.55800 14.4445 5 afro1255 1 \n", | |
"zumb1240 jmb 10.82700 9.9683 5 afro1255 1 \n", | |
"zuni1245 zun 35.00560 -108.7820 NaN NaN NaN \n", | |
"zuoj1238 zzj 21.83750 107.3620 5 taik1256 1 \n", | |
"zyph1238 zyp 22.52400 93.2640 5 sino1245 1 \n", | |
"\n", | |
"[8397 rows x 11 columns]" | |
] | |
}, | |
"execution_count": 65, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"languages = pd.read_sql_query(\"\"\"SELECT * FROM languoid\n", | |
"WHERE level='Language' AND NOT obsolete ORDER BY id\"\"\", conn, index_col='id')\n", | |
"\n", | |
"tree = pd.read_sql_query('SELECT * FROM tree WHERE terminal', conn, index_col='child')\n", | |
"\n", | |
"langs = languages.join(tree, how='left', rsuffix='_tree')\n", | |
"\n", | |
"langs" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Analyze the **number of languages per top-level family**." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 66, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"count 243.000000\n", | |
"mean 33.781893\n", | |
"std 137.796044\n", | |
"min 1.000000\n", | |
"25% 2.000000\n", | |
"50% 5.000000\n", | |
"75% 12.000000\n", | |
"max 1430.000000\n", | |
"dtype: float64\n" | |
] | |
}, | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAE7CAYAAADTpEpZAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xu8XFV99/HPFwJyFaMVoqBcC4ZW1CABW1qPWm69AK2I\neIVC1QdQeKoPmog2aW0FrK1aLbQKRbxS8EZU5FY4bSlCQC6JJkCKclXSioio1RL4Pn+sdZLJ5Jyc\n5Jy9J5yd7/v1mtfZs2bP/u05M/s3a9Zea23ZJiIiumuTDb0DERHRriT6iIiOS6KPiOi4JPqIiI5L\noo+I6Lgk+oiIjhs30Us6T9JySYv6yt8maamkxZLO7CmfK2lZfezgnvJZkhZJulPSh5t9GRERMZZ1\nqdGfDxzSWyBpCPgD4Pm2nw98sJbPBI4GZgKHAWdLUn3aOcAJtvcE9pS02jYjIqId4yZ629cCD/cV\nnwicaXtFXeeHtfwI4ELbK2zfDSwDZkuaAWxr+8a63qeAIxvY/4iIGMdE2+j3BH5b0vWSrpG0by3f\nEbivZ70HatmOwP095ffXsoiIaNm0STxvuu0DJO0HXAzs1txuRUREUyaa6O8DvgRg+0ZJj0t6BqUG\n/9ye9XaqZQ8AzxmlfFSSMgFPRMQE2FZ/2bo23ajeRnwFeDmApD2BzW0/BCwAXi1pc0m7AnsAC20/\nCDwiaXY9OftG4JJxdna9b/PmzZvQ8yZ6G2S8Lr+2xEu8xGsm3ljGrdFL+hwwBDxD0r3APOCfgPMl\nLQZ+WRM3tpdIughYAjwGnORV0U8GPglsAVxq+7LxYkdExOSNm+htv3aMh94wxvpnAGeMUv4t4Pnr\ntXcRETFpnRoZOzQ01Nl4XX5tiZd4idduPK2tXWdDkeQn435FRDyZScKTOBkbERFTVBJ9RETHJdFH\nRHRcEn1ERMcl0UdEdFwSfURExyXRR0R0XBJ9RETHJdFHRHRcEn1ERMcl0UdEdFwSfURExyXRR0R0\nXBJ9RETHJdFHRHRcEn1ERMcl0UdEdNy4iV7SeZKWS1o0ymPvkPSEpKf3lM2VtEzSUkkH95TPkrRI\n0p2SPjzRHZ4xYxckrfdtxoxdJhoyImJKW5ca/fnAIf2FknYCDgLu6SmbCRwNzAQOA86WNHJZq3OA\nE2zvCewpaY1trovly+8BvN638ryIiI3PuIne9rXAw6M89CHgtL6yI4ALba+wfTewDJgtaQawre0b\n63qfAo6c8F5HRMQ6m1AbvaTDgftsL+57aEfgvp77D9SyHYH7e8rvr2UREdGyaev7BElbAu+mNNu0\nZv78+SuXh4aGGBoaajNcRMSUMzw8zPDw8Ljryfb4K0k7A1+1vY+kXweuAn4OCNiJUnOfDRwPYPvM\n+rzLgHmUdvxrbM+s5ccAL7V94hjxPNZ+lSb/8fd5lGeyLq81ImKqkoRt9Zeva9ON6g3b37Y9w/Zu\ntnelNMO8yPZ/AQuAV0vaXNKuwB7AQtsPAo9Iml1Pzr4RuKSB1xUREeNYl+6VnwOuo/SUuVfSH/et\nYlZ9CSwBLgKWAJcCJ/VUzU8GzgPuBJbZvqyZlxAREWuzTk03g5amm4iI9TfZppuIiJiikugjIjou\niT4iouOS6CMiOi6JPiKi45LoIyI6Lok+IqLjkugjIjouiT4iouOS6CMiOi6JPiKi45LoIyI6Lok+\nIqLjkujHMWPGLkha79uMGbts6F2PiAAyTfG67MtA40VETFSmKY6I2Egl0UdEdFwSfURExyXRR0R0\n3LpcHPw8ScslLeop+4CkpZJulfRFSU/teWyupGX18YN7ymdJWiTpTkkfbv6lRETEaNalRn8+cEhf\n2RXAr9l+IbAMmAsgaW/gaGAmcBhwtkq3FYBzgBNs7wnsKal/mxER0YJxE73ta4GH+8qusv1EvXs9\nsFNdPhy40PYK23dTvgRmS5oBbGv7xrrep4AjG9j/iIgYRxNt9McDl9blHYH7eh57oJbtCNzfU35/\nLYuIiJZNm8yTJZ0OPGb78w3tz0rz589fuTw0NMTQ0FDTISIiprTh4WGGh4fHXW+dRsZK2hn4qu19\nesqOA94EvNz2L2vZHMC2z6r3LwPmAfcA19ieWcuPAV5q+8Qx4mVkbETEeprsyFjV28jGDgVOAw4f\nSfLVAuAYSZtL2hXYA1ho+0HgEUmz68nZNwKXTPC1RETEehi36UbS54Ah4BmS7qXU0N8NbA5cWTvV\nXG/7JNtLJF0ELAEeA07qqZqfDHwS2AK41PZlDb+WiIgYRSY1G39fBhovImKiMqlZRMRGKok+IqLj\nkugjIjouiT4iouOS6CMiOi6JPiKi45LoIyI6Lok+IqLjkugjIjouiT4iouOS6CMiOi6JPiKi45Lo\nIyI6Lok+IqLjkugjIjouiT4iouOS6CMiOi6JPiKi45LoIyI6btxEL+k8ScslLeopmy7pCkl3SLpc\n0nY9j82VtEzSUkkH95TPkrRI0p2SPtz8S4mIiNGsS43+fOCQvrI5wFW29wKuBuYCSNobOBqYCRwG\nnK1ydW2Ac4ATbO8J7Cmpf5sREdGCcRO97WuBh/uKjwAuqMsXAEfW5cOBC22vsH03sAyYLWkGsK3t\nG+t6n+p5TkREtGiibfTb214OYPtBYPtaviNwX896D9SyHYH7e8rvr2UREdGyaQ1txw1tZ6X58+ev\nXB4aGmJoaKjpEBERU9rw8DDDw8Pjrid7/BwtaWfgq7b3qfeXAkO2l9dmmWtsz5Q0B7Dts+p6lwHz\ngHtG1qnlxwAvtX3iGPE81n6VJv+JfK+IdXmtGzpeRMREScK2+svXtelG9TZiAXBcXT4WuKSn/BhJ\nm0vaFdgDWFibdx6RNLuenH1jz3MiIqJF4zbdSPocMAQ8Q9K9lBr6mcDFko6n1NaPBrC9RNJFwBLg\nMeCknqr5ycAngS2AS21f1uxLiYiI0axT082gpekmImL9TbbpJiIipqgk+oiIjkuij4jouCT6iIiO\nS6KPiOi4JPqIiI5Loo+I6Lgk+oiIjkuij4jouCT6iIiOS6KPiOi4JPqIiI5Loo+I6Lgk+oiIjkui\nj4jouCT6iIiOS6KPiOi4JPqIiI5Loo+I6LhJJXpJfyrp25IWSfqspM0lTZd0haQ7JF0uabue9edK\nWiZpqaSDJ7/7ERExngkneknPBt4GzLK9DzANeA0wB7jK9l7A1cDcuv7ewNHATOAw4GyVK29HRESL\nJtt0symwtaRpwJbAA8ARwAX18QuAI+vy4cCFtlfYvhtYBsyeZPyIiBjHhBO97e8DfwPcS0nwj9i+\nCtjB9vK6zoPA9vUpOwL39WzigVoWEREtmjbRJ0p6GqX2vjPwCHCxpNcB7lu1//46mT9//srloaEh\nhoaGJrSfERFdNTw8zPDw8LjryZ5QHkbSUcAhtt9U778BOAB4OTBke7mkGcA1tmdKmgPY9ll1/cuA\nebZvGGXbHmu/SrP+RPZZTOS1DjpeRMREScL2Guc+J9NGfy9wgKQt6knVVwBLgAXAcXWdY4FL6vIC\n4JjaM2dXYA9g4STiR0TEOphw043thZK+ANwCPFb/fhzYFrhI0vHAPZSeNtheIukiypfBY8BJY1bb\nIyKiMRNuumlTmm4iItZfG003ERExBSTRR0R0XBJ9RETHJdFHRHRcEn1ERMcl0UdEdFwSfURExyXR\nR0R0XBJ9RETHJdFHRHRcEn1ERMcl0UdEdFwSfURExyXRR0R0XBJ9RETHJdFHRHRcEn1ERMcl0UdE\ndFwSfUREx00q0UvaTtLFkpZK+o6k/SVNl3SFpDskXS5pu57150paVtc/ePK7HxER45lsjf4jwKW2\nZwIvAG4H5gBX2d4LuBqYCyBpb+BoYCZwGHC2ypW3IyKiRRNO9JKeCvyW7fMBbK+w/QhwBHBBXe0C\n4Mi6fDhwYV3vbmAZMHui8SMiYt1Mpka/K/BDSedLulnSxyVtBexgezmA7QeB7ev6OwL39Tz/gVoW\nEREtmjbJ584CTrZ9k6QPUZpt3Lde//11Mn/+/JXLQ0NDDA0NTWwvIyI6anh4mOHh4XHXkz2hPIyk\nHYBv2t6t3j+Qkuh3B4ZsL5c0A7jG9kxJcwDbPquufxkwz/YNo2zbY+1XadafyD6LibzWQceLiJgo\nSdhe49znhJtuavPMfZL2rEWvAL4DLACOq2XHApfU5QXAMZI2l7QrsAewcKLxu2jGjF2QtN63GTN2\n2dC7HhFPYhOu0QNIegFwLrAZ8F3gj4FNgYuA5wD3AEfb/nFdfy5wAvAYcKrtK8bY7kZZo8+vh4iY\njLFq9JNK9G1Jom8/VkR0T+NNNxERMTUk0UdEdFwSfURExyXRR0R0XBJ9RETHJdFHRHRcEn1ERMcl\n0UdEdFwSfURExyXRR0R0XBJ9RETHJdFHRHRcEn1ERMcl0W/EMv99xMYh0xSPvy8Di9fl1xYR7cs0\nxRERG6kk+oiIjkuij4jouCT6iIiOm3Sil7SJpJslLaj3p0u6QtIdki6XtF3PunMlLZO0VNLBk40d\nERHja6JGfyqwpOf+HOAq23sBVwNzASTtDRwNzAQOA85W6fYREREtmlSil7QT8LvAuT3FRwAX1OUL\ngCPr8uHAhbZX2L4bWAbMnkz8iIgY32Rr9B8CTmP1ztg72F4OYPtBYPtaviNwX896D9SyiIho0bSJ\nPlHS7wHLbd8qaWgtq05oZM38+fNXLg8NDTE0tLYQEREbn+HhYYaHh8ddb8IjYyW9H3g9sALYEtgW\n+DLwYmDI9nJJM4BrbM+UNAew7bPq8y8D5tm+YZRtZ2Rsy7E2RLwZM3Zh+fJ71vt5O+ywMw8+ePd6\nPy9iYzPWyNhGpkCQ9FLgHbYPl/QB4CHbZ0l6FzDd9px6MvazwP6UJpsrgV8dLaMn0bcfa2OIF7Gx\nGSvRT7jpZi3OBC6SdDxwD6WnDbaXSLqI0kPnMeCkMbN5REQ0JpOajb8vA4vX5de2IeKlqSg2Nq02\n3TQtib79WInXfLyIDS2zV0ZEbKSS6CMiOi6JPiKi45LoIyI6Lok+IqLjkugjIjouiT6iITNm7IKk\n9b7NmLHLht716Lj0ox9/XwYWr8uvLfGajxfRL/3oIyI2Ukn0EREdl0QfEdFxSfQRER2XRB8R0XFJ\n9BERHZdEHxHRcUn0EREdl0QfMUVlJG6sq4yMHX9fBhavy68t8aZ+vHjya3xkrKSdJF0t6TuSFks6\npZZPl3SFpDskXS5pu57nzJW0TNJSSQdPNHZERKy7yTTdrADebvvXgJcAJ0t6HjAHuMr2XsDVwFwA\nSXsDRwMzgcOAs1WqJBER0aIJJ3rbD9q+tS7/FFgK7AQcAVxQV7sAOLIuHw5caHuF7buBZcDsicaP\niIh108jJWEm7AC8Ergd2sL0cypcBsH1dbUfgvp6nPVDLIiKiRdMmuwFJ2wBfAE61/VNJ/Wd5JnTW\nZ/78+SuXh4aGGBoamuguRkR00vDwMMPDw+OuN6leN5KmAV8DvmH7I7VsKTBke7mkGcA1tmdKmgPY\n9ll1vcuAebZvGGW76XXTcqzES7zonrbmo/8nYMlIkq8WAMfV5WOBS3rKj5G0uaRdgT2AhZOMHxER\n45hM98rfBF4HvFzSLZJulnQocBZwkKQ7gFcAZwLYXgJcBCwBLgVOGrPaHhFPOhmgNXVlwNT4+zKw\neF1+bYmXeOtrxoxdWL78nvV+3g477MyDD9693s/rglxKMCKmlJLkvd63iXw5QLd/saRGP/6+DCxe\nl19b4iVe4rUvNfqIiI1UEn1ERMcl0UdEdFwSfURExyXRR0RsAIPs5ZNeN+Pvy8Didfm1JV7iJV77\n8dLrJiJiI5VEHxHRcUn0EREdl0QfEdFxSfQRER2XRB8R0XFJ9BERHZdEHxHRcUn0EREdl0QfEdFx\nA0/0kg6VdLukOyW9a9DxIyI2NgNN9JI2AT4GHAL8GvAaSc9rLsJwc5t60sUbZKzES7zE61K8Qdfo\nZwPLbN9j+zHgQuCI5jY/3NymnnTxBhkr8RIv8boUb9CJfkfgvp7799eyiIhoSU7GRkR03EDno5d0\nADDf9qH1/hzAts/qW+/JN0l+RMQUMNp89INO9JsCdwCvAH4ALAReY3vpwHYiImIjM22QwWw/Lumt\nwBWUZqPzkuQjItr1pLyUYERENCcnYyMiOi6JPiKi4wbaRh9Th6Sn2/7Rht6PeHKS9ExgJ+Bx4Lu2\nf7qBdynWYkrW6CU9XdKfSfoTFadL+pqkv5Y0vYV4+zS9zXWM+2JJfyjp8Ganilgjznt6lveWdCfw\nLUl3S9q/pZiHSDpB0i595ce3Ea8vxoGS3i7p4Ja2v4mk4yV9XdJtkm6WdKGkoZbiTZP0FkmXSVpU\nb9+Q9H8kbdZwrL0lXQV8E7gB+ASwWNInJW3XZKwab3NJb5T0O/X+ayV9TNLJTb+2MeJvI2mWpKe1\ntP2tJL1T0mmStpB0nKQFkj4gaZvG4kzFk7GSLgUWA08FZtbli4CDgBfYbnBaBZD0OPBdypQNn7e9\npMntjxLvpcDfAD8G9gX+A5gOPAa8wfZ9a3n6ROLdbHtWXf468DHb35A0G/iw7d9oON77gQOBm4E/\nqDE+2r8vDcZbaHt2XX4TcDLwZeBg4Ku2z2w43vnAPcBVwFHAT4B/B94FXDLyWhuM93nKZ+UCymhz\nKLXtY4Gn2351g7GuB461fUf9fJxs+9j6fz3E9lFNxarxPktpediK8hq3Ab5E6aIt28c2HO9s2yfV\n5QOBzwF3AXsAb7F9acPxLqLMFrAlsBewFPhn4HBghu03NBLI9pS7AbfWvwIeGO2xhuPdAvw68FfA\nfwK3AXOAXVp6fbcAz6zLuwJfrssHAVe0EO/msf5/wC0txFsMTKvLTwMuBT7UYrxbepZv7Pnfbg0s\nbiHeor7719e/TwGWthDvzok8NsFYt63ls9PGa1tU/04DlgOb1vvq/z83FK/39VwDzKrLuwE3tRCv\nN5c9yKrKd6Ovb0o23QCb1Caa5wDbjPz8l/QMYPMW4tn2t22fbnsP4E3A9sC1kq5rId6mtv+7Lt8L\n7Fx34kramRtot/pz8avATpK26nmsjZ/H02yvALD9Y0qt/qmSLqad928TSdPr52Pl/9b2z4AVLcR7\nTNLuAJJmAf9b4/0SaOMn9I8kvarODkuNu4mkVwMPNxzrLknvlfSbkv4GuLXG24x2moI3kbQ5sC2l\nVj/SPPQU2vls9trO9s0Atr9Li03dLtn90vp35H5jn5WpejL2DOD2unw8cG6dNmFv4M9biLfakGLb\nC4GFkt4B/HYL8W6SdB5wNeUn3DCU9jxg0xbi9Td1bVLj7QCc00K8uyS91Pa/QhlIB5wg6S+BV7YQ\nbzvgW5T30ZKeZfsHtQ10jeHiDTgNuEbSLynH2DGw8gTm11qIdwxwFnC2pIcpr2k7So30mIZjHQ+8\nG5hL+WV7ai3fitJU1LTzKMf6psDpwMWSvgscQGlKbdrzJC2i/A93kTTd9sP1S7SNSshNkrax/VPb\nK89P1YrCo00FmZJt9LByOgXZXiFpGvBCSjPOD1qI9Vrbn2t6u2uJtxnlV8PelIPpn1xGFW8JbG/7\nnkHtSxvq68D2/4zy2I62HxjQfmwF7GD7ey1sW8AzbP+w6W2PE/cZALYfGmTcNkl6NoDt79eTor8D\n3FsrXE3H2rmv6Ae2/1fSrwC/bftLTcdcy77IDSXoKZnoJe1je9GG3o8uk3Sn7T0HGO8k22cPMF6r\n3UdrD5RDWdXU9gBweW2qapWkXYEXAUts3z7e+uu57U0oNfdXUppOHwfuBP7B9nCTsdayD53q+lt/\nWR7K6v/PK2w/0VSMqdpGf4ukZZLeJ2nvtoNJ2k7SmSqXQPyRpIckLa1lrXS7Wsu+fKOFbT4q6Sf1\n76OSHgV2HylvId7b+27vAP5i5H4L8QbafVTSGyk9ioYoTRpbAS+rMd/YQryv9CwfQWny+wNggaTj\nGg53HuWc0ZmUpqGv1bL3SHpbw7Go5wKWSvqOpP0lXQncKOk+SS9pId5Aj3VJR1Per0OBtwL7AW8A\nblWT3bqbPos8iBuD7wVzOaVr3Iyeshm1rI1eMLPGuO1L+SnZdLy/Az5FacYYKftei+/fo5QuZH8G\nzKu3h0eWW4jX25Pi68BhdXk2cF0L8e4AnjZK+XQa7gVTt9vbq+g6YNe6/Cv09ZJpINagexQtBJ4P\nvAT4IXBgLZ8F/EcL8QZ9rC8Ctup5vy6vy/s0+dmcqidjbfvblJMzp9f+vMdQesHc64b7fVO+QFab\nM9/2g8BZameAz43AvzL6icLGaxW2T5G0L/D5Wjv8GO30Dhnxa5RxAlsDf27755KOtd3GifR+O9r+\nBpST6iPnCxomRv//PUE7J397Y23ues7B9g8lNfbzv3pM0u627+rvUaR2riOxme3FAJL+2/a1Nd7N\nLb13gz7WBYycq/oZpTcfthdJempTQaZqoh90L5h7JL0TuMD2cljZI+U4Vr80YlOWUgZnLOt/QFIb\n8bD9LZXRh2+lfMls0UacGute4FW1meFKSR9qK1a1m6QFlM/NTpK2sv3z+lgbXfT+CrhZ0hWs+nw8\nlzIO4n0txHtBbWIT8JSeXkWb03wvrUH3KOptXp7b91gbvWAGfaxfClwm6d8ozTcX15hPp8FKwVQ9\nGTvoXjDTKU1DR1C/cSmDNxYAZ7nhE0OSjqIM5LljlMeOtP2VUZ7WZPxnAS9yw6MAx4i1NTAf2N92\nG1/SIyONe33L9k/rAXyU7b9vIeZ04BDWPBnbdL/2te3D04CZtr/Z8HYH1qNI0uHAVT1fzCPluwOv\ntP2BhuMN9FivMX+X2sPOZazMyEnvzVzGXkw+xlRM9NG8+jPxmbbv6itPD6dYTT4r7ZI0y3WgVlOm\nZK8bSU+VdIakT0t6bd9jrXTRkzRb0n51ee/aQ+R324g16Hj1zP/twBdr74b9eh7+ZAvxBvr+SdpU\nZdKv90n6zb7H3jPW8yYR70eSzpX0ilr7bZWk56hMmvbvkt6tnsm+envkNBRr0J+Vgb53dbuDPPZm\n9d32pfSWelE9B9JMnKlYo5f0RWAZcD1lpN5jwGvrCaE2JsWaBxxGaZO8Etif0rXsIMrP8b+a4vFu\npfRE+UE9sf0pYK7tL0u6xfaLGo436PfvXEoXx4WUrmv/avvt9bE24t0BfBR4DbAL8AXKZHjXNxmn\nJ96VwBcp/88TKL2z/sD2Q02/fxvgszLo927Qx94TlPett4nmgFpm2y9vJFDT3YUGcWPNibdOp8zw\n+Ax6utI1GG8x5aTWVpSZCJ9ay7eknYmVBh6v7/6zKFMGnNLS/3PQ79+inuVpwMcpMyA+hXYmUevt\nzvlc4J2UfvXfBd4/gP/n64HvALs3/f/cAJ+VQb93gz72Xknp/HBYT9n3mo4zJZtuKD0LVu67y7fs\nJ4B/oySLpq2w/bjLCaG7bP+kxv0fSpe5qR7v0XpyixrnB5TBPkdQukI2bdDv38reGbZX2H4zZTKu\nqynT3jZtZXON7Xttf8Cl5vm7rF5za8pmklb2krL9GcocNJdTEnGTBv1ZGfR7N9Bjz/YXgd8DDpZ0\nsaTn0kLX5qma6L8KrPaTxvYngXdQ+/U27H+1akbHfUcKVYa5t5F4Bx3vRNbssvoopbtXG32HB/3+\n3STp0L54fwGcT2laado1oxXavt3tjBU4l9LE0BvrKuBVwLcbjjXoz8qg37tBH3u4TGj2p8D7KdcU\naPwLbEq20Q+apKd4lG5OKhMdPct1QMdUjRcRxYY+9urJ+21Hfkk0ZarW6Mck6Y+b3uZob3wt/yHQ\n+MyHg463NpIG+qXSxvs3TryDBhzvz7oabwN8Vhp/7zb0sefiJ9Dse9e5Gr3KFAjPTbz12uYfjfUQ\nZVbCZzYZb5x9mfL/zy7Hy2dlasabklMgqFwYYNSHgB1aiDfWjIqijfa0AcejTDD2WUY/CdT4VAgb\n4P1bsJZ4jZ/81dgzforSe2Mqxxv0Z2XQ792gj/WBvHdTMtFTksEhrHmZNFFm72va+4G/ZvTLzrXR\n/DXoeIuAD7pMFLcalflvmjbo9++3KF0OfzpKvNktxPsxsJ/rXCmrBWxnrqJBxhv0Z2XQ792gj72B\nvHdTNdF/DdjG9q39D0gabiHezcBXbH9rlHh/0oF4/5fSZ3g0f9hCvEG/f9cDP3e9dGFfvDXmE2rA\npyhztq9x8AJtzNE0yHiD/qwM+r0b9LE3kPeuc230bZC0F/CQR5nESdIOo30bT6V4EVF09djrRKKX\ntD097YMu0+C2EedVti8er6yFuNtA6W/bcpydKEP3D6S0wf47cKrt+1uKd4Lt8/rKzrQ9p414dfs7\nUK7iA7DQ9n+1EGOtw/Ld8IRVfbGnA7/K6sfDv7UQZ1fgbZS+7CtbBmwf3nSsDWFQx/qgPitTOtGr\nTGH6N8Czgf+i/ARaaruNEXqjzq3RxnwbPdt+PuWn3cjc1P8NHDta+2hD8a6k/Fz8dC16PfA62610\nQZR0KfBZ25+t9/8e2ML2CS3FO5rS/jpM+X/+FnCa7S80HGfUAVOV3dT8JWvG/RPKiNidKKNHDwC+\n2UY8SbdRLiG4mJ6BRKM1sTQU74+AsyhTB6vebLuxi3P0xRvIsT6oz8pUT/S3UUZYXmX7RZJeBry+\n6UQh6TDK8PWjKb0ORjwV2Nt2GyeFkHQdcLrta+r9IcpcKU1fQWsk3q22XzheWYPxtqTM8/1PlJGV\nP7Z9ahuxarzbgINGavEqF8u4yvYL2oo5SLUf+36Uy/u9UNLzKJ+XsbpETibWDbYbv97uWuL9J2Wi\ntqUtx9kgx3rbpvqAqcdsPwRsImmTmhBf3EKc7wM3Ab+gTOA0cltA6T3Slq1HkjyA7WHK5ffa8pCk\n16tMDbuppNcDDzUdRNLTVa6gsyXwJ5RJvx4F/ryWt2WTvqaah2jxGJB0snouKC1puqST2ooH/ML2\nL2qsp9i+HdirpVgfkTRP0kvUM81uS7EAlred5KsNcqy3/VmZ6jX6q4AjgTMoF9b9L0pXpbZqvJvZ\nfqwuTwee4xYvtCDpy5ReAL1NKfvabqN3A5J2prTRv4TSRn8dcErT5zwkfa9uXz1/R9j2bk3G64n7\n15SLLn++2ARKAAALRUlEQVS+Fr2aMiPhu1qKN9ovpMan8u3Z9peBP6b0jHk5pfvqZrYbn0td0hmU\naYPvYlXTTZvNUh+hXKT7K/RMDGf7Sy3FG/Sx3upnZaon+q0p37wCXgdsR2nzbbwWWuMNA4dTTj59\ni/LFcp3LhERtxJsO/Dnl5CiUk6PzPcDL0XVNbetd+f+0/eUWYy0G9nE9yCRtSvliaeUcUl/sl1KO\nh8tsNz5RXG1K2buNbY8R7/xRim27jYnUNsSx3upnZUon+kEb+YatJ72eY3uepEW299nQ+9aE2mb9\nJtbsSdHWwbQZZTbEkWvFDgP/OFKTajjWppT2+Jc1ve21xPxrSgeBf6xFbwHus/2OFmNOB57D6u9f\n4718VK5c9eY2ei09GQz6WG/7szIlB0xJepTVf/qvfIgWz8QD01QunH005WIZrZD0VdYyJ3WLXdgu\nofxquAp4vKUYvc4BNgNGLh/4hlrW+MAU249LekLSdrYfaXr7Y3gX5YA9sd6/kjKlcCskvQ84jnKB\nk5XNKfRNCd2QpwG3S7qR1ZtSWvlsSvq7UYofAW6yfUkLIQdyrPdo9bMyJRO97W03UOi/oFzM4Vrb\nN0rajXJJvKZ9sP79I0q75Gfq/dcw+gi6pmzVVnv1GPbr6/Fyde0Z05afAotrN9KfjRTaPqWNYLaf\noHxxndPG9kdxNLD7gJpT5g0gRq8tgOcBI/3YX0mZTfIFkl5m+/82HG9QxzrQ/mdlSjfdSPq07TeM\nVzZVSbrJ9ovHK2sw3l9S2iEvbWP7o8S7GXiV7bvq/d2AL7Q4LuHY0cptX9BSvJGTzv3x2jrZ/EXg\nxEE1pwxi8FlPrOuB37T9eL0/jfLr80DK5Q33biv2IKhc+Hw+pflmGqtaJxr5rEzJGn2P1U5U1Dd/\n3zHWnbR6Qmi0A7eVNmxga0m72f5ujb8r7XavPBV4t6RfUi7Y3XZT2GnANZK+W2PtTOk10ihJ/2L7\nFZSTh4P8xdL7hbwF5YpPbXYfPQO4RdK3abk5ZZTBZx+V1Pjgsx7TKbNHjjS7bQ08vTbJNX55xg1w\nrJ8H/CnlxG/jzaZTMtFLmgu8G9hSq0/z+Rjl4sFt+VrP8haUSZy+32K8PwWG+xLhW9oKNugmMdv/\nIulXWdXX+w6PceGHSXqWpN8ADpd0Iat352xtSoJRen99WNK3gLYuBnIBZfToaqNVW3I6pelttcFn\nQFuJ/gPArbU3jCgn8N9fe95d1UK8QR/rj9j+Rlsbn+pNN2dQPgB7smpuD7uFuT3GiL8JpQ2vlX77\nNcZTKG2TALe3kQglPc/27WMNeGkrEQ6q142ko4ATKD/zb+p7uM2+373/z00oNfwT2xqJK+lG2/uN\nv2YjsRbbfn7P/U2A23rLWoj5LFZNTXyj7TYTb3/sVo91SWcCmwJfYvVfY5nrRtKbgFMYwNweY8Tf\nC/i67T1ajPEbrNnd8VMNx/i47Tdr9Xk3Vn4wWkyE51J63Yy0kb8BeNx2G9PBIum9wMdYVTEwtDPp\nV413Dav+jyuAuylzud/ZUry/pSSJBbSQLPpijTb4bLHtdzYcZ4NUQkbZj1aP9Z5jb+TzMtJs2six\nNyWbbnqcwqq5PV6mOrdHW8F6unVS/y6nDN9vK96ngd0pX2Ij7XamTHTWGNtvrovnUAbY/KQmxVnA\n+5qM1WfQvW4eBP6N1SsG1wGvaCneYZTeIbuw6lg7htKjow0joygP6ClrpXul7dP6Bp99vKXBZ28H\n3kyZvHBl+J7ltiohAz3WKb9m+zVWC5/qif4Xtn8haeXcHvWbtxW2t1WZi6V3Gtg2fxK9mHICcVA/\nu95j+yJJB1IOoA9Skn9bk1c9Lmn3vl43bfbfH2jFgDJc/8eUaSx+0WIcAAY8GOysemL7S6OUNWZD\nVUI2wLHeOwX5FsDvA43N7TPVE/39KhMBfQW4UtLDwD1tBdMY08DSUq0C+DalH/0PWtp+v5Ek+3vA\nJ2x/vXa5bEtvrxsoNd/Ge930GGjFANjJ9qEtbn8Nkn6P0hutdz76Nn5BHEQZ5NPrsFHKmjLQSsig\nj3Xbvb9YkPRBSj/+Rkzp2Stt/6HtH9ueD7yX0kXpyBZDnkqpEd5Ta08votTY2vIrwBJJl0taMHJr\nMd4Dkv6R0t56aT0R3OZn5D8oQ76fAH5Ul7/ZYrz+isEltFgxAK5TuabAQEj6B8p79zZKG++rKD21\nmoxxosq8LM+TtKjn9j1Kb5+2rFEJATZvMd6gj/V+W1G+ZBoxpU/GDtpIrwZJtwL72/6lpO+4vQud\nvHS0crd3cYetKPPCL7a9rPZyeL7tK1qKdxHl+qOfrUWvBZ5m+1VtxOuL3eqkXzXGEmAPygjOX7Lq\nBFtb86Ussr1Pz99tgG/Y/q0GY2xH6dN+BnAmq3pMXWv7lqbijBL3a8ADlF8Ss4D/oQzSarUH0wCP\n9cWsahraFHgm8Be2P9bE9qd6082gDbSpqK2EvpZ4P6enzdX2D2i32ejX+0Y0XlOTY+sG9L89bAAx\nev1P/ftzSc+mzLf/rCYDuMwT9EgdqfoZyudFwAWSPmH7o03G63E0pRLyQds/rpWQ01qKBQM+1ilt\n8iNWUObfX9HUxlOjn6A2a4SSrrV9YN+Zf2h/pOpASfoM8DHb19f7+wMn237jht2zqamepPwopR35\n72vxubbf20KsRcBLbP+s3t+a0rW5EzO59hrEr7+2JdHHwPX8TN2MMir23np/Z8qgsCk9b8mGonJp\nxhMp18Idubj7Oa5XnWo41mJK99iRK1ptQRnENLBzErHu0nQTG8Lvj79KTMAFlEsyjkzp+1rKmIuj\nW4h1PnCDylWtoHSCOK+FONGA1OgjOkLSkv5fQ6OVNRhvFqtfrau1k7ExOanRR3THzZIO6Dvn0T+3\nT2Pq9AMDmYIgJic1+oiOkLSUVec8AJ4L3EHpxdFat8548kuij+gISWsdHGW7ze6B8SSWRB8R0XFT\negqEiIgYXxJ9RETHJdFHrAdJ20k6cZx1dpb0mkHtU8R4kuhjoyVp0wk8bTpw0jjr7EoZrNRUzIhJ\nSaKPKa3WnpdK+oykJZIukrSlpPdKuqFOofsPPetfI+lDkm4ETpH0K5K+UNe9QdJL6nrzJJ1X1/9P\nSW+tmzgD2E3SzZLOGmO3zgAOrOucKulYSZdI+hfqhawl/T9JCyXdKmlez/69ru7HzZLOkaQxYkSs\nsyT66IK9KJOj7U2ZAuBE4KO29699x7eqF+QYsZnt/Wx/CPgI8Le29weOYvVh/HtRpsXdH5hfa+Nz\ngLtsz1rL1ZTmUEaKzrL9kVr2IuCP6pWtDgJ+1fbsWv5iSQeqXPHq1cBv2J5Fmaf/dZP830RkZGx0\nwr0jo0EpU+eeAtwt6Z2UCzhMp1yt6+t1nX/uee7vADN7as7b1Hn5oVwMegXwkKTlwA6T2Mcr6xS/\nAAcDB0m6mTIj6daUS9a9ANgXuLHuzxaUa5VGTEoSfXSRKdP07mv7+7VpZIuex3/WsyzKhSUe691A\nzfu/7Cl6gskdL/0xz7D9ib6YbwU+afv0ScSJWEOabqILnlvndYFyEvTf6/JD9SpLR63luVdQLhsH\ngKTxrlj0KLDtJNe5HDi+zuGOpGdLeibwL8BRdRlJ0yU9d5xYEeNKoo8uuAM4uV6dajvKRaPPBb4D\nfANY2LNu/1DwUylt5LdJ+jbwljFiGMD2j4D/qCd5xzoZuwh4QtItkk7tj2n7SuBzwDfrBTwuBrax\nvRR4D3CFpNsoX0Izxn/5EWuXKRBiSqvzu3wtF7yIGFtq9NEFqa1ErEVq9BETJOnXgU+z6otGwC9s\nv2TD7VXEmpLoIyI6Lk03EREdl0QfEdFxSfQRER2XRB8R0XFJ9BERHZdEHxHRcf8fb59SpijUGvgA\nAAAASUVORK5CYII=\n", | |
"text/plain": [ | |
"<matplotlib.figure.Figure at 0x478b65c0>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"famsizes = langs.groupby('parent_tree').size().sort_values(ascending=False)\n", | |
"\n", | |
"print famsizes.describe()\n", | |
"famsizes[famsizes > 100].plot.bar();" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Analyze the **number of steps from languages to their top-level family**." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 67, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"count 8397.000000\n", | |
"mean 5.691080\n", | |
"std 3.476092\n", | |
"min 0.000000\n", | |
"25% 3.000000\n", | |
"50% 5.000000\n", | |
"75% 8.000000\n", | |
"max 17.000000\n", | |
"Name: steps, dtype: float64\n" | |
] | |
}, | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAEACAYAAAC9Gb03AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XmUXGWd//H3tzs7S0hYEggIERQCGDFKggLSsslySJhR\nGRYdkDng+aESWZTEGU1wHCUoCuMAyqCYgfADDLLoEIkh9E9R9rDEpLN2ZScdgZAQknS6u76/P55q\n0km600stz61bn9c5dbr69q2qT5b+1q3vfZ7nmrsjIiLpVRU7gIiIFJcKvYhIyqnQi4iknAq9iEjK\nqdCLiKScCr2ISMp1WujN7Jdm1mBmr7fZNsjMZprZQjN70swGtvnZRDNbbGZ1ZnZmm+2jzOx1M1tk\nZrcW/o8iIiLt6coR/T3AZ3faNgGY5e5HArOBiQBmdjRwATACOBu4w8ws95g7gX9x9w8DHzaznZ9T\nRESKoNNC7+7PAOt32jwOmJq7PxU4P3d/LPCAuze7+zJgMTDazIYCe7n7i7n9/qfNY0REpIh62qM/\nwN0bANx9LXBAbvswYGWb/Vbntg0DVrXZviq3TUREiqxQJ2O1joKISEL16uHjGsxsiLs35Noy63Lb\nVwOHtNnv4Ny2jra3y8z0xiEi0gPubjtv6+oRveVurR4HLsvdvxR4rM32C82sj5kNB44AXsi1dzaY\n2ejcydl/bvOYjsIm8jZp0qToGcoxW9LzJTlb0vMpW3LydaTTI3ozux+oAfY1sxXAJOAm4Ddmdjmw\nnDDSBnefb2YPAfOBJuAq3/7qXwV+DfQDnnD3P3T22iIikr9OC727X9zBj07vYP8fAj9sZ/vLwEe6\nlU5ERPKmmbHdVFNTEztCh5KcDZKdL8nZINn5lK3nSpXPdtfXicXMPIm5RESSzMzwPE7GiohImVKh\nFxFJORV6EZGUU6EXEUk5FXoRkZRToRcRSTkVehGRlFOhl12sXw/XXw/jxsE778ROIyL50oQped+2\nbXDnnXDjjWAWCv7ee8OCBTB0aOx0ItKZjiZMqdAL7vDII3DttbBlC2zYAI2N4WfV1dC3L8ybB4cd\nFjWmiHRChV7a9fzzMH48LFkCGzeGot/cvOM+VVXQqxe8/DIce2ycnCLSORV62UEmAzfcADNnwtat\nYVvrUXx7qqrC7U9/gk9+sjQZRaR7tNaNANtPtI4cCTNmQFNTKPC7K/IA2Sy0tMBJJ8GTT5Ymq4gU\nhgp9hdi2DW67DQ4/HO65JxTtTZtg8+auP4d7uJ19Njz4YPGyikhh9fSasVIm3OG3v4XrrgsnWjdv\n7vzovbPnA7joonDS9sorC5NTRIpHhT7FunKitafc4StfCePsv/WtwjyniBSHTsamUHdPtOZrwgT4\n4S4XjxSRUtOomwqwfj18//tw113h+2y2ez34fFx5JfziF6V5LRFpnwp9yq1dCx/5SCjuW7aEWymZ\nwec/Dw89VNrXFZHtNLwy5b773TBU8u23S1/kIfTsp0+HM87YfsJWRJJBR/QpMG9emMTU0lK6Vk1H\nzODjHw8ngqt0GCFSUjqiT7HrrgujaWIXeQhH83PmwDHHhE8YIhKfCn2ZmzULXnghdoodZbOwaFGY\nnBWjjSQiO0p1oV+3Dj70IfjLX2InKY6WFvjGN+C995JXULNZWLMGDj1Ua9qLxJbqQr9gASxdCqee\nGk5Sps2994Y3s6SezmhpCX/vhx4Kb7wRO41I5Up1oc9kYMCA0CsePTq5BbEnNm8OE5U2bkx2L7yl\nJXziOPzw8KYrIqWX6kJfXx9mhLqH+2lal+UnPwntkSQX+VYtLeHf4aij4OmnY6cRqTypLvQLF24/\nineHu++GRx+Nm6kQ1q6Fm28Oq09ms7HTdE3rMsennQb/+Z+x04hUltQX+r59d9z2hS+EQlnOJk0K\nl/hL2gnYzrQuczx+PHz5y7HTiFSOVE+YGjw4tDY2bdq+raoKDjoIli8vzwk98+fDCSckY3JUPsxg\n1Ch49lno3Tt2GpF0qLgJU1u3hhOVras3tspmYfVquPjiOLnydd115V/kIRzZv/IKDBumETkixZba\nQr98eRhx09766+7hCkn33Vf6XPl46qmwtEACP4T1SDYbhl8edhj89a+x04ikV2oLfX19562ZSy+F\nZctKEidv2WxyJ0flo6UlvBmfdFI4WS4ihZfaQp/JdD700D2Mr29pKU2mfNx3HzQ0pOdovq1sNvy5\nrrgCvva12GlE0ievQm9m15jZ38zsdTObZmZ9zGyQmc00s4Vm9qSZDWyz/0QzW2xmdWZ2Zv7xO1Zf\nv2t/fmfu8OabMHZsMZPkb/PmcMWoDRvKY9x8Pu64IxzdF+qShyKSR6E3s4OArwOj3H0k4fqzFwET\ngFnufiQwG5iY2/9o4AJgBHA2cIeZ7XJ2uFDq6sIQxM64wxNPwJ13FitJ/m69dXuLI+3cw0icD3wg\nvAmLSP7ybd1UA3uYWS+gP7AaGAdMzf18KnB+7v5Y4AF3b3b3ZcBiYHSer9+hxYuhT5+u7/+1r4U3\nh6RpaIApU+Ddd8tnclS+stmwhs8hh4Qlj0UkPz0u9O6+BrgFWEEo8BvcfRYwxN0bcvusBQ7IPWQY\nsLLNU6zObSuK1au7t797uHjHtm3FydNTkyaFk8qdtaHSpqUl/FscfzxMmxY7jUh569XTB5rZPoSj\n90OBDcBvzOwSYOfThT06fTh58uT379fU1FBTU9Plx65fH9oc3Sna7mHc/emnw5/+1PXHFdP8+XD/\n/eVxsrgYWj/BfPGL8NprYdkHEdmutraW2traTvfr8cxYM/s88Fl3vyL3/ZeAE4BTgRp3bzCzocDT\n7j7CzCYA7u5Tcvv/AZjk7s+389x5zYydMycsTbxhQ88eP2UKfOtbPX75gjnnHKitTddwyp4yC/+m\nTz7ZtXMvIpWoGDNjVwAnmFm/3EnV04D5wOPAZbl9LgUey91/HLgwNzJnOHAEUJRrI2Uy+Q1DnDAB\nXn65cHl6YvZseO65uBmSxD2sfHn44T1/AxepVD1u3bj7C2Y2HXgFaMp9vQvYC3jIzC4HlhNG2uDu\n883sIcKbQRNwVbGuAJ7JhGVxe8odTjklnBAcMKBwubqq7eSopJ0ziCmbhVWrwrIJzz0Hxx4bO5FI\neUjlomZf+Qr893/nd1RfVQXHHRfnyP7ee8OaNhs2qNC3p7V18+KL8LGPxc0ikiQVtahZXd2uyxN3\nVzYbev3/9m+FydRVW7aE8wMq8h1raQm3Cy+MnUSkPKSy0Gcy0KvHTakd/eAHpR2FU0mTo/K1eHGY\nZyAiu5e6Qp/NhguLFKpQusOZZ4ahl8W2bh3cdFNlTY7Khzv87GexU4gkX+oK/RtvhBmxhZxg1NQE\nY8YUf0GxyZMrc3JUPu64I3YCkeRLXaHPZAp/xaJsNlyWcPz4wj5vWwsWhBUq1ZfvnnfeCZOpRKRj\nqSv09fXFmUna2iaYMaPwzw1w/fXpuHJUqbnDv/977BQiyZa6Qp/JFLf1cc45MHx4uLj1o4/CW2/l\n/5y1teEKSwkc6VoWfvc7nbwW2Z3UFfqFC4tfMN94Iyy09bnPwZAhMHQo/OM/wtSpsGJF954rmw0t\nobRdOaqUtm2D3/8+dgqR5ErdhKlRo0KxL2ULpHfvcBJ127YwmWePPeATnwgXNDn9dBgxIqzV0p77\n7oNrrw295rRfVKSYPv5xeOml2ClE4upowlTqCv1++4XlDzZtKnCobqiuDsW/sTGM56+uhpEj4dxz\n4ayzwmzO3r3DEfzhh4f2j07C5qeqKlxofODAzvcVSauKKPSNjeFouqoqWUfHZtCvX8hXXR1aS0ce\nCQceGEaMvPWWxs0Xwo9/HJaOEKlUFVHoFy8OrZuYR/Nd1b//9laPjuYL48ADYc2a2ClE4qmItW7q\n68tnrfItW7ZfRUkKY+1aWLo0dgqR5ElVoc9kktWykdJyD0tIiMiOUlXo6+u1fEClmzZN8xFEdpaq\nQj9/fvm0bqQ4tm4NE9BEZLtUFfolS8KCZlK53OF734udQiRZUjXqZs89wy+61oupbNXVYann/v1j\nJxEprdSPumm9IpOWEZCWFrj//tgpRJIjNYU+kwlHcAn8gCIR/PCHsROIJEeqCr1Iq/r6sPiciKSo\n0NfXhyUGRCB8svvpT2OnEEmG1BT6JUs0y1R2dNddauWJQIoKfV0d9O0bO4UkycaNMGdO7BQi8aWm\n0NfXhyWBRVrpMoMiQSrG0buHo/nqai2BIDvq3TtcvavQF4wXSaJUj6NfuzYczavIy86amuCxx2Kn\nEIkrFYW+vl5LH0jH1L6RSpeKQp/JhNmQIu2ZOxfWr4+dQiSe1BR6tW2kI+7w85/HTiESTyoK/aJF\nuuaq7N5tt8VOIBJPKgp9XV24+LZIR9atg4ULY6cQiSMVhX75cqhKxZ9EisVdC51J5Sr7cfTbtsGA\nAaHQ63qxsjv9+oUx9TookLRK7Tj6FSvC8sQq8tKZxkZ46qnYKURKL69Cb2YDzew3ZlZnZvPMbIyZ\nDTKzmWa20MyeNLOBbfafaGaLc/ufmX/8MOJG14mVrnCHG2+MnUKk9PI9or8NeMLdRwAfBRYAE4BZ\n7n4kMBuYCGBmRwMXACOAs4E7zGyXjxjdlcnoaF667tlnQ/tGpJL0uNCb2d7Aye5+D4C7N7v7BmAc\nMDW321Tg/Nz9scADuf2WAYuB0T19/VZLl2oMvXRdNgv33hs7hUhp5XNEPxx408zuMbM5ZnaXmQ0A\nhrh7A4C7rwUOyO0/DFjZ5vGrc9vysmCBWjfSPVOmxE4gUlr5FPpewCjgdncfBbxHaNvsPFymqMN6\nFi/WOjfSPcuXw6pVsVOIlE4+K7ivAla6+0u57x8mFPoGMxvi7g1mNhRYl/v5auCQNo8/OLetXZMn\nT37/fk1NDTU1Ne3ut3Jlu5tFOuQOt9yiSw1K+autraW2trbT/fIaR29m/w+4wt0XmdkkYEDuR2+7\n+xQzuwEY5O4TcidjpwFjCC2bPwIfam/AfFfH0b/7Luy7b1jQTEsgSHfstRds2AD5DwcQSY6OxtHn\ne02mq4FpZtYbqAe+DFQDD5nZ5cBywkgb3H2+mT0EzAeagKu6dXWRdmQyYQz9xo35PItUok2b4MUX\nYXTewwFEkq+sZ8Y++ihceqkKvfTMOefA//5v7BQihZPKmbGZTFgCQaQnZs7U/x+pDGVd6JcuDdPa\nRXqiuRkefjh2CpHiK+tCP3++hlZKfr7//dgJRIqvrAt9fT307h07hZSzujp4883YKUSKq2wLvTu8\n8YauFSv5cYc77oidQqS4ynbUTUMDDB8OW7aUKJSk1n77wd//HjuFSP5SN+qmvl79eSmMt94K53tE\n0qpsC30mo7aNFIY7/Md/xE4hUjxlXeg1tFIKZfp0XddA0qtsC/3ChTqil8JpboZhw+CZZ2InESm8\nsi30CxZA376xU0haZLOwfj18+tNw1llaVkPSpWwL/fLluuCIFFZzc+jXP/VUGIlz663he5FyV5bD\nK5uawqqV1dVaq0SKo3X54kMPhd/9Do49Nm4eka5I1fDKlStDoVeRl2JxD7dVq2DkSLjkEp38l/JV\nloU+k1HbRkqjtZ3z4IMweDBMmxY7kUj3lWWhr6/XUDgprZYW2LoVvvSlcIS/fHnsRCJdV7aFfuvW\n2Cmk0mSz4ei+rg4++EG4+upwxC+SdGVZ6BcsUOtG4mluDkX/9tth//3hD38o7PNv3AjPPgt33w3j\nx4fhnq+9VtjXkMpSlqNujjkmfHR+770ShhJpR3V1KPonngiPPBKGZXbVe++FTwfz5sHcufDSS+H+\nhg2wxx7hDWXz5jACqE+fMM5fc0dkdzoadVOWhX7gwNAzVaGXpOjVK3z9znfCzdr8qjU2hk+hrQV9\nzpzw9c03Q0FvaQkFvbo6FPQtW3ad9W0Gl10Gv/pVyf5IUoZSU+g3bQqjH1pawpGUSJKYwZAh4aL1\n8+aFlsvatTBgQOjvb94MVVWhoDc2dm9QgRksWRLOD4i0JzWFfu5cOOkkTVGX5OrVKxyIVFdDv36h\nmBdiDL4ZHHWUllSWjqVmwlQmEzuByO61jr1vbg6fQAs10ap1xM+DDxbm+aRylGWh14xYqWSXX655\nJNI9ZVfoly7VVHSpbFu2wNe/HjuFlJOyK/Tz5+sSglLZ3OGuu2DFithJpFyUXaFfuhR6946dQiS+\n886LnUDKRVkVend44w1dWUrEHV5/PUzSEulMWQ2vXLcurA/e2KgLQogA7LlnmDHbOmFLKlsqhldm\nMmEKuIq8SPDee3DttbFTSNKVXaHXbFiR7dzD4mpr1sROIklWVoW+vl5DK0V25q4Ts7J7ZVXoFy3S\n+t8iO3MPC6U98UTsJJJUZVXoFyzQMq0iHbnoIo1Ik/aVVaFftiys/Cciu3r3XZgwIXYKSaKyGV7Z\n3BxWAqyu1lo3Ih2pqgonZocMiZ1EYija8EozqzKzOWb2eO77QWY208wWmtmTZjawzb4TzWyxmdWZ\n2ZndeZ1Vq0KhV5EX6Zg7jBsXO4UkTSEaIeOBtitkTwBmufuRwGxgIoCZHQ1cAIwAzgbuMLNd3nk6\nUl+vSSEinXGH55+HP/4xdhJJkrwKvZkdDJwD3N1m8zhgau7+VOD83P2xwAPu3uzuy4DFwOiuvlYm\noxE3Il31T/+kOSeyXb5H9D8Fvgm0bagPcfcGAHdfCxyQ2z4MWNlmv9W5bV1SXw9bt+YXVqRSvPNO\nuHatCORR6M3sXKDB3V8FdteCKcjZ3gULdrzgsoh0zB1uuilcgFwkn673icBYMzsH6A/sZWb3AmvN\nbIi7N5jZUGBdbv/VwCFtHn9wblu7Jk+e/P79mpoaFi2qoU8ftW9Eusod/uEf4M9/jp1EiqW2tpba\n2tpO9yvI8EozOwW4zt3HmtnNwFvuPsXMbgAGufuE3MnYacAYQsvmj8CH2lumsr3hlfvsE4r8e+/l\nHVekotTWwimnxE4hpdDR8MpijGO5CXjIzC4HlhNG2uDu883sIcIInSbgqnbXIm7H5s2hwCdwyL9I\n4n3uc2GJb002rFxlMWFq3jz41Kdg48aIoUTK2I03wne/GzuFFFtZr0efycROIFLevve9MBJHKlPZ\nFPqmptgpRMpXNhtaOFKZyqLQL1miMfQi+XCH2bPh2WdjJ5EYyqLQ19VB796xU4iUv/PP16CGSlQW\nhX7pUujTJ3YKkfK3bh386EexU0ipJX7UjTv07x+WJ968OXIwkRTo1QsaGmDw4NhJpNDKdtTNW2+F\nr1u2xM0hkhbZLIwZoxZOJUl8oc9kwjr0+k8pUhjZbGiHXnxx7CRSKokv9PX1Wm5VpNDc4YEH4Oc/\nj51ESiHxhT6TgcbG2ClE0umqq+Cll2KnkGJLfKFfvFgrVooUizucfDJs2BA7iRRT4gt9XR307Rs7\nhUh6bdsGo0bpPFiaJb7QL1sWhlaKSHFks6FFesEFsZNIsSS60Le0hAkeWudGpLjcYfp0uP322Emk\nGBI9YWr5cjjmGF1sRKRUzOC552D06NhJpCfKcsJUJhNm8YlIabiHq1GtXx87iRRS4gu9RtyIlNa2\nbfCxj+nkbJokutDX12t5YpFSy2ZhxQqtX58miS70CxbETiBSmdzhkUfgtttiJ5FCSHShX7RIY+hF\nYrrmGl2sJA0SPepm8ODQL9SoG5F4+vaF1ath331jJ5HOlN2omy1b4N131aMXia2pSTNny11iC/2y\nZeGCIy0tsZOIVLZsFlauDJchlPKU2EKfyYTJGyISnzs8/jj85Cexk0hPJLbQ19dr6QORpLn+enjm\nmdgppLsSW+iXLlV/XiRp3OG00+DNN2Mnke5IbKGvq4PevWOnEJGdNTfDccfpym/lJLGFfskSFXqR\nJMpmYc0aOO+82EmkqxJb6Nes0XAukaRyhyeegClTYieRrkjshKm+fV3XihVJODOorYVPfzp2EoGO\nJ0wlttDvvbezcWPsJCLSmT59YO1aGDQodhIpu5mxCXz/EZF2NDfDmDH6nU2yxBZ6tW1EykM2GwZP\njB8fO4l0JLGtGzPXEYJImZk5E844I3aKylV2Pfr+/Z0tW2InEZHuUL8+rrLr0VdXx04gIt3V3Ayf\n/KT69UnT40JvZgeb2Wwzm2dmc83s6tz2QWY208wWmtmTZjawzWMmmtliM6szszN39/xa50ak/GSz\n4YJB114bO4m01ePWjZkNBYa6+6tmtifwMjAO+DLwlrvfbGY3AIPcfYKZHQ1MA44HDgZmAR/ydgKY\nmYMOCUTK2axZYV0cKZ2Ct27cfa27v5q7vwmoIxTwccDU3G5TgdZVrMcCD7h7s7svAxYDo3v6+iKS\nbOeeCxs2xE4hUKAevZkdBhwHPAcMcfcGCG8GwAG53YYBK9s8bHVum4ikUFOT+vVJ0SvfJ8i1baYD\n4919U2i77KCH/8yT29yvyd1EpFxks7BgQVjD/pZbYqdJp9raWmprazvdL6/hlWbWC/g9MMPdb8tt\nqwNq3L0h18d/2t1HmNkEwN19Sm6/PwCT3P35dp5XPXqRlDCDp5+GU06JnST9ijW88lfA/NYin/M4\ncFnu/qXAY222X2hmfcxsOHAE8EKery8iCecOn/0sWrsqonxG3ZwI/AmYSzj8duDbhOL9EHAIsBy4\nwN3fyT1mIvAvQBOh1TOzg+fWEb1IilRVwYgR8Le/xU6SbmU3M1aFXiRdzEK//uabYydJLxV6EYlO\n69cXlwq9iCRCv36wbh3stVfsJOlTdmvdiEg6NTbCiSfGTlFZVOhFpKTcYe5cmDgxdpLKodaNiERh\nBn/+s47uC0k9ehFJnP79Q79+zz1jJ0kH9ehFJHEaG+Hkk2OnSD8VehGJJpuFV1+F73wndpJ0U+tG\nRKIzg7/8Jax2KT2nHr2IJFr//vD3v8Mee8ROUr7UoxeRRGts1IzZYlGhF5FEyGZhzhw477xwZC+F\no0IvIokyYwYceCCMGwcrV3a+v3ROhV5EEqWlJdxmzIDDDoNTTw1XqpKeU6EXkURqagrtnGeegWOO\ngdGj4aWXYqcqTyr0IpJorQX/tddCsT/6aHjqqdipyosKvYiUhW3bwoJoS5bAGWeEts7DD4dtsnsq\n9CJSVpqaQnFfswa+8AUYOhR++ctw1C/tU6EXkbLUWvDffhuuuAL23RduuSVslx1pZqyIpEJ1dSj8\n/frBNdfAv/5rmG1bSbQEgohUhKpcn6JXL7jyynAx8kop+Cr0IlJxqquhd2/40Y/gq18Ni6elmQq9\niFSkqqrQ0hk2DB58ED71qdiJikeLmolIRcpmQ6FvaICTToLPfCZc1aqSqNCLSEVoHaXzzDNw0EHw\njW9UzggdtW5EpCJVVcGAAXDnnfDFL8ZOUxjq0YuI7KS6OrR2jjgCpk+HkSNjJ8qPevQiIjtpaQnt\nnGXL4LjjYOxYeOed2KkKT4VeRCpea/9+xgzYf3+YNCm8CaSFWjciIjupqoJ99oFf/zpc8apcqEcv\nItINrf37kSND//6II2In6px69CIi3dDav58/H448Ei65BDZujJ2qZ3RELyLSBa1H+NXVsNdeobUz\neDAMGRKucXvIIWHJ5P32C7f99w9fBw8O6+6Uglo3IiIF0qfP9uKdzYaj/6am8CbQq9f2hdVat/fv\nH94cBg0KbwBDhoRJWx/9KIwZA0cdFR6bLxV6EZFIWhdXa/sG0Nwc3iR69w4touHD4YQTwtWzRo8O\n5wSqutlcT0yhN7OzgFsJ5wd+6e5T2tlHhV5EKkq/fqHwNzdv/7RwxBFhEbYzzoDjjw+XT9zdCpyJ\nKPRmVgUsAk4D1gAvAhe6+4Kd9ktwoa8FaiJn6Egtyc0Gyc5XS3KzQbLz1aJsPVXL7vL17x8Kf0tL\n+FRQXR1ODJ90Epx2Wij+w4ZtL/5JGXUzGljs7svdvQl4ABhX4gx5qo0dYDdqYwfoRG3sALtRGztA\nJ2pjB9iN2tgBdqM2doBO1O72p1u2hB5/Nhu+NjbCokXwi1+E6+UOHx56/8cfD9dd1/HzlOhc8PuG\nASvbfL+KUPxFRKQT7qH4t5XNwrx58MorHT+u1IVeREQKKJvdtfjvrNQ9+hOAye5+Vu77CYDvfEI2\n9OhFRKS7knAythpYSDgZ+wbwAnCRu9eVLISISIUpaevG3VvM7GvATLYPr1SRFxEpokROmBIRkcJJ\n1KJmZnaWmS0ws0VmdkPsPG2Z2cFmNtvM5pnZXDO7OnamnZlZlZnNMbPHY2dpy8wGmtlvzKwu9/c3\nJnamtszsGjP7m5m9bmbTzKxPxCy/NLMGM3u9zbZBZjbTzBaa2ZNmNjBh+W7O/du+amYPm9neScnW\n5mfXmVnWzAbHyJbL0G4+M/t67u9vrpndVIzXTkyhz02m+i/gs8AxwEVmdlTcVDtoBq5192OATwJf\nTVg+gPHA/Ngh2nEb8IS7jwA+CiSmXWdmBwFfB0a5+0hCO/PCiJHuIfwOtDUBmOXuRwKzgYklT7Vd\ne/lmAse4+3HAYuLlay8bZnYwcAawvOSJdrRLPjOrAc4DPuLuHwF+XIwXTkyhJ+GTqdx9rbu/mru/\niVCshsVNtV3uP/M5wN2xs7SVO7o72d3vAXD3ZndP2mKv1cAeZtYLGECYtR2Fuz8DrN9p8zhgau7+\nVOD8koZqo7187j7L3bO5b58DDi55MDr8uwP4KfDNEsfZRQf5/g9wk7s35/Z5sxivnaRC395kqsQU\n0rbM7DDgOOD5uEl20PqfOWknXYYDb5rZPbm20l1m1j92qFbuvga4BVgBrAbecfdZcVPt4gB3b4Bw\nwAEcEDnP7lwOzIgdopWZjQVWuvvc2Fk68GHg02b2nJk9bWafKMaLJKnQlwUz2xOYDozPHdlHZ2bn\nAg25TxyWuyVFL2AUcLu7jwI2E1oRiWBm+xCOmA8FDgL2NLOL46bqVNLezAEws38Fmtz9/thZAHIH\nFN8GJrXdHClOR3oBg9z9BOBbwEPFeJEkFfrVwAfafH9wblti5D7aTwfudffHYudp40RgrJnVA/8X\n+IyZ/U/kTK1WEY6oXsp9P51Q+JPidKDe3d929xbgt8CnImfaWYOZDQEws6HAush5dmFmlxFah0l6\nkzwcOAzQdzbGAAABXElEQVR4zcwyhJryspkl6RPRSsL/Odz9RSBrZvsW+kWSVOhfBI4ws0Nzox4u\nBBI1egT4FTDf3W+LHaQtd/+2u3/A3T9I+Hub7e7/HDsXQK7lsNLMPpzbdBrJOmG8AjjBzPqZmRHy\nxT5ZvPOnsseBy3L3LwViH2TskC+39Pg3gbHu3hgtVS5O7oa7/83dh7r7B919OOGg42PuHvONcud/\n20eBUwFyvyO93f2tQr9oYgp97miqdTLVPOCBJE2mMrMTgUuAU83slVy/+azYucrE1cA0M3uVMOrm\nB5HzvM/dXyB8yngFeI3wS3hXrDxmdj/wV+DDZrbCzL4M3AScYWats8qLMgQvj3w/A/YE/pj7vbgj\nQdnaciK2bjrI9yvgg2Y2F7gfKMoBmiZMiYikXGKO6EVEpDhU6EVEUk6FXkQk5VToRURSToVeRCTl\nVOhFRFJOhV5EJOVU6EVEUu7/AymgAxnSM72VAAAAAElFTkSuQmCC\n", | |
"text/plain": [ | |
"<matplotlib.figure.Figure at 0x49665f28>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"langs['steps'] = langs['steps'].fillna(0)\n", | |
"print langs['steps'].describe()\n", | |
"langs['steps'].value_counts().sort_index().plot.area();" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Inspect the **geographical distribution** of languages." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 68, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAEACAYAAAC9Gb03AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAHWVJREFUeJzt3X+MHPd93vH3Q1KkLf/gURXES0WbJ0GyLaV1z27FGFVS\nHfSDVRSAFFIgYWwkOhv+o1UcG2qRinRasA0CUBRiuwYK/9HK9jGuVFZ27IgKFIkipA0QB5bkSBfK\nIsMysClTbLhuJEu1IkOxrE//2Lm7udvbu93Z3Zv5zj4vgODM7Oztc3u73519dmZWEYGZmdXXurID\nmJnZcHmgNzOrOQ/0ZmY154HezKzmPNCbmdWcB3ozs5rreqCXtE7S05KOZPNbJB2VdErSI5I259bd\nJ+m0pJOSdg4juJmZdaeXLfpPASdy83uBYxHxXuAxYB+ApKuBXwGuAn4R+IIkDSaumZn1qquBXtI2\n4Bbgntzi3cChbPoQcGs2vQs4HBFvRMQZ4DSwYyBpzcysZ91u0X8O+G0gfxjt1ohoAkTEeeCSbPml\nwNnceueyZWZmVoJVB3pJvwQ0I2IWWKmC8bkUzMwqaEMX61wL7JJ0C/BW4B2SvgKcl7Q1IpqSxoEf\nZOufA96Vu/62bNkikvzCYGZWQET09Lnnqlv0EfHpiHh3RFwO7AEei4hfBx4EprPVbgMeyKaPAHsk\nbZR0GXAF8GSHn53sv/3795eewfnLzzGK+VPOXof8RXSzRd/JXcD9kj4GPE9rTxsi4oSk+2ntofMT\n4PYomq7Czpw5U3aEvjh/uVLOn3J2SD9/ET0N9BHxp8CfZtMvATd2WO8AcKDvdGZm1jcfGVvQ9PR0\n2RH64vzlSjl/ytkh/fxFqKxWRVIdGx0zs6GSRAz6w1hbXqPRKDtCX5y/XCnnTzk7pJ+/CA/0ZmY1\n5+rGzCwhrm7MzKyNB/qCUu/5nL9cKedPOTukn78ID/RmZjXnjt7MLCHu6M3MrI0H+oJS7/mcv1wp\n5085O6SfvwgP9GZmNeeO3swsIe7ozcysjQf6glLv+Zy/XCnnTzk7pJ+/CA/0ZmY1547eKmV8fIJm\n8/n5+a1bt3P+/JnyAplVTJGO3gO9VYokIP+4UOHvyTSrI38Yu4ZS7/mcv1wp5085O6Sfv4hVB3pJ\nmyQ9IekZSc9K2p8t3y/pBUlPZ/9uzl1nn6TTkk5K2jnMX8DSND4+gSQkMT4+UXYcs1rrqrqRdGFE\nvCZpPfBN4JPALwI/iojPLln3KuA+4BpgG3AMuHJpT+PqZrQtrmgW6hlXN2YrG1p1ExGvZZObgA3k\nn6HtdgOHI+KNiDgDnAZ29BLKzMwGp6uBXtI6Sc8A54FHI+Kp7KJPSJqVdI+kzdmyS4Gzuaufy5bV\nSuo9n/OXK+X8KWeH9PMX0e0W/ZsR8QFaVcwOSVcDXwAuj4hJWi8AnxleTDMzK2pDLytHxP+T1ABu\nXtLN/3fgwWz6HPCu3GXbsmVtpqenmZiYAGBsbIzJyUmmpqaAhVfdqs7PLatKntTyZwmAxZcvviw3\nV7H8qd///cxPTU1VKk/d8zcaDWZmZgDmx8terfphrKSLgZ9ExCuS3go8AtwFPB0R57N17gCuiYgP\nZ1v79wI/R6uyeRR/GGtL+MNYs2KG9WHszwCPS5oFngAeiYiHgLslHc+WXwfcARARJ4D7gRPAQ8Dt\ndRzR27dA0+L85Uo5f8rZIf38Raxa3UTEs8AHl1n+Gytc5wBwoL9oZmY2CD4FgpXC1Y1ZMT4FgpmZ\ntfFAX1DqPZ/zlyvl/Clnh/TzF+GB3sys5tzRWync0ZsV447ezMzaeKAvKPWez/nLlXL+lLND+vmL\n8EBvZlZz7uitFO7ozYpxR29mZm080BeUes9Xrfyb5r9WsFvVyt+7lPOnnB3Sz1+EB3qrgNdp1TWu\naMyGwR29lWJpR7/8dGvejxOzBe7ozcysjQf6glLv+Zy/XCnnTzk7pJ+/CA/0ZmY1547eSuGO3qwY\nd/RmfRgfn5jfzXN8fKLsOGYD44G+oNR7Pudv12w+z9xunq3p4Un5/k85O6Sfv4hVB3pJmyQ9IekZ\nSc9K2p8t3yLpqKRTkh6RtDl3nX2STks6KWnnMH8BMzNbWVcdvaQLI+I1SeuBbwKfBP4V8GJE3C3p\nTmBLROyVdDVwL3ANsA04Bly5tJB3Rz/aqtjRdzr/jlmVDK2jj4jXsslNwAZaz4bdwKFs+SHg1mx6\nF3A4It6IiDPAaWBHL6HMzGxwuhroJa2T9AxwHng0Ip4CtkZEEyAizgOXZKtfCpzNXf1ctqxWUu/5\nnL9cKedPOTukn7+IDd2sFBFvAh+Q9E7gG5J+lvYTk/T8Pnd6epqJiQkAxsbGmJycZGpqClj4Y1R1\nfnZ2tlJ5Usvf0gCmctMsuSw3t0b5F1wwf5K1rVu3c/jwzEBvr+z73/PpzDcaDWZmZgDmx8te9bwf\nvaT/CLwGfByYioimpHHg8Yi4StJeICLiYLb+w8D+iHhiyc9xRz/CUujo3ddbFQ2lo5d08dweNZLe\nCtwEnASOANPZarcBD2TTR4A9kjZKugy4Aniyl1BmZjY43XT0PwM8LmkWeAJ4JCIeAg4CN0k6BdwA\n3AUQESeA+4ETwEPA7XXcdG9/q5+WUc5fhQOjUr7/U84O6ecvYtWOPiKeBT64zPKXgBs7XOcAcKDv\ndGZDsHBgFDSbPb0DNkuSz3VjpSizo+/u+2rd0Vs1+Vw3ZmbWxgN9Qan3fM5frpTzp5wd0s9fhAd6\nGwn5D2DNRo07eivFWnf03fTv7ugtBe7ozcysjQf6glLv+Zy/XCnnTzk7pJ+/CA/0ZmY1547eSuGO\n3qwYd/RmZtbGA31Bqfd8zl+ulPOnnB3Sz1+EB3ozs5pzR2+lcEdvVow7ejMza+OBvqDUez7nL1fK\n+VPODunnL8IDvZlZzbmjt1K4ozcrxh29mZm18UBfUOo9n/OXK+X8KWeH9PMXsepAL2mbpMckPSfp\nWUm/lS3fL+kFSU9n/27OXWefpNOSTkraOcxfwOpuU+lf5G2WulU7eknjwHhEzEp6O/AXwG7gV4Ef\nRcRnl6x/FXAfcA2wDTgGXLm0kHdHP9p66egH0ZW7o7e6GEpHHxHnI2I2m34VOAlcOneby1xlN3A4\nIt6IiDPAaWBHL6HMzGxweuroJU0Ak8AT2aJPSJqVdI+kzdmyS4GzuaudY+GFoTZS7/mcv1wp5085\nO6Sfv4gN3a6Y1TZfAz4VEa9K+gLwuxERkn4P+Azw8V5ufHp6momJCQDGxsaYnJxkamoKWPhjVHV+\ndna2UnlSy9/SAKZy0yy5rPN8r/lXu73ln/wL69ft/vd8OvONRoOZmRmA+fGyV13tRy9pA/DHwJ9E\nxOeXuXw78GBEvF/SXiAi4mB22cPA/oh4Ysl13NGPMHf0ZsUMcz/6LwEn8oN89iHtnF8GvpNNHwH2\nSNoo6TLgCuDJXkKZmdngdLN75bXAR4DrJT2T25XybknHJc0C1wF3AETECeB+4ATwEHB7HTfdl3+r\nn461yj8+PjGU3SN9/5cn5eyQfv4iVu3oI+KbwPplLnp4hescAA70kctqotl8nrkKpNns6d2mmQ2I\nz3VjQ7W09+6mD3dHb9aZz3VjZmZtPNAXlHrP5/zlSjl/ytkh/fxFeKA3M6s5d/Q2VO7ozQbLHb2Z\nmbXxQF9Q6j2f85cr5fwpZ4f08xfhgd7MrObc0dtQuaM3Gyx39GZm1sYDfUGp93zOX66U86ecHdLP\nX4QHejOzmnNHb0Pljt5ssNzRm5lZGw/0BaXe8zl/uVLOn3J2SD9/ER7ozcxqzh29DZU7erPBckdv\nZmZtPNAXlHrP5/zlSjl/ytkh/fxFdPPl4NskPSbpOUnPSvpktnyLpKOSTkl6RNLm3HX2STot6aSk\nncP8Bcw6yX8xeWebuljHLG2rdvSSxoHxiJiV9HbgL4DdwEeBFyPibkl3AlsiYq+kq4F7gWuAbcAx\n4Mqlhbw7+tFQZkff3W10N+3HqlXFUDr6iDgfEbPZ9KvASVoD+G7gULbaIeDWbHoXcDgi3oiIM8Bp\nYEcvoczMbHB66uglTQCTwLeArRHRhNaLAXBJttqlwNnc1c5ly2ol9Z7P+cuVcv6Us0P6+YvY0O2K\nWW3zNeBTEfGqpKXvZXt+bzs9Pc3ExAQAY2NjTE5OMjU1BSz8Mao6Pzs7W6k8Vc2/YOn83LKpDpev\nPN9t/uK3t3j9VO9/z6c/32g0mJmZAZgfL3vV1X70kjYAfwz8SUR8Plt2EpiKiGbW4z8eEVdJ2gtE\nRBzM1nsY2B8RTyz5me7oR4A7erPBGuZ+9F8CTswN8pkjwHQ2fRvwQG75HkkbJV0GXAE82UsoMzMb\nnG52r7wW+AhwvaRnJD0t6WbgIHCTpFPADcBdABFxArgfOAE8BNxex0335auBdDh/uVLOn3J2SD9/\nEat29BHxTWB9h4tv7HCdA8CBPnKZmdmA+Fw3NlSj2tGPj0/QbD4PwNat2zl//kxP1zfrpEhH74He\nhmpUB/pOv7dZv3xSszWUes/n/OVKOX/K2SH9/EV4oDczqzlXNzZUrm6KXd+sE1c3ZmbWxgN9Qan3\nfM5frpTzp5wd0s9fhAd6M7Oac0dvQ+WOvtj1zTpxR2+V0N03O5nZWvFAX1DqPd8w87eOCA0KnLm6\na77/y5Nydkg/fxEe6M3Mas4dvQ3cSt14mh39W4DXge7PW+OO3oalSEff9TdMmY2u15kbtJtNf+5g\n6XF1U1DqPV/q+S+6aHz+A9/x8Ymy4/Qs5fs/5eyQfv4ivEVva2jTwPbE+eEPm3gr26w77uht4AbX\njbdf1qnjX+6xNMj96Hvt293R27B4P3qzEZE/ViHF6srWlgf6glLv+VLPn7p+7//8sQpz32S1VlJ/\n7KSev4huvhz8i5Kako7nlu2X9EL2ReFzXxY+d9k+SaclnZS0c1jBzcysO6t29JJ+HngV+IOIeH+2\nbD/wo4j47JJ1rwLuA64BtgHHgCuXK+Pd0deXO/rhd/T+DGB0DaWjj4g/A3643O0ts2w3cDgi3oiI\nM8BpYEcvgczMbLD66eg/IWlW0j2SNmfLLgXO5tY5ly2rndR7vtTzpy7l+z/l7JB+/iKK7kf/BeB3\nIyIk/R7wGeDjvf6Q6elpJiYmABgbG2NycpKpqSlg4Y9R1fnZ2dlK5alafmiw2NL5uWXdrr/y9Tvl\nKX57K6+/+u+/cr5+7//Vfr7n6zPfaDSYmZkBmB8ve9XVfvSStgMPznX0nS6TtBeIiDiYXfYwsD8i\nnljmeu7oa8odvTt6G55h7kcvcp28pPHcZb8MfCebPgLskbRR0mXAFcCTvQQyM7PB6mb3yvuAPwfe\nI+n7kj4K3C3puKRZ4DrgDoCIOAHcD5wAHgJur+tm+/LVQDpSz7/YpuQOHkr5/k85O6Sfv4hVO/qI\n+PAyi7+8wvoHgAP9hDLrjc8uabYSn+vGBq6Mjr74Oe/d0VtafK4bMzNr44G+oNR7vtTzpy7l+z/l\n7JB+/iI80JtVmM9SaYPgjt4Gzh394Dr0Tj/HHf3ockdvVgP5rXizQfBAX1DqPV/q+VO30v2fP9d8\nFaX+2Ek9fxEe6M3Mas4dvQ2cO/r+OvQiv5ufS6PDHb1ZJaV3igarFw/0BaXe86Wev7NNFfwgc+4U\nDQvf75ry/Z9ydkg/fxEe6K1mFgZVM2txR28DV3ZHP+zpIh19L9d3R28rcUdvZmZtPNAXlHrPt1p+\nH3q/uvx91Ov9lPLjJ+XskH7+Iop+Z6zV3MJBOz7Heyf5+6g17/vJqskdvS1rWPuB16mjX5yh+3zu\n6K0fRTp6b9GbJWNTxXYbtVS4oy8o9Z4vzfxV3Ec+r/t8xe7/auw6muZjZ0Hq+Yvo5svBvyipKel4\nbtkWSUclnZL0iKTNucv2STot6aSkncMKbqOoGgNdZ1XPZ6Nq1Y5e0s8DrwJ/EBHvz5YdBF6MiLsl\n3QlsiYi9kq4G7gWuAbYBx4Arlyvj3dFXW1U7+ipMr9TRF90Pf3x8Yv6o2ZbB79tv9TCU/egj4s+A\nHy5ZvBs4lE0fAm7NpncBhyPijYg4A5wGdvQSyGwUVf3UxJa2oh39JRHRBIiI88Al2fJLgbO59c5l\ny2on9Z4v9fypS/n+Tzk7pJ+/iEHtdVNoM2R6epqJiQkAxsbGmJycZGpqClj4Y1R1fnZ2tlJ5Bp2/\npQEU+/mt6+YtnV/881dfv9/rD3b99sGit+svvf/7zVf248nzw5tvNBrMzMwAzI+XvepqP3pJ24EH\ncx39SWAqIpqSxoHHI+IqSXuBiIiD2XoPA/sj4ollfqY7+gpzR7+2HX2/95mfS6NjmOe6UfZvzhFg\nOpu+DXggt3yPpI2SLgOuAJ7sJZBV0fLnU8+fAmD9+rdVfNfH6vF3w9pa6Wb3yvuAPwfeI+n7kj4K\n3AXcJOkUcEM2T0ScAO4HTgAPAbfXdbM99Z6vt/zt51OHxR8gvvnma/PTtrpGo5HsB7Cj9divh1U7\n+oj4cIeLbuyw/gHgQD+hrMpG/ejMUf/9LUU+140tq8xzyVS9o+836zDO1+Pn0ujw+ejNzKyNB/qC\nUu/5Us+fupTv/5SzQ/r5i/BAb2ZWc+7obVnu6N3RWzW5ozczszYe6AtKvedLPX+6qn5O/dWl/thJ\nPX8RHujN1tTcwWePlx3ERog7eluWO/qUsr6F1gsIbN26nfPnz2D15Y7ebCQtf4qKbuXPuZM/l5HV\nhwf6glLv+ZbL75NsraVG2QHm5c+5080LRR0f+3Xngd7mpXqSLTNbmTt6m1eNXr4qvXe6WXt9Xi3+\nu7vvr7oiHf2gvmHKzGphru+HZtMVXl24uiko9Z4v9fzpa5QdoLDUHzup5y/CA71ZTXlvGpvjjt7m\nuaOvR9ZO59Lp9Hxb6e/u52j1eD96MzNr09dAL+mMpL+U9IykJ7NlWyQdlXRK0iOSNg8marWk3vOl\nnj99jbIDFJb6Yyf1/EX0u0X/JjAVER+IiB3Zsr3AsYh4L/AYsK/P27A+5bta97Vmo6evjl7S94B/\nFhEv5pb9FXBdRDQljQONiHjfMtd1R79GFnew0Kl7dUdfj6zu6OutjI4+gEclPSXp49myrRHRBIiI\n88Alfd6GmZn1od+B/tqI+CBwC/Cbkn6B9uPna7lJkHrPl3r+9DXKDlBY6o+d1PMX0deRsRHxN9n/\n/1fSHwE7gKakrbnq5gedrj89Pc3ExAQAY2NjTE5OMjU1BSz8Mao6Pzs7W6k8q80vDCwr51+sMb9+\n+8C01uuv9e31un6v15/t8/ZWXn+5v2ej0eji8bH87XV6fO3ZMz1/IjRpExGt0yds2bKVr3/9cGUe\n/ynPNxoNZmZmAObHy14V7uglXQisi4hXJb0NOAr8Z+AG4KWIOCjpTmBLROxd5vru6NeIO/pRyrpw\nrpqW4Xb07vfX3lqf62Yr8A1Jkf2ceyPiqKRvA/dL+hjwPPArfdyGmfVk4Vw1rYF3eePjE4XOXW9p\nKtzRR8T3ImIy27XyH0fEXdnylyLixoh4b0TsjIiXBxe3OtLu+TZ5V8vSNdb49jYt2sW2n1NSp/3Y\nTz9/ET4ytqZW/hKR12l9Z2mxbySyFC18C1VN94+wFfhcNzXVuTtdOt+50y1/uio5Rjfrcs/R9tqn\nWEef/zk+9333inT0HuhrqvuBvsrTVckxuln7+dC+1w9yPR50xyc1W0Pp93yNsgOMuEbZAXpSp+8T\nTv+52zt/w5SZdbBpycC++t48Vk2ubhLXqed0deOsZWede37nH6Pr1l3Im2++lsvh6qZX7uhHUKee\n0wO9s5addfXHojv6ItzRr6H0e75G2QFGXKPsAJU17NNqp//c7Z0HejOrlMUHcwXN5nkf4NcnVzeJ\nW/y2uJt94pfOV3m6Kjmctffp3o/PWL7q6bzeqHJ1M/LyRz+alcmPxSrxQF9Q+j1fo+wAI65RdoCR\nlf5zt3ce6M2sAjZ1eUCWT8hXhDv6BHV7rhF3yc5a76wLnwOM0rly3NGPiH5OMWtWHwufA/RzFtb8\n7px1fZfggb6gte75Bn+ukcaAfo4V0yg7QM10X+ksfe7mN5zqetpun+umwlauaMxswcI3azWbfn4s\n5Y6+Yvrr30epn63adFVyOCv0di78lvbrVvV8+e7oK66bLtD9u1m/Vq9xunme1anSGdpAL+lmSX8l\n6X9LunNYt1OWbjv6/OC++IGzcFj3+vVvK+Fc3401vC1r1yg7QI0t/yFtd59zdbubZ1qGMtBLWgf8\nV+BfAj8L/Jqk9w3jtsoyOzu7aL7T1nrnLYeFB2PrtK1rvRU/u/oqNkS+/9fGpmU2tD63wvqrH9Hb\n6bk+7JOx9WNYW/Q7gNMR8XxE/AQ4DOwu+sOuv34XF164hQsv3MLFF2/j3LlzAwta1Msvv7xovtPW\nenW9vPoqNkS+/9fGcgN3kft+uReMxe8Y2k/GVp26Z1gD/aXA2dz8C9myQo4fP86Pf9zgxz/+Ln//\n9xfRbDb7DriS/CtzvlbJv0L//u//l0Wv3ov5PB9m9dLpOZ1G1ZPEh7EbN17AO97xb3nnO3+D119/\nnp07d6369qibwbrT+vlX5nytkt9S/7u/e4X8q3d6zpQdYMSdKTvACDszwJ+10kbdpp7GoGEayu6V\nkj4E/KeIuDmb3wtERBzMrZPi6GhmVrpKfJWgpPXAKeAG4G+AJ4Ffi4iTA78xMzNb0VCOjI2In0r6\nBHCUVj30RQ/yZmblKO3IWDMzWxulfBgr6bcknZT0rKS7csv3STqdXbazjGzdkvTvJL0p6aLcssrn\nl3R3lm9W0h9KemfushTyJ3UgnqRtkh6T9Fz2eP9ktnyLpKOSTkl6RNLmsrOuRNI6SU9LOpLNJ5Nf\n0mZJX80e189J+rlU8ku6Q9J3JB2XdK+kjYWyR8Sa/gOmaFU6G7L5i7P/rwKeoVUnTQB/TfaOo2r/\ngG3Aw8D3gItSyg/cCKzLpu8CDmTTV1c9P60Nk78GtgMX0Drq6H1l51ol8zgwmU2/ndZnV+8DDgL/\nPlt+J3BX2VlX+T3uAP4HcCSbTyY/MAN8NJveAGxOIT/wD4HvAhuz+f8F3FYkexlb9P8mC/YGQET8\nbbZ8N3A4It6IiDPAaVoHXlXR54DfXrIsifwRcSwi3sxmv0XrRQtgF9XPP9AD8dZCRJyPiNls+lXg\nJK37fDdwKFvtEHBrOQlXJ2kbcAtwT25xEvmzd6y/EBFfBsge36+QSH5gPfA2SRuAtwLnKJC9jIH+\nPcC/kPQtSY9L+qfZ8qUHWZ2jj4OshkXSLuBsRDy75KIk8i/xMeChbDqF/AM9EG+tSZoAJmm9wG6N\niCa0XgyAS8pLtqq5DZv8B3qp5L8M+FtJX86qp/8m6UISyB8R/wf4DPB9Ws/HVyLiGAWyD2WvG0mP\nAlvzi2g9SP5DdptbIuJDkq4BvgpcPowcRa2S/9PATWXk6tYK+X8nIh7M1vkd4CcR8T9LiDhyJL0d\n+BrwqYh4dZnjSCq5V4SkXwKaETEraWqFVSuZn9Z480HgNyPi25I+B+ylPW/l8ksao7X1vh14Bfiq\npI9QIPuwdq/sOBBK+tfA17P1npL0U0n/gNYr1rtzq27Llq25Tvkl/SNa/fVfqnXM8zbgaUk7SCD/\nHEnTtN6KX59bfA54V26+tPwrqMx93IvsbffXgK9ExAPZ4qakrRHRlDQO/KC8hCu6Ftgl6RZa1cE7\nJH0FOJ9I/hdovQP/djb/h7QG+hTu/xuB70bESwCSvgH8cwpkL6O6+SOyAUbSe2h90PAicAT41exT\n5cuAK2gdaFUZEfGdiBiPiMsj4jJaD6IPRMQPSCA/tPZaofU2fFdEvJ676Aiwp+L5nwKukLRd0kZg\nD63cVfcl4EREfD637AgwnU3fBjyw9EpVEBGfjoh3R8TltO7vxyLi14EHSSN/EzibjTXQOojzOdK4\n/78PfEjSW7INyxuAExTJXsInyRcAXwGeBb4NXJe7bB+tvSpOAjvXOluB3+W7ZHvdpJKf1oeszwNP\nZ/++kFj+m2ntuXIa2Ft2ni7yXgv8lNYeQs9k9/nNwEXAsex3OQqMlZ21i9/lOhb2ukkmP/BPaG0k\nzNJqEzankh/Ynz0fj9P64PWCItl9wJSZWc0lcfZKMzMrzgO9mVnNeaA3M6s5D/RmZjXngd7MrOY8\n0JuZ1ZwHejOzmvNAb2ZWc/8f3U1lJXKiHrMAAAAASUVORK5CYII=\n", | |
"text/plain": [ | |
"<matplotlib.figure.Figure at 0x478b6518>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"langs['latitude'].hist(bins=100);" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 69, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAX0AAAEACAYAAABfxaZOAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3X2QJPV93/H358SDHmFPkrlNcRKLgoIOV1QLsY4/kEuL\nJSHAKaCoFIUfIq1lUqpQilHkUriTy3WOSg6gMlguOfePULIgmSCiSAIqEhwUN07JZYEkWIN0J3KO\nfQguurUsQBalgA/dN3907+3s7sxNz3bPdP+mP6+qqevu6en57K/nftPz7SdFBGZm1g6b6g5gZmbj\n407fzKxF3OmbmbWIO30zsxZxp29m1iLu9M3MWqRwpy9pk6THJN2Tj++S9IykR/PHxV3z7pR0QNJ+\nSReNIriZmQ3vhCHmvQ74HnBK17RbIuKW7pkkbQOuArYBW4EHJb01fEKAmVntCm3pS9oKXArcuvap\nHrNfDtwZES9HxEHgALC9TEgzM6tG0fLOHwMfA9ZurX9Y0qKkWyWdmk87HXi6a55D+TQzM6vZwE5f\n0q8CSxGxyOot+93AWyJiFjgM3DyaiGZmVpUiNf0LgMskXQq8CnidpNsj4v1d83wWuDcfPgS8qeu5\nrfm0VSS5xm9mtgER0au0XsjALf2I+HhEvDki3gJcDTwUEe+XNN0125XAd/Phe4CrJZ0k6UzgLOCR\nPstu/GPXrl21Z3BO50w5ZwoZU8pZ1jBH76z1KUmzwFHgIPChvCPfJ+kuYB9wBLg2qkhak4MHD9Yd\noRDnrJZzVieFjJBOzrKG6vQj4s+BP8+H33+c+W4AbigXzczMquYzcgeYn5+vO0Ihzlmtfjmnp2eQ\nhCSmp2fGmqmXFNozhYyQTs6yVFflRVLKVR9rKUmsHLmsSmqsZsOQRIxyR27bdTqduiMU4pzVcs7q\npJAR0slZljt9M7MWcXnHbAgu71jdXN4xM7PC3OkPkEqdzzmr5ZzVSSEjpJOzLHf6ZmYt4pq+2RBc\n07e6uaZvZmaFudMfIJU6n3NWyzmrk0JGSCdnWe70zcxaxDV9syG4pm91c03fzMwKc6c/QCp1Pues\nlnNWp4kZe10ttYk5R6HMTVTMzJK0tPQUy2W6paUNV0qSVLimL2kT8G3gmYi4TNJm4IvAGWR3zroq\nIn6Sz7sT+CDwMnBdROzpsTzX9C05rulPhpTX4zhr+teR3QJx2Q7gwYg4G3gI2JkHOge4CtgGXALs\nVtbCZmZWs0KdvqStwKXArV2TLwduy4dvA67Ihy8D7oyIlyPiIHAA2F5J2hqkUudzzmo5Z3VSyAjp\n5Cyr6Jb+HwMfY+X3EMCWiFgCiIjDwGn59NOBp7vmO5RPMzOzmg3ckSvpV4GliFiUNHecWYcuis3P\nzzMzMwPA1NQUs7OzzM1lb7H8revxYuPL05qSJ/Xx5Wlrn1+xerzuvE0en5uba1SeFR1g9fPHnmlQ\n3k6nw8LCAsCx/rKMgTtyJf0n4DfJdsq+Cngd8BXgl4C5iFiSNA3sjYhtknYAERE35a+/D9gVEQ+v\nWa535FpyUt4BaCtSXo8j35EbER+PiDdHxFuAq4GHIuJfA/cC8/lsHwDuzofvAa6WdJKkM4GzgEc2\nGrBu67fwmsk5q+Wc1UkhI6STs6wyx+nfCNwl6YPAU2RH7BAR+yTdRXakzxHgWm/Sm5k1g6+9YzaE\nlMsCtiLl9ehr75iZWWHu9AdIpc7nnNVyzuqkkBHSyVmWO30zsxZxTd9sCCnXgm1FyuvRNX0zMyvM\nnf4AqdT5nLNazlmdFDJCOjnLcqdvZtYirumbDSHlWrCtSHk9uqZvZmaFudMfIJU6n3NWyzmrk0JG\nSCdnWe70zcxaxDV9syGkXAu2FSmvR9f0zcysMHf6A6RS53POajlndVLICOnkLMudvplZi7imbzaE\nlGvBtiLl9Tjymr6kkyU9LOkxSU9I2pVP3yXpGUmP5o+Lu16zU9IBSfslXbTRcGZmVq0i98h9Cbgw\nIs4FZoFLJG3Pn74lIs7LH/cBSNpGduvEbcAlwG5lX6tJSqXO55zVcs7qpJAR0slZVqGafkT8LB88\nmey+uiu/i9a7HLgzIl6OiIPAAWB7j/nMzGzMCtX0JW0CvgP8U+A/R8TOvMwzD/wE+DbwuxHxE0mf\nAf4yIu7IX3sr8LWI+PKaZbqmb8lJuRbcdtPTMywtPdU1Jc31WLamf0KRmSLiKHCupFOAr0g6B9gN\nfCIiQtIngZuBa4Z58/n5eWZmZgCYmppidnaWubk5YOWnlsc93qTxFavHm5LP4/3Hsw6/u0jRAZqT\nr994p9NhYWEB4Fh/WUpEDPUAfh/46JppZwCP58M7gOu7nrsPOL/HciIFe/furTtCIc5ZrX45gYDI\nH/V/hlNoz6ZkXLvu1q7HpuQcJM87dN+9/Chy9M4bJZ2aD78KeC/wfUnTXbNdCXw3H74HuFrSSZLO\nBM4CHtn415KZmVVlYE1f0j8HbiPb6bsJ+GJE/KGk28mO5jkKHAQ+FBFL+Wt2Ar8NHAGui4g9PZYb\ng97brGlc00/X2nWX6nosW9P3yVlmQ3Cnny53+hlfhmGA9Tvwmsk5q+Wc1UkhI6STsyx3+mZmLeLy\njtkQXN5Jl8s7GW/pm5m1iDv9AVKp8zlntZyzOilkhHRyluVO38ysRVzTNxuCa/rpck0/4y19M7MW\ncac/QCp1PueslnNWJ4WMkE7Ostzpm5m1iGv6ZkNwTT9drulnvKVvZtYi7vQHSKXO55zVcs7qpJAR\n0slZljt9M7MWcU3fbAiu6afLNf2Mt/TNzFqkyO0ST5b0sKTHJD0haVc+fbOkPZKelHT/8i0V8+d2\nSjogab+ki0b5B4xaKnU+56yWc1YnhYyQTs6yBnb6EfEScGFEnEt2e8RLJG0nuwH6gxFxNvAQsBNA\n0jnAVcA24BJgt7LfVWZmVrOhavqSXg38L+DfAp8H3hURS/lN0jsR8TZJO8ju1n5T/pqvA38QEQ+v\nWZZr+pYc1/TT5Zp+plBNX9ImSY8Bh4EHIuJbwJblG6FHxGHgtHz204Gnu15+KJ9mNmFORhKSmJ6e\nqTuMWSEnFJkpIo4C50o6BfiKpF9k5Wvy2GzDvvn8/DwzMzMATE1NMTs7y9zcHLBSX6t7fHlaU/L0\nG//0pz/dyPabtPZc0QFeYvljv7QkOp2O27PH+NqsdeZZrQOsPL+4uMhHPvKRWvP1a7+FhQWAY/1l\nKREx1AP4feB3gf1kW/sA08D+fHgHcH3X/PcB5/dYTqRg7969dUcoxDmr1S8nEBD5Y/VwHVJoz6Zk\nHLTumpJzkDzv0H338mNgTV/SG4EjEfETSa8C7gduBN4FPBsRN0m6HtgcETvyHbl/BpxPVtZ5AHhr\nrHkj1/QtRZNSF26jSVl3ZWv6Rco7/wS4TdImsn0AX4yIr0n6JnCXpA8CT5EdsUNE7JN0F7APOAJc\n697dzKwZihyy+UREnBcRsxHx9oj4w3z6sxHxnog4OyIuiojnu15zQ0ScFRHbImLPKP+AUetdC2we\n56yWc1YnhYyQTs6yfEaumVmL+No7ZkOYlLpwG03KuvO1d8zMrDB3+gOkUudzzmo5Z3VSyAjp5CzL\nnb6ZWYu4pm82hEmpC7fRpKw71/TNzKwwd/oDpFLnc85qOWd1UsgI6eQsy52+mVmLuKZvNoRJqQu3\n0aSsO9f0zcysMHf6A6RS53POajlndVLICOnkLMudvplZi7imbzaESakLt9GkrDvX9M3MrLCBnb6k\nrZIekvQ9SU9I+nf59F2SnpH0aP64uOs1OyUdkLRf0kWj/ANGLZU6n3NWyzmrk0JGSCdnWUXunPUy\n8NGIWJT0WuA7kh7In7slIm7pnlnSNrK7aG0DtgIPSlp3u0QzMxu/oWv6kr4KfAZ4J/BCRNy85vkd\nZDfuvSkf/zrwBxHx8Jr5/D1gyZmUunAbTcq6G2tNX9IMMAssd+AflrQo6VZJp+bTTgee7nrZoXya\nmZnVrHCnn5d2vgRcFxEvALuBt0TELHAYuPl4r09VKnU+56yWc1YnhYyQTs6yitT0kXQCWYf/+Yi4\nGyAiftQ1y2eBe/PhQ8Cbup7bmk9bZ35+npmZGQCmpqaYnZ1lbm4OWFkBdY8va0qefuOLi4uNyjOp\n7dn1F7BWp9NxezZ8fLUOsPL84uJi7fl6jXc6HRYWFgCO9ZdlFKrpS7od+PuI+GjXtOmIOJwP/3vg\nHRHx65LOAf4MOJ+srPMAsG5Hrmv6lqJJqQu30aSsu7I1/YFb+pIuAH4DeELSY2Qt9XHg1yXNAkeB\ng8CHACJin6S7gH3AEeBa9+5mZs0wsKYfEX8REa+IiNmIODcizouI+yLi/RHx9nz6FRGx1PWaGyLi\nrIjYFhF7RvsnjFbvn4XN45zVcs7qpJAR0slZls/INTNrEV97x2wIk1IXbqNJWXe+9o6ZmRXmTn+A\nVOp8zlkt56xOChkhnZxludM3M2sR1/TNhjApdeE2mpR155q+mZkV5k5/gFTqfM5ZLeesTgoZIZ2c\nZbnTN7NGm56eQRKSmJ6eqTtO8lzTNxvCpNSFU7K2zTfazpOy7lzTNzOzwtzpD5BKnc85q+Wc1Ukh\nI6STsyx3+mZmLeKavtkQJqUunBLX9FdzTd/MzApzpz9AKnU+56yWc1YnhYyQTs6yBnb6krZKekjS\n9yQ9Iel38umbJe2R9KSk+yWd2vWanZIOSNov6aJR/gFmZlbcwJq+pGlgOiIWJb0W+A5wOfBbwI8j\n4lOSrgc2R8SOrnvkvoPspugP4nvk2oSYlLpwSlzTX23kNf2IOBwRi/nwC8B+ss78cuC2fLbbgCvy\n4cuAOyPi5Yg4CBwAtm80oJmZVWeomr6kGWAW+CawZfm+uBFxGDgtn+104Omulx3KpyUplTqfc1bL\nOauTQkZIJ2dZhTv9vLTzJeC6fIt/7e+hdH4fmZm11AlFZpJ0AlmH//mIuDufvCRpS0Qs5XX/v8un\nHwLe1PXyrfm0debn55mZmQFgamqK2dlZ5ubmgJVvXY8XG1+e1pQ8qY8vT1v7/Iq1427/fuNzc3MV\nrI8O3Ta6vNU6QO/nm9R+nU6HhYUFgGP9ZRmFTs6SdDvw9xHx0a5pNwHPRsRNfXbknk9W1nkA78i1\nCTEpOwPHbXp6hqWlpwDYsuUMDh8+WPi13pG72sh35Eq6APgN4FckPSbpUUkXAzcB75X0JPBu4EaA\niNgH3AXsA74GXJty7957C6F5nLNazlmdTqeTd/gBxLHOv2lSaMsqDCzvRMRfAK/o8/R7+rzmBuCG\nErnMzGwEfO2dCVHm57MVNyklgnErU6JxeWe1suUdd/oToqr/GHZ8k9JxjJs7/er4gmsj1pY637ik\n0p7OWZ0UMkI6Octyp29m1iIu70wIl3fGY1JKBOPW7PLOK4GXgDT2h5Ut7xQ6OcvMbHK9xPIXwNLS\nhvvSZLi8M0Bb6nzjkkp7jjPn9PQMkpDE9PTMUK9NoT1TyJjp1B1gLLylb1azlROX2rGlafVyTX9C\nuKY/HqOo6bdh3VVX0994/f146y6l9vchm2aNcPKGSzQ2jOX6e3Mv59B07vQHSKcemYZU2nP4nPV0\nRim0ZwoZM526A4yFO30zsxZxTX9CtKEu3ARV1YW7r5WUmex1V+Vx+qNYTkrt75q+WYK6LzXcPt7/\nUSd3+gOkU49MQyrt6ZzVWZ+xqTtjO3UHGAt3+mZmLTKwpi/pc8C/BJYi4u35tF3Av2Hlvrgfj4j7\n8ud2Ah8EXia7ifqePst1Tb9CrumPR1V14bZdw6fM31vmtcfbd5Jq+4/8evqS3gm8ANy+ptP/aUTc\nsmbebcAdwDvIboj+ID3uj5vP606/Qu70x8Od/sbU1ekXXV8ptf/Id+RGxDeA53q9d49plwN3RsTL\nEXEQOABs32i4JkihZpqSVNqzO2f3tXGaJoX2TCFjplN3gLEoU9P/sKRFSbdKOjWfdjrwdNc8h/Jp\nZslq95E2Nmk2esG13cAnIiIkfRK4Gbhm2IXMz88zMzMDwNTUFLOzs8zNzQErWwceLzae6QDNyJP6\n+PK0lfEOq60dX542x/IhibByfZhB62vt8ur++6scX92GG/t7B7V/v9f3m79snnGOdzodFhYWAI71\nl2UUOjlL0hnAvcs1/X7PSdoBRETclD93H7ArIh7u8TrX9Cvkmv7obKQuPGhduKbvmv5GjevkLNFV\nw5c03fXclcB38+F7gKslnSTpTOAs4JGNhmuC9VsMVkYq7VlfzuFOXEqhPVPImOnUHWAsBpZ3JN1B\n9hvoDZJ+AOwCLpQ0CxwFDgIfAoiIfZLuAvYBR4BrvTlvNox23cXJxs/X3pkQLu+MzrjLO5O4Hl3e\nqY6vvWNmVpMyt7qsizv9AdKpR6YhlfZ0zuqkkDHTGfoV3YfzNus6Qv250zczaxHX9CeEa/qj45p+\neZNa06/j/51r+mZmVpg7/QHSqUemIZX2dM7qpJAx06k7wFi40zczaxHX9CeEa/qj45p+eaOpy7+S\n7GS2lWscDfO+rumbtVCKx1nbsqbedrHZ3OkPkE49Mg1Na89+x1k3LWc/KeQcT8YqbrbeqTBPc7nT\nNxupKjojG8xb/UW5pj8hXNPfmCLtVram32v5rumP51h71/TX85a+mY1EfbeZPLmm902DO/0BUqiZ\npiSV9nTO8lb2l+wd8zuvlHqG06k+SgO50zczaxHX9CdEVbXF6emZYzvC+h37PEnGW9NfOa48M3k1\n/e7PT2Zjf+NGllP1fpciJrKmL+lzkpYkPd41bbOkPZKelHS/pFO7ntsp6YCk/ZIu2mgwq0eKl4pN\nx0bLDuno/vw0YTm2XpHyzn8F3rdm2g7gwYg4G3gI2Akg6RzgKmAbcAmwW4nvTWlyzTRFqbSnc1ap\nU3eAgjp1BxiLgZ1+RHwDeG7N5MuB2/Lh24Ar8uHLgDsj4uWIOAgcALZXE9XMmqi+o3RsIwrV9CWd\nAdwbEW/Px5+NiNd3Pf9sRLxe0meAv4yIO/LptwJfi4gv91ima/oVqqq22Lbj/Vf/vb2v5VLlcfop\n3K912P06dV6bqO42T7Gmf0JFOTb0l87PzzMzMwPA1NQUs7OzzM3NASs/Wz1ebDzTAcotb/WyTjy2\n9bZ58xa+/OU7G/P3VjW++u9drrnD0pLodDpd7dtr/rU6LLd/1fOPs32yDj87zHJp6cJCry/79w76\nPI+7/Tf2+Rn+9UXGO50OCwsLAMf6y1IiYuADOAN4vGt8P7AlH54G9ufDO4Dru+a7Dzi/zzIjBXv3\n7q07QiFAQOSPjbft2uWsDJ+8vFcttmw5Y8PLb1p79v97KTTPqIcHGUV7Dpth8N+yd+i/sZ423zv0\n/52q/t8N+55RoN/u9yh6nL7yx7J7gPl8+APA3V3Tr5Z0kqQzgbOARwq+hzVa265tcqLr1DaRBtb0\nJd1B9hvoDcASsAv4KvDfgTcBTwFXRcTz+fw7gd8GjgDXRcSePsuNQe9txY2qpl9F3bPJ6qwdN7Wd\nh/0suaY/3vVVtqbvk7MmhDv9jXGnv547/cnu9H0ZhgHSOA46HW7PaqXRnp26AxTUqTvAWLjTN7Oh\n+dj8dLm8MyHqKu+kfq0el3fWq+seA6Navss7q1V1nL611Mo1UrJj282s2VzeGSCNmmk63J7VGmd7\nbryk0xlFnBHo1B1gLNzpm1kh7bjy5eTf09g1/QlR5Boywy+n33D/68Kntk5d01+vX516NG3VvOsd\nlTlc1TV9q8nqa8iMcvnZfxKzjRr1Z9XWcnlnANegq+X2rFYa7dmpO0BBnboDjIU7fTOzFnFNf0Ic\nrwZa5rjjFGrQZTSldtyk9hxvTb++92prTd9b+mbWkDNsT25AhsnnTn+AUdRMu/+DTephYf2kUYNO\nR1Xt2f9wzCo64k7B+eq+cXynpvcdLx+9UwOfxWrp8JFak8Y1/RqMog7omv7GNKV2PKg9R32No6a3\nQ93t30+KNX1v6ZslwL8OrSqlavqSDkr6K0mPSXokn7ZZ0h5JT0q6X9Kp1USth2vQwxh8Crvbs1pp\ntGen7gAFdeoOMBZld+QeBeYi4tyI2J5P2wE8GBFnAw8BO0u+hyWjbffRrcvKl+uVV1694aU044gd\nG7dSNX1Jfwv8UkT8uGva94F3RcSSpGmgExFv6/Fa1/SzsYmq6adW3296LbvIceujvndCe4cHX8Mq\nxZp+2S39AB6Q9C1J1+TTtkTEEkBEHAZOK/keZpXyFq4VM5m/XMvuyL0gIn4o6ReAPZKeZP1Btn2/\n+ubn55mZmQFgamqK2dlZ5ubmgJVaZd3jy9OqXv7a+mH55S0vs9zyVy9rrf7L7zV/p9MZW3sOM579\nB96bp7mwb/5h/9465y/z+R5FntXjna5/R7H8quZfPO78/T7Pveav8vPa6XRYWFgAONZfllHZIZuS\ndgEvANeQ1fmXyzt7I2Jbj/mTKO90r+iqjLe8M9xllkdd3hlFew4rpbLGZJR3OmRfrvW3Z5mcRW7r\nmEJ5Z8OdvqRXA5si4gVJrwH2AP8ReDfwbETcJOl6YHNE7Ojx+iQ6/VEYd02/zHHHk1jTd6fPwGV6\neP3wpHT6ZWr6W4BvSHoM+CZwb0TsAW4C3puXet4N3FjiPSxZ9d+BqM2XuzDrx2fkDpB+eae+Lf1e\n7zvO8k6dV4usZrj/HcrS2tLv4PJOdeo+esdKq3+L2Jqq7guQ2STyln4N6qzP1r2lP0797hucacrW\nZfnh5m/pT8JwsXv5prCl72vvWEv4apFWxuTcy9flnQHSuLZJ062UsF7/+um6w9jYdeoOUFCn7gBj\n4U7fxnCG6kpt+rnnlkb0HmZWhGv6NWhaTX/ctd1xrfe21Kxd0x//cL8jwVKo6XtL38ysRdzpD9CE\nmv4oTjLyRccmxYkJHPLbqTtAQZ26A4yFj95JQHV3TTp5TSfvo1nSd4RJOaokHWv/H6XFNf0aDFvT\n73e8+fGOF27u8HAXgCsjnTapp77fxvZpUvtvlGv6rZP6Nb575/d1cszGw53+AOOt6bf3kgwrJaxU\nv8ysv07dAQrq1B1gLNzpN0rqW/FVae+XXzlut3ql0f6u6ddg2GPqi9wgJVN/TbOq4519HHr54V7t\nNj09s2aDov6ckzo8qv7NNf2JdXKBQyp9FUbrp/dWZ3cZzdppZJ2+pIslfV/S/87voJWk+o7Tb0OH\nXuSLzTamu1R4uOZ27tT0vsPqjGSpTTtIYSSdvqRNwJ8C7wN+Efg1SW+r+n2OHj3K448/fuzx4osv\nVv0WLC4uDp6pAJ8M1UsbvtiaoO52rub/0OiNJmfTDlIY1Zb+duBARDwVEUeAO4HLq36TL3zhC2zf\nfiG//Mu/yfnnX8wb3rB1qG/UIt/Azz//fM/5X/GK1/R8bb95/LPa2uv5wbM0Qio5yxlVp3868HTX\n+DP5tEr99Kc/Rbqaf/iHx3nxxY/zs5/9mEHfqN2d8upv4MM9O/E/+qNP95z/6NGf9Xxtv3nMzJog\n6cswnHjiicD/5JRTfsA//uPfsrq6s3Kq9KZNr8474GXde9uXdd8k4ZVDXq7AN+ioU/cRKevXtdXv\nYN0BCjpY4bKae6mGUXX6h4A3d41vzaetUlWjvPhi91b9+mWu7wRUYHjY+T08quFhPicbW9ce9vAo\nhuk5ve4vg5Ecpy/pFcCTwLuBHwKPAL8WEfsrfzMzMytsJFv6EfFzSR8G9pDtN/icO3wzs/rVdkau\nmZmN31jOyJX0KUn7JS1K+h+STul6bqekA/nzF3VNP0/S4/nJXZ8eU85/Jem7kn4u6byu6WdI+pmk\nR/PH7rpy9suYP9eYtlyTa5ekZ7ra7+JBmevS5JMKJR2U9FeSHpP0SD5ts6Q9kp6UdL+kU2vI9TlJ\nS5Ie75rWN1dd67xPzkZ9NiVtlfSQpO9JekLS7+TTq2vPiBj5A3gPsCkfvhG4IR8+B3iMrMw0A/w1\nK78+HgbekQ9/DXjfGHKeDbwVeAg4r2v6GcDjfV4z1pzHybitSW25JvMu4KM9pvfNXMeDbCPor/P1\nfSLZ2TpvqytPj3x/A2xeM+0m4D/kw9cDN9aQ653AbPf/kX65jvd/vqacjfpsAtPAbD78WrJ9o2+r\nsj3HsqUfEQ9GxNF89JtkR/MAXAbcGREvR8RB4ACwXdI08LqI+FY+3+3AFWPI+WREHKD3Lvh10+rI\neZyMl9OgtuyhV5v2zDzWVKuN5aTCEsT6X+eXA7flw7dRw7qNiG8Az62Z3C9Xz//zNeaEBn02I+Jw\nRCzmwy8A+8n6y8ras44Lrn2QbGsT1p/EdSifdjrZCV3LRnJy15Bm8p9/eyW9M5/WpJxNb8sP5+W9\nW7t+mvbLXJexnFRYQgAPSPqWpGvyaVsiYgmyDgM4rbZ0q53WJ1fT1jk09LMpaYbsl8k36b+eh85Z\n2dE7kh4AtnRPIvuQ/l5E3JvP83vAkYj4b1W977CK5Ozh/wJvjojn8jr6VyWd07CMtTpeZmA38ImI\nCEmfBG4Grlm/FBvggoj4oaRfAPZIepL1p3s39ciMpuZq5GdT0muBLwHXRcQLkipbz5V1+hHx3uM9\nL2keuBT4la7Jh4A3dY0vn8TVb/rIc/Z5zRHyn4UR8aik/wP8s1Hl3EjG42QZWVt2GyLzZ4HlL66x\nZBtCoZMK6xIRP8z//ZGkr5L9jF+StCUilvJS3t/VGnJFv1yNWucR8aOu0UZ8NiWdQNbhfz4i7s4n\nV9ae4zp652LgY8BlEdF91497gKslnSTpTOAs4JH858tPJG2XJOD9wN3rFjzi2McGpDcqu3Iokt6S\n5/ybBuTsrkU2ti3zD+myK4HvHi/zOLOt8S3gLGVHa50EXJ1nrJ2kV+dbf0h6DXAR8ARZvvl8tg8w\n/v8ny8T6z+N8Ptydq+51vipnQz+b/wXYFxF/0jWtuvYc017zA8BTwKP5Y3fXczvJ9jjvBy7qmv4v\nyD7UB4A/GVPOK8jqY/+P7Ezir+fTlz8MjwLfBi6tK2e/jE1ryzWZbwceJzsa5qtk9cnjZq7rAVxM\ndsTEAWBH3Xm6cp2Zt99j+brckU9/PfBgnnkPMFVDtjvISqAvAT8AfgvY3C9XXeu8T85GfTaBC4Cf\nd63rR/NbftCNAAAAQUlEQVTPZN/1PGxOn5xlZtYivl2imVmLuNM3M2sRd/pmZi3iTt/MrEXc6ZuZ\ntYg7fTOzFnGnb2bWIu70zcxa5P8DOYPfc02Iu0MAAAAASUVORK5CYII=\n", | |
"text/plain": [ | |
"<matplotlib.figure.Figure at 0xe549358>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"langs['longitude'].hist(bins=100);" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 70, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAs4AAAFwCAYAAACoxP20AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzsvX9sHGl65/eURP0iRyI5JEeaoUyl6V4P19JEHBzpdXOz\n3QrQM75w5uREFxgwJ74NW0BwcY0TYIPW2rgcWnNIYJvCboC7SQIIhzAL/yJ8yAUWYvXZu8nIRuC+\nhYNb+xx7e7yX+A7GAnGa6wsO0QIX/3jzh/jUPPXyraq3fnVVdX8/QEFUd3XVW1VvPe/3fd7nfV5H\nKUUAAAAAAACAcM4UXQAAAAAAAACqAIQzAAAAAAAAFkA4AwAAAAAAYAGEMwAAAAAAABZAOAMAAAAA\nAGABhDMAAAAAAAAWZCKcHcf5Tx3H+b2T7T85+WzRcZxfdxznY8dxfs1xnPkszgUAAAAAAEARpBbO\njuPcJKL7RLRFRJtE9K7jON9LRD9JRF9TSr1ORP8LEf1U2nMBAAAAAABQFFl4nD9NRF9XSv1rpdRf\nENFvEtE9IrpLRF852ecrRPTvZnAuAAAAAAAACiEL4fy/E9HnTkIzZolol4i+h4iuKqX+hIhIKfV/\nEdErGZwLAAAAAACAQphJewCl1NBxnJ8loq8S0f9LRN8gor8w7Zr2XAAAAAAAABRFauFMRKSUOiSi\nQyIix3H+SyL6YyL6E8dxriql/sRxnGtE9H+bfus4DgQ1AAAAAAAYC0opJ+lvs8qqsXLy7xoR/XtE\n9ItE9ISI/sOTXT5PRL8S9HulFLaKbr1er/AyYMPzm8YNz67aG55fdTc8u2pvacnE40xE/4PjOC8T\n0Z8R0Y8rpf7VSfjGLzuO0yGif0FEP5LRuQAAAAAAABg7WYVqNA2f/SkRtbM4PgAAAAAAAEWDlQNB\nKu7cuVN0EUAK8PyqC55dtcHzqy54dtONk0W8R6oCOI4qugwAAAAAAGDycRyHVNGTAwEAAAAAAJh0\nIJwBAAAAAACwAMIZAAAAAAAACyCcAQAAAAAAsADCGQAAAAAAAAsgnAEAAAAAALAAwhkAAAAAAAAL\nIJwBAAAAAACwAMIZAAAAAAAACyCcAQAAAAAAsADCGQAAAAAAAAsgnAEAAAAAALAAwhkAAAAAAAAL\nIJwBAAAAAACwAMIZAAAAAAAACyCcAQAAAAAAsADCGQAAAAAAAAsgnAEAAAAAALAAwhkAAAAAAAAL\nIJwBAAAAAACwAMIZAAAAAAAACyCcAQAAAAAAsADCGQAAAAAAAAsgnAEAAAAAALAAwhkAAAAA1hwf\nH9OjR4/o+Pi46KIAMHYgnAEAAABgzeHhIT148IAODw+LLgoAY2em6AIAAAAAoDrs7+/7/gVgmnCU\nUsUWwHFU0WUAAAAAAACTj+M4pJRykv4eoRoAAAAAGBuIkQZVBsIZAAAAAGMDMdJ+0JGoFhDOGYLK\nDwAAYNKIatvitn37+/t0cHCAGOkT0JGoFhDOGZK28kN4g7KDOgryQNYr1LHyEdW2xW37lpeXqdvt\n0vLycpbFrCzoSFQLZNXIkLQzjdn4EBF1u93MygVAVqCOgjyQ9er58+f0wQcf0PPnz+nhw4fFFgwQ\nUXTbhiwb6eCOBKgGEM4Zkrbyw/iAsoM6Cmw5Pj6mw8ND2t/fj/Qsynr14YcfjqN4IAZRbRuEH8iT\nOLZkHCAdHQAAgMx59OgRPXjwgA4ODmKJKlMjGdZwlq1RBQBkS1JbEgTS0U0YiO8DAFSBKFvFcZt3\n796NZdNM8a9hMbSm72BHAZgcyhYDjlCNkoEYUgBAFYiyVSyA2VsUtJ+NxzgsRMj0HewoAJND2UKB\nIJxLBmJIAQBV4O7du/Ts2TO6e/du6H5RNs1G5IY1nKbvps2OIlwFgPGBUI0U5DEciDQ9ABQLhvnt\nePLkCT19+pSePHkSul+UTctjGHba7CjyAAMwPiotnItu4GCsAJg88F7bkZXgDRO5Rdv4qlC2GFAw\nGeD9M1Mp4aw/xKwauKSVw8ZYZVHxUHkBGB8QIXbIEIm49snWpqETY8e0edjBeMD7F4BSqtDtRRHs\nODg4UESkDg4OlFJKjUYjdXBwoEajkfUxbI6bJVkcO8/yTRNZ1RcwPaDORJPEPtn+pmr3P255q3Z9\nYLow1c9JqLMnujO5bk3z4yy2OMI56IGlfZB5VoQsDOkkVNQygA4IiIusM3gPzSS5L5N6L+PamEmw\nSZP6LIGZPOvsuOrSVAnnICbB+DBlu5ZJMoqTdC1gPMg6U7Z3cxyMRiPV6/VUr9fDe2PBNHqcp/G9\nmGbyrLPjqksQzir7B1mkMZPntilH3mXNqiJPQgMBpptJrMNR18TvP4RRcZS93pW9fKA6TJXHmYjm\niegfENE3iej3iegzRLRIRL9ORB8T0a8R0XzAb/O8P7EZjUZqd3c3sKEYp5GwEa1R+5QljAVeiWDQ\n8Ew3RT5/G/vR7XZVu91Ww+Ew03Mnve5pe19gOwGww9Y2lEU4//dEtH/y98yJkP5ZInpw8tkXiehn\nAn6b8lZlCxup3d1d483Pw4ilid229RgVbXSnrbGLQ1meESiGIp+/zXuZV/lMxy2yPGUFtnM6wXOP\nj61tKFw4E9EVIvo/DJ8Piejqyd/XiGgY8PtUNyproiprHpU5TUNQRHlBtuAZTTdFPX8+73A4TG1D\nspogaGML8b6Um+FwqHZ3dzMfoZg2pq2DmAWV8TgT0W0i+joRHRLRPyGix0Q0S0T/UtvvTwN+n/JW\nZcM44qSDzpHG0GDWPwAgCXJ0bZwpM8PsVBwbBntXTrg+7e7uFl2USjNN9Xvc11oG4fxXiOjPiGjr\n5P//FRH9HV0oE9F3An6f282JQ9a9O9Pxgs6Rlce5DD3UaXrZAagyth7nsN/yb+K891nZqaDjlGVy\nd9VJei3wOIO4jFu7lEE4XyWi/1P8/98iov/pZKKgDNX4ZsDvvXRHvV5PffTRR7ndrDCK9Dhnce7R\nqBxpo8og3gEA+ZJHeFlcIR90nCJt0CTZv0m6FlBu8u5wfvTRRz6dWbhwVi/E728Q0fed/N07mRj4\ns0T0xZPPKjM5sKqUxchNkscFAOAnjZc66vdZhY7kZYPyivUuK5N0LQBIyiKcbxPRbxPR7xDRPzzJ\nqvEyEX2NXqSj+3UiWgj4bZ73Z2qwNXJlM4ZlKw8AIJi0HfQwcZxWlOdNWZwTVQM2HpSNtMJ5hjJA\nKfW7RLRt+KqdxfFBNMvLy9TtdiP3Ozw8pAcPHhARWe2fN2UrDwAgmP39fd+/YRwfH9Ph4SHt7+/T\n8vKy73d3796lO3fu+I4jbVgZbUGcawefABsPJg3nhfgusACOo4ouwzRhaszifD/u8pTtuAAAOx49\nekQPHjygg4MDCKYTptEu8TXfvXuXnjx5MlXXDsqJ4ziklHKS/v5MloUB5Ye9OkGGi70Dh4eHocc5\nPj6mR48e0fHxca7lSYrtdQBQJrJ6r7IiTXn29/fp4ODA89AeHx/Tw4cP6eHDh6W5vnEzjXaJbfyT\nJ0+m7tqnhbLZrdxJE+eRxUaIcS4VtvFoZY/3yypX7KQxzddeBXjGd7fbLcVzyvI952OVyW6kfR/i\n/n6a37+yx7CD5JRdD+hQGSYHpioAhPNYycpwV7kBqNpLniXTfO1lZzQaqXa7rYjI+7eI5yTz8Gb5\nnpclZaYkKhd0lMjD+xQf3LPJo2p6AMIZxIKNVq/XO9UwyMpftRchDmW4tqKXWZ7E51p1ZMaJcXnl\nTItV5LXyWxkXxojKBS2zf8TJzQ+CKWM9ANMFhDMIRRfDehJw2TBIT4BsONAoZA+8LkBnnB1XPj57\ntqVIjitsbMvKtqbdbpdebJo8znhn7YiqD7iPIIq87R+EMwjFJIa5QvZ6PdXtdr2hU73hTrsQQdXJ\n8+WFpwooVdzqd3z8breb2vtnW1YW5N1ut1C7kvTdwztrR1R9iLqPWYcH4ZlVj7ztH4QzCEV6mfXh\n36jKOe1DavCMgLwJmgyYR4Ofl0c7yQS5ImOdx/VeT6toS3vdaYV3nGOBcgKPM4Rz4URNgMGQmplp\nbfjA+GDhbDsZMEmdHA6Hqt1uq2az6Z1LitYi6nmRtiXoerO+D9NuP5OSZbs0DTZ8Gq4xa6ZGOKNy\nJMfm3mHiCwDjJyx7g+n9SyIaWq2WlwZuaWnpVEo4m2NmbQvKaFuyFrplvMY8sc1EktV5puW+RqGH\nYOLeRDM1whm99/yIimfGy/gC3AcwTtgb3ev1vM/Ye9ztdiPrIf9+Z2fHE8s2Hmf5f5u4ZD30oqrv\nSVXLXRZMmUhA/sgOC+69HVMjnGHU8iMqg0aaTktSb3cZSXIfwkQJAGGYhDPXwSAPkxSxLHa73a4X\nCtJsNiNji6U94N/V6/XA3+llgpNjOhmXxxmYiWrHwSdMjXAG+RE1CTDucLKEX2bOG52lMB+3CE1y\nPv3aICqADUET6PTPg+oXi91ut6sGg4Fqt9vKdV0vbEOKcdO52XPV7XbVxsZGZC7jSfA4g+TkPZk1\nzr5lqX9VaJ+mFQhnkBpbMafvFxXiwfscHBx43rMsQ0GqIELhcQZJsK3ben3iUI5areaFZbDw5X9N\nWTzCjis71uN85/CulB8b+56UOHVN7luWdiHvcuD9SA6EM0iNqbfOjWTQyoJKxRsaKtojAUCVSNuZ\n5PzMHK5Rr9dVp9NRrVbLy90eV5SMc0VDed6iBRAIxmZEMSnwOIeD9yM5EM4gU/hl5NhGubKYbgCC\nhpPDSGJMkhrQSWZarhPYY3ofdY8gi5y471QRk45Qx8sPnlFxFDl/qOrPHcIZZAq/ELYTi6J6vUFe\natv9bX4jmZZe+LRcJ7DHNJmQGY1G3nscFt/M+8aZyxDViFa9kQXpKTLeN+25i66/ac6fVztR9fYH\nwhnkwmAwUBsbG6rf73sNLm+u63r7ydAOk/fZFBdtM5lQvpDT5nGGJwEoFb/e88S/brcbebywY4cN\nv5ti9qU3Oqrji/o1nYxLaJlirtOeO+7vs67jScufZEQ4zrGr/B5DOINc4MZwZWXFi5G8fPmyIiK1\nuLh46oXhl1tvcPWY6SxjoUejkecRT7IseNyXP0tjYSNciujNV92TMEkkGWmx8ShHHTtswpc+ryHo\n/0EdX9Sv6WRcQsvU6Ru3xzmt8yft+cPKAV4A4QxyYTgcqnq9rojIm6G/tbXly9cqkb3bsAY3y5dY\nioXd3d3Ev7ctU5bXYCNciujNV92TMEnE7URy7uYgD5M8nmkhFZM32dQBNnmYecQp7PxR15Tl8HoS\nUPerTxmeYZxwwzzLW4Z7UVYgnEFusABuNBpeiAYL4zAPsmmIKA9v7XA4LI3HuUjvNQBMWIdMficn\nC7LYDZsILOOnw4RB3PkLQeUrwlsGDx3Ii6C6b1vn0F5kC4QzyA1+WXmioGkFMxm7HLYwQ5aUsYHL\nokxhHr+kxwDTRdDz199PKZx5ZEnmetbjm8MmHpqObyLqHYHHGUwbtnWujG1elYFwBrljerl5yHYw\nGPjEdRaTgKIa0DI2cFmUSTeOUWLFdG4YWKBUdDYbDtXgCYXtdts3kiQ7y7oo1t9923ezjO8tAFUA\n7062QDiDXNFfWG40uWHleEeZ9znty130kK2OrdFKKxr0pc+j0ouZxHLRXjtQDkwjQiYhHbRwBdc9\nFtTyWPzOy6W4g84bRpq6iroNAEgKhDPIFW40XddVu7u7Pg/VwcGBGgwGgd6npJRN/NmKAdv4UqXs\nJpCEXXuQWE5SbjB5yHkApgl+QWEV0pvMAlkX17Ye5zxjmlG3QZ6Uod0B+QHhDHKFhTPHQTabTZ9X\ndBoaMJv4zah9oobO4x7DxrDD+AOTOA1bBZC/k+nlktafKK923h7noPdJ71QE/ZvHe1PWd9KmUzVN\nTEK7Vta6VgYgnEFuyJRV7IHSs1dk/XKW9WWPMqRxJz4FNU5xvNZg+kgjME1eaH1fzlKTxQgSn08P\n9RjXO87viz4yw+WRnYSgf7MuY1nfYb1TVcYyjpOytkNxKGtdKwMQziA3OA3d1tbW2EInyvqyR123\n/r3+/7CYUdvzTIIxB+mI8uJmdewwrzQT1gEM8ybbTnpNi14+vr5ms+l1DmTu6cFgoDY3N9XFixfV\nrVu3ciljGd5hUxmmweNchns/TqbteuMA4Qxy48aNG4qI1MLCwtgm6+X5so/TkOj3Sab/ykP0gGqS\nNPwmqPOVpo7zbzudjiJ6kb897Di6R1fC5et2u6c6lDyROG/hrDMajU55VGU5uVxyq9VqiXLElwWb\nuRTTwrReNzgNhDPIDdd1FRGp9957zwvZyDP+L2+SGs4kYsTkgY7jyYkaVgeTga2XN2pEQz+eTR0P\nOgbPZ6jX65G/D6rT+twILo8MCSiiTsvwM/5bZgUybe12e+zlZLKKM9fnUkyjTZnW6wangXAGucEx\nj9evXzfGChYxMS0Lj1pcAWwzbJ2UKAGU57lB8Zjibk1D6GHhPabj2dTxoJCJwWCgNjY21GAwSHQt\n7D2XaevYa6unXNR/n3eYgKmjwuXp9Xrq3XffPSWc5+fnY9+LNMj7mNZLCrEIwGkgnEGuyKHNS5cu\nefGAeoOok9ewWBYetSTny8tDFnQ98DhPF1EdKA7vMaWXS9qJ5NSS3W43k2vQ67Kp02kz+XUcoWBh\nNoxz1HOoGu83LuQ9gvAFIHsgnEGuDIdDtb29rS5duuQ1It1uV62srHjDmEEz9G0Nfl77mhqgOMJj\nHB4wNIzTS1TnSIYV8HeyTqfpnEqRmlWssU14Ulh9j3rfsnxXpKjnCYLtdlsNh0Pfc3FdV9Xr9bF5\nnPV7gA50McAuTzYQziB32AOzsLCg9vb2vLjFjY0N77s0HqK8vNOmIU/pAYs677gnk8BYTw9SuAWF\n48jv5e+y8jgn7RTanjfr9yfr4+neZ7ZpRb5/+jVmHbIFG2MHJhJONhDOIFdGo5FaW1vzGhbOsMEN\nTFjMYpxz5G3Mk3qc84zT1r1IMNbTgxREQQtvZPFuhZG0ftumk8v6/TEdL8qDbXN+nsuhT2RMcqy0\nmLz2prqRtONTJRsT93mnOW6SfUB1gXAGuSKHdIlIvfLKK2p3d1f1+321u7ubicc5D8ps+ExepHGE\nhYDyYBLF4xY1Sc8XNw9znnXbJmbadn6C7s3nEA6Zqq4sk/T0eHAW/xxuMq5y5Inp2WbxjlSp8wDy\nAcIZ5MpoNFK1Ws0z0rdu3VK7u7tqe3tbEZHa2dkppSEuq3GUIkJ6kcpaXpAPpuc97hGQpMeL+zsp\n8rKaiGhTFtPkRNvjyPANmZYu6b2Per/j3lO9MyLvsQztSXr8MmATM5/FccH0AeEMcmc4HHrimf9d\nXFxURC8WSShiCDPr/cdFUANa1vKCfEjyvIvoXKWtlyx22G6MOyeyqfycbq/f7wd2XFk4t1ot1Ww2\nFRGpTqeTKE2fLEfQJD+bZxt2DPY4t1ot32RSPY67yh3zqjsXMKpYHiCcwVjgl951XdVut8cSqmGa\n3JdmqNQ0W33cBkyeN+8YVjBZZBHPG/e4ad87/n232y28rvN1cizzuXPnvJATXdDwuzkYDJTrumpt\nbU1duHBBEZFaWlrKxPOsh4ZEPb+wiYJs3/SQEv4NpzOssmCruvDUw2tAcUA4g7GhN4KDwcAnpgeD\nQaZDv7KRSSsy5dDrwUH+S4dHMRqN1MbGRuDQKgBxMAky2xRmYTHLWXicyyLY+B7t7Ox46TWXlpa8\nFVLl9esiVd9kiESca5PPJm6mDPlbjr/mckj7Vq/XPTtpG/tcFYq222nIS/iX6R2rChDOYGzwC8pe\nDRZ++r9x4/jYa63HP7LRbzab3mINSQ0mNyzcgBRlwPSh07BFZACwxTQ6YyvMoib7TcrIiIx5ZnvC\nNoHtj+4BHg6Hqtvt+jILcYga/67T6ailpSXV6XRix30nXVhJ916yDTXFY5vSGlYVOWmTHTZVr5dp\nqXJnoiggnMHY4YUZXNf1GTD2QAcJ0qAXnBsgNvbSW8bxhXEbmdFopFzXVbVaTbmuqwaDgS/1Vx6G\nxmbC1yQNnYL8MHXCbDtmpsmnYb+POm6ZhFdWHnDZeWbbEBZ2xve10Wj4BOru7q56+eWXrYfg444G\nhB1Hep339vZOecVfffVV1Wq1vLC6KgtMk6c+bDLkJBFV5+Fxjg+EMygE9lItLS0ZY+pM4jFshTRd\n1LJx5Iaq0WgkmkgljasptjBLTMfV7weMHIhCekble2TT4Qvbx6ZjZ6JMHuc8Or22ndnR6JNlyhuN\nhucg6HQ6iojU1tZW5HuddflN2T/0jR0O0ptepVhh+T7wv+y4mQaPMzzK2QPhDApBGuxareabdKcb\n5agX3zRJhkU0e4Fsc8bKY+7s7CgiUtvb2754QP2ceQKhDOISNIwfJxTIdgJh1UZAknjiTd/LzoCN\nJ16KN90exRGiSb3+QeghGo7jqHfeecdbqGphYcEbGZTzU+R1lP25y/dhGpceRxuSPRDOoBCk90WP\nP+bJNq7revuGNXZhwjqN0dDjieN679KglxvGD5hIGpKR9hwMp2bjd7aKXq2oeGHTu24TfiJ/x3+3\n221jZp609iTpyBTbtgsXLqj5+XnfqBqHwBGRF/LGGUX4OnTbWEY7VcYygWoD4QwKIygEg41zvV73\n7RcUwmDjMUoztDgOcaKjN4QYbgMmxlEvws4hh76rNHwvCQptkd+HeZyD9pd2La0NCypH0Hc29UK3\ni/r/5aig9DTX63Xluq4xDr5KdgqCGiQFwhmUjn6/r1ZWVlS/31dKmWftxzF6Ml65CgZdKXicgR22\n8cNx6k+cumeaX1CVd2w4HHoZd7JMhamHrwSFB8R9p+PcX5tjm+yqLualkB4Oh2p9fd0X2rG4uOhb\n0KUKdoqvi73occP4AChcOBPRBSL6OhF9g4h+j4h6J58vEtGvE9HHRPRrRDQf8Pscbw8oAr2BSCuc\n43qc08QRVqHhAJNDVCo4Jo7oiusJZapW97e2tnwZJJKI5qDFRGSIF+8TFA8sPdRhdirr+2tK4xl2\nDumAkNu5c+cSrYZoS9bXrV8HhDOIS+HC+UUZaPbk37NE9I+J6AeI6GeJ6MHJ518kop8J+G1+dwcU\ngj7UafLYJG3cbQgSGVHiw9SQVk1MgGphK5yTCuCqeZHjwDG9+miUbceZ7z2nwTR56ntigaegdHV8\njzmPfZL7ncTO2NYdeQ6Zv1puq6urscobhyzrII8ybG1tqZ2dHd/y4gDYUgrh7B2MaJaI/jci2iai\nIRFdPfn8GhENA36T280BxRJmMPNs3G0aTv3vnliuVk4yylPgA5B3nap6nQ0r/+bmpif89vb2IuN0\no0bCTL+T3s0wj7MeKx73ftvGNKcN/xqNRr4lx+fm5hQRqStXroTGc9uWKek+Nufp9Xpe2csSulf1\n92taKYVwJqIzJ6Ea/4qIfvrks3+p7fOnAb/N696AgrENsRiH8QmbfCMbx6WlJdXv942NiK3AhzEF\nVaLM9TXsnQtbcTQqpMI0EmbqWJvCL4I802kmVto8g6B7EVe8DodDn2BeWFhQ165dOzWJ0GbBqXGN\naMiJjrxKYxkmsk7yiM4kUwrh7B2M6AoR/c9EdFMXykT0nYDf5HRrQBmIMizjWlxBL4feSMrUTbyo\niz4EaiswYEyBLWUQrWWrr0EjQ/r3SXP62lxv2D78HXugk+aajwOLc1M+epvFnbjMLIYHg4FaWVk5\nFWJy/fp1RUTeEuNlcRLwPeZyjTuuOWy0suj3F8SnVML5RXnobxPRf0ZE39RCNb4ZsL/XW+/1euqj\njz7K6VaBIjA1ctLYjGs53ygDNxqNVLfb9Q0FJjXOMKbAlqSjGFnWsaRD82G/s3nfgr6PCo9KK/TT\nhhfwd+wF5bhh3fOdJVL4ymvXBXGYV1r3JPP+g8FANZtNdePGDV/MM7fJUZMe02Bb9/he86JWce91\n2vAT0whlWTqaIJqPPvrIpzMLF85EtEwnGTOI6BIR/SYR7dKLyYFfPPkckwOnnCDDwx7nwWCQu6G2\nbTDzaiQA0Ek6ilFU420rIKLKF/Z9VHhUWTqmXA7d45xVTK+poxQVXmLySstjBuW7luFqRC8WVGk0\nGoqIvFUIiUjdvn1btdvtxOn/ZFmko0KKmqDOSq/X8zKp8OJatugTQcOIqnNlqX8gOWUQzm8Q0T8h\not8hon9KRH/r5POXiehr9CId3a8T0ULA7/O8P6AkRBkeabjlkrBphUFUIwxAWsbRkObpcU5ajrw8\nznKfvDuxae+jqYxx7Uwcr3pYeaU4tOmUmK6DxTJvi4uLvv/zdubMmUBxORwOVbfbVe122xeCJ7+X\nNn5jY8MXw9ztdo11XS4v3mq1rO4tI38bJp7H6TgZDodethaI8PFSuHBOu0E4A6X8MXzdble5rqua\nzWZqoxI17AtAWtAhy4dx3Ne059B/n0R4xfGqh6Wg4+904aljOrYUta7rqrW1NdVoNHyZS/Tt/Pnz\nnoA+e/as+vSnP+2LQ+bPb968qT71qU8px3F8oS2XLl1Sm5ub6r333lNra2vqtddeU0Sktre3vesY\nDAbenBMZSsLtgu29lsKcnTNRaVLzgsstwwJhO8YLhDOYKNhwycl6aYwKxDLImzLUsTKUIWuK8OSn\n/X0S4RUnDCAs5CDst1ErROqCnPfh1fk4fCNIRMfZ5ufn1aVLl7y/9e9XV1e970zf87XLUUqTY0R2\nBqRTRqYdte2wZFkXpYiv1+vwOBcAhDOYGKQHwXVdRfRiMgjijQEIB17vcpBWYEU9x6A45SjBx2KR\n80zzXBL+jS6cddHZ6/XU0dGRJ56Xl5dDxTGntwvbzpw5o86dO+f9nzN66OEiRKQ2NzdVq9XyROZg\nMFC1Wk1tb2+rbrfrhT2wx7pWq3nx0PrS3EHx4mmeiy2j0cj3LNCuFQOEM5gYpHGyideLIomBzJpJ\n9ASC8oF6Vj6SpNq0jf12XVfV63VvqWy2nT1tkRb+nPMzy9UP+bugsAUTbI/ZYxy0NRoN1Ww21cbG\nhrp48aJvf92LfP78eeW6rleGwWCgXn75ZUVE6rXXXlP1et1zpLDneHZ21udt1ic3yo0zcTSbzdht\ngd6GpG1VFZRgAAAgAElEQVRL4kxSBPkB4QwmBtOQZRpDJRuGorxxiLEGIBlVf1+yTrUpbSLH/V68\neFG1Wi0vywULM1PM9WAw8CajsSc5rmNiOBx6eZ+XlpaU67qeqCV6MZlwe3vbE6szMzOK6MVCK0Tk\n8zDzdvHiRZ/N52uYm5tTt27dUkQv4ptd11W1Ws373czMjOp0Ot5vZXgfn2t7e9vnweay2yzuotRp\n+522LQmLUQfjA8IZgADK5nHGcDoA9lT9fQnyOCftELC3WMYd88bi3GTz+D7yZLRWq+UJwKg0cEHX\nxQLUdV11/fp1NT8/rzqdzqmMGby99NJLoTHSUjDLNHWXL1/29tGX22YhzJ2G4XAYeI6lpSVPQHPZ\nde+86RmZ7qMU3YPBwPOI28RFV70zOClAOIPKk5cxKZuRKlt5ACgzVXpf4pQ1aYeAPaoc67u9va2u\nXbumbty44YVsMDIkgMUfe4ZZeEctmhIk/OUEbvZ8syCV33U6HVWr1YwT/OTGYQvcMVhYWPCyechY\n6fv373shH2fOnPHOzbHR7XZb7e3tKSJS9+7d84Q279dut31iWHp/o7Jr6Bk8+P8yXZ/eCal6x2+S\ngXAGlScvAwPDBQCISxLBbmNr0o6ASSGrp1fTz6vnLZaTAcMWUZH/Z6G+sbHh+04KRBaxly9fVv1+\n33eN7HnmLBmcuk5ur776qndsWWYiUisrK54QXltb88I/ZKiGfrxOp+Olrdve3vZCVHZ3d1W/3/d1\nBAaDgdrY2FCdTudU+ERUthT9/t++fdv3GULzyg2EM6g8em8+bL84DQ8MFwAgLmlTygWRZXwrH6vZ\nbBrtJn8vwxvCQhNMv3Vd1wtr0CcXcqy0fnxGDw8xbbOzsz5vthSeHKJx/fr1wGPIeGfe9Bhq+Qz1\nmHP+v1y9MAj9+erCudVqWbdjoHggnMFEYNNY6Y1BWT3JEOyg7FSpjiYt67h/F0WWwjnIS6x7hnkx\nKRa6Nl5x9jSzAOz1ep5IrdVqvkl4+rLZevmkF1l6nFdWVk6Fl0jhqaeka7fboYuxyO3ixYuK6EU2\nDfkM9dATTmHnuq4vBMPm2Q+HQ3X27FnffQDVAcIZTARxPDZsSIP2LVoUBHUCii4XAExcr2qRdTdp\nyFXZQrXyvIf6tcoJfLYTAWXWCDkBTveustDsdru+EJCgY+rZLkweap3hcOjFD8/Pz6t3333XW3kw\nbGOPs23nRE/jx9caVWdk+rvLly/HSjkIigfCOWcgdsqDNMJhwlk3huN+dkF1JmlDjjoIsiZunSpS\nhJbNc1xG9GtlO7m4uKhc1/VErh6GwV7fMNuqxx63222llD/dZ1SIHU/gk5vJSyu9zjLNnWmTQnpm\nZka9+eabXrzyO++8o+bm5tTR0ZHVfQubKBj0O146vEydM2AHhHPOlM1rMe1Ir0iQR7ff76t6va62\nt7cjRfY4SdqQow6CoilShJZNAJetPCZ00ckT9FzXVbu7u17oxcrKiidWgzzBLJy3t7dVu932pX+z\nzWvMoRfnz58P9ThLT2632zXmfWYvtBTNfDwW//y7ubk5q/uV5Jkiprm6QDjnTBWM5LQR5dHlIUq5\nraysJB5OK7oOFH1+UE0mpd6UreNYtvKY0L3E0lvMAnplZcW7DpkjWqZb45AMFqa648K2jvExeCGU\nxcVFX4yzjDfWveOmbW1tTRF9Eje9vr7updlrt9vq3r17anZ2NtLjnAWT8p5NExDOAJzAxv69995T\ni4uL6rXXXvMZ2/X19UTGrQoNJZhskjTOk1JvyyZMylYeEybRyTmdOQyi1Wr54pWJXkyoazQavowV\nUsjycW1X3mNGo9Eph0a9XvdEusycsb6+rrrdrhoMBkaP8+3bt70Jj/p3u7u7gXHKeT23SXnPpgkI\n5wIISgwPxo9uDDnGjYjUzZs3TxnWJDPaq9BQgmIYV93IK0UayI+iw1t4iepOp+MLJ5DhELw4SLfb\nPSVE+TspmLvdbuJrYq+yzI5xcHDgK4/JO87imb3LLLK73a4XZ7y6uqparZYaDoeZzzGJAu9Z9Ugr\nnM8QiM0XvvAFevr0KX3hC18ouihTz+HhIT148IA+97nP0ccff0y/8iu/4n338ccfExHRzMwMLS8v\nExHRd7/73ULKGZfj42N6+PAhPXz4kI6Pj+n4+JgePXpEx8fHRRcNCLj+HR4e5nqe/f19Ojg4oP39\nfevfLC8vU7fb9ep+HFDf0jOuumHiO9/5Dv3RH/0R/eIv/iJtbGzQ+++/79WD/f196na7VK/X6Tvf\n+Q4REc3OztJnP/tZIiK6ceMGNZtN+vmf/3lqNBr0cz/3c/Rbv/VbRET0C7/wC3T37t1Eder111+n\nr371q/TDP/zDRETUbrdpf3+f9vf3aXV11dvv6tWr1Ol06A//8A+JiOjHfuzH6MaNG/SXf/mXtLa2\nRj/0Qz9ERETf+MY36NatW0RE9O1vf5t+4zd+g548eeKr91yPP/74Y3r+/Dn1er1Y7xAARtKo7iw2\ngscZJIB7+XrapX6/r+bm5k6tUsUzsBuNRqx8nUoVMxQnvTDSKxNUBhmTmHRlMhCfSfU2lWn4uar3\nuMhycxwyxzHv7u56K+hxu6VPbpM2VWaa0DfOqpEUU/u5vr7uHX9hYcHn/ZZ/12o1X5w1xzUTvVgs\nRbft+kIsWeTQloRNVgflhRCqAaYR2bCzIebZ3mxoZ2dnjYZ/Z2dH3bt3TxG9SFskc5IG5SLNugGM\nmpEd1KhFpeDjRlIa8qoKD1Acep0psg7Jdx112Q7pUND/bbfbofdQhmVcuHDhlPMh7WIfpk7ZYDDw\nznXt2jUvC8f6+rrq9/veMtqu6yqlPqmPMnvIuXPnVKvV8i30IoVt1sJ5OByqV199VRG9WDocjrTq\nAOEMphJTA8oGWV8K1rSxR1r3TJsMax6Nte5RTkuYx7lM3kNQTdLWoTTvkMmDGDbxa5ziuswjPboH\nmT3OUSsIyomCPMH6ypUrnmMi7fUFPZ/hcOiz2/oCLuyllr81ecV3dnZOdfrySBvHKf54w+qB1QHC\nGYATpEGOSp7PHpS33nrL99nW1tYpL6/talJxyzquHKDw0uVHEffWdE6bcqSpc2mvM6vOW9A1yOOP\ns6MoO8ByqeoyE/UsdTG6sbExNm8qn5szgEjhLztOvFDJYDBQCwsLPifIzs7OWMqqtyccRgLKD4Qz\nAAaClnnVt729PdXtdtXLL7/s+1w2wO12G0nugZE8RVqc7AA25ZCCaNyjD1l2MEzXWgaPs8x3XGZM\nYTh6WFhU+Nq4yhb0vUyLx3Wasyjdvn3bC9nIo+xSwOsbRvWqAYQzAAaGw6G6fft2pHBeWlry9uc4\nuosXL6p+vx858aOKntwqlrnM5Hk/bcIS4pRDevOSlrcM9cfWux5HSGdxXWW4Nzbo9UqKwDKLftMz\nHQ6HXko6jouOCr1LS5Bo5rzUoPxAOAMQwuPHjwMF89LSkur3+96+0vCePXtWNRoN1e/3T81EDxo2\nrILRZPFU5gayCNKKnqxF02g08nLrDgaDTDyAWXhlxxkGkaSMJo+kTahV2HVVRRCbsOlkjUajUoWZ\n2ITjSPj5uq6rLl686LPz165dy9zzPBqNvJULeVtdXa3M5EBkBYNwBsCK4XCoNjc31YULF9Tm5qbR\naMjURrxxCAc3KDwcy8Y4rxjovMjC6ziJxBWEo9FI3b9/3+t8ZS0opVfr7NmzqYaDwybShh0rTSx1\nFh7hJPdUZoTg0SKbjm2YmKjy5FrbspepcxA0cTooxIRXEQxaZpyIVKfTybSM165d8zlh5PLhZUdO\ntpxWIJwByIjhcKgWFxeNhpdXzOJhQZnLtEo9eJmmqopCIC/iCgfZuK+srPiGjm1iNG0m8V25csVX\nBxcWFpTrulbZG2SdjIoJDiLp6AT/rtlsBnrJbSbzJRFzssy2z0Qvj06ZRGVcqlh2XmGw2+16S4Tr\ndUiG0cmwDFO4BjtAsuT69euKiNTMzIxnA7LKOpInciSrCu1VXkA4A5Ahw+FQNRoNX5q6xcVFbynv\nra0tn9GJioMuGywQdnd3S23gyw57nOfn59X29ra31C/XhaAJpVEeQHkMbpzlJkMPwgSt9Col9RKb\nRKjNMUyevzBRnKW4S+NdH1eWmyIpUkjbntvUqdLrO3/ebDZVo9FQzWbTE9mmUZqsPc6ctWlvb89b\nZIYdEmfPnlVvvvlmKYVplUdPsgTCGYAc6Pf76tKlS57hlflFZYwzG+qNjY1THocyenukJ7KM5asS\no9HIqxfNZtMTq9yAmsIEokYnZMdGplRcXFxUrVbLJw504SyfZ9xRkDCvNB/LVvB3Oh01Pz+vrl69\nqra3t3PLbmBLEsE2aZQhrEyGuYWh12PTqn/69Ui7rHuil5aWcolz7na7amdnR21tbalWq6UGg4E6\nd+6cd960KyzmAWz+CyCcAcgJ6YUaDAaq3W4r13W9z9iY64JJN+BZNlJpDZ/NEDmwg1efJPpkVTVe\n9r1Wq6nt7W0vbl7PYCA7X/JZskjt9/uq2Wx6OWql4MgiJlg/xmAwUPV6Xbmue+q4XNfr9boaDoe+\nMvKQuuxA1mq1QG9znPobV/yn9WKnebfKLkhYtK6urnreWZ28r+HWrVuKiNStW7esf2MaIdOFNddB\nfu69Xk91Oh21trbmG7XJ2s7JlKdra2uq1+v5bMLe3l6m5wPZAeEMwBhhQ86Tj/Qhejk0zkIiyyG7\nNGJXH44ue2NfdqR3iejFAgzSO8ubrANRscf8GQ//EpGXKSBqYYc4k/l0r7UssxTovV5PbW1ted/x\nRCz2gvPnMzMznjB677331OLiorp165ZqNBperKr0EJpChdiLt7m5qebn572V2ebm5nyTr4IEcl5x\n0zaUvROq57XXR0LGEXLGS2pfuHDB+jdhoTdyFUSe7MzfyZEg2enLAj6PaZEt13U9u1Cv1zM5H8ge\nCGcAxkjQkKfuHUvbEAU18Gkafm5UqpQ+r8w8fvz4VCxl72TpZfa68r82HmcWjpzqamFhQXU6Hc9r\ntrCwYOUpZC+c67qecNDDOnThLCfG8gpoMr6Ut/n5eU8wbG5u+r5jrzufU2apkenOgt4L0/lk58NU\n9qA4bP6bl5nm+5aXwC17J3Q4HHqjF7LO6vUyr7kPw+FQvfTSS7E9zib0uQTyWkajke+zN954wwu5\nyyrVnrSj6+vr3rkuXbrkvbtzc3Netp2y1olpBsIZgAKQ8Z+9Xs8z1kmHpSVSeMm0d2kNcBniHCcN\nFrvcgF66dEl1Oh1fPKdt3LuM1+SNvXRySLjT6aharaZc11WDweBUiJCMsebtypUrPq/tYDBQ169f\n97zZGxsb6r333vP2v3fvnlpaWlKzs7OnjsXXZJoEKNOCcQYa/T4EhWCMRqNTYR68dTod757JibpB\n95nRU2+VXeDmiel5jWs+hmlEIw08GtLtdk9lbglKS5eVcNZDRVqtlpqfn/d18oJGlEA5gHAGoECk\nlyyNt0YKcSl86vV65FK+eghG0LFlDOCkZw8YF9L7JBdfkFk1bETJaDQy5hGP2lhYszeay7O4uKhe\neeWVUx7hra0t9e677xqPJTPJsPc4SIAMh0PVbDbV1atXfd9xjnT29sr4U/2emQTFcDg8FQKje0el\nuK7Vat7yyqZjcseiSnl284I7eexxnp+f9z7PuzMRFkOflKB6JDug29vbXiczi/NyveeJugy/u4uL\ni174HuxseYFwBqBAuDFqtVrexME0YRTsmVlbW/MWX4la1UuKd5MYkceGtzlbpOjgmFw5UZDDJvg5\nBoXemLzNvLGQDBOz7MnTO15Zb+fOnVMbGxunckzrwlpekymeNkyoHR0dnTqmjFFljzMP/ZvOwSA8\n6TQc+nP58mVfuEueNiGu99W2sxkUzpaXaJVhIHIBEX2UsAhv8zSPpsQFwhmAgtFjM4Ny+CoVHbts\nmkRlWsAhjhdZiikIiPwYDAZqbm5OEZH3r0xjaBLP/Jx5yHl7e1udP3/eSsSyUGdP8ZtvvunVRek9\nluXJc5uZmVF7e3unBHPYhEATfA2rq6tqYWHBi6XmjiMfj73tjUYj8Lh6xwT1/5OOR1SnI0viitky\nhjmMRiMv/KhWq/k8zqPRJ6kpOc8/Z57Ja3EsvS0p4z0rKxDOABQMNwqu6/qGkV3XDZwRHmXcTJMN\nkxrJvCf+gE+QnZSlpSWjwJT1Qhd0cr9r1655k43ktr6+rlzX9VZJY5F8+fJl38QpGfLQ6XR8oSR5\niOagjp5pomy/31eLi4tqa2vrVE5xva7rExlHo5Hv3prmAZj+HzQXYdqQoxthnY6siWOzyuI95RFF\nGbtvsqN8bXp8M88RyCOns34/s54LM6749yKAcAagJOjCh0W0DLGwNUK6UUxjJCfR8JUZfWZ/s9n0\nxS9zdoN2u+1lfRgMBur+/ftqcXHR87DW63U1GAzU9va299sbN24oIjIuDX/16lUv/lhOXOVOXVpx\nzF5rx3F8iwPxxmLW5MHU8+3K8u/u7oaKKt1bORqN1M7OzimPKf9+OBx698l1Xe846EC+YDQaefcn\nKsVhltjGm5fJXkmb3m63A8ulT1Z0XdfXudM91FmQ132KansmAQhnAEqC9DxzOjBdOEvCFnjQjWLc\nxSDilhmTWLJFZpbg2fcsVuQmh3f5s7W1Nd/nMu1bo9EI9GRLISl5/PhxKsGsx1bfvn3bJ+Z563Q6\ngVlb9Dh8zg194cIF1e/3I+ugbLz5+LzUsswTfXR05Hm/+f4yZRJkRSNTBurkdZ/kQlFhxy5TB4ez\nZgQtGiP308OyuJPMsfiNRmOMJU8OPM4QzgAURpjBGY0+SRsmF00J2teU+9YkeuMauaiJhSAZeuyy\nzPfKm/S6bm1teTG7a2trXqer2+16+83MzASm2pKbHiJk2md1ddWbfBq03bhxIzA9HHvRdnZ2vJAR\nGUphCpOQacNkSAvX7ZWVlVO5b/XYfxneIjsb7NHTF45BNg0zpo74aDRSrut69S2ow5/mnLrNMxFk\n74pADxMKQqbbM73r/M7FJausMFFza4KeRV4Om6KBcAagggTFxOnCuNvtep4MfRVCk+hNMnsdHufs\n0Sel8fbKK6+otbU1T2wSkU/AcgiEzEvc7/c9USlHNYIELaeI4zqQ1NPMdULWQaIXKb5YtHKYCYdh\nhMU3cz2TIRcsitmLzv/yypymNHPcmA8GA59IOX/+vG9xj6p4+MqCXlezFs5K+cOYwo7PdUNfwGac\nDAYDrxMRlXtahiINBgPjqND29nbsMnBHQy4AlISgdiEqBFDPgx5F2FyFMgHhDEAFkT15KV6l8JDC\nmONipaAyefJsjNU0DMWVBRaeekxyp9NRS0tLp0IeOFMGhyDoCzzws9IzI+ib9OjJ9HTXr18P9IjJ\n7cKFC776ICdJDQYD30Q76W3Tl55n4dtsNn3CTI+D5o5Ap9NRPbEi2+LiYmAeaLnKoilV3/3798f7\nsCsOP5/FxcVM8y0HncdGmHN94nz24+rgD4dDXxw/T2KV70CQzeTr40m7PDcgycIv4/Y460I6rsdZ\n/p7/LmMmGwhnACqIKWZTztzmRoLjYvlf6a3RBbbu0bM5t/w/p09iQ20aJgfJkMtgd7tdz8v82muv\nqaWlJfXGG294jXRQ9guZh1tmzJDhCUQvYpDlsxoMBl4YCMcWR21XrlxRSvkbVllPZB2UkxC5oeXv\nZay27CDKOGi94yiFOHvbdBF/cHDgC9XQ0++xWAf2jKsDHec8ptjhLFYeDGMwGBjfQVkOfudMI3u6\n5351dbUyI3pJ64CprdBH3bLO/pEGCGcAKoj0Fsthe7lYBjcSLLhYWEhvHx9DTkbUhbU+fGaKi5Zx\nptxISO+hFDlFG72qI0cS2FMalrv5woUL3qIVrVbLJxgXFhaMIw/MaORfxjpoVT5dNB8dHfnqmXz2\nulCWdct1XS9dHncCa7XaKa+x9GRJQc6NLIsUTufF55UhId1uNzQGO2+BBcaDLkTX1tZyE6Kj0Sh0\n8u2VK1c8UR0Up83vo7TrRcdq501UVhzZZsQZccgLCGcAKogUq1I0y1Ri/K8+UYaNVLvd9qU5k949\nuUx30PCZKd6UPc4swvlYLMp07wGIDzesYaIvaslrDvXo9/uh55Ii3Xa7ePGiT6zU63XfQg6y7sqO\nnpy4ePHiRS8Mha/TlGWDO3xcd7kT2Ol0TtVnGWIiO466F31mZsYL+bAVV+gQFk9YOIEpY0WWNmg0\nGqlOp+N11PRtc3PTJ6ijMoPwMbk+T9rkOp047w+EM4QzAImQwoFFqfTk6TFn0rthmnjG4oT3k8ZJ\nxsmZfhvmJeD9TMPzEBrp0DNk1Go1b2Th6OhILS0tqXv37qlms+l5cON62rjxNoUyhG1SzMpOGcea\nsuePPeGyDpo23Ttn6jhKjzNvzWbT52HnrdFoePeCv79+/bqX+zrIAxbUwMedVFt2qtQRYHsVtjjN\ncDhUzWbTq2/r6+uJxKjpvsg6FLSxoK7VapGp6SRljvNNQ5r6VYa6CeEMQAUxxYSZsmQEGXo5dM2e\nuqAJf/qELX24PciABU1aHI1Gnqcbw+HJYVHbarVOhTJkTb/fV5cvX/Y82Y7j+LzaMhsF0Scx9Rxm\nIcOATPmoeWu1Wl6KOvY4NxoNX+dQ1iNeyGRzc9PXQeDfcmz2+vr6qYmUHDcdlFotSiDrIUxFN+ZZ\nUqWOgLR77XY70CbpoydJrs10X/TOGtGL7DYctsbvRpK80kFxvlWnSvXLBIQzABWGDSoP5+mxqqb4\nVf6NDJ8Iil9Wyp8qSU4+DMusIQW37i3RGzpQPfjZysVC2u22b0U+KVBkyjjpXa7Vat4CLWtra77l\ns1lsS0+0HnbEHT+ZdWRxcdFbNEIXyvrxZMNtI35Nkx2r2viHUaWOgLRPcqRML//R0ZEnYq9du6Za\nrVZsr7PpvujCmTtrpk5aUqr0PGwo4nqyPCeEMwAVRqYV02OYZUyy3EeKbd2THCQo9JjUVqvlaxB0\nEWFKIya94NMSuzepyFzIHJbBaea4frCnWU485bSI7Blut9vqzTff9Alq3lfGJHM90kc6+v2+Wlpa\nUrdu3fI8fSZPNg/NcyfSdV0vZV9YWFMYplEfMH6kWGbbwqFB8lnq2WP4+yTIUYrBYOAbidnb2/Oy\n3vBEwCxH1iZ1UZG8ybKjW3rhTER/lYiGRPSHRPRFw/epbwIAVUVmD2BDasqCYcrlKycOyswaeiws\nGxyOcz44OPDF15o813oasUkd0p429OcnV9vj+iEbda5f7G2+ffu2IiK1t7fn5YfWM4JIzzTXo1ar\n5QlnU7o6Pv/R0ZG6fv26eumll9Tly5d9wluOtsiYVBZXcmU628ZVdion0fNcBfTJYnqd48/7/b5a\nXFxUm5ub6t1331Wzs7Pq8ePHieyRPIcpNp87k9wp5PqXBq6/slMA7JkajzMRnSGif0ZEN4joHBH9\nDhFtaPukvgkAVAV94RPXdb0GQqbnCgqhkMOaMr6TMwvUajWjh1hfrUt+xoZc93hLoa2LcIiMaqI/\nP45VZ0+vLiL1ZZLlwiTyX97YQ2cSyixAuG7Jzturr77qCRZ9wRb+ze7urufNvnr1qpqfn/fCRHho\nn0W1zSRK2yWgQb6YbJx81qbsC/zcOL1ilD3q9/vqypUram5uTl2/fl0dHR35PNhLS0ve6J+eIYY3\nvRxhQu7x48dqZmZGvfHGG14nVM4TuHbtGjzOBVJ24fyDRNQX//9J3esM4QymAZO3gQXuzs6ON+GK\nQyhMYRuj0chbWY7oxSpr/L1ctlkKBzbuMj2dLJM+cZA/N8WBBsVQg+qgN/aca3Zra8sYtqB31DhD\nC4sMOWGPwzhM9Yzr/cLCgufhk5kUWJDrqxq+9NJL3gRK+X7wxtlCGo2GV3ZTXTfBdd8mtRjIH9M8\ni6CQsMFg4K3IZ/P89NzMHPojvc3z8/Oq0Wiofr/vGwnhTB4yXGM4HHp1lWOte72ecl1XNRoN38Rb\n9izLDDSO41itBljWEb6ylCtpOcounP86ET0W//8PiOjvavvEumAAqoiMP15ZWVGDwcDo1Wg2m0bv\nC+dX1r17LGTYcxwkvIPSIUUZnmmZTDXpBMVVysmipv3kksemSagsvBcXF70YaVPHSp9cyB01Ppbr\nuqrVavkWjZAbC+xGo+EL4dA9gr1ez/NQBwlneU7E6ZcHk32RHXudOMtR60vUX7lyxRO7ej1iQcwd\nQVN9MqVSNNXJq1evevVLOj34nUpyT8pAWcqVtBwTIZzZ4PV6PfXRRx/FugEAVAHTpBddOC8tLZ1q\nBNiwB61mJb1mHP6hxyvHyTQQdQ1l8DKA+AQJEP2Z6vvpSx7Lesux97o3TiKFOA9VNxoNXx2Sk/pY\nUHAIxurqqrdSYJA4ISJ1//59X9xzWJ7foEm0oFhM9iVsIl0c0SRD0/TOW7fbVVtbW2p1dVU1Gg0v\n7G1hYcGre7Jcg8FAra+ve+FFnG6RJ62ura15nmh5LTJUg+iTEUP2cLuuGxmqVxbyLFecyZO25fjo\no498OrPswvkHiegfif8jVANMLWzoObau2Wz60nrpXo3RaOQbmpbLJV+7du2UoDCFdwQhBbYeEx2X\nshp38AlhjZF8fjLbBtcNFsdyaJrrb9SiLDKbix5nz+fm8+iTXfWJfrwvewDloi78O5mVQ46UyPIh\n5GgyiGt3pHf5xo0bXviR3nFzXTc0DIRt9fz8vM9+drtd790xlUuGhnB+dCL/PAF2kJjerazsbNnt\nddgoQ1aUXTifpU8mB56nF5MDP63tk9e9AaBUDAYDzzCGefF0gaDvv7a25kvDxUOW/X7fOr2W9LpJ\nAZ3EmJZl2A6YCWsoZZy7zMwi654eltHpdHxhQ0Hx8Up9Iti5fssJp1Ios6jvdruq2Wx65+eVA6Xg\nHw6HvqWROZWe9EjXajXvPQgT0aCc5PGspNeZO38yoxDXP7a17OCQdm04HPpChfb29rw5AFxng3Lb\ny4rch2cAACAASURBVEnX/F4dHBycWjJe39gTLTuSsoMb5z4NBgPjdcl7VPQ7Mo50faUWzi/KR3+V\niD4mom8R0U8avs/nzgBQMmTDzgabh/h4kx4MXRjoYpo9dzJcI87QpRTLJvFra8DgwSs3YR0b/q7Z\nbPommPJ25coVbxlw/owFdK1W8/7muhh0LinQWbzwvzJGX557d3fXtw97xOV+7BnkeionHLLA0BfX\nQAev/JjqUZz5GEHok6X1Zb65Pst6L4+nx+BzFhnZOWSnBtt3PWxO5jIfDAaq1WoF5i9nL7QMMzl3\n7pwxp/UHH3wQeU/4XVtaWvKNZm5ubk7VPJbSC+fIAkA4gymBF3u4f/++Z8ik51fmt2WjK0WGHDZn\nA31wcOB52+J6HySmRifOkJkcrjR5HDEBqziiPM4HBwe+RnR1dVXNz8/7GuVWq+UNM8t9eaVBFs6m\nzAh6GAhP1Op0Ol691eOX2cuszwvQ41RNHUbThFZT3QTlxVRnZUiFaVQtbsyzaXROz+msH+u1117z\nfX/p0iU1Nzen5ufn1dHRkbHeyfdDLyvX3Uajoer1ui8bx+zsrLeKYafT8YUmhW1nz55Vb7755qkR\nF6U+GZ3U462586yL+km13RDOAFQEPfWb9D5Ij7MUHdKDFiSqObYujoGz8c4EiV7T5yycdQ9OHPFd\nhmHCaYU7ZDyxyeRV4+coQ44ajYaxjvCzlF5ebsR1jzM/cx5N4VAk/q3MFiMF/OLiojeMze/D0dHR\nqbAlU2w1qB563m1pS5XKZpIzT/zb2tryhcPx73iEI8gzLPeVI4Sc13wwGPhGTM6ePavOnz+v3nvv\nPTUajdT7779vPPaNGzdOpWK02eQ7ondo33rrrVP7y/eU5zTMzMyovb29iRpRhHAGoCLIWGQZVyqH\n+aQHjj/jGdcchyfFtPTCxZlMETakHuWd1M8nQzV0L0Ucj/O0DBOWEb2hr9Vq3jLY3PDroUTSc6WL\nUt7HdV3fgj9hIyR6lplut+ubSKvHhkoRoKdq5BjoTqfjy0MNqovseLmu65tQHafTHWVngr7Xvch6\neMXt27dPnV+GKBGR50E2ba7rGkMwbH4btbG3mkMymOFwqJrNppcZRIaUmFJDBo0sVQ0IZwAqgu5x\na7VavqWtZTouXSywZyVoqeGsPM428bC8uhunzmORxIJGehHjxD5X3RhXHenhZc8v1zWe5c9/8zPl\nz1zX9Q19B8WQSiHBdYPzKcvhYw5NYoEuO5YyDKjZbHrllEKGh7xZQGN54+qjj2JIMZckTCNurPRo\nNPKFavBIDC9epddP+TtTiJG+1et11e/31dmzZxMLZNvt8ePHkfdpOByq7e1tX/jI5uamr/NcVScH\nhDMAFSFoYsrKyorq9/teCAavxsZDbKY40GazmYvAtPE460ZTnxRjivPDksblR8ap93o930pnsr7K\n58m/0VPVsac5aMl2uXKmPL7rut5kWFmXTLHTupf59u3b3t/f+73f66X5qtfrY43TRAcwX7jeyQVs\nxjWXQo6K3Lt3Tx0cHHgdPtMKg7LMehrF1dVVdf36dXXz5k21vr6ujo6OvMWu+H3a2dlRnU5H1Wo1\n7/iLi4teGxE2qTBqe+211wJDrGTndGZm5tRv9WXtTXW+zO8BhDMAJcZkPKQYZjEiPWPSk8Kw4GAx\nELTgRN6YvMhBEwP14f+oGNMyG9ppIKjTs729HRgmJFeuJCIvtRY/c33Cnoy1lBky2NvGQ9VyHoCp\ncebyyfjrRqNxak5AEe8KQo7yw1QHer3eqVz2eSFzMbPzgu0fC+KrV6+GOh6CJjTK7EimiXmyc8B/\nP3782HrSoGlrt9u+Msm5KqPRyBhXze/s3Nyc6vf7p67DdG1lA8IZgBIjwxgYafBlermVlZXAyYJ6\nrJxNrFkeQjQsTRRfq1xOWXrYo4Rz2LEhpseLHF42dYj0sCLu0PGwtQzdkGJWih69vvAx+T3Q64Me\nBqRn4uB5AN1u1xefOe5Jgaiz+aHXAX0EL+97rsc5czn0VI62I2yyrsjJj/y+dDodtb6+rhqNhk9E\nczmClqiPu128eFG9/fbbpxbZkqJ8dnbWW3Tr/PnzXmdBzl8wOYjKmJUDwhmAEmPyxrKXQnrG5HLb\nQbHM0tvLwiIo1kwKC93rl6ZxCYv/k9ckvUBc5qhY5yiPDBgvQaE5SvkzHPAm/89i1RSuIzuO+rCw\nnEjImTK44TV5oLvdrjdxUQ6js6csbOltUD1MNkI6J0z1Juvzs42TNlsKWI6DTmKz9M7AwsKCb0RF\n309Pj5fXpr9H+kRF9jxLymy7IZwBKDFB4oP/32g0TokD6UljT17QDG8pSuUwuIwl1Yfh8vDA6WJI\n9ybqRtRGxMedXAjyISw8h4etuQ7rdVF+FhYPqSMFd1jHS9YrmVJPrsZWVFgTGA9BIUZBK/gF/TbO\n+eSkbaWUb6Edm5VbbcslO6Nra2un9ltdXc1UIOthH2fOnFGdTufUtezt7fn2O3/+vDc3Qb+GMtpu\nCGcAKoAe42nKOiDz40pPgx76oP8tRTIfSxrycQjnsHhU0/9tvBFBnkkwfvTnFRTXru8vV/QL6gAF\nTSzi30hPsqzz+nugp7OT3mcwHci4XNlhChJxSb2iaYRy0HLZpjLKBVlc1z11rPv375+q77wkeNyJ\ng8vLy2o4HKrNzU114cIF9eqrr/qEsER2HuRWr9dj3ceigHAGoCJIbzLPSpZG12SIOL8miwiTkWeR\nIWOipeeF9+eJXCZjGNc7ELV/2u/lPrr3Gowf3ets83zlIhV6XZToHTpdGMsFU8JGL1g4s9jg1Q/h\ncZ4ubEO+RqOR6nQ6amlpyRhqEIXJYWEqS7fbVTs7O2pzc9O3TD1nybhy5Yrq9/vGhXp4wRTXdU+9\na/ry87LuE72YpBhHOB8dHcW6/tHok0whvDAMi+2yjxRCOANQEdjQytzN0rBIj7McApcehyDhaxIf\nuvEKW8XPFNphcy1Bnpo08W0ydtV1Xd8qiWWMl5sW4obbyFGWsHANve7K//M52bsnN+54ct3gXNOc\n0o7oRdw1YpxBmJjmejIajby6wytP6vH37CUeDoee6H78+LFqtVrq9u3bXmgdby+99NIpgXrx4kXV\narV8o4pXrlzx/rYdEeR6z57lZrOpNjc3FREF5oKWOZn1zSa0xXRf5QR3IvJNMLx06ZJ68OCBmpub\n84R5USEc8rxTLZzH8QDKEqdTlnKA5OieMfaGyRAO/leursbCkQ2UKVWXKQxDimdelpgnUun1SHoI\nbcRpFh7loN/p2RK4s4HV34ojLM7Z1NBHPX8pwsPqMgsWuTgKiwS9Y6n/n0dw0NkCJqRQZkHMdYfT\nIuqjgfqS30k3ztfMMcUsnONkBpHtiCkvur4tLi6qfr+vNjc31cWLF30Lt0iPdavVCsyUYcK0uItM\nEylFuylbzriQ551q4TyOB1CWmaFlKQdIj56/Vv6re914SW6iF7GaPAnL5P0LEjbS0EtxHHf43Ya0\nx+DrajabnseZ7w1WfysOk/0JE85R8fRh9cQUv8+NM4dfsNDgUCYZ18p12pQLF4AggjzOXAc3NjbU\ngwcP1JkzZ9Rbb71l9N5GxRXrk8E5fE73cNuWV7ff/H7o24ULF045HvQwj6CNJwdKx448Dnduv/Sl\nL3mLpcjJuXqHFx7nCfM4B01SKYOnN6zigmrBhpi9GhwfJg0LCw8WBDwEyELElCNTD7dgjwSLCz03\ntCkOOqtrS3o80/vGSyvzdaDuj48wuxNmG20nokbZNfl9kJdP1mcWED0tdzQASZF1S4Yh3Lx5U83M\nzKjl5WXV6XS8zhqH2e3t7XkT9V555ZWx2K7BYKDW19fV66+/rubn59XGxoaq1WreiI18FwaDgZVw\nnpubs4rnlveKw7NkOArRi8wgNqFdeWucqRbOcbARwFXw6lahjCAcGY7AHgqZF1N6lfVMARzaYZoo\nKI/b6/U80c3/spBotVreMHjcSRz6exT1/6yImmAG8oHtDccb245M2NYDU4Mc9FsWxPqiD7zYStBI\nC5wN08E4nFyPHz9W586dUzdv3jxVX4+OjnzCWk5mLaLu6Z1JUzkeP34cKZwvXLgQmAkkCj3Eitsf\n0+85xR3Haedp5yGcLbERnGXxLodRhTKCaEajkTE2TApkNlQ9kTFDnzgV5KGT4qHT6RjjRHXRHeVR\nNMVB23oW02JKrwfyR69PBwcHRjEd51imOmbKQW6y1Syeu92uNxy8s7PjG2nhsnIHdFzLMYNiGadT\nyRROxBPyzpw548UJJzluXGdGEOx0CRKqzGAwUDdu3FArKyvG5bv39vZOjWbGuZ5ut6uazaZaW1vz\njRLpzM7OKqIXsdCmLCJZAuFsCQQnKBvcwPNyrRyOUa/XPa9as9k0Th6UXoSguLP19fVTHmfuzW9t\nbfn2NzU68rPhcOgNQdZqNa9RyFs4470tBybvlWnUIwxb50WYcJDH4Hq9s7MTuEplWIOPujVZFBk7\nK0fErl+/nugYBwcHqtPp+EZL0lyLdMzYvqMsdL/ne77HJ7rlNXIqVZuyyZBCObfH1F4cHR155V1a\nWso1ZBbCGYCKIoe/2MixYWGRyr10U7ymPswtj7W9ve0bymYRzZOqarWaryxRHmc9vlROSszCoAUd\nB6FJ5SXusw8aOpZiOUqMy315BIVHIaSYZ294WJiGnIALAQ3SMBr5M3TERV9imycebm1tqVqtpt57\n7z0v3aJN6BGXh+Obs6jbejtgY5P1FKhRNkOOikqPftxOehQQzgBUFBaF3W7XE8wyi4ZcEEWGaujL\nGcs8x/oiKouLi77VpV599VVFROr27dunjhVm0AaDgTeM99JLL2UuMoI8g/AKFk8ez0B2iPhvW8Eb\nFipiamiDyi9DOtA5A0UxHA69UcfLly8H5mDmzaa+ylC9tbW1zDLLxA0lMU1ij3P8pGFhfKyg30ys\ncEaDCSYdruOck7Zer58KwWAjwmmLpIdZCgROc3R0dKTq9bq6deuWZzjlSmoyzozj8fRjyrKZJlwl\nGYqMugdyIYu846WBHTbenrBwIZtj6x5n/n1UnHOcyYlBxzK9Z0HHRHsE8mA0Gp1KC8de5/n5eW8S\novRC65NhTUh7LduWsqML7TTvXZgNmVjhPCmT+QCIgofp5JCzXqelF1qKFJOBJCLfak6u66rd3V0v\nJrRer/uWZj137twp4aO/f/I8ruuGXk+c91KeZ1wTDYEdNt4ePVwoSJzGtdG2v4vaz9ZDFtXeIGQI\n5ME777zjs98vv/yyOjo68pwkbM954hx7nMPqcr/fVwsLC2ppaUmdP3/eN5pTdtiOzM7OqpmZGfX4\n8ePEx5oaj3NQDFwQMGZgEgjyjkmPGIdj8IpOMmxDD9fQQzv4fZLHGA6HXh7Se/fuBU6c0sNCZHhH\n0LsZ572M+86D8RHHsxuU59kmzVwaosS9bV20EeComyAJuu2VmLJYsD1n0czZhHiJ76j6zOsDyHA9\n9mDnna0iLcPh0Fd+x3FyKe9ECee4QhjGDEwaJhGtiw85QcN1Xc+wShGte9hM3lwZWxokbExC3mYC\nV5r3Eu/1ZCBjiPX6bHq2ScRrUJ1MGkICQJbo9rrdbvu+W15ePjViKHP3b2xseBPmWEhvbGxEepzD\nYqTTZuvIk6Ojo1PLdXe73czPM1HCWTeMaEDBNGNKOs/iQ1+qm0XDcDj0hvnkRCtT/LCeG9kkrk1C\nPmqiRpr3VvdSgmKxCXUIet56fQryQOuL/dTrdWOMo8mxEiSQs3TCoB0CcdE7dJzViEWgLqjX1tZ8\ny8a3223f3BcZcmcz0e7tt9/2eZv5b05zWlbbKkNLeNvZ2cn8PBMlnHV04wcDBqYJU+MvxTSn5OI0\nRdIYb2xseEabDfLW1pZvuDAojjkoFs72/UsTQsVlaDab1jO3QbaYxGqQpyqso2NatCZsREWGGrXb\nbZ/wkKkW5fFMIyIy7MfW4xxWZxESCOLC9ZZTHQZ17Nrtts+p0RNLxfd6PV86tzijJ2zzt7e3fe9M\n3CwX48aUUWRjYyPz80y0cNYbahgwME1EhU6Yhr6lYeTfs+eC4+OCcmpmNeKTpoOLFGHFo4tR2Zjr\nzyMo/EL3qJk6Y6Y4eva2sfdZFw/yeBw72mw2leu6nliQ57KdcAqPM0iLtL3SjgUtUKXbaVM4U9SI\ni1Lm+snvT7PZNK7KWdZwDdMS4DMzM6mPq9+jiRbOURcPwDQhh81ZZPAwnskrzV4F/n+/34/lbci7\no6pfjy6i8K4XQ1gsscnjHDYpj9Mf8qQkm2eqh4dIgS29cdIbLjcZGhI3xSHaGJCE0Wjky2gkO5um\nEZmgkZIgQc3nMP1fjsrw+xJ0/sFg4DlQyppl46233jr1Tqd9H/W2bKqEcxWBIQZZYfI2m7wH+mpN\nQSSZjJUl+jXYxE+D8RFHLAc16hxT2Wq1vOe9srLiW10tzsiiLix6vZ43osK5zHlOgC4obMCoJkiC\ntGVc5+UcE90ZEORJDhpR1B0Jci6LvoAWC2aOiZZzZOSoTR6T7rJgNBqpa9euGTvDaY45kR7nSRWY\nMMQgK/QYzqBJW7ZxbKa6Oc730ORxlkP0oFiC5pjIDBb8mXxusg7pOco51ZSMWww6jy4wguKpgzxs\nSTphk9oOgfwYjV4sb80T8eQkPpNA1uulTX3X6z3X9Xq97r2P/H/+m0OX5HvV7Xa9HP5lt7Hz8/OZ\nCGc514LbmYkRzpMqMGGIQZ6k8RqbvIRFZ7SwjUkF+RPkCZYNP39mCqHQh6GVUt4KmIPB4FSMs6mO\nmjxw7Xb7VIcxzDsHQJ5Ib/P6+rpvPoCpXsqOZxRB7winqOMQKNlxlN5nKcL5ez3Hf1np9/vq5Zdf\nVjdv3lStVivxhEY56Vh63NUkCOe0AhMCFUwjUR3OqAklpokjJmGS5P2KE9PKZamCQZ9WwkI0goah\neZ+eIb+4TSfJ5I0zCY8s7T/aEhCH0WjkhQvxvzLeWHdGxLWLpv3kSI6+v6njKMU929gqOChkuWUO\n7DiYJkpOjHBOy6R6rAEII6nHWXpC+L3hEA82NPJdSvJ+2f7GVBZQXsLChPh7k6eaN36+MvNAkg7Z\nYDDIJbWWrI+mjgAAOmGiNWkKuDD7GSZ8gzq4cvVCk/AuI6PRSN26dUsRkXrnnXcSvYem+wHhHHJz\nynxcAPLGJkzD5HE2xYZm6XE2eSSjhuxBeZBC2NSom2KWTR5nPYbT5rnLumM7CTYuupcwrVMGbchk\nENVhDCIsdEkeO47NNYnxoBEZ/XPpHInbac0aGbrFZdXvMWcBOXfuXGaOFQjnnIEnG1SFIE+fbSyd\nKQVYHka1KsOEwExcj3PUseKMNMg6nediDlmGD6ENqSZ6PTd1GG3qutwnqC5EhdQFjRjKcgRNipUh\neKbFVYqsl6urq949/dKXvuSb6MjtQ7/fV3Nzc7E62FFMvHAuurde9PkBsMVkTGV8aJgA0MVsno09\nn0tOXgGTQZzRgyQhF+O2x0neg6ARFdTzaqB7aNkummxo3PoRx+McdHzTCA7vt7GxcWq5eu5g6qK6\nDPVyYWHBF8bV6XS8v/XJjVmWdeKFM3rrMLzAjrDhPtkImN4lPeYtzzoXNKQIqkfQKAfnmA3zEMkY\n5zwn+6UhSTnQZlUbGbJmWkBHd0jkaSdNzo6g+Sn8znG9swkRKZp+v+8TzrOzs6pWqynXdXNtHyZe\nOJf1gY8TGGKQBDl812q1VKPR8GYW6/tFrbCWx3uId7u6hMVTcp3jtFhBdUofeZBe6nGF86BeA4kU\nqxymo8fmj6M9Dgthkh5nmVpO76hWpR4eHR35xDPRi0Vk8iz/xAtnkE36GjBZ2Dxr3funx44xpuwG\naeKl45QddbaaSK9c0ChH1GI2YXVMF8551RM4JQCji1Xds8tpOqNWAswCeW7TMfWydbvdTOL9i7LH\n/X5fLS0tebHMtVot1/NBOJeccVZENALTQ5hwYWSMm0wCHySc5edh8dJRRMWuymOjzlaTuJOibI4V\nJETCvG/juA4w+cg6pqchlDHCQaEQaexYWIhdkCjX35mswhqKtsd6lo28gHAuOeOsiGgEpoe4ccKj\nkT+Pp+lYQYY7LqY416Dzoc5OHmHZLkzPO8pGhnnfsqw/qIvTC9cxffEnKVD1eOOs7Jip/svj2WiI\nsPPH+W5a3gEI55IzLRURFEOc+pVkskjQBBWbMuW1QAUoN2H5lU2jG1F1Mez7LB0TRXvbQHEEOSJs\nYu3TtvFRnUmTQI+T8z7sGqJE+6QC4QxiMQ0vBTAjPYGmGNKwGdxJwzbAZGOqC2Ee56yESBIBEXUd\nsv6jjk8n+nO3qa95dLiCvNl6bLONTQ67hjiifZKAcAaxgFdlegmbfCUFcrPZ9ATEcDhUzWZTtVot\nTwihDgEmbiy8TTynFAVBx8u6DurHQx2fXuKGYMSN5Y8rQrkucsdOTlDkY4V5jk2dS5u4aRnbPWki\nGsIZxGKSKj+IR1iDMBqNfBMI9Ukv0njGDd0Ak4tej+IKaaU+8YjV63UvEwd7rIOW5M7ajk1rrCf4\nhKBwjaxI2hmzKVfcuQOm74I6jxz3nWde5XED4QwAyAQWxTJvqS6Ug4wxhAZQKlkaQ26QuZHWh6V5\nkmlYBhkA0iI9u3HqWV4e5ySduShveZgXWg+x0gX7JL1/EM5gYoD4Gj9pjTmDoW1gwraxN2V80Rv5\nSfJ4gfIRFpsfRha2L8xjbJNy1LajKmOkbW141vMJykChwpmI/g4R/S4RfYOI/hERXRPf/RQRfYuI\nvklEb4ccI8fbUx0gGiG+8iCqXsW950HHQ/0FeYM6BvIkbShFmnoZFqMc1mFMMscgKCd6VBrJqNVl\nq0TRwvkl8fdPENF/e/L395+I6Rki+jeI6J8RkRNwjBxvT3WAaETDmAdZexFQT0EUiBcGVSRq4uq4\nzx31nWm+SZrJjGG2XYZTQThnGKpBRD9JRP+1+PuL4rs+EX0m4Hd53ZtKgcYF5EFQY5B0JTbUUxBF\n0CQj20lNAJQFm3R048D0ntis+BrnmGHvIp+r3W5PxLuaVjjPUEocx/kviOhvENH/Q0T/9snHq0Q0\nELt9++QzEMDy8jJ1u92iiwEmDFO9Ojw8pKdPn9Lu7i7t7++nPh4AEq5TQf9KDg8P6cGDB0REqFcA\nBBD1nhwfH9Ph4SF97nOfo93dXbp79+6p7/b392l5eTnwmCbbzr/90R/9UZqbmzt1jKklSlkT0VeJ\n6J+K7fdO/v1r2n5fJKKHJ3//PSLaE9/9fSK6F3D8nPsWAAAJvHwga5LWKdRFUGbyTnuY5ncyVIM9\nwqaRROmZts26YZpsmHTyZBmhvD3OSqm3LDX4LxLRrxLRQ3rhYf4e8d31k8+MPHz40Pv7zp07dOfO\nHctTAgDiAq8xyBr2Xj179oy+8pWvWHuluC4eHx/To0eP4NECpSJrW5l0hMVUjuXlZZqbm6MHDx5Q\nr9ejg4MDunv3Lt25cydwJJHP//z5c6MHWS+fHC36/Oc/T0+fPiUiol/91V+1v+gS8OzZM3r27Fl2\nB0yjuomoLv7+CSL6ZeWfHHieiGqEyYEAADCxpImbV6o8saQA5EkWHuwor7HN/kGZOsImIppSRlYV\nKjjG+Wccx/k+IvpLIvoXRPQ3T5TwHziO88tE9AdE9GdE9OMnhQUAADBhLC8v01e+8hX68MMP6fnz\n53R8fAzPMZgogmKF4xDmwbY9vu4VjvJc6+fkER72ONuU7/DwkB49ekQHBwf0+uuvh55vGkglnJVS\n/37Idz9NRD+d5vgAgHCyMOYAZIEcOp6bm4s1FP3+++8bG3IAykKcMIskdlk/ftAxwibb2hI3BCWL\nc04SqbNqAADGg8mQIisBKBNJG1jE3YOyE6duJ7HL+vGDjhHnXcnCsQLnjIE0cR5ZbIQYZwCsCFtd\nClkJQNlA3QSTQty6nHUsc9Jjpl0NkTNpJDlGmaGyLICSuAAQzgBYgRXZQJXAKpNgUihDXU4ygTZp\nGyEXPKGTNHeT1M6kFc5nivByAwDiw0N0epjG4eFhbufkNGHHx8e5nQNMJvv7+3RwcIC4SFB5ylqX\nTfZZfra8vEz7+/t0eHgYuE8Yb775Jh0cHMRKMTkNIMYZgIqSxYQNU/za8fExffjhh94+H3zwAREh\nhhrEA3HLYFIoQ12WE2g//vhj+sIXvkA3b96kR48e+fIy67HRplhp02eyLZDngmA+jaMKzhLnOI4q\nugwATCuPHj2iBw8e0MHBgWdA+TMiol6vBwMKAAAl4p133qGnT59Su92mt99+m54/f04ffPAB7e7u\n0pe//GV68uSJZ7ODnCOHh4d09+5db18W0+12mz772c/S+++/P7E233EcUko5SX8PjzMAU4zJa72/\nv0/Pnz8nIppo4wkAAFXky1/+svfv66+/TsfHx/Tbv/3b9PTpU7pz505gFg4porvdrs9Jsr+/T8+e\nPaOnT5/S1772tdgpJacJeJwBAAAAACoMh2+wmDahjzBKIU1E9OGHH9J3v/tdmp2dnWinSVqPMyYH\nAjDhYIIfAABMNk+ePKGnT5/SkydPAvcJm+R4eHhIH3zwAc3OztLc3FyeRa08CNUAYMLBIikAADDZ\nhE0Wl55ljmeWcc3Pnj3zwj+eP3+O9iICCGcAJhwslwoAAJNNWOYP6TwhImNcM8dGHx8fexPCgRnE\nOANQAbDsKQAAAFtk5oxf+qVfIqIXk72JyNeWTGLbEnVNyKoBwATDBoDTDRFh+AwAAEA4Mgzj6dOn\ndHBw4InIoKwbk8Dx8TF9/vOfp6dPnxJRPu0lhDMAJYaNX6/XK+XKVQAAAMoHtxV3796lO3fuTE3b\ncXh4SE+fPqXd3d3crhmhGgCUFLmC3ySnBgIAAACywCb0BOnoAJhQOD3Q3NwcRDMAAAAQAYee5Nlm\nIlQDgJKCbBgAAADygEc0p2HBk6xBqAYAAAAAwIQjwxj0FHW8muA0gKwaAAAAAAAgFCmW9/f36i2G\nagAAC8dJREFUaTQa0de//nX6zGc+g5HNGEA4AwAAAABMOHr43+///u/Tb/7mb9K7776LMI0YYHIg\nABXi+PiYHj16RMfHx0UXBQAAQIWQE+c4bdvGxgbdvXu36KJVCghnACoED7UdHh4WXRQAAAAVZX9/\nn3Z3d2k4HNKTJ0+KLk6lwORAAEpGWB7KSVweFQAAwPiZ1vYk7eRACGcASsajR4/owYMHUzXLGQAA\nABgHyKoBwISB/M0AAABAOYHHGQAAAAAATAVYchsAAAAAAIAxAOEMAAAAAACABRDOAFQU5HQGAAAA\nxguEMwAVBTmdAQAAgPGCrBoAVBRk3wAAAADGC7JqAAAAAACAqQBZNQCYABCvDAAAAJQfCGcASgDi\nlQEAAIDygxhnAEoA4pUBAACA8oMYZwAAAAAAMBUgxhkAAAAAAIAxAOEMAAAAAACABRDOAAAAAAAA\nWADhDEAFQLo6AAAAoHggnAGoAKZ0dRDTAAAAwHiBcAagAuzv79PBwQHdvXvXE8sQ0wAAAMB4QR5n\nACrA8vIydbtdevToET148ICIiO7evUvPnj2ju3fvevuxmCYi6na7hZQVAAAAmFQgnAGoEHKhlMPD\nQ3r69CnduXPHE8lYSAUAAADIDyyAAkBF4XCN/f19Wl5eLro4AAAAQOlJuwAKhDMAAAAAAJgKsHIg\nAAAAAAAAYyC1cHYc5yccx/mm4zi/5zjOz4jPf8pxnG+dfPd22vMAAAAAAABQJKkmBzqOc4eI/hoR\nvaGU+nPHcZZPPv80Ef0IEX2aiK4T0dccx/kUYjIAAAAAAEBVSetx/o+J6GeUUn9ORKSU4uSxP0xE\nR0qpP1dK/XMi+hYR/UDKcwEAAAAAAFAYaYXz9xFR03Gcf+w4zkeO4/yVk89XieiPxX7fPvkMAAAA\nAACAShIZquE4zleJ6Kr8iIgUEf3nJ79fVEr9oOM420T0D4hoPY+CAgAAAAAAUCSRwlkp9VbQd47j\n/E0i+ocn+/224zh/4TjOEr3wMK+JXa+ffGbk4cOH3t937tyhO3fuRBULAAAAAACAUJ49e0bPnj3L\n7Hip8jg7jvMfEdGqUqrnOM73EdFXlVI3HMf5fiL6BSL6DL0I0fgqERknByKPMwAAAAAAGAdp8zin\nXXL7kIj+O8dxfo+I/jUR/Q0iIqXUHziO88tE9AdE9GdE9ONQxwAAAAAAoMpg5UAAAAAAADAVYOVA\nAAAAAAAAxgCEMwAAAAAAABZAOAMAAAAAAGABhDMAAAAAAAAWQDgDAAAAAABgAYQzAAAAAAAAFkA4\nAwAAAAAAYAGEMwAAAAAAABZAOAMAAAAAAGABhDMAAAAAAAAWQDgDAAAAAABgAYQzAAAAAAAAFkA4\nAwAAAAAAYAGEMwAAAAAAABZAOAMAAAAAAGABhDMAAAAAAAAWQDgDAAAAAABgAYQzAAAAAAAAFkA4\nAwAAAAAAYAGEMwAAAAAAABZAOAMAAAAAAGABhDMAAAAAAAAWQDgDAAAAAABgAYQzAAAAAAAAFkA4\nAwAAAAAAYAGEMwAAAAAAABZAOAMAAAAAAGABhDMAAAAAAAAWQDgDAAAAAABgAYQzAAAAAAAAFkA4\nAwAAAAAAYAGEMwAAAAAAABZAOAMAAAAAAGABhDMAAAAAAAAWQDgDAAAAAABgAYQzAAAAAAAAFkA4\nAwAAAAAAYAGEMwAAAAAAABZAOAMAAAAAAGABhDMAAAAAAAAWQDgDAAAAAABgAYQzAAAAAAAAFkA4\nAwAAAAAAYAGEMwAAAAAAABZAOAMAAAAAAGABhDMAAAAAAAAWQDgDAAAAAABgQSrh7DjOv+k4zm85\njvO7juP8iuM4L4nvfspxnG85jvNNx3HeTl9UAAAAAAAAiiOtx/nvE9EDpdRtIvofiegBEZHjON9P\nRD9CRJ8mon+HiP4bx3GclOcCJeTZs2dFFwGkAM+vuuDZVRs8v+qCZzfdpBXOn1JK/a8nf3+NiP76\nyd93iehIKfXnSql/TkTfIqIfSHkuUEJgQKoNnl91wbOrNnh+1QXPbrpJK5x/33Gcuyd//wgRXT/5\ne5WI/ljs9+2TzwAAAAAAAKgkM1E7OI7zVSK6Kj8iIkVEf4uIOkT09xzH+dtE9ISI/r88CgkAAAAA\nAEDROEqpbA7kOJ8iop9TSv2g4zg/+f+3d2exdk1xHMe/P7QRYorxwRAiNTSUoryYpxJjYk5QPJgi\njYhQJB4VjTmRCG5LDEHFFKEaIojSxlyURAwVrlYMD0KKn4e1bnt603t7enqdsy+/z1Pv2msnK/l1\n77Oy91r7D9j2jfXYC8D1tt9axXkjM4CIiIiIiNWw3fG+u7WaOEva0vYSSesAfcArtmfWzYEPAftT\nlmi8RFkPnUlyRERERIxKa7vG+UxJi4CPgW9tzwSw/THwWG1/Hrgkk+aIiIiIGM1GbKlGRERERMR/\nWVcrB0o6RdJHkv6SNLGlfT1JMyV9IGlhXSM9cGxibf9M0m3dHG+sMEx2R0haUIvgzJd0aMuxZNdQ\nki6rxYk+lDS9pT2Fi0YBSZMlfVqvrat6PZ4YnqRNJD1er6uFkvaXtJmkOZIWSXpR0ia9HmcUku6T\n1C/pg5a2m2p+70maLWnjlmO5bzbEENlNkPSmpHclvS1p35Zja5xdt0tufwicDLw6qP1UYKztPYF9\ngQslbV+P3Q1cYHscME7S0V0bbbQaKrslwHG1CM4U4MGWY8mugSQdAhwP7GF7D2BGbd+NFC5qvLqn\n5C7gaGA8Zcncrr0dVazG7cDztncDJgCfAlcDc23vArwMTOvh+GJlfZTrq9UcYLztvSi1KaZBCr41\n0Kqyu4nygYq9geuBm6Hz7Lo6cba9yPbnlE/arXQI2FDSusAGwB/Ar5K2ATayPb/2ewA4qWsDjuWG\nys72+7a/r/9eCKwvaUyya7SLgem2/wSwvbS2n0gKF40Gk4DPbX9lexnwKCW7aKD6ZPJA230A9fr6\nhZLZrNptFrk/NkYt7PbToLa5tv+uf85jRd2KFHxrkFVlB/wNDLzR2ZRSWwQ6zK7bT5yH8gTwG/Ad\n8CUww/bPlC9yLG7pt5gUUmksSacA79Qf82TXXOOAgyTNk/SKpH1qewoXjQ6Dc8q11Ww7Aksl9Ul6\nR9I9kjYAtrbdD1AfPmzV01HGmjif8uEDyH1zNLgcmCHpa8rT54G3Ox1lt9oCKGtquIIptp8d4rRJ\nwJ/ANsDmwGuS5o702GJ4HWY3cO544AbgyH9vhNGuYbK8jnLdb1a/ub4f8DiwU/dHGfG/sB4wEbjU\n9gJJt1KWaQzemZ+d+qOApGuBZbYf6fVYom0XA1NtP1Uf8N3PWsxVRnzibLuTwZwFvFBfgyyR9AZl\nrfPrwHYt/bZlxSP2GGEdZoekbYEngbPr6w4oOSW7HhkuS0kXUfLC9vy64XNzSj7bt3RNZs2UnEaX\nxcA3thfUv2dTJs79kra23V+Xtv3QsxFGWyRNAY4FDmtpzm9d851reyqA7Sck3VvbO8qul0s1WtfK\nfk39jyhpQ+AA4JP6+uoXSZPqgu1zgKe7PtIYbHl2dSf4c8BVtucNtCe7RnuKFdfbOMrG3B+BZ4DT\nJY2VtCOwM/B274YZQ5gP7CxpB0ljgTMo2UUD1eUY39RrDeBwYCElsym17Vxyf2wasfJv3WTgSuAE\n23+09HsGOCP3zUZZKTvgW0kHA0g6nLKWGTrMrqvfcZZ0EnAnsAXwM/Ce7WPqZLkP2L12vd/2LfWc\nfYCZwPqUXclTuzbgWG6Y7K6lPD0Z2Dho4CjbS5NdM0kaQ3lVtRdlI+4Vtl+tx6YBFwDLKK+25vRs\noDGk+iN+O+Xhx322p6/mlOghSROAe4ExwBfAecC6lEJh2wFfAafVvT3RY5IeBg6hLB3tp3yJ4Rpg\nLPBj7TbP9iW1f+6bDTFEdouAOyjX3O+Uonzv1v5rnF0KoEREREREtKEpX9WIiIiIiGi0TJwjIiIi\nItqQiXNERERERBsycY6IiIiIaEMmzhERERERbcjEOSIiIiKiDZk4R0RERES0IRPniIiIiIg2/APa\n/dPXsEBivgAAAABJRU5ErkJggg==\n", | |
"text/plain": [ | |
"<matplotlib.figure.Figure at 0x2c5bb710>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"import matplotlib.pyplot as plt\n", | |
"\n", | |
"plt.figure(figsize=(12, 6))\n", | |
"plt.axis([-180, 180, -90, 90])\n", | |
"plt.xticks(range(-180, 181, 60))\n", | |
"plt.yticks(range(-90, 91, 30))\n", | |
"\n", | |
"plt.scatter(langs['longitude'], langs['latitude'], 1);" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 2", | |
"language": "python", | |
"name": "python2" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 2 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython2", | |
"version": "2.7.11" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 0 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment