Skip to content

Instantly share code, notes, and snippets.

@Shivani29sheth
Created June 20, 2020 23:43
Show Gist options
  • Save Shivani29sheth/5da2318be442e5d7ff7ff6ecddf31f52 to your computer and use it in GitHub Desktop.
Save Shivani29sheth/5da2318be442e5d7ff7ff6ecddf31f52 to your computer and use it in GitHub Desktop.
Created on Skills Network Labs
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<h1>Final Capstone Project - Analyze and Cluster Neighborhoods in British Columbia </h1>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Table of contents\n",
"* [Introduction and Project Proposal](#introduction)\n",
"* [Obtaining and cleaning data](#data)\n",
"* [Methodology](#methodology)\n",
"* [Analyze each neighborhood](#analysis)\n",
"* [Cluster Neighborhoods](#cluster)\n",
"* [Results and Conclusion](#results)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"## Introduction and Project Proposal <a name=\"introduction\"></a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this project, we take leverage of the Foursquare API to find the most popular places in each of the neighborhoods consisting of British Columbia. A place is marked as “happening” by the Foursquare API according to the number of people present at a given place and hence the place is updated in real-time; it might change every few minutes. We then cluster the neighborhoods based upon their preferred places in the surrounding area. This will give a clear picture of the aura or the vibe of the place, which can help an individual know what to expect in the neighborhood, and hence decide upon a suitable location according to their preference."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"## Obtaining and cleaning data <a name=\"data\"></a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### In this module, we can obtain data from various sources and clean them to match our required data frame needed for further processing. Two such methods of obtaining our required data has been explained below."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### First method to acquire the dataset needed:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### In this method, we scrape the data frame needed from the wikipedia page from the given url and clean the table for better readability. Once we obtain the location data, we pass the address through the geocoder package of the geopy library to extract the latitude and longitude values for each address. Hence, this will give us the complete data set needed for the methodology."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### We start with importing and downloading the necessary libraries.\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Collecting package metadata (current_repodata.json): done\n",
"Solving environment: done\n",
"\n",
"## Package Plan ##\n",
"\n",
" environment location: /home/jupyterlab/conda/envs/python\n",
"\n",
" added / updated specs:\n",
" - lxml\n",
"\n",
"\n",
"The following packages will be downloaded:\n",
"\n",
" package | build\n",
" ---------------------------|-----------------\n",
" libxslt-1.1.33 | h7d1a2b0_0 426 KB\n",
" lxml-3.8.0 | py36_0 3.8 MB conda-forge\n",
" ------------------------------------------------------------\n",
" Total: 4.2 MB\n",
"\n",
"The following NEW packages will be INSTALLED:\n",
"\n",
" libxslt pkgs/main/linux-64::libxslt-1.1.33-h7d1a2b0_0\n",
" lxml conda-forge/linux-64::lxml-3.8.0-py36_0\n",
"\n",
"\n",
"\n",
"Downloading and Extracting Packages\n",
"lxml-3.8.0 | 3.8 MB | ##################################### | 100% \n",
"libxslt-1.1.33 | 426 KB | ##################################### | 100% \n",
"Preparing transaction: done\n",
"Verifying transaction: done\n",
"Executing transaction: done\n"
]
}
],
"source": [
"import pandas as pd\n",
"import numpy as np\n",
"import requests\n",
"!conda install -c conda-forge lxml --yes "
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Collecting package metadata (current_repodata.json): done\n",
"Solving environment: done\n",
"\n",
"# All requested packages already installed.\n",
"\n",
"Collecting package metadata (current_repodata.json): done\n",
"Solving environment: done\n",
"\n",
"## Package Plan ##\n",
"\n",
" environment location: /home/jupyterlab/conda/envs/python\n",
"\n",
" added / updated specs:\n",
" - geocoder\n",
"\n",
"\n",
"The following packages will be downloaded:\n",
"\n",
" package | build\n",
" ---------------------------|-----------------\n",
" brotlipy-0.7.0 |py36h8c4c3a4_1000 346 KB conda-forge\n",
" chardet-3.0.4 |py36h9f0ad1d_1006 188 KB conda-forge\n",
" click-7.1.2 | pyh9f0ad1d_0 64 KB conda-forge\n",
" cryptography-2.9.2 | py36h45558ae_0 613 KB conda-forge\n",
" future-0.18.2 | py36h9f0ad1d_1 714 KB conda-forge\n",
" geocoder-1.38.1 | py_1 53 KB conda-forge\n",
" pysocks-1.7.1 | py36h9f0ad1d_1 27 KB conda-forge\n",
" ratelim-0.1.6 | py_2 6 KB conda-forge\n",
" requests-2.24.0 | pyh9f0ad1d_0 47 KB conda-forge\n",
" ------------------------------------------------------------\n",
" Total: 2.0 MB\n",
"\n",
"The following NEW packages will be INSTALLED:\n",
"\n",
" brotlipy conda-forge/linux-64::brotlipy-0.7.0-py36h8c4c3a4_1000\n",
" chardet conda-forge/linux-64::chardet-3.0.4-py36h9f0ad1d_1006\n",
" click conda-forge/noarch::click-7.1.2-pyh9f0ad1d_0\n",
" cryptography conda-forge/linux-64::cryptography-2.9.2-py36h45558ae_0\n",
" decorator conda-forge/noarch::decorator-4.4.2-py_0\n",
" future conda-forge/linux-64::future-0.18.2-py36h9f0ad1d_1\n",
" geocoder conda-forge/noarch::geocoder-1.38.1-py_1\n",
" idna conda-forge/noarch::idna-2.9-py_1\n",
" pyopenssl conda-forge/noarch::pyopenssl-19.1.0-py_1\n",
" pysocks conda-forge/linux-64::pysocks-1.7.1-py36h9f0ad1d_1\n",
" ratelim conda-forge/noarch::ratelim-0.1.6-py_2\n",
" requests conda-forge/noarch::requests-2.24.0-pyh9f0ad1d_0\n",
" urllib3 conda-forge/noarch::urllib3-1.25.9-py_0\n",
"\n",
"\n",
"\n",
"Downloading and Extracting Packages\n",
"future-0.18.2 | 714 KB | ##################################### | 100% \n",
"chardet-3.0.4 | 188 KB | ##################################### | 100% \n",
"cryptography-2.9.2 | 613 KB | ##################################### | 100% \n",
"brotlipy-0.7.0 | 346 KB | ##################################### | 100% \n",
"pysocks-1.7.1 | 27 KB | ##################################### | 100% \n",
"geocoder-1.38.1 | 53 KB | ##################################### | 100% \n",
"requests-2.24.0 | 47 KB | ##################################### | 100% \n",
"ratelim-0.1.6 | 6 KB | ##################################### | 100% \n",
"click-7.1.2 | 64 KB | ##################################### | 100% \n",
"Preparing transaction: done\n",
"Verifying transaction: done\n",
"Executing transaction: done\n"
]
}
],
"source": [
"!conda install -c conda-forge html5lib --yes \n",
"!conda install -c conda-forge geocoder --yes "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Next, we scrape the table given on this wikipedia page using the read_html method of the pandas library. This table consists of the postal code along with the neighborhood for each location.\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>V1AKimberley</th>\n",
" <th>V2APenticton</th>\n",
" <th>V3ALangley Township(Langley City)</th>\n",
" <th>V4ASurreySouthwest</th>\n",
" <th>V5ABurnaby(Government Road / Lake City / SFU / Burnaby Mountain)</th>\n",
" <th>V6AVancouver(Strathcona / Chinatown / Downtown Eastside)</th>\n",
" <th>V7ARichmondSouth</th>\n",
" <th>V8APowell River</th>\n",
" <th>V9AVictoria(Vic West / Esquimalt)Canadian Forces(MARPAC)</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>V1BVernonEast</td>\n",
" <td>V2BKamloopsNorthwest</td>\n",
" <td>V3BPort CoquitlamCentral</td>\n",
" <td>V4BWhite Rock</td>\n",
" <td>V5BBurnaby(Parkcrest-Aubrey / Ardingley-Sprott)</td>\n",
" <td>V6BVancouver(NE Downtown / Gastown / Harbour C...</td>\n",
" <td>V7BRichmond(Sea Island / YVR)</td>\n",
" <td>V8BSquamish</td>\n",
" <td>V9BVictoria(West Highlands / North Langford / ...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>V1CCranbrook</td>\n",
" <td>V2CKamloopsCentral and Southeast</td>\n",
" <td>V3CPort CoquitlamSouth</td>\n",
" <td>V4CDeltaNortheast</td>\n",
" <td>V5CBurnaby(Burnaby Heights / Willingdon Height...</td>\n",
" <td>V6CVancouver(Waterfront / Coal Harbour / Canad...</td>\n",
" <td>V7CRichmondNorthwest</td>\n",
" <td>V8CKitimat</td>\n",
" <td>V9CVictoria(Colwood / South Langford / Metchosin)</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>V1ESalmon Arm</td>\n",
" <td>V2EKamloopsSouth and West</td>\n",
" <td>V3ECoquitlamNorth</td>\n",
" <td>V4EDeltaEast</td>\n",
" <td>V5EBurnaby(Lakeview-Mayfield / Richmond Park /...</td>\n",
" <td>V6EVancouver(SE West End / Davie Village)</td>\n",
" <td>V7ERichmondSouthwest</td>\n",
" <td>V8EWhistler</td>\n",
" <td>V9EVictoria(East Highlands / NW Saanich)</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>V1GDawson Creek</td>\n",
" <td>V2GWilliams Lake</td>\n",
" <td>V3GAbbotsfordEast</td>\n",
" <td>V4GDeltaEast Central</td>\n",
" <td>V5GBurnaby(Cascade-Schou / Douglas-Gilpin)</td>\n",
" <td>V6GVancouver(NW West End / Stanley Park)</td>\n",
" <td>V7GNorth Vancouver (district municipality)Oute...</td>\n",
" <td>V8GTerrace</td>\n",
" <td>V9GLadysmith</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>V1HVernonWest</td>\n",
" <td>V2HKamloopsNorth</td>\n",
" <td>V3HPort Moody</td>\n",
" <td>V4HNot assigned</td>\n",
" <td>V5HBurnaby(Maywood / Marlborough / Oakalla / W...</td>\n",
" <td>V6HVancouver(West Fairview / Granville Island ...</td>\n",
" <td>V7HNorth Vancouver (district municipality)Inne...</td>\n",
" <td>V8HNot assigned</td>\n",
" <td>V9HCampbell RiverOutskirts</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" V1AKimberley V2APenticton \\\n",
"0 V1BVernonEast V2BKamloopsNorthwest \n",
"1 V1CCranbrook V2CKamloopsCentral and Southeast \n",
"2 V1ESalmon Arm V2EKamloopsSouth and West \n",
"3 V1GDawson Creek V2GWilliams Lake \n",
"4 V1HVernonWest V2HKamloopsNorth \n",
"\n",
" V3ALangley Township(Langley City) V4ASurreySouthwest \\\n",
"0 V3BPort CoquitlamCentral V4BWhite Rock \n",
"1 V3CPort CoquitlamSouth V4CDeltaNortheast \n",
"2 V3ECoquitlamNorth V4EDeltaEast \n",
"3 V3GAbbotsfordEast V4GDeltaEast Central \n",
"4 V3HPort Moody V4HNot assigned \n",
"\n",
" V5ABurnaby(Government Road / Lake City / SFU / Burnaby Mountain) \\\n",
"0 V5BBurnaby(Parkcrest-Aubrey / Ardingley-Sprott) \n",
"1 V5CBurnaby(Burnaby Heights / Willingdon Height... \n",
"2 V5EBurnaby(Lakeview-Mayfield / Richmond Park /... \n",
"3 V5GBurnaby(Cascade-Schou / Douglas-Gilpin) \n",
"4 V5HBurnaby(Maywood / Marlborough / Oakalla / W... \n",
"\n",
" V6AVancouver(Strathcona / Chinatown / Downtown Eastside) \\\n",
"0 V6BVancouver(NE Downtown / Gastown / Harbour C... \n",
"1 V6CVancouver(Waterfront / Coal Harbour / Canad... \n",
"2 V6EVancouver(SE West End / Davie Village) \n",
"3 V6GVancouver(NW West End / Stanley Park) \n",
"4 V6HVancouver(West Fairview / Granville Island ... \n",
"\n",
" V7ARichmondSouth V8APowell River \\\n",
"0 V7BRichmond(Sea Island / YVR) V8BSquamish \n",
"1 V7CRichmondNorthwest V8CKitimat \n",
"2 V7ERichmondSouthwest V8EWhistler \n",
"3 V7GNorth Vancouver (district municipality)Oute... V8GTerrace \n",
"4 V7HNorth Vancouver (district municipality)Inne... V8HNot assigned \n",
"\n",
" V9AVictoria(Vic West / Esquimalt)Canadian Forces(MARPAC) \n",
"0 V9BVictoria(West Highlands / North Langford / ... \n",
"1 V9CVictoria(Colwood / South Langford / Metchosin) \n",
"2 V9EVictoria(East Highlands / NW Saanich) \n",
"3 V9GLadysmith \n",
"4 V9HCampbell RiverOutskirts "
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"url = \"https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_V\"\n",
"df = pd.read_html(url, header=0)\n",
"df= df[0]\n",
"df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### As we see, the table scraped isn't very readable. Hence, we convert the table to a list and iterate through each value to separate the postal codes and neighborhood values, and store them in a data frame.\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"list_var = df.values.tolist()"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"postal_code = []\n",
"neighborhood = []\n",
"for i in list_var:\n",
" for j in i:\n",
" postal_code.append(j[0:3])\n",
" j = j[3:]\n",
" neigh = j.split(\"(\")[0]\n",
" neighborhood.append(neigh)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Postal Codes</th>\n",
" <th>Neighborhood</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>V1B</td>\n",
" <td>VernonEast</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>V2B</td>\n",
" <td>KamloopsNorthwest</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>V3B</td>\n",
" <td>Port CoquitlamCentral</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>V4B</td>\n",
" <td>White Rock</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>V5B</td>\n",
" <td>Burnaby</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Postal Codes Neighborhood\n",
"0 V1B VernonEast\n",
"1 V2B KamloopsNorthwest\n",
"2 V3B Port CoquitlamCentral\n",
"3 V4B White Rock\n",
"4 V5B Burnaby"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = pd.DataFrame({'Postal Codes':postal_code,'Neighborhood':neighborhood})\n",
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Int64Index([39, 43, 48, 92, 156, 168], dtype='int64')\n",
"(165, 3)\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>index</th>\n",
" <th>Postal Codes</th>\n",
" <th>Neighborhood</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>0</td>\n",
" <td>V1B</td>\n",
" <td>VernonEast</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1</td>\n",
" <td>V2B</td>\n",
" <td>KamloopsNorthwest</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>2</td>\n",
" <td>V3B</td>\n",
" <td>Port CoquitlamCentral</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>3</td>\n",
" <td>V4B</td>\n",
" <td>White Rock</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>4</td>\n",
" <td>V5B</td>\n",
" <td>Burnaby</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" index Postal Codes Neighborhood\n",
"0 0 V1B VernonEast\n",
"1 1 V2B KamloopsNorthwest\n",
"2 2 V3B Port CoquitlamCentral\n",
"3 3 V4B White Rock\n",
"4 4 V5B Burnaby"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"remove = df[df['Neighborhood']=='Not assigned'].index\n",
"print(remove)\n",
"df.drop(remove,axis=0, inplace=True)\n",
"df.reset_index(inplace=True)\n",
"print(df.shape)\n",
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(165, 2)\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Postal Codes</th>\n",
" <th>Neighborhood</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>V1B</td>\n",
" <td>VernonEast</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>V2B</td>\n",
" <td>KamloopsNorthwest</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>V3B</td>\n",
" <td>Port CoquitlamCentral</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>V4B</td>\n",
" <td>White Rock</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>V5B</td>\n",
" <td>Burnaby</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Postal Codes Neighborhood\n",
"0 V1B VernonEast\n",
"1 V2B KamloopsNorthwest\n",
"2 V3B Port CoquitlamCentral\n",
"3 V4B White Rock\n",
"4 V5B Burnaby"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.drop(['index'], axis=1, inplace=True)\n",
"print(df.shape)\n",
"df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Now that we have the address for each location, we install and import the libraries necessary for leveraging the geocoder package.\n"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Collecting package metadata (current_repodata.json): done\n",
"Solving environment: done\n",
"\n",
"## Package Plan ##\n",
"\n",
" environment location: /home/jupyterlab/conda/envs/python\n",
"\n",
" added / updated specs:\n",
" - geopy\n",
"\n",
"\n",
"The following packages will be downloaded:\n",
"\n",
" package | build\n",
" ---------------------------|-----------------\n",
" geographiclib-1.50 | py_0 34 KB conda-forge\n",
" geopy-1.22.0 | pyh9f0ad1d_0 63 KB conda-forge\n",
" ------------------------------------------------------------\n",
" Total: 97 KB\n",
"\n",
"The following NEW packages will be INSTALLED:\n",
"\n",
" geographiclib conda-forge/noarch::geographiclib-1.50-py_0\n",
" geopy conda-forge/noarch::geopy-1.22.0-pyh9f0ad1d_0\n",
"\n",
"\n",
"\n",
"Downloading and Extracting Packages\n",
"geopy-1.22.0 | 63 KB | ##################################### | 100% \n",
"geographiclib-1.50 | 34 KB | ##################################### | 100% \n",
"Preparing transaction: done\n",
"Verifying transaction: done\n",
"Executing transaction: done\n"
]
}
],
"source": [
"!conda install -c conda-forge geopy --yes "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Using geocoder, we pass the location for each neighborhood and obtain its latitude and longitude values.\n"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"from geopy.geocoders import Nominatim\n",
"geolocator = Nominatim(user_agent=\"My_App\")\n",
"from geopy.extra.rate_limiter import RateLimiter\n",
"obj = RateLimiter(geolocator.geocode, min_delay_seconds=1)\n",
"lat=[]\n",
"long=[]\n",
"for i in neighborhood:\n",
" location = obj(\"'\"+i+\"'', Canada\")\n",
" if location == None:\n",
" lat.append(\"None\")\n",
" long.append(\"None\")\n",
" else:\n",
" lat.append(str(location.latitude))\n",
" long.append(str(location.longitude))"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"79\n",
"79\n"
]
}
],
"source": [
"print(lat.count('None'))\n",
"print(long.count('None'))"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"dict_keys(['None', '49.0235357', '49.2433804', '49.2608724', '49.163168', '49.6980743', '48.4283182', '49.5107477', '54.0535577', '50.7005059', '50.1171903', '55.7605306', '52.129081', '49.3207133', '54.5172715', '48.9936579', '49.2822243', '56.2524039', '52.9794279', '54.3126572', '50.111704', '48.8296672', '49.3479861', '49.494891', '48.6505788', '48.7786872', '48.5946782', '49.6727575', '49.316171', '49.3179514', '49.099049', '49.857464', '49.1637594', '52.966077', '49.2207623', '49.2343668', '48.3825724'])"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from collections import Counter\n",
"Counter(lat).keys()"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"dict_keys(['None', '-122.7979246', '-122.9725459', '-123.1139529', '-123.137414', '-123.1558612', '-123.3649533', '-115.7672772', '-128.6540519', '-119.2790529', '-122.9543022', '-120.2364453', '-122.1397346', '-123.0737831', '-128.5995482', '-123.8157964', '-122.8293424', '-120.846943', '-122.4936273', '-130.32549', '-120.7884227', '-123.51516139891447', '-124.4439409', '-117.290039', '-123.3983246', '-123.7080446', '-123.4207265', '-124.9276204', '-117.663574', '-124.3117397', '-117.713013', '-119.580688', '-123.9379719', '-114.4216167', '-122.6901534', '-124.8056517', '-123.7315177'])"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Counter(long).keys()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Hence, we extract the coordinate values and can now store it in the data frame to obtain the required dataset. This completes our first method of obtaining the required dataset.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Second method to acquire the dataset needed:\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### In this method, we scrape the table consisting of the postal code, the neighborhood name, and the coordinate values from a website url given below. Next, we separately extract the coordinate values and the address values via a temporary table, clean them individually and merge them in a single data frame again to obtain our required dataset.\n"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"url = \"https://www.geonames.org/postal-codes/CA/BC/british-columbia.html\"\n",
"df = pd.read_html(url, header=0)"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(385, 7)\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Unnamed: 0</th>\n",
" <th>Place</th>\n",
" <th>Code</th>\n",
" <th>Country</th>\n",
" <th>Admin1</th>\n",
" <th>Admin2</th>\n",
" <th>Admin3</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1.0</td>\n",
" <td>Port Moody</td>\n",
" <td>V3H</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>NaN</td>\n",
" <td>49.323/-122.863</td>\n",
" <td>49.323/-122.863</td>\n",
" <td>49.323/-122.863</td>\n",
" <td>49.323/-122.863</td>\n",
" <td>49.323/-122.863</td>\n",
" <td>49.323/-122.863</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>2.0</td>\n",
" <td>Pitt Meadows</td>\n",
" <td>V3Y</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>NaN</td>\n",
" <td>49.221/-122.69</td>\n",
" <td>49.221/-122.69</td>\n",
" <td>49.221/-122.69</td>\n",
" <td>49.221/-122.69</td>\n",
" <td>49.221/-122.69</td>\n",
" <td>49.221/-122.69</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>3.0</td>\n",
" <td>White Rock</td>\n",
" <td>V4B</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Unnamed: 0 Place Code Country \\\n",
"0 1.0 Port Moody V3H Canada \n",
"1 NaN 49.323/-122.863 49.323/-122.863 49.323/-122.863 \n",
"2 2.0 Pitt Meadows V3Y Canada \n",
"3 NaN 49.221/-122.69 49.221/-122.69 49.221/-122.69 \n",
"4 3.0 White Rock V4B Canada \n",
"\n",
" Admin1 Admin2 Admin3 \n",
"0 British Columbia NaN NaN \n",
"1 49.323/-122.863 49.323/-122.863 49.323/-122.863 \n",
"2 British Columbia NaN NaN \n",
"3 49.221/-122.69 49.221/-122.69 49.221/-122.69 \n",
"4 British Columbia NaN NaN "
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"print(df[2].shape)\n",
"df[2].head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### As we see the table scraped from this website, is much more readable but also contains garbage values. Hence we initially clean the table by dropping unnecessary columns and indexes.\n"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"df1 = df[2]"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(385, 7)\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/jupyterlab/conda/envs/python/lib/python3.6/site-packages/pandas/core/ops/array_ops.py:253: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison\n",
" res_values = method(rvalues)\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Unnamed: 0</th>\n",
" <th>Place</th>\n",
" <th>Code</th>\n",
" <th>Country</th>\n",
" <th>Admin1</th>\n",
" <th>Admin2</th>\n",
" <th>Admin3</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1.0</td>\n",
" <td>Port Moody</td>\n",
" <td>V3H</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>NaN</td>\n",
" <td>49.323/-122.863</td>\n",
" <td>49.323/-122.863</td>\n",
" <td>49.323/-122.863</td>\n",
" <td>49.323/-122.863</td>\n",
" <td>49.323/-122.863</td>\n",
" <td>49.323/-122.863</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>2.0</td>\n",
" <td>Pitt Meadows</td>\n",
" <td>V3Y</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>NaN</td>\n",
" <td>49.221/-122.69</td>\n",
" <td>49.221/-122.69</td>\n",
" <td>49.221/-122.69</td>\n",
" <td>49.221/-122.69</td>\n",
" <td>49.221/-122.69</td>\n",
" <td>49.221/-122.69</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>3.0</td>\n",
" <td>White Rock</td>\n",
" <td>V4B</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Unnamed: 0 Place Code Country \\\n",
"0 1.0 Port Moody V3H Canada \n",
"1 NaN 49.323/-122.863 49.323/-122.863 49.323/-122.863 \n",
"2 2.0 Pitt Meadows V3Y Canada \n",
"3 NaN 49.221/-122.69 49.221/-122.69 49.221/-122.69 \n",
"4 3.0 White Rock V4B Canada \n",
"\n",
" Admin1 Admin2 Admin3 \n",
"0 British Columbia NaN NaN \n",
"1 49.323/-122.863 49.323/-122.863 49.323/-122.863 \n",
"2 British Columbia NaN NaN \n",
"3 49.221/-122.69 49.221/-122.69 49.221/-122.69 \n",
"4 British Columbia NaN NaN "
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"remove = df1[df1['Unnamed: 0']=='NaN'].index\n",
"df1.drop(remove,axis=0, inplace=True)\n",
"print(df1.shape)\n",
"df1.head()"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>index</th>\n",
" <th>Place</th>\n",
" <th>Code</th>\n",
" <th>Country</th>\n",
" <th>Admin1</th>\n",
" <th>Admin2</th>\n",
" <th>Admin3</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1.0</td>\n",
" <td>Port Moody</td>\n",
" <td>V3H</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>NaN</td>\n",
" <td>49.323/-122.863</td>\n",
" <td>49.323/-122.863</td>\n",
" <td>49.323/-122.863</td>\n",
" <td>49.323/-122.863</td>\n",
" <td>49.323/-122.863</td>\n",
" <td>49.323/-122.863</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>2.0</td>\n",
" <td>Pitt Meadows</td>\n",
" <td>V3Y</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>NaN</td>\n",
" <td>49.221/-122.69</td>\n",
" <td>49.221/-122.69</td>\n",
" <td>49.221/-122.69</td>\n",
" <td>49.221/-122.69</td>\n",
" <td>49.221/-122.69</td>\n",
" <td>49.221/-122.69</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>3.0</td>\n",
" <td>White Rock</td>\n",
" <td>V4B</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" index Place Code Country Admin1 \\\n",
"0 1.0 Port Moody V3H Canada British Columbia \n",
"1 NaN 49.323/-122.863 49.323/-122.863 49.323/-122.863 49.323/-122.863 \n",
"2 2.0 Pitt Meadows V3Y Canada British Columbia \n",
"3 NaN 49.221/-122.69 49.221/-122.69 49.221/-122.69 49.221/-122.69 \n",
"4 3.0 White Rock V4B Canada British Columbia \n",
"\n",
" Admin2 Admin3 \n",
"0 NaN NaN \n",
"1 49.323/-122.863 49.323/-122.863 \n",
"2 NaN NaN \n",
"3 49.221/-122.69 49.221/-122.69 \n",
"4 NaN NaN "
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df1.rename(columns = {'Unnamed: 0':'index'}, inplace = True)\n",
"df1.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Now, we see that each row is followed by a row consisting of its coordinates. Hence extract the alternate rows and store the address data of each neighborhood in a temporary data frame and remove the null values.\n"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>index</th>\n",
" <th>Place</th>\n",
" <th>Code</th>\n",
" <th>Country</th>\n",
" <th>Admin1</th>\n",
" <th>Admin2</th>\n",
" <th>Admin3</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1.0</td>\n",
" <td>Port Moody</td>\n",
" <td>V3H</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>2.0</td>\n",
" <td>Pitt Meadows</td>\n",
" <td>V3Y</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>3.0</td>\n",
" <td>White Rock</td>\n",
" <td>V4B</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>4.0</td>\n",
" <td>Penticton</td>\n",
" <td>V2A</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>5.0</td>\n",
" <td>Westbank</td>\n",
" <td>V4T</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" index Place Code Country Admin1 Admin2 Admin3\n",
"0 1.0 Port Moody V3H Canada British Columbia NaN NaN\n",
"2 2.0 Pitt Meadows V3Y Canada British Columbia NaN NaN\n",
"4 3.0 White Rock V4B Canada British Columbia NaN NaN\n",
"6 4.0 Penticton V2A Canada British Columbia NaN NaN\n",
"8 5.0 Westbank V4T Canada British Columbia NaN NaN"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df2 = df1.iloc[::2]\n",
"df2.head()"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(193, 5)\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/jupyterlab/conda/envs/python/lib/python3.6/site-packages/pandas/core/frame.py:3997: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame\n",
"\n",
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" errors=errors,\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>index</th>\n",
" <th>Place</th>\n",
" <th>Code</th>\n",
" <th>Country</th>\n",
" <th>Admin1</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>376</th>\n",
" <td>189.0</td>\n",
" <td>Vancouver (NE Downtown / Harbour Centre / Gast...</td>\n",
" <td>V6B</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" </tr>\n",
" <tr>\n",
" <th>378</th>\n",
" <td>190.0</td>\n",
" <td>Richmond South</td>\n",
" <td>V7A</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" </tr>\n",
" <tr>\n",
" <th>380</th>\n",
" <td>191.0</td>\n",
" <td>Duncan</td>\n",
" <td>V9L</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" </tr>\n",
" <tr>\n",
" <th>382</th>\n",
" <td>192.0</td>\n",
" <td>Parksville</td>\n",
" <td>V9P</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" </tr>\n",
" <tr>\n",
" <th>384</th>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" index Place Code Country \\\n",
"376 189.0 Vancouver (NE Downtown / Harbour Centre / Gast... V6B Canada \n",
"378 190.0 Richmond South V7A Canada \n",
"380 191.0 Duncan V9L Canada \n",
"382 192.0 Parksville V9P Canada \n",
"384 NaN NaN NaN NaN \n",
"\n",
" Admin1 \n",
"376 British Columbia \n",
"378 British Columbia \n",
"380 British Columbia \n",
"382 British Columbia \n",
"384 NaN "
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df2.drop(['Admin2', 'Admin3'], axis=1, inplace=True)\n",
"print(df2.shape)\n",
"df2.tail()"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(192, 5)\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/jupyterlab/conda/envs/python/lib/python3.6/site-packages/ipykernel_launcher.py:1: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame\n",
"\n",
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" \"\"\"Entry point for launching an IPython kernel.\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>index</th>\n",
" <th>Place</th>\n",
" <th>Code</th>\n",
" <th>Country</th>\n",
" <th>Admin1</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>374</th>\n",
" <td>188.0</td>\n",
" <td>Vancouver (Strathcona / Chinatown / Downtown E...</td>\n",
" <td>V6A</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" </tr>\n",
" <tr>\n",
" <th>376</th>\n",
" <td>189.0</td>\n",
" <td>Vancouver (NE Downtown / Harbour Centre / Gast...</td>\n",
" <td>V6B</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" </tr>\n",
" <tr>\n",
" <th>378</th>\n",
" <td>190.0</td>\n",
" <td>Richmond South</td>\n",
" <td>V7A</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" </tr>\n",
" <tr>\n",
" <th>380</th>\n",
" <td>191.0</td>\n",
" <td>Duncan</td>\n",
" <td>V9L</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" </tr>\n",
" <tr>\n",
" <th>382</th>\n",
" <td>192.0</td>\n",
" <td>Parksville</td>\n",
" <td>V9P</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" index Place Code Country \\\n",
"374 188.0 Vancouver (Strathcona / Chinatown / Downtown E... V6A Canada \n",
"376 189.0 Vancouver (NE Downtown / Harbour Centre / Gast... V6B Canada \n",
"378 190.0 Richmond South V7A Canada \n",
"380 191.0 Duncan V9L Canada \n",
"382 192.0 Parksville V9P Canada \n",
"\n",
" Admin1 \n",
"374 British Columbia \n",
"376 British Columbia \n",
"378 British Columbia \n",
"380 British Columbia \n",
"382 British Columbia "
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df2.dropna(subset = [\"index\"], inplace=True)\n",
"print(df2.shape)\n",
"df2.tail()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### In the main data frame, we are left with the coordinate values corresponding to each address in the temp data frame. All the cells of the row contain the same coordinate values given by latitude/longitude. Hence we drop all duplicate column, rename the column name as 'Coordinates', and merge the address values from the temporary data frame to the main data frame.\n"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [],
"source": [
"j=1.0\n",
"for i in range(df1.shape[0]):\n",
" remove = df1[df1[\"index\"]==j].index\n",
" df1.drop(remove, axis=0, inplace=True)\n",
" j+=1.0"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>index</th>\n",
" <th>Place</th>\n",
" <th>Code</th>\n",
" <th>Country</th>\n",
" <th>Admin1</th>\n",
" <th>Admin2</th>\n",
" <th>Admin3</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>377</th>\n",
" <td>NaN</td>\n",
" <td>49.279/-123.114</td>\n",
" <td>49.279/-123.114</td>\n",
" <td>49.279/-123.114</td>\n",
" <td>49.279/-123.114</td>\n",
" <td>49.279/-123.114</td>\n",
" <td>49.279/-123.114</td>\n",
" </tr>\n",
" <tr>\n",
" <th>379</th>\n",
" <td>NaN</td>\n",
" <td>49.12/-123.117</td>\n",
" <td>49.12/-123.117</td>\n",
" <td>49.12/-123.117</td>\n",
" <td>49.12/-123.117</td>\n",
" <td>49.12/-123.117</td>\n",
" <td>49.12/-123.117</td>\n",
" </tr>\n",
" <tr>\n",
" <th>381</th>\n",
" <td>NaN</td>\n",
" <td>48.783/-123.703</td>\n",
" <td>48.783/-123.703</td>\n",
" <td>48.783/-123.703</td>\n",
" <td>48.783/-123.703</td>\n",
" <td>48.783/-123.703</td>\n",
" <td>48.783/-123.703</td>\n",
" </tr>\n",
" <tr>\n",
" <th>383</th>\n",
" <td>NaN</td>\n",
" <td>49.316/-124.319</td>\n",
" <td>49.316/-124.319</td>\n",
" <td>49.316/-124.319</td>\n",
" <td>49.316/-124.319</td>\n",
" <td>49.316/-124.319</td>\n",
" <td>49.316/-124.319</td>\n",
" </tr>\n",
" <tr>\n",
" <th>384</th>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" index Place Code Country \\\n",
"377 NaN 49.279/-123.114 49.279/-123.114 49.279/-123.114 \n",
"379 NaN 49.12/-123.117 49.12/-123.117 49.12/-123.117 \n",
"381 NaN 48.783/-123.703 48.783/-123.703 48.783/-123.703 \n",
"383 NaN 49.316/-124.319 49.316/-124.319 49.316/-124.319 \n",
"384 NaN NaN NaN NaN \n",
"\n",
" Admin1 Admin2 Admin3 \n",
"377 49.279/-123.114 49.279/-123.114 49.279/-123.114 \n",
"379 49.12/-123.117 49.12/-123.117 49.12/-123.117 \n",
"381 48.783/-123.703 48.783/-123.703 48.783/-123.703 \n",
"383 49.316/-124.319 49.316/-124.319 49.316/-124.319 \n",
"384 NaN NaN NaN "
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df1.tail()"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(192, 7)\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>index</th>\n",
" <th>Place</th>\n",
" <th>Code</th>\n",
" <th>Country</th>\n",
" <th>Admin1</th>\n",
" <th>Admin2</th>\n",
" <th>Admin3</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>375</th>\n",
" <td>NaN</td>\n",
" <td>49.278/-123.091</td>\n",
" <td>49.278/-123.091</td>\n",
" <td>49.278/-123.091</td>\n",
" <td>49.278/-123.091</td>\n",
" <td>49.278/-123.091</td>\n",
" <td>49.278/-123.091</td>\n",
" </tr>\n",
" <tr>\n",
" <th>377</th>\n",
" <td>NaN</td>\n",
" <td>49.279/-123.114</td>\n",
" <td>49.279/-123.114</td>\n",
" <td>49.279/-123.114</td>\n",
" <td>49.279/-123.114</td>\n",
" <td>49.279/-123.114</td>\n",
" <td>49.279/-123.114</td>\n",
" </tr>\n",
" <tr>\n",
" <th>379</th>\n",
" <td>NaN</td>\n",
" <td>49.12/-123.117</td>\n",
" <td>49.12/-123.117</td>\n",
" <td>49.12/-123.117</td>\n",
" <td>49.12/-123.117</td>\n",
" <td>49.12/-123.117</td>\n",
" <td>49.12/-123.117</td>\n",
" </tr>\n",
" <tr>\n",
" <th>381</th>\n",
" <td>NaN</td>\n",
" <td>48.783/-123.703</td>\n",
" <td>48.783/-123.703</td>\n",
" <td>48.783/-123.703</td>\n",
" <td>48.783/-123.703</td>\n",
" <td>48.783/-123.703</td>\n",
" <td>48.783/-123.703</td>\n",
" </tr>\n",
" <tr>\n",
" <th>383</th>\n",
" <td>NaN</td>\n",
" <td>49.316/-124.319</td>\n",
" <td>49.316/-124.319</td>\n",
" <td>49.316/-124.319</td>\n",
" <td>49.316/-124.319</td>\n",
" <td>49.316/-124.319</td>\n",
" <td>49.316/-124.319</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" index Place Code Country \\\n",
"375 NaN 49.278/-123.091 49.278/-123.091 49.278/-123.091 \n",
"377 NaN 49.279/-123.114 49.279/-123.114 49.279/-123.114 \n",
"379 NaN 49.12/-123.117 49.12/-123.117 49.12/-123.117 \n",
"381 NaN 48.783/-123.703 48.783/-123.703 48.783/-123.703 \n",
"383 NaN 49.316/-124.319 49.316/-124.319 49.316/-124.319 \n",
"\n",
" Admin1 Admin2 Admin3 \n",
"375 49.278/-123.091 49.278/-123.091 49.278/-123.091 \n",
"377 49.279/-123.114 49.279/-123.114 49.279/-123.114 \n",
"379 49.12/-123.117 49.12/-123.117 49.12/-123.117 \n",
"381 48.783/-123.703 48.783/-123.703 48.783/-123.703 \n",
"383 49.316/-124.319 49.316/-124.319 49.316/-124.319 "
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df1.dropna(subset = [\"Place\"], inplace=True)\n",
"print(df1.shape)\n",
"df1.tail()"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(192, 1)\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Place</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>49.323/-122.863</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>49.221/-122.69</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>49.026/-122.806</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>49.481/-119.586</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>49.866/-119.739</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Place\n",
"1 49.323/-122.863\n",
"3 49.221/-122.69\n",
"5 49.026/-122.806\n",
"7 49.481/-119.586\n",
"9 49.866/-119.739"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df1.drop(['index','Code','Country','Admin1','Admin2','Admin3'], axis=1, inplace=True)\n",
"print(df1.shape)\n",
"df1.head()"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Coordinates</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>49.323/-122.863</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>49.221/-122.69</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>49.026/-122.806</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>49.481/-119.586</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>49.866/-119.739</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Coordinates\n",
"1 49.323/-122.863\n",
"3 49.221/-122.69\n",
"5 49.026/-122.806\n",
"7 49.481/-119.586\n",
"9 49.866/-119.739"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df1.rename(columns={'Place':'Coordinates'}, inplace=True)\n",
"df1.head()"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [],
"source": [
"df1.reset_index(inplace = True, drop = True) \n",
"df2.reset_index(inplace = True, drop = True) "
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Coordinates</th>\n",
" <th>Place</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>49.323/-122.863</td>\n",
" <td>Port Moody</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>49.221/-122.69</td>\n",
" <td>Pitt Meadows</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>49.026/-122.806</td>\n",
" <td>White Rock</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>49.481/-119.586</td>\n",
" <td>Penticton</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>49.866/-119.739</td>\n",
" <td>Westbank</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Coordinates Place\n",
"0 49.323/-122.863 Port Moody\n",
"1 49.221/-122.69 Pitt Meadows\n",
"2 49.026/-122.806 White Rock\n",
"3 49.481/-119.586 Penticton\n",
"4 49.866/-119.739 Westbank"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df1[\"Place\"]=df2[[\"Place\"]]\n",
"df1.head()"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>index</th>\n",
" <th>Place</th>\n",
" <th>Code</th>\n",
" <th>Country</th>\n",
" <th>Admin1</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1.0</td>\n",
" <td>Port Moody</td>\n",
" <td>V3H</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2.0</td>\n",
" <td>Pitt Meadows</td>\n",
" <td>V3Y</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>3.0</td>\n",
" <td>White Rock</td>\n",
" <td>V4B</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>4.0</td>\n",
" <td>Penticton</td>\n",
" <td>V2A</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>5.0</td>\n",
" <td>Westbank</td>\n",
" <td>V4T</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" index Place Code Country Admin1\n",
"0 1.0 Port Moody V3H Canada British Columbia\n",
"1 2.0 Pitt Meadows V3Y Canada British Columbia\n",
"2 3.0 White Rock V4B Canada British Columbia\n",
"3 4.0 Penticton V2A Canada British Columbia\n",
"4 5.0 Westbank V4T Canada British Columbia"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df2.head()"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Coordinates</th>\n",
" <th>Place</th>\n",
" <th>Code</th>\n",
" <th>Country</th>\n",
" <th>Admin1</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>49.323/-122.863</td>\n",
" <td>Port Moody</td>\n",
" <td>V3H</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>49.221/-122.69</td>\n",
" <td>Pitt Meadows</td>\n",
" <td>V3Y</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>49.026/-122.806</td>\n",
" <td>White Rock</td>\n",
" <td>V4B</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>49.481/-119.586</td>\n",
" <td>Penticton</td>\n",
" <td>V2A</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>49.866/-119.739</td>\n",
" <td>Westbank</td>\n",
" <td>V4T</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Coordinates Place Code Country Admin1\n",
"0 49.323/-122.863 Port Moody V3H Canada British Columbia\n",
"1 49.221/-122.69 Pitt Meadows V3Y Canada British Columbia\n",
"2 49.026/-122.806 White Rock V4B Canada British Columbia\n",
"3 49.481/-119.586 Penticton V2A Canada British Columbia\n",
"4 49.866/-119.739 Westbank V4T Canada British Columbia"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df1[\"Code\"]=df2[[\"Code\"]]\n",
"df1[\"Country\"]=df2[[\"Country\"]]\n",
"df1[\"Admin1\"]=df2[[\"Admin1\"]]\n",
"df1.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Now, we have obtained all the required data in one data frame. Upon checking the data types of each column, we find that that the coordinate values are in the 'string' data type,. Hence to use them in further processing, we split the values into latitude and longitude values, store them separately in the data frame and convert them to the float data type.\n"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [],
"source": [
"coordinates = df1[[\"Coordinates\"]].values.tolist()"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [],
"source": [
"lat=[]\n",
"long=[]\n",
"for i in range(len(coordinates)):\n",
" coord = coordinates[i][0].split(\"/\")\n",
" lat.append(coord[0])\n",
" long.append(coord[1])"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [],
"source": [
"df1[\"Latitude\"] = lat\n",
"df1[\"Longitude\"] = long"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [],
"source": [
"df1.rename(columns={'Admin1':'Province'}, inplace=True)"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Coordinates</th>\n",
" <th>Place</th>\n",
" <th>Code</th>\n",
" <th>Country</th>\n",
" <th>Province</th>\n",
" <th>Latitude</th>\n",
" <th>Longitude</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>49.323/-122.863</td>\n",
" <td>Port Moody</td>\n",
" <td>V3H</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.323</td>\n",
" <td>-122.863</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>49.221/-122.69</td>\n",
" <td>Pitt Meadows</td>\n",
" <td>V3Y</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.221</td>\n",
" <td>-122.69</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>49.026/-122.806</td>\n",
" <td>White Rock</td>\n",
" <td>V4B</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.026</td>\n",
" <td>-122.806</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>49.481/-119.586</td>\n",
" <td>Penticton</td>\n",
" <td>V2A</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.481</td>\n",
" <td>-119.586</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>49.866/-119.739</td>\n",
" <td>Westbank</td>\n",
" <td>V4T</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.866</td>\n",
" <td>-119.739</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Coordinates Place Code Country Province Latitude \\\n",
"0 49.323/-122.863 Port Moody V3H Canada British Columbia 49.323 \n",
"1 49.221/-122.69 Pitt Meadows V3Y Canada British Columbia 49.221 \n",
"2 49.026/-122.806 White Rock V4B Canada British Columbia 49.026 \n",
"3 49.481/-119.586 Penticton V2A Canada British Columbia 49.481 \n",
"4 49.866/-119.739 Westbank V4T Canada British Columbia 49.866 \n",
"\n",
" Longitude \n",
"0 -122.863 \n",
"1 -122.69 \n",
"2 -122.806 \n",
"3 -119.586 \n",
"4 -119.739 "
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df1.head()"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [],
"source": [
"df1[['Latitude','Longitude']] = df1[['Latitude','Longitude']].astype(np.float16) "
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(192, 7)\n"
]
},
{
"data": {
"text/plain": [
"Coordinates object\n",
"Place object\n",
"Code object\n",
"Country object\n",
"Province object\n",
"Latitude float16\n",
"Longitude float16\n",
"dtype: object"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"print(df1.shape)\n",
"df1.dtypes"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### We also notice that several different locations inside the same cities having different postal codes have been included in our dataset. Under the assumption that the popular places that we aim to explore in each city will be similar, we remove redundant city values from our data frame. For example, if we have Vancouver East and Vancouver West in our data frame, then we remove the 'East' and the 'West' part and combine them into one 'Vancouver.\n"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [],
"source": [
"place = df1['Place'].tolist()"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [],
"source": [
"new_place = []\n",
"for i in place:\n",
" new_place.append(i.split(\" (\")[0])"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [],
"source": [
"df1['Place'] = new_place"
]
},
{
"cell_type": "code",
"execution_count": 73,
"metadata": {},
"outputs": [],
"source": [
"aggregation_functions = {'Code': 'first', 'Country': 'first', 'Province': 'first', 'Latitude' : 'mean', 'Longitude' : 'mean'}\n",
"df_new = df1.groupby(df1['Place']).aggregate(aggregation_functions)"
]
},
{
"cell_type": "code",
"execution_count": 74,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(155, 5)\n"
]
}
],
"source": [
"print(df_new.shape)\n",
"df_new.reset_index(inplace=True)"
]
},
{
"cell_type": "code",
"execution_count": 75,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Place</th>\n",
" <th>Code</th>\n",
" <th>Country</th>\n",
" <th>Province</th>\n",
" <th>Latitude</th>\n",
" <th>Longitude</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Abbotsford East</td>\n",
" <td>V3G</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.06250</td>\n",
" <td>-122.1875</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Abbotsford Southeast</td>\n",
" <td>V2S</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.03125</td>\n",
" <td>-122.3125</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Abbotsford Southwest</td>\n",
" <td>V2T</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.03125</td>\n",
" <td>-122.3750</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Abbotsford West</td>\n",
" <td>V4X</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.09375</td>\n",
" <td>-122.3750</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Atlin Region</td>\n",
" <td>V0W</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>59.62500</td>\n",
" <td>-133.5000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>150</th>\n",
" <td>Westbank</td>\n",
" <td>V4T</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.87500</td>\n",
" <td>-119.7500</td>\n",
" </tr>\n",
" <tr>\n",
" <th>151</th>\n",
" <td>Whistler</td>\n",
" <td>V8E</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>50.12500</td>\n",
" <td>-122.9375</td>\n",
" </tr>\n",
" <tr>\n",
" <th>152</th>\n",
" <td>White Rock</td>\n",
" <td>V4B</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.03125</td>\n",
" <td>-122.8125</td>\n",
" </tr>\n",
" <tr>\n",
" <th>153</th>\n",
" <td>Williams Lake</td>\n",
" <td>V2G</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>52.15625</td>\n",
" <td>-122.1250</td>\n",
" </tr>\n",
" <tr>\n",
" <th>154</th>\n",
" <td>Winfield</td>\n",
" <td>V4V</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>50.03125</td>\n",
" <td>-119.3750</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>155 rows × 6 columns</p>\n",
"</div>"
],
"text/plain": [
" Place Code Country Province Latitude Longitude\n",
"0 Abbotsford East V3G Canada British Columbia 49.06250 -122.1875\n",
"1 Abbotsford Southeast V2S Canada British Columbia 49.03125 -122.3125\n",
"2 Abbotsford Southwest V2T Canada British Columbia 49.03125 -122.3750\n",
"3 Abbotsford West V4X Canada British Columbia 49.09375 -122.3750\n",
"4 Atlin Region V0W Canada British Columbia 59.62500 -133.5000\n",
".. ... ... ... ... ... ...\n",
"150 Westbank V4T Canada British Columbia 49.87500 -119.7500\n",
"151 Whistler V8E Canada British Columbia 50.12500 -122.9375\n",
"152 White Rock V4B Canada British Columbia 49.03125 -122.8125\n",
"153 Williams Lake V2G Canada British Columbia 52.15625 -122.1250\n",
"154 Winfield V4V Canada British Columbia 50.03125 -119.3750\n",
"\n",
"[155 rows x 6 columns]"
]
},
"execution_count": 75,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_new"
]
},
{
"cell_type": "code",
"execution_count": 80,
"metadata": {},
"outputs": [],
"source": [
"place1 = df_new['Place'].tolist()"
]
},
{
"cell_type": "code",
"execution_count": 81,
"metadata": {},
"outputs": [],
"source": [
"new_place1 = []\n",
"for i in place1:\n",
" new_place1.append(i.split(\" \")[0])"
]
},
{
"cell_type": "code",
"execution_count": 82,
"metadata": {},
"outputs": [],
"source": [
"df_new['Distinct place'] = new_place1"
]
},
{
"cell_type": "code",
"execution_count": 83,
"metadata": {},
"outputs": [],
"source": [
"aggregation_functions = {'Place' : 'first' ,'Code': 'first', 'Country': 'first', 'Province': 'first', 'Latitude' : 'mean', 'Longitude' : 'mean'}\n",
"df_new = df_new.groupby(df_new['Distinct place']).aggregate(aggregation_functions)"
]
},
{
"cell_type": "code",
"execution_count": 84,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['Abbotsford',\n",
" 'Atlin Region',\n",
" 'Burnaby',\n",
" 'Campbell River Central',\n",
" 'Cariboo and West Okanagan',\n",
" 'Castlegar',\n",
" 'Cedar',\n",
" 'Central Island',\n",
" 'Chilcotin',\n",
" 'Chilliwack Central',\n",
" 'Comox',\n",
" 'Coquitlam',\n",
" 'Courtenay Central',\n",
" 'Cranbrook',\n",
" 'Dawson Creek',\n",
" 'Delta Central',\n",
" 'Duncan',\n",
" 'East Kootenays',\n",
" 'Esquimalt',\n",
" 'Fort St. John',\n",
" 'Harrison Lake Region',\n",
" 'Highlands',\n",
" 'Inside Passage and the Queen Charlottes',\n",
" 'Juan de Fuca Shore',\n",
" 'Kamloops Central and',\n",
" 'Kelowna Central',\n",
" 'Kimberley',\n",
" 'Kitimat',\n",
" 'Ladysmith',\n",
" 'Langley City',\n",
" 'Lower Skeena',\n",
" 'Maple Ridge',\n",
" 'Merritt',\n",
" 'Metchosin',\n",
" 'Mission',\n",
" 'Nanaimo Central',\n",
" 'Nelson',\n",
" 'New Westminster Northeast',\n",
" 'North Central Island and Bute Inlet Region',\n",
" 'Northern British Columbia',\n",
" 'Oak Bay',\n",
" 'Omineca and Yellowhead',\n",
" 'Parksville',\n",
" 'Penticton',\n",
" 'Pitt Meadows',\n",
" 'Port Alberni',\n",
" 'Powell River',\n",
" 'Prince George Central',\n",
" 'Qualicum Beach',\n",
" 'Quesnel',\n",
" 'Richmond',\n",
" 'Saanich Central',\n",
" 'Salmon Arm',\n",
" 'Saltspring Island',\n",
" 'Sidney',\n",
" 'Similkameen',\n",
" 'Sooke',\n",
" 'South Okanagan',\n",
" 'Squamish',\n",
" 'Surrey',\n",
" 'Terrace',\n",
" 'Trail',\n",
" 'Upper Columbia Region',\n",
" 'Vancouver',\n",
" 'Vernon Central',\n",
" 'Victoria Central British Columbia Provincial Government',\n",
" 'West Kootenays',\n",
" 'Westbank',\n",
" 'Whistler',\n",
" 'White Rock',\n",
" 'Williams Lake',\n",
" 'Winfield']"
]
},
"execution_count": 84,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"modified_places = df_new['Place'].tolist()\n",
"for index, word in enumerate(modified_places):\n",
" if \" North Island, Sunshine Coast\" in word:\n",
" pass\n",
" elif \" Westminster\" in word:\n",
" pass\n",
" elif \" Northern\" in word:\n",
" pass\n",
" elif \" West Okanagan\" in word:\n",
" pass\n",
" elif \" East\" in word:\n",
" modified_places[index]=modified_places[index].replace(\" East\",\"\")\n",
" elif \" West\" in word:\n",
" modified_places[index]=modified_places[index].replace(\" West\",\"\")\n",
" elif \" Southeast\" in word:\n",
" modified_places[index]=modified_places[index].replace(\" Southeast\",\"\")\n",
" elif \" Northeast\" in word:\n",
" modified_places[index]=modified_places[index].replace(\" Northeast\",\"\")\n",
" elif \" North\" in word:\n",
" modified_places[index]=modified_places[index].replace(\" North\",\"\")\n",
" elif \" South\" in word:\n",
" modified_places[index]=modified_places[index].replace(\" South\",\"\")\n",
"modified_places"
]
},
{
"cell_type": "code",
"execution_count": 85,
"metadata": {},
"outputs": [],
"source": [
"df_new['Place'] = modified_places"
]
},
{
"cell_type": "code",
"execution_count": 86,
"metadata": {},
"outputs": [],
"source": [
"df_new.at[38, 'Place'] = \"North Island, Sunshine Coast\""
]
},
{
"cell_type": "code",
"execution_count": 89,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(73, 6)\n"
]
}
],
"source": [
"print(df_new.shape)\n",
"df_new.reset_index(inplace=True)"
]
},
{
"cell_type": "code",
"execution_count": 90,
"metadata": {},
"outputs": [],
"source": [
"df_new.drop(['Distinct place'], axis=1, inplace=True)"
]
},
{
"cell_type": "code",
"execution_count": 92,
"metadata": {},
"outputs": [],
"source": [
"pd.set_option('display.max_rows', df_new.shape[0]+1)"
]
},
{
"cell_type": "code",
"execution_count": 94,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(72, 6)\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Place</th>\n",
" <th>Code</th>\n",
" <th>Country</th>\n",
" <th>Province</th>\n",
" <th>Latitude</th>\n",
" <th>Longitude</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Abbotsford</td>\n",
" <td>V3G</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.06250</td>\n",
" <td>-122.3125</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Atlin Region</td>\n",
" <td>V0W</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>59.62500</td>\n",
" <td>-133.5000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Burnaby</td>\n",
" <td>V3N</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.25000</td>\n",
" <td>-123.0000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Campbell River Central</td>\n",
" <td>V9W</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>50.00000</td>\n",
" <td>-125.5625</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Cariboo and West Okanagan</td>\n",
" <td>V0K</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>51.43750</td>\n",
" <td>-121.6250</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>Castlegar</td>\n",
" <td>V1N</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.31250</td>\n",
" <td>-117.6875</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>Cedar</td>\n",
" <td>V9X</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.03125</td>\n",
" <td>-124.0000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>Central Island</td>\n",
" <td>V0R</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.68750</td>\n",
" <td>-122.5000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>Chilcotin</td>\n",
" <td>V0L</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>52.09375</td>\n",
" <td>-123.6250</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>Chilliwack Central</td>\n",
" <td>V2P</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.12500</td>\n",
" <td>-121.8125</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>Comox</td>\n",
" <td>V9M</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.68750</td>\n",
" <td>-124.9375</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>Coquitlam</td>\n",
" <td>V3J</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.25000</td>\n",
" <td>-122.8750</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12</th>\n",
" <td>Courtenay Central</td>\n",
" <td>V9N</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.68750</td>\n",
" <td>-125.1250</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>Cranbrook</td>\n",
" <td>V1C</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.50000</td>\n",
" <td>-115.7500</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td>Dawson Creek</td>\n",
" <td>V1G</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>55.78125</td>\n",
" <td>-120.2500</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>Delta Central</td>\n",
" <td>V4K</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.09375</td>\n",
" <td>-123.0000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16</th>\n",
" <td>Duncan</td>\n",
" <td>V9L</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>48.78125</td>\n",
" <td>-123.6875</td>\n",
" </tr>\n",
" <tr>\n",
" <th>17</th>\n",
" <td>East Kootenays</td>\n",
" <td>V0B</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.68750</td>\n",
" <td>-115.5625</td>\n",
" </tr>\n",
" <tr>\n",
" <th>18</th>\n",
" <td>Esquimalt</td>\n",
" <td>V9A</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>48.43750</td>\n",
" <td>-123.4375</td>\n",
" </tr>\n",
" <tr>\n",
" <th>19</th>\n",
" <td>Fort St. John</td>\n",
" <td>V1J</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>56.25000</td>\n",
" <td>-120.8750</td>\n",
" </tr>\n",
" <tr>\n",
" <th>20</th>\n",
" <td>Harrison Lake Region</td>\n",
" <td>V0M</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.62500</td>\n",
" <td>-122.0625</td>\n",
" </tr>\n",
" <tr>\n",
" <th>21</th>\n",
" <td>Highlands</td>\n",
" <td>V9B</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>48.46875</td>\n",
" <td>-123.5000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>22</th>\n",
" <td>Inside Passage and the Queen Charlottes</td>\n",
" <td>V0T</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>52.21875</td>\n",
" <td>-126.1875</td>\n",
" </tr>\n",
" <tr>\n",
" <th>23</th>\n",
" <td>Juan de Fuca Shore</td>\n",
" <td>V0S</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>48.53125</td>\n",
" <td>-123.6875</td>\n",
" </tr>\n",
" <tr>\n",
" <th>24</th>\n",
" <td>Kamloops Central and</td>\n",
" <td>V2C</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>50.68750</td>\n",
" <td>-120.4375</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25</th>\n",
" <td>Kelowna Central</td>\n",
" <td>V1Y</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.90625</td>\n",
" <td>-119.4375</td>\n",
" </tr>\n",
" <tr>\n",
" <th>26</th>\n",
" <td>Kimberley</td>\n",
" <td>V1A</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.68750</td>\n",
" <td>-116.0000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>27</th>\n",
" <td>Kitimat</td>\n",
" <td>V8C</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>54.06250</td>\n",
" <td>-128.6250</td>\n",
" </tr>\n",
" <tr>\n",
" <th>28</th>\n",
" <td>Ladysmith</td>\n",
" <td>V9G</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>48.96875</td>\n",
" <td>-123.8125</td>\n",
" </tr>\n",
" <tr>\n",
" <th>29</th>\n",
" <td>Langley City</td>\n",
" <td>V3A</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.09375</td>\n",
" <td>-122.5625</td>\n",
" </tr>\n",
" <tr>\n",
" <th>30</th>\n",
" <td>Lower Skeena</td>\n",
" <td>V0V</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>53.96875</td>\n",
" <td>-129.8750</td>\n",
" </tr>\n",
" <tr>\n",
" <th>31</th>\n",
" <td>Maple Ridge</td>\n",
" <td>V2W</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.25000</td>\n",
" <td>-122.5625</td>\n",
" </tr>\n",
" <tr>\n",
" <th>32</th>\n",
" <td>Merritt</td>\n",
" <td>V1K</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>50.12500</td>\n",
" <td>-120.8125</td>\n",
" </tr>\n",
" <tr>\n",
" <th>33</th>\n",
" <td>Metchosin</td>\n",
" <td>V9C</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>48.37500</td>\n",
" <td>-123.5625</td>\n",
" </tr>\n",
" <tr>\n",
" <th>34</th>\n",
" <td>Mission</td>\n",
" <td>V2V</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.43750</td>\n",
" <td>-122.4375</td>\n",
" </tr>\n",
" <tr>\n",
" <th>35</th>\n",
" <td>Nanaimo Central</td>\n",
" <td>V9S</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.18750</td>\n",
" <td>-124.0000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>36</th>\n",
" <td>Nelson</td>\n",
" <td>V1L</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.50000</td>\n",
" <td>-117.3125</td>\n",
" </tr>\n",
" <tr>\n",
" <th>37</th>\n",
" <td>New Westminster Northeast</td>\n",
" <td>V3L</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.18750</td>\n",
" <td>-122.8750</td>\n",
" </tr>\n",
" <tr>\n",
" <th>38</th>\n",
" <td>North Central Island and Bute Inlet Region</td>\n",
" <td>V0P</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.56250</td>\n",
" <td>-123.6250</td>\n",
" </tr>\n",
" <tr>\n",
" <th>39</th>\n",
" <td>Northern British Columbia</td>\n",
" <td>V0C</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>58.37500</td>\n",
" <td>-125.6875</td>\n",
" </tr>\n",
" <tr>\n",
" <th>40</th>\n",
" <td>Oak Bay</td>\n",
" <td>V8R</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>48.43750</td>\n",
" <td>-123.3125</td>\n",
" </tr>\n",
" <tr>\n",
" <th>41</th>\n",
" <td>Omineca and Yellowhead</td>\n",
" <td>V0J</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>56.00000</td>\n",
" <td>-126.8750</td>\n",
" </tr>\n",
" <tr>\n",
" <th>42</th>\n",
" <td>Parksville</td>\n",
" <td>V9P</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.31250</td>\n",
" <td>-124.3125</td>\n",
" </tr>\n",
" <tr>\n",
" <th>43</th>\n",
" <td>Penticton</td>\n",
" <td>V2A</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.46875</td>\n",
" <td>-119.5625</td>\n",
" </tr>\n",
" <tr>\n",
" <th>44</th>\n",
" <td>Pitt Meadows</td>\n",
" <td>V3Y</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.21875</td>\n",
" <td>-122.6875</td>\n",
" </tr>\n",
" <tr>\n",
" <th>45</th>\n",
" <td>Port Alberni</td>\n",
" <td>V9Y</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.28125</td>\n",
" <td>-123.1875</td>\n",
" </tr>\n",
" <tr>\n",
" <th>46</th>\n",
" <td>Powell River</td>\n",
" <td>V8A</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>50.03125</td>\n",
" <td>-124.3125</td>\n",
" </tr>\n",
" <tr>\n",
" <th>47</th>\n",
" <td>Prince George Central</td>\n",
" <td>V2L</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>53.96875</td>\n",
" <td>-124.3750</td>\n",
" </tr>\n",
" <tr>\n",
" <th>48</th>\n",
" <td>Qualicum Beach</td>\n",
" <td>V9K</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.34375</td>\n",
" <td>-124.4375</td>\n",
" </tr>\n",
" <tr>\n",
" <th>49</th>\n",
" <td>Quesnel</td>\n",
" <td>V2J</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>52.96875</td>\n",
" <td>-122.5000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50</th>\n",
" <td>Richmond</td>\n",
" <td>V7B</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.15625</td>\n",
" <td>-123.1250</td>\n",
" </tr>\n",
" <tr>\n",
" <th>51</th>\n",
" <td>Saanich Central</td>\n",
" <td>V8Z</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>48.50000</td>\n",
" <td>-123.3750</td>\n",
" </tr>\n",
" <tr>\n",
" <th>52</th>\n",
" <td>Salmon Arm</td>\n",
" <td>V1E</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>50.68750</td>\n",
" <td>-119.2500</td>\n",
" </tr>\n",
" <tr>\n",
" <th>53</th>\n",
" <td>Saltspring Island</td>\n",
" <td>V8K</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>48.81250</td>\n",
" <td>-123.5000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>54</th>\n",
" <td>Sidney</td>\n",
" <td>V8L</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>48.65625</td>\n",
" <td>-123.3750</td>\n",
" </tr>\n",
" <tr>\n",
" <th>55</th>\n",
" <td>Similkameen</td>\n",
" <td>V0X</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.37500</td>\n",
" <td>-120.6875</td>\n",
" </tr>\n",
" <tr>\n",
" <th>56</th>\n",
" <td>Sooke</td>\n",
" <td>V9Z</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>48.37500</td>\n",
" <td>-123.7500</td>\n",
" </tr>\n",
" <tr>\n",
" <th>57</th>\n",
" <td>South Okanagan</td>\n",
" <td>V0H</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.40625</td>\n",
" <td>-119.0000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>58</th>\n",
" <td>Squamish</td>\n",
" <td>V8B</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.68750</td>\n",
" <td>-123.1250</td>\n",
" </tr>\n",
" <tr>\n",
" <th>59</th>\n",
" <td>Surrey</td>\n",
" <td>V3Z</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.12500</td>\n",
" <td>-122.8125</td>\n",
" </tr>\n",
" <tr>\n",
" <th>60</th>\n",
" <td>Terrace</td>\n",
" <td>V8G</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>54.53125</td>\n",
" <td>-128.6250</td>\n",
" </tr>\n",
" <tr>\n",
" <th>61</th>\n",
" <td>Trail</td>\n",
" <td>V1R</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.09375</td>\n",
" <td>-117.6875</td>\n",
" </tr>\n",
" <tr>\n",
" <th>62</th>\n",
" <td>Upper Columbia Region</td>\n",
" <td>V0A</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>51.31250</td>\n",
" <td>-116.9375</td>\n",
" </tr>\n",
" <tr>\n",
" <th>63</th>\n",
" <td>Vancouver</td>\n",
" <td>V5K</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.25000</td>\n",
" <td>-123.1250</td>\n",
" </tr>\n",
" <tr>\n",
" <th>64</th>\n",
" <td>Vernon Central</td>\n",
" <td>V1T</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>50.21875</td>\n",
" <td>-119.2500</td>\n",
" </tr>\n",
" <tr>\n",
" <th>65</th>\n",
" <td>Victoria Central British Columbia Provincial G...</td>\n",
" <td>V8W</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>48.43750</td>\n",
" <td>-123.3750</td>\n",
" </tr>\n",
" <tr>\n",
" <th>66</th>\n",
" <td>West Kootenays</td>\n",
" <td>V0G</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.50000</td>\n",
" <td>-122.0625</td>\n",
" </tr>\n",
" <tr>\n",
" <th>67</th>\n",
" <td>Westbank</td>\n",
" <td>V4T</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.87500</td>\n",
" <td>-119.7500</td>\n",
" </tr>\n",
" <tr>\n",
" <th>68</th>\n",
" <td>Whistler</td>\n",
" <td>V8E</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>50.12500</td>\n",
" <td>-122.9375</td>\n",
" </tr>\n",
" <tr>\n",
" <th>69</th>\n",
" <td>White Rock</td>\n",
" <td>V4B</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.03125</td>\n",
" <td>-122.8125</td>\n",
" </tr>\n",
" <tr>\n",
" <th>70</th>\n",
" <td>Williams Lake</td>\n",
" <td>V2G</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>52.15625</td>\n",
" <td>-122.1250</td>\n",
" </tr>\n",
" <tr>\n",
" <th>71</th>\n",
" <td>Winfield</td>\n",
" <td>V4V</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>50.03125</td>\n",
" <td>-119.3750</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Place Code Country \\\n",
"0 Abbotsford V3G Canada \n",
"1 Atlin Region V0W Canada \n",
"2 Burnaby V3N Canada \n",
"3 Campbell River Central V9W Canada \n",
"4 Cariboo and West Okanagan V0K Canada \n",
"5 Castlegar V1N Canada \n",
"6 Cedar V9X Canada \n",
"7 Central Island V0R Canada \n",
"8 Chilcotin V0L Canada \n",
"9 Chilliwack Central V2P Canada \n",
"10 Comox V9M Canada \n",
"11 Coquitlam V3J Canada \n",
"12 Courtenay Central V9N Canada \n",
"13 Cranbrook V1C Canada \n",
"14 Dawson Creek V1G Canada \n",
"15 Delta Central V4K Canada \n",
"16 Duncan V9L Canada \n",
"17 East Kootenays V0B Canada \n",
"18 Esquimalt V9A Canada \n",
"19 Fort St. John V1J Canada \n",
"20 Harrison Lake Region V0M Canada \n",
"21 Highlands V9B Canada \n",
"22 Inside Passage and the Queen Charlottes V0T Canada \n",
"23 Juan de Fuca Shore V0S Canada \n",
"24 Kamloops Central and V2C Canada \n",
"25 Kelowna Central V1Y Canada \n",
"26 Kimberley V1A Canada \n",
"27 Kitimat V8C Canada \n",
"28 Ladysmith V9G Canada \n",
"29 Langley City V3A Canada \n",
"30 Lower Skeena V0V Canada \n",
"31 Maple Ridge V2W Canada \n",
"32 Merritt V1K Canada \n",
"33 Metchosin V9C Canada \n",
"34 Mission V2V Canada \n",
"35 Nanaimo Central V9S Canada \n",
"36 Nelson V1L Canada \n",
"37 New Westminster Northeast V3L Canada \n",
"38 North Central Island and Bute Inlet Region V0P Canada \n",
"39 Northern British Columbia V0C Canada \n",
"40 Oak Bay V8R Canada \n",
"41 Omineca and Yellowhead V0J Canada \n",
"42 Parksville V9P Canada \n",
"43 Penticton V2A Canada \n",
"44 Pitt Meadows V3Y Canada \n",
"45 Port Alberni V9Y Canada \n",
"46 Powell River V8A Canada \n",
"47 Prince George Central V2L Canada \n",
"48 Qualicum Beach V9K Canada \n",
"49 Quesnel V2J Canada \n",
"50 Richmond V7B Canada \n",
"51 Saanich Central V8Z Canada \n",
"52 Salmon Arm V1E Canada \n",
"53 Saltspring Island V8K Canada \n",
"54 Sidney V8L Canada \n",
"55 Similkameen V0X Canada \n",
"56 Sooke V9Z Canada \n",
"57 South Okanagan V0H Canada \n",
"58 Squamish V8B Canada \n",
"59 Surrey V3Z Canada \n",
"60 Terrace V8G Canada \n",
"61 Trail V1R Canada \n",
"62 Upper Columbia Region V0A Canada \n",
"63 Vancouver V5K Canada \n",
"64 Vernon Central V1T Canada \n",
"65 Victoria Central British Columbia Provincial G... V8W Canada \n",
"66 West Kootenays V0G Canada \n",
"67 Westbank V4T Canada \n",
"68 Whistler V8E Canada \n",
"69 White Rock V4B Canada \n",
"70 Williams Lake V2G Canada \n",
"71 Winfield V4V Canada \n",
"\n",
" Province Latitude Longitude \n",
"0 British Columbia 49.06250 -122.3125 \n",
"1 British Columbia 59.62500 -133.5000 \n",
"2 British Columbia 49.25000 -123.0000 \n",
"3 British Columbia 50.00000 -125.5625 \n",
"4 British Columbia 51.43750 -121.6250 \n",
"5 British Columbia 49.31250 -117.6875 \n",
"6 British Columbia 49.03125 -124.0000 \n",
"7 British Columbia 49.68750 -122.5000 \n",
"8 British Columbia 52.09375 -123.6250 \n",
"9 British Columbia 49.12500 -121.8125 \n",
"10 British Columbia 49.68750 -124.9375 \n",
"11 British Columbia 49.25000 -122.8750 \n",
"12 British Columbia 49.68750 -125.1250 \n",
"13 British Columbia 49.50000 -115.7500 \n",
"14 British Columbia 55.78125 -120.2500 \n",
"15 British Columbia 49.09375 -123.0000 \n",
"16 British Columbia 48.78125 -123.6875 \n",
"17 British Columbia 49.68750 -115.5625 \n",
"18 British Columbia 48.43750 -123.4375 \n",
"19 British Columbia 56.25000 -120.8750 \n",
"20 British Columbia 49.62500 -122.0625 \n",
"21 British Columbia 48.46875 -123.5000 \n",
"22 British Columbia 52.21875 -126.1875 \n",
"23 British Columbia 48.53125 -123.6875 \n",
"24 British Columbia 50.68750 -120.4375 \n",
"25 British Columbia 49.90625 -119.4375 \n",
"26 British Columbia 49.68750 -116.0000 \n",
"27 British Columbia 54.06250 -128.6250 \n",
"28 British Columbia 48.96875 -123.8125 \n",
"29 British Columbia 49.09375 -122.5625 \n",
"30 British Columbia 53.96875 -129.8750 \n",
"31 British Columbia 49.25000 -122.5625 \n",
"32 British Columbia 50.12500 -120.8125 \n",
"33 British Columbia 48.37500 -123.5625 \n",
"34 British Columbia 49.43750 -122.4375 \n",
"35 British Columbia 49.18750 -124.0000 \n",
"36 British Columbia 49.50000 -117.3125 \n",
"37 British Columbia 49.18750 -122.8750 \n",
"38 British Columbia 49.56250 -123.6250 \n",
"39 British Columbia 58.37500 -125.6875 \n",
"40 British Columbia 48.43750 -123.3125 \n",
"41 British Columbia 56.00000 -126.8750 \n",
"42 British Columbia 49.31250 -124.3125 \n",
"43 British Columbia 49.46875 -119.5625 \n",
"44 British Columbia 49.21875 -122.6875 \n",
"45 British Columbia 49.28125 -123.1875 \n",
"46 British Columbia 50.03125 -124.3125 \n",
"47 British Columbia 53.96875 -124.3750 \n",
"48 British Columbia 49.34375 -124.4375 \n",
"49 British Columbia 52.96875 -122.5000 \n",
"50 British Columbia 49.15625 -123.1250 \n",
"51 British Columbia 48.50000 -123.3750 \n",
"52 British Columbia 50.68750 -119.2500 \n",
"53 British Columbia 48.81250 -123.5000 \n",
"54 British Columbia 48.65625 -123.3750 \n",
"55 British Columbia 49.37500 -120.6875 \n",
"56 British Columbia 48.37500 -123.7500 \n",
"57 British Columbia 49.40625 -119.0000 \n",
"58 British Columbia 49.68750 -123.1250 \n",
"59 British Columbia 49.12500 -122.8125 \n",
"60 British Columbia 54.53125 -128.6250 \n",
"61 British Columbia 49.09375 -117.6875 \n",
"62 British Columbia 51.31250 -116.9375 \n",
"63 British Columbia 49.25000 -123.1250 \n",
"64 British Columbia 50.21875 -119.2500 \n",
"65 British Columbia 48.43750 -123.3750 \n",
"66 British Columbia 49.50000 -122.0625 \n",
"67 British Columbia 49.87500 -119.7500 \n",
"68 British Columbia 50.12500 -122.9375 \n",
"69 British Columbia 49.03125 -122.8125 \n",
"70 British Columbia 52.15625 -122.1250 \n",
"71 British Columbia 50.03125 -119.3750 "
]
},
"execution_count": 94,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_new.dropna(subset = [\"Code\"], inplace=True)\n",
"print(df_new.shape)\n",
"df_new"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Hence, after all the necessary cleaning and preprocessing, we obtain the final data frame with 72 addresses and their latitude and longitude values. This completes our second method of obtaining data.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"## Methodology <a name=\"methodology\"></a>\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### In the methodology section, we begin with outlining a map of British Columbia and mark all the neighborhoods from our data frame onto the map. For this purpose, we utilize the visualization libraries and hence our first step would be to install and import them. We also import the KMeans package for clustering from the Sklearn library for later use.\n"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Collecting package metadata (current_repodata.json): done\n",
"Solving environment: done\n",
"\n",
"# All requested packages already installed.\n",
"\n",
"Collecting package metadata (current_repodata.json): done\n",
"Solving environment: failed with initial frozen solve. Retrying with flexible solve.\n",
"Collecting package metadata (repodata.json): done\n",
"Solving environment: done\n",
"\n",
"## Package Plan ##\n",
"\n",
" environment location: /home/jupyterlab/conda/envs/python\n",
"\n",
" added / updated specs:\n",
" - folium=0.5.0\n",
"\n",
"\n",
"The following packages will be downloaded:\n",
"\n",
" package | build\n",
" ---------------------------|-----------------\n",
" altair-4.1.0 | py_1 614 KB conda-forge\n",
" branca-0.4.1 | py_0 26 KB conda-forge\n",
" folium-0.5.0 | py_0 45 KB conda-forge\n",
" pandas-1.0.4 | py36h830a2c2_0 10.1 MB conda-forge\n",
" toolz-0.10.0 | py_0 46 KB conda-forge\n",
" vincent-0.4.4 | py_1 28 KB conda-forge\n",
" ------------------------------------------------------------\n",
" Total: 10.9 MB\n",
"\n",
"The following NEW packages will be INSTALLED:\n",
"\n",
" altair conda-forge/noarch::altair-4.1.0-py_1\n",
" attrs conda-forge/noarch::attrs-19.3.0-py_0\n",
" branca conda-forge/noarch::branca-0.4.1-py_0\n",
" entrypoints conda-forge/linux-64::entrypoints-0.3-py36h9f0ad1d_1001\n",
" folium conda-forge/noarch::folium-0.5.0-py_0\n",
" importlib_metadata conda-forge/noarch::importlib_metadata-1.6.0-0\n",
" jinja2 conda-forge/noarch::jinja2-2.11.2-pyh9f0ad1d_0\n",
" jsonschema conda-forge/linux-64::jsonschema-3.2.0-py36h9f0ad1d_1\n",
" markupsafe conda-forge/linux-64::markupsafe-1.1.1-py36h8c4c3a4_1\n",
" pandas conda-forge/linux-64::pandas-1.0.4-py36h830a2c2_0\n",
" pyrsistent conda-forge/linux-64::pyrsistent-0.16.0-py36h8c4c3a4_0\n",
" pytz conda-forge/noarch::pytz-2020.1-pyh9f0ad1d_0\n",
" toolz conda-forge/noarch::toolz-0.10.0-py_0\n",
" vincent conda-forge/noarch::vincent-0.4.4-py_1\n",
"\n",
"\n",
"\n",
"Downloading and Extracting Packages\n",
"toolz-0.10.0 | 46 KB | ##################################### | 100% \n",
"folium-0.5.0 | 45 KB | ##################################### | 100% \n",
"altair-4.1.0 | 614 KB | ##################################### | 100% \n",
"branca-0.4.1 | 26 KB | ##################################### | 100% \n",
"pandas-1.0.4 | 10.1 MB | ##################################### | 100% \n",
"vincent-0.4.4 | 28 KB | ##################################### | 100% \n",
"Preparing transaction: done\n",
"Verifying transaction: done\n",
"Executing transaction: done\n",
"Libraries imported.\n"
]
}
],
"source": [
"!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab\n",
"from geopy.geocoders import Nominatim # convert an address into latitude and longitude values\n",
"\n",
"import json # library to handle JSON files\n",
"\n",
"import requests # library to handle requests\n",
"from pandas import json_normalize # tranform JSON file into a pandas dataframe\n",
"\n",
"# Matplotlib and associated plotting modules\n",
"import matplotlib.cm as cm\n",
"import matplotlib.colors as colors\n",
"\n",
"# import k-means from clustering stage\n",
"from sklearn.cluster import KMeans\n",
"\n",
"!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab\n",
"import folium # map rendering library\n",
"\n",
"print('Libraries imported.')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### We obtain the coordinates for the British Columbia province using the geocoder package, as we did before in the data section.\n"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The lat long values for British Columbia are: 55.001251, -125.002441\n"
]
}
],
"source": [
"address = 'British Columbia, CA'\n",
"\n",
"N_obj = Nominatim(user_agent='british_columbia_explorer')\n",
"geo_obj = N_obj.geocode(address)\n",
"lat = geo_obj.latitude\n",
"long = geo_obj.longitude\n",
"print(\"The lat long values for British Columbia are: {}, {}\".format(lat,long))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Using the Folium library, we map the neighborhood coordinates onto the main British Columbia map.\n"
]
},
{
"cell_type": "code",
"execution_count": 95,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"55.001251 , -125.002441\n"
]
},
{
"data": {
"text/html": [
"<div style=\"width:100%;\"><div style=\"position:relative;width:100%;height:0;padding-bottom:60%;\"><span style=\"color:#565656\">Make this Notebook Trusted to load map: File -> Trust Notebook</span><iframe src=\"about:blank\" style=\"position:absolute;width:100%;height:100%;left:0;top:0;border:none !important;\" data-html= onload=\"this.contentDocument.open();this.contentDocument.write(atob(this.getAttribute('data-html')));this.contentDocument.close();\" allowfullscreen webkitallowfullscreen mozallowfullscreen></iframe></div></div>"
],
"text/plain": [
"<folium.folium.Map at 0x7fef7f14d550>"
]
},
"execution_count": 95,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"map_bc = folium.Map(location=[lat, long], zoom_start=5)\n",
"print(\"{} , {}\".format(lat,long))\n",
"for lati, lng, label in zip(df_new['Latitude'], df_new['Longitude'], df_new['Place']):\n",
" label = folium.Popup(label, parse_html=True)\n",
" folium.CircleMarker(\n",
" [lati, lng],\n",
" radius=5,\n",
" popup=label,\n",
" color='blue',\n",
" fill=True,\n",
" fill_color='#3186cc',\n",
" fill_opacity=0.7,\n",
" parse_html=False).add_to(map_bc)\n",
" \n",
"map_bc"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### We leverage the Foursquare API to explore the nearby locations of each neighborhood. To do so, we first define our Foursqaure credentials and version.\n"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Your credentails:\n",
"CLIENT_ID: QIQC42VYJTUBVEV1B0DBVBEFM4TR2YKVTUJOI21ELHIIIFLP\n",
"CLIENT_SECRET:ZXKAIHJSAPDYD1X5YUOJE0GIMO3C3O5DVG41FXGPT2JEUQCI\n"
]
}
],
"source": [
"CLIENT_ID = 'QIQC42VYJTUBVEV1B0DBVBEFM4TR2YKVTUJOI21ELHIIIFLP' # your Foursquare ID\n",
"CLIENT_SECRET = 'ZXKAIHJSAPDYD1X5YUOJE0GIMO3C3O5DVG41FXGPT2JEUQCI' # your Foursquare Secret\n",
"VERSION = '20180605' # Foursquare API version\n",
"\n",
"print('Your credentails:')\n",
"print('CLIENT_ID: ' + CLIENT_ID)\n",
"print('CLIENT_SECRET:' + CLIENT_SECRET)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### We also combine the entire process of creating the url request for each location, extracting the locations from the response file and storing them into a data frame, into a function named as 'getNearbyVenues'. \n"
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {},
"outputs": [],
"source": [
"def getNearbyVenues(names, latitudes, longitudes, radius=500, limit=150):\n",
" \n",
" venues_list=[]\n",
" for name, lat, lng in zip(names, latitudes, longitudes):\n",
" print(name)\n",
" \n",
" # create the API request URL\n",
" url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(\n",
" CLIENT_ID, \n",
" CLIENT_SECRET, \n",
" VERSION, \n",
" lat, \n",
" lng, \n",
" radius, \n",
" limit)\n",
" \n",
" # make the GET request\n",
" results = requests.get(url).json()[\"response\"]['groups'][0]['items']\n",
" \n",
" # return only relevant information for each nearby venue\n",
" venues_list.append([(\n",
" name, \n",
" lat, \n",
" lng, \n",
" v['venue']['name'], \n",
" v['venue']['location']['lat'], \n",
" v['venue']['location']['lng'], \n",
" v['venue']['categories'][0]['name']) for v in results])\n",
"\n",
" #print(venues_list)\n",
" nearby_venues = pd.DataFrame([item for var in venues_list for item in var])\n",
" nearby_venues.columns = ['Neighborhood', \n",
" 'Neighborhood Latitude', \n",
" 'Neighborhood Longitude', \n",
" 'Venue', \n",
" 'Venue Latitude',\n",
" 'Venue Longitude', \n",
" 'Venue Category']\n",
" \n",
" return(nearby_venues)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Call the function using values from our data frame.\n"
]
},
{
"cell_type": "code",
"execution_count": 96,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Abbotsford\n",
"Atlin Region\n",
"Burnaby\n",
"Campbell River Central\n",
"Cariboo and West Okanagan\n",
"Castlegar\n",
"Cedar\n",
"Central Island\n",
"Chilcotin\n",
"Chilliwack Central\n",
"Comox\n",
"Coquitlam\n",
"Courtenay Central\n",
"Cranbrook\n",
"Dawson Creek\n",
"Delta Central\n",
"Duncan\n",
"East Kootenays\n",
"Esquimalt\n",
"Fort St. John\n",
"Harrison Lake Region\n",
"Highlands\n",
"Inside Passage and the Queen Charlottes\n",
"Juan de Fuca Shore\n",
"Kamloops Central and\n",
"Kelowna Central\n",
"Kimberley\n",
"Kitimat\n",
"Ladysmith\n",
"Langley City\n",
"Lower Skeena\n",
"Maple Ridge\n",
"Merritt\n",
"Metchosin\n",
"Mission\n",
"Nanaimo Central\n",
"Nelson\n",
"New Westminster Northeast\n",
"North Central Island and Bute Inlet Region\n",
"Northern British Columbia\n",
"Oak Bay\n",
"Omineca and Yellowhead\n",
"Parksville\n",
"Penticton\n",
"Pitt Meadows\n",
"Port Alberni\n",
"Powell River\n",
"Prince George Central\n",
"Qualicum Beach\n",
"Quesnel\n",
"Richmond\n",
"Saanich Central\n",
"Salmon Arm\n",
"Saltspring Island\n",
"Sidney\n",
"Similkameen\n",
"Sooke\n",
"South Okanagan\n",
"Squamish\n",
"Surrey\n",
"Terrace\n",
"Trail\n",
"Upper Columbia Region\n",
"Vancouver\n",
"Vernon Central\n",
"Victoria Central British Columbia Provincial Government\n",
"West Kootenays\n",
"Westbank\n",
"Whistler\n",
"White Rock\n",
"Williams Lake\n",
"Winfield\n"
]
}
],
"source": [
"neighborhood_venues = getNearbyVenues(names=df_new['Place'],\n",
" latitudes=df_new['Latitude'],\n",
" longitudes=df_new['Longitude']\n",
" )"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Checking the size of the resulting data frame.\n"
]
},
{
"cell_type": "code",
"execution_count": 97,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(145, 7)\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Neighborhood</th>\n",
" <th>Neighborhood Latitude</th>\n",
" <th>Neighborhood Longitude</th>\n",
" <th>Venue</th>\n",
" <th>Venue Latitude</th>\n",
" <th>Venue Longitude</th>\n",
" <th>Venue Category</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Abbotsford</td>\n",
" <td>49.0625</td>\n",
" <td>-122.3125</td>\n",
" <td>Discovery Trail</td>\n",
" <td>49.060245</td>\n",
" <td>-122.315565</td>\n",
" <td>Trail</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Abbotsford</td>\n",
" <td>49.0625</td>\n",
" <td>-122.3125</td>\n",
" <td>Grandmas Market Gladwin Rd</td>\n",
" <td>49.066149</td>\n",
" <td>-122.313659</td>\n",
" <td>Grocery Store</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Burnaby</td>\n",
" <td>49.2500</td>\n",
" <td>-123.0000</td>\n",
" <td>BCITSA's Stand Central SE2</td>\n",
" <td>49.251424</td>\n",
" <td>-123.001384</td>\n",
" <td>Snack Place</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Burnaby</td>\n",
" <td>49.2500</td>\n",
" <td>-123.0000</td>\n",
" <td>BCIT Bookstore</td>\n",
" <td>49.251548</td>\n",
" <td>-123.001364</td>\n",
" <td>Bookstore</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Burnaby</td>\n",
" <td>49.2500</td>\n",
" <td>-123.0000</td>\n",
" <td>The Rix @ BCIT</td>\n",
" <td>49.251153</td>\n",
" <td>-123.000636</td>\n",
" <td>Coffee Shop</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Neighborhood Neighborhood Latitude Neighborhood Longitude \\\n",
"0 Abbotsford 49.0625 -122.3125 \n",
"1 Abbotsford 49.0625 -122.3125 \n",
"2 Burnaby 49.2500 -123.0000 \n",
"3 Burnaby 49.2500 -123.0000 \n",
"4 Burnaby 49.2500 -123.0000 \n",
"\n",
" Venue Venue Latitude Venue Longitude Venue Category \n",
"0 Discovery Trail 49.060245 -122.315565 Trail \n",
"1 Grandmas Market Gladwin Rd 49.066149 -122.313659 Grocery Store \n",
"2 BCITSA's Stand Central SE2 49.251424 -123.001384 Snack Place \n",
"3 BCIT Bookstore 49.251548 -123.001364 Bookstore \n",
"4 The Rix @ BCIT 49.251153 -123.000636 Coffee Shop "
]
},
"execution_count": 97,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"print(neighborhood_venues.shape)\n",
"neighborhood_venues.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Checking how many venues were returned for each neighborhood.\n"
]
},
{
"cell_type": "code",
"execution_count": 98,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Neighborhood Latitude</th>\n",
" <th>Neighborhood Longitude</th>\n",
" <th>Venue</th>\n",
" <th>Venue Latitude</th>\n",
" <th>Venue Longitude</th>\n",
" <th>Venue Category</th>\n",
" </tr>\n",
" <tr>\n",
" <th>Neighborhood</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>Abbotsford</th>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Burnaby</th>\n",
" <td>9</td>\n",
" <td>9</td>\n",
" <td>9</td>\n",
" <td>9</td>\n",
" <td>9</td>\n",
" <td>9</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Comox</th>\n",
" <td>6</td>\n",
" <td>6</td>\n",
" <td>6</td>\n",
" <td>6</td>\n",
" <td>6</td>\n",
" <td>6</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Coquitlam</th>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Cranbrook</th>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Duncan</th>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Esquimalt</th>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Fort St. John</th>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Highlands</th>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Kelowna Central</th>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Kimberley</th>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Kitimat</th>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Langley City</th>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Nanaimo Central</th>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Nelson</th>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>New Westminster Northeast</th>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Oak Bay</th>\n",
" <td>9</td>\n",
" <td>9</td>\n",
" <td>9</td>\n",
" <td>9</td>\n",
" <td>9</td>\n",
" <td>9</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Parksville</th>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Pitt Meadows</th>\n",
" <td>6</td>\n",
" <td>6</td>\n",
" <td>6</td>\n",
" <td>6</td>\n",
" <td>6</td>\n",
" <td>6</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Qualicum Beach</th>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Richmond</th>\n",
" <td>27</td>\n",
" <td>27</td>\n",
" <td>27</td>\n",
" <td>27</td>\n",
" <td>27</td>\n",
" <td>27</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Saanich Central</th>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Salmon Arm</th>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Sooke</th>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>South Okanagan</th>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Surrey</th>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Terrace</th>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Trail</th>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Vancouver</th>\n",
" <td>20</td>\n",
" <td>20</td>\n",
" <td>20</td>\n",
" <td>20</td>\n",
" <td>20</td>\n",
" <td>20</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Victoria Central British Columbia Provincial Government</th>\n",
" <td>9</td>\n",
" <td>9</td>\n",
" <td>9</td>\n",
" <td>9</td>\n",
" <td>9</td>\n",
" <td>9</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Whistler</th>\n",
" <td>7</td>\n",
" <td>7</td>\n",
" <td>7</td>\n",
" <td>7</td>\n",
" <td>7</td>\n",
" <td>7</td>\n",
" </tr>\n",
" <tr>\n",
" <th>White Rock</th>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Neighborhood Latitude \\\n",
"Neighborhood \n",
"Abbotsford 2 \n",
"Burnaby 9 \n",
"Comox 6 \n",
"Coquitlam 5 \n",
"Cranbrook 1 \n",
"Duncan 3 \n",
"Esquimalt 1 \n",
"Fort St. John 2 \n",
"Highlands 4 \n",
"Kelowna Central 2 \n",
"Kimberley 4 \n",
"Kitimat 1 \n",
"Langley City 1 \n",
"Nanaimo Central 2 \n",
"Nelson 1 \n",
"New Westminster Northeast 3 \n",
"Oak Bay 9 \n",
"Parksville 2 \n",
"Pitt Meadows 6 \n",
"Qualicum Beach 4 \n",
"Richmond 27 \n",
"Saanich Central 2 \n",
"Salmon Arm 2 \n",
"Sooke 1 \n",
"South Okanagan 1 \n",
"Surrey 2 \n",
"Terrace 1 \n",
"Trail 1 \n",
"Vancouver 20 \n",
"Victoria Central British Columbia Provincial Go... 9 \n",
"Whistler 7 \n",
"White Rock 4 \n",
"\n",
" Neighborhood Longitude \\\n",
"Neighborhood \n",
"Abbotsford 2 \n",
"Burnaby 9 \n",
"Comox 6 \n",
"Coquitlam 5 \n",
"Cranbrook 1 \n",
"Duncan 3 \n",
"Esquimalt 1 \n",
"Fort St. John 2 \n",
"Highlands 4 \n",
"Kelowna Central 2 \n",
"Kimberley 4 \n",
"Kitimat 1 \n",
"Langley City 1 \n",
"Nanaimo Central 2 \n",
"Nelson 1 \n",
"New Westminster Northeast 3 \n",
"Oak Bay 9 \n",
"Parksville 2 \n",
"Pitt Meadows 6 \n",
"Qualicum Beach 4 \n",
"Richmond 27 \n",
"Saanich Central 2 \n",
"Salmon Arm 2 \n",
"Sooke 1 \n",
"South Okanagan 1 \n",
"Surrey 2 \n",
"Terrace 1 \n",
"Trail 1 \n",
"Vancouver 20 \n",
"Victoria Central British Columbia Provincial Go... 9 \n",
"Whistler 7 \n",
"White Rock 4 \n",
"\n",
" Venue Venue Latitude \\\n",
"Neighborhood \n",
"Abbotsford 2 2 \n",
"Burnaby 9 9 \n",
"Comox 6 6 \n",
"Coquitlam 5 5 \n",
"Cranbrook 1 1 \n",
"Duncan 3 3 \n",
"Esquimalt 1 1 \n",
"Fort St. John 2 2 \n",
"Highlands 4 4 \n",
"Kelowna Central 2 2 \n",
"Kimberley 4 4 \n",
"Kitimat 1 1 \n",
"Langley City 1 1 \n",
"Nanaimo Central 2 2 \n",
"Nelson 1 1 \n",
"New Westminster Northeast 3 3 \n",
"Oak Bay 9 9 \n",
"Parksville 2 2 \n",
"Pitt Meadows 6 6 \n",
"Qualicum Beach 4 4 \n",
"Richmond 27 27 \n",
"Saanich Central 2 2 \n",
"Salmon Arm 2 2 \n",
"Sooke 1 1 \n",
"South Okanagan 1 1 \n",
"Surrey 2 2 \n",
"Terrace 1 1 \n",
"Trail 1 1 \n",
"Vancouver 20 20 \n",
"Victoria Central British Columbia Provincial Go... 9 9 \n",
"Whistler 7 7 \n",
"White Rock 4 4 \n",
"\n",
" Venue Longitude \\\n",
"Neighborhood \n",
"Abbotsford 2 \n",
"Burnaby 9 \n",
"Comox 6 \n",
"Coquitlam 5 \n",
"Cranbrook 1 \n",
"Duncan 3 \n",
"Esquimalt 1 \n",
"Fort St. John 2 \n",
"Highlands 4 \n",
"Kelowna Central 2 \n",
"Kimberley 4 \n",
"Kitimat 1 \n",
"Langley City 1 \n",
"Nanaimo Central 2 \n",
"Nelson 1 \n",
"New Westminster Northeast 3 \n",
"Oak Bay 9 \n",
"Parksville 2 \n",
"Pitt Meadows 6 \n",
"Qualicum Beach 4 \n",
"Richmond 27 \n",
"Saanich Central 2 \n",
"Salmon Arm 2 \n",
"Sooke 1 \n",
"South Okanagan 1 \n",
"Surrey 2 \n",
"Terrace 1 \n",
"Trail 1 \n",
"Vancouver 20 \n",
"Victoria Central British Columbia Provincial Go... 9 \n",
"Whistler 7 \n",
"White Rock 4 \n",
"\n",
" Venue Category \n",
"Neighborhood \n",
"Abbotsford 2 \n",
"Burnaby 9 \n",
"Comox 6 \n",
"Coquitlam 5 \n",
"Cranbrook 1 \n",
"Duncan 3 \n",
"Esquimalt 1 \n",
"Fort St. John 2 \n",
"Highlands 4 \n",
"Kelowna Central 2 \n",
"Kimberley 4 \n",
"Kitimat 1 \n",
"Langley City 1 \n",
"Nanaimo Central 2 \n",
"Nelson 1 \n",
"New Westminster Northeast 3 \n",
"Oak Bay 9 \n",
"Parksville 2 \n",
"Pitt Meadows 6 \n",
"Qualicum Beach 4 \n",
"Richmond 27 \n",
"Saanich Central 2 \n",
"Salmon Arm 2 \n",
"Sooke 1 \n",
"South Okanagan 1 \n",
"Surrey 2 \n",
"Terrace 1 \n",
"Trail 1 \n",
"Vancouver 20 \n",
"Victoria Central British Columbia Provincial Go... 9 \n",
"Whistler 7 \n",
"White Rock 4 "
]
},
"execution_count": 98,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"neighborhood_venues.groupby('Neighborhood').count()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Unique categories all over British Columbia.\n"
]
},
{
"cell_type": "code",
"execution_count": 99,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"There are 76 uniques categories.\n"
]
}
],
"source": [
"print('There are {} uniques categories.'.format(len(neighborhood_venues['Venue Category'].unique())))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"## Analyze each neighborhood <a name=\"analysis\"></a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### We analyze the nearby locations of each neighborhood using the one hot encoding technique.\n"
]
},
{
"cell_type": "code",
"execution_count": 134,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(145, 76)\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>American Restaurant</th>\n",
" <th>Asian Restaurant</th>\n",
" <th>Athletics &amp; Sports</th>\n",
" <th>Auto Workshop</th>\n",
" <th>Bakery</th>\n",
" <th>Bank</th>\n",
" <th>Baseball Field</th>\n",
" <th>Beach</th>\n",
" <th>Boat or Ferry</th>\n",
" <th>Bookstore</th>\n",
" <th>Breakfast Spot</th>\n",
" <th>Brewery</th>\n",
" <th>Bubble Tea Shop</th>\n",
" <th>Burger Joint</th>\n",
" <th>Bus Station</th>\n",
" <th>Bus Stop</th>\n",
" <th>Business Service</th>\n",
" <th>Café</th>\n",
" <th>Chinese Restaurant</th>\n",
" <th>Coffee Shop</th>\n",
" <th>Construction &amp; Landscaping</th>\n",
" <th>Convenience Store</th>\n",
" <th>Dessert Shop</th>\n",
" <th>Dim Sum Restaurant</th>\n",
" <th>Dog Run</th>\n",
" <th>Elementary School</th>\n",
" <th>Falafel Restaurant</th>\n",
" <th>Fast Food Restaurant</th>\n",
" <th>Fish &amp; Chips Shop</th>\n",
" <th>Fried Chicken Joint</th>\n",
" <th>Garden</th>\n",
" <th>Gas Station</th>\n",
" <th>Gastropub</th>\n",
" <th>Gift Shop</th>\n",
" <th>Golf Course</th>\n",
" <th>Greek Restaurant</th>\n",
" <th>Grocery Store</th>\n",
" <th>Gym</th>\n",
" <th>Gym / Fitness Center</th>\n",
" <th>Home Service</th>\n",
" <th>Hot Dog Joint</th>\n",
" <th>Hotel</th>\n",
" <th>Indian Restaurant</th>\n",
" <th>Juice Bar</th>\n",
" <th>Korean Restaurant</th>\n",
" <th>Lake</th>\n",
" <th>Liquor Store</th>\n",
" <th>Malay Restaurant</th>\n",
" <th>Men's Store</th>\n",
" <th>Motorcycle Shop</th>\n",
" <th>Mountain</th>\n",
" <th>Park</th>\n",
" <th>Pet Store</th>\n",
" <th>Pharmacy</th>\n",
" <th>Pizza Place</th>\n",
" <th>Playground</th>\n",
" <th>Plaza</th>\n",
" <th>Pub</th>\n",
" <th>Recreation Center</th>\n",
" <th>Restaurant</th>\n",
" <th>Salon / Barbershop</th>\n",
" <th>Sandwich Place</th>\n",
" <th>Shipping Store</th>\n",
" <th>Ski Area</th>\n",
" <th>Ski Lodge</th>\n",
" <th>Snack Place</th>\n",
" <th>Sporting Goods Shop</th>\n",
" <th>Sushi Restaurant</th>\n",
" <th>Theme Park</th>\n",
" <th>Tourist Information Center</th>\n",
" <th>Toy / Game Store</th>\n",
" <th>Trail</th>\n",
" <th>Vacation Rental</th>\n",
" <th>Vietnamese Restaurant</th>\n",
" <th>Wine Shop</th>\n",
" <th>Zoo</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" American Restaurant Asian Restaurant Athletics & Sports Auto Workshop \\\n",
"0 0 0 0 0 \n",
"1 0 0 0 0 \n",
"2 0 0 0 0 \n",
"3 0 0 0 0 \n",
"4 0 0 0 0 \n",
"\n",
" Bakery Bank Baseball Field Beach Boat or Ferry Bookstore \\\n",
"0 0 0 0 0 0 0 \n",
"1 0 0 0 0 0 0 \n",
"2 0 0 0 0 0 0 \n",
"3 0 0 0 0 0 1 \n",
"4 0 0 0 0 0 0 \n",
"\n",
" Breakfast Spot Brewery Bubble Tea Shop Burger Joint Bus Station \\\n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
"\n",
" Bus Stop Business Service Café Chinese Restaurant Coffee Shop \\\n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 1 \n",
"\n",
" Construction & Landscaping Convenience Store Dessert Shop \\\n",
"0 0 0 0 \n",
"1 0 0 0 \n",
"2 0 0 0 \n",
"3 0 0 0 \n",
"4 0 0 0 \n",
"\n",
" Dim Sum Restaurant Dog Run Elementary School Falafel Restaurant \\\n",
"0 0 0 0 0 \n",
"1 0 0 0 0 \n",
"2 0 0 0 0 \n",
"3 0 0 0 0 \n",
"4 0 0 0 0 \n",
"\n",
" Fast Food Restaurant Fish & Chips Shop Fried Chicken Joint Garden \\\n",
"0 0 0 0 0 \n",
"1 0 0 0 0 \n",
"2 0 0 0 0 \n",
"3 0 0 0 0 \n",
"4 0 0 0 0 \n",
"\n",
" Gas Station Gastropub Gift Shop Golf Course Greek Restaurant \\\n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
"\n",
" Grocery Store Gym Gym / Fitness Center Home Service Hot Dog Joint \\\n",
"0 0 0 0 0 0 \n",
"1 1 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
"\n",
" Hotel Indian Restaurant Juice Bar Korean Restaurant Lake Liquor Store \\\n",
"0 0 0 0 0 0 0 \n",
"1 0 0 0 0 0 0 \n",
"2 0 0 0 0 0 0 \n",
"3 0 0 0 0 0 0 \n",
"4 0 0 0 0 0 0 \n",
"\n",
" Malay Restaurant Men's Store Motorcycle Shop Mountain Park Pet Store \\\n",
"0 0 0 0 0 0 0 \n",
"1 0 0 0 0 0 0 \n",
"2 0 0 0 0 0 0 \n",
"3 0 0 0 0 0 0 \n",
"4 0 0 0 0 0 0 \n",
"\n",
" Pharmacy Pizza Place Playground Plaza Pub Recreation Center \\\n",
"0 0 0 0 0 0 0 \n",
"1 0 0 0 0 0 0 \n",
"2 0 0 0 0 0 0 \n",
"3 0 0 0 0 0 0 \n",
"4 0 0 0 0 0 0 \n",
"\n",
" Restaurant Salon / Barbershop Sandwich Place Shipping Store Ski Area \\\n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
"\n",
" Ski Lodge Snack Place Sporting Goods Shop Sushi Restaurant Theme Park \\\n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 1 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
"\n",
" Tourist Information Center Toy / Game Store Trail Vacation Rental \\\n",
"0 0 0 1 0 \n",
"1 0 0 0 0 \n",
"2 0 0 0 0 \n",
"3 0 0 0 0 \n",
"4 0 0 0 0 \n",
"\n",
" Vietnamese Restaurant Wine Shop Zoo \n",
"0 0 0 0 \n",
"1 0 0 0 \n",
"2 0 0 0 \n",
"3 0 0 0 \n",
"4 0 0 0 "
]
},
"execution_count": 134,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"one_hot = pd.get_dummies(neighborhood_venues[['Venue Category']], prefix=\"\", prefix_sep=\"\")\n",
"print(one_hot.shape)\n",
"one_hot.head()"
]
},
{
"cell_type": "code",
"execution_count": 135,
"metadata": {},
"outputs": [],
"source": [
"pd.set_option('display.max_columns', one_hot.shape[1]+1)"
]
},
{
"cell_type": "code",
"execution_count": 136,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>American Restaurant</th>\n",
" <th>Asian Restaurant</th>\n",
" <th>Athletics &amp; Sports</th>\n",
" <th>Auto Workshop</th>\n",
" <th>Bakery</th>\n",
" <th>Bank</th>\n",
" <th>Baseball Field</th>\n",
" <th>Beach</th>\n",
" <th>Boat or Ferry</th>\n",
" <th>Bookstore</th>\n",
" <th>Breakfast Spot</th>\n",
" <th>Brewery</th>\n",
" <th>Bubble Tea Shop</th>\n",
" <th>Burger Joint</th>\n",
" <th>Bus Station</th>\n",
" <th>Bus Stop</th>\n",
" <th>Business Service</th>\n",
" <th>Café</th>\n",
" <th>Chinese Restaurant</th>\n",
" <th>Coffee Shop</th>\n",
" <th>Construction &amp; Landscaping</th>\n",
" <th>Convenience Store</th>\n",
" <th>Dessert Shop</th>\n",
" <th>Dim Sum Restaurant</th>\n",
" <th>Dog Run</th>\n",
" <th>Elementary School</th>\n",
" <th>Falafel Restaurant</th>\n",
" <th>Fast Food Restaurant</th>\n",
" <th>Fish &amp; Chips Shop</th>\n",
" <th>Fried Chicken Joint</th>\n",
" <th>Garden</th>\n",
" <th>Gas Station</th>\n",
" <th>Gastropub</th>\n",
" <th>Gift Shop</th>\n",
" <th>Golf Course</th>\n",
" <th>Greek Restaurant</th>\n",
" <th>Grocery Store</th>\n",
" <th>Gym</th>\n",
" <th>Gym / Fitness Center</th>\n",
" <th>Home Service</th>\n",
" <th>Hot Dog Joint</th>\n",
" <th>Hotel</th>\n",
" <th>Indian Restaurant</th>\n",
" <th>Juice Bar</th>\n",
" <th>Korean Restaurant</th>\n",
" <th>Lake</th>\n",
" <th>Liquor Store</th>\n",
" <th>Malay Restaurant</th>\n",
" <th>Men's Store</th>\n",
" <th>Motorcycle Shop</th>\n",
" <th>Mountain</th>\n",
" <th>Park</th>\n",
" <th>Pet Store</th>\n",
" <th>Pharmacy</th>\n",
" <th>Pizza Place</th>\n",
" <th>Playground</th>\n",
" <th>Plaza</th>\n",
" <th>Pub</th>\n",
" <th>Recreation Center</th>\n",
" <th>Restaurant</th>\n",
" <th>Salon / Barbershop</th>\n",
" <th>Sandwich Place</th>\n",
" <th>Shipping Store</th>\n",
" <th>Ski Area</th>\n",
" <th>Ski Lodge</th>\n",
" <th>Snack Place</th>\n",
" <th>Sporting Goods Shop</th>\n",
" <th>Sushi Restaurant</th>\n",
" <th>Theme Park</th>\n",
" <th>Tourist Information Center</th>\n",
" <th>Toy / Game Store</th>\n",
" <th>Trail</th>\n",
" <th>Vacation Rental</th>\n",
" <th>Vietnamese Restaurant</th>\n",
" <th>Wine Shop</th>\n",
" <th>Zoo</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>140</th>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>141</th>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>142</th>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>143</th>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>144</th>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>145 rows × 76 columns</p>\n",
"</div>"
],
"text/plain": [
" American Restaurant Asian Restaurant Athletics & Sports Auto Workshop \\\n",
"0 0 0 0 0 \n",
"1 0 0 0 0 \n",
"2 0 0 0 0 \n",
"3 0 0 0 0 \n",
"4 0 0 0 0 \n",
".. ... ... ... ... \n",
"140 0 0 0 0 \n",
"141 0 0 0 0 \n",
"142 0 0 0 0 \n",
"143 0 0 0 0 \n",
"144 0 0 1 0 \n",
"\n",
" Bakery Bank Baseball Field Beach Boat or Ferry Bookstore \\\n",
"0 0 0 0 0 0 0 \n",
"1 0 0 0 0 0 0 \n",
"2 0 0 0 0 0 0 \n",
"3 0 0 0 0 0 1 \n",
"4 0 0 0 0 0 0 \n",
".. ... ... ... ... ... ... \n",
"140 0 0 0 0 0 0 \n",
"141 0 0 0 0 0 0 \n",
"142 0 0 0 0 0 0 \n",
"143 0 0 0 0 0 0 \n",
"144 0 0 0 0 0 0 \n",
"\n",
" Breakfast Spot Brewery Bubble Tea Shop Burger Joint Bus Station \\\n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
".. ... ... ... ... ... \n",
"140 0 0 0 0 0 \n",
"141 0 0 0 0 0 \n",
"142 0 0 0 0 0 \n",
"143 0 0 0 0 0 \n",
"144 0 0 0 0 0 \n",
"\n",
" Bus Stop Business Service Café Chinese Restaurant Coffee Shop \\\n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 1 \n",
".. ... ... ... ... ... \n",
"140 0 0 0 0 0 \n",
"141 0 0 0 0 0 \n",
"142 0 0 0 0 0 \n",
"143 0 0 0 0 0 \n",
"144 0 0 0 0 0 \n",
"\n",
" Construction & Landscaping Convenience Store Dessert Shop \\\n",
"0 0 0 0 \n",
"1 0 0 0 \n",
"2 0 0 0 \n",
"3 0 0 0 \n",
"4 0 0 0 \n",
".. ... ... ... \n",
"140 0 0 0 \n",
"141 0 0 0 \n",
"142 0 0 0 \n",
"143 0 0 0 \n",
"144 0 0 0 \n",
"\n",
" Dim Sum Restaurant Dog Run Elementary School Falafel Restaurant \\\n",
"0 0 0 0 0 \n",
"1 0 0 0 0 \n",
"2 0 0 0 0 \n",
"3 0 0 0 0 \n",
"4 0 0 0 0 \n",
".. ... ... ... ... \n",
"140 0 0 0 0 \n",
"141 0 0 0 0 \n",
"142 0 0 0 0 \n",
"143 0 0 0 0 \n",
"144 0 0 0 0 \n",
"\n",
" Fast Food Restaurant Fish & Chips Shop Fried Chicken Joint Garden \\\n",
"0 0 0 0 0 \n",
"1 0 0 0 0 \n",
"2 0 0 0 0 \n",
"3 0 0 0 0 \n",
"4 0 0 0 0 \n",
".. ... ... ... ... \n",
"140 0 0 0 0 \n",
"141 0 0 0 0 \n",
"142 0 0 0 0 \n",
"143 0 0 0 0 \n",
"144 0 0 0 0 \n",
"\n",
" Gas Station Gastropub Gift Shop Golf Course Greek Restaurant \\\n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
".. ... ... ... ... ... \n",
"140 0 0 0 0 0 \n",
"141 0 0 0 0 0 \n",
"142 0 0 0 0 0 \n",
"143 0 0 0 0 0 \n",
"144 0 0 0 0 0 \n",
"\n",
" Grocery Store Gym Gym / Fitness Center Home Service Hot Dog Joint \\\n",
"0 0 0 0 0 0 \n",
"1 1 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
".. ... ... ... ... ... \n",
"140 0 0 0 0 0 \n",
"141 0 0 0 0 0 \n",
"142 0 0 0 0 0 \n",
"143 0 0 1 0 0 \n",
"144 0 0 0 0 0 \n",
"\n",
" Hotel Indian Restaurant Juice Bar Korean Restaurant Lake \\\n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
".. ... ... ... ... ... \n",
"140 1 0 0 0 0 \n",
"141 0 0 0 0 0 \n",
"142 0 0 0 0 0 \n",
"143 0 0 0 0 0 \n",
"144 0 0 0 0 0 \n",
"\n",
" Liquor Store Malay Restaurant Men's Store Motorcycle Shop Mountain \\\n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
".. ... ... ... ... ... \n",
"140 0 0 0 0 0 \n",
"141 0 0 0 0 0 \n",
"142 0 0 0 0 0 \n",
"143 0 0 0 0 0 \n",
"144 0 0 0 0 0 \n",
"\n",
" Park Pet Store Pharmacy Pizza Place Playground Plaza Pub \\\n",
"0 0 0 0 0 0 0 0 \n",
"1 0 0 0 0 0 0 0 \n",
"2 0 0 0 0 0 0 0 \n",
"3 0 0 0 0 0 0 0 \n",
"4 0 0 0 0 0 0 0 \n",
".. ... ... ... ... ... ... ... \n",
"140 0 0 0 0 0 0 0 \n",
"141 1 0 0 0 0 0 0 \n",
"142 1 0 0 0 0 0 0 \n",
"143 0 0 0 0 0 0 0 \n",
"144 0 0 0 0 0 0 0 \n",
"\n",
" Recreation Center Restaurant Salon / Barbershop Sandwich Place \\\n",
"0 0 0 0 0 \n",
"1 0 0 0 0 \n",
"2 0 0 0 0 \n",
"3 0 0 0 0 \n",
"4 0 0 0 0 \n",
".. ... ... ... ... \n",
"140 0 0 0 0 \n",
"141 0 0 0 0 \n",
"142 0 0 0 0 \n",
"143 0 0 0 0 \n",
"144 0 0 0 0 \n",
"\n",
" Shipping Store Ski Area Ski Lodge Snack Place Sporting Goods Shop \\\n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 0 0 1 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
".. ... ... ... ... ... \n",
"140 0 0 0 0 0 \n",
"141 0 0 0 0 0 \n",
"142 0 0 0 0 0 \n",
"143 0 0 0 0 0 \n",
"144 0 0 0 0 0 \n",
"\n",
" Sushi Restaurant Theme Park Tourist Information Center \\\n",
"0 0 0 0 \n",
"1 0 0 0 \n",
"2 0 0 0 \n",
"3 0 0 0 \n",
"4 0 0 0 \n",
".. ... ... ... \n",
"140 0 0 0 \n",
"141 0 0 0 \n",
"142 0 0 0 \n",
"143 0 0 0 \n",
"144 0 0 0 \n",
"\n",
" Toy / Game Store Trail Vacation Rental Vietnamese Restaurant \\\n",
"0 0 1 0 0 \n",
"1 0 0 0 0 \n",
"2 0 0 0 0 \n",
"3 0 0 0 0 \n",
"4 0 0 0 0 \n",
".. ... ... ... ... \n",
"140 0 0 0 0 \n",
"141 0 0 0 0 \n",
"142 0 0 0 0 \n",
"143 0 0 0 0 \n",
"144 0 0 0 0 \n",
"\n",
" Wine Shop Zoo \n",
"0 0 0 \n",
"1 0 0 \n",
"2 0 0 \n",
"3 0 0 \n",
"4 0 0 \n",
".. ... ... \n",
"140 0 0 \n",
"141 0 0 \n",
"142 0 0 \n",
"143 0 0 \n",
"144 0 0 \n",
"\n",
"[145 rows x 76 columns]"
]
},
"execution_count": 136,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"one_hot"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Add neighborhood values corresponding to their nearby locations.\n"
]
},
{
"cell_type": "code",
"execution_count": 138,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>American Restaurant</th>\n",
" <th>Asian Restaurant</th>\n",
" <th>Athletics &amp; Sports</th>\n",
" <th>Auto Workshop</th>\n",
" <th>Bakery</th>\n",
" <th>Bank</th>\n",
" <th>Baseball Field</th>\n",
" <th>Beach</th>\n",
" <th>Boat or Ferry</th>\n",
" <th>Bookstore</th>\n",
" <th>Breakfast Spot</th>\n",
" <th>Brewery</th>\n",
" <th>Bubble Tea Shop</th>\n",
" <th>Burger Joint</th>\n",
" <th>Bus Station</th>\n",
" <th>Bus Stop</th>\n",
" <th>Business Service</th>\n",
" <th>Café</th>\n",
" <th>Chinese Restaurant</th>\n",
" <th>Coffee Shop</th>\n",
" <th>Construction &amp; Landscaping</th>\n",
" <th>Convenience Store</th>\n",
" <th>Dessert Shop</th>\n",
" <th>Dim Sum Restaurant</th>\n",
" <th>Dog Run</th>\n",
" <th>Elementary School</th>\n",
" <th>Falafel Restaurant</th>\n",
" <th>Fast Food Restaurant</th>\n",
" <th>Fish &amp; Chips Shop</th>\n",
" <th>Fried Chicken Joint</th>\n",
" <th>Garden</th>\n",
" <th>Gas Station</th>\n",
" <th>Gastropub</th>\n",
" <th>Gift Shop</th>\n",
" <th>Golf Course</th>\n",
" <th>Greek Restaurant</th>\n",
" <th>Grocery Store</th>\n",
" <th>Gym</th>\n",
" <th>Gym / Fitness Center</th>\n",
" <th>Home Service</th>\n",
" <th>Hot Dog Joint</th>\n",
" <th>Hotel</th>\n",
" <th>Indian Restaurant</th>\n",
" <th>Juice Bar</th>\n",
" <th>Korean Restaurant</th>\n",
" <th>Lake</th>\n",
" <th>Liquor Store</th>\n",
" <th>Malay Restaurant</th>\n",
" <th>Men's Store</th>\n",
" <th>Motorcycle Shop</th>\n",
" <th>Mountain</th>\n",
" <th>Park</th>\n",
" <th>Pet Store</th>\n",
" <th>Pharmacy</th>\n",
" <th>Pizza Place</th>\n",
" <th>Playground</th>\n",
" <th>Plaza</th>\n",
" <th>Pub</th>\n",
" <th>Recreation Center</th>\n",
" <th>Restaurant</th>\n",
" <th>Salon / Barbershop</th>\n",
" <th>Sandwich Place</th>\n",
" <th>Shipping Store</th>\n",
" <th>Ski Area</th>\n",
" <th>Ski Lodge</th>\n",
" <th>Snack Place</th>\n",
" <th>Sporting Goods Shop</th>\n",
" <th>Sushi Restaurant</th>\n",
" <th>Theme Park</th>\n",
" <th>Tourist Information Center</th>\n",
" <th>Toy / Game Store</th>\n",
" <th>Trail</th>\n",
" <th>Vacation Rental</th>\n",
" <th>Vietnamese Restaurant</th>\n",
" <th>Wine Shop</th>\n",
" <th>Zoo</th>\n",
" <th>Neighborhood</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>Abbotsford</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>Abbotsford</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>Burnaby</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>Burnaby</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>Burnaby</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" American Restaurant Asian Restaurant Athletics & Sports Auto Workshop \\\n",
"0 0 0 0 0 \n",
"1 0 0 0 0 \n",
"2 0 0 0 0 \n",
"3 0 0 0 0 \n",
"4 0 0 0 0 \n",
"\n",
" Bakery Bank Baseball Field Beach Boat or Ferry Bookstore \\\n",
"0 0 0 0 0 0 0 \n",
"1 0 0 0 0 0 0 \n",
"2 0 0 0 0 0 0 \n",
"3 0 0 0 0 0 1 \n",
"4 0 0 0 0 0 0 \n",
"\n",
" Breakfast Spot Brewery Bubble Tea Shop Burger Joint Bus Station \\\n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
"\n",
" Bus Stop Business Service Café Chinese Restaurant Coffee Shop \\\n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 1 \n",
"\n",
" Construction & Landscaping Convenience Store Dessert Shop \\\n",
"0 0 0 0 \n",
"1 0 0 0 \n",
"2 0 0 0 \n",
"3 0 0 0 \n",
"4 0 0 0 \n",
"\n",
" Dim Sum Restaurant Dog Run Elementary School Falafel Restaurant \\\n",
"0 0 0 0 0 \n",
"1 0 0 0 0 \n",
"2 0 0 0 0 \n",
"3 0 0 0 0 \n",
"4 0 0 0 0 \n",
"\n",
" Fast Food Restaurant Fish & Chips Shop Fried Chicken Joint Garden \\\n",
"0 0 0 0 0 \n",
"1 0 0 0 0 \n",
"2 0 0 0 0 \n",
"3 0 0 0 0 \n",
"4 0 0 0 0 \n",
"\n",
" Gas Station Gastropub Gift Shop Golf Course Greek Restaurant \\\n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
"\n",
" Grocery Store Gym Gym / Fitness Center Home Service Hot Dog Joint \\\n",
"0 0 0 0 0 0 \n",
"1 1 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
"\n",
" Hotel Indian Restaurant Juice Bar Korean Restaurant Lake Liquor Store \\\n",
"0 0 0 0 0 0 0 \n",
"1 0 0 0 0 0 0 \n",
"2 0 0 0 0 0 0 \n",
"3 0 0 0 0 0 0 \n",
"4 0 0 0 0 0 0 \n",
"\n",
" Malay Restaurant Men's Store Motorcycle Shop Mountain Park Pet Store \\\n",
"0 0 0 0 0 0 0 \n",
"1 0 0 0 0 0 0 \n",
"2 0 0 0 0 0 0 \n",
"3 0 0 0 0 0 0 \n",
"4 0 0 0 0 0 0 \n",
"\n",
" Pharmacy Pizza Place Playground Plaza Pub Recreation Center \\\n",
"0 0 0 0 0 0 0 \n",
"1 0 0 0 0 0 0 \n",
"2 0 0 0 0 0 0 \n",
"3 0 0 0 0 0 0 \n",
"4 0 0 0 0 0 0 \n",
"\n",
" Restaurant Salon / Barbershop Sandwich Place Shipping Store Ski Area \\\n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
"\n",
" Ski Lodge Snack Place Sporting Goods Shop Sushi Restaurant Theme Park \\\n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 1 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
"\n",
" Tourist Information Center Toy / Game Store Trail Vacation Rental \\\n",
"0 0 0 1 0 \n",
"1 0 0 0 0 \n",
"2 0 0 0 0 \n",
"3 0 0 0 0 \n",
"4 0 0 0 0 \n",
"\n",
" Vietnamese Restaurant Wine Shop Zoo Neighborhood \n",
"0 0 0 0 Abbotsford \n",
"1 0 0 0 Abbotsford \n",
"2 0 0 0 Burnaby \n",
"3 0 0 0 Burnaby \n",
"4 0 0 0 Burnaby "
]
},
"execution_count": 138,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"one_hot['Neighborhood'] = neighborhood_venues['Neighborhood'] \n",
"one_hot.head()"
]
},
{
"cell_type": "code",
"execution_count": 139,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Neighborhood</th>\n",
" <th>American Restaurant</th>\n",
" <th>Asian Restaurant</th>\n",
" <th>Athletics &amp; Sports</th>\n",
" <th>Auto Workshop</th>\n",
" <th>Bakery</th>\n",
" <th>Bank</th>\n",
" <th>Baseball Field</th>\n",
" <th>Beach</th>\n",
" <th>Boat or Ferry</th>\n",
" <th>Bookstore</th>\n",
" <th>Breakfast Spot</th>\n",
" <th>Brewery</th>\n",
" <th>Bubble Tea Shop</th>\n",
" <th>Burger Joint</th>\n",
" <th>Bus Station</th>\n",
" <th>Bus Stop</th>\n",
" <th>Business Service</th>\n",
" <th>Café</th>\n",
" <th>Chinese Restaurant</th>\n",
" <th>Coffee Shop</th>\n",
" <th>Construction &amp; Landscaping</th>\n",
" <th>Convenience Store</th>\n",
" <th>Dessert Shop</th>\n",
" <th>Dim Sum Restaurant</th>\n",
" <th>Dog Run</th>\n",
" <th>Elementary School</th>\n",
" <th>Falafel Restaurant</th>\n",
" <th>Fast Food Restaurant</th>\n",
" <th>Fish &amp; Chips Shop</th>\n",
" <th>Fried Chicken Joint</th>\n",
" <th>Garden</th>\n",
" <th>Gas Station</th>\n",
" <th>Gastropub</th>\n",
" <th>Gift Shop</th>\n",
" <th>Golf Course</th>\n",
" <th>Greek Restaurant</th>\n",
" <th>Grocery Store</th>\n",
" <th>Gym</th>\n",
" <th>Gym / Fitness Center</th>\n",
" <th>Home Service</th>\n",
" <th>Hot Dog Joint</th>\n",
" <th>Hotel</th>\n",
" <th>Indian Restaurant</th>\n",
" <th>Juice Bar</th>\n",
" <th>Korean Restaurant</th>\n",
" <th>Lake</th>\n",
" <th>Liquor Store</th>\n",
" <th>Malay Restaurant</th>\n",
" <th>Men's Store</th>\n",
" <th>Motorcycle Shop</th>\n",
" <th>Mountain</th>\n",
" <th>Park</th>\n",
" <th>Pet Store</th>\n",
" <th>Pharmacy</th>\n",
" <th>Pizza Place</th>\n",
" <th>Playground</th>\n",
" <th>Plaza</th>\n",
" <th>Pub</th>\n",
" <th>Recreation Center</th>\n",
" <th>Restaurant</th>\n",
" <th>Salon / Barbershop</th>\n",
" <th>Sandwich Place</th>\n",
" <th>Shipping Store</th>\n",
" <th>Ski Area</th>\n",
" <th>Ski Lodge</th>\n",
" <th>Snack Place</th>\n",
" <th>Sporting Goods Shop</th>\n",
" <th>Sushi Restaurant</th>\n",
" <th>Theme Park</th>\n",
" <th>Tourist Information Center</th>\n",
" <th>Toy / Game Store</th>\n",
" <th>Trail</th>\n",
" <th>Vacation Rental</th>\n",
" <th>Vietnamese Restaurant</th>\n",
" <th>Wine Shop</th>\n",
" <th>Zoo</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Abbotsford</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Abbotsford</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Burnaby</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Burnaby</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Burnaby</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Neighborhood American Restaurant Asian Restaurant Athletics & Sports \\\n",
"0 Abbotsford 0 0 0 \n",
"1 Abbotsford 0 0 0 \n",
"2 Burnaby 0 0 0 \n",
"3 Burnaby 0 0 0 \n",
"4 Burnaby 0 0 0 \n",
"\n",
" Auto Workshop Bakery Bank Baseball Field Beach Boat or Ferry \\\n",
"0 0 0 0 0 0 0 \n",
"1 0 0 0 0 0 0 \n",
"2 0 0 0 0 0 0 \n",
"3 0 0 0 0 0 0 \n",
"4 0 0 0 0 0 0 \n",
"\n",
" Bookstore Breakfast Spot Brewery Bubble Tea Shop Burger Joint \\\n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 1 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
"\n",
" Bus Station Bus Stop Business Service Café Chinese Restaurant \\\n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
"\n",
" Coffee Shop Construction & Landscaping Convenience Store Dessert Shop \\\n",
"0 0 0 0 0 \n",
"1 0 0 0 0 \n",
"2 0 0 0 0 \n",
"3 0 0 0 0 \n",
"4 1 0 0 0 \n",
"\n",
" Dim Sum Restaurant Dog Run Elementary School Falafel Restaurant \\\n",
"0 0 0 0 0 \n",
"1 0 0 0 0 \n",
"2 0 0 0 0 \n",
"3 0 0 0 0 \n",
"4 0 0 0 0 \n",
"\n",
" Fast Food Restaurant Fish & Chips Shop Fried Chicken Joint Garden \\\n",
"0 0 0 0 0 \n",
"1 0 0 0 0 \n",
"2 0 0 0 0 \n",
"3 0 0 0 0 \n",
"4 0 0 0 0 \n",
"\n",
" Gas Station Gastropub Gift Shop Golf Course Greek Restaurant \\\n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
"\n",
" Grocery Store Gym Gym / Fitness Center Home Service Hot Dog Joint \\\n",
"0 0 0 0 0 0 \n",
"1 1 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
"\n",
" Hotel Indian Restaurant Juice Bar Korean Restaurant Lake Liquor Store \\\n",
"0 0 0 0 0 0 0 \n",
"1 0 0 0 0 0 0 \n",
"2 0 0 0 0 0 0 \n",
"3 0 0 0 0 0 0 \n",
"4 0 0 0 0 0 0 \n",
"\n",
" Malay Restaurant Men's Store Motorcycle Shop Mountain Park Pet Store \\\n",
"0 0 0 0 0 0 0 \n",
"1 0 0 0 0 0 0 \n",
"2 0 0 0 0 0 0 \n",
"3 0 0 0 0 0 0 \n",
"4 0 0 0 0 0 0 \n",
"\n",
" Pharmacy Pizza Place Playground Plaza Pub Recreation Center \\\n",
"0 0 0 0 0 0 0 \n",
"1 0 0 0 0 0 0 \n",
"2 0 0 0 0 0 0 \n",
"3 0 0 0 0 0 0 \n",
"4 0 0 0 0 0 0 \n",
"\n",
" Restaurant Salon / Barbershop Sandwich Place Shipping Store Ski Area \\\n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
"\n",
" Ski Lodge Snack Place Sporting Goods Shop Sushi Restaurant Theme Park \\\n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 1 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
"\n",
" Tourist Information Center Toy / Game Store Trail Vacation Rental \\\n",
"0 0 0 1 0 \n",
"1 0 0 0 0 \n",
"2 0 0 0 0 \n",
"3 0 0 0 0 \n",
"4 0 0 0 0 \n",
"\n",
" Vietnamese Restaurant Wine Shop Zoo \n",
"0 0 0 0 \n",
"1 0 0 0 \n",
"2 0 0 0 \n",
"3 0 0 0 \n",
"4 0 0 0 "
]
},
"execution_count": 139,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"fixed_columns = [one_hot.columns[-1]] + list(one_hot.columns[:-1])\n",
"one_hot = one_hot[fixed_columns]\n",
"one_hot.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Checking the size of our new data frame.\n"
]
},
{
"cell_type": "code",
"execution_count": 140,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(145, 77)"
]
},
"execution_count": 140,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"one_hot.shape"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category.\n"
]
},
{
"cell_type": "code",
"execution_count": 141,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Neighborhood</th>\n",
" <th>American Restaurant</th>\n",
" <th>Asian Restaurant</th>\n",
" <th>Athletics &amp; Sports</th>\n",
" <th>Auto Workshop</th>\n",
" <th>Bakery</th>\n",
" <th>Bank</th>\n",
" <th>Baseball Field</th>\n",
" <th>Beach</th>\n",
" <th>Boat or Ferry</th>\n",
" <th>Bookstore</th>\n",
" <th>Breakfast Spot</th>\n",
" <th>Brewery</th>\n",
" <th>Bubble Tea Shop</th>\n",
" <th>Burger Joint</th>\n",
" <th>Bus Station</th>\n",
" <th>Bus Stop</th>\n",
" <th>Business Service</th>\n",
" <th>Café</th>\n",
" <th>Chinese Restaurant</th>\n",
" <th>Coffee Shop</th>\n",
" <th>Construction &amp; Landscaping</th>\n",
" <th>Convenience Store</th>\n",
" <th>Dessert Shop</th>\n",
" <th>Dim Sum Restaurant</th>\n",
" <th>Dog Run</th>\n",
" <th>Elementary School</th>\n",
" <th>Falafel Restaurant</th>\n",
" <th>Fast Food Restaurant</th>\n",
" <th>Fish &amp; Chips Shop</th>\n",
" <th>Fried Chicken Joint</th>\n",
" <th>Garden</th>\n",
" <th>Gas Station</th>\n",
" <th>Gastropub</th>\n",
" <th>Gift Shop</th>\n",
" <th>Golf Course</th>\n",
" <th>Greek Restaurant</th>\n",
" <th>Grocery Store</th>\n",
" <th>Gym</th>\n",
" <th>Gym / Fitness Center</th>\n",
" <th>Home Service</th>\n",
" <th>Hot Dog Joint</th>\n",
" <th>Hotel</th>\n",
" <th>Indian Restaurant</th>\n",
" <th>Juice Bar</th>\n",
" <th>Korean Restaurant</th>\n",
" <th>Lake</th>\n",
" <th>Liquor Store</th>\n",
" <th>Malay Restaurant</th>\n",
" <th>Men's Store</th>\n",
" <th>Motorcycle Shop</th>\n",
" <th>Mountain</th>\n",
" <th>Park</th>\n",
" <th>Pet Store</th>\n",
" <th>Pharmacy</th>\n",
" <th>Pizza Place</th>\n",
" <th>Playground</th>\n",
" <th>Plaza</th>\n",
" <th>Pub</th>\n",
" <th>Recreation Center</th>\n",
" <th>Restaurant</th>\n",
" <th>Salon / Barbershop</th>\n",
" <th>Sandwich Place</th>\n",
" <th>Shipping Store</th>\n",
" <th>Ski Area</th>\n",
" <th>Ski Lodge</th>\n",
" <th>Snack Place</th>\n",
" <th>Sporting Goods Shop</th>\n",
" <th>Sushi Restaurant</th>\n",
" <th>Theme Park</th>\n",
" <th>Tourist Information Center</th>\n",
" <th>Toy / Game Store</th>\n",
" <th>Trail</th>\n",
" <th>Vacation Rental</th>\n",
" <th>Vietnamese Restaurant</th>\n",
" <th>Wine Shop</th>\n",
" <th>Zoo</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Abbotsford</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.5</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.5</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Burnaby</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.111111</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.111111</td>\n",
" <td>0.111111</td>\n",
" <td>0.222222</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.111111</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.111111</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.111111</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.111111</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Comox</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.166667</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.333333</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.166667</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.166667</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.166667</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Coquitlam</td>\n",
" <td>0.0</td>\n",
" <td>0.2</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.200000</td>\n",
" <td>0.0</td>\n",
" <td>0.2</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.2</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.2</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Cranbrook</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>1.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.000000</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Neighborhood American Restaurant Asian Restaurant Athletics & Sports \\\n",
"0 Abbotsford 0.0 0.0 0.0 \n",
"1 Burnaby 0.0 0.0 0.0 \n",
"2 Comox 0.0 0.0 0.0 \n",
"3 Coquitlam 0.0 0.2 0.0 \n",
"4 Cranbrook 0.0 0.0 0.0 \n",
"\n",
" Auto Workshop Bakery Bank Baseball Field Beach Boat or Ferry \\\n",
"0 0.0 0.0 0.0 0.0 0.0 0.0 \n",
"1 0.0 0.0 0.0 0.0 0.0 0.0 \n",
"2 0.0 0.0 0.0 0.0 0.0 0.0 \n",
"3 0.0 0.0 0.0 0.0 0.0 0.0 \n",
"4 0.0 0.0 0.0 0.0 0.0 0.0 \n",
"\n",
" Bookstore Breakfast Spot Brewery Bubble Tea Shop Burger Joint \\\n",
"0 0.000000 0.0 0.0 0.0 0.000000 \n",
"1 0.111111 0.0 0.0 0.0 0.111111 \n",
"2 0.000000 0.0 0.0 0.0 0.000000 \n",
"3 0.000000 0.0 0.0 0.0 0.000000 \n",
"4 0.000000 0.0 0.0 0.0 0.000000 \n",
"\n",
" Bus Station Bus Stop Business Service Café Chinese Restaurant \\\n",
"0 0.000000 0.000000 0.0 0.0 0.0 \n",
"1 0.111111 0.222222 0.0 0.0 0.0 \n",
"2 0.000000 0.000000 0.0 0.0 0.0 \n",
"3 0.000000 0.000000 0.0 0.0 0.0 \n",
"4 0.000000 0.000000 0.0 0.0 0.0 \n",
"\n",
" Coffee Shop Construction & Landscaping Convenience Store Dessert Shop \\\n",
"0 0.000000 0.0 0.0 0.0 \n",
"1 0.111111 0.0 0.0 0.0 \n",
"2 0.166667 0.0 0.0 0.0 \n",
"3 0.200000 0.0 0.2 0.0 \n",
"4 0.000000 1.0 0.0 0.0 \n",
"\n",
" Dim Sum Restaurant Dog Run Elementary School Falafel Restaurant \\\n",
"0 0.0 0.0 0.0 0.0 \n",
"1 0.0 0.0 0.0 0.0 \n",
"2 0.0 0.0 0.0 0.0 \n",
"3 0.0 0.0 0.0 0.0 \n",
"4 0.0 0.0 0.0 0.0 \n",
"\n",
" Fast Food Restaurant Fish & Chips Shop Fried Chicken Joint Garden \\\n",
"0 0.000000 0.0 0.0 0.0 \n",
"1 0.000000 0.0 0.0 0.0 \n",
"2 0.333333 0.0 0.0 0.0 \n",
"3 0.000000 0.0 0.0 0.0 \n",
"4 0.000000 0.0 0.0 0.0 \n",
"\n",
" Gas Station Gastropub Gift Shop Golf Course Greek Restaurant \\\n",
"0 0.0 0.0 0.0 0.0 0.0 \n",
"1 0.0 0.0 0.0 0.0 0.0 \n",
"2 0.0 0.0 0.0 0.0 0.0 \n",
"3 0.2 0.0 0.0 0.2 0.0 \n",
"4 0.0 0.0 0.0 0.0 0.0 \n",
"\n",
" Grocery Store Gym Gym / Fitness Center Home Service Hot Dog Joint \\\n",
"0 0.5 0.0 0.0 0.0 0.0 \n",
"1 0.0 0.0 0.0 0.0 0.0 \n",
"2 0.0 0.0 0.0 0.0 0.0 \n",
"3 0.0 0.0 0.0 0.0 0.0 \n",
"4 0.0 0.0 0.0 0.0 0.0 \n",
"\n",
" Hotel Indian Restaurant Juice Bar Korean Restaurant Lake Liquor Store \\\n",
"0 0.0 0.0 0.000000 0.0 0.0 0.0 \n",
"1 0.0 0.0 0.000000 0.0 0.0 0.0 \n",
"2 0.0 0.0 0.166667 0.0 0.0 0.0 \n",
"3 0.0 0.0 0.000000 0.0 0.0 0.0 \n",
"4 0.0 0.0 0.000000 0.0 0.0 0.0 \n",
"\n",
" Malay Restaurant Men's Store Motorcycle Shop Mountain Park \\\n",
"0 0.0 0.0 0.0 0.0 0.000000 \n",
"1 0.0 0.0 0.0 0.0 0.111111 \n",
"2 0.0 0.0 0.0 0.0 0.000000 \n",
"3 0.0 0.0 0.0 0.0 0.000000 \n",
"4 0.0 0.0 0.0 0.0 0.000000 \n",
"\n",
" Pet Store Pharmacy Pizza Place Playground Plaza Pub \\\n",
"0 0.0 0.000000 0.0 0.0 0.0 0.0 \n",
"1 0.0 0.000000 0.0 0.0 0.0 0.0 \n",
"2 0.0 0.166667 0.0 0.0 0.0 0.0 \n",
"3 0.0 0.000000 0.0 0.0 0.0 0.0 \n",
"4 0.0 0.000000 0.0 0.0 0.0 0.0 \n",
"\n",
" Recreation Center Restaurant Salon / Barbershop Sandwich Place \\\n",
"0 0.0 0.0 0.0 0.000000 \n",
"1 0.0 0.0 0.0 0.111111 \n",
"2 0.0 0.0 0.0 0.166667 \n",
"3 0.0 0.0 0.0 0.000000 \n",
"4 0.0 0.0 0.0 0.000000 \n",
"\n",
" Shipping Store Ski Area Ski Lodge Snack Place Sporting Goods Shop \\\n",
"0 0.0 0.0 0.0 0.000000 0.0 \n",
"1 0.0 0.0 0.0 0.111111 0.0 \n",
"2 0.0 0.0 0.0 0.000000 0.0 \n",
"3 0.0 0.0 0.0 0.000000 0.0 \n",
"4 0.0 0.0 0.0 0.000000 0.0 \n",
"\n",
" Sushi Restaurant Theme Park Tourist Information Center Toy / Game Store \\\n",
"0 0.0 0.0 0.0 0.0 \n",
"1 0.0 0.0 0.0 0.0 \n",
"2 0.0 0.0 0.0 0.0 \n",
"3 0.0 0.0 0.0 0.0 \n",
"4 0.0 0.0 0.0 0.0 \n",
"\n",
" Trail Vacation Rental Vietnamese Restaurant Wine Shop Zoo \n",
"0 0.5 0.0 0.0 0.0 0.0 \n",
"1 0.0 0.0 0.0 0.0 0.0 \n",
"2 0.0 0.0 0.0 0.0 0.0 \n",
"3 0.0 0.0 0.0 0.0 0.0 \n",
"4 0.0 0.0 0.0 0.0 0.0 "
]
},
"execution_count": 141,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"british_columbia_grouped = one_hot.groupby('Neighborhood').mean().reset_index()\n",
"british_columbia_grouped.head()"
]
},
{
"cell_type": "code",
"execution_count": 142,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(32, 77)"
]
},
"execution_count": 142,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"british_columbia_grouped.shape"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### We define a function to sort the venues in descending order.\n"
]
},
{
"cell_type": "code",
"execution_count": 143,
"metadata": {},
"outputs": [],
"source": [
"def return_most_common_venues(row, num_top_venues):\n",
" row_categories = row.iloc[1:]\n",
" row_categories_sorted = row_categories.sort_values(ascending=False)\n",
" \n",
" return row_categories_sorted.index.values[0:num_top_venues]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Now we create the new data frame and display the top 10 venues for each neighborhood.\n"
]
},
{
"cell_type": "code",
"execution_count": 147,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Neighborhood</th>\n",
" <th>1st Most Common Venue</th>\n",
" <th>2nd Most Common Venue</th>\n",
" <th>3rd Most Common Venue</th>\n",
" <th>4th Most Common Venue</th>\n",
" <th>5th Most Common Venue</th>\n",
" <th>6th Most Common Venue</th>\n",
" <th>7th Most Common Venue</th>\n",
" <th>8th Most Common Venue</th>\n",
" <th>9th Most Common Venue</th>\n",
" <th>10th Most Common Venue</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Abbotsford</td>\n",
" <td>Grocery Store</td>\n",
" <td>Trail</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Burnaby</td>\n",
" <td>Bus Stop</td>\n",
" <td>Bookstore</td>\n",
" <td>Snack Place</td>\n",
" <td>Park</td>\n",
" <td>Bus Station</td>\n",
" <td>Burger Joint</td>\n",
" <td>Sandwich Place</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Falafel Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Comox</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Pharmacy</td>\n",
" <td>Sandwich Place</td>\n",
" <td>Juice Bar</td>\n",
" <td>Elementary School</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Coquitlam</td>\n",
" <td>Asian Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Golf Course</td>\n",
" <td>Gas Station</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Zoo</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Cranbrook</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>Duncan</td>\n",
" <td>Convenience Store</td>\n",
" <td>Gas Station</td>\n",
" <td>Dog Run</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>Esquimalt</td>\n",
" <td>Boat or Ferry</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Zoo</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>Fort St. John</td>\n",
" <td>American Restaurant</td>\n",
" <td>Gas Station</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>Highlands</td>\n",
" <td>Zoo</td>\n",
" <td>Theme Park</td>\n",
" <td>Boat or Ferry</td>\n",
" <td>Wine Shop</td>\n",
" <td>Auto Workshop</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>Kelowna Central</td>\n",
" <td>Park</td>\n",
" <td>Mountain</td>\n",
" <td>Zoo</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Fast Food Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>Kimberley</td>\n",
" <td>American Restaurant</td>\n",
" <td>Ski Lodge</td>\n",
" <td>Hotel</td>\n",
" <td>Ski Area</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>Kitimat</td>\n",
" <td>Business Service</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12</th>\n",
" <td>Langley City</td>\n",
" <td>Baseball Field</td>\n",
" <td>Zoo</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fast Food Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>Nanaimo Central</td>\n",
" <td>Tourist Information Center</td>\n",
" <td>Brewery</td>\n",
" <td>Zoo</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td>Nelson</td>\n",
" <td>Trail</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>New Westminster Northeast</td>\n",
" <td>Hot Dog Joint</td>\n",
" <td>Garden</td>\n",
" <td>Playground</td>\n",
" <td>Zoo</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16</th>\n",
" <td>Oak Bay</td>\n",
" <td>Gym</td>\n",
" <td>Bookstore</td>\n",
" <td>Gift Shop</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" <td>Men's Store</td>\n",
" <td>Breakfast Spot</td>\n",
" <td>Café</td>\n",
" <td>Bakery</td>\n",
" <td>Toy / Game Store</td>\n",
" <td>Bank</td>\n",
" </tr>\n",
" <tr>\n",
" <th>17</th>\n",
" <td>Parksville</td>\n",
" <td>Bookstore</td>\n",
" <td>Home Service</td>\n",
" <td>Zoo</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fast Food Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>18</th>\n",
" <td>Pitt Meadows</td>\n",
" <td>Gym / Fitness Center</td>\n",
" <td>Plaza</td>\n",
" <td>Elementary School</td>\n",
" <td>Pub</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" </tr>\n",
" <tr>\n",
" <th>19</th>\n",
" <td>Qualicum Beach</td>\n",
" <td>American Restaurant</td>\n",
" <td>Restaurant</td>\n",
" <td>Pharmacy</td>\n",
" <td>Grocery Store</td>\n",
" <td>Gastropub</td>\n",
" <td>Gas Station</td>\n",
" <td>Garden</td>\n",
" <td>Fried Chicken Joint</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" <td>Fast Food Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>20</th>\n",
" <td>Richmond</td>\n",
" <td>Gym</td>\n",
" <td>Grocery Store</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Sushi Restaurant</td>\n",
" <td>Pizza Place</td>\n",
" <td>Sandwich Place</td>\n",
" <td>Pub</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Indian Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>21</th>\n",
" <td>Saanich Central</td>\n",
" <td>Bank</td>\n",
" <td>Zoo</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fast Food Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>22</th>\n",
" <td>Salmon Arm</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" </tr>\n",
" <tr>\n",
" <th>23</th>\n",
" <td>Sooke</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" </tr>\n",
" <tr>\n",
" <th>24</th>\n",
" <td>South Okanagan</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25</th>\n",
" <td>Surrey</td>\n",
" <td>Recreation Center</td>\n",
" <td>Auto Workshop</td>\n",
" <td>Zoo</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" </tr>\n",
" <tr>\n",
" <th>26</th>\n",
" <td>Terrace</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" </tr>\n",
" <tr>\n",
" <th>27</th>\n",
" <td>Trail</td>\n",
" <td>Pub</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Zoo</td>\n",
" </tr>\n",
" <tr>\n",
" <th>28</th>\n",
" <td>Vancouver</td>\n",
" <td>Bank</td>\n",
" <td>Sushi Restaurant</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Park</td>\n",
" <td>Gym</td>\n",
" <td>Sandwich Place</td>\n",
" <td>Greek Restaurant</td>\n",
" <td>Grocery Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Liquor Store</td>\n",
" </tr>\n",
" <tr>\n",
" <th>29</th>\n",
" <td>Victoria Central British Columbia Provincial G...</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Brewery</td>\n",
" <td>Restaurant</td>\n",
" <td>Motorcycle Shop</td>\n",
" <td>Sandwich Place</td>\n",
" <td>Gym</td>\n",
" <td>Gastropub</td>\n",
" <td>Fried Chicken Joint</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" <td>Fast Food Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>30</th>\n",
" <td>Whistler</td>\n",
" <td>Hotel</td>\n",
" <td>Vacation Rental</td>\n",
" <td>Beach</td>\n",
" <td>Lake</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" </tr>\n",
" <tr>\n",
" <th>31</th>\n",
" <td>White Rock</td>\n",
" <td>Park</td>\n",
" <td>Gym / Fitness Center</td>\n",
" <td>Athletics &amp; Sports</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Neighborhood \\\n",
"0 Abbotsford \n",
"1 Burnaby \n",
"2 Comox \n",
"3 Coquitlam \n",
"4 Cranbrook \n",
"5 Duncan \n",
"6 Esquimalt \n",
"7 Fort St. John \n",
"8 Highlands \n",
"9 Kelowna Central \n",
"10 Kimberley \n",
"11 Kitimat \n",
"12 Langley City \n",
"13 Nanaimo Central \n",
"14 Nelson \n",
"15 New Westminster Northeast \n",
"16 Oak Bay \n",
"17 Parksville \n",
"18 Pitt Meadows \n",
"19 Qualicum Beach \n",
"20 Richmond \n",
"21 Saanich Central \n",
"22 Salmon Arm \n",
"23 Sooke \n",
"24 South Okanagan \n",
"25 Surrey \n",
"26 Terrace \n",
"27 Trail \n",
"28 Vancouver \n",
"29 Victoria Central British Columbia Provincial G... \n",
"30 Whistler \n",
"31 White Rock \n",
"\n",
" 1st Most Common Venue 2nd Most Common Venue 3rd Most Common Venue \\\n",
"0 Grocery Store Trail Falafel Restaurant \n",
"1 Bus Stop Bookstore Snack Place \n",
"2 Fast Food Restaurant Coffee Shop Pharmacy \n",
"3 Asian Restaurant Convenience Store Golf Course \n",
"4 Construction & Landscaping Zoo Fast Food Restaurant \n",
"5 Convenience Store Gas Station Dog Run \n",
"6 Boat or Ferry Fish & Chips Shop Convenience Store \n",
"7 American Restaurant Gas Station Fast Food Restaurant \n",
"8 Zoo Theme Park Boat or Ferry \n",
"9 Park Mountain Zoo \n",
"10 American Restaurant Ski Lodge Hotel \n",
"11 Business Service Zoo Fast Food Restaurant \n",
"12 Baseball Field Zoo Fish & Chips Shop \n",
"13 Tourist Information Center Brewery Zoo \n",
"14 Trail Zoo Fast Food Restaurant \n",
"15 Hot Dog Joint Garden Playground \n",
"16 Gym Bookstore Gift Shop \n",
"17 Bookstore Home Service Zoo \n",
"18 Gym / Fitness Center Plaza Elementary School \n",
"19 American Restaurant Restaurant Pharmacy \n",
"20 Gym Grocery Store Dim Sum Restaurant \n",
"21 Bank Zoo Fish & Chips Shop \n",
"22 Construction & Landscaping Zoo Fast Food Restaurant \n",
"23 Construction & Landscaping Zoo Fast Food Restaurant \n",
"24 Construction & Landscaping Zoo Fast Food Restaurant \n",
"25 Recreation Center Auto Workshop Zoo \n",
"26 Construction & Landscaping Zoo Fast Food Restaurant \n",
"27 Pub Falafel Restaurant Coffee Shop \n",
"28 Bank Sushi Restaurant Coffee Shop \n",
"29 Coffee Shop Brewery Restaurant \n",
"30 Hotel Vacation Rental Beach \n",
"31 Park Gym / Fitness Center Athletics & Sports \n",
"\n",
" 4th Most Common Venue 5th Most Common Venue \\\n",
"0 Coffee Shop Construction & Landscaping \n",
"1 Park Bus Station \n",
"2 Sandwich Place Juice Bar \n",
"3 Gas Station Coffee Shop \n",
"4 Convenience Store Dessert Shop \n",
"5 Zoo Fast Food Restaurant \n",
"6 Dessert Shop Dim Sum Restaurant \n",
"7 Construction & Landscaping Convenience Store \n",
"8 Wine Shop Auto Workshop \n",
"9 Falafel Restaurant Convenience Store \n",
"10 Ski Area Construction & Landscaping \n",
"11 Convenience Store Dessert Shop \n",
"12 Convenience Store Dessert Shop \n",
"13 Falafel Restaurant Construction & Landscaping \n",
"14 Construction & Landscaping Convenience Store \n",
"15 Zoo Falafel Restaurant \n",
"16 Fish & Chips Shop Men's Store \n",
"17 Convenience Store Dessert Shop \n",
"18 Pub Coffee Shop \n",
"19 Grocery Store Gastropub \n",
"20 Sushi Restaurant Pizza Place \n",
"21 Convenience Store Dessert Shop \n",
"22 Convenience Store Dessert Shop \n",
"23 Convenience Store Dessert Shop \n",
"24 Convenience Store Dessert Shop \n",
"25 Falafel Restaurant Construction & Landscaping \n",
"26 Convenience Store Dessert Shop \n",
"27 Construction & Landscaping Convenience Store \n",
"28 Park Gym \n",
"29 Motorcycle Shop Sandwich Place \n",
"30 Lake Zoo \n",
"31 Zoo Fast Food Restaurant \n",
"\n",
" 6th Most Common Venue 7th Most Common Venue \\\n",
"0 Convenience Store Dessert Shop \n",
"1 Burger Joint Sandwich Place \n",
"2 Elementary School Construction & Landscaping \n",
"3 Zoo Falafel Restaurant \n",
"4 Dim Sum Restaurant Dog Run \n",
"5 Construction & Landscaping Dessert Shop \n",
"6 Dog Run Elementary School \n",
"7 Dessert Shop Dim Sum Restaurant \n",
"8 Fish & Chips Shop Dessert Shop \n",
"9 Dessert Shop Dim Sum Restaurant \n",
"10 Convenience Store Dessert Shop \n",
"11 Dim Sum Restaurant Dog Run \n",
"12 Dim Sum Restaurant Dog Run \n",
"13 Convenience Store Dessert Shop \n",
"14 Dessert Shop Dim Sum Restaurant \n",
"15 Construction & Landscaping Convenience Store \n",
"16 Breakfast Spot Café \n",
"17 Dim Sum Restaurant Dog Run \n",
"18 Construction & Landscaping Convenience Store \n",
"19 Gas Station Garden \n",
"20 Sandwich Place Pub \n",
"21 Dim Sum Restaurant Dog Run \n",
"22 Dim Sum Restaurant Dog Run \n",
"23 Dim Sum Restaurant Dog Run \n",
"24 Dim Sum Restaurant Dog Run \n",
"25 Convenience Store Dessert Shop \n",
"26 Dim Sum Restaurant Dog Run \n",
"27 Dessert Shop Dim Sum Restaurant \n",
"28 Sandwich Place Greek Restaurant \n",
"29 Gym Gastropub \n",
"30 Fast Food Restaurant Convenience Store \n",
"31 Convenience Store Dessert Shop \n",
"\n",
" 8th Most Common Venue 9th Most Common Venue 10th Most Common Venue \n",
"0 Dim Sum Restaurant Dog Run Elementary School \n",
"1 Coffee Shop Fast Food Restaurant Falafel Restaurant \n",
"2 Convenience Store Dessert Shop Dim Sum Restaurant \n",
"3 Dessert Shop Dim Sum Restaurant Dog Run \n",
"4 Elementary School Falafel Restaurant Fish & Chips Shop \n",
"5 Dim Sum Restaurant Elementary School Falafel Restaurant \n",
"6 Falafel Restaurant Fast Food Restaurant Zoo \n",
"7 Dog Run Elementary School Falafel Restaurant \n",
"8 Dim Sum Restaurant Dog Run Elementary School \n",
"9 Dog Run Elementary School Fast Food Restaurant \n",
"10 Dim Sum Restaurant Dog Run Elementary School \n",
"11 Elementary School Falafel Restaurant Fish & Chips Shop \n",
"12 Elementary School Falafel Restaurant Fast Food Restaurant \n",
"13 Dim Sum Restaurant Dog Run Elementary School \n",
"14 Dog Run Elementary School Falafel Restaurant \n",
"15 Dessert Shop Dim Sum Restaurant Dog Run \n",
"16 Bakery Toy / Game Store Bank \n",
"17 Elementary School Falafel Restaurant Fast Food Restaurant \n",
"18 Dessert Shop Dim Sum Restaurant Dog Run \n",
"19 Fried Chicken Joint Fish & Chips Shop Fast Food Restaurant \n",
"20 Fast Food Restaurant Falafel Restaurant Indian Restaurant \n",
"21 Elementary School Falafel Restaurant Fast Food Restaurant \n",
"22 Elementary School Falafel Restaurant Fish & Chips Shop \n",
"23 Elementary School Falafel Restaurant Fish & Chips Shop \n",
"24 Elementary School Falafel Restaurant Fish & Chips Shop \n",
"25 Dim Sum Restaurant Dog Run Elementary School \n",
"26 Elementary School Falafel Restaurant Fish & Chips Shop \n",
"27 Dog Run Elementary School Zoo \n",
"28 Grocery Store Dessert Shop Liquor Store \n",
"29 Fried Chicken Joint Fish & Chips Shop Fast Food Restaurant \n",
"30 Dessert Shop Dim Sum Restaurant Dog Run \n",
"31 Dim Sum Restaurant Dog Run Elementary School "
]
},
"execution_count": 147,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"indicators = ['st', 'nd', 'rd']\n",
"\n",
"# create columns according to number of top venues\n",
"columns = ['Neighborhood']\n",
"for ind in np.arange(10):\n",
" try:\n",
" columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))\n",
" except:\n",
" columns.append('{}th Most Common Venue'.format(ind+1))\n",
"\n",
"# create a new dataframe\n",
"neigh_top_venues = pd.DataFrame(columns=columns)\n",
"neigh_top_venues['Neighborhood'] = british_columbia_grouped['Neighborhood']\n",
"\n",
"for ind in np.arange(british_columbia_grouped.shape[0]):\n",
" neigh_top_venues.iloc[ind, 1:] = return_most_common_venues(british_columbia_grouped.iloc[ind, :], 10)\n",
"\n",
"neigh_top_venues"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"## Cluster Neighborhoods <a name=\"cluster\"></a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Here, we cluster the neighborhoods using the k-means clustering based on similar nearby locations. One of the most important factor in K-Means Clustering is to choose the value of 'k', or the number of clusters for th dataset. To decide this, we plot a graph with the 'sum of squared distances' on the y-axis and the 'k' value on the x-axis. Ideally, from this graph, the best 'k' value would be the one with the least sum of squared distance but that usually results in a very high k value. Hence, we apply the elbow method to choose the optimum k value. In the graph, as we increase the value of k, the distance decreases. The point which shows a steep decline in the distance, after which the distance decrease is comparatively less is known as the optimal value of k ( also known as the elbow point ). \n"
]
},
{
"cell_type": "code",
"execution_count": 176,
"metadata": {},
"outputs": [],
"source": [
"british_columbia_clustering = british_columbia_grouped.drop('Neighborhood', 1)"
]
},
{
"cell_type": "code",
"execution_count": 177,
"metadata": {},
"outputs": [],
"source": [
"Sum_of_squared_distances = []\n",
"K = range(1,15)\n",
"for k in K:\n",
" km = KMeans(n_clusters=k)\n",
" km = km.fit(british_columbia_clustering)\n",
" Sum_of_squared_distances.append(km.inertia_)"
]
},
{
"cell_type": "code",
"execution_count": 178,
"metadata": {},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "code",
"execution_count": 179,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"plt.plot(K, Sum_of_squared_distances, 'bx-')\n",
"plt.xlabel('k')\n",
"plt.ylabel('Sum_of_squared_distances')\n",
"plt.title('Elbow Method For Optimal k')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### In this case, the elbow point is k=8, and we run the run k-means to cluster the neighborhood into 8 clusters.\n"
]
},
{
"cell_type": "code",
"execution_count": 180,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([1, 1, 1, 1, 2, 1, 3, 1, 3, 1, 1, 4, 0, 1, 7], dtype=int32)"
]
},
"execution_count": 180,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"obj = KMeans(n_clusters=8, random_state=0).fit(british_columbia_clustering)\n",
"\n",
"obj.labels_[0:15] "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Now, we create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.\n"
]
},
{
"cell_type": "code",
"execution_count": 187,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Cluster Labels</th>\n",
" <th>Neighborhood</th>\n",
" <th>1st Most Common Venue</th>\n",
" <th>2nd Most Common Venue</th>\n",
" <th>3rd Most Common Venue</th>\n",
" <th>4th Most Common Venue</th>\n",
" <th>5th Most Common Venue</th>\n",
" <th>6th Most Common Venue</th>\n",
" <th>7th Most Common Venue</th>\n",
" <th>8th Most Common Venue</th>\n",
" <th>9th Most Common Venue</th>\n",
" <th>10th Most Common Venue</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>Abbotsford</td>\n",
" <td>Grocery Store</td>\n",
" <td>Trail</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1</td>\n",
" <td>Burnaby</td>\n",
" <td>Bus Stop</td>\n",
" <td>Bookstore</td>\n",
" <td>Snack Place</td>\n",
" <td>Park</td>\n",
" <td>Bus Station</td>\n",
" <td>Burger Joint</td>\n",
" <td>Sandwich Place</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Falafel Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>1</td>\n",
" <td>Comox</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Pharmacy</td>\n",
" <td>Sandwich Place</td>\n",
" <td>Juice Bar</td>\n",
" <td>Elementary School</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>1</td>\n",
" <td>Coquitlam</td>\n",
" <td>Asian Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Golf Course</td>\n",
" <td>Gas Station</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Zoo</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>2</td>\n",
" <td>Cranbrook</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Cluster Labels Neighborhood 1st Most Common Venue \\\n",
"0 1 Abbotsford Grocery Store \n",
"1 1 Burnaby Bus Stop \n",
"2 1 Comox Fast Food Restaurant \n",
"3 1 Coquitlam Asian Restaurant \n",
"4 2 Cranbrook Construction & Landscaping \n",
"\n",
" 2nd Most Common Venue 3rd Most Common Venue 4th Most Common Venue \\\n",
"0 Trail Falafel Restaurant Coffee Shop \n",
"1 Bookstore Snack Place Park \n",
"2 Coffee Shop Pharmacy Sandwich Place \n",
"3 Convenience Store Golf Course Gas Station \n",
"4 Zoo Fast Food Restaurant Convenience Store \n",
"\n",
" 5th Most Common Venue 6th Most Common Venue \\\n",
"0 Construction & Landscaping Convenience Store \n",
"1 Bus Station Burger Joint \n",
"2 Juice Bar Elementary School \n",
"3 Coffee Shop Zoo \n",
"4 Dessert Shop Dim Sum Restaurant \n",
"\n",
" 7th Most Common Venue 8th Most Common Venue 9th Most Common Venue \\\n",
"0 Dessert Shop Dim Sum Restaurant Dog Run \n",
"1 Sandwich Place Coffee Shop Fast Food Restaurant \n",
"2 Construction & Landscaping Convenience Store Dessert Shop \n",
"3 Falafel Restaurant Dessert Shop Dim Sum Restaurant \n",
"4 Dog Run Elementary School Falafel Restaurant \n",
"\n",
" 10th Most Common Venue \n",
"0 Elementary School \n",
"1 Falafel Restaurant \n",
"2 Dim Sum Restaurant \n",
"3 Dog Run \n",
"4 Fish & Chips Shop "
]
},
"execution_count": 187,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#neigh_top_venues.insert(0, 'Cluster Labels', obj.labels_)\n",
"neigh_top_venues['Cluster Labels'] = obj.labels_\n",
"neigh_top_venues.head()"
]
},
{
"cell_type": "code",
"execution_count": 188,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Place</th>\n",
" <th>Code</th>\n",
" <th>Country</th>\n",
" <th>Province</th>\n",
" <th>Latitude</th>\n",
" <th>Longitude</th>\n",
" <th>Cluster Labels</th>\n",
" <th>1st Most Common Venue</th>\n",
" <th>2nd Most Common Venue</th>\n",
" <th>3rd Most Common Venue</th>\n",
" <th>4th Most Common Venue</th>\n",
" <th>5th Most Common Venue</th>\n",
" <th>6th Most Common Venue</th>\n",
" <th>7th Most Common Venue</th>\n",
" <th>8th Most Common Venue</th>\n",
" <th>9th Most Common Venue</th>\n",
" <th>10th Most Common Venue</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Abbotsford</td>\n",
" <td>V3G</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.0625</td>\n",
" <td>-122.3125</td>\n",
" <td>1.0</td>\n",
" <td>Grocery Store</td>\n",
" <td>Trail</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Atlin Region</td>\n",
" <td>V0W</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>59.6250</td>\n",
" <td>-133.5000</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Burnaby</td>\n",
" <td>V3N</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.2500</td>\n",
" <td>-123.0000</td>\n",
" <td>1.0</td>\n",
" <td>Bus Stop</td>\n",
" <td>Bookstore</td>\n",
" <td>Snack Place</td>\n",
" <td>Park</td>\n",
" <td>Bus Station</td>\n",
" <td>Burger Joint</td>\n",
" <td>Sandwich Place</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Falafel Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Campbell River Central</td>\n",
" <td>V9W</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>50.0000</td>\n",
" <td>-125.5625</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Cariboo and West Okanagan</td>\n",
" <td>V0K</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>51.4375</td>\n",
" <td>-121.6250</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Place Code Country Province Latitude \\\n",
"0 Abbotsford V3G Canada British Columbia 49.0625 \n",
"1 Atlin Region V0W Canada British Columbia 59.6250 \n",
"2 Burnaby V3N Canada British Columbia 49.2500 \n",
"3 Campbell River Central V9W Canada British Columbia 50.0000 \n",
"4 Cariboo and West Okanagan V0K Canada British Columbia 51.4375 \n",
"\n",
" Longitude Cluster Labels 1st Most Common Venue 2nd Most Common Venue \\\n",
"0 -122.3125 1.0 Grocery Store Trail \n",
"1 -133.5000 NaN NaN NaN \n",
"2 -123.0000 1.0 Bus Stop Bookstore \n",
"3 -125.5625 NaN NaN NaN \n",
"4 -121.6250 NaN NaN NaN \n",
"\n",
" 3rd Most Common Venue 4th Most Common Venue 5th Most Common Venue \\\n",
"0 Falafel Restaurant Coffee Shop Construction & Landscaping \n",
"1 NaN NaN NaN \n",
"2 Snack Place Park Bus Station \n",
"3 NaN NaN NaN \n",
"4 NaN NaN NaN \n",
"\n",
" 6th Most Common Venue 7th Most Common Venue 8th Most Common Venue \\\n",
"0 Convenience Store Dessert Shop Dim Sum Restaurant \n",
"1 NaN NaN NaN \n",
"2 Burger Joint Sandwich Place Coffee Shop \n",
"3 NaN NaN NaN \n",
"4 NaN NaN NaN \n",
"\n",
" 9th Most Common Venue 10th Most Common Venue \n",
"0 Dog Run Elementary School \n",
"1 NaN NaN \n",
"2 Fast Food Restaurant Falafel Restaurant \n",
"3 NaN NaN \n",
"4 NaN NaN "
]
},
"execution_count": 188,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"clustered_df = df_new\n",
"clustered_df = clustered_df.join(neigh_top_venues.set_index('Neighborhood'), on='Place')\n",
"clustered_df.head()"
]
},
{
"cell_type": "code",
"execution_count": 189,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(72, 6)\n"
]
}
],
"source": [
"clustered_df.dropna(subset = [\"Cluster Labels\"], inplace=True)\n",
"print(df_new.shape)\n",
"clustered_df = clustered_df.astype({\"Cluster Labels\": int})"
]
},
{
"cell_type": "code",
"execution_count": 190,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Place</th>\n",
" <th>Code</th>\n",
" <th>Country</th>\n",
" <th>Province</th>\n",
" <th>Latitude</th>\n",
" <th>Longitude</th>\n",
" <th>Cluster Labels</th>\n",
" <th>1st Most Common Venue</th>\n",
" <th>2nd Most Common Venue</th>\n",
" <th>3rd Most Common Venue</th>\n",
" <th>4th Most Common Venue</th>\n",
" <th>5th Most Common Venue</th>\n",
" <th>6th Most Common Venue</th>\n",
" <th>7th Most Common Venue</th>\n",
" <th>8th Most Common Venue</th>\n",
" <th>9th Most Common Venue</th>\n",
" <th>10th Most Common Venue</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Abbotsford</td>\n",
" <td>V3G</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.06250</td>\n",
" <td>-122.3125</td>\n",
" <td>1</td>\n",
" <td>Grocery Store</td>\n",
" <td>Trail</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Burnaby</td>\n",
" <td>V3N</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.25000</td>\n",
" <td>-123.0000</td>\n",
" <td>1</td>\n",
" <td>Bus Stop</td>\n",
" <td>Bookstore</td>\n",
" <td>Snack Place</td>\n",
" <td>Park</td>\n",
" <td>Bus Station</td>\n",
" <td>Burger Joint</td>\n",
" <td>Sandwich Place</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Falafel Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Comox</td>\n",
" <td>V9M</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.68750</td>\n",
" <td>-124.9375</td>\n",
" <td>1</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Pharmacy</td>\n",
" <td>Sandwich Place</td>\n",
" <td>Juice Bar</td>\n",
" <td>Elementary School</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Coquitlam</td>\n",
" <td>V3J</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.25000</td>\n",
" <td>-122.8750</td>\n",
" <td>1</td>\n",
" <td>Asian Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Golf Course</td>\n",
" <td>Gas Station</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Zoo</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Cranbrook</td>\n",
" <td>V1C</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.50000</td>\n",
" <td>-115.7500</td>\n",
" <td>2</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>Duncan</td>\n",
" <td>V9L</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>48.78125</td>\n",
" <td>-123.6875</td>\n",
" <td>1</td>\n",
" <td>Convenience Store</td>\n",
" <td>Gas Station</td>\n",
" <td>Dog Run</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>Esquimalt</td>\n",
" <td>V9A</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>48.43750</td>\n",
" <td>-123.4375</td>\n",
" <td>3</td>\n",
" <td>Boat or Ferry</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Zoo</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>Fort St. John</td>\n",
" <td>V1J</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>56.25000</td>\n",
" <td>-120.8750</td>\n",
" <td>1</td>\n",
" <td>American Restaurant</td>\n",
" <td>Gas Station</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>Highlands</td>\n",
" <td>V9B</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>48.46875</td>\n",
" <td>-123.5000</td>\n",
" <td>3</td>\n",
" <td>Zoo</td>\n",
" <td>Theme Park</td>\n",
" <td>Boat or Ferry</td>\n",
" <td>Wine Shop</td>\n",
" <td>Auto Workshop</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>Kelowna Central</td>\n",
" <td>V1Y</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.90625</td>\n",
" <td>-119.4375</td>\n",
" <td>1</td>\n",
" <td>Park</td>\n",
" <td>Mountain</td>\n",
" <td>Zoo</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Fast Food Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>Kimberley</td>\n",
" <td>V1A</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.68750</td>\n",
" <td>-116.0000</td>\n",
" <td>1</td>\n",
" <td>American Restaurant</td>\n",
" <td>Ski Lodge</td>\n",
" <td>Hotel</td>\n",
" <td>Ski Area</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>Kitimat</td>\n",
" <td>V8C</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>54.06250</td>\n",
" <td>-128.6250</td>\n",
" <td>4</td>\n",
" <td>Business Service</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12</th>\n",
" <td>Langley City</td>\n",
" <td>V3A</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.09375</td>\n",
" <td>-122.5625</td>\n",
" <td>0</td>\n",
" <td>Baseball Field</td>\n",
" <td>Zoo</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fast Food Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>Nanaimo Central</td>\n",
" <td>V9S</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.18750</td>\n",
" <td>-124.0000</td>\n",
" <td>1</td>\n",
" <td>Tourist Information Center</td>\n",
" <td>Brewery</td>\n",
" <td>Zoo</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td>Nelson</td>\n",
" <td>V1L</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.50000</td>\n",
" <td>-117.3125</td>\n",
" <td>7</td>\n",
" <td>Trail</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>New Westminster Northeast</td>\n",
" <td>V3L</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.18750</td>\n",
" <td>-122.8750</td>\n",
" <td>1</td>\n",
" <td>Hot Dog Joint</td>\n",
" <td>Garden</td>\n",
" <td>Playground</td>\n",
" <td>Zoo</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16</th>\n",
" <td>Oak Bay</td>\n",
" <td>V8R</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>48.43750</td>\n",
" <td>-123.3125</td>\n",
" <td>1</td>\n",
" <td>Gym</td>\n",
" <td>Bookstore</td>\n",
" <td>Gift Shop</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" <td>Men's Store</td>\n",
" <td>Breakfast Spot</td>\n",
" <td>Café</td>\n",
" <td>Bakery</td>\n",
" <td>Toy / Game Store</td>\n",
" <td>Bank</td>\n",
" </tr>\n",
" <tr>\n",
" <th>17</th>\n",
" <td>Parksville</td>\n",
" <td>V9P</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.31250</td>\n",
" <td>-124.3125</td>\n",
" <td>1</td>\n",
" <td>Bookstore</td>\n",
" <td>Home Service</td>\n",
" <td>Zoo</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fast Food Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>18</th>\n",
" <td>Pitt Meadows</td>\n",
" <td>V3Y</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.21875</td>\n",
" <td>-122.6875</td>\n",
" <td>1</td>\n",
" <td>Gym / Fitness Center</td>\n",
" <td>Plaza</td>\n",
" <td>Elementary School</td>\n",
" <td>Pub</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" </tr>\n",
" <tr>\n",
" <th>19</th>\n",
" <td>Qualicum Beach</td>\n",
" <td>V9K</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.34375</td>\n",
" <td>-124.4375</td>\n",
" <td>1</td>\n",
" <td>American Restaurant</td>\n",
" <td>Restaurant</td>\n",
" <td>Pharmacy</td>\n",
" <td>Grocery Store</td>\n",
" <td>Gastropub</td>\n",
" <td>Gas Station</td>\n",
" <td>Garden</td>\n",
" <td>Fried Chicken Joint</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" <td>Fast Food Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>20</th>\n",
" <td>Richmond</td>\n",
" <td>V7B</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.15625</td>\n",
" <td>-123.1250</td>\n",
" <td>1</td>\n",
" <td>Gym</td>\n",
" <td>Grocery Store</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Sushi Restaurant</td>\n",
" <td>Pizza Place</td>\n",
" <td>Sandwich Place</td>\n",
" <td>Pub</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Indian Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>21</th>\n",
" <td>Saanich Central</td>\n",
" <td>V8Z</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>48.50000</td>\n",
" <td>-123.3750</td>\n",
" <td>6</td>\n",
" <td>Bank</td>\n",
" <td>Zoo</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fast Food Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>22</th>\n",
" <td>Salmon Arm</td>\n",
" <td>V1E</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>50.68750</td>\n",
" <td>-119.2500</td>\n",
" <td>2</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" </tr>\n",
" <tr>\n",
" <th>23</th>\n",
" <td>Sooke</td>\n",
" <td>V9Z</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>48.37500</td>\n",
" <td>-123.7500</td>\n",
" <td>2</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" </tr>\n",
" <tr>\n",
" <th>24</th>\n",
" <td>South Okanagan</td>\n",
" <td>V0H</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.40625</td>\n",
" <td>-119.0000</td>\n",
" <td>2</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25</th>\n",
" <td>Surrey</td>\n",
" <td>V3Z</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.12500</td>\n",
" <td>-122.8125</td>\n",
" <td>1</td>\n",
" <td>Recreation Center</td>\n",
" <td>Auto Workshop</td>\n",
" <td>Zoo</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" </tr>\n",
" <tr>\n",
" <th>26</th>\n",
" <td>Terrace</td>\n",
" <td>V8G</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>54.53125</td>\n",
" <td>-128.6250</td>\n",
" <td>2</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" </tr>\n",
" <tr>\n",
" <th>27</th>\n",
" <td>Trail</td>\n",
" <td>V1R</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.09375</td>\n",
" <td>-117.6875</td>\n",
" <td>5</td>\n",
" <td>Pub</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Zoo</td>\n",
" </tr>\n",
" <tr>\n",
" <th>28</th>\n",
" <td>Vancouver</td>\n",
" <td>V5K</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.25000</td>\n",
" <td>-123.1250</td>\n",
" <td>1</td>\n",
" <td>Bank</td>\n",
" <td>Sushi Restaurant</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Park</td>\n",
" <td>Gym</td>\n",
" <td>Sandwich Place</td>\n",
" <td>Greek Restaurant</td>\n",
" <td>Grocery Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Liquor Store</td>\n",
" </tr>\n",
" <tr>\n",
" <th>29</th>\n",
" <td>Victoria Central British Columbia Provincial G...</td>\n",
" <td>V8W</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>48.43750</td>\n",
" <td>-123.3750</td>\n",
" <td>1</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Brewery</td>\n",
" <td>Restaurant</td>\n",
" <td>Motorcycle Shop</td>\n",
" <td>Sandwich Place</td>\n",
" <td>Gym</td>\n",
" <td>Gastropub</td>\n",
" <td>Fried Chicken Joint</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" <td>Fast Food Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>30</th>\n",
" <td>Whistler</td>\n",
" <td>V8E</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>50.12500</td>\n",
" <td>-122.9375</td>\n",
" <td>1</td>\n",
" <td>Hotel</td>\n",
" <td>Vacation Rental</td>\n",
" <td>Beach</td>\n",
" <td>Lake</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" </tr>\n",
" <tr>\n",
" <th>31</th>\n",
" <td>White Rock</td>\n",
" <td>V4B</td>\n",
" <td>Canada</td>\n",
" <td>British Columbia</td>\n",
" <td>49.03125</td>\n",
" <td>-122.8125</td>\n",
" <td>1</td>\n",
" <td>Park</td>\n",
" <td>Gym / Fitness Center</td>\n",
" <td>Athletics &amp; Sports</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Place Code Country \\\n",
"0 Abbotsford V3G Canada \n",
"1 Burnaby V3N Canada \n",
"2 Comox V9M Canada \n",
"3 Coquitlam V3J Canada \n",
"4 Cranbrook V1C Canada \n",
"5 Duncan V9L Canada \n",
"6 Esquimalt V9A Canada \n",
"7 Fort St. John V1J Canada \n",
"8 Highlands V9B Canada \n",
"9 Kelowna Central V1Y Canada \n",
"10 Kimberley V1A Canada \n",
"11 Kitimat V8C Canada \n",
"12 Langley City V3A Canada \n",
"13 Nanaimo Central V9S Canada \n",
"14 Nelson V1L Canada \n",
"15 New Westminster Northeast V3L Canada \n",
"16 Oak Bay V8R Canada \n",
"17 Parksville V9P Canada \n",
"18 Pitt Meadows V3Y Canada \n",
"19 Qualicum Beach V9K Canada \n",
"20 Richmond V7B Canada \n",
"21 Saanich Central V8Z Canada \n",
"22 Salmon Arm V1E Canada \n",
"23 Sooke V9Z Canada \n",
"24 South Okanagan V0H Canada \n",
"25 Surrey V3Z Canada \n",
"26 Terrace V8G Canada \n",
"27 Trail V1R Canada \n",
"28 Vancouver V5K Canada \n",
"29 Victoria Central British Columbia Provincial G... V8W Canada \n",
"30 Whistler V8E Canada \n",
"31 White Rock V4B Canada \n",
"\n",
" Province Latitude Longitude Cluster Labels \\\n",
"0 British Columbia 49.06250 -122.3125 1 \n",
"1 British Columbia 49.25000 -123.0000 1 \n",
"2 British Columbia 49.68750 -124.9375 1 \n",
"3 British Columbia 49.25000 -122.8750 1 \n",
"4 British Columbia 49.50000 -115.7500 2 \n",
"5 British Columbia 48.78125 -123.6875 1 \n",
"6 British Columbia 48.43750 -123.4375 3 \n",
"7 British Columbia 56.25000 -120.8750 1 \n",
"8 British Columbia 48.46875 -123.5000 3 \n",
"9 British Columbia 49.90625 -119.4375 1 \n",
"10 British Columbia 49.68750 -116.0000 1 \n",
"11 British Columbia 54.06250 -128.6250 4 \n",
"12 British Columbia 49.09375 -122.5625 0 \n",
"13 British Columbia 49.18750 -124.0000 1 \n",
"14 British Columbia 49.50000 -117.3125 7 \n",
"15 British Columbia 49.18750 -122.8750 1 \n",
"16 British Columbia 48.43750 -123.3125 1 \n",
"17 British Columbia 49.31250 -124.3125 1 \n",
"18 British Columbia 49.21875 -122.6875 1 \n",
"19 British Columbia 49.34375 -124.4375 1 \n",
"20 British Columbia 49.15625 -123.1250 1 \n",
"21 British Columbia 48.50000 -123.3750 6 \n",
"22 British Columbia 50.68750 -119.2500 2 \n",
"23 British Columbia 48.37500 -123.7500 2 \n",
"24 British Columbia 49.40625 -119.0000 2 \n",
"25 British Columbia 49.12500 -122.8125 1 \n",
"26 British Columbia 54.53125 -128.6250 2 \n",
"27 British Columbia 49.09375 -117.6875 5 \n",
"28 British Columbia 49.25000 -123.1250 1 \n",
"29 British Columbia 48.43750 -123.3750 1 \n",
"30 British Columbia 50.12500 -122.9375 1 \n",
"31 British Columbia 49.03125 -122.8125 1 \n",
"\n",
" 1st Most Common Venue 2nd Most Common Venue 3rd Most Common Venue \\\n",
"0 Grocery Store Trail Falafel Restaurant \n",
"1 Bus Stop Bookstore Snack Place \n",
"2 Fast Food Restaurant Coffee Shop Pharmacy \n",
"3 Asian Restaurant Convenience Store Golf Course \n",
"4 Construction & Landscaping Zoo Fast Food Restaurant \n",
"5 Convenience Store Gas Station Dog Run \n",
"6 Boat or Ferry Fish & Chips Shop Convenience Store \n",
"7 American Restaurant Gas Station Fast Food Restaurant \n",
"8 Zoo Theme Park Boat or Ferry \n",
"9 Park Mountain Zoo \n",
"10 American Restaurant Ski Lodge Hotel \n",
"11 Business Service Zoo Fast Food Restaurant \n",
"12 Baseball Field Zoo Fish & Chips Shop \n",
"13 Tourist Information Center Brewery Zoo \n",
"14 Trail Zoo Fast Food Restaurant \n",
"15 Hot Dog Joint Garden Playground \n",
"16 Gym Bookstore Gift Shop \n",
"17 Bookstore Home Service Zoo \n",
"18 Gym / Fitness Center Plaza Elementary School \n",
"19 American Restaurant Restaurant Pharmacy \n",
"20 Gym Grocery Store Dim Sum Restaurant \n",
"21 Bank Zoo Fish & Chips Shop \n",
"22 Construction & Landscaping Zoo Fast Food Restaurant \n",
"23 Construction & Landscaping Zoo Fast Food Restaurant \n",
"24 Construction & Landscaping Zoo Fast Food Restaurant \n",
"25 Recreation Center Auto Workshop Zoo \n",
"26 Construction & Landscaping Zoo Fast Food Restaurant \n",
"27 Pub Falafel Restaurant Coffee Shop \n",
"28 Bank Sushi Restaurant Coffee Shop \n",
"29 Coffee Shop Brewery Restaurant \n",
"30 Hotel Vacation Rental Beach \n",
"31 Park Gym / Fitness Center Athletics & Sports \n",
"\n",
" 4th Most Common Venue 5th Most Common Venue \\\n",
"0 Coffee Shop Construction & Landscaping \n",
"1 Park Bus Station \n",
"2 Sandwich Place Juice Bar \n",
"3 Gas Station Coffee Shop \n",
"4 Convenience Store Dessert Shop \n",
"5 Zoo Fast Food Restaurant \n",
"6 Dessert Shop Dim Sum Restaurant \n",
"7 Construction & Landscaping Convenience Store \n",
"8 Wine Shop Auto Workshop \n",
"9 Falafel Restaurant Convenience Store \n",
"10 Ski Area Construction & Landscaping \n",
"11 Convenience Store Dessert Shop \n",
"12 Convenience Store Dessert Shop \n",
"13 Falafel Restaurant Construction & Landscaping \n",
"14 Construction & Landscaping Convenience Store \n",
"15 Zoo Falafel Restaurant \n",
"16 Fish & Chips Shop Men's Store \n",
"17 Convenience Store Dessert Shop \n",
"18 Pub Coffee Shop \n",
"19 Grocery Store Gastropub \n",
"20 Sushi Restaurant Pizza Place \n",
"21 Convenience Store Dessert Shop \n",
"22 Convenience Store Dessert Shop \n",
"23 Convenience Store Dessert Shop \n",
"24 Convenience Store Dessert Shop \n",
"25 Falafel Restaurant Construction & Landscaping \n",
"26 Convenience Store Dessert Shop \n",
"27 Construction & Landscaping Convenience Store \n",
"28 Park Gym \n",
"29 Motorcycle Shop Sandwich Place \n",
"30 Lake Zoo \n",
"31 Zoo Fast Food Restaurant \n",
"\n",
" 6th Most Common Venue 7th Most Common Venue \\\n",
"0 Convenience Store Dessert Shop \n",
"1 Burger Joint Sandwich Place \n",
"2 Elementary School Construction & Landscaping \n",
"3 Zoo Falafel Restaurant \n",
"4 Dim Sum Restaurant Dog Run \n",
"5 Construction & Landscaping Dessert Shop \n",
"6 Dog Run Elementary School \n",
"7 Dessert Shop Dim Sum Restaurant \n",
"8 Fish & Chips Shop Dessert Shop \n",
"9 Dessert Shop Dim Sum Restaurant \n",
"10 Convenience Store Dessert Shop \n",
"11 Dim Sum Restaurant Dog Run \n",
"12 Dim Sum Restaurant Dog Run \n",
"13 Convenience Store Dessert Shop \n",
"14 Dessert Shop Dim Sum Restaurant \n",
"15 Construction & Landscaping Convenience Store \n",
"16 Breakfast Spot Café \n",
"17 Dim Sum Restaurant Dog Run \n",
"18 Construction & Landscaping Convenience Store \n",
"19 Gas Station Garden \n",
"20 Sandwich Place Pub \n",
"21 Dim Sum Restaurant Dog Run \n",
"22 Dim Sum Restaurant Dog Run \n",
"23 Dim Sum Restaurant Dog Run \n",
"24 Dim Sum Restaurant Dog Run \n",
"25 Convenience Store Dessert Shop \n",
"26 Dim Sum Restaurant Dog Run \n",
"27 Dessert Shop Dim Sum Restaurant \n",
"28 Sandwich Place Greek Restaurant \n",
"29 Gym Gastropub \n",
"30 Fast Food Restaurant Convenience Store \n",
"31 Convenience Store Dessert Shop \n",
"\n",
" 8th Most Common Venue 9th Most Common Venue 10th Most Common Venue \n",
"0 Dim Sum Restaurant Dog Run Elementary School \n",
"1 Coffee Shop Fast Food Restaurant Falafel Restaurant \n",
"2 Convenience Store Dessert Shop Dim Sum Restaurant \n",
"3 Dessert Shop Dim Sum Restaurant Dog Run \n",
"4 Elementary School Falafel Restaurant Fish & Chips Shop \n",
"5 Dim Sum Restaurant Elementary School Falafel Restaurant \n",
"6 Falafel Restaurant Fast Food Restaurant Zoo \n",
"7 Dog Run Elementary School Falafel Restaurant \n",
"8 Dim Sum Restaurant Dog Run Elementary School \n",
"9 Dog Run Elementary School Fast Food Restaurant \n",
"10 Dim Sum Restaurant Dog Run Elementary School \n",
"11 Elementary School Falafel Restaurant Fish & Chips Shop \n",
"12 Elementary School Falafel Restaurant Fast Food Restaurant \n",
"13 Dim Sum Restaurant Dog Run Elementary School \n",
"14 Dog Run Elementary School Falafel Restaurant \n",
"15 Dessert Shop Dim Sum Restaurant Dog Run \n",
"16 Bakery Toy / Game Store Bank \n",
"17 Elementary School Falafel Restaurant Fast Food Restaurant \n",
"18 Dessert Shop Dim Sum Restaurant Dog Run \n",
"19 Fried Chicken Joint Fish & Chips Shop Fast Food Restaurant \n",
"20 Fast Food Restaurant Falafel Restaurant Indian Restaurant \n",
"21 Elementary School Falafel Restaurant Fast Food Restaurant \n",
"22 Elementary School Falafel Restaurant Fish & Chips Shop \n",
"23 Elementary School Falafel Restaurant Fish & Chips Shop \n",
"24 Elementary School Falafel Restaurant Fish & Chips Shop \n",
"25 Dim Sum Restaurant Dog Run Elementary School \n",
"26 Elementary School Falafel Restaurant Fish & Chips Shop \n",
"27 Dog Run Elementary School Zoo \n",
"28 Grocery Store Dessert Shop Liquor Store \n",
"29 Fried Chicken Joint Fish & Chips Shop Fast Food Restaurant \n",
"30 Dessert Shop Dim Sum Restaurant Dog Run \n",
"31 Dim Sum Restaurant Dog Run Elementary School "
]
},
"execution_count": 190,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"clustered_df.reset_index(inplace = True, drop = True)\n",
"clustered_df"
]
},
{
"cell_type": "code",
"execution_count": 191,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1 20\n",
"2 5\n",
"3 2\n",
"7 1\n",
"6 1\n",
"5 1\n",
"4 1\n",
"0 1\n",
"Name: Cluster Labels, dtype: int64"
]
},
"execution_count": 191,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"clustered_df['Cluster Labels'].value_counts()"
]
},
{
"cell_type": "code",
"execution_count": 192,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The clustered neighborhoods of British Columbia are: 55.001251 , -125.002441\n"
]
}
],
"source": [
"print(\"The clustered neighborhoods of British Columbia are: {} , {}\".format(lat,long))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Finally, visualizing the resulting cluster onto the British Columbia map using Folium.\n"
]
},
{
"cell_type": "code",
"execution_count": 204,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"width:100%;\"><div style=\"position:relative;width:100%;height:0;padding-bottom:60%;\"><span style=\"color:#565656\">Make this Notebook Trusted to load map: File -> Trust Notebook</span><iframe src=\"about:blank\" style=\"position:absolute;width:100%;height:100%;left:0;top:0;border:none !important;\" data-html= onload=\"this.contentDocument.open();this.contentDocument.write(atob(this.getAttribute('data-html')));this.contentDocument.close();\" allowfullscreen webkitallowfullscreen mozallowfullscreen></iframe></div></div>"
],
"text/plain": [
"<folium.folium.Map at 0x7fef7c7ece48>"
]
},
"execution_count": 204,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"map_bc_neigh = folium.Map(location=[lat,long], zoom_start=5)\n",
"map_bc_neigh\n",
"\n",
"x = np.arange(8)\n",
"ys = [i + x + (i*x)**2 for i in range(8)]\n",
"colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))\n",
"rainbow = [colors.rgb2hex(i) for i in colors_array]\n",
"\n",
"for lati, lon, neigh, cluster in zip(clustered_df['Latitude'], clustered_df['Longitude'], clustered_df['Place'], clustered_df['Cluster Labels']):\n",
" label = folium.Popup(str(neigh) + ' Cluster ' + str(cluster), parse_html=True)\n",
" folium.CircleMarker(\n",
" [lati, lon],\n",
" radius=5,\n",
" popup=label,\n",
" color=rainbow[cluster-1],\n",
" fill=True,\n",
" fill_color=rainbow[cluster-1],\n",
" fill_opacity=0.7).add_to(map_bc_neigh)\n",
" \n",
"map_bc_neigh"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"## Results and Conclusion <a name=\"results\"></a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Based on the similarity of neighborhoods based on their nearby venues, we cluster the neighborhoods into 8 different clusters. We examine each cluster and determine the discriminating venue categories that distinguish each cluster.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Cluster 1"
]
},
{
"cell_type": "code",
"execution_count": 205,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(1, 11)\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Place</th>\n",
" <th>1st Most Common Venue</th>\n",
" <th>2nd Most Common Venue</th>\n",
" <th>3rd Most Common Venue</th>\n",
" <th>4th Most Common Venue</th>\n",
" <th>5th Most Common Venue</th>\n",
" <th>6th Most Common Venue</th>\n",
" <th>7th Most Common Venue</th>\n",
" <th>8th Most Common Venue</th>\n",
" <th>9th Most Common Venue</th>\n",
" <th>10th Most Common Venue</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Langley City</td>\n",
" <td>Baseball Field</td>\n",
" <td>Zoo</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fast Food Restaurant</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Place 1st Most Common Venue 2nd Most Common Venue \\\n",
"0 Langley City Baseball Field Zoo \n",
"\n",
" 3rd Most Common Venue 4th Most Common Venue 5th Most Common Venue \\\n",
"0 Fish & Chips Shop Convenience Store Dessert Shop \n",
"\n",
" 6th Most Common Venue 7th Most Common Venue 8th Most Common Venue \\\n",
"0 Dim Sum Restaurant Dog Run Elementary School \n",
"\n",
" 9th Most Common Venue 10th Most Common Venue \n",
"0 Falafel Restaurant Fast Food Restaurant "
]
},
"execution_count": 205,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"label0 = clustered_df.loc[clustered_df['Cluster Labels'] == 0,\n",
" clustered_df.columns[[0] + list(range(7, clustered_df.shape[1]))]]\n",
"print(label0.shape)\n",
"label0.head().reset_index(drop=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Cluster 2"
]
},
{
"cell_type": "code",
"execution_count": 206,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(20, 11)\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Place</th>\n",
" <th>1st Most Common Venue</th>\n",
" <th>2nd Most Common Venue</th>\n",
" <th>3rd Most Common Venue</th>\n",
" <th>4th Most Common Venue</th>\n",
" <th>5th Most Common Venue</th>\n",
" <th>6th Most Common Venue</th>\n",
" <th>7th Most Common Venue</th>\n",
" <th>8th Most Common Venue</th>\n",
" <th>9th Most Common Venue</th>\n",
" <th>10th Most Common Venue</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Abbotsford</td>\n",
" <td>Grocery Store</td>\n",
" <td>Trail</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Burnaby</td>\n",
" <td>Bus Stop</td>\n",
" <td>Bookstore</td>\n",
" <td>Snack Place</td>\n",
" <td>Park</td>\n",
" <td>Bus Station</td>\n",
" <td>Burger Joint</td>\n",
" <td>Sandwich Place</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Falafel Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Comox</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Pharmacy</td>\n",
" <td>Sandwich Place</td>\n",
" <td>Juice Bar</td>\n",
" <td>Elementary School</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Coquitlam</td>\n",
" <td>Asian Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Golf Course</td>\n",
" <td>Gas Station</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Zoo</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Duncan</td>\n",
" <td>Convenience Store</td>\n",
" <td>Gas Station</td>\n",
" <td>Dog Run</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Place 1st Most Common Venue 2nd Most Common Venue \\\n",
"0 Abbotsford Grocery Store Trail \n",
"1 Burnaby Bus Stop Bookstore \n",
"2 Comox Fast Food Restaurant Coffee Shop \n",
"3 Coquitlam Asian Restaurant Convenience Store \n",
"4 Duncan Convenience Store Gas Station \n",
"\n",
" 3rd Most Common Venue 4th Most Common Venue 5th Most Common Venue \\\n",
"0 Falafel Restaurant Coffee Shop Construction & Landscaping \n",
"1 Snack Place Park Bus Station \n",
"2 Pharmacy Sandwich Place Juice Bar \n",
"3 Golf Course Gas Station Coffee Shop \n",
"4 Dog Run Zoo Fast Food Restaurant \n",
"\n",
" 6th Most Common Venue 7th Most Common Venue \\\n",
"0 Convenience Store Dessert Shop \n",
"1 Burger Joint Sandwich Place \n",
"2 Elementary School Construction & Landscaping \n",
"3 Zoo Falafel Restaurant \n",
"4 Construction & Landscaping Dessert Shop \n",
"\n",
" 8th Most Common Venue 9th Most Common Venue 10th Most Common Venue \n",
"0 Dim Sum Restaurant Dog Run Elementary School \n",
"1 Coffee Shop Fast Food Restaurant Falafel Restaurant \n",
"2 Convenience Store Dessert Shop Dim Sum Restaurant \n",
"3 Dessert Shop Dim Sum Restaurant Dog Run \n",
"4 Dim Sum Restaurant Elementary School Falafel Restaurant "
]
},
"execution_count": 206,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"label1 = clustered_df.loc[clustered_df['Cluster Labels'] == 1,\n",
" clustered_df.columns[[0] + list(range(7, clustered_df.shape[1]))]]\n",
"print(label1.shape)\n",
"label1.head().reset_index(drop=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Cluster 3"
]
},
{
"cell_type": "code",
"execution_count": 207,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(5, 11)\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Place</th>\n",
" <th>1st Most Common Venue</th>\n",
" <th>2nd Most Common Venue</th>\n",
" <th>3rd Most Common Venue</th>\n",
" <th>4th Most Common Venue</th>\n",
" <th>5th Most Common Venue</th>\n",
" <th>6th Most Common Venue</th>\n",
" <th>7th Most Common Venue</th>\n",
" <th>8th Most Common Venue</th>\n",
" <th>9th Most Common Venue</th>\n",
" <th>10th Most Common Venue</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Cranbrook</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Salmon Arm</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Sooke</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>South Okanagan</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Terrace</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Place 1st Most Common Venue 2nd Most Common Venue \\\n",
"0 Cranbrook Construction & Landscaping Zoo \n",
"1 Salmon Arm Construction & Landscaping Zoo \n",
"2 Sooke Construction & Landscaping Zoo \n",
"3 South Okanagan Construction & Landscaping Zoo \n",
"4 Terrace Construction & Landscaping Zoo \n",
"\n",
" 3rd Most Common Venue 4th Most Common Venue 5th Most Common Venue \\\n",
"0 Fast Food Restaurant Convenience Store Dessert Shop \n",
"1 Fast Food Restaurant Convenience Store Dessert Shop \n",
"2 Fast Food Restaurant Convenience Store Dessert Shop \n",
"3 Fast Food Restaurant Convenience Store Dessert Shop \n",
"4 Fast Food Restaurant Convenience Store Dessert Shop \n",
"\n",
" 6th Most Common Venue 7th Most Common Venue 8th Most Common Venue \\\n",
"0 Dim Sum Restaurant Dog Run Elementary School \n",
"1 Dim Sum Restaurant Dog Run Elementary School \n",
"2 Dim Sum Restaurant Dog Run Elementary School \n",
"3 Dim Sum Restaurant Dog Run Elementary School \n",
"4 Dim Sum Restaurant Dog Run Elementary School \n",
"\n",
" 9th Most Common Venue 10th Most Common Venue \n",
"0 Falafel Restaurant Fish & Chips Shop \n",
"1 Falafel Restaurant Fish & Chips Shop \n",
"2 Falafel Restaurant Fish & Chips Shop \n",
"3 Falafel Restaurant Fish & Chips Shop \n",
"4 Falafel Restaurant Fish & Chips Shop "
]
},
"execution_count": 207,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"label2 = clustered_df.loc[clustered_df['Cluster Labels'] == 2,\n",
" clustered_df.columns[[0] + list(range(7, clustered_df.shape[1]))]]\n",
"print(label2.shape)\n",
"label2.head().reset_index(drop=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Cluster 4"
]
},
{
"cell_type": "code",
"execution_count": 208,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(2, 11)\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Place</th>\n",
" <th>1st Most Common Venue</th>\n",
" <th>2nd Most Common Venue</th>\n",
" <th>3rd Most Common Venue</th>\n",
" <th>4th Most Common Venue</th>\n",
" <th>5th Most Common Venue</th>\n",
" <th>6th Most Common Venue</th>\n",
" <th>7th Most Common Venue</th>\n",
" <th>8th Most Common Venue</th>\n",
" <th>9th Most Common Venue</th>\n",
" <th>10th Most Common Venue</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Esquimalt</td>\n",
" <td>Boat or Ferry</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Zoo</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Highlands</td>\n",
" <td>Zoo</td>\n",
" <td>Theme Park</td>\n",
" <td>Boat or Ferry</td>\n",
" <td>Wine Shop</td>\n",
" <td>Auto Workshop</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Place 1st Most Common Venue 2nd Most Common Venue \\\n",
"0 Esquimalt Boat or Ferry Fish & Chips Shop \n",
"1 Highlands Zoo Theme Park \n",
"\n",
" 3rd Most Common Venue 4th Most Common Venue 5th Most Common Venue \\\n",
"0 Convenience Store Dessert Shop Dim Sum Restaurant \n",
"1 Boat or Ferry Wine Shop Auto Workshop \n",
"\n",
" 6th Most Common Venue 7th Most Common Venue 8th Most Common Venue \\\n",
"0 Dog Run Elementary School Falafel Restaurant \n",
"1 Fish & Chips Shop Dessert Shop Dim Sum Restaurant \n",
"\n",
" 9th Most Common Venue 10th Most Common Venue \n",
"0 Fast Food Restaurant Zoo \n",
"1 Dog Run Elementary School "
]
},
"execution_count": 208,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"label3 = clustered_df.loc[clustered_df['Cluster Labels'] == 3,\n",
" clustered_df.columns[[0] + list(range(7, clustered_df.shape[1]))]]\n",
"print(label3.shape)\n",
"label3.head().reset_index(drop=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Cluster 5"
]
},
{
"cell_type": "code",
"execution_count": 209,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(1, 11)\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Place</th>\n",
" <th>1st Most Common Venue</th>\n",
" <th>2nd Most Common Venue</th>\n",
" <th>3rd Most Common Venue</th>\n",
" <th>4th Most Common Venue</th>\n",
" <th>5th Most Common Venue</th>\n",
" <th>6th Most Common Venue</th>\n",
" <th>7th Most Common Venue</th>\n",
" <th>8th Most Common Venue</th>\n",
" <th>9th Most Common Venue</th>\n",
" <th>10th Most Common Venue</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Kitimat</td>\n",
" <td>Business Service</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Place 1st Most Common Venue 2nd Most Common Venue 3rd Most Common Venue \\\n",
"0 Kitimat Business Service Zoo Fast Food Restaurant \n",
"\n",
" 4th Most Common Venue 5th Most Common Venue 6th Most Common Venue \\\n",
"0 Convenience Store Dessert Shop Dim Sum Restaurant \n",
"\n",
" 7th Most Common Venue 8th Most Common Venue 9th Most Common Venue \\\n",
"0 Dog Run Elementary School Falafel Restaurant \n",
"\n",
" 10th Most Common Venue \n",
"0 Fish & Chips Shop "
]
},
"execution_count": 209,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"label4 = clustered_df.loc[clustered_df['Cluster Labels'] == 4,\n",
" clustered_df.columns[[0] + list(range(7, clustered_df.shape[1]))]]\n",
"print(label4.shape)\n",
"label4.head().reset_index(drop=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Cluster 6"
]
},
{
"cell_type": "code",
"execution_count": 210,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(1, 11)\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Place</th>\n",
" <th>1st Most Common Venue</th>\n",
" <th>2nd Most Common Venue</th>\n",
" <th>3rd Most Common Venue</th>\n",
" <th>4th Most Common Venue</th>\n",
" <th>5th Most Common Venue</th>\n",
" <th>6th Most Common Venue</th>\n",
" <th>7th Most Common Venue</th>\n",
" <th>8th Most Common Venue</th>\n",
" <th>9th Most Common Venue</th>\n",
" <th>10th Most Common Venue</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Trail</td>\n",
" <td>Pub</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Coffee Shop</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Zoo</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Place 1st Most Common Venue 2nd Most Common Venue 3rd Most Common Venue \\\n",
"0 Trail Pub Falafel Restaurant Coffee Shop \n",
"\n",
" 4th Most Common Venue 5th Most Common Venue 6th Most Common Venue \\\n",
"0 Construction & Landscaping Convenience Store Dessert Shop \n",
"\n",
" 7th Most Common Venue 8th Most Common Venue 9th Most Common Venue \\\n",
"0 Dim Sum Restaurant Dog Run Elementary School \n",
"\n",
" 10th Most Common Venue \n",
"0 Zoo "
]
},
"execution_count": 210,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"label5 = clustered_df.loc[clustered_df['Cluster Labels'] == 5,\n",
" clustered_df.columns[[0] + list(range(7, clustered_df.shape[1]))]]\n",
"print(label5.shape)\n",
"label5.head().reset_index(drop=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Cluster 7"
]
},
{
"cell_type": "code",
"execution_count": 211,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(1, 11)\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Place</th>\n",
" <th>1st Most Common Venue</th>\n",
" <th>2nd Most Common Venue</th>\n",
" <th>3rd Most Common Venue</th>\n",
" <th>4th Most Common Venue</th>\n",
" <th>5th Most Common Venue</th>\n",
" <th>6th Most Common Venue</th>\n",
" <th>7th Most Common Venue</th>\n",
" <th>8th Most Common Venue</th>\n",
" <th>9th Most Common Venue</th>\n",
" <th>10th Most Common Venue</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Saanich Central</td>\n",
" <td>Bank</td>\n",
" <td>Zoo</td>\n",
" <td>Fish &amp; Chips Shop</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" <td>Fast Food Restaurant</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Place 1st Most Common Venue 2nd Most Common Venue \\\n",
"0 Saanich Central Bank Zoo \n",
"\n",
" 3rd Most Common Venue 4th Most Common Venue 5th Most Common Venue \\\n",
"0 Fish & Chips Shop Convenience Store Dessert Shop \n",
"\n",
" 6th Most Common Venue 7th Most Common Venue 8th Most Common Venue \\\n",
"0 Dim Sum Restaurant Dog Run Elementary School \n",
"\n",
" 9th Most Common Venue 10th Most Common Venue \n",
"0 Falafel Restaurant Fast Food Restaurant "
]
},
"execution_count": 211,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"label6 = clustered_df.loc[clustered_df['Cluster Labels'] == 6,\n",
" clustered_df.columns[[0] + list(range(7, clustered_df.shape[1]))]]\n",
"print(label6.shape)\n",
"label6.head().reset_index(drop=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"#### Cluster 8"
]
},
{
"cell_type": "code",
"execution_count": 212,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(1, 11)\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Place</th>\n",
" <th>1st Most Common Venue</th>\n",
" <th>2nd Most Common Venue</th>\n",
" <th>3rd Most Common Venue</th>\n",
" <th>4th Most Common Venue</th>\n",
" <th>5th Most Common Venue</th>\n",
" <th>6th Most Common Venue</th>\n",
" <th>7th Most Common Venue</th>\n",
" <th>8th Most Common Venue</th>\n",
" <th>9th Most Common Venue</th>\n",
" <th>10th Most Common Venue</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Nelson</td>\n",
" <td>Trail</td>\n",
" <td>Zoo</td>\n",
" <td>Fast Food Restaurant</td>\n",
" <td>Construction &amp; Landscaping</td>\n",
" <td>Convenience Store</td>\n",
" <td>Dessert Shop</td>\n",
" <td>Dim Sum Restaurant</td>\n",
" <td>Dog Run</td>\n",
" <td>Elementary School</td>\n",
" <td>Falafel Restaurant</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Place 1st Most Common Venue 2nd Most Common Venue 3rd Most Common Venue \\\n",
"0 Nelson Trail Zoo Fast Food Restaurant \n",
"\n",
" 4th Most Common Venue 5th Most Common Venue 6th Most Common Venue \\\n",
"0 Construction & Landscaping Convenience Store Dessert Shop \n",
"\n",
" 7th Most Common Venue 8th Most Common Venue 9th Most Common Venue \\\n",
"0 Dim Sum Restaurant Dog Run Elementary School \n",
"\n",
" 10th Most Common Venue \n",
"0 Falafel Restaurant "
]
},
"execution_count": 212,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"label7 = clustered_df.loc[clustered_df['Cluster Labels'] == 7,\n",
" clustered_df.columns[[0] + list(range(7, clustered_df.shape[1]))]]\n",
"print(label7.shape)\n",
"label7.head().reset_index(drop=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Thus, these clusters define the type of locality and can be used by different people to choose their preferably area of interest accordingly. For example, the first cluster involves a locality that has baseball fields, zoos in their most popular nearby venues and hence would be ore suitable for a younger generation, preferably kids in their middle schools. On the contrary, the second cluster involves a locality that has grocery shops, fast food shops and convenience stores as their popular nearby locations, and hence these areas would be more suitable for college students, who would need all of these at an almost daily basis."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### This marks the end of the project. Thank you for your time!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python",
"language": "python",
"name": "conda-env-python-py"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.10"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment