This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
sf_merged.loc[sf_merged['Cluster Labels'] == 1, sf_merged.columns[[1] + list(range(5, sf_merged.shape[1]))]] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
sf_merged.loc[sf_merged['Cluster Labels'] == 0, sf_merged.columns[[1] + list(range(5, sf_merged.shape[1]))]] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Cluster 1 | |
sf_merged.loc[sf_merged['Cluster Labels'] == 0, sf_merged.columns[[1] + list(range(5, sf_merged.shape[1]))]] | |
# Cluster 2 | |
sf_merged.loc[sf_merged['Cluster Labels'] == 1, sf_merged.columns[[1] + list(range(5, sf_merged.shape[1]))]] | |
# Cluster 3 | |
sf_merged.loc[sf_merged['Cluster Labels'] == 2, sf_merged.columns[[1] + list(range(5, sf_merged.shape[1]))]] | |
# Cluster 4 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# create map | |
map_clusters = folium.Map(location = [latitude, longitude], zoom_start = 11) | |
# set color scheme for the clusters | |
x = np.arange(kclusters) | |
ys = [i + x + (i * x) ** 2 for i in range(kclusters)] | |
colors_array = cm.rainbow(np.linspace(0, 1, len(ys))) | |
rainbow = [colors.rgb2hex(i) for i in colors_array] | |
# add markers to the map |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_) | |
sf_merged = sf_data | |
sf_merged = sf_merged.merge(neighborhoods_venues_sorted, on = 'Neighborhood') | |
sf_merged.head() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from sklearn.cluster import KMeans | |
# set number of clusters | |
kclusters = 5 | |
sf_grouped_clustering = sf_grouped.drop('Neighborhood', 1) | |
# run k-means clustering | |
kmeans = KMeans(n_clusters = kclusters, random_state = 0).fit(sf_grouped_clustering) | |
# check cluster labels generated for each row in the dataframe | |
kmeans.labels_[0:10] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
num_top_venues = 10 | |
indicators = ['st', 'nd', 'rd'] | |
# create columns according to number of top venues | |
columns = ['Neighborhood'] | |
for ind in np.arange(num_top_venues): | |
try: | |
# append 'st', 'nd', 'rd' to the top 3 venues | |
columns.append('{}{} Most Common Venue'.format(ind + 1, indicators[ind])) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def return_most_common_venues(row, num_top_venues): | |
row_categories = row.iloc[1:] | |
row_categories_sorted = row_categories.sort_values(ascending = False) | |
return row_categories_sorted.index.values[0:num_top_venues] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
sf_grouped = sf_onehot.groupby('Neighborhood').mean().reset_index() | |
sf_grouped.head() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# one hot encoding | |
sf_onehot = pd.get_dummies(sf_venues[['Venue Category']], prefix = "", prefix_sep = "") | |
# add neighborhood column back to dataframe | |
sf_onehot['Neighborhood'] = sf_venues['Neighborhood'] | |
# move neighborhood column to the first column | |
fixed_columns = [sf_onehot.columns[-1]] + list(sf_onehot.columns[:-1]) | |
sf_onehot = sf_onehot[fixed_columns] |