This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
%matplotlib inline | |
import seaborn as sns | |
sns.set_style("whitegrid") | |
sns.set(font_scale=1.5) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from pyspark.sql.functions import col, concat, lit | |
custom_concat = [col('appName'), lit('|'), col('platform'), lit('|'), | |
col('carrier'), lit('|'), col('connectionType'), lit('|'), | |
col('country'), lit('|'), col('city'), lit('|'), | |
col('userAgent')] | |
# Add a new column entitled "custom_col" | |
union_df = union_df.withColumn('custom_col', concat(*custom_concat)) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Create a Vertex DataFrame with unique ID column "id" | |
v = sqlContext.createDataFrame([ | |
("a", "Alice", 34), | |
("b", "Bob", 36), | |
("c", "Charlie", 30), | |
], ["id", "name", "age"]) | |
# Create an Edge DataFrame with "src" and "dst" columns | |
e = sqlContext.createDataFrame([ | |
("a", "b", "friend"), | |
("b", "c", "follow"), |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Check out: | |
# https://gist.github.com/adrianorsouza/df4759b0583dcd112da4 | |
# http://olivierlacan.com/posts/launch-sublime-text-3-from-the-command-line/ | |
# To usr/bin | |
sudo ln -s /Applications/Sublime\ Text.app/Contents/SharedSupport/bin/subl /usr/bin/subl | |
# To use/***LOCAL***/bin | |
ln -s "/Applications/Sublime Text.app/Contents/SharedSupport/bin/subl" /usr/local/bin/subl |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Widen width of notebook | |
from IPython.core.display import display, HTML | |
display(HTML("<style>.container { width:98% !important; }</style>")) | |
# Set | |
import pandas as pd | |
pd.set_option('display.max_columns', 50) | |
pd.set_option('display.max_colwidth', 200) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def normalize(train, test): | |
mean, std = train.mean(), test.std() | |
train = (train - mean) / std | |
test = (test - mean) / std | |
return train, test |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Train the logistic rgeression classifier | |
clf = sklearn.linear_model.LogisticRegressionCV() | |
clf.fit(X, y) | |
# Plot the decision boundary | |
plot_decision_boundary(lambda x: clf.predict(x)) | |
plt.title("Logistic Regression") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
### Problem Statement ### | |
Let's say you have a square matrix which consists of cosine similarities (values between 0 and 1). | |
This square matrix can be of any size. | |
You want to get clusters which maximize the values between elemnts in the cluster. | |
For example, for the following matrix: | |
| A | B | C | D | |
A | 1.0 | 0.1 | 0.6 | 0.4 | |
B | 0.1 | 1.0 | 0.1 | 0.2 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def fix_encoding(some_str): | |
return ''.join([c for c in some_str if 0x20 <= ord(c) <= 0x78]) | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# =================================================================================== | |
# Many thanks to: | |
# https://uoa-eresearch.github.io/eresearch-cookbook/recipe/2014/11/20/conda/ | |
# | |
# More info: | |
# https://www.continuum.io/blog/developer-blog/python-packages-and-environments-conda | |
# https://conda-forge.github.io/#about | |
# =================================================================================== | |
# conda info --env |