This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
wget https://repo.anaconda.com/archive/Anaconda3-5.2.0-Linux-x86_64. | |
sh Anaconda3-5.2.0-Linux-x86_64.sh | |
conda update -n base conda | |
# Activate a environment then inside it run: | |
conda install pandas matplotlib jupyter notebook scipy scikit-learn nb_conda seaborn | |
# Add some extra things... | |
# jupyter extensions | |
conda install -c conda-forge jupyter_contrib_nbextensions |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from multiprocessing import Pool # for reading the CSVs faster | |
def my_read_csv(filename): | |
# Helper function for the parellel load_csvs | |
return pd.read_csv(filename) | |
def load_csvs(prefix): | |
"""Reads and joins all our CSV files into one big dataframe. | |
We do it in parallel to make it faster, since otherwise it takes some time. | |
Idea from: https://stackoverflow.com/questions/36587211/easiest-way-to-read-csv-files-with-multiprocessing-in-pandas |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# taken from https://stackoverflow.com/a/20538655 | |
git init | |
git remote add origin PATH/TO/REPO | |
git fetch | |
git checkout -t origin/master |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"""Famous kaggle reduce mem usage script. | |
NOT MINE - taken from https://www.kaggle.com/gemartin/load-data-reduce-memory-usage | |
""" | |
import pandas as pd | |
import numpy as np | |
def reduce_mem_usage(df): | |
""" iterate through all the columns of a dataframe and modify the data type |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def embed_plotly(fig): | |
""" | |
See https://plotly.com/python/static-image-export/ | |
""" | |
# not sure if this import will work outside jupyter | |
from IPython.display import Image | |
img_bytes = fig.to_image(format="png") | |
return Image(img_bytes) | |