Skip to content

Instantly share code, notes, and snippets.

@accessnash
accessnash / ts_plots.py
Created August 7, 2018 00:58
Time series visualization in Python - Datacamp
# Visualizing time-series
import matplotlib.pyplot as plt
# Plot the aapl time series in blue
plt.plot(aapl, color='blue', label='AAPL')
# Plot the ibm time series in green
plt.plot(ibm, color='green', label='IBM')
@accessnash
accessnash / classification_tree.py
Created August 10, 2018 21:54
Machine Learning with Tree-Based Models in Python : Ch 1 : Classification & Regression trees (Datacamp)
from sklearn.tree import DecisionTreeClassifier
# Instantiate a DecisionTreeClassifier 'dt' with a maximum depth of 6
dt = DecisionTreeClassifier(max_depth =6, random_state=SEED)
# Fit dt to the training set
dt.fit(X_train, y_train)
# Predict test set labels
y_pred = dt.predict(X_test)
@accessnash
accessnash / bias_var_tradeoff.py
Last active August 15, 2018 21:46
Machine Learning with Tree-Based Models in Python : Ch 2 : Bias-variance trade-off , Ensemble learning - Datacamp
from sklearn.model_selection import train_test_split
# Set SEED for reproducibility
SEED = 1
# Split the data into 70% train and 30% test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=SEED)
# Instantiate a DecisionTreeRegressor dt
dt = DecisionTreeRegressor(max_depth=4, min_samples_leaf=0.26, random_state=SEED)
@accessnash
accessnash / bagging_n_randfor.py
Created August 15, 2018 23:45
Machine Learning with Tree-Based Models in Python : Ch 3 : Bagging & Random Forests - Datacamp
from sklearn.tree import DecisionTreeClassifier
# Import BaggingClassifier
from sklearn.ensemble import BaggingClassifier
# Instantiate dt
dt = DecisionTreeClassifier(random_state=1)
# Instantiate bc
bc = BaggingClassifier(base_estimator=dt, n_estimators=50, random_state=1)
@accessnash
accessnash / boosting.py
Created August 22, 2018 21:14
Machine Learning with Tree-Based Models in Python : Ch - 4 - Adaboosting, Gradient boosting and Stochastic Gradient boosting - Datacamp
# Import DecisionTreeClassifier
from sklearn.tree import DecisionTreeClassifier
# Import AdaBoostClassifier
from sklearn.ensemble import AdaBoostClassifier
# Instantiate dt
dt = DecisionTreeClassifier(max_depth=2, random_state=1)
# Instantiate ada
@accessnash
accessnash / model_tuning.py
Last active August 22, 2018 21:54
Machine Learning with Tree-Based Models in Python : Ch - 5 - Model Tuning - Datacamp
# Define params_dt
params_dt = {
'max_depth': [2, 3, 4],
'min_samples_leaf': [0.12, 0.14, 0.16, 0.18],
}
# Import GridSearchCV
from sklearn.model_selection import GridSearchCV
# Instantiate grid_dt
print(basetable.head())
# Assign the number of rows in the basetable to the variable 'population_size'.
population_size = len(basetable)
# Print the population size.
print(population_size)
# Assign the number of targets to the variable 'targets_count'.
targets_count = sum(basetable["target"])
# Print the number of targets.
print(targets_count)
# Print the incidence, i.e. the number of targets divided by the population size.
@accessnash
accessnash / predictive_analytics2.py
Created September 4, 2018 00:30
Forward stepwise variable selection for logistic regression - Chapter 2 - Predictive Analytics - Datacamp
# Import the linear_model and roc_auc_score modules
from sklearn import linear_model
from sklearn.metrics import roc_auc_score
# Consider two sets of variables
variables_1 = ["mean_gift","income_low"]
variables_2 = ["mean_gift","income_low","gender_F","country_India","age"]
# Make predictions using the first set of variables and assign the AUC to auc_1
X_1 = basetable[variables_1]
@accessnash
accessnash / predictive_analytics3.py
Created September 4, 2018 15:25
Cumulative gains curve and Lift curve to explain model performance to business- Chapter 3 - Predictive Analytics - Datacamp
# Cumulative Gains curve
import matplotlib.pyplot as plt
# Import the scikitplot module
import scikitplot as skplt
# Plot the cumulative gains graph
skplt.metrics.plot_cumulative_gain(targets_test, predictions_test)
plt.show()
@accessnash
accessnash / predictive_analytics4.py
Created September 9, 2018 22:13
Predictor insight graph - Chapter 4 - Predictive Analytics - Datacamp
# Predictor inside graph
# Inspect the predictor insight graph table of Country
print(pig_table)
# Print the number of UK donors
print(pig_table["Size"][pig_table["Country"]=="UK"])
# Check the target incidence of USA and India donors
print(pig_table["Incidence"][pig_table["Country"]=="USA"])