This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # Visualizing time-series | |
| import matplotlib.pyplot as plt | |
| # Plot the aapl time series in blue | |
| plt.plot(aapl, color='blue', label='AAPL') | |
| # Plot the ibm time series in green | |
| plt.plot(ibm, color='green', label='IBM') |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from sklearn.tree import DecisionTreeClassifier | |
| # Instantiate a DecisionTreeClassifier 'dt' with a maximum depth of 6 | |
| dt = DecisionTreeClassifier(max_depth =6, random_state=SEED) | |
| # Fit dt to the training set | |
| dt.fit(X_train, y_train) | |
| # Predict test set labels | |
| y_pred = dt.predict(X_test) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from sklearn.model_selection import train_test_split | |
| # Set SEED for reproducibility | |
| SEED = 1 | |
| # Split the data into 70% train and 30% test | |
| X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=SEED) | |
| # Instantiate a DecisionTreeRegressor dt | |
| dt = DecisionTreeRegressor(max_depth=4, min_samples_leaf=0.26, random_state=SEED) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from sklearn.tree import DecisionTreeClassifier | |
| # Import BaggingClassifier | |
| from sklearn.ensemble import BaggingClassifier | |
| # Instantiate dt | |
| dt = DecisionTreeClassifier(random_state=1) | |
| # Instantiate bc | |
| bc = BaggingClassifier(base_estimator=dt, n_estimators=50, random_state=1) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # Import DecisionTreeClassifier | |
| from sklearn.tree import DecisionTreeClassifier | |
| # Import AdaBoostClassifier | |
| from sklearn.ensemble import AdaBoostClassifier | |
| # Instantiate dt | |
| dt = DecisionTreeClassifier(max_depth=2, random_state=1) | |
| # Instantiate ada |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # Define params_dt | |
| params_dt = { | |
| 'max_depth': [2, 3, 4], | |
| 'min_samples_leaf': [0.12, 0.14, 0.16, 0.18], | |
| } | |
| # Import GridSearchCV | |
| from sklearn.model_selection import GridSearchCV | |
| # Instantiate grid_dt |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| print(basetable.head()) | |
| # Assign the number of rows in the basetable to the variable 'population_size'. | |
| population_size = len(basetable) | |
| # Print the population size. | |
| print(population_size) | |
| # Assign the number of targets to the variable 'targets_count'. | |
| targets_count = sum(basetable["target"]) | |
| # Print the number of targets. | |
| print(targets_count) | |
| # Print the incidence, i.e. the number of targets divided by the population size. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # Import the linear_model and roc_auc_score modules | |
| from sklearn import linear_model | |
| from sklearn.metrics import roc_auc_score | |
| # Consider two sets of variables | |
| variables_1 = ["mean_gift","income_low"] | |
| variables_2 = ["mean_gift","income_low","gender_F","country_India","age"] | |
| # Make predictions using the first set of variables and assign the AUC to auc_1 | |
| X_1 = basetable[variables_1] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # Cumulative Gains curve | |
| import matplotlib.pyplot as plt | |
| # Import the scikitplot module | |
| import scikitplot as skplt | |
| # Plot the cumulative gains graph | |
| skplt.metrics.plot_cumulative_gain(targets_test, predictions_test) | |
| plt.show() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # Predictor inside graph | |
| # Inspect the predictor insight graph table of Country | |
| print(pig_table) | |
| # Print the number of UK donors | |
| print(pig_table["Size"][pig_table["Country"]=="UK"]) | |
| # Check the target incidence of USA and India donors | |
| print(pig_table["Incidence"][pig_table["Country"]=="USA"]) |