Created
May 5, 2020 14:51
-
-
Save tuffacton/d9d8ac1bfe8d8f8d9aa779f990e5a340 to your computer and use it in GitHub Desktop.
HW5
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"nbformat": 4, | |
"nbformat_minor": 0, | |
"metadata": { | |
"colab": { | |
"name": "HW5", | |
"provenance": [], | |
"collapsed_sections": [], | |
"machine_shape": "hm", | |
"authorship_tag": "ABX9TyNO4uSND0utIPOOA0cToGtM", | |
"include_colab_link": true | |
}, | |
"kernelspec": { | |
"name": "python3", | |
"display_name": "Python 3" | |
} | |
}, | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "view-in-github", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"<a href=\"https://colab.research.google.com/gist/tuffacton/d9d8ac1bfe8d8f8d9aa779f990e5a340/hw5.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "I7GL9WXDog-b", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"# Homework 5\n", | |
"Nicolas Acton\n", | |
"\n", | |
"You can run your own copy of this Colaboratory notebook from this gist:\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "wcY97ecRo1xv", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"## Task\n", | |
"Investigate the effect of different neural network architectures on the performance of a classification problem. Specifically, which activation functions, and number and sizes of hidden layers, give the best performance for a given dataset.\n", | |
"\n", | |
"Here is what we'll do:\n", | |
"1. Load the \"ccpp\" dataset using the data on the sheet named \"allBin\". The ID field is \"ID\" and the binary target is \"TG\".\n", | |
"2. Normalize/scale the data appropriately\n", | |
"3. Divide the data into training and test sets\n", | |
"4. For a variety of architectures:\n", | |
"\n", | |
" a. Train an MLPClassifer on the training data.\n", | |
"\n", | |
" b. Measure the performance on the test set using two different measures: AUROC and misclassification rate.\n", | |
"5. Build two tables: the ten best model architectures by AUROC, and the ten best model architectures by misclassification rate.\n", | |
"6. Identify the best model using each of the two measures of performance: do they agree or do they indicate different models?\n", | |
"7. Summarize your results and your findings.\n", | |
"\n", | |
"Note: We want to see all combinations of number of hiddens layers (from 1 to 3), number of nodes in hidden layers (1 to 20 without assumption that the layers have the same number of nodes), and all combinations of activation features (relu, identity, and tanh). This program could take hours to run, so keep that in mind." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "uT1xYpnpEo6I", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"import sklearn\n", | |
"import pandas as pd\n", | |
"import numpy as np\n", | |
"import math" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "JyLhRZgno0-v", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"### Data Preparation\n", | |
"Luckily we're using just one sheet from an otherwise larger set of sheets. We'll transfer to csv for easier data retrieval and load from a gist." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "HiEaoOlRuFoP", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"data = pd.read_csv('https://gist.githubusercontent.com/tuffacton/a115eeab803f0eaff766b6a943c8d760/raw/26bd4ad3f8a1b950010953ecd3fe8dd72424189b/ccpp.csv')" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "nIYWD6ycuTCt", | |
"colab_type": "code", | |
"outputId": "fe31f9d5-40fa-4cc4-cb93-d0a248c343ad", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 406 | |
} | |
}, | |
"source": [ | |
"# Using the ID as the ID was troublesome for further normalization, so we'll just allow position to be utilized\n", | |
"data.drop(['ID'], axis=1)" | |
], | |
"execution_count": 0, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>AT</th>\n", | |
" <th>V</th>\n", | |
" <th>AP</th>\n", | |
" <th>RH</th>\n", | |
" <th>TG</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0</th>\n", | |
" <td>14.96</td>\n", | |
" <td>41.76</td>\n", | |
" <td>1024.07</td>\n", | |
" <td>73.17</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>25.18</td>\n", | |
" <td>62.96</td>\n", | |
" <td>1020.04</td>\n", | |
" <td>59.08</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>5.11</td>\n", | |
" <td>39.40</td>\n", | |
" <td>1012.16</td>\n", | |
" <td>92.14</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>20.86</td>\n", | |
" <td>57.32</td>\n", | |
" <td>1010.24</td>\n", | |
" <td>76.64</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4</th>\n", | |
" <td>10.82</td>\n", | |
" <td>37.50</td>\n", | |
" <td>1009.23</td>\n", | |
" <td>96.62</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>...</th>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>9563</th>\n", | |
" <td>15.12</td>\n", | |
" <td>48.92</td>\n", | |
" <td>1011.80</td>\n", | |
" <td>72.93</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>9564</th>\n", | |
" <td>33.41</td>\n", | |
" <td>77.95</td>\n", | |
" <td>1010.30</td>\n", | |
" <td>59.72</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>9565</th>\n", | |
" <td>15.99</td>\n", | |
" <td>43.34</td>\n", | |
" <td>1014.20</td>\n", | |
" <td>78.66</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>9566</th>\n", | |
" <td>17.65</td>\n", | |
" <td>59.87</td>\n", | |
" <td>1018.58</td>\n", | |
" <td>94.65</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>9567</th>\n", | |
" <td>23.68</td>\n", | |
" <td>51.30</td>\n", | |
" <td>1011.86</td>\n", | |
" <td>71.24</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"<p>9568 rows × 5 columns</p>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" AT V AP RH TG\n", | |
"0 14.96 41.76 1024.07 73.17 1\n", | |
"1 25.18 62.96 1020.04 59.08 0\n", | |
"2 5.11 39.40 1012.16 92.14 1\n", | |
"3 20.86 57.32 1010.24 76.64 0\n", | |
"4 10.82 37.50 1009.23 96.62 1\n", | |
"... ... ... ... ... ..\n", | |
"9563 15.12 48.92 1011.80 72.93 1\n", | |
"9564 33.41 77.95 1010.30 59.72 0\n", | |
"9565 15.99 43.34 1014.20 78.66 1\n", | |
"9566 17.65 59.87 1018.58 94.65 1\n", | |
"9567 23.68 51.30 1011.86 71.24 1\n", | |
"\n", | |
"[9568 rows x 5 columns]" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 3 | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "1jYUrPu2uoQ4", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"We'll do a little normalization using the sklearn scaling functionality." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "2vplwFHNuaXr", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"def normalize(x):\n", | |
" # Change the attributes of the scaler to experiment\n", | |
" from sklearn.preprocessing import StandardScaler\n", | |
" scaler = StandardScaler()\n", | |
" x = pd.DataFrame(scaler.fit_transform(x), columns=x.columns)\n", | |
" return x" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "Ifec-81PvBD7", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"X = normalize(data.drop([\"TG\"], axis=1))\n", | |
"Y = data['TG']" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "-alTGbU8wyaJ", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"Lets look now that we've done some normalization." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "AJa27SvLv_R8", | |
"colab_type": "code", | |
"outputId": "d1267b78-fb13-4799-f3ff-00f26dec7343", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 197 | |
} | |
}, | |
"source": [ | |
"X.head()" | |
], | |
"execution_count": 0, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>ID</th>\n", | |
" <th>AT</th>\n", | |
" <th>V</th>\n", | |
" <th>AP</th>\n", | |
" <th>RH</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0</th>\n", | |
" <td>-1.731870</td>\n", | |
" <td>-0.621065</td>\n", | |
" <td>-0.985689</td>\n", | |
" <td>1.809755</td>\n", | |
" <td>-0.015536</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>-1.731508</td>\n", | |
" <td>0.751781</td>\n", | |
" <td>0.686880</td>\n", | |
" <td>1.129478</td>\n", | |
" <td>-0.986488</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>-1.731146</td>\n", | |
" <td>-1.944209</td>\n", | |
" <td>-1.171880</td>\n", | |
" <td>-0.200691</td>\n", | |
" <td>1.291699</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>-1.730784</td>\n", | |
" <td>0.171478</td>\n", | |
" <td>0.241914</td>\n", | |
" <td>-0.524794</td>\n", | |
" <td>0.223584</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4</th>\n", | |
" <td>-1.730422</td>\n", | |
" <td>-1.177188</td>\n", | |
" <td>-1.321781</td>\n", | |
" <td>-0.695285</td>\n", | |
" <td>1.600419</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" ID AT V AP RH\n", | |
"0 -1.731870 -0.621065 -0.985689 1.809755 -0.015536\n", | |
"1 -1.731508 0.751781 0.686880 1.129478 -0.986488\n", | |
"2 -1.731146 -1.944209 -1.171880 -0.200691 1.291699\n", | |
"3 -1.730784 0.171478 0.241914 -0.524794 0.223584\n", | |
"4 -1.730422 -1.177188 -1.321781 -0.695285 1.600419" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 6 | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "2140HTd0MOfv", | |
"colab_type": "code", | |
"outputId": "3b0aef78-c833-4fbb-c6f7-dec897829d78", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 123 | |
} | |
}, | |
"source": [ | |
"Y.head()" | |
], | |
"execution_count": 0, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"0 1\n", | |
"1 0\n", | |
"2 1\n", | |
"3 0\n", | |
"4 1\n", | |
"Name: TG, dtype: int64" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 7 | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "sVr5COee0lml", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"We will use sklearn tooling to create randomly sampled train and test sets." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "4sTcOqmbw4Uk", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"from sklearn.model_selection import train_test_split\n", | |
"X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.3, random_state=42)" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "Up-s5U8u1ZZT", | |
"colab_type": "code", | |
"outputId": "27611843-eb8d-444f-b3a1-b856cab58f78", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 197 | |
} | |
}, | |
"source": [ | |
"X_train.head()" | |
], | |
"execution_count": 0, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>ID</th>\n", | |
" <th>AT</th>\n", | |
" <th>V</th>\n", | |
" <th>AP</th>\n", | |
" <th>RH</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>8759</th>\n", | |
" <td>1.439333</td>\n", | |
" <td>-1.714505</td>\n", | |
" <td>-1.043282</td>\n", | |
" <td>1.480589</td>\n", | |
" <td>0.980912</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1434</th>\n", | |
" <td>-1.212689</td>\n", | |
" <td>0.609392</td>\n", | |
" <td>0.347633</td>\n", | |
" <td>-0.303661</td>\n", | |
" <td>-0.994757</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>7320</th>\n", | |
" <td>0.918342</td>\n", | |
" <td>0.158045</td>\n", | |
" <td>0.377613</td>\n", | |
" <td>-0.141610</td>\n", | |
" <td>0.826552</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2579</th>\n", | |
" <td>-0.798141</td>\n", | |
" <td>-1.341070</td>\n", | |
" <td>-0.980955</td>\n", | |
" <td>3.305352</td>\n", | |
" <td>-0.052748</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>9142</th>\n", | |
" <td>1.577998</td>\n", | |
" <td>-1.804506</td>\n", | |
" <td>-1.063795</td>\n", | |
" <td>1.531230</td>\n", | |
" <td>1.045688</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" ID AT V AP RH\n", | |
"8759 1.439333 -1.714505 -1.043282 1.480589 0.980912\n", | |
"1434 -1.212689 0.609392 0.347633 -0.303661 -0.994757\n", | |
"7320 0.918342 0.158045 0.377613 -0.141610 0.826552\n", | |
"2579 -0.798141 -1.341070 -0.980955 3.305352 -0.052748\n", | |
"9142 1.577998 -1.804506 -1.063795 1.531230 1.045688" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 9 | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "-vvJGL-2OT0z", | |
"colab_type": "code", | |
"outputId": "f9a331be-67bb-4c80-b388-bcf22fa965d0", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 123 | |
} | |
}, | |
"source": [ | |
"Y_train.head()" | |
], | |
"execution_count": 0, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"8759 1\n", | |
"1434 1\n", | |
"7320 0\n", | |
"2579 1\n", | |
"9142 1\n", | |
"Name: TG, dtype: int64" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 10 | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "aGU4Yit10tB7", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"### MLP Classifier\n", | |
"MLP stands for Multi-layer Perceptron classifier and it optimizes the log-loss function using either LBFGS or stochastic gradient descent. [source](https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "az0VbBke1Qoc", | |
"colab_type": "code", | |
"outputId": "7dfbdd78-c6a8-4f5d-b59a-319735c4f9bb", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 315 | |
} | |
}, | |
"source": [ | |
"from sklearn.neural_network import MLPClassifier\n", | |
"\n", | |
"hls = (5,10)\n", | |
"regpenalty = 0.001\n", | |
"clf = MLPClassifier(solver='adam', alpha=regpenalty, hidden_layer_sizes=hls, early_stopping=True, validation_fraction=0.42)\n", | |
"clf.fit(X_train, Y_train)\n", | |
"\n", | |
"# Make predictions\n", | |
"predY = clf.predict(X_test)\n", | |
"print(\"\\n\\rANN: %d mislabeled out of %d points\" % ((Y_test != predY).sum(), X_test.shape[0]))\n", | |
"trainingLoss = np.asarray(clf.loss_curve_)\n", | |
"validationLoss = np.sqrt(1- np.asarray(clf.validation_scores_))\n", | |
"factor = trainingLoss[1]/validationLoss[1]\n", | |
"validationLoss = validationLoss*factor\n", | |
"\n", | |
"# Plot ROC\n", | |
"import matplotlib.pyplot as plt\n", | |
"%matplotlib inline\n", | |
"# create figure and axis objects with subplots()\n", | |
"xlabel= \"epochs\"\n", | |
"fig,ax= plt.subplots()\n", | |
"ax.plot(trainingLoss, color=\"blue\")\n", | |
"ax.set_xlabel(xlabel,fontsize=10, color=\"blue\")\n", | |
"ax.set_ylabel(\"training loss\",color=\"blue\",fontsize=10)\n", | |
"ax2=ax.twinx()\n", | |
"ax2.plot(validationLoss,color=\"red\")\n", | |
"ax2.set_ylabel(\"validation score\",color=\"red\",fontsize=10)\n", | |
"plt.show()" | |
], | |
"execution_count": 0, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"text": [ | |
"\n", | |
"\rANN: 157 mislabeled out of 2871 points\n" | |
], | |
"name": "stdout" | |
}, | |
{ | |
"output_type": "display_data", | |
"data": { | |
"image/png": "\n", | |
"text/plain": [ | |
"<Figure size 432x288 with 2 Axes>" | |
] | |
}, | |
"metadata": { | |
"tags": [], | |
"needs_background": "light" | |
} | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "YXvqhyrjwdnr", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"### Exploring Various MLP Architectures" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "p739fuWWuvRT", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"from sklearn.neural_network import MLPClassifier\n", | |
"from sklearn.metrics import roc_auc_score\n", | |
"import pandas as pd\n", | |
"import numpy as np\n", | |
"import random\n", | |
"\n", | |
"# Define our trial space for hidden layer architectures\n", | |
"from itertools import product\n", | |
"ones = list(range(1,21))\n", | |
"twos = list(product(range(1,21),repeat=2))\n", | |
"threes = list(product(range(1,21),repeat=3))\n", | |
"combos = ones + twos + threes\n", | |
"\n", | |
"# Define list of activations\n", | |
"activations = ('relu', 'identity', 'tanh')\n", | |
"\n", | |
"# Store our results in a DataFrame for later\n", | |
"\n", | |
"results = pd.DataFrame(data=None, columns=['hidden layers',\n", | |
" 'activation', 'error_rate', \n", | |
" 'auroc']) # We'll want to store all our results to analyze later\n", | |
"\n", | |
"# Define our lists of possible things to create our architecture\n", | |
"for i in combos:\n", | |
" for a in activations:\n", | |
" hidden_layers = i\n", | |
" solvers = 'adam'\n", | |
" activate = a\n", | |
" alphas = 0.001\n", | |
" early_stoppings = True # We always want it to stop early if we're not improving to save computation\n", | |
"\n", | |
" clf = MLPClassifier(solver=solvers, \n", | |
" activation=activate,\n", | |
" alpha=alphas, \n", | |
" hidden_layer_sizes=hidden_layers, \n", | |
" early_stopping=early_stoppings, \n", | |
" validation_fraction=0.42)\n", | |
" \n", | |
" # Train the classifier\n", | |
" clf.fit(X_train, Y_train)\n", | |
"\n", | |
" # Predict against the test data\n", | |
" predictions = clf.predict(X_test)\n", | |
" actual = Y_test\n", | |
"\n", | |
" # Determine Basic Misclassification Rate\n", | |
" error_rate = ((predictions == actual).value_counts()[False]/actual.count())\n", | |
"\n", | |
" # Determine the AUROC\n", | |
" auroc = roc_auc_score(actual, predictions)\n", | |
"\n", | |
" # Provide progress\n", | |
" print(\"Running \", str(i), \" hidden layers for the solver: \", str(a))\n", | |
"\n", | |
" # Store all results for later\n", | |
" results = results.append({'hidden layers': hidden_layers, \n", | |
" 'activation': activate, \n", | |
" 'error_rate': error_rate,\n", | |
" 'auroc': auroc\n", | |
" }, ignore_index=True)\n", | |
" \n", | |
" # We'll probably clear the output when all is said and done since it would be massive" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "EPSGDHvzw732", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"results.head()" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "KidCgDS80-SL", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"We could run this overnight and lose connection to our run-time, so lets make online and local copies of the serialized results dataframe so we can run analysis on it later." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "2jikitG67b2Y", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"We can use https://file.io to save a copy to the cloud. This is a private/public framework but we're not particularly concerned with locking this data down since there's nothing inherently bad about losing it." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "Zvw224UanEaI", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"results.to_pickle('/content/results.pickle')\n", | |
"# Lets not lose these results in case we need to run this exploration overnight\n", | |
"push_response = !curl -F \"[email protected]\" https://file.io\n", | |
"print(push_response)" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "QEUYhHvb7n7Z", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"Colaboratory also has tooling integrated with Chrome to simply download right to your local computer! We'll do this as well to ensure we keep a local copy separate from this run-time." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "P6Rnpsr802m_", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"from google.colab import files\n", | |
"files.download('/content/results.pickle') " | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "pQznvKvbnD9y", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"### Analysis & Discussion" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "QUAq2X9S7H_2", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"As I said, exploring the trial space of all possible products of 1-3 hidden layers with 1-20 nodes per layer for three possible activations took a really long time (I wish I timed it but I'm not doing that again).\n", | |
"\n", | |
"We re-load our local copy into the runtime and read it for analysis." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "Ru7najiy7Hf9", | |
"colab_type": "code", | |
"outputId": "5e147759-6895-4923-ac9c-67c97aeafd2d", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 197 | |
} | |
}, | |
"source": [ | |
"results = pd.read_pickle('/content/results.pickle')\n", | |
"results.head()" | |
], | |
"execution_count": 0, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>hidden layers</th>\n", | |
" <th>activation</th>\n", | |
" <th>error_rate</th>\n", | |
" <th>auroc</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0</th>\n", | |
" <td>1</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.475096</td>\n", | |
" <td>0.500000</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>1</td>\n", | |
" <td>identity</td>\n", | |
" <td>0.058865</td>\n", | |
" <td>0.941772</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>1</td>\n", | |
" <td>tanh</td>\n", | |
" <td>0.072449</td>\n", | |
" <td>0.927858</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>2</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.068617</td>\n", | |
" <td>0.932238</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4</th>\n", | |
" <td>2</td>\n", | |
" <td>identity</td>\n", | |
" <td>0.061303</td>\n", | |
" <td>0.939623</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" hidden layers activation error_rate auroc\n", | |
"0 1 relu 0.475096 0.500000\n", | |
"1 1 identity 0.058865 0.941772\n", | |
"2 1 tanh 0.072449 0.927858\n", | |
"3 2 relu 0.068617 0.932238\n", | |
"4 2 identity 0.061303 0.939623" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 13 | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "RzQK1S3I8eed", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"As we can see, we ran a TON of trials!" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "cx5-bJH18X4a", | |
"colab_type": "code", | |
"outputId": "9846ec7f-699f-48a3-b03b-8db48370a810", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 34 | |
} | |
}, | |
"source": [ | |
"results.shape" | |
], | |
"execution_count": 0, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"(25260, 4)" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 15 | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "g7pavtlF-x69", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"#### AUROC Scoring" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "uO_XvCqr-1YT", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"Lets take a look at some of our best trials when using the AUROC score as the criteria (closer to 1 is better)." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "SzbKprrs8aP1", | |
"colab_type": "code", | |
"outputId": "3d905d4f-4c41-4c33-8852-fcd2b205d100", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 347 | |
} | |
}, | |
"source": [ | |
"results.sort_values('auroc',ascending=False).head(10)" | |
], | |
"execution_count": 0, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>hidden layers</th>\n", | |
" <th>activation</th>\n", | |
" <th>error_rate</th>\n", | |
" <th>auroc</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>22293</th>\n", | |
" <td>(18, 11, 12)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.047022</td>\n", | |
" <td>0.953470</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>20868</th>\n", | |
" <td>(17, 7, 17)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.046674</td>\n", | |
" <td>0.953280</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>22536</th>\n", | |
" <td>(18, 15, 13)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.047022</td>\n", | |
" <td>0.953087</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>24792</th>\n", | |
" <td>(20, 13, 5)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.048067</td>\n", | |
" <td>0.952440</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>20229</th>\n", | |
" <td>(16, 17, 4)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.048067</td>\n", | |
" <td>0.952370</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>10080</th>\n", | |
" <td>(8, 8, 1)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.048415</td>\n", | |
" <td>0.952177</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>13214</th>\n", | |
" <td>(10, 20, 5)</td>\n", | |
" <td>tanh</td>\n", | |
" <td>0.048067</td>\n", | |
" <td>0.952161</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>24873</th>\n", | |
" <td>(20, 14, 12)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.048415</td>\n", | |
" <td>0.952038</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>23208</th>\n", | |
" <td>(19, 6, 17)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.048415</td>\n", | |
" <td>0.951969</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>24309</th>\n", | |
" <td>(20, 5, 4)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.048415</td>\n", | |
" <td>0.951899</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" hidden layers activation error_rate auroc\n", | |
"22293 (18, 11, 12) relu 0.047022 0.953470\n", | |
"20868 (17, 7, 17) relu 0.046674 0.953280\n", | |
"22536 (18, 15, 13) relu 0.047022 0.953087\n", | |
"24792 (20, 13, 5) relu 0.048067 0.952440\n", | |
"20229 (16, 17, 4) relu 0.048067 0.952370\n", | |
"10080 (8, 8, 1) relu 0.048415 0.952177\n", | |
"13214 (10, 20, 5) tanh 0.048067 0.952161\n", | |
"24873 (20, 14, 12) relu 0.048415 0.952038\n", | |
"23208 (19, 6, 17) relu 0.048415 0.951969\n", | |
"24309 (20, 5, 4) relu 0.048415 0.951899" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 19 | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "wI8Xi7CSF7ss", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"Just for comparison, lets look at the worst architectures for AUROC score as well (farther from 1 is worse)." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "0glTRptj8s5_", | |
"colab_type": "code", | |
"outputId": "67560c81-42a3-4ba1-bee0-53d513c15faa", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 347 | |
} | |
}, | |
"source": [ | |
"results.sort_values('auroc',ascending=True).head(10)" | |
], | |
"execution_count": 0, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>hidden layers</th>\n", | |
" <th>activation</th>\n", | |
" <th>error_rate</th>\n", | |
" <th>auroc</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>10890</th>\n", | |
" <td>(9, 1, 11)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.475444</td>\n", | |
" <td>0.499668</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>20496</th>\n", | |
" <td>(17, 1, 13)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.475444</td>\n", | |
" <td>0.499668</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>16950</th>\n", | |
" <td>(14, 2, 11)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.475444</td>\n", | |
" <td>0.499668</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>10881</th>\n", | |
" <td>(9, 1, 8)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.475444</td>\n", | |
" <td>0.499668</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>11823</th>\n", | |
" <td>(9, 17, 2)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.524904</td>\n", | |
" <td>0.500000</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>18903</th>\n", | |
" <td>(15, 15, 2)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.524904</td>\n", | |
" <td>0.500000</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>780</th>\n", | |
" <td>(13, 1)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.475096</td>\n", | |
" <td>0.500000</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>24195</th>\n", | |
" <td>(20, 3, 6)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.524904</td>\n", | |
" <td>0.500000</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>0</th>\n", | |
" <td>1</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.475096</td>\n", | |
" <td>0.500000</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>7746</th>\n", | |
" <td>(6, 9, 3)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.524904</td>\n", | |
" <td>0.500000</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" hidden layers activation error_rate auroc\n", | |
"10890 (9, 1, 11) relu 0.475444 0.499668\n", | |
"20496 (17, 1, 13) relu 0.475444 0.499668\n", | |
"16950 (14, 2, 11) relu 0.475444 0.499668\n", | |
"10881 (9, 1, 8) relu 0.475444 0.499668\n", | |
"11823 (9, 17, 2) relu 0.524904 0.500000\n", | |
"18903 (15, 15, 2) relu 0.524904 0.500000\n", | |
"780 (13, 1) relu 0.475096 0.500000\n", | |
"24195 (20, 3, 6) relu 0.524904 0.500000\n", | |
"0 1 relu 0.475096 0.500000\n", | |
"7746 (6, 9, 3) relu 0.524904 0.500000" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 20 | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "8Hbp4eJvKa-b", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"Just by looking at the tail ends of this sorted distribution, it's hard to identify distinct patterns that would lead us to reduce our trial-space if we were to conduct it in the future. The top performing AUROC is a `relu` activation with `(18,11,12)` hidden layers. That being said, we're not seeing significantly worse AUROC performance from much simpler architectures in this list. Just from a cursory overview, despite `(18,11,12)` being the top of the list, I would lean more towards the `(8,8,1)` architecture as this is going to be a simpler one that could be easier/faster to train and generalize on future datasets without overfitting. This would, of course, require further testing and analysis.\n", | |
"\n", | |
"It is interesting to see in the worst values that the `relu` activation is prevalent there as well, implying maybe the other possible activations lie as relatively average in terms of AUROC performance. That being said, while we're seeing seeing some simpler architectures with less layers in the worse performance we're also seeing some relatively complex ones as well. This also lends to the idea that increased complexity does not equate to better performance." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "gEA_Uoc8Mxb2", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"#### Misclassification Rate Performance" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "oFVfr5fiM1cd", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"Lets look at some of the performances based on simpler misclassification rates (lower=better)." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "bJMSJtdvJ7x1", | |
"colab_type": "code", | |
"outputId": "f76684d6-6489-4fb3-ea84-fa722bfc15f5", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 347 | |
} | |
}, | |
"source": [ | |
"results.sort_values('error_rate',ascending=True).head(10)" | |
], | |
"execution_count": 0, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>hidden layers</th>\n", | |
" <th>activation</th>\n", | |
" <th>error_rate</th>\n", | |
" <th>auroc</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>20868</th>\n", | |
" <td>(17, 7, 17)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.046674</td>\n", | |
" <td>0.953280</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>22293</th>\n", | |
" <td>(18, 11, 12)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.047022</td>\n", | |
" <td>0.953470</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>22536</th>\n", | |
" <td>(18, 15, 13)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.047022</td>\n", | |
" <td>0.953087</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>24792</th>\n", | |
" <td>(20, 13, 5)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.048067</td>\n", | |
" <td>0.952440</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>20229</th>\n", | |
" <td>(16, 17, 4)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.048067</td>\n", | |
" <td>0.952370</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>13214</th>\n", | |
" <td>(10, 20, 5)</td>\n", | |
" <td>tanh</td>\n", | |
" <td>0.048067</td>\n", | |
" <td>0.952161</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>24699</th>\n", | |
" <td>(20, 11, 14)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.048415</td>\n", | |
" <td>0.951760</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>23763</th>\n", | |
" <td>(19, 16, 2)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.048415</td>\n", | |
" <td>0.951899</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>24873</th>\n", | |
" <td>(20, 14, 12)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.048415</td>\n", | |
" <td>0.952038</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>23208</th>\n", | |
" <td>(19, 6, 17)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.048415</td>\n", | |
" <td>0.951969</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" hidden layers activation error_rate auroc\n", | |
"20868 (17, 7, 17) relu 0.046674 0.953280\n", | |
"22293 (18, 11, 12) relu 0.047022 0.953470\n", | |
"22536 (18, 15, 13) relu 0.047022 0.953087\n", | |
"24792 (20, 13, 5) relu 0.048067 0.952440\n", | |
"20229 (16, 17, 4) relu 0.048067 0.952370\n", | |
"13214 (10, 20, 5) tanh 0.048067 0.952161\n", | |
"24699 (20, 11, 14) relu 0.048415 0.951760\n", | |
"23763 (19, 16, 2) relu 0.048415 0.951899\n", | |
"24873 (20, 14, 12) relu 0.048415 0.952038\n", | |
"23208 (19, 6, 17) relu 0.048415 0.951969" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 22 | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "BwbxVC6RNHvo", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"Again, lets also look at the worst performers" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "MdUVOPChM808", | |
"colab_type": "code", | |
"outputId": "d4bd2810-1c25-41e6-8a06-9dffbaa11438", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 347 | |
} | |
}, | |
"source": [ | |
"results.sort_values('error_rate',ascending=False).head(10)" | |
], | |
"execution_count": 0, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>hidden layers</th>\n", | |
" <th>activation</th>\n", | |
" <th>error_rate</th>\n", | |
" <th>auroc</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>3671</th>\n", | |
" <td>(3, 1, 4)</td>\n", | |
" <td>tanh</td>\n", | |
" <td>0.524904</td>\n", | |
" <td>0.5</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1140</th>\n", | |
" <td>(19, 1)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.524904</td>\n", | |
" <td>0.5</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>16680</th>\n", | |
" <td>(13, 18, 1)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.524904</td>\n", | |
" <td>0.5</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4802</th>\n", | |
" <td>(3, 20, 1)</td>\n", | |
" <td>tanh</td>\n", | |
" <td>0.524904</td>\n", | |
" <td>0.5</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4803</th>\n", | |
" <td>(3, 20, 2)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.524904</td>\n", | |
" <td>0.5</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>6486</th>\n", | |
" <td>(5, 8, 3)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.524904</td>\n", | |
" <td>0.5</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>22392</th>\n", | |
" <td>(18, 13, 5)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.524904</td>\n", | |
" <td>0.5</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>24426</th>\n", | |
" <td>(20, 7, 3)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.524904</td>\n", | |
" <td>0.5</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>22803</th>\n", | |
" <td>(18, 20, 2)</td>\n", | |
" <td>relu</td>\n", | |
" <td>0.524904</td>\n", | |
" <td>0.5</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>22802</th>\n", | |
" <td>(18, 20, 1)</td>\n", | |
" <td>tanh</td>\n", | |
" <td>0.524904</td>\n", | |
" <td>0.5</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" hidden layers activation error_rate auroc\n", | |
"3671 (3, 1, 4) tanh 0.524904 0.5\n", | |
"1140 (19, 1) relu 0.524904 0.5\n", | |
"16680 (13, 18, 1) relu 0.524904 0.5\n", | |
"4802 (3, 20, 1) tanh 0.524904 0.5\n", | |
"4803 (3, 20, 2) relu 0.524904 0.5\n", | |
"6486 (5, 8, 3) relu 0.524904 0.5\n", | |
"22392 (18, 13, 5) relu 0.524904 0.5\n", | |
"24426 (20, 7, 3) relu 0.524904 0.5\n", | |
"22803 (18, 20, 2) relu 0.524904 0.5\n", | |
"22802 (18, 20, 1) tanh 0.524904 0.5" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 23 | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "vNL-ZNzkNNf0", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"While we're getting slightly different lists here, the results seem to be about the same with poor performance with AUROC correlating relatively closely with misclassification." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "05wMB8NQNbW9", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"### Conclusion\n", | |
"So what have we done here, apart from use hours and hours of Google's compute infrastructure? Well we've basically brute-forced a trial-space for various artificial neural network architectures and assessed them based off of criteria. We learned that there are likely no real patterns of parameters that can be assumed in order to find the best model.\n", | |
"\n", | |
"That being said, if I were a production Machine Learning engineer would I do this again to re-train my model? Probably not. I would more likely use some form of Bayesian-optimization for hyperparameter tuning, or even a basic GridSearch, to reduce my trial-space significantly and proceed forward with the fact that even if my architecture is not 100% optimal according to my objective criteria it is likely good enough for its intended generalization purposes.\n", | |
"\n", | |
"Of course, something the performs well on a validation set might do worse on future data anyway as the domain changes. It's important to consider these external elements and not just rely on your models as they stand at one point in time." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "T7dxI9CYNMOX", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
} | |
] | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment