Created
April 8, 2018 19:32
-
-
Save ravila4/85eb84473b93edbfc2fc6f9e374fde71 to your computer and use it in GitHub Desktop.
PCA analysis on Swiss Bank Notes
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"The file SwissBankNotes_HW2_BINF5112.txt consists of six variables measured on 200 old Swiss 1,000-franc bank notes. The first 100 are genuine and the next 100 are counterfeit. The six variables are length of the bank note, height of the bank note, measured on the left, height of the bank note, measured on the right, distance of inner frame to the lower border, distance of inner frame to the upper border, and length of the diagonal." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 1, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>Length</th>\n", | |
" <th>Left</th>\n", | |
" <th>Right</th>\n", | |
" <th>Bottom</th>\n", | |
" <th>Top</th>\n", | |
" <th>Diagonal</th>\n", | |
" <th>Y</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>214.8</td>\n", | |
" <td>131.0</td>\n", | |
" <td>131.1</td>\n", | |
" <td>9.0</td>\n", | |
" <td>9.7</td>\n", | |
" <td>141.0</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>214.6</td>\n", | |
" <td>129.7</td>\n", | |
" <td>129.7</td>\n", | |
" <td>8.1</td>\n", | |
" <td>9.5</td>\n", | |
" <td>141.7</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>214.8</td>\n", | |
" <td>129.7</td>\n", | |
" <td>129.7</td>\n", | |
" <td>8.7</td>\n", | |
" <td>9.6</td>\n", | |
" <td>142.2</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4</th>\n", | |
" <td>214.8</td>\n", | |
" <td>129.7</td>\n", | |
" <td>129.6</td>\n", | |
" <td>7.5</td>\n", | |
" <td>10.4</td>\n", | |
" <td>142.0</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>5</th>\n", | |
" <td>215.0</td>\n", | |
" <td>129.6</td>\n", | |
" <td>129.7</td>\n", | |
" <td>10.4</td>\n", | |
" <td>7.7</td>\n", | |
" <td>141.8</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" Length Left Right Bottom Top Diagonal Y\n", | |
"1 214.8 131.0 131.1 9.0 9.7 141.0 0\n", | |
"2 214.6 129.7 129.7 8.1 9.5 141.7 0\n", | |
"3 214.8 129.7 129.7 8.7 9.6 142.2 0\n", | |
"4 214.8 129.7 129.6 7.5 10.4 142.0 0\n", | |
"5 215.0 129.6 129.7 10.4 7.7 141.8 0" | |
] | |
}, | |
"execution_count": 1, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"import numpy as np\n", | |
"import pandas as pd\n", | |
"\n", | |
"# Read data\n", | |
"df = pd.read_csv(\"SwissBankNote_HW2_BINF5112.txt\", sep=\"\\s+\")\n", | |
"\n", | |
"df.head()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Discuss which matrix (the sample covariance matrix or the sample correlation matrix) seems more appropriate for PCA of this example." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"A correlation matrix is usually more appropriate when the units used in the independent variables are different, because using the correlation matrix will standardize the data. In this case, the units are the same, so it may be okay to use the covariance matrix." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 2, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>Length</th>\n", | |
" <th>Left</th>\n", | |
" <th>Right</th>\n", | |
" <th>Bottom</th>\n", | |
" <th>Top</th>\n", | |
" <th>Diagonal</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>Length</th>\n", | |
" <td>0.141793</td>\n", | |
" <td>0.031443</td>\n", | |
" <td>0.023091</td>\n", | |
" <td>-0.103246</td>\n", | |
" <td>-0.177737</td>\n", | |
" <td>0.084306</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>Left</th>\n", | |
" <td>0.031443</td>\n", | |
" <td>0.130339</td>\n", | |
" <td>0.108427</td>\n", | |
" <td>0.215803</td>\n", | |
" <td>-0.024207</td>\n", | |
" <td>-0.209342</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>Right</th>\n", | |
" <td>0.023091</td>\n", | |
" <td>0.108427</td>\n", | |
" <td>0.163274</td>\n", | |
" <td>0.284132</td>\n", | |
" <td>0.067082</td>\n", | |
" <td>-0.240470</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>Bottom</th>\n", | |
" <td>-0.103246</td>\n", | |
" <td>0.215803</td>\n", | |
" <td>0.284132</td>\n", | |
" <td>2.086878</td>\n", | |
" <td>0.117303</td>\n", | |
" <td>-1.036996</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>Top</th>\n", | |
" <td>-0.177737</td>\n", | |
" <td>-0.024207</td>\n", | |
" <td>0.067082</td>\n", | |
" <td>0.117303</td>\n", | |
" <td>30.915678</td>\n", | |
" <td>-0.100771</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>Diagonal</th>\n", | |
" <td>0.084306</td>\n", | |
" <td>-0.209342</td>\n", | |
" <td>-0.240470</td>\n", | |
" <td>-1.036996</td>\n", | |
" <td>-0.100771</td>\n", | |
" <td>1.327716</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" Length Left Right Bottom Top Diagonal\n", | |
"Length 0.141793 0.031443 0.023091 -0.103246 -0.177737 0.084306\n", | |
"Left 0.031443 0.130339 0.108427 0.215803 -0.024207 -0.209342\n", | |
"Right 0.023091 0.108427 0.163274 0.284132 0.067082 -0.240470\n", | |
"Bottom -0.103246 0.215803 0.284132 2.086878 0.117303 -1.036996\n", | |
"Top -0.177737 -0.024207 0.067082 0.117303 30.915678 -0.100771\n", | |
"Diagonal 0.084306 -0.209342 -0.240470 -1.036996 -0.100771 1.327716" | |
] | |
}, | |
"execution_count": 2, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"features = df.iloc[:, 0:6]\n", | |
"\n", | |
"# Display covariance matrix\n", | |
"features.cov()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 4, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAP4AAAECCAYAAADesWqHAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAACmtJREFUeJzt3d+LXPUdxvHnyTbrhkQrRStipClYBBGqZQnFQKHBSvyB7aWCXgl704LSgtRL/wHxpjdBpS1aRVChWLUGNNiAv5IYrRotRSwNCusPRJOqIbtPL3ZS8mPtnk3mO+fo5/2CJbvJMPuQ5L1nZnZnjpMIQC1r+h4AYPIIHyiI8IGCCB8oiPCBgggfKGiw4dveZvtt2/+0/dsB7LnP9rzt1/vecpTtC20/a3u/7Tds3zqATTO2X7L96mjTnX1vOsr2lO1XbD/e95ajbL9r+++299nePbHPO8Tv49uekvQPST+TdEDSy5JuTPJmj5t+IumgpD8mubSvHceyfb6k85PstX2mpD2SftHz35MlrU9y0PZaSbsk3Zrkhb42HWX715JmJZ2V5Lq+90hL4UuaTfLhJD/vUI/4myX9M8k7SQ5LekjSz/sclOQ5SR/3ueFESd5Psnf0/meS9ku6oOdNSXJw9OHa0VvvRxfbGyVdK+mevrcMwVDDv0DSv4/5+IB6/g89dLY3Sbpc0ov9LvnfTep9kuYl7UjS+yZJd0u6XdJi30NOEElP295je25Sn3So4XuZ3+v9qDFUtjdIekTSbUk+7XtPkoUkl0naKGmz7V7vGtm+TtJ8kj197vgKW5L8SNLVkn45ukvZ3FDDPyDpwmM+3ijpvZ62DNrofvQjkh5I8mjfe46V5BNJOyVt63nKFknXj+5PPyRpq+37+520JMl7o1/nJT2mpbu5zQ01/Jcl/cD2921PS7pB0p973jQ4owfS7pW0P8ldfe+RJNvn2j579P46SVdKeqvPTUnuSLIxySYt/V96JslNfW6SJNvrRw/KyvZ6SVdJmsh3jQYZfpIjkn4l6a9aesDq4SRv9LnJ9oOSnpd0se0Dtm/pc8/IFkk3a+kItm/0dk3Pm86X9Kzt17T0BXxHksF8+2xgzpO0y/arkl6S9JckT03iEw/y23kA2hrkER9AW4QPFET4QEGEDxRE+EBBgw5/kj/C2NUQN0nD3MWmbvrYNOjwJQ3uH0nD3CQNcxebuiF8AO01+QGeac9k3ZoNp309h/OFpj0zhkXjM8RN0jB3jXXTmP6fHtaXmtYZY7kuSZKXez7Z6ozz7+nzxYM6nC9WHPWtsXy2E6xbs0E/3nB9i6s+dYtDezbmgA3xpzmH+u83NdX3guO88J9uPx3NTX2gIMIHCiJ8oCDCBwoifKAgwgcKInygIMIHCiJ8oCDCBwoifKAgwgcKInygoE7hD+1c9QBOz4rhj85V/zstndTvEkk32r6k9TAA7XQ54g/uXPUATk+X8DlXPfAN0+UVeDqdq370SqFzkjTj9ac5C0BLXY74nc5Vn2R7ktkks0N77TcAx+sSPueqB75hVrypn+SI7aPnqp+SdF/f56oHcHo6vcpukickPdF4C4AJ4Sf3gIIIHyiI8IGCCB8oiPCBgggfKIjwgYIIHyiI8IGCCB8oiPCBgggfKKjTk3ROyeJis6s+JWv4GteVvdxrr/QrCwt9T1je0HblpNfIWRY1AAURPlAQ4QMFET5QEOEDBRE+UBDhAwURPlAQ4QMFET5QEOEDBRE+UBDhAwURPlAQ4QMFrRi+7ftsz9t+fRKDALTX5Yj/e0nbGu8AMEErhp/kOUkfT2ALgAnhPj5Q0Nhec8/2nKQ5SZrx+nFdLYAGxnbET7I9yWyS2WnPjOtqATTATX2goC7fzntQ0vOSLrZ9wPYt7WcBaGnF+/hJbpzEEACTw019oCDCBwoifKAgwgcKInygIMIHCiJ8oCDCBwoifKAgwgcKInygIMIHCiJ8oCDCBwoifKAgwgcKInygIMIHCiJ8oCDCBwoifKAgwgcKInygIMIHCiJ8oCDCBwoifKAgwgcK6nK23AttP2t7v+03bN86iWEA2lnxbLmSjkj6TZK9ts+UtMf2jiRvNt4GoJEVj/hJ3k+yd/T+Z5L2S7qg9TAA7azqPr7tTZIul/RiizEAJqPLTX1Jku0Nkh6RdFuST5f58zlJc5I04/VjGwhg/Dod8W2v1VL0DyR5dLnLJNmeZDbJ7LRnxrkRwJh1eVTfku6VtD/JXe0nAWityxF/i6SbJW21vW/0dk3jXQAaWvE+fpJdkjyBLQAmhJ/cAwoifKAgwgcKInygIMIHCiJ8oCDCBwoifKAgwgcKInygIMIHCiJ8oCDCBwrq/Ao8q5Y0u+pTsfSyAujC53yn7wknyQcf9T1hWYuHDvU94Tjp2B1HfKAgwgcKInygIMIHCiJ8oCDCBwoifKAgwgcKInygIMIHCiJ8oCDCBwoifKAgwgcK6nKa7BnbL9l+1fYbtu+cxDAA7XR5Pv6XkrYmOWh7raRdtp9M8kLjbQAa6XKa7Eg6OPpw7ehtWK+yAWBVOt3Htz1le5+keUk7krzYdhaAljqFn2QhyWWSNkrabPvSEy9je872btu7D+eLce8EMEarelQ/ySeSdkratsyfbU8ym2R22jNjmgeghS6P6p9r++zR++skXSnprdbDALTT5VH98yX9wfaUlr5QPJzk8bazALTU5VH91yRdPoEtACaEn9wDCiJ8oCDCBwoifKAgwgcKInygIMIHCiJ8oCDCBwoifKAgwgcKInygIMIHCurytNzVS6TFxSZXfaqysND3hK+NfPBR3xNO8uTbf+t7wrKuvuiKviccx593O5ZzxAcKInygIMIHCiJ8oCDCBwoifKAgwgcKInygIMIHCiJ8oCDCBwoifKAgwgcKInygIMIHCuocvu0p26/Y5hTZwNfcao74t0ra32oIgMnpFL7tjZKulXRP2zkAJqHrEf9uSbdL+srX07I9Z3u37d2H9eVYxgFoY8XwbV8naT7Jnv93uSTbk8wmmZ3WGWMbCGD8uhzxt0i63va7kh6StNX2/U1XAWhqxfCT3JFkY5JNkm6Q9EySm5ovA9AM38cHClrV6+on2SlpZ5MlACaGIz5QEOEDBRE+UBDhAwURPlAQ4QMFET5QEOEDBRE+UBDhAwURPlAQ4QMFET5Q0KqendeZLU1NNbnqU7aw0PeCr43FQ4f6nnCSqy+6ou8Jy1rz7bP6nnC8w92O5RzxgYIIHyiI8IGCCB8oiPCBgggfKIjwgYIIHyiI8IGCCB8oiPCBgggfKIjwgYIIHyio09Nybb8r6TNJC5KOJJltOQpAW6t5Pv5Pk3zYbAmAieGmPlBQ1/Aj6Wnbe2zPtRwEoL2uN/W3JHnP9ncl7bD9VpLnjr3A6AvCnCTNeP2YZwIYp05H/CTvjX6dl/SYpM3LXGZ7ktkks9OeGe9KAGO1Yvi219s+8+j7kq6S9HrrYQDa6XJT/zxJj9k+evk/JXmq6SoATa0YfpJ3JP1wAlsATAjfzgMKInygIMIHCiJ8oCDCBwoifKAgwgcKInygIMIHCiJ8oCDCBwoifKAgwgcKcpLxX6n9gaR/jeGqzpE0tBf4HOImaZi72NTNODd9L8m5K12oSfjjYnv30F7Ke4ibpGHuYlM3fWzipj5QEOEDBQ09/O19D1jGEDdJw9zFpm4mvmnQ9/EBtDH0Iz6ABggfKIjwgYIIHyiI8IGC/gslsIPze/KYBAAAAABJRU5ErkJggg==\n", | |
"text/plain": [ | |
"<matplotlib.figure.Figure at 0x7f20c7ca22e8>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"import matplotlib.pyplot as plt\n", | |
"\n", | |
"# Display covariance heatmap\n", | |
"plt.matshow(features.cov())\n", | |
"plt.show()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"In this display, we can easily see that the covariance matrix contains one large outlier in the \"top\" variable." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 5, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>Length</th>\n", | |
" <th>Left</th>\n", | |
" <th>Right</th>\n", | |
" <th>Bottom</th>\n", | |
" <th>Top</th>\n", | |
" <th>Diagonal</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>Length</th>\n", | |
" <td>1.000000</td>\n", | |
" <td>0.231293</td>\n", | |
" <td>0.151763</td>\n", | |
" <td>-0.189801</td>\n", | |
" <td>-0.084891</td>\n", | |
" <td>0.194301</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>Left</th>\n", | |
" <td>0.231293</td>\n", | |
" <td>1.000000</td>\n", | |
" <td>0.743263</td>\n", | |
" <td>0.413781</td>\n", | |
" <td>-0.012059</td>\n", | |
" <td>-0.503229</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>Right</th>\n", | |
" <td>0.151763</td>\n", | |
" <td>0.743263</td>\n", | |
" <td>1.000000</td>\n", | |
" <td>0.486758</td>\n", | |
" <td>0.029858</td>\n", | |
" <td>-0.516476</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>Bottom</th>\n", | |
" <td>-0.189801</td>\n", | |
" <td>0.413781</td>\n", | |
" <td>0.486758</td>\n", | |
" <td>1.000000</td>\n", | |
" <td>0.014604</td>\n", | |
" <td>-0.622983</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>Top</th>\n", | |
" <td>-0.084891</td>\n", | |
" <td>-0.012059</td>\n", | |
" <td>0.029858</td>\n", | |
" <td>0.014604</td>\n", | |
" <td>1.000000</td>\n", | |
" <td>-0.015729</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>Diagonal</th>\n", | |
" <td>0.194301</td>\n", | |
" <td>-0.503229</td>\n", | |
" <td>-0.516476</td>\n", | |
" <td>-0.622983</td>\n", | |
" <td>-0.015729</td>\n", | |
" <td>1.000000</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" Length Left Right Bottom Top Diagonal\n", | |
"Length 1.000000 0.231293 0.151763 -0.189801 -0.084891 0.194301\n", | |
"Left 0.231293 1.000000 0.743263 0.413781 -0.012059 -0.503229\n", | |
"Right 0.151763 0.743263 1.000000 0.486758 0.029858 -0.516476\n", | |
"Bottom -0.189801 0.413781 0.486758 1.000000 0.014604 -0.622983\n", | |
"Top -0.084891 -0.012059 0.029858 0.014604 1.000000 -0.015729\n", | |
"Diagonal 0.194301 -0.503229 -0.516476 -0.622983 -0.015729 1.000000" | |
] | |
}, | |
"execution_count": 5, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"# Display correlation matrix\n", | |
"df.iloc[:, 0:6].corr()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 6, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAP4AAAECCAYAAADesWqHAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAACyNJREFUeJzt3V+IpQUdxvHncRz/NGPslmuIu7QFYYhixrA3C0JSYSVWdwp1JaxhgVYgdeltUHiT2FJSkbUEthH90yXdREhtVtd/rUWY0bbRrG6Lzmrb7szTxZyNXffYvKPve963ft8PDDvjHt55GPc77zlz5pzjJAJQyxl9DwAweYQPFET4QEGEDxRE+EBBhA8UNNjwbV9t+/e2/2j7SwPYc5ftBdtP973lBNubbD9ge5/tZ2zfPIBN59h+1PYTo0239b3pBNtTth+3/dO+t5xg+3nbT9nea3t+Yp93iPfj256S9AdJH5K0X9JvJV2f5Hc9brpS0qKk7ya5tK8dJ7N9oaQLkzxm+zxJeyR9ouevkyXNJFm0PS3pIUk3J3m4r00n2P6CpDlJb01yTd97pJXwJc0leWGSn3eoZ/wtkv6Y5Lkk/5K0Q9LH+xyU5EFJh/rc8FpJ/pbksdH7L0vaJ+minjclyeLow+nRW+9nF9sbJX1M0jf73jIEQw3/Ikl/Oenj/er5H/TQ2d4s6QpJj/S75D9XqfdKWpC0K0nvmyTdLulWSct9D3mNSLrP9h7b2yb1SYcavsf8t97PGkNle1bSPZJuSfJS33uSLCV5n6SNkrbY7vWmke1rJC0k2dPnjtexNcn7JX1E0mdHNyk7N9Tw90vadNLHGyUd6GnLoI1uR98j6e4kP+p7z8mSHJa0W9LVPU/ZKuna0e3pHZKusv29fietSHJg9OeCpJ1auZnbuaGG/1tJ77H9LttnSbpO0k963jQ4ox+kfUvSviRf63uPJNneYHvd6P1zJX1Q0rN9bkry5SQbk2zWyr+l+5N8qs9NkmR7ZvRDWdmekfRhSRO512iQ4Sc5Lulzku7Vyg+sfpjkmT432f6BpN9Iutj2fts39LlnZKukT2vlDLZ39PbRnjddKOkB209q5Rv4riSDuftsYN4h6SHbT0h6VNLPkvxyEp94kHfnAejWIM/4ALpF+EBBhA8URPhAQYQPFDTo8Cf5K4xNDXGTNMxdbGqmj02DDl/S4P4naZibpGHuYlMzhA+ge538As/5b5vK5k3Tb/o4B19c0oa3T7WwSHrq8IZWjrO0uKip2dlWjrVywHYOs3zkiM6YmWnlWNOvtHIYHTu6qOmz2/laLbfzz0DH/3lEZ57TztdJkpbObeEYi0c0NdvOpuOHDmlp8ci4B7md4sxWPttrbN40rUfv3bT6BSfo3Ttv7HvCWFOLw7vSdcH80B65Kh1dN7yvkyT945Jh/ebrga/e3uhyw/xqAugU4QMFET5QEOEDBRE+UBDhAwURPlAQ4QMFET5QEOEDBRE+UBDhAwURPlBQo/CH9lr1AN6cVcMfvVb917Xyon6XSLre9iVdDwPQnSZn/MG9Vj2AN6dJ+LxWPfB/pkn4jV6r3vY22/O25w++2NLzSQHoRJPwG71WfZLtSeaSzLX1PHkAutEkfF6rHvg/s+qTbSY5bvvEa9VPSbqr79eqB/DmNHqW3SQ/l/TzjrcAmBB+cw8oiPCBgggfKIjwgYIIHyiI8IGCCB8oiPCBgggfKIjwgYIIHyiI8IGCGj1IZ62eOrxB7955YxeHfsOe++Q3+p4w1p2Hh/dkRndcfGXfE07jX6/ve8JY7/3Kn/qecIpDB482uhxnfKAgwgcKInygIMIHCiJ8oCDCBwoifKAgwgcKInygIMIHCiJ8oCDCBwoifKAgwgcKInygoFXDt32X7QXbT09iEIDuNTnjf1vS1R3vADBBq4af5EFJhyawBcCEcBsfKKi18G1vsz1ve35pcbGtwwLoQGvhJ9meZC7J3NTsbFuHBdABruoDBTW5O+8Hkn4j6WLb+23f0P0sAF1a9Xn1k1w/iSEAJoer+kBBhA8URPhAQYQPFET4QEGEDxRE+EBBhA8URPhAQYQPFET4QEGEDxRE+EBBqz467w1ZkqYWh/U95c7DF/U9YazPrPtr3xNOs+Mtr/Y94TQvnLW+7wljZXm57wmnSrOLDatOABNB+EBBhA8URPhAQYQPFET4QEGEDxRE+EBBhA8URPhAQYQPFET4QEGEDxRE+EBBTV4td5PtB2zvs/2M7ZsnMQxAd5o8Hv+4pC8mecz2eZL22N6V5HcdbwPQkVXP+En+luSx0fsvS9onaZjPagGgkTXdxre9WdIVkh7pYgyAyWgcvu1ZSfdIuiXJS2P+fpvtedvzy0eOtLkRQMsahW97WivR353kR+Muk2R7krkkc2fMzLS5EUDLmvxU35K+JWlfkq91PwlA15qc8bdK+rSkq2zvHb19tONdADq06t15SR6S5AlsATAh/OYeUBDhAwURPlAQ4QMFET5QEOEDBRE+UBDhAwURPlAQ4QMFET5QEOEDBRE+UFCTJ9tcs+lXpAvml7s49Bt2x8VX9j1hrB1vebXvCafZfemP+55wmst23dT3hLGW/r7Q94RTJMcbXY4zPlAQ4QMFET5QEOEDBRE+UBDhAwURPlAQ4QMFET5QEOEDBRE+UBDhAwURPlAQ4QMFNXmZ7HNsP2r7CdvP2L5tEsMAdKfJ4/GPSroqyaLtaUkP2f5Fkoc73gagI01eJjuSFkcfTo/e0uUoAN1qdBvf9pTtvZIWJO1K8ki3swB0qVH4SZaSvE/SRklbbF/62svY3mZ73vb8saOLpx8EwGCs6af6SQ5L2i3p6jF/tz3JXJK56bNnW5oHoAtNfqq/wfa60fvnSvqgpGe7HgagO01+qn+hpO/YntLKN4ofJvlpt7MAdKnJT/WflHTFBLYAmBB+cw8oiPCBgggfKIjwgYIIHyiI8IGCCB8oiPCBgggfKIjwgYIIHyiI8IGCCB8oqMnDctdseUo6um5Y31P86/V9TxjrhbOGt+uyXTf1PeE0T33+jr4njHX5sWF9rY7d3ew5cIdVJ4CJIHygIMIHCiJ8oCDCBwoifKAgwgcKInygIMIHCiJ8oCDCBwoifKAgwgcKInygIMIHCmocvu0p24/b5iWygf9xaznj3yxpX1dDAExOo/Btb5T0MUnf7HYOgEloesa/XdKtkpZf7wK2t9metz1//J9HWhkHoBurhm/7GkkLSfb8t8sl2Z5kLsncmefMtDYQQPuanPG3SrrW9vOSdki6yvb3Ol0FoFOrhp/ky0k2Jtks6TpJ9yf5VOfLAHSG+/GBgtb0vPpJdkva3ckSABPDGR8oiPCBgggfKIjwgYIIHyiI8IGCCB8oiPCBgggfKIjwgYIIHyiI8IGCCB8oaE2Pzmtq6VzpH5eki0O/Ye/9yp/6njBWll/32cx6s/T3hb4nnObyYzf1PWGsJ269o+8Jp9jyq4ONLscZHyiI8IGCCB8oiPCBgggfKIjwgYIIHyiI8IGCCB8oiPCBgggfKIjwgYIIHyiI8IGCGj0s1/bzkl6WtCTpeJK5LkcB6NZaHo//gSQvdLYEwMRwVR8oqGn4kXSf7T22t3U5CED3ml7V35rkgO0LJO2y/WySB0++wOgbwjZJmlq/vuWZANrU6Iyf5MDozwVJOyVtGXOZ7UnmksxNzc60uxJAq1YN3/aM7fNOvC/pw5Ke7noYgO40uar/Dkk7bZ+4/PeT/LLTVQA6tWr4SZ6TdPkEtgCYEO7OAwoifKAgwgcKInygIMIHCiJ8oCDCBwoifKAgwgcKInygIMIHCiJ8oCDCBwpykvYPah+U9OcWDnW+pKE9wecQN0nD3MWmZtrc9M4kG1a7UCfht8X2/NCeynuIm6Rh7mJTM31s4qo+UBDhAwUNPfztfQ8YY4ibpGHuYlMzE9806Nv4ALox9DM+gA4QPlAQ4QMFET5QEOEDBf0b6RKxXPb9DeMAAAAASUVORK5CYII=\n", | |
"text/plain": [ | |
"<matplotlib.figure.Figure at 0x7f20c7c2bdd8>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"# Display correlation heatmap\n", | |
"plt.matshow(features.corr())\n", | |
"plt.show()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"In this case, we can see that the correlation values are much more evenly spread out between the maximum and the minimum values. Hence, we can conclude that for this data the correlation matrix is probably the best choice for PCA." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Consulting the sample correlation matrix, discuss the usefulness of PCA for dimension reduction." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"PCA transforms the data into a new set of coordinate axes, along the eigenvectors of the covariance or correlation matrix. Since the sources of variation are orthogonal to each other, we can easily reduce dimensions, by ignoring the axes that contribute the least to the total variance.\n", | |
"\n", | |
"In the correlation matrix and heat maps, we can see that some variable pairs contain more variance than others. For example, the relationship betwee \"diagonal\" with \"left\", \"right\", and \"bottom\" distances has a poor correlation, indicating high variability. " | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Carry out a PCA with 200 bank notes. Choose the number of PCs you want to keep based on the cumulative proportion (of your choice) or the scree diagram. Is PCA effective for dimensional reduction?" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"PCA can be very effective for dimensional reduction, but it is important to keep in mind data normalization.\n", | |
"\n", | |
"Here we define a function for calculating PCA:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 7, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"np.set_printoptions(threshold=np.nan)\n", | |
"\n", | |
"def PCA(data, matrix_type, threshold):\n", | |
" \"\"\"Function to compute PCA.\n", | |
" Supports both covariance, and correlation matrices.\"\"\"\n", | |
" # Center data on axis\n", | |
" data = data - np.mean(data, axis=0)\n", | |
" if matrix_type == \"covariance\":\n", | |
" # Compute covariance matrix\n", | |
" mat = np.cov(data, rowvar=False)\n", | |
" elif matrix_type == \"correlation\":\n", | |
" # Compute correlation matrix\n", | |
" mat = np.corrcoef(data, rowvar=False)\n", | |
" evals, evecs = np.linalg.eigh(mat)\n", | |
" # Sort by decreasing magnitude\n", | |
" idx = np.argsort(evals)[::-1]\n", | |
" evals = evals[idx]\n", | |
" evecs = evecs[:, idx]\n", | |
" # Find the variance retained at each level\n", | |
" variance_retained = np.cumsum(evals) / np.sum(evals)\n", | |
" print(\"Cummulative variance retained:\", variance_retained)\n", | |
" print(\"_\" * 70)\n", | |
" print(\"All components:\")\n", | |
" print(\"eigenvals:\", evals)\n", | |
" print(\"eigenvecs:\")\n", | |
" print(evecs)\n", | |
" # Feature selection\n", | |
" index = np.argmax(variance_retained >= threshold)\n", | |
" evals = evals[:index + 1]\n", | |
" evecs = evecs[:, :index + 1]\n", | |
" reduced_data = np.dot(evecs.T, data.T).T\n", | |
" print(\"_\" * 70)\n", | |
" print(\"Top components with at least {0:.0f}% of the variance:\".format(threshold * 100) )\n", | |
" print(\"eigenvals:\", evals)\n", | |
" print(\"eigenvecs:\")\n", | |
" print(evecs)\n", | |
" # Get transformation matrix\n", | |
" coefficients = np.dot(evecs, np.diag(evals)) \n", | |
" return coefficients" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 8, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Cummulative variance retained: [ 0.44112039 0.65656584 0.82062644 0.89786062 0.95929831 1. ]\n", | |
"______________________________________________________________________\n", | |
"All components:\n", | |
"eigenvals: [ 2.64672233 1.29267268 0.98436364 0.46340507 0.36862616 0.24421012]\n", | |
"eigenvecs:\n", | |
"[[-0.00608924 0.79765636 0.16172325 0.54422009 0.1865996 0.080981 ]\n", | |
" [-0.50728436 0.31018807 0.08184041 -0.3802684 0.0009923 -0.70366401]\n", | |
" [-0.5244321 0.21444822 0.10525394 -0.33628115 -0.33334729 0.66610743]\n", | |
" [-0.46980657 -0.29743752 -0.12458421 0.65338254 -0.47176944 -0.16067461]\n", | |
" [-0.01439042 -0.25131735 0.9639207 0.07506323 0.03200387 -0.02882109]\n", | |
" [ 0.49666001 0.2644053 0.10679422 -0.11658551 -0.79402049 -0.16719134]]\n", | |
"______________________________________________________________________\n", | |
"Top components with at least 90% of the variance:\n", | |
"eigenvals: [ 2.64672233 1.29267268 0.98436364 0.46340507 0.36862616]\n", | |
"eigenvecs:\n", | |
"[[-0.00608924 0.79765636 0.16172325 0.54422009 0.1865996 ]\n", | |
" [-0.50728436 0.31018807 0.08184041 -0.3802684 0.0009923 ]\n", | |
" [-0.5244321 0.21444822 0.10525394 -0.33628115 -0.33334729]\n", | |
" [-0.46980657 -0.29743752 -0.12458421 0.65338254 -0.47176944]\n", | |
" [-0.01439042 -0.25131735 0.9639207 0.07506323 0.03200387]\n", | |
" [ 0.49666001 0.2644053 0.10679422 -0.11658551 -0.79402049]]\n" | |
] | |
} | |
], | |
"source": [ | |
"# Calculate PCA with a cumulative proportion of 0.90\n", | |
"pca_coeff = PCA(features, \"correlation\", .9)\n", | |
"#evals, evecs = PCA(features, \"covariance\", .9)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Interpret PC1 and PC2" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"All principal components are a linear combination of the original features. PC1 is the component with the most variance, and PC2 is the component with the second most variance." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 9, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"[[ -1.61165282e-02 1.03110859e+00 1.59194486e-01 2.52194350e-01\n", | |
" 6.87854928e-02]\n", | |
" [ -1.34264085e+00 4.00971638e-01 8.05607256e-02 -1.76218306e-01\n", | |
" 3.65786904e-04]\n", | |
" [ -1.38802616e+00 2.77211351e-01 1.03608155e-01 -1.55834392e-01\n", | |
" -1.22880533e-01]\n", | |
" [ -1.24344754e+00 -3.84489361e-01 -1.22636168e-01 3.02780783e-01\n", | |
" -1.73906558e-01]\n", | |
" [ -3.80874363e-02 -3.24871072e-01 9.48848490e-01 3.47846835e-02\n", | |
" 1.17974626e-02]\n", | |
" [ 1.31452115e+00 3.41789511e-01 1.05124347e-01 -5.40263144e-02\n", | |
" -2.92696726e-01]]\n" | |
] | |
}, | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAQwAAAE5CAYAAABlOx6NAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAEVhJREFUeJzt3XusZWV9xvHvM2dGBhgEp4IXQMYLXogFTEZqQ1sjYoNXUrUGUq1W6zTeCo3WgKHxUttq26j9g9SOimJFiSi0SixIVcBbhQGR2+ANuUyhAUW5jBWZ4dc/9qI5jsM5754566y9z3w/yc5Ze5211/vMyZlnv2udvddOVSFJLZYNHUDS9LAwJDWzMCQ1szAkNbMwJDWzMCQ1W9KFkeSYJN9N8oMkJw2dZz5JTktyW5Krh87SKsmBSb6SZGOSa5KcMHSmuSRZmeSSJN/p8r5z6Eytkswk+XaSc4fKsGQLI8kMcCrwXOAQ4Pgkhwybal4fA44ZOsSYtgBvrqqnAM8A3jDhP+d7gaOq6jDgcOCYJM8YOFOrE4CNQwZYsoUBHAH8oKqur6pfAmcCxw6caU5VdTFwx9A5xlFVt1bV5d3y3Yx+ofcfNtWDq5F7ursrutvEv3oxyQHA84EPD5ljKRfG/sDNs+5vYoJ/kZeCJGuApwHfGjbJ3Lqp/RXAbcAFVTXReTsfAN4K3D9kiKVcGNnOuol/JplWSVYBnwVOrKq7hs4zl6raWlWHAwcARyR56tCZ5pLkBcBtVXXZ0FmWcmFsAg6cdf8A4JaBsixpSVYwKoszqursofO0qqqfARcy+eeNjgRelOQGRofWRyX5xBBBlnJhXAocnOSxSR4CHAd8buBMS06SAB8BNlbV+4bOM58k+ybZp1veHTgauG7YVHOrqpOr6oCqWsPo9/jLVfXyIbIs2cKoqi3AG4HzGZ2I+3RVXTNsqrkl+RTwTeBJSTYlec3QmRocCbyC0bPeFd3teUOHmsOjgK8kuZLRk8oFVTXYnymnTXx7u6RWS3aGIWnhWRiSmlkYkppZGJKaWRiSmu0ShZFk3dAZxjVtmactL0xf5knIu0sUBjD4D3oHTFvmacsL05d58Ly7SmFIWgAT9cKtmVV71vLVqxd8v1vv2czMqj0XfL8AB+19ey/7vfOOrey9embB93vjPQ9f8H0CbL17MzN79fMzzrJ+fke33rWZmYf2k7m2bO+9jzunz9/jLT/5KVvv2Txv6OW9jL6Dlq9ezaPfcuLQMcZy6gs/NHSEsbz2q68aOsLYlq+8b+gIY7vvZyuHjjCW//mbf2razkMSSc0sDEnNLAxJzSwMSc0sDEnNLAxJzSwMSc0sDEnNLAxJzSwMSc0sDEnNLAxJzSwMSc0sDEnNLAxJzSwMSc0sDEnNei2MJMck+W6SHyQ5qc+xJPWvt8JIMgOcCjwXOAQ4PskhfY0nqX99zjCOAH5QVddX1S+BM4FjexxPUs/6LIz9gZtn3d/UrfsVSdYl2ZBkw9Z7NvcYR9LO6rMwtnfJ8l+7XnxVra+qtVW1tq9LqEtaGH0WxibgwFn3DwBu6XE8ST3rszAuBQ5O8tgkDwGOAz7X43iSetbbBxlV1ZYkbwTOB2aA06rqmr7Gk9S/Xj/5rKq+AHyhzzEkLR5f6SmpmYUhqZmFIamZhSGpmYUhqZmFIamZhSGpmYUhqZmFIamZhSGpmYUhqZmFIamZhSGpmYUhqZmFIamZhSGpmYUhqVmvV9wa126bfs7Bf3n50DHGcvLBLx46wljq3ul7jlh20/RdTT773D90hPHU9i7y/+um77dH0mAsDEnNLAxJzSwMSc0sDEnNLAxJzSwMSc0sDEnNLAxJzSwMSc0sDEnNLAxJzSwMSc0sDEnNLAxJzSwMSc0sDEnNeiuMJKcluS3J1X2NIWlx9TnD+BhwTI/7l7TIeiuMqroYuKOv/UtafJ7DkNRs8KuGJ1kHrANYyR4Dp5E0l8FnGFW1vqrWVtXaFVk5dBxJcxi8MCRNjz7/rPop4JvAk5JsSvKavsaStDh6O4dRVcf3tW9Jw/CQRFIzC0NSMwtDUjMLQ1IzC0NSMwtDUjMLQ1IzC0NSMwtDUjMLQ1IzC0NSMwtDUjMLQ1IzC0NSMwtDUjMLQ1IzC0NSs8GvGv4rqqj7fjl0irHctXnKLlw8hU8RNYWZl2+ertC5v2276fpXSRqUhSGpmYUhqZmFIamZhSGpmYUhqZmFIamZhSGpmYUhqZmFIamZhSGpmYUhqZmFIamZhSGpmYUhqZmFIamZhSGpmYUhqVlvhZHkwCRfSbIxyTVJTuhrLEmLo89rem4B3lxVlyfZC7gsyQVVdW2PY0rqUW8zjKq6taou75bvBjYC+/c1nqT+Lco5jCRrgKcB31qM8ST1o/ePGUiyCvgscGJV3bWd768D1gGsZI++40jaCb3OMJKsYFQWZ1TV2dvbpqrWV9Xaqlq7gt36jCNpJ41dGEkeluTQhu0CfATYWFXv25FwkiZLU2EkuTDJQ5OsBr4DfDTJfCVwJPAK4KgkV3S35+1kXkkDaj2HsXdV3ZXkT4GPVtXbk1w51wOq6mtAdjqhpInRekiyPMmjgJcB5/aYR9IEay2MdwHnAz+sqkuTPA74fn+xJE2ipkOSqjoLOGvW/euBl/QVStJkaj3p+cQkX0pydXf/0CSn9BtN0qRpPST5EHAycB9AVV0JHNdXKEmTqbUw9qiqS7ZZt2Whw0iabK2F8eMkjwcKIMlLgVt7SyVpIrW+DuMNwHrgyUn+G/gR8Ee9pZI0keYtjCTLgLVVdXSSPYFl3dvVJe1i5j0kqar7gTd2y5stC2nX1XoO44Ikb+kuu7f6gVuvySRNnNZzGK/uvr5h1roCHrewcSRNstZXej627yCSJl9TYST54+2tr6qPL2wcSZOs9ZDk6bOWVwLPBi4HLAxpF9J6SPKm2feT7A38ay+JJE2sHb2m58+BgxcyiKTJ13oO4/N0LwtnVDKHMOvt7gsly2eY2We6/lq72zf2GjrCWO5/RM2/0YRZ/pRfu9j8xNty7UOHjjCexl+L1nMY/zhreQtwY1VtGjOSpCnXekjyvKq6qLt9vao2JXlvr8kkTZzWwnjOdtY9dyGDSJp8cx6SJHkd8HrgcdtcJXwv4Ot9BpM0eeY7h/FJ4D+AvwNOmrX+7qq6o7dUkibSnIVRVXcCdwLHAyTZj9ELt1YlWVVVN/UfUdKkaL0I8AuTfJ/RhXMuAm5gNPOQtAtpPen5buAZwPe6N6I9G89hSLuc1sK4r6p+AixLsqyqvgIc3mMuSROo9YVbP0uyCvgqcEaS2/Cq4dIup3WGcSyj94+cCJwH/BB4YV+hJE2m1nerbk5yEHBwVZ2eZA9gpt9okiZN619JXgt8BviXbtX+wL/1FUrSZGo9JHkDcCRwF0BVfR/Yr69QkiZTa2HcW1W/fOBOkuU0vyFW0lLRWhgXJXkbsHuS5zC6Fsbn+4slaRK1FsZJwO3AVcCfAV8ATukrlKTJNN+7VR9TVTd1n372oe7WJMlK4GJgt26cz1TV23cmrKRhzTfD+P+/hCT57Jj7vhc4qqoOY/Sq0GOSPGPMfUiaIPO9DiOzlsf6lLOqKuCe7u6K7uaJUmmKzTfDqAdZbpJkJskVwG3ABVX1rXH3IWlyzDfDOCzJXYxmGrt3y3T3q6rmvDRyVW0FDk+yD3BOkqdW1dWzt0myDlgHsHLZqh35N0haJPNdQGdBXv5dVT9LciFwDHD1Nt9bD6wH2HvFvh6ySBNsRz/IaF5J9u1mFiTZHTgauK6v8ST1r/Xt7TviUcDpSWYYFdOnq+rcHseT1LPeCqOqrgSe1tf+JS2+3g5JJC09FoakZhaGpGYWhqRmFoakZhaGpGYWhqRmFoakZhaGpGYWhqRmFoakZhaGpGYWhqRmFoakZhaGpGYWhqRmFoakZhaGpGZ9XtNzbL945Eq+++YnDh1jLHvcOnSC8bzrxWcOHWFsb/vPlw0dYXz73D90grG0fj6AMwxJzSwMSc0sDEnNLAxJzSwMSc0sDEnNLAxJzSwMSc0sDEnNLAxJzSwMSc0sDEnNLAxJzSwMSc0sDEnNLAxJzSwMSc16L4wkM0m+neTcvseS1K/FmGGcAGxchHEk9azXwkhyAPB84MN9jiNpcfQ9w/gA8FZguq6IKmm7eiuMJC8Abquqy+bZbl2SDUk2bN28ua84khZAnzOMI4EXJbkBOBM4Kskntt2oqtZX1dqqWjuz5549xpG0s3orjKo6uaoOqKo1wHHAl6vq5X2NJ6l/vg5DUrNF+eSzqroQuHAxxpLUH2cYkppZGJKaWRiSmlkYkppZGJKaWRiSmlkYkppZGJKaWRiSmlkYkppZGJKaWRiSmlkYkppZGJKaWRiSmlkYkppZGJKaLcoVt1r95urbueS4Dw4dYyyHXnL80BHGcvLFLx06wthmfpGhI4xt5e3T9Vy87L7G7fqNIWkpsTAkNbMwJDWzMCQ1szAkNbMwJDWzMCQ1szAkNbMwJDWzMCQ1szAkNbMwJDWzMCQ1szAkNbMwJDWzMCQ1szAkNbMwJDXr9RJ9SW4A7ga2Aluqam2f40nq12Jc0/NZVfXjRRhHUs88JJHUrO/CKOCLSS5Lsm57GyRZl2RDkg23/2Rrz3Ek7Yy+D0mOrKpbkuwHXJDkuqq6ePYGVbUeWA+w9rCV1XMeSTuh1xlGVd3Sfb0NOAc4os/xJPWrt8JIsmeSvR5YBn4fuLqv8ST1r89DkkcA5yR5YJxPVtV5PY4nqWe9FUZVXQ8c1tf+JS0+/6wqqZmFIamZhSGpmYUhqZmFIamZhSGpmYUhqZmFIamZhSGpmYUhqZmFIamZhSGpmYUhqZmFIamZhSGpmYUhqZmFIalZqibnQt1Jbgdu7GHXDwem7cOUpi3ztOWF6cvcZ96Dqmrf+TaaqMLoS5IN0/YxjdOWedrywvRlnoS8HpJIamZhSGq2qxTG+qED7IDtZk6yNckVSa5OclaSPbr1j0xyZpIfJrk2yReSPHHW4/4iyS+S7L2YeYeU5G3zbDJxmecxeN5d4hzGUpLknqpa1S2fAVwGvB/4BnB6VX2w+97hwF5V9dXu/iXAvcBHqupjQ2RfbLN/VloYu8oMY6n6KvAE4FnAfQ+UBUBVXTGrLB4PrAJOAY5/sJ0leWuSq5J8J8l7unWHJ/mvJFcmOSfJw7r1FyZ5f5KLk2xM8vQkZyf5fpJ3d9usSXJdktO7x39m1ozo2Um+3Y13WpLduvU3JHlnksu77z25W79nt92l3eOO7da/qhv3vG7sv+/WvwfYvZuNnbGgP/VdWVV5m6IbcE/3dTnw78DrgD8H3j/HY04B/orRE8QNwH7b2ea5jGYpe3T3V3dfrwSe2S2/C/hAt3wh8N5u+QTgFuBRwG7AJuA3gDVAMfpQboDTgLcAK4GbgSd26z8OnNgt3wC8qVt+PfDhbvlvgZd3y/sA3wP2BF4FXA/s3e33RuDA2T8rbwt3c4YxfXZPcgWwAbgJ+EjDY44Dzqyq+4GzgT/czjZHAx+tqp8DVNUd3fmOfarqom6b04Hfm/WYz3VfrwKuqapbq+peRv+BD+y+d3NVfb1b/gTwO8CTgB9V1fceZL9nd18vY1Q6MPps3pO6f/uFjMrhMd33vlRVd1bVL4BrgYPm+XloB/X52arqx/9W1eGzVyS5Bnjp9jZOcihwMHBB9zm3D2H0H/rUbTdlNBsYx73d1/tnLT9w/4HfrW33Wd1YLfvdOms/AV5SVd+dvWGS39pm7NmP0QJzhrE0fBnYLclrH1jRnVN4JqNzFu+oqjXd7dHA/km2fRb+IvDqWecYVlfVncBPk/xut80rgIsYz2OS/Ha3fDzwNeA6YE2SJ4yx3/OBN6VrvSRPaxj7viQrxsyrOVgYS0CNDtj/AHhO92fVa4B3MDqvcBxwzjYPOadbP3sf5zE6xNjQTfvf0n3rlcA/JLkSOJzReYxxbARe2T1+NfDP3aHDnwBnJbmK0Yzkg3PsA+CvgRXAlUmu7u7PZ323vSc9F4h/VlVvkqwBzq2qpw4cRQvEGYakZs4wJDVzhiGpmYUhqZmFIamZhSGpmYUhqZmFIanZ/wHZ5pRtH3WLMQAAAABJRU5ErkJggg==\n", | |
"text/plain": [ | |
"<matplotlib.figure.Figure at 0x7f20c7bd3278>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"# Show transformation matrix\n", | |
"print(pca_coeff)\n", | |
"\n", | |
"# Plot transformation matrix\n", | |
"plt.matshow(pca_coeff)\n", | |
"plt.xlabel(\"PCA component\")\n", | |
"plt.ylabel(\"Features\")\n", | |
"plt.show()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Make a scatter plot of PC scores (use different symbols for genuine and counterfeit bank note) for PC1 vs. PC2. Which PC variable is more important in discriminating counterfeit vs. genuine? Furthermore, which of the six original measurement variables contribute the most for the counterfeit detection?" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"From the heat map above, we can observe that feature 5, corresponding to \"Diagonal\", is the most influential, with a coefficient of 1.314 in PC1, and is the most useful for separating the data." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 10, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"# Transform original feature matrix\n", | |
"z = features.dot(pca_coeff)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 11, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYgAAAEKCAYAAAAIO8L1AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAIABJREFUeJzt3XeYVPX1x/H3mbqNLgiICoodFSP2gjWxd6MmxhJ7NBr9JZbYYjTGQiyxo0ZRY8WOGsVCVFQUFBBEDYodpcP2MnN+f9wLbLmzBXZY0M/reeZ5Zm6bM7M799xvvebuiIiINBbr6ABERGTlpAQhIiKRlCBERCSSEoSIiERSghARkUhKECIiEkkJQkREIilBiIhIJCUIERGJlOjoAJbHaqut5v379+/oMEREVikTJkyY4+49W9pulU4Q/fv3Z/z48R0dhojIKsXMvmzNdqpiEhGRSEoQIiISSQlCREQiKUGIiEgkJQiRFax8YTnf/G8mtTW1HR1K3mUyGSpKK9F9Z1ZNq3QvJpFVSU11LTeccgdjHnmLeDJOzIzj/3YkB52xT0eH1u4ymQz3XvQwT938ArXVdXTr3YWjLzqMzyd/yYTRk+m2ehfW3KAvY596j4pFFfQd2Idz7jqVTbbboKNDl3psVc7sQ4YMcXVzlVXFsBNv5bWH3qSmcmnJIV2U5vz7f8+OB2+zQmL4+pNveWnEGMrml7Ptfluy1d5bEIu1vSLB3Xnuzpd58IrHmff9fNbcYA1O+cexDPn55gDcdvY9PHfny1RX1DTYLxaPkc1kow9qcO3Ll7LezwYw6o7RvPvCB/RaazUO+v3e9BmwOgUlBaTSyTbHuryy2Sxmhpmt8PfOFzOb4O5DWtxOCUIk/yrLKjms1wnUVDWtVlrvZwO4dfw1eY/hmdv+w61n3UsmkwEPktNmQzfm8mfOIx6Pt+lYI697lnsveYTqiuoly9KFKa4YdQEbbrMeh652fORnbUn33l1JpBMsmLWImsoaCM/J8XiMeCLOHr8Zyuk3Hk+qINXmY7fVzBk/cOOpw/ng1SnEYjF2PGRrfn/ziXTu0anJtuWLKiidV0bPNXu06rtcNLeUkdc9yzujJtClZ2cOO3s/ttl3y3x8jEitTRCqYhJZAUrnl2Ox6CvQOd/Nz/v7T584g5tOv7vBsuqKaiaPmcqbj49j6C+3b/WxMnUZHrh8ZIPkAFBdWcM9Fz3E+Q+cSV1tZpninPf9AhKpOHU14f6++D2zZOqyvHz/61SUVnLhg39YpuPnMv6lSTxw+WPM/OwHBv5sAEeedzCXHTaMRXNL8ayTzWR584lxzJjyNcMnDVtS6qqqqOa6k27nzSfGEY/HSBUmOe2G49nj1zvnfK/S+WWcssUfWTi7lNrqIIlOe+d/HH3xoRx53sHt+rmWlxqp5Udr7sz5zJjy1UrRGNyjT7fIq16LGZtsn/969+tPviNyeXVlDa8+9Gbkullfz2HYCbdyxBonc8ImZ/P8Xa/g7pTOL8tZOvhq2rcsmluWuxqpFZYkhwg1VTWMffJd5s9a2OwxamtqeeXfb3DFkddz0+/vZsaHuQcOv/bIWP5yyDVMHfsJ875fwHsvfMC5e/6VytJKPLu0hqWuNsOsL2cz8bWpS5Zdc9zNjH1yHLXVtVRVVLNobhnXnXgbrz/+Ts73e/rmF1g0Z2lygCBZ33/ZSMoXljf7uVY0lSB+RLzqRbz0Bsh8C4kBWKdzsfQOHR3WClc6v4wrjrieD9+YRiIVJxaL8bsbj+fnx+zSYTHFE3FOGXYMN/7uTmrDk6sZFBSlOe7yIwFYNK+UOd/Mo/eAXhR1Kow8TvmiCqZ/MIOuvbqw9kb9Wv3+n0/OfYJMFTZNXPNnLeS0Lc+lbH452UyWeTPnc9sf7uGLKV9xyrBjSCTjDU5wi/Ud2JuP3vqEeCJGpq7tScJi1uCkHCWZTjD323l069Ulcn1NdS3/N/QSvpj6NVXl1cTiMV7816v84Y6T2ePooQ22dXduPuPuBm0l7lBXUxd57Ewmy9cff8vPdt+UBbMX8s6zE5p8D7XVdVxxxHXsdfyunHX7yU2qnN77z8TIBBuLGxfsfSXVFdVssfsgDv/jgfTo063Z7yLflCB+JLIVT8Oii4GqYEHdNHz+adDtViy9Y4fGtqL95ZBr+ejtT6mrqVvy4/3n7+6k7zqrM2jHjTokpprqWl57eCx4cBLEnXRRmstHXcAaA3sz7IRbefXBN0mmE9TVZDjkD/vw27/9qkHD6KPDnmHEpY+QTCWoq61jrQ3X4IpRF9C9d8snkYLiNGU5Tnp7n7Bbk2VP3/Q8laVVDUoCVRXVjBo+mqIuRVRX1jTZJ12U4vgrjmLR3FJShSkqS6ta89U00FJygKCE0Wfd1Ze8rq2p5ZYz/8XLD7xBpraOLr26sHDOIuqqg8+bzWSprqzhupPuYPOhm9BzzdWW7PvSiDEsmlva6vji8Rj9B60JwNzv5pOr3dqzzqsPvUnPtVaj5xo9+HTCZ6y9cT/2OHpnirsUR+5TVV7NtHc+BeCrj79h9H2vc/sH19KzX49Wx9fe1Ej9I+Du+OydIDur6crEhsRWe2bFB9VBZs74gRMHnRM0cDay/YFbcdmT5+btvd2dVx98k8dvGLWkl9BRfz6Ebr268MDlj/HwVU81OLHG4jE2G7ox62y2Ns/dMbrBukQqQd91gp47Ox2yDf026MtVv7mpQb1/PBFjvZ+tw03v/L3F2O46/wEev2FUk+qbAZutxfCJ/yBTl2HsU+/y3osT6bZ6Vya8NJFPx3/e5DjJgmSQdBudNmLxGBc/eg47HrwNVRXVHNXvFMoWtFxdkkglGPKLzfl43P9YMGtRi9sXFKU55A/7cvwVRwFB0vplnxNbnYxiiRi7HbUjC35YyKxv5vDtpzNbXdKxmLHu4P7c+t7VmBnDz72fx4a18NsKS4lV5dWki9J4Nhu2p7TcRhNPxNn7xN0469aTWxVfW6iR+ielBrJzolfVNf2R/5jN/34ByVQiMkH88OXsvL738HPvZ9RtL1EVnsSfve1FXh/5Nnd+eB0v3PVqk6vubCbLh29MY9rbnzZZV1dTx1cffwvAF1O/xsyaNApn6rLM+PArvvvse/qu27vZ2I657Ai+mPo177/yIWZGNpNh3cEDuGb0JdRU1/LHXf/CjClfUVVWRSKZIJvNBj2IGiWC2hxtD9lMlqLOQbVYQVGaa16+hEsOupo5385rcozGn3Piq1OoKq/OvdFiBtvuvyU7H74dM2f8QJ8Bq3PPhQ+2qaSSDRu6l0WqIMkhZ+3LM7e+yIBN1+Lpm19oeSdnyWdr/PerL6pqLVOXYfyLk5Yp1vaiBPGjkALrBB7RcBfvs+LDaYZX/QcvvRGyMyGxXtBOktqq3Y7ff9BakfXHiVSCLffcfLmO7Znv8fJ7oXZiEHvxb7HEACCos3/6lv80OIHW1WYonV/Os7e/1GxDeW2Oqp/FaiprcvbBjyfjLJpbuiRBTHlzGnf/+UFmfPgVq6/dk2MvO4LtD9yKVDrJ2cNP5fk7RzNv5gJ2OWJ7Nt9lEADP3Poin0/+Ykk9fF1tdDzxZJxMM72TLj7gam56+0rW2Wxt1vvZOjz45e289cx7XHPszVQsqsy5X6uSA4DDmEfe4q2n3wMz1t5oDWbOiCg150l1RQ3Dfnvr0mq3dhoWkSpMka3LUJdt+t127921fd5kGakX04+AmUHJaUDjhs0CKGnf7oDLI1vxGL7gPMh8Bl4BtZPweSfgNe+123sUdSrkN5ceTkFxesmyeDJOSZciDjtnv2U+rtd9hs/ZFyruh9r3oXIkPucgvGYCANPf/5xUuun1Vk1lDe+PnsyOh2xDItm0f3y/9fvSa63Vmixv8v7ukfXd2awzYNO1APjwjWmc/4srmPLmx5QvrODzyV9y5a9v4KX7xjBq+GiOWfd0HrnmaUbf/zp/3ufvvHjvawC8+tCbTQa0RVlrwzVIpHL38a+prOHSg65ZMq2GmbHDgVvz5Lx7OeD0vVo8fmvVVNVSU1nD9IlfUL6wot2O2xoNeme1U+18Ihln4BYDSKYa/v8UFKc5/I8HAEFpYvR9/+VPu1/GaVuey5W/vpHXH3s77z30VIL4kbCi43Ecym4HL4NYVyg5h1jhvnl7T/csVD2FVzwIXg0F+2PFv8GsaQ8c9yyUDgMaX0lW4aXXYj0ebbe4jjj3INbcYA0eHfYM839YwNZ7bcGRFxxMt9XbdjWWrXgaym+BzA9gCfD6jZkZoBJfdAm22nP06Ns9si47Fo/RZ0Avjvvrkbz3wkQWzF5IVXk1qcIUiUSc8+/7PfO+X8CF+17ZbCyxmJEsSOHZLDVVtZgFV56nXXcs6cIgGd51/gNNqqqqK2q444/3UVlaSW11w5LBP393J1vuuRnpotYNOvtq2jct1tcvmL2QLz/6hv6brMnCOYsY99z7AHTt2Zl4It6quvfWak2D9qqgrjbDPifvwUv3juHT8Z+RSCXI1Gb49UWHsuPB2+DuXHrwNUx8beqSaqrpH8zg9cfeolvvbtw49gp6rdnyRcayUILIk+AqKoPZivmKzQwrPgEvOh68EqwoL1MDuNdB7WTA8YqHoGo0S076ZV/iVc9Dj8eCNpHsAkisi1kqSFpeFn3QuuntHuf2B27F9gcue9VVtvz+hgkt17mo7jPcq1hns7Xpt35fZkz5qkE1TDKd4OCz9qVzj07cNfU63n7qWXp1v4+11/uUqsokE9/+O90GnM6G26zHx+P+lzOeZEGSK0adz5Q3pvHuCx/Qo293Dv3Dvg16Zc348KvIfcuaGaT3xuPj2O+Un/PRW5+0WNXTmsZcM6Outo7R9/+XG065g1giKHHUVtWQWY6xET8GBUXp4LwQM6rLq4OuwJksuHP7OSOora5j7xN2ZfejhzJg0JoUlgQXWpPGTGXSmKmRbVBzv5vHP064jatfujgvMStBtDN3xysegLJbwOfhsT7Q6U/ECpe9eqMtzGJg0d3olpfXvIfPO52gK60DjU8oVVA3A5+zH2S+C666ieGdLsEK9wNLgUcUiXO0k7hn8fL7oOJeyJZDwY5Yyf9hidb3/18W7nVQdiNNSztRkuEDrnzhQv525PV89PanxBMx0oVpzrnzVNbZbG0AUqkydt7tGrKZhcRiTmFRJdvt9l/++8xUuvc+kILiNNUV1dTvWJgsSJJIxjnzlhMZvMsgBu8yiKMvPjwyktX69eDrsGG7vngiFtl24O7U1WbY8eCtmfTarrxw9yvE4jGqK2qWefbVdHGakm7F3HDq8LCvf+uqQBKpBJm6zI+mVNDY+kPW5YQrf8WmQzdi/H8mMe65Cbzx+Lglgw4Xj4t4acR/2fLngyksWX/Jvu+/Mjln8vasM2nMVGqqavIy/YgSRDvzivug9DqWnFyyM2Hhn3FLYQU/79DYlodnF+LzjqPlH3wlZL4APKh2gmB8RqI/FJ8AZXfS8MRbgJWciWcr8Monofq/EO+DFR2NL7wU6up1Y656Dq96A3q+gMVbvN/60tjrvsYXXQY1bwEJKNwP6/RnLFYSvUN2wdLYm5WGwgMwC66Su/XqwrBX/8K87+dTvrCCvgN7Nxgk5aVX4tkF1J8br6DI2eWgOTx803jOvPV0Xh/5Dp+On06vtXryi+N3ZZ3N1mbg4P6t+vEf+5dfcu1vb204P1JRmp8fuwsv3vtak55dZsb2Bw7BzDjjphM45A/7Mvn1aXz+4Rc8ecPzrfj8S8USMVLpJBc9dDZjn3g38kQfiwcT3tUviVjM6LlmD/Y5aQ92+eX2/PXwf/D5pFbdLrkJM2tTYkumk2TqMsSTMWqrmu8osDziyThnDz+FgYODDg2Dd92ET96bTvnC8ibfU1V5Nc/c8h+2P2Bp6bdLj86kCpLNzG3leZtOXQmiHbk7lN1MdD379R2SIDxbBtVjwGsgvVPkidXrZgRVQokNsFjn6OOU30Nrrwab1sdU4xX3YF2uwzEovxu8CmJdoOSPkNoen3tQUNdPJRDDKx8Don60ZXjFfVin/8v97nVf4OV3QO0UiA+A6reAMiAbHLPyabz2E+gxMroaLtYZLN5CI2QcUttinS9qsqZ7725NBq957SdQ9VxkQ3NtjdFn7YV8Ne0bLn/6vObetFlDf7k9ZQvL+defH6Qy7K566Dn78ZtLDqdT92Iev35U0MvKjGQqwVEXHMwaA5eW3vqu23tJb6ixj49j1tdzG37iRBzwJlVN8USMoy8+jH1O2oPuvbsx7Z1PI6fasFiMfU7cAzP4/ovZDNh0LfY8ZmiDEeF3fDCMbz79jneem8BDVz7Z6kFsyYKgFJerG25j6aIUV/3nIvqs25snbhjFY/94Bl/OGrBco8CTqQTxeHBVMOfbuZy+9QWUzS/LWWVX2mj8yK5H7cA9Fz8c/Z4Gm+yw4ZJ2qPamBNGuqnLXs2eaFv3zzavH4PPPAosBDosyeKc/ESs+JlifnYfPPxVqPw4bYWvxktOIlfyu6cGqo+fraWUkkJkZVH8V/xYvOgmjBqw46JNfehNkZrK0yiobPqJkofpdaDqhZvBOtR/h834VJCCyUPcpTc/0tUFPqtoPIPWzJscwS+FFx0L5veSuZioi1v3OZj91g7jK7yLXZ0qmnLnfF7JpcUGrj5fLviftyd4n7E7ZgnKKOxeFJ3U4/vKj2OnQbXn9sbcxM3Y5YnsGbLp2zuNc+sSf+NNul1FXl6GmsobCkgK69+nGHr/ZmYeufAKLGbFYMHX3hQ+fzXb7Lx1zte3+Q3jg8pFNGqQTiTgHnrFXi1OE9Fu/L4et35e1NuzHXw8fRk1VLZ51EskE6aIU+526J0/f/J8l1S7JdJIuPTrRY43ufPJuy+1ZZnDefWcuab/p2W81Eslk5NQhrdGjbzf6rd+Xrr268Paz7zWYzh2gpFsx/QcFPc1u/7/7WDBrYc65qlKFKYYevl2DZd17d+OyJ//E5UdcR8XCpTdfShWmKOpUwB/vjvi9thMliHZVEFwVZ+c1XZXon/d3d3fw+UEcZILkQGXD82Pp1WQzs7CCofjCv0FmGkF1ULi+7A48MbBBacdrPoC6qeSWCB/V5Lzsrp1K9oetwRcBSbzwUKzzBUAaql+iaXtGc2/Xt95nziyp4gHwRX8LutAuXRJ9DPegcTwiQQBYyVnBnuW3Re/f1vb/us9yhvH19DTffVnMbke1z5QosViMzt2bZtCBgwcsqeZoyfpbrsuI6Tfx0ogxfDf9BwbtuCE7H74dqXSSvY7flXef/4BkOsl2BwyhpGvDNq/+m6zJQWfuw1M3vbBkyu5UQZKDz9ynTfNHbb33FtzwxhU8cu3TfPvpdwzacSMO+7/96bXmamw+dBMev34U839YyLb7bcmhZ+/HjClfceE+V0ZOA1LfJjtsyE6HLL3/xk6Hbcud5z3QZLt4Ms62+w1h6tiPicWNeTMXRH9XQ9blr0+dR11tHRftfxVTx35MdUUN6aIUsViMSx7745KS6rjnJuRMDumiNL3792T/U5vWNGy55+aM/OFupoz9mI/e+pTK8mrWXL8POx++HQVF+Sk9gKbaaHfZikdg0d9YMicSAAVYt5uw9NBcuy0Xr/0UL7seqscSVAPFILFR2Dsoqp94LHzkqHdNbkmsx0PBsb0Gn7V9eGLPofMwKP07+Nzc2zQRh9SuxLrfSnbur6C2DX/Hbk9D3dtQdkeQEOP9oOQCYoV7kv1+ENByn36sCOt2N5Zqfg7+7Jwjoe79RksTUHggsS4tT3Gx5DgLL4PKR2j8nWcycMpum3LEn8/gF8fu2urjrQo+fvd/vPbwWMyMXY/cgQ22Gpj39/zg1Q+5cN8rm3TpXSxVkOTaVy5l40Z3rnvtkbEM++2txBOxoMBbl+Wcu05l91/tBEB1ZTWH9TqhSWNxQXGaP91zOjsfFlz1uzuT//sRH74xjW6rd2HoL7dvkEAP7n5c9BQkBmfdehJ7HjM0b9VFDd6uo28YZGYFwOtAmuDycqS7X2pm9wJDgcXDfo9z94kWpNgbgX0IzmrHuXvjX2YDHZkg3LN4xSNQMSLoH5/eGSs5C4v3DvvP3wiZ7yG+djBauKDhj99rpwXVNrESKNgLi7V+1kbPzIHq18ESuNfCor/Q9Ap8cUvoMlSsxvsT6/lS0CNr0RVQ+QC5K+QX/zO3oQRQX9GZUPNqWEJp6X/RoOj4oJRWdjtNGru73YIvPC/3tCNLpIKR0D2eaLErsNfNwOceETZaB92HiXXDejyOxbq3+PGWHCfzLT5nf/ByFn/OTCbJ99/vRNd1htGpW44Gc2mzh/7+BA9c8XiTRvl0YYprX/0LG22zXuR+pfPLeO+FD3CHrfYe3KQU9vaz4/nbkdeTzTq11bUUFKfZaq8tuOiRs1t9V75bzvoXzw1/uUF1ViIZZ7sDt+KSR3O3q7W3lSFBGFDs7mVmlgTeBM4CTgVGufvIRtvvA/yeIEFsA9zo7s3eh7EjE0R24V+g8kmWnqTiYF2wns8DyaA3DpmgYbjeiSQ46V4ClU8TXE0GtXzW7RYsvVOD93CvgdppECuG+LpBfX35v6H0qqBdwY3oEsJyKjoO0nvBwj9C9pv2P34DCXKWZJYogMS6UHIWlt4Rn7VVeKJtJNYX0rtD5UgaJo80WHfw74OX1g1KTseKjmrVOBXPlkHVKLzuMyw5KEjo1varPK/7DF90NdS+B9YFio/Fio4N2mZ+ohbNK2XWV3Pos87qFHcuapdj1lTVcO6ef+XzSV+Gk+SlSCQT/GPMZUu6HC+rOd/O5bWHxlK6oIytfrEFg3bcsE3jjSrLqzj/F5fz+aQvcQ8GQPZauyfXjbks8k51+dLhCaJRMEUECeK08BGVIO4Axrj7Q+HrT4Bd3H1mruN2VILwzCx89m40rcpIQ2oPqHkl7AUTDJaj88VY4eFQ824w6rh6NE1OilaM9XpnyYknW/kcLLoIMPAMxPtCemgwJmBZSgVtYV3Bo+tbO0YBFPycWNdhQcP6rJ3JXY2UglivYGZbS4e9t/YEyqH6HZYmjkJIbYV1u7PBD9zrvgqr694OSirFJ2CFh/+o7ke8MqirreOGU4fz2kNvkkglqKup48Az9uKkq3/TLt91Nptl4qtT+OjtT1ltje7sfPh2Oe+xsaK5O9Pe+ZQZH37FGuv1YfNdNlnh/18rRYKwoPVwAjAQuMXdzwurmLYjqJN4BTjf3avNbBRwlbu/Ge77CnCeu49vdMyTgZMB1lprrS2//LJtfaa97nO87GaoeR/ia2Alp7X5fgle/V98wTmNpl5YEiFNq0rSkBoSzOHjOXrFWAnW9XosPZRszXiYdxytqkv/yUhhvV4H64zP2ibHd7+YQfHpWHrnoH0i83U4hqPRd29FWLfhWGprIJyMb85+YU+0xUm4EIp+Razzsnc/laZu/+MIRt32UoMG5XRRmt/+7SgOOSt/08NIoLUJIq9lW3fPuPtgoB+wtZkNAi4ANgS2AroDi395USm0SfZy9+HuPsTdh/Ts2frBUhAW8eceClXPQ/Y7qH0Pn/87shVP1j8+XjOB7KKryJbegEf1PomvET0iODpkoA5qxuVODvX2zZbfDfOORsmhsSRkvg+qhAoPamFbh4oRWGowFl8Nat4l8vv0yuDvsvhl+b/CHlD1S2iVUPEAnl2ZSlSrtmw22+T+FxBMh93i/RVkhVohlZ/uvgAYA+zl7jM9UA3cA2wdbvYNsGa93foB37VrHKU3hCfp+ieAKij9O+6ZsH3gQnzeb6HiHigfjs85OKj3r8cSAyG5EYunWGhZhpbr2bO4G5T+k7xXIa2SKiEe9CXP1WW0AS9dOro01h2IGomcDteFat4j8u9kKajLPU+StE1dTV3k/ToASuflGEckHSJvCcLMeppZ1/B5IbAH8LGZ9QmXGXAQMCXc5RngGAtsCyxsrv1hmdS+T+TJ16uCOuuad6HqOYKqCCc4WVRB6VV4pmEXTus2HBo1KufWXP1iEijAulwHlY/Suvl/fmxaMUDMipdOjdHqyf3CBFGwVzhYsLFaPF2vz3miP5E/Ca8J2oCkXaQKUvTJcYOjDbbOf1dYab18liD6AK+Z2WTgPWC0u48C/m1mHwIfAqsBV4TbPw98DkwH7gTaf3hgrFeOFR7UbVe9EI7AbcTiUPPfhotiXYh1uz24UU+rRJ0E01ByDtbzZaxgN8jOb+WxfmQKj4Dkts1vE683wCrRmpNIfEnvIIt1gsJfRm5D+dLR0FZ8Ek1LGilIbY3F12jFe0pr/f7mE0gXpZZMPRKLGQXFaU79x7EdG5g0kLeR1O4+GdgiYnnTO6QHyx04PV/xAFjJqfjCcxu1BaSDydtixbiliG5kNqKrKIDE+lA7oaV3hvjakPmSoASTACPoQVP/bmoFPw+mf6D95sxfeTTTnbV6bIPR0U2lsKKlJw4rOQufN4Fmx16kG/2bVUbdHrIGKh7GO52HWRxLbgxdbwy6IS9O1gV7YJ3/1kxssiy23HNz/jHmrzx05eN8+dG3rD9kHX514aFtGmkt+feTG0mdLR8BZTcQTC9RBwV7Y12uwCyN107F5x5Fw1HQAIVYr7GRs396zfigzaLJPo2k98JKfg81Y8E6Q8HPmxzPsxVBH/9WT4q3skgRVJXVEiTXRvFbTyg5G0ovIrp9JQnFp0D5XUR+jwUHYl2uadAVMBgPcnmO48WwnmOweFCNEXSNzVVCiWO9JmCxpX3w3T0YbGdFWCw/U6eLdKSVohfTyihWfCzWaxzW40ms11vEul67ZOyBJTeBkjMIRgcXAEVAAdb1hpxTQ1tqCNb9bkhuFu4T1d5QiKV3wpLrYcXHYUWHYLESshVPk539C7I/bEl23nGQmdGw0TTiOBQdT37+bMllOG4hdL0F63Yb1u1m6PVO8D3Y4pNtQdB20O1WYsWHgeWq4gMKDw9mUW1QqE1Ber/wb9Twew3mispRAE5tvyQ5AHjZbeRsB4r1bZAcILz5UrynkoP85P0kJ+szS0EietKyWMnJeOF+4VQWKUjvjsW6NH+81FZYj2DcX3bhpVD5FEsbm9PBDXEK92+wT7bsLii7ael2NW/hc38FBXtC1Qs0LUUYJDfBOv0JJxFM8YET/AlrwufL0ftcaeIUAAAUUklEQVQp1hmKToKya1p5nEJIDcHSuy+t6we8+7+h5k28ZjwW6xVW34W3+iw+Ipg/qUEpIQ6pIcQSffAeT+Flt0L1K8EUJIXHYEXRN8exeE88vTNUv0HDqqYCrKRRTWXVaHJO49F4WxFZ4idXxZRv7g5VzwZ3lcuWQ+HeWNFxDUogwQR420RMF2GQ2jmYijozlyB5WPAoPhUrOZ1g1pLgngdBEivEE+vAvONpeOK1oCrLioMJ7byWpW0A4fTfpMK7vhnW7R4stTmeLcMrHg1O0nWf5JikLw2dr8QK927TLVXda/D5vwvHJVjQsyi2Gtb931i8mdJFruNlK/BFF0HVS8FnihVDp4uJFe7TYLvsnH1zdFNNYr3eaNOcSiI/BivFSOp8WxkTRGt43Vf43AMaTUsdivXGer4Ilc/iNe9BfC2s6PAGVSZRspUvwqILgUzQtpIYiHW7FYv3wT0bJq2Hgi6bhQcE00zUTIBYt7CU1HQeHK/7IhhY6JUEySUGpLFuNzeZN6pNn7/2I6j9KOg6mtp2ueci8mxZMLI61qvB1N+LZcsfDmabbdCFOA7JnxHr8e8m24v82ClBrMQ8Wx42mkb0wklutcwnLffa4ErZSrDEWssX5OJjZmbi5XcHU5MkBmDFJ2LJjdrl2CuKezYoaVQ+G5aYPGh76H7vMpVcRFZ1rU0QP8k2iI5msWK88JBwNthG941YjjpxsyQkN17u+BocM94n8raaqxKzGNblSrz4d1A3BWK9Ibm5JuATaYESRAexzhcF4y4qHgEyQVVPpz9j6e07OrQfLUv0g4T62Yu0lhJEBzFLYp0vxDudGzRWWxdd0YrISkUJooOZJYP7L4iIrGR+cgPlRESkdZQgREQkkhKEiIhEUoIQEZFIShAiIhJJCUJERCIpQYiISCQlCBERiaQEISIikZQgREQkkhKEiIhEUoIQEZFIShAiIhJJCUJERCIpQYiISCQlCBERiaQEISIikfKWIMyswMzeNbNJZjbVzC5rtP4mMyur9/o4M5ttZhPDx4n5ik1ERFqWz1uOVgO7uXuZmSWBN83sBXd/x8yGAFH32XzE3c/IY0wiItJKeStBeGBxCSEZPtzM4sC1wLn5em8REVl+eW2DMLO4mU0EZgGj3X0ccAbwjLvPjNjlUDObbGYjzWzNHMc82czGm9n42bNn5zF6EZGftrwmCHfPuPtgoB+wtZntDBwO3BSx+bNAf3ffDHgZGJHjmMPdfYi7D+nZs2e+QhcR+clbIb2Y3H0BMAbYFRgITDezL4AiM5sebjPX3avDXe4EtlwRsYmISLR89mLqaWZdw+eFwB7ABHfv7e793b0/UOHuA8Nt+tTb/QBgWr5iExGRluWzF1MfYETYKB0DHnX3Uc1sf6aZHQDUAfOA4/IYm4iItCBvCcLdJwNbtLBNSb3nFwAX5CseERFpG42kFhGRSEoQIiISSQlCREQiKUGIiEgkJQgREYmkBCEiIpGUIEREJJIShIiIRFKCEBGRSEoQIiISSQlCREQiKUGIiEgkJQgREYmkBCEiIpGUIEREJJIShIiIRFKCEBGRSEoQIiISSQlCREQiKUGIiEgkJQgREYmkBCEiIpGUIEREJJIShIiIRFKCEBGRSEoQIiISSQlCREQi5S1BmFmBmb1rZpPMbKqZXdZo/U1mVlbvddrMHjGz6WY2zsz65ys2ERFpWT5LENXAbu6+OTAY2MvMtgUwsyFA10bbnwDMd/eBwPXA1XmMTUREWtBsgjCzDc1sdzMrabR8r5YO7IHFJYRk+HAziwPXAuc22uVAYET4fCSwu5lZKz6DiIjkQc4EYWZnAk8DvwemmNmB9VZf2ZqDm1nczCYCs4DR7j4OOAN4xt1nNtp8DeBrAHevAxYCPSKOebKZjTez8bNnz25NGCIisgwSzaw7CdjS3cvC9oCRZtbf3W8EWnVl7+4ZYLCZdQWeNLOdgcOBXSI2jzqmRxxzODAcYMiQIU3Wi4hI+2guQcQXVxG5+xdmtgtBklibViaIxdx9gZmNAXYFBgLTw9qjIjObHrY7fAOsCXxjZgmgCzCvjZ9HRETaSXNtEN+b2eDFL8JksR+wGrBpSwc2s55hyQEzKwT2ACa4e2937+/u/YGKMDkAPAMcGz4/DHjV3VVCEBHpIM2VII4B6uovCNsGjjGzO1px7D7AiLBROgY86u6jmtn+buB+M5tOUHI4shXvISIieZIzQbj7N82sG9vSgd19MrBFC9uU1HteRdA+ISIiKwGNpBYRkUhKECIiEqm5cRADzWyHiOU7mdm6+Q1LREQ6WnMliBuA0ojlleE6ERH5EWsuQfQPG5obcPfxQP+8RSQiIiuF5hJEQTPrCts7EBERWbk0lyDeM7OTGi80sxOACfkLSUREVgbNDZT7A8H8Sb9maUIYAqSAg/MdmIiIdKzmBsr9AGxvZrsCg8LFz7n7qyskMhER6VA5E4SZFQCnEkyu9yFwdzjVhoiI/AQ01wYxgqBK6UNgb2DYColIRERWCs21QWzs7psCmNndwLsrJiQREVkZNFeCqF38RFVLIiI/Pc2VIDY3s0XhcwMKw9dGcMvpznmPTkREOkxzvZjiKzIQERFZuWg2VxERiaQEISIikZQgREQkkhKEiIhEUoIQEZFIShAiIhJJCUJERCIpQYiISCQlCBERiaQEISIikZQgREQkkhKEiIhEyluCMLMCM3vXzCaZ2VQzuyxcfne4bLKZjTSzknD5cWY228wmho8T8xWbiIi0rLnpvpdXNbCbu5eZWRJ408xeAM5290UAZnYdcAZwVbjPI+5+Rh5jEhGRVspbgnB3B8rCl8nw4fWSgwGFgOcrBhERWXZ5bYMws7iZTQRmAaPdfVy4/B7ge2BD4KZ6uxxar+ppzXzGJiIizctrgnD3jLsPBvoBW5vZoHD58UBfYBpwRLj5s0B/d98MeBkYEXVMMzvZzMab2fjZs2fnM3wRkZ+0FdKLyd0XAGOAveotywCPAIeGr+e6e3W4+k5gyxzHGu7uQ9x9SM+ePfMat4jIT1k+ezH1NLOu4fNCYA/gEzMbGC4zYH/g4/B1n3q7H0BQuhARkQ6Sz15MfYARZhYnSESPAs8Bb5hZZ8CAScBp4fZnmtkBQB0wDzguj7GJiEgL8tmLaTKwRcSqHXJsfwFwQb7iERGRttFIahERiaQEISIikZQgREQkkhKEiIhEUoIQEZFIShAiIhJJCUJERCIpQYiISCQlCBERiaQEISIikZQgREQkkhKEiIhEUoIQEZFIShAiIhJJCUJERCIpQYiISCQlCBERiaQEISIikZQgREQkkhKEiIhEUoIQEZFIShAiIhJJCUJERCIpQYiISCQlCBERiaQEISIikZQgREQkUt4ShJkVmNm7ZjbJzKaa2WXh8rvDZZPNbKSZlYTL02b2iJlNN7NxZtY/X7GJiEjL8lmCqAZ2c/fNgcHAXma2LXC2u2/u7psBXwFnhNufAMx394HA9cDVeYxNRERakLcE4YGy8GUyfLi7LwIwMwMKAQ+3ORAYET4fCewebiMiIh0gr20QZhY3s4nALGC0u48Ll98DfA9sCNwUbr4G8DWAu9cBC4Ee+YxPRERyy2uCcPeMuw8G+gFbm9mgcPnxQF9gGnBEuHlUacEbLzCzk81svJmNnz17dp4iFxGRFdKLyd0XAGOAveotywCPAIeGi74B1gQwswTQBZgXcazh7j7E3Yf07Nkzz5GLiPx05bMXU08z6xo+LwT2AD4xs4HhMgP2Bz4Od3kGODZ8fhjwqrs3KUGIiMiKkcjjsfsAI8wsTpCIHgWeA94ws84EVUqTgNPC7e8G7jez6QQlhyPzGJuIiLQgbwnC3ScDW0Ss2iHH9lXA4fmKR0RE2kYjqUVEJJIShIiIRFKCEBGRSEoQIiISSQlCREQiKUGIiEgkJQgREYmkBCEiIpGUIEREJJIShIiIRFKCEBGRSEoQIiISSQlCREQiKUGIiEgkJQgREYmkBCEiIpGUIEREJJIShIiIRFKCEBGRSEoQIiISSQlCREQiKUGIiEgkJQgREYmkBCEiIpGUIEREJFKiowMQWRVMfesTnr3tRRbNK2PnQ7dlt1/vRCqd7OiwRPJKCUKkBU/cOIp/XfgQNZU1uMOHr3/EqDte4rrXL1eSkB+1vFUxmVmBmb1rZpPMbKqZXRYu/7eZfWJmU8zsX2aWDJfvYmYLzWxi+LgkX7GJtFbp/DLuvuBBqiuC5ABQVV7Nlx99w6sPvtmxwYnkWT7bIKqB3dx9c2AwsJeZbQv8G9gQ2BQoBE6st88b7j44fPw1j7GJtMrUsZ+QSDUtaFeVV/PmE+90QEQiK07eqpjc3YGy8GUyfLi7P794GzN7F+iXrxhElldR50J8cdGhHjOjc49OHRCRyIqT115MZhY3s4nALGC0u4+rty4J/Ab4T71dtgurpF4ws03yGZtIa2yywwYUdS5qsjxVmGS/U3/eARGJrDh5TRDunnH3wQSlhK3NbFC91bcCr7v7G+Hr94G1wyqpm4Cnoo5pZieb2XgzGz979ux8hi9CPB7n6hcvosca3SnsVEBR50JSBSlOvOpoNt52/Y4OTySvLKr4nJc3MrsUKHf3YeHzLYBD3D2bY/svgCHuPifXMYcMGeLjx4/PS7wi9WWzWT566xPKF1awyQ4bUtK1uKNDEllmZjbB3Ye0tF3e2iDMrCdQ6+4LzKwQ2AO42sxOBH4B7F4/OZhZb+AHd3cz25qgdDM3X/GJtEUsFmPQjht1dBgiK1Q+x0H0AUaYWZzgZP+ou48yszrgS+BtMwN4IuyxdBhwWri+EjjSV1TxRkREmshnL6bJBNVIjZdHvqe73wzcnK94RESkbTQXk4iIRFKCEBGRSEoQIiISaYV1c80HM5tN0ODdEVYDcnbBXcmtyrGD4u9Iq3LsoPgXW9vde7a00SqdIDqSmY1vTT/ildGqHDso/o60KscOir+tVMUkIiKRlCBERCSSEsSyG97RASyHVTl2UPwdaVWOHRR/m6gNQkREIqkEISIikZQgWmBmh4e3TM2a2ZB6y1Nmdo+ZfRjew2KXcHmRmT1nZh+H+13VYcHT9vjDdVuGy6eb2T8tnDSrIzQTf9LMRoRxTjOzC+qtOzvcZ4qZPWRmBatQ7F3NbGT4/zPNzLbriNjDWNocf7g+bmYfmNmoFR91gzjaFL+ZrWlmr4XLpprZWatK7OG6vSy4nfN0Mzu/XQJxdz2aeQAbARsAYwimH1+8/HTgnvB5L2ACQcItAnYNl6eAN4C9V5X4w9fvAtsBBrywksb/K+Dh8HkR8AXQH1gDmAEUhuseBY5bFWIPX48ATqz3/9N1Vfnu660/B3gQGNVRsS/j/04f4Gfh8k7Ap8DGq0jsceAzYJ3w/2ZSe8SuEkQL3H2au38SsWpj4JVwm1nAAoI/ZIW7vxYuryG4EVKH3Va1rfGbWR+gs7u/7cF/4X3AQSss4Eaaid+BYjNLENzbvAZYFK5LAIXhuiLguxUSbOMA2xi7mXUGdgbuDvevcfcFKyzgxkEuw3dvZv2AfYG7VligObQ1fnef6e7vh/uWAtMILjhWuGX47rcGprv75+F552HgwOWNQwli2U0CDjSzhJkNALYE1qy/gZl1BfYnPBGvZHLFvwbwTb3tvqGDfiQtGAmUAzOBr4Bh7j7P3b8FhoXLZgIL3f2ljgszUmTsBFd/s4F7wiqau8xsZbwzUa74AW4AzgUibwS2kmgufgDMrD/BbNTjGu/cwXLFvgbwdb3t2uV3m8/7QawyzOxloHfEqgvd/ekcu/2LoBg4nmC6j7eAunrHTAAPAf9098/bN+KG2jn+qPaGvHZ1W8b4twYyQF+gG/BGeJz5BFdOAwhKRY+Z2dHu/kD7R97usSeAnwG/d/dxZnYjcD5wcftHHmjn+DcGZrn7hPptWvnUnvEv/p2aWQnwOPAHd1+U4xjLrZ2/+7z8bpUgAHffYxn2qQPOXvzazN4C/ldvk+HA/9z9huWPsMVY2jP++TSsEutHnqtoliV+grrY/7h7LTDLzMYCQwh+FDPcfTaAmT0BbA/kJUG0c+yvA9+4++Kr1pEECSJv2jn+LYADzGwfoADobGYPuPvR7RdxQ+0c/+dmliRIDv929yfaMdQm2jn2r2lYg9Euv1tVMS0jC3orFYfP9wTq3P2j8PUVQBfgDx0YYrNyxe/uM4FSM9s27L10DJDraqYjfQXsZoFiYFvg43D5tuHnM2B3grrklUlk7O7+PfC1mW0Qbrc78FFHBdmMXPFf4O793L0/cCTwaj6Tw3KIjD/8f7kbmObu13VohLnl+r9/D1jPzAaYWYrg+39mud+tI1roV6UHcDBBfV418APwYri8P/AJwcnnZYLZESHI3B4unxg+TlxV4g/XDQGmEPSKuJlwQOVKFn8J8BgwleAk+qd6+1wW/mimAPcD6VUo9sEE1X6TgaeAbqvSd19v313o+F5MbYof2DH87U6u99vdZ1WIPVy3D0HPq88IqqmWOw6NpBYRkUiqYhIRkUhKECIiEkkJQkREIilBiIhIJCUIERGJpAQh0gZmljGziRbMFPuYmRWFy3ub2cNm9pmZfWRmz5vZ+hH7/8vMZpnZlBUfvUjbKEGItE2luw9290EEE6WdGg6wehIY4+7ruvvGwJ+B1SP2vxfYa4VFK7IcNNWGyLJ7A9gM2BWodffbF69w94lRO7j76+FEcCIrPZUgRJZBOBnj3sCHwCCC+2mI/KgoQYi0TaGZTSSYDuMrwns3iPwYqYpJpG0q3X1w/QVmNhU4rIPiEckblSBElt+rQNrMTlq8wMy2MrOhHRiTyHJTghBZTh7MeHkwsGfYzXUq8Bci5uM3s4eAt4ENzOwbMzthhQYr0gaazVVERCKpBCEiIpGUIEREJJIShIiIRFKCEBGRSEoQIiISSQlCREQiKUGIiEgkJQgREYn0/59sj0llbvFtAAAAAElFTkSuQmCC\n", | |
"text/plain": [ | |
"<matplotlib.figure.Figure at 0x7f20c58b1470>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"true_labels = df['Y']\n", | |
"\n", | |
"# Plot the first two components\n", | |
"plt.scatter(z[0], z[1], c=true_labels)\n", | |
"plt.xlabel(\"PC 1\")\n", | |
"plt.ylabel(\"PC 2\")\n", | |
"plt.show()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"From this plot, we can observe that PC1 has most of the variance and is the best separator of the data. We can also observe that there is one outlier value in PC2." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Using the Scikit-learn library (**scikit-learn only supports covariance matrices**):" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 12, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"from sklearn.decomposition import PCA as pca\n", | |
"\n", | |
"pca = pca(n_components=2)\n", | |
"pca.fit(features)\n", | |
"features_pca = pca.transform(features)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 13, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYQAAAEKCAYAAAASByJ7AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAIABJREFUeJzt3XmYHGW59/Hv3d2zT1YybAkQkIgshsWwKzsStqDigqKo8IJwRODoAeTlQg/H9QgHkQPi4RWEoywKiqCyQxAXFAYIkBBW2QKEDElIMpmZnl7u94/uwCTp6ekxU/V0pn6f6+or3VVPTf2mpzP3VNVTz2PujoiISCp0ABERqQ8qCCIiAqggiIhImQqCiIgAKggiIlKmgiAiIoAKgoiIlKkgiIgIoIIgIiJlmdABhmPSpEk+derU0DFERNYrjzzyyFvu3jFUu/WqIEydOpXOzs7QMURE1itm9nIt7XTKSEREABUEEREpU0EQERFABUFERMpUEEREBFjPehmNFM89i3dfCvl5kN4Sa/8y1rhz6FgiIkElriB47kl88WeBLFCEwqv4kodgwiVY036B04mIhJO4U0a+/PtAL1AcsLQPX/4fgRKJiNSHxBUEcnMrLy+8gXtvvFlEROpI8gpCakLl5dYINMYaRUSkniSvILSdCLSssbAR0u/Buy/Gc8+FSCUiElziCoK1fgbavgg0g7UBaaAA+bmw8qf44qMprrwqcEoRkfglryCYkRpzBrbh32DMt4AGoFBeWwD6YMUP8cLCcCFFRAJIXEFYxVKtkH+KUvfTtdZCdnbckUREgkpsQQDAGqj4FphROnIQEUmORBcEazmSir/43aH5wNjziIiElOyCkNkaxnyNUnfTFrBWoAnGXYAN1j1VRGSUStzQFWtKtX0ebz4EsvcDDdB8AJaagOdfgsLLkNkaS08OnFJEJHqJLwgAlt4YWo8BwL2X4pLjob+zdI3B+/HmA7FxF2Cm6woiMnoFO2VkZs1m9pCZPW5m88zs/FBZBvLl34H+h4E+8BVAFvruw7svDx1NRCRSIa8hZIED3H1HYCdgppntETAP7kXo/S1rd0Xtg57rQkQSEYlNsFNG7u5Ad/llQ/nhofIAuPcD+UFWrow1i4hI3IL2MjKztJnNARYBd7v730Pk8OxfKXYdCoums/qw2KsYNO4edywRkVgFLQjuXnD3nYApwG5mtsOabczsJDPrNLPOrq6ukc/Q/zi+9GQovDBIiwxYOzb2nBHft4hIPamL+xDc/W3gfmBmhXVXuPsMd5/R0dEx8vvuvozKw1eU2Vhs0m1Y5j0jvm8RkXoSspdRh5mNLz9vAQ4Cno49SOF5ql668GVgY2KLIyISSsj7EDYBrjGzNKXC9Ct3/33sKdJbQmFBlQYpMN2uISKjX8heRk8AO4fa/zsKr1VZmYHmmZhpJjURGf3q4hpCKF5YCIVXBm+QmoiNrYv75UREIpfogkBxRfX1rZ/FUu3xZBERCSzZBSGzZfX1TfvHk0NEpA4kuiCYZWDQYa7T4IVB1omIjD6JLggANO4FWIUVBVj+zfJwFiIio1/iC4K1nwKkK6/MP46/OZ1i7z2xZhIRCUEFIbMVZKZVaVGEZafhhddjyyQiEkLiC4J7EfIvDtEqj/f+NpY8IiKhqCD0/groG7phcUnkWUREQkp8QaDn+hoaZbCmD0YeRUQkJBUE7x26TWYnaNwn+iwiIgGpIDQfQuVupwM0fQgzvVUiMrol/rectZ0IjKveqPfWWLKIiISkgpAaC5mpQzQa5D4FEZFRJPEFoSRffXXrJ+KJISISkAoCQNNaM3eurnlWPDlERAJKfEEorrwaVl5cvdGSY/DisljyiIiEkuiC4LlnYcVFDHnKqLAAX3FhLJlEREJJdkHouxXI1dAyB323RR1HRCSoRBcEvB8o1th4iHsVRETWc4kuCNZ8MNBcQ8sGaD4i6jgiIkEluiDQMANajgAyVRplIDMVG/PVuFKJiASR6IJgZtjYb0N688EbpaZgG9xauoFNRGQUC1YQzGwzM5ttZvPNbJ6ZnR4oB1jr4A2KL+FLjsFz8+MLJSISQMgjhDzwNXffFtgD+LKZbRckiTVVX597HF/yGbywMJ48IiIBBCsI7v6Guz9afr4CmA9MDpOmcegmnsN7fh59FBGRQOriGoKZTQV2Bv5eYd1JZtZpZp1dXV3RBGjev4ZG/aDTRiIyigUvCGbWDvwaOMPdl6+53t2vcPcZ7j6jo6MjmgwtnwAbM3RDD/52iYhEJuhvODNroFQMrnX33wTLkWqDiXcN3TD3EO4efSARkQBC9jIy4EpgvrtfFCoHgOeegKUfr6FlH15cGXkeEZEQQh4h7A18DjjAzOaUH4fFHcJz8/HFn4Pia7VuEWkeEZFQqt2iGyl3/zN1MECQd/830Fdj6+bS6SURkVFIV0lz86jtr/4MtJ+Bmd4yERmd9NttqPmUAWw8jDkHa/ti5HFEREJJfEGw9lMZcsTTxt1JtX2uNMyFiMgopYLQuCuM/yGkNh28UfZOirlX4gslIhJA4gsCQKr5QKxjNlBlkLueq+OKIyIShApCWel0UJXpNPufiC2LiEgIKgirqdYLN4/nntadyiIyaiW+ILgXKHb/D8VFewPZwRsWXsCXfAp/ayae1/UEERl9VBCWfxO6L4NiF1Cs0jIL3guFl/GlX9SRgoiMOokuCF5YDL2/pfY7lQGKUFwMOV1TEJHRJdEFgcI/hp4traIUFJeMeBwRkZCSXRDSm4H3D387z0HjTiOfR0QkoEQXBEtvDE37Aenhbdj2JSw1IYpIIiLBJLogANj4C6H5qGFs0YQ1HxJZHhGRUFQQrInU+O9DenqNW2Rx7440k4hICIkvCO8oPFt72xU/ji6HiEggKgir2DCuI+T+El0OEZFAVBBWaZ41jMb5yGKIiISiglBmY84END2miCSXCkKZpdph4q9qbR1pFhGREFQQBkg1ToP0ljW1da8yVLaIyHpIBWFN7WfW0KgJ+h+JPIqISJyCFgQzu8rMFpnZ3JA5VlN4o9aGkcYQEYlb6COEq4GZgTO8w70AKy+poWUeGneNPI+ISJyCFgR3fwCon2FDfVlpzoOhG0L+ucjjiIjEKfQRQn2xMWDVptFcpYAvOy/yOCIicar7gmBmJ5lZp5l1dnV1RbyvBmg9DmgZunF+Hu5VptwUEVnP1H1BcPcr3H2Gu8/o6OiIfH/Wfga0Hc/Q9xqkGPaw2SIidazuC0LczFKkxpwO435UvWHj/lhNp5dERNYPobudXg88CGxjZgvM7ISQeQZKtcwE22SQtQ3Y+B/EmkdEJGpB/8R190+H3P+a3B0As/LpovHfg6XHA8UBrZqg457SUBciIqOIThkBXlhMcelp+Jvb429uT3HpKRTzr8GyswFfo3UW8i8FSCkiEq3EnwR3z+NLPlm+Q7k8rHV2NmT/DAzSi+jtk/ENO7HhzKEgIlLndISQvR+KS1h9joMigxYDAM9C7tFoc4mIxEwFIf8CeN8wN8qUioKIyCiigpDZGqx5+Ns17DLyWUREAlJBaNoXUh0M63JK66lYqjWySCIiISS+IJhlsA1ugKaDat8o/+foAomIBFK1IJjZ+8zsQDNrX2N53QxZPRIsNRFrOwGo8a/+/icjzSMiEsKgBcHMTgNuAb4CzDWzowas/m7UwWKXnsLqPY2qSI2LNIqISAjVTpyfCHzA3bvNbCpwk5lNdfcfMQpnmbf0BnjzTOi7Cxiq11Hiz7SJyChU7Tdb2t27Adz9JWA/4FAzu4hRWBAAbNx3ofXTYC2UvsUxlRsW38Lzz8cZTUQkctUKwkIz22nVi3JxOAKYBLw/6mAhmDWSGnsOtuEcbKN50LDdIA0boPBavOFERCJWrSAcBywcuMDd8+5+HLBPpKkCM7PS0NbpyVQ8GPIsZLaNPZeISJQGvYbg7guqrPtLNHHqh+eegL4/sPbgdhlo+TiW3jBELBGRyOjq6CB8xSUMOp7RmLNjzSIiEgcVhMHkn6683BqxYrRzO4uIhFDtPoStzWzvCss/ZGbviTZWHUhvWnm5FyEd/dzOIiJxq3aEcDGwosLy3vK6UcuLb0P+uQprDFqOxawl9kwiIlGrVhCmuvsTay50905gamSJ6oD3/Aa8UGFNBpr3jz2PiEgcqhWEamNCj+4/kQvPUPlu5QxWeDnuNCIisahWEB42sxPXXGhmJwCPRBepDmTeT+WaV8Qz0+JOIyISi2pjGZ0B3Gxmx/JuAZgBNAIfjTpYSNbyEXzlj6GYpTSd5io5WHoGvsE1WGaLUPFERCIx6BGCu7/p7nsB5wMvlR/nu/ue7r5wsO1GA0u1w8SbKtyNXAR/HV9yPO5r3rAmIrJ+q9bttNnMzgCOBvqBy939vpHcuZnNNLNnzOx5M/v6SH7tdeH5V2Dp/4H8/MoNigvw3GPxhhIRiVi1U0bXADngT8ChwLaUTiONCDNLA5cBBwMLKF2zuNXdnxqpfQA8P+dF/nDF3Sx7awV7zdqVfT+5Jw2NDYO2dy/iS46D4kJWP120WivovR0aNa+yiIwe1QrCdu7+fgAzuxJ4aIT3vRvwvLv/o7yPG4CjgBErCHf87D4uPfVKcv15ioUiD9/+GLdcdgf/df/5NDYNUhRyneDLGLwYlBVeHKmYIiJ1oVovo9yqJ+5e41RiwzIZeHXA6wXlZSOit7uXS79yJdnefoqF0i/3vpVZXnzyFe79xQODb1hcXMNXN0hvNjJBRUTqRLWCsKOZLS8/VgDTVz03s+UjsO9Kk+ysdaXWzE4ys04z6+zqqn0MoacefJZMZu0DoGxPlvt/WWWw1oadwXODrwegCWv9TM1ZRETWB9V6GaXdfWz5McbdMwOejx2BfS8ABv6ZPQV4vUKOK9x9hrvP6OiofQyhlvZmil75tE/7+LZBt7P0xtB6bHnWtHeWlh7WCjYOG38R1qD7EURkdKl2DSFqDwPTzGxL4DXgGGDE/ux+3+7TaBvbSu+K1e84bm5t4oiTP1x1WxvzdWjcBV/5c/BuaD4UmvbHKEBmWmnyHBGRUSbYbzZ3z5vZqcCdQBq4yt3njdTXT6VSfPf2czn74P8g29MPQC6X51NnH8XOB7w7A+jSRcu465rZvDxvAZOnbcyBx+7DxlM3xJsOwBp3BxuLmUYJF5HRz9anG6xmzJjhnZ2dw9qmkC/w+P3zWLF0JdP32ZYJG40HwN25+hs3cMP3bqZYfPc9aGrLcOm9k9h88weAPKTGwZhzSLUcOZLfiohIbMzsEXefMVS7UX/uI51Js8tB09da/tdbHubGC3+3WjEA+MKZL7PhpMd4p9tp8S1Ydi6eGoc1jeqppEUk4RJ7LuTmS24jl129N1Fjc5HDPreY5tY1L0b34d3/HV84EZEAElsQVizpXmvZuIn5Ch1fy/KvDrJCRGR0SGxB2Osju5JKr/7tL1nUQD5f6fYIwJdS7F9rviARkVEjsQVhq+lT8eLqp4YKeeN/f7Ax7oPcM7fkWLy49pGFiMhokNiCcNNFv6NSB6vbr9u0ylZZfOU1kWUSEQkpkQVh5bKVPP23ZyuuSzekcW8cfOPsPRGlEhEJK5EFYfYNf8XSlb/1XDaPN1W5k9mqTTUtIrL+SmRBePOVLgq5QsV1Xizyq8t3AVorb5x7nGL3j6MLJyISSCILwra7T6OlvfJf+oV8keu/fzs3/uwksE0qtMhD9//g/XOiDSkiErNEFoTdD9+FKe/dhFS6chfTbE+Wq867nx/+Wwf92Uo3c/fhvb+ONqSISMwSWRDS6TRnXnMqZoPccwAUi07P8mXk+isNoe3gfRWWi4isvxJZEACeeegFGgabRrPssT+1k8lU6JtqrVjzYRElExEJI7EFYezEdiw1+BECwIq3M/z4vMlke438qmGPrBUaPwRN+0YfUkQkRqN+tNPB7PDBbcj2ZIdsd8d1G/DUw2185Ev9HH7CnljzwdC4V9XTTSIi66PEFoTb/t+9pDJpioX8kG1fea6ZeY8dzJHjToshmYhIGIk9ZfTHGx8knx26GEDp7uVt93hvxIlERMJKbEFoHdtSc9t0Js2Hjt49wjQiIuEltiDM+pdDhmyTaczQ2NLIWVefysSNJ8SQSkQknMReQ9hj1q5Dtpl5/P6c8N1jaR/fFkMiEZGwElcQli9ewcWnXMFfb3m4ekODjadupGIgIomRqIJQLBb56r7f4LXn3hh0cLt3OEyaotNEIpIcibqG8Pj983jz5S7yQxWDsivO/Dl9NdyrICIyGgQpCGb2CTObZ2ZFM5sR135ffeZ1sj39NbdfuayH2df/OcJEIiL1I9QRwlzgY8ADce/YK82bOYhsTz9PP/R8hGlEROpHkGsI7j4fiH34Byvvs9ai0NTayObbTo42lIhInaj7awhmdpKZdZpZZ1dX1zp9rcnTNqGpranm9rlsnv5sjlx/bujGIiLrucgKgpndY2ZzKzyOGs7Xcfcr3H2Gu8/o6OhYp0w7HbADHVM2IJWp7dsuFopc+62bOPew7w7rVJOIyPoosoLg7ge5+w4VHrdEtc+hpFIpLvrj+Wyy1UY1b5Pt6Wf+Q88zZ/bcCJOJiIRX96eMRloqnSLfX9ugdqv0rexj7p+fjiiRiEh9CNXt9KNmtgDYE/iDmd0Zx36LxSJf2++bvPnS8K5FNLc0MXHj8RGlEhGp7OE7HuPkXc5k1rjjOOUDZ/HwnXMi3V+QguDuN7v7FHdvcveN3H3okeZGwOP3zxt2MYDSUcW+n9wrgkQiIpX99daHOf/oC3lhzkv0rujl+cde5PyjL+DB33VGts9EnTJ69enXKeSLNbdPN6TZcPNJ/Ofd52lMIxGJ1RVn/pxs7+o30mZ7+rnizP+NbJ+JKghbbD+FdI09jKB0ZHDOtafzvt2mRZhKRGRtrz+/sOLy1wZZPhISVRCm77Mdk6dtgqVquyEul81x33V/ijiViMjaJmw0ruLyKK9nJqogmBkX3vdNttxh89o2cOi86/FoQ4mIVHDseR+nqXX1G2mbWpv47Hkfj2yfiSoIAG3j2jj98hNrbt/16mIWPPdGhIlERNZ25Mkf5ovfPob28W1kGjO0j2/j+O8cw+EnHRzZPhM1H8Iq2+7xXprbm+jrHnpo60xjmhefeJkp0zaJIZmISImZcfQZR/DR0w6jZ3kvrWNbSKWi/Rs+kQWhZ3nP0BPklPV1Z2kb1xpxIhGRylKpVGy9HBN3yghgycK3yTTUWAutdP+CiMhol8iCsNEWHRQKNd6P4PDovU9GG0hEpA4ksiD09WTxWgsCpe6nIiKjXSILwtXn3UBuGAPcFYu1Fw8RkfVVIgvCfdcNb57kKdM2jSiJiEj9SFxBKBQK9Kzorbl9Q3MDR//rEREmEhGpD4krCPMffJZUjUNXABx+0sFsv9c2ESYSEakPiSsIPSv6aGhqqLn9rZfezuxf/iXCRCIi9SFxBWGbXbeiv6/2XkPFonPpV67UhWURGfUSVxB+8a1fY1b7KSOAFUu6WbGkO6JEIiL1IVEFoVAocPtP76WQr23YincYtIxpiSaUiEidSFRByGXzw7r/YJUPHb0HjcO47iAisj5KVEFobm1ig00nDGubptZGPnbaYfT39Q/dWERkPZaogpDtzbJiycrhbdPTz9kf/jZHb3gCd/zsvoiSiYiEl6iC8OCtndg/8R1ne7L0dfdx6Veu5KkHnxn5YCIidSBIQTCzC8zsaTN7wsxuNrPoJgkd4O2u5RRy/3z30f7efn5zyW0jmEhEpH6EOkK4G9jB3acDzwLnxLHTHffbnmH2OF2NO7y1YMnIBRIRqSNBCoK73+Xuq7r7/A2YEsd+t9xhc/b5xJ40tTb+U9s3tjSwx+G7jHAqEZH6UA/XEI4Hbo9rZ/921b/w2fM+QSo9vEOFhqYGJmw4niNP+XBEyUREwoqsIJjZPWY2t8LjqAFtzgXywLVVvs5JZtZpZp1dXV3rnCuVSvGx0w+jqaWp5m0mTZnIMV//CJc/+gPaxsUzt6mISNzM3cPs2OzzwMnAge7eU8s2M2bM8M7OzhHZ/62X38mlX7kSLw79/Y/fcCw3LrxyRPYrIhI3M3vE3WcM1S5UL6OZwNnArFqLwUibdcoh7DlryPcHgLcXLSfbm404kYhIWKGuIVwKjAHuNrM5ZvaTECG+dMFxZBozNbXt7e6LOI2ISFihehlt7e6buftO5cfJIXIsfXNZzd1Q7/zZ7GjDiIgEVg+9jIL53eV3kq9xsLs7rlJBEJHRLdEF4e2u5dR6TV0T5IjIaJfogpBO1/7tt41rpVAY5jwKIiLrkcQWhIUvLeKx+56suf3LT73Kry64JcJEIiJhJbYgPHLX4zWfLgLo781x8480sJ2IjF6JLQgtY1ooFoZ3XaD77SC3TIiIxCKxBWHPIz8w7JFPt9vzvdGEERGpA4ktCC3tLXz5kuNrapvOpGhpb+aUH34h2lAiIgHVdpvuKHXkyYdQyBe57PSrYJDrCVtO35z37TaNT511FJO33iTegCIiMQo2uN0/YyQHt1ulryfLke2fHXT9da/8hI4pG4zoPkVE4lTXg9vVkzdf7sIGuZhgZnQv7Y45kYhIGIkvCJM2nQCDXFx2d5rbm+MNJCISSOILQtu4NvY/Zu9B15+881m83bUsxkQiImEkviBAaVrNhuaGiut6VvRw64/vjDmRiEj8VBAALzr5bG6QlfDEH5+KN5CISAAqCEBDUwOtY1sHXb/p1hvHmEZEJAwVBEq9iT511lEV1zU0Zjj6jMNjTiQiEj8VBEpzHdzziwdIpVbvbmQG5/7yq2yx3WaBkomIxEcFAXjs3ifpWrCYYnH1m/SaWptYtkg9jEQkGVQQgFeffp18/9qT3/StzPKPJ18JkEhEJH4qCMDUHTYj05Bea3lzWzNb7zQ1/kAiIgGoIAA77rc9m269MQ2N7471l0qnaB3bwn5VbloTERlNVBAo9TL6r9n/zkHH7UtzWxONzQ3s/ZFdueyh79Hc2hQ6nohILIKMdmpm3wKOAorAIuAL7v76UNtFMdqpiMhoV++jnV7g7tPdfSfg98A3AuUQEZGyIAXB3ZcPeNnGoNPTiIhIXILNmGZm3wGOA5YB+4fKISIiJZEdIZjZPWY2t8LjKAB3P9fdNwOuBU6t8nVOMrNOM+vs6uqKKq6ISOIFn0LTzLYA/uDuOwzVVheVRUSGr64vKpvZtAEvZwFPh8ghIiLvCtXt9NfANpS6nb4MnOzur9WwXVe5fVwmAW/FuL9a1GMmUK7hqMdMoFzDUY+ZYPBcW7h7x1AbBz9lVM/MrLOWw6w41WMmUK7hqMdMoFzDUY+ZYN1z6U5lEREBVBBERKRMBaG6K0IHqKAeM4FyDUc9ZgLlGo56zATrmEvXEEREBNARgoiIlKkgVGBmM83sGTN73sy+HjDHVWa2yMzmDlg20czuNrPnyv9OCJBrMzObbWbzzWyemZ0eOpuZNZvZQ2b2eDnT+eXlW5rZ38uZfmlmjXFlWiNf2sweM7Pf10suM3vJzJ40szlm1lleFvTzZWbjzewmM3u6/Pnasw4ybVN+j1Y9lpvZGXWQ61/Ln/W5ZnZ9+f/AOn2uVBDWYGZp4DLgUGA74NNmtl2gOFcDM9dY9nXgXnefBtxbfh23PPA1d98W2AP4cvk9CpktCxzg7jsCOwEzzWwP4D+BH5YzLQVOiDHTQKcD8we8rpdc+7v7TgO6Kob+fP0IuMPd3wfsSOk9C5rJ3Z8pv0c7AR8AeoCbQ+Yys8nAacCM8igPaeAY1vVz5e56DHgAewJ3Dnh9DnBOwDxTgbkDXj8DbFJ+vgnwTB28Z7cAB9dLNqAVeBTYndJNOplKP9sY80yh9AvjAErDvVud5HoJmLTGsmA/Q2As8CLla5v1kKlCxg8DfwmdC5gMvApMpDRI6e+BQ9b1c6UjhLWteqNXWVBeVi82cvc3AMr/bhgyjJlNBXYG/k7gbOXTMnMoTbp0N/AC8La758tNQv0sLwbOonRnPsAGdZLLgbvM7BEzO6m8LOTPcCugC/hZ+fTaT82sLXCmNR0DXF9+HiyXl0Z2uBB4BXiD0qjRj7COnysVhLVZhWXqilWBmbUDvwbO8NXnuAjC3QteOqyfAuwGbFupWZyZzOwIYJG7PzJwcYWmIT5je7v7LpROj37ZzPYJkGGgDLALcLm77wysJMwp0YrK5+NnATfWQZYJlGad3BLYlNK8ModWaDqsz5UKwtoWAJsNeD0FGHJ6zxi9aWabAJT/XRQihJk1UCoG17r7b+opm7u/DdxP6frGeDNbNe9HiJ/l3sAsM3sJuIHSaaOL6yAXXp621t0XUTonvhthf4YLgAXu/vfy65soFYi6+FxR+oX7qLu/WX4dMtdBwIvu3uXuOeA3wF6s4+dKBWFtDwPTylfrGykdIt4aONNAtwKfLz//PKXz97EyMwOuBOa7+0X1kM3MOsxsfPl5C6X/MPOB2cDHQ2QCcPdz3H2Ku0+l9Fm6z92PDZ3LzNrMbMyq55TOjc8l4M/Q3RcCr5rZNuVFBwJPhcy0hk/z7ukiCJvrFWAPM2st/39c9V6t2+cq1MWZen4AhwHPUjoHfW7AHNdTOj+Yo/TX0wmUzj/fCzxX/ndigFwfpHQo+gQwp/w4LGQ2YDrwWDnTXOAb5eVbAQ8Bz1M61G8K+PPcD/h9PeQq7//x8mPeqs956M8XpR5ineWf42+BCaEzlXO1AouBcQOWhX6vzqc0dcBc4OdA07p+rnSnsoiIADplJCIiZSoIIiICqCCIiEiZCoKIiAAqCCIiUqaCIFKFmRXKI1zONbMbzay1vHxjM7vBzF4ws6fM7DYze2+F7dcasVakXqkgiFTX66WRLncA+oGTyzcC3Qzc7+7vcfftgP8LbFRh+6tZe8RakbqUGbqJiJT9idINcPsDOXf/yaoV7j6n0gbu/kB5AECRuqcjBJEalMeHORR4EtiB0siSIqOKCoJIdS3lIbU7KY0fc2XgPCKR0Skjkep6vTSk9jvMbB7vDiAmMmroCEFk+O4DmszsxFULzGxXM9s3YCaRdaaCIDJMXhoR8qPAweVup/OAf6fC2PNmdj3wILCNmS0ws1BzJ4sMSaOdiogIoCMEEREpU0FkLckYAAAAKklEQVQQERFABUFERMpUEEREBFBBEBGRMhUEEREBVBBERKRMBUFERAD4/++tJBXFbSkYAAAAAElFTkSuQmCC\n", | |
"text/plain": [ | |
"<matplotlib.figure.Figure at 0x7f20bad59630>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"plt.scatter(features_pca[:, 0], features_pca[:, 1], c=true_labels)\n", | |
"plt.xlabel(\"PC 1\")\n", | |
"plt.ylabel(\"PC 2\")\n", | |
"plt.show()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"In this plot of the two principal components of the covariance matrix, the outlier effect is more pronounced. The outlier has caused the entire plot to be flipped 90 degrees, so now, PC2 is the best separator of the data." | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.6.3" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 2 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment