Created
March 23, 2022 11:47
-
-
Save jamm1985/d423d4820e835ae640e0a09ec853d024 to your computer and use it in GitHub Desktop.
Lab_13_intro_to_ML_regression_part_II.ipynb
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"nbformat": 4, | |
"nbformat_minor": 0, | |
"metadata": { | |
"colab": { | |
"name": "Lab_13_intro_to_ML_regression_part_II.ipynb", | |
"provenance": [], | |
"authorship_tag": "ABX9TyMHlW9igur20r8kmGAUdbdr", | |
"include_colab_link": true | |
}, | |
"kernelspec": { | |
"name": "python3", | |
"display_name": "Python 3" | |
}, | |
"language_info": { | |
"name": "python" | |
} | |
}, | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "view-in-github", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"<a href=\"https://colab.research.google.com/gist/jamm1985/d423d4820e835ae640e0a09ec853d024/lab_13_intro_to_ml_regression_part_ii.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"source": [ | |
"Видео лабораторной: https://youtu.be/A3LE-ZmtVGs\n", | |
"\n", | |
"TG: https://t.me/data_science_news\n", | |
"\n", | |
"\n", | |
"\n", | |
"---" | |
], | |
"metadata": { | |
"id": "DWH294RVmYHR" | |
} | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"!pip install -U tensorflow-addons" | |
], | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "zfPIoGDkN_O6", | |
"outputId": "e3b82070-faff-4f22-c30b-c182dda278e3" | |
}, | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"Requirement already satisfied: tensorflow-addons in /usr/local/lib/python3.7/dist-packages (0.16.1)\n", | |
"Requirement already satisfied: typeguard>=2.7 in /usr/local/lib/python3.7/dist-packages (from tensorflow-addons) (2.7.1)\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"id": "dO65TOXRqVrK", | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"outputId": "ec1e8b31-a762-47f2-a24f-761d4cfe79b5" | |
}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"2.8.0\n" | |
] | |
} | |
], | |
"source": [ | |
"import pandas as pd\n", | |
"import numpy as np\n", | |
"import matplotlib.pylab as plt\n", | |
"\n", | |
"from sklearn.linear_model import LinearRegression\n", | |
"from sklearn.model_selection import cross_val_score\n", | |
"from sklearn.model_selection import train_test_split\n", | |
"from sklearn.metrics import mean_squared_error\n", | |
"\n", | |
"from sklearn.preprocessing import PolynomialFeatures\n", | |
"from sklearn.model_selection import KFold\n", | |
"\n", | |
"import tensorflow as tf\n", | |
"from tensorflow import keras\n", | |
"from tensorflow.keras import layers\n", | |
"import tensorflow_addons as tfa\n", | |
"\n", | |
"print(tf.__version__)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"source": [ | |
"## Первая часть (задача регрессии) \n", | |
"[Видео лабораторной](https://youtu.be/r-z1cjvpwBE)\n", | |
"- Кросс-валидация (cross-validation)\n", | |
"- Регуляризация (regularization)\n", | |
"- Целевая функция (objective function)\n", | |
"- Конструирование признаков (feature engineering)\n", | |
"\n", | |
"## Вторая часть (задача регрессии)\n", | |
"- Конструирование признаков (feature engineering)\n", | |
"- Оценка модели (model assessment)\n", | |
"- Выбор модели (model selection)" | |
], | |
"metadata": { | |
"id": "o1qNRLQ-FaTQ" | |
} | |
}, | |
{ | |
"cell_type": "markdown", | |
"source": [ | |
"# Дилема смещения-дисперсии (bias-variance trade-off)\n", | |
"\n", | |
"Пусть задана MSE для оценки $\\theta$: $\\mathrm{MSE}(\\theta)=E_\\theta[(\\hat{\\theta}-\\theta)^2]$\n", | |
"\n", | |
"MSE для оценки $\\hat{\\theta}$ можно выразить как композицию смещения и дисперсии:\n", | |
"\n", | |
"$\\mathrm{MSE}(\\theta)=E_\\theta[(\\hat{\\theta}-\\theta)^2]=...=E_\\theta[(\\hat{\\theta}-E_\\theta[\\hat{\\theta}])^2] + (E_\\theta[\\hat{\\theta}]-\\theta)^2=\\mathrm{Var}_\\theta(\\hat{\\theta})+\\mathrm{Bias}(\\hat{\\theta},\\theta)$\n", | |
"\n", | |
"**В задаче регрессии**, вобщем виде для модели $y=f(x)+\\epsilon$:\n", | |
"\n", | |
"$E_{D,\\epsilon}[y-\\hat{f}(x;D)^2]=\\mathrm{Var}_D(\\hat{f}(x;D))+\\mathrm{Bias}_D[\\hat{f}(x;D)]^2+\\sigma^2$\n", | |
"\n", | |
"где\n", | |
"\n", | |
"$\\mathrm{Bias}_D[\\hat{f}(x;D)]=E_D[\\hat{f}(x;D)]-f(x)$\n", | |
"\n", | |
"$\\mathrm{Var}_D(\\hat{f}(x;D))=E_D[(E_D[\\hat{f}(x;D)]-\\hat{f}(x;D))^2]$\n", | |
"\n", | |
"$\\epsilon$ - \"величина ошибки\" (noise), $E[\\epsilon]=0$, $\\mathrm{Var}(\\epsilon)=\\sigma^2$\n", | |
"\n", | |
"$D=\\{\\{x_1,y_1\\},\\{x_2,y_2\\},...,\\{x_n,y_n\\}\\}$ - это выборка из совместного распределения $f_{X,Y}(x,y)$\n", | |
"\n", | |
"**На наборе данных:** $\\mathrm{MSE} = \\frac{1}{N}\\sum_{i=1}^n(y_i-\\hat{y}_i)^2$" | |
], | |
"metadata": { | |
"id": "-rM1uNGG-Yb2" | |
} | |
}, | |
{ | |
"cell_type": "markdown", | |
"source": [ | |
"\n", | |
"\n", | |
"https://en.wikipedia.org/wiki/Bias–variance_tradeoff#/media/File:Bias_and_variance_contributing_to_total_error.svg" | |
], | |
"metadata": { | |
"id": "4i1q1jOOEybZ" | |
} | |
}, | |
{ | |
"cell_type": "markdown", | |
"source": [ | |
"# Разведочный анализ данных\n", | |
"\n", | |
"[Набор данных](https://github.com/nguyen-toan/ISLR/blob/master/dataset/Advertising.csv)\n", | |
"\n", | |
"[Книга](http://www-bcf.usc.edu/~gareth/ISL/)\n", | |
"\n", | |
"[Simple to Multiple and Polynomial Regression in R](https://www.kaggle.com/code/pranjalpandey12/simple-to-multiple-and-polynomial-regression-in-r/data)\n", | |
"\n" | |
], | |
"metadata": { | |
"id": "tW_KgZOLHSbX" | |
} | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"!wget https://raw.githubusercontent.com/nguyen-toan/ISLR/master/dataset/Advertising.csv\n", | |
"!head Advertising.csv" | |
], | |
"metadata": { | |
"id": "1Nl1rZOFtjWm", | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"outputId": "6debe33e-5a41-4b89-9888-46be89c93c81" | |
}, | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"--2022-03-23 05:32:20-- https://raw.githubusercontent.com/nguyen-toan/ISLR/master/dataset/Advertising.csv\n", | |
"Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.109.133, 185.199.108.133, ...\n", | |
"Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.\n", | |
"HTTP request sent, awaiting response... 200 OK\n", | |
"Length: 5166 (5.0K) [text/plain]\n", | |
"Saving to: ‘Advertising.csv.1’\n", | |
"\n", | |
"\rAdvertising.csv.1 0%[ ] 0 --.-KB/s \rAdvertising.csv.1 100%[===================>] 5.04K --.-KB/s in 0s \n", | |
"\n", | |
"2022-03-23 05:32:20 (51.3 MB/s) - ‘Advertising.csv.1’ saved [5166/5166]\n", | |
"\n", | |
"\"\",\"TV\",\"Radio\",\"Newspaper\",\"Sales\"\n", | |
"\"1\",230.1,37.8,69.2,22.1\n", | |
"\"2\",44.5,39.3,45.1,10.4\n", | |
"\"3\",17.2,45.9,69.3,9.3\n", | |
"\"4\",151.5,41.3,58.5,18.5\n", | |
"\"5\",180.8,10.8,58.4,12.9\n", | |
"\"6\",8.7,48.9,75,7.2\n", | |
"\"7\",57.5,32.8,23.5,11.8\n", | |
"\"8\",120.2,19.6,11.6,13.2\n", | |
"\"9\",8.6,2.1,1,4.8\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"DATA = pd.read_csv('Advertising.csv')\n", | |
"DATA" | |
], | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 423 | |
}, | |
"id": "n2b4Zci3IshW", | |
"outputId": "17daf468-ce37-4e94-e6dc-f1c9b6a47d07" | |
}, | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
" Unnamed: 0 TV Radio Newspaper Sales\n", | |
"0 1 230.1 37.8 69.2 22.1\n", | |
"1 2 44.5 39.3 45.1 10.4\n", | |
"2 3 17.2 45.9 69.3 9.3\n", | |
"3 4 151.5 41.3 58.5 18.5\n", | |
"4 5 180.8 10.8 58.4 12.9\n", | |
".. ... ... ... ... ...\n", | |
"195 196 38.2 3.7 13.8 7.6\n", | |
"196 197 94.2 4.9 8.1 9.7\n", | |
"197 198 177.0 9.3 6.4 12.8\n", | |
"198 199 283.6 42.0 66.2 25.5\n", | |
"199 200 232.1 8.6 8.7 13.4\n", | |
"\n", | |
"[200 rows x 5 columns]" | |
], | |
"text/html": [ | |
"\n", | |
" <div id=\"df-9ac026aa-816f-48c3-b216-94322ee2d6aa\">\n", | |
" <div class=\"colab-df-container\">\n", | |
" <div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>Unnamed: 0</th>\n", | |
" <th>TV</th>\n", | |
" <th>Radio</th>\n", | |
" <th>Newspaper</th>\n", | |
" <th>Sales</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0</th>\n", | |
" <td>1</td>\n", | |
" <td>230.1</td>\n", | |
" <td>37.8</td>\n", | |
" <td>69.2</td>\n", | |
" <td>22.1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>2</td>\n", | |
" <td>44.5</td>\n", | |
" <td>39.3</td>\n", | |
" <td>45.1</td>\n", | |
" <td>10.4</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>3</td>\n", | |
" <td>17.2</td>\n", | |
" <td>45.9</td>\n", | |
" <td>69.3</td>\n", | |
" <td>9.3</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>4</td>\n", | |
" <td>151.5</td>\n", | |
" <td>41.3</td>\n", | |
" <td>58.5</td>\n", | |
" <td>18.5</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4</th>\n", | |
" <td>5</td>\n", | |
" <td>180.8</td>\n", | |
" <td>10.8</td>\n", | |
" <td>58.4</td>\n", | |
" <td>12.9</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>...</th>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>195</th>\n", | |
" <td>196</td>\n", | |
" <td>38.2</td>\n", | |
" <td>3.7</td>\n", | |
" <td>13.8</td>\n", | |
" <td>7.6</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>196</th>\n", | |
" <td>197</td>\n", | |
" <td>94.2</td>\n", | |
" <td>4.9</td>\n", | |
" <td>8.1</td>\n", | |
" <td>9.7</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>197</th>\n", | |
" <td>198</td>\n", | |
" <td>177.0</td>\n", | |
" <td>9.3</td>\n", | |
" <td>6.4</td>\n", | |
" <td>12.8</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>198</th>\n", | |
" <td>199</td>\n", | |
" <td>283.6</td>\n", | |
" <td>42.0</td>\n", | |
" <td>66.2</td>\n", | |
" <td>25.5</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>199</th>\n", | |
" <td>200</td>\n", | |
" <td>232.1</td>\n", | |
" <td>8.6</td>\n", | |
" <td>8.7</td>\n", | |
" <td>13.4</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"<p>200 rows × 5 columns</p>\n", | |
"</div>\n", | |
" <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-9ac026aa-816f-48c3-b216-94322ee2d6aa')\"\n", | |
" title=\"Convert this dataframe to an interactive table.\"\n", | |
" style=\"display:none;\">\n", | |
" \n", | |
" <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n", | |
" width=\"24px\">\n", | |
" <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n", | |
" <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n", | |
" </svg>\n", | |
" </button>\n", | |
" \n", | |
" <style>\n", | |
" .colab-df-container {\n", | |
" display:flex;\n", | |
" flex-wrap:wrap;\n", | |
" gap: 12px;\n", | |
" }\n", | |
"\n", | |
" .colab-df-convert {\n", | |
" background-color: #E8F0FE;\n", | |
" border: none;\n", | |
" border-radius: 50%;\n", | |
" cursor: pointer;\n", | |
" display: none;\n", | |
" fill: #1967D2;\n", | |
" height: 32px;\n", | |
" padding: 0 0 0 0;\n", | |
" width: 32px;\n", | |
" }\n", | |
"\n", | |
" .colab-df-convert:hover {\n", | |
" background-color: #E2EBFA;\n", | |
" box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n", | |
" fill: #174EA6;\n", | |
" }\n", | |
"\n", | |
" [theme=dark] .colab-df-convert {\n", | |
" background-color: #3B4455;\n", | |
" fill: #D2E3FC;\n", | |
" }\n", | |
"\n", | |
" [theme=dark] .colab-df-convert:hover {\n", | |
" background-color: #434B5C;\n", | |
" box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n", | |
" filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n", | |
" fill: #FFFFFF;\n", | |
" }\n", | |
" </style>\n", | |
"\n", | |
" <script>\n", | |
" const buttonEl =\n", | |
" document.querySelector('#df-9ac026aa-816f-48c3-b216-94322ee2d6aa button.colab-df-convert');\n", | |
" buttonEl.style.display =\n", | |
" google.colab.kernel.accessAllowed ? 'block' : 'none';\n", | |
"\n", | |
" async function convertToInteractive(key) {\n", | |
" const element = document.querySelector('#df-9ac026aa-816f-48c3-b216-94322ee2d6aa');\n", | |
" const dataTable =\n", | |
" await google.colab.kernel.invokeFunction('convertToInteractive',\n", | |
" [key], {});\n", | |
" if (!dataTable) return;\n", | |
"\n", | |
" const docLinkHtml = 'Like what you see? Visit the ' +\n", | |
" '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n", | |
" + ' to learn more about interactive tables.';\n", | |
" element.innerHTML = '';\n", | |
" dataTable['output_type'] = 'display_data';\n", | |
" await google.colab.output.renderOutput(dataTable, element);\n", | |
" const docLink = document.createElement('div');\n", | |
" docLink.innerHTML = docLinkHtml;\n", | |
" element.appendChild(docLink);\n", | |
" }\n", | |
" </script>\n", | |
" </div>\n", | |
" </div>\n", | |
" " | |
] | |
}, | |
"metadata": {}, | |
"execution_count": 54 | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"DATA = DATA.drop(columns=['Unnamed: 0'])" | |
], | |
"metadata": { | |
"id": "Rnd4OuIEJTll" | |
}, | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"DATA.describe()" | |
], | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 300 | |
}, | |
"id": "EM8lElX_I6hc", | |
"outputId": "f8145350-dab5-4d20-8485-2d0d94f5f4b2" | |
}, | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
" TV Radio Newspaper Sales\n", | |
"count 200.000000 200.000000 200.000000 200.000000\n", | |
"mean 147.042500 23.264000 30.554000 14.022500\n", | |
"std 85.854236 14.846809 21.778621 5.217457\n", | |
"min 0.700000 0.000000 0.300000 1.600000\n", | |
"25% 74.375000 9.975000 12.750000 10.375000\n", | |
"50% 149.750000 22.900000 25.750000 12.900000\n", | |
"75% 218.825000 36.525000 45.100000 17.400000\n", | |
"max 296.400000 49.600000 114.000000 27.000000" | |
], | |
"text/html": [ | |
"\n", | |
" <div id=\"df-b0b1d2ee-60de-4453-a678-95e989e6f0d7\">\n", | |
" <div class=\"colab-df-container\">\n", | |
" <div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>TV</th>\n", | |
" <th>Radio</th>\n", | |
" <th>Newspaper</th>\n", | |
" <th>Sales</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>count</th>\n", | |
" <td>200.000000</td>\n", | |
" <td>200.000000</td>\n", | |
" <td>200.000000</td>\n", | |
" <td>200.000000</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>mean</th>\n", | |
" <td>147.042500</td>\n", | |
" <td>23.264000</td>\n", | |
" <td>30.554000</td>\n", | |
" <td>14.022500</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>std</th>\n", | |
" <td>85.854236</td>\n", | |
" <td>14.846809</td>\n", | |
" <td>21.778621</td>\n", | |
" <td>5.217457</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>min</th>\n", | |
" <td>0.700000</td>\n", | |
" <td>0.000000</td>\n", | |
" <td>0.300000</td>\n", | |
" <td>1.600000</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>25%</th>\n", | |
" <td>74.375000</td>\n", | |
" <td>9.975000</td>\n", | |
" <td>12.750000</td>\n", | |
" <td>10.375000</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>50%</th>\n", | |
" <td>149.750000</td>\n", | |
" <td>22.900000</td>\n", | |
" <td>25.750000</td>\n", | |
" <td>12.900000</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>75%</th>\n", | |
" <td>218.825000</td>\n", | |
" <td>36.525000</td>\n", | |
" <td>45.100000</td>\n", | |
" <td>17.400000</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>max</th>\n", | |
" <td>296.400000</td>\n", | |
" <td>49.600000</td>\n", | |
" <td>114.000000</td>\n", | |
" <td>27.000000</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>\n", | |
" <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-b0b1d2ee-60de-4453-a678-95e989e6f0d7')\"\n", | |
" title=\"Convert this dataframe to an interactive table.\"\n", | |
" style=\"display:none;\">\n", | |
" \n", | |
" <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n", | |
" width=\"24px\">\n", | |
" <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n", | |
" <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n", | |
" </svg>\n", | |
" </button>\n", | |
" \n", | |
" <style>\n", | |
" .colab-df-container {\n", | |
" display:flex;\n", | |
" flex-wrap:wrap;\n", | |
" gap: 12px;\n", | |
" }\n", | |
"\n", | |
" .colab-df-convert {\n", | |
" background-color: #E8F0FE;\n", | |
" border: none;\n", | |
" border-radius: 50%;\n", | |
" cursor: pointer;\n", | |
" display: none;\n", | |
" fill: #1967D2;\n", | |
" height: 32px;\n", | |
" padding: 0 0 0 0;\n", | |
" width: 32px;\n", | |
" }\n", | |
"\n", | |
" .colab-df-convert:hover {\n", | |
" background-color: #E2EBFA;\n", | |
" box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n", | |
" fill: #174EA6;\n", | |
" }\n", | |
"\n", | |
" [theme=dark] .colab-df-convert {\n", | |
" background-color: #3B4455;\n", | |
" fill: #D2E3FC;\n", | |
" }\n", | |
"\n", | |
" [theme=dark] .colab-df-convert:hover {\n", | |
" background-color: #434B5C;\n", | |
" box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n", | |
" filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n", | |
" fill: #FFFFFF;\n", | |
" }\n", | |
" </style>\n", | |
"\n", | |
" <script>\n", | |
" const buttonEl =\n", | |
" document.querySelector('#df-b0b1d2ee-60de-4453-a678-95e989e6f0d7 button.colab-df-convert');\n", | |
" buttonEl.style.display =\n", | |
" google.colab.kernel.accessAllowed ? 'block' : 'none';\n", | |
"\n", | |
" async function convertToInteractive(key) {\n", | |
" const element = document.querySelector('#df-b0b1d2ee-60de-4453-a678-95e989e6f0d7');\n", | |
" const dataTable =\n", | |
" await google.colab.kernel.invokeFunction('convertToInteractive',\n", | |
" [key], {});\n", | |
" if (!dataTable) return;\n", | |
"\n", | |
" const docLinkHtml = 'Like what you see? Visit the ' +\n", | |
" '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n", | |
" + ' to learn more about interactive tables.';\n", | |
" element.innerHTML = '';\n", | |
" dataTable['output_type'] = 'display_data';\n", | |
" await google.colab.output.renderOutput(dataTable, element);\n", | |
" const docLink = document.createElement('div');\n", | |
" docLink.innerHTML = docLinkHtml;\n", | |
" element.appendChild(docLink);\n", | |
" }\n", | |
" </script>\n", | |
" </div>\n", | |
" </div>\n", | |
" " | |
] | |
}, | |
"metadata": {}, | |
"execution_count": 56 | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"source": [ | |
"# Простая линейная регрессия" | |
], | |
"metadata": { | |
"id": "uC0zpZOTKawW" | |
} | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"X = DATA.loc[:, DATA.columns != 'Sales'].to_numpy()\n", | |
"y = DATA['Sales'].to_numpy()\n", | |
"print(X.shape)\n", | |
"print(y.shape)" | |
], | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "KRa_VszDKaYY", | |
"outputId": "950fad99-2c6f-4fe0-afea-1165d116c60b" | |
}, | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"(200, 3)\n", | |
"(200,)\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"lin_reg_1 = LinearRegression()\n", | |
"scores = cross_val_score(lin_reg_1, X, y, cv=5)\n", | |
"print(\"%0.2f R^2 with a standard deviation of %0.2f\" % (scores.mean(), scores.std()))\n", | |
"\n", | |
"scores = cross_val_score(lin_reg_1, X, y, cv=5, scoring='neg_mean_squared_error')\n", | |
"print(\"%0.2f MSE with a standard deviation of %0.2f\" % (scores.mean(), scores.std()))" | |
], | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "dqCkpqGlI-cF", | |
"outputId": "969ec98d-0751-4f83-b585-b6fae2534cdf" | |
}, | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"0.89 R^2 with a standard deviation of 0.04\n", | |
"-3.07 MSE with a standard deviation of 1.28\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"source": [ | |
"# Полиномиальная регрессия" | |
], | |
"metadata": { | |
"id": "COMxvBd_Lkxo" | |
} | |
}, | |
{ | |
"cell_type": "markdown", | |
"source": [ | |
"Допустим у нас есть ряд набюдений $X$:\n", | |
"\n", | |
"$$\\bf{X}=\\left[ \\begin{matrix} 1 & x_{11} & x_{12} \\\\ 1& x_{21} & x_{22} \\\\ ... & ... & ... & \\\\ 1 & x_{n1} & x_{n2} & \\end{matrix} \\right]$$\n", | |
"\n", | |
"Зависимая переменная ${Y}=\\left[ \\begin{matrix} y_1 \\\\ y_2 \\\\ ... \\\\ y_n \\end{matrix} \\right]$\n", | |
"\n", | |
"Пример простой линейной модели с двумя признаками: $y_i=\\beta_0 + \\beta_1x_{i1} + \\beta_2x_{i2}$\n", | |
"\n", | |
"Пример полиномиальной модели: $y_i=\\beta_0 + \\beta_1x_{i1} + \\beta_2x_{i2} + \\beta_3x_{i1}^2 + \\beta_4x_{i1}x_{i2} + \\beta_5x_{i2}^2$\n", | |
"\n", | |
"То есть, $X$ принимает вид (_feature engineering_):\n", | |
"\n", | |
"$$\\bf{X}=\\left[ \\begin{matrix} 1 & x_{11} & x_{12} & x_{11}^2 & x_{11}x_{12} & x_{12}^2 \\\\ 1 & x_{21} & x_{22} & x_{21}^2 & x_{21}x_{22} & x_{22}^2 \\\\ ... & ... & ... & ... & ... & ... \\\\ 1 & x_{n1} & x_{n2} & x_{n1}^2 & x_{n1}x_{n2} & x_{n2}^2 \\end{matrix} \\right]$$" | |
], | |
"metadata": { | |
"id": "EqESRagwzhK2" | |
} | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"# https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.PolynomialFeatures.html\n", | |
"\n", | |
"poly = PolynomialFeatures(2, include_bias=False)\n", | |
"X_2 = poly.fit_transform(X)\n", | |
"\n", | |
"X.shape, X_2.shape" | |
], | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "JaW--o2_LHtm", | |
"outputId": "d1c6e321-708a-4c8a-9374-1282a736104a" | |
}, | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"((200, 3), (200, 9))" | |
] | |
}, | |
"metadata": {}, | |
"execution_count": 59 | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"lin_reg_2 = LinearRegression()\n", | |
"scores = cross_val_score(lin_reg_2, X_2, y, cv=5)\n", | |
"print(\"%0.2f R^2 with a standard deviation of %0.2f\" % (scores.mean(), scores.std()))\n", | |
"\n", | |
"scores = cross_val_score(lin_reg_2, X_2, y, cv=5, scoring='neg_mean_squared_error')\n", | |
"print(\"%0.2f MSE with a standard deviation of %0.2f\" % (scores.mean(), scores.std()))" | |
], | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "MU3knNuZMAOJ", | |
"outputId": "1641fa22-1dd9-418a-befc-1e59c64ed570" | |
}, | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"0.98 R^2 with a standard deviation of 0.01\n", | |
"-0.44 MSE with a standard deviation of 0.39\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"poly_3 = PolynomialFeatures(3, include_bias=False)\n", | |
"X_3 = poly_3.fit_transform(X)\n", | |
"\n", | |
"print(X.shape, X_3.shape)\n", | |
"\n", | |
"lin_reg_3 = LinearRegression()\n", | |
"scores = cross_val_score(lin_reg_3, X_3, y, cv=5)\n", | |
"print(\"%0.2f R^2 with a standard deviation of %0.2f\" % (scores.mean(), scores.std()))\n", | |
"\n", | |
"scores = cross_val_score(lin_reg_3, X_3, y, cv=5, scoring='neg_mean_squared_error')\n", | |
"print(\"%0.2f MSE with a standard deviation of %0.2f\" % (scores.mean(), scores.std()))" | |
], | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "tkSUyoJMMOoo", | |
"outputId": "7ac71a71-030a-42b0-e337-e155199c2aa0" | |
}, | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"(200, 3) (200, 19)\n", | |
"0.99 R^2 with a standard deviation of 0.01\n", | |
"-0.31 MSE with a standard deviation of 0.24\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"source": [ | |
"# Простая нейронная сеть" | |
], | |
"metadata": { | |
"id": "gGE1QhiCNDKk" | |
} | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"def simple_model(input_features):\n", | |
" input = keras.Input(shape=(input_features,))\n", | |
" x = layers.Dense(8, activation='relu')(input)\n", | |
" x = layers.Dense(8, activation='relu')(x)\n", | |
" x = layers.Dense(8, activation='relu')(x)\n", | |
" output = layers.Dense(1)(x)\n", | |
" model = keras.Model(input, output)\n", | |
" return model" | |
], | |
"metadata": { | |
"id": "qEc_99FAbGR7" | |
}, | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"nn_1 = simple_model(3)\n", | |
"nn_1.summary()" | |
], | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "M4he8n1vMx1w", | |
"outputId": "8b5c7caf-9712-4ca3-8a38-49fe9f4f6e9b" | |
}, | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"Model: \"model_30\"\n", | |
"_________________________________________________________________\n", | |
" Layer (type) Output Shape Param # \n", | |
"=================================================================\n", | |
" input_31 (InputLayer) [(None, 3)] 0 \n", | |
" \n", | |
" dense_120 (Dense) (None, 8) 32 \n", | |
" \n", | |
" dense_121 (Dense) (None, 8) 72 \n", | |
" \n", | |
" dense_122 (Dense) (None, 8) 72 \n", | |
" \n", | |
" dense_123 (Dense) (None, 1) 9 \n", | |
" \n", | |
"=================================================================\n", | |
"Total params: 185\n", | |
"Trainable params: 185\n", | |
"Non-trainable params: 0\n", | |
"_________________________________________________________________\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"# original features\n", | |
"input_features=3\n", | |
"MSE_metric = []\n", | |
"r2_metric = []\n", | |
"LR = 0.01\n", | |
"batch_size = 1\n", | |
"epochs = 10\n", | |
"\n", | |
"kfold = KFold(n_splits=5, shuffle=True)\n", | |
"\n", | |
"step = 1\n", | |
"for train, test in kfold.split(X, y):\n", | |
" model = simple_model(input_features)\n", | |
" model.compile(\n", | |
" optimizer=keras.optimizers.Adam(learning_rate=LR),\n", | |
" loss=[tf.keras.losses.MeanSquaredError()],\n", | |
" metrics=[tfa.metrics.RSquare(dtype=tf.float32, y_shape=(1,))]\n", | |
" )\n", | |
" \n", | |
" print(\"Traint on Fold # {}\".format(step))\n", | |
" history = model.fit(X[train], y[train],\n", | |
" batch_size=batch_size,\n", | |
" epochs=epochs)\n", | |
" \n", | |
" scores = model.evaluate(X[test], y[test], verbose=0)\n", | |
" \n", | |
" MSE_metric.append(scores[0])\n", | |
" r2_metric.append(scores[1])\n", | |
"\n", | |
" step += 1\n", | |
"\n", | |
"\n", | |
"print(\"%0.2f R^2 with a standard deviation of %0.2f\" % (np.mean(r2_metric), np.std(r2_metric)))\n", | |
"print(\"%0.2f MSE with a standard deviation of %0.2f\" % (np.mean(MSE_metric), np.std(MSE_metric)))\n", | |
"\n" | |
], | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "O2jAiISyamFK", | |
"outputId": "82bd138a-0370-4a84-f895-b161a9587a0f" | |
}, | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"Traint on Fold # 1\n", | |
"Epoch 1/10\n", | |
"160/160 [==============================] - 1s 2ms/step - loss: 16.2024 - r_square: 0.4167\n", | |
"Epoch 2/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 3.9761 - r_square: 0.8569\n", | |
"Epoch 3/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 3.7821 - r_square: 0.8638\n", | |
"Epoch 4/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 2.3843 - r_square: 0.9142\n", | |
"Epoch 5/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 3.0153 - r_square: 0.8914\n", | |
"Epoch 6/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 2.8528 - r_square: 0.8973\n", | |
"Epoch 7/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 2.9074 - r_square: 0.8953\n", | |
"Epoch 8/10\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 3.3223 - r_square: 0.8804\n", | |
"Epoch 9/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 2.0669 - r_square: 0.9256\n", | |
"Epoch 10/10\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 3.1468 - r_square: 0.8867\n", | |
"Traint on Fold # 2\n", | |
"Epoch 1/10\n", | |
"160/160 [==============================] - 1s 1ms/step - loss: 17.1019 - r_square: 0.3959\n", | |
"Epoch 2/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 7.1286 - r_square: 0.7482\n", | |
"Epoch 3/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 2.3780 - r_square: 0.9160\n", | |
"Epoch 4/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 2.2266 - r_square: 0.9213\n", | |
"Epoch 5/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 2.2002 - r_square: 0.9223\n", | |
"Epoch 6/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 1.7749 - r_square: 0.9373\n", | |
"Epoch 7/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 1.8101 - r_square: 0.9361\n", | |
"Epoch 8/10\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 1.9340 - r_square: 0.9317\n", | |
"Epoch 9/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 2.3880 - r_square: 0.9156\n", | |
"Epoch 10/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 2.9022 - r_square: 0.8975\n", | |
"Traint on Fold # 3\n", | |
"Epoch 1/10\n", | |
"160/160 [==============================] - 1s 2ms/step - loss: 15.5137 - r_square: 0.4153\n", | |
"Epoch 2/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 3.7498 - r_square: 0.8587\n", | |
"Epoch 3/10\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 3.0436 - r_square: 0.8853\n", | |
"Epoch 4/10\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 2.4885 - r_square: 0.9062\n", | |
"Epoch 5/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 2.6690 - r_square: 0.8994\n", | |
"Epoch 6/10\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 1.2477 - r_square: 0.9530\n", | |
"Epoch 7/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 1.2213 - r_square: 0.9540\n", | |
"Epoch 8/10\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 1.9037 - r_square: 0.9283\n", | |
"Epoch 9/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 1.0913 - r_square: 0.9589\n", | |
"Epoch 10/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 1.1647 - r_square: 0.9561\n", | |
"Traint on Fold # 4\n", | |
"Epoch 1/10\n", | |
"160/160 [==============================] - 1s 1ms/step - loss: 15.6965 - r_square: 0.3908\n", | |
"Epoch 2/10\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 4.8530 - r_square: 0.8117\n", | |
"Epoch 3/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 3.0453 - r_square: 0.8818\n", | |
"Epoch 4/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 2.5156 - r_square: 0.9024\n", | |
"Epoch 5/10\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 1.6183 - r_square: 0.9372\n", | |
"Epoch 6/10\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 1.6698 - r_square: 0.9352\n", | |
"Epoch 7/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 1.2195 - r_square: 0.9527\n", | |
"Epoch 8/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 1.4367 - r_square: 0.9442\n", | |
"Epoch 9/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 1.5627 - r_square: 0.9393\n", | |
"Epoch 10/10\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 1.0964 - r_square: 0.9574\n", | |
"Traint on Fold # 5\n", | |
"Epoch 1/10\n", | |
"160/160 [==============================] - 1s 1ms/step - loss: 38.1272 - r_square: -0.4204\n", | |
"Epoch 2/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 3.5857 - r_square: 0.8664\n", | |
"Epoch 3/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 4.1824 - r_square: 0.8442\n", | |
"Epoch 4/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 2.6618 - r_square: 0.9008\n", | |
"Epoch 5/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 2.9917 - r_square: 0.8885\n", | |
"Epoch 6/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 3.2090 - r_square: 0.8805\n", | |
"Epoch 7/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 2.1400 - r_square: 0.9203\n", | |
"Epoch 8/10\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 2.1150 - r_square: 0.9212\n", | |
"Epoch 9/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 1.4849 - r_square: 0.9447\n", | |
"Epoch 10/10\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 1.5077 - r_square: 0.9438\n", | |
"0.91 R^2 with a standard deviation of 0.06\n", | |
"2.40 MSE with a standard deviation of 1.61\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"X_2.shape" | |
], | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "fYzGmId2j2vn", | |
"outputId": "dcc87048-fb27-4a5c-f76b-06249304aacd" | |
}, | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"(200, 9)" | |
] | |
}, | |
"metadata": {}, | |
"execution_count": 65 | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"nn_1 = simple_model(9)\n", | |
"nn_1.summary()" | |
], | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "nimpInU0j1TL", | |
"outputId": "763f6300-2e82-4341-90cd-32bee069f9be" | |
}, | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"Model: \"model_36\"\n", | |
"_________________________________________________________________\n", | |
" Layer (type) Output Shape Param # \n", | |
"=================================================================\n", | |
" input_37 (InputLayer) [(None, 9)] 0 \n", | |
" \n", | |
" dense_144 (Dense) (None, 8) 80 \n", | |
" \n", | |
" dense_145 (Dense) (None, 8) 72 \n", | |
" \n", | |
" dense_146 (Dense) (None, 8) 72 \n", | |
" \n", | |
" dense_147 (Dense) (None, 1) 9 \n", | |
" \n", | |
"=================================================================\n", | |
"Total params: 233\n", | |
"Trainable params: 233\n", | |
"Non-trainable params: 0\n", | |
"_________________________________________________________________\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"# X^2 polynomial features\n", | |
"\n", | |
"input_features=9\n", | |
"MSE_metric = []\n", | |
"r2_metric = []\n", | |
"LR = 0.5\n", | |
"batch_size = 1\n", | |
"epochs = 15\n", | |
"\n", | |
"kfold = KFold(n_splits=5, shuffle=True)\n", | |
"\n", | |
"step = 1\n", | |
"for train, test in kfold.split(X_2, y):\n", | |
" model = simple_model(input_features)\n", | |
" model.compile(\n", | |
" optimizer=keras.optimizers.Adam(learning_rate=LR),\n", | |
" loss=[tf.keras.losses.MeanSquaredError()],\n", | |
" metrics=[tfa.metrics.RSquare(dtype=tf.float32, y_shape=(1,))]\n", | |
" )\n", | |
" \n", | |
" print(\"Traint on Fold # {}\".format(step))\n", | |
" history = model.fit(X_2[train], y[train],\n", | |
" batch_size=batch_size,\n", | |
" epochs=epochs)\n", | |
" \n", | |
" scores = model.evaluate(X_2[test], y[test], verbose=0)\n", | |
" \n", | |
" MSE_metric.append(scores[0])\n", | |
" r2_metric.append(scores[1])\n", | |
"\n", | |
" step += 1\n", | |
"\n", | |
"\n", | |
"print(\"%0.2f R^2 with a standard deviation of %0.2f\" % (np.mean(r2_metric), np.std(r2_metric)))\n", | |
"print(\"%0.2f MSE with a standard deviation of %0.2f\" % (np.mean(MSE_metric), np.std(MSE_metric)))\n" | |
], | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "ufWWjw0ZgHiv", | |
"outputId": "c6837406-6e13-4548-e373-b2e91e6b639b" | |
}, | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"Traint on Fold # 1\n", | |
"Epoch 1/15\n", | |
"160/160 [==============================] - 1s 2ms/step - loss: 239035.0000 - r_square: -9199.8721\n", | |
"Epoch 2/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 37.3801 - r_square: -0.4388\n", | |
"Epoch 3/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 36.6234 - r_square: -0.4097\n", | |
"Epoch 4/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 32.8304 - r_square: -0.2637\n", | |
"Epoch 5/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 35.4500 - r_square: -0.3645\n", | |
"Epoch 6/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 34.0948 - r_square: -0.3124\n", | |
"Epoch 7/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 36.7745 - r_square: -0.4155\n", | |
"Epoch 8/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 41.5983 - r_square: -0.6012\n", | |
"Epoch 9/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 37.4649 - r_square: -0.4421\n", | |
"Epoch 10/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 36.5226 - r_square: -0.4058\n", | |
"Epoch 11/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 39.3880 - r_square: -0.5161\n", | |
"Epoch 12/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 42.9531 - r_square: -0.6533\n", | |
"Epoch 13/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 34.7907 - r_square: -0.3392\n", | |
"Epoch 14/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 52.1614 - r_square: -1.0078\n", | |
"Epoch 15/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 36.0439 - r_square: -0.3874\n", | |
"Traint on Fold # 2\n", | |
"Epoch 1/15\n", | |
"160/160 [==============================] - 1s 2ms/step - loss: 180739664.0000 - r_square: -6380801.5000\n", | |
"Epoch 2/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 162.5291 - r_square: -4.7379\n", | |
"Epoch 3/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 160.1068 - r_square: -4.6524\n", | |
"Epoch 4/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 157.0634 - r_square: -4.5449\n", | |
"Epoch 5/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 153.4884 - r_square: -4.4187\n", | |
"Epoch 6/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 149.4353 - r_square: -4.2756\n", | |
"Epoch 7/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 144.9490 - r_square: -4.1173\n", | |
"Epoch 8/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 140.0574 - r_square: -3.9446\n", | |
"Epoch 9/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 134.8342 - r_square: -3.7602\n", | |
"Epoch 10/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 129.2957 - r_square: -3.5646\n", | |
"Epoch 11/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 123.4846 - r_square: -3.3595\n", | |
"Epoch 12/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 117.4536 - r_square: -3.1466\n", | |
"Epoch 13/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 111.2620 - r_square: -2.9280\n", | |
"Epoch 14/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 104.9761 - r_square: -2.7061\n", | |
"Epoch 15/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 98.6283 - r_square: -2.4820\n", | |
"Traint on Fold # 3\n", | |
"Epoch 1/15\n", | |
"160/160 [==============================] - 1s 2ms/step - loss: 218001.1562 - r_square: -8377.3857\n", | |
"Epoch 2/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 27.8599 - r_square: -0.0707\n", | |
"Epoch 3/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 27.7370 - r_square: -0.0660\n", | |
"Epoch 4/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 25.1877 - r_square: 0.0320\n", | |
"Epoch 5/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 28.3859 - r_square: -0.0910\n", | |
"Epoch 6/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 29.5888 - r_square: -0.1372\n", | |
"Epoch 7/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 31.9030 - r_square: -0.2261\n", | |
"Epoch 8/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 29.6330 - r_square: -0.1389\n", | |
"Epoch 9/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 31.4090 - r_square: -0.2071\n", | |
"Epoch 10/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 28.9161 - r_square: -0.1113\n", | |
"Epoch 11/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 28.8925 - r_square: -0.1104\n", | |
"Epoch 12/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 31.6793 - r_square: -0.2175\n", | |
"Epoch 13/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 31.9550 - r_square: -0.2281\n", | |
"Epoch 14/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 33.1925 - r_square: -0.2757\n", | |
"Epoch 15/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 29.1770 - r_square: -0.1214\n", | |
"Traint on Fold # 4\n", | |
"Epoch 1/15\n", | |
"160/160 [==============================] - 1s 1ms/step - loss: 54.1740 - r_square: -1.0094\n", | |
"Epoch 2/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 31.0335 - r_square: -0.1511\n", | |
"Epoch 3/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 29.1616 - r_square: -0.0816\n", | |
"Epoch 4/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 29.8393 - r_square: -0.1068\n", | |
"Epoch 5/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 29.4575 - r_square: -0.0926\n", | |
"Epoch 6/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 30.0795 - r_square: -0.1157\n", | |
"Epoch 7/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 27.9198 - r_square: -0.0356\n", | |
"Epoch 8/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 27.3414 - r_square: -0.0141\n", | |
"Epoch 9/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 28.1615 - r_square: -0.0445\n", | |
"Epoch 10/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 27.6828 - r_square: -0.0268\n", | |
"Epoch 11/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 28.0983 - r_square: -0.0422\n", | |
"Epoch 12/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 28.5057 - r_square: -0.0573\n", | |
"Epoch 13/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 28.6568 - r_square: -0.0629\n", | |
"Epoch 14/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 28.7601 - r_square: -0.0667\n", | |
"Epoch 15/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 28.3267 - r_square: -0.0507\n", | |
"Traint on Fold # 5\n", | |
"Epoch 1/15\n", | |
"160/160 [==============================] - 1s 1ms/step - loss: 4140985856.0000 - r_square: -147298672.0000\n", | |
"Epoch 2/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 166.7630 - r_square: -4.9319\n", | |
"Epoch 3/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 166.2355 - r_square: -4.9131\n", | |
"Epoch 4/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 165.5636 - r_square: -4.8892\n", | |
"Epoch 5/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 164.7577 - r_square: -4.8606\n", | |
"Epoch 6/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 163.8211 - r_square: -4.8273\n", | |
"Epoch 7/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 162.7552 - r_square: -4.7893\n", | |
"Epoch 8/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 161.5582 - r_square: -4.7468\n", | |
"Epoch 9/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 160.2272 - r_square: -4.6994\n", | |
"Epoch 10/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 158.7604 - r_square: -4.6472\n", | |
"Epoch 11/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 157.1565 - r_square: -4.5902\n", | |
"Epoch 12/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 155.4084 - r_square: -4.5280\n", | |
"Epoch 13/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 153.5127 - r_square: -4.4606\n", | |
"Epoch 14/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 151.4646 - r_square: -4.3877\n", | |
"Epoch 15/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 149.2648 - r_square: -4.3095\n", | |
"-1.92 R^2 with a standard deviation of 2.02\n", | |
"71.91 MSE with a standard deviation of 41.74\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"X_3.shape" | |
], | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "S44LAAull35c", | |
"outputId": "5c0971ee-122d-499f-a6cb-c7ac1bc2de32" | |
}, | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"(200, 19)" | |
] | |
}, | |
"metadata": {}, | |
"execution_count": 68 | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"nn_1 = simple_model(19)\n", | |
"nn_1.summary()" | |
], | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "2vO67DlDl8V5", | |
"outputId": "6792981c-ce4d-465f-feed-a667a8390ea4" | |
}, | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"Model: \"model_42\"\n", | |
"_________________________________________________________________\n", | |
" Layer (type) Output Shape Param # \n", | |
"=================================================================\n", | |
" input_43 (InputLayer) [(None, 19)] 0 \n", | |
" \n", | |
" dense_168 (Dense) (None, 8) 160 \n", | |
" \n", | |
" dense_169 (Dense) (None, 8) 72 \n", | |
" \n", | |
" dense_170 (Dense) (None, 8) 72 \n", | |
" \n", | |
" dense_171 (Dense) (None, 1) 9 \n", | |
" \n", | |
"=================================================================\n", | |
"Total params: 313\n", | |
"Trainable params: 313\n", | |
"Non-trainable params: 0\n", | |
"_________________________________________________________________\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"# X^3 polynomial features\n", | |
"\n", | |
"input_features=19\n", | |
"MSE_metric = []\n", | |
"r2_metric = []\n", | |
"LR = 0.1\n", | |
"batch_size = 1\n", | |
"epochs = 15\n", | |
"\n", | |
"kfold = KFold(n_splits=5, shuffle=True)\n", | |
"\n", | |
"step = 1\n", | |
"for train, test in kfold.split(X_3, y):\n", | |
" model = simple_model(input_features)\n", | |
" model.compile(\n", | |
" optimizer=keras.optimizers.Adam(learning_rate=LR),\n", | |
" loss=[tf.keras.losses.MeanSquaredError()],\n", | |
" metrics=[tfa.metrics.RSquare(dtype=tf.float32, y_shape=(1,))]\n", | |
" )\n", | |
" \n", | |
" print(\"Traint on Fold # {}\".format(step))\n", | |
" history = model.fit(X_3[train], y[train],\n", | |
" batch_size=batch_size,\n", | |
" epochs=epochs)\n", | |
" \n", | |
" scores = model.evaluate(X_3[test], y[test], verbose=0)\n", | |
" \n", | |
" MSE_metric.append(scores[0])\n", | |
" r2_metric.append(scores[1])\n", | |
"\n", | |
" step += 1\n", | |
"\n", | |
"\n", | |
"print(\"%0.2f R^2 with a standard deviation of %0.2f\" % (np.mean(r2_metric), np.std(r2_metric)))\n", | |
"print(\"%0.2f MSE with a standard deviation of %0.2f\" % (np.mean(MSE_metric), np.std(MSE_metric)))" | |
], | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "ReXwB3j9OowI", | |
"outputId": "ef51d07f-51df-4f78-f331-61f9980562db" | |
}, | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"Traint on Fold # 1\n", | |
"Epoch 1/15\n", | |
"160/160 [==============================] - 1s 2ms/step - loss: 208534814720.0000 - r_square: -7539420160.0000\n", | |
"Epoch 2/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 29.6321 - r_square: -0.0713\n", | |
"Epoch 3/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 28.3144 - r_square: -0.0237\n", | |
"Epoch 4/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 29.8664 - r_square: -0.0798\n", | |
"Epoch 5/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 29.5055 - r_square: -0.0667\n", | |
"Epoch 6/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 29.4847 - r_square: -0.0660\n", | |
"Epoch 7/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 30.8022 - r_square: -0.1136\n", | |
"Epoch 8/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 29.9225 - r_square: -0.0818\n", | |
"Epoch 9/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 28.9794 - r_square: -0.0477\n", | |
"Epoch 10/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 29.5053 - r_square: -0.0667\n", | |
"Epoch 11/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 30.8465 - r_square: -0.1152\n", | |
"Epoch 12/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 30.0251 - r_square: -0.0855\n", | |
"Epoch 13/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 30.3983 - r_square: -0.0990\n", | |
"Epoch 14/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 28.5822 - r_square: -0.0334\n", | |
"Epoch 15/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 29.2593 - r_square: -0.0578\n", | |
"Traint on Fold # 2\n", | |
"Epoch 1/15\n", | |
"160/160 [==============================] - 1s 1ms/step - loss: 39032479744.0000 - r_square: -1439124608.0000\n", | |
"Epoch 2/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 210.6791 - r_square: -6.7678\n", | |
"Epoch 3/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 209.8824 - r_square: -6.7384\n", | |
"Epoch 4/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 208.8626 - r_square: -6.7008\n", | |
"Epoch 5/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 207.6375 - r_square: -6.6556\n", | |
"Epoch 6/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 206.2179 - r_square: -6.6033\n", | |
"Epoch 7/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 204.6006 - r_square: -6.5436\n", | |
"Epoch 8/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 202.7906 - r_square: -6.4769\n", | |
"Epoch 9/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 200.7783 - r_square: -6.4027\n", | |
"Epoch 10/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 198.5644 - r_square: -6.3211\n", | |
"Epoch 11/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 196.1439 - r_square: -6.2318\n", | |
"Epoch 12/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 193.5140 - r_square: -6.1349\n", | |
"Epoch 13/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 190.6665 - r_square: -6.0299\n", | |
"Epoch 14/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 187.5995 - r_square: -5.9168\n", | |
"Epoch 15/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 184.3087 - r_square: -5.7955\n", | |
"Traint on Fold # 3\n", | |
"Epoch 1/15\n", | |
"160/160 [==============================] - 1s 2ms/step - loss: 28592672768.0000 - r_square: -1088961536.0000\n", | |
"Epoch 2/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 233.1044 - r_square: -7.8779\n", | |
"Epoch 3/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 233.0444 - r_square: -7.8756\n", | |
"Epoch 4/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 232.9677 - r_square: -7.8727\n", | |
"Epoch 5/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 232.8752 - r_square: -7.8692\n", | |
"Epoch 6/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 232.7672 - r_square: -7.8650\n", | |
"Epoch 7/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 232.6434 - r_square: -7.8603\n", | |
"Epoch 8/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 232.5038 - r_square: -7.8550\n", | |
"Epoch 9/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 232.3466 - r_square: -7.8490\n", | |
"Epoch 10/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 232.1722 - r_square: -7.8423\n", | |
"Epoch 11/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 231.9788 - r_square: -7.8350\n", | |
"Epoch 12/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 231.7654 - r_square: -7.8269\n", | |
"Epoch 13/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 231.5313 - r_square: -7.8179\n", | |
"Epoch 14/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 231.2749 - r_square: -7.8081\n", | |
"Epoch 15/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 230.9945 - r_square: -7.7975\n", | |
"Traint on Fold # 4\n", | |
"Epoch 1/15\n", | |
"160/160 [==============================] - 1s 2ms/step - loss: 10843359232.0000 - r_square: -412928064.0000\n", | |
"Epoch 2/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 229.7369 - r_square: -7.7487\n", | |
"Epoch 3/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 229.6414 - r_square: -7.7450\n", | |
"Epoch 4/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 229.5188 - r_square: -7.7404\n", | |
"Epoch 5/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 229.3708 - r_square: -7.7347\n", | |
"Epoch 6/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 229.1986 - r_square: -7.7282\n", | |
"Epoch 7/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 229.0008 - r_square: -7.7206\n", | |
"Epoch 8/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 228.7769 - r_square: -7.7122\n", | |
"Epoch 9/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 228.5263 - r_square: -7.7026\n", | |
"Epoch 10/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 228.2477 - r_square: -7.6920\n", | |
"Epoch 11/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 227.9395 - r_square: -7.6802\n", | |
"Epoch 12/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 227.5996 - r_square: -7.6673\n", | |
"Epoch 13/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 227.2270 - r_square: -7.6531\n", | |
"Epoch 14/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 226.8187 - r_square: -7.6376\n", | |
"Epoch 15/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 226.3732 - r_square: -7.6206\n", | |
"Traint on Fold # 5\n", | |
"Epoch 1/15\n", | |
"160/160 [==============================] - 1s 1ms/step - loss: 327125762048.0000 - r_square: -11685727232.0000\n", | |
"Epoch 2/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 192.7558 - r_square: -5.8857\n", | |
"Epoch 3/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 192.6152 - r_square: -5.8807\n", | |
"Epoch 4/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 192.4306 - r_square: -5.8741\n", | |
"Epoch 5/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 192.2160 - r_square: -5.8664\n", | |
"Epoch 6/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 191.9613 - r_square: -5.8574\n", | |
"Epoch 7/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 191.6872 - r_square: -5.8476\n", | |
"Epoch 8/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 191.3779 - r_square: -5.8365\n", | |
"Epoch 9/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 191.0527 - r_square: -5.8249\n", | |
"Epoch 10/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 190.6981 - r_square: -5.8122\n", | |
"Epoch 11/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 190.3278 - r_square: -5.7990\n", | |
"Epoch 12/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 189.9332 - r_square: -5.7849\n", | |
"Epoch 13/15\n", | |
"160/160 [==============================] - 0s 1ms/step - loss: 189.5108 - r_square: -5.7698\n", | |
"Epoch 14/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 189.0756 - r_square: -5.7543\n", | |
"Epoch 15/15\n", | |
"160/160 [==============================] - 0s 2ms/step - loss: 188.6184 - r_square: -5.7379\n", | |
"-5.54 R^2 with a standard deviation of 2.90\n", | |
"177.37 MSE with a standard deviation of 81.22\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"source": [ | |
"# Выбор модели" | |
], | |
"metadata": { | |
"id": "iBlV6TPotLhH" | |
} | |
}, | |
{ | |
"cell_type": "markdown", | |
"source": [ | |
"Model Name | parameters | $r^2$ | Mean Squared Error|\n", | |
"----------------|------------|--------------|-------------------|\n", | |
"LR | $\\bf4$ |$0.89\\pm0.04$ |$3.07\\pm1.28$ |\n", | |
"LR poly 2 | $10$ |$0.98\\pm0.01$ |$0.44\\pm0.39$ |\n", | |
"LR poly 3 | $20$ |$\\bf0.99\\pm0.01$ |$\\bf0.31\\pm0.24$ |\n", | |
"NN | $185$ |$0.91\\pm1.61$ |$1.86\\pm1.49$ |\n" | |
], | |
"metadata": { | |
"id": "ugPHpCGYtPIp" | |
} | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"" | |
], | |
"metadata": { | |
"id": "B60pcxi-mEXO" | |
}, | |
"execution_count": null, | |
"outputs": [] | |
} | |
] | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment