Last active
October 26, 2024 06:30
-
-
Save RahulDas-dev/e4e603b6c9d6043321c153fe54d91622 to your computer and use it in GitHub Desktop.
Math behind the Linear Regression with Gradient Descent from scratch
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"id": "c7b727ce", | |
"metadata": {}, | |
"source": [ | |
"\n", | |
"# Linear Regression With Gradient Descent\n", | |
"\n", | |
"\n", | |
"![reg](./images/regression.svg)\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "e929f281", | |
"metadata": {}, | |
"source": [ | |
"X_data = Matrix of Independent features of size mxn , or X ∈ ℝᵐˣⁿ\n", | |
"\n", | |
"xⁱ = ith observation of X_data or ith row of matrix X_data , or xⁱ ∈ ℝⁿ\n", | |
"\n", | |
"Y = Target vector of size mx1, or Y ∈ ℝᵐ\n", | |
"\n", | |
"yⁱ = ith target value or ith data point of vector Y , or yⁱ ∈ ℝ\n", | |
"\n", | |
"Ŷ = Predicted target vector of size mx1, or Ŷ ∈ ℝᵐ\n", | |
"\n", | |
"ŷⁱ = ith predicted target value or ith data point of vector Ŷ, or ŷⁱ ∈ ℝ\n", | |
"\n", | |
"W = Weights vector of size (n+1)x1, or W ∈ ℝⁿ⁺¹\n", | |
"\n", | |
"$$\\large Y =\n", | |
"\\begin{bmatrix}\n", | |
"y^{1}\\\\\n", | |
"y^{2}\\\\\n", | |
"y^{3}\\\\\n", | |
"\\vdots\\\\\n", | |
"y^{m}\\\\\n", | |
"\\end{bmatrix}, \\large X_{data} = \\begin{bmatrix} \n", | |
"x^{1}_{1} & x^{1}_{2} & x^{1}_{3} & \\dots & x^{1}_{n} \\\\\n", | |
"x^{2}_{1} & x^{2}_{2} & x^{2}_{3} & \\dots & x^{2}_{n} \\\\\n", | |
"x^{3}_{1} & x^{3}_{2} & x^{3}_{3} & \\dots & x^{3}_{n} \\\\\n", | |
"\\vdots & & \\vdots\\\\\n", | |
"x^{m}_{1} & x^{m}_{2} & x^{m}_{3} & \\dots & x^{m}_{n} \\\\\n", | |
"\\end{bmatrix}, \\large \\hat{Y} = \\begin{bmatrix}\n", | |
"\\hat{y}^{1}\\\\\n", | |
"\\hat{y}^{2}\\\\\n", | |
"\\hat{y}^{3}\\\\\n", | |
"\\vdots\\\\\n", | |
"\\hat{y}^{m}\\\\\n", | |
"\\end{bmatrix}, \\large W =\n", | |
"\\begin{bmatrix}\n", | |
"w_{1}\\\\\n", | |
"w_{2}\\\\\n", | |
"w_{3}\\\\\n", | |
"\\vdots\\\\\n", | |
"w_{n}\\\\\n", | |
"w_{n+1}\\\\\n", | |
"\\end{bmatrix}$$" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "e34ae574", | |
"metadata": {}, | |
"source": [ | |
"$$\\large \\hat{y}^{i} = \\sum \\limits _{k=1} ^{n} w_{k}\\times x_{k}^{i} + b = \\sum \\limits _{k=1} ^{n+1} w_{k}\\times x_{k}^{i} \\quad\\quad\\quad\\quad w_{n+1} = b \\quad \\text{and} \\quad x_{n+1} = 1\n", | |
"$$\n", | |
"\n", | |
"$$\\large \\hat{y}^{i} = \\begin{bmatrix}x^{i}_{1} & x^{i}_{2} & x^{i}_{3} & \\dots & x^{1}_{n} & 1\\end{bmatrix} \\begin{bmatrix} w_{1}\\\\\n", | |
"w_{2}\\\\\n", | |
"w_{3}\\\\\n", | |
"\\vdots\\\\\n", | |
"w_{n}\\\\\n", | |
"w_{n+1}\\\\\n", | |
"\\end{bmatrix}\n", | |
"$$\n", | |
"\n", | |
"\n", | |
"$$\\large \\hat{Y} = \\begin{bmatrix}\n", | |
"\\hat{y}^{1}\\\\\n", | |
"\\hat{y}^{2}\\\\\n", | |
"\\hat{y}^{3}\\\\\n", | |
"\\vdots\\\\\n", | |
"\\hat{y}^{m}\\\\\n", | |
"\\end{bmatrix} = \\small \\begin{bmatrix} \n", | |
"x^{1}_{1} & x^{1}_{2} & x^{1}_{3} & \\dots & x^{1}_{n} & 1\\\\\n", | |
"x^{2}_{1} & x^{2}_{2} & x^{2}_{3} & \\dots & x^{2}_{n} & 1\\\\\n", | |
"x^{3}_{1} & x^{3}_{2} & x^{3}_{3} & \\dots & x^{3}_{n} & 1\\\\\n", | |
"\\vdots & \\vdots\\\\\n", | |
"x^{i}_{1} & x^{i}_{2} & x^{i}_{3} & \\dots & x^{i}_{n} & 1\\\\\n", | |
"\\vdots & & \\dots & & \\vdots\\\\\n", | |
"x^{m}_{1} & x^{m}_{2} & x^{m}_{3} & \\dots & x^{m}_{n} & 1\\\\\n", | |
"\\end{bmatrix}\n", | |
"\\begin{bmatrix}\n", | |
"w_{1}\\\\\n", | |
"w_{2}\\\\\n", | |
"w_{3}\\\\\n", | |
"\\vdots\\\\\n", | |
"w_{n}\\\\\n", | |
"w_{n+1}\\\\\n", | |
"\\end{bmatrix}\n", | |
"$$\n", | |
"\n", | |
"\n", | |
"$$\\hat{Y} = X.W $$" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "d43c4a73", | |
"metadata": {}, | |
"source": [ | |
"Lest Call that Big fat Matrix X , \n", | |
"\n", | |
"So Our final expression for $\\hat{Y} = XW$ , which is a vector of size mx1\n", | |
"\n", | |
"Residuals $e = (Y − \\hat{Y})$, which is a vector of size mx1\n", | |
"\n", | |
"Square Sum of residuals $e^{2} = (Y − \\hat{Y})^{2}$ , which is a vector of size mx1" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "e6aab87b", | |
"metadata": {}, | |
"source": [ | |
"$$\\large Y =\n", | |
"\\begin{bmatrix}\n", | |
"y^{1}\\\\\n", | |
"y^{2}\\\\\n", | |
"y^{3}\\\\\n", | |
"\\vdots\\\\\n", | |
"y^{m}\\\\\n", | |
"\\end{bmatrix}, \\large \\hat{Y} = \\begin{bmatrix}\n", | |
"\\hat{y}^{1}\\\\\n", | |
"\\hat{y}^{2}\\\\\n", | |
"\\hat{y}^{3}\\\\\n", | |
"\\vdots\\\\\n", | |
"\\hat{y}^{m}\\\\\n", | |
"\\end{bmatrix}, \\large e =\n", | |
"\\begin{bmatrix}\n", | |
"y^{1}-\\hat{y}^{1}\\\\\n", | |
"y^{2}-\\hat{y}^{2}\\\\\n", | |
"y^{3}-\\hat{y}^{3}\\\\\n", | |
"\\vdots\\\\\n", | |
"y^{m}-\\hat{y}^{m}\\\\\n", | |
"\\end{bmatrix} = \\begin{bmatrix}\n", | |
"e^{1}\\\\\n", | |
"e^{2}\\\\\n", | |
"e^{3}\\\\\n", | |
"\\vdots\\\\\n", | |
"e^{m}\\\\\n", | |
"\\end{bmatrix}$$ " | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "76b09cd3", | |
"metadata": {}, | |
"source": [ | |
"### Cost Function\n", | |
"\n", | |
"$$\\large L_{C}(Y,\\hat{Y}) = \\frac{1}{m} \\sum_{k=1}^{m} \\left (y^{k}-\\hat{y}^{k} \\right )^{2} $$\n", | |
"\n", | |
"$$\\large L_{C}(Y,\\hat{Y}) = \\frac{1}{m} \\sum_{k=1}^{m} ( e^{k} )^{2} = \\frac{1}{m} \\left [ (e^{1})^{2} + (e^{2})^{2}+ (e^{2})^{3}+ \\dots + (e^{2})^{m} \\right ]$$\n", | |
"\n", | |
"$$\\large L_{C}(Y,\\hat{Y}) = \\frac{1}{m} \\begin{bmatrix}e^{1} & e^{2} & e^{3} & \\dots & e^{m} &\\end{bmatrix} \\begin{bmatrix} e^{1} \\\\\n", | |
"e^{2} \\\\\n", | |
"e^{3} \\\\\n", | |
"\\vdots\\\\\n", | |
"e^{m} \\\\\n", | |
"\\end{bmatrix}$$\n", | |
"\n", | |
"$$\\large L_{C}(Y,\\hat{Y}) = \\frac{1}{m} e^{T}e$$\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "f9b33b8f", | |
"metadata": {}, | |
"source": [ | |
"#### Cost Function simplification\n", | |
"\n", | |
"$$\\large L_{C}(Y,\\hat{Y}) = \\frac{1}{m} e^{T}e = \\frac{1}{m}(Y-\\hat{Y})^{T}(Y-\\hat{Y}) = \\frac{1}{m}(Y^{T}-\\hat{Y}^{T})(Y-\\hat{Y})$$\n", | |
"\n", | |
"$$\\large L_{C}(Y,\\hat{Y}) = \\frac{1}{m}(Y^{T}Y-Y^{T}\\hat{Y} - \\hat{Y}^{T}Y +\\hat{Y}^{T}\\hat{Y})$$\n", | |
"\n", | |
"from equation 4 $\\hat{Y} = X.W $\n", | |
"\n", | |
"$$\\large L_{C}(Y,\\hat{Y}) = \\frac{1}{m}(Y^{T}Y-Y^{T}XW - (XW)^{T}Y +(XW)^{T}(XW))$$\n", | |
"\n", | |
"$$\\large L_{C}(Y,\\hat{Y}) = \\frac{1}{m}(Y^{T}Y-Y^{T}XW - W^{T}X^{T}Y + W^{T}X^{T}XW ) \\label{eq:eq8.1} \\tag{8.1}$$\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "e4fcfcb3", | |
"metadata": {}, | |
"source": [ | |
"Our learning problem reduces to that of finding a good set of weights/parameters for our model which minimizes the cost function.\n", | |
" \n", | |
"$$\\hat{W} \\in \\mathbb{R}^{n+1}$$ \n", | |
"\n", | |
"$$\\large \\hat{W} = \\text{argmin}_{W}L_{c}(Y,\\hat{Y}) \\quad\\quad \\text{where,}\\quad W \\in \\mathbb{R}^{n+1} $$" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "12c913dd", | |
"metadata": {}, | |
"source": [ | |
"\n", | |
"\n", | |
"#### Gradient Descent \n", | |
"\n", | |
"Inorder to find the minimum value of Cost function we will be using Gradient Descent method. \n", | |
"Given a function defined by a set of parameters, gradient descent starts with an initial set of parameter W₀ and iteratively moves toward a set of parameter values that minimize the function. This iterative minimization is achieved using calculus, taking steps η in the negative direction of the function gradient.\n", | |
"\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "e8d39784", | |
"metadata": {}, | |
"source": [ | |
"\n", | |
"$\\large W_{t+1} = W_{t} - \\eta \\times \\nabla_{w} L_{C}(Y,\\hat{Y}) \\label{eq:eq10} \\tag{10}$" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "adaf8a68", | |
"metadata": {}, | |
"source": [ | |
"\n", | |
"#### Claculations of Gredient \n", | |
"\n", | |
"$$\\large \\nabla_{w} L_{C}(\\hat{Y},Y) = \\frac{dL_{C}(\\hat{Y},Y)}{dW}= \\begin{bmatrix}\n", | |
"\\frac{\\partial L_{C}(Y,\\hat{Y}) }{\\partial w_{1}} \\\\\n", | |
"\\frac{\\partial L_{C}(Y,\\hat{Y}) }{\\partial w_{2}} \\\\\n", | |
"\\frac{\\partial L_{C}(Y,\\hat{Y}) }{\\partial w_{2}} \\\\\n", | |
"\\vdots\\\\\n", | |
"\\frac{\\partial L_{C}(Y,\\hat{Y}) }{\\partial w_{n}} \\\\\n", | |
"\\frac{\\partial L_{C}(Y,\\hat{Y}) }{\\partial w_{n+1}} \\\\\n", | |
"\\end{bmatrix} \n", | |
"\\label{eq:eq11} \\tag{11}\n", | |
"$$\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "7bfe91cd", | |
"metadata": {}, | |
"source": [ | |
"#### Partial Diff of Cost Function with respect to Weights\n", | |
"\n", | |
"$\\large \\frac{\\partial L_{C}(\\hat{Y},Y) }{\\partial w_{i}} = \\frac{\\partial }{\\partial w_{i}} \\left[\\frac{1}{m} \\sum_{k=1}^{m} \\left ( y^{k} -\\hat{y}^{k}) \\right )^{2} \\right ]$\n", | |
"\n", | |
"$\\large = \\frac{1}{m} \\sum_{k=1}^{m} \\frac{\\partial}{\\partial w_{i}} \\left ( y^{k} -\\hat{y}^{k}\\right)^{2}$\n", | |
"\n", | |
"$\\large = \\frac{1}{m} \\sum_{k=1}^{m} \\frac{\\partial}{\\partial \\hat{y}^{k}} \\left ( y^{k} -\\hat{y}^{k}\\right)^{2} \\frac{\\partial \\hat{y}^{k}}{\\partial w_{i}}$\n", | |
"\n", | |
"$\\large = -\\frac{2}{m} \\sum_{k=1}^{m} \\left ( y^{k} -\\hat{y}^{k}\\right) \\frac{\\partial \\hat{y}^{k}}{\\partial w_{i}}$\n", | |
"\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "b2c3b592", | |
"metadata": {}, | |
"source": [ | |
"Partial Differentiation of $\\hat{y}^{k}$ with respect to ith weights $w_{i}$\n", | |
"\n", | |
"$\\large \\frac{\\partial \\hat{y}^{k} }{\\partial w_{i}} = \\frac{\\partial }{\\partial w_{i}} \\sum_{i=1}^{n+1} (w_{i}x^{k}_{i}) = x^{k}_{i}$\n", | |
"\n", | |
"Replaceing equation above\n", | |
"\n", | |
"$\\large \\frac{\\partial L_{C}(\\hat{Y},Y) }{\\partial w_{i}} = -\\frac{2}{m} \\sum_{k=1}^{m} \\left ( y^{k} -\\hat{y}^{k}\\right) \\frac{\\partial \\hat{y}^{k}}{\\partial w_{i}} = -\\frac{2}{m} \\sum_{k=1}^{m} (y^{k} -\\hat{y}^{k})x^{k}_{i}$" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "c8519b59", | |
"metadata": {}, | |
"source": [ | |
"#### Expressing gredient interms of matix\n", | |
"\n", | |
"$\\large \\frac{\\partial L_{C}(\\hat{Y},Y) }{\\partial w_{i}} = -\\frac{2}{m} \\sum_{k=1}^{m} \\left(y^{k}-\\hat{y}^{k}\\right)x^{k}_{i} = -\\frac{2}{m} \\sum_{k=1}^{m} e^{k} x^{k}_{i}$\n", | |
"\n", | |
"$\\large \\frac{\\partial L_{C}(\\hat{Y},Y) }{\\partial w_{i}} = -\\frac{2}{m} \\left[e^{1}x^{1}_{i} + e^{2}x^{2}_{i} + e^{2}x^{3}_{i} + \\dots + e^{m}x^{m}_{i} \\right ] $\n", | |
"\n", | |
"$\\large \\frac{\\partial L_{C}(\\hat{Y},Y) }{\\partial w_{i}} = -\\frac{2}{m} \\begin{bmatrix} e^{1}& e^{2} & e^{3} &\\dots &e^{m}\\end{bmatrix} \\begin{bmatrix}\n", | |
"x^{1}_{i}\\\\\n", | |
"x^{2}_{i}\\\\\n", | |
"x^{3}_{i}\\\\\n", | |
"\\vdots\\\\\n", | |
"x^{m}_{i}\\\\\n", | |
"\\end{bmatrix}= -\\frac{2}{m} e^{T}X_{i^{th} Column} $\n", | |
"\n", | |
"$\\nabla_{\\theta} L_{C}(\\hat{Y},Y) = \\begin{bmatrix}\n", | |
"\\frac{\\partial L_{C}(\\hat{Y},Y) }{\\partial w_{1}} \\\\\n", | |
"\\frac{\\partial L_{C}(\\hat{Y},Y) }{\\partial w_{2}} \\\\\n", | |
"\\frac{\\partial L_{C}(\\hat{Y},Y) }{\\partial w_{2}} \\\\\n", | |
"\\vdots\\\\\n", | |
"\\frac{\\partial L_{C}(\\hat{Y},Y) }{\\partial w_{n}} \\\\\n", | |
"\\frac{\\partial L_{C}(\\hat{Y},Y) }{\\partial w_{n+1}} \\\\\n", | |
"\\end{bmatrix} \n", | |
"= -\\frac{2}{m} e^{T}X $" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "3a7cf134", | |
"metadata": {}, | |
"source": [ | |
"\n", | |
"#### Claculations of Gredient Using Matrix \n", | |
"\n", | |
"from eyqution 8.1 we know \n", | |
"\n", | |
"$\\large L_{C}(Y,\\hat{Y}) = \\frac{1}{m}(Y^{T}Y-Y^{T}XW - W^{T}X^{T}Y + W^{T}X^{T}XW )$\n", | |
"\n", | |
"$\\large \\nabla_{w} L_{C}(\\hat{Y},Y) = \\frac{dL_{C}(\\hat{Y},Y)}{dW}$\n", | |
"\n", | |
"$\\large \\nabla_{w} L_{C}(\\hat{Y},Y) = \\frac{1}{m} \\frac{d(Y^{T}Y-Y^{T}XW - W^{T}X^{T}Y + W^{T}X^{T}XW)}{dW}$\n", | |
"\n", | |
"$\\large \\nabla_{w} L_{C}(\\hat{Y},Y) = \\frac{1}{m} \\left[ -\\frac{d(Y^{T}XW)}{dW} - \\frac{d(W^{T}X^{T}Y)}{dW} + \\frac{d(W^{T}X^{T}XW)}{dW} \\right]$\n", | |
"\n", | |
"$\\large \\nabla_{w} L_{C}(\\hat{Y},Y) = \\frac{1}{m} \\left[ -Y^{T}X - (X^{T}Y)^{T} + 2W^{T}X^{T}X \\right]$\n", | |
"\n", | |
"from equation 4 $\\hat{Y} = X.W $\n", | |
"\n", | |
"$\\large \\nabla_{w} L_{C}(\\hat{Y},Y) = \\frac{1}{m} \\left[ -Y^{T}X - Y^{T}X + 2(XW)^{T}X \\right]$\n", | |
"\n", | |
"$\\large \\nabla_{w} L_{C}(\\hat{Y},Y) = \\frac{1}{m} \\left[ -2Y^{T}X + 2\\hat{Y}^{T}X\\right]$\n", | |
"\n", | |
"$\\large \\nabla_{w} L_{C}(\\hat{Y},Y) = -\\frac{2}{m} \\left[ (Y^{T}- \\hat{Y}^{T})X \\right]$\n", | |
"\n", | |
"$\\large \\nabla_{w} L_{C}(\\hat{Y},Y) = -\\frac{2}{m} e^{T}X $\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "a27909c1", | |
"metadata": {}, | |
"source": [ | |
"#### Finally Gredient in Matrix format\n", | |
"\n", | |
"$$ \\nabla_{\\theta} L_{C}(\\hat{Y},Y) = \\begin{bmatrix}\n", | |
"\\frac{\\partial L_{C}(\\hat{Y},Y) }{\\partial w_{1}} \\\\\n", | |
"\\frac{\\partial L_{C}(\\hat{Y},Y) }{\\partial w_{2}} \\\\\n", | |
"\\frac{\\partial L_{C}(\\hat{Y},Y) }{\\partial w_{2}} \\\\\n", | |
"\\vdots\\\\\n", | |
"\\frac{\\partial L_{C}(\\hat{Y},Y) }{\\partial w_{n}} \\\\\n", | |
"\\frac{\\partial L_{C}(\\hat{Y},Y) }{\\partial w_{n+1}} \\\\\n", | |
"\\end{bmatrix} \n", | |
"= -\\frac{2}{m} e^{T}X \\label{eq:eq12} \\tag{12}$$" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "5de80ead", | |
"metadata": {}, | |
"source": [ | |
"##### Steps for numpy Implementation\n", | |
"1. Capture No of datapoints(m) and number of indepemdent features(n) from X_data matrix size.\n", | |
"2. Create the Matrix X by appending a column of 1 on X_data\n", | |
"3. Create a Weights vector (weights )with 0 values of size (n+1)x1\n", | |
"4. set iteration_counter = 0 and loss_history = []\n", | |
"4. If iteration_counter less then max_iter then go to step 5 else step 11\n", | |
"5. Compute Y_predict by multiplying X and weights\n", | |
"6. Compute Error vector (e) subtracting Y_predict from Y\n", | |
"7. Compute loss using equation 8, and save it in loss_history list\n", | |
"8. Update weights using equation 10\n", | |
"9. Update iteration_counter and repeat step 4 to 9\n", | |
"10. Return Y_predict, loss_history" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 1, | |
"id": "4d721e56", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"import numpy as np\n", | |
"from sklearn.datasets import make_regression\n", | |
"import matplotlib.pyplot as plt" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "5a1e857d", | |
"metadata": {}, | |
"source": [ | |
"### Generating data of shape 1000X5 \n", | |
"\n", | |
"\n", | |
"$\\large m = 1000 \\quad \\text {and } \\quad n = 5 $\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 2, | |
"id": "ca58bc2d", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"x_data.shape (1000, 5),y.shape (1000,)\n", | |
"coeffecient [65.30099317 62.37184401 91.55702227 34.77747286 55.15562951], bias 5\n" | |
] | |
} | |
], | |
"source": [ | |
"bias_ = 5 \n", | |
"x_data,y,coef = make_regression(n_samples=1000,n_features=5,n_informative=5,bias=bias_,coef=True)\n", | |
"print(f'x_data.shape {x_data.shape},y.shape {y.shape}')\n", | |
"print(f'coeffecient {coef}, bias {bias_}' )" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "abeb251f", | |
"metadata": {}, | |
"source": [ | |
"#### Creating Model Class default max_iteration_count is 100 and learnng rate is 0.001\n", | |
"\n", | |
"##### max_iteration= 1000 \n", | |
"##### eta = 0.001\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 3, | |
"id": "f130e289", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"class LinearRegression:\n", | |
" \n", | |
" def __init__(self, iteration = 1000,lr = 0.001):\n", | |
" self.weights = None\n", | |
" self.max_iteration = iteration\n", | |
" self.lr = lr\n", | |
" \n", | |
" def fit(self,x,y):\n", | |
" no_of_obs,no_of_fea = x.shape #step - 1\n", | |
" x = np.column_stack((x,np.ones(no_of_obs))) #step - 2\n", | |
" self.weights = np.zeros(no_of_fea+1) #step - 3\n", | |
" i , history = 0, [] #step - 4\n", | |
" while i < self.max_iteration: #step - 5\n", | |
" y_pred = self.forward_pass(x) #step - 6\n", | |
" error_vec = y - y_pred #step - 7\n", | |
" loss = self.compute_loss(error_vec) #step - 8\n", | |
" history.append(loss)\n", | |
" grediant = self.compute_grediant(x,error_vec) #step - 9\n", | |
" self.weights = self.weights + self.lr*grediant #step - 9\n", | |
" i += 1 #step - 10\n", | |
" print(f'Loss : {loss}') \n", | |
" return self.forward_pass(x),history \n", | |
" \n", | |
" def forward_pass(self,x ):\n", | |
" return x.dot(self.weights)\n", | |
" \n", | |
" def compute_grediant(self,x,error_vec):\n", | |
" no_of_obs,_ = x.shape\n", | |
" return (2/no_of_obs)*error_vec.T.dot(x)\n", | |
" \n", | |
" def compute_loss(self,error_vec):\n", | |
" ssd = error_vec*error_vec\n", | |
" return np.mean(np.sqrt(ssd))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "d1072e84", | |
"metadata": {}, | |
"source": [ | |
"#### Running the model for 3000 iteration " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 4, | |
"id": "6cf14141", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Loss : 0.5078941397041853\n", | |
"Final iteration loss 0.5078941397041853\n", | |
"Model Parameter [64.96124197 62.00281208 91.16393802 34.77883773 54.91523277 4.92118004]\n", | |
"Expected Parameter [65.30099317 62.37184401 91.55702227 34.77747286 55.15562951], bias 5\n" | |
] | |
} | |
], | |
"source": [ | |
"regessor = LinearRegression(iteration=3000)\n", | |
"y_pred,loss_history = regessor.fit(x_data,y)\n", | |
"\n", | |
"print(f'Final iteration loss {loss_history[-1]}')\n", | |
"print(f'Model Parameter {regessor.weights}')\n", | |
"print(f'Expected Parameter {coef}, bias {bias_}')\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 5, | |
"id": "d0a1c5ce", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"image/png": "\n", | |
"text/plain": [ | |
"<Figure size 1296x360 with 2 Axes>" | |
] | |
}, | |
"metadata": { | |
"needs_background": "light" | |
}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"fig, (ax1,ax2) = plt.subplots(nrows=1, ncols=2,figsize=(18,5))\n", | |
"\n", | |
"\n", | |
"iterations = list(range(1,len(loss_history)+1))\n", | |
"\n", | |
"_=ax1.plot(y,y_pred)\n", | |
"ax1.set_xlabel('Y True values')\n", | |
"ax1.set_ylabel('Y Predicted values')\n", | |
"ax1.grid()\n", | |
"_=ax2.plot(iterations,loss_history)\n", | |
"ax2.set_xlabel('iterations')\n", | |
"ax2.set_ylabel('loss')\n", | |
"ax2.grid()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"id": "4f7e7eac", | |
"metadata": {}, | |
"outputs": [], | |
"source": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "164a171c-64da-4cda-80ae-9e6a4c679f2b", | |
"metadata": {}, | |
"source": [ | |
"\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"id": "c50fdbae-18ab-4f80-8411-e27d31c79526", | |
"metadata": {}, | |
"outputs": [], | |
"source": [] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3 (ipykernel)", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.11.9" | |
}, | |
"toc": { | |
"base_numbering": 1, | |
"nav_menu": {}, | |
"number_sections": true, | |
"sideBar": true, | |
"skip_h1_title": false, | |
"title_cell": "Table of Contents", | |
"title_sidebar": "Contents", | |
"toc_cell": false, | |
"toc_position": {}, | |
"toc_section_display": true, | |
"toc_window_display": false | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 5 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment