Last active
October 29, 2021 11:55
-
-
Save ashnair1/433ffbc1e747f80067f8a0439e346279 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Derivation of 1D Gaussian decision boundary threshold" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Consider a two class classification scenario. At the decision boundary, the posterior probability of classifying a data point into two classes will be equal i.e. $p(y=1|x) = p(y=2|x)$ \n", | |
"\n", | |
"Posterior definition\n", | |
"\n", | |
"$p(y=1|x) = \\large\\frac{p(x|y=1) * P(y=1)}{p(x)}$\n", | |
"\n", | |
"$p(y=2|x) = \\large\\frac{p(x|y=2) * P(y=2)}{p(x)}$\n", | |
"\n", | |
"where likelihoods are Gaussian i.e.\n", | |
"\n", | |
"$p(x|y=1) = \\mathcal{N}(x|\\mu_1, \\sigma_1)$ \n", | |
"\n", | |
"$p(x|y=2) = \\mathcal{N}(x|\\mu_2, \\sigma_2)$ \n", | |
"\n", | |
"So at the decision boundary, $p(y=1|x^*) = p(y=2|x^*)$ where $x^*$ is the threshold" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"$\n", | |
"\\begin{align}\n", | |
"&p(y=1|x^*) = p(y=2|x^*) \\\\\n", | |
"&\\Longrightarrow\\frac{1}{\\sqrt{2\\pi\\sigma_1^2}}\\exp(-\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2}) * P(y=1) = \\frac{1}{\\sqrt{2\\pi\\sigma_2^2}}\\exp(-\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2})* P(y=2)\\\\\n", | |
"\\end{align}\n", | |
"$\n", | |
"\n", | |
"Taking log on both sides,\n", | |
"\n", | |
"$\n", | |
"\\begin{align}\n", | |
"\\Rightarrow & \\small-\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2} -\\log\\sqrt{2\\pi}\\sigma_1 + \\log P(y=1)= -\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2} -\\log\\sqrt{2\\pi}\\sigma_2 + \\log P(y=2)\\\\\n", | |
"\\end{align}\n", | |
"$" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"\n", | |
"\n", | |
"Rearranging the above equation a bit we get,$$ \n", | |
"\\begin{align} \n", | |
"\\small\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2} -\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2}+\\log\\sqrt{2\\pi}\\sigma_1 -\\log\\sqrt{2\\pi}\\sigma_2 + \\log P(y=2) -\\log P(y=1)=0\\\\ \n", | |
"\\end{align} \n", | |
"$$\n", | |
"\n", | |
"Group the log terms,\n", | |
"\n", | |
"$$ \n", | |
"\\begin{align} \n", | |
"\\rightarrow \\small\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2} -\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2} + \\log \\frac{\\sigma_1}{\\sigma_2} + \\log \\frac{P(y=2)}{P(y=1)} = 0\\\\ \n", | |
"\\rightarrow \\small\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2} -\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2} + \\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)} = 0 \n", | |
"\\end{align} \n", | |
"$$\n", | |
"\n", | |
"Let's now only consider the first two terms and for convenience set $\\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)} = k$ a constant.\n", | |
"Expanding the squares we get,\n", | |
"\n", | |
"$$ \n", | |
"\\begin{align} \n", | |
"\\frac{2(\\sigma_2^2 - \\sigma_1^2 )x^{*^2} -4(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)x^* + 2(\\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2)}{4\\sigma_1^2\\sigma_2^2} +k=0\\\\ \n", | |
"\\frac{0.5(\\sigma_2^2 - \\sigma_1^2 )x^{*^2} -(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)x^* + 0.5(\\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2)}{\\sigma_1^2\\sigma_2^2} +k=0\\\\ \n", | |
"0.5(\\sigma_2^2 - \\sigma_1^2 )x^{*^2} -(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)x^* + 0.5(\\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2) + k\\sigma_1^2\\sigma_2^2=0\\\\\n", | |
"\\end{align} \n", | |
"$$\n", | |
"\n", | |
"Multiply by 2,\n", | |
"$$ \n", | |
"\\begin{align}\n", | |
"(\\sigma_2^2 - \\sigma_1^2 )x^{*^2} -2(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)x^* + \\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2 + 2k\\sigma_1^2\\sigma_2^2=0\n", | |
"\\end{align}\n", | |
"$$\n", | |
"\n", | |
"For a quadratic equation of the form $ax^2 + bx + c = 0$, the discriminant ($\\Delta$) $= b^2 - 4ac$\n", | |
"\n", | |
"Calculate the individual terms from above equation, \n", | |
"$$ \n", | |
"\\begin{align} \n", | |
"b^2 =\\left(-2(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)\\right) ^2= 4\\left(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2\\right) ^2\\\\ \n", | |
"4ac =4 (\\sigma_2^2 - \\sigma_1^2 ) \\left(\\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2+ 2k\\sigma_1^2\\sigma_2^2 \\right)\\\\ \n", | |
"\\end{align} \n", | |
"$$\n", | |
"\n", | |
"Calculate discriminant \n", | |
"$$ \n", | |
"\\begin{align} \n", | |
"b^2 - 4ac = 4\\left(\\left(\\sigma_1^2\\mu_2 - \\sigma_2^2\\mu_1\\right) ^2 - (\\sigma_1^2 - \\sigma_2^2 ) \\left(\\sigma_1^2\\mu_2^2-\\sigma_2^2\\mu_1^2 - 2k\\sigma_1^2\\sigma_2^2 \\right)\\right)\\\\ \n", | |
"= 4\\left(\\left(\\sigma_1^2\\mu_2 - \\sigma_2^2\\mu_1\\right) ^2 - \\left(\\sigma_1^4\\mu_2^2 - \\sigma_1^2\\sigma_2^2\\mu_1^2-2k\\sigma_1^4\\sigma_2^2 - \\sigma_1^2\\sigma_2^2\\mu_2^2 + \\sigma_2^4\\mu_1^2 + 2k\\sigma_1^2\\sigma_2^4\\right)\\right)\\\\\n", | |
"= 4\\left(\\left(\\sigma_1^4\\mu_2^2 -2\\sigma_1^2\\sigma_2^2\\mu_1\\mu_2 +\\sigma_2^4\\mu_1^2\\right) - \\left(\\sigma_1^4\\mu_2^2 - \\sigma_1^2\\sigma_2^2\\mu_1^2-2k\\sigma_1^4\\sigma_2^2 - \\sigma_1^2\\sigma_2^2\\mu_2^2 + \\sigma_2^4\\mu_1^2 + 2k\\sigma_1^2\\sigma_2^4\\right)\\right)\\\\\n", | |
"= 4\\sigma_1^2\\sigma_2^2\\left(\\mu_1^2 +\\mu_2^2 - 2\\mu_1\\mu_2 + 2k(\\sigma_1^2 - \\sigma_2^2)\\right)\\\\ \n", | |
"= 4\\sigma_1^2\\sigma_2^2\\left( \\left(\\mu_1 - \\mu_2\\right)^2 + 2k(\\sigma_1^2 - \\sigma_2^2)\\right)\\\\ \n", | |
"\\end{align} \n", | |
"$$\n", | |
"\n", | |
"So the discriminant $\\Delta =4\\sigma_1^2\\sigma_2^2\\left(\\left(\\mu_1-\\mu_2\\right)^2 + 2\\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)}(\\sigma_1^2 - \\sigma_2^2)\\right)$. \n", | |
"\n", | |
"\n", | |
"Recall formula for roots of quadratic equation $ax^2 + bx + c = 0$,\n", | |
"$$\n", | |
"\\begin{align}\n", | |
"x = \\frac{-b \\pm \\sqrt\\Delta}{2a}\n", | |
"\\end{align}\n", | |
"$$\n", | |
"\n", | |
"So threshold ($x^*$) can be computed as follows:\n", | |
"$$\n", | |
"\\begin{align}\n", | |
"x^* = \\frac{2(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2) \\pm \\sqrt{4\\sigma_1^2\\sigma_2^2\\left(\\left(\\mu_1-\\mu_2\\right)^2 + 2\\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)}(\\sigma_1^2 - \\sigma_2^2)\\right)}}{2(\\sigma_2^2 - \\sigma_1^2 )}\\\\\n", | |
"x^*= \\frac{(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2) \\pm \\sqrt{\\sigma_1^2\\sigma_2^2\\left(\\left(\\mu_1-\\mu_2\\right)^2 + 0.5\\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)}(\\sigma_1^2 - \\sigma_2^2)\\right)}}{(\\sigma_2^2 - \\sigma_1^2 )}\n", | |
"\\end{align}\n", | |
"$$\n" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.7.3" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 4 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment