Skip to content

Instantly share code, notes, and snippets.

@ashnair1
Last active October 29, 2021 11:55
Show Gist options
  • Save ashnair1/433ffbc1e747f80067f8a0439e346279 to your computer and use it in GitHub Desktop.
Save ashnair1/433ffbc1e747f80067f8a0439e346279 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Derivation of 1D Gaussian decision boundary threshold"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Consider a two class classification scenario. At the decision boundary, the posterior probability of classifying a data point into two classes will be equal i.e. $p(y=1|x) = p(y=2|x)$ \n",
"\n",
"Posterior definition\n",
"\n",
"$p(y=1|x) = \\large\\frac{p(x|y=1) * P(y=1)}{p(x)}$\n",
"\n",
"$p(y=2|x) = \\large\\frac{p(x|y=2) * P(y=2)}{p(x)}$\n",
"\n",
"where likelihoods are Gaussian i.e.\n",
"\n",
"$p(x|y=1) = \\mathcal{N}(x|\\mu_1, \\sigma_1)$ \n",
"\n",
"$p(x|y=2) = \\mathcal{N}(x|\\mu_2, \\sigma_2)$ \n",
"\n",
"So at the decision boundary, $p(y=1|x^*) = p(y=2|x^*)$ where $x^*$ is the threshold"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"$\n",
"\\begin{align}\n",
"&p(y=1|x^*) = p(y=2|x^*) \\\\\n",
"&\\Longrightarrow\\frac{1}{\\sqrt{2\\pi\\sigma_1^2}}\\exp(-\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2}) * P(y=1) = \\frac{1}{\\sqrt{2\\pi\\sigma_2^2}}\\exp(-\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2})* P(y=2)\\\\\n",
"\\end{align}\n",
"$\n",
"\n",
"Taking log on both sides,\n",
"\n",
"$\n",
"\\begin{align}\n",
"\\Rightarrow & \\small-\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2} -\\log\\sqrt{2\\pi}\\sigma_1 + \\log P(y=1)= -\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2} -\\log\\sqrt{2\\pi}\\sigma_2 + \\log P(y=2)\\\\\n",
"\\end{align}\n",
"$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"Rearranging the above equation a bit we get,$$ \n",
"\\begin{align} \n",
"\\small\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2} -\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2}+\\log\\sqrt{2\\pi}\\sigma_1 -\\log\\sqrt{2\\pi}\\sigma_2 + \\log P(y=2) -\\log P(y=1)=0\\\\ \n",
"\\end{align} \n",
"$$\n",
"\n",
"Group the log terms,\n",
"\n",
"$$ \n",
"\\begin{align} \n",
"\\rightarrow \\small\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2} -\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2} + \\log \\frac{\\sigma_1}{\\sigma_2} + \\log \\frac{P(y=2)}{P(y=1)} = 0\\\\ \n",
"\\rightarrow \\small\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2} -\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2} + \\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)} = 0 \n",
"\\end{align} \n",
"$$\n",
"\n",
"Let's now only consider the first two terms and for convenience set $\\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)} = k$ a constant.\n",
"Expanding the squares we get,\n",
"\n",
"$$ \n",
"\\begin{align} \n",
"\\frac{2(\\sigma_2^2 - \\sigma_1^2 )x^{*^2} -4(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)x^* + 2(\\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2)}{4\\sigma_1^2\\sigma_2^2} +k=0\\\\ \n",
"\\frac{0.5(\\sigma_2^2 - \\sigma_1^2 )x^{*^2} -(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)x^* + 0.5(\\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2)}{\\sigma_1^2\\sigma_2^2} +k=0\\\\ \n",
"0.5(\\sigma_2^2 - \\sigma_1^2 )x^{*^2} -(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)x^* + 0.5(\\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2) + k\\sigma_1^2\\sigma_2^2=0\\\\\n",
"\\end{align} \n",
"$$\n",
"\n",
"Multiply by 2,\n",
"$$ \n",
"\\begin{align}\n",
"(\\sigma_2^2 - \\sigma_1^2 )x^{*^2} -2(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)x^* + \\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2 + 2k\\sigma_1^2\\sigma_2^2=0\n",
"\\end{align}\n",
"$$\n",
"\n",
"For a quadratic equation of the form $ax^2 + bx + c = 0$, the discriminant ($\\Delta$) $= b^2 - 4ac$\n",
"\n",
"Calculate the individual terms from above equation, \n",
"$$ \n",
"\\begin{align} \n",
"b^2 =\\left(-2(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)\\right) ^2= 4\\left(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2\\right) ^2\\\\ \n",
"4ac =4 (\\sigma_2^2 - \\sigma_1^2 ) \\left(\\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2+ 2k\\sigma_1^2\\sigma_2^2 \\right)\\\\ \n",
"\\end{align} \n",
"$$\n",
"\n",
"Calculate discriminant \n",
"$$ \n",
"\\begin{align} \n",
"b^2 - 4ac = 4\\left(\\left(\\sigma_1^2\\mu_2 - \\sigma_2^2\\mu_1\\right) ^2 - (\\sigma_1^2 - \\sigma_2^2 ) \\left(\\sigma_1^2\\mu_2^2-\\sigma_2^2\\mu_1^2 - 2k\\sigma_1^2\\sigma_2^2 \\right)\\right)\\\\ \n",
"= 4\\left(\\left(\\sigma_1^2\\mu_2 - \\sigma_2^2\\mu_1\\right) ^2 - \\left(\\sigma_1^4\\mu_2^2 - \\sigma_1^2\\sigma_2^2\\mu_1^2-2k\\sigma_1^4\\sigma_2^2 - \\sigma_1^2\\sigma_2^2\\mu_2^2 + \\sigma_2^4\\mu_1^2 + 2k\\sigma_1^2\\sigma_2^4\\right)\\right)\\\\\n",
"= 4\\left(\\left(\\sigma_1^4\\mu_2^2 -2\\sigma_1^2\\sigma_2^2\\mu_1\\mu_2 +\\sigma_2^4\\mu_1^2\\right) - \\left(\\sigma_1^4\\mu_2^2 - \\sigma_1^2\\sigma_2^2\\mu_1^2-2k\\sigma_1^4\\sigma_2^2 - \\sigma_1^2\\sigma_2^2\\mu_2^2 + \\sigma_2^4\\mu_1^2 + 2k\\sigma_1^2\\sigma_2^4\\right)\\right)\\\\\n",
"= 4\\sigma_1^2\\sigma_2^2\\left(\\mu_1^2 +\\mu_2^2 - 2\\mu_1\\mu_2 + 2k(\\sigma_1^2 - \\sigma_2^2)\\right)\\\\ \n",
"= 4\\sigma_1^2\\sigma_2^2\\left( \\left(\\mu_1 - \\mu_2\\right)^2 + 2k(\\sigma_1^2 - \\sigma_2^2)\\right)\\\\ \n",
"\\end{align} \n",
"$$\n",
"\n",
"So the discriminant $\\Delta =4\\sigma_1^2\\sigma_2^2\\left(\\left(\\mu_1-\\mu_2\\right)^2 + 2\\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)}(\\sigma_1^2 - \\sigma_2^2)\\right)$. \n",
"\n",
"\n",
"Recall formula for roots of quadratic equation $ax^2 + bx + c = 0$,\n",
"$$\n",
"\\begin{align}\n",
"x = \\frac{-b \\pm \\sqrt\\Delta}{2a}\n",
"\\end{align}\n",
"$$\n",
"\n",
"So threshold ($x^*$) can be computed as follows:\n",
"$$\n",
"\\begin{align}\n",
"x^* = \\frac{2(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2) \\pm \\sqrt{4\\sigma_1^2\\sigma_2^2\\left(\\left(\\mu_1-\\mu_2\\right)^2 + 2\\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)}(\\sigma_1^2 - \\sigma_2^2)\\right)}}{2(\\sigma_2^2 - \\sigma_1^2 )}\\\\\n",
"x^*= \\frac{(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2) \\pm \\sqrt{\\sigma_1^2\\sigma_2^2\\left(\\left(\\mu_1-\\mu_2\\right)^2 + 0.5\\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)}(\\sigma_1^2 - \\sigma_2^2)\\right)}}{(\\sigma_2^2 - \\sigma_1^2 )}\n",
"\\end{align}\n",
"$$\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Derivation of 1D Gaussian decision boundary threshold"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Consider a two class classification scenario. At the decision boundary, the posterior probability of classifying a data point into two classes will be equal i.e. $p(y=1|x) = p(y=2|x)$ \n",
"\n",
"Posterior definition\n",
"\n",
"$p(y=1|x) = \\large\\frac{p(x|y=1) * P(y=1)}{p(x)}$\n",
"\n",
"$p(y=2|x) = \\large\\frac{p(x|y=2) * P(y=2)}{p(x)}$\n",
"\n",
"where likelihoods are Gaussian i.e.\n",
"\n",
"$p(x|y=1) = \\mathcal{N}(x|\\mu_1, \\sigma_1)$ \n",
"\n",
"$p(x|y=2) = \\mathcal{N}(x|\\mu_2, \\sigma_2)$ \n",
"\n",
"So at the decision boundary, $p(y=1|x^*) = p(y=2|x^*)$ where $x^*$ is the threshold"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"$\n",
"\\begin{align}\n",
"&p(y=1|x^*) = p(y=2|x^*) \\\\\n",
"&\\Longrightarrow\\frac{1}{\\sqrt{2\\pi\\sigma_1^2}}\\exp(-\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2}) * P(y=1) = \\frac{1}{\\sqrt{2\\pi\\sigma_2^2}}\\exp(-\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2})* P(y=2)\\\\\n",
"\\end{align}\n",
"$\n",
"\n",
"Taking log on both sides,\n",
"\n",
"$\n",
"\\begin{align}\n",
"\\Rightarrow & \\small-\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2} -\\log\\sqrt{2\\pi}\\sigma_1 + \\log P(y=1)= -\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2} -\\log\\sqrt{2\\pi}\\sigma_2 + \\log P(y=2)\\\\\n",
"\\end{align}\n",
"$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"Rearranging the above equation a bit we get,$$ \n",
"\\begin{align} \n",
"\\small\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2} -\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2}+\\log\\sqrt{2\\pi}\\sigma_1 -\\log\\sqrt{2\\pi}\\sigma_2 + \\log P(y=2) -\\log P(y=1)=0\\\\ \n",
"\\end{align} \n",
"$$\n",
"\n",
"Group the log terms,\n",
"\n",
"$$ \n",
"\\begin{align} \n",
"\\rightarrow \\small\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2} -\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2} + \\log \\frac{\\sigma_1}{\\sigma_2} + \\log \\frac{P(y=2)}{P(y=1)} = 0\\\\ \n",
"\\rightarrow \\small\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2} -\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2} + \\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)} = 0 \n",
"\\end{align} \n",
"$$\n",
"\n",
"Let's now only consider the first two terms and for convenience set $\\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)} = k$ a constant.\n",
"Expanding the squares we get,\n",
"\n",
"$$ \n",
"\\begin{align} \n",
"\\frac{2(\\sigma_2^2 - \\sigma_1^2 )x^{*^2} -4(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)x^* + 2(\\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2)}{4\\sigma_1^2\\sigma_2^2} +k=0\\\\ \n",
"\\frac{0.5(\\sigma_2^2 - \\sigma_1^2 )x^{*^2} -(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)x^* + 0.5(\\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2)}{\\sigma_1^2\\sigma_2^2} +k=0\\\\ \n",
"0.5(\\sigma_2^2 - \\sigma_1^2 )x^{*^2} -(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)x^* + 0.5(\\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2) + k\\sigma_1^2\\sigma_2^2=0\\\\\n",
"\\end{align} \n",
"$$\n",
"\n",
"Multiply by 2,\n",
"$$ \n",
"\\begin{align}\n",
"(\\sigma_2^2 - \\sigma_1^2 )x^{*^2} -2(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)x^* + \\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2 + 2k\\sigma_1^2\\sigma_2^2=0\n",
"\\end{align}\n",
"$$\n",
"\n",
"For a quadratic equation of the form $ax^2 + bx + c = 0$, the discriminant ($\\Delta$) $= b^2 - 4ac$\n",
"\n",
"Calculate the individual terms from above equation, \n",
"$$ \n",
"\\begin{align} \n",
"b^2 =\\left(-2(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)\\right) ^2= 4\\left(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2\\right) ^2\\\\ \n",
"4ac =4 (\\sigma_2^2 - \\sigma_1^2 ) \\left(\\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2+ 2k\\sigma_1^2\\sigma_2^2 \\right)\\\\ \n",
"\\end{align} \n",
"$$\n",
"\n",
"Calculate discriminant \n",
"$$ \n",
"\\begin{align} \n",
"b^2 - 4ac = 4\\left(\\left(\\sigma_1^2\\mu_2 - \\sigma_2^2\\mu_1\\right) ^2 - (\\sigma_1^2 - \\sigma_2^2 ) \\left(\\sigma_1^2\\mu_2^2-\\sigma_2^2\\mu_1^2 - 2k\\sigma_1^2\\sigma_2^2 \\right)\\right)\\\\ \n",
"= 4\\left(\\left(\\sigma_1^2\\mu_2 - \\sigma_2^2\\mu_1\\right) ^2 - \\left(\\sigma_1^4\\mu_2^2 - \\sigma_1^2\\sigma_2^2\\mu_1^2-2k\\sigma_1^4\\sigma_2^2 - \\sigma_1^2\\sigma_2^2\\mu_2^2 + \\sigma_2^4\\mu_1^2 + 2k\\sigma_1^2\\sigma_2^4\\right)\\right)\\\\\n",
"= 4\\left(\\left(\\sigma_1^4\\mu_2^2 -2\\sigma_1^2\\sigma_2^2\\mu_1\\mu_2 +\\sigma_2^4\\mu_1^2\\right) - \\left(\\sigma_1^4\\mu_2^2 - \\sigma_1^2\\sigma_2^2\\mu_1^2-2k\\sigma_1^4\\sigma_2^2 - \\sigma_1^2\\sigma_2^2\\mu_2^2 + \\sigma_2^4\\mu_1^2 + 2k\\sigma_1^2\\sigma_2^4\\right)\\right)\\\\\n",
"= 4\\sigma_1^2\\sigma_2^2\\left(\\mu_1^2 +\\mu_2^2 - 2\\mu_1\\mu_2 + 2k(\\sigma_1^2 - \\sigma_2^2)\\right)\\\\ \n",
"= 4\\sigma_1^2\\sigma_2^2\\left( \\left(\\mu_1 - \\mu_2\\right)^2 + 2k(\\sigma_1^2 - \\sigma_2^2)\\right)\\\\ \n",
"\\end{align} \n",
"$$\n",
"\n",
"So the discriminant $\\Delta =4\\sigma_1^2\\sigma_2^2\\left(\\left(\\mu_1-\\mu_2\\right)^2 + 2\\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)}(\\sigma_1^2 - \\sigma_2^2)\\right)$. \n",
"\n",
"\n",
"Recall formula for roots of quadratic equation $ax^2 + bx + c = 0$,\n",
"$$\n",
"\\begin{align}\n",
"x = \\frac{-b \\pm \\sqrt\\Delta}{2a}\n",
"\\end{align}\n",
"$$\n",
"\n",
"So threshold ($x^*$) can be computed as follows:\n",
"$$\n",
"\\begin{align}\n",
"x^* = \\frac{2(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2) \\pm \\sqrt{4\\sigma_1^2\\sigma_2^2\\left(\\left(\\mu_1-\\mu_2\\right)^2 + 2\\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)}(\\sigma_1^2 - \\sigma_2^2)\\right)}}{2(\\sigma_2^2 - \\sigma_1^2 )}\\\\\n",
"x^*= \\frac{(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2) \\pm \\sqrt{\\sigma_1^2\\sigma_2^2\\left(\\left(\\mu_1-\\mu_2\\right)^2 + 0.5\\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)}(\\sigma_1^2 - \\sigma_2^2)\\right)}}{(\\sigma_2^2 - \\sigma_1^2 )}\n",
"\\end{align}\n",
"$$\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment