ashnair1 · October 29, 2021 11:55
diff --git a/gauss1d_threshold.ipynb b/gauss1d_threshold.ipynb
diff --git a/gistfile1.txt b/gistfile1.txt
 {
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Derivation of 1D Gaussian decision boundary threshold"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Consider a two class classification scenario. At the decision boundary, the posterior probability of classifying a data point into two classes will be equal i.e. $p(y=1|x) = p(y=2|x)$ \n",
    "\n",
    "Posterior definition\n",
    "\n",
    "$p(y=1|x) = \\large\\frac{p(x|y=1) * P(y=1)}{p(x)}$\n",
    "\n",
    "$p(y=2|x) = \\large\\frac{p(x|y=2) * P(y=2)}{p(x)}$\n",
    "\n",
    "where likelihoods are Gaussian i.e.\n",
    "\n",
    "$p(x|y=1) = \\mathcal{N}(x|\\mu_1, \\sigma_1)$ \n",
    "\n",
    "$p(x|y=2) = \\mathcal{N}(x|\\mu_2, \\sigma_2)$ \n",
    "\n",
    "So at the decision boundary, $p(y=1|x^*) = p(y=2|x^*)$ where $x^*$ is the threshold"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$\n",
    "\\begin{align}\n",
    "&p(y=1|x^*) = p(y=2|x^*) \\\\\n",
    "&\\Longrightarrow\\frac{1}{\\sqrt{2\\pi\\sigma_1^2}}\\exp(-\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2}) * P(y=1) = \\frac{1}{\\sqrt{2\\pi\\sigma_2^2}}\\exp(-\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2})* P(y=2)\\\\\n",
    "\\end{align}\n",
    "$\n",
    "\n",
    "Taking log on both sides,\n",
    "\n",
    "$\n",
    "\\begin{align}\n",
    "\\Rightarrow & \\small-\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2} -\\log\\sqrt{2\\pi}\\sigma_1 + \\log P(y=1)= -\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2} -\\log\\sqrt{2\\pi}\\sigma_2 + \\log P(y=2)\\\\\n",
    "\\end{align}\n",
    "$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "\n",
    "Rearranging the above equation a bit we get,$$  \n",
    "\\begin{align}  \n",
    "\\small\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2} -\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2}+\\log\\sqrt{2\\pi}\\sigma_1 -\\log\\sqrt{2\\pi}\\sigma_2    + \\log P(y=2) -\\log P(y=1)=0\\\\  \n",
    "\\end{align}  \n",
    "$$\n",
    "\n",
    "Group the log terms,\n",
    "\n",
    "$$  \n",
    "\\begin{align}  \n",
    "\\rightarrow \\small\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2} -\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2} + \\log \\frac{\\sigma_1}{\\sigma_2} + \\log \\frac{P(y=2)}{P(y=1)} = 0\\\\  \n",
    "\\rightarrow \\small\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2} -\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2} + \\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)} = 0  \n",
    "\\end{align}  \n",
    "$$\n",
    "\n",
    "Let's now only consider the first two terms and for convenience set  $\\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)} = k$ a constant.\n",
    "Expanding the squares we get,\n",
    "\n",
    "$$  \n",
    "\\begin{align}  \n",
    "\\frac{2(\\sigma_2^2 - \\sigma_1^2 )x^{*^2} -4(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)x^* + 2(\\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2)}{4\\sigma_1^2\\sigma_2^2} +k=0\\\\  \n",
    "\\frac{0.5(\\sigma_2^2 - \\sigma_1^2 )x^{*^2} -(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)x^* + 0.5(\\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2)}{\\sigma_1^2\\sigma_2^2} +k=0\\\\  \n",
    "0.5(\\sigma_2^2 - \\sigma_1^2 )x^{*^2} -(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)x^* + 0.5(\\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2) + k\\sigma_1^2\\sigma_2^2=0\\\\\n",
    "\\end{align}  \n",
    "$$\n",
    "\n",
    "Multiply by 2,\n",
    "$$  \n",
    "\\begin{align}\n",
    "(\\sigma_2^2 - \\sigma_1^2 )x^{*^2} -2(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)x^* + \\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2 + 2k\\sigma_1^2\\sigma_2^2=0\n",
    "\\end{align}\n",
    "$$\n",
    "\n",
    "For a quadratic equation of the form $ax^2 + bx + c = 0$, the discriminant ($\\Delta$) $= b^2 - 4ac$\n",
    "\n",
    "Calculate the individual terms from above equation,  \n",
    "$$  \n",
    "\\begin{align}  \n",
    "b^2 =\\left(-2(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)\\right) ^2= 4\\left(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2\\right) ^2\\\\  \n",
    "4ac =4 (\\sigma_2^2 - \\sigma_1^2 ) \\left(\\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2+ 2k\\sigma_1^2\\sigma_2^2 \\right)\\\\  \n",
    "\\end{align}  \n",
    "$$\n",
    "\n",
    "Calculate discriminant \n",
    "$$  \n",
    "\\begin{align}  \n",
    "b^2 - 4ac = 4\\left(\\left(\\sigma_1^2\\mu_2 - \\sigma_2^2\\mu_1\\right) ^2 - (\\sigma_1^2 - \\sigma_2^2 ) \\left(\\sigma_1^2\\mu_2^2-\\sigma_2^2\\mu_1^2 - 2k\\sigma_1^2\\sigma_2^2 \\right)\\right)\\\\  \n",
    "=   4\\left(\\left(\\sigma_1^2\\mu_2 - \\sigma_2^2\\mu_1\\right) ^2 - \\left(\\sigma_1^4\\mu_2^2 - \\sigma_1^2\\sigma_2^2\\mu_1^2-2k\\sigma_1^4\\sigma_2^2 - \\sigma_1^2\\sigma_2^2\\mu_2^2 + \\sigma_2^4\\mu_1^2 + 2k\\sigma_1^2\\sigma_2^4\\right)\\right)\\\\\n",
    "=   4\\left(\\left(\\sigma_1^4\\mu_2^2 -2\\sigma_1^2\\sigma_2^2\\mu_1\\mu_2 +\\sigma_2^4\\mu_1^2\\right) - \\left(\\sigma_1^4\\mu_2^2 - \\sigma_1^2\\sigma_2^2\\mu_1^2-2k\\sigma_1^4\\sigma_2^2 - \\sigma_1^2\\sigma_2^2\\mu_2^2 + \\sigma_2^4\\mu_1^2 + 2k\\sigma_1^2\\sigma_2^4\\right)\\right)\\\\\n",
    "= 4\\sigma_1^2\\sigma_2^2\\left(\\mu_1^2 +\\mu_2^2 - 2\\mu_1\\mu_2  + 2k(\\sigma_1^2 - \\sigma_2^2)\\right)\\\\ \n",
    "= 4\\sigma_1^2\\sigma_2^2\\left( \\left(\\mu_1 - \\mu_2\\right)^2  + 2k(\\sigma_1^2 - \\sigma_2^2)\\right)\\\\   \n",
    "\\end{align}  \n",
    "$$\n",
    "\n",
    "So the discriminant $\\Delta =4\\sigma_1^2\\sigma_2^2\\left(\\left(\\mu_1-\\mu_2\\right)^2 + 2\\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)}(\\sigma_1^2 - \\sigma_2^2)\\right)$. \n",
    "\n",
    "\n",
    "Recall formula for roots of quadratic equation $ax^2 + bx + c = 0$,\n",
    "$$\n",
    "\\begin{align}\n",
    "x = \\frac{-b \\pm \\sqrt\\Delta}{2a}\n",
    "\\end{align}\n",
    "$$\n",
    "\n",
    "So threshold ($x^*$) can be computed as follows:\n",
    "$$\n",
    "\\begin{align}\n",
    "x^* = \\frac{2(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2) \\pm \\sqrt{4\\sigma_1^2\\sigma_2^2\\left(\\left(\\mu_1-\\mu_2\\right)^2 + 2\\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)}(\\sigma_1^2 - \\sigma_2^2)\\right)}}{2(\\sigma_2^2 - \\sigma_1^2 )}\\\\\n",
    "x^*= \\frac{(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2) \\pm \\sqrt{\\sigma_1^2\\sigma_2^2\\left(\\left(\\mu_1-\\mu_2\\right)^2 + 0.5\\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)}(\\sigma_1^2 - \\sigma_2^2)\\right)}}{(\\sigma_2^2 - \\sigma_1^2 )}\n",
    "\\end{align}\n",
    "$$\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
 }
	{
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Derivation of 1D Gaussian decision boundary threshold"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Consider a two class classification scenario. At the decision boundary, the posterior probability of classifying a data point into two classes will be equal i.e. $p(y=1\|x) = p(y=2\|x)$ \n",
	"\n",
	"Posterior definition\n",
	"\n",
	"$p(y=1\|x) = \\large\\frac{p(x\|y=1) * P(y=1)}{p(x)}$\n",
	"\n",
	"$p(y=2\|x) = \\large\\frac{p(x\|y=2) * P(y=2)}{p(x)}$\n",
	"\n",
	"where likelihoods are Gaussian i.e.\n",
	"\n",
	"$p(x\|y=1) = \\mathcal{N}(x\|\\mu_1, \\sigma_1)$ \n",
	"\n",
	"$p(x\|y=2) = \\mathcal{N}(x\|\\mu_2, \\sigma_2)$ \n",
	"\n",
	"So at the decision boundary, $p(y=1\|x^) = p(y=2\|x^)$ where $x^*$ is the threshold"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"$\n",
	"\\begin{align}\n",
	"&p(y=1\|x^) = p(y=2\|x^) \\\\\n",
	"&\\Longrightarrow\\frac{1}{\\sqrt{2\\pi\\sigma_1^2}}\\exp(-\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2}) * P(y=1) = \\frac{1}{\\sqrt{2\\pi\\sigma_2^2}}\\exp(-\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2})* P(y=2)\\\\\n",
	"\\end{align}\n",
	"$\n",
	"\n",
	"Taking log on both sides,\n",
	"\n",
	"$\n",
	"\\begin{align}\n",
	"\\Rightarrow & \\small-\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2} -\\log\\sqrt{2\\pi}\\sigma_1 + \\log P(y=1)= -\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2} -\\log\\sqrt{2\\pi}\\sigma_2 + \\log P(y=2)\\\\\n",
	"\\end{align}\n",
	"$"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"\n",
	"\n",
	"Rearranging the above equation a bit we get,$$ \n",
	"\\begin{align} \n",
	"\\small\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2} -\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2}+\\log\\sqrt{2\\pi}\\sigma_1 -\\log\\sqrt{2\\pi}\\sigma_2 + \\log P(y=2) -\\log P(y=1)=0\\\\ \n",
	"\\end{align} \n",
	"$$\n",
	"\n",
	"Group the log terms,\n",
	"\n",
	"$$ \n",
	"\\begin{align} \n",
	"\\rightarrow \\small\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2} -\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2} + \\log \\frac{\\sigma_1}{\\sigma_2} + \\log \\frac{P(y=2)}{P(y=1)} = 0\\\\ \n",
	"\\rightarrow \\small\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2} -\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2} + \\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)} = 0 \n",
	"\\end{align} \n",
	"$$\n",
	"\n",
	"Let's now only consider the first two terms and for convenience set $\\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)} = k$ a constant.\n",
	"Expanding the squares we get,\n",
	"\n",
	"$$ \n",
	"\\begin{align} \n",
	"\\frac{2(\\sigma_2^2 - \\sigma_1^2 )x^{^2} -4(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)x^ + 2(\\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2)}{4\\sigma_1^2\\sigma_2^2} +k=0\\\\ \n",
	"\\frac{0.5(\\sigma_2^2 - \\sigma_1^2 )x^{^2} -(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)x^ + 0.5(\\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2)}{\\sigma_1^2\\sigma_2^2} +k=0\\\\ \n",
	"0.5(\\sigma_2^2 - \\sigma_1^2 )x^{^2} -(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)x^ + 0.5(\\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2) + k\\sigma_1^2\\sigma_2^2=0\\\\\n",
	"\\end{align} \n",
	"$$\n",
	"\n",
	"Multiply by 2,\n",
	"$$ \n",
	"\\begin{align}\n",
	"(\\sigma_2^2 - \\sigma_1^2 )x^{^2} -2(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)x^ + \\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2 + 2k\\sigma_1^2\\sigma_2^2=0\n",
	"\\end{align}\n",
	"$$\n",
	"\n",
	"For a quadratic equation of the form $ax^2 + bx + c = 0$, the discriminant ($\\Delta$) $= b^2 - 4ac$\n",
	"\n",
	"Calculate the individual terms from above equation, \n",
	"$$ \n",
	"\\begin{align} \n",
	"b^2 =\\left(-2(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)\\right) ^2= 4\\left(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2\\right) ^2\\\\ \n",
	"4ac =4 (\\sigma_2^2 - \\sigma_1^2 ) \\left(\\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2+ 2k\\sigma_1^2\\sigma_2^2 \\right)\\\\ \n",
	"\\end{align} \n",
	"$$\n",
	"\n",
	"Calculate discriminant \n",
	"$$ \n",
	"\\begin{align} \n",
	"b^2 - 4ac = 4\\left(\\left(\\sigma_1^2\\mu_2 - \\sigma_2^2\\mu_1\\right) ^2 - (\\sigma_1^2 - \\sigma_2^2 ) \\left(\\sigma_1^2\\mu_2^2-\\sigma_2^2\\mu_1^2 - 2k\\sigma_1^2\\sigma_2^2 \\right)\\right)\\\\ \n",
	"= 4\\left(\\left(\\sigma_1^2\\mu_2 - \\sigma_2^2\\mu_1\\right) ^2 - \\left(\\sigma_1^4\\mu_2^2 - \\sigma_1^2\\sigma_2^2\\mu_1^2-2k\\sigma_1^4\\sigma_2^2 - \\sigma_1^2\\sigma_2^2\\mu_2^2 + \\sigma_2^4\\mu_1^2 + 2k\\sigma_1^2\\sigma_2^4\\right)\\right)\\\\\n",
	"= 4\\left(\\left(\\sigma_1^4\\mu_2^2 -2\\sigma_1^2\\sigma_2^2\\mu_1\\mu_2 +\\sigma_2^4\\mu_1^2\\right) - \\left(\\sigma_1^4\\mu_2^2 - \\sigma_1^2\\sigma_2^2\\mu_1^2-2k\\sigma_1^4\\sigma_2^2 - \\sigma_1^2\\sigma_2^2\\mu_2^2 + \\sigma_2^4\\mu_1^2 + 2k\\sigma_1^2\\sigma_2^4\\right)\\right)\\\\\n",
	"= 4\\sigma_1^2\\sigma_2^2\\left(\\mu_1^2 +\\mu_2^2 - 2\\mu_1\\mu_2 + 2k(\\sigma_1^2 - \\sigma_2^2)\\right)\\\\ \n",
	"= 4\\sigma_1^2\\sigma_2^2\\left( \\left(\\mu_1 - \\mu_2\\right)^2 + 2k(\\sigma_1^2 - \\sigma_2^2)\\right)\\\\ \n",
	"\\end{align} \n",
	"$$\n",
	"\n",
	"So the discriminant $\\Delta =4\\sigma_1^2\\sigma_2^2\\left(\\left(\\mu_1-\\mu_2\\right)^2 + 2\\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)}(\\sigma_1^2 - \\sigma_2^2)\\right)$. \n",
	"\n",
	"\n",
	"Recall formula for roots of quadratic equation $ax^2 + bx + c = 0$,\n",
	"$$\n",
	"\\begin{align}\n",
	"x = \\frac{-b \\pm \\sqrt\\Delta}{2a}\n",
	"\\end{align}\n",
	"$$\n",
	"\n",
	"So threshold ($x^*$) can be computed as follows:\n",
	"$$\n",
	"\\begin{align}\n",
	"x^* = \\frac{2(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2) \\pm \\sqrt{4\\sigma_1^2\\sigma_2^2\\left(\\left(\\mu_1-\\mu_2\\right)^2 + 2\\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)}(\\sigma_1^2 - \\sigma_2^2)\\right)}}{2(\\sigma_2^2 - \\sigma_1^2 )}\\\\\n",
	"x^*= \\frac{(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2) \\pm \\sqrt{\\sigma_1^2\\sigma_2^2\\left(\\left(\\mu_1-\\mu_2\\right)^2 + 0.5\\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)}(\\sigma_1^2 - \\sigma_2^2)\\right)}}{(\\sigma_2^2 - \\sigma_1^2 )}\n",
	"\\end{align}\n",
	"$$\n"
	]
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "Python 3",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.7.3"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 4
	}