jnturton · March 19, 2018 17:48
diff --git a/monty-hall-information.ipynb b/monty-hall-information.ipynb
 {
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "_Note: GitHub does not render LaTeX embedded in Jupyter Notebooks at the time of writing.  Please paste the Github URL of this document into http://nbviewer.jupyter.org/ if your view is missing mathematics._"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Information changes in the Monty Hall Problem\n",
    "\n",
    "What routinely fools Nobel-prize-winning physicists but not pigeons?  [The Monty Hall Problem](https://en.m.wikipedia.org/wiki/Monty_Hall_problem), a delightfully counterintuitive problem in probabilty.  It's nicely explained at the Wikipedia page and in numerous other public places so we needn't go through all of that again.  What I want to do is to zoom in on the frequently-stated (and correct) idea that Monty \"leaks information\" when he reveals a goat.\n",
    "\n",
    "Now it's actually immediately obvious that Monty leaks _some_ information when he reveals a goat - one of the doors is completely eliminated and we are left to assign probabilities over two doors.  But we know that in some sense we have learned _more_ by Monty's revelation than our intuition would have us believe.  Is this something that we can put a number to?  To make this well-defined enough to do calculations, I'm going analyse information changes in \n",
    "- the Monty Hall Problem itself and \n",
    "- a superficially similar problem that leads to exactly the wrong answer that most of us give."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Definition:** The _Shannon entropy_, _information content_ or the _self-information_ of a system $X$ is\n",
    "\n",
    "$$\n",
    "H(X) = -\\sum_x P(x) \\log_2 P(x)\n",
    "$$\n",
    "\n",
    "where $x$ ranges over the possible states of $X$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### A superficially similar problem\n",
    "\n",
    "1. You, the contestant, are presented by Monty Hall with a choice of three doors.  One conceals a car, the other two conceal goats.\n",
    "2. _Right away_ Monty Hall opens a door chosen to reveal a goat.\n",
    "3. You are left with a choice of two doors, one of concealing a car and the other a goat.\n",
    "\n",
    "This problem has no trick to it.  Steps 1 and 2 are superfluous and it reduces directly to a two door guessing game in which your probability of winning is $0.5$ on each door.  It is my belief that many of us incorrectly reduce the Monty Hall Problem to this problem when our intuitions lead us astray.  Let's look at what happened in terms of information theory."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Before:\t\t1.584962500721156 bits\n",
      "After:\t\t1.0 bits\n",
      "Change:\t\t-0.5849625007211561 bits\n"
     ]
    }
   ],
   "source": [
    "import math\n",
    "\n",
    "def shannon_entropy(probabilities):\n",
    "    return -sum(map(lambda p: p * math.log2(p), probabilities))\n",
    "\n",
    "s_before = shannon_entropy((1/3,1/3,1/3))\n",
    "s_after = shannon_entropy((1/2,1/2))\n",
    "s_change = s_after - s_before\n",
    "\n",
    "print(\"Before:\\t\\t{} bits\\nAfter:\\t\\t{} bits\\nChange:\\t\\t{} bits\".format(s_before, s_after, s_after-s_before))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Monty has leaked about $0.585$ bits of information by eliminating one of the doors."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### The Monty Hall Problem\n",
    "\n",
    "I'm going to assume that you've been to Wikipedia, know the sequence of the Monty Hall Problem, have accepted that the probability distribution over the remaining two doors has shifted to $\\{1/3, 2/3\\}$ and that I can move directly to the information change calculation :-)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Before:\t\t1.584962500721156 bits\n",
      "After:\t\t0.9182958340544896 bits\n",
      "Change:\t\t-0.6666666666666665 bits\n"
     ]
    }
   ],
   "source": [
    "m_before = shannon_entropy((1/3,1/3,1/3))\n",
    "m_after = shannon_entropy((1/3,2/3))\n",
    "m_change = m_after - m_before\n",
    "\n",
    "print(\"Before:\\t\\t{} bits\\nAfter:\\t\\t{} bits\\nChange:\\t\\t{} bits\".format(m_before, m_after, m_change))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this case Monty has leaked $0.\\dot{6}$ bits of information by elimnating one of the doors and we can proceed to calculate how much _more_ information Monty actually leaks than so many of us first believe."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.08170416594551044 bits\n"
     ]
    }
   ],
   "source": [
    "print(\"{} bits\".format(abs(m_change - s_change)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "That may not sound like many bits of information but these particular bits are worth one sixth of a car (assuming a goat is worthless)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "© James Turton 2016"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2+"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
 }
	{
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"_Note: GitHub does not render LaTeX embedded in Jupyter Notebooks at the time of writing. Please paste the Github URL of this document into http://nbviewer.jupyter.org/ if your view is missing mathematics._"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"### Information changes in the Monty Hall Problem\n",
	"\n",
	"What routinely fools Nobel-prize-winning physicists but not pigeons? [The Monty Hall Problem](https://en.m.wikipedia.org/wiki/Monty_Hall_problem), a delightfully counterintuitive problem in probabilty. It's nicely explained at the Wikipedia page and in numerous other public places so we needn't go through all of that again. What I want to do is to zoom in on the frequently-stated (and correct) idea that Monty \"leaks information\" when he reveals a goat.\n",
	"\n",
	"Now it's actually immediately obvious that Monty leaks _some_ information when he reveals a goat - one of the doors is completely eliminated and we are left to assign probabilities over two doors. But we know that in some sense we have learned _more_ by Monty's revelation than our intuition would have us believe. Is this something that we can put a number to? To make this well-defined enough to do calculations, I'm going analyse information changes in \n",
	"- the Monty Hall Problem itself and \n",
	"- a superficially similar problem that leads to exactly the wrong answer that most of us give."
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Definition: The _Shannon entropy_, _information content_ or the _self-information_ of a system $X$ is\n",
	"\n",
	"$$\n",
	"H(X) = -\\sum_x P(x) \\log_2 P(x)\n",
	"$$\n",
	"\n",
	"where $x$ ranges over the possible states of $X$."
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"#### A superficially similar problem\n",
	"\n",
	"1. You, the contestant, are presented by Monty Hall with a choice of three doors. One conceals a car, the other two conceal goats.\n",
	"2. _Right away_ Monty Hall opens a door chosen to reveal a goat.\n",
	"3. You are left with a choice of two doors, one of concealing a car and the other a goat.\n",
	"\n",
	"This problem has no trick to it. Steps 1 and 2 are superfluous and it reduces directly to a two door guessing game in which your probability of winning is $0.5$ on each door. It is my belief that many of us incorrectly reduce the Monty Hall Problem to this problem when our intuitions lead us astray. Let's look at what happened in terms of information theory."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 1,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"Before:\t\t1.584962500721156 bits\n",
	"After:\t\t1.0 bits\n",
	"Change:\t\t-0.5849625007211561 bits\n"
	]
	}
	],
	"source": [
	"import math\n",
	"\n",
	"def shannon_entropy(probabilities):\n",
	" return -sum(map(lambda p: p * math.log2(p), probabilities))\n",
	"\n",
	"s_before = shannon_entropy((1/3,1/3,1/3))\n",
	"s_after = shannon_entropy((1/2,1/2))\n",
	"s_change = s_after - s_before\n",
	"\n",
	"print(\"Before:\\t\\t{} bits\\nAfter:\\t\\t{} bits\\nChange:\\t\\t{} bits\".format(s_before, s_after, s_after-s_before))"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Monty has leaked about $0.585$ bits of information by eliminating one of the doors."
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"#### The Monty Hall Problem\n",
	"\n",
	"I'm going to assume that you've been to Wikipedia, know the sequence of the Monty Hall Problem, have accepted that the probability distribution over the remaining two doors has shifted to $\\{1/3, 2/3\\}$ and that I can move directly to the information change calculation :-)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 2,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"Before:\t\t1.584962500721156 bits\n",
	"After:\t\t0.9182958340544896 bits\n",
	"Change:\t\t-0.6666666666666665 bits\n"
	]
	}
	],
	"source": [
	"m_before = shannon_entropy((1/3,1/3,1/3))\n",
	"m_after = shannon_entropy((1/3,2/3))\n",
	"m_change = m_after - m_before\n",
	"\n",
	"print(\"Before:\\t\\t{} bits\\nAfter:\\t\\t{} bits\\nChange:\\t\\t{} bits\".format(m_before, m_after, m_change))"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"In this case Monty has leaked $0.\\dot{6}$ bits of information by elimnating one of the doors and we can proceed to calculate how much _more_ information Monty actually leaks than so many of us first believe."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 29,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"0.08170416594551044 bits\n"
	]
	}
	],
	"source": [
	"print(\"{} bits\".format(abs(m_change - s_change)))"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"That may not sound like many bits of information but these particular bits are worth one sixth of a car (assuming a goat is worthless)."
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"© James Turton 2016"
	]
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "Python 3",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.5.2+"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 1
	}