Skip to content

Instantly share code, notes, and snippets.

@aflaxman
Created September 18, 2013 19:12
Show Gist options
  • Select an option

  • Save aflaxman/6614061 to your computer and use it in GitHub Desktop.

Select an option

Save aflaxman/6614061 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"metadata": {
"name": "2013_09_18_pymc3_bfgs_va-post-dx-adj"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "code",
"collapsed": false,
"input": "!date",
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": "Wed Sep 18 11:15:08 PDT 2013\r\n"
}
],
"prompt_number": 1
},
{
"cell_type": "markdown",
"metadata": {},
"source": "The goal in this notebook is to quickly test a sophisticated approach to post-diagnosis adjustment of VA CSMFs. This is a generalization of the standard backcalculation of epidemiology to the setting where there are many causes.\n\nIn the standard setting, the sensitivity and specificity of a test are known a priori, and the test results are adjusted to yield population level estimates. For general application, the test has a 2x2 confusion matrix $M$, with $M_{ij}$ equal to the probability the test returns $i$ when the truth is $j$.\n\nThe basic challenge then is to find $p^{true}$ such that $M\\cdot p^{true} = p^{obs}$."
},
{
"cell_type": "code",
"collapsed": false,
"input": "# example\nM = [[.9, .1],\n [.2, .8]]\n\np_obs = [.2, .8]\n\np_true = np.linalg.inv(M).dot(p_obs)",
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 2
},
{
"cell_type": "code",
"collapsed": false,
"input": "p_true",
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 3,
"text": "array([ 0.11428571, 0.97142857])"
}
],
"prompt_number": 3
},
{
"cell_type": "markdown",
"metadata": {},
"source": "The problem is that p_true must be constrained to be on the simplex (i.e. sum to one, and have all entries in [0,1].\n\nSo maybe the solution is to find $p^{true}$ such that $d(M\\cdot p^{true}, p^{obs})$ is minimized, subject to the constraints."
},
{
"cell_type": "code",
"collapsed": false,
"input": "def d(p_true):\n p_true /= p_true.sum()\n return np.sqrt(np.mean(np.square(M.dot(p_true) - p_obs)))\n\nM = array(M)\np_obs = array(p_obs)\n\np_true = array([.2, .8])\nd(p_true)",
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 6,
"text": "0.094868329805051305"
}
],
"prompt_number": 6
},
{
"cell_type": "code",
"collapsed": false,
"input": "x = linspace(0,.2,100)\ny = zeros(100)\n\nfor i, p0 in enumerate(x):\n p_true = array([p0, 1-p0])\n y[i] = d(p_true)\n \nplot(x,y)\nxlabel('$p^{true}_0$', fontsize=20,)\nylabel('$d(M\\cdot p^{true}, p^{obs})$', fontsize=20, rotation=0,)\naxis(ymin=0);",
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "display_data",
"png": "iVBORw0KGgoAAAANSUhEUgAAAgoAAAEdCAYAAABt+k/rAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XtYVNX+BvB3BlRMFBUFERDUlLiMioCKiuIlTQrMwrCU\nMo+llkd/eqJ7qZ2yLD1lWaZHO5pEJaipoSSKgFe8i6hI3gEvJIaaOtxm/f5YxzmOOIjOMHuA9/M8\n8+jMbPb+znaceVlr7bVUQggBIiIiortQK10AERERWS8GBSIiIjKKQYGIiIiMYlAgIiIioxgUiIiI\nyCgGBSIiIjKKQYHMIikpCRqNBj4+Ppg1a1aF59PT09G1a1fUq1cPK1asMHhu6dKl8PX1ha+vL77/\n/ntLlUxERFWg4jwKZKri4mI88sgj2Lp1K5ydnREcHIyFCxfC399fv82ZM2dw9epVzJ49GxEREXj6\n6acBAOfPn0dISAgOHDgAAOjSpQu2bdsGZ2dnRV4LEREZYosCmSwjIwO+vr5wdXWFra0toqKikJiY\naLCNh4cHNBoN1GrDt1xycjKGDBkCe3t72Nvb47HHHkNycrIlyyciokowKJDJ8vLy4O7urr/v5uaG\nvLy8Kv1sfn4+3NzcHuhniYio+tkqXQDVfCqVqlYcg4ioNjJ1hAFbFMhkbm5uyM3N1d/Pzc01aGG4\n0+1f+vfzs0II3sx0mzZtmuI11JYbzyXPpzXfzIFBgUwWFBSErKws5Ofno7S0FMuXL8eQIUPuuu2d\nb96BAwciKSkJ165dw7Vr15CUlISBAwdaqnQiIroHBgUymZ2dHebPn4/Bgwejc+fOeOqpp9C1a1dM\nmzYNa9euBQDs3r0b7u7uSEhIwLhx46DRaAAALi4ueOedd9C9e3d0794d77//Pq94ICKyIrw8kmoE\nlUpltmY0AlJTUxEaGqp0GbUCz6V58Xyalzk+OxkUqEZgUCAiun/m+Oxk1wMREREZxaBARERERjEo\nEBERkVEMCkRERGQUgwIREVEtdOqUefbDoEBERFTLLFkCdO9unn0xKBAREdUSOh3w5pvAP/8JpKWZ\nZ59cFIqIiKgWuH4dGDUKuHwZyMgAWrQwz37ZokBERFTD5eYCvXsDTZsCycnmCwkAgwIREVGNtmsX\n0KMH8NxzwHffAfXrm3f/7HogIiKqoZYvB159FVi0CBg6tHqOwaBARERUwwgBfPCBbEFITga6dKm+\nYzEoEBER1SA3bwIvvgicPi0HLbZqVb3H4xgFIiKiGuLcOaBvX8DWFkhNrf6QADAoEBER1Qh798pJ\nlIYOBZYtA+zsLHNcdj0QERFZufh44JVXgIULgWHDLHtsBgUiIiIrJYScZXHxYmDDBsDf3/I1MCgQ\nERFZoRs35KDFs2ctM2jRGI5RICIisjJ5eUBICNCgAbB5s3IhAWBQICIisioZGXKmxWeeAZYutdyg\nRWPY9UBERGQlfvgBmDJFjkkID1e6GolBgYiISGE6HfDOO3JK5pQUwM9P6Yr+h0GBiIhIQVevyuWh\nr1417/LQ5sIxCkRERAo5cQIIDgZcXc2/PLS5MCgQEREpICUF6NULmDgRmD8fqFdP6Yrujl0PRERE\nFiQE8PXXwIcfAj/+CPTrp3RFlWNQICIispCSEtmCsGMHsH070K6d0hXdG4MCERGRBRQUAJGRgKOj\nDAmNGytdUdVwjAIREVE127cPCAqSS0SvWFFzQgLAFgUiIqJq9dNPwN//LgcsRkYqXc39Y1AgIiKq\nBuXlwLvvyqCwcSPQubPSFT0YBgUiIiIzu3IFeO454OZNYPdu65wfoao4RoGIiMiMjh0DuncH2rcH\nfvutZocEgEGBiIjIbH79VS4PHRMDfPml9U6idD/Y9UBERGQinQ6YORP49ltgzRq5THRtwRYFMouk\npCRoNBr4+Phg1qxZFZ4vLi5GVFQUNBoNevXqhTNnzgAArl+/jsjISHh7e6NDhw6YPn26hSsnIjLN\ntWvA8OFAYiKwa1ftCgkAgwKZQXFxMSZMmICkpCRkZmYiISEB+/fvN9hm3rx5cHFxwaFDhxATE4NJ\nkyYBAH788UfUq1cPR48eRWZmJpYuXYqTJ08q8TKIiO7b8eMyGDRrBqSmAq1bK12R+TEokMkyMjLg\n6+sLV1dX2NraIioqComJiQbbrFu3DtHR0QCAiIgIbN++HUIIuLu74/r16ygvL8f169dRv359NG/e\nXImXQUR0X9avl4s6/f3vwL//DTRooHRF1YNBgUyWl5cHd3d3/X03Nzfk5eUZ3UatVsPR0REFBQUY\nPHgwmjRpAhcXF3h6eiImJgZNmza1aP1ERPdDCDkeYexYYOVKYPx4QKVSuqrqw8GMZDKVCf9DYmNj\ncfPmTZw/fx6XL19GSEgIBgwYgLZt21bY9vbxC6GhoQgNDX3g4xIRPYhr14DRo4H8fDkewdVV6YoM\npaamIjU11az7ZFAgk7m5uSE3N1d/Pzc316CF4dY2Z8+ehZOTE3Q6HQoLC9GiRQts3boVw4YNg42N\nDVq2bIlevXph165d9wwKRESW9vvvwJNPyu6GuDjr7Gq485eoGTNmmLxPdj2QyYKCgpCVlYX8/HyU\nlpZi+fLlGDJkiME2YWFhiI2NBQCsXr0awcHBsLGxQfv27ZGSkgJAXgGxY8cOPPzwwxZ/DURElfn1\nVxkQJk8GFi60zpBQXVRCCKF0EVTzrV+/HjExMdDpdIiOjsZbb72FadOmITAwEOHh4SguLkZ0dDSO\nHj2Kxo0bIy4uDp6enrhx4wZGjx6NgwcPQqfT4YUXXsC7775bYf8qlQp8qxKRpel0wAcfAIsXA/Hx\nNe/SR3N8djIoUI3AoEBEllZUBERHy3Ubli8HWrVSuqL7Z47PTnY9EBER3SEzEwgMBNq2BTZtqpkh\nwVwYFIiIiG4TFwcMGABMn1571mswBa96ICIiAlBaKhdzWrsW2LgR6NxZ6YqsA4MCERHVeefPy/Ua\nHByAPXvklMwkseuBiIjqtC1b5HiEwYNlawJDgiG2KBARUZ0kBPDFF8AnnwBLlwKPPaZ0RdaJQYGI\niOqca9eAv/0NOHkSyMgAPD2Vrsh6seuBiIjqlCNHgG7dZBfD1q0MCffCoEBERHXGjz8CffsCb7wB\nLFgA2NkpXZH1Y9cDERHVesXFwNSpwIYNvPTxfjEoEBFRrXbmjLz00dVVXvro4KB0RTULux6IiKjW\nWrdOjkd45hlg5UqGhAfBFgUiIqp1ysqAadOA778HVqwAevdWuqKai0GBiIhqlfPngWeflWs07N0L\nODkpXVHNxq4HIiKqNVJSgIAAoH9/ICmJIcEc2KJANcbvvwMdOihdBRFZo/Jy4KOPgG+/BZYtk6s/\nknmwRYFqjJ49gZ9/VroKIrI2Fy4AgwbJ1oQ9exgSzI1BgWqMDRuAd94Bxo8Hbt5UuhoisgYpKUDX\nrkCvXnJ+hNatla6o9mFQoBrD3x/Ytw/480+gRw/g2DGlKyIipZSXy6saRo6UCzp98AFgy870asGg\nQDVKkybATz8Br7wiL3eKjVW6IiKytHPnZPfC1q3A/v3Ao48qXVHtxqBANY5KBYwbB2zaBHz4ITBm\nDHD9utJVEZElJCXJqxoGDpTdka1aKV1R7cegQDVWp05y4JJOBwQGApmZSldERNWlpASIiQFeflkO\nan73XcDGRumq6gYGBarR7O2BJUuAt9+WTZHffAMIoXRVRGROJ0/KrsZjx2RXQ58+SldUtzAoUK0Q\nHQ1s2wYsWgQ89RRw+bLSFRGROcTFAd27y0GLq1cDjo5KV1T3MChQrdGxI7BjB9C2LdClC5CernRF\nRPSg/voLGD0amDFDjkWYPFmOTyLLY1CgWqVBA+Bf/5Kzs0VFAe+9JxeHIaKaY+9eOTeCjY38u7+/\n0hXVbSoh2KNL1k+lUuF+36rnz8vfSK5dA374QbY0EJH10umA2bPl7auvZNgn0zzIZ+ed2KJAtZaL\nC7B+PRAZKdejj4tTuiIiMubcOTkN85o1wO7dDAnWhEGBajW1Gpg6VfZx/vOfwKhRwJUrSldFRLdb\ntUp2NYSEAKmpgIeH0hXR7RgUqE7w95d9nU2ayIGOW7cqXRER/fUXMHYs8NprwC+/yCmZOQ2z9WFQ\noDrjoYfkPAtffgkMHy4XmCopUboqoropI0MG+PJy4MABuX4LWScOZqQawRwDcm538SLwt7/J5Wlj\nY4FHHjHbromoEqWlwEcfAfPny+D+9NNKV1S7cTAj0QNydgbWrpXNnr17A/PmyRHXRFR9cnLk/7cd\nO+QMiwwJNQODAtVZKhUwfjywfTuwbBnw2GNAfr7SVRHVPkLI1oOePeWA4vXrgdatla6KqopBgeq8\njh3l9M8hIbLP9McfuV4Ekbnk58sQvmSJHET897/Lq5Go5uA/F5lFUlISNBoNfHx8MGvWrArPFxcX\nIyoqChqNBr169cKZM2f0z2VmZiIkJAT+/v7QaDQoLi62ZOkA5Ejr994D1q2TS1c/8wzwxx8WL4Oo\n1hBCTnTm7y9bErZt41igmopBgUxWXFyMCRMmICkpCZmZmUhISMD+/fsNtpk3bx5cXFxw6NAhxMTE\nYNKkSQAArVaL5557DosXL8b+/fuxdetW1KtXT4mXAUAuV713L+DpCXTuLBehIaL7U1AgJzqbOVN2\nM0ybBij435pMxKBAJsvIyICvry9cXV1ha2uLqKgoJCYmGmyzbt06REdHAwAiIiKwfft26HQ6JCUl\noVu3bujYsSMAwMHBAWqF2yXt7IDPPpNr3v/jH8DzzwN//qloSUQ1xooVMmQ//LAM3QEBSldEpmJQ\nIJPl5eXB3d1df9/NzQ15eXlGt1Gr1XB0dERBQQGOHTuGkpIShIaGQqPR4MMPP7Ro7ZUJCQEOHgSa\nNgU0GuCO7ENEt7l0CRgxAnj7bRkWZs2SoZtqPs6BRSZTmbD2a3l5ObZv3449e/agYcOGGDBgAAIC\nAjBkyJAK206fPl3/99DQUISGhj7wcauqUSM5QdNTTwFjxgDx8cDnnwPNmlX7oYlqjJUrgYkTgWef\nBf7zH6BhQ6UrqrtSU1ORmppq1n0yKJDJ3NzckJubq7+fm5tr0MJwa5uzZ8/CyckJOp0OhYWFcHJy\nQps2bdCnTx80b94cABAWFoYDBw7cMyhYWmgokJkJvPWWbF345hsgIkKxcoisQkGBDAgHD8oQ3auX\n0hXRnb9EzZgxw+R9suuBTBYUFISsrCzk5+ejtLQUy5cvr/BFHxYWhtjYWADA6tWrERwcDLVajQED\nBuDgwYO4efMmysrKkJaWBm9vbyVexj3Z28ulb+Pi5NiF557jlRFUNwkh/x9oNHL59gMHGBJqMwYF\nMpmdnR3mz5+PwYMHo3PnznjqqafQtWtXTJs2DWvXrgUATJw4EefOnYNGo8Fnn32GL7/8EgDg4uKC\n1157DUFBQfD19YWfnx+efPJJJV/OPfXpI3+DcnWVH5RxcZx3geqO3FwgPBz4+GM5u+msWexqqO24\n1gPVCOZe68Fcdu+Wa0a4u8u569u0Uboiouqh0wELFgDvvw9MmgS88QZQv77SVdG9cK0HIoUFBQF7\n9gDBwUDXrsDcuXI1PKLa5PBheRXQsmVAWpqcnIwhoe5gUCAyUf36wLvvypnnVq2SoeHgQaWrIjKd\nVitbEEJD5RoNW7cCPj5KV0WWxqBAZCZeXkBKCvDyy8CjjwIxMcD160pXRfRgNm0COnUCsrLkYMUJ\nE7hGQ13Ff3YiM1Kr5dLVWVnAhQuAr68c8EVUU1y8KFsP/vY3YM4cOUeCq6vSVZGSGBSIqoGTk+zP\nXbwYeO01YOhQ4PRppasiMq68XM4PotHIJaAPH5ZXNxAxKPxXdnY2tmzZonQZZjFv3jw4ODhg586d\nSpdS5w0YICdq6tZNLjg1cyagwOKYRJXatQvo3h346SfZffbpp3JWUiKAQUFv9uzZyMnJUboMs3jh\nhRdgZ2eH7t27K10KAWjQAHjnHXkpZUaG/I0tKUnpqojkhGEvvyxbvCZPllc0+PkpXRVZm2oPCmVl\nZdV9CLPYuHEjBg0apHQZZrF582b06dPngddgqCn/ZjVN27Zy2eovvgD+/nf54XzypNJVUV1UVgbM\nmyfH0DRqBBw9CkRHAyYs20K1WLWu9bBy5UpcuXIFL774YnUexiRr165FUlISysvL8f3336N3795o\n27YtVq5ciYSEBDg7O8PDwwNff/01ioqKYGtri2+//RYHDx7E+PHjERgYiIsXLyI8PBy7du0CAOh0\nOnz++edQqVRo0aIFioqKMGnSJLPXLoTA7NmzYWtri4ceegiHDx/G559/jg0bNkCn02Hp0qXIyMjA\n6NGj0a1bNwDA8uXLcfHiRTRv3hw6nU6/9PPtPvvsM/Tt2xc9e/Y0e80EhIXJLok5c2SXxLhxcg0J\ne3ulK6O6ICUF+L//Axwd5d/ZgkD3JEwwZcoU4e7uLlQqldi8ebPBc6mpqWLy5MlGf/b5558XgYGB\nQqVSCbVaLc6ePVvpsbZu3Srs7OxEvXr1RHBwsBgzZowppRtYsmSJGD9+vP7++vXrhRBCBAQEiJUr\nVwqdTidycnKEEEL88MMP4urVqyI6Olr8/PPPQggh4uLixMiRI/U/P3bsWDF37lwhhBA3btwQb7zx\nhtlqvd348ePFrFmz9PdHjx4tli1bJry8vMTu3buFEEL89ttvYtCgQUIIIa5evSq6d+8uhBDiwoUL\nYuDAgUb3HRkZKY4cOVItdT8IE9+qVisvT4joaCFcXYVYulSI8nKlK6La6sQJIYYNE8LTU4iEBCF0\nOqUrIkswx2enyXuYN2+esLOzE1qtVv/YlStXRLdu3cTNmzcr/dnjx4+L3r17C5VKJVJSUoxuV1JS\nIl577TVhY2NTafh4UKNHjxbLly83eKywsFA4OTlV2LaoqEgUFRUJJycncePGDSGEDAaLFy8WQghx\n6NAh0bBhQ7Fs2TIRFxcnli5dKoqKisxe886dO0XLli1FSUmJ/rHIyEjxwQcfiA4dOugfW7RokejR\no4cQQoibN28KNzc34e/vL6ZMmSLy8/ON7j8vL0/06NFDlJaWmr32B1Fbg8ItO3YI0a2bEAEBQqSl\nKV0N1SZ//ilETIwQzZsL8dFHQtzjY5lqGXN8dpo8RmHLli0ICgpCgwYN9I/NnDkTI0eOhJ2d3T1/\n9la3xMlKOmsXL14Mb29v6HQ6DBgwwNSS71pHv379IIRAYWEhACAlJQV9+/atsK2DgwMSExPRt29f\nNPzvSiipqano378/ioqKkJmZiYCAAIwaNQrPPvssnn/+eTg4OJi95s2bN6Nfv36oV68eAKCkpASp\nqamws7NDSEiIfrsNGzbox140aNAAR44cwbRp05CVlYXPP//c6P5dXV0RFBSEhQsXmr12qqhHD2DH\nDrkqZXQ08PTTwO+/K10V1WSlpXIcgpcXcPmynNvj7beBe3wsE1VglqBw+xfq9evXsWjRorv2fd9p\n27ZtiIyMhJ2dndGgcOrUKTRs2BDHjh2DSqUy+BI0h0uXLqF+/fpo0aIFVq1ahZs3bwKQgxv79+9/\n15/Jz89Hx44dAQA5OTnQarXw8PBAfHw8fO6Y31QIgUWLFpm1ZgBo06YNHnroIf39uXPnIjIyEl27\ndoWTkxMA4PTp0zhw4ACmTp2K33//Hc2bN4eNjQ2GDh2Kl19+GZ6enpUeY+zYsZgzZ47Za6e7U6uB\nZ58FsrPlpZTBwXLQY0GB0pVRTSIEsGKFHHuwZg2QnAwsWgS4uChdGdVUJg1mPHHiBM6fP28QFBIT\nE+Hj44NmzZrd8+evXLmCJk2awNPT02hQWLZsGd577z307NkTfn5+aNq0qSklV9CsWTN06tQJS5cu\nhaurK9zc3AAAR44cwRtvvHHXn4mMjMTbb7+N2NhYAECnTp3wzTffIDIyEs7OzhgyZAjmzp2LFi1a\n4MaNG4iIiDBrzQAwYsQI7N69GwsXLsSNGzdgY2OD+fPnQwiBtWvX4rvvvsPevXuRlJQEBwcHqFQq\nvP7661i7di20Wi3+/PNPTJ48udJjdOrUCSUlJcjIyOCllhbUsKEc3PjSS8CHH8q59SdPBqZM4YBH\nqlx6ulzVUasFvvoKqCUXcpHC7muZ6XXr1uG7777Dww8/jKKiInTo0AFvvfUWrly5om+Gf+mll9Cs\nWTN8+umnle4rLy8PX3zxBWbPno3HH38cf/zxh/6qgVtWrVqFdu3aoUOHDmjatCnGjRuHr776qsov\nLj4+Hjt27MCBAwcQGxuLTZs2ITs7G5cuXYJarcaXX36pb7q3JtZUd2RkJAIDA/Hmm29a5HjGWOsy\n05Zw4oRcmCclRQaIcePk3AxEt+zbJ+fqyM4G/vlP4LnnuC4DSWb57KzqYIZFixaJNm3aiAsXLggh\nhDh79qyws7MTPXv2NNguMDBQP7CvMrGxsWLNmjVCCCFeffVV4ejoaPD8lStXxOzZs4UQQmzcuFGo\nVKoKAw4rU1xcLCZOnCiEEKJnz57Cx8dHJCcn65/v3bu3mDFjRpX3ZynWVvebb74pRowYYbHjGXMf\nb9Va68ABIR5/XAgPDyH+/W8hbhvHSnXU4cNCDB8uhIuLEF9/LURxsdIVkbUxx2dnlTLnwYMHMWHC\nBPzrX/+Cs7MzAMDd3R329vYVBvydPn1av01ltm3bph9v0K5dO1y+fBnXrl3TP79gwQKMHz8eAJCe\nng4A6NOnT6X7vHjxon6yoPT0dP0AxZycHAwbNgwDBw7Ub9umTRv8/PPP96zT0qyt7latWuHUqVMW\nOx4Z17kz8OuvQFycnGr3kUeApUvl5DlUt2Rny1aD0FAgIEAOfH3lFbnkOZG5VSkovP7663BwcMCw\nYcP0jx05cgSFhYUVgsKVK1cMBtkZc/nyZf14g7Zt2wL435UPe/bsgZeXFxr9d7Lx9PR0dOjQodIA\nsm3bNri6uurHA/j6+iIsLExf552zLp48eRJXr169Z52WZm11N2rUCFeuXLHY8ejeevYENm4EvvtO\n3ry9gSVL5Ch3qt2OHpUrO4aEyMGKJ07IMQlcl4Gq0z2DQmFhITZu3IhHH30U6ts6vVJTU2FjY4Pe\nvXsbbF+V/pCLFy8afOm3a9cOgPwSLC8vR2Jiov4L/9Zgunu1JjRr1gzNmzeHh4cHAMDFxQV2dnZI\nS0tDgwYN0KNHD/22Wq0W+/fvr3CFgjWwtrqFnGvDYsejquvbV87Nv2iRXKnSywtYuJCLTtVGBw8C\nw4fLf3MfH+D4cXmpY+PGSldGdcE9g8KxY8cghKgwne/mzZsREBCARo0aGVyx0LRpU9y4caPSfaan\npxu0RNxqUThx4gSWLFmCMWPG6J/bvXs3tFrtXec0uJ2Pjw8KCgowf/58g8fT0tIQHByM+re1yW3e\nvBklJSUYMWJEpftUkrXUff36dbNfaULm1bcvsGmTDAu//CLXlPjsM8AKG8zoPgghg2BYGDBkiJxr\n4+RJGRCqYWoWIqPuGRQa/zeytmnTRv+YVqtFWlqa/sv79ol72rZti/z8/Er3uXXrVoMWgiZNmsDR\n0RFbtmxBWVkZ3N3d9c9VdXyCMenp6RVaPebPnw9/f3+88MILD7RPc8rOztbP3XA7a6n7/Pnz+hYf\nsm69egHr1gHr18tR8O3aAW++CdzjvyNZmfJyYOVK2cU0diwwbJgMCP/4By+PJWXcMyhoNBp4eXnh\nxIkTAOTKgq+//jpKSkrQsWNHXLp0yaAbISAgANnZ2Ub3V1BQgE2bNsHR0dHgcU9PT5w+fRovvfSS\nweObN2+Gm5ubQVCpqpycHFy8eBGHDh3SP7Zs2TIcOHAA8fHxBl0pSti8eTN8fHwwcuRIg8etqe6j\nR48iICDAYscj03XuDPz4I7BrF3DjhlzW+vnngf37la6MKnPtGvDll0CHDsDs2TIYZGfL+TQ4myIp\nqUoTLsXHx2Pq1Kk4d+4cdDodJk+ejICAAMTFxWHHjh2YNWuWfttHH33U4P4tly5dQmRkJPbu3Ysb\nN27Ay8sLU6ZMwYQJEwAAfn5+ePnll6FWq1FWVoahQ4eioKAAe/fuRcOGDdGvXz88+eST95wk6Ha3\n+vnHjBmD8ePHQwgBrVaLnTt3onXr1lXeT3Vp1aoVWrRogYMHDxo8bi11CyGwe/dufPjhhxY7JplP\nu3byi2fGDGDBAiAiAvDwACZNkr+lWuEUInVSTg7wzTey66h/f+CHH+SsnETW4r4mXKoKrVYLd3d3\nZGVlVekyyeo0atQonD17Vt99Ya2mT5+O6dOn6+9bS90ZGRkYM2YMDh8+rGgdQN2ecMlcysrkGIav\nvpKD4caOlbfbevrIQkpLgcRE4NtvZTfR2LHA+PHAAzScElXKHJ+dZm/DtrOzw6uvvorFixebe9f3\nLS0tzexrQ1SH4juGqVtL3f/+97/x2muvKV0GmYmtLRAZKQfIJSUBhYWymyIiQq4JwMsrq9+pU8B7\n7wGensCcOcDIkcDZs8DMmQwJZL2qpbP79ddfx4oVK1BUVFQdu6+S33//Hfn5+RWu1rA2W7Zsgb+/\nv/6+tdR9/PhxHDlyBKNHj1a0DqoeGo1cWTA3Fxg6FJg1S7YsxMQAVtCAVKv89ZecGKtfP6BbNzkW\nITkZ2LJFrhTK8Qdk9Uye29GIXbt2idGjR1fX7iv1xRdfCE9PT6FWq4W3t7f4+OOPFanjXkpLS8WU\nKVP0962l7rKyMhERESGOHz+uyPHvphrfqvRf2dlCvPmmEG5uQnTuLMSnnwqRm6t0VTVTcbEQa9cK\n8eyzQjg4CPHEE0KsWCGEVqt0ZVTXmOOz0+xjFG6XnJyM48eP6wcsUs3wwQcfICwsDIGBgUqXoscx\nCpaj08lVCH/4QS5X7OMjuyyefprjGSpTUiLns0hIAFavllNsP/ecnCipZUulq6O6yhyfndUaFIjM\nhUFBGcXF8ssvPl6OY2jfHggPl+MaOnUCVCqlK1TWn38Cv/0GrF0r56/w8ZGB6qmn5BUmREpjUKA6\ng0FBeaWlsl99zRp5KysDBg+WtwEDgGbNlK6w+pWXy6sUkpOBDRvk3/v2BZ54QgYoK7jqmsgAgwLV\nGQwK1kX1otJzAAASuklEQVQIORnQhg3yN+qtW4GOHeWXZt++QO/eQPPmSldpuvJyuc5Cevr/bs7O\nwKOPylu/fkAV1sAjUgyDAtUZDArWrbgY2L0bSE2Vt4wMwNVVrk/QvTvg7y+7Kqz5S1UI4MwZOYNl\nRoac2XLPHsDNDejTR9769pWvi6imYFCgOoNBoWYpKwOOHAF27pRfuvv3yxYIT0/A11cuje3tLVe8\nbNvWst0WpaVAXp6cETE7Gzh2DMjKAjIz5XLNXbrIyxi7dweCgoA7ZpsnqlEYFKjOYFCo+UpKgKNH\nZYA4elTecnLkJERqtQwMrVsDLi7y1rKlDBDNmsnVEhs2/N/NxkYOpFSpZPdAcTGg1crblSv/u/3x\nB3DxInDhAnDunGwxuHBBdh907CiDyiOPyEGInTsDLVoofZaIzItBgeoMBoXaSwjg8mXg9Gn5ZX7+\nvLxduiSvKvjzT6CoCLh5UwaBmzdlOLj1dlCr5aRFDRrIPx0c/ndr2VKGglatZPjw9JRdB1znguoK\nBgWqMxgUiIjun1Wu9UB1U1JSEjQaDXx8fO66emhxcTGioqKg0WjQq1cvnDlzxuD5s2fPwt7eHnPm\nzLFUyUREVAUMCmSy4uJiTJgwAUlJScjMzERCQgL2799vsM28efPg4uKCQ4cOISYmBpMmTTJ4furU\nqXj88cctWTYREVUBgwKZLCMjA76+vnB1dYWtrS2ioqKQmJhosM26desQHR0NAIiIiMD27dv1zWG/\n/PIL2rVrBx8fH4vXTkRElWNQIJPl5eXB/bZFANzc3JCXl2d0G7VaDUdHRxQUFOCvv/7Cp59+iunT\np1uyZCIiqiJbpQugmk/1gBP+CyEwffp0TJkyBQ899NA9B9zcHiZCQ0MRGhr6QMclIqqtUlNTkZqa\natZ9MiiQydzc3JCbm6u/n5uba9DCcGubs2fPwsnJCTqdDoWFhWjZsiV27dqFFStW4PXXX0dRURHU\najUaNmyIV155pcJx2OpARFS5O3+JmjFjhsn7ZFAgkwUFBSErKwv5+flwcnLC8uXLsWDBAoNtwsLC\nEBsbi8DAQKxevRrBwcGwsbFBenq6fpsZM2agcePGdw0JRESkDAYFMpmdnR3mz5+PwYMHQ6fTITo6\nGl27dsW0adMQGBiI8PBwTJw4EdHR0dBoNGjcuDHi4uKULpuIiKqAEy5RjcAJl4iI7h8nXCIiIqJq\nxaBARERERjEoEBERkVEMCkRERGQUgwIREREZxaBARERERjEoEBERkVEMCkRERGQUgwIREREZxaBA\nRERERjEoEBERkVEMCkRERGQUgwIREREZxaBARERERjEoEBERkVEMCkRERGQUgwIREREZxaBARERE\nRjEoEBERkVEMCkRERGQUgwIREREZxaBARERERjEoEBERkVEMCkRERGQUgwIREREZxaBARERERjEo\nEBERkVEMCkRERGQUgwIREREZxaBARERERjEoEBERkVEMCkRERGQUgwIREREZxaBARERERjEoEBER\nkVEMCkRERGQUgwKZRVJSEjQaDXx8fDBr1qwKzxcXFyMqKgoajQa9evXCmTNnAAAbNmxA165d0alT\nJ2g0Gvz222+WLp2IiCqhEkIIpYugmq24uBiPPPIItm7dCmdnZwQHB2PhwoXw9/fXbzNnzhzk5ubi\niy++wC+//IL//Oc/WL16NTIzM+Hi4oKWLVvi8OHDGDBgAM6fPw+VSmVwDJVKBb5ViYjujzk+O9mi\nQCbLyMiAr68vXF1dYWtri6ioKCQmJhpss27dOkRHRwMAIiIisH37dggh0KlTJ7Rs2RIA4OvrC51O\nB61Wa/HXQEREd2erdAFU8+Xl5cHd3V1/383NDampqUa3UavVcHR0REFBAZydnfXbJCQkoHPnzmjY\nsOFdjzN9+nT930NDQxEaGmq210BEVBukpqZW+Pw1FYMCmezOboIHceTIEbz55ptITk42us3tQYGI\niCq685eoGTNmmLxPdj2Qydzc3JCbm6u/n5uba9DCcGubs2fPAgB0Oh0KCwv1XQ55eXkYNmwYli1b\nhrZt21qucCIiuicGBTJZUFAQsrKykJ+fj9LSUixfvhxDhgwx2CYsLAyxsbEAgNWrVyM4OBhqtRpF\nRUV4/PHH8cknnyA4OFiJ8omIqBK86oHMYv369YiJiYFOp0N0dDTeeustTJs2DYGBgQgPD0dxcTGi\no6Nx9OhRNG7cGHFxcfD09MSHH36ITz75BB06dNDvKzk5GS1atDDYP696ICK6f+b47GRQoBqBQYGI\n6P7x8kgiIiKqVgwKREREZBSDAhERERnFoEBERERGMSgQERGRUQwKREREZBSDAhERERnFoEBERERG\nMSgQERGRUQwKREREZBSDAhERERnFoEBERERGMSgQERGRUQwKREREZBSDAhERERnFoEBERERGMSgQ\nERGRUQwKREREZBSDAhERERnFoEBERERGMSgQERGRUQwKREREZBSDAhERERnFoEBERERGMSgQERGR\nUQwKREREZBSDAhERERnFoEBERERGMSgQERGRUQwKREREZBSDAhERERnFoEBERERGMSgQERGRUQwK\nREREZBSDAhERERnFoEBmkZSUBI1GAx8fH8yaNavC88XFxYiKioJGo0GvXr1w5swZ/XMff/wxfHx8\noNFosGHDBkuWXWelpqYqXUKtwXNpXjyf1odBgUxWXFyMCRMmICkpCZmZmUhISMD+/fsNtpk3bx5c\nXFxw6NAhxMTEYNKkSQCAvXv3YuXKlTh06BCSkpIwbtw4lJSUKPEy6hR+GJsPz6V58XxaHwYFMllG\nRgZ8fX3h6uoKW1tbREVFITEx0WCbdevWITo6GgAQERGB7du3Q6fTITExESNGjICNjQ1cXV3h6+uL\nXbt2KfEyiIjoLhgUyGR5eXlwd3fX33dzc0NeXp7RbdRqNRwdHVFQUID8/Hy4ublV+rNERKQcW6UL\noJpPpVLVquPUFTNmzFC6hFqD59K8eD6tC4MCmczNzQ25ubn6+7m5uQYtDLe2OXv2LJycnKDT6VBY\nWIiWLVtW+Nk7WyduEUJU3wsgIiKj2PVAJgsKCkJWVhby8/NRWlqK5cuXY8iQIQbbhIWFITY2FgCw\nevVqBAcHw8bGBmFhYfj5559RVlaGvLw8ZGVloVu3bkq8DCIiugu2KJDJ7OzsMH/+fAwePBg6nQ7R\n0dHo2rUrpk2bhsDAQISHh2PixImIjo6GRqNB48aNERcXBwAICAjAsGHD0KlTJ6jVaixYsAD16tVT\n+BUREZGeIFLQ+vXrhZ+fn/D29haffPJJhee1Wq145plnhJ+fn+jZs6c4ffq0/rmZM2cKb29v4efn\nJ3777TdLlm21HvR8njp1StjZ2YkuXbqILl26iAkTJli6dKt0r/OZlpYm/P39ha2trUhISDB4bsmS\nJcLHx0f4+PiIpUuXWqpkq2bK+VSr1fr359ChQy1VstW617n89NNPhY+Pj/D19RUhISHi5MmT+ufu\n973JoECK0Wq1wtPTU+Tl5YnS0lIRGBgo9u3bZ7DN7NmzxeTJk4UQQqxatUpEREQIIYTYs2ePCAwM\nFGVlZSIvL094enqK4uJii78Ga2LK+Tx16pTw8/OzeM3WrCrn8/Tp0yIzM1M8//zzBl9s586dE+3b\ntxfXrl0T165dE+3btxcXLlyw9EuwKqacTyGEsLe3t2S5Vq0q5zI9PV1otVohhBDz588XTz75pBDi\nwd6bHKNAiuH8C+b1oOdTcKDoXVXlfHp4eECj0UCtNvwoTU5OxpAhQ2Bvbw97e3s89thjSE5OtmT5\nVseU80mGqnIuQ0JC0KBBAwBAr169kJ+fD+DB3pv81yDFcP4F8zLlfALA6dOn0aVLF/Ts2RMpKSmW\nK9xKVeV8GsP3Z0WmnE8A0Gq1CAwMRNeuXbF8+fLqKLHGuN9zuWDBAgwdOhTAg703OZiRFMN5EczL\nlPPZunVr5Ofno0mTJti/fz+eeOIJHD58GE2bNjVjhTUL35/mZer5zM/Ph5OTE06dOoX+/fujc+fO\n8PLyMlN1Ncv9nMsffvgB+/btQ1pa2gMfjy0KpJj7mX8BwAPNv1CXmHI+69evjyZNmgAA/P394efn\nh+zsbMsVb4Wqcj5vd/uH9/3+bF1gyvkEACcnJwBA27ZtMWjQIOzbt696Cq0BqnouN27ciI8++ghr\n1qzRX032QO/NahlpQVQFN2/eFB4eHiIvL0+UlJSIwMBAsXfvXoNtbh98t3LlShEeHi6E+N9gxtLS\nUpGbmys8PDxESUmJxV+DNTHlfBYWFory8nIhhBzY2Lp1a/HHH39Y9gVYmaqcz1teeOGFuw5mvHr1\nqrh69apo165dnR/MaMr5LCoq0v//vnTpkvDy8hIHDx60SN3WqCrnct++faJ9+/bi+PHjBo8/yHuT\nQYEUtW7dOuHr6yu8vb3FzJkzhRBCvP/++2LNmjVCCDm6d/jw4cLPz08EBweLU6dO6X/2o48+Et7e\n3sLX11ckJSUpUb7VedDzmZCQIHx9fYVGoxF+fn4VRpzXVfc6n7t27RJubm6iUaNGwtHR0eDKke++\n+054e3sLb29vsWTJEkXqtzYPej63bdsm/Pz8RKdOnYSXl5f4+uuvFXsN1sLYuVy7dq0QQoiBAweK\nVq1a3fWS0vt9b6qE4JBnIiIiujuOUSAiIiKjGBSIiIjIKAYFIiIiMopBgYiIiIxiUCAiIiKjGBSI\niIjIKAYFIqL7kJ2djS1btihdBpHFMCgQEd2H2bNnIycnR+kyiCyGEy4REd0HT09PbNmypc6v3UB1\nB1ePJCKqgrVr1yIpKQnl5eX4/vvv0bt3b7Rt2xYrV65EQkICnJ2d4eHhga+//hp//PEHlixZgszM\nTIwfPx6BgYG4ePEiwsPDsWvXLgByUa7PP/8cKpUKLVq0QFFRESZNmqTwqySqiF0PRERVEB4ejm7d\nuuGJJ57AO++8g759++LIkSP4v//7P2i1WowaNQpz5szBoUOH8Ouvv+LFF19ESUkJTp48CQBISUlB\nx44d9fsbN24c6tWrh6lTp2L48OE4d+6cUi+NqFLseiAiqqIXX3wRYWFhGD58uP6xy5cvw9vbGxcv\nXtQ/duXKFQBAx44dcfr0aTRs2BAvvfQSgoODMWbMGGRlZaFbt25YuHAhbGxsUFpaiqFDh8LBwcHi\nr4noXtj1QERURVu2bMFnn30GIQQuX74MR0dHpKSkoG/fvgbbOTg4IC4uDn379kXDhg0BAKmpqXjn\nnXdQVFSEzMxMBAQEYNSoUUq8DKL7wq4HIqIquHTpEurXr48WLVpg1apVuHnzJgBg48aN6N+/f4Xt\n8/Pz9V0NOTk50Gq18PDwQHx8PHx8fAy2FUJg0aJF1f8iiB4Aux6IqE6Kj4/Hjh07cODAAcTGxmLT\npk3Izs7GpUuXoFar8eWXX6JevXr67cvLyzFy5EgMGTIErq6uGDhwIACgT58+WLp0Kdq2bWuw/1On\nTuHtt9/G448/DgD48ccfERYWhsjISDg7O2PmzJlo1KgRWrRogRs3biAiIgLOzs6WOwFEVcSgQER1\nTklJCf7xj3/gq6++Qq9evVBUVIS5c+fqv/xDQkLw6KOP4v3331e4UiLlseuBiOqc9PR09OvXD0II\n5OTkYNiwYfqQAABt2rTBzz//rGCFRNaDgxmJqM7x9fVFs2bNcOTIERQWFmLQoEEGz588eRJXr15V\nqDoi68IWBSKqc1xcXGBnZ4e0tDQ0aNAAPXr00D+n1Wqxf//+CgMOieoqBgUiqrPS0tIQHByM+vXr\n6x/bvHkzSkpKMGLECADAzp07MWnSJPz000+YOHEiTpw4oVS5RIrgYEYiqrNcXFzw0ksv4YMPPtA/\nFhERgfz8fOzevRulpaXw8vJCRkYGnJ2dsWfPHowfPx579uxRsGoiy2KLAhHVSTk5Obh48SIOHTqk\nf2zZsmU4cOAA4uPjoVarsXnzZtjb2+svWwwICEBOTg6ys7OVKpvI4jiYkYjqpFvjE8aMGYPx48dD\nCAGtVoudO3eidevWAIDs7GyDuQ1UKhVatmyJ7OxsPPLII0qVTmRRDApEVCelpaUhKCgI4eHhCA8P\nv+s2165dMxi/AAC2tra4fv26JUoksgrseiCiOiktLQ0hISGVbtO4ceMKj2m1Wtjb21dXWURWh0GB\niOqc33//Hfn5+ejZs2el2/n4+KCwsFB/v6ysDAUFBfDy8qruEomsBoMCEdUpc+fOxaBBg6BSqRAT\nE4NPPvnE6LZ9+vTBuXPn8McffwCQrRA+Pj4cn0B1Ci+PJCKqREpKCuLj49GzZ08kJyfjvffeQ4cO\nHZQui8hiGBSIiIjIKHY9EBERkVEMCkRERGQUgwIREREZxaBARERERjEoEBERkVEMCkRERGQUgwIR\nEREZxaBARERERjEoEBERkVH/DyvZfSRTEw7+AAAAAElFTkSuQmCC\n",
"text": "<matplotlib.figure.Figure at 0x4138b10>"
}
],
"prompt_number": 23
},
{
"cell_type": "markdown",
"metadata": {},
"source": "The plot above shows that while the observed value of p_0 was 20%, the corrected p_0 value is around .08. The distance from zero shows that something funny is going on... if the confusion matrix is really M, then the observed values of p should not come up."
},
{
"cell_type": "code",
"collapsed": false,
"input": "M.dot([.07, .93])",
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 24,
"text": "array([ 0.156, 0.758])"
}
],
"prompt_number": 24
},
{
"cell_type": "markdown",
"metadata": {},
"source": "Now let's try it automatically."
},
{
"cell_type": "code",
"collapsed": false,
"input": "from scipy.optimize import fmin_l_bfgs_b",
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 27
},
{
"cell_type": "code",
"collapsed": false,
"input": "fmin_l_bfgs_b(d, p_obs,\n approx_grad=True,\n bounds=zip(0.*np.ones_like(p_obs),\n 1.*np.ones_like(p_obs)))",
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 47,
"text": "(array([ 0.08000043, 0.91999957]),\n 0.042426406872284869,\n {'funcalls': 7,\n 'grad': array([ 4.73093786e-06, -3.93435284e-07]),\n 'nit': 4,\n 'task': 'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL',\n 'warnflag': 0})"
}
],
"prompt_number": 47
},
{
"cell_type": "markdown",
"metadata": {},
"source": "Pretty easy... wrap the whole thing in a function for some more exploration."
},
{
"cell_type": "code",
"collapsed": false,
"input": "def adjust_csmf(confusion_matrix, csmf_predicted):\n \"\"\" Use generalized backcalculation to adjust the predicted csmf\n \n Parameters\n ----------\n confusion_matrix : array_like, shape (J,J)\n csmf_predicted : array_like, shape J\n \n Returns\n -------\n csmf_adjusted : array_like, shape J\n \n Notes\n -----\n Uses constrained optimization (L-BFGS-B) to find point on P-simplex\n that minimizes L_2 distance from csmf_predicted after operated on by\n confusion_matrix.\n \"\"\"\n \n def sq_dist(x, M, y):\n return np.sum(np.square(M.dot(x/x.sum()) - y))\n\n csmf_adjusted, min_dist, results_dict = \\\n fmin_l_bfgs_b(sq_dist, csmf_predicted,\n args=(confusion_matrix, csmf_predicted),\n approx_grad=True,\n bounds=zip(0.*np.ones_like(csmf_predicted),\n 1.*np.ones_like(csmf_predicted)))\n \n return csmf_adjusted / csmf_adjusted.sum()\n\nadjust_csmf(M, p_obs)",
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 50,
"text": "array([ 0.08000251, 0.91999749])"
}
],
"prompt_number": 50
},
{
"cell_type": "code",
"collapsed": false,
"input": "adjust_csmf(array([[.9 , .05, .05],\n [.06, .9 , .04],\n [.1 , .1 , .8]]),\n array([.3, .3, .4]))",
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 51,
"text": "array([ 0.28890458, 0.29039373, 0.4207017 ])"
}
],
"prompt_number": 51
},
{
"cell_type": "markdown",
"metadata": {},
"source": "Looks good. Off the David for use."
},
{
"cell_type": "code",
"collapsed": false,
"input": "",
"language": "python",
"metadata": {},
"outputs": []
}
],
"metadata": {}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment