jonathan-taylor · May 27, 2016 21:24
diff --git a/knockoff_data_generating_with_sqrt_lasso.ipynb b/knockoff_data_generating_with_sqrt_lasso.ipynb
diff --git a/lasso_cv.ipynb b/lasso_cv.ipynb
diff --git a/other_multiples.ipynb b/other_multiples.ipynb
 {
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "from __future__ import division, print_function\n",
    "from IPython.display import HTML\n",
    "\n",
    "import numpy as np\n",
    "import statsmodels.api as sm\n",
    "%matplotlib inline\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "# this code below is available at https://github.com/jonathan-taylor/selective-inference\n",
    "from selection.algorithms.lasso import lasso\n",
    "from selection.algorithms.sqrt_lasso import choose_lambda\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Data generating mechanism\n",
    "\n",
    "We use the same data generating mechanism as in [Barber and Candes (2016)](http://arxiv.org/abs/1602.03574)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3.765534496582512e-15"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def cov(p, rho=0.25):\n",
    "    idx = np.arange(p)\n",
    "    return rho**np.fabs(np.subtract.outer(idx, idx))\n",
    "\n",
    "def sqrt_cov(p, rho=0.25):\n",
    "    idx = np.arange(p)\n",
    "    C = rho**np.fabs(np.subtract.outer(idx, idx))\n",
    "    return np.linalg.cholesky(C)\n",
    "\n",
    "# Testing we have the square-root correct\n",
    "p = 2500\n",
    "A = cov(p, rho=0.3)\n",
    "B = sqrt_cov(p, rho=0.3)\n",
    "np.linalg.norm(B.dot(B.T) - A)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "cholesky_factors = {}\n",
    "for rho in [0, 0.25, 0.5, 0.75]:\n",
    "    cholesky_factors[(rho, 2500)] = sqrt_cov(2500, rho=rho)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "def instance(n=2000, k=30, p=2500, signal=4.5, rho=0.25, sigma=1.234): # sigma there just to convince you \n",
    "                                                                       # we don't need to know noise level\n",
    "    if (rho, p) in cholesky_factors.keys():\n",
    "        _sqrt_cov = cholesky_factors[(rho, p)]\n",
    "    else:\n",
    "        cholesky_factors[(rho, p)] = sqrt_cov(p, rho=rho)\n",
    "        _sqrt_cov = cholesky_factors[(rho, p)]\n",
    "    X = np.random.standard_normal((n, p)).dot(_sqrt_cov.T)\n",
    "\n",
    "    X /= (np.sqrt((X**2).sum(0))) # like normc\n",
    "    beta = np.zeros(p)\n",
    "    beta[:k] = signal * (2 * np.random.binomial(1, 0.5, size=(k,)) - 1) \n",
    "    np.random.shuffle(beta)\n",
    "    Y = (X.dot(beta) + np.random.standard_normal(n)) * sigma\n",
    "    true_active = np.nonzero(beta != 0)[0]\n",
    "    return X, Y, true_active, beta\n",
    "\n",
    "X, Y, true_active, beta = instance()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Selective inference with square-root LASSO\n",
    "\n",
    "We will use the approach of [Tian, Loftus, Taylor (2015)](http://arxiv.org/abs/1504.08031) that does not\n",
    "require noise level to choose $\\lambda$. The p-values used are the ones given by the Gaussian approximation suggested in that paper. There are exact p-values but we have not computed them below. So, our null ones will not be \n",
    "exactly uniform, but the size of the 5% test is pretty close to 5%."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We use a choice of $\\lambda$ as guided by theory. The function `choose_lambda` computes a Monte\n",
    "Carlo estimate of \n",
    "$$\n",
    "\\lambda = E\\left(\\frac{\\|X^T\\epsilon\\|_{\\infty}}{\\|\\epsilon\\|_2}\\right)\n",
    "$$\n",
    "(Actually it chooses the 95% quantile of the random variable by default, rather than the expectation.)\n",
    "\n",
    "In the paper on selective inference with the square-root LASSO, we see that for a given $\\lambda$ parameter, \n",
    "solving the square-root LASSO is equivalent to solving a LASSO with a rough estimate of $\\sigma$ (the estimate used by square-root LASSO) as well as an inflation \n",
    "factor depending on the design matrix. So using $\\kappa \\cdot \\lambda$ results in solving a LASSO\n",
    "with some other effective $\\lambda$.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's let $\\alpha$ be the multiple of \n",
    "`choose_lambda(X)` we use. Where, roughly speaking\n",
    "`choose_lambda(X)` is\n",
    "$$\n",
    "\\lambda = E\\left(\\frac{\\|X^T\\epsilon\\|_{\\infty}}{\\|\\epsilon\\|_2}\\right)\n",
    "$$\n",
    "\n",
    "One reasonable way to choose this might be so that the expected size of the active set in a null model is some size. This is computationally expensive. We can easily get an idea of how many \"strong rules\" violators we have as a function of $\\kappa$ under a null model, by looking at \n",
    "$$\n",
    "\\# \\left\\{j: |X_j^T\\epsilon| / \\|\\epsilon\\|_2 > \\kappa \\lambda \\right\\}\n",
    "$$\n",
    "\n",
    "\n",
    "Or, we might just take some fixed multiple. In the square-root LASSO paper, I have been generally taking $\\kappa=0.7$ for various scenarios. It seems reasonable. No real justification for it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def simulate(rho, q=0.2, nsim=100, mult=0.5):\n",
    "    P0, PA, active_size = [], [], []\n",
    "    full_model_FDP, full_model_power, directional_FDP = {}, {}, {}\n",
    "\n",
    "    for _ in range(nsim):\n",
    "        X, Y, true_active, beta = instance(rho=rho)\n",
    "        lam = choose_lambda(X)\n",
    "        L = lasso.sqrt_lasso(X, Y, mult * lam)\n",
    "        L.fit()\n",
    "        S = L.summary('onesided')\n",
    "        active_size.append(len(L.active))\n",
    "        # keep p-values when screening \n",
    "        \n",
    "        if set(L.active).issuperset(true_active):\n",
    "            P0.extend([p for p, v in zip(S['pval'], S['variable']) if v not in true_active])\n",
    "            PA.extend([p for p, v in zip(S['pval'], S['variable']) if v in true_active])\n",
    "    \n",
    "        # what to do with the p-values\n",
    "        \n",
    "        methods = {'BH': lambda p: sm.stats.multipletests(S['pval'], q, 'fdr_bh')[0],\n",
    "                   'BY': lambda p: sm.stats.multipletests(S['pval'], q, 'fdr_by')[0],\n",
    "                   'thresh': lambda p: p < q}\n",
    "        \n",
    "        for key, method in methods.items():\n",
    "        \n",
    "            selected = methods[key](S['pval'])\n",
    "            type1_errors = selected * np.array([v not in true_active for v in S['variable']])\n",
    "            sign_errors = selected * np.array([(s != np.sign(beta[v])) * (v in true_active) for v, s in \n",
    "                                               zip(S['variable'], L.active_signs)])\n",
    "\n",
    "            full_model_FDP.setdefault(key, []).append(np.sum(type1_errors) / max(np.sum(selected), 1))\n",
    "            directional_FDP.setdefault(key, []).append(np.sum(type1_errors + sign_errors) / max(np.sum(selected), 1))\n",
    "            full_model_power.setdefault(key, []).append(np.sum(selected * np.array([v in true_active for v in S['variable']])) / len(true_active))\n",
    "    \n",
    "    return P0, PA, full_model_FDP, directional_FDP, full_model_power, active_size"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Size of active sets"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[(0.4, 242.33333333333334), (0.7, 35.666666666666664), (1.0, 12.333333333333334)]\n"
     ]
    }
   ],
   "source": [
    "print([(m, np.mean(simulate(0.25, nsim=3, mult=m)[-1])) for m in [0.4,0.7,1.]])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "def make_table(nsim, q, mult):\n",
    "\n",
    "    rows = r\"\"\"\n",
    "    <tr><td>$\\rho$</td><td>Model FDP</td><td>Directional FDP</td><td>Model power</td><td>Active set</td></tr>\n",
    "    \"\"\"\n",
    "\n",
    "    for rho in [0., 0.25, 0.5, 0.75]:\n",
    "        P0, PA, full_model_FDP, directional_FDP, full_model_power, active_size = simulate(rho=rho, q=q, nsim=nsim, mult=mult)\n",
    "        rows += \"\"\"\n",
    "        <tr><td>%0.2f</td><td>%0.3f</td><td>%0.3f</td><td>%0.3f</td><td>%0.1f</td></tr>\n",
    "        \"\"\" % (rho, np.mean(full_model_FDP['BH']), np.mean(directional_FDP['BH']), np.mean(full_model_power['BH']), np.mean(active_size))\n",
    "\n",
    "    table = \"\"\"\n",
    "    <table>\n",
    "    %s\n",
    "    </table>\"\"\" % rows\n",
    "\n",
    "    return table"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## $\\kappa=0.7$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "    <table>\n",
       "    \n",
       "    <tr><td>$\\rho$</td><td>Model FDP</td><td>Directional FDP</td><td>Model power</td></tr>\n",
       "    \n",
       "        <tr><td>0.00</td><td>0.042</td><td>0.042</td><td>0.666</td></tr>\n",
       "        \n",
       "        <tr><td>0.25</td><td>0.041</td><td>0.041</td><td>0.684</td></tr>\n",
       "        \n",
       "        <tr><td>0.50</td><td>0.050</td><td>0.050</td><td>0.619</td></tr>\n",
       "        \n",
       "        <tr><td>0.75</td><td>0.108</td><td>0.108</td><td>0.449</td></tr>\n",
       "        \n",
       "    </table>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "HTML(make_table(100, 0.2, 0.7))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## $\\kappa=0.4$\n",
    "\n",
    "You can of course choose $\\alpha$ so small so that power goes down. A multiple of 0.4 here seems pretty low and one can imagine that there are many more coefficients at this point to test."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "    <table>\n",
       "    \n",
       "    <tr><td>$\\rho$</td><td>Model FDP</td><td>Directional FDP</td><td>Model power</td></tr>\n",
       "    \n",
       "        <tr><td>0.00</td><td>0.083</td><td>0.083</td><td>0.038</td></tr>\n",
       "        \n",
       "        <tr><td>0.25</td><td>0.046</td><td>0.046</td><td>0.021</td></tr>\n",
       "        \n",
       "        <tr><td>0.50</td><td>0.087</td><td>0.087</td><td>0.045</td></tr>\n",
       "        \n",
       "        <tr><td>0.75</td><td>0.112</td><td>0.112</td><td>0.051</td></tr>\n",
       "        \n",
       "    </table>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "HTML(make_table(100, 0.2, 0.4))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## $\\kappa=1.$\n",
    "\n",
    "If $\\kappa$ is too large, then too few variables will be selected. Power would be expected to go down because we have\n",
    "not included enough. Though FDP seems under control.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "    <table>\n",
       "    \n",
       "    <tr><td>$\\rho$</td><td>Model FDP</td><td>Directional FDP</td><td>Model power</td></tr>\n",
       "    \n",
       "        <tr><td>0.00</td><td>0.001</td><td>0.001</td><td>0.240</td></tr>\n",
       "        \n",
       "        <tr><td>0.25</td><td>0.000</td><td>0.000</td><td>0.248</td></tr>\n",
       "        \n",
       "        <tr><td>0.50</td><td>0.013</td><td>0.013</td><td>0.254</td></tr>\n",
       "        \n",
       "        <tr><td>0.75</td><td>0.081</td><td>0.081</td><td>0.206</td></tr>\n",
       "        \n",
       "    </table>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "HTML(make_table(100, 0.2, 1.))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
 }
	{
	"cells": [
	{
	"cell_type": "code",
	"execution_count": 1,
	"metadata": {
	"collapsed": false
	},
	"outputs": [],
	"source": [
	"from __future__ import division, print_function\n",
	"from IPython.display import HTML\n",
	"\n",
	"import numpy as np\n",
	"import statsmodels.api as sm\n",
	"%matplotlib inline\n",
	"import matplotlib.pyplot as plt\n",
	"\n",
	"# this code below is available at https://github.com/jonathan-taylor/selective-inference\n",
	"from selection.algorithms.lasso import lasso\n",
	"from selection.algorithms.sqrt_lasso import choose_lambda\n",
	"\n"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Data generating mechanism\n",
	"\n",
	"We use the same data generating mechanism as in [Barber and Candes (2016)](http://arxiv.org/abs/1602.03574)."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 2,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"3.765534496582512e-15"
	]
	},
	"execution_count": 2,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"def cov(p, rho=0.25):\n",
	" idx = np.arange(p)\n",
	" return rho**np.fabs(np.subtract.outer(idx, idx))\n",
	"\n",
	"def sqrt_cov(p, rho=0.25):\n",
	" idx = np.arange(p)\n",
	" C = rho**np.fabs(np.subtract.outer(idx, idx))\n",
	" return np.linalg.cholesky(C)\n",
	"\n",
	"# Testing we have the square-root correct\n",
	"p = 2500\n",
	"A = cov(p, rho=0.3)\n",
	"B = sqrt_cov(p, rho=0.3)\n",
	"np.linalg.norm(B.dot(B.T) - A)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 3,
	"metadata": {
	"collapsed": false
	},
	"outputs": [],
	"source": [
	"cholesky_factors = {}\n",
	"for rho in [0, 0.25, 0.5, 0.75]:\n",
	" cholesky_factors[(rho, 2500)] = sqrt_cov(2500, rho=rho)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 4,
	"metadata": {
	"collapsed": false
	},
	"outputs": [],
	"source": [
	"def instance(n=2000, k=30, p=2500, signal=4.5, rho=0.25, sigma=1.234): # sigma there just to convince you \n",
	" # we don't need to know noise level\n",
	" if (rho, p) in cholesky_factors.keys():\n",
	" _sqrt_cov = cholesky_factors[(rho, p)]\n",
	" else:\n",
	" cholesky_factors[(rho, p)] = sqrt_cov(p, rho=rho)\n",
	" _sqrt_cov = cholesky_factors[(rho, p)]\n",
	" X = np.random.standard_normal((n, p)).dot(_sqrt_cov.T)\n",
	"\n",
	" X /= (np.sqrt((X**2).sum(0))) # like normc\n",
	" beta = np.zeros(p)\n",
	" beta[:k] = signal * (2 * np.random.binomial(1, 0.5, size=(k,)) - 1) \n",
	" np.random.shuffle(beta)\n",
	" Y = (X.dot(beta) + np.random.standard_normal(n)) * sigma\n",
	" true_active = np.nonzero(beta != 0)[0]\n",
	" return X, Y, true_active, beta\n",
	"\n",
	"X, Y, true_active, beta = instance()"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Selective inference with square-root LASSO\n",
	"\n",
	"We will use the approach of [Tian, Loftus, Taylor (2015)](http://arxiv.org/abs/1504.08031) that does not\n",
	"require noise level to choose $\\lambda$. The p-values used are the ones given by the Gaussian approximation suggested in that paper. There are exact p-values but we have not computed them below. So, our null ones will not be \n",
	"exactly uniform, but the size of the 5% test is pretty close to 5%."
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"We use a choice of $\\lambda$ as guided by theory. The function `choose_lambda` computes a Monte\n",
	"Carlo estimate of \n",
	"$$\n",
	"\\lambda = E\\left(\\frac{\\\|X^T\\epsilon\\\|_{\\infty}}{\\\|\\epsilon\\\|_2}\\right)\n",
	"$$\n",
	"(Actually it chooses the 95% quantile of the random variable by default, rather than the expectation.)\n",
	"\n",
	"In the paper on selective inference with the square-root LASSO, we see that for a given $\\lambda$ parameter, \n",
	"solving the square-root LASSO is equivalent to solving a LASSO with a rough estimate of $\\sigma$ (the estimate used by square-root LASSO) as well as an inflation \n",
	"factor depending on the design matrix. So using $\\kappa \\cdot \\lambda$ results in solving a LASSO\n",
	"with some other effective $\\lambda$.\n"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Let's let $\\alpha$ be the multiple of \n",
	"`choose_lambda(X)` we use. Where, roughly speaking\n",
	"`choose_lambda(X)` is\n",
	"$$\n",
	"\\lambda = E\\left(\\frac{\\\|X^T\\epsilon\\\|_{\\infty}}{\\\|\\epsilon\\\|_2}\\right)\n",
	"$$\n",
	"\n",
	"One reasonable way to choose this might be so that the expected size of the active set in a null model is some size. This is computationally expensive. We can easily get an idea of how many \"strong rules\" violators we have as a function of $\\kappa$ under a null model, by looking at \n",
	"$$\n",
	"\\# \\left\\{j: \|X_j^T\\epsilon\| / \\\|\\epsilon\\\|_2 > \\kappa \\lambda \\right\\}\n",
	"$$\n",
	"\n",
	"\n",
	"Or, we might just take some fixed multiple. In the square-root LASSO paper, I have been generally taking $\\kappa=0.7$ for various scenarios. It seems reasonable. No real justification for it."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 12,
	"metadata": {
	"collapsed": true
	},
	"outputs": [],
	"source": [
	"def simulate(rho, q=0.2, nsim=100, mult=0.5):\n",
	" P0, PA, active_size = [], [], []\n",
	" full_model_FDP, full_model_power, directional_FDP = {}, {}, {}\n",
	"\n",
	" for _ in range(nsim):\n",
	" X, Y, true_active, beta = instance(rho=rho)\n",
	" lam = choose_lambda(X)\n",
	" L = lasso.sqrt_lasso(X, Y, mult * lam)\n",
	" L.fit()\n",
	" S = L.summary('onesided')\n",
	" active_size.append(len(L.active))\n",
	" # keep p-values when screening \n",
	" \n",
	" if set(L.active).issuperset(true_active):\n",
	" P0.extend([p for p, v in zip(S['pval'], S['variable']) if v not in true_active])\n",
	" PA.extend([p for p, v in zip(S['pval'], S['variable']) if v in true_active])\n",
	" \n",
	" # what to do with the p-values\n",
	" \n",
	" methods = {'BH': lambda p: sm.stats.multipletests(S['pval'], q, 'fdr_bh')[0],\n",
	" 'BY': lambda p: sm.stats.multipletests(S['pval'], q, 'fdr_by')[0],\n",
	" 'thresh': lambda p: p < q}\n",
	" \n",
	" for key, method in methods.items():\n",
	" \n",
	" selected = methods[key](S['pval'])\n",
	" type1_errors = selected * np.array([v not in true_active for v in S['variable']])\n",
	" sign_errors = selected * np.array([(s != np.sign(beta[v])) * (v in true_active) for v, s in \n",
	" zip(S['variable'], L.active_signs)])\n",
	"\n",
	" full_model_FDP.setdefault(key, []).append(np.sum(type1_errors) / max(np.sum(selected), 1))\n",
	" directional_FDP.setdefault(key, []).append(np.sum(type1_errors + sign_errors) / max(np.sum(selected), 1))\n",
	" full_model_power.setdefault(key, []).append(np.sum(selected * np.array([v in true_active for v in S['variable']])) / len(true_active))\n",
	" \n",
	" return P0, PA, full_model_FDP, directional_FDP, full_model_power, active_size"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Size of active sets"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 17,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"[(0.4, 242.33333333333334), (0.7, 35.666666666666664), (1.0, 12.333333333333334)]\n"
	]
	}
	],
	"source": [
	"print([(m, np.mean(simulate(0.25, nsim=3, mult=m)[-1])) for m in [0.4,0.7,1.]])"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 15,
	"metadata": {
	"collapsed": false
	},
	"outputs": [],
	"source": [
	"def make_table(nsim, q, mult):\n",
	"\n",
	" rows = r\"\"\"\n",
	" <tr><td>$\\rho$</td><td>Model FDP</td><td>Directional FDP</td><td>Model power</td><td>Active set</td></tr>\n",
	" \"\"\"\n",
	"\n",
	" for rho in [0., 0.25, 0.5, 0.75]:\n",
	" P0, PA, full_model_FDP, directional_FDP, full_model_power, active_size = simulate(rho=rho, q=q, nsim=nsim, mult=mult)\n",
	" rows += \"\"\"\n",
	" <tr><td>%0.2f</td><td>%0.3f</td><td>%0.3f</td><td>%0.3f</td><td>%0.1f</td></tr>\n",
	" \"\"\" % (rho, np.mean(full_model_FDP['BH']), np.mean(directional_FDP['BH']), np.mean(full_model_power['BH']), np.mean(active_size))\n",
	"\n",
	" table = \"\"\"\n",
	" <table>\n",
	" %s\n",
	" </table>\"\"\" % rows\n",
	"\n",
	" return table"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## $\\kappa=0.7$"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 7,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"data": {
	"text/html": [
	"\n",
	" <table>\n",
	" \n",
	" <tr><td>$\\rho$</td><td>Model FDP</td><td>Directional FDP</td><td>Model power</td></tr>\n",
	" \n",
	" <tr><td>0.00</td><td>0.042</td><td>0.042</td><td>0.666</td></tr>\n",
	" \n",
	" <tr><td>0.25</td><td>0.041</td><td>0.041</td><td>0.684</td></tr>\n",
	" \n",
	" <tr><td>0.50</td><td>0.050</td><td>0.050</td><td>0.619</td></tr>\n",
	" \n",
	" <tr><td>0.75</td><td>0.108</td><td>0.108</td><td>0.449</td></tr>\n",
	" \n",
	" </table>"
	],
	"text/plain": [
	"<IPython.core.display.HTML object>"
	]
	},
	"execution_count": 7,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"HTML(make_table(100, 0.2, 0.7))"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## $\\kappa=0.4$\n",
	"\n",
	"You can of course choose $\\alpha$ so small so that power goes down. A multiple of 0.4 here seems pretty low and one can imagine that there are many more coefficients at this point to test."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 8,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"data": {
	"text/html": [
	"\n",
	" <table>\n",
	" \n",
	" <tr><td>$\\rho$</td><td>Model FDP</td><td>Directional FDP</td><td>Model power</td></tr>\n",
	" \n",
	" <tr><td>0.00</td><td>0.083</td><td>0.083</td><td>0.038</td></tr>\n",
	" \n",
	" <tr><td>0.25</td><td>0.046</td><td>0.046</td><td>0.021</td></tr>\n",
	" \n",
	" <tr><td>0.50</td><td>0.087</td><td>0.087</td><td>0.045</td></tr>\n",
	" \n",
	" <tr><td>0.75</td><td>0.112</td><td>0.112</td><td>0.051</td></tr>\n",
	" \n",
	" </table>"
	],
	"text/plain": [
	"<IPython.core.display.HTML object>"
	]
	},
	"execution_count": 8,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"HTML(make_table(100, 0.2, 0.4))"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## $\\kappa=1.$\n",
	"\n",
	"If $\\kappa$ is too large, then too few variables will be selected. Power would be expected to go down because we have\n",
	"not included enough. Though FDP seems under control.\n"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 9,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"data": {
	"text/html": [
	"\n",
	" <table>\n",
	" \n",
	" <tr><td>$\\rho$</td><td>Model FDP</td><td>Directional FDP</td><td>Model power</td></tr>\n",
	" \n",
	" <tr><td>0.00</td><td>0.001</td><td>0.001</td><td>0.240</td></tr>\n",
	" \n",
	" <tr><td>0.25</td><td>0.000</td><td>0.000</td><td>0.248</td></tr>\n",
	" \n",
	" <tr><td>0.50</td><td>0.013</td><td>0.013</td><td>0.254</td></tr>\n",
	" \n",
	" <tr><td>0.75</td><td>0.081</td><td>0.081</td><td>0.206</td></tr>\n",
	" \n",
	" </table>"
	],
	"text/plain": [
	"<IPython.core.display.HTML object>"
	]
	},
	"execution_count": 9,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"HTML(make_table(100, 0.2, 1.))"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {
	"collapsed": true
	},
	"outputs": [],
	"source": []
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {
	"collapsed": true
	},
	"outputs": [],
	"source": []
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "Python 2",
	"language": "python",
	"name": "python2"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 2
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython2",
	"version": "2.7.11"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 0
	}
No results found