Created
April 16, 2014 09:57
-
-
Save jscrane/10844831 to your computer and use it in GitHub Desktop.
Probability Cheat Sheet
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Counting | |
| ------------ | |
| **Permutations** | |
| $n! = n(n-1)(n-2) ... 1$ | |
| **k-Permutations** | |
| $\frac{n!}{(n-k)!}$ | |
| **Combinations** | |
| $\binom{n}{k} = \frac{n!}{k!(n-k)!}$ | |
| **Partitions** | |
| $\frac{n!}{\prod_{i=1}^{r}n_i!}$ | |
| Events | |
| ---------- | |
| $A, B$ are events. | |
| **Conditional Probability** | |
| if $P(B) > 0$, $P(A | B) = \frac{P(A \cap B)}{P(B)}$ | |
| **Additivity** | |
| $P(A_1 \cup A_2 | B) = P(A_1 | B) + P(A_2 | B)$ | |
| **Multiplication Rule** | |
| $P(A_1 \cap A_2 \cap A_3) = P(A_1)\,P(A_2 | A_1)\,P(A_3 | A_1 \cap A_2)$ | |
| **Total Probability** | |
| if $A_i$ are disjoint and partition the sample space, $P(B) = \sum_i P(A_i \cap B) = \sum_i P(A_i)\,P(B | A_i)$ | |
| **Bayes' Rule** | |
| $P(A_i | B) = \frac{P(A_i)P(B | A_i)}{P(B)} = \frac{P(A_i)\,P(B | A_i)}{\sum_i P(A_i)\,p(B | A_i)}$ | |
| **Independence** | |
| $ | |
| \begin{array} | |
| _P(A \cap B) = P(A)\,P(B)\\ | |
| P(A | B) = P(A) & P(B) > 0\\ | |
| P(A \cap B | C) = P(A | C)\,P(B | C)\\ | |
| P(A | B \cap C) = P(A | C) & P(B \cap C) > 0\\ | |
| P(\bigcap_{i \in S} A_i) = \prod_{i=S}P(A_i) & \forall S \subset \{1, 2, ..., n\} | |
| \end{array}$ | |
| Discrete Random Variables | |
| ----------------------------- | |
| $X, Y$ are discrete RVs, $x, y$ are values. | |
| **Mass Function (PMF) and Cumulative Distribution (CDF)** | |
| $ | |
| \begin{array} | |
| _p_X(x) = P(X = x) & (\sum_x P_X(x) = 1)\\ | |
| F_X(x) = P(X \le x) = \sum_{k \le x} p_X(k) | |
| \end{array}$ | |
| **Joint PMF** | |
| $p_{X,Y}(x,y) = P(X = x, Y = y)$ | |
| **Marginals** | |
| $p_X(x) = \sum_y p_{X,Y}(x,y)\\ | |
| p_y(y) = \sum_x p_{X,Y}(x,y)$ | |
| **Conditional PMF** | |
| $ | |
| \begin{array} | |
| _p_{X|A} = P(X = x | A) & (\sum_x P_{X|A}(x) = 1) | |
| \end{array} | |
| $ | |
| **Total Probability** | |
| $p_X(x) = \sum_i P(A_i)\,p_{X|A_i}(x)\\ | |
| p_{X|B}(x) = \sum_i P(A_i|B)p\,_{X|A_i \cap B}(x)\\ | |
| p_{X,Y}(x,y) = p_Y(y)\,p_{X|Y}(x|y)\\ | |
| p_X(x) = \sum_y p_Y(y)\,p_{X|Y}(x|y)$ | |
| **Bayes' Rule** | |
| $p_{Y|X}(y|x) = \frac{p_Y(y)\,p_{X|Y}(x|y)}{p_X(x)} = \frac{p_Y(y)\,p_{X|Y}(x|y)}{\sum_y p_Y(y)\,p_{X|Y}(x|y)} | |
| $ | |
| **Expectation** | |
| $E[X] = \sum_x x\,p_X(x)\\ | |
| E[X|A] = \sum_x x\,p_{X|A}(x)\\ | |
| E[X|Y] = \sum_x x\,p_{X|Y}(x|y)\\ | |
| E[g(X)] = \sum_x g(x)\,p_X(x)\\ | |
| E[g(X) | A] = \sum_x g(x)\,p_{X|A}(x)\\ | |
| E[g(X)h(Y)] = \sum_x \sum_y g(x)\,h(y)\,p_{X,Y}(x,y)$ | |
| **Total Expectation** | |
| $E[X] = \sum_i P(A_i)\,E[X | A_i]\\ | |
| E[X | B] = \sum_i P(A_i)\,E[X | A_i \cap B]\\ | |
| E[X] = \sum_y p_Y(y)\,E[X | Y=y]$ | |
| **Independence** | |
| If $X, Y$ are independent, | |
| $p_{X,Y}(x,y) = p_X(x)\,p_Y(y)\\ | |
| p_{X|Y}(x|y) = p_X(x)$ | |
| **Bernoulli** | |
| $ | |
| p_X(k) = | |
| \begin{cases} | |
| p & k = 1\\ | |
| 1-p & k = 0 | |
| \end{cases}\\ | |
| E[X] = p\\ | |
| var(X) = p(1-p) | |
| $ | |
| **Binomial** | |
| $p_X(k) = P(X = k) = \binom{n}{k}\,p^k\,(1-p)^{n-k}\\ | |
| E[X] = np\\ | |
| var(X) = np(1-p)$ | |
| **Poisson** | |
| $p_X(k) = e^{-\lambda}\frac{\lambda^k}{k!}\\ | |
| E[X] = \lambda\\ | |
| var(X) = \lambda\\ | |
| p_{N_\tau}(k) = P(k, \tau) = e^{-\lambda\tau}\frac{(\lambda\tau)^k}{k!}\\ | |
| E[N_\tau] = \lambda\tau\\ | |
| var(N_\tau) = \lambda\tau\\ | |
| $ | |
| **Geometric** | |
| $p_X(k) = (1-p)^{k-1}p\\ | |
| E[X] = \frac{1}{p}\\ | |
| var(X) = \frac{1-p}{p^2}\\ | |
| F_X(x) = \sum_{k=1}^n p\,(1-p)^{k-1} = 1 - (1-p)^n | |
| $ | |
| **Uniform** | |
| $p_X(x) = | |
| \begin{cases} | |
| \frac{1}{b-a+1} & x \in a, ..., b\\ | |
| 0 & o.w. | |
| \end{cases}\\ | |
| E[X] = \frac{a+b}{2}\\ | |
| var(X) = \frac{(b-a+1)^2-1}{12}$ | |
| **Pascal** | |
| $p_{Y_k}(t) = \binom{t-1}{k-1}p^k(1-p)^{t-k}\\ | |
| E[Y_k] = k/p\\ | |
| var(Y_k) = \frac{k(1-p)}{p^2} | |
| $ | |
| Continuous Random Variables | |
| --------------------------- | |
| **Density Function (PDF) and Cumulative Distribution (CDF)** | |
| $ | |
| \begin{array} | |
| _P(X \in B) = \int_B f_X(x)\,dx & (\int_{-\infty}^{\infty}f_X(x)\,dx = 1)\\ | |
| F_X(x) = P(X \le x) = \int_{-\infty}^x f_X(t)\,dt | |
| \end{array}$ | |
| **Joint PDF and CDF** | |
| $ | |
| \begin{array} | |
| _P(a \le X \le b, c \le Y \le d) = \int_c^d \int_a^b f_{X,Y}(x,y)\,dx\,dy & (\int_{-\infty}^\infty f_{X,Y}(x,y)\,dx\,dy = 1)\\ | |
| F_{X,Y}(x,y) = P(X \le x, Y \le y) = \int_{-\infty}^x \int_{-\infty}^y f_{X,Y}(s,t)\,ds\,dt\\ | |
| f_{X,Y}(x,y) = \frac{\delta^2 F_{X,Y}(x,y)}{\delta x \delta y} | |
| \end{array}$ | |
| **Marginals** | |
| $f_X(x) = \int_{-\infty}^\infty f_{X,Y}(x,y)\,dy\\ | |
| f_Y(y) = \int_{-\infty}^\infty f_{X,Y}(x,y)\,dx | |
| $ | |
| **Conditional PDF** | |
| $f_{X|Y}(x|y) = \frac{f_{X,Y}(x,y)}{f_Y(y)}$ | |
| **Total Probability** | |
| $f_X(x) = \int_{-\infty}^\infty f_Y(y)\,f_{X|Y}(x|y)\,dy$ | |
| **Bayes' Rule** | |
| $f_{Y|X}(y|x) = \frac{f_Y(y)\,f_{X|Y}(x|y)}{f_X(x)} = \frac{f_Y(y)\,f_{X|Y}(x|y)}{\int_{-\infty}^\infty f_Y(t)\,f_{X|Y}(x|t)\,dt}$ | |
| **Expectation** | |
| $ | |
| E[X] = \int_{-\infty}^\infty x\,f_X(x)\,dx\\ | |
| E[g(X)] = \int_{-\infty}^\infty g(x)\,f_X(x)\,dx\\ | |
| E[X | A] = \int_{-\infty}^\infty x\,f_{X|A}(x)\,dx\\ | |
| E[g(X) | A] = \int_{-\infty}^\infty g(x)\,f_{X|A}(x)\,dx\\ | |
| E[X | Y=y] = \int_{-\infty}^\infty x\,f_{X|Y}(x|y)\,dx\\ | |
| E[g(X) | Y=y] = \int_{-\infty}^\infty g(x)\,f_{X|Y}(x|y)\,dx\\ | |
| E[g(X,Y) | Y=y] = \int_{-\infty}^\infty g(x,y)\,f_{X|Y}(x|y)\,dx\\ | |
| E[g(X,Y)] = \int E[g(X,Y) | Y=y]\,f_Y(y)\,dy\\ | |
| E[g(X,Y)] = \int_{-\infty}^\infty \int_{-\infty}^\infty g(x,y)\,f_{X,Y}\,dx\,dy\\ | |
| var(X) = \int_{-\infty}^\infty (x - E[X])^2\,f_X(x)\,dx | |
| $ | |
| **Total Expectation** | |
| $E[X] = \int_{-\infty}^\infty E[X|Y=y]\,f_Y(y)\,dy$ | |
| **Independence** | |
| $ | |
| \begin{array} | |
| _f_{X,Y}(x,y) = f_X(x)\,f_Y(y) & \forall x,y | |
| \end{array}$ | |
| **Uniform** | |
| $f_X(x) = | |
| \begin{cases} | |
| \frac{1}{b-a} & a \le x \le b\\ | |
| 0 & o.w. | |
| \end{cases}\\ | |
| E[X] = \frac{a+b}{2}\\ | |
| var(X) = \frac{(b-a)^2}{12}$ | |
| **Exponential** | |
| $f_X(x) = | |
| \begin{cases} | |
| \lambda\,e^{-\lambda\,x} & x \ge 0\\ | |
| 0 & o.w. | |
| \end{cases}\\ | |
| E[X] = \frac{1}{\lambda}\\ | |
| var(X) = \frac{1}{\lambda^2}\\ | |
| F_X(x) = \int_0^x \lambda\,e^{-\lambda t}\,dt = 1 - e^{-\lambda x}$ | |
| **Normal** | |
| $f_X(x) = \mathcal{N}(\mu, \sigma^2) = \frac{1}{\sqrt{2\pi}\,\sigma}\,e^{-\frac{(x-\mu)^2}{2\,\sigma^2}}\\ | |
| E[X] = \mu\\ | |
| var(X) = \sigma^2\\ | |
| E[aX+b] = a\mu+b\\ | |
| var(aX+b) = a^2\sigma^2\\ | |
| \phi(x) = \mathcal{N}(0, 1) = e^{-x^2/2}/\sqrt{2\pi}\\ | |
| \Phi(x) = \int_{-\infty}^x\,\phi(t)\,dt | |
| $ | |
| **Beta** | |
| $f_\Theta(\theta) = | |
| \begin{cases} | |
| \frac{1}{B(\alpha, \beta)}\theta^{\alpha-1}(1-\theta)^{\beta-1} & 0 \lt \theta \lt 1\\ | |
| 0 & o.w. | |
| \end{cases}\\ | |
| E[\Theta^m] = \frac{B(\alpha+m,\beta)}{B(\alpha,\beta)}\\ | |
| B(\alpha,\beta) = \frac{(\alpha-1)!\,(\beta-1)!}{(\alpha+\beta-1)!}\\ | |
| $ | |
| **Erlang** | |
| $f_{Y_k}(y) = \frac{\lambda^k\,y^{k-1}\,e^{-{\lambda}y}}{(k-1)!}\\ | |
| E[Y_k] = k/\lambda\\ | |
| var(Y_k) = k/\lambda^2 | |
| $ | |
| Properties of Random Variables | |
| ------------------------------ | |
| **Linearity of Expectation** | |
| $E[aX + b] = aE[X] + b\\ | |
| E[aX + bY + c] = aE[X] + bE[Y] + c$ | |
| **Iterated Expectation** | |
| $E[X] = E[E[X|Y]]$ | |
| **Variance** | |
| $var(X) = E[(X - E[X])^2]\\ | |
| var(X) = E[X^2] - (E[X])^2\\ | |
| var(aX+b) = a^2\,var(X)\\ | |
| \sigma_x = \sqrt{var(X)}$ | |
| **Total Variance** | |
| $var(X) = E[var(X|Y)] + var(E[X|Y]) | |
| $ | |
| **Covariance** | |
| $cov(X,Y) = E[(X-E[X])(Y-E[Y])]\\ | |
| cov(X,Y) = E[XY] - E[X]E[Y]\\ | |
| var(X+Y) = var(X) + var(Y) + 2\,cov(X,Y)\\ | |
| \rho(X,Y) = \frac{cov(X,Y)}{\sqrt{var(X)\,var(Y)}}\\ | |
| $ | |
| **Independence** | |
| If $X, Y$ are independent, | |
| $E[XY] = E[X]\,E[Y]\\ | |
| E[g(X)h(Y)] = E[g(X)]\,E[h(Y)]\\ | |
| var(X+Y) = var(X) + var(Y)$ | |
| If $X, Y$ are independent and normal, | |
| $Z = X + Y \implies Z \sim \mathcal{N}(\mu_x + \mu_y, \sigma_x^2 + \sigma_y^2) | |
| $ | |
| Inference | |
| --------- | |
| **Maximum a Posteriori (MAP)** | |
| $ | |
| \begin{array} | |
| _\hat\theta = arg\,max_\theta ...\\ | |
| p_\Theta(\theta)\,p_{X|\Theta}(x| \theta) & \Theta,\,X\,discrete\\ | |
| p_\Theta(\theta)\,f_{X|\Theta}(x| \theta) & \Theta\,discrete,\, X\,continuous\\ | |
| f_\Theta(\theta)\,p_{X|\Theta}(x| \theta) & \Theta\,continuous,\, X\,discrete\\ | |
| f_\Theta(\theta)\,f_{X|\Theta}(x| \theta) & \Theta,\, X\, continuous\\ | |
| \end{array}$ | |
| **Conditional Expectation (LMS)** | |
| $ | |
| \Theta = \hat\Theta + \tilde\Theta\\ | |
| \hat\Theta = E[\Theta|X] = | |
| \int_\theta \theta\,f_{\Theta|X}(\theta|x)\,d\theta\\ | |
| \hat\Theta = E[\Theta|X] = | |
| \sum_\theta \theta\,p_{\Theta|X}(\theta|x)\\ | |
| E[\tilde\Theta] = E[\tilde\Theta|X=x] = 0\\ | |
| cov(\tilde\Theta,\hat\Theta) = 0\\ | |
| var(\Theta) = var(\tilde\Theta) + var(\hat\Theta) | |
| $ | |
| **Linear Least Mean Squares (LLMS)** | |
| $ | |
| \hat\Theta = E[\Theta] + \frac{cov(\Theta,X)}{var(X)}(X-E[X]) = E[\Theta] + \rho\frac{\sigma_\Theta}{\sigma_X}(X-E[X])\\ | |
| \rho = \frac{cov(\Theta,X)}{\sigma_\Theta\sigma_X}\\ | |
| E[(\hat\theta - \Theta)^2|X] = (1-\rho^2)\sigma_\Theta^2 | |
| $ | |
| Limit Theorems | |
| -------------- | |
| **Markov Inequality** | |
| If $X$ is non-negative, | |
| $\begin{array} | |
| _P(X \ge a) \le \frac{E[X]}{a} & \forall a \gt 0 | |
| \end{array}$ | |
| **Chebyshev Inequality** | |
| If $X$ has mean $\mu$ and variance $\sigma^2$, | |
| $\begin{array} | |
| _P(|X - \mu| \ge c) \le \frac{\sigma^2}{c^2} & \forall c \gt 0\\ | |
| P(|X - \mu| \ge k\sigma) \le \frac{1}{k^2} & \forall k \gt 0 | |
| \end{array}$ | |
| **Convergence in Probability** | |
| A sequence $Y_1, Y_2, ... Y_n$ converges *in probability* to a real value $a$ if, | |
| $\begin{array} | |
| _\lim\limits_{n \rightarrow \infty} P(|Y_n - a| \ge \epsilon) = 0 & \forall \epsilon \gt 0\\ | |
| P(|Y_n - a| \ge \epsilon) \le \delta & \forall n \gt n_0\\ | |
| \end{array}$ | |
| $\epsilon$ is *accuracy* and $\delta$ is *confidence* | |
| **Weak Law of Large Numbers** | |
| If $X_n$ are i.i.d. with mean $\mu$ and sample mean $M_n$, $M_n$ converges to $\mu$ *in probability*: | |
| $\begin{array} | |
| _\lim\limits_{n \rightarrow \infty} P(|M_n - \mu| \ge \epsilon) = 0 & \forall \epsilon \gt 0 | |
| \end{array}$ | |
| **Central Limit Theorem** | |
| If $S_n = (X_1+X_2+...+X_n)$ and $Z_n = \frac{S_n - n\mu}{\sigma\sqrt{n}}$ | |
| $\begin{array} | |
| _\lim\limits_{n \rightarrow \infty} P(Z_n \le z) = \Phi(z) & \forall z | |
| \end{array} | |
| $ | |
| **Normal Approximation** | |
| $P(S_n \le c) \approx \Phi(\frac{c - n\mu}{\sigma\sqrt{n}}) | |
| $ | |
| **De Moivre-Laplace Approximation to the Binomial** | |
| $P(S_n \le c) \approx \Phi(\frac{c+\frac{1}{2} - np}{\sqrt{np(1-p)}})$ | |
| Classical Statistics | |
| -------------------- | |
| **Sample Mean** | |
| If $X_1, X_2, ..., X_n$ are i.i.d with mean $\theta$ and variance $\sigma^2$ | |
| $\hat\Theta_n = M_n = \frac{1}{n}\sum_i X_i$ | |
| **Unbiased** | |
| $\begin{array} | |
| _E[\hat\Theta_n] = \theta & \forall\,\theta | |
| \end{array} | |
| $ | |
| **Consistent** | |
| $\begin{array} | |
| _\hat\Theta_n \rightarrow \theta & \forall\,\theta | |
| \end{array} | |
| $ | |
| **Mean Squared Error** | |
| $E[(\hat\Theta_n - \theta)^2] = var(\hat\Theta_n) = \sigma^2/n\\ | |
| E[(\hat\Theta_n - \theta)^2] = var(\hat\Theta_n - \theta) + E[\hat\Theta_n - \theta]^2\\ | |
| E[(\hat\Theta_n - \theta)^2] = var(\hat\Theta_n)+ (bias)^2 | |
| $ | |
| **Standard Error** | |
| $\sqrt{var(\hat\Theta)} = \sigma/\sqrt{n}$ | |
| **Confidence Interval** | |
| Typically $\alpha = 0.05, 0.025, 0.01$ | |
| $\begin{array} | |
| _P(\hat\Theta^- \le \theta \le \hat\Theta^+) \ge 1 - \alpha & \forall\,\theta | |
| \end{array} | |
| $ | |
| **Confidence Interval for Estimation of Mean** | |
| $P(\hat\Theta_n - \frac{1.96\sigma}{\sqrt{n}} \le \theta \le \hat\Theta_n + \frac{1.96\sigma}{\sqrt{n}}) \approx 0.95 | |
| $ | |
| **Sample Mean Estimate of Variance** | |
| $\hat\sigma^2 = \frac{1}{n-1}\sum_i (X_i - \hat\Theta_n)^2 \rightarrow \sigma^2$ | |
| **Maximum Likelihood Estimation** | |
| $\hat\theta_{ML} = arg\,max_\theta p_X(x; \theta)$ | |
| Bernoulli and Poisson Processes | |
| ------------------------------- | |
| Aspect | Poisson | Bernoulli | |
| -------|-------|------- | |
| Arrival Times | Continuous | Discrete | |
| PMF of # Arrivals | Poisson | Binomial | |
| Interarrival Time CDF | Exponential | Geometric | |
| Time to $k^{th}$ arrival | Erlang | Pascal | |
| Arrival Rate | $\lambda/$unit time | $p/$trial | |
| **Poisson Approximation to the Binomial** | |
| If $\lambda = np$ is constant, | |
| $\lim\limits_{n \rightarrow \infty} \frac{n!}{(n-k)!\,k!}p^k(1-p)^{n-k} \rightarrow e^{-\lambda}\frac{\lambda^k}{k!} | |
| $ | |
| Markov Chains | |
| ------------- | |
| **n-Step Transition Probabilities** | |
| $r_{ij}(n) = P(X_n = j | X_0 = i)\\ | |
| r_{ij}(n) = \sum_{k=1}^m r_{ik}(n-1)p_{kj}\\ | |
| r_{ij}(1) = p_{ij} | |
| $ | |
| **Steady-State (Stationary Distribution)** | |
| $\lim\limits_{n \rightarrow \infty} r_{ij}(n) = \pi_j\\ | |
| \pi_j = \sum_{k=1}^m \pi_kp_{kj}\\ | |
| \sum_{k=1}^m \pi_k = 1 | |
| $ | |
| **Birth-Death** | |
| $b_i = P(X_{n+1} = i+1 | X_n = i)\\ | |
| d_i = P(X_{n+1} = i-1| X_n = i)\\ | |
| \pi_ib_i = \pi_{i+1}d_{i+1}\\ | |
| \pi_i = \pi_0\frac{b_0b_1...b_{i-1}}{d_1d_2...d_i} | |
| $ | |
| **Absorption** | |
| $\begin{array} | |
| _a_s = 1\\ | |
| a_i = P(X_n=s|X_0=i)\\ | |
| a_i = 0 & \forall\, absorbing\, i \neq s\\ | |
| a_i = \sum_{j=1}^m p_{ij}a_j & \forall\, transient\, i | |
| \end{array}$ | |
| **Expected Time to Absorption** | |
| $\begin{array} | |
| _\mu_i = 0 & \forall\, recurrent\, i\\ | |
| \mu_i = 1 + \sum_{j=1}^m p_{ij}\mu_j & \forall\, transient\, i | |
| \end{array}$ | |
| **Mean First Passage** | |
| $\begin{array} | |
| _t_s = 0\\ | |
| t_i = 1 + \sum_{j=1}^m p_{ij}t_j & \forall i \neq s | |
| \end{array}$ | |
| **Mean Recurrence Time** | |
| $t_s^* = 1+ \sum_{j=1}^m p_{sj}t_j | |
| $ | |
| > Written with [StackEdit](https://stackedit.io/). |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment