Bayes' Theorem is a fundamental concept in probability and statistics. It allows us to update our initial beliefs about an event based on new evidence. Here's a step-by-step breakdown of the theorem, its components, and a couple of examples to solidify understanding.
- Independent Events: If events A and B are independent (they don't influence each other), then the probability that both occur is simply the product of their individual probabilities.
[ P(A \text{ and } B) = P(A) \times P(B) ]
- Dependent Events: When events are related or influence each other, the probability that both events occur depends on the probability of one event given that the other has occurred.
[ P(A \text{ and } B) = P(A) \times P(B|A) ]
Here, P(B|A) is the probability that B occurs given that A has already happened.
-
Bayes' Theorem: This theorem allows us to reverse conditional probabilities. It enables us to find P(A|B) — the probability that A occurs given that B is true — based on known probabilities.
[ P(A|B) = \frac{P(A) \times P(B|A)}{P(B)} ]
Where:
- P(A|B): Posterior probability — probability of A given B.
- P(A): Prior probability — initial probability of A.
- P(B|A): Likelihood — probability of B given A.
- P(B): Normalizing constant — probability of B, ensuring probability values range from 0 to 1.
Suppose you have two jars of cookies:
- Jar 1 contains 10 chocolate chip cookies and 30 sugar cookies. So, 3/4 of the cookies in Jar 1 are sugar cookies.
- Jar 2 contains 20 chocolate chip cookies and 20 sugar cookies. In Jar 2, only 1/2 are sugar cookies.
Now, a friend picks a sugar cookie from one of the jars. You want to know from which jar the cookie most likely came.
-
Hypotheses:
- Hypothesis 1 (H1): The friend picked the cookie from Jar 1.
- Hypothesis 2 (H2): The friend picked the cookie from Jar 2.
-
Priors:
- Since the friend could pick either jar randomly, both hypotheses are equally likely.
- P(H1) = P(H2) = 0.5
-
Likelihoods:
- P(\text{Sugar} | H1): Probability of picking a sugar cookie from Jar 1 = 0.75 (3/4).
- P(\text{Sugar} | H2): Probability of picking a sugar cookie from Jar 2 = 0.5 (1/2).
-
Normalizing Constant (P(E)): This is the probability of picking a sugar cookie overall, regardless of which jar it came from.
[ P(E) = (P(H1) \times P(\text{Sugar}|H1)) + (P(H2) \times P(\text{Sugar}|H2)) = (0.5 \times 0.75) + (0.5 \times 0.5) = 0.625 ]
Now, we can find the posterior probability that the friend picked the cookie from Jar 1 given they picked a sugar cookie:
[ P(H1|\text{Sugar}) = \frac{P(H1) \times P(\text{Sugar}|H1)}{P(E)} = \frac{0.5 \times 0.75}{0.625} = 0.6 ]
So, there’s a 60% chance that the cookie came from Jar 1.
In another scenario, imagine your friend mentions they spoke to someone with long hair. You want to guess whether that person is likely female based on this clue.
-
Hypotheses:
- Hypothesis W: The person is a woman.
- Hypothesis M: The person is a man.
-
Priors:
- If the population is assumed to be 50% male and 50% female:
- P(W) = 0.5
- P(M) = 0.5
-
Likelihoods:
- P(\text{Long Hair} | W): Probability of having long hair if the person is female = 0.75 (or 75%).
- P(\text{Long Hair} | M): Probability of having long hair if the person is male = 0.15 (or 15%).
-
Normalizing Constant (P(Long)): The overall probability of the person having long hair.
[ P(\text{Long}) = (P(W) \times P(\text{Long Hair}|W)) + (P(M) \times P(\text{Long Hair}|M)) ] [ P(\text{Long}) = (0.5 \times 0.75) + (0.5 \times 0.15) = 0.375 + 0.075 = 0.45 ]
We want to calculate the probability that the person is female given that they have long hair.
[ P(W|\text{Long Hair}) = \frac{P(W) \times P(\text{Long Hair}|W)}{P(\text{Long})} ] [ P(W|\text{Long Hair}) = \frac{0.5 \times 0.75}{0.45} = \frac{0.375}{0.45} = 0.833 ]
So, there’s about an 83.3% chance that the long-haired person is female.
- Identify Hypotheses: Define the possible scenarios (e.g., Jar 1 vs. Jar 2, Woman vs. Man).
- Determine Priors: Assign initial probabilities to each hypothesis.
- Calculate Likelihoods: Find the probability of the evidence given each hypothesis.
- Find the Normalizing Constant: Calculate the total probability of the evidence occurring across all hypotheses.
- Apply Bayes’ Formula: Compute the posterior probability by dividing the relevant hypothesis's likelihood by the normalizing constant.