Last active
March 1, 2017 17:53
-
-
Save drdozer/c9a21a8950fd519ca8c48c7c4cccdfc3 to your computer and use it in GitHub Desktop.
Working with simple multiple trial probabilities
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The question we're answering here is how many times we need to try something | |
to have some level of certainty that it will happen at least once. | |
For example, if a parent has children, each with independent mutations, | |
how many children are needed until the chances of at least one of those children carrying a particular mutation exceeds 90%. | |
First some notation. | |
p(e) :: probability of some event e happening in a single trial | |
p(¬e) :: probability that some event e does not happen | |
And as either something happens or it does not, | |
p(e) + p(¬e) = 1 | |
=> | |
p(¬e) = 1-p(e) | |
Now let's say we do something lots of times: | |
p(n of e) :: probability of n independent events e | |
As the events are independent, the chances of them all happening is their product. | |
As we are taking the product of n things, all alike, we can rewrite this as a power. | |
p(n of e) = p(e)^n | |
We want to know how many events we need to look at, to expect to have a 50% chance (or some other chance) of observing at least one example of e happening. | |
The problem with this is that it may have happened once, or twice, or more times in those trials. | |
This sum gets quite complicated quite quickly. | |
Luckily though, is the same as asking how many events we need to look at, to have a (100% - 50% = 50%) chance of not observing any examples of e happening. | |
We can calculate this easily. | |
It's the chance that it didn't happen the first time, and didn't happen the second, and so on. | |
Given p(e), and a threshold c: | |
c = p(at least one e from n trials), | |
¬c is p(no e from n trials) | |
¬c = p(n of ¬e) | |
from independence of events: | |
= p(¬e)^n | |
We want to solve for n. | |
log both sides | |
ln(p(¬e)) * n = ln(¬c) | |
divide through | |
n = ln(¬c) / ln(p(¬e)) | |
removing ¬ by complementary probabilities | |
n = ln(1 - c) / ln(1 - p(e)) | |
Let's try it with some simple examples. | |
Say we have a fair coin. | |
So we can ask how many throws of that coin we need to get to expect a 50% chane that it will come up heads. | |
Now, we know directly that the answer is 1 throw, as it is a fair coin with just two outcomes one of which we are selecting (1/2=50%). | |
However, we can also show it with this formula: | |
n = ln(1 - c) / ln(1 - p(e)) | |
c = 50%, p(e) = 50% | |
= ln(1-50%) / ln(1-50%) | |
x/x = 1 | |
= 1 | |
So, in a long-winded way, we've validated that a fair dice needs to be thrown once to have a 50% chance of coming up heads. | |
How many times must you roll a dice to have a 90% chance of a 6? | |
n = ln(1 - c) / ln(1 - p(e)) | |
c = 90%, p(e) = 1/6 | |
= ln(1 - 90%) / ln(1-1/6) | |
arithmetic | |
= ln(10%) / ln(5/6) | |
and using a calculator | |
~ 12.6 | |
So if you roll a dice 12-13 times, you have a 90% chance of getting at least one 6. | |
Now, let's say we have a mutation that has a 1/16 chance of happening in any child, and we want to ask how many children a parent must have | |
before they have a 90% chance of at last one child carrying this mutation, we work it out in exactly the same way: | |
n = ln(1 - c) / ln(1 - p(e)) | |
c = 90%, p(e) = 1/16 | |
~35.7 | |
So with 36 children, there's a greater than 90% chance that at least one child will carry that mutation. | |
In case you have scala installed, here's the scala code for calculating this: | |
// c_threshold - chance threshold | |
// p_e - probability of event e in a single trial | |
// returns the number of trials needed for the combined probability of e happening at least once | |
// to reach c_threshold | |
def estimateN(c_threshold: Double, p_e: Double) = Math.log(1.0 - c_threshold) / Math.log(1.0-p_e) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment