Created
August 9, 2021 21:37
-
-
Save XMPPwocky/b23b53ba5574fcc1fdf7aaa7c6e485a7 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
```python | |
# ye olde imports | |
import math, random | |
``` | |
Imagine a virus which is, somehow, a single protein with only 3 "amino acids" (actually just uppercase letters here, oh well) | |
Currently, only one type is circulating, which is "XYZ". | |
```python | |
EXISTING_TYPE = "XYZ" | |
``` | |
But there's a possible variant, the Super Nasty Variant, "BAD", which escapes antibodies and is hard to even create good antibodies against (meaning even a vaccine targeted specifically at it has trouble): | |
```python | |
SUPER_NASTY_TYPE = "BAD" | |
``` | |
Now we'll run a comically oversimplified simulation of evolution: starting from a population of 100 copies of the existing virus (XYZ)... | |
```python | |
POPULATION_SIZE = 100 | |
def make_starting_population(): return [EXISTING_TYPE] * POPULATION_SIZE | |
``` | |
Each generation, every virus makes a copy of itself (to form a population twice as big), possibly mutating in the process. | |
Then, each virus may die (in this case, be neutralized by antibodies or whatever); the odds of survival each generation is the virus's fitness. | |
Note that the steady-state here happens for viruses with fitness of 0.5 (they make a copy of themselves each generation, but on average the original or the copy dies). | |
```python | |
def evolve(starting_population, fitness): | |
population = starting_population | |
new_population = [virus for virus in population] | |
# each virus makes a copy of itself, maybe mutating... | |
for virus in population: | |
new_population.append(maybe_mutate(virus)) | |
# but then some maybe die, based on their fitness: | |
population = [] | |
for virus in new_population: | |
if fitness(virus) > random.random(): | |
# virus survived | |
population.append(virus) | |
if not population: | |
population = [EXISTING_TYPE] # goofy hack so that the virus never completely dies out | |
return population | |
def geometric_mean(x): | |
if not x: return float("nan") | |
return math.prod(x) / len(x) | |
MAX_GENERATIONS = 2500 # timeout to keep things sensible | |
def run_experiment(fitness_fn, quiet=True): | |
population = make_starting_population() | |
if not quiet: print("Generation\tPop. size\tAvg. fitness\tMax. fitness") | |
for generation in range(MAX_GENERATIONS): | |
if not population: | |
if not quiet: print("Virus died out...") | |
break | |
if generation % 100 == 0: | |
fitnesses = [fitness_fn(virus) for virus in population] | |
avg_fitness = geometric_mean(fitnesses) | |
max_fitness = max(fitnesses) | |
if not quiet: print("{:5}\t\t{:8}\t{:5.4f}\t\t{:5.4f}".format(generation, len(population), avg_fitness, max_fitness)) | |
population = evolve(population, fitness_fn) | |
if SUPER_NASTY_TYPE in population: | |
if not quiet: print("Super Nasty Variant evolved in {} generations!".format(generation)) | |
return generation | |
``` | |
In this model, "mutations" are just random amino acid substitutions: | |
```python | |
ERROR_RATE = 1e-3 # 1/1000 chance per amino acid to replace it with a random one | |
def random_letter(): | |
return chr(ord("A")+random.randint(0, 26)) | |
def maybe_mutate(virus): | |
# random substitutions | |
newvirus = virus | |
for i in range(len(newvirus)): | |
if random.random() < ERROR_RATE: | |
# mutate this position | |
newvirus = newvirus[:i] + random_letter() + newvirus[i+1:] | |
return newvirus | |
``` | |
If each of the individual 3 substitutions (X->B, Y->A, Z->D) required to get from the existing type to the Super Nasty Type confers a small advantage on its own... | |
```python | |
SINGLE_SUBSTITUTION_FITNESS_GAIN = 1.05 # 5% advantage per substitution | |
BASE_FITNESS = 0.5 # steady state... | |
def fitness_normal(virus): | |
num_mutations = 0 | |
if virus[0] == "B": num_mutations += 1 | |
if virus[1] == "A": num_mutations += 1 | |
if virus[2] == "D": num_mutations += 1 | |
return BASE_FITNESS * pow(SINGLE_SUBSTITUTION_FITNESS_GAIN, num_mutations) | |
``` | |
then within a few thousand generations, the Super Nasty Variant usually shows up: | |
```python | |
for experiment in range(20): | |
generations_to_nasty = run_experiment(fitness_normal, quiet=True) | |
if generations_to_nasty is not None: | |
print("Experiment #{}: Super Nasty Variant evolved in {} generations".format(experiment, generations_to_nasty)) | |
else: | |
print("Experiment #{}: Super Nasty Variant never evolved".format(experiment)) | |
``` | |
Experiment #0: Super Nasty Variant never evolved | |
Experiment #1: Super Nasty Variant evolved in 2445 generations | |
Experiment #2: Super Nasty Variant evolved in 701 generations | |
Experiment #3: Super Nasty Variant never evolved | |
Experiment #4: Super Nasty Variant evolved in 331 generations | |
Experiment #5: Super Nasty Variant evolved in 722 generations | |
Experiment #6: Super Nasty Variant evolved in 472 generations | |
Experiment #7: Super Nasty Variant evolved in 2227 generations | |
Experiment #8: Super Nasty Variant never evolved | |
Experiment #9: Super Nasty Variant evolved in 627 generations | |
Experiment #10: Super Nasty Variant evolved in 516 generations | |
Experiment #11: Super Nasty Variant evolved in 195 generations | |
Experiment #12: Super Nasty Variant evolved in 693 generations | |
Experiment #13: Super Nasty Variant evolved in 300 generations | |
Experiment #14: Super Nasty Variant never evolved | |
Experiment #15: Super Nasty Variant evolved in 1506 generations | |
Experiment #16: Super Nasty Variant evolved in 837 generations | |
Experiment #17: Super Nasty Variant evolved in 643 generations | |
Experiment #18: Super Nasty Variant evolved in 302 generations | |
Experiment #19: Super Nasty Variant never evolved | |
But if all 3 mutations must occur together to get a fitness gain... | |
```python | |
def fitness_new(virus): | |
num_mutations = 0 | |
if virus[0] == "B": num_mutations += 1 | |
if virus[1] == "A": num_mutations += 1 | |
if virus[2] == "D": num_mutations += 1 | |
# now, without all 3 mutations, it's no better than no mutations: | |
if num_mutations < 2: num_mutations = 0 | |
return BASE_FITNESS * pow(SINGLE_SUBSTITUTION_FITNESS_GAIN, num_mutations) | |
``` | |
... it's much rarer (and slower) for the variant to evolve in the first place, even though it's just as bad when it does show up. | |
```python | |
for experiment in range(20): | |
generations_to_nasty = run_experiment(fitness_new, quiet=True) | |
if generations_to_nasty is not None: | |
print("Experiment #{}: Super Nasty Variant evolved in {} generations".format(experiment, generations_to_nasty)) | |
else: | |
print("Experiment #{}: Super Nasty Variant never evolved".format(experiment)) | |
``` | |
Experiment #0: Super Nasty Variant never evolved | |
Experiment #1: Super Nasty Variant never evolved | |
Experiment #2: Super Nasty Variant never evolved | |
Experiment #3: Super Nasty Variant never evolved | |
Experiment #4: Super Nasty Variant never evolved | |
Experiment #5: Super Nasty Variant never evolved | |
Experiment #6: Super Nasty Variant never evolved | |
Experiment #7: Super Nasty Variant never evolved | |
Experiment #8: Super Nasty Variant never evolved | |
Experiment #9: Super Nasty Variant never evolved | |
Experiment #10: Super Nasty Variant never evolved | |
Experiment #11: Super Nasty Variant never evolved | |
Experiment #12: Super Nasty Variant never evolved | |
Experiment #13: Super Nasty Variant never evolved | |
Experiment #14: Super Nasty Variant never evolved | |
Experiment #15: Super Nasty Variant never evolved | |
Experiment #16: Super Nasty Variant never evolved | |
Experiment #17: Super Nasty Variant never evolved | |
Experiment #18: Super Nasty Variant evolved in 943 generations | |
Experiment #19: Super Nasty Variant never evolved | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment