Created
January 24, 2016 20:50
-
-
Save cjbayesian/468725f4bd7d6f3f027d to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
################################################################################################# | |
#### This is a simulation to demonstrate how real populations reach Hardy Weinburg equilibrium | |
#### under random mating. | |
#### Author: Corey Chivers, 2011 | |
################################################################################################# | |
cross<-function(parents) | |
{ | |
offspring<-c('d','d') #initiate a child object | |
offspring[1]<-sample(parents[1,],1) | |
offspring[2]<-sample(parents[2,],1) | |
return(offspring) | |
} | |
random_mating<-function() | |
{ | |
tmp_pop<-pop | |
for(n in 1:N) | |
{ | |
parents<-sample(1:N,2) | |
tmp_pop[n,]<-cross(pop[parents,]) | |
} | |
pop<-tmp_pop | |
} | |
genotypes<-c('A','a') | |
p<-0.5 | |
q<-1-p | |
N=200 | |
a_freq<-c(p,q) | |
pop<-array(sample(genotypes,2*N,p=a_freq,replace=T),dim=c(N,2)) | |
I<-200 | |
num_generations=1 | |
g_freq<-array(dim=c(I,3)) | |
p_vec<-array(dim=I) | |
for(i in 1:I) | |
{ | |
p<-runif(1,0,1) | |
q<-1-p | |
a_freq<-c(p,q) | |
pop[,1]<-sample(genotypes,N,p=a_freq,replace=T) | |
pop[,2]<-sample(genotypes,N,p=a_freq,replace=T) | |
for(g in 1:num_generations) | |
random_mating() | |
f_aa<-0 | |
f_Aa<-0 | |
f_AA<-0 | |
for(n in 1:N) | |
{ | |
if(identical(pop[n,],c('A','A'))) | |
f_AA=f_AA+1 | |
if(identical(pop[n,],c('A','a')) || identical(pop[n,],c('a','A') )) | |
f_Aa=f_Aa+1 | |
if(identical(pop[n,],c('a','a'))) | |
f_aa=f_aa+1 | |
} | |
f_aa<-f_aa/N | |
f_Aa<-f_Aa/N | |
f_AA<-f_AA/N | |
g_freq[i,]<-c(f_AA,f_Aa,f_aa) | |
p_vec[i]<-p | |
} | |
pdf('HW.pdf') | |
## Plot the sims | |
plot(p_vec,g_freq[,1],col='green',xlab='p, or 1-q',ylab='') | |
points(p_vec,g_freq[,2],col='red') | |
points(p_vec,g_freq[,3],col='blue') | |
## Theoretical Curves | |
curve(x^2,col='green',add=T) | |
curve((1-x)^2,col='blue',add=T) | |
curve(2*x*(1-x),col='red',add=T) | |
dev.off() | |
I
represents the number of populations to simulate, each with a random fraction p
. N
is the size of each population. So, if you increase I
, you'll simulate more populations. If you increase N
, you'll increase the number of individuals in each population. The bigger the population, the closer the genotype proportions will be to those predicted by HW.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thank you for your code and effort. I am adapting your code for a Shiny app. I want to clarify why you used
I
to dimension your arrays instead ofN
. I assume you wanted to ensure 200 points are plotted every time. I also assume that it is coincidence thatI
andN
are the same values inititally. (I read your blog post stating that your students changed the value ofN
.)If I set
I = N
, and setN
to some value greater than 200, then the number of plotted points increases, but if I leaveI = 200
, then I get what appears to be 200 points, regardless ofN
. Would you mind clarifying your use ofI
in this context? Thanks again!