Skip to content

Instantly share code, notes, and snippets.

@cjbayesian
Created January 24, 2016 20:50
Show Gist options
  • Save cjbayesian/468725f4bd7d6f3f027d to your computer and use it in GitHub Desktop.
Save cjbayesian/468725f4bd7d6f3f027d to your computer and use it in GitHub Desktop.
#################################################################################################
#### This is a simulation to demonstrate how real populations reach Hardy Weinburg equilibrium
#### under random mating.
#### Author: Corey Chivers, 2011
#################################################################################################
cross<-function(parents)
{
offspring<-c('d','d') #initiate a child object
offspring[1]<-sample(parents[1,],1)
offspring[2]<-sample(parents[2,],1)
return(offspring)
}
random_mating<-function()
{
tmp_pop<-pop
for(n in 1:N)
{
parents<-sample(1:N,2)
tmp_pop[n,]<-cross(pop[parents,])
}
pop<-tmp_pop
}
genotypes<-c('A','a')
p<-0.5
q<-1-p
N=200
a_freq<-c(p,q)
pop<-array(sample(genotypes,2*N,p=a_freq,replace=T),dim=c(N,2))
I<-200
num_generations=1
g_freq<-array(dim=c(I,3))
p_vec<-array(dim=I)
for(i in 1:I)
{
p<-runif(1,0,1)
q<-1-p
a_freq<-c(p,q)
pop[,1]<-sample(genotypes,N,p=a_freq,replace=T)
pop[,2]<-sample(genotypes,N,p=a_freq,replace=T)
for(g in 1:num_generations)
random_mating()
f_aa<-0
f_Aa<-0
f_AA<-0
for(n in 1:N)
{
if(identical(pop[n,],c('A','A')))
f_AA=f_AA+1
if(identical(pop[n,],c('A','a')) || identical(pop[n,],c('a','A') ))
f_Aa=f_Aa+1
if(identical(pop[n,],c('a','a')))
f_aa=f_aa+1
}
f_aa<-f_aa/N
f_Aa<-f_Aa/N
f_AA<-f_AA/N
g_freq[i,]<-c(f_AA,f_Aa,f_aa)
p_vec[i]<-p
}
pdf('HW.pdf')
## Plot the sims
plot(p_vec,g_freq[,1],col='green',xlab='p, or 1-q',ylab='')
points(p_vec,g_freq[,2],col='red')
points(p_vec,g_freq[,3],col='blue')
## Theoretical Curves
curve(x^2,col='green',add=T)
curve((1-x)^2,col='blue',add=T)
curve(2*x*(1-x),col='red',add=T)
dev.off()
@mtaylor-semo
Copy link

Thank you for your code and effort. I am adapting your code for a Shiny app. I want to clarify why you used I to dimension your arrays instead of N. I assume you wanted to ensure 200 points are plotted every time. I also assume that it is coincidence that I and N are the same values inititally. (I read your blog post stating that your students changed the value of N.)

If I set I = N, and set N to some value greater than 200, then the number of plotted points increases, but if I leave I = 200, then I get what appears to be 200 points, regardless of N. Would you mind clarifying your use of I in this context? Thanks again!

@cjbayesian
Copy link
Author

I represents the number of populations to simulate, each with a random fraction p. N is the size of each population. So, if you increase I, you'll simulate more populations. If you increase N, you'll increase the number of individuals in each population. The bigger the population, the closer the genotype proportions will be to those predicted by HW.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment