Skip to content

Instantly share code, notes, and snippets.

@dwinter
Created June 13, 2016 21:35
Show Gist options
  • Save dwinter/579bbb7a80c3e5b15a9d9db1a86d2ab2 to your computer and use it in GitHub Desktop.
Save dwinter/579bbb7a80c3e5b15a9d9db1a86d2ab2 to your computer and use it in GitHub Desktop.
Parrelizable codeml?
#for seq in alignments/ENSG*.best.fas
#do
# There is some evidence that forcing trees to fit the spp tree will inflate
# dN/dS estimates. Better to estimate the tree every time. At the same time,
# label the human branch and delet branch lengths which aren't useful
gene=`basename ${seq%%.*}`
raxml -s ${seq} -T 2 -p 1231 -m GTRGAMMA -n ${gene}_delete_me
sed -e "s/:0.[0-9]*//g" -e "s/sapiens/sapiens#1/g" RAxML_bestTree.${gene}_delete_me >${gene}.tr
rm *.${gene}_delete_me
#now run codeml using the template control file update with the current seq
echo -e 'seqfile = '${seq}'\n treefile = '${gene}'.tr\n outfile = '${gene}'_branch.txt' | cat - codeml_branch_template.ctl > ${gene}_branch.ctl
echo -e 'seqfile = '${seq}'\n treefile = '${gene}'.tr\n outfile = '${gene}'_null.txt' | cat - codeml_null_template.ctl > ${gene}_null.ctl
codeml ${gene}_branch.ctl
codeml ${gene}_null.ctl
#get the good stuff from the results files
scripts/extract_codeml.py ${gene} >> results/codeml_res.csv
#including the number of non-synonymous changes in the human branch
#grep "(homo_sapiens)" rst | sed "s/.*n=\([ 0-9]*.[0-9]*\).*/${gene}\t\1/" >> results/num_non_syn.tsv
#and clean up the god-awful mess it makes
#rm 2NG.*
#rm 4fold.nuc
#rm lnf
#rm rst*
#rm rub
# ...any my mess too
rm ${gene}.tr
rm ${gene}*.ctl
rm ${gene}_branch.txt
rm ${gene}_null.txt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment