Skip to content

Instantly share code, notes, and snippets.

View avrilcoghlan's full-sized avatar

Avril Coghlan avrilcoghlan

View GitHub Profile
@avrilcoghlan
avrilcoghlan / list_treefam_genes3.pl
Created March 1, 2013 15:17
Perl script that connects to the TreeFam mysql database, and prints out a list of Caenorhabditis elegans and Caenorhabditis briggsae genes in TreeFam families
#!/usr/local/bin/perl
#
# Perl script list_treefam_genes3.pl
# Written by Avril Coghlan ([email protected]).
# 18-JAN-06.
# Updated 6-Dec-07.
#
# For the TreeFam project.
#
@avrilcoghlan
avrilcoghlan / find_schisto_paralogs.pl
Created March 1, 2013 15:23
Perl script that, given a nhx-format tree file for a tree for a TreeFam family, finds Schistosoma mansoni/S. japonicum/Nematostella vectensis paralog pairs, and gives the ancestral taxon in which the duplication giving rise to the paralogs occcurred).
#!/usr/local/bin/perl
#
# Perl script find_schisto_paralogs.pl
# Written by Avril Coghlan ([email protected])
# 22-Dec-08.
#
# This perl script reads in a tree and finds S. mansoni/
# S. japonicum/Nematostella vectensis paralog pairs, and gives the
# ancestral taxon in which the duplication occurred.
@avrilcoghlan
avrilcoghlan / get_trees2.pl
Created March 1, 2013 15:27
Perl script that gets the TreeFam clean tree for a family.
#!/usr/local/bin/perl
#
# Perl script get_trees2.pl
# Written by Avril Coghlan ([email protected])
# 6-Mar-07.
# Updated 6-Dec-07.
#
# For the TreeFam project.
#
@avrilcoghlan
avrilcoghlan / badgenes_in_alns2.pl
Created March 1, 2013 15:37
Perl script that reads in a fasta-format alignment, and finds sequences that align to <x% of the alignment length.
#!/usr/local/bin/perl
#
# Perl script badgenes_in_alns2.pl
# Written by Avril Coghlan ([email protected]).
# 6-Oct-05.
#
# For the TreeFam project.
#
# This reads in an alignment, and finds sequences that align to
@avrilcoghlan
avrilcoghlan / check_if_adjacent_genes_are_paralogs.pl
Created March 1, 2013 15:40
Perl script that, given a gff file of Caenorhabditis elegans genes, uses TreeFam to check whether adjacent gens are paralogs.
#!/usr/local/bin/perl
#
# Perl script check_if_adjacent_genes_are_paralogs.pl
# Written by Avril Coghlan ([email protected])
# 10-Dec-07.
#
# This perls scripts checks whether adjacent genes are paralogs for
# each gene in the ngasp benchmark gene set.
#
@avrilcoghlan
avrilcoghlan / check_if_have_treefam_ortholog.pl
Created March 1, 2013 15:42
Perl script that, given a gff file of Caenorhabditis elegans genes, finds their C. briggsae, human and yeast orthologs from TreeFam.
#!/usr/local/bin/perl
#
# Perl script check_if_have_treefam_ortholog.pl
# Written by Avril Coghlan ([email protected])
# 4-Dec-07.
#
# This perls scripts finds the TreeFam ortholog in C. briggsae, human, yeast
# each gene in the ngasp benchmark gene set.
#
@avrilcoghlan
avrilcoghlan / find_human_paralogs.pl
Created March 1, 2013 15:56
Perl script that retrieves trees from the TreeFam database, and infers human within-species paralogs from the trees.
#!/usr/local/bin/perl
#
# Perl script find_human_paralogs.pl
# Written by Avril Coghlan ([email protected])
# 31-Jul-07.
#
# For Christine Bird.
# This perl script reads in all trees in TreeFam-4
# and finds human within-species paralogs.
@avrilcoghlan
avrilcoghlan / find_human_paralog_alns.pl
Created March 1, 2013 16:00
Perl script that, for a particular TreeFam family of interest (that has human paralogs), prints out the DNA alignment for the family, with the position of introns shown with respect to the DNA alignment.
#!/usr/local/bin/perl
#
# Perl script find_human_paralogs_alns.pl
# Written by Avril Coghlan ([email protected])
# 29-Aug-07.
#
# For Christine Bird.
# This gets the DNA alignments for TreeFam families
# that have within-species paralogs.
@avrilcoghlan
avrilcoghlan / find_intron_cons_treefam_ortholog.pl
Created March 1, 2013 16:04
Perl script that, given a gff file for Caenorhabditis elegans, finds the fraction of introns in each Caenorhabditis elegans gene that are shared in position in the ortholog (in TreeFam) in C. briggsae, human or yeast.
#!/usr/local/bin/perl
#
# Perl script find_intron_cons_treefam_ortholog.pl
# Written by Avril Coghlan ([email protected])
# 7-Dec-07.
#
# This perls scripts finds the fraction of introns in a C. elegans
# gene that are shared in position in the TreeFam ortholog in C. briggsae,
# human, yeast for each C. elegans gene in the ngasp benchmark gene set.
@avrilcoghlan
avrilcoghlan / find_pc_id_to_treefam_ortholog.pl
Created March 1, 2013 16:07
Perl script that, given a gff file of Caenorhabditis elegans genes, retrieves their orthologs in C. briggsae, human and yeast from the TreeFam database, and finds the percent identity between each C. elegans gene and each of its orthologs.
#!/usr/local/bin/perl
#
# Perl script find_pc_id_to_treefam_ortholog.pl
# Written by Avril Coghlan ([email protected])
# 4-Dec-07.
#
# This perls scripts finds the %id to the TreeFam ortholog in C. briggsae, human, yeast
# each gene in the ngasp benchmark gene set.
#