Skip to content

Instantly share code, notes, and snippets.

View avrilcoghlan's full-sized avatar

Avril Coghlan avrilcoghlan

View GitHub Profile
@avrilcoghlan
avrilcoghlan / exercise4_compara.pl
Created December 17, 2013 11:07
Use the Ensembl Compara Perl API to get the families predicted for the human gene ENSG00000139618.
#!/usr/bin/env perl
# Get the families predicted for the human gene ENSG00000139618. What do you notice ?
# Note: Families include UniProt proteins and Ensembl genes/proteins.
use strict;
use warnings;
use Bio::EnsEMBL::Registry;
my $registry = 'Bio::EnsEMBL::Registry';
@avrilcoghlan
avrilcoghlan / exercise3_compara.pl
Created December 17, 2013 11:02
Use the Ensembl Compara Perl API to get the multiple alignment corresponding to the family with the stable id ENSFM00250000006121
#!/usr/bin/env perl
# Get the multiple alignment corresponding to the family with the stable id ENSFM00250000006121
# Note: this prints some warnings about uninitialised values in the Compara api
use strict;
use warnings;
use Bio::EnsEMBL::Registry;
use Bio::AlignIO;
@avrilcoghlan
avrilcoghlan / exercise2a_compara.pl
Created December 17, 2013 10:33
Use the Ensembl Compara Perl API to find and print the sequence of all the peptide Members corresponding to the human protein-coding gene(s) FRAS1.
#!/usr/bin/env perl
# Find and print the sequence of all the peptide Members corresponding to the human protein-coding gene(s) FRAS1.
# Print its attributes using the print_member() method.
# Get all the peptide members and print them as well.
# Print the sequence of these members.
use strict;
use warnings;
use Bio::EnsEMBL::Registry;
@avrilcoghlan
avrilcoghlan / exercise1a_compara.pl
Created December 17, 2013 10:04
Perl script to use the Ensembl Compara Perl API to print the sequence of the Member corresponding to SwissProt protein O93279
#!/usr/bin/env perl
# Print the sequence of the Member corresponding to SwissProt protein O93279
use strict;
use warnings;
use Bio::EnsEMBL::Registry;
my $registry = 'Bio::EnsEMBL::Registry';
@avrilcoghlan
avrilcoghlan / dijkstra_example2.py
Created November 14, 2013 14:39
Python script to implement Dijkstra's algorithm with a directed graph
# define function 'xrange'
def xrange(x):
return iter(range(x))
genes = ['g0', 'g1', 'g2', 'g3', 'g4', 'g5', 'g6', 'g7', 'g8', 'g9', 'g10', 'g11']
# create a list containing 12 lists initialised to 0
Matrix = [[0 for x in xrange(12)] for x in xrange(12)]
@avrilcoghlan
avrilcoghlan / dijkstra_example.py
Last active October 8, 2018 18:53
Python script to implement Dijkstra's algorithm for an undirected graph
# define function 'xrange'
def xrange(x):
return iter(range(x))
genes = ['g1', 'g2', 'g3', 'g4', 'g5', 'g6', 'g7']
Matrix = [[0 for x in xrange(7)] for x in xrange(7)]
Matrix[1-1][4-1] = 12
Matrix[4-1][1-1] = 12
@avrilcoghlan
avrilcoghlan / AvrilGenefindingUtils.pm
Created October 21, 2013 08:22
Perl module with functions to process gene-finding data.
package HelminthGenomeAnalysis::AvrilGenefindingUtils;
use strict;
use warnings;
use Bio::Seq;
use Bio::SeqIO;
use Moose;
use Math::Round; # HAS THE nearest() FUNCTION
use Carp::Assert; # HAS THE assert() FUNCTION
use Scalar::Util qw(looks_like_number);
@avrilcoghlan
avrilcoghlan / parse_sff_for_454_linkers.py
Created October 10, 2013 15:37
Python script to parse a sff file, and print out how many of the first 100,000 reads have FLX linkers, and how many Titanium linkers
import sys
from Bio import SeqIO
#====================================================================#
# check the command-line arguments:
if len(sys.argv) != 2:
print("Usage: %s sff_file") % sys.argv[0]
sys.exit(1)
#====================================================================#
@avrilcoghlan
avrilcoghlan / gist:6729918
Created September 27, 2013 14:58
Perl script that splits up an input fasta file into smaller files with n sequences each.
#!/usr/bin/env perl
=head1 NAME
split_up_fasta.pl
=head1 SYNOPSIS
split_up_fasta.pl input_fasta num_seqs_per_file prefix outputdir
where input_fasta is the input fasta file,
@avrilcoghlan
avrilcoghlan / merge_overlapping_exons.pl
Created August 27, 2013 14:26
Perl script that, given an input gff file, identifies exons that have a 0 bp intron between them, or are overlapping exons, and merges them.
#!/usr/bin/env perl
=head1 NAME
merge_overlapping_exons.pl
=head1 SYNOPSIS
merge_overlapping_exons.pl input_gff output_gff outputdir input_fasta
where input_gff is the input gff file,