Skip to content

Instantly share code, notes, and snippets.

View andrewyatz's full-sized avatar

Andrew Yates andrewyatz

  • EMBL-EBI
  • Cambridge, UK
View GitHub Profile
@andrewyatz
andrewyatz / ucsc_binary_downloader.pl
Created April 13, 2011 15:50
A script which downloads all x86_64 binaries from UCSC's download server. You can invoke it like ./ucsc_binary_downloader.pl or ./ucsc_binary_downloader.pl blat. The first version will download everything in the root dir or linux.x86_64 and the second wil
#!/usr/bin/env perl
use strict;
use warnings;
use WWW::Mechanize;
use LWP::Simple;
my $href = 'http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/';
if($ARGV[0]) {
@andrewyatz
andrewyatz / synteny_to_psl.pl
Created May 3, 2011 08:38
A script for extracting synteny regions from Ensembl & Ensembl Genomes databases into PSL (UCSC/Blat alignment report format).
#!/bin/env perl
use strict;
use warnings;
use Getopt::Long;
use Pod::Usage;
use Bio::EnsEMBL::Registry;
use Bio::EnsEMBL::Utils::Scalar qw(wrap_array);
use IO::String;
@andrewyatz
andrewyatz / funky_db_work_with_ensembl.pl
Created May 24, 2011 20:30
Example of connecting to an Ensembl database using a registry, getting said species & running SQL done 4 ways
use strict;
use warnings;
use Bio::EnsEMBL::Registry;
use Bio::EnsEMBL::Utils::SqlHelper;
#Set to the registry location
my $registry_location = $ARGV[0];
#Set this to the target species e.g. e_coli_k12
my $species = $ARGV[1];
@andrewyatz
andrewyatz / dump_every_genetree.pl
Created June 7, 2011 15:52
Two ways of getting alignments of clusters from a compara database and writing these out to a file.
use strict;
use warnings;
use Bio::EnsEMBL::Registry;
use Bio::AlignIO;
Bio::EnsEMBL::Registry->load_registry_from_db(
-HOST => 'mysql.ebi.ac.uk',
-PORT => 4157,
-USER => 'anonymous',
-DB_VERSION => 62
@andrewyatz
andrewyatz / phyloxml_to_fasta.pl
Created June 7, 2011 16:41
Converting a PhyloXML file to a FASTA alignment
#Only supports a PhyloXML file with only one phylogeny element. This is intended as an extension point
use strict;
use warnings;
use XML::Twig;
my $file = '/Users/ayates/Desktop/tmp.xml';
my $id;
my $seqs = [];
@andrewyatz
andrewyatz / phyloxml_to_msa.pl
Created June 8, 2011 09:44
Convert a PhyloXML file to any BioPerl supported alignment format
#!/bin/env perl
use strict;
use warnings;
use Bio::AlignIO;
use Bio::LocatableSeq;
use Bio::SimpleAlign;
use IO::Handle;
use IO::File;
@andrewyatz
andrewyatz / dump_homology_tsv.pl
Created June 8, 2011 16:14
Script used to dump homology information into a tabbed separated value file
#!/bin/env perl
use strict;
use warnings;
use Bio::EnsEMBL::ApiVersion;
use Bio::EnsEMBL::Utils::SqlHelper;
use Bio::EnsEMBL::Registry;
use File::Spec;
use Getopt::Long;
@andrewyatz
andrewyatz / dump_xrefs_tsv.pl
Created June 9, 2011 12:01
Retrieve an Xref mapping file from any Ensembl based database
#!/bin/env perl
use strict;
use warnings;
use Bio::EnsEMBL::ApiVersion;
use Bio::EnsEMBL::Utils::SqlHelper;
use Bio::EnsEMBL::Registry;
use File::Spec;
use Getopt::Long;

(a gist based on the old toolmantim article on setting up remote repos)

To collaborate in a distributed development process you’ll need to push code to remotely accessible repositories.

This is somewhat of a follow-up to the previous article setting up a new rails app with git.

For the impatient

Set up the new bare repo on the server:

@andrewyatz
andrewyatz / find_upi.groovy
Created July 20, 2011 11:05
Uses the UniProtJAPI and BioJava3 to process a FASTA file and then outputs the appropriate UPI.
#!/usr/bin/env groovy
@GrabResolver(name='ebi', root='http://www.ebi.ac.uk/~maven/m2repo')
@Grab(group='uk.ac.ebi.uniprot.kraken', module='uniprotjapi', version='2011.07')
@Grab(group='org.biojava', module='biojava3-core', version='3.0')
@Grab(group='com.google.code.gson', module='gson', version='1.7.1')
import org.biojava3.core.sequence.io.FastaReaderHelper
import org.biojava3.core.sequence.template.SequenceMixin