Skip to content

Instantly share code, notes, and snippets.

View brantfaircloth's full-sized avatar

Brant Faircloth brantfaircloth

View GitHub Profile
@brantfaircloth
brantfaircloth / list-comprehension.py
Last active December 21, 2015 15:59
casting a numpy array of strings to int (examples)
import numpy
s = '40 40 40 40 40'
sl = s.rstrip().split(' ')
si = [int(elem) for elem in sl]
sa = numpy.array(si)
@brantfaircloth
brantfaircloth / faircloth-et-al-2013.rst
Last active December 16, 2015 21:09
Steps used for data analysis in Faircloth et. al. 2013 "A phylogenomic perspective on the radiation of ray-finned fishes based upon targeted sequencing of ultraconserved elements (UCEs)"
@brantfaircloth
brantfaircloth / 1-tbl2asn.rst
Created December 18, 2012 06:34
Preparing sequences for Genbank using tbl2asn, blast, and a bit of custom vector screening

This is primarily directed towards preparing large amounts of UCE data for Genbank. However, parts of the following should work with most NGS data sets and other types of sequence data. Programs within phyluce are availble from:

https://github.com/faircloth-lab/phyluce

Sequin will trim vector contamination, but Sequin will also not handle huge files (nor do you want to have it try). So, the vector screening portions below attempt to be equivalent to this process.

@brantfaircloth
brantfaircloth / non-model-snps.rst
Last active May 16, 2018 19:11
Calling SNPs with GATK in non-model taxa
@brantfaircloth
brantfaircloth / convert-bcl-to-fastq.rst
Created July 16, 2012 23:44
Convert BCL files to fastq

Install dependencies and Casava

The following assumes you are converting BCL files containing PE100 reads with a 10 nt index read. You can allow Casava to demultiplex for you or do it on your own, later. You can adjust values below if you are doing something different (e.g. shorter reads, longer indexes) but be careful.

  • You need a pretty beefy machine. Illumina recommends something with multiple cores and 48 GB RAM, running Centos 5. Centos 6 also works just fine. See their recommendations here:
@brantfaircloth
brantfaircloth / ephemeral.rst
Created July 9, 2012 22:47
Mount ephemeral storage on AWS

# start the instance:

ec2-run-instances --key /path/to/my/ec2-keypair ami-74f0061d --instance-type=c1.xlarge --block-device-mapping '/dev/sda2=ephemeral0' --block-device-mapping '/dev/sda3=ephemeral1'

# mount the ephemeral storage:

sudo su
mkdir /mnt/data
mount /dev/sda2 /mnt/data
@brantfaircloth
brantfaircloth / split_concat_nexus.py
Created June 26, 2012 22:01
Splitting a concatenated nexus file with biopython
from Bio.Nexus import Nexus
aln = Nexus.Nexus()
aln.read('my-properly-formatted-nexus-file.nex')
# assuming your partitions are defined in a charset block like:
#
# begin sets;
# charset bag2 = 1-186;
# charset bag3 = 187-483;
@brantfaircloth
brantfaircloth / mpi_sate.py
Created May 6, 2012 01:53
Run SATe over MPI using mpi4py_map
#!/usr/bin/env python
# encoding: utf-8
"""
File: mpi_sate.py
Author: Brant Faircloth
Created by Brant Faircloth on 04 May 2012 15:05 PDT (-0700)
Copyright (c) 2012 Brant C. Faircloth. All rights reserved.
Description:
@brantfaircloth
brantfaircloth / mpi_file_passing.py
Created April 13, 2012 19:43
mpi4py "file-passing" scatter/gather example
import os
import tempfile
from mpi4py import MPI
comm = MPI.COMM_WORLD
size = comm.Get_size()
rank = comm.Get_rank()
mode = MPI.MODE_RDONLY
  • start up ARDAgent (on remote machine via ssh):

    sudo /System/Library/CoreServices/RemoteManagement/ARDAgent.app/Contents/Resources/kickstart -activate \
        -configure -users bcf -access -on -restart -agent -privs -all -allowAccessFor -specifiedUsers
    
  • start tunnel (from local to remote):

    ssh -i keyfile -NfL 1202:127.0.0.1:5900 [email protected]
    
  • connect w/ (on local machine):