Skip to content

Instantly share code, notes, and snippets.

View arq5x's full-sized avatar

Aaron Quinlan arq5x

View GitHub Profile
@arq5x
arq5x / go.sh
Created April 9, 2014 18:15
Track Github repo release download info , etc.
curl -i https://api.github.com/repos/arq5x/bedtools2/releases
@arq5x
arq5x / summary.md
Last active August 29, 2015 13:57
Discrepancies in PolyPhen2 predictions.

Example

Nonsynonymous change in question (human, build 37; 1-based coordinate)
chrom: chr22 
pos:   24379402
ref:   T
alt:   G
@arq5x
arq5x / inheritance_scenarios.md
Last active March 28, 2022 21:45
mendelian violations
dad mom kid Inheritance description
HOM_REF HOM_REF HOM_REF Expected
HOM_REF HOM_REF HET Mendelian violation (plausible de novo)
HOM_REF HOM_REF HOM_ALT Mendelian violation (implausible de novo)
HOM_REF HOM_ALT HOM_REF Mendelian violation (uniparental disomy)
HOM_REF HOM_ALT HET Expected
HOM_REF HOM_ALT HOM_ALT Mendelian violation (uniparental disomy)
HOM_REF HET HOM_REF Expected
HOM_REF HET HET Expected
@arq5x
arq5x / cadd_compress.sh
Last active March 30, 2017 09:09
Various attempts at compressing the raw CADD datasets for use with GEMINI (and by others).
# Download the raw CADD TSV and Tabix index (no annotations, just scores)
wget http://krishna.gs.washington.edu/download/CADD/v1.0/whole_genome_SNVs.tsv.gz
wget http://krishna.gs.washington.edu/download/CADD/v1.0/whole_genome_SNVs.tsv.gz.tbi
# it is big. 79Gb
ls -ltrh whole_genome_SNVs.tsv.gz
-rw-r--r-- 1 arq5x users 79G Sep 26 01:44 whole_genome_SNVs.tsv.gz
# for testing, let's play with the chr22 intervals
tabix whole_genome_SNVs.tsv.gz 22 | bgzip > whole_genome_SNVs.tsv.22.gz
@arq5x
arq5x / macs.sh
Last active August 29, 2015 13:56
Examples for generating haplotypes with Macs
DL: https://code.google.com/p/macs/
# simulate:
# 100 individuals (200 haplotypes)
# "genome" is 1Mb (1e6)
# mutation and recombinaytion rate at 0.001
macs 200 1e6 -T -t .001 -r .001 > 200.macs
# peak at file:
grep SITE: 200.macs | head
@arq5x
arq5x / AFS_analysis_for_311_samples.ipynb
Last active January 2, 2016 01:39
T1D-RA targeted regulatory sequencing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@arq5x
arq5x / cell-line-workflow.sh
Last active December 31, 2015 21:39
Ovarian cancer chemoresistance.
export SAMPLES="2484-AJ-0001 2484-AJ-0002 2484-AJ-0003"
######################################
# Make FASTQ
######################################
export OVHOME=/home/arq5x/cphg-home/cphg-quinlan/projects/ov-cell-lines
export STEPNAME=ovc-fastq
for sample in `echo $SAMPLES`
do
export QSUB="qsub -W group_list=cphg_arq5x -q arq5xlab -V -l select=1:mem=8000m:ncpus=1 -N $STEPNAME -m bea -M [email protected]";
@arq5x
arq5x / chromsweep-scalability.sh
Last active December 31, 2015 01:19
Bedtools protocols.
# /home/arq5x/cphg-home/projects/bedtools-curr-prot-bx
#######################################################
# 1. Create subsamples of BAM file and convert to BED.
#######################################################
bedtools bamtobed -i ~/cphg-quinlan/projects/rs-exome/bam/1478PC0009B.conc.on.pos.bam | \
cut -f 1-3 \
> datasets/sample.100M.bam.bed &
samtools view -us 0.10 ~/cphg-quinlan/projects/rs-exome/bam/1478PC0009B.conc.on.pos.bam | \
@arq5x
arq5x / flatten_transcripts.py
Created December 4, 2013 15:52
Flattened CCDS
import pybedtools as pbt
import sys
def merge_gene(lines):
tmp = pbt.BedTool(lines, from_string=True).merge(nms=True)
print tmp
gene_lines = ''
curr_gene = None
prev_gene = None
@arq5x
arq5x / workflow.sh
Last active December 23, 2015 05:39
irradiated clones
############################################################
# Novoalign
############################################################
export GENOME=/home/arq5x/cphg-home/shared/genomes/hg19/bwa/gatk/hg19_gatk.fa.novo.k14.s1.idx
export IRCHOME=/net/midtier18/vol79/cphg-quinlan2/projects/irradiated-clones
export STEPNAME=ircnovo
export QSUB="qsub -W group_list=cphg_arq5x -q arq5xlab -V -l select=1:mem=32000m:ncpus=16 -N $STEPNAME -m bea -M [email protected]";
echo "cd $IRCHOME; novoalign -d $GENOME -o SAM $'@RG\tID:parental\tSM:parental' -r Random \
-f fastq/CgmW_AGTCAA_L001_R1.fastq.gz fastq/CgmW_AGTCAA_L001_R2.fastq.gz \