This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# 1. Download BED files of 349 DHS experiments from Science, 337, no. 6099, pp. 1190-1195, 7 Sep. 2012 | |
# http://www.uwencode.org/proj/Science_Maurano_Humbert_et_al/ | |
wget http://www.uwencode.org/proj/Science_Maurano_Humbert_et_al/data/all_fdr0.05_hot.tgz | |
# 2. Unpack. | |
tar -zxvf all_fdr0.05_hot.tgz | |
# 3. Make sure all of the files are sorted lexicographically by chrom, then numerically by start. | |
# This is required for the sweep allgorithm. | |
# Hint: they are sorted correctly, this is just a sanity check. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Cancer_type Lifetime_cancer_incidence Total_cells_tissue Total_Stem_Cells Stem_cell_divisions_per_year Stem_cell_divisions_per_lifetime LCSD | |
ALL 0.0041 3000000000000 135000000 12 960 129900000000 | |
BCC 0.3 180000000000 5820000000 7.6 608 3550000000000 | |
CLL 0.0052 3000000000000 135000000 12 960 129900000000 | |
Colorectal 0.048 30000000000 200000000 73 5840 1168000000000 | |
Colorectal_FAP 1 30000000000 200000000 73 5840 1168000000000 | |
Colorectal_Lynch 0.5 30000000000 200000000 73 5840 1168000000000 | |
Duodenum_adenocarcinoma 0.0003 680000000 4000000 24 1947 7796000000 | |
Duodenum_adenocarcinoma_with_FAP 0.035 680000000 4000000 24 1947 7796000000 | |
Esophageal_squamous_cell_carcinoma 0.001938 3240000000 846000 17.4 1390 1203000000 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import sys | |
from itertools import * | |
""" | |
compute the complexity of each kmer passed in | |
given the format of the output of `jellyfish dump -ct` | |
complexity is measured as the number of runs divided | |
by the total length of the sequence. | |
e.g., "AAAAA" would be 1/5 | |
and "ACTGC" would be 5/5 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import sys | |
import numpy as np | |
""" | |
Simulate chutes and ladders. | |
Reports the number of moves for 1-player to reach the end, | |
followed by the list of rolls that player had. | |
Run as follows for 100000 games with 1 player. Report the total | |
number of moves made by the winning player: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
cat ivl.bed | |
chr1 10 30 | |
cat data.bed | |
chr1 9 20 d1 | |
chr1 12 18 d2 | |
chr1 12 20 d3 | |
chr1 15 16 d4 | |
chr1 25 40 d5 | |
chr1 26 30 d6 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
sudo pip install awscli | |
aws configure | |
aws s3 ls | |
aws s3 ls s3://gqt-data |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
chmod a+x vcfsort.sh | |
vcfsort.sh trio.trim.vep.vcf.gz |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# bedtools --version | |
# bedtools v2.24.0-14-gaa11ef9 | |
######################################################## | |
# Create a BED file of 5kb windows with 2.5kb overlap | |
# tiling build 37 (hg19) of the human genome | |
######################################################## | |
bedtools makewindows -g hg19.txt -w 5000 -s 2500 > hg19.w5k.s2.5k.bedg | |
######################################################## |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
tosses <- 200 | |
experiments <- 1000 | |
hist((rbinom(experiments, tosses, 0.5) / tosses), | |
breaks=20, xlim=c(0,1), | |
main=paste("Distribution of % heads from", experiments, | |
"experiments with", tosses, "tosses each"), | |
xlab = "Fraction of tosses that were heads") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# get HumVar | |
wget ftp://genetics.bwh.harvard.edu/pph2/training/humvar-2011_12.predictions.tar.gz | |
tar -zxvf humvar-2011_12.predictions.tar.gz | |
# get db snp | |
wget http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/snp138.txt.gz | |
gunzip snp138.txt.gz | |
# get the deleterious SNPs | |
grep snp138.txt -wFf <(grep rs humvar-2011_12.deleterious.pph.output | cut -f 5) \ |