Skip to content

Instantly share code, notes, and snippets.

#!/home/fng514/bin/ruby
#$ -S /home/fng514/bin/ruby
#$ -cwd
require 'bio-samtools'
require 'zlib'
rsnumbers = {}
puts ARGV.inspect
throw "invalid numbmer of args" unless ARGV.size == 2
require 'bio-samtools'
@cth
cth / vcf2bed.pl
Last active November 11, 2015 12:21 — forked from mlawson/vcf2bed.pl
Converts a vcf file (typically of indels) into a bed file
#!/usr/bin/perl
use strict;
if (scalar @ARGV != 1) {
print "vcf2bed.pl <vcf_file>\n";
exit 1;
}
my $vcfFile = $ARGV[0];
@cth
cth / ABCA6.sh
Last active November 10, 2015 08:29
#!/bin/bash
#$ -S /bin/bash
#$ -cwd
# Check if CNVs are LD with SNPs.
for file in Samples*txt
do
gene=`echo $file|sed 's/.txt//'|sed 's/Samples//'`
# Extract chr and (one) position from file
@cth
cth / README
Last active October 26, 2015 10:00
For imputed genotypes used the batch of 2015-04-27 which include DCH an other smaller cohorts.
This data is imputed using impute2 version 2.3.0 and phased using shapeit v2.r727.
Variants with p(HWE) < 0.000001 or callrate less than 95%, MAF less than %1 were removed prior
to impution. The panel used was 1000 Genomes phase 1.
The dosages all 48 SNPs were extracted from the VCF created from the imputed data
and merged with the 13 directly genotyped SNPs. Whenever, directly genotyped variants
are available they are used in place of the imputed dosages.
#!/usr/bin/perl
#Joseph Glessner
=head
Inputs:
PennCNV .log -> LRR_SD, GCWF
PennCNV .rawcnv -> CountCNV
Plink --missing .imiss / Illumina LIMs Project Detail Report .csv / Illumina Genome Studio Samples Table / General format SampleID CallRate -> CallRate
Eigenstrat smartPCA .pca.evec / Plink .mds -> PCA
Plink --genome .genome -> PI_HAT relatedness
make sure to include extremevalues_2.2.tar.gz http://cran.r-project.org/web/packages/extremevalues/index.html
# Approximate way of creating a GC model from a different gcmodel
# This takes a while to run ...
library("data.table")
# We take Human 1M-Duev3 since it has the most SNPs
gcmodel.other <- data.table(read.table("run_ind_ex/lib/GC_hg19/Human1M-Duov3-0_hg19.gcmodel",h=T,sep="\t"))
setkey(gcmodel.other, "Chr", "Position")
# Read the map file
#!/bin/sh
#$ -S /bin/bash
#$ -cwd
# Christian Theil HAve, 2015
# Modify this to be the final report directory of interest
# Note that final reports must contain LRR and BAF
FINAL_REPORTS_DIR=target_final_reports
# This is where
require 'rinruby'
require 'bio-samtools'
require 'rake'
plink_stems = {
:lucamp => "/eva/brutus/J8/data/deepExom/burden5/clean/lrt24qibin7R2BeagleV2",
# :metabochip => "/eva/brutus/J8/data/metaboChip/clean/metaboChipAll",
:exomchip => "/eva/brutus/J8/data/exomChip/clean/exomChipAll"
}
def mean_std_dev(arr,n)
sum=0
mean = arr.inject{|sum,x| sum + x } / n.to_f
total=0
[mean, Math.sqrt(arr.inject{|total,x| total + ((mean-x)*(mean-x)) } / n.to_f) ]
end
def percentile_depths(arr,n,pct)
samples_in_percentile = (n * pct).floor
(arr.sort.reverse.slice(0,samples_in_percentile)).min