Christian Theil Have cth

cth / create_gc_model.r

Created August 31, 2015 10:50

	# Approximate way of creating a GC model from a different gcmodel
	# This takes a while to run ...

	library("data.table")

	# We take Human 1M-Duev3 since it has the most SNPs
	gcmodel.other <- data.table(read.table("run_ind_ex/lib/GC_hg19/Human1M-Duov3-0_hg19.gcmodel",h=T,sep="\t"))
	setkey(gcmodel.other, "Chr", "Position")

	# Read the map file

cth / ParseCNV_QC.pl

Created September 10, 2015 10:06

	#!/usr/bin/perl
	#Joseph Glessner
	=head
	Inputs:
	PennCNV .log -> LRR_SD, GCWF
	PennCNV .rawcnv -> CountCNV
	Plink --missing .imiss / Illumina LIMs Project Detail Report .csv / Illumina Genome Studio Samples Table / General format SampleID CallRate -> CallRate
	Eigenstrat smartPCA .pca.evec / Plink .mds -> PCA
	Plink --genome .genome -> PI_HAT relatedness
	make sure to include extremevalues_2.2.tar.gz http://cran.r-project.org/web/packages/extremevalues/index.html

cth / README

Last active October 26, 2015 10:00



	For imputed genotypes used the batch of 2015-04-27 which include DCH an other smaller cohorts.
	This data is imputed using impute2 version 2.3.0 and phased using shapeit v2.r727.
	Variants with p(HWE) < 0.000001 or callrate less than 95%, MAF less than %1 were removed prior
	to impution. The panel used was 1000 Genomes phase 1.

	The dosages all 48 SNPs were extracted from the VCF created from the imputed data
	and merged with the 13 directly genotyped SNPs. Whenever, directly genotyped variants
	are available they are used in place of the imputed dosages.

cth / ABCA6.sh

Last active November 10, 2015 08:29

	#!/bin/bash
	#$ -S /bin/bash
	#$ -cwd

	# Check if CNVs are LD with SNPs.

	for file in Samples*txt
	do
	gene=`echo $file\|sed 's/.txt//'\|sed 's/Samples//'`
	# Extract chr and (one) position from file

cth / vcf2bed.pl

Last active November 11, 2015 12:21 — forked from mlawson/vcf2bed.pl

Converts a vcf file (typically of indels) into a bed file

cth / make_dosages.rb

Created November 19, 2015 23:46

cth / extract_variants.rb

Created November 20, 2015 13:05

	#!/home/fng514/bin/ruby
	#$ -S /home/fng514/bin/ruby
	#$ -cwd
	require 'bio-samtools'
	require 'zlib'
	rsnumbers = {}

	puts ARGV.inspect

	throw "invalid numbmer of args" unless ARGV.size == 2

cth / power.R

Created November 26, 2015 23:42

	## Very simple power calculation
	## assuming a true odds ratio of 3.1

	pick_eater_obese <- function(n) table(runif(n) < 109/(330+109))
	pick_eater_control <- function(n) table(runif(n) < 37/(37+38))

	#controls <- data.frame(pheno = rep(1000,"case"), runif(

	p.values <- replicate(1000, fisher.test( matrix(c(pick_eater_obese(50),pick_eater_control(50)),2) )$p )
	significance <- c(5e-1, 5e-2, 5e-3, 5e-4, 5e-5)

cth / power.R

Created November 26, 2015 23:42

	## Very simple power calculation
	## assuming a true odds ratio of 3.1

	pick_eater_obese <- function(n) table(runif(n) < 109/(330+109))
	pick_eater_control <- function(n) table(runif(n) < 37/(37+38))
	p.values <- replicate(1000, fisher.test( matrix(c(pick_eater_obese(50),pick_eater_control(50)),2) )$p )
	significance <- c(5e-1, 5e-2, 5e-3, 5e-4, 5e-5)
	powers <- data.frame(significance, power = sapply(significance,function(x) mean(p.values < x)))

cth / README

Last active May 12, 2016 10:36

	strandflip.R version 0.2
	By Christian Theil Have
	with inspiration from SAS script by Aldi Kraja

	DISCLAIMER: NO WARRANTIES WHATSOEVER ARE PROVIDED WITH THIS
	PROGRAM. YOU RUN IT AT YOUR OWN RISK.
	YOU CAN DISTRIBUTE this utility pgm in other collaborations

	purpose: flip dosage if the alleles in original data are
	different when compared to 1000G set of nucmtDNA