Skip to content

Instantly share code, notes, and snippets.

View gatoravi's full-sized avatar

Avinash R gatoravi

View GitHub Profile
@gatoravi
gatoravi / import-model-define.sh
Last active August 29, 2015 14:16
define models for imported instrument data.
#List all the instrument-data that is imported
genome instrument-data list imported "sample.patient.common_name='TST1ds' and sample.extraction_type = 'genomic dna'"
genome instrument-data list imported "sample.patient.common_name='TST1ds' and sample.extraction_type = 'rna'"
#Define the models based on imported instrument-data, rename the ID's
genome model define rna-seq --reference-sequence-build='106942997' --annotation-build='124434505' --cancer-annotation-db='tgi/cancer-annotation/human/build37-20131010.1' --subject='H_NJ-HCC1395ds-HCC1395_RNA' --processing-profile='2762841' --instrument-data='0296e0aeeb3f4daab1e8604cddb2cb70' --model-name='hcc1395ds-tumor-rnaseq'
genome model define rna-seq --reference-sequence-build='106942997' --annotation-build='124434505' --cancer-annotation-db='tgi/cancer-annotation/human/build37-20131010.1' --subject='H_NJ-HCC1395ds-HCC1395_BL_RNA' --processing-profile='2762841' --instrument-data='675561c466744c8498f903b77ee9acbe' --model-name='hcc1395ds-normal-rnaseq'
ge
@gatoravi
gatoravi / run_em.R
Created April 20, 2015 05:30
EM algorithm - solution to problem 6.6.7 "Intro to Mathematical Statistics - Hogg, 7th ed"
#! /bin/Rscript
run_em <- function(data, cutoff = 1.50, theta_init = 0.5, iter = 100, tolerance = 1e-4) {
missing = data[data == cutoff]
observed = data[data != cutoff]
n1 = length(observed)
n2 = length(missing)
n = n1 + n2
theta_current = theta_init
theta_new = 0
@gatoravi
gatoravi / bamreadcount_parse1.sh
Created April 22, 2015 20:42
Parse read-counts from Bam-Readcount output.
ls *tsv | xargs -i echo 'cut -f 1-4,6-9 {} |tr ":" "\t" | cut -f 1-4,6,20,34,48 | perl -ne '\''chomp$_; @array=split("\t"); if($array[3] == $array[4] || $array[3] == $array[5] || $array[3] == $array[6] || $array[3] == $array[7]){next;} else {print "$_\n";}'\'' > {}.formatted '
@gatoravi
gatoravi / runGemini.sh
Last active January 11, 2016 23:40
How to run a Gemini analysis on clia1
#VEP is on clia1 here - /opt/gms/vep/variant_effect_predictor
#An example that I used to run it was,
#VEP is installed here /opt/gms/vep/variant_effect_predictor
#PERL5LIB=$PERL5LIB:/opt/gms/vep/ perl variant_effect_predictor.pl -i example_GRCh37.vcf --cache --dir_cache /opt/gms/vep/
#another example with required VEP annotations for GEMINI
PERL5LIB=$PERL5LIB:/opt/gms/vep/ perl /opt/gms/vep/variant_effect_predictor/variant_effect_predictor.pl -i test.vcf \
--cache --sift b --polyphen b --symbol --numbers --biotype --total_length \
-o snvs.detailed.vep.vcf --vcf --fields Consequence,Codons,Amino_acids,Gene,SYMBOL,Feature,EXON,PolyPhen,SIFT,Protein_position,BIOTYPE \
@gatoravi
gatoravi / gist:9a9277cc0cdabf5c1dcf
Last active August 29, 2015 14:24
create_igv_batch_from_bed.sh
#! /bin/bash
#This script takes 2 arguments
#The first script is the BAM file
#The second is the BED file
bam=$1
bed=$2
awk -v bam=$bam 'BEGIN { print "new\ngenome hg19\nload "bam"\nsnapshotDirectory ./snapshots/" } { gsub("chr", "", $1); if(NR > 1) print "goto "$1"\t"$2-10"\t"$2+10"\nsort base\ncollapse\nsnapshot"; }' $bed
@gatoravi
gatoravi / .gitignore
Last active August 29, 2015 14:24 — forked from octocat/.gitignore
# Compiled source #
###################
*.com
*.class
*.dll
*.exe
*.o
*.so
# Packages #
@gatoravi
gatoravi / cs_programming_resources.md
Last active August 29, 2015 14:26 — forked from anonymous/cs_programming_resources.md
Programming/Computer Science books.
  1. The Practice of programming - Brian W. Kernighan
  2. The Unix Programming Environment - Brian W. Kernighan and Rob Pike
  3. Numerical Recipes in C: The Art of Scientific Computing
  4. Structure and Interpretation of Computer Programs.
  5. Higher-Order Perl: Transforming Programs with Programs - Mark Jason Dominus
  6. Reverse Engineering for Beginners - Dennis Yurichev
  7. The Pragmatic Programmer: From Journeyman to Master
  8. The Mythical Man-Month
@gatoravi
gatoravi / test_read_write.R
Last active August 29, 2015 14:26
Test reading and writing to the same file in R
#! /usr/bin/Rscript
test_read_write <- function() {
iris_copy <- read.table("iris_copy.tsv", head = T, sep = "\t")
iris_copy$Sepal.Length <- iris$Sepal.Length + 1
write.table(iris_copy, "iris_copy.tsv", row.names = F, quote = F,
sep = "\t")
}
create_original_modified_iris <- function() {
@gatoravi
gatoravi / travis_cpp_notes1.md
Last active August 29, 2015 14:26
Notes from Travis on general c++ coding

##Exiting from programs Exceptions aren't the only way. There are times when cerr/exit are acceptable, but personally, I almost never do that. The answer to your question is kind of long.

I tend to think like a library developer (i.e., code reuse is always on my mind). If your function calls exit(), that makes it impossible to use in other contexts that might not care about whatever error condition is causing you to want to exit. "But what about contexts where the error is truly unrecoverable?" In those cases, exit() inhibits debugging, use abort() instead. This should be reserved for truly egregious errors--those which you'll want to get a core dump

@gatoravi
gatoravi / create_annotate_files.sh
Created September 26, 2015 17:51
Create the test files for 'regtools junctions annotate'
#Generate the FASTA
rm -f test.fa; samtools faidx ../../regtools-test/chr22.fa 22:41475000-41585000 > test.fa
sed "1s/.*/>22/" test.fa > new_test.fa; rm -f test.fa; mv new_test.fa test.fa #fix the header
samtools faidx test.fa
#Generate the GTF
rm -f test.gtf; awk -F"\t" '$4> 41480000 && $5 < 41580000 { OFS = "\t"; $4 = $4 - 41475000 + 1; $5 = $5 - 41475000 + 1; print }' ../../regtools-test/genes_chr22.gtf > test.gtf
#Create the junctions.bed
rm -f test_junctions.bed; awk '$1==22 && $2> 41480000 && $3 < 41580000 { OFS = "\t"; $2=$2 - 41475000 + 1; $3 = $3 - 41475000 + 1; $7=$2; $8=$3; print }' hcc1395_junctions.bed > test_junctions.bed