Skip to content

Instantly share code, notes, and snippets.

View gatoravi's full-sized avatar

Avinash R gatoravi

View GitHub Profile
@gatoravi
gatoravi / Parse_VCF_BCFs_with_htslib_C++.md
Last active November 16, 2020 10:28
Parse VCF/BCFs with htslib (C++)

#Gist of the gist This is a simple example showing how htslib can be used to parse VCF/BCF files in C++. Compiling options are shown in compile_and_run.sh, this assumes that you have compiled htslib into a static library - libhts.a, post a comment if you'd like help doing that.

The program takes a VCF file and spits out chromosome, position and number_of_alleles for each record in the VCF.

This test in the Samtools repo - [https://github.com/samtools/htslib/blob/1.2.1/test/test-vcf-api.c] also has some useful examples, note you might have to switch to the master branch version of that file to keep up with the currently rapidly evolving htslib API.

#Acknowledgements

@gatoravi
gatoravi / go_notes.md
Last active September 26, 2015 17:58
Notes from "Building Large Systems with Go" Unsession by Mark Mandel at StrangeLoop 2015

#Formatting Go code

  • go fmt
  • go vet
  • golint

#Package manager options

  • gb
  • go get
  • git clone
@gatoravi
gatoravi / create_annotate_files.sh
Created September 26, 2015 17:51
Create the test files for 'regtools junctions annotate'
#Generate the FASTA
rm -f test.fa; samtools faidx ../../regtools-test/chr22.fa 22:41475000-41585000 > test.fa
sed "1s/.*/>22/" test.fa > new_test.fa; rm -f test.fa; mv new_test.fa test.fa #fix the header
samtools faidx test.fa
#Generate the GTF
rm -f test.gtf; awk -F"\t" '$4> 41480000 && $5 < 41580000 { OFS = "\t"; $4 = $4 - 41475000 + 1; $5 = $5 - 41475000 + 1; print }' ../../regtools-test/genes_chr22.gtf > test.gtf
#Create the junctions.bed
rm -f test_junctions.bed; awk '$1==22 && $2> 41480000 && $3 < 41580000 { OFS = "\t"; $2=$2 - 41475000 + 1; $3 = $3 - 41475000 + 1; $7=$2; $8=$3; print }' hcc1395_junctions.bed > test_junctions.bed
@gatoravi
gatoravi / travis_cpp_notes1.md
Last active August 29, 2015 14:26
Notes from Travis on general c++ coding

##Exiting from programs Exceptions aren't the only way. There are times when cerr/exit are acceptable, but personally, I almost never do that. The answer to your question is kind of long.

I tend to think like a library developer (i.e., code reuse is always on my mind). If your function calls exit(), that makes it impossible to use in other contexts that might not care about whatever error condition is causing you to want to exit. "But what about contexts where the error is truly unrecoverable?" In those cases, exit() inhibits debugging, use abort() instead. This should be reserved for truly egregious errors--those which you'll want to get a core dump

@gatoravi
gatoravi / test_read_write.R
Last active August 29, 2015 14:26
Test reading and writing to the same file in R
#! /usr/bin/Rscript
test_read_write <- function() {
iris_copy <- read.table("iris_copy.tsv", head = T, sep = "\t")
iris_copy$Sepal.Length <- iris$Sepal.Length + 1
write.table(iris_copy, "iris_copy.tsv", row.names = F, quote = F,
sep = "\t")
}
create_original_modified_iris <- function() {
@gatoravi
gatoravi / cs_programming_resources.md
Last active August 29, 2015 14:26 — forked from anonymous/cs_programming_resources.md
Programming/Computer Science books.
  1. The Practice of programming - Brian W. Kernighan
  2. The Unix Programming Environment - Brian W. Kernighan and Rob Pike
  3. Numerical Recipes in C: The Art of Scientific Computing
  4. Structure and Interpretation of Computer Programs.
  5. Higher-Order Perl: Transforming Programs with Programs - Mark Jason Dominus
  6. Reverse Engineering for Beginners - Dennis Yurichev
  7. The Pragmatic Programmer: From Journeyman to Master
  8. The Mythical Man-Month
@gatoravi
gatoravi / .gitignore
Last active August 29, 2015 14:24 — forked from octocat/.gitignore
# Compiled source #
###################
*.com
*.class
*.dll
*.exe
*.o
*.so
# Packages #
@gatoravi
gatoravi / gist:9a9277cc0cdabf5c1dcf
Last active August 29, 2015 14:24
create_igv_batch_from_bed.sh
#! /bin/bash
#This script takes 2 arguments
#The first script is the BAM file
#The second is the BED file
bam=$1
bed=$2
awk -v bam=$bam 'BEGIN { print "new\ngenome hg19\nload "bam"\nsnapshotDirectory ./snapshots/" } { gsub("chr", "", $1); if(NR > 1) print "goto "$1"\t"$2-10"\t"$2+10"\nsort base\ncollapse\nsnapshot"; }' $bed
@gatoravi
gatoravi / runGemini.sh
Last active January 11, 2016 23:40
How to run a Gemini analysis on clia1
#VEP is on clia1 here - /opt/gms/vep/variant_effect_predictor
#An example that I used to run it was,
#VEP is installed here /opt/gms/vep/variant_effect_predictor
#PERL5LIB=$PERL5LIB:/opt/gms/vep/ perl variant_effect_predictor.pl -i example_GRCh37.vcf --cache --dir_cache /opt/gms/vep/
#another example with required VEP annotations for GEMINI
PERL5LIB=$PERL5LIB:/opt/gms/vep/ perl /opt/gms/vep/variant_effect_predictor/variant_effect_predictor.pl -i test.vcf \
--cache --sift b --polyphen b --symbol --numbers --biotype --total_length \
-o snvs.detailed.vep.vcf --vcf --fields Consequence,Codons,Amino_acids,Gene,SYMBOL,Feature,EXON,PolyPhen,SIFT,Protein_position,BIOTYPE \
@gatoravi
gatoravi / bamreadcount_parse1.sh
Created April 22, 2015 20:42
Parse read-counts from Bam-Readcount output.
ls *tsv | xargs -i echo 'cut -f 1-4,6-9 {} |tr ":" "\t" | cut -f 1-4,6,20,34,48 | perl -ne '\''chomp$_; @array=split("\t"); if($array[3] == $array[4] || $array[3] == $array[5] || $array[3] == $array[6] || $array[3] == $array[7]){next;} else {print "$_\n";}'\'' > {}.formatted '