Skip to content

Instantly share code, notes, and snippets.

@arq5x
Last active December 18, 2015 10:29
Show Gist options
  • Save arq5x/5769167 to your computer and use it in GitHub Desktop.
Save arq5x/5769167 to your computer and use it in GitHub Desktop.
Example of using GEMINI genotype columns to identify confident somatic mutations in a cancer experiment
# DOCS:
# https://gemini.readthedocs.org/en/latest/content/database_schema.html#genotype-information
# GEMINI SOURCE:
# https://github.com/arq5x/gemini
#########################################################################
# load a VCF for a tumor / normal pair into gemini.
# - use 4 cores
# - assume VCF has been annotated with snpEff
#########################################################################
$ gemini load -v tumor-normal.vcf -t snpEff --cores 4 tumor-normal.vcf.db
#########################################################################
# Identify novel somatic mutations in the tumor that are likely to
# impact gene function.
#
# SOMATIC:
# 1. the tumor has an alternate allele (HET)
# 2. the normal does not (HOM_REF)
# 3. Also, the normal has NO evidence for the alternate allele. To
# enfore this, we require that the were 0 alignments for the normal
# with the alternate allele.
#########################################################################
$ gemini query -q "select chrom, start, end, ref, alt, type, gene \
from variants
where impact_severity !='LOW'
and in_dbsnp = 0" \
--gt-filter "gt_types.TUMOR == HET and
gt_types.NORMAL == HOM_REF and
gt_alt_depths.NORMAL == 0" \
tumor-normal.vcf.db
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment