Created
June 7, 2018 17:56
-
-
Save naumenko-sa/82df1cb7d9b5f64691bf437f0eb455f0 to your computer and use it in GitHub Desktop.
Module 3 - Annotation. Tutorial
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Module 3 - annotation |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Module3. Annotation
Check where you are in the file system
Copy wrapper scripts from the github
Set up environment varianbles
. /home/stud40/module3_annotation/env_setup.sh
Check:
You should see:
(Use bed file in your home folder)
Result
Check out validation: nist_NA12878-validate.png
Check out multiqc report
Check out commands (bcbio-nextgen-commands.log), tools versions (programs.txt), data versions (data_versions.csv).
Create excel report
if you are running for the second time (no cleanup)
Explore excel report: nist.csv
Check if variants are real in IGV. Which of 2 is of better quality?
Explore annotated vcf: Ashkenazim.vcf.gz
#which genes have variants?
select distinct gene from variants
#select all variants in PKN2 gene
select * from variants where gene='PKN2'
#select samples
select * from samples
#select only rare coding variants (MAF<1%)
select * from variants where max_aaf_all<0.01
#Select impacts of rare coding variants.
select chrom,start+1,ref,alt,impact from variants where max_aaf_all<0.01
#run the same query in unix command line
gemini query -q "select chrom,start+1,ref,alt,impact from variants where max_aaf_all<0.01" --header tiny.vepeffects.db
#Generate excel reports:
cre.gemini2txt.sh tiny.vepeffects.db 10
cat tiny.vepeffects.db.txt
cre.gemini_variant_impacts.sh tiny.vepeffects.db 10 ALL
cat tiny.vepeffects.db.impacts.txt
#Example report: Ashkenazim.no_c4r.xls