- State from the beginning that you are focusing on exome
- Talk through the pLI figure in more detail
- Any slide needs to talked through in detail if you present it
- Make it clear what missense variants are what you are seeing
- Should the observed CpG density be modulated by the codon content of the region
- Model ideas?
- Exclude singletons???
- use just synonymous density as the independent variable
- Molly Przezorski, CpG
- Simplify the explanation of each of the essential gene descriptions
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# download simple repeats from UCSC and convert to BED | |
curl -s http://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/simpleRepeat.txt.gz \ | |
| gzcat \ | |
| cut -f 2-5 \ | |
> simrep.hg19.bed | |
# download microsatellites from UCSC and convert to bed | |
curl -s http://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/microsat.txt.gz \ | |
| gzcat \ | |
| cut -f 2-5 \ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# mnake sure all rows have 99 fields | |
$ awk 'BEGIN{FS="\t"} {print NF}' /uufs/chpc.utah.edu/common/home/u1072926/gemini_queries/all.txt | uniq | |
99 | |
# how many HET (1) and HOM_ALT (3) genotypes were there? | |
$ awk 'BEGIN{FS="\t"} {print $99}' /uufs/chpc.utah.edu/common/home/u1072926/gemini_queries/all.txt | sort | uniq -c | |
# get rid of headers except for the first one | |
(head -n 1 /uufs/chpc.utah.edu/common/home/u1072926/gemini_queries/all.txt; grep -v gt_types /uufs/chpc.utah.edu/common/home/u1072926/gemini_queries/all.txt) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# STEP 1: for all fams, print each SGS region (chrom, start, end) | |
import sys | |
import subprocess as sub | |
def run_jobs(commands): | |
""" | |
This function takes a set of max_work commands and executes | |
them with rj | |
""" | |
f = open('rungemini.sh', 'w') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# GeneDx: https://www.genedx.com/test-catalog/disorders/early-onset-epileptic-encephalopathy-andor-infantile-spasms/ | |
# Invitae: https://www.invitae.com/en/physician/tests/03402/ | |
ADSL | |
ALDH7A1 | |
ALG13 | |
ARHGEF9 | |
ARID1B | |
ARX | |
ATP1A2 | |
ATP6AP2 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
""" | |
given a callable file from goleft depth on stdin, merge adjacent LOW_COVERAGE and CALLABLE regions. | |
also split regions that are larger than max_region. | |
this is to even out parallelism for sending regions to freebayes. | |
""" | |
from __future__ import print_function, division | |
import sys | |
from itertools import groupby | |
from operator import itemgetter |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
cat a.bed | |
chr1 10 50 10 | |
cat b.bed | |
chr1 20 40 20 | |
cat c.bed | |
chr1 30 33 30 | |
# Find the sub-intervals shared and unique to each file. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
chr1 10 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
GTTGTACTTCGTTCAATCGGTAGGTGTTTAACCGGATGGTCACGCCTACC | |
CAAGCATACTTCATTCAGTCAGGCGAAATTATTGCCAGGTCGCCGCCTAC | |
GTTGTACTTCGTTCAGTCGGTGGTGTTTAACTGGGTCATCGCCTACCGTG | |
AGTAATACTTCGTTCAGTTTGTGGAAGGTAGTGTTTAACCGGTTGCTCGC | |
GGTATGCGCTGGTCAAATCGGAGAGTGGGTGTTTATCGGATGGATCGCTG | |
CGGTGCGTGCTGTGATTCATCTTTGACTGGTGTTTATGGTCGGTCTTTAC | |
CATTGTACTTCCCGTTCAGTTTATCAAATTTGGTGTTTATATGAACCATA | |
GATGGTGGTGGTGGCGGTGGCGCTGGTGTTGCATGGTGTTGCGCATTATT | |
GTATCGCGTTCAGTTTCGGAAGGTGGTGTTTAACCAGTTCGCCGCCTACC | |
GTTTTTTCGTTCATTTGGTACGGTGTTGCGCCGGTCGCCGCCTACCGTGA |
Here's the deal.
- We have sequenced the exomes of an affected child, and their unaffected parents. The child has a rare skin condition.
- We aligned their exome data to the human reference genome.
- We called variants using GATK.
- The resulting VCF file is called
trio.trim.vep.vcf.gz
.
OlderNewer