Skip to content

Instantly share code, notes, and snippets.

@BrunoGrandePhD
Last active June 22, 2016 18:13
Show Gist options
  • Save BrunoGrandePhD/f21c37deb9de4d0af70bb25877df1a35 to your computer and use it in GitHub Desktop.
Save BrunoGrandePhD/f21c37deb9de4d0af70bb25877df1a35 to your computer and use it in GitHub Desktop.
Command for calculating average coverage for genes in a targeted sequencing experiment
# Input:
# - exons.bed: Exons BED file with the fourth column containing the gene name for gene coverage summarizing
# - sample.bam: BAM file using the same genome build (aka chromosome names and coordinate system) as exons.bed
# Output:
# - Tab-delimited file with two columns: gene name and average coverage
samtools bedcov exons.bed sample.bam \
| awk ' \
BEGIN { \
FS=OFS="\t"} \
{ \
exonbases=$5; \
exonlen=$3-$2+1; \
genebases[$4] += exonbases; \
genelen[$4] += exonlen} \
END { \
for (gene in genebases) {
genecov = genebases[gene]/genelen[gene]; \
print gene,genecov}}'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment