Last active
June 22, 2016 18:13
-
-
Save BrunoGrandePhD/f21c37deb9de4d0af70bb25877df1a35 to your computer and use it in GitHub Desktop.
Command for calculating average coverage for genes in a targeted sequencing experiment
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Input: | |
# - exons.bed: Exons BED file with the fourth column containing the gene name for gene coverage summarizing | |
# - sample.bam: BAM file using the same genome build (aka chromosome names and coordinate system) as exons.bed | |
# Output: | |
# - Tab-delimited file with two columns: gene name and average coverage | |
samtools bedcov exons.bed sample.bam \ | |
| awk ' \ | |
BEGIN { \ | |
FS=OFS="\t"} \ | |
{ \ | |
exonbases=$5; \ | |
exonlen=$3-$2+1; \ | |
genebases[$4] += exonbases; \ | |
genelen[$4] += exonlen} \ | |
END { \ | |
for (gene in genebases) { | |
genecov = genebases[gene]/genelen[gene]; \ | |
print gene,genecov}}' |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment