Created
June 7, 2021 21:58
-
-
Save tomsing1/db3376b2789cb85c5f4b1dc9a9fa30f4 to your computer and use it in GitHub Desktop.
Shell script to sub-sample a BAM file
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Shell function to subsample to a fixed number of alignments, | |
# requiring the sambamba and samtools suites to be available. | |
# see https://www.biostars.org/p/76791/ | |
function SubSample { | |
local FACTOR=$(samtools idxstats $1 | cut -f3 | \ | |
awk -v COUNT=$2 'BEGIN {total=0} {total += $1} END {print COUNT/total}') | |
if [[ $FACTOR > 1 ]] | |
then | |
echo '[ERROR]: Requested number of reads exceeds total read count in' $1 '-- exiting' && exit 1 | |
fi | |
sambamba view -s $FACTOR -t 2 -f bam -l 5 $1 | |
} | |
# example: | |
SubSample original.bam 10000 > subsampled.bam |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment