Last active
December 10, 2019 03:33
-
-
Save sirselim/e081e779078c4a931f7bcd09edfa5224 to your computer and use it in GitHub Desktop.
a collection of handy scripts
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# handy scripts for bioinformatics | |
# A collection of scripts that I find useful. | |
## convert bam to cram format | |
# define reference genome (required for cram format) | |
# GENOME="/data/publicData/genomes/human/GRCh38/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna" # hg38 | |
GENOME="/data/publicData/genomes/human/GRCh37/hs37d5.fa" # hg19 | |
# find all bam files in current dir and convert to cram | |
find . -name "*.bam" | sed "s/\.bam$//" | xargs -I {} -P 36 samtools view -@ 4 -T $GENOME -C -o {}.cram {}.bam | |
# index | |
find . -name "*.cram" | xargs -I {} -P 36 samtools index -@ 4 {} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment