Skip to content

Instantly share code, notes, and snippets.

@ryanlayer
Last active December 18, 2020 16:05
Show Gist options
  • Select an option

  • Save ryanlayer/38b58ef08a5ef7dbe326 to your computer and use it in GitHub Desktop.

Select an option

Save ryanlayer/38b58ef08a5ef7dbe326 to your computer and use it in GitHub Desktop.
LUMPY and CNVnator
WIN=100
SAMPLE="NA12878"
SAMPLE_BAM="NA12878_S1.bam"
cnvnator -root $SAMPLE.$WIN.root -genome GRCh37 -tree $SAMPLE_BAM
cnvnator -genome GRCh37 -root $SAMPLE.$WIN.root -his $WIN -d /shared/genomes/b37/full/chroms
cnvnator -root $SAMPLE.$WIN.root -stat $WIN
cnvnator -root $SAMPLE.$WIN.root -partition $WIN
cnvnator -root $SAMPLE.$WIN.root -call $WIN > $SAMPLE.$WIN.cnvcalls.txt
~/src/lumpy-sv/scripts/cnvanator_to_bedpes.py \
-c NA12878.$WIN.cnvcalls.txt \
-b 600 \
--del_o del.$WIN.bedpe \
--dup_o dup.$WIN.bedpe
#### DEPENDING ON IF YOU BAMS ARE chr1 or 1 YOU MAY NOT NEED THIS STEP
cat del.$WIN.bedpe | sed -e "s/chr//g" > del.$WIN.nochr.bedpe
cat dup.$WIN.bedpe | sed -e "s/chr//g" > dup.$WIN.nochr.bedpe
~/src/lumpy-sv/scripts/bedpe_sort.py \
-b del.$WIN.nochr.bedpe \
-g ~/scratch/cnvnator/genome.txt\
> del.$WIN.nochr.posSorted.bedpe
~/src/lumpy-sv/scripts/bedpe_sort.py \
-b dup.$WIN.nochr.bedpe \
-g ~/scratch/cnvnator/genome.txt\
> dup.$WIN.nochr.posSorted.bedpe
~/src/lumpy-sv/bin/lumpy \
-mw 4 \
-tt 1.0 \
-pe bam_file:$PEBAM,histo_file:$HISTO,mean:$MEAN,stdev:$STD,read_length:100,min_non_overlap:100,discordant_z:$Z,back_distance:20,weight:1,id:PE,min_mapping_threshold:10 \
-sr bam_file:$SRBAM,back_distance:20,min_mapping_threshold:10,weight:1,id:SR,min_clip:20 \
-bedpe bedpe_file:dup.$WIN.posSorted.bedpe,weight:3,id:DUP \
-bedpe bedpe_file:del.$WIN.posSorted.bedpe,weight:3,id:DEL
@flashton2003
Copy link
Copy Markdown

Thank you for this! Life saver :-)

@rbatorsky
Copy link
Copy Markdown

How would one incorporate tumor-normal pairs into this workflow? I've tried the following:

$LUMPY -mw 4
-tt 1.0
-pe id:normal,bam_file:$NORMAL_PEBAM,histo_file:$NORMAL_HISTO,mean:$NORMAL_MEAN,stdev:$NORMAL_STD,read_length:100,min_non_overlap:100,discordant_z:$Z,back_distance:20,weight:1,id:PE,min_mapping_threshold:10
-sr id:normal,bam_file:$NORMAL_SRBAM,back_distance:20,min_mapping_threshold:10,weight:1,id:SR,min_clip:20
-pe id:tumor,bam_file:$TUMOR_PEBAM,histo_file:$TUMOR_HISTO,mean:$TUMOR_MEAN,stdev:$TUMOR_STD,read_length:100,min_non_overlap:100,discordant_z:$Z,back_distance:20,weight:1,id:PE,min_mapping_threshold:10
-sr id:tumor,bam_file:$TUMOR_SRBAM,back_distance:20,min_mapping_threshold:10,weight:1,id:SR,min_clip:20
-bedpe id:tumor_dup,bedpe_file:${setdir}/dup.$WIN.tumor.bedpe,weight:4
-bedpe id:tumor_del,bedpe_file:${setdir}/del.$WIN.tumor.bedpe,weight:4 > tumornormal.vcf

But my vcf header contains fields:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT PE SR tumor_dup tumor_del

I expected the tumor, normal samples to have separate fields in the header, as they did when I ran lumpy_express. Does the vcf contain information for both samples?

Thanks for your help.

@nitinra
Copy link
Copy Markdown

nitinra commented Dec 18, 2020

What format should the genome file be in for the bedpe_sort.py script? I tried using the reference genome in fasta format and it returned a blank file.
./bedpe_sort.py -b del.sample.nochr.bedpe -g reference.fasta > del.sample.nochr.posSorted.bedpe

@ryanlayer
Copy link
Copy Markdown
Author

ryanlayer commented Dec 18, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment