Skip to content

Instantly share code, notes, and snippets.

@lindenb
Created July 14, 2025 11:47
Show Gist options
  • Save lindenb/fbd76e5887754a7831d95422ea58f499 to your computer and use it in GitHub Desktop.
Save lindenb/fbd76e5887754a7831d95422ea58f499 to your computer and use it in GitHub Desktop.
using nextflow multiMap to split a stream so the input can fit the design of an existing module.
/**
You want to call some BAM files in several chromosomes.
So you would expect a process CALL_VARIANT to have the following input:
```
process CALL_VARIANT {
input:
tuple val(meta),path(bam),path(bai),path(contig)
```
unfortunately, the nf-core module available was designed the following way;
```
process CALL_VARIANT {
input:
tuple val(meta1),val(contig)
tuple val(meta ),path(bam),path(bai)
```
the operator `multiMap` is here to help:
expected output:
> invoke bcftools merge with S3.bam.chr1.vcf + S1.bam.chr1.vcf + S2.bam.chr1.vcf
> invoke bcftools merge with S1.bam.chr2.vcf + S3.bam.chr2.vcf + S2.bam.chr2.vcf
> invoke bcftools merge with S3.bam.chrY.vcf
*/
workflow {
/** the list of bams : meta, bam, bai */
bams = Channel.of(
[ [id:"S1",sex:"XX"], "S1.bam", "S1.bam.bai"],
[ [id:"S2",sex:"XX"], "S2.bam", "S2.bam.bai"],
[ [id:"S3",sex:"XY"], "S3.bam", "S3.bam.bai"]
)
.map{[it[0],file(it[1]),file(it[2])]}
/* the regions I want to call */
contigs = Channel.of("chr1","chr2","chrY")
ch = bams.combine(contigs)
/* do not call females on chrY */
.filter{!(it[0].sex.equals("XX") && it[3].equals("chrY"))}
/* add the contig in meta so we can group data by contig to merge the vcf by contig */
.map{it[0]=it[0].plus(contig:it[3]); return it;}
/* split the input with multimap */
.multiMap{
contig: [[id:it[3]],it[3]]
bam: it[0..2]
}
/* call the variants */
CALL_VARIANT_IN_CONTIG(ch.contig, ch.bam)
CALL_VARIANT_IN_CONTIG.out.view{"invoke bcftools merge with ${it[1].collect{it.name}.join(" + ")}"}
}
/* just testing if it works with a sub-workflow */
workflow CALL_VARIANT_IN_CONTIG {
take:
contig
bam
main:
CALL_VARIANT(contig,bam)
/* group by contig and invoke a tool to merge by contig */
vcf = merge = CALL_VARIANT.out.vcf
.map{[ [id:it[0].contig],it[1]]}
.groupTuple()
emit:
vcf
}
process CALL_VARIANT {
input:
tuple val(meta1),val(contig)
tuple val(meta ),path(bam),path(bai)
output:
tuple val(meta),path("*.vcf"),emit:vcf
script:
"""
echo "call ${bam} at ${contig}" > ${bam.name}.${contig}.vcf
"""
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment