Created
July 4, 2018 13:57
-
-
Save andreas-wilm/0a7502d974a1e696e54f2d5a1486e18c to your computer and use it in GitHub Desktop.
Nextflow: Proof of concept example for repeated function call with expected static output
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env nextflow | |
/* this is a minimal working example for a bug we recently ran into. | |
* in a nutshell: we can function (gen_sample_map_str) in every call to | |
* process GenomicsDB. * the function has static input and should | |
* produce identical output each time, but it doesn't. the question is | |
* why. | |
* | |
* yes, we can do tihs different, i.e. construct the string once and yes | |
* some things don't make sense in this current minimalistic form. | |
* but the code should in theory nevertheless work, yet doesn't. | |
*/ | |
params.publishdir = 'out' | |
// a sample map akin to what's used in GATK's GenomicsDB | |
params.sample_name_map = [ | |
'sample1': 's1.g.vcf.gz', | |
'sample2': 's2.g.vcf.gz', | |
'sample3': 's3.g.vcf.gz', | |
'sample4': 's4.g.vcf.gz', | |
'sample5': 's5.g.vcf.gz', | |
'sample6': 's6.g.vcf.gz', | |
'sample7': 's7.g.vcf.gz', | |
'sample8': 's8.g.vcf.gz', | |
'sample9': 's9.g.vcf.gz', | |
'sample10': 's10.g.vcf.gz' | |
] | |
// fake regions | |
region_ch = Channel.from([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) | |
// constructing a sample map string from params.sample_name_map.the | |
// question is, why do repeated calls give different results? | |
def gen_sample_map_str() { | |
str = "" | |
params.sample_name_map.each{ k, v -> | |
str += "${k}\t${v}\n" | |
} | |
return str; | |
} | |
// just prints output of gen_sample_map_str() to a file. in theory all | |
// files should have at least identical number of lines, but they | |
// don't (see output generated in Validate) | |
// | |
process GenomicsDBImport { | |
publishDir params.publishdir | |
input: | |
val reg from region_ch | |
output: | |
file("${reg}.txt") into final_ch | |
script: | |
sample_name_map_str = gen_sample_map_str() | |
""" | |
echo "${sample_name_map_str}" > ${reg}.txt; | |
""" | |
} | |
process Validate { | |
input: | |
file(all) from final_ch.collect() | |
output: | |
stdout wc_ch | |
script: | |
""" | |
wc -l ${all} | |
""" | |
} | |
wc_ch.subscribe { print "Line numbers should be identical but are not (forget about order):\n $it" } |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Adding
-process.cpus=1
or-qs 1
as arguments doesn't fix the behaviour