task name {
String in
command {
echo '${in}'
}
output {
String out = read_string(stdout())
}
}
workflow w {
call name
}
2-spaces, braces on same line, like this
task test {
String prefix
command {
./my_script > ${prefix}.first
python other_script.py > ${prefix}.second
}
output {
File first = "${prefix}.first"
File second = "${prefix}.second"
}
}
Cromwell gives you a writeable directory as your CWD
for every backend
task test {
command {
./do_work > do_this
./do_work2 > subdir/do_this
./do_work3 > /etc/do_not_do_this
}
}
task bad {
File f
command {
java -jar /usr/lib/library.jar -Dinput=${f}
}
}
Instead:
task good {
File f
File jar
command {
java -jar ${jar} -Dinput=${f}
}
}
If the task is only ever expected to run in the context of a Docker container, then it is okay to reference absolute paths:
task okay {
File f
command {
java -jar /usr/lib/library.jar -Dinput=${f}
}
runtime {
docker: "broadinstitute:some-image"
}
}
If you find yourself wanting to do a small data transformation on one of the inputs, or wanting an if statement, or random number or any other piece of logic, encapsulate it in the task
's command:
task example {
String s
File f
Boolean b
command <<<
if [ "${b}" != 'true' ]; then
var="first"
else
var="second"
fi
echo $var ${f} ${s}
#java -jar picard.jar ...
>>>
output {
String o = read_string(stdout())
}
}
If you're more familiar with Python:
task example {
File f
String s
command {
# Probably not a good idea in practice...
pip install my_module
python <<CODE
import my_module
if 'xyz' in '${s}'.split(','):
floating_point_result = my_module.my_method('${f}')
else:
floating_point_result = my_module.my_method2('${f}')
print(floating_point_result)
CODE
}
}
I find it easier to debug the inputs/outputs/commands of WDL file if they're not run within Docker containers.
Even if that means temporarily using absolute paths
task example {
String s
File f
Boolean b
command <<<
if [ "${b}" != 'true' ]; then
var="first"
else
var="second"
fi
echo $var ${f} ${s}
#java -jar picard.jar ...
>>>
output {
String o = read_string(stdout())
}
}
workflow w {
call example
}
Inputs (template generated with java -jar cromwell.jar inputs example.wdl
):
{
"w.example.s": "foobar",
"w.example.f": "example.inputs",
"w.example.b": true
}
if [ "true" != 'true' ]; then
var="first"
else
var="second"
fi
echo $var /Users/sfrazer/projects/cromwell/cromwell-executions/w/4ede86f2-52ea-4db5-a3dc-68467f264eb7/call-example/Users/sfrazer/projects/cromwell/example.inputs foobar
#java -jar picard.jar ...
From the output is the command you can copy/paste to run it manually:
"/bin/bash" "-c" "cat cromwell-executions/w/4ede86f2-52ea-4db5-a3dc-68467f264eb7/call-example/script | /bin/bash <&0"
WDL allows compound types like Array[String]
or Map[String, Int]
or Array[Array[String]]
. There are two ways to get these data types into a form that the command
can use:
- Serialization by concatenation (only for
Array
) - Serialization by write-to-file
task example {
Array[String] array
Map[String, File] map
Array[Array[Int]] matrix
command {
echo ${sep=',' array}
cat ${write_lines(array)}
python script.py --map=${write_map(map)}
python process.py ${write_tsv(matrix)}
}
}
workflow test {
call example
}
{
"test.example.array": ["a", "b", "c"],
"test.example.map": {
"key0": "/path/to/file0",
"key1": "/path/to/file1",
"key2": "/path/to/file2",
},
"test.example.matrix": [
[0, 1, 2],
[3, 4, 5],
[6, 7, 8]
]
}
Produces this command:
echo a,b,c
cat /tmp/array.txt
python script.py --map=/tmp/map.txt
python process.py /tmp/matrix.txt
/tmp/array.txt
would contain
a
b
c
/tmp/map.txt
would contain
key0 /path/to/file0
key1 /path/to/file1
key2 /path/to/file2
/tmp/matrix.txt
would contain
0 1 2
3 4 5
6 7 8
use read_*
functions go to from files output by your command into WDL values
task example {
command {
echo 'first' > file
echo 'second' >> file
echo 'third' >> file
}
output {
Array[String] out = read_lines("file")
}
}