At the moment, the best solution for running deepbinner
(GPU) on the EBI cluster is using
a Singularity container and requesting a whole GPU to yourself. The reason for
needing to request a whole GPU is a horrible mess of tensorflow, cuda, memory sharing,
and my lack of proper GPU architecture understanding.
The image I have been using to run deepbinner
(v0.2.0) can be pulled with
hash="4b9b1b107c50b155e6a807c181982973"
uri="shub://mbhall88/Singularity_recipes:deepbinnergpu@${hash}"
name="deepbinner.v0.2.0.gpu.simg"
singularity pull --name "$name" "$uri"
Obviously you can change name
to whatever you want. Please do not execute directly
from the uri
.
You only need to run the classify
step of deepbinner
on the GPU, as the bin
step
is all CPU. An example script to run the classify
step with the above container
might look like
#!/usr/bin/env bash
if [[ $# -ne 4 ]]; then
echo "Error: Illegal number of parameters"
echo "Usage: $0 <container> <fast5_dir> <output> <barcode_type>"
exit 2
fi
container="$1"
fast5_dir="$2"
output="$3"
barcode_type="$4" # must be 'rapid' or 'native'
if [[ "$barcode_type" == "rapid" ]]; then
barcode_type="--rapid"
elif [[ "$barcode_type" == "native" ]]; then
barcode_type="--native"
else
echo "Error: Barcode type must be 'rapid' or 'native'"
exit 2
fi
singularity --silent exec --nv "$container" \
deepbinner classify "$barcode_type" "$fast5_dir" > "$output"
And to submit this script, reserving a whole GPU, you would use something like
script="my_deepbinner_script.sh"
job="deepbinnner_sample1"
log_dir="path/to/my/logs/"
memory=4000 # in my experience this is usually more than enough
bsub -gpu "num=1:j_exclusive=yes" \
-R "select[mem>${memory}] rusage[mem=${memory}]" \
-M "$memory" \
-P gpu \
-o "${log_dir}/${job}.out" \
-e "${log_dir}/${job}.err" \
-J "$job" \
"$script" <script_args>