Running Deepbinner on the EBI GPU cluster

Deepbinner on the EBI cluster

At the moment, the best solution for running deepbinner (GPU) on the EBI cluster is using a Singularity container and requesting a whole GPU to yourself. The reason for needing to request a whole GPU is a horrible mess of tensorflow, cuda, memory sharing, and my lack of proper GPU architecture understanding.

The image I have been using to run deepbinner (v0.2.0) can be pulled with

hash="4b9b1b107c50b155e6a807c181982973"
uri="shub://mbhall88/Singularity_recipes:deepbinnergpu@${hash}"
name="deepbinner.v0.2.0.gpu.simg"
singularity pull --name "$name" "$uri"

Obviously you can change name to whatever you want. Please do not execute directly from the uri.

You only need to run the classify step of deepbinner on the GPU, as the bin step is all CPU. An example script to run the classify step with the above container might look like

#!/usr/bin/env bash
if [[ $# -ne 4 ]]; then
    echo "Error: Illegal number of parameters"
    echo "Usage: $0 <container> <fast5_dir> <output> <barcode_type>"
    exit 2
fi

container="$1"
fast5_dir="$2"
output="$3"
barcode_type="$4"  # must be 'rapid' or 'native'

if [[ "$barcode_type" == "rapid" ]]; then
    barcode_type="--rapid"
elif [[ "$barcode_type" == "native" ]]; then
    barcode_type="--native"
else
    echo "Error: Barcode type must be 'rapid' or 'native'"
    exit 2
fi

singularity --silent exec --nv "$container" \
    deepbinner classify "$barcode_type" "$fast5_dir" > "$output"

And to submit this script, reserving a whole GPU, you would use something like

script="my_deepbinner_script.sh"
job="deepbinnner_sample1"
log_dir="path/to/my/logs/"
memory=4000  # in my experience this is usually more than enough
bsub -gpu "num=1:j_exclusive=yes" \
    -R "select[mem>${memory}] rusage[mem=${memory}]" \
    -M "$memory" \
    -P gpu \
    -o "${log_dir}/${job}.out" \
    -e "${log_dir}/${job}.err" \
    -J "$job" \
    "$script" <script_args>

mbhall88/README.md

Deepbinner on the EBI cluster