Skip to content

Instantly share code, notes, and snippets.

@iracooke
Last active September 20, 2019 09:29
Show Gist options
  • Save iracooke/568fa41363e0aa1c3b96c43a397e0eff to your computer and use it in GitHub Desktop.
Save iracooke/568fa41363e0aa1c3b96c43a397e0eff to your computer and use it in GitHub Desktop.
JCU HPC FAQ

The first port of call for info on the JCU HPC system is the official wiki . This gist is a supplement to the main wiki that provides some quick answers to common questions and links to this wiki as well as other useful resources.

This gist assumes that your local machine (ie your personal computer, not the HPC) is running a unix-like OS (macOS or linux). Windows users should consider setting up windows subsystem for linux so that they can also have a unix-like operating system to work with.

What is the JCU HPC system

It is a fairly substantial collection of high performance computers. At the time of writing this constituted 15 nodes each of which has 80 cpus and just under 400Gb of memory. All the nodes are networked together so that large jobs can be distributed across multiple nodes. A range of high capacity data storage is also networked to HPC accounts as detailed here

How do I get an account on the JCU HPC System?

As documented in the official wiki the best way to obtain an account is to log a service-now job. When you log your job you should explain why you need access to the HPC for your research.

I have an account. How do I login?

Open a terminal window on your local computer (eg the Terminal program on macOS). Then connect using a command like this

Where jcXXXX is your jc number. If you are attempting to connect from outside the university you might need to use port 8822 instead of the default (22). You can do this with;

ssh -p 8822 [email protected]

When I login I don't know what to do next?

The JCU HPC systems run linux and you may need to familiarise yourself with some basic linux commands to navigate the system. There are plenty of resources on the internet for learning linux. For example, this website has an interactive beginners guide.

For a much more comprehensive introduction to command-line bioinformatics (of which using linux is a part) you should consider enrolling in BC5203/BC3203.

How do I know if the program I want is installed on HPC?

Most software on JCU HPC is available through the module load command. To see a list of everything available you can type;

module avail

The software I need isn't available. How do I get it installed?

First of all have a read of the official word from the HPC team on this topic. I believe they would prefer that all software is installed centrally and so the best option is to log a service now request. Describe your software in a way that will make it as easy as possible for the HPC team to install it (eg give a link to the software website).

There are other ways to install software (eg in your home directory, or with conda/miniconda). The aren't recommended by the HPC team so they should be avoided unless absolutely necessary.

I need to run my software for a long time or on many CPUs

When you type commands directly into your terminal you are running those commands interactively. In general this means that when your login session ends your commands will be terminated. A better way to run large jobs on the HPC is to formulate those commands as scripts to be submitted to the PBS Pro job scheduling system. The HPC website provides detailed instructions on how to write these job scripts.

A simple example job script is shown below

#!/bin/bash
#PBS -j oe
#PBS -N test
#PBS -l walltime=160:00:00
#PBS -l select=1:ncpus=24:mem=160gb

cd $PBS_O_WORKDIR
shopt -s expand_aliases
source /etc/profile.d/modules.sh
echo "Job identifier is $PBS_JOBID"
echo "Working directory is $PBS_O_WORKDIR"

# Write your command task here, eg blastp etc

In this example the lines at the top starting with #PBS are special PBS directives. They provide the name of the job (test), specify the maximum time the job will run with walltime and specify the number of nodes, CPUS and RAM the job will require. Be careful with these options. If you ask for resources that are not available your job might sit in the queue forever. In general if you have a large task try to break it into smaller jobs and submit each separately. This will get you greater overall usage of the machine and your jobs will move more quickly through the queue.

To submit this to PBS you would save this to a file. For example test.sh. Then run the following code to submit your job.

qsub test.sh

After the job is finished the output will be found in test.o****.

You can query the current job queue with qstat

How do I copy data from my laptop to the HPC?

There are lots of ways but a simple one is the scp command. To use this command first open a terminal window on your laptop/desktop (ie local machine) and navigate to a folder that contains the data you wish to copy. Let's say you want to copy a file called test.txt from your local machine to your home directory on the HPC. You do it like this;

scp test.txt [email protected]:~/

Now if you login to the HPC and type ls you should see the file test.txt in your home directory.

How do I copy data from the HPC to my local machine?

This is the reverse of the above command. Imagine you want to copy a file called test.txt which is in your home directory on the HPC. To copy this file from the HPC to your local machine you should type the following into a terminal window (on your local machine).

scp [email protected]:~/test.txt .

Note the . at the end. This means copy to the current directory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment