Skip to content

Instantly share code, notes, and snippets.

@donkirkby
Last active January 7, 2016 00:09
Show Gist options
  • Save donkirkby/a4e76a50662c0fa92a42 to your computer and use it in GitHub Desktop.
Save donkirkby/a4e76a50662c0fa92a42 to your computer and use it in GitHub Desktop.
Report processes with top CPU across a Beowulf cluster.
{ bpsh -pa ps -eo pcpu,pid,user,args ; bpsh -p -1 ps -eo pcpu,pid,user,args | bpstat -P -1 | sed -e 's/^ */ /' ; } | sort -g -k2 -r | head -30
#!/usr/bin/env python
# A related tool to check load levels across the cluster. Output is cpu count, 1 min avg, 5 min avg, 15 min avg.
# ./display_load.py ; bpsh -sap ./display_load.py
import re
from subprocess import check_output
def main():
with open('/proc/cpuinfo', 'rU') as f:
processor_count = sum([1 for line in f if line.startswith('processor')])
uptime = check_output(['uptime'])
loads = re.split(',? ', uptime.strip())[-3:]
loads.insert(0, str(processor_count))
print ', '.join(loads)
main()
@donkirkby
Copy link
Author

This reports the top 30 processes across the cluster, sorted by CPU load.
For a full explanation of all the commands, see explainshell.com, but here's a summary:

  • bpsh -pa ps ... - gets a list of all the processes from each compute node in the cluster, with the node number in the first column
  • bpsh -p -1 ps ... | bpstat -P -1 | sed ... - gets a list of all the processes in the cluster, then filters to only the ones on the head node, then trims the leading spaces. bpsh -pa doesn't include the head node, so we have to call it with -1 explicitly. ps aux on the head node includes entries for all the compute nodes, so we have to use bpstat to exclude everything from the compute nodes. bpstat adds a blank column, so we use sed to remove the leading spaces.
  • { bpsh ... ; bpsh ... | bpstat ... | sed ... ; } - The braces and semicolons combine the output from the two commands into one, so we get a list of all the processes on the compute nodes and the head node. You might wonder why we don't just use ps ... | bpstat -P, but that doesn't include the full command line. Running ps on each compute node includes the full command line.
  • sort ... - Sort by the second column, the CPU usage of each process.
  • head -30 - Return the top 30 lines, or the 30 processes that are using the most CPU across the cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment