Last active
January 7, 2016 00:09
-
-
Save donkirkby/a4e76a50662c0fa92a42 to your computer and use it in GitHub Desktop.
Report processes with top CPU across a Beowulf cluster.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ bpsh -pa ps -eo pcpu,pid,user,args ; bpsh -p -1 ps -eo pcpu,pid,user,args | bpstat -P -1 | sed -e 's/^ */ /' ; } | sort -g -k2 -r | head -30 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
# A related tool to check load levels across the cluster. Output is cpu count, 1 min avg, 5 min avg, 15 min avg. | |
# ./display_load.py ; bpsh -sap ./display_load.py | |
import re | |
from subprocess import check_output | |
def main(): | |
with open('/proc/cpuinfo', 'rU') as f: | |
processor_count = sum([1 for line in f if line.startswith('processor')]) | |
uptime = check_output(['uptime']) | |
loads = re.split(',? ', uptime.strip())[-3:] | |
loads.insert(0, str(processor_count)) | |
print ', '.join(loads) | |
main() |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This reports the top 30 processes across the cluster, sorted by CPU load.
For a full explanation of all the commands, see explainshell.com, but here's a summary:
bpsh -pa ps ...
- gets a list of all the processes from each compute node in the cluster, with the node number in the first columnbpsh -p -1 ps ... | bpstat -P -1 | sed ...
- gets a list of all the processes in the cluster, then filters to only the ones on the head node, then trims the leading spaces.bpsh -pa
doesn't include the head node, so we have to call it with -1 explicitly.ps aux
on the head node includes entries for all the compute nodes, so we have to usebpstat
to exclude everything from the compute nodes.bpstat
adds a blank column, so we usesed
to remove the leading spaces.{ bpsh ... ; bpsh ... | bpstat ... | sed ... ; }
- The braces and semicolons combine the output from the two commands into one, so we get a list of all the processes on the compute nodes and the head node. You might wonder why we don't just useps ... | bpstat -P
, but that doesn't include the full command line. Runningps
on each compute node includes the full command line.sort ...
- Sort by the second column, the CPU usage of each process.head -30
- Return the top 30 lines, or the 30 processes that are using the most CPU across the cluster.