Last active
February 23, 2020 20:52
-
-
Save michaeljfazio/35ed67578df85d6d19b877e0fe8574c9 to your computer and use it in GitHub Desktop.
Jormungandr Node Monitor
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# | |
# Author: Michael Fazio (sandstone.io) | |
# | |
# This script monitors a Jormungandr node for "liveness" and executes a shutdown if the node is determined | |
# to be "stuck". A node is "stuck" if the time elapsed since last block exceeds the sync tolerance | |
# threshold. The script does NOT perform a restart on the Jormungandr node. Instead we rely on process | |
# managers such as systemd to perform restarts. | |
POLLING_INTERVAL_SECONDS=30 | |
SYNC_TOLERANCE_SECONDS=240 | |
REST_API="http://127.0.0.1:8443/api" | |
while true; do | |
LAST_BLOCK=$(jcli rest v0 node stats get --output-format json --host $REST_API 2> /dev/null) | |
LAST_BLOCK_HEIGHT=$(echo $LAST_BLOCK | jq -r .lastBlockHeight) | |
LAST_BLOCK_DATE=$(echo $LAST_BLOCK | jq -r .lastBlockTime) | |
LAST_BLOCK_TIME=$(date -d$LAST_BLOCK_DATE +%s 2> /dev/null) | |
CURRENT_TIME=$(date +%s) | |
DIFF_SECONDS=$((CURRENT_TIME - LAST_BLOCK_TIME)) | |
if ((LAST_BLOCK_TIME > 0)); then | |
if ((DIFF_SECONDS > SYNC_TOLERANCE_SECONDS)); then | |
echo "Jormungandr out-of-sync. Time difference of $DIFF_SECONDS seconds. Shutting down node..." | |
jcli rest v0 shutdown get --host $REST_API | |
else | |
echo "Jormungandr synchronized. Time difference of $DIFF_SECONDS seconds. Last block height $LAST_BLOCK_HEIGHT." | |
fi | |
else | |
echo "Jormungandr node is offline or bootstrapping..." | |
fi | |
sleep $POLLING_INTERVAL_SECONDS | |
done |
Hey. There really isn’t anything special required to run this as a service. If you can run it manually then it should run just fine as a service also.
Hi, Now I can run the script as well with linux service and I solved it by adding in the script the path to the jcli (my changes are yellow) :
you'd be better off with this:
JCLI="$(which jcli)"
[ -z "${JCLI}" ] && [ -f jcli ] && JCLI="./jcli"
it checks the binaries in your $PATH and if it fails, it looks in the current directory.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi Michael,
Great job for the script. I tested on my node and it's working fine when i run the script.
Because I'm using ssh connection to my server I can't keep all the time my ssh connection active, so I created a service which will start the script, but when i check the status for my service i see all the time this message : "Jormungandr node is offline or bootstrapping..." So what should i change in order to run your script via an linux service ?
Thank you.