Last active
January 12, 2019 19:15
-
-
Save jleskovar/dfc545148398d81715da02f61bf39b91 to your computer and use it in GitHub Desktop.
btcd watchdog
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
POST_INIT_SYNC_DELAY=60 | |
POLL_DELAY=60 | |
STALL_THRESHOLD=5 | |
if [ -z `pidof btcd` ]; then | |
echo "Starting btcd" | |
nohup btcd & | |
sleep $POST_INIT_SYNC_DELAY | |
fi | |
stalls=0 | |
while true; do | |
start=`btcctl --notls getinfo | jq -r .blocks` | |
sleep $POLL_DELAY | |
end=`btcctl --notls getinfo | jq -r .blocks` | |
echo "Processed $((end - start)) blocks in the last $POLL_DELAY seconds" | |
if [[ "$start" == "$end" ]]; then | |
if (( stalls > STALL_THRESHOLD )); then | |
echo "Too many stalls detected. Restarting btcd..." | |
kill `pidof btcd` | |
sleep 10 | |
nohup btcd & | |
stalls=0 | |
else | |
syncnode=`btcctl --notls getpeerinfo | jq -r '.[] | select(.syncnode == true) | .addr' | cut -f1 -d:` | |
if [ -z "$syncnode" ]; then | |
echo "Stall detected, but no syncnode found. Restarting btcd..." | |
kill `pidof btcd` | |
sleep 10 | |
nohup btcd & | |
stalls=0 | |
else | |
echo "Stall detected! Evicting potentially bad node $syncnode" | |
btcctl --notls node disconnect $syncnode | |
stalls=$(( stalls + 1 )) | |
fi | |
fi | |
fi | |
done |
Works like a charm, thank you. In my case I only had to remove --notls .
./watchdog_btcd.sh
+ POST_INIT_SYNC_DELAY=60
+ POLL_DELAY=60
+ STALL_THRESHOLD=5
++ pidof btcd
+ '[' -z 5465 ']'
+ stalls=0
+ true
++ jq -r .blocks
++ btcctl getinfo
+ start=384672
+ sleep 60
++ btcctl getinfo
++ jq -r .blocks
+ end=384672
+ echo 'Processed 0 blocks in the last 60 seconds'
Processed 0 blocks in the last 60 seconds
+ [[ 384672 == \3\8\4\6\7\2 ]]
+ (( stalls > STALL_THRESHOLD ))
++ btcctl getpeerinfo
++ jq -r '.[] | select(.syncnode == true) | .addr'
++ cut -f1 -d:
+ syncnode=217.23.8.80
+ '[' -z 217.23.8.80 ']'
+ echo 'Stall detected! Evicting potentially bad node 217.23.8.80'
Stall detected! Evicting potentially bad node 217.23.8.80
+ btcctl node disconnect 217.23.8.80
2018-04-20 09:28:00.697 [INF] SYNC: Lost peer 217.23.8.80:8333 (outbound)
2018-04-20 09:28:00.697 [INF] SYNC: Syncing to block height 519094 from peer 83.248.113.248:8333
+ stalls=1
+ true
++ jq -r .blocks
++ btcctl getinfo
+ start=384672
+ sleep 60
2018-04-20 09:28:00.977 [INF] SYNC: New valid peer 5.15.98.67:8333 (outbound) (/Satoshi:0.16.0/)
2018-04-20 09:28:01.391 [INF] SYNC: Processed 1 block in the last 7m29.19s (2 transactions, height 384673, 2015-11-21 19:38:21 +0000 UTC)
2018-04-20 09:28:11.851 [INF] SYNC: Processed 3 blocks in the last 10.46s (1207 transactions, height 384676, 2015-11-21 19:47:05 +0000 UTC)
2018-04-20 09:28:24.364 [INF] SYNC: Processed 6 blocks in the last 12.51s (3072 transactions, height 384682, 2015-11-21 20:19:26 +0000 UTC)
2018-04-20 09:28:36.536 [INF] SYNC: Processed 2 blocks in the last 12.17s (3743 transactions, height 384684, 2015-11-21 20:55:52 +0000 UTC)
2018-04-20 09:28:52.387 [INF] SYNC: Processed 4 blocks in the last 15.85s (2171 transactions, height 384688, 2015-11-21 21:24:00 +0000 UTC)
I was having issues with the script being able to ban stalled ipv6 hosts. It is easier to ban by node id than ip.
syncnode=`btcctl --notls getpeerinfo | jq -r '.[] | select(.syncnode == true) | .id'
This helped a lot
Thank you, very helpful
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@Sjors. I have both btcd mainnet and testnet running. By first one do you mean the service started first of the two? It seems to be working fine for me with mainnet so far. I had testnet already synced at 100%, shut btcd down, restarted on mainnet then resumed testnet