-
-
Save jleskovar/dfc545148398d81715da02f61bf39b91 to your computer and use it in GitHub Desktop.
#!/bin/bash | |
POST_INIT_SYNC_DELAY=60 | |
POLL_DELAY=60 | |
STALL_THRESHOLD=5 | |
if [ -z `pidof btcd` ]; then | |
echo "Starting btcd" | |
nohup btcd & | |
sleep $POST_INIT_SYNC_DELAY | |
fi | |
stalls=0 | |
while true; do | |
start=`btcctl --notls getinfo | jq -r .blocks` | |
sleep $POLL_DELAY | |
end=`btcctl --notls getinfo | jq -r .blocks` | |
echo "Processed $((end - start)) blocks in the last $POLL_DELAY seconds" | |
if [[ "$start" == "$end" ]]; then | |
if (( stalls > STALL_THRESHOLD )); then | |
echo "Too many stalls detected. Restarting btcd..." | |
kill `pidof btcd` | |
sleep 10 | |
nohup btcd & | |
stalls=0 | |
else | |
syncnode=`btcctl --notls getpeerinfo | jq -r '.[] | select(.syncnode == true) | .addr' | cut -f1 -d:` | |
if [ -z "$syncnode" ]; then | |
echo "Stall detected, but no syncnode found. Restarting btcd..." | |
kill `pidof btcd` | |
sleep 10 | |
nohup btcd & | |
stalls=0 | |
else | |
echo "Stall detected! Evicting potentially bad node $syncnode" | |
btcctl --notls node disconnect $syncnode | |
stalls=$(( stalls + 1 )) | |
fi | |
fi | |
fi | |
done |
thanks, very useful! I've been trying to sync my btcd for three days now. Hopefully with the watchdog it will now work without interruptions.
This is working very well for me; thanks for posting
@Sjors. I have both btcd mainnet and testnet running. By first one do you mean the service started first of the two? It seems to be working fine for me with mainnet so far. I had testnet already synced at 100%, shut btcd down, restarted on mainnet then resumed testnet
Works like a charm, thank you. In my case I only had to remove --notls .
./watchdog_btcd.sh
+ POST_INIT_SYNC_DELAY=60
+ POLL_DELAY=60
+ STALL_THRESHOLD=5
++ pidof btcd
+ '[' -z 5465 ']'
+ stalls=0
+ true
++ jq -r .blocks
++ btcctl getinfo
+ start=384672
+ sleep 60
++ btcctl getinfo
++ jq -r .blocks
+ end=384672
+ echo 'Processed 0 blocks in the last 60 seconds'
Processed 0 blocks in the last 60 seconds
+ [[ 384672 == \3\8\4\6\7\2 ]]
+ (( stalls > STALL_THRESHOLD ))
++ btcctl getpeerinfo
++ jq -r '.[] | select(.syncnode == true) | .addr'
++ cut -f1 -d:
+ syncnode=217.23.8.80
+ '[' -z 217.23.8.80 ']'
+ echo 'Stall detected! Evicting potentially bad node 217.23.8.80'
Stall detected! Evicting potentially bad node 217.23.8.80
+ btcctl node disconnect 217.23.8.80
2018-04-20 09:28:00.697 [INF] SYNC: Lost peer 217.23.8.80:8333 (outbound)
2018-04-20 09:28:00.697 [INF] SYNC: Syncing to block height 519094 from peer 83.248.113.248:8333
+ stalls=1
+ true
++ jq -r .blocks
++ btcctl getinfo
+ start=384672
+ sleep 60
2018-04-20 09:28:00.977 [INF] SYNC: New valid peer 5.15.98.67:8333 (outbound) (/Satoshi:0.16.0/)
2018-04-20 09:28:01.391 [INF] SYNC: Processed 1 block in the last 7m29.19s (2 transactions, height 384673, 2015-11-21 19:38:21 +0000 UTC)
2018-04-20 09:28:11.851 [INF] SYNC: Processed 3 blocks in the last 10.46s (1207 transactions, height 384676, 2015-11-21 19:47:05 +0000 UTC)
2018-04-20 09:28:24.364 [INF] SYNC: Processed 6 blocks in the last 12.51s (3072 transactions, height 384682, 2015-11-21 20:19:26 +0000 UTC)
2018-04-20 09:28:36.536 [INF] SYNC: Processed 2 blocks in the last 12.17s (3743 transactions, height 384684, 2015-11-21 20:55:52 +0000 UTC)
2018-04-20 09:28:52.387 [INF] SYNC: Processed 4 blocks in the last 15.85s (2171 transactions, height 384688, 2015-11-21 21:24:00 +0000 UTC)
I was having issues with the script being able to ban stalled ipv6 hosts. It is easier to ban by node id than ip.
syncnode=`btcctl --notls getpeerinfo | jq -r '.[] | select(.syncnode == true) | .id'
This helped a lot
Thank you, very helpful
This also won't work if you have multiple instances of btcd running, e.g. one for testnet and one for mainnet, because
pidof btcd
will just pick the first one.