Skip to content

Instantly share code, notes, and snippets.

@Fraser999
Last active April 16, 2021 15:33
Show Gist options
  • Save Fraser999/71b6f716a832503987e62fb2c5d081c3 to your computer and use it in GitHub Desktop.
Save Fraser999/71b6f716a832503987e62fb2c5d081c3 to your computer and use it in GitHub Desktop.
nctl upgrade scripts
#!/usr/bin/bash
set -o errexit
set -o pipefail
ACTIVATION_POINT=1
NODE_V2="/home/fraser/Rust/casper-node/target/release/casper-node"
V2=1_1_0
V2_SEMVER=1.1.0
for INDEX in 1 2 3 4 5 6 7 8 9 10; do
mkdir -p $NCTL_CASPER_HOME/utils/nctl/assets/net-1/nodes/node-$INDEX/bin/$V2
cp "$NODE_V2" $NCTL_CASPER_HOME/utils/nctl/assets/net-1/nodes/node-$INDEX/bin/$V2
TEMP_DIR="$NCTL_CASPER_HOME/utils/nctl/assets/net-1/nodes/node-$INDEX/config/$V2-pending"
CONFIG="$TEMP_DIR/config.toml"
UPDATE_CONFIG=`cat <<EOF
import toml
cfg=toml.load('$CONFIG')
cfg['logging']['format']='text'
del cfg['consensus']['unit_hashes_folder']
del cfg['consensus']['max_execution_delay']
del cfg['consensus']['pending_vertex_timeout']
del cfg['event_stream_server']['broadcast_channel_size']
cfg['consensus']['highway']={'unit_hashes_folder': '../../storage-consensus', 'pending_vertex_timeout': '1min', 'request_latest_state_timeout': '30sec', 'standstill_timeout': '5min', 'log_participation_interval': '1min', 'max_execution_delay': 3, 'round_success_meter': {'num_rounds_to_consider': 40, 'num_rounds_slowdown': 10, 'num_rounds_speedup': 32, 'acceleration_parameter': 40, 'acceleration_ftt': [1, 100]}}
toml.dump(cfg, open('$CONFIG', 'w'))
EOF`
python3 -c "${UPDATE_CONFIG[*]}"
CHAINSPEC="$TEMP_DIR/chainspec.toml"
UPDATE_CHAINSPEC=`cat <<EOF
import toml
cfg=toml.load('$CHAINSPEC')
cfg['protocol']['version']='$V2_SEMVER'
cfg['protocol']['hard_reset']=True
cfg['protocol']['activation_point']=$ACTIVATION_POINT
del cfg['deploys']['max_deploy_size']
cfg['network']['maximum_net_message_size']=23068672
toml.dump(cfg, open('$CHAINSPEC', 'w'))
EOF`
python3 -c "${UPDATE_CHAINSPEC[*]}"
mv $TEMP_DIR $NCTL_CASPER_HOME/utils/nctl/assets/net-1/nodes/node-$INDEX/config/$V2
done
#!/usr/bin/bash
set -o errexit
set -o pipefail
HASH=436408e928ce0b4c736dad4db7fd6134ad35f0c1be82b377e52b5459094cfacb
V1=1_0_0
V2=1_1_0
for INDEX in 6 7 8 9 10; do
# cp $NCTL_CASPER_HOME/target/release/0.7.7-flexible-handshake/casper-node $NCTL_CASPER_HOME/utils/nctl/assets/net-1/nodes/node-$INDEX/bin/$V1
CONFIG="$NCTL_CASPER_HOME/utils/nctl/assets/net-1/nodes/node-$INDEX/config/$V1/config.toml"
UPDATE_CONFIG=`cat <<EOF
import toml
cfg=toml.load('$CONFIG')
cfg['node']['trusted_hash']='$HASH'
toml.dump(cfg, open('$CONFIG', 'w'))
EOF`
python3 -c "${UPDATE_CONFIG[*]}"
done
#!/usr/bin/bash
set -o errexit
set -o pipefail
LAUNCHER="$NCTL_CASPER_HOME/../casper-node-launcher/target/release/casper-node-launcher"
NODE_V1="$NCTL_CASPER_HOME/target/release/casper-node"
V1=1_0_0
V2=1_1_0
GET_TIMESTAMP=`cat <<EOF
from datetime import datetime, timedelta
print((datetime.utcnow() + timedelta(seconds=60)).isoformat('T') + 'Z')
EOF`
TIMESTAMP=$(python3 -c "${GET_TIMESTAMP[*]}")
for INDEX in 1 2 3 4 5 6 7 8 9 10; do
cp $LAUNCHER "$NCTL_CASPER_HOME/utils/nctl/assets/net-1/nodes/node-$INDEX/bin"
cp $NODE_V1 "$NCTL_CASPER_HOME/utils/nctl/assets/net-1/nodes/node-$INDEX/bin/$V1"
TEMP_DIR="$NCTL_CASPER_HOME/utils/nctl/assets/net-1/nodes/node-$INDEX/config/$V2-pending"
mkdir "$TEMP_DIR"
CHAINSPEC="$NCTL_CASPER_HOME/utils/nctl/assets/net-1/nodes/node-$INDEX/config/$V1/chainspec.toml"
CONFIG="$NCTL_CASPER_HOME/utils/nctl/assets/net-1/nodes/node-$INDEX/config/$V1/config.toml"
cp $CHAINSPEC $TEMP_DIR/chainspec.toml
cp $CONFIG $TEMP_DIR/config.toml
# cp /home/fraser/Rust/temp/casper-node/resources/local/chainspec.toml $CHAINSPEC
# UPDATE_CONFIG=`cat <<EOF
# import toml
# cfg=toml.load('$CONFIG')
# cfg['logging']['format']='text'
# if 'max_execution_delay' in cfg['consensus']:
# del cfg['consensus']['max_execution_delay']
# toml.dump(cfg, open('$CONFIG', 'w'))
# EOF`
UPDATE_CONFIG=`cat <<EOF
import toml
cfg=toml.load('$CONFIG')
cfg['logging']['format']='text'
toml.dump(cfg, open('$CONFIG', 'w'))
EOF`
python3 -c "${UPDATE_CONFIG[*]}"
UPDATE_CHAINSPEC=`cat <<EOF
import toml
cfg=toml.load('$CHAINSPEC')
cfg['protocol']['activation_point']='$TIMESTAMP'
toml.dump(cfg, open('$CHAINSPEC', 'w'))
EOF`
python3 -c "${UPDATE_CHAINSPEC[*]}"
done
#!/usr/bin/bash
set -o errexit
set -o pipefail
for INDEX in 6 7 8 9 10; do
RUST_LOG=casper=debug CASPER_CONFIG_DIR=$NCTL_CASPER_HOME/utils/nctl/assets/net-1/nodes/node-$INDEX/config/ CASPER_BIN_DIR=$NCTL_CASPER_HOME/utils/nctl/assets/net-1/nodes/node-$INDEX/bin $NCTL_CASPER_HOME/utils/nctl/assets/net-1/nodes/node-$INDEX/bin/casper-node-launcher&
done
@Fraser999
Copy link
Author

Fraser999 commented Apr 16, 2021

I have 2 separate clones of casper-node:

  • The one setup to be used by nctl should be checked out to the version of node we want to upgrade from (e.g. git co release-1.0.0) and nctl-compile should be run.
  • My second repo is at /home/fraser/Rust/casper-node and this is hard-coded in NODE_V2 of install-upgrade.sh. That repo should be checked out to the version of node we want to upgrade to, and cargo run --release -- -V should be run.

I normally run the test as follows:

  1. in console 1: nctl-assets-setup && ./setup-upgrade.sh && nctl-start
  2. in console 2: curl localhost:60101/events and wait for network to start creating blocks (this step isn't really required - we don't have to wait)
  3. in console 1: ./install-upgrade.sh. Ensure script has value for ACTIVATION_POINT in the future
  4. watch for the event stream connection to die in console 2, meaning the upgrade has been hit
  5. in console 2, re-run curl localhost:60101/events to check the network has restarted ok. If not look in logs for crash reason. Often it's a mismatch in config or chainspec tomls - the script copies the v1 config/chainspecs and modifies them as neeed - there's a high chance of mistakes there. If that happens, I generally just fix the script(s) and start fresh from step 1.
  6. once the upgraded network is running, ./prep-last-5-nodes.sh. Ensure script has correct value for HASH (pick a block hash from the stream printed in console 2)
  7. run ./start-last-5-nodes.sh
  8. in console 3: curl localhost:60106/events (this is the SSE stream for node-6) and wait for the stream to catch up with the stream for node-1. Note that the last 5 nodes can all run for a while without actually successfully upgrading. So it's not really enough to just check they don't crash straight away - we need to wait for their event streams or status endpoints to show that they've caught up with the first 5 nodes. Note also that the streams for the last 5 will also stop and need to be restarted as they go through the upgrade.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment