Skip to content

Instantly share code, notes, and snippets.

@notmatt
Last active August 29, 2015 14:14
Show Gist options
  • Save notmatt/b24bb67ffda33b61504a to your computer and use it in GitHub Desktop.
Save notmatt/b24bb67ffda33b61504a to your computer and use it in GitHub Desktop.

Manatee Brain Transplant

The goal of this process is to upgrade a manatee of any vintage to Manatee v2. It relies on ZFS send/recv to replicate the data, but is limited to a migration between nodes in ONWM.

step 1. upgrade moray

Upgrading moray to a forward/backward compatible version is a prerequisite of the upgrade. The usual process is to disable one moray node, double-check the stack reconnects correctly, reprovision that node, and then repeat for other moray nodes.

If there is only one moray node deployed, deploying a second using the new image allows you to upgrade the original node as above.

Pre work - update the moray image in SAPI:

SDC=$(sdc-sapi /applications?name=sdc | json -Ha uuid)
MORAY_SVC=$(sdc-sapi "/services?name=moray&application_uuid=$SDC" | json -Ha uuid)
sapiadm update $MORAY_SVC params.image_uuid=$MORAY_IMAGE

In the target moray zone:

svcadm disable registrar
svcadm disable *moray-202*

Wait for moray to fall out of DNS. On older stacks, sdc-healtcheck is advised, as a number of SDC services may require restarts, as they often do not reconnect correctly when moray goes down.

In the headnode GZ:

sapiadm reprovision $TARGET_MORAY $MORAY_IMAGE

step 2. reduce existing manatee to ONWM

It can help to be logged on to the CNs of each manatee node, but most operations will be done in the HN/GZ via sdc-oneachnode. First we'll se up some environment variables:

SDC=$(sdc-sapi /applications?name=sdc | json -Ha uuid)
MANATEE_SVC=$(sdc-sapi "/services?name=manatee&application_uuid=$SDC" | json -Ha uuid)
# NB: sdc-manatee-stat isn't present in the GZ on all installs;
# manatee-stat inside any manatee zone will be. Adjust if required.
MANATEE_STAT=$(sdc-manatee-stat | json | tee initial_manatee_stat.json)

PRIMARY=$(echo $MANATEE_STAT | json sdc.primary.zoneId)
CN_PRIMARY=$(sdc-vmapi /vms/$PRIMARY | json -H server_uuid)
SYNC=$(echo $MANATEE_STAT | json sdc.sync.zoneId)
CN_SYNC=$(sdc-vmapi /vms/$SYNC | json -H server_uuid)
ASYNC=$(echo $MANATEE_STAT | json sdc.async.zoneId)
CN_ASYNC=$(sdc-vmapi /vms/$ASYNC | json -H server_uuid)

Now we will proceed to disable the async and sync nodes:

sdc-oneachnode -n $CN_ASYNC "svcadm -z $ASYNC disable manatee-sitter"
sdc-oneachnode -n $CN_SYNC "svcadm -z $SYNC disable manatee-sitter"

Set ONWM on the primary node:

svcadm disable config-agent
vim /opt/smartdc/manatee/etc/sitter.json
# search for oneNodeWriteMode and set it to true.
svcadm restart manatee-sitter

step 3. backup manatee.

In the GZ of the CN of the primary manatee:

MANATEE_UUID=$(vmadm lookup alias=~manatee)
zlogin $MANATEE_UUID "svcadm disable manatee-sitter" < /dev/null
zfs snapshot -r zones/$MANATEE_UUID/data/manatee@backup
zfs send zones/$MANATEE_UUID/data/manatee@backup >./manatee-backup.zfs
zfs destroy zones/$MANATEE_UUID/data/manatee@backup
zlogin $MANATEE_UUID "svcadm enable manatee-sitter" < /dev/null

step 4. Create the first manatee v2 node

In this step, we will use one of two options to create the first manatee v2 node; it will be configured to not start manatee-sitter when the zones come up. This will allow us to destroy its delegated dataset, and set it up for zfs recv in the following steps.

The first option is to create a new manatee on the same CN as the existing primary; this can speed up the zfs send/recv step, as it will operate on the same filesystem.

Option 1: create a new manatee

sapiadm update $MANATEE_SVC params.image_uuid=$MANATEE_IMAGE
cat > manatee3.json <<EOF
{
  "service_uuid": "$MANATEE_SVC",
  "params": {
    "alias": "manatee3",
    "server_uuid": "$CN_PRIMARY"
  },
  "metadata": {
    "DISABLE_SITTER": true,
    "ONE_NODE_WRITE_MODE": true
  }
}
EOF
sdc-sapi /instances -X POST [email protected]

Option 2: reprovision the $ASYNC

This reprovisions the $ASYNC manatee to the new manatee image. zfs send/recv will take place over the admin network, which can take some time for large installations (e.g., JPC can take 30+ minutes), but has the advantage of less cleanup, and no changes to manatee zone UUIDs, IPs, etc. (This should not matter for the SDC stack, but may be inappropriately used by operators)

sapiadm update $MANATEE_SVC params.image_uuid=$MANATEE_IMAGE
sapiadm update $ASYNC metadata.DISABLE_SITTER=true
sapiadm update $ASYNC metadata.ONE_NODE_WRITE_MODE=true
sapiadm reprovision $ASYNC $MANATEE_IMAGE

step 5. prepping the new manatee node

Wait for the [re]provision from step 4 to complete, log into the new node, then:

# ensure that manatee-sitter is indeed disabled
svcs manatee-sitter

# take note of IP address
ifconfig | grep inet

# destroy dataset
ZONE_UUID=$(sysinfo | json UUID)
zfs destroy -r zones/$ZONE_UUID/data/manatee
nc -l 1337 | zfs recv zones/$ZONE_UUID/data/manatee

At this point, we can safely disable the moray services, which prevents any writes to the system and somwhat simplifies rollback. Log into each moray and svcadm disable *moray-202*.

step 6. migrate

On the old primary (now in ONWM), we will disable manatee, take a snapshot, and send it to the v2 manatee node. Depending on your choice in step 4, there are two ways this is possible. If you created a new manatee on the same CN as the v1 primary, you can use option 1. If you reprovisioned a v1 node, you can use option 2 (option 2 also works on new manatee nodes, but is slower than it could be).

Option 1: donor/recipient on the same CN

Option 2: donor/recipient on different CNs

ZONE_UUID=$(sysinfo | json UUID)
IP=XXX # from above.
svcadm disable manatee-sitter
zfs snapshot zones/$ZONE_UUID/data/manatee@migrate
zfs send -v zones/$ZONE_UUID/data/manatee@migrate | nc $IP 1337

step 7 - enable the new manatee!

NB: rolling back after this point is slightly more complicated, see below.

When the zfs send has completed, you can svcadm enable manatee-sitter in the new zone. manatee-stat in the new zone will indicate when the v2 primary comes online, and the stack can be checked via the usual sdc-healthcheck, provisioning tests, and so on.

step 8. move manatee v2 out of ONWM

A slight modification of the steps outlined here. It assumes $CN_V2_PRIMARY is the CN on which you [re]provisioned manateein step 4, and $V2_PRIMARY is the manatee v2 primary.

  • select a node to reprovision (we'll assume the v1 sync node, which is typical)
  • reprovision it: sapiadm reprovision $SYNC $MANATEE_IMAGE
  • re-enable moray: log into moray zones and svcadm enable *moray-202*
  • remove metadata from sapi:
    • sapiadm update $V2_PRIMARY metadata.ONE_NODE_WRITE_MODE=false
    • sapiadm update $V2_PRIMARY metadata.DISABLE_SITTER=false
  • halt primary: sdc-oneachnode -n $CN_V2_PRIMARY "vmadm stop $V2_PRIMARY"
  • run manatee-adm unfreeze in the reprovisioned sync: sdc-oneachnode -n $CN_SYNC "zlogin $SYNC 'manatee-adm unfreeze' < /dev/null"
  • start v2 primary: sdc-oneachnode -n $CN_V2_PRIMARY "vmadm start $V2_PRIMARY"

The modification takes advantage of the fact that manatee-sitter does not immediately pick up new config, so we can use SAPI to write the new config file, but it will not take effect until after manatee is stopped and restarted.

We should now have a 2-machine manatee v2 cluster, and can check the stack in the usual way. If that's successful, the final step is to reprovision the old v1 manatee; no other special steps are required for that.

Rolling back

At any time before enabling the v2 manatee (step 6), rollback is straight-forward:

  • svcadm enable config-agent on the v1 primary
  • wait & double-check that oneNodeWriteMode: false is in /opt/smartdc/manatee/sitter.json
  • svcadm restart manatee-sitter in the v1 primary, wait for manatee-stat to stabilize
  • svcadm enable manatee-sitter in the sync
  • sapiadm reprovision $MANATEE_ASYNC $OLD_MANATEE_IMAGE to revive the async (it may also require a rebuild)

After enabling the v2 manatee, rollback requires a little more consideration, as we may have accepted writes that we are not prepared to discard. In this case, the "reverse brain transplant" could be attempted; sending a snapshot of the v2 manatee back to the v1 primary (or perhaps the v1 sync).

If all else fails, we can restore the v1 manatee from its pre-migration backup.

@qdzlug
Copy link

qdzlug commented Feb 6, 2015

Hey Matt,

Two questions - the netcat to 1137, is that the port that manatee is listening on for snapshots to be sent received?

Also, last command in step #5 - is that supposed to be nc (it reads as cn).

Jay

@notmatt
Copy link
Author

notmatt commented Feb 6, 2015

Re: the port - check the steps for the v2 manatee node, we set the port to listen on there. Yunong picked '1337' because it's l33t.

And thanks for the typo catch. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment