This Choria Playbook will automate the steps to do a simple version upgrade of a Vault cluster
- Upgrades follower servers first
- Upgrades the leader last (to reduce risk of a failover from newer version to older one)
- Sleeps in between each upgrade to allow operator unseal to be run, and for vault to re-register itself in service discovery
- Bulk updates all clients
If you have manual unseal, you must watch the playbook run, and unseal the nodes during the sleep period before the next node is upgraded. If you do not unseal at least one node before the final node is upgraded, there will be an outage.
You might also want to manually run vault operator step-down
between the penultimate and final node upgrades,
to minimise the impact.
A future version of this playbook might verify nodes have been unsealed before continuing with the upgrade of other nodes. Currently this is not implemented
- On each server:
- Disables puppet, to avoid interference
- Waits for any in-progress puppet runs to complete
- Triggers a
yum clean all
to ensure repos contain the latest packages - Ensures the
vault
package version is the desired version - Re-enables puppet
- Runs puppet (which is assumed will make a change and trigger a restart of the service)
- For me, puppet will always be setting a file capability on the newly installed vault binary, so this is a safe assumption
- Waits for the puppet run to complete
- On each client (in batches of 100):
- Triggers a
yum clean all
- Ensures the
vault
package version is the desired version
- Triggers a
- Fetches the version of all installed vault versions for review
- Assumes this playbook is present in a
site_vault
puppet module - Consul DNS for locating the leader instance
- yum/dnf based distro (because the playbook issues a yum clean)
- Custom facts (site specific):
group
the cluster identifier to upgrade (optional)hostname_environment
a site-specific identifier for the environment, obtained by parsing the hostname
mco playbook run site_vault::upgrade_cluster --modulepath modules:thirdparty --environment p --new_version '1.10.0+ent'
--noop
flag will not take any action, just print the hostnames that would be affected