Skip to content

Instantly share code, notes, and snippets.

@dcanadillas
Created July 17, 2020 14:07
Show Gist options
  • Save dcanadillas/8448a3ba6652f8fe120c011f1825555e to your computer and use it in GitHub Desktop.
Save dcanadillas/8448a3ba6652f8fe120c011f1825555e to your computer and use it in GitHub Desktop.

DR Test case re-plied

Preparation of environment

Steps executed:

git clone https://github.com/dcanadillas/tf-vault-azure
  • Create two diferent backend configs for TFC from the example of the repo backend_template.hcl (one for Primary and one for DR) and Deploy infra (from root path of the repo tf-vault-azure:
$ terraform init -backend-config=backend.hcl
<configure your variables in TFC with 3 nodes>
$ terraform apply
...
...
Apply complete! Resources: 39 added, 0 changed, 0 destroyed.

Outputs:

TLS = true
load-balancer = 20.50.181.202
nodes = [
  "20.50.181.187",
  "20.50.181.181",
  "20.50.181.163",
]
...
$ terraform init -backend-config=backend-dr.hcl
<configure your variables in TFC with 1 node>
$ terraform apply
...
...
Apply complete! Resources: 25 added, 0 changed, 0 destroyed.

Outputs:

TLS = true
load-balancer = 20.50.181.237
nodes = [
  "20.50.182.24",
]
...
  • Starting the system and applying license. On Primary cluster, initialize Vault:
dcanadillas@vault-server-0:~$ vault operator init --recovery-shares=1 --recovery-threshold=1 >> vault-init.log

dcanadillas@vault-server-0:~$ cat vault-init.log
Recovery Key 1: 3Om04iXiyNyk22Nae9Ev07CYKh+YniK2PU3hr84aK+I=

Initial Root Token: s.7c4L7rLOFMkccoKXhEOfq4iu

Success! Vault is initialized

Recovery key initialized with 1 key shares and a key threshold of 1. Please
securely distribute the key shares printed above.

dcanadillas@vault-server-0:~$ export VAULT_TOKEN=s.7c4L7rLOFMkccoKXhEOfq4iu

dcanadillas@vault-server-0:~$ vault login $VAULT_TOKEN
WARNING! The VAULT_TOKEN environment variable is set! This takes precedence
over the value set by this command. To use the value set by this command,
unset the VAULT_TOKEN environment variable or set it to the token displayed
below.

Success! You are now authenticated. The token information displayed below
is already stored in the token helper. You do NOT need to run "vault login"
again. Future Vault requests will automatically use this token.

Key                  Value
---                  -----
token                s.7c4L7rLOFMkccoKXhEOfq4iu
token_accessor       k3Gthoj8IFz2nYPhjcLQH2Bs
token_duration       ∞
token_renewable      false
token_policies       ["root"]
identity_policies    []
policies             ["root"]

dcanadillas@vault-server-0:~$ vault write sys/license text="02MV4UU43BK5H...GKK3PG..."
Success! Data written to: sys/license

dcanadillas@vault-server-0:~$ vault read sys/license
WARNING! The following warnings were returned from Vault:

  * time left on license is 64h28m26s

Key                          Value
---                          -----
expiration_time              2020-07-10T00:00:00Z
features                     [HSM Performance Replication DR Replication MFA Sentinel Seal Wrapping Control Groups Performance Standby Namespaces KMIP Entropy Augmentation Transform Secrets Engine]
license_id                   c9d9102e-c4d9-d365-9bda-3055ddd10d99
performance_standby_count    9999
start_time                   2020-06-26T00:00:00Z
  • Repeat previous steps in the Secondary cluster (the one that is going to be DR) deployed by Terraform (initialize Vault and apply license)
  • Edit config.hcl on DR Vault configuration to comment the replication stanza:
...
    # replication {
    #    resolver_discover_servers = false
    # }

    api_addr = "https://20.50.186.249:8200"
...
  • Restarting Vault:
dcanadillas@vault-server-0:~$ sudo systemctl stop vault

dcanadillas@vault-server-0:~$ sudo systemctl start vault

dcanadillas@vault-server-0:~$ sudo systemctl status vault
● vault.service - Vault
   Loaded: loaded (/etc/systemd/system/vault.service; disabled; vendor preset: enabled)
   Active: active (running) since Tue 2020-07-07 07:40:10 UTC; 5s ago
     Docs: https://www.vaultproject.io/docs/
 Main PID: 14569 (vault)
    Tasks: 13 (limit: 19141)
   CGroup: /system.slice/vault.service
           └─14569 /usr/local/bin/vault server -config=/etc/vault.d/config.hcl
  • Adding primary DR certificate in secondary, in /usr/local/share/ca-certificates/ :
dcanadillas@vault-server-0:~$ echo "-----BEGIN CERTIFICATE-----
MIIDIjCCAgqgAwIBAgIQDmJVr0TYkdey7KwHqmJTeDANBgkqhkiG9w0BAQsFADAr
MRAwDgYDVQQKEwdIYXNoaUNBMRcwFQYDVQQDEw52YXVsdC1jYS5sb2NhbDAeFw0y
MDA2MjkxMDI1MDFaFw0yMTA2MjkxMDI1MDFaMCsxEDAOBgNVBAoTB0hhc2hpQ0Ex
FzAVBgNVBAMTDnZhdWx0LWNhLmxvY2FsMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8A
MIIBCgKCAQEAvtLmTIKEO38qeuHZqNej+3ZHjQYoLuk7O/gjmZPktECd6KULke/b
JR8/WtfVZJHbzghG7/15z9s/OFG2zcgAQTH7RX0oxmO5NY2AC5nAS/50PXidyeqs
xIxFsfaE7i3siled4N66YPS/i2DT32/+yBuseNWmBZ30wy49XkBO17Y192n9+Nec
JeMoFEElFXXlcowtR4Sc3a/ALyePAEf2gWCQlEKfzJ906dF4zLPu0PLk5tFWXSOr
+qhxzpPKt2/n0f5zyfK9b8M2fIFw0T967A8axfRA8XUxsuq5nBKg7jjeL/myUDfS
lNHbR7iuxz9M7P0+gKHujiDNn/X5HGlFdwIDAQABo0IwQDAOBgNVHQ8BAf8EBAMC
AYYwDwYDVR0TAQH/BAUwAwEB/zAdBgNVHQ4EFgQUOdBEtS8YdkxJKVHm/vTUldXD
0LEwDQYJKoZIhvcNAQELBQADggEBADlbQnlSn2Ed3a9jWRcqZ6QgkBGJaprBGvXA
mM0+HVtqT6N97oQujVFb5CB9jgzyjQGOAB8VWVnsEqxUkWt+EP4z09k0Y9nBmS+N
zUKbi01PbV7P21u47BFYeZoYG527CXpVgHj1lOrr2lji64xl8GJqk+AXOYogEXmu
XznrEl2kuSFqeZOvcSxEHuppTyBJxgXX1agarJmMAQX4hKHOQZAp5OV2j/NBjxRL
5jfTyskR6odU1i3tpDDWuxZgdoGAMw0jRbSp886uRVDpSa+M+D1KfJRXqe4/oo09
rlSxyM6HR6RRLPb3hYw4cp+3IHiCjnSB3gkyhE25DhMOAq/6Pi8=
-----END CERTIFICATE-----" | sudo tee /usr/local/share/ca-certificates/01-me_primary_ca.crt

Replication configuration

Executed on Primary.

Let's enable DR replication on Primary cluster:

dcanadillas@vault-server-0:~$ vault write sys/replication/dr/primary/enable primary_cluster_addr="https://vault-demo.dcanadillas.com:8201"
WARNING! The following warnings were returned from Vault:

  * This cluster is being enabled as a primary for replication. Vault will be
  unavailable for a brief period and will resume service shortly.

dcanadillas@vault-server-0:~$ vault write sys/replication/dr/primary/secondary-token id="vault-sec"
Key                              Value
---                              -----
wrapping_token:                  eyJhbGciOiJFUzUxMiIsInR5cCI6IkpXVCJ9.eyJhY2Nlc3NvciI6IiIsImFkZHIiOiJodHRwczovLzIwLjUwLjE4NC4yNTQ6ODIwMCIsImV4cCI6MTU5NDEwNTQ1MCwiaWF0IjoxNTk0MTAzNjUwLCJqdGkiOiJzLk1IcG1PNDI0MVNRRW9ISDhjeXJtaUZQVyIsIm5iZiI6MTU5NDEwMzY0NiwidHlwZSI6IndyYXBwaW5nIn0.AVMpDBCv3XgxnXSaYsXah10ZWP7RB9xp_IGueZoN7SWqDK8myLqeK2SK4bzH13FTBQm9vIvYHojlnvD73uiNawK4AVXRKGSLMv1a7CYm_kuCq561uHdiC2NWPImDRybkpEgmVr68E5LX3qlVcG2MpwpHrCi4Vjr_kLQRLCxTU3m-5Mxa
wrapping_accessor:               PjXfJsy65zR5YCZq3f2wzmDN
wrapping_token_ttl:              30m
wrapping_token_creation_time:    2020-07-07 06:34:10.682811719 +0000 UTC
wrapping_token_creation_path:    sys/replication/dr/primary/secondary-token

Executed on Secondary:

dcanadillas@vault-server-0:~$ cat /etc/vault.d/config.hcl
    cluster_name = "vault-poc-dr-demo"
    storage "raft" {
        path = "/vault/data"
        node_id = "vault-server-0"
    }

    listener "tcp" {
        address       = "0.0.0.0:8200"
        cluster_address = "0.0.0.0:8201"
        tls_disable = 0
        tls_cert_file = "/etc/vault.d/tls/vault.crt"
        tls_key_file  = "/etc/ssl/certs/me.key"
    }


    seal "azurekeyvault" {
        tenant_id      = "0e3e2e88-8caf-41ca-b4da-e3b33b6c52ec"
        client_id      = "-------------"
        client_secret  = "-------------"
        vault_name     = "demo-dc-9d43ecff"
        key_name       = "demo-dc-cf7ad67e"
        enviroment    = "AzurePublicCloud"
    }

    # replication {
    #    resolver_discover_servers = false
    # }

    api_addr = "https://20.50.14.28:8200"
    cluster_addr = "https://10.0.1.4:8201"
    disable_mlock = true
    ui = true
    
    

Let's enable Secondary cluster DR and check that there is a transient_failure connection error:

dcanadillas@vault-server-0:~$ vault write sys/replication/dr/secondary/enable primary_api_addr="https://vault-demo.dcanadillas.com:8200" ca_path="/usr/local/share/ca-certificates/" token="eyJhbGciOiJFUzUxMiIsInR5cCI6IkpXVCJ9.eyJhY2Nlc3NvciI6IiIsImFkZHIiOiJodHRwczovLzIwLjUwLjE4NC4yNTQ6ODIwMCIsImV4cCI6MTU5NDEwNTQ1MCwiaWF0IjoxNTk0MTAzNjUwLCJqdGkiOiJzLk1IcG1PNDI0MVNRRW9ISDhjeXJtaUZQVyIsIm5iZiI6MTU5NDEwMzY0NiwidHlwZSI6IndyYXBwaW5nIn0.AVMpDBCv3XgxnXSaYsXah10ZWP7RB9xp_IGueZoN7SWqDK8myLqeK2SK4bzH13FTBQm9vIvYHojlnvD73uiNawK4AVXRKGSLMv1a7CYm_kuCq561uHdiC2NWPImDRybkpEgmVr68E5LX3qlVcG2MpwpHrCi4Vjr_kLQRLCxTU3m-5Mxa"
WARNING! The following warnings were returned from Vault:

  * Vault has successfully found secondary information; it may take a while to
  perform setup tasks. Vault will be unavailable until these tasks and initial
  sync complete.
  
dcanadillas@vault-server-0:~$ unset VAULT_ADDR

dcanadillas@vault-server-0:~$ vault read sys/replication/dr/status
Key                            Value
---                            -----
cluster_id                     24ae72ac-e779-cc4d-5f19-754e181587dd
connection_state               connecting
known_primary_cluster_addrs    [https://vault-demo.dcanadillas.com:8201]
last_reindex_epoch             1594992090
last_remote_wal                0
merkle_root                    0b23c889458d1dc00e538b9d9539452d1b66cc61
mode                           secondary
primary_cluster_addr           https://vault-demo.dcanadillas.com:8201
secondary_id                   secondary
state                          stream-wals
dcanadillas@vault-server-0:~$ vault read sys/replication/dr/status
Key                            Value
---                            -----
cluster_id                     24ae72ac-e779-cc4d-5f19-754e181587dd
connection_state               transient_failure
known_primary_cluster_addrs    [https://vault-demo.dcanadillas.com:8201]
last_reindex_epoch             1594992090
last_remote_wal                0
merkle_root                    0b23c889458d1dc00e538b9d9539452d1b66cc61
mode                           secondary
primary_cluster_addr           https://vault-demo.dcanadillas.com:8201
secondary_id                   secondary
state                          stream-wals

There is a connection error, so it is not replicating. We can try to restart Vault on Secondary DR and see that the log is complaining about connection to an internal IP, that is the cluster_addr api on the active node in the Primary cluster:

dcanadillas@vault-server-0:~$ sudo systemctl stop vault
dcanadillas@vault-server-0:~$ sudo systemctl start vault
dcanadillas@vault-server-0:~$ tail -f /var/log/syslog
...
...
Jul 17 13:22:52 vault-server-0 vault[25267]: 2020-07-17T13:22:52.004Z [ERROR] replication: encountered error, applying backoff: backoff=16s error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp 10.0.1.6:8201: connect: no route to host""

So, now, lets stop Vault and enable the replication stanza . For that let's modify the file in /etc/vault.d/config.hcl on the Secondary cluster:

		...
		
    replication {
       resolver_discover_servers = false
    }
		
		...

If we restart Vault again on Secondary cluster, the replication connection should be established and working:

dcanadillas@vault-server-0:~$ sudo systemctl stop vault

dcanadillas@vault-server-0:~$ sudo systemctl start vault

dcanadillas@vault-server-0:~$ vault read sys/replication/dr/status
Key                            Value
---                            -----
cluster_id                     24ae72ac-e779-cc4d-5f19-754e181587dd
connection_state               ready
known_primary_cluster_addrs    [https://vault-demo.dcanadillas.com:8201]
last_reindex_epoch             1594992090
last_remote_wal                0
merkle_root                    07da5de90db9495a25e8cf7fdcdae81b091f964c
mode                           secondary
primary_cluster_addr           https://vault-demo.dcanadillas.com:8201
secondary_id                   secondary
state                          stream-wals

dcanadillas@vault-server-0:~$ vault read sys/replication/dr/status
Key                            Value
---                            -----
cluster_id                     24ae72ac-e779-cc4d-5f19-754e181587dd
connection_state               ready
known_primary_cluster_addrs    [https://10.0.1.6:8201 https://10.0.1.4:8201 https://10.0.1.5:8201]
last_reindex_epoch             1594992090
last_remote_wal                0
merkle_root                    07da5de90db9495a25e8cf7fdcdae81b091f964c
mode                           secondary
primary_cluster_addr           https://vault-demo.dcanadillas.com:8201
secondary_id                   secondary
state                          stream-wals

You can check from the previous status command (you may need to execute more than once to refresh) that the known_primary_cluster_addrs changed from the load balancer address to the internal nodes (active and standby nodes).

Promoting secondary DR into primary

Now, let's promote the secondary cluster:

dcanadillas@vault-server-0:~$ vault operator generate-root -dr-token -init
A One-Time-Password has been generated for you and is shown in the OTP field.
You will need this value to decode the resulting root token, so keep it safe.
Nonce         0fc26d50-7c2a-884a-b3a1-dec8eafd2cd3
Started       true
Progress      0/3
Complete      false
OTP           v7bhrhhESYXnCK5ICvTuSQiT7p
OTP Length    26

dcanadillas@vault-server-0:~$ vault operator generate-root -dr-token
Operation nonce: 0fc26d50-7c2a-884a-b3a1-dec8eafd2cd3
Unseal Key (will be hidden):
Nonce       0fc26d50-7c2a-884a-b3a1-dec8eafd2cd3
Started     true
Progress    1/3
Complete    false

dcanadillas@vault-server-0:~$ vault operator generate-root -dr-token 1Qk1zISwCjua3uIOoI283T5Ibecm7z0DBUjrYzdKsWyc
Missing nonce value: specify it via the -nonce flag
dcanadillas@vault-server-0:~$ vault operator generate-root -dr-token
Operation nonce: 0fc26d50-7c2a-884a-b3a1-dec8eafd2cd3
Unseal Key (will be hidden):
Nonce       0fc26d50-7c2a-884a-b3a1-dec8eafd2cd3
Started     true
Progress    2/3
Complete    false

dcanadillas@vault-server-0:~$ vault operator generate-root -dr-token
Operation nonce: 0fc26d50-7c2a-884a-b3a1-dec8eafd2cd3
Unseal Key (will be hidden):
Nonce            0fc26d50-7c2a-884a-b3a1-dec8eafd2cd3
Started          true
Progress         3/3
Complete         true
Encoded Token    BRkBDzouMR0Raio2BztDInMvOQIAMyIaegE

dcanadillas@vault-server-0:~$ vault operator generate-root -dr-token -decode=BRkBDzouMR0Raio2BztDInMvOQIAMyIaegE -otp=v7bhrhhESYXnCK5ICvTuSQiT7p
s.cgHFYXB3rXDpvk0YmwSbKNMq

dcanadillas@vault-server-0:~$ vault write sys/replication/dr/secondary/promote dr_operation_token=s.cgHFYXB3rXDpvk0YmwSbKNMq
WARNING! The following warnings were returned from Vault:

  * This cluster is being promoted to a replication primary. Vault will be
  unavailable for a brief period and will resume service shortly.

Once secondary is promoted we can now login with primary tokens (use dcanadillas user as an example) and see that all the Primary secrets are there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment