Skip to content

Instantly share code, notes, and snippets.

@rjhowe
Last active September 21, 2016 15:55
Show Gist options
  • Save rjhowe/0c0f1eb31e90da8e682a15e4624818f0 to your computer and use it in GitHub Desktop.
Save rjhowe/0c0f1eb31e90da8e682a15e4624818f0 to your computer and use it in GitHub Desktop.
Adding old etcd member with large DB back to cluster

CONTEXT: 3 Masters:

master1.openshift.com 172.17.28.10
master2.openshift.com 172.17.28.12
master3.openshift.com 172.17.28.18

In this example we will be adding "master2.openshift.com" back into the cluster.

STEPS:

  1. Ensure etcd is updated on all etcd hosts with version 2.3.x

     ┌─[root@master1]─[~]
     └──> etcd --version
     etcd Version: 2.3.7
     Git SHA: fd17c91
     Go Version: go1.6.2
     Go OS/Arch: linux/amd64
    
  2. Do not start the ectd service yet on "master2.openshift.com"

  3. Adding member "master2.openshift.com"

    • After running the command the output displayed will need to be added to the ectd.conf file on "master2.openshift.com"
    • Also keep the ID (NODEID) for the new as it will be used when taking a backup of etcd.
    ┌─[root@master1]─[~]
    └──>  etcdctl -C     https://master1.openshift.com:2379,https://master2.openshift.com:2379,https://master3.openshift.com:2379     --ca-file=/etc/origin/master/master.etcd-ca.crt     --cert-file=/etc/etcd/peer.crt     --key-file=/etc/etcd/peer.key member add master2.openshift.com https://172.17.28.12:2380
    Added member named master2.openshift.com with ID 5c3309891545cfa2 to cluster
    
    ETCD_NAME="master2.openshift.com"
    ETCD_INITIAL_CLUSTER="master1.openshift.com=https://172.17.28.10:2380,master2.openshift.com=https://172.17.28.12:2380,master3.openshift.com=https://172.17.28.18:2380"
    ETCD_INITIAL_CLUSTER_STATE="existing"
    
  4. Create a backup of the database with the two new options unique to etcd 2.3.X

    • Replace NODE_ID with the ID from the previous step.
    ┌─[root@master1]─[~]
    └──> etcdctl backup --keep-cluster-id --node-id NODE_ID --data-dir /var/lib/etcd --backup-dir /var/lib/etcd/etcdbackup 
    ┌─[root@master1]─[~]
    └──> tar -cvf etcdbackup.tar.gz -C /var/lib/etcd/etcdbackup/ member/ 
    
  5. Transfer backup to master2.openshift.com.

    ┌─[root@master1]─[~]
    └──> scp etcdbackup.tar.gz master2.openshift.com:/tmp/
    
  6. Remote to new etcd member master2.openshift.com. In our case this member is master2.openshift.com and will already have its previous config setup from initial install.

    • Make sure data-dir is removed and does not have old data.
    ┌──[root@master2]─[~]
    └──> rm -rf /var/lib/etcd/member
    
    • Extract the back up to /var/lib/etcd/member
    ┌──[root@master2]─[~]
    └──> tar -xf /tmp/etcdbackup.tar.gz -C /var/lib/etcd/
    ┌──[root@master2]─[~]
    └──> chown -R etcd:etcd /var/lib/etcd/
    
    • Add the following data to etcd.conf, you may need to update these fields as some information will already be there. Make sure the file shows the variable set correctly according to the output we got from adding the member. All that might need to be changed is ETCD_INITIAL_CLUSTER_STATE value, from "new" to "existing"
    ┌──[root@master2]─[~]
    └──> vi /etc/etcd/etcd.conf
    ...
    ... 
    ETCD_NAME="master2.openshift.com"
    ETCD_INITIAL_CLUSTER="master1.openshift.com=https://172.17.28.10:2380,master2.openshift.com=https://172.17.28.12:2380,master3.openshift.com=https://172.17.28.18:2380"
    ETCD_INITIAL_CLUSTER_STATE="existing"
    ...
    ...
    
    • Start the etcd service.
    ┌──[root@master2]─[~]
    └──> systemctl start etcd
    
  7. Check the health of the cluster.

    ┌─[root@master1]─[~]
    └──> etcdctl -C     https://master1.openshift.com:2379   --ca-file=/etc/origin/master/master.etcd-ca.crt     --cert-file=/etc/origin/master/master.etcd-client.crt     --key-file=/etc/origin/master/master.etcd-client.key member list
    
    member 1e49ac32bec28074 is healthy: got healthy result from https://172.17.28.10:2379
    member 30ae3b88f5f6f8aa is healthy: got healthy result from https://172.17.28.12:2379
    member 5c3309891545cfa2 is healthy: got healthy result from https://172.17.28.18:2379
    cluster is healthy
    
  8. Add the member back to the master-config.yaml on all the masters.

    ┌─[root@master1]─[~]
    └──> vi /etc/origin/master/master-config.yaml
    
     etcdClientInfo:
       ca: master.etcd-ca.crt
       certFile: master.etcd-client.crt
       keyFile: master.etcd-client.key
       urls:
         - https://master1.openshift.com:2379
         - https://master2.openshift.com:2379
         - https://master3.openshift.com:2379
    
    ┌─[root@master1]─[~]
    └──> systemctl restart atomic-openshift-master-* 
    
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment