Skip to content

Instantly share code, notes, and snippets.

@rjhowe
Last active January 13, 2022 19:01
Show Gist options
  • Save rjhowe/32f4154e0fcbcdd34cb5d99dc68da66d to your computer and use it in GitHub Desktop.
Save rjhowe/32f4154e0fcbcdd34cb5d99dc68da66d to your computer and use it in GitHub Desktop.
Downgrading from etcd 3.3 to 3.2 and retoring etcd on OpenShift 3.10 and 3.11
# ETCD_ALL_ENDPOINTS=` etcdctl3 --write-out=fields   member list | awk '/ClientURL/{printf "%s%s",sep,$3; sep=","}'`
# etcdctl3 --endpoints=$ETCD_ALL_ENDPOINTS  endpoint status  --write-out=table 

Capture a snapshot:

# etcdctl3 snapshot save /tmp/snapshot.db 

Back up existing db:

# cp /var/lib/etcd/member/snap/db /tmp/db

Copy snap shot to all etcd hosts.

scp /tmp/snapshot etcd_host:/tmp/snapshot

On all etcd members

  • Stop the docker and atomic-openshift-node services
  • Change the image of etcd so that the image built with etcd-3.2.22 is used.
  • Remove /var/lib/etcd
# systemctl stop docker atomic-openshift-node
# sed -i '/image/ s/etcd.*$/etcd:3.2.22-18/' /etc/origin/node/pods/etcd.yaml 
# rm -rf /var/lib/etcd

On all hosts before restoring from a snapshot, etcd needs to be installed so that the etcd cli can be used.

- Make sure the etcd rpm version is 3.2.22 
# rpm -qa etcd 

- If not installed install etcd-3.2.22
# yum install etcd-3.2.22

- If installed but version is etcd-3.3.x downgrade
# yum downgrade etcd-3.2.22

With etcd stopped and /var/lib/etcd removed we can now restore from our snapshot.

  • It is very important that after each restore the clusterid is the same on every restored etcd hosts.
  • Do not start etcd until a restore has happends on each etcd host.
  • --initial-cluster-token and --initial-cluster option's value need to be the same on all restored hosts.
  • If restoring from the copied backup /var/lib/etcd/member/snap/db the option --skip-hash-check=true is needed. It is not needed if a snapshot was taken and is being used for the restore.
# source /etc/etcd/etcd.conf
# export ETCDCTL_API=3

- Confirm Value has all etcd hosts set with hostname=https://IP:2380
# echo -e "$ETCD_INITIAL_CLUSTER \n$ETCD_INITIAL_CLUSTER_TOKEN"

- If restoring from the snapshot.db run the following: 
# etcdctl snapshot restore /tmp/snapshot.db \
  --name $ETCD_NAME \
  --initial-cluster $ETCD_INITIAL_CLUSTER \
  --initial-cluster-token $ETCD_INITIAL_CLUSTER_TOKEN \
  --initial-advertise-peer-urls $ETCD_INITIAL_ADVERTISE_PEER_URLS \
  --data-dir /var/lib/etcd
  
- If restoring from the copied backup /var/lib/etcd/member/snap/db 
# etcdctl snapshot restore /tmp/db  \
  --name $ETCD_NAME \
  --data-dir /var/lib/etcd \
  --initial-cluster $ETCD_INITIAL_CLUSTER \
  --initial-cluster-token $ETCD_INITIAL_CLUSTER_TOKEN \
  --initial-advertise-peer-urls $ETCD_INITIAL_ADVERTISE_PEER_URLS \
  --data-dir /var/lib/etcd \ 
  --skip-hash-check=true 

Restore the context of the /var/lib/etcd

# restorecon -Rv /var/lib/etcd

Once Restored start docker and atomic-openshift-node

# systemctl start docker atomic-openshift-node

Confirm health of etcd

Example Run through with 3 ETCD hosts

ETCD Hosts:

  • master1.etcd.com
  • master2.etcd.com
  • master3.etcd.com
# ssh master1.etcd.com
# ETCD_ALL_ENDPOINTS=` etcdctl3 --write-out=fields   member list | awk '/ClientURL/{printf "%s%s",sep,$3; sep=","}'`
# etcdctl3 --endpoints=$ETCD_ALL_ENDPOINTS  endpoint status  --write-out=table 
+-----------------------------------+------------------+---------+---------+-----------+-----------+------------+
|           ENDPOINT                |        ID        | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+-----------------------------------+------------------+---------+---------+-----------+-----------+------------+
|     https://master1.etcd.com:2379 | d91b1c20df818655 |  3.3.11 |   17 MB |      true |         6 |       7863 |
|           https://10.0.88.33:2379 |  d35cfd2fedc078f |  3.3.11 |   17 MB |     false |         6 |       7863 |
|           https://10.0.88.22:2379 | c9624828ed10ae36 |  3.3.11 |   17 MB |     false |         6 |       7863 |
|           https://10.0.88.11:2379 | d91b1c20df818655 |  3.3.11 |   17 MB |      true |         6 |       7863 |
+-----------------------------------+------------------+---------+---------+-----------+-----------+------------+


# etcdctl3 snapshot save /tmp/snapshot.db
# cp /var/lib/etcd/member/snap/db /tmp/db

# scp /tmp/snapshot.db master2.etcd.com:/tmp/snapshot.db
# scp /tmp/snapshot.db master3.etcd.com:/tmp/snapshot.db

# systemctl stop docker atomic-openshift-node
# sed -i '/image/ s/etcd.*$/etcd:3.2.22-18/' /etc/origin/node/pods/etcd.yaml  
# rm -rf /var/lib/etcd

# ssh master2.etcd.com
# systemctl stop docker atomic-openshift-node
# sed -i '/image/ s/etcd.*$/etcd:3.2.22-18/' /etc/origin/node/pods/etcd.yaml 
# rm -rf /var/lib/etcd


# ssh master3.etcd.com
# systemctl stop docker atomic-openshift-node
# sed -i '/image/ s/etcd.*$/etcd:3.2.22-18/' /etc/origin/node/pods/etcd.yaml 
# rm -rf /var/lib/etcd


# ssh master1.etcd.com 
# rpm -qa etcd 
etcd-3.2.22-1.el7.x86_64
# source /etc/etcd/etcd.conf
# export ETCDCTL_API=3
# echo -e  "$ETCD_INITIAL_CLUSTER \n$ETCD_INITIAL_CLUSTER_TOKEN"
  master1.etcd.com=https://10.0.88.11:2380,master2.etcd.com=https://10.0.88.22:2380,master3.etcd.com=https://10.0.88.33:2380  
  etcd-cluster-1

# ETCDCTL_API=3 etcdctl snapshot restore /tmp/snapshot.db \
  --name master1.etcd.com \
  --initial-cluster master1.etcd.com=https://10.0.88.11:2380,master2.etcd.com=https://10.0.88.22:2380,master3.etcd.com=https://10.0.88.33:2380 \
  --initial-cluster-token etcd-cluster-1 \
  --initial-advertise-peer-urls https://10.0.88.11:2380\
  --data-dir /var/lib/etcd 
2019-02-05 12:49:04.103233 I | mvcc: restore compact to 2361744
2019-02-05 12:49:04.135995 I | etcdserver/membership: added member d35cfd2fedc078f [https://10.0.88.33:2380] to cluster 1a196dd3442fbe59
2019-02-05 12:49:04.136161 I | etcdserver/membership: added member c9624828ed10ae36 [https://10.0.88.22:2380] to cluster 1a196dd3442fbe59
2019-02-05 12:49:04.136267 I | etcdserver/membership: added member d91b1c20df818655 [https://10.0.88.11:2380] to cluster 1a196dd3442fbe59

# restorecon -Rv /var/lib/etcd

# ssh master2.etcd.com
# ETCDCTL_API=3 etcdctl snapshot restore snapshot.db \
  --name master2.etcd.com \
  --initial-cluster master1.etcd.com=https://10.0.88.11:2380,master2.etcd.com=https://10.0.88.22:2380,master3.etcd.com=https://10.0.88.33:2380 \
  --initial-cluster-token etcd-cluster-1 \
  --initial-advertise-peer-urls https://10.0.88.22:2380 \
  --data-dir /var/lib/etcd 
2019-02-05 12:51:25.179801 I | mvcc: restore compact to 2356950
2019-02-05 12:51:25.193709 I | etcdserver/membership: added member d35cfd2fedc078f [https://10.0.88.33:2380] to cluster 1a196dd3442fbe59
2019-02-05 12:51:25.193745 I | etcdserver/membership: added member c9624828ed10ae36 [https://10.0.88.22:2380] to cluster 1a196dd3442fbe59
2019-02-05 12:51:25.193759 I | etcdserver/membership: added member d91b1c20df818655 [https://10.0.88.11:2380] to cluster 1a196dd3442fbe59

# restorecon -Rv /var/lib/etcd


# ssh master3.etcd.com
# ETCDCTL_API=3 etcdctl snapshot restore snapshot.db \
  --name master3.etcd.com \
  --initial-cluster master1.etcd.com=https://10.0.88.11:2380,master2.etcd.com=https://10.0.88.22:2380,master3.etcd.com=https://10.0.88.33:2380 \
  --initial-cluster-token etcd-cluster-1 \
  --initial-advertise-peer-urls https://10.0.88.33:2380\
  --data-dir /var/lib/etcd 
2019-02-05 12:53:06.612149 I | mvcc: restore compact to 2356950
2019-02-05 12:53:06.634761 I | etcdserver/membership: added member d35cfd2fedc078f [https://10.0.88.33:2380] to cluster 1a196dd3442fbe59
2019-02-05 12:53:06.634905 I | etcdserver/membership: added member c9624828ed10ae36 [https://10.0.88.22:2380] to cluster 1a196dd3442fbe59
2019-02-05 12:53:06.635001 I | etcdserver/membership: added member d91b1c20df818655 [https://10.0.88.11:2380] to cluster 1a196dd3442fbe59

# restorecon -Rv /var/lib/etcd

# ssh master1.etcd.com
# systemctl start docker atomic-openshift-node

# ssh master2.etcd.com
# systemctl start docker atomic-openshift-node

# ssh master3.etcd.com
# systemctl start docker atomic-openshift-node

# ssh master1.etcd.com
# ETCD_ALL_ENDPOINTS=` etcdctl3 --write-out=fields   member list | awk '/ClientURL/{printf "%s%s",sep,$3; sep=","}'`
# etcdctl3 --endpoints=$ETCD_ALL_ENDPOINTS  endpoint status  --write-out=table 
+-----------------------------------+------------------+---------+---------+-----------+-----------+------------+
|           ENDPOINT                |        ID        | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+-----------------------------------+------------------+---------+---------+-----------+-----------+------------+
|     https://master1.etcd.com:2379 | d91b1c20df818655 |  3.2.22 |   17 MB |      true |         6 |       42   |
|           https://10.0.88.33:2379 |  d35cfd2fedc078f |  3.2.22 |   17 MB |     false |         6 |       42   |
|           https://10.0.88.22:2379 | c9624828ed10ae36 |  3.2.22 |   17 MB |     false |         6 |       42   |
|           https://10.0.88.11:2379 | d91b1c20df818655 |  3.2.22 |   17 MB |      true |         6 |       42   |
+-----------------------------------+------------------+---------+---------+-----------+-----------+------------+
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment