Etcd Backup and Restore
In Kubernetes, etcd is like the brain of the cluster — it’s where all the configuration and resource data lives. It runs as a static Pod started by kubelet, and if you lose it, you basically lose your whole cluster configuration. So yes, backing it up regularly is a must.
Etcd Backup
I use the etcdctl tool (the official etcd client) for backups. I grabbed the binary from the official GitHub releases page and put it on my first control node:
wget https://github.com/etcd-io/etcd/releases/download/v3.5.19/etcd-v3.5.19-linux-amd64.tar.gz
After extracting the tar file, I first checked if etcdctl could talk to my etcd server:
./etcdctl --endpoints=https://127.0.0.1:2379 --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member list
That gave me a list of the active etcd members in my cluster.
````bash result 10d8c2fc2b2b053c, started, kube-master3.home.lab, https://192.168.11.73:2380, https://192.168.11.73:2379, false 373f7e005da9deca, started, kube-master2.home.lab, https://192.168.11.72:2380, https://192.168.11.72:2379, false aa70f81cf7c7bc59, started, kube-master1.home.lab, https://192.168.11.71:2380, https://192.168.11.71:2379, false
If you’re wondering where those certificate paths come from — you can find them by running on the control nodes:
````bash
ps -aux | grep etcd
Usually, they’re stored in /etc/kubernetes/pki/etcd/
directory.
In the following example, the tool prints the prefixes of API keys of etcd.
./etcdctl --endpoints=https://127.0.0.1:2379 --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key get / --prefix --keys-only
Creating a snapshot
The actual backup is just one command:
[root@kube-master1 etcd-v3.5.19-linux-amd64]# ./etcdctl --endpoints=https://127.0.0.1:2379 --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key snapshot save /opt/etcd-backups/etcd-backup1
After a few seconds, I had a nice 72 MB snapshot file sitting in /opt/etcd-backups/
.
{"level":"info","ts":"2025-03-16T18:02:37.760533+0100","caller":"snapshot/v3_snapshot.go:65","msg":"created temporary db file","path":"/opt/etcd-backups/etcd-backup1.part"}
{"level":"info","ts":"2025-03-16T18:02:37.772177+0100","logger":"client","caller":"v3@v3.5.19/maintenance.go:212","msg":"opened snapshot stream; downloading"}
{"level":"info","ts":"2025-03-16T18:02:37.772292+0100","caller":"snapshot/v3_snapshot.go:73","msg":"fetching snapshot","endpoint":"https://127.0.0.1:2379"}
{"level":"info","ts":"2025-03-16T18:02:39.263730+0100","logger":"client","caller":"v3@v3.5.19/maintenance.go:220","msg":"completed snapshot read; closing"}
{"level":"info","ts":"2025-03-16T18:02:39.412698+0100","caller":"snapshot/v3_snapshot.go:88","msg":"fetched snapshot","endpoint":"https://127.0.0.1:2379","size":"75 MB","took":"1 second ago"}
{"level":"info","ts":"2025-03-16T18:02:39.412999+0100","caller":"snapshot/v3_snapshot.go:97","msg":"saved","path":"/opt/etcd-backups/etcd-backup1"}
Snapshot saved at /opt/etcd-backups/etcd-backup1
[root@kube-master1 etcd-v3.5.19-linux-amd64]# ll -h /opt/etcd-backups/etcd-backup1
-rw-------. 1 root root 72M Mar 16 18:02 /opt/etcd-backups/etcd-backup1
[root@kube-master1 etcd-v3.5.19-linux-amd64]#
To check the snapshot details:
[root@kube-master1 etcd-v3.5.19-linux-amd64]# ./etcdctl --write-out=table snapshot status /opt/etcd-backups/etcd-backup1
Restoring Etcd from snapshot
If things go wrong and you need to restore:
- Stop core Kubernetes services. Move the static Pod manifests out of the way (don’t delete them!):
mkdir /etc/kubernetes/manifests_bckp
mv /etc/kubernetes/manifests/* /etc/kubernetes/manifests_bckp/
crictl ps
-
Rename or remove
/var/lib/etcd
. This clears the old etcd data. -
Run restore command:
./etcdctl snapshot restore /opt/etcd-backups/etcd-backup1 /var/lib/etcd
- Bring the static Pods back
mv /etc/kubernetes/manifests_bckp/* /etc/kubernetes/manifests/
- Check that everything is up
crictl ps
kubectl get all