LAB18: Etcd Backup & Restore

LAB 900a: Etcd Backup & Restore

1. Listing Kubernetes objects from etcd (pods, deployments, secrets…)

2. Taking a FULL etcd snapshot (correct way for kubeadm)

3. Restoring the cluster from the snapshot (disaster recovery)

PART 1 — LIST OBJECTS IN ETCD (Kubernetes registry)

List ALL keys (huge output)

sudo ETCDCTL_API=3 etcdctl \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt \
  --key=/etc/kubernetes/pki/etcd/healthcheck-client.key \
  get /registry/ --prefix --keys-only

List all PODS stored in etcd

sudo etcdctl \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt \
  --key=/etc/kubernetes/pki/etcd/healthcheck-client.key \
  get /registry/pods/ --prefix --keys-only

Example output:


List deployments


List secrets


List configmaps


View a single object (raw protobuf)

(Remember: RAW output is binary protobuf and unreadable.)


PART 2 — FULL ETCD BACKUP (Kubeadm Production-Safe)

Create a snapshot

Confirm:

You will see:


PART 3 — RESTORE ETCD (DISASTER RECOVERY)

⚠️ Must stop kube-apiserver and etcd first. ⚠️ Restore must be done on ONE control-plane node at a time.


Stop the static etcd pod

Wait 20 seconds → kubelet stops etcd.


Restore snapshot to a new data dir

This creates:


Update etcd.yaml to point to restored data-dir

Edit:

Find:

Replace with:

Save → kubelet will automatically restart etcd.


Wait until etcd is running again

Check:

Or:


Bring kube-apiserver back

If needed:


PART 4 — Verify cluster health after restore

If everything works → recovery successful.


PART 5 — Optional: Automate backups

Create cron job:

Add:

Last updated