BLOG04a: Node Maintenance

BLOG07: Node Maintenance - Cordon/Drain

1. What is kubectl cordon?

Cordon = mark a node unschedulable

When a node is cordoned:

  • Kubernetes stops scheduling new pods on that node

  • Existing pods keep running

  • Node becomes read-only / safe mode for scheduling

Command:

kubectl cordon <node-name>

Common use cases:

  • Before performing maintenance

  • Temporary pause on scheduling

  • Testing cluster behavior without disrupting running workloads

Visual Summary:

Action
New Pods
Existing Pods

cordon

Not scheduled

Keep running


2. What is kubectl drain?

Drain = evict all pods safely + mark node unschedulable

When a node is drained:

  1. Node becomes unschedulable (same as cordon)

  2. Kubernetes evicts all pods safely (graceful termination)

  3. Control plane reschedules pods to other nodes

Command:

What happens under the hood:

  • Pod Disruption Budgets (PDB) are checked

  • Graceful termination is respected

  • Replicas are recreated on other nodes

Visual:

Action
New Pods
Existing Pods

drain

Not scheduled

Evicted + rescheduled

Drain is used for:

  • Node upgrade (kubelet, OS, kernel patch)

  • Node replacement

  • Planned maintenance


3. What is the Node Release Process?

The Node Release Process is the operational workflow for safely taking a node out of service and returning it.

The process consists of 6 stages:


Stage 1: Cordon the node

Prevent new pods from being scheduled:


Stage 2: Drain the node

Move running workloads away:

This safely evicts pods.


Stage 3: Perform Maintenance

Examples:

  • OS patching

  • Kubelet upgrade

  • Hardware change

  • Reboot

  • Cloud provider draining (AWS, GCP)


Stage 4: Bring node back online

After maintenance, the kubelet reconnects to the control plane.

Node status becomes: Ready


Stage 5: Uncordon the node

Allow scheduling to resume:

Scheduling of new pods resumes.


Stage 6: Verify scheduling


Cordon vs Drain – Comparison

CORDON

  • Do not place new pods

  • Do not remove running pods

DRAIN

  • Do not place new pods

  • Evict all running pods


Additional Information: What happens if a node is deleted from cluster provider (AWS/GCP/VMware)?

  • Kubernetes marks it NotReady

  • After 5 minutes → Unreachable

  • kube-controller-manager deletes the node object

  • Pods are recreated on healthy nodes automatically

This is called Node Self-Healing.


The time required to evict pods during a kubectl drain is not fixed — it depends on several factors. The following sections provide a detailed breakdown to help predict the timing accurately.


1. Pod Eviction Time Depends On the Pod's Termination Grace Period

Each pod has:

Default = 30 seconds

Therefore, each pod may take up to 30 seconds to shut down cleanly unless overridden.

When executing:

Kubernetes waits for the pod to:

  1. Receive SIGTERM

  2. Run its shutdown lifecycle hooks

  3. Gracefully stop

  4. Be killed with SIGKILL if it exceeds the grace period

Example:

  • Pod A: grace period = 30s

  • Pod B: grace period = 10s

  • Pod C: grace period = 60s

Total drain time ≈ 60 seconds (but often parallelized)


2. DaemonSet pods do NOT block drain

Drain automatically skips DaemonSet pods when using:

If this flag is not included, drain will hang indefinitely.


3. Pods backed by local-storage (EmptyDir) require flag

If this flag is not provided, drain fails immediately (fails fast rather than slowly).


4. Pod Disruption Budgets (PDB) can block drain

If the PDB specifies that a minimum of 1 replica must remain available:

Example PDB:

If attempting to drain the last node hosting that pod:

The drain will wait indefinitely until the PDB is satisfied or until using:


5. Typical cluster drain timing

Small cluster, simple workloads

Medium workloads (web apps, backing services)

Heavy workloads (Java apps, large caches, ML processes)

Pods with very large grace periods

Example:

Drain may take 5 minutes per pod.

This can be overridden with:

But this will force-kill pods.


6. Monitoring eviction behavior

To monitor eviction behavior, execute:

The output will show:

  • Eviction request sent

  • SIGTERM delivered

  • Container shutting down

  • Pod deleted

  • New pod scheduled elsewhere


Summary

Factor
Impact

Termination grace period

Biggest time factor (default 30s)

PDB

May block indefinitely

DaemonSets

Must ignore or drain hangs

Local volumes

Need delete flag

Number of pods

Drains run in parallel but still wait for slow pods

Typical drain time: 30–90 seconds for most clusters.

Last updated