BLOG20: Why We Need StatefulSet

BLOG08: Why we need statefulset?

What is a StatefulSet?

A StatefulSet is a Kubernetes workload controller used to manage stateful applications — applications that need stable identity, persistent storage, and ordered deployment.

Key Features

Stable network identity → Pod names follow a predictable format: pod-name-0, pod-name-1, pod-name-2
Stable storage → Each Pod gets its own PersistentVolume that is not deleted on pod restart.
Ordered deployment & scaling → Pod-0 starts first, then Pod-1, etc.
Ordered updates & termination

When to Use a StatefulSet? (Real Use Cases)

StatefulSet is ideal when each pod must keep its own data or identity.

Real Use Cases

Databases
- MySQL Cluster
- PostgreSQL with streaming replication
- MongoDB Replica Set
- Cassandra, Redis, Etcd
Distributed systems that need stable identity
- Kafka brokers (kafka-0, kafka-1…)
- Zookeeper quorum
- Elasticsearch nodes
Systems that maintain local state
- Storage systems
- Caches that cannot lose local data
- Message Queue clusters

Example: In Kafka, each broker must have a fixed ID and storage volume. A StatefulSet ensures that if kafka-2 is restarted, it still comes back as kafka-2 with its old data.

What is a DaemonSet?

A DaemonSet ensures that one copy of a Pod runs on every node (or on selected nodes) in the cluster.

Key Features

Schedules exactly 1 pod per node
Automatically adds/removes pods as nodes join/leave
Ideal for node-level agents

When to Use a DaemonSet? (Real Use Cases)

DaemonSets are for workloads that must run on every node.

Real Use Cases

Log Collection Agents
- Fluentd / Fluent Bit
- Logstash
- Filebeat
Monitoring and Metrics
- Prometheus Node Exporter
- Datadog Agent
- New Relic Infra Agent
Networking Components
- CNI plugins (Calico, Weave, Cilium)
- Kube-proxy
Security Agents
- Falco
- Anti-virus / node scanners
Storage Drivers
- Ceph Agent
- CSI node plugin

Example: If you use Fluent Bit to collect logs from /var/log on every node, a DaemonSet ensures each node has a collector pod automatically.

StatefulSet vs DaemonSet (Simple Comparison)

Feature

StatefulSet

DaemonSet

Purpose

Stateful apps, maintain identity

Node-level agent per node

Pod Names

Fixed (app-0, app-1)

Same name pattern on each node

Storage

Persistent per pod

Usually no persistent storage

Scaling

Manual (replicas)

Auto—follows nodes

Examples

DBs, Kafka, Zookeeper

Logging, monitoring, networking

Quick, Practical Mnemonic

StatefulSet = Stable Identity DaemonSet = One Pod Per Node

What is “Stable Network Identity”?

It means a Pod gets a permanent hostname and DNS name that does not change, even if:

✔ The pod is deleted ✔ The pod is rescheduled to another node ✔ The cluster restarts

This is critical for apps that need to identify each peer in a cluster.

Example

If you deploy a StatefulSet named mysql with 3 replicas, Kubernetes creates:

mysql-0
mysql-1
mysql-2

Their DNS hostnames will be:

mysql-0.mysql.default.svc.cluster.local
mysql-1.mysql.default.svc.cluster.local
mysql-2.mysql.default.svc.cluster.local

These DNS names never change as long as the StatefulSet exists.

Why Deployment/ReplicaSet CAN’T provide stable identity

A Deployment/ReplicaSet manages pods like cattle:

Pod Names Change

If a pod crashes, Deployment creates a new pod with a different name, for example:

nginx-85c9dcb6d-abcde
nginx-85c9dcb6d-f3kd2

These are random hashes, not predictable.

No Fixed DNS Entry

Each new pod gets a new IP, and because:

Pod IPs are ephemeral
Pod names change randomly

You cannot rely on any pod to have a consistent identity.

ReplicaSet purpose

ReplicaSet only ensures N running replicas. It doesn’t care which pod is “pod-0” or “pod-1”.

How StatefulSet Achieves Stable Network Identity

StatefulSet has two mechanisms:

1. Fixed Pod Names (Ordinal Indexing)

Pods are created sequentially:

pod-0 → pod-1 → pod-2

If pod-1 restarts, it returns as exactly the same name:

pod-1

2. Stable DNS via Headless Service

StatefulSets require a Headless Service (clusterIP: None).

This creates a DNS entry for each pod:

pod-0.service-name.namespace.svc.cluster.local

Even after restarts or node failures, this DNS remains valid.

Why do stateful apps need this?

Example 1: Cassandra

Nodes form a ring: Each node is responsible for a specific token range.

node-0 → token 0–100
node-1 → token 101–200
node-2 → token 201–300

If node names changed, the ring breaks.

Example 2: Kafka

Each broker has a permanent ID:

broker.id=0
broker.id=1
broker.id=2

If Kafka pods kept getting new names (like in a Deployment), consumers and producers would lose connection to the brokers.

Example 3: MongoDB Replica Set

ReplicaSet members include hostnames:

mongo-0:27017
mongo-1:27017
mongo-2:27017

If pod identity changed, the replica config becomes invalid.

Summary Table

Feature

StatefulSet

Deployment/ReplicaSet

Pod Identity

Stable, predictable (app-0, app-1)

Random (app-xxxxx)

DNS

Per-pod DNS

Only service DNS

Storage

Per-pod persistent volume

Shared/ephemeral

Order

Ordered create/delete

No order

Use Case

Databases, message queues

Web apps, APIs

One-Line Answer

Stable network identity means pods get fixed names and DNS records. StatefulSet maintains this using ordered creation, fixed naming, and a headless service — something Deployment/ReplicaSet cannot do because they treat pods as interchangeable.

PreviousSecurity: Auditing Kubernetes NextBLOG20a: What is Kubernetes API

Last updated 1 month ago

Good night

hashtagBLOG08: Why we need statefulset?

hashtagWhat is a StatefulSet?

hashtagKey Features

hashtagWhen to Use a StatefulSet? (Real Use Cases)

hashtagWhat is a DaemonSet?

hashtagKey Features

hashtagWhen to Use a DaemonSet? (Real Use Cases)

hashtagStatefulSet vs DaemonSet (Simple Comparison)

hashtagQuick, Practical Mnemonic

hashtagWhat is “Stable Network Identity”?

hashtagExample

hashtagWhy Deployment/ReplicaSet CAN’T provide stable identity

hashtagPod Names Change

hashtagNo Fixed DNS Entry

hashtagReplicaSet purpose

hashtagHow StatefulSet Achieves Stable Network Identity

hashtag1. Fixed Pod Names (Ordinal Indexing)

hashtag2. Stable DNS via Headless Service

hashtagWhy do stateful apps need this?

hashtagExample 1: Cassandra

hashtagExample 2: Kafka

hashtagExample 3: MongoDB Replica Set

hashtagSummary Table

hashtagOne-Line Answer

BLOG08: Why we need statefulset?

What is a StatefulSet?

Key Features

When to Use a StatefulSet? (Real Use Cases)

What is a DaemonSet?

Key Features

When to Use a DaemonSet? (Real Use Cases)

StatefulSet vs DaemonSet (Simple Comparison)

Quick, Practical Mnemonic

What is “Stable Network Identity”?

Example

Why Deployment/ReplicaSet CAN’T provide stable identity

Pod Names Change

No Fixed DNS Entry

ReplicaSet purpose

How StatefulSet Achieves Stable Network Identity

1. Fixed Pod Names (Ordinal Indexing)

2. Stable DNS via Headless Service

Why do stateful apps need this?

Example 1: Cassandra

Example 2: Kafka

Example 3: MongoDB Replica Set

Summary Table

One-Line Answer