BLOG16: What is Kubernetes Operator Pattern?
In Kubernetes, operator and controller are core patterns used to automate application and infrastructure lifecycle management. They are often confused, but each has a specific meaning.
This document provides a clear explanation of these concepts.
What is the Kubernetes Controller?
A controller is a control-loop program inside Kubernetes that continuously monitors the cluster state and tries to move it toward the desired state.
How a Controller Works
Every controller follows the reconcile loop pattern:
Observe: Read the desired state from etcd (via API server).
Compare: Check the actual state in the cluster.
Act: If they differ, make changes to move actual → desired.
Built-in controllers in Kubernetes
Examples:
All of these continuously ensure Kubernetes resources work as expected.
For example:
The ReplicaSet controller ensures there are always 3 pods running. If 1 pod dies → controller creates a new one.
What is the Kubernetes Operator Pattern?
An Operator is a special kind of controller that manages application logic beyond what built-in controllers can do.
It extends Kubernetes using Custom Resource Definitions (CRDs).
An Operator = CRD + Custom Controller
Defines a new Kubernetes resource type (e.g., MySQLCluster)
Contains logic to reconcile that custom resource
The Operator Pattern automates complex lifecycle tasks like:
Install / configure an application
Upgrading application versions
Failover & replication management
Health checking & self-healing
Basically, an operator codifies human operational knowledge.
Real Examples of Operators
Deploy, upgrade, manage Prometheus & Alertmanager
Cluster lifecycle & scaling
Unseal, manage Vault lifecycle
Mongo sharding, replication, backups
Key Difference: Controller vs Operator
Feature
Controller
Operator
Kubernetes built-in resources
Custom application-specific resources
Custom code (written by you or vendors)
Simple (replicas, scheduling)
High (backups, upgrades, failovers, etc.)
Maintain basic K8s objects
Automate application lifecycle
Example: MySQL Operator
Desired State (defined by user):
Actual State (managed by Operator):
Creates StatefulSets internally
Sets up PersistentVolumes
Performs rolling upgrades
This automation would normally require a human DBA.
Manages built-in Kubernetes objects
Extends Kubernetes with CRDs
Includes a custom controller with domain-specific knowledge
Automates full lifecycle of complex applications (DBs, Observability, Security tools, etc.)
It’s called the Operator Pattern because it models human operators who traditionally run production systems — and encodes their operational knowledge into software.
The following sections explain the origin and reasoning:
Why the name "Operator"?
In traditional infrastructure and application management, there are always people called:
These human operators perform actions like:
installing the application
monitoring & healing failures
Kubernetes wanted a way to automate these same tasks using software.
So they created a design pattern that mimics a human operator’s behaviour → and named it the Operator Pattern.
What makes it a "pattern"?
A design pattern is a reusable way of solving a class of problems.
The Operator Pattern provides the reusable idea:
“Model applications as custom resources and write controllers that reconcile them to desired state.”
This combination (CRD + Controller + Application Logic) repeats across many use-cases → therefore, it's a pattern like Singleton, Observer, etc.
What did Operators originally solve? (History)
Before Operators:
Kubernetes could manage pods, replica sets, deployments
But it could NOT manage application logic, like:
These required a human operator.
Red Hat engineers (CoreOS team) in 2016 introduced:
"Let’s embed human operator knowledge in a controller"
Hence the term Operator.
The core meaning
It replaces/augments human operators
Encodes operational intelligence into the cluster
Automates lifecycle tasks that normally require expertise
Uses Kubernetes-native APIs (CRDs)
Example to show the naming logic
Human Operator Task
A MySQL DBA would:
Kubernetes Operator
A MySQL Operator:
It operates the application → therefore, Operator.
Think of a Kubernetes Operator as a robot version of a human operator.
Operator Pattern vs Controller Pattern (Deep but Simple Explanation)
Operators and controllers are related, but not the same. Every Operator includes a controller, but not every controller is an Operator.
The following sections provide a detailed comparison.
1. What is the Controller Pattern?
A controller is a Kubernetes control-loop that:
Watches a Kubernetes resource
Compares desired state vs actual state
Reconciles until they match
This pattern is built into Kubernetes itself.
The Deployment controller:
ensures 3 Pods always exist
Controllers manage Kubernetes-native resources, like:
2. What is the Operator Pattern?
The Operator Pattern extends the controller pattern.
It adds:
CRDs (Custom Resource Definitions)
Domain-specific operational logic
Kubernetes only knows basic stuff (deploy, scale). The operator pattern teaches Kubernetes complex application logic.
A PostgreSQL Operator can:
Initialize a primary/replica cluster
Handle automatic failover
This is far beyond what native controllers can do.
3. Why Operator Pattern Exists (The Real Reason)
Before Operators, only human operators could:
Fix cluster-level failures
Manage clusters (DBs, queues, caches)
Developers wanted:
“A Kubernetes-native way to automate what human operators do.”
This is why the pattern is called Operator Pattern.
It automates operational knowledge.
4. Side-by-Side Comparison
Feature
Controller Pattern
Operator Pattern
Defines new resource type
Only basic (replicas, scheduling)
Full lifecycle (install, backup, upgrade)
Platform engineers / vendors
5. Visual Architecture
Controllers operate Kubernetes. Operators operate applications.
6. How to Write an Operator (Simple Steps)
Using Operator SDK (Go-based):
Step 1 — Create a CRD
Example:
Step 2 — Write a controller (in Go/Python)
Your code:
Watches MySQLCluster resources
Creates StatefulSets, PVCs, Services
Ensures the DB cluster is healthy
Takes backups automatically
Step 3 — Build & deploy into K8s
7. Operator vs Helm vs GitOps (important differentiation)
Installs apps using templates (static)
Manages desired state from Git
Automates the application lifecycle (dynamic, intelligent)
Operators are active. Helm/GitOps are passive.
Example: Helm installs MongoDB, but it cannot:
MongoDB Operator can.
Summary – Easy to Remember
Controller Pattern
Control loop → makes real state = desired state
Works on native resources
Kubernetes uses it internally
Operator Pattern
Automates application lifecycle
Encodes human operator knowledge
Designed for complex apps like DBs, queues, storage
1. How the Reconciliation Loop Actually Works (Internals)
The reconciliation loop is the brain behind every controller and operator.
Reconciliation = Desired State → Actual State
A controller continuously compares:
And takes action to bring them together.
Step-by-Step: Internal Mechanics
The operator registers informers to watch specific resources:
Resources it owns (Pods, PVCs, Services)
Whenever something changes, a reconcile event is triggered.
Step 2 — Fetch Current State
Inside Reconcile() you fetch:
the actual StatefulSets, Pods, PVCs, Secrets, etc.
Step 3 — Compare Desired vs Actual
Example:
spec.replicas = 3 but actual pods = 2 → mismatch
Step 4 — Take Action
Operator creates/patches/deletes resources.
Examples:
Replace failed primary DB node
Restart a pod for upgrade
Step 5 — Requeue
Operator may requeue reconciliation:
So it checks again after X seconds.
Reconciliation is idempotent
Running it 100 times must always produce the same result.
2. How Operators Handle Upgrades & Backups
Operators encode domain knowledge.
Below is how real-world operators handle complex tasks.
Upgrades (Rolling Upgrade Logic)
Example: Upgrading a PostgreSQL cluster from 13 → 14
Operator performs:
Mark cluster as Upgrading
Validate version compatibility
Drain traffic from replica
Wait for it to become healthy
Promote replica to primary
Update CRD .status.version
All without downtime (if HA setup exists).
Most operators follow this pattern:
automatically based on CRD schedule
or manual backups using a CRD resource like:
Operator:
Mounts PVC or connects to DB
Uploads backup to S3, GCS, Minio, NFS, etc.
Restore Process
Operator:
Restores PVC from stored backup
Rebuilds replica topology
3. How to Write an Operator (Step-by-Step Using Operator SDK)
Step 1 — Create Operator Project
Step 2 — Create API (CRD) + Controller
This generates:
Step 3 — Define the CRD Schema
api/v1/mysqlcluster_types.go:
Step 4 — Implement Reconcile Logic
controllers/mysqlcluster_controller.go:
This is simplified but captures the idea:
Step 5 — Build & Deploy Operator
This installs:
Deployment for your operator
Step 6 — Apply CRD example
Apply:
The operator will now build a MySQL cluster.
4. Top 20 Production-Grade Operators in the World Today
The most widely adopted, enterprise-grade operators:
Database Operators
MongoDB Community Operator
Percona Operators (MySQL, PostgreSQL, MongoDB)
CrunchyData PostgreSQL Operator
Observability & Logging Operators
Fluent Operator (Fluent Bit/Fluentd)
Security Operators
Messaging & Streaming
RabbitMQ Cluster Operator
Rook-Ceph Operator (Storage)
Extra commonly used:
Operator Pattern = CRD + Controller + Operational Knowledge
It enables:
Everything a human operator used to do.