BLOG07: What is Kubernetes Operator

BLOG16: What is Kubernetes Operator Pattern?

In Kubernetes, operator and controller are core patterns used to automate application and infrastructure lifecycle management. They are often confused, but each has a specific meaning.

This document provides a clear explanation of these concepts.

What is the Kubernetes Controller?

A controller is a control-loop program inside Kubernetes that continuously monitors the cluster state and tries to move it toward the desired state.

How a Controller Works

Every controller follows the reconcile loop pattern:

Observe: Read the desired state from etcd (via API server).
Compare: Check the actual state in the cluster.
Act: If they differ, make changes to move actual → desired.

Built-in controllers in Kubernetes

Examples:

Deployment controller
ReplicaSet controller
StatefulSet controller
Job controller
Node controller
PV/PVC controller

All of these continuously ensure Kubernetes resources work as expected.

Simple Example

For example:

spec:
  replicas: 3

The ReplicaSet controller ensures there are always 3 pods running. If 1 pod dies → controller creates a new one.

What is the Kubernetes Operator Pattern?

An Operator is a special kind of controller that manages application logic beyond what built-in controllers can do.

It extends Kubernetes using Custom Resource Definitions (CRDs).

An Operator = CRD + Custom Controller

Component

Purpose

CRD

Defines a new Kubernetes resource type (e.g., MySQLCluster)

Controller

Contains logic to reconcile that custom resource

The Operator Pattern automates complex lifecycle tasks like:

Install / configure an application
Upgrading application versions
Backups & restores
Failover & replication management
Scaling decisions
Health checking & self-healing

Basically, an operator codifies human operational knowledge.

Real Examples of Operators

Operator

What It Manages

Prometheus Operator

Deploy, upgrade, manage Prometheus & Alertmanager

ElasticSearch Operator

Cluster lifecycle & scaling

Cert-Manager Operator

Manage TLS certificates

Vault Operator

Unseal, manage Vault lifecycle

MongoDB Operator

Mongo sharding, replication, backups

Key Difference: Controller vs Operator

Feature

Controller

Operator

Scope

Kubernetes built-in resources

Custom application-specific resources

Managed by

Kubernetes core

Custom code (written by you or vendors)

Uses CRD

Yes

Complexity

Simple (replicas, scheduling)

High (backups, upgrades, failovers, etc.)

Purpose

Maintain basic K8s objects

Automate application lifecycle

Example: MySQL Operator

Desired State (defined by user):

apiVersion: db.example.com/v1
kind: MySQLCluster
spec:
  replicas: 3
  version: "8.0.25"
  backup: S3

Actual State (managed by Operator):

Creates StatefulSets internally
Configures replication
Sets up PersistentVolumes
Manages pod restarts
Performs rolling upgrades
Takes backups

This automation would normally require a human DBA.

Summary

Controller

Reconciliation loop
Manages built-in Kubernetes objects
Core part of Kubernetes

Operator

Extends Kubernetes with CRDs
Includes a custom controller with domain-specific knowledge
Automates full lifecycle of complex applications (DBs, Observability, Security tools, etc.)

It’s called the Operator Pattern because it models human operators who traditionally run production systems — and encodes their operational knowledge into software.

The following sections explain the origin and reasoning:

Why the name "Operator"?

In traditional infrastructure and application management, there are always people called:

system operators
database operators
ops engineers
SREs

These human operators perform actions like:

installing the application
upgrading it
backing it up
tuning configuration
monitoring & healing failures

Kubernetes wanted a way to automate these same tasks using software.

So they created a design pattern that mimics a human operator’s behaviour → and named it the Operator Pattern.

What makes it a "pattern"?

A design pattern is a reusable way of solving a class of problems.

The Operator Pattern provides the reusable idea:

“Model applications as custom resources and write controllers that reconcile them to desired state.”

This combination (CRD + Controller + Application Logic) repeats across many use-cases → therefore, it's a pattern like Singleton, Observer, etc.

What did Operators originally solve? (History)

Before Operators:

Kubernetes could manage pods, replica sets, deployments
But it could NOT manage application logic, like:
- bootstrap DB cluster
- restore backup
- rotate certificates
- shard a database
- failover a primary node

These required a human operator.

Red Hat engineers (CoreOS team) in 2016 introduced:

"Let’s embed human operator knowledge in a controller"

Hence the term Operator.

The core meaning

It replaces/augments human operators
Encodes operational intelligence into the cluster
Automates lifecycle tasks that normally require expertise
Uses Kubernetes-native APIs (CRDs)

Example to show the naming logic

Human Operator Task

A MySQL DBA would:

install MySQL
configure replication
manage failover
take backups

Kubernetes Operator

A MySQL Operator:

watches MySQLCluster CRD
configures replication
performs rolling updates
automates failover
manages backups

It operates the application → therefore, Operator.

Simple analogy

Think of a Kubernetes Operator as a robot version of a human operator.

Operator Pattern vs Controller Pattern (Deep but Simple Explanation)

Operators and controllers are related, but not the same. Every Operator includes a controller, but not every controller is an Operator.

The following sections provide a detailed comparison.

1. What is the Controller Pattern?

A controller is a Kubernetes control-loop that:

Watches a Kubernetes resource
Compares desired state vs actual state
Reconciles until they match

This pattern is built into Kubernetes itself.

Example

The Deployment controller:

sees .spec.replicas = 3
ensures 3 Pods always exist

Controllers manage Kubernetes-native resources, like:

Pods
ReplicaSets
Deployments
Nodes
Services

2. What is the Operator Pattern?

The Operator Pattern extends the controller pattern.

It adds:

CRDs (Custom Resource Definitions)
Custom controllers
Domain-specific operational logic

Kubernetes only knows basic stuff (deploy, scale). The operator pattern teaches Kubernetes complex application logic.

Example

A PostgreSQL Operator can:

Initialize a primary/replica cluster
Perform rolling upgrades
Handle automatic failover
Create periodic backups
Integrate with S3

This is far beyond what native controllers can do.

3. Why Operator Pattern Exists (The Real Reason)

Before Operators, only human operators could:

Deploy complex apps
Fix cluster-level failures
Run backups
Manage clusters (DBs, queues, caches)

Developers wanted:

“A Kubernetes-native way to automate what human operators do.”

This is why the pattern is called Operator Pattern.

It automates operational knowledge.

4. Side-by-Side Comparison

Feature

Controller Pattern

Operator Pattern

Defines new resource type

Yes (via CRD)

Built into Kubernetes

Yes

No (custom)

Complexity

Basic

Advanced

Automates lifecycle

Only basic (replicas, scheduling)

Full lifecycle (install, backup, upgrade)

Designed for

Kubernetes primitives

Complex applications

Who uses

Kubernetes itself

Platform engineers / vendors

5. Visual Architecture

┌──────────────────────────────┐
│          Kubernetes          │
│                              │
│  ┌─────────────┐             │
│  │ Controller  │ ← Built-in  │
│  └─────────────┘             │
│     manages Pods, RS, etc    │
│                              │
│  ┌──────────────────────────┐ │
│  │        Operator          │ │
│  │  ┌───────────────┐      │ │
│  │  │   CRD         │◄─────┼─┐ New Resource Type |
│  │  └───────────────┘      │ │
│  │  ┌───────────────┐      │ │
│  │  │ Custom Ctrlr  │       │ │ Reconcile custom logic |
│  │  └───────────────┘      │ │
│  └──────────────────────────┘ │
└──────────────────────────────┘

Controllers operate Kubernetes. Operators operate applications.

6. How to Write an Operator (Simple Steps)

Using Operator SDK (Go-based):

Step 1 — Create a CRD

Example:

apiVersion: apps.example.com/v1
kind: MySQLCluster
spec:
  replicas: 3
  version: "8.0"

Step 2 — Write a controller (in Go/Python)

Your code:

Watches MySQLCluster resources
Creates StatefulSets, PVCs, Services
Ensures the DB cluster is healthy
Handles updates
Takes backups automatically

Step 3 — Build & deploy into K8s

7. Operator vs Helm vs GitOps (important differentiation)

Tool

Purpose

Helm

Installs apps using templates (static)

GitOps

Manages desired state from Git

Operator

Automates the application lifecycle (dynamic, intelligent)

Operators are active. Helm/GitOps are passive.

Example: Helm installs MongoDB, but it cannot:

heal replica set
trigger failover
ensure data replication
rotate certs

MongoDB Operator can.

Summary – Easy to Remember

Controller Pattern

Control loop → makes real state = desired state
Works on native resources
Kubernetes uses it internally

Operator Pattern

Extends Kubernetes API
Automates application lifecycle
CRD + Custom Controller
Encodes human operator knowledge
Designed for complex apps like DBs, queues, storage

1. How the Reconciliation Loop Actually Works (Internals)

The reconciliation loop is the brain behind every controller and operator.

Reconciliation = Desired State → Actual State

A controller continuously compares:

desired state (from YAML CRD)
vs
actual state (from cluster)

And takes action to bring them together.

Step-by-Step: Internal Mechanics

Step 1 — Watch

The operator registers informers to watch specific resources:

Your CRD (MySQLCluster)
Resources it owns (Pods, PVCs, Services)

Whenever something changes, a reconcile event is triggered.

Step 2 — Fetch Current State

Inside Reconcile() you fetch:

the CRD object
the actual StatefulSets, Pods, PVCs, Secrets, etc.
their status

Step 3 — Compare Desired vs Actual

Example:

spec.replicas = 3 but actual pods = 2 → mismatch

Step 4 — Take Action

Operator creates/patches/deletes resources.

Examples:

Create missing pods
Replace failed primary DB node
Restart a pod for upgrade
Create backup job
Create/rotate secrets

Step 5 — Requeue

Operator may requeue reconciliation:

return ctrl.Result{RequeueAfter: time.Minute}, nil

So it checks again after X seconds.

Key Idea:

Reconciliation is idempotent

Running it 100 times must always produce the same result.

2. How Operators Handle Upgrades & Backups

Operators encode domain knowledge.

Below is how real-world operators handle complex tasks.

Upgrades (Rolling Upgrade Logic)

Example: Upgrading a PostgreSQL cluster from 13 → 14

Operator performs:

Mark cluster as Upgrading
Validate version compatibility
Drain traffic from replica
Upgrade replica node 1
Wait for it to become healthy
Upgrade replica node 2
Wait again
Promote replica to primary
Upgrade old primary last
Update CRD .status.version

All without downtime (if HA setup exists).

Backups

Most operators follow this pattern:

Backup Trigger

automatically based on CRD schedule
or manual backups using a CRD resource like:

apiVersion: db.example.com/v1
kind: Backup
spec:
  cluster: mydb

Backup Process

Operator:

Creates a Kubernetes Job
Mounts PVC or connects to DB
Executes backup commands
Uploads backup to S3, GCS, Minio, NFS, etc.
Updates Backup.status

Restore Process

Operator:

Stops cluster
Restores PVC from stored backup
Recreates StatefulSets
Rebuilds replica topology

3. How to Write an Operator (Step-by-Step Using Operator SDK)

Prerequisites:

Go 1.22+
Docker/Podman
Kubernetes cluster
Operator SDK installed

Step 1 — Create Operator Project

operator-sdk init --domain example.com --owner "Deepak"

Step 2 — Create API (CRD) + Controller

operator-sdk create api --group db --version v1 --kind MySQLCluster --resource --controller

This generates:

api/v1/mysqlcluster_types.go
controllers/mysqlcluster_controller.go

Step 3 — Define the CRD Schema

api/v1/mysqlcluster_types.go:

type MySQLClusterSpec struct {
    Replicas int    `json:"replicas"`
    Version  string `json:"version"`
}

type MySQLClusterStatus struct {
    ReadyReplicas int `json:"readyReplicas"`
}

Step 4 — Implement Reconcile Logic

controllers/mysqlcluster_controller.go:

func (r *MySQLClusterReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    // Get CR
    var cluster dbv1.MySQLCluster
    if err := r.Get(ctx, req.NamespacedName, &cluster); err != nil {
        return ctrl.Result{}, client.IgnoreNotFound(err)
    }

    // Ensure StatefulSet exists
    ss := &appsv1.StatefulSet{}
    err := r.Get(ctx, types.NamespacedName{Name: cluster.Name, Namespace: cluster.Namespace}, ss)

    if errors.IsNotFound(err) {
        // Create StatefulSet
        ss = buildStatefulSet(cluster)
        _ = r.Create(ctx, ss)
    }

    // Check for version upgrades
    if ss.Spec.Template.Spec.Containers[0].Image != "mysql:"+cluster.Spec.Version {
        ss.Spec.Template.Spec.Containers[0].Image = "mysql:" + cluster.Spec.Version
        _ = r.Update(ctx, ss)
    }

    // Update status
    cluster.Status.ReadyReplicas = int(ss.Status.ReadyReplicas)
    _ = r.Status().Update(ctx, &cluster)

    return ctrl.Result{}, nil
}

This is simplified but captures the idea:

fetch CR
fetch resources
create/patch/delete
update status

Step 5 — Build & Deploy Operator

make docker-build docker-push
make deploy

This installs:

CRD
RBAC roles
Deployment for your operator

Step 6 — Apply CRD example

apiVersion: db.example.com/v1
kind: MySQLCluster
metadata:
  name: mydb
spec:
  replicas: 3
  version: "8.0"

Apply:

kubectl apply -f example.yaml

The operator will now build a MySQL cluster.

4. Top 20 Production-Grade Operators in the World Today

The most widely adopted, enterprise-grade operators:

Database Operators

MongoDB Community Operator
Percona Operators (MySQL, PostgreSQL, MongoDB)
CrunchyData PostgreSQL Operator
Vitess Operator
MariaDB Operator
CockroachDB Operator
Redis Operator
Cassandra Operator

Observability & Logging Operators

Prometheus Operator
Loki Operator
Grafana Operator
Fluent Operator (Fluent Bit/Fluentd)

Security Operators

Cert-Manager Operator
Vault Operator
Kyverno Policy Engine
OPA Gatekeeper Operator

Messaging & Streaming

Kafka Strimzi Operator
RabbitMQ Cluster Operator
NATS Operator

Infrastructure

Rook-Ceph Operator (Storage)

Extra commonly used:

ArgoCD Operator
Istio Operator
ETCD Operator
MinIO Tenant Operator
ElasticSearch Operator

Final Summary

Operator Pattern = CRD + Controller + Operational Knowledge

It enables:

backups
upgrades
autoscaling
failover
lifecycle management

Everything a human operator used to do.

References:

PreviousLAB06: Play with Cluster NextLAB07: Building Own Operator

Last updated 1 month ago

Good night

hashtagBLOG16: What is Kubernetes Operator Pattern?

hashtagWhat is the Kubernetes Controller?

hashtagHow a Controller Works

hashtagBuilt-in controllers in Kubernetes

hashtagSimple Example

hashtagWhat is the Kubernetes Operator Pattern?

hashtagAn Operator = CRD + Custom Controller

hashtagThe Operator Pattern automates complex lifecycle tasks like:

hashtagReal Examples of Operators

hashtagKey Difference: Controller vs Operator

hashtagExample: MySQL Operator

hashtagSummary

hashtagController

hashtagOperator

hashtagWhy the name "Operator"?

hashtagWhat makes it a "pattern"?

hashtagWhat did Operators originally solve? (History)

hashtag"Let’s embed human operator knowledge in a controller"

hashtagThe core meaning

hashtagExample to show the naming logic

hashtagHuman Operator Task

hashtagKubernetes Operator

hashtagSimple analogy

hashtagOperator Pattern vs Controller Pattern (Deep but Simple Explanation)

hashtag1. What is the Controller Pattern?

hashtagExample

hashtag2. What is the Operator Pattern?

hashtagExample

hashtag3. Why Operator Pattern Exists (The Real Reason)

hashtag4. Side-by-Side Comparison

hashtag5. Visual Architecture

hashtag6. How to Write an Operator (Simple Steps)

hashtagStep 1 — Create a CRD

hashtagStep 2 — Write a controller (in Go/Python)

hashtagStep 3 — Build & deploy into K8s

hashtag7. Operator vs Helm vs GitOps (important differentiation)

hashtagSummary – Easy to Remember

hashtagController Pattern

hashtagOperator Pattern

hashtag1. How the Reconciliation Loop Actually Works (Internals)

hashtagReconciliation = Desired State → Actual State

hashtagStep-by-Step: Internal Mechanics

hashtagStep 1 — Watch

hashtagStep 2 — Fetch Current State

hashtagStep 3 — Compare Desired vs Actual

hashtagStep 4 — Take Action

hashtagStep 5 — Requeue

hashtagKey Idea:

hashtagReconciliation is idempotent

hashtag2. How Operators Handle Upgrades & Backups

hashtagUpgrades (Rolling Upgrade Logic)

hashtagExample: Upgrading a PostgreSQL cluster from 13 → 14

hashtagBackups

hashtagBackup Trigger

hashtagBackup Process

hashtagRestore Process

hashtag3. How to Write an Operator (Step-by-Step Using Operator SDK)

hashtagPrerequisites:

hashtagStep 1 — Create Operator Project

hashtagStep 2 — Create API (CRD) + Controller

hashtagStep 3 — Define the CRD Schema

hashtagStep 4 — Implement Reconcile Logic

hashtagStep 5 — Build & Deploy Operator

hashtagStep 6 — Apply CRD example

hashtag4. Top 20 Production-Grade Operators in the World Today

hashtagDatabase Operators

hashtagObservability & Logging Operators

hashtagSecurity Operators

hashtagMessaging & Streaming

hashtagInfrastructure

hashtagFinal Summary

hashtagOperator Pattern = CRD + Controller + Operational Knowledge

hashtagReferences:

BLOG16: What is Kubernetes Operator Pattern?

What is the Kubernetes Controller?

How a Controller Works

Built-in controllers in Kubernetes

Simple Example

What is the Kubernetes Operator Pattern?

An Operator = CRD + Custom Controller

The Operator Pattern automates complex lifecycle tasks like:

Real Examples of Operators

Key Difference: Controller vs Operator

Example: MySQL Operator

Summary

Controller

Operator

Why the name "Operator"?

What makes it a "pattern"?

What did Operators originally solve? (History)

"Let’s embed human operator knowledge in a controller"

The core meaning

Example to show the naming logic

Human Operator Task

Kubernetes Operator

Simple analogy

Operator Pattern vs Controller Pattern (Deep but Simple Explanation)

1. What is the Controller Pattern?

Example

2. What is the Operator Pattern?

Example

3. Why Operator Pattern Exists (The Real Reason)

4. Side-by-Side Comparison

5. Visual Architecture

6. How to Write an Operator (Simple Steps)

Step 1 — Create a CRD

Step 2 — Write a controller (in Go/Python)

Step 3 — Build & deploy into K8s

7. Operator vs Helm vs GitOps (important differentiation)

Summary – Easy to Remember

Controller Pattern

Operator Pattern

1. How the Reconciliation Loop Actually Works (Internals)

Reconciliation = Desired State → Actual State

Step-by-Step: Internal Mechanics

Step 1 — Watch

Step 2 — Fetch Current State

Step 3 — Compare Desired vs Actual

Step 4 — Take Action

Step 5 — Requeue

Key Idea:

Reconciliation is idempotent

2. How Operators Handle Upgrades & Backups

Upgrades (Rolling Upgrade Logic)

Example: Upgrading a PostgreSQL cluster from 13 → 14

Backups

Backup Trigger

Backup Process

Restore Process

3. How to Write an Operator (Step-by-Step Using Operator SDK)

Prerequisites:

Step 1 — Create Operator Project

Step 2 — Create API (CRD) + Controller

Step 3 — Define the CRD Schema

Step 4 — Implement Reconcile Logic

Step 5 — Build & Deploy Operator

Step 6 — Apply CRD example

4. Top 20 Production-Grade Operators in the World Today

Database Operators

Observability & Logging Operators

Security Operators

Messaging & Streaming

Infrastructure

Final Summary

Operator Pattern = CRD + Controller + Operational Knowledge

References: