BLOG06: Containerization Concepts

BLOG15: Containerization Concepts

Can we simulate container environment manually?

1. Building a final image by combining multiple image layers (Overlaying graphics)

If you want to take folder of images (PNG/JPG etc.) and overlay them to generate a final composite image (e.g., NFT layers, logos on top of base images, watermarks, etc.) — yes, absolutely possible.

Example with Python (Pillow library)

from PIL import Image

base = Image.open("layers/base/background.png").convert("RGBA")
layer1 = Image.open("layers/body/body1.png").convert("RGBA")
layer2 = Image.open("layers/head/head3.png").convert("RGBA")
layer3 = Image.open("layers/accessories/hat2.png").convert("RGBA")

final = Image.alpha_composite(base, layer1)
final = Image.alpha_composite(final, layer2)
final = Image.alpha_composite(final, layer3)

final.save("output/final.png")

This overlays each layer on top of the previous one.


2. Using Docker to build an image from folders (Docker layering)

If your question means building a Docker image from folders and overlay file systems, then yes — Docker already uses an overlay filesystem.

Example:

Docker puts these into layers during build.


3. Building a filesystem image (ISO, squashfs, qcow2) by overlaying folders

If you mean building OS images, like overlaying:

  • rootfs/

  • apps/

  • configs/

You can use:

OverlayFS (Linux)

Creates a merged filesystem.

mksquashfs


To understand how Docker images and layers function, this document provides a step-by-step demonstration of manually assembling a Docker image using core Linux tools.


Part 1: Understand How Docker Images Work

A Docker image = a stack of tarred filesystem layers + a config.json.

Internally:

Each layer.tar is a complete snapshot of changes between layers.


Part 2: Manually Build a Root Filesystem

1. Create working directories

2. Populate a base filesystem

For example, use BusyBox static binary (very small):

Create essential dirs:

Link busybox applets:

3. Create required dev nodes (minimal)

You now have a working tiny Linux root filesystem.


Part 3: Verify it works (optional)

Use chroot:

The output will be:

Exit using exit.


Part 4: Create a Docker Layer (layer.tar)

Docker expects a tar file representing the layer.

This is layer 1.


Part 5: Create the Docker metadata

Docker requires:

manifest.json

config.json

Minimal config:

Compute the diffid:

Put that value into config.json.


Part 6: Import the Image into Docker

Check:

The output should show a BusyBox shell:


Part 7: Build More Layers (Optional)

To add files:

Then add to:

  • manifest.json

  • config.json → add new diffid


Summary

Step
What you did

Build rootfs

Created a minimal filesystem (like Docker does)

Tar it

This becomes a Docker layer

Add metadata

Docker uses manifest + config

docker load

Loads your manually-built image

This section covers how Docker physically stores images and containers on disk. This includes where image layers and container filesystems are stored, how Docker assembles everything so containers can run, and what happens under the hood.

and how OverlayFS works under the hood.

This section covers:

  1. How Docker overlays layers

  2. How to manually overlay folders

  3. How to get the "top merged view"

  4. How this matches Docker container storage


1. Quick Overview: How Docker Uses OverlayFS

When Docker launches a container:

Example path:

When a container runs:

You can inspect with:

Look for:


2. Manually Overlay Two Folders (Like Docker Does)

We simulate Docker’s overlay using your own folders.

Step 1: Create lower (base image layer)

Step 2: Create upper (container's writable layer)

Step 3: Create work + merged dirs

Step 4: Mount Overlay


3. Check the Top (Merged) View

Check the merged directory:

The output should show:

Open the overlapped file:

Output:

This is copy-on-write!


4. OverlayFS Delete Example (Docker Whiteout)

Simulate deletion in upper layer (delete file in a read-only layer):

Now check:

The file base.txt will not be visible.

Upperdir will contain a whiteout marker:

The output will show something like:

This hides the file from the lower layer — Docker does this internally.


5. Compare With How Docker Does It

To inspect a running container:

The output will show something like:

If you enter the merged dir:

This is exactly the overlay mount we created manually.


6. Multi-layer Overlay Example (like Docker Image Layers)

Docker might have 5 layers:

OverlayFS supports stacking:

Example:


7. Why merged view feels like "final Docker image"

Because Docker literally uses OverlayFS as the live root filesystem.


Summary

Concept
Meaning

lowerdir

read-only layers from image

upperdir

writable layer (container)

merged

what container sees (both merged)

whiteout

hides deleted files from lower layers

workdir

OverlayFS internal use


This section explains Docker networking using only Linux primitivesnetwork namespaces + veth pairs + bridges + routing + iptables — the same way Docker builds networking internally (via containerd + netns + bridge).

Before we dive in, consider these questions to help guide your understanding:

  • What does a "network namespace" actually isolate in Linux?

  • What is a veth pair, and how does it simulate a network interface inside a container?

  • Why does Docker use a software bridge (like docker0) rather than connecting containers directly to the host network?

  • How does Docker make it possible for containers to access the internet without being directly exposed?

  • What roles do iptables and NAT play in container networking?

  • How are all these pieces (namespaces, veths, bridges, routing) connected together in practice?

These questions will be answered as we step through building Docker-style networking from scratch.

This is the equivalent of your previous request about overlay filesystem but now for Docker networking.


How Docker Networking Works Internally

When executing:

Docker creates:

  1. Network Namespace → isolated network stack for the container

  2. veth pair → one end inside container, one end on host

  3. Bridge on host (docker0) → virtual switch

  4. IP assignment → via Docker’s built-in DHCP

  5. NAT → iptables MASQUERADE for internet access

  6. Routing → default route via bridge

The following steps demonstrate how to build all this manually.


STEP 1: Create a Network Namespace (Container Equivalent)

Check:


STEP 2: Create a veth Pair (Like Docker Does)

A veth pair is like a virtual ethernet cable.

  • veth0 → on the host

  • veth1 → inside the container namespace


STEP 3: Move One End Into the Namespace

Now:

  • Host has: veth0

  • Namespace has: veth1


STEP 4: Create docker0 Bridge (Docker Default Network)

Docker normally creates:

We simulate:


STEP 5: Attach veth0 to the Bridge


STEP 6: Set IP Inside Container Namespace

Assign IP to veth1:


STEP 7: Bring Up Loopback in Namespace


STEP 8: Set Default Route (Like Docker Does)


TEST: Ping Host from Container Namespace

This should work successfully.


TEST: Ping Between Containers

If you create a second netns with IP 10.0.0.3/24, both can communicate via br0.


STEP 9: Enable NAT (Docker Does This)

Docker sets:

Equivalent for your 10.0.0.0/24 network:

Enable IP forwarding:

Now outgoing packets from namespace → NAT → internet.


TEST: Ping Google DNS (8.8.8.8)


Summary

The full Docker networking model has been manually recreated:

Component
Manual Equivalent

Container namespace

ip netns add

veth pair

ip link add veth0 type veth peer veth1

docker0 bridge

ip link add name br0 type bridge

IP assignment

ip addr add

Routing

ip route add default

NAT

iptables -t nat -A POSTROUTING

This is exactly what Docker’s libnetwork / containerd do behind the scenes.


This section covers the third core concept of Docker, after:

  • Storage (OverlayFS)

  • Networking (Namespace + veth + bridge)

Now we look at:

Docker User Process Isolation using Linux Namespaces + cgroups

Docker is NOT a VM. It simply launches your process in a set of isolated namespaces:

The following steps demonstrate how to manually build a mini-container by:

  • creating namespaces

  • starting a process inside them

  • applying isolation

  • mounting a filesystem

This is the closest raw version of what docker run actually does.


1. Launch a Process in New Namespaces (container-like)

Use the unshare command (part of util-linux).

Smallest "container" ever:

Flags:

Flag
Namespace

-p

PID namespace

-f

fork into new PID namespace

-m

mount namespace

-u

UTS (hostname)

--mount-proc

create new /proc

Inside this “container” shell:

The output will show:

You are PID 1 inside a container (like Docker does).


2. Set a Hostname (UTS Namespace)

Inside the unshared shell:

Output:

On the host, your hostname remains unchanged.

This is exactly how Docker lets each container have its own hostname.


3. Create an Isolated Root Filesystem (mount namespace)

In another terminal:

Now launch new namespaces and change root:

Inside this chroot container:

You now have:

  • own filesystem

  • own PID tree

  • own mount table

  • own hostname

This is basically Docker without networking.


4. Combine Namespaces + Chroot → Full Minimal Container

Full example:

Inside:

Your mini-container now has:

  • PID namespace

  • Mount namespace

  • Network namespace

  • UTS (hostname)

  • IPC namespace

  • USER namespace

Exactly what Docker uses.


5. Add cgroups → Resource limits (like docker run --cpus 1 --memory=256m)

Create a cgroup:

Limit CPU:

Limit memory:

Now add your bash process to the cgroup:

Now that process and its children cannot exceed that resource limit.

Docker does exactly the same.


6. Understanding Docker in 1 Sentence

Docker = Linux namespaces + cgroups + overlay filesystem + networking tools + container runtime wrapper

All Docker does is:

1. prepare filesystems

2. prepare networking

3. isolate namespaces

4. apply cgroups

5. run process inside isolation


Final Full Example: Starting a Process in "Container Mode"

You have now launched a real container-like environment without Docker.


Next Steps

The following topics can be covered:

How Docker uses containerd + runc + OCI runtime spec

How docker run translates into runc commands

Build your own container runtime (40 lines of Go!)

How Docker namespaces map to docker inspect output

How Docker really launches a container under the hood using containerd + runc + OCI bundle

You will understand exactly what happens when you run docker run ubuntu bash.

This is the most important piece of all the Docker internals.


1. Docker Architecture in Reality

Docker is not the “runtime”. It's a wrapper.

Here’s the true pipeline:

So the real container engine is runc, NOT Docker.


2. runc Needs an OCI Bundle (RootFS + Config)

Every container runc runs needs:

We will build this manually.


3. Create a Minimal Root Filesystem

Same as before:

Populate with busybox:

Now your rootfs contains a working BusyBox environment.


4. Generate a Default OCI Config

If you have runc installed:

This will create:

This is what Docker would generate automatically.

This JSON describes:

  • namespaces

  • mounts

  • process command

  • user id mapping

  • capabilities

  • cgroups

  • hostname

Everything.


5. Modify config.json to run a command

Inside config.json, change:

You can set hostname:


6. Run the Container Using ONLY runc

From inside the folder:

You now have a real Linux container without Docker.

Inside you will see:

This is the exact same thing Docker does.


7. Summary of what has been built:

Component
Docker Does
You Did

Extract image layers

yes

you created rootfs yourself

Create OCI config.json

yes

you generated with runc spec

Create namespaces

containerd + runc

runc created them

Set cgroups

dockerd

runc used default cgroup rules

Set hostname

docker

config.json hostname

Mount /proc

docker

mount-proc auto in runc

Launch PID1

bash/sleep/node

you ran sh


8. Compare With What Docker Would Run

To see Docker's actual call:

When you run:

Docker builds an OCI bundle under:

Inside you will find:

Docker then calls:

Exactly same thing you just did manually.


This section covers Docker volumes, how they work under the hood, and how files are mounted inside containers. This is the storage counterpart to what we already explored with OverlayFS and namespaces.

Before we start, consider these questions:

  • Why do containers need volumes if OverlayFS already provides a layered filesystem?

  • How do Docker volumes differ from bind mounts or tmpfs?

  • Where does Docker store the actual data for a volume on the host?

  • What happens to the data in a Docker volume when the container is deleted?

  • How does Docker "mount" a volume inside a running container?

  • How can you inspect which volumes are attached to a container?

Keep these in mind as we explore Docker volumes and storage!


1. Docker Volumes Overview

A Docker volume is a persistent storage mechanism that lives outside of the container’s UnionFS layer.

Key points:

Feature
Description

Persistence

Survives container deletion

Location

Managed by Docker on the host

Isolation

Mounted into container via mount point

Types

Named volumes, anonymous volumes, bind mounts, tmpfs


2. Where Docker Stores Volumes on the Host

By default:

Example:

Output:

  • Mountpoint → actual host path storing files.

  • _data → the folder that will be mounted into containers.


3. Mounting Volume into a Container

Suppose we run:

  • Docker mounts:

  • Any read/write to /app/data in container directly reflects on host.


4. How This Works Internally

Docker uses mount namespaces + bind mounts.

  • A bind mount is essentially:

  • Container sees /app/data but the kernel maps it to host storage.

  • OverlayFS is not involved here — Docker volumes bypass container filesystem layers for performance and persistence.


5. Example: Manual Volume Mount

Suppose you want to simulate Docker volume without Docker:

  • Output:

This is exactly how Docker mounts volumes.


6. Named vs Bind Mounts

  1. Named volume: Docker manages /var/lib/docker/volumes/<name>/_data

  2. Bind mount: You mount any host folder:

  • Host path /home/user/config directly appears in container /app/config.


7. Key Takeaways

Concept
Host
Container
Notes

Volume storage

/var/lib/docker/volumes/<name>/_data

/mount/point/in/container

Bind mount behind the scenes

Persistence

Yes

Yes

Independent of container lifecycle

OverlayFS involvement

No

No

Volume bypasses image layers

Bind mount

Any folder

Mounted into container

Useful for dev/test


8. How Docker actually does it

When you run:

Docker (via containerd + runc) performs:

  1. Prepares rootfs for container (OverlayFS)

  2. Prepares mount namespace for container

  3. Binds the volume directory from host into container namespace:

  1. Launches container process (PID 1) in the namespace

Everything inside /app/data maps to host path directly.


1. Containers and Namespaces

When Docker launches a container, it creates isolated namespaces for:

  • PID → process tree

  • MNT → filesystem mounts

  • NET → networking

  • UTS → hostname

  • IPC → inter-process communication

Important: Each namespace is isolated logically, but they all share the same kernel.

  • Namespaces are not separate machines — they are kernel-level isolation.

  • They can see the kernel and the host’s storage, but only what is mounted into their mount namespace.


2. How a Container Can See Host Files

Docker uses bind mounts to share host paths with the container.

Example:

  • Host path: /var/lib/docker/volumes/mydata/_data

  • Container mount point: /app/data

Internally, Docker does:

Even though the container has a different mount namespace, the kernel allows this bind mount, so the container can see and write to the host directory.


3. Why it Works Across Namespaces

  1. Mount namespace isolation only isolates mount points, not the underlying filesystem objects.

  2. Bind mount creates a reference from host FS into container FS.

  3. The kernel manages access; processes in different mount namespaces can see the same inode if it’s bind-mounted.

So even in a different MNT namespace:

  • Both containers see the same underlying data.

  • This is why volumes are shared storage.


4. Manual Example Without Docker

Open a second terminal:

Both namespaces see the same data because it's bind-mounted from host.


5. Key Takeaways

Concept
Detail

Mount namespaces

Isolate mount points, not the underlying FS

Bind mount

Exposes host directory into container namespace

Same kernel

All containers share kernel, so underlying storage is accessible

Docker volume

Just a managed host path, bind-mounted into containers

Isolation

Containers can’t see host FS outside bind mounts


So the container sees the host volume because:

  1. The host path exists on the kernel.

  2. Docker uses a bind mount into the container’s mount namespace.

  3. Kernel enforces access; namespace isolation doesn’t block it.

Last updated