Chapter 9: Container Platforms and Orchestration

Learning Objectives

After completing this chapter, you will be able to:

  • Understand container fundamentals and their benefits for infrastructure
  • Design Kubernetes cluster architectures for production workloads
  • Implement container platform best practices
  • Configure container networking, storage, and security
  • Select appropriate container platforms for different use cases
  • Operate and troubleshoot containerized environments

Introduction

The container revolution has fundamentally changed how organizations build, deploy, and operate software. What began as an interesting approach to application packaging has become the default deployment model for modern applications. Containers have made it possible to achieve levels of consistency, portability, and resource efficiency that were previously impractical or impossible.

The appeal of containers lies in their elegant solution to a persistent problem: how do you ensure that software works the same way everywhere it runs? Developers had long struggled with applications that worked perfectly on their laptops but failed mysteriously in production. Operations teams spent countless hours tracking down differences between environments. The famous “works on my machine” excuse became a symbol of the friction between development and operations.

Containers solve this problem by packaging applications with everything they need to run—not just code, but runtime, system tools, libraries, and settings. This complete package can run identically whether it’s on a developer’s laptop, a test server, or a production cluster. The container is the same everywhere; only the host changes.

But containers alone are not enough for production use at scale. A single container is useful, but organizations typically run hundreds or thousands of containers, and they need these containers to work together as a system. They need to scale containers up and down, distribute them across machines, handle failures, manage networking and storage, and coordinate deployments. This is where container orchestration becomes essential.

Kubernetes has emerged as the dominant platform for container orchestration. Originally developed by Google based on their internal container management systems, Kubernetes provides a comprehensive framework for running containerized applications at scale. It has become so prevalent that understanding Kubernetes is now a fundamental skill for infrastructure professionals.

This chapter explores container technology from the ground up: what containers are and how they work, how Kubernetes orchestrates containers at scale, and how to operate container platforms in production. We will examine networking, storage, security, and the operational practices that make container platforms successful.


Understanding Container Technology

The Container Concept

To understand containers, it helps to first understand what came before them. Traditional application deployment involved installing applications directly on servers. Each server might run one application, or several applications might share a server and compete for resources. Configuration was manual, often inconsistent, and difficult to reproduce.

Virtual machines improved this situation by allowing multiple isolated operating systems to run on a single physical server. Each VM has its own kernel, its own filesystem, its own network stack—essentially its own complete computer. VMs provide strong isolation and have enabled cloud computing as we know it. But VMs are heavyweight: each one requires its own operating system, consuming gigabytes of memory and storage, and taking minutes to start.

Containers provide a middle ground. They offer isolation that is lighter than VMs but sufficient for most applications. A container has its own filesystem, its own network interface, and its own process space, but it shares the host operating system’s kernel. This sharing makes containers dramatically more efficient than VMs: they start in seconds rather than minutes, consume megabytes rather than gigabytes, and hundreds can run on a single host.

The technology behind containers is not new. Linux containers rely on kernel features that have existed for years: namespaces provide isolation between processes, cgroups (control groups) limit and account for resource usage, and union filesystems enable efficient image layering. What Docker and subsequent container tools provided was a way to package and distribute these technologies in a developer-friendly format.

How Containers Work

When you run a container, the container runtime creates isolated spaces for the container’s processes. The namespace isolation means the container sees its own view of the system: its own process IDs, its own network interfaces, its own filesystem. Processes inside the container cannot see processes outside it. The container appears to be its own self-contained system.

Cgroups complement namespaces by limiting what resources the container can use. You can specify that a container can use at most two CPUs and one gigabyte of memory. If the container tries to exceed these limits, the kernel enforces them: processes may be throttled or killed. This resource limiting is essential for multi-tenant environments where containers must share host resources predictably.

The container filesystem is typically built from layers using a union filesystem. The base layer might be a minimal operating system (Ubuntu, Alpine, or a distro-less image). Additional layers add application runtimes, libraries, and finally the application itself. Each layer is immutable and shared between containers that use it. If ten containers use the same base image, that base exists only once on disk, with each container adding only its unique layers.

This layered approach makes images efficient to store and transfer. When you update an application, only the changed layers need to be transferred. When you run multiple containers from similar images, they share the common layers. The efficiency compounds at scale—a host running dozens of containers might store far less data than if each had its own complete filesystem.

Container Images and Registries

A container image is a packaged filesystem containing everything needed to run an application. Images are typically built using a Dockerfile (or equivalent) that specifies the base image and the steps to add application content. Each instruction in the Dockerfile creates a new layer in the image.

Building good container images is both an art and a discipline. The best images are small (reducing transfer time and attack surface), secure (using minimal base images and running as non-root users), and efficient (placing frequently-changing content in later layers to maximize cache reuse).

# Example of a well-structured Dockerfile

# Use a specific, minimal base image
FROM python:3.11-slim-bookworm AS base

# Create non-root user
RUN useradd -m -s /bin/bash appuser

# Set working directory
WORKDIR /app

# Install dependencies in a separate layer (changes less frequently)
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code (changes more frequently)
COPY --chown=appuser:appuser . .

# Switch to non-root user
USER appuser

# Define the entry point
EXPOSE 8080
CMD ["python", "app.py"]

Container registries store and distribute images. Public registries like Docker Hub host millions of images for common software. Organizations typically use private registries for their own images—services like Amazon ECR, Google Container Registry, Azure Container Registry, or self-hosted options like Harbor. Registries provide versioning through tags, and increasingly provide security scanning to identify vulnerabilities in stored images.

Containers vs. Virtual Machines

The choice between containers and virtual machines is not always straightforward. Each technology has strengths that make it suitable for different situations.

Containers excel when you need density and speed. Starting a container takes seconds; starting a VM takes minutes. You can run hundreds of containers on hardware that would support only tens of VMs. For microservices architectures where each service is a separate deployment unit, containers are the natural choice. For CI/CD pipelines that spin up and tear down environments constantly, containers’ fast startup is essential.

Virtual machines provide stronger isolation. Because each VM has its own kernel, a kernel vulnerability in one VM cannot directly affect another. For truly multi-tenant environments where you cannot trust workloads, VMs provide an additional security boundary. Some compliance requirements may mandate VM-level isolation.

In practice, many organizations use both. VMs provide the infrastructure layer—the nodes that run Kubernetes or other container platforms. Containers run within those VMs, providing density and efficiency for application workloads. This layered approach combines the security boundaries of VMs with the operational benefits of containers.


Kubernetes Fundamentals

What Kubernetes Does

Kubernetes is a platform for running containers at scale. At its core, Kubernetes does something simple: you tell it what you want (your desired state), and it works continuously to make reality match that desire. If you say you want three copies of an application running, Kubernetes starts three containers. If one fails, Kubernetes starts a replacement. If you change your desire to five copies, Kubernetes starts two more.

This declarative model is powerful because it separates intent from implementation. You don’t write scripts that check if containers are running and start new ones if needed. You declare that three containers should be running, and Kubernetes handles the details. It tracks the current state, computes the difference from the desired state, and takes actions to close that gap.

Kubernetes extends this declarative model across many aspects of running applications. You declare how traffic should be routed, and Kubernetes configures load balancing and service discovery. You declare what storage you need, and Kubernetes provisions it. You declare what configuration your application needs, and Kubernetes makes it available. The complexity is real, but it represents the complexity inherent in running distributed applications—Kubernetes provides a consistent way to manage that complexity.

The Kubernetes Architecture

A Kubernetes cluster consists of two types of machines: control plane nodes that manage the cluster, and worker nodes that run application workloads. In production, you typically have multiple control plane nodes for high availability and many worker nodes depending on your scale.

The control plane runs several components that together manage the cluster. The API server is the front door to the cluster—all communication, whether from users, tools, or cluster components, goes through the API. When you run kubectl commands, you’re talking to the API server. When the scheduler decides where to place a pod, it reads from and writes to the API server.

Behind the API server sits etcd, a distributed key-value store that holds all cluster state. Every piece of information Kubernetes needs—what pods exist, where they’re running, what their configuration is—lives in etcd. This makes etcd critical: lose etcd and you lose your cluster’s memory. Production clusters back up etcd regularly and run it with redundancy.

The scheduler decides where pods run. When you create a pod, it initially has no assigned node. The scheduler watches for unscheduled pods, evaluates which nodes could run each pod (based on resource requirements, affinity rules, and other constraints), scores the viable options, and assigns the pod to the best node. This assignment is then stored in the API server, and the node’s kubelet takes over.

The controller manager runs controllers—control loops that watch the cluster state and take actions to move toward the desired state. The deployment controller watches Deployments and creates or deletes ReplicaSets. The ReplicaSet controller watches ReplicaSets and creates or deletes Pods. Each controller has a specific responsibility, and together they implement Kubernetes’ self-healing behavior.

On each worker node, the kubelet is the agent that actually runs pods. The kubelet watches the API server for pods assigned to its node, then works with the container runtime to start containers, monitor their health, and report status. If a container fails, the kubelet restarts it. If a pod should no longer run on the node, the kubelet stops it.

Kubernetes Objects and Resources

Kubernetes represents everything as objects—data structures that you can create, read, update, and delete through the API. Understanding the key objects is essential to using Kubernetes effectively.

A Pod is the smallest deployable unit in Kubernetes. A pod contains one or more containers that share network and storage. Containers in a pod can communicate via localhost and can share files. In most cases, a pod runs a single container, but multi-container pods are useful for patterns like sidecars (helper containers that augment the main container).

You rarely create pods directly. Instead, you create higher-level objects that manage pods for you. A Deployment manages pods for stateless applications. It creates ReplicaSets that maintain the desired number of pod replicas. When you update a Deployment, it creates a new ReplicaSet with the updated specification and gradually scales it up while scaling the old ReplicaSet down, enabling rolling updates with zero downtime.

# Example Deployment manifest with detailed explanations

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-application
  labels:
    app: web
    tier: frontend
spec:
  # Maintain three replicas for high availability
  replicas: 3

  # Selector determines which pods belong to this deployment
  selector:
    matchLabels:
      app: web
      tier: frontend

  # Rolling update strategy for zero-downtime deployments
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1        # Allow one extra pod during update
      maxUnavailable: 0  # Never reduce below desired count

  template:
    metadata:
      labels:
        app: web
        tier: frontend
    spec:
      containers:
      - name: web
        image: myapp:1.2.3
        ports:
        - containerPort: 8080

        # Resource requests and limits
        resources:
          requests:      # Minimum resources guaranteed
            cpu: 100m
            memory: 256Mi
          limits:        # Maximum resources allowed
            cpu: 500m
            memory: 512Mi

        # Health checks
        livenessProbe:   # Is the container healthy?
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10

        readinessProbe:  # Is the container ready for traffic?
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5

        # Security context
        securityContext:
          runAsNonRoot: true
          readOnlyRootFilesystem: true
          allowPrivilegeEscalation: false

StatefulSets manage pods for stateful applications—databases, message queues, and other workloads that need stable identities and persistent storage. Unlike Deployments, StatefulSets give each pod a stable name (web-0, web-1, web-2) and create pods in order. Each pod can have its own persistent storage that follows it across restarts and rescheduling.

DaemonSets ensure a pod runs on every node (or a subset of nodes matching a selector). This pattern is useful for infrastructure workloads like log collectors, monitoring agents, or network plugins that need to run everywhere.

Services provide stable network endpoints for pods. Because pods are ephemeral—they come and go as the system scales and heals—you cannot rely on pod IP addresses for communication. A Service defines a logical set of pods (using a selector) and a policy for accessing them. The Service gets a stable IP address and DNS name that other pods can use. Traffic to the Service is load-balanced across the healthy pods.


Kubernetes Networking

The Networking Model

Kubernetes networking follows a few fundamental principles that shape how containers communicate. Understanding these principles helps you design and troubleshoot network configurations.

The first principle is that every pod gets its own IP address. This is different from Docker’s default networking where containers share port spaces. In Kubernetes, pods can communicate with each other using their IP addresses directly, without NAT (network address translation). This flat network model simplifies application configuration—services don’t need to worry about port conflicts or address translation.

The second principle is that pods can reach each other across nodes. When a pod on Node A sends traffic to a pod on Node B, the network infrastructure routes that traffic correctly. How this happens depends on the CNI (Container Network Interface) plugin you use, but the result is a cluster-wide network where any pod can reach any other pod.

The third principle is that Services abstract pod IPs. Since pod IPs change as pods come and go, applications use Service IPs and DNS names instead. Kubernetes maintains the mapping from Services to pods and load-balances traffic accordingly.

Container Network Interface

The CNI (Container Network Interface) is a specification for network plugins that configure container networking. When Kubernetes creates a pod, it calls the CNI plugin to set up networking for that pod. Different CNI plugins implement networking differently, with trade-offs in features, performance, and complexity.

Calico is one of the most popular CNI plugins. It uses BGP (Border Gateway Protocol) to route traffic between nodes without encapsulation, providing good performance. Calico also implements Kubernetes NetworkPolicies, providing pod-level firewall capabilities. It’s well-suited for large clusters and security-conscious environments.

Cilium represents a newer generation of CNI plugins using eBPF (extended Berkeley Packet Filter), a technology for running programs in the Linux kernel. eBPF enables Cilium to provide advanced features like transparent encryption, deep packet inspection, and sophisticated observability without the overhead of traditional approaches. Cilium is increasingly popular for its performance and security capabilities.

Cloud providers offer CNI plugins that integrate with their native networking. AWS VPC CNI assigns pods IP addresses from VPC subnets, enabling pods to communicate directly with other VPC resources and appear as native VPC endpoints. Azure CNI provides similar integration for Azure. These plugins simplify networking when running in their respective clouds but reduce portability.

Services and Ingress

Services are how you expose applications within and outside the cluster. The Service type determines the exposure level.

ClusterIP Services (the default) create a virtual IP that is only reachable from within the cluster. Other pods can reach the Service using its IP or DNS name (typically service-name.namespace.svc.cluster.local). ClusterIP is appropriate for internal services that only other cluster workloads need to access.

NodePort Services expose the service on a port on every node in the cluster. External clients can reach the service by connecting to any node’s IP address at the designated port. NodePort is useful for development or when you have external load balancers that can be configured to target node ports, but it’s not ideal for production because it exposes ports on all nodes.

LoadBalancer Services integrate with cloud providers to provision external load balancers. When you create a LoadBalancer Service on AWS, Kubernetes creates an ELB (Elastic Load Balancer) that routes traffic to your pods. This is the standard way to expose services to the internet in cloud environments, but it creates a separate load balancer for each service, which can become costly at scale.

Ingress provides a more sophisticated approach to external access. An Ingress defines rules for routing HTTP traffic based on hostnames and paths. A single Ingress controller (like NGINX Ingress or Traefik) can handle traffic for many services, consolidating external load balancers. Ingress also supports TLS termination and other HTTP-specific features.

# Example Ingress configuration

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: application-ingress
  annotations:
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
  ingressClassName: nginx
  tls:
  - hosts:
    - app.example.com
    secretName: app-tls-secret
  rules:
  - host: app.example.com
    http:
      paths:
      - path: /api
        pathType: Prefix
        backend:
          service:
            name: api-service
            port:
              number: 80
      - path: /
        pathType: Prefix
        backend:
          service:
            name: frontend-service
            port:
              number: 80

Service Mesh

For complex microservices architectures, a service mesh provides additional networking capabilities beyond what basic Kubernetes networking offers. A service mesh works by injecting a proxy sidecar into each pod. All traffic flows through these proxies, enabling the mesh to observe, route, and secure communication.

Istio is the most feature-rich service mesh, providing traffic management (routing, load balancing, circuit breaking), security (mutual TLS, authorization policies), and observability (metrics, traces, access logs). Istio’s capabilities come with significant complexity; operating Istio well requires substantial expertise.

Linkerd offers a simpler alternative with strong reliability features. It provides mTLS (mutual TLS) encryption between services with minimal configuration, along with observability and traffic management. Linkerd aims for simplicity and low resource overhead, making it a good choice for teams that want service mesh benefits without Istio’s complexity.

The decision to adopt a service mesh should be driven by actual needs. Service meshes add complexity and resource overhead. For smaller deployments or simpler architectures, basic Kubernetes networking and explicit HTTP calls may suffice. Service meshes become valuable when you need features like automatic mTLS, sophisticated traffic routing, or deep observability across many services.


Kubernetes Storage

The Storage Challenge

Containers are ephemeral by design. When a container restarts, it starts fresh with the filesystem from its image. Any data written during the container’s life is lost. This works well for stateless applications but presents challenges for anything that needs to persist data—databases, message queues, or applications that process files.

Kubernetes addresses this with volumes—storage that can be attached to pods and persisted beyond the container lifecycle. The simplest volumes, like emptyDir, provide temporary storage that survives container restarts within a pod but is deleted when the pod terminates. More persistent options survive pod termination and even node failures.

PersistentVolumes (PVs) represent pieces of storage in the cluster. A PV might be backed by cloud provider block storage (EBS, Azure Disk, Persistent Disk), network filesystem storage (NFS, EFS), or local disks. PVs exist independently of pods; they’re cluster resources that pods can claim and use.

PersistentVolumeClaims (PVCs) are how pods request storage. A PVC specifies what the pod needs: how much storage, what access mode (read-write-once, read-only-many, read-write-many), and what storage class. Kubernetes matches PVCs with available PVs or dynamically provisions new storage to fulfill the claim.

Storage Classes and Dynamic Provisioning

Storage classes define different types of storage available in the cluster. A cluster might have storage classes for fast SSD storage, standard disk storage, and network-attached storage. Each class specifies a provisioner that knows how to create the backing storage.

Dynamic provisioning creates storage on demand. When a PVC requests storage with a particular storage class, the provisioner creates the underlying storage (an EBS volume, for example), creates a PV to represent it, and binds the PVC to that PV. This automation eliminates the need to pre-provision storage and enables self-service storage for applications.

# Storage class for high-performance SSD storage on AWS

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
  iops: "4000"
  throughput: "250"
  fsType: ext4
  encrypted: "true"
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete
allowVolumeExpansion: true

---
# PVC requesting storage from that class

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: database-storage
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: fast-ssd
  resources:
    requests:
      storage: 100Gi

StatefulSet Storage Patterns

StatefulSets have a special relationship with storage. When you define a StatefulSet with a volumeClaimTemplate, Kubernetes creates a separate PVC for each pod. These PVCs persist even if the pod is deleted, and the new pod gets the same PVC when it’s recreated. This behavior enables stateful applications like databases to retain their data across restarts and rescheduling.

The challenge with StatefulSet storage is that the data is tied to a specific pod identity. If you need to move data between pods or scale down, you need to handle the data explicitly. Kubernetes won’t automatically rebalance data; that’s the application’s responsibility (or an operator that understands the application).

For critical stateful workloads, consider using operators—specialized controllers that understand the application and can handle operational tasks like backup, restore, scaling, and failover. Operators exist for many databases (PostgreSQL, MySQL, MongoDB) and message queues (Kafka, RabbitMQ). These operators encode operational knowledge that would otherwise require human intervention.


Container Security

Security in Depth

Container security requires attention at multiple layers. Each layer provides defense, and weaknesses in one layer may be mitigated by strength in another. A comprehensive security approach addresses all layers.

At the code layer, secure coding practices prevent vulnerabilities from entering applications. Static analysis tools identify issues in source code. Dependency scanning identifies vulnerable libraries before they’re packaged into images.

At the image layer, security starts with the base image. Minimal base images (like distroless images or Alpine) reduce the attack surface by including only what’s necessary. Image scanning identifies known vulnerabilities in the image’s packages. Image signing ensures that images haven’t been tampered with.

At the container layer, runtime configuration affects security. Running containers as non-root users limits what a compromised container can do. Read-only root filesystems prevent modifications. Dropping unnecessary Linux capabilities reduces privilege. Seccomp profiles limit available system calls.

At the pod layer, Kubernetes Pod Security Standards define profiles for pod security. The Restricted profile enforces best practices like non-root users, read-only filesystems, and dropped capabilities. Applying these standards across namespaces ensures consistent security.

At the cluster layer, RBAC (Role-Based Access Control) limits what users and service accounts can do. Network policies control traffic between pods. Audit logging records cluster activities for investigation and compliance.

Pod Security Standards

Kubernetes provides Pod Security Standards—predefined profiles that apply security restrictions to pods. These standards replace the older PodSecurityPolicy resource with a simpler, more maintainable approach.

The Privileged profile has no restrictions. It’s for pods that require elevated privileges, like system-level infrastructure workloads. Use it sparingly and only where truly necessary.

The Baseline profile prevents known privilege escalations. It blocks privileged containers, host networking and ports, and hostPath mounts. It’s a reasonable default for most workloads that don’t need special privileges.

The Restricted profile enforces best practices. It requires non-root users, read-only root filesystems, and dropped capabilities. It blocks volume types beyond the safe ones. This profile is appropriate for security-sensitive workloads.

# Example of security context enforcing Restricted profile requirements

spec:
  securityContext:
    runAsNonRoot: true
    seccompProfile:
      type: RuntimeDefault
  containers:
  - name: app
    image: myapp:1.0
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      runAsNonRoot: true
      capabilities:
        drop:
          - ALL

Network Policies

Network policies are the Kubernetes equivalent of firewalls for pod-to-pod traffic. By default, all pods can communicate with all other pods. Network policies restrict this, specifying which pods can communicate with which.

A network policy selects pods using labels and specifies ingress (incoming) and egress (outgoing) rules. Pods not selected by any policy continue to accept all traffic. Once a pod is selected by any policy, it’s restricted to only the traffic that policies explicitly allow.

This default-deny approach requires careful planning. A common pattern is to create a default-deny policy for a namespace, then explicitly allow the traffic that should be permitted. This ensures that new pods don’t accidentally get more network access than intended.

# Default deny all ingress traffic in a namespace

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-ingress
  namespace: production
spec:
  podSelector: {}  # Selects all pods in namespace
  policyTypes:
  - Ingress
  # No ingress rules = deny all ingress

---
# Allow traffic from frontend to backend

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-frontend-to-backend
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: backend
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 8080

Operating Kubernetes in Production

Cluster Design Considerations

Designing a Kubernetes cluster for production requires decisions about availability, scalability, and isolation. These decisions shape how resilient the cluster is to failures and how it can grow over time.

Control plane availability requires multiple control plane nodes. A three-node control plane can survive the loss of one node while maintaining quorum in etcd. For critical environments, five nodes provide tolerance for two failures. Control plane nodes should be spread across availability zones so that a zone failure doesn’t take down the cluster.

Worker node capacity depends on your workloads. Larger nodes can run more pods but represent bigger failure domains. Smaller nodes provide more granular scaling but have more overhead. Many organizations use a mix: larger nodes for most workloads, specialized node pools for specific needs (high-memory nodes for databases, GPU nodes for machine learning).

Node pools enable heterogeneous clusters. You might have a default pool of general-purpose nodes, a high-memory pool for memory-intensive workloads, and a spot/preemptible pool for cost-sensitive, fault-tolerant workloads. Taints and tolerations ensure pods land on appropriate nodes.

Upgrades and Maintenance

Kubernetes releases new versions approximately every four months, and each version is supported for about a year. Staying current with upgrades ensures you receive security patches and bug fixes. Falling too far behind makes upgrades more difficult and risky.

Control plane upgrades should be done one node at a time in highly available setups. Upgrade one control plane node, verify it’s healthy, then proceed to the next. Most managed Kubernetes services handle control plane upgrades automatically with appropriate safeguards.

Worker node upgrades can follow various strategies. In-place upgrades update nodes without replacing them—faster but riskier. Rolling replacements create new nodes with the new version, drain workloads from old nodes, and then delete them—safer but slower. The choice depends on your tolerance for risk and the effort you want to invest in automation.

Pod Disruption Budgets (PDBs) protect workloads during maintenance. A PDB specifies how many pods of a certain type must remain available during voluntary disruptions like node drains. Kubernetes honors these budgets, refusing to drain nodes if doing so would violate the budget. PDBs ensure your applications maintain capacity during upgrades.

Managed Kubernetes Services

For many organizations, managed Kubernetes services from cloud providers offer a compelling alternative to running Kubernetes themselves. These services handle control plane operations, upgrades, and integration with cloud services.

Amazon EKS (Elastic Kubernetes Service) provides managed Kubernetes on AWS. The control plane is fully managed with high availability across multiple availability zones. EKS integrates deeply with AWS services—IAM for authentication, VPC for networking, EBS and EFS for storage. Fargate integration enables running pods without managing nodes at all.

Azure AKS (Azure Kubernetes Service) is Microsoft’s managed offering. It provides free control plane management (you pay only for nodes), Azure AD integration for authentication, and Azure CNI for networking. AKS has been a leader in Windows container support for organizations with Windows workloads.

Google GKE (Google Kubernetes Engine) benefits from Google’s experience as the original developers of Kubernetes. GKE offers Autopilot mode, which manages nodes automatically and bills per-pod rather than per-node. GKE is often first to support new Kubernetes features and provides strong multi-cluster and multi-region capabilities.

The choice between managed and self-managed Kubernetes depends on your organization. Managed services reduce operational burden but cost more and offer less flexibility. Self-managed clusters provide more control but require significant expertise to operate well. Many organizations start with managed services and only move to self-managed if they hit specific limitations.


Platform Engineering and Developer Experience

The Platform Engineering Approach

Running Kubernetes well is complex, but most developers shouldn’t need to understand that complexity. Platform engineering is the practice of building internal platforms that abstract infrastructure complexity and enable developers to be productive.

A good platform provides developers with what they need while hiding what they don’t. Developers should be able to deploy applications, access logs and metrics, and configure routing without understanding the intricacies of Kubernetes objects. The platform team handles networking, security, upgrades, and compliance; application teams focus on their applications.

This abstraction takes many forms. GitOps tools like ArgoCD or Flux let developers deploy by pushing to Git repositories—they don’t need to run kubectl commands or understand manifests in detail. Templates and CRDs (Custom Resource Definitions) provide simplified interfaces for common patterns. Internal developer portals aggregate documentation, services, and self-service capabilities.

Building the Platform

Effective platforms start with understanding developer needs. What are the common tasks developers perform? What’s currently painful? What would help them move faster? Platform teams that build in isolation from their users often build the wrong things.

Core platform capabilities typically include:

Deployment automation through GitOps ensures that what’s in Git matches what’s running in the cluster. Developers push changes to Git; the GitOps controller applies them to the cluster. This provides auditability, enables easy rollbacks, and eliminates direct cluster access for deployments.

Observability stacks provide the visibility developers need. Prometheus and Grafana for metrics, Loki or Elasticsearch for logs, Jaeger or Tempo for traces. These tools should be pre-configured so developers get useful dashboards and alerts without configuration effort.

Secrets management integrates external secret stores (like AWS Secrets Manager, HashiCorp Vault, or Azure Key Vault) with Kubernetes. Tools like External Secrets Operator or Sealed Secrets ensure sensitive data is handled correctly without burdening developers with the details.

Service mesh capabilities (if needed) provide encryption, traffic management, and observability for service-to-service communication. The platform team configures and operates the mesh; developers benefit without managing it themselves.

Balancing Control and Freedom

Platform teams must balance standardization with flexibility. Too much control frustrates developers and slows innovation. Too much freedom creates inconsistency and security risks. The right balance depends on your organization’s culture and needs.

Golden paths are well-supported, recommended approaches for common tasks. Want to deploy a web service? Here’s the golden path: use this template, deploy through this pipeline, and get these features automatically. Developers can follow the path and benefit from the platform’s work. But they can also diverge when they have good reasons—they just take on more responsibility when they do.

Platform teams should measure their success by developer outcomes, not by technology metrics. Are developers able to deploy faster? Are incidents resolved more quickly? Do developers find the platform helpful? These outcomes matter more than which tools the platform uses.


Review Questions

  1. Container Fundamentals: Explain how containers provide isolation without requiring a separate kernel. What are the advantages and limitations of this approach compared to virtual machines?

  2. Kubernetes Architecture: Describe the role of each control plane component. What happens when you run kubectl apply -f deployment.yaml?

  3. Networking Design: Your application consists of a frontend, an API backend, and a database. Design the Services and NetworkPolicies to expose the frontend externally while restricting traffic appropriately.

  4. Storage Strategy: A team wants to run a PostgreSQL database on Kubernetes. What storage configuration would you recommend? What are the risks and how would you mitigate them?

  5. Security Implementation: Design a security approach for a Kubernetes cluster including image security, pod security, and network security. What tools and policies would you use?

  6. Managed vs. Self-Managed: What factors would influence your decision between managed Kubernetes (like EKS or GKE) and self-managed Kubernetes? What are the trade-offs?


Key Takeaways

Containers have become the standard packaging format for modern applications. They provide consistency across environments, efficient resource utilization, and fast deployment—benefits that accumulate at scale.

Kubernetes provides the orchestration layer that makes containers practical for production use. Its declarative model, self-healing capabilities, and rich ecosystem solve the hard problems of running distributed applications.

Networking in Kubernetes follows a flat model where pods communicate directly. Services abstract pod IPs, and Ingress handles external HTTP traffic. Service meshes add capabilities for complex architectures but also add complexity.

Storage in Kubernetes uses the PersistentVolume abstraction. Dynamic provisioning with storage classes enables self-service storage. StatefulSets provide special support for stateful applications.

Security requires defense at every layer—from secure code through hardened images to locked-down pods and clusters. Pod Security Standards provide baseline policies. Network policies implement micro-segmentation.

Platform engineering applies these technologies to create productive developer experiences. Good platforms hide complexity while providing the capabilities developers need.


Summary

Container platforms have fundamentally changed how organizations deploy and operate software. Containers provide the packaging—consistent, portable, efficient units that run the same everywhere. Kubernetes provides the orchestration—the automated management that makes running containers at scale practical.

Understanding containers starts with understanding their relationship to virtual machines: lighter isolation, shared kernels, faster startup, higher density. These characteristics make containers ideal for microservices, CI/CD, and cloud-native architectures. But containers alone aren’t enough; you need something to orchestrate them at scale.

Kubernetes has become that something for most organizations. Its declarative model, where you specify desired state and the system works to achieve it, simplifies operations while handling the complexity of distributed systems. Understanding Kubernetes architecture—the control plane, worker nodes, and the objects that represent workloads—is essential for anyone operating modern infrastructure.

The operational aspects—networking, storage, security, upgrades—require careful attention. Each area has depth and nuance that this chapter can only introduce. As you work with container platforms, you’ll develop deeper expertise in the areas that matter most for your workloads.

Platform engineering represents the maturation of container operations. Rather than every team wrestling with Kubernetes complexity, platform teams build abstractions that enable developers to be productive. This division of labor—platform teams handling infrastructure complexity, application teams focusing on applications—is how organizations scale their container operations effectively.


Chapter Navigation


Back to top

Infrastructure and Platform Management Handbook - MIT License