Module 15 — End-to-End Verification
You have built a Kubernetes cluster from scratch — certificates, etcd, control plane, worker nodes, networking, DNS, load balancing, and a deployed application. This final module runs comprehensive verification tests to confirm everything works together.
Each section tests a different cluster capability. Run all tests from your Mac using the kubectl context configured in Module 13.
1. Cluster Component Health
1.1 Node status
kubectl get nodes -o wide
Expected:
NAME STATUS ROLES AGE VERSION INTERNAL-IP ...
worker1 Ready <none> 1h v1.31.0 192.168.56.23 ...
worker2 Ready <none> 1h v1.31.0 192.168.56.24 ...
Both nodes are Ready with the correct IPs and Kubernetes version.
1.2 Component status
kubectl get componentstatuses
Expected:
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy ok
etcd-1 Healthy ok
1.3 System pods
kubectl get pods -n kube-system
Expected: CoreDNS pods are Running (2 replicas).
1.4 etcd health
From a control plane node:
ssh cp1 "sudo ETCDCTL_API=3 etcdctl endpoint health \
--endpoints=https://192.168.56.21:2379,https://192.168.56.22:2379 \
--cacert=/etc/etcd/ca.pem \
--cert=/etc/etcd/etcd.pem \
--key=/etc/etcd/etcd-key.pem"
Expected: Both endpoints show is healthy: successfully committed proposal.
Checkpoint: All cluster components are healthy — nodes Ready, control plane Healthy, etcd healthy, CoreDNS running.
2. Application Verification
2.1 Pod status
kubectl get pods -n customerapp -o wide
Expected: All pods (postgres, backend x2, nginx) are Running and distributed across workers.
2.2 Health endpoint
curl -s http://192.168.56.23:30080/health
Expected: Health response from the backend (e.g., {"status":"ok"}).
2.3 Test through both workers
The NodePort Service is accessible on every worker node:
curl -s http://192.168.56.23:30080/health
curl -s http://192.168.56.24:30080/health
Both should return the same response.
2.4 Login test
curl -s -X POST http://192.168.56.23:30080/login \
-H "Content-Type: application/json" \
-d '{"username":"admin","password":"admin123"}'
Expected: A response with a session token or success message.
2.5 CRUD test
Create a customer:
curl -s -X POST http://192.168.56.23:30080/customers \
-H "Content-Type: application/json" \
-d '{"name":"Verification Test","email":"verify@test.com"}'
List customers:
curl -s http://192.168.56.23:30080/customers
The created customer should appear in the list.
Checkpoint: The application is fully functional — health checks pass, login works, CRUD operations succeed through both worker node IPs.
3. DNS Verification
3.1 Cluster service resolution
kubectl run dns-verify --image=busybox:1.36 --restart=Never --rm -it \
-- nslookup kubernetes.default
Expected: Resolves to 10.32.0.1 (the API server's ClusterIP).
3.2 Application service resolution
kubectl run dns-verify --image=busybox:1.36 --restart=Never --rm -it \
-n customerapp \
-- nslookup postgres.customerapp.svc.cluster.local
Expected: Resolves to the ClusterIP of the postgres Service.
3.3 Cross-namespace resolution
kubectl run dns-verify --image=busybox:1.36 --restart=Never --rm -it \
-- nslookup kube-dns.kube-system.svc.cluster.local
Expected: Resolves to 10.32.0.10 (the CoreDNS Service IP).
3.4 External DNS resolution
kubectl run dns-verify --image=busybox:1.36 --restart=Never --rm -it \
-- nslookup google.com
Expected: Resolves to a public IP address.
Checkpoint: DNS works for cluster services, cross-namespace lookups, and external domains.
4. Scaling
4.1 Scale the backend to 3 replicas
kubectl scale deployment backend -n customerapp --replicas=3
4.2 Watch pods being created
kubectl get pods -n customerapp -l app=backend -o wide -w
Press Ctrl+C after all 3 pods are Running.
Expected: The scheduler distributes pods across worker1 and worker2. You should see pods on both nodes.
4.3 Verify all replicas serve traffic
for i in 1 2 3 4 5; do
curl -s http://192.168.56.23:30080/health
echo
done
All requests should succeed. The Service load-balances across all 3 backend pods.
4.4 Scale back to 2 replicas
kubectl scale deployment backend -n customerapp --replicas=2
Verify one pod is terminated:
kubectl get pods -n customerapp -l app=backend
Expected: 2 pods in Running status.
Checkpoint: Scaling up creates new pods across nodes. Scaling down terminates excess pods gracefully.
5. Rolling Update
5.1 Build and push a v2 image
On the app-server (192.168.56.12) where the registry and source code live:
ssh app-server
Make a small change to the application (e.g., update the health endpoint response or version string), rebuild, and push:
cd ~/customerapp
# Make a small visible change (add a version to the health endpoint, for example)
docker build -t 192.168.56.12:5000/customerapp:v2 .
docker push 192.168.56.12:5000/customerapp:v2
Tip: If you do not want to modify the app code, you can simply retag and push the same image as v2. The rollout will still replace all pods:
docker tag 192.168.56.12:5000/customerapp:v1 192.168.56.12:5000/customerapp:v2
docker push 192.168.56.12:5000/customerapp:v2
5.2 Trigger the rolling update
From your Mac:
kubectl set image deployment/backend -n customerapp \
backend=192.168.56.12:5000/customerapp:v2
5.3 Watch the rollout
kubectl rollout status deployment/backend -n customerapp
Expected:
Waiting for deployment "backend" rollout to finish: 1 out of 2 new replicas have been updated...
Waiting for deployment "backend" rollout to finish: 1 old replicas are pending termination...
deployment "backend" successfully rolled out
During the rollout, Kubernetes creates new pods with v2 and terminates old v1 pods one at a time — ensuring zero downtime.
5.4 Verify the new version
kubectl get pods -n customerapp -l app=backend -o jsonpath='{.items[*].spec.containers[0].image}'
echo
Expected: All pods run 192.168.56.12:5000/customerapp:v2.
5.5 Test the app still works
curl -s http://192.168.56.23:30080/health
5.6 View rollout history
kubectl rollout history deployment/backend -n customerapp
5.7 Rollback (optional)
If the new version has issues, roll back to the previous version:
kubectl rollout undo deployment/backend -n customerapp
Verify pods are back to v1:
kubectl get pods -n customerapp -l app=backend -o jsonpath='{.items[*].spec.containers[0].image}'
echo
Checkpoint: Rolling update replaces pods without downtime. Rollback restores the previous version.
6. Node Failure Simulation
6.1 Check current pod distribution
kubectl get pods -n customerapp -o wide
Note which pods are on worker1.
6.2 Drain worker1
Draining a node evicts all pods and marks the node as unschedulable:
kubectl drain worker1 --ignore-daemonsets --delete-emptydir-data --force
--ignore-daemonsets— do not evict DaemonSet pods (there are none in this setup, but it is good practice)--delete-emptydir-data— allow eviction of pods with emptyDir volumes--force— evict pods not managed by a controller (standalone pods)
6.3 Watch pods reschedule
kubectl get pods -n customerapp -o wide
Expected: Pods that were on worker1 are now rescheduled to worker2. The exception is PostgreSQL — it is pinned to worker1 via nodeName and will be in Pending state.
Note: The PostgreSQL pod uses
nodeName: worker1, so it cannot be rescheduled to worker2. In production you would use StatefulSets with distributed storage to handle this. For this training, the brief PostgreSQL downtime demonstrates why stateful workloads need special consideration.
6.4 Verify the app still works (partially)
curl -s http://192.168.56.24:30080/health
The backend and nginx should still work (they are on worker2). Database-dependent operations may fail until PostgreSQL is back.
6.5 Uncordon worker1
kubectl uncordon worker1
This marks worker1 as schedulable again. Existing pods do NOT automatically move back — only new pods will consider worker1 for scheduling.
6.6 Verify PostgreSQL recovers
The PostgreSQL pod should start on worker1 once it is uncordoned:
kubectl get pods -n customerapp -l app=postgres -o wide -w
Wait until it shows Running.
6.7 Verify full app functionality
curl -s http://192.168.56.23:30080/health
curl -s http://192.168.56.23:30080/customers
Everything should be working again.
Checkpoint: Draining a node evicts pods to the remaining node. Uncordoning restores the node for scheduling.
7. Secret Encryption at Rest
In Module 08 you created an encryption config for encrypting Secrets in etcd. Verify it works.
7.1 Create a test secret
kubectl create secret generic test-encryption \
-n customerapp \
--from-literal=secret-data="this-should-be-encrypted"
7.2 Read the secret from etcd directly
From a control plane node (cp1):
ssh cp1 "sudo ETCDCTL_API=3 etcdctl get /registry/secrets/customerapp/test-encryption \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/etcd/ca.pem \
--cert=/etc/etcd/etcd.pem \
--key=/etc/etcd/etcd-key.pem \
--hex"
7.3 Verify encryption
The output should contain hex data. Look for the k8s:enc:aescbc:v1:key1 prefix in the raw value. This confirms the Secret is encrypted using AES-CBC with the key you generated in Module 08.
If the data were unencrypted, you would see the raw this-should-be-encrypted string in plain text. The encrypted data looks like random bytes.
7.4 Verify Kubernetes can still read it
kubectl get secret test-encryption -n customerapp -o jsonpath='{.data.secret-data}' | base64 -d
echo
Expected: this-should-be-encrypted
Kubernetes transparently decrypts the data when reading through the API server.
7.5 Clean up
kubectl delete secret test-encryption -n customerapp
Checkpoint: Secrets are encrypted at rest in etcd. The API server transparently encrypts and decrypts data.
8. Cluster Health Report
Create a script that runs all verification checks and prints a summary. This is useful for quick cluster validation at any time.
8.1 Create the script
On your Mac:
cat > ~/k8s-cluster/cluster-health.sh <<'SCRIPT'
#!/bin/bash
set -euo pipefail
echo "============================================"
echo " Kubernetes Cluster Health Report"
echo " $(date)"
echo "============================================"
echo
# Nodes
echo "--- Nodes ---"
kubectl get nodes -o wide
echo
# Component status
echo "--- Component Status ---"
kubectl get componentstatuses 2>/dev/null || echo "(componentstatuses API deprecated)"
echo
# System pods
echo "--- System Pods ---"
kubectl get pods -n kube-system -o wide
echo
# Application pods
echo "--- Application Pods (customerapp) ---"
kubectl get pods -n customerapp -o wide
echo
# Services
echo "--- Services (customerapp) ---"
kubectl get svc -n customerapp
echo
# App health check
echo "--- Application Health ---"
HEALTH=$(curl -s --max-time 5 http://192.168.56.23:30080/health 2>/dev/null || echo "UNREACHABLE")
echo " Worker1 (192.168.56.23:30080): ${HEALTH}"
HEALTH=$(curl -s --max-time 5 http://192.168.56.24:30080/health 2>/dev/null || echo "UNREACHABLE")
echo " Worker2 (192.168.56.24:30080): ${HEALTH}"
echo
# DNS check
echo "--- DNS Check ---"
kubectl run dns-check --image=busybox:1.36 --restart=Never --rm -it --quiet \
-- nslookup kubernetes.default 2>/dev/null || echo " DNS check failed"
echo
# Cluster info
echo "--- Cluster Info ---"
kubectl cluster-info
echo
echo "============================================"
echo " Health report complete."
echo "============================================"
SCRIPT
chmod +x ~/k8s-cluster/cluster-health.sh
8.2 Run the report
~/k8s-cluster/cluster-health.sh
Review the output. All sections should show healthy components, running pods, and successful health checks.
Checkpoint: The health report script runs and shows all-green status.
9. What You Have Now — Full Cluster Summary
Congratulations. You have built a complete Kubernetes cluster from scratch. Here is everything you created across Modules 06–15:
| Module | What you built |
|---|---|
| 06 — Cluster VMs | 5 VMs with static IPs and SSH access |
| 07 — Certificate Authority & TLS | CA + 10 certificate pairs for all components |
| 08 — Kubeconfig Files | 6 kubeconfigs + encryption config distributed to nodes |
| 09 — etcd Cluster | 2-node etcd cluster with peer/client TLS |
| 10 — Control Plane | kube-apiserver, controller-manager, scheduler on cp1/cp2 |
| 11 — Worker Nodes | containerd, kubelet, kube-proxy on worker1/worker2 |
| 12 — CNI Networking | Bridge plugin + static routes for cross-node pod networking |
| 13 — CoreDNS & HAProxy | Cluster DNS + API server load balancing |
| 14 — Deploy App | PostgreSQL + Go backend + Nginx on Kubernetes |
| 15 — Verification | Scaling, rolling updates, failover, encryption at rest |
Cluster capabilities verified
| Capability | Status |
|---|---|
| Multi-node control plane with leader election | Verified |
| Worker node registration and scheduling | Verified |
| Cross-node pod networking | Verified |
| Service discovery via CoreDNS | Verified |
| API server high availability via HAProxy | Verified |
| Application deployment with Deployments and Services | Verified |
| Horizontal scaling | Verified |
| Rolling updates with zero downtime | Verified |
| Node failure resilience (drain/uncordon) | Verified |
| Secret encryption at rest | Verified |
| Persistent storage with PersistentVolumes | Verified |
| Private registry with authentication | Verified |
10. What's Next
You have completed the Kubernetes The Hard Way track. Here are areas to explore next:
Cluster management:
- Helm — package manager for Kubernetes. Deploy complex applications with a single
helm installcommand. - Ingress controllers — replace NodePort with proper HTTP routing (Nginx Ingress, Traefik, or Envoy-based controllers).
- cert-manager — automate TLS certificate management with Let's Encrypt.
Observability:
- Prometheus + Grafana — metrics collection and dashboarding for cluster and application monitoring.
- Loki — log aggregation. Centralize logs from all pods into a single queryable store.
- OpenTelemetry — distributed tracing for understanding request flows across services.
Networking:
- Calico or Cilium — replace the basic bridge CNI with a production-grade CNI that supports network policies, BGP, and eBPF.
- Network Policies — restrict pod-to-pod traffic based on labels (zero-trust networking).
Security:
- Pod Security Standards — enforce security baselines (no privileged containers, no host networking, read-only root filesystem).
- OPA/Gatekeeper — policy engine for validating Kubernetes resources before they are created.
- Falco — runtime threat detection for containers.
Storage:
- Longhorn or Rook-Ceph — distributed storage that works across nodes (replaces single-node hostPath).
- CSI drivers — integrate with cloud storage providers.
GitOps:
- Flux or ArgoCD — continuous deployment from Git. Push a change to your repo, and the cluster automatically updates to match.
Each of these tools builds on the fundamentals you now understand. Because you built the cluster by hand, you know exactly what each tool is abstracting away.