Module 13 — CoreDNS & HAProxy Load Balancer
Two critical infrastructure pieces are still missing:
- CoreDNS — provides DNS-based service discovery inside the cluster. Without it, pods cannot resolve service names like
postgres.customerapp.svc.cluster.local. - HAProxy — load balances API server traffic across cp1 and cp2. Every kubeconfig for workers and external clients points to
192.168.56.20:6443(the lb VM). Until HAProxy is running, those connections fail.
1. Deploy CoreDNS
CoreDNS runs as a Kubernetes Deployment with a Service at the well-known IP 10.32.0.10 (the clusterDNS address configured in kubelet). Every pod's /etc/resolv.conf points to this IP.
1.1 Create the CoreDNS manifests
Run from a control plane node (cp1 or cp2). Create all resources in a single apply:
kubectl apply --kubeconfig /var/lib/kubernetes/admin.kubeconfig -f - <<'EOF'
apiVersion: v1
kind: ServiceAccount
metadata:
name: coredns
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:coredns
labels:
kubernetes.io/bootstrapping: rbac-defaults
rules:
- apiGroups: [""]
resources:
- endpoints
- services
- pods
- namespaces
verbs:
- list
- watch
- apiGroups: ["discovery.k8s.io"]
resources:
- endpointslices
verbs:
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:coredns
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
labels:
kubernetes.io/bootstrapping: rbac-defaults
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:coredns
subjects:
- kind: ServiceAccount
name: coredns
namespace: kube-system
---
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . 8.8.8.8 8.8.4.4
cache 30
loop
reload
loadbalance
}
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: coredns
namespace: kube-system
labels:
k8s-app: kube-dns
spec:
replicas: 2
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
selector:
matchLabels:
k8s-app: kube-dns
template:
metadata:
labels:
k8s-app: kube-dns
spec:
serviceAccountName: coredns
tolerations:
- key: "CriticalAddonsOnly"
operator: "Exists"
containers:
- name: coredns
image: coredns/coredns:1.11.3
imagePullPolicy: IfNotPresent
resources:
limits:
memory: 170Mi
requests:
cpu: 100m
memory: 70Mi
args: ["-conf", "/etc/coredns/Corefile"]
volumeMounts:
- name: config-volume
mountPath: /etc/coredns
readOnly: true
ports:
- containerPort: 53
name: dns
protocol: UDP
- containerPort: 53
name: dns-tcp
protocol: TCP
- containerPort: 9153
name: metrics
protocol: TCP
livenessProbe:
httpGet:
path: /health
port: 8080
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
readinessProbe:
httpGet:
path: /ready
port: 8181
scheme: HTTP
dnsPolicy: Default
volumes:
- name: config-volume
configMap:
name: coredns
items:
- key: Corefile
path: Corefile
---
apiVersion: v1
kind: Service
metadata:
name: kube-dns
namespace: kube-system
annotations:
prometheus.io/port: "9153"
prometheus.io/scrape: "true"
labels:
k8s-app: kube-dns
spec:
selector:
k8s-app: kube-dns
clusterIP: 10.32.0.10
ports:
- name: dns
port: 53
protocol: UDP
- name: dns-tcp
port: 53
protocol: TCP
EOF
Understanding the Corefile
| Directive | Purpose |
|---|---|
errors | Log errors to stdout |
health | Serve health check on :8080/health |
ready | Serve readiness check on :8181/ready |
kubernetes cluster.local | Answer DNS queries for Kubernetes services and pods |
forward . 8.8.8.8 8.8.4.4 | Forward non-cluster queries to Google DNS |
cache 30 | Cache DNS responses for 30 seconds |
loop | Detect and break forwarding loops |
reload | Reload Corefile on changes without restart |
1.2 Verify CoreDNS pods are running
kubectl get pods -n kube-system -l k8s-app=kube-dns \
--kubeconfig /var/lib/kubernetes/admin.kubeconfig
Expected:
NAME READY STATUS RESTARTS AGE
coredns-xxxxxxxxxx-xxxxx 1/1 Running 0 30s
coredns-xxxxxxxxxx-yyyyy 1/1 Running 0 30s
Both replicas should be Running.
1.3 Verify the kube-dns service
kubectl get svc kube-dns -n kube-system \
--kubeconfig /var/lib/kubernetes/admin.kubeconfig
Expected:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.32.0.10 <none> 53/UDP,53/TCP 45s
The ClusterIP is 10.32.0.10 — the same address kubelet is configured to use as clusterDNS.
Checkpoint: Two CoreDNS pods are running. The kube-dns service has ClusterIP
10.32.0.10.
2. Verify DNS Resolution
2.1 Test from a pod
kubectl run dns-test --image=busybox:1.36 --restart=Never --rm -it \
--kubeconfig /var/lib/kubernetes/admin.kubeconfig \
-- nslookup kubernetes.default
Expected:
Server: 10.32.0.10
Address 1: 10.32.0.10 kube-dns.kube-system.svc.cluster.local
Name: kubernetes.default
Address 1: 10.32.0.1 kubernetes.default.svc.cluster.local
This confirms:
- The pod's DNS resolver points to
10.32.0.10(CoreDNS) - CoreDNS resolves
kubernetes.defaultto10.32.0.1(the API server's ClusterIP)
2.2 Test external DNS resolution
kubectl run dns-test2 --image=busybox:1.36 --restart=Never --rm -it \
--kubeconfig /var/lib/kubernetes/admin.kubeconfig \
-- nslookup google.com
Expected: Returns an IP address for google.com. This proves the forward directive in CoreDNS is working — non-cluster queries are forwarded to 8.8.8.8.
Checkpoint: DNS resolution works for both cluster services and external domains from inside pods.
3. Install HAProxy on the Load Balancer
The lb VM (192.168.56.20) has been idle since Module 06. Now it gets its purpose — load balancing API server traffic across cp1 and cp2.
3.1 SSH into the lb VM
ssh lb
3.2 Install HAProxy
sudo apt-get update
sudo apt-get install -y haproxy
3.3 Configure HAProxy
cat <<EOF | sudo tee /etc/haproxy/haproxy.cfg
global
log /dev/log local0
log /dev/log local1 notice
daemon
defaults
mode tcp
log global
option tcplog
option dontlognull
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
frontend kubernetes-api
bind *:6443
default_backend kubernetes-api-backend
backend kubernetes-api-backend
balance roundrobin
option tcp-check
server cp1 192.168.56.21:6443 check fall 3 rise 2
server cp2 192.168.56.22:6443 check fall 3 rise 2
EOF
What each section does
| Section | Purpose |
|---|---|
mode tcp | Layer 4 (TCP) proxying — HAProxy passes raw TLS without terminating it |
frontend kubernetes-api | Listens on *:6443 for incoming connections |
backend kubernetes-api-backend | Distributes connections across cp1 and cp2 |
balance roundrobin | Alternate between backends |
option tcp-check | Health check that verifies TCP connectivity |
check fall 3 rise 2 | Mark backend down after 3 failed checks, up after 2 successful |
Note: HAProxy runs in TCP mode (Layer 4), not HTTP mode. The TLS handshake happens directly between the client and the API server. HAProxy only forwards raw TCP bytes — it never sees the decrypted traffic.
3.4 Start HAProxy
sudo systemctl enable haproxy
sudo systemctl restart haproxy
sudo systemctl status haproxy
Expected: Active: active (running).
Checkpoint: HAProxy is running on the lb VM and listening on port 6443.
4. Verify HAProxy
4.1 Test from your Mac
Now that HAProxy is running, the load balancer IP (192.168.56.20:6443) works:
cd ~/k8s-cluster/certs
kubectl --kubeconfig=admin.kubeconfig get nodes
Expected:
NAME STATUS ROLES AGE VERSION
worker1 Ready <none> 30m v1.31.0
worker2 Ready <none> 30m v1.31.0
The admin kubeconfig points to https://192.168.56.20:6443. HAProxy forwards the request to one of the API servers, and you get the node list back.
4.2 Verify kube-proxy reconnects
kube-proxy on the workers has been trying to connect to 192.168.56.20:6443 since Module 11. Now that HAProxy is running, it should have reconnected automatically. Check from a worker:
ssh worker1 "sudo systemctl status kube-proxy"
The service should be active (running) without recent restart errors.
4.3 Check HAProxy stats
ssh lb "echo 'show stat' | sudo socat stdio /run/haproxy/admin.sock 2>/dev/null || echo 'Stats socket not configured (that is OK)'"
Checkpoint:
kubectl get nodesworks from your Mac through the load balancer. Both workers showReady.
5. Test API Server Failover
5.1 Stop API server on cp1
ssh cp1 "sudo systemctl stop kube-apiserver"
5.2 Verify kubectl still works
kubectl --kubeconfig=admin.kubeconfig get nodes
Expected: Still returns the node list. HAProxy detected that cp1 is down and routes all traffic to cp2.
5.3 Check HAProxy backend status
ssh lb "echo 'show servers state' | sudo socat stdio /run/haproxy/admin.sock 2>/dev/null || echo 'Check haproxy logs: sudo journalctl -u haproxy --no-pager | tail -20'"
5.4 Restart API server on cp1
ssh cp1 "sudo systemctl start kube-apiserver"
Verify cp1 is back:
kubectl --kubeconfig=admin.kubeconfig --server=https://192.168.56.21:6443 get nodes
Checkpoint: kubectl continues to work when one API server is stopped. Restarting the API server brings it back into the load balancer pool.
6. Set Up kubectl Context on Your Mac
Now that the load balancer is running, configure your Mac's default kubectl context:
cd ~/k8s-cluster/certs
mkdir -p ~/.kube
cp admin.kubeconfig ~/.kube/config
Test:
kubectl get nodes
kubectl get pods -n kube-system
Both commands should work without specifying --kubeconfig or --server.
Tip: From now on, all
kubectlcommands in subsequent modules assume you have this context configured. You no longer need the--kubeconfigflag.
7. Troubleshooting
CoreDNS pods in CrashLoopBackOff
Check CoreDNS logs:
kubectl logs -n kube-system -l k8s-app=kube-dns --kubeconfig /var/lib/kubernetes/admin.kubeconfig
Common causes:
- Corefile syntax error — validate the ConfigMap
- Cannot reach
8.8.8.8— the worker VM needs a NAT adapter for internet access - RBAC permission error — verify the ClusterRole and ClusterRoleBinding exist
DNS pod cannot resolve kubernetes.default
kubectl run dns-debug --image=busybox:1.36 --restart=Never --rm -it -- cat /etc/resolv.conf
The nameserver should be 10.32.0.10. If not, kubelet's clusterDNS configuration is wrong — check /var/lib/kubelet/kubelet-config.yaml on the worker.
HAProxy "connection refused" on port 6443
- HAProxy is not running:
ssh lb "sudo systemctl status haproxy" - Port is not listening:
ssh lb "ss -tlnp | grep 6443" - Configuration error:
ssh lb "sudo haproxy -c -f /etc/haproxy/haproxy.cfg"(validates config syntax)
Failover does not work — kubectl fails when one CP is down
HAProxy might not have detected the failure yet. Wait 15 seconds (3 health check intervals × 5 seconds) and retry. If it still fails, verify both backends are listed in the config:
ssh lb "grep server /etc/haproxy/haproxy.cfg"
Workers show connection errors to 192.168.56.20:6443 in logs
Before HAProxy was running, kubelet and kube-proxy logged connection errors to the LB IP. These errors should stop now. If they persist, verify network connectivity:
ssh worker1 "curl -k https://192.168.56.20:6443/healthz"
Should return ok.
8. What You Have Now
| Capability | Verification Command |
|---|---|
| CoreDNS running (2 replicas) | kubectl get pods -n kube-system -l k8s-app=kube-dns |
| Cluster DNS at 10.32.0.10 | kubectl get svc kube-dns -n kube-system |
| Service name resolution works | kubectl run test --image=busybox --rm -it -- nslookup kubernetes.default |
| External DNS resolution works | kubectl run test --image=busybox --rm -it -- nslookup google.com |
| HAProxy load balancing API | kubectl get nodes from Mac (via 192.168.56.20:6443) |
| API server failover | Stop one CP, kubectl still works |
| kubectl configured on Mac | kubectl get nodes without extra flags |
The cluster is now feature-complete from an infrastructure perspective. DNS works for service discovery, the API server is load-balanced for high availability, and kubectl works from your Mac. The cluster is ready for application deployments.
Next up: Module 14 — Deploy the Customer Information App — deploy PostgreSQL, the Go backend, and the Nginx reverse proxy as Kubernetes workloads.