Skip to main content

Module 13 — CoreDNS & HAProxy Load Balancer

Two critical infrastructure pieces are still missing:

  • CoreDNS — provides DNS-based service discovery inside the cluster. Without it, pods cannot resolve service names like postgres.customerapp.svc.cluster.local.
  • HAProxy — load balances API server traffic across cp1 and cp2. Every kubeconfig for workers and external clients points to 192.168.56.20:6443 (the lb VM). Until HAProxy is running, those connections fail.

1. Deploy CoreDNS

CoreDNS runs as a Kubernetes Deployment with a Service at the well-known IP 10.32.0.10 (the clusterDNS address configured in kubelet). Every pod's /etc/resolv.conf points to this IP.

1.1 Create the CoreDNS manifests

Run from a control plane node (cp1 or cp2). Create all resources in a single apply:

kubectl apply --kubeconfig /var/lib/kubernetes/admin.kubeconfig -f - <<'EOF'
apiVersion: v1
kind: ServiceAccount
metadata:
name: coredns
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:coredns
labels:
kubernetes.io/bootstrapping: rbac-defaults
rules:
- apiGroups: [""]
resources:
- endpoints
- services
- pods
- namespaces
verbs:
- list
- watch
- apiGroups: ["discovery.k8s.io"]
resources:
- endpointslices
verbs:
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:coredns
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
labels:
kubernetes.io/bootstrapping: rbac-defaults
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:coredns
subjects:
- kind: ServiceAccount
name: coredns
namespace: kube-system
---
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . 8.8.8.8 8.8.4.4
cache 30
loop
reload
loadbalance
}
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: coredns
namespace: kube-system
labels:
k8s-app: kube-dns
spec:
replicas: 2
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
selector:
matchLabels:
k8s-app: kube-dns
template:
metadata:
labels:
k8s-app: kube-dns
spec:
serviceAccountName: coredns
tolerations:
- key: "CriticalAddonsOnly"
operator: "Exists"
containers:
- name: coredns
image: coredns/coredns:1.11.3
imagePullPolicy: IfNotPresent
resources:
limits:
memory: 170Mi
requests:
cpu: 100m
memory: 70Mi
args: ["-conf", "/etc/coredns/Corefile"]
volumeMounts:
- name: config-volume
mountPath: /etc/coredns
readOnly: true
ports:
- containerPort: 53
name: dns
protocol: UDP
- containerPort: 53
name: dns-tcp
protocol: TCP
- containerPort: 9153
name: metrics
protocol: TCP
livenessProbe:
httpGet:
path: /health
port: 8080
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
readinessProbe:
httpGet:
path: /ready
port: 8181
scheme: HTTP
dnsPolicy: Default
volumes:
- name: config-volume
configMap:
name: coredns
items:
- key: Corefile
path: Corefile
---
apiVersion: v1
kind: Service
metadata:
name: kube-dns
namespace: kube-system
annotations:
prometheus.io/port: "9153"
prometheus.io/scrape: "true"
labels:
k8s-app: kube-dns
spec:
selector:
k8s-app: kube-dns
clusterIP: 10.32.0.10
ports:
- name: dns
port: 53
protocol: UDP
- name: dns-tcp
port: 53
protocol: TCP
EOF

Understanding the Corefile

DirectivePurpose
errorsLog errors to stdout
healthServe health check on :8080/health
readyServe readiness check on :8181/ready
kubernetes cluster.localAnswer DNS queries for Kubernetes services and pods
forward . 8.8.8.8 8.8.4.4Forward non-cluster queries to Google DNS
cache 30Cache DNS responses for 30 seconds
loopDetect and break forwarding loops
reloadReload Corefile on changes without restart

1.2 Verify CoreDNS pods are running

kubectl get pods -n kube-system -l k8s-app=kube-dns \
--kubeconfig /var/lib/kubernetes/admin.kubeconfig

Expected:

NAME                       READY   STATUS    RESTARTS   AGE
coredns-xxxxxxxxxx-xxxxx 1/1 Running 0 30s
coredns-xxxxxxxxxx-yyyyy 1/1 Running 0 30s

Both replicas should be Running.

1.3 Verify the kube-dns service

kubectl get svc kube-dns -n kube-system \
--kubeconfig /var/lib/kubernetes/admin.kubeconfig

Expected:

NAME       TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)         AGE
kube-dns ClusterIP 10.32.0.10 <none> 53/UDP,53/TCP 45s

The ClusterIP is 10.32.0.10 — the same address kubelet is configured to use as clusterDNS.

Checkpoint: Two CoreDNS pods are running. The kube-dns service has ClusterIP 10.32.0.10.


2. Verify DNS Resolution

2.1 Test from a pod

kubectl run dns-test --image=busybox:1.36 --restart=Never --rm -it \
--kubeconfig /var/lib/kubernetes/admin.kubeconfig \
-- nslookup kubernetes.default

Expected:

Server:    10.32.0.10
Address 1: 10.32.0.10 kube-dns.kube-system.svc.cluster.local

Name: kubernetes.default
Address 1: 10.32.0.1 kubernetes.default.svc.cluster.local

This confirms:

  • The pod's DNS resolver points to 10.32.0.10 (CoreDNS)
  • CoreDNS resolves kubernetes.default to 10.32.0.1 (the API server's ClusterIP)

2.2 Test external DNS resolution

kubectl run dns-test2 --image=busybox:1.36 --restart=Never --rm -it \
--kubeconfig /var/lib/kubernetes/admin.kubeconfig \
-- nslookup google.com

Expected: Returns an IP address for google.com. This proves the forward directive in CoreDNS is working — non-cluster queries are forwarded to 8.8.8.8.

Checkpoint: DNS resolution works for both cluster services and external domains from inside pods.


3. Install HAProxy on the Load Balancer

The lb VM (192.168.56.20) has been idle since Module 06. Now it gets its purpose — load balancing API server traffic across cp1 and cp2.

3.1 SSH into the lb VM

ssh lb

3.2 Install HAProxy

sudo apt-get update
sudo apt-get install -y haproxy

3.3 Configure HAProxy

cat <<EOF | sudo tee /etc/haproxy/haproxy.cfg
global
log /dev/log local0
log /dev/log local1 notice
daemon

defaults
mode tcp
log global
option tcplog
option dontlognull
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms

frontend kubernetes-api
bind *:6443
default_backend kubernetes-api-backend

backend kubernetes-api-backend
balance roundrobin
option tcp-check
server cp1 192.168.56.21:6443 check fall 3 rise 2
server cp2 192.168.56.22:6443 check fall 3 rise 2
EOF

What each section does

SectionPurpose
mode tcpLayer 4 (TCP) proxying — HAProxy passes raw TLS without terminating it
frontend kubernetes-apiListens on *:6443 for incoming connections
backend kubernetes-api-backendDistributes connections across cp1 and cp2
balance roundrobinAlternate between backends
option tcp-checkHealth check that verifies TCP connectivity
check fall 3 rise 2Mark backend down after 3 failed checks, up after 2 successful

Note: HAProxy runs in TCP mode (Layer 4), not HTTP mode. The TLS handshake happens directly between the client and the API server. HAProxy only forwards raw TCP bytes — it never sees the decrypted traffic.

3.4 Start HAProxy

sudo systemctl enable haproxy
sudo systemctl restart haproxy
sudo systemctl status haproxy

Expected: Active: active (running).

Checkpoint: HAProxy is running on the lb VM and listening on port 6443.


4. Verify HAProxy

4.1 Test from your Mac

Now that HAProxy is running, the load balancer IP (192.168.56.20:6443) works:

cd ~/k8s-cluster/certs

kubectl --kubeconfig=admin.kubeconfig get nodes

Expected:

NAME      STATUS   ROLES    AGE   VERSION
worker1 Ready <none> 30m v1.31.0
worker2 Ready <none> 30m v1.31.0

The admin kubeconfig points to https://192.168.56.20:6443. HAProxy forwards the request to one of the API servers, and you get the node list back.

4.2 Verify kube-proxy reconnects

kube-proxy on the workers has been trying to connect to 192.168.56.20:6443 since Module 11. Now that HAProxy is running, it should have reconnected automatically. Check from a worker:

ssh worker1 "sudo systemctl status kube-proxy"

The service should be active (running) without recent restart errors.

4.3 Check HAProxy stats

ssh lb "echo 'show stat' | sudo socat stdio /run/haproxy/admin.sock 2>/dev/null || echo 'Stats socket not configured (that is OK)'"

Checkpoint: kubectl get nodes works from your Mac through the load balancer. Both workers show Ready.


5. Test API Server Failover

5.1 Stop API server on cp1

ssh cp1 "sudo systemctl stop kube-apiserver"

5.2 Verify kubectl still works

kubectl --kubeconfig=admin.kubeconfig get nodes

Expected: Still returns the node list. HAProxy detected that cp1 is down and routes all traffic to cp2.

5.3 Check HAProxy backend status

ssh lb "echo 'show servers state' | sudo socat stdio /run/haproxy/admin.sock 2>/dev/null || echo 'Check haproxy logs: sudo journalctl -u haproxy --no-pager | tail -20'"

5.4 Restart API server on cp1

ssh cp1 "sudo systemctl start kube-apiserver"

Verify cp1 is back:

kubectl --kubeconfig=admin.kubeconfig --server=https://192.168.56.21:6443 get nodes

Checkpoint: kubectl continues to work when one API server is stopped. Restarting the API server brings it back into the load balancer pool.


6. Set Up kubectl Context on Your Mac

Now that the load balancer is running, configure your Mac's default kubectl context:

cd ~/k8s-cluster/certs
mkdir -p ~/.kube
cp admin.kubeconfig ~/.kube/config

Test:

kubectl get nodes
kubectl get pods -n kube-system

Both commands should work without specifying --kubeconfig or --server.

Tip: From now on, all kubectl commands in subsequent modules assume you have this context configured. You no longer need the --kubeconfig flag.


7. Troubleshooting

CoreDNS pods in CrashLoopBackOff

Check CoreDNS logs:

kubectl logs -n kube-system -l k8s-app=kube-dns --kubeconfig /var/lib/kubernetes/admin.kubeconfig

Common causes:

  • Corefile syntax error — validate the ConfigMap
  • Cannot reach 8.8.8.8 — the worker VM needs a NAT adapter for internet access
  • RBAC permission error — verify the ClusterRole and ClusterRoleBinding exist

DNS pod cannot resolve kubernetes.default

kubectl run dns-debug --image=busybox:1.36 --restart=Never --rm -it -- cat /etc/resolv.conf

The nameserver should be 10.32.0.10. If not, kubelet's clusterDNS configuration is wrong — check /var/lib/kubelet/kubelet-config.yaml on the worker.

HAProxy "connection refused" on port 6443

  1. HAProxy is not running: ssh lb "sudo systemctl status haproxy"
  2. Port is not listening: ssh lb "ss -tlnp | grep 6443"
  3. Configuration error: ssh lb "sudo haproxy -c -f /etc/haproxy/haproxy.cfg" (validates config syntax)

Failover does not work — kubectl fails when one CP is down

HAProxy might not have detected the failure yet. Wait 15 seconds (3 health check intervals × 5 seconds) and retry. If it still fails, verify both backends are listed in the config:

ssh lb "grep server /etc/haproxy/haproxy.cfg"

Workers show connection errors to 192.168.56.20:6443 in logs

Before HAProxy was running, kubelet and kube-proxy logged connection errors to the LB IP. These errors should stop now. If they persist, verify network connectivity:

ssh worker1 "curl -k https://192.168.56.20:6443/healthz"

Should return ok.


8. What You Have Now

CapabilityVerification Command
CoreDNS running (2 replicas)kubectl get pods -n kube-system -l k8s-app=kube-dns
Cluster DNS at 10.32.0.10kubectl get svc kube-dns -n kube-system
Service name resolution workskubectl run test --image=busybox --rm -it -- nslookup kubernetes.default
External DNS resolution workskubectl run test --image=busybox --rm -it -- nslookup google.com
HAProxy load balancing APIkubectl get nodes from Mac (via 192.168.56.20:6443)
API server failoverStop one CP, kubectl still works
kubectl configured on Mackubectl get nodes without extra flags

The cluster is now feature-complete from an infrastructure perspective. DNS works for service discovery, the API server is load-balanced for high availability, and kubectl works from your Mac. The cluster is ready for application deployments.


Next up: Module 14 — Deploy the Customer Information App — deploy PostgreSQL, the Go backend, and the Nginx reverse proxy as Kubernetes workloads.