Running Pi-hole in a Highly Available Setup on Kubernetes (v6 Update)

With the recent release of Pi-hole v6, I decided to revisit the challenge of setting up high availability (HA) for Pi-hole in a Kubernetes environment. While Pi-hole has always been a great self-hosted DNS sinkhole, running it in a highly available way has been tricky—mainly due to its reliance on SQLite databases, which don’t support concurrent access across multiple replicas.

In previous versions, a common issue was that scaling Pi-hole with a Persistent Volume Claim (PVC) wasn’t straightforward. Since SQLite locks its database files, simply spinning up multiple replicas would lead to corruption or conflicts. This meant that traditional Kubernetes scaling with a single PVC wasn’t a viable solution.

With Pi-hole v6, there are some new possibilities and improvements that make tackling HA more practical. In this post, I’ll go through my approach to running Pi-hole in a highly available setup on Kubernetes, ensuring redundancy, fault tolerance, and seamless DNS resolution across multiple nodes.

Let’s dive in! 🚀

Upgrading Pi-hole from v5 to v6

With the release of Pi-hole v6, I decided to fully transition to environment-based configuration instead of relying on the UI for settings. This approach allows for better automation, version control, and consistency, especially when running Pi-hole in Kubernetes.

Configuration via Environment Variables

To achieve this, I defined all necessary configuration values in a YAML file that is loaded into a Kubernetes ConfigMap using Kustomize. This eliminates manual setup through the web UI, ensuring that settings are applied consistently across deployments.

Here’s the configuration mask used in the ConfigMap:

TZ=Europe/Berlin
FTLCONF_webserver_api_password=<REDACTED>
FTLCONF_dns_upstreams=10.2.0.1
FTLCONF_dns_revServers=true,10.1.0.0/16,10.1.0.1#53,local
FTLCONF_webserver_interface_theme=default-dark
FTLCONF_dns_listeningMode=all
FTLCONF_dns_cache_size=10000
FTLCONF_files_database=/etc/pihole/db/pihole-FTL.db
FTLCONF_files_gravity=/etc/pihole/db/gravity.db
FTLCONF_files_macvendor=/etc/pihole/db/macvendor.db
FTLCONF_dns_rateLimit_count=1000
FTLCONF_dns_rateLimit_interval=60
FTLCONF_dns_ignoreLocalhost=false
FTLCONF_dns_domain=local
FTLCONF_misc_dnsmasq_lines=no-0x20-encode

Persistent Storage for Stateful Data

Since Pi-hole relies on databases (e.g., FTL, Gravity), a PersistentVolumeClaim (PVC) is required to store them. The PVC is mounted to three key locations:

/etc/dnsmasq.d - Configuration files for DNSMasq
/etc/pihole - Pi-hole core configuration
/etc/pihole/db - Custom database location (set explicitly in environment variables)

Deploying Pi-hole in Kubernetes

I used a StatefulSet to ensure that Pi-hole has a stable identity and persistent storage:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: pi-hole-rw
  labels:
    app.kubernetes.io/name: pi-hole
    pi.hole/role: read-write
spec:
  revisionHistoryLimit: 3
  selector:
    matchLabels:
      app.kubernetes.io/name: pi-hole
      pi.hole/role: read-write
  serviceName: "pi-hole"
  replicas: 1
  template:
    metadata:
      labels:
        app.kubernetes.io/name: pi-hole
        pi.hole/role: read-write
    spec:
      topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: kubernetes.io/hostname
        whenUnsatisfiable: ScheduleAnyway
        labelSelector:
          matchLabels:
            app.kubernetes.io/name: pi-hole
      containers:
      - name: pi-hole
        image: pihole/pihole
        envFrom:
        - configMapRef:
            name: pi-hole-env
        resources:
          requests:
            memory: 128Mi
            cpu: 100m
        volumeMounts:
        - name: pihole-data
          mountPath: /etc/pihole/db
          subPath: databases
        - name: pihole-data
          mountPath: /etc/pihole
          subPath: config
        - name: pihole-data
          mountPath: /etc/dnsmasq.d
          subPath: dnsmasq
        livenessProbe:
          failureThreshold: 3
          timeoutSeconds: 5
          httpGet:
            path: /admin
            port: http
        readinessProbe:
          failureThreshold: 3
          timeoutSeconds: 5
          httpGet:
            path: /admin
            port: http
        ports:
        - containerPort: 80
          name: http
          protocol: TCP
        - containerPort: 53
          name: dns-tcp
          protocol: TCP
        - containerPort: 53
          name: dns-udp
          protocol: UDP
      - name: exporter
        image: ekofr/pihole-exporter
        env:
          - name: PIHOLE_HOSTNAME
            valueFrom:
              fieldRef:
                fieldPath: status.podIP
          - name: PIHOLE_PORT
            value: "80"
          - name: PIHOLE_PASSWORD
            value: "<REDACTED>"
        resources:
          limits:
            memory: 128Mi
          requests:
            cpu: 50m
            memory: 128Mi
        ports:
          - containerPort: 9617
            name: prometheus
            protocol: TCP
      securityContext:
        fsGroup: 1000
        fsGroupChangePolicy: "OnRootMismatch"
      volumes:
      - name: pihole-data
        persistentVolumeClaim:
          claimName: pi-hole

Exposing Pi-hole Services

A Service is used to expose Pi-hole within the cluster:

apiVersion: v1
kind: Service
metadata:
  name: pi-hole
  labels: 
    app.kubernetes.io/name: pi-hole
    pi.hole/role: read-write
spec:
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 300
  selector:
    app.kubernetes.io/name: pi-hole
    pi.hole/role: read-write
  ports:
  - port: 80
    targetPort: http
    name: http

For DNS resolution, a LoadBalancer Service with a static IP is used:

apiVersion: v1
kind: Service
metadata:
  name: pi-hole-dns
  labels:
    app.kubernetes.io/name: pi-hole
  annotations:
    io.cilium/lb-ipam-sharing-key: "pi-hole-svc"
    io.cilium/lb-ipam-ips: "10.2.0.6"
spec:
  type: LoadBalancer
  externalTrafficPolicy: Local
  ports:
  - port: 53
    targetPort: 53
    protocol: TCP
    name: dns-tcp
  - port: 53
    targetPort: 53
    protocol: UDP
    name: dns-udp
  selector:
    app.kubernetes.io/name: pi-hole

Finally, an Ingress is used to expose the Pi-hole UI over HTTPS:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: pi-hole
  annotations:
    kubernetes.io/ingress.allow-http: "false"
    cert-manager.io/cluster-issuer: letsencrypt
    gethomepage.dev/description: Secondary AdBlocking DNS
    gethomepage.dev/enabled: "true"
    gethomepage.dev/group: Cluster Management
    gethomepage.dev/icon: pi-hole.png
    gethomepage.dev/name: Pi Hole
    external-dns.alpha.kubernetes.io/target: ingress.somedomain.com
  labels:
    use-cloudflare-solver: "true"
    app.kubernetes.io/name: pi-hole
spec:
  ingressClassName: nginx
  tls:
  - hosts:
    - pihole.somedomain.com
    secretName: pi-hole-tls
  rules:
  - host: pihole.somedomain.com
    http:
        paths: 
        - path: /
          pathType: ImplementationSpecific 
          backend:
            service:
              name: pi-hole
              port:
                name: http

Summary

By upgrading to Pi-hole v6 and moving entirely to environment-based configuration, I achieved:

Better automation with Kubernetes-native ConfigMaps.
Persistent storage for database files using a PVC.
Scalability improvements, allowing for a better HA strategy in the next steps.

Next, I will tackle high availability (HA) by ensuring that multiple Pi-hole replicas can work together while keeping the data consistent. Stay tuned! 🚀

High Availability Setup

Ensuring DNS availability during updates and reconfiguration is critical. To achieve this, I implemented a High Availability (HA) setup with read-only replicas that continue to serve DNS queries while the primary instance undergoes maintenance.

Read-Only Replicas

I created a second StatefulSet for read-only replicas that do not expose the web UI. These replicas mount their data into an emptyDir instead of using a persistent volume claim (PVC), as their configuration is synced from the primary instance.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: pi-hole-ro
  labels:
    app.kubernetes.io/name: pi-hole
    pi.hole/role: read-only
spec:
  serviceName: "pi-hole-ro"
  revisionHistoryLimit: 3
  selector:
    matchLabels:
      app.kubernetes.io/name: pi-hole
      pi.hole/role: read-only
  replicas: 2
  template:
    metadata:
      labels:
        app.kubernetes.io/name: pi-hole
        pi.hole/role: read-only
    spec:
      topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: kubernetes.io/hostname
        whenUnsatisfiable: ScheduleAnyway
        labelSelector:
          matchLabels:
            app.kubernetes.io/name: pi-hole
      containers:
      - name: pi-hole
        image: pihole/pihole
        envFrom:
        - configMapRef:
            name: pi-hole-env
        resources:
          requests:
            memory: 128Mi
            cpu: 100m
        volumeMounts:
        - name: pihole-data
          mountPath: /etc/pihole/db
          subPath: databases
        - name: pihole-data
          mountPath: /etc/pihole
          subPath: config
        - name: pihole-data
          mountPath: /etc/dnsmasq.d
          subPath: dnsmasq
        livenessProbe:
          failureThreshold: 3
          timeoutSeconds: 5
          httpGet:
            path: /admin
            port: http
        readinessProbe:
          failureThreshold: 3
          timeoutSeconds: 5
          httpGet:
            path: /admin
            port: http
        ports:
        - containerPort: 80
          name: http
          protocol: TCP
        - containerPort: 53
          name: dns-tcp
          protocol: TCP
        - containerPort: 53
          name: dns-udp
          protocol: UDP
      securityContext:
        fsGroup: 1000
        fsGroupChangePolicy: "OnRootMismatch"
      volumes:
      - name: pihole-data
        emptyDir: {}

A headless service is used to provide stable DNS names for the read-only replicas, allowing them to be targeted for synchronization.

apiVersion: v1
kind: Service
metadata:
  name: pi-hole-ro
  labels: 
    app.kubernetes.io/name: pi-hole
    pi.hole/role: read-only
spec:
  clusterIP: None
  selector:
    app.kubernetes.io/name: pi-hole
    pi.hole/role: read-only
  ports:
  - port: 80
    targetPort: http
    name: http

Configuration Synchronization

To synchronize the primary and read-only replicas, I used Nebula Sync (ghcr.io/lovelaze/nebula-sync). This tool utilizes the Teleporter API to sync configuration changes when UI modifications occur on the primary instance.

However, adlists need to be manually synchronized, as they are managed via the UI. Since Pi-hole v6 now fully exposes an API, I created a script to fetch lists from the primary and sync them to the replicas.

Adlist Synchronization Script

#!/bin/bash

# Ensure required environment variables are set
if [[ -z "$PRIMARY_PIHOLE" || -z "$PRIMARY_PIHOLE_PASS" || -z "$REPLICA_PIHOLE" || -z "$REPLICA_PIHOLE_PASS" ]]; then
    echo "Error: Required environment variables are missing."
    exit 1
fi

# Authenticate with the primary Pi-hole
auth_payload=$(jq -n --arg password "$PRIMARY_PIHOLE_PASS" '{password: $password}')
auth_response=$(curl -s -X POST -H "Content-Type: application/json" -d "$auth_payload" "$PRIMARY_PIHOLE/api/auth")
SID=$(echo "$auth_response" | jq -r '.session.sid')

if [[ -z "$SID" || "$SID" == "null" ]]; then
    echo "Error: Failed to retrieve session token from $PRIMARY_PIHOLE"
    exit 1
fi

# Fetch enabled adlists from the primary Pi-hole
response=$(curl -s -H "X-FTL-SID: $SID" "$PRIMARY_PIHOLE/api/lists")
addresses=$(echo "$response" | jq -c '[.lists[] | select(.enabled == true) | .address]')

if [[ "$addresses" == "[]" ]]; then
    echo "No enabled lists found. Exiting."
    exit 0
fi

# Define payload
payload=$(jq -n --argjson address "$addresses" '{address: $address, type: "block", comment: "Synced list", groups: [0], enabled: true}')

# Sync adlists with replicas
IFS='|' read -ra REPLICAS <<< "$REPLICA_PIHOLE"
IFS='|' read -ra PASSWORDS <<< "$REPLICA_PIHOLE_PASS"

for index in "${!REPLICAS[@]}"; do
    replica=${REPLICAS[$index]}
    pass=${PASSWORDS[$index]}

    # Authenticate with replica
    replica_auth_payload=$(jq -n --arg password "$pass" '{password: $password}')
    replica_auth_response=$(curl -s -X POST -H "Content-Type: application/json" -d "$replica_auth_payload" "$replica/api/auth")
    replica_SID=$(echo "$replica_auth_response" | jq -r '.session.sid')

    if [[ -z "$replica_SID" || "$replica_SID" == "null" ]]; then
        echo "Error: Failed to retrieve session token from $replica"
        continue
    fi

    # Send adlist data to replica
    curl -s -X POST -H "Content-Type: application/json" -H "X-FTL-SID: $replica_SID" -d "$payload" "$replica/api/lists"

    # Trigger gravity update on replica
    curl -s -X POST -H "Content-Type: application/json" -H "X-FTL-SID: $replica_SID" "$replica/api/action/gravity"

    echo "Sync completed for $replica."
done

Automated Execution

A Kubernetes CronJob runs every hour to sync configurations between the primary and read-only replicas.

apiVersion: batch/v1
kind: CronJob
metadata:
  name: pi-hole-sync
spec:
  schedule: "0 * * * *"
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 1
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: nebula-sync
            image: ghcr.io/lovelaze/nebula-sync:latest
            env:
            - name: PRIMARY
              value: "https://pihole.somedomain.com|<masked-password>"
            - name: REPLICAS
              value: "http://pi-hole-ro-0.pi-hole-ro.networking.svc.cluster.local|<masked-password>,http://pi-hole-ro-1.pi-hole-ro.networking.svc.cluster.local|<masked-password>"
            - name: FULL_SYNC
              value: "true"
          - name: adlists-sync
            image: alpine:latest
            command: ["/bin/sh", "-c", "apk add --no-cache jq bash curl && /script/sync.sh"]
            volumeMounts:
              - name: script
                mountPath: /script
            env:
              - name: PRIMARY_PIHOLE
                value: https://pihole.somedomain.com
              - name: PRIMARY_PIHOLE_PASS
                value: <masked-password>
              - name: REPLICA_PIHOLE
                value: "http://pi-hole-ro-0|http://pi-hole-ro-1"
              - name: REPLICA_PIHOLE_PASS
                value: "<masked-password>|<masked-password>"
          volumes:
            - name: script
              configMap:
                name: sync-script
                defaultMode: 0777
          restartPolicy: OnFailure

This ensures continuous availability of Pi-hole DNS services, even during maintenance or updates. 🚀

Conclusion

Implementing a high-availability Pi-hole setup within Kubernetes ensures continuous DNS resolution, even during updates and reconfigurations. By deploying a read-only replica StatefulSet, leveraging a headless service for internal discovery, and utilizing nebula-sync along with a custom adlist synchronization script, we maintain consistency across all instances.

This approach ensures that UI-based configuration changes on the primary Pi-hole are automatically propagated to the replicas, reducing manual intervention and preventing inconsistencies. The automated hourly sync mechanism further reinforces reliability by keeping blocklists and DNS configurations in sync.

However, it's important to note that this setup does not provide instant high availability. Changes made via the UI will only sync once the scheduled CronJob runs, meaning there may be a delay before replicas receive the latest configuration updates. Despite this, the system remains resilient and significantly reduces downtime while maintaining a consistent DNS filtering experience.