Set Up a Three-Node PostgreSQL Cluster with Patroni and HAProxy on Ubuntu 24.04

PostgreSQL is a feature-rich, reliable, and scalable open-source database management system. For mission-critical environments, maintaining high availability is essential, which can be achieved through replication and load balancing in a clustered setup. A Patroni-based cluster requires an odd number of nodes to preserve quorum integrity. Quorum means that most nodes must agree before a leader can be elected or configuration changes can occur. With three nodes, even if one fails, the remaining two nodes still form a majority, ensuring uninterrupted and safe operation of the cluster.

This guide explains how to deploy a three-node PostgreSQL cluster on Ubuntu 24.04 using Patroni for automatic replication and failover, and HAProxy for distributing client connections.

Prerequisites

Before you start, make sure you meet the following requirements:

  • Three Ubuntu 24.04 servers, each with a minimum of 2 CPU cores and 4 GB of RAM, configured with a non-root sudo user.
  • PostgreSQL installed on all three servers.
  • A registered domain name with A records configured for three subdomains pointing to each server’s IP address:
    • node1.example.com
    • node2.example.com
    • node3.example.com

Note: Replace all example domains used in this guide with your actual domain values. For instance, if your domain is mydb.com, replace node1.example.com with node1.mydb.com.

Install Dependencies

Begin by installing the required packages, opening necessary firewall ports, and configuring SSL certificates to enable secure communication between cluster nodes.

Install Packages and Configure Firewall

Update the system package repository index on each node:

Install HAProxy, Certbot, and the necessary Python packages on every node:

$ sudo apt install haproxy certbot pipx -y

Use pip3 to install Patroni and its dependencies:

$ sudo pip3 install --break-system-packages 'patroni[etcd3]' psycopg2-binary psycopg

Download and install the etcd binaries on all nodes:

$ wget https://github.com/etcd-io/etcd/releases/download/v3.6.4/etcd-v3.6.4-linux-amd64.tar.gz
$ tar -xvf etcd-v3.6.4-linux-amd64.tar.gz
$ sudo mv etcd-v3.6.4-linux-amd64/etcd etcd-v3.6.4-linux-amd64/etcdctl /usr/local/bin/

Open the required firewall ports on every node:

$ sudo ufw allow 80,2379,2380,5432,5433,8008,8009/tcp

The open ports serve the following purposes:

  • 80: Certbot SSL certificate verification
  • 2379, 2380: etcd client and peer communication
  • 5432, 5433: PostgreSQL and Patroni-managed PostgreSQL
  • 8008, 8009: Patroni REST API

Reload the firewall configuration:

Check and verify that the firewall rules are correctly applied:

Configure SSL Certificates

To secure communication between nodes, use Let’s Encrypt SSL certificates.

Request SSL certificates for each node, starting with Node1:

$ sudo certbot certonly --standalone -d node1.example.com -m admin@example.com --agree-tos --no-eff

Repeat the same process for Node2 and Node3, changing the subdomain for each one.

Now, create a script on every node to prepare the SSL certificates. Update the HOSTNAME variable according to each node’s domain:

$ sudo nano /usr/local/bin/prepare-ssl-certs.sh

Add this content, making sure to modify HOSTNAME for each node:

#!/bin/bash
# SSL Certificate Setup Script
HOSTNAME="node1.example.com"  # Update for each node
CERT_DIR="/etc/letsencrypt/live/$HOSTNAME"
ARCHIVE_DIR="/etc/letsencrypt/archive/$HOSTNAME"

# Create ssl-users group
getent group ssl-users >/dev/null || sudo groupadd ssl-users

# Create users and add to ssl-users group
for user in etcd patroni haproxy postgres; do
    if ! id "$user" >/dev/null 2>&1 && [[ "$user" == "etcd" || "$user" == "patroni" ]]; then
        getent group "$user" >/dev/null && sudo useradd -r -m -s /bin/bash -g "$user" "$user" || sudo useradd -r -m -s /bin/bash "$user"
    fi
    groups "$user" 2>/dev/null | grep -q ssl-users || sudo usermod -aG ssl-users "$user"
done

# Create combined certificate and set permissions
cat "$CERT_DIR/fullchain.pem" "$CERT_DIR/privkey.pem" > "$CERT_DIR/combined.pem"
sudo chmod 755 /etc/letsencrypt /etc/letsencrypt/live /etc/letsencrypt/archive
sudo chgrp ssl-users "$CERT_DIR" "$ARCHIVE_DIR"
sudo chmod 755 "$CERT_DIR" "$ARCHIVE_DIR"
sudo chown root:ssl-users "$ARCHIVE_DIR"/*.pem "$CERT_DIR/combined.pem"
sudo chmod 644 "$ARCHIVE_DIR"/cert*.pem "$ARCHIVE_DIR"/chain*.pem "$ARCHIVE_DIR"/fullchain*.pem
sudo chmod 640 "$ARCHIVE_DIR"/privkey*.pem "$CERT_DIR/combined.pem"

# Add linuxuser to ssl-users group
id "linuxuser" >/dev/null 2>&1 && ! groups linuxuser | grep -q ssl-users && sudo usermod -aG ssl-users linuxuser

echo "SSL certificate setup completed for $HOSTNAME"

Make the script executable:

$ sudo chmod +x /usr/local/bin/prepare-ssl-certs.sh

Execute the script to apply the changes:

$ sudo /usr/local/bin/prepare-ssl-certs.sh

Create SSL Renewal Hook Script

Generate an SSL renewal hook script on each node:

$ sudo nano /usr/local/bin/prepare-renewed-ssl-certs.sh

Add this code, updating HOSTNAME accordingly:

#!/bin/bash
# SSL Certificate Renewal Hook Script
HOSTNAME="node1.example.com"  # Update for each node
CERT_DIR="/etc/letsencrypt/live/$HOSTNAME"
ARCHIVE_DIR="/etc/letsencrypt/archive/$HOSTNAME"

# Recreate combined certificate for HAProxy
cat "$CERT_DIR/fullchain.pem" "$CERT_DIR/privkey.pem" > "$CERT_DIR/combined.pem" 2>/dev/null

# Reset permissions after renewal
chmod 755 /etc/letsencrypt /etc/letsencrypt/live /etc/letsencrypt/archive 2>/dev/null
chgrp ssl-users "$CERT_DIR" "$ARCHIVE_DIR" 2>/dev/null
chmod 755 "$CERT_DIR" "$ARCHIVE_DIR" 2>/dev/null
chown root:ssl-users "$ARCHIVE_DIR"/*.pem "$CERT_DIR/combined.pem" 2>/dev/null
chmod 644 "$ARCHIVE_DIR"/cert*.pem "$ARCHIVE_DIR"/chain*.pem "$ARCHIVE_DIR"/fullchain*.pem 2>/dev/null
chmod 640 "$ARCHIVE_DIR"/privkey*.pem "$CERT_DIR/combined.pem" 2>/dev/null

# Reload services after certificate renewal
systemctl is-active --quiet haproxy && systemctl reload haproxy 2>/dev/null
systemctl is-active --quiet postgresql && systemctl reload postgresql 2>/dev/null
systemctl is-active --quiet patroni && systemctl restart patroni 2>/dev/null
systemctl is-active --quiet etcd && { sleep $((RANDOM % 30)); systemctl restart etcd 2>/dev/null; }

Make the renewal script executable:

$ sudo chmod +x /usr/local/bin/prepare-renewed-ssl-certs.sh

Configure automatic certificate renewal hooks (on Node1 as an example):

$ echo 'renew_hook = /usr/local/bin/prepare-renewed-ssl-certs.sh' | sudo tee -a /etc/letsencrypt/renewal/node1.example.com.conf

Repeat this setup for Node2 and Node3, modifying the subdomain each time.

Configure etcd

Patroni relies on etcd as a distributed key-value store to handle cluster coordination. Set up etcd with HTTPS on all nodes for secure communication.

Create a systemd service unit for etcd:

$ sudo nano /etc/systemd/system/etcd.service

Insert the following configuration for Node1, substituting your domain name for example.com:

[Unit]
Description=etcd key-value store
Documentation=https://github.com/etcd-io/etcd
After=network.target

[Service]
User=etcd
Type=notify
ExecStart=/usr/local/bin/etcd \
  --name node1 \
  --data-dir /var/lib/etcd \
  --initial-cluster node1=https://node1.example.com:2380,node2=https://node2.example.com:2380,node3=https://node3.example.com:2380 \
  --initial-cluster-state new \
  --listen-peer-urls https://0.0.0.0:2380 \
  --initial-advertise-peer-urls https://node1.example.com:2380 \
  --listen-client-urls https://0.0.0.0:2379 \
  --advertise-client-urls https://node1.example.com:2379 \
  --cert-file /etc/letsencrypt/live/node1.example.com/fullchain.pem \
  --key-file /etc/letsencrypt/live/node1.example.com/privkey.pem \
  --peer-cert-file /etc/letsencrypt/live/node1.example.com/fullchain.pem \
  --peer-key-file /etc/letsencrypt/live/node1.example.com/privkey.pem \
  --client-cert-auth=false \
  --peer-client-cert-auth=false \
  --auto-tls=false \
  --peer-auto-tls=false
Restart=always
RestartSec=5
LimitNOFILE=40000

[Install]
WantedBy=multi-user.target

For Node2 and Node3, update the --name parameter, adjust node-specific URLs, and modify SSL certificate paths accordingly.

Create the necessary data and configuration directories:

$ sudo mkdir -p /var/lib/etcd /etc/etcd

Set ownership and restrict permissions for the etcd data directory:

$ sudo chown -R etcd:etcd /var/lib/etcd /etc/etcd
$ sudo chmod 700 /var/lib/etcd

Reload systemd to apply the new configuration:

$ sudo systemctl daemon-reload

Enable the etcd service to start on boot:

$ sudo systemctl enable etcd

Start the service:

$ sudo systemctl start etcd

Check the etcd service status:

$ sudo systemctl status etcd

Verify the cluster membership:

$ etcdctl --endpoints=https://node1.example.com:2379 member list

Example Output:

693741ed87a9350d, started, node1, https://node1.example.com:2380, https://node1.example.com:2379, false
d9554ae165b5de0b, started, node3, https://node3.example.com:2380, https://node3.example.com:2379, false
fecdfaa2d1b8f35c, started, node2, https://node2.example.com:2380, https://node2.example.com:2379, false

Lastly, verify the cluster’s health:

$ etcdctl --endpoints=https://node1.example.com:2379,https://node2.example.com:2379,https://node3.example.com:2379 endpoint health

Configure PostgreSQL

Adjust PostgreSQL to use SSL certificates and tune performance parameters for optimal operation.

Edit the PostgreSQL Configuration

Open the PostgreSQL configuration file on each node:

$ sudo nano /etc/postgresql/18/main/postgresql.conf

Modify the following settings for Node1:

# SSL Configuration
ssl_cert_file = '/etc/letsencrypt/live/node1.example.com/fullchain.pem'
ssl_key_file = '/etc/letsencrypt/live/node1.example.com/privkey.pem'

# Memory Settings
shared_buffers = 512MB
work_mem = 8MB
effective_cache_size = 1536MB

Make sure to update certificate paths for Node2 and Node3 appropriately.

Force SSL Connections

To enforce SSL, edit the pg_hba.conf file:

$ sudo nano /etc/postgresql/18/main/pg_hba.conf

Add this rule at the end:

hostssl all             all             0.0.0.0/0               scram-sha-256

Restart PostgreSQL to apply the new configuration:

$ sudo systemctl restart postgresql

Check that SSL is active:

$ sudo -u postgres psql -c "SHOW ssl;"

Expected Output:

 ssl 
-----
 on
(1 row)

Configure Patroni

Patroni is responsible for managing PostgreSQL replication and failover automatically. You’ll configure Patroni on each node with HTTPS support enabled.

Create the Patroni Configuration File

Open the configuration file:

$ sudo nano /etc/patroni.yaml

Insert the following content for Node1:

scope: postgres
name: node1

restapi:
  listen: 0.0.0.0:8008
  connect_address: node1.example.com:8008
  certfile: /etc/letsencrypt/live/node1.example.com/fullchain.pem
  keyfile: /etc/letsencrypt/live/node1.example.com/privkey.pem

etcd3:
  hosts:
    - node1.example.com:2379
    - node2.example.com:2379
    - node3.example.com:2379
  protocol: https
  cacert: /etc/ssl/certs/ca-certificates.crt

bootstrap:
  dcs:
    ttl: 30
    loop_wait: 5
    retry_timeout: 5
    maximum_lag_on_failover: 1048576
  initdb:
    - encoding: UTF8
    - data-checksums
  pg_hba:
    - host replication replicator 0.0.0.0/0 scram-sha-256
    - host all all 0.0.0.0/0 scram-sha-256

postgresql:
  listen: 0.0.0.0:5433
  connect_address: node1.example.com:5433
  data_dir: /var/lib/postgresql/18/main
  bin_dir: /usr/lib/postgresql/18/bin
  parameters:
    ssl: 'on'
    ssl_cert_file: '/etc/letsencrypt/live/node1.example.com/fullchain.pem'
    ssl_key_file: '/etc/letsencrypt/live/node1.example.com/privkey.pem'
  authentication:
    replication:
      username: replicator
      password: StrongPassword123!
    superuser:
      username: postgres
      password: StrongPassword123!

For Node2 and Node3, update the name, connect_address, and SSL certificate paths. Use the same passwords across all nodes.

Prepare the PostgreSQL Runtime Directory

Create the runtime directory for PostgreSQL:

$ sudo mkdir -p /var/run/postgresql

Adjust permissions and ownership for the patroni user:

$ sudo chown patroni:patroni /var/run/postgresql
$ sudo chmod 755 /var/run/postgresql

Ensure the directory is recreated at reboot:

$ echo 'd /var/run/postgresql 0755 patroni patroni -' | sudo tee /etc/tmpfiles.d/postgresql.conf

Prepare PostgreSQL Data Directory for Patroni

Backup any existing PostgreSQL data directory:

$ sudo mv /var/lib/postgresql/18 /var/lib/postgresql/18.bak-$(date +%s) || true

Create a fresh PostgreSQL data directory for Patroni:

$ sudo mkdir -p /var/lib/postgresql/18/main

Grant ownership to the patroni user:

$ sudo chown -R patroni:patroni /var/lib/postgresql/18

Secure directory permissions:

$ sudo chmod -R 700 /var/lib/postgresql/18

Ensure PostgreSQL binaries are executable by all users (needed for certain scripts):

$ sudo chmod o+rx /usr/lib/postgresql/18/bin/*

Stop PostgreSQL so Patroni can take control:

$ sudo systemctl stop postgresql

Create Patroni Systemd Service

Create a new systemd service file for Patroni:

$ sudo nano /etc/systemd/system/patroni.service

Insert the following configuration:

[Unit]
Description=Patroni PostgreSQL cluster
After=network.target

[Service]
Type=simple
User=patroni
Group=patroni
ExecStart=/usr/local/bin/patroni /etc/patroni.yaml
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
TimeoutSec=30
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target

Reload systemd to register the new service:

$ sudo systemctl daemon-reload

Enable Patroni to start automatically on boot:

$ sudo systemctl enable patroni

Start Patroni and confirm it’s running correctly:

$ sudo systemctl start patroni

Check the cluster’s operational state:

$ patronictl -c /etc/patroni.yaml list

Expected Output:

+ Cluster: postgres (7561190282296399779) --+-----------+----+-------------+-----+------------+-----+
| Member | Host                   | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+--------+------------------------+---------+-----------+----+-------------+-----+------------+-----+
| node1  | node1.example.com:5433 | Leader  | running   |  1 |             |     |            |     |
| node2  | node2.example.com:5433 | Replica | streaming |  1 |   0/4000000 |   0 |  0/4000000 |   0 |
| node3  | node3.example.com:5433 | Replica | streaming |  1 |   0/4000000 |   0 |  0/4000000 |   0 |
+--------+------------------------+---------+-----------+----+-------------+-----+------------+-----+

Configure HAProxy

Set up HAProxy on every node to evenly distribute PostgreSQL client connections.

Remove the Default Configuration

Delete the default HAProxy configuration file:

$ sudo rm /etc/haproxy/haproxy.cfg

Create a New HAProxy Configuration

Open a new configuration file:

$ sudo nano /etc/haproxy/haproxy.cfg

Insert the following configuration, which will be the same for all nodes:

global
    daemon
    maxconn 4096
    log stdout local0
    stats socket /var/lib/haproxy/stats mode 660 level admin

defaults
    mode tcp
    timeout connect 5000ms
    timeout client 50000ms
    timeout server 50000ms
    log global

frontend postgres_ssl
    bind *:5432
    default_backend postgres_backend

backend postgres_backend
    balance roundrobin
    server node1 node1.example.com:5433 check inter 5000ms rise 2 fall 3
    server node2 node2.example.com:5433 check inter 5000ms rise 2 fall 3
    server node3 node3.example.com:5433 check inter 5000ms rise 2 fall 3

listen stats
    bind *:8404
    mode http
    stats enable
    stats uri /stats
    stats refresh 30s

Restart HAProxy

Restart the HAProxy service to load the new configuration:

$ sudo systemctl restart haproxy

Check the HAProxy service status to confirm it’s running properly:

$ sudo systemctl status haproxy

Test the Cluster

Now test high availability and automatic failover to verify that the cluster operates correctly.

Connect to the PostgreSQL cluster from a client machine, replacing domain and password values as needed:

$ psql 'postgresql://postgres:StrongPassword123!@node1.example.com:5433,node2.example.com:5433,node3.example.com:5433/postgres?sslmode=require&target_session_attrs=read-write'

Check which node currently acts as the primary:

postgres=# SELECT inet_server_addr() as ip, 
       CASE WHEN pg_is_in_recovery() THEN 'Replica' ELSE 'Primary' END as role;

Exit the PostgreSQL console:

Simulate a Failover

Test the failover process by stopping Patroni on the current leader (for example, Node1):

$ sudo systemctl stop patroni

Check the new cluster state from another node:

$ patronictl -c /etc/patroni.yaml list

The output confirms that a new leader is elected automatically.

Verify that the cluster remains accessible via the same connection string:

$ psql 'postgresql://postgres:StrongPassword123!@node1.example.com:5433,node2.example.com:5433,node3.example.com:5433/postgres?sslmode=require&target_session_attrs=read-write'

Restart the stopped node to rejoin it to the cluster:

$ sudo systemctl start patroni

Conclusion

You have successfully deployed a fault-tolerant PostgreSQL cluster on Ubuntu 24.04, managed by Patroni for replication and automatic failover, and balanced by HAProxy for connection distribution. The setup ensures continuous service availability, even when a node fails, and all traffic is protected by SSL/TLS encryption. For production monitoring and performance tracking, consider integrating Prometheus and Grafana.

Source: vultr.com

Create a Free Account

Register now and get access to our Cloud Services.

Posts you might be interested in: