Home

PostgreSQL HA Cluster with Patroni — Ubuntu 20.04

PostgreSQL is a great open-source database, but it doesn't ship any HA story out of the box. The community answer is Patroni — a Python cluster manager that handles deploying and operating highly-available PostgreSQL clusters, backed by a distributed config store like etcd, Consul or ZooKeeper. It also takes care of replication, backup, and restore configs. In this post I'm walking through a 4-node setup, but you can scale the PostgreSQL nodes however you like.

What you'll need:

ServerWhat runs on itIP
psql01PostgreSQL, Patroni172.16.16.101
psql02PostgreSQL, Patroni172.16.16.102
etcdetcd172.16.16.103
haproxyHAProxy172.16.16.104

If you need a PostgreSQL install walkthrough first, I covered that in a separate post.

Installing etcd

Note: only run this on the etcd node.

etcd holds the state of the PostgreSQL cluster. Whenever a node's state changes, Patroni writes the new state into etcd's key/value store. etcd is what Patroni uses to elect the leader and keep the cluster healthy.

bash
apt install etcd

Once etcd is installed, drop in the config:

bash
vim /etc/default/etcd
bash
ETCD_LISTEN_PEER_URLS="http://172.16.16.103:2380"ETCD_LISTEN_CLIENT_URLS="http://localhost:2379,http://172.16.16.103:2379"ETCD_INITIAL_ADVERTISE_PEER_URLS="http://172.16.16.103:2380"ETCD_INITIAL_CLUSTER="default=http://172.16.16.103:2380,"ETCD_ADVERTISE_CLIENT_URLS="http://172.16.16.103:2379"ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"ETCD_INITIAL_CLUSTER_STATE="new"

Swap the IP for your own etcd host. After saving, restart the service:

bash
systemctl restart etcdsystemctl status etcd

Installing HAProxy

Note: only run this on the HAProxy node.

HAProxy watches the master/slave nodes and routes client connections to the current leader.

bash
apt install haproxy

We'll come back to the HAProxy config at the end, after PostgreSQL and Patroni are running.

Installing Patroni and PostgreSQL

Note: only run this on the PostgreSQL nodes.

Use the PostgreSQL install link above if you need it. Once Postgres is in place, install Patroni:

bash
apt install python3-pip python3-dev libpq-dev -ypip3 install --upgrade pippip install patronipip install python-etcdpip install psycopg2

Infrastructure is in place — now we wire up the cluster.

Configuring Patroni

Note: do this on every node that runs PostgreSQL + Patroni.

Patroni's config is a single YAML that describes the whole cluster. Patroni needs a couple of PostgreSQL users, so set those up on both nodes first — set a password for the default postgres user and create a new replicator user that Patroni will use for replication:

sql
su - postgrespsqlALTER USER postgres PASSWORD 'sifre123';CREATE USER replicator WITH ENCRYPTED PASSWORD 'sifre321';

Patroni also needs to reach a few PostgreSQL binaries, so symlink them. Run this on both Postgres nodes:

bash
ln -s /usr/lib/postgresql/12/bin/* /usr/sbin/

Last bit of prep: create the data dir Patroni will use and lock down the perms.

bash
mkdir -p /data/patronichown -R postgres:postgres /data/chmod -R 700 /data/

Now we can write patroni.yml.

SSH into psql01 first:

bash
vim /etc/patroni.yml
yaml
scope: prodcluster # cluster namenamespace: /db/name: prodpsql01 # node namerestapi:    listen: 172.16.16.101:8008 # node IP    connect_address: 172.16.16.101:8008 # node IPetcd:    host: 172.16.16.103:2379 # etcd server IPbootstrap:    dcs:        ttl: 30        loop_wait: 10        retry_timeout: 10        maximum_lag_on_failover: 1048576        postgresql:            use_pg_rewind: true    initdb:    - encoding: UTF8    - data-checksums    pg_hba:    - host replication replicator 127.0.0.1/32 md5    - host replication replicator 172.16.16.101/0 md5 # psql01 IP    - host replication replicator 172.16.16.102/0 md5 # psql02 IP    - host all all 0.0.0.0/0 md5    users:        admin:            password: admin            options:                - createrole                - createdbpostgresql:    listen: 10.90.214.183:5432    connect_address: 10.90.214.183:5432    data_dir: /data/patroni # Patroni data dir    pgpass: /tmp/pgpass    authentication:        replication:            username: replicator            password: sifre123        superuser:            username: postgres            password: sifre321    parameters:        unix_socket_directories: '.'tags:    nofailover: false    noloadbalance: false    clonefrom: false    nosync: false

Now psql02 — same file, just swap the IP:

bash
vim /etc/patroni.yml
yaml
scope: prodcluster # cluster namenamespace: /db/name: prodpsql01 # node namerestapi:    listen: 172.16.16.102:8008 # node IP    connect_address: 172.16.16.102:8008 # node IPetcd:    host: 172.16.16.103:2379 # etcd server IPbootstrap:    dcs:        ttl: 30        loop_wait: 10        retry_timeout: 10        maximum_lag_on_failover: 1048576        postgresql:            use_pg_rewind: true    initdb:    - encoding: UTF8    - data-checksums    pg_hba:    - host replication replicator 127.0.0.1/32 md5    - host replication replicator 172.16.16.102/0 md5 # psql01 IP    - host replication replicator 172.16.16.101/0 md5 # psql02 IP    - host all all 0.0.0.0/0 md5    users:        admin:            password: admin            options:                - createrole                - createdbpostgresql:    listen: 10.90.214.183:5432    connect_address: 10.90.214.183:5432    data_dir: /data/patroni # Patroni data dir    pgpass: /tmp/pgpass    authentication:        replication:            username: replicator            password: sifre123        superuser:            username: postgres            password: sifre321    parameters:        unix_socket_directories: '.'tags:    nofailover: false    noloadbalance: false    clonefrom: false    nosync: false

With the YAML in place, turn Patroni into a systemd service. Do this on both PostgreSQL nodes:

bash
vim /etc/systemd/system/patroni.service
ini
[Unit]Description=Runners to orchestrate a high-availability PostgreSQLAfter=syslog.target network.target[Service]Type=simpleUser=postgresGroup=postgresExecStart=/usr/local/bin/patroni /etc/patroni.ymlKillMode=processTimeoutSec=30Restart=no[Install]WantedBy=multi-user.target

That's everything — start the cluster:

bash
systemctl daemon-reloadsystemctl enable patroni # so Patroni starts on bootsystemctl enable postgresqlsystemctl start patronisystemctl start postgresql

PostgreSQL cluster's up. Last step: HAProxy config so we have a single endpoint for the database.

Configuring HAProxy

On the HAProxy node, edit the config. I'm using port 5000 for the database endpoint — pick whatever you want.

bash
vim /etc/haproxy/haproxy.cfg
bash
global    maxconn 100defaults    log global    mode tcp    retries 2    timeout client 30m    timeout connect 4s    timeout server 30m    timeout check 5slisten stats    mode http    bind *:7000    stats enable    stats uri /listen postgres    bind *:5000    option httpchk    http-check expect status 200    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions    server PSQL01 172.16.16.101:5432 maxconn 100 check port 8008    server PSQL02 172.16.16.102:5432 maxconn 100 check port 8008

Save it and restart HAProxy:

bash
systemctl restart haproxy

Open http://haproxy_ip:7000 in a browser to see the HAProxy stats page and the live cluster.

Both cluster nodes are listed. PSQL02 showing as down on the stats page is expected — that's how HAProxy reports replicas in this master/slave layout, not an actual failure.

That's it. haproxy_ip:5000 is now your database endpoint. You can also check the cluster from the CLI with patronictl:

bash
patronictl -c /etc/patroni.yml list

Output:

bash
+------------+---------------+---------+---------+----+-----------+|   Member   |      Host     |   Role  |  State  | TL | Lag in MB |+ Cluster: prodcluster (7138377709122875960) ----+----+-----------+| prodpsql01 | 172.16.16.101 | Leader  | running | 20 |           || prodpsql02 | 172.16.16.102 | Replica | running | 20 |         0 |+------------+---------------+---------+---------+----+-----------+

And that's a 4-node PostgreSQL HA cluster, ready to use. 🙂