Setting Up Replication in PostgreSQL

As databases grow and applications demand higher availability, replication becomes a key feature for maintaining performance, reliability, and scalability. PostgreSQL offers powerful built-in replication capabilities that allow you to create standby servers, improve read performance, and prepare for disaster recovery.

This article explains how PostgreSQL replication works, the different replication types available, and a step-by-step guide to setting up replication in a production-ready environment.

What Is PostgreSQL Replication?

PostgreSQL replication is the process of copying and maintaining data across multiple PostgreSQL servers. One server acts as the primary, while one or more replica (standby) servers receive data changes from the primary.

Replication is commonly used for:

  • High availability
  • Read scaling
  • Backup and disaster recovery
  • Load balancing
  • Maintenance with minimal downtime

Types of Replication in PostgreSQL

PostgreSQL supports several replication methods, each suited for different use cases.

Physical (Streaming) Replication

This is the most common replication method. It works at the storage level by continuously sending Write-Ahead Log (WAL) records from the primary server to replicas.

Key characteristics:

  • Exact copy of the primary database
  • Fast and reliable
  • Read-only replicas
  • Ideal for high availability and failover

Logical Replication

Logical replication works at the logical level, replicating individual tables or changes.

Key characteristics:

  • Selective table replication
  • Supports different PostgreSQL versions
  • Allows write operations on subscribers
  • Useful for data distribution and migrations

Synchronous vs Asynchronous Replication

  • Asynchronous replication (default): Primary does not wait for replicas to confirm WAL receipt.
  • Synchronous replication: Primary waits for at least one replica, ensuring zero data loss at the cost of latency.

Prerequisites for Replication Setup

Before setting up replication, ensure:

  • Same major PostgreSQL version (for physical replication)
  • Network connectivity between servers
  • Sufficient disk space on replicas
  • PostgreSQL configured with replication support
  • Proper user privileges

Configuring the Primary Server

Edit the postgresql.conf file on the primary server.

Enable WAL and replication settings:

wal_level = replica
max_wal_senders = 10
wal_keep_size = 64MB

Optional (for read replicas):

hot_standby = on

Reload or restart PostgreSQL after changes.

Creating a Replication User

Create a dedicated user for replication:

CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'strongpassword';

Using a dedicated replication user improves security and auditability.

Updating pg_hba.conf

Allow replica servers to connect to the primary:

host replication replicator 192.168.1.0/24 md5

Reload PostgreSQL configuration after editing pg_hba.conf.

Preparing the Replica Server

Stop PostgreSQL on the replica server and clear the data directory.

Use pg_basebackup to copy data from the primary:

pg_basebackup -h primary_ip -D /var/lib/postgresql/data \
-U replicator -Fp -Xs -P -R

Explanation:

  • -Fp: plain format
  • -Xs: stream WAL during backup
  • -R: automatically creates replication config

Starting the Replica Server

Start PostgreSQL on the replica server:

systemctl start postgresql

The replica will begin streaming WAL files from the primary server.

Verify replication status on the primary:

SELECT * FROM pg_stat_replication;

Monitoring Replication Health

PostgreSQL provides built-in views for monitoring replication.

Useful views:

  • pg_stat_replication
  • pg_stat_wal_receiver
  • pg_replication_slots

Monitoring replication lag is critical for production systems.

Setting Up Synchronous Replication (Optional)

Enable synchronous replication on the primary:

synchronous_standby_names = 'standby1'

This ensures transactions are confirmed only after the standby receives them.

Use synchronous replication when data consistency is more important than latency.

Logical Replication Overview

Logical replication uses publications and subscriptions.

Example:

CREATE PUBLICATION my_pub FOR TABLE orders;

On the subscriber:

CREATE SUBSCRIPTION my_sub
CONNECTION 'host=primary_ip dbname=mydb user=replicator password=secret'
PUBLICATION my_pub;

Logical replication is ideal for partial data replication and version upgrades.

Failover and High Availability

Replication alone does not provide automatic failover. For production environments, consider tools such as:

  • Patroni
  • pg_auto_failover
  • repmgr

These tools automate failover, leader election, and replica promotion.

Common Replication Issues and Solutions

Common problems include:

  • WAL file buildup due to lagging replicas
  • Network latency
  • Insufficient disk space
  • Incorrect permissions

Regular monitoring and alerting can prevent most replication failures.

Best Practices for PostgreSQL Replication

  1. Use SSL for replication connections
  2. Monitor replication lag continuously
  3. Use replication slots carefully
  4. Keep replicas on separate hardware
  5. Test failover procedures regularly
  6. Document recovery steps

Conclusion

Setting up replication in PostgreSQL is a foundational step toward building highly available and scalable database systems. Whether you choose physical replication for high availability or logical replication for flexibility, PostgreSQL provides robust tools to meet modern infrastructure needs.

With proper configuration, monitoring, and operational discipline, PostgreSQL replication can significantly improve reliability and performance in production environments.

You may also like