Databases

Database Replication Explained: Primary-Replica, Sync vs Async

A complete guide to PostgreSQL streaming replication. Covers WAL shipping, synchronous vs asynchronous modes, replica lag monitoring, failover with Patroni, and physical vs logical replication trade-offs.

A
Abhishek Patel10 min read

Infrastructure engineer with 10+ years building production systems on AWS, GCP,…

Database Replication Explained: Primary-Replica, Sync vs Async
Database Replication Explained: Primary-Replica, Sync vs Async

Replication Is How You Stop a Single Server From Being a Single Point of Failure

Database replication copies data from one PostgreSQL server to one or more others in real time. It's the foundation of high availability, disaster recovery, and read scaling. Without replication, your database is one disk failure away from an outage -- and one traffic spike away from saturating your primary server's read capacity.

I've set up replication at companies ranging from 10-person startups to Fortune 500 banks. The concepts are straightforward, but the operational details are where teams get burned. This guide covers PostgreSQL streaming replication end to end: how WAL shipping works, the real trade-offs between synchronous and asynchronous modes, and the failover scenarios that catch people off guard.

What Is Database Replication?

Definition: Database replication is the continuous process of copying data changes from a primary (leader) server to one or more replica (follower) servers, so that each replica maintains an up-to-date copy of the primary's data. Replicas can serve read queries, act as failover targets, or both.

PostgreSQL implements replication through its Write-Ahead Log (WAL). Every change to the database -- inserts, updates, deletes, schema changes -- is first written to the WAL before being applied to the actual data files. Replication works by streaming these WAL records to replicas, which replay them in order.

Physical vs Logical Replication

PostgreSQL offers two fundamentally different replication mechanisms. Understanding the distinction saves you from picking the wrong one.

FeaturePhysical (Streaming)Logical
What's replicatedByte-level WAL changesRow-level change events
Replica is writable?No (read-only)Yes
Selective replicationEntire cluster onlyPer-table, per-database
Cross-version supportSame major version onlyDifferent major versions OK
DDL replicationYes (automatic)No (manual)
Performance overheadLowerHigher (decoding step)
Use caseHA failover, read replicasSelective sync, migrations, CDC

Physical replication is what you want for high availability. It's simpler, faster, and the replica is an exact binary copy of the primary. Logical replication is for specific scenarios: replicating a subset of tables, migrating between PostgreSQL major versions with zero downtime, or feeding changes into external systems (change data capture).

How to Set Up PostgreSQL Streaming Replication

Here's the step-by-step process for configuring physical streaming replication between a primary and a replica.

  1. Configure the primary server -- set wal_level = replica, max_wal_senders = 10, and wal_keep_size = 1GB in postgresql.conf
  2. Create a replication user -- CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'secure_password';
  3. Update pg_hba.conf -- allow the replica's IP to connect with the replication role
  4. Take a base backup -- use pg_basebackup -h primary_host -U replicator -D /var/lib/postgresql/data -Fp -Xs -P
  5. Configure the replica -- create standby.signal and set primary_conninfo in postgresql.conf
  6. Start the replica -- it connects to the primary and begins streaming WAL
  7. Verify replication -- check pg_stat_replication on the primary and pg_stat_wal_receiver on the replica
-- On the primary: check replication status
SELECT client_addr, state, sent_lsn, write_lsn, flush_lsn, replay_lsn,
       (sent_lsn - replay_lsn) AS replay_lag_bytes
FROM pg_stat_replication;

-- On the replica: confirm it's receiving WAL
SELECT status, received_lsn, latest_end_lsn,
       latest_end_time
FROM pg_stat_wal_receiver;
# postgresql.conf on the primary
wal_level = replica
max_wal_senders = 10
wal_keep_size = 1GB
hot_standby = on

# postgresql.conf on the replica
primary_conninfo = 'host=primary_host port=5432 user=replicator password=secure_password'
hot_standby = on

Synchronous vs Asynchronous Replication

This is where the real engineering trade-off lives. Async replication is the default. Sync replication is available but comes with a significant cost.

Asynchronous Replication (Default)

The primary writes WAL to disk, sends it to replicas, and commits the transaction immediately -- without waiting for replicas to acknowledge. This means the primary is never slowed down by replica performance, but there's a window where the replica is behind. If the primary crashes during that window, those transactions are lost on the replica.

Synchronous Replication

The primary waits for at least one replica to confirm it has received (and optionally flushed to disk) the WAL before committing. This guarantees zero data loss on failover, but every write is now limited by network latency to the replica.

-- Enable synchronous replication on the primary
-- postgresql.conf
synchronous_standby_names = 'FIRST 1 (replica1, replica2)'

-- Control the durability guarantee per-transaction
SET synchronous_commit = 'remote_apply';  -- Strongest: wait for replay
SET synchronous_commit = 'remote_write';  -- Wait for OS write (not fsync)
SET synchronous_commit = 'on';           -- Wait for local fsync only (async)
SET synchronous_commit = 'off';          -- Don't wait for any fsync
synchronous_commitData Loss RiskWrite Latency ImpactUse Case
remote_applyZeroHighest (replica replay time)Financial transactions
remote_writeNear-zero (OS crash only)Moderate (network round-trip)Most HA setups
onReplica may lagNone (async to replica)Default, most workloads
offLocal crash loses dataLowestBulk imports, non-critical writes

Pro tip: You can set synchronous_commit per transaction. Use remote_apply for payment processing and off for logging or analytics writes. This gives you strong guarantees where they matter without penalizing every write on the system.

Replica Lag: The Silent Problem

Replica lag is the delay between a change being committed on the primary and that change being visible on the replica. With async replication, some lag is expected. The question is how much and how to detect it.

-- Monitor lag on the replica itself
SELECT now() - pg_last_xact_replay_timestamp() AS replay_lag;

-- Monitor from the primary (more accurate)
SELECT client_addr,
       pg_wal_lsn_diff(sent_lsn, replay_lsn) AS replay_lag_bytes,
       replay_lag
FROM pg_stat_replication;

-- Set up alerting: lag over 30 seconds is a problem
-- lag over 5 minutes means your replica might be useless for reads

Common causes of replica lag: the replica has slower disks than the primary, long-running queries on the replica block WAL replay (check max_standby_streaming_delay), or a burst of writes on the primary overwhelms the replica's replay capacity. I've seen teams route read traffic to replicas without monitoring lag, then spend days debugging why users see stale data.

Watch out: If your application reads from a replica immediately after writing to the primary, you'll get stale reads. Either read from the primary after writes, use synchronous replication, or add application-level logic to handle eventual consistency.

Failover: The Hard Part

Setting up replication is straightforward. Handling failover correctly is where teams stumble. When the primary goes down, you need to:

  1. Detect the failure -- is the primary actually down, or is there a network partition?
  2. Promote a replica -- run pg_promote() or pg_ctl promote on the chosen replica
  3. Redirect application traffic -- update DNS, load balancer, or connection string
  4. Handle the old primary -- when it comes back, it can't rejoin as primary (split-brain risk)
  5. Rebuild replication -- use pg_rewind to re-sync the old primary as a new replica
# Promote a replica to primary
pg_ctl promote -D /var/lib/postgresql/data

# Or from SQL (PostgreSQL 12+)
SELECT pg_promote();

# Rewind the old primary to follow the new primary
pg_rewind --target-pgdata=/var/lib/postgresql/data           --source-server="host=new_primary port=5432"

Automated failover tools like Patroni, repmgr, and pg_auto_failover handle steps 1 through 4. Patroni with etcd is the most battle-tested option and the one I'd recommend for production. It handles leader election, fencing, and replica promotion with minimal manual intervention.

Managed PostgreSQL Replication: Provider Comparison

If you're running managed PostgreSQL, replication setup and failover are handled for you -- but the details differ significantly.

ProviderFailover TimeRead ReplicasCross-RegionStarting Price/mo
AWS RDS60-120 secondsUp to 15Yes~$190 (Multi-AZ)
Google Cloud SQL~60 secondsUp to 10Yes~$175 (HA)
Azure Database~60 secondsUp to 5Yes~$200 (HA)
SupabaseAutomaticUp to 2No$25 (Pro)
NeonInstant (compute)Via branchingLimited$19 (Launch)

Managed services handle the operational complexity, but you still need to understand replication to make informed decisions about consistency, read routing, and failover behavior. Don't assume "managed" means "no replication knowledge needed."

When to Use Each Replication Strategy

Here's my decision framework after setting up replication at dozens of companies:

  1. Start with a single primary and async streaming replicas -- this covers 80% of use cases
  2. Add synchronous replication only for zero-data-loss requirements -- financial systems, compliance-driven workloads
  3. Use logical replication for selective sync -- when you need specific tables replicated to a different system or PostgreSQL version
  4. Deploy Patroni for automated failover -- manual failover is a liability at 3 AM
  5. Monitor replica lag from day one -- set alerts at 10 seconds and 60 seconds

Frequently Asked Questions

What is the difference between physical and logical replication in PostgreSQL?

Physical replication ships raw WAL bytes and creates an exact binary copy of the entire cluster. The replica is read-only. Logical replication decodes WAL into row-level changes and can replicate specific tables to a writable target. Use physical for HA failover and read scaling. Use logical for selective sync, cross-version migrations, or change data capture.

How much replica lag is acceptable?

It depends on your application's tolerance for stale reads. For most web applications, under 1 second of lag is fine. For financial dashboards or inventory systems, even 100 milliseconds matters. Monitor with pg_stat_replication and alert at thresholds that match your business requirements. If zero lag is required, use synchronous replication.

Can a PostgreSQL replica handle write queries?

Physical replicas are strictly read-only. Any write attempt returns an error. Logical replication targets can accept writes, but you must handle conflicts manually -- if the same row is modified on both sides, PostgreSQL won't resolve it automatically. Bidirectional logical replication is possible but complex and error-prone.

What happens when the primary server crashes?

With async replication, transactions that were committed on the primary but not yet shipped to replicas are lost. With synchronous replication (remote_apply), no committed transactions are lost. In both cases, you need to promote a replica to primary and redirect traffic. Tools like Patroni automate this process.

How does Patroni handle PostgreSQL failover?

Patroni uses a distributed consensus store (etcd, ZooKeeper, or Consul) to elect a leader. When the primary becomes unreachable, Patroni nodes hold an election, the winning replica promotes itself, and other replicas reconfigure to follow the new primary. The entire process typically completes in 10-30 seconds with properly tuned timeouts.

Should I use replication or backups for disaster recovery?

Both. Replication provides fast failover (seconds to minutes) but protects against hardware failure, not human error. If someone runs DROP TABLE, that statement replicates immediately. Backups (pg_dump or continuous WAL archiving with pg_basebackup) let you restore to a specific point in time. Use replication for availability and backups for recovery.

What is WAL shipping and how does it relate to streaming replication?

WAL shipping is the older method where completed WAL files (16 MB each by default) are copied to a replica via archive_command. Streaming replication is the modern approach where WAL records are sent to replicas in real time over a TCP connection. Streaming has much lower lag. WAL archiving is still useful as a fallback if the streaming connection drops.

A

Written by

Abhishek Patel

Infrastructure engineer with 10+ years building production systems on AWS, GCP, and bare metal. Writes practical guides on cloud architecture, containers, networking, and Linux for developers who want to understand how things actually work under the hood.

Related Articles

Enjoyed this article?

Get more like this in your inbox. No spam, unsubscribe anytime.

Comments

Loading comments...

Leave a comment

Stay in the loop

New articles delivered to your inbox. No spam.