Practical Guide for Database Scaling: how to use Postgres Logical Replication

Check the youtube tutorial video

https://www.youtube.com/@DoCodeDo

https://youtu.be/IHG6iqkWui4

Why use logical replication?

Today, most of us rely on using paid managed databases. However, many basic plans offer limited memory and complex slow queries can queue up, leading to slower app response times.

Sometimes, if our application is read intensive, it could be worth spinning up another database (a cheaper self managed one) just to serve as a read replica for our app.

This can also be very useful to implement analytics and data warehouses in a manner that doesn’t impact the performance of the main database that is serving our customers.

Postgres logical replication offers an efficient and easy-to-implement solution in these cases.

Postgres logical replication

Postgres database replication

Postgres has two kinds of replication:

  • Physical Replication: This method replicates the entire database server instance, meaning all databases, schemas, and tables are copied.
  • Logical replication: Allows for more granular replication. You create publications for specific databases and tables. Another Postgres instance can subscribe to these publications and receive updates for these specific tables.

Practical example

Let’s use docker compose to set up a logical replica.

Clone the repository from:
https://github.com/viniciusoyama/postgres-logical-replication-example

It has a docker-compose.yml file and some SQL examples that we will use here. The setup is:

  • 1 postgres instance to serve as main database publisher
  • 1 postgres instance to serve as read subscriber
  • 1 PGAdmin web interfaceI won’t really use it here in this article but it’s configured so you can explore more if you want.

Start the project in background mode with docker compose up -d

Create a movies database

In our example we will be creating a movies database.

Let’s create the database on each instance: main and replica.


docker compose exec postgres-main psql -U user -c "CREATE DATABASE movies_app_main;"

docker compose exec postgres-replica psql -U user -c "CREATE DATABASE movies_app_replica;"

Create tables

We will use the provided create-tables.sql to create two tables: authors and movies

-- Create authors table

CREATE TABLE authors (

id SERIAL PRIMARY KEY,

name VARCHAR(255) NOT NULL

);

-- Create movies table

CREATE TABLE movies (

id SERIAL PRIMARY KEY,

name VARCHAR(255) NOT NULL,

author_id INT,

FOREIGN KEY (author_id) REFERENCES authors(id) ON DELETE SET NULL

);

Execute the following command to create the tables in both databases:


docker compose exec -T postgres-main psql -U user movies_app_main < create-tables.sql

docker compose exec -T postgres-replica psql -U user movies_app_replica < create-tables.sql

Populating tables with data

Before setting up replication, let’s populate the main database with data.


docker compose exec -T postgres-main psql -U user movies_app_main < populate-data.sql

Checking main database with a SELECT


docker compose exec postgres-main psql -U user -d movies_app_main -c "SELECT * FROM authors;SELECT * FROM MOVIES;"

 id | name  

----+-------

  1 | John

  2 | Maria

  3 | Lisa

(3 rows)

 id |  name   | author_id 

----+---------+-----------

  1 | Movie 1 |         1

  2 | Movie 2 |         1

  3 | Movie 3 |         2

(3 rows)

The replica database tables should be empty:


docker compose exec postgres-replica psql -U user -d movies_app_replica -c "SELECT * FROM authors;SELECT * FROM movies;"
 id | name 

----+------

(0 rows)

 id | name | author_id 

----+------+-----------

(0 rows)

Configuring the logical replication

Creating the publication

Let’s create a publication on the main database:


docker compose exec postgres-main psql -U user -d movies_app_main -c "CREATE PUBLICATION all_changes_pub FOR TABLES IN SCHEMA public;"

Oops. It gives a warning: WARNING:  wal_level is insufficient to publish logical changes.

As good programmers that we are, warnings are not errors right? Life goes on… just kidding.

Write-Ahead Log (WAL)

The Write-Ahead Log (WAL) records database changes before they are applied to the main data files (hence the term ‘write-ahead’). This ensures that even if a failure occurs before the data is fully written to disk, the changes can still be recovered from the WAL during crash recovery.

How WAL Works
  • When a change (insert, update, delete) is made to a PostgreSQL database, the change is first written to the WAL log before it’s applied to the actual data files on disk.
  • The WAL entries are stored in WAL segments, which are typically 16MB files.
  • Only after the WAL entry is safely written to disk does PostgreSQL actually apply the changes to the database itself (this is called a checkpoint).

You can read more about WAL here on the postgres documentation: https://www.postgresql.org/docs/current/runtime-config-wal.htm

There you will find more information on this warning on wal_level:

wal_level determines how much information is written to the WAL. The default value is replica, which writes enough data to support WAL archiving and replication, including running read-only queries on a standby server. minimal removes all logging except the information required to recover from a crash or immediate shutdown. Finally, logical adds information necessary to support logical decoding. Each level includes the information logged at all lower levels. This parameter can only be set at server start.

Changing wal_level

  • Stop the project with docker compose down
  • Uncomment the command section on docker-compose.ymlfrom postgres-main service. It will set wal_level to logical.
  • Start the project again with docker compose up -d.

The publication is already created and now it’s working.

Just for fun, try to create a new one: it should not give you any error.

docker compose exec postgres-main psql -U user -d movies_app_main -c "CREATE PUBLICATION all_changes_pub_two FOR ALL TABLES;"

Subscribing the replica to the main publication

Now we just need to configure the replica database to subscribe to the main publication:


docker compose exec postgres-replica psql -U user -d movies_app_replica -c "CREATE SUBSCRIPTION replica_sub

CONNECTION 'host=postgres-main

dbname=movies_app_main

user=user

password=secret'

PUBLICATION all_changes_pub;

;"

After subscribing, all previous data should be copied. You can check it running a SELECT on the replica database:


docker compose exec postgres-replica psql -U user -d movies_app_replica -c "SELECT * FROM authors;SELECT * FROM MOVIES;"

Testing the replication with new data

Lets insert two new authors and one new movie on the main database and check if it appears on the replica database:


docker compose exec postgres-main psql -U user -d movies_app_main -c "

INSERT INTO authors (name) VALUES ('Pedro');

INSERT INTO authors (name) VALUES ('Jose');

INSERT INTO movies (name, author_id) VALUES ('Movie 4', (SELECT id FROM authors WHERE name = 'Jose'));

"

Checking replica with SELECT


docker compose exec postgres-replica psql -U user -d movies_app_replica -c "SELECT * FROM authors;SELECT * FROM MOVIES;"

give us:


 id | name  

----+-------

  1 | John

  2 | Maria

  3 | Lisa

  4 | Pedro

  5 | Jose

(5 rows)

 id |  name   | author_id 

----+---------+-----------

  1 | Movie 1 |         1

  2 | Movie 2 |         1

  3 | Movie 3 |         2

  4 | Movie 4 |         5

(4 rows)

Everything is working and we can use our replica database as a read replica on our main app or for other purposes.

Limitations

When using logical replicas, we should be aware of some limitations:

Bonus tip 1: PGAdmin setup

Start pgadmin

Start the project with:

docker compose up

Access the PGAdmin web interface at http://localhost:8888 .

Login with:

Add database instances to pgadmin

We need to add our 2 databases to PGAdmin.
You can import the configuration from the json config included on the project code or manually add each server:

Importing json config file

  • Go to header menu -> tools -> import export server
  • Select import option
  • Search for serverconfig.json file and click next
  • Select all servers and click next
  • Finish

On the object explorer section on the left you can expand and list the servers. If you’re asked, the password is secret .

Manual configuration

Right click on the "Servers" item on the "object explorer" menu on the left. Go to Register -> Server

Main

# General tab
name: main

# Connection tab
hostname: postgres-main
Port: 5432
username: user
password: secret

Replica

# General tab
name: replica

# Connection tab
hostname: postgres-replica
Port: 5432
username: user
password: secret

Bonus tip 2: how to manage schema changes

Postgres Logical Replication requires that all published tables to be created on subscribed databases and have the same structure.

If you want to use this architecture in your app and want to learn how to manage table schema changes I have another post explaining how to do it using Ruby on Rails. Check it here:

https://vinioyama.com/blog/scaling-rails-setup-multiple-databases-with-postgres-logical-replication/

I encourage you to take a look as the technique in other languages/frameworks should be very similar.

Conclusion

Postgres Logical Replication is a powerful feature that enables us to cherry pick what we wanto to replicate and control access permissions.

It’s a quick and easy way to have another database serving as a read replica but also can be used in other complex scenarios with multiple services and different applications.



Subscribe to my newsletter!

I share content about Software Development & Architecture, Entrepreneurship and Lifelong Learning

Scroll to Top