PostgreSQL 9.4 streaming replication over SSL with Replication Slots
I never really liked the concept of keeping wal files in case the replica goes down proactively, setting wal_keep_segments was always a tricky number and most of the time we ended up setting this for the worst case scenario, something that means space consumption for no (under normal circumstances) good reason.
That’s why i really like replication slots.
Replication slots are, and i am quoting Robert Haas :
"a crash-safe data structure which can be created on either a master or a standby to prevent premature removal of write-ahead log segments needed by a standby, as well as (with hot_standby_feedback=on) pruning of tuples whose removal would cause replication conflicts. Slots have some advantages over existing techniques, as explained in the documentation."
While there are a lot of posts about how to setup replication slots, and to be honest its pretty easy, in this post i will combine this with replication over SSL and i will also show a useful trick for pg_basebackup (props to my friend Denish Patel).
The setup that will serve this post is the following
2 Debian VMS, with minimal install with PostgreSQL 9.4.5 installed on both.
we will call them :
master01 the ip will be 192.168.1.15 and it will act as Master with $PGDATA at /data/.
slave01 the ip will be 192.168.1.16 and it will act as Slave with $PGDATA at /data/.
on Master:
We set SSL and replication settings to :
ssl = on
ssl_cert_file = '/data/ssl/server.crt'
ssl_key_file = '/data/ssl/server.key'
wal_level = hot_standby
max_wal_senders = 5
wal_keep_segments = 0
max_replication_slots = 5
hot_standby = on
hot_standby is ignored on master but we plan getting a pg_basebackup so it will be used on slave.
Before we restart the Master, we have to create the certificate files :
mkdir $PGDATA/ssl ; chown postgres /data/ssl
cd /data/ssl/
openssl req -new -text -out server.req
openssl rsa -in privkey.pem -out server.key
rm privkey.pem
openssl req -x509 -in server.req -text -key server.key -out server.crt
chmod og-rwx server.key
More info about the certifications can be found here.
Now we need a proper replication user a replication slot and a replication entry on pg_hba.conf.
for the user :
psql -c "create role repuser with login replication password 'password';" postgres
for the slot :
psql -c "SELECT * FROM pg_create_physical_replication_slot('slave01_slot');" postgres
and pg_hba.conf should loot like this :
# TYPE DATABASE USER ADDRESS METHOD
local all all trust
host all all 127.0.0.1/32 trust
host all vasilis 192.168.1.0/16 md5
hostssl replication repuser 192.168.1.16/32 md5
host all all 0.0.0.0/0 reject
Two things that need attenction in pg_hba.conf :
1. hostssl for the replication user
2. keep your "reject everyone else" as your last entry.
With all these set on Master , time to restart and start working on slave.
On slave:
Connect to the replication slot BEFORE you initiate pg_basebackup. This way Master will keep all wal files needed for the duration of the backup. Connection to the replication slot can be done by using pg_receivexlog eg:
> pg_receivexlog -D . -S slave01_slot -v -h 192.168.1.15 -U repuser -W
Password:
pg_receivexlog: starting log streaming at 0/3E000000 (timeline 1)
pg_receivexlog: finished segment at 0/3F000000 (timeline 1)
^Cpg_receivexlog: received interrupt signal, exiting
pg_receivexlog: received interrupt signal, exiting
notice that i pressed ^C, i stopped pg_receivexlog and with this way the wal files will remain to the Master until my slave is up.
now run pg_basebackup :
pg_basebackup -h 192.168.1.15 -D $PGDATA -P -X stream -U repuser
When pg_basebackup is complete, add a recovery.conf with the following entries :
standby_mode = 'on'
primary_conninfo = 'host=192.168.1.15 port=5432 user=repuser password=password sslmode=require'
trigger_file = '/data/failover.trigger'
recovery_target_timeline = 'latest'
primary_slot_name = 'slave01_slot'
and start the slave.
you should see:
database system is ready to accept read only connections
Now, in order to monitor your replication there are two useful sql statements :
I run the first one while my slave is down so as you can see it shows how much data (wal data) have to be transferred to my slave.
postgres=# SELECT slot_name, database,
active, pg_xlog_location_diff(pg_current_xlog_insert_location(), restart_lsn) AS retained_bytes FROM pg_replication_slots;
slot_name | database | active | retained_bytes
--------------+----------+--------+----------------
slave01_slot | | f | 201326688
(1 row)
and after i started the slave :
postgres=# SELECT slot_name, database, active,pg_xlog_location_diff(pg_current_xlog_insert_location(), restart_lsn) AS retained_bytes FROM pg_replication_slots;
slot_name | database | active | retained_bytes
--------------+----------+--------+----------------
slave01_slot | | t | 0
(1 row)
the second monitoring statement is :
postgres=# SELECT pg_xlog_location_diff(pg_current_xlog_insert_location(), flush_location) AS lag_bytes, pid, application_name
FROM pg_stat_replication;
lag_bytes | pid | application_name
-----------+-----+------------------
0 | 855 | walreceiver
This will show you how many bytes your slave is behind.
(both statements are to be run on Master unless cascading replicas are in place)
Thanks for reading,
Vasilis Ventirozos
That’s why i really like replication slots.
Replication slots are, and i am quoting Robert Haas :
"a crash-safe data structure which can be created on either a master or a standby to prevent premature removal of write-ahead log segments needed by a standby, as well as (with hot_standby_feedback=on) pruning of tuples whose removal would cause replication conflicts. Slots have some advantages over existing techniques, as explained in the documentation."
While there are a lot of posts about how to setup replication slots, and to be honest its pretty easy, in this post i will combine this with replication over SSL and i will also show a useful trick for pg_basebackup (props to my friend Denish Patel).
The setup that will serve this post is the following
2 Debian VMS, with minimal install with PostgreSQL 9.4.5 installed on both.
we will call them :
master01 the ip will be 192.168.1.15 and it will act as Master with $PGDATA at /data/.
slave01 the ip will be 192.168.1.16 and it will act as Slave with $PGDATA at /data/.
on Master:
We set SSL and replication settings to :
ssl = on
ssl_cert_file = '/data/ssl/server.crt'
ssl_key_file = '/data/ssl/server.key'
wal_level = hot_standby
max_wal_senders = 5
wal_keep_segments = 0
max_replication_slots = 5
hot_standby = on
hot_standby is ignored on master but we plan getting a pg_basebackup so it will be used on slave.
Before we restart the Master, we have to create the certificate files :
mkdir $PGDATA/ssl ; chown postgres /data/ssl
cd /data/ssl/
openssl req -new -text -out server.req
openssl rsa -in privkey.pem -out server.key
rm privkey.pem
openssl req -x509 -in server.req -text -key server.key -out server.crt
chmod og-rwx server.key
More info about the certifications can be found here.
Now we need a proper replication user a replication slot and a replication entry on pg_hba.conf.
for the user :
psql -c "create role repuser with login replication password 'password';" postgres
for the slot :
psql -c "SELECT * FROM pg_create_physical_replication_slot('slave01_slot');" postgres
and pg_hba.conf should loot like this :
# TYPE DATABASE USER ADDRESS METHOD
local all all trust
host all all 127.0.0.1/32 trust
host all vasilis 192.168.1.0/16 md5
hostssl replication repuser 192.168.1.16/32 md5
host all all 0.0.0.0/0 reject
Two things that need attenction in pg_hba.conf :
1. hostssl for the replication user
2. keep your "reject everyone else" as your last entry.
With all these set on Master , time to restart and start working on slave.
On slave:
Connect to the replication slot BEFORE you initiate pg_basebackup. This way Master will keep all wal files needed for the duration of the backup. Connection to the replication slot can be done by using pg_receivexlog eg:
> pg_receivexlog -D . -S slave01_slot -v -h 192.168.1.15 -U repuser -W
Password:
pg_receivexlog: starting log streaming at 0/3E000000 (timeline 1)
pg_receivexlog: finished segment at 0/3F000000 (timeline 1)
^Cpg_receivexlog: received interrupt signal, exiting
pg_receivexlog: received interrupt signal, exiting
notice that i pressed ^C, i stopped pg_receivexlog and with this way the wal files will remain to the Master until my slave is up.
now run pg_basebackup :
pg_basebackup -h 192.168.1.15 -D $PGDATA -P -X stream -U repuser
When pg_basebackup is complete, add a recovery.conf with the following entries :
standby_mode = 'on'
primary_conninfo = 'host=192.168.1.15 port=5432 user=repuser password=password sslmode=require'
trigger_file = '/data/failover.trigger'
recovery_target_timeline = 'latest'
primary_slot_name = 'slave01_slot'
and start the slave.
you should see:
database system is ready to accept read only connections
Now, in order to monitor your replication there are two useful sql statements :
I run the first one while my slave is down so as you can see it shows how much data (wal data) have to be transferred to my slave.
postgres=# SELECT slot_name, database,
active, pg_xlog_location_diff(pg_current_xlog_insert_location(), restart_lsn) AS retained_bytes FROM pg_replication_slots;
slot_name | database | active | retained_bytes
--------------+----------+--------+----------------
slave01_slot | | f | 201326688
(1 row)
and after i started the slave :
postgres=# SELECT slot_name, database, active,pg_xlog_location_diff(pg_current_xlog_insert_location(), restart_lsn) AS retained_bytes FROM pg_replication_slots;
slot_name | database | active | retained_bytes
--------------+----------+--------+----------------
slave01_slot | | t | 0
(1 row)
the second monitoring statement is :
postgres=# SELECT pg_xlog_location_diff(pg_current_xlog_insert_location(), flush_location) AS lag_bytes, pid, application_name
FROM pg_stat_replication;
lag_bytes | pid | application_name
-----------+-----+------------------
0 | 855 | walreceiver
This will show you how many bytes your slave is behind.
(both statements are to be run on Master unless cascading replicas are in place)
Thanks for reading,
Vasilis Ventirozos
 
This comment has been removed by a blog administrator.
ReplyDeleteHow come your pg_basebackup does not complain that the slave $PGDATA is not empty? Mine does... Could it be because of me running 11.4?
ReplyDelete