Posts

Showing posts from 2016

Traveling in time(lines) with pg_rewind

At 9.5 pg_rewind was introduced, with this feature it was possible to make a server that is no longer master to follow a promoted standby that has a new timeline. 
There are at least a few use cases that could benefit from this feature, to name a few :

Testing Failovers : Its pretty common for one of customers to ask for a failover simulation just to make sure that everything works, by using pg_rewind this is much easier and much faster because we don't have to re-sync the whole database that sometimes is multi-TB sized. 

Minor version upgrades : during an upgrade you have to put the new binaries in place and restart the database. By using pg_rewind if the application is flexible enough , you could upgrade the slave , switchover , upgrade the master and pg_rewind the old master to follow the new slave, possibly minimising even more the downtime.

Requirements.

pg_rewind has a couple of requirements :


1. data page checksums enabled (initdb -k)
or
2. parameter wal_log_hints has to be enabled…

Adding columns with default value fast

I recently came across a discussion in #postgresql channel about how to add a new column that will represent record insert timestamps to a table with minimal locking.

Normally you would do something like this :

monkey=# alter table alter_test add column insert_timestamp timestamp without time zone default now();
ALTER TABLE
Time: 13643.983 ms
monkey=# create index concurrently insert_timestamp_idx on alter_test (insert_timestamp) where insert_timestamp is not null;
CREATE INDEX
Time: 1616.108 ms

This though , would require an AccessExclusiveLock on the table and the application would have to wait for the duration of alter.
What you could do instead is :

monkey=# alter table alter_test add column insert_timestamp timestamp without time zone;
ALTER TABLE
Time: 1.135 ms
monkey=# alter table alter_test alter column insert_timestamp set default now();
ALTER TABLE
Time: 1.238 ms
monkey=# create index concurrently insert_timestamp_idx on alter_test (insert_timestamp) where insert_timestamp i…

Custom Postgres installation with Docker

Image
Lately, i wanted to make a new test / lab environment that would be dispensable, fast to deploy and easily customisable. VM's are all nice for lab use and they served me well over the years, but they tend to be slow(er) to deploy, heavy on resources and lets face it,



So i've decided to play around with Docker. I've played with Docker in the past but i haven't done anything too serious, i just took the official postgres Dockerfile and run it to see how it works. This is how i started my whole Docker exploration, i took the official postgres docker file that can be found here and made an image. Immediately i saw that it wasn't exactly what i wanted, i wanted something less generic. The things that i would like are :

latest postgres version, compiled from sourceeasily customisable pgdata and prefixcustom postgresql.conf settings (mostly having it replication ready)custom shell environmentsome extensionsdata page checksumseasy replica deployment So i started creating m…

Repairing clog corruptions

Yesterday, i got a page from a client about a possible corruption issue to one of his clients, so i decided to share how i dealt with it. Before starting with how i approached the problem, i want to say that corruptions don't just happen in postgres, in OmniTI, we manage A LOT of databases, all with different major versions and in different operating systems and in my time, I haven't seen (many) cases that corruption happened because of a postgres bug (i've seen indexes getting corrupted but i don't remember ever seeing data being corrupted). What i want to say is that corruptions don't just happen, hardware failures, power outages, disks getting full are common reasons for corruptions. 
A replica and backups should always be in place and the server should be properly monitored. Unfortunately this server was not managed by us so none of the above was in place..

At first I saw in the logs entries like :


From the logs: 2016-05-18 15:50:06 EDT::@:[10191]: ERROR:  could no…