Monday, 12 November 2018

Zero downtime upgrade PostgreSQL 10 to 11.

PostgreSQL 11 has been officially released and it's packed with exciting changes.Native declarative partitioning is more robust, parallelism is now allowing moreoperations to work in parallel, transactions are now supported in stored procedures and just in time compilation for expressions are just some of the new features.

With every major version, Postgres is getting better, providing more and more tools to developers and better quality of life to the DBAs, making it harder to resist an upgrade.That said, while new major versions come with all these exciting improvements, they also come with internal system changes, making the database clusters incompatible across major versions. This means that upgrades up to Postgres 10, where logical replication was introduced, were impossible without downtime or 3rd party replication tools.

Story time,

Before we get into how to upgrade using logical replication, let's see how upgrading and replication evolved over the years. pg_dump and pg_restore is probably the most traditional way to do an upgrade. This method requires all database writes to be suspended, making it impractical for any reasonably sized production database. Pg_upgrade was introduced in Postgres 8.4, it’s still the most common way to do upgrades. It works under the assumption that the internal storage format rarely changes allowing it to create new system catalog tables and simply reuse the old data files. This means that upgrades are safe and fast. It still requires the database to be down and by default it will copy the datafiles to a new data directory, This can take significant amount of time but it can easily bypassed by using the hard link option provided by pg_upgrade itself. Hard links are only valid in the same filesystem, and with that in mind, this method not only massively reduces downtime, but also eliminates the need of having a second copy of the database cluster. In rare occasions like for example, an upgrade changing storage options, like floating point to int64 date/times, pg_upgrade won’t help, it will exit saying that clusters are incompatible. For this reason, or in case a database downtime -even a brief one- is inhibitory, there are third party tools that can perform upgrades with zero downtime. pg_logical and Slony are the most popular ones, with Slony doing trigger based replication and pg_logical using logical decoding.

 Just to give you a sense of how different the downtime requirements can be for each upgrade method below there are timings for a 13GB database using each of the three methods we discussed above:

  • pg_dump : 10m52.363s
  • pg_upgrade : 2m41.983s
  • pg_upgrade using hard links : 0m4.545s


Introduced in 8.3, warm standbys were the first Postgres native replication method. Based on log shipping where WAL changes, packed into WAL files were shipped directly from one server to another in order to be replayed sequentially one file at the time. Streaming replication was introduced at 9.0, this gave us the ability to transmit WAL records instead of WAL files over a separate and dedicated replication communication channel. It is however replicating changes byte-per-byte so it doesn’t allow any writes in standbys. Also partial replication of individual databases or objects isn’t possible. In Postgres 10 Logical replication was introduced and while it uses the same communication channel and the same data as streaming, it uses Logical Decoding feature to make logical changes to the target database based on each objects replica identity, usually a primary key. Logical replication uses a publisher-subscriber model where one or more subscribers subscribe to one or more publications on a publisher node. Replication of an object usually starts by transferring a snapshot of the object's data to the subscriber. Once this is done the changes are being transmitted as they happen through the native replication channel over a replication slot that is automatically created. Because logical replication is sending incremental changes of individual objects it gives flexibility that doesn't exist in Streaming replication :

  • It replicates between different major versions
  • It replicates to a fully functional and writable target database
  • It works on different platforms (Linux to Windows)

 Upgrading by replicating.

With cloud installations getting very popular and with AWS RDS supporting Logical replication since Jul 25, 2018 one could easily make a transition from Postgres 10 to 11 with absolutely zero database downtime. The only downtime needed would be the application switchover to the new database. The process in theory is pretty straight forward, you need to create a publisher on your postgres 10, adding all your tables to one or more publications, transfer an empty schema to postgres 11 and create a subscriber that will subscribe to the publication(s). We won’t get into the details of each individual command, there are numerous guides that explain the process. Here’s a video of Robert Treat, demonstrating Logical replication live during Postgres Open SV 2017. Instead we are going to talk about the possibilities that this method could offer now or in the near future. Over the last 15 years or so, we moved from dump-restore offline upgrades, to pg_upgrade (rapid, offline upgrades) to logical replication upgrades, pg_upgrade served us really well, but it has its cons. You have to generate new database statistics after an upgrade and this, when sizes are big or statistics targets are tuned higher than the defaults can be time consuming. Yes, this can be done online but until it's done, database performance is compromised. Also, the upgraded database will start with cold cache and sometimes it'll take hours or even days to get to the state it was. Upgrading with logical replication gives us the ability to update statistics and pre-warm cache on the replica prior to the application switch over, something that isn't possible with pg_upgrade method. Up until 11 emergency downgrades were not a real option, now with logical replication they are. Lastly, as we said earlier, pg_upgrade works under the assumption that internal storage format doesn't change and even if the community will try its best not to let that happen, pluggable storage engines are a possible future feature, and if this ever happens, i'm not sure how pg_upgrade could work. All these restrictions are in theory lifted with logical replication, and assuming that tools around logical replication will be developed I can see a future where we no longer upgrade databases, we just make transitions to between versions.

Caveats - word of advice

Logical replication offers a lot of flexibility but this doesn’t come without a cost, it’s very fragile to changes and it can be quite tricky to replicate a database with a complicated schema. It also requires every single replicated table to have a primary key and it has much more maintenance work than Streaming replication, which remains the goto solution for high availability uses. Another thing is that logical replication lifts the restriction of having a single point of writes and this, if not handled well from the application can end up causing split brain situations, where you have parts of the application writing to two different masters at the same time. This can be easily solved with DNS , virtual ips or by using connection pooling but it’s still more prone to mistakes than the other upgrade methods. Logical replication is a relatively new postgres feature, and as most of the new features it lacks the proper tooling, with postgres 11 being the first version that can be upgraded using logical replication, this method is still in its infancy. As mentioned before, there are certain use cases where logical replication can help, but I also need to mention that pg_upgrade is extremely mature, it's safe and it should be the recommended way of upgrading your postgres cluster.

Thanks for reading.

Vasilis Ventirozos
OmniTI Computer Consulting

No comments:

Post a Comment

Zero downtime upgrade PostgreSQL 10 to 11.

PostgreSQL 11 has been officially released and it's packed with exciting changes.Native declarative partitioning is more robust, parall...