Tuning checkpoints

We recently had the chance to help a customer with some IO related issues that ended up being unconfigured checkpoints. Something that may not always be obvious but can actually be somewhat common. Let's start with how things roughly work. Postgres smallest IO unit is a disk block that is 8kb (by default). Each time postgres needs a new block it will fetch it from the disks and load it to an area in RAM called shared_buffers. When postgres needs to write, it does it in the same manner: Fetches the block(s) from the disks if the block is not in shared_buffers Changes the page in shared buffers. Marks the page as changed (dirty) in shared buffers. It writes the change in  a "sequential ledger of changes" called WAL to ensure durability. This basically means that the writes are not yet "on disk". This operation is taken care of by a postgres process called checkpointer. Checkpoints are how postgres guarantees that data files and index files will be upda

Comparing pg 9.4 with pg 12, CTE edition

Postgres 12 is around the corner, and as always is packed with new features. One of them being the option to allow the user to control the behaviour of CTE materialization. You can find the commit from Tom Lane here , which explains everything about this new change but TLDR is that so far, CTE's where fully materialized, so restrictions from the query that uses it won't apply to the CTE. Which is the right way doing this when you are using CTEs to INSERT/UPDATE/DELETE or when they are recursive. This means that when a CTE is side-effect-free and non-recursive it's safe to push the restrictions from the outer query. So, from postgres 12, when it's safe or when the CTE is called only once, postgres will inline the CTE to the outer query, removing the optimization fence. User will be able to override this behaviour by using  MATERIALIZED / NOT MATERIALIZED keywords. Here's an example : drop table if exists test_cte; create table test_cte as     select generate_ser

Accessing PostgreSQL data from AWS Lambda

All databases need some maintenance that we usually perform using scheduled tasks. For example, if you have an RDS instance and you want to get a bloat report once a month, you’ll probably need a small EC2 instance just to do these kinds of things. In this post, we will talk about accessing RDS, getting a result from a SQL statement and reporting it to a slack channel using a scheduled Lambda function; let’s call this poor man’s crontab :) Before we start, psycopg2 isn’t supported by lambda so it needs to be packaged into the lambda deployment package along with any other dependencies, but since psycopg2 requires libpq it needs to be compiled with libpq statically linked. There are many binaries of psycopg2 that you can download, but I would suggest you compile your own, using the latest PostgreSQL source code. We’ll talk about all these steps throughout this post. We’ll cover the following: Creating a new AWS postgres RDS instance How to compile psycopg (we’ll use docker for

CVE-2019-10164 Who's affected and how to protect your systems.

Yesterday, 20th of June Postgres community released minor version updates for all supported versions.  (11.4, 10.9, 9.6.14, 9.5.18, 9.4.23, 12Beta2) As with any minor version, it is recommended to upgrade and keep your database to the latest minor version. But this release is a bit more important  than others because it includes a fix about a recently  discovered CVE (CVE-2019-10164)  From the release notes: "This release is made outside of the normal update release schedule as the security vulnerability was determined to be critical enough to distribute the fix as quickly as possible. Users who are running PostgreSQL 10, PostgreSQL 11, or the PostgreSQL 12 beta should upgrade as soon as possible." In this post i'll talk about this CVE. What this all about, who is affected and how  to protect your systems against it. What is this all about. A system authenticated user could change their own password with a "special crafted password" that could cra

Managing xid wraparound without looking like a (mail) chimp

My colleague Payal came across an outage that happened to mailchimp's mandrill app yesterday, link can be found HERE . Since this was PostgreSQL related i wanted to post about the technical aspect of it. According to the company : “Mandrill uses a sharded Postgres setup as one of our main datastores,” the email explains. “On Sunday, February 3, at 10:30pm EST, 1 of our 5 physical Postgres instances saw a significant spike in writes.”  The email continues: “ The spike in writes triggered a Transaction ID Wraparound issue. When this occurs, database activity is completely halted. The database sets itself in read-only mode until offline maintenance (known as vacuuming) can occur.” So, lets see what that "transaction id wraparound issue" is and how someone could prevent similar outages from ever happening. PostgreSQL uses MVCC to control transaction visibility, basically by comparing transaction IDs (XIDs). A row with an insert XID greater than the current transac

PostgreSQL 12 : Where in copy from

5 days ago a commit made it to 12devel that implements WHERE clause in COPY FROM. Today we're gonna see how it works and see how someone could achieve the same by using file_fdw. To begin with, lets create a table and put some data in it. create table test ( id int, date timestamp without time zone, ); insert into test (id,date) select generate_series (1,1000000)            ,'2015-01-01'; insert into test (id,date) select generate_series (1000001,2000000),'2016-01-01'; insert into test (id,date) select generate_series (2000001,3000000),'2017-01-01'; insert into test (id,date) select generate_series (3000001,4000000),'2018-01-01'; insert into test (id,date) select generate_series (4000001,5000000),'2019-01-01'; now, lets make this a bit larger than 170MB, dump the data in csv and truncate the table : monkey=# insert into test select * from test; INSERT 0 5000000 monkey=# insert into test select * from test; INSE

Zero downtime upgrade PostgreSQL 10 to 11.

PostgreSQL 11 has been officially released and it's packed with exciting changes.Native declarative partitioning is more robust, parallelism is now allowing moreoperations to work in parallel, transactions are now supported in stored procedures and just in time compilation for expressions are just some of the new features. With every major version, Postgres is getting better, providing more and more tools to developers and better quality of life to the DBAs, making it harder to resist an upgrade. That said, while new major versions come with all these exciting improvements, they also come with internal system changes, making the database clusters incompatible across major versions. This means that upgrades up to Postgres 10, where logical replication was introduced, were impossible without downtime or 3rd party replication tools. Story time, Before we get into how to upgrade using logical replication, let's see how upgrading and replication evolved over the years.  pg_