Posts

Comparing pg 9.4 with pg 12, CTE edition

Postgres 12 is around the corner, and as always is packed with new features. One of them being the option to allow the user to control the behaviour of CTE materialization. You can find the commit from Tom Lane here, which explains everything about this new change but TLDR is that so far, CTE's where fully materialized, so restrictions from the query that uses it won't apply to the CTE. Which is the right way doing this when you are using CTEs to INSERT/UPDATE/DELETE or when they are recursive. This means that when a CTE is side-effect-free and non-recursive it's safe to push the restrictions from the outer query.
So, from postgres 12, when it's safe or when the CTE is called only once, postgres will inline the CTE to the outer query, removing the optimization fence. User will be able to override this behaviour by using  MATERIALIZED / NOT MATERIALIZED keywords.

Here's an example :
drop table if exists test_cte; create table test_cte as     select generate_series(1…

Accessing PostgreSQL data from AWS Lambda

Image
All databases need some maintenance that we usually perform using scheduled tasks. For example, if you have an RDS instance and you want to get a bloat report once a month, you’ll probably need a small EC2 instance just to do these kinds of things. In this post, we will talk about accessing RDS, getting a result from a SQL statement and reporting it to a slack channel using a scheduled Lambda function; let’s call this poor man’s crontab :)
Before we start, psycopg2 isn’t supported by lambda so it needs to be packaged into the lambda deployment package along with any other dependencies, but since psycopg2 requires libpq it needs to be compiled with libpq statically linked. There are many binaries of psycopg2 that you can download, but I would suggest you compile your own, using the latest PostgreSQL source code. We’ll talk about all these steps throughout this post.
We’ll cover the following: Creating a new AWS postgres RDS instanceHow to compile psycopg (we’ll use docker for that)Code fo…

CVE-2019-10164 Who's affected and how to protect your systems.

Yesterday, 20th of June Postgres community released minor version updates for all supported versions. (11.4, 10.9, 9.6.14, 9.5.18, 9.4.23, 12Beta2)
As with any minor version, it is recommended to upgrade and keep your database to the latest minor version. But this release is a bit more important 
than others because it includes a fix about a recently discovered CVE (CVE-2019-10164) 
From the release notes:

"This release is made outside of the normal update release schedule as the security
vulnerability was determined to be critical enough to distribute the fix as quickly as possible.
Users who are running PostgreSQL 10, PostgreSQL 11, or the PostgreSQL 12 beta should upgrade as soon as possible."

In this post i'll talk about this CVE. What this all about, who is affected and how to protect your systems against it.


What is this all about.A system authenticated user could change their own password with a "special crafted password" that could crash your database server or…

Managing xid wraparound without looking like a (mail) chimp

My colleague Payal came across an outage that happened to mailchimp's mandrill app yesterday, link can be found HERE.
Since this was PostgreSQL related i wanted to post about the technical aspect of it.
According to the company :

“Mandrill uses a sharded Postgres setup as one of our main datastores,”
the email explains.
“On Sunday, February 3, at 10:30pm EST, 1 of our 5 physical Postgres instances saw a significant spike in writes.” 
The email continues:
The spike in writes triggered a Transaction ID Wraparound issue. When this occurs, database activity is completely halted. The database sets itself in read-only mode until offline maintenance (known as vacuuming) can occur.”

So, lets see what that "transaction id wraparound issue" is and how someone could prevent similar outages from ever happening.

PostgreSQL uses MVCC to control transaction visibility, basically by comparing transaction IDs (XIDs). A row with an insert XID greater than the current transaction  XID shouldn'…

PostgreSQL 12 : Where in copy from

5 days ago a commit made it to 12devel that implements WHERE clause in COPY FROM.
Today we're gonna see how it works and see how someone could achieve the same by using file_fdw.

To begin with, lets create a table and put some data in it.

create table test (
id int,
date timestamp without time zone,
);

insert into test (id,date) select generate_series (1,1000000)            ,'2015-01-01';
insert into test (id,date) select generate_series (1000001,2000000),'2016-01-01';
insert into test (id,date) select generate_series (2000001,3000000),'2017-01-01'; insert into test (id,date) select generate_series (3000001,4000000),'2018-01-01'; insert into test (id,date) select generate_series (4000001,5000000),'2019-01-01';
now, lets make this a bit larger than 170MB, dump the data in csv and truncate the table :
monkey=# insert into test select * from test; INSERT 0 5000000 monkey=# insert into test select * from test; INSERT 0 10000000 monkey=# select c…

Zero downtime upgrade PostgreSQL 10 to 11.

PostgreSQL 11 has been officially released and it's packed with exciting changes.Native declarative partitioning is more robust, parallelism is now allowing moreoperations to work in parallel, transactions are now supported in stored procedures and just in time compilation for expressions are just some of the new features.With every major version, Postgres is getting better, providing more and more tools to developers and better quality of life to the DBAs, making it harder to resist an upgrade.That said, while new major versions come with all these exciting improvements, they also come with internal system changes, making the database clusters incompatible across major versions. This means that upgrades up to Postgres 10, where logical replication was introduced, were impossible without downtime or 3rd party replication tools.
Story time, Before we get into how to upgrade using logical replication, let's see how upgrading and replication evolved over the years. pg_dump and pg_r…

Logical Replication in aws RDS

Recently Amazon annnounced that postgres 10 supports logical replication syntax and that it can work as either a publisher or a subscriber. The announcement can be found here.

In order to test it out i made 2 10.4 RDS instances. Set them up to be in the same security,
subnet and parameter groups so i ended up with something like this :

user : vasilis password : Ap4ssw0rd db instance : lrnode1 & 2 dbname : lrtest port : 5432
only thing i had to change in order to enable logical replication is to change rds.enable_logical_replication to 1 in my parameter group.
After verifying connectivity
psql -h lrnode1.random.us-east-1.rds.amazonaws.com -U vasilis -d lrtest -W psql -h lrnode2.random.us-east-1.rds.amazonaws.com -U vasilis -d lrtest -W
i created a dummy table on my publisher and subscriber (lrnode1 and 2)
create table lr_test_tbl (     id int primary key,     name text );
on lrnode1 (publisher) i created a publication CREATE PUBLICATION lr_pub FOR TABLE lr_test_tbl ;

on lrnode2 and as vasilis i creat…