Posts

Showing posts from 2014

Offsite replication problems and how to solve them.

 Those of us who use (and abuse) replication in daily basis know how cool and flexible it is. I've seen a lot of guides on how to setup streaming replication in 5 minutes, how to setup basic archiving and/or wal shipping replication but i haven't seen many guides combining these or implementing an offsite setup simulating latency, packet corruption, and basically what happens under network degradation. In this post i will describe a resilient replication setup of 2 nodes and i will put it to the test. For this post i will use 2 debian VMs, PostgreSQL 9.4 beta2, OmniPITR 1.3.2 and netem . Netem can be found on all current (2.6+) distributions and it can emulate variable delay, loss, duplication and re-ordering. The Basics Streaming replication is awesome, its fast , easy to setup, lightweight and near to realtime, but how it performs over the internet ? I setup a simple streaming replica, set wal_segments and wal_keep_segments low (10 and 5). Now i wanna emulate how i

PostgreSQL and ElasticSearch

Recently i had to evaluate ElasticSearch for a possible installation, elasticsearch is basically a search server that provides a distributed full-text search engine using a restful web interface, stores documents in json, it is written in java it is fast and really works out of the box with almost minimum effort. After the installation, it's just reading the documentation and adding / searching documents, I didn't really experiment much with searching but the API looks really good. One interesting question that i had to answer was about connectivity with postgres, and how to maintain a table in both datastores and that's what this post is all about. The first (fast and easy) answer here was rivers , it creates a jdbc connection with another datastore and based on a query it can pump data from any database table. It is available for postgres, twitter, mongo etc.. Because its jdbc its relatively slow and elasticsearch will (re)pump the data once every restart so pay ext

PostgreSQL Bi-Directional Replication

A while ago I wanted to explore the options i had for multi master replication in postgres, I have previously worked with Oracle MMR and to be honest i don't like the idea of mmr, mostly because of the all conflicts that you sometime get, most of the times these conflicts are easy to resolve though and mmr, or in our case BDR can be a very nice way to scale up your database writes. So, while searching i came across a project called postgresql BDR (bi-directional replication) developed by 2ndQuadrant , it allows users to create a geographically distributed asynchronous multi-master database using Logical Log Streaming Replication based on the changeset extraction feature introduced in PostgreSQL 9.4. This post is about showing how to set this up. For this setup i used 2 debian VMs (debian-7.5.0-amd64-netinst.iso) the extra packages i installed were : git, mc, joe,sudo, curl, libreadline-dev, zlib1g-dev, bison, flex (by the way, comments about joe wont be tolerated !!) With

Setting up Postgres Automated Failover

Nowadays, failover is one of the simplest procedures in postgres. Still, there are not many ways to automate this procedure and perform it without any human interference. This blog post will guide you through a way to implement automated failover by using two open source projects, repmgr2 and pgbouncer . PGbouncer will act like a connection pool and as an entry point to the database cluster. Repmgr now, (since version 2) supports automatic failover, the slave will try to connect to the master for a predefined time and number of attempts, if it fails, it will run a script that can be anything. We are going to use this functionality to perform auto-failover, redirect our application connections to the new master database, and notify the dba that a failover has been performed. Each server will have its own copy of pgbouncer configuration file(s) and when a server is being promoted, the new master also has the responsibility to overwrite the configuration file on the server that