Offsite replication problems and how to solve them.
Those of us who use (and abuse) replication in daily basis know how cool and flexible it is. I've seen a lot of guides on how to setup streaming replication in 5 minutes, how to setup basic archiving and/or wal shipping replication but i haven't seen many guides combining these or implementing an offsite setup simulating latency, packet corruption, and basically what happens under network degradation. In this post i will describe a resilient replication setup of 2 nodes and i will put it to the test. For this post i will use 2 debian VMs, PostgreSQL 9.4 beta2, OmniPITR 1.3.2 and netem . Netem can be found on all current (2.6+) distributions and it can emulate variable delay, loss, duplication and re-ordering. The Basics Streaming replication is awesome, its fast , easy to setup, lightweight and near to realtime, but how it performs over the internet ? I setup a simple streaming replica, set wal_segments and wal_keep_segments low (10 and 5). Now i wanna emulate how i