Managing xid wraparound without looking like a (mail) chimp
My colleague Payal came across an outage that happened to mailchimp's mandrill app yesterday, link can be found HERE . Since this was PostgreSQL related i wanted to post about the technical aspect of it. According to the company : “Mandrill uses a sharded Postgres setup as one of our main datastores,” the email explains. “On Sunday, February 3, at 10:30pm EST, 1 of our 5 physical Postgres instances saw a significant spike in writes.” The email continues: “ The spike in writes triggered a Transaction ID Wraparound issue. When this occurs, database activity is completely halted. The database sets itself in read-only mode until offline maintenance (known as vacuuming) can occur.” So, lets see what that "transaction id wraparound issue" is and how someone could prevent similar outages from ever happening. PostgreSQL uses MVCC to control transaction visibility, basically by comparing transaction IDs (XIDs). A row with an insert XID greater than the current transac...