Tuning checkpoints

We recently had the chance to help a customer with some IO related issues that ended up being unconfigured checkpoints. Something that may not always be obvious but can actually be somewhat common.

Let's start with how things roughly work.
Postgres smallest IO unit is a disk block that is 8kb (by default). Each time postgres needs a new block it will fetch it from the disks and load it to an area in RAM called shared_buffers.
When postgres needs to write, it does it in the same manner:
  • Fetches the block(s) from the disks if the block is not in shared_buffers
  • Changes the page in shared buffers.
  • Marks the page as changed (dirty) in shared buffers.
  • It writes the change in  a "sequential ledger of changes" called WAL to ensure durability.

This basically means that the writes are not yet "on disk". This operation is taken care of by a postgres process called checkpointer. Checkpoints are how postgres guarantees that data files and index files will be updated with all the changes that happened before that checkpoint. In case of a crash, postgres will go back to the latest checkpoint record and it will start a REDO operation from WAL. Checkpoints are triggered every checkpoint_timeout (default at 5min) or when changes reach max_wal_size (default at 1GB). This is an IO intensive operation and postgres tries to spread this IO
with checkpoint_completion_target (default at 0.5).

checkpoint_timeout* is the maximum time between checkpoints in seconds.
min_wal_size minimum size of wals that will be recycled rather than removed
max_wal_size** maximum size allowed for wals between checkpoints
checkpoint_completion_target allows data changes to spread over a longer period of time, making the  final fsync() much cheaper.
* Affects recovery time, change only after reviewing the documentation
** This is a softmax, it can exceed this value in special cases.

Best way to start is to set checkpoint_timeout value to something reasonable and set max_wal_size high enough so you won't reach the timeout. To make sense of what is reasonable, you can do the following: schedule (cronjob will do) something like this to run in short periods of time, say every minute:

psql -XqtA -c "copy (select now()::timestamp(0), pg_current_wal_insert_lsn()) to stdout with csv;" monkey >> current_lcn

Leave it running for as long as you see fit, from the result you can extract the difference of 2 locations in bytes like this:

monkey=# SELECT pg_size_pretty(pg_wal_lsn_diff('0/B1277248','0/59CEA2F8'));
 pg_size_pretty
----------------
 1398 MB
(1 row)

(protip: file_fdw + window function)

This function will calculate the difference of later location - earlier location in bytes, so having the location per minute can help calculate (or graph) the rate of changes over time.
With that number, your storage capabilities and your recovery needs in mind you should be able to come up with a good starting point.
max_wal_size should be set high enough so it won't be reached before the timeout. The rate of changes figure we calculated earlier should be a good indication of where to start. min_wal_size has to follow common sense and leave a small portion of wal files to be recycled for the next checkpoint.
checkpoint_completion_target often is just set to 0.9 but if you want to be more "precise" it follows the following rule:
(checkpoint_timeout - 2min) / checkpoint_timeout

To watch how checkpointer is working after the changes, you can query pg_stat_bgwriter. This is a very interesting view that probably deserves it's own blogpost - because of the data you can extract from it - but today we are going to concentrate on 2 columns, checkpoint_timed and checkpoint_req.

checkpoint_timed counts checkpoints that happened triggered by checkpoint_timeout and checkpoint_req is the count of the checkpoint that happened because max_wal_size was hit.
What you want to see there is the majority of the checkpoints to be timed.

And because graphs are cool, you can see an example of a production system before and after checkpoint configuration. The graph is write IOPS over time on an AWS RDS database.
In RDS specifically, i find it weird that checkpoint defaults are so low considering the high shared_buffers default it has.






Thanks for reading
-- Vasilis Ventirozos
-- credativ LLC



Comments

  1. I have a big buffers_clean and small buffers_checkpoint any idea?
    buffers_checkpoint | buffers_clean | buffers_backend
    2939561 | 19872289 | 6015787

    ReplyDelete
  2. There are checkpoints_timed and checkpoints_req columns in the view, not checkpoint_timed and checkpoint_req.

    ReplyDelete
  3. I think you did an awesome job explaining it. Sure beats having to research it on my own. Thanks
    BCOM 1st, 2nd & 3rd Year Time Table 2020

    ReplyDelete
  4. The main motive of the Big data engineering automation is to spread the knowledge so that they can give more big data engineers to the world.

    ReplyDelete
  5. Feeling good to read such a informative blog, mostly i eagerly search for this kind of blog. I really found your blog informative and unique, waiting for your new blog to read.
    Digital marketing Service in Delhi
    SMM Services
    PPC Services in Delhi
    Website Design & Development Packages
    SEO Services Packages
    Local SEO services
    E-mail marketing services
    YouTube plans

    ReplyDelete
  6. Hi there colleagues, good paragraph and good urging commented here, I am genuinely enjoying by these. Result BA Third Part

    ReplyDelete
  7. Interesting article, thank you so much for sharing this amazing blog with us. Visit Ogen Infosystem for creative Website Designing and SEO Services in Delhi, India.
    SEO Service in Delhi

    ReplyDelete
  8. The actual demand for id verification service is much greater these days. Trust Swiftly is the sole platform that has many verification tactics, and a corporation can bring much better safety to their company by implementing all the verification approaches. In case you explore this particular TRUST SWIFTLY web site, you will definitely get a growing number of information regarding id verification service.

    ReplyDelete
  9. The Pros and Cons of Group Health Insurance By Timothy Hebert | Customs When Completing Individual Health Insurance Application By Dennis Alexander.50 Free Backlink
    2000 Backlink at cheapest
    5000 Backlink at cheapest
    Boost DA upto 15+ at cheapest
    Boost DA upto 25+ at cheapest . Suggest Article Comments Print ArticleShare this article on FacebookShare this article on TwitterShare

    ReplyDelete
  10. Kim Ravida is a lifestyle and business coach who helps women in business take powerful money actions and make solid, productiveIamLinkfeeder IamLinkfeeder IamLinkfeeder IamLinkfeeder IamLinkfeeder IamLinkfeeder IamLinkfeeder IamLinkfeeder IamLinkfeeder IamLinkfeeder

    ReplyDelete
  11. All type services start in Bihar Govt go to see this services and help for filling form RTPS Bihar

    ReplyDelete
  12. very good i like this
    http://google.co.uk/url?q=https://firstgyan.com/

    ReplyDelete
  13. Thank you so much for sharing this amazing information, I really appreciate your hard work for writing this article thanks once again

    Please leave you comment on how to get short in height

    ReplyDelete
  14. I'm usually to blogging and i actually appreciate your content. The article has genuinely peaks my interest. I'm going to bookmark your web page

    ReplyDelete
  15. I did go through the article and would appreciate the efforts of the writer to share such an insightful and well-researched article.
    Best School Management Software In India

    ReplyDelete
  16. Hi I read your blog and found that your blog is full of informative content. So keep posting thanks for share this article. Your article is very amazing. I like your article. thanks you ones again.

    Madurai Kamaraj University Time Talbe 2022
    Shekhawati University Time Table 2022

    ReplyDelete
  17. Cloud Telephony Software
    Excellent information you have shared, thanks for taking the time to share with us such a great article. I really appreciate your work.

    ReplyDelete
  18. Medfemme Fertility is the best Fertility Specialist, Infertility Specialist, High Risk Pregnancy Specialist in Delhi. For more information in details visit our website.
    High Risk Pregnancy Specialist in Delhi

    ReplyDelete
  19. Sarswatienterprises is a trusted Die Set Manufacturers, Power Press Manufacturer, and Flip off Seals Machinery in Delhi, India. For more information visit our website.
    Air Blower Machine Manufacturer in Delhi

    ReplyDelete
  20. Get the best deals on honda acura engines for sale, toyota transmission for sale and jdm lexus engines online at JDM of Washington. For more information in detail visit our website.
    Honda Acura Engines for Sale

    ReplyDelete
  21. Visit Orange Pill Clinic for leading Nursing and Doctor Consultation Services at Home, Home Nursing Services in Delhi NCR, and Nursing Services for Suturing Removal at Home. For more information visit our website.
    Nursing Services for Suturing Removal at Home

    ReplyDelete

Post a Comment

Popular posts from this blog

Accessing PostgreSQL data from AWS Lambda

AWS Aurora Postgres, not a great first impression