Adding columns with default value fast
I recently came across a discussion in #postgresql channel about how to add a new column that will represent record insert timestamps to a table with minimal locking.
Normally you would do something like this :
monkey=# alter table alter_test add column insert_timestamp timestamp without time zone default now();
ALTER TABLE
Time: 13643.983 ms
monkey=# create index concurrently insert_timestamp_idx on alter_test (insert_timestamp) where insert_timestamp is not null;
CREATE INDEX
Time: 1616.108 ms
This though , would require an AccessExclusiveLock on the table and the application would have to wait for the duration of alter.
What you could do instead is :
monkey=# alter table alter_test add column insert_timestamp timestamp without time zone;
ALTER TABLE
Time: 1.135 ms
monkey=# alter table alter_test alter column insert_timestamp set default now();
ALTER TABLE
Time: 1.238 ms
monkey=# create index concurrently insert_timestamp_idx on alter_test (insert_timestamp) where insert_timestamp is not null;
CREATE INDEX
Time: 1618.788 ms
As you can see the timings are completely different, in the first case the database has to change data in all pre-existing rows, which isn't really needed since the value will be dummy (it will default to the transaction time).
The second way will also require an AccessExclusiveLock but it wont have to change any data ignoring the old rows so the lock should be almost instant (assuming no blockers exist),
next step you alter the table adding the default value and from now on the new column will have the default value. If you don't want to have nulls, or you want to add a not null constraint to the column, you can update the old (null) rows in a more progressive and non intrusive way later on.
Thanks for reading
- Vasilis Ventirozos
Normally you would do something like this :
monkey=# alter table alter_test add column insert_timestamp timestamp without time zone default now();
ALTER TABLE
Time: 13643.983 ms
monkey=# create index concurrently insert_timestamp_idx on alter_test (insert_timestamp) where insert_timestamp is not null;
CREATE INDEX
Time: 1616.108 ms
This though , would require an AccessExclusiveLock on the table and the application would have to wait for the duration of alter.
What you could do instead is :
monkey=# alter table alter_test add column insert_timestamp timestamp without time zone;
ALTER TABLE
Time: 1.135 ms
monkey=# alter table alter_test alter column insert_timestamp set default now();
ALTER TABLE
Time: 1.238 ms
monkey=# create index concurrently insert_timestamp_idx on alter_test (insert_timestamp) where insert_timestamp is not null;
CREATE INDEX
Time: 1618.788 ms
As you can see the timings are completely different, in the first case the database has to change data in all pre-existing rows, which isn't really needed since the value will be dummy (it will default to the transaction time).
The second way will also require an AccessExclusiveLock but it wont have to change any data ignoring the old rows so the lock should be almost instant (assuming no blockers exist),
next step you alter the table adding the default value and from now on the new column will have the default value. If you don't want to have nulls, or you want to add a not null constraint to the column, you can update the old (null) rows in a more progressive and non intrusive way later on.
Thanks for reading
- Vasilis Ventirozos
And if you want add a not null column:
ReplyDeletemonkey=# alter table alter_test add column insert_timestamp timestamp without time zone;
monkey=# update alter_test SET insert_timestamp = now();
monkey=# alter table alter_test alter column insert_timestamp set default now();
monkey=# alter table alter_test alter column insert_timestamp set not null;
(Inside a transaction obviously)
Deletethanks for valuable info
ReplyDelete< a herf="https://gcpmasters.in/gcp-data-engineer-training-in-hyderabad/">gcp data engineer trainin g in hyderabad