Monday, October 31, 2016

Module Monday: spi/timetravel

For pg_sentinel I used a SPI function for convenience. After that I decided to refresh my SPI knowledge and discovered the contrib/spi module that comes with PostgreSQL, especially timetravel, which I will introduce today.

tl;dr

Timetravel transparently morphs UPDATEs to INSERTs and DELETEs to UPDATEs together with timestamps to preserve DML changes made to a table. Thus, the state of the data in the table can be reproduced for any given point in time since the table was put under timetravel control.

First, timetravel must be installed in the database:

create extension timetravel;

Now, a test table is created

CREATE TABLE baz (
id INTEGER,
val INTEGER,
PRIMARY KEY (id));

Let's INSERT and UPDATE a single row.

INSERT INTO baz(id,val) VALUES(1,0);

UPDATE baz SET val=1 WHERE id=1;
UPDATE baz SET val=2 WHERE id=1;
UPDATE baz SET val=3 WHERE id=1;

A SELECT reveals that only the last UPDATE survived, which is the normal behaviour.

SELECT * FROM baz;

Now, in order to enable timetravel capabilities on table baz, it has to be changed a bit.

TRUNCATE TABLE baz;

ALTER TABLE baz ADD COLUMN v_start ABSTIME;
ALTER TABLE baz ADD COLUMN v_end ABSTIME;
ALTER TABLE baz DROP CONSTRAINT baz_pkey;
ALTER TABLE baz ADD PRIMARY KEY (id, v_end);

Now, for every row, two timestamp columns define the timespan for which this row was valid. These columns can be arbitrarily named, but have to be of type ABSTIME. And the primary key must be extended with the 'end' timestamp. Otherwise primary key duplication violations will occur.

Now, baz can be put under timetravel control:

CREATE TRIGGER baz_timetravel
BEFORE INSERT OR DELETE OR UPDATE ON baz
FOR EACH ROW
EXECUTE PROCEDURE
timetravel (v_start, v_end);

Again, we INSERT and UPDATE a single row.

INSERT INTO baz(id,val) VALUES(1,0);

UPDATE baz SET val=1 WHERE id=1;
UPDATE baz SET val=2 WHERE id=1;
UPDATE baz SET val=3 WHERE id=1;

Now, a SELECT reveals what timetravel does - for every UPDATE a new row was INSERTed and timestamped.

SELECT * FROM baz;

See, all values were persisted. The row of the last UPDATE has a 'end' timestamp of infinity, indicating that this is the current value of the row.

Likewise, a DELETE becomes an UPDATE, setting the 'end' timestamp to the time when the DELETE happened.

DELETE FROM baz WHERE id=1 AND val=3;

SELECT * FROM baz;

If you try to UPDATE or DELETE a non active row, timetravel intercepts it with a no-op. Only active rows with a 'end' of infinity are mutable.

Timetravel can be switched on and off per table with SELECT set_timetravel('table_name', 1) and SELECT set_timetravel('table_name', 0) respectively, while SELECT get_timetravel('table_name') shows the current status.

And that's basically it. However, timetravel has a few shortcomings:
  • Tables and primary keys have to be changed
  • Does not protect against TRUNCATE TABLE
  • Needs additional disk space
  • The timestamps have to be ABSTIME, which is deprecated according to the PostgreSQL documentation
  • You can only travel backwards, not into the future ;-)
To be continued every Monday whenever my spare time permits...

No comments:

Post a Comment