This morning I was working on getting some statistics on average transaction open time on a relatively underutilized host when I noticed some oddities. Periodically, I would see that my average duration of open transactions would dip below zero seconds. As I looked into it, I came across the following result from a simple query to pg_stat_activity:
postgres=# SELECT xact_start, NOW() AS now FROM pg_stat_activity WHERE xact_start IS NOT NULL; -[ RECORD 1 ]---------------------------- xact_start | 2012-09-11 19:33:41.253139+00 <-- Note this happens AFTER NOW() (i.e. in the future) now | 2012-09-11 19:33:41.252795+00 ...
Naturally, PostgreSQL isn’t predicting future queries, but it is obvious that there are some issues with recorded vs. reported transaction start times. Digging a bit deeper into src/backend/access/transam/xact.c and src/backend/postmaster/pgstat.c, it became clear to me why this behavior manifests:
1743 /* 1744 * set transaction_timestamp() (a/k/a now()). We want this to be the same 1745 * as the first command's statement_timestamp(), so don't do a fresh 1746 * GetCurrentTimestamp() call (which'd be expensive anyway). Also, mark 1747 * xactStopTimestamp as unset. 1748 */ 1749 xactStartTimestamp = stmtStartTimestamp; 1750 xactStopTimestamp = 0; 1751 pgstat_report_xact_timestamp(xactStartTimestamp);
Of course, this appears simply to be an implementation detail that probably won’t affect most users, but it’s certainly something to be aware of! For me, the solution to this is to simply normalize all negative intervals to zero and use that to calculate average current transaction duration.