Update: 2023/11/30 Bruce Momjian's take on things: https://momjian.us/main/blogs/pgblog/2023.html#November_22_2023
Given a table...
CREATE TABLE foo (
id SERIAL PRIMARY KEY,
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
...
);
...you realize that you have to perform complex queries with the following type of condition:
WHERE ...
AND DATE(created_at) = DATE('2013-01-01') ...
The table is big and other indexes are not helping, so you decide to create an index on DATE(created_at)
:
postgres=> CREATE INDEX ON foo (DATE(created_at));
ERROR: functions in index expression must be marked IMMUTABLE
It turns out that the function DATE(TIMESTAMP WITH TIME ZONE)
is mutable, due to the time zone.
On the other hand DATE(TIMESTAMP WITHOUT TIME ZONE)
is immutable, as shown in the following example:
postgres=> CREATE TABLE test (created_at TIMESTAMP WITHOUT TIME ZONE DEFAULT NOW());
CREATE TABLE
postgres=> CREATE INDEX ON test (date(created_at));
CREATE INDEX
So how do we handle the case where we have a time zone? Well, since the DATE()
function may give different results based on the time zone, we have to decide which time zone is applicable. In this example we use UTC:
postgres=> CREATE INDEX ON foo (DATE(created_at AT TIME ZONE 'UTC'));
CREATE INDEX
That worked. Now how do get the query planner to use it? I was under the impression that I simply had to use the same function I used to create the index in the WHERE
condition. Let's see what the query planner says:
postgres=> EXPLAIN SELECT * FROM foo WHERE DATE(created_at AT TIME ZONE 'UTC') = DATE('2013-01-01');
QUERY PLAN
----------------------------------------------------------------------------------
Seq Scan on foo (cost...)
Filter: (date(timezone('UTC'::text, created_at)) = '2013-01-01'::date)
The index is not used. Let's go look at the index definition:
postgres=> \d foo_date_idx
Index "public.foo_date_idx"
Column | Type | Definition
--------+------+-----------------------------------------
date | date | date(timezone('UTC'::text, created_at))
btree, for table "public.foo"
Let's try to use the definition as stored by the database. Note that this is (and should be) functionally equivalent to DATE(created_at AT TIME ZONE 'UTC')
, since the documentation says:
The function timezone(zone, timestamp) is equivalent to the SQL-conforming construct timestamp AT TIME ZONE zone.
postgres=> EXPLAIN SELECT * FROM foo WHERE DATE(TIMEZONE('UTC'::text, created_at)) = DATE('2013-01-01');
QUERY PLAN
--------------------------------------------------------------------------------------------
Bitmap Heap Scan on foo (cost...)
Recheck Cond: (date(timezone('UTC'::text, created_at)) = '2013-01-01'::date)
-> Bitmap Index Scan on foo_date_idx (cost...)
Index Cond: (date(timezone('UTC'::text, created_at)) = '2013-01-01'::date)
This seems to have done the trick. It is strange though that PostgreSQL rewrites the function used to create the index to a canonical form, but does not seem to do the same when the function is used in the WHERE
clause (in order to match the index function).
Footnote: PostgreSQL 9.1.9 was used.
wow excellent post