If you're not familiar with BRIN, check these easy-to-follow articles: https://www.crunchydata.com/blog/postgres-indexing-when-does-brin-win https://medium.com/geekculture/postgres-brin-index-large-data-performance-with-minimal-storage-4db6b...
Start a postgres instance and connect to it with psql
or your favorite tool. And then follow the steps below.
-
Create a table and fill in with dummy date:
create table sensor (id int, value int, received_time timestamp); CREATE OR REPLACE FUNCTION random_between(low INT ,high INT) RETURNS INT AS $$ BEGIN RETURN floor(random()* (high-low + 1) + low); END; $$ language 'plpgsql' STRICT;
-
Generate dummy data:
insert into sensor select random_between (1,500), /* generating random sensor ID */ random_between (0,200), /* generating random value reported by a sensor */ generate_series('2022-10-24'::timestamp,'2023-10-24'::timestamp, '1 minute') /* generate time when the measurement is reported */; /* refreshing statistics */ analyze sensor;
-
Querying data for a specific range. Postgres does a full scan:
explain analyze select * from sensor where received_time between '2022-12-12'::timestamp and '2023-2-15'::timestamp; Seq Scan on sensor (cost=0.00..10726.01 rows=94053 width=16) (actual time=4.273..37.238 rows=93601 loops=1) Filter: ((received_time >= '2022-12-12 00:00:00'::timestamp without time zone) AND (received_time <= '2023-02-15 00:00:00'::timestamp without time zone)) Rows Removed by Filter: 432000 Planning Time: 9.145 ms Execution Time: 40.373 ms
-
Create a Btree index and re-run the query (two times faster):
create index time_btree_idx on sensor(received_time); explain analyze select * from sensor where received_time between '2022-12-12'::timestamp and '2023-2-15'::timestamp; ----------------------------------------------------------------------------------------------------------------------------------------------------------------- Index Scan using time_btree_idx on sensor (cost=0.42..3333.73 rows=91365 width=16) (actual time=0.053..13.992 rows=93601 loops=1) Index Cond: ((received_time >= '2022-12-12 00:00:00'::timestamp without time zone) AND (received_time <= '2023-02-15 00:00:00'::timestamp without time zone)) Planning Time: 0.185 ms Execution Time: 18.116 ms
-
Get the size of the index:
select * from pg_size_pretty(pg_table_size('time_btree_idx')); pg_size_pretty ---------------- 11 MB
-
Now, create a BRIN index and execute the same query:
drop index time_btree_idx; create index time_brin_idx on sensor using brin(received_time); explain analyze select * from sensor where received_time between '2022-12-12'::timestamp and '2023-2-15'::timestamp; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on sensor (cost=35.13..4591.04 rows=91871 width=16) (actual time=2.545..12.784 rows=93601 loops=1) Recheck Cond: ((received_time >= '2022-12-12 00:00:00'::timestamp without time zone) AND (received_time <= '2023-02-15 00:00:00'::timestamp without time zone)) Rows Removed by Index Recheck: 24799 Heap Blocks: lossy=640 -> Bitmap Index Scan on time_brin_idx (cost=0.00..12.16 rows=114261 width=0) (actual time=0.064..0.064 rows=6400 loops=1) Index Cond: ((received_time >= '2022-12-12 00:00:00'::timestamp without time zone) AND (received_time <= '2023-02-15 00:00:00'::timestamp without time zone)) Planning Time: 3.110 ms Execution Time: 16.506 ms
-
And check the BRIN size:
select * from pg_size_pretty(pg_table_size('time_brin_idx')); pg_size_pretty ---------------- 48 kB (1 row)