- Start a Postgres instance, for example
docker run -p 5432:5432 -e POSTGRES_PASSWORD=password -d docker
- Open a psql client shell
psql -h localhost -p 5432 -U postgres -d postgres
Password for user postgres: <enter password>
CREATE TABLE test (object_id TEXT, PRIMARY KEY (object_id));
postgres=# \d+ test;
Table "public.test"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
-----------+------+-----------+----------+---------+----------+-------------+--------------+-------------
object_id | text | | not null | | extended | | |
Indexes:
"test_pkey" PRIMARY KEY, btree (object_id)
Access method: heap
Description: With 1k object_ids, select the middle most object in the table
go run main.go -num_ids=1000 | pbcopy
go run main.go -num_ids=1000 -query | pbcopy
postgres=# EXPLAIN ANALYZE SELECT object_id FROM test WHERE object_id IN ('500');
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Index Only Scan using test_pkey on test (cost=0.28..4.29 rows=1 width=3) (actual time=0.039..0.040 rows=1 loops=1)
Index Cond: (object_id = '500'::text)
Heap Fetches: 0
Planning Time: 0.454 ms
Execution Time: 0.076 ms
(5 rows)
Description: With 5k object_ids, select the middle most object in the table
go run main.go -num_ids=5000 | pbcopy
go run main.go -num_ids=5000 -query | pbcopy
postgres=# EXPLAIN ANALYZE SELECT object_id FROM test WHERE object_id IN ('2500');
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Index Only Scan using test_pkey on test (cost=0.28..4.30 rows=1 width=4) (actual time=0.067..0.069 rows=1 loops=1)
Index Cond: (object_id = '2500'::text)
Heap Fetches: 0
Planning Time: 0.634 ms
Execution Time: 0.543 ms
(5 rows)
Description: With 10k object_ids, select the middle most object in the table
go run main.go -num_ids=10000 | pbcopy
go run main.go -num_ids=10000 -query | pbcopy
postgres=# EXPLAIN ANALYZE SELECT object_id FROM test WHERE object_id IN ('5000');
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Index Only Scan using test_pkey on test (cost=0.29..8.30 rows=1 width=4) (actual time=0.100..0.103 rows=1 loops=1)
Index Cond: (object_id = '5000'::text)
Heap Fetches: 1
Planning Time: 0.218 ms
Execution Time: 0.165 ms
(5 rows)
Description: With 20k object_ids, select the middle most object in the table
go run main.go -num_ids=20000 | pbcopy
go run main.go -num_ids=20000 -query | pbcopy
postgres=# EXPLAIN ANALYZE SELECT object_id FROM test WHERE object_id IN ('10000');
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Index Only Scan using test_pkey on test (cost=0.29..8.30 rows=1 width=4) (actual time=0.093..0.095 rows=1 loops=1)
Index Cond: (object_id = '10000'::text)
Heap Fetches: 1
Planning Time: 0.141 ms
Execution Time: 0.128 ms
(5 rows)
Description: With 100k object_ids, select the middle most object in the table
go run main.go -num_ids=100000 | pbcopy
go run main.go -num_ids=100000 -query | pbcopy
EXPLAIN ANALYZE SELECT object_id FROM test WHERE object_id IN ('50000');
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Index Only Scan using test_pkey on test (cost=0.42..8.44 rows=1 width=5) (actual time=0.041..0.041 rows=1 loops=1)
Index Cond: (object_id = '50000'::text)
Heap Fetches: 1
Planning Time: 0.130 ms
Execution Time: 0.063 ms
(5 rows)
Summary: Scenarios 1-5 demonstrate that as the input table size grows, selecting from a row in the middle of the b-tree remains relatively amortized in cost due to the logarithmic complexity of the b-tree traversal. This is exactly what we'd expect from a b-tree.
Description: 100k rows, 1 row in the filter matching in the middle of the table, 10 object_ids provided in the predicate filter
go run main.go -num_ids=100000 | pbcopy
go run main.go -num_ids=100000 -list_size=10 -query | pbcopy
EXPLAIN ANALYZE SELECT object_id FROM test WHERE object_id IN ('50000','100000', '100001', '100002', '100003', '100004', '100005', '100006', '100007', '100008', '100009');
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Index Only Scan using test_pkey on test (cost=0.42..48.79 rows=11 width=5) (actual time=0.360..0.361 rows=1 loops=1)
Index Cond: (object_id = ANY ('{50000,100000,100001,100002,100003,100004,100005,100006,100007,100008,100009}'::text[]))
Heap Fetches: 0
Planning Time: 0.247 ms
Execution Time: 0.401 ms
(5 rows)
Description: 100k rows, 1 row in the filter matching in the middle of the table, 100 object_ids provided in the predicate filter
go run main.go -num_ids=100000 | pbcopy
go run main.go -num_ids=100000 -list_size=100 -query | pbcopy
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Only Scan using test_pkey on test (cost=0.42..415.94 rows=101 width=5) (actual time=0.507..0.508 rows=1 loops=1)
Index Cond: (object_id = ANY ('{50000,100000,100001,100002,100003,100004,100005,100006,100007,100008,100009,100010,100011,100012,100013,100014,100015,100016,100017,100018,100019,100020,100021,100022,100023,100024,100025,100026,100027,100028,100029,100030,100031,100032,100033,100034,100035,100036,100037,100038,100039,100040,100041,100042,100043,100044,100045,100046,100047,100048,100049,100050,100051,100052,100053,100054,100055,100056,100057,100058,100059,100060,100061,100062,100063,100064,100065,100066,100067,100068,100069,100070,100071,100072,100073,100074,100075,100076,100077,100078,100079,100080,100081,100082,100083,100084,100085,100086,100087,100088,100089,100090,100091,100092,100093,100094,100095,100096,100097,100098,100099}'::text[]))
Heap Fetches: 0
Planning Time: 0.166 ms
Execution Time: 0.529 ms
(5 rows)
Description: 100k rows, 1 row in the filter matching in the middle of the table, 1000 object_ids provided in the predicate filter
go run main.go -num_ids=100000 | pbcopy
go run main.go -num_ids=100000 -list_size=1000 -query | pbcopy
Seq Scan on test (cost=2.50..1945.50 rows=1001 width=5) (actual time=3.322..9.139 rows=1 loops=1)
Filter: (object_id = ANY ('{50000,100000,100001,100002,100003,100004,100005,100006,100007,100008,100009,100010,100011,100012,100013,100014,100015,100016,100017,100018,100019,100020,100021,100022,100023,100024,100025,100026,100027,100028,100029,100030,100031,100032,100033,100034,100035,100036,100037,100038,100039,100040,100041,100042,100043,100044,100045,100046,100047,100048,100049,100050,100051,100052,100053,100054,100055,100056,100057,100058,100059,100060,100061,100062,100063,100064,100065,100066,100067,100068,100069,100070,100071,100072,100073,100074,100075,100076,100077,100078,100079,100080,100081,100082,100083,100084,100085,100086,100087,100088,100089,100090,100091,100092,100093,100094,100095,100096,100097,100098,100099,100100,100101,100102,100103,100104,100105,100106,100107,100108,100109,100110,100111,100112,100113,100114,100115,100116,100117,100118,100119,100120,100121,100122,100123,100124,100125,100126,100127,100128,100129,100130,100131,100132,100133,100134,100135,100136,100137,100138,100139,100140,100141,100142,100143,100144,100145,100146,100147,100148,100149,100150,100151,100152,100153,100154,100155,100156,100157,100158,100159,100160,100161,100162,100163,100164,100165,100166,100167,100168,100169,100170,100171,100172,100173,100174,100175,100176,100177,100178,100179,100180,100181,100182,100183,100184,100185,100186,100187,100188,100189,100190,100191,100192,100193,100194,100195,100196,100197,100198,100199,100200,100201,100202,100203,100204,100205,100206,100207,100208,100209,100210,100211,100212,100213,100214,100215,100216,100217,100218,100219,100220,100221,100222,100223,100224,100225,100226,100227,100228,100229,100230,100231,100232,100233,100234,100235,100236,100237,100238,100239,100240,100241,100242,100243,100244,100245,100246,100247,100248,100249,100250,100251,100252,100253,100254,100255,100256,100257,100258,100259,100260,100261,100262,100263,100264,100265,100266,100267,100268,100269,100270,100271,100272,100273,100274,100275,100276,100277,100278,100279,100280,100281,100282,100283,100284,100285,100286,100287,100288,100289,100290,100291,100292,100293,100294,100295,100296,100297,100298,100299,100300,100301,100302,100303,100304,100305,100306,100307,100308,100309,100310,100311,100312,100313,100314,100315,100316,100317,100318,100319,100320,100321,100322,100323,100324,100325,100326,100327,100328,100329,100330,100331,100332,100333,100334,100335,100336,100337,100338,100339,100340,100341,100342,100343,100344,100345,100346,100347,100348,100349,100350,100351,100352,100353,100354,100355,100356,100357,100358,100359,100360,100361,100362,100363,100364,100365,100366,100367,100368,100369,100370,100371,100372,100373,100374,100375,100376,100377,100378,100379,100380,100381,100382,100383,100384,100385,100386,100387,100388,100389,100390,100391,100392,100393,100394,100395,100396,100397,100398,100399,100400,100401,100402,100403,100404,100405,100406,100407,100408,100409,100410,100411,100412,100413,100414,100415,100416,100417,100418,100419,100420,100421,100422,100423,100424,100425,100426,100427,100428,100429,100430,100431,100432,100433,100434,100435,100436,100437,100438,100439,100440,100441,100442,100443,100444,100445,100446,100447,100448,100449,100450,100451,100452,100453,100454,100455,100456,100457,100458,100459,100460,100461,100462,100463,100464,100465,100466,100467,100468,100469,100470,100471,100472,100473,100474,100475,100476,100477,100478,100479,100480,100481,100482,100483,100484,100485,100486,100487,100488,100489,100490,100491,100492,100493,100494,100495,100496,100497,100498,100499,100500,100501,100502,100503,100504,100505,100506,100507,100508,100509,100510,100511,100512,100513,100514,100515,100516,100517,100518,100519,100520,100521,100522,100523,100524,100525,100526,100527,100528,100529,100530,100531,100532,100533,100534,100535,100536,100537,100538,100539,100540,100541,100542,100543,100544,100545,100546,100547,100548,100549,100550,100551,100552,100553,10055
4,100995,100996,100997,100998,100999}'::text[]))
Rows Removed by Filter: 99999
Planning Time: 0.538 ms
Execution Time: 9.365 ms
If roughly 5-10% of all rows in the table have to be evaluated, then a sequential scan is much faster than an index scan because of the sequential access pattern to disk. This isn't always true, but in general it is if you assume a higher cache ratio for sequential access. See https://www.postgresql.org/docs/current/runtime-config-query.html for more info on 'random_page_cost'. The query planner here determines that a sequential scan of the table is actually lower cost than the alternative index scan as we saw previously.
Summary: Scenario 6-8 demonstrate that, for a given sparse query (that is one with a single matching predicate in the middle of the b-tree and all others non-matching predicates), the query planner and cost of the query degrades with respect to the range query and the b-tree on the primary table. Care should be taken to ensure we're serving index scans instead of sequence scans in these read query patterns, which practically means that the length of these predicate lists in these range queries should be relatively low ratio of the number of tuples in that FGA store. So long as that is the case we'll see solid logarithmic query performance out of the query engine.