postgres.md

Consider the following query

SELECT "orders".* FROM "orders" 
WHERE "orders"."bid" = 't' 
AND "orders"."region_id" = 10000002 
AND "orders"."type_id" = 30259 
ORDER BY price DESC
LIMIT 1

From a table with about 2.5million rows, it returns this one result in about 1075ms.

Running the same query, but with a LIMIT 2 increased the speed of the query to about 12ms.

Here are the EXPLAIN ANALYZE for both queries, LIMIT 1 first:

Limit  (cost=0.00..4187.72 rows=1 width=140) (actual time=1062.737..1062.737 rows=1 loops=1)
  ->  Index Scan Backward using index_orders_on_price on orders  (cost=0.00..6662655.21 rows=1591 width=140) (actual time=1062.736..1062.736 rows=1 loops=1)
        Filter: (bid AND (region_id = 10000002) AND (type_id = 30259))
        Rows Removed by Filter: 598168

And for LIMIT 2:

Limit  (cost=5288.02..5288.02 rows=2 width=140) (actual time=0.922..0.922 rows=2 loops=1)
  ->  Sort  (cost=5288.02..5291.99 rows=1591 width=140) (actual time=0.922..0.922 rows=2 loops=1)
        Sort Key: price
        Sort Method: top-N heapsort  Memory: 25kB
        ->  Index Scan using index_orders_on_bid_and_region_id_and_type_id on orders  (cost=0.00..5272.11 rows=1591 width=140) (actual time=0.044..0.673 rows=1429 loops=1)
              Index Cond: ((bid = true) AND (region_id = 10000002) AND (type_id = 30259))
              Filter: bid

Clearly, using the LIMIT 2 makes Postgres decide to use a sorting algoritm and use the index_orders_on_bid_and_region_id_and_type_id index. LIMIT 1 does no such thing.

So, what's happening here? Why does Postgres make this bad query plan decision and how do I beat Postgres back into submission by running that LIMIT 1 query fast as well?

Some more table info:

CREATE TABLE orders
(
  id serial NOT NULL,
  result_type character varying(8),
  price numeric(19,2),
  range integer,
  vol_entered bigint,
  vol_remaining bigint,
  min_volume bigint,
  bid boolean,
  issued_at timestamp without time zone,
  duration integer,
  type_id integer,
  order_id bigint,
  station_id integer,
  solar_system_id integer,
  region_id integer,
  emdr_hash character varying(40),
  generated_at timestamp without time zone,
  created_at timestamp without time zone,
  CONSTRAINT orders_pkey PRIMARY KEY (id)
)
WITH (
  OIDS=FALSE
);

Indices:

CREATE INDEX index_orders_on_bid
  ON orders
  USING btree
  (bid);

CREATE INDEX index_orders_on_bid_and_region_id_and_type_id
  ON orders
  USING btree
  (bid, region_id, type_id);

CREATE INDEX index_orders_on_generated_at
  ON orders
  USING btree
  (generated_at);

CREATE INDEX index_orders_on_price
  ON orders
  USING btree
  (price);

CREATE INDEX index_orders_on_solar_system_id
  ON orders
  USING btree
  (solar_system_id);

CREATE INDEX index_orders_on_type_id
  ON orders
  USING btree
  (type_id);

I also ran VACUUM ANALYZE FULL on the table.

Noteworth might be that if I change bid to false in the query, I get slow queries upto LIMIT 4, where LIMIT 5 is using the composite key again - and thus fast.

ariejan/postgres.md