Postgres EXPLAIN Lunch & Learn @ BenchPrep

EXPLAIN Explained video on YouTube

What EXPLAIN doesn't do

Tell you why a particular index isn't used
Explain how to rewrite your queries
Show what other factors make the DB slow
Tell you how much time the request took outside the DB

The query planner

Break the query down into atomic "nodes"
Figure out every possible way to execute each "node" and it's cost
Chain the combinations together into plans
Calculate the total "cost" of each plan
Pick the plan with the lowest "cost"

Nodes

(Rowset + Parameters) =(Node Operation)=> Rowset

Sort node

(Scan rows + Sort columns) =(Sort)=> Sorted rows

$cost$

A figure in completely arbitrary cost units (float) which is intended to represent estimated time and system resources required to execute the query based on dubious theor and proven practice.

(some people adjust seq_page_cost)

`EXPLAIN` vs `EXPLAIN ANALYZE`

EXPLAIN shows what the planner planned to do.
EXPLAIN ANALYZE shows the plan, executes the query, then shows how it did.
Most of the time you want to run EXPLAIN ANALYZE unless:
- The query never finishes.
- You're just checking index usage.
- Your just checking for partition scans.

Other `EXPLAIN` options

buffers
costs & timing
format (yaml, xml, json)
verbose

explain (analyze on, verbose on, format yaml)

NOTE

The explain output is an inverted tree. Read from inside to outside.

Node Types

Scans

Most will contain a filter condition

Seq Scan

Full-table scan

Index Scan

Scan index
Look up tuples in table

Index-Only Scan

Scan index
Ignore Heap

Bitmap Scan

Bitmap Heap Scan, Bitmap Index Scan

Scan index
Build bitmap of compliant rows

Joins

NestLoop

For each row in Set A:
- Look up rows in Set B

# Essentially...
for each row in input_a
  for each row in input_b
    do something
  next
next
# Ref: https://wiki.postgresql.org/wiki/Introduction_to_VACUUM,_ANALYZE,_EXPLAIN,_and_COUNT

Hash Join

Make a Hash Table of Set A:
- Match groups of rows in Set B

Merge Join

Sort both Set A and Set B
- The "zip" them together

Lateral Join

Join against table expressions

Semi-Join

Partial join, for outer join

Anti-Join

Exclusion, such as for NOT IN()

Aggregates

GroupAggregate

Regular aggregation

HashAggregate

In-memory aggregate using hash tables

WindowAgg

Used for windowing clause expressions

Other operations

Unique

De-deduplicate rows, like for DISTINCT or UNION

Sort

Sort

Limit

Supports LIMIT/OFFSET

CTEScan

Joins a CTE (WITH clause) to the main query

SubqueryScan

Same as CTEScan but for subqueries

Materialize

Create a recordset in-memory from a query fragment

Append

Merge rowsets for UNION ALL and partitions

Hints when reading `EXPLAIN`

Seq Scan usually can mean an index is needed on the scanned columns
Index Scan usually can mean there was a bad row estimate (?)
Nested Loops look at time consumption, rows, loop count
Index Scan look at time consumption, rows, loop count

Tweaking

Run EXPLAIN
Toggle options based on initial output
GOTO 1

Automatically check

auto_explain

F.3. auto_explain

The auto_explain module provides a means for logging execution plans of slow statements automatically, without having to run EXPLAIN by hand. This is especially helpful for tracking down un-optimized queries in large applications.

The module provides no SQL-accessible functions. To use it, simply load it into the server. You can load it into an individual session:

LOAD 'auto_explain'; (You must be superuser to do that.) More typical usage is to preload it into some or all sessions by including auto_explain in session_preload_libraries or shared_preload_libraries in postgresql.conf. Then you can track unexpectedly slow queries no matter when they happen. Of course there is a price in overhead for that.

Reading

Other tools

User.where(:id => 1).joins(:posts).explain

config.active_record.auto_explain_threshold_in_seconds = 0.5

Bullet

Help to kill N+1 queries and unused eager loading.

pghero

A performance dashboard for Postgres

sqlint

pg_query

Parses SQL queries using a copy of the PostgreSQL server query parser

just3ws/explain_explained_notes.md