EXPLAIN Explained video on YouTube
- Tell you why a particular index isn't used
- Explain how to rewrite your queries
- Show what other factors make the DB slow
- Tell you how much time the request took outside the DB
- Break the query down into atomic "nodes"
- Figure out every possible way to execute each "node" and it's cost
- Chain the combinations together into plans
- Calculate the total "cost" of each plan
- Pick the plan with the lowest "cost"
(Rowset + Parameters) =(Node Operation)=> Rowset
(Scan rows + Sort columns) =(Sort)=> Sorted rows
A figure in completely arbitrary cost units (float) which is intended to represent estimated time and system resources required to execute the query based on dubious theor and proven practice.
(some people adjust seq_page_cost
)
EXPLAIN
shows what the planner planned to do.EXPLAIN ANALYZE
shows the plan, executes the query, then shows how it did.- Most of the time you want to run
EXPLAIN ANALYZE
unless:- The query never finishes.
- You're just checking index usage.
- Your just checking for partition scans.
- buffers
- costs & timing
- format (yaml, xml, json)
- verbose
explain (analyze on, verbose on, format yaml)
The explain output is an inverted tree. Read from inside to outside.
Most will contain a filter condition
Full-table scan
- Scan index
- Look up tuples in table
- Scan index
- Ignore Heap
Bitmap Heap Scan
, Bitmap Index Scan
- Scan index
- Build bitmap of compliant rows
- For each row in Set A:
- Look up rows in Set B
# Essentially...
for each row in input_a
for each row in input_b
do something
next
next
# Ref: https://wiki.postgresql.org/wiki/Introduction_to_VACUUM,_ANALYZE,_EXPLAIN,_and_COUNT
- Make a Hash Table of Set A:
- Match groups of rows in Set B
- Sort both Set A and Set B
- The "zip" them together
- Join against table expressions
- Partial join, for outer join
- Exclusion, such as for
NOT IN()
- Regular aggregation
- In-memory aggregate using hash tables
- Used for windowing clause expressions
- De-deduplicate rows, like for
DISTINCT
orUNION
- Sort
- Supports
LIMIT
/OFFSET
- Joins a CTE (
WITH
clause) to the main query
- Same as CTEScan but for subqueries
- Create a recordset in-memory from a query fragment
- Merge rowsets for
UNION ALL
and partitions
Seq Scan
usually can mean an index is needed on the scanned columnsIndex Scan
usually can mean there was a bad row estimate (?)Nested Loops
look at time consumption, rows, loop countIndex Scan
look at time consumption, rows, loop count
- Run
EXPLAIN
- Toggle options based on initial output
- GOTO 1
F.3. auto_explain
The auto_explain module provides a means for logging execution plans of slow statements automatically, without having to run EXPLAIN by hand. This is especially helpful for tracking down un-optimized queries in large applications.
The module provides no SQL-accessible functions. To use it, simply load it into the server. You can load it into an individual session:
LOAD 'auto_explain'; (You must be superuser to do that.) More typical usage is to preload it into some or all sessions by including auto_explain in session_preload_libraries or shared_preload_libraries in postgresql.conf. Then you can track unexpectedly slow queries no matter when they happen. Of course there is a price in overhead for that.
- Using EXPLAIN
- EXPLAIN docs
- Reading PgAdmin Graphical EXPLAIN Plans
- Introduction to VACUUM, ANALYZE, EXPLAIN, and COUNT
- Reading a Postgres EXPLAIN ANALYZE Query Plan
- Advanced Postgres Performance Tips
- Explaining Your PostgreSQL Data
- Understanding Postgres Performance
- More on Postgres Performance
- Speed up with Materialized Views on PostgreSQL and Rails
User.where(:id => 1).joins(:posts).explain
config.active_record.auto_explain_threshold_in_seconds = 0.5
Help to kill N+1 queries and unused eager loading.
A performance dashboard for Postgres
Parses SQL queries using a copy of the PostgreSQL server query parser