Skip to content

Instantly share code, notes, and snippets.

@cabecada
Created October 5, 2024 18:47
Show Gist options
  • Save cabecada/f877ef4d9afcbaffad5de73ea1495e73 to your computer and use it in GitHub Desktop.
Save cabecada/f877ef4d9afcbaffad5de73ea1495e73 to your computer and use it in GitHub Desktop.
diagnose slow deletes which have foreign keys
many times there are tables with 10s of references in other tables. when we try to delete data cascade, sometimes deletes are very slow.
we have an index on the column of the primary table and some of the foreign tables, but we dont know why the delete is slow.
here we try to make use of auto_explain with gucs to track plans of foreign tables to figure out slow deletes and how we fix them
postgres@ubuntu:/tmp$ cat db1/postgresql.auto.conf | grep -v '^#'
port=5432
session_preload_libraries = auto_explain
auto_explain.log_min_duration = 0
auto_explain.log_analyze = true
auto_explain.log_buffers = true
auto_explain.log_timing = true
auto_explain.log_verbose = on
auto_explain.log_triggers = on
auto_explain.log_nested_statements = on
postgres@ubuntu:/tmp$ psql -p 5432 demo
psql (16.4 (Ubuntu 16.4-0ubuntu0.24.04.1))
Type "help" for help.
demo=# create table t(col1 int primary key, col2 int); CREATE TABLE
demo=# create table r(col3 int primary key, col1 int references t(col1) on update cascade on delete cascade); -- foreign table with col1 having no index
CREATE TABLE
demo=# insert into t select x, x from generate_series(1, 1000000) x; INSERT 0 1000000
demo=# insert into r select x, (x % 3) + 1 from generate_series(1, 100000) x; -- only 1,2,3 for col1
INSERT 0 100000
demo=# begin;
BEGIN
demo=*# explain (analyze,verbose,buffers) delete from t where col1 < 5; -- when we delete from t, it will show it touches r but will not show what plan was used to delete data from r. we need to check the actual plan to figure out why the delete is slow and how to fix it
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Delete on public.t (cost=0.42..8.50 rows=0 width=0) (actual time=0.025..0.026 rows=0 loops=1)
Buffers: shared hit=13
-> Index Scan using t_pkey on public.t (cost=0.42..8.50 rows=4 width=6) (actual time=0.006..0.008 rows=4 loops=1)
Output: ctid
Index Cond: (t.col1 < 5)
Buffers: shared hit=4
Planning:
Buffers: shared hit=7
Planning Time: 0.083 ms
Trigger RI_ConstraintTrigger_a_16475 for constraint r_col1_fkey: time=46.274 calls=4
Execution Time: 46.460 ms
(11 rows)
demo=*# -- note Trigger RI_ConstraintTrigger_a_16475 for constraint r_col1_fkey: time=46.274 calls=4 , it made 4 calls to r
demo=*# rollback;
ROLLBACK
demo=#
\q
postgres@ubuntu:/tmp$ less db1.log
postgres@ubuntu:/tmp$ tail -20 db1.log
2024-10-05 18:40:26.429 UTC [2260] LOG: duration: 1.087 ms plan:
Query Text: DELETE FROM ONLY "public"."r" WHERE $1 OPERATOR(pg_catalog.=) "col1"
Query Parameters: $1 = '4'
Delete on public.r (cost=0.00..1693.00 rows=0 width=0) (actual time=1.086..1.087 rows=0 loops=1)
Buffers: shared hit=443
-> Seq Scan on public.r (cost=0.00..1693.00 rows=1 width=6) (actual time=1.086..1.086 rows=0 loops=1)
Output: ctid
Filter: (4 = r.col1)
Buffers: shared hit=443
2024-10-05 18:40:26.429 UTC [2260] CONTEXT: SQL statement "DELETE FROM ONLY "public"."r" WHERE $1 OPERATOR(pg_catalog.=) "col1""
2024-10-05 18:40:26.429 UTC [2260] LOG: duration: 46.303 ms plan:
Query Text: explain (analyze,verbose,buffers) delete from t where col1 < 5;
Delete on public.t (cost=0.42..8.50 rows=0 width=0) (actual time=0.025..0.026 rows=0 loops=1)
Buffers: shared hit=13
-> Index Scan using t_pkey on public.t (cost=0.42..8.50 rows=4 width=6) (actual time=0.006..0.008 rows=4 loops=1)
Output: ctid
Index Cond: (t.col1 < 5)
Buffers: shared hit=4
Trigger RI_ConstraintTrigger_a_16475 for constraint r_col1_fkey: time=46.274 calls=4
2024-10-05 18:41:01.062 UTC [2232] LOG: checkpoint complete: wrote 139 buffers (0.8%); 1 WAL file(s) added, 0 removed, 0 recycled; write=269.932 s, sync=0.010 s, total=269.979 s; sync files=11, longest=0.007 s, average=0.001 s; distance=540667 kB, estimate=540758 kB; lsn=6/5CF4BF40, redo lsn=6/483FFF00
postgres@ubuntu:/tmp$ #note with auto_explain nested statement and triggers, we can see the plan used when data deleted from foreign tables. in this case it is a seq scan for each row in table t matching, hence it is super slow
postgres@ubuntu:/tmp$ #to fix this we will create an index on col2 in table r
postgres@ubuntu:/tmp$ psql -p 5432 demo
psql (16.4 (Ubuntu 16.4-0ubuntu0.24.04.1))
Type "help" for help.
postgres@ubuntu:/tmp$ psql -p 5432 demo
psql (16.4 (Ubuntu 16.4-0ubuntu0.24.04.1))
Type "help" for help.
demo=# create index concurrently on r(col1);
CREATE INDEX
demo=# -- now we run the same delete and check plan in logs
demo=# begin;
BEGIN
demo=*# explain (analyze,verbose,buffers) delete from t where col1 < 5;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Delete on public.t (cost=0.42..8.50 rows=0 width=0) (actual time=0.046..0.046 rows=0 loops=1)
Buffers: shared hit=15
-> Index Scan using t_pkey on public.t (cost=0.42..8.50 rows=4 width=6) (actual time=0.005..0.006 rows=4 loops=1)
Output: ctid
Index Cond: (t.col1 < 5)
Buffers: shared hit=4
Planning:
Buffers: shared hit=38
Planning Time: 0.323 ms
Trigger RI_ConstraintTrigger_a_16475 for constraint r_col1_fkey: time=46.178 calls=4
Execution Time: 46.304 ms
(11 rows)
demo=*# rollback;
ROLLBACK
demo=#
\q
###### note the use of index on table r to scan the relevant rows, earlier it was seq scan for each matching row
postgres@ubuntu:/tmp$ tail -20 db1.log
2024-10-05 18:43:34.040 UTC [2289] CONTEXT: SQL statement "DELETE FROM ONLY "public"."r" WHERE $1 OPERATOR(pg_catalog.=) "col1""
2024-10-05 18:43:34.040 UTC [2289] LOG: duration: 0.004 ms plan:
Query Text: DELETE FROM ONLY "public"."r" WHERE $1 OPERATOR(pg_catalog.=) "col1"
Query Parameters: $1 = '4'
Delete on public.r (cost=0.29..7.86 rows=0 width=0) (actual time=0.004..0.004 rows=0 loops=1)
Buffers: shared hit=2
-> Index Scan using r_col1_idx1 on public.r (cost=0.29..7.86 rows=1 width=6) (actual time=0.004..0.004 rows=0 loops=1)
Output: ctid
Index Cond: (r.col1 = 4)
Buffers: shared hit=2
2024-10-05 18:43:34.040 UTC [2289] CONTEXT: SQL statement "DELETE FROM ONLY "public"."r" WHERE $1 OPERATOR(pg_catalog.=) "col1""
2024-10-05 18:43:34.040 UTC [2289] LOG: duration: 46.226 ms plan:
Query Text: explain (analyze,verbose,buffers) delete from t where col1 < 5;
Delete on public.t (cost=0.42..8.50 rows=0 width=0) (actual time=0.046..0.046 rows=0 loops=1)
Buffers: shared hit=15
-> Index Scan using t_pkey on public.t (cost=0.42..8.50 rows=4 width=6) (actual time=0.005..0.006 rows=4 loops=1)
Output: ctid
Index Cond: (t.col1 < 5)
Buffers: shared hit=4
Trigger RI_ConstraintTrigger_a_16475 for constraint r_col1_fkey: time=46.178 calls=4
@cabecada
Copy link
Author

cabecada commented Oct 5, 2024

note:
auto_explain.log_min_duration = 0 this GUC is only for demo purposes, when dealing with slow queries, this will be set appropriately to a higer value to only filter slow queries. else all queries plan will be logged, which will flood the logs.
this param does not require a db restart, a simple reload would do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment