Skip to content

Instantly share code, notes, and snippets.

@possebon
Created April 21, 2026 19:55
Show Gist options
  • Select an option

  • Save possebon/481f8148a3ea4fa6cfbace09bb12ec68 to your computer and use it in GitHub Desktop.

Select an option

Save possebon/481f8148a3ea4fa6cfbace09bb12ec68 to your computer and use it in GitHub Desktop.
PostgreSQL: find what's blocking autovacuum — idle-in-transaction backends + replication slots pinning xmin
-- Find what's blocking autovacuum across a PostgreSQL cluster.
--
-- Autovacuum can only reclaim dead tuples older than the oldest running
-- transaction *anywhere in the cluster*. One old idle-in-transaction session,
-- or one inactive replication slot holding back xmin, can silently stall
-- reclaim across dozens of tables.
--
-- This query returns both culprit classes in one result set:
-- kind = 'backend' → a client-backend transaction older than 10 minutes
-- kind = 'replication_slot' → a slot pinning xmin or catalog_xmin
--
-- Tested on PostgreSQL 12+. Column names stable back to 11.
-- Blog post: https://www.linkedin.com/in/fernando-possebon/
SELECT 'backend' AS kind,
pid::text AS identifier,
state,
usename,
datname,
EXTRACT(EPOCH FROM (now() - xact_start))::int AS xact_age_seconds,
query
FROM pg_stat_activity
WHERE xact_start IS NOT NULL
AND (now() - xact_start) > interval '10 minutes'
AND backend_type = 'client backend'
UNION ALL
SELECT 'replication_slot' AS kind,
slot_name,
CASE WHEN active THEN 'active' ELSE 'inactive' END,
NULL,
database,
NULL,
slot_type
FROM pg_replication_slots
WHERE xmin IS NOT NULL OR catalog_xmin IS NOT NULL
ORDER BY xact_age_seconds DESC NULLS LAST
LIMIT 50;
-- Three things worth knowing:
--
-- backend_type = 'client backend'
-- Filters out walsender, autovacuum workers, and background workers.
-- You want the application sessions, not Postgres's own plumbing.
--
-- pg_replication_slots.xmin = transactional slot (physical replication)
-- pg_replication_slots.catalog_xmin = logical slot (logical decoding)
-- Both can hold back the horizon. A logical slot whose consumer died
-- weeks ago is the #1 silent cause of stalled vacuum I have seen.
--
-- pg_prepared_xacts can also block (2PC transactions left uncommitted).
-- Rare but devastating. Add a third UNION if your cluster uses 2PC.
-- How to act on the results:
--
-- Idle client backend:
-- SELECT pg_terminate_backend(<pid>);
-- Then teach the application to close transactions. Shortest-lived
-- connection is almost never the problem; idle-in-transaction is.
--
-- Forgotten replication slot (consumer truly gone):
-- SELECT pg_drop_replication_slot('<slot_name>');
-- Check twice. Dropping an in-use slot loses replication state.
--
-- Prepared xact:
-- ROLLBACK PREPARED '<gid>';
-- Then find out why your app is using 2PC and whether it should be.
--
-- Preventive alert (put this in your monitoring, before you need it):
-- Page when ANY client-backend xact_age > 15 minutes. Full stop.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment