Skip to content

Instantly share code, notes, and snippets.

@pdlug
Created May 21, 2026 18:59
Show Gist options
  • Select an option

  • Save pdlug/eb9e765b68fd3af7be09c23c76d80645 to your computer and use it in GitHub Desktop.

Select an option

Save pdlug/eb9e765b68fd3af7be09c23c76d80645 to your computer and use it in GitHub Desktop.

Tested two potential optimizations based on a comment on the TypeGraph post in /r/postgresql from u/Deep_Ad1959:

  1. Native CYCLE clause vs the manual ARRAY[] + != ALL(path) cycle detection that TypeGraph currently emits (packages/typegraph/src/query/dialect/postgres.ts:178-192).
  2. Prepared-statement plan caching can pick a bad generic plan after the 5th execution under plan_cache_mode = auto when start-node selectivity varies widely.

Setup

  • Postgres: PG18 in a dedicated container (pgvector/pgvector:pg18).
  • Graph: 10,000 nodes (Person), ~21,500 knows edges with a heavy-tailed degree distribution:
    • 5 hub nodes with 500 outgoing edges each
    • 100 mid nodes with 30 outgoing edges each
    • ~9,895 leaves with 0-3 outgoing edges
    • +500 reciprocal pairs to seed cycles
  • Schema: matches what createPostgresTables produces — same typegraph_nodes, typegraph_edges, and typegraph_edges_from_idx shape the dialect emitter relies on.
  • Traversal depth: 5 hops, cycle prevention on (the default).
  • Iterations: 2 warmup + 10 samples per (start-node, variant) pair.
  • Bench file: packages/benchmarks/src/recursive-pg-bench.ts.

Experiment 1 — manual ARRAY vs native CYCLE clause

Variant A — what TypeGraph emits today

WITH RECURSIVE reachable AS (
  SELECT n.id, n.kind, 0 AS depth, ARRAY[n.id]::text[] AS path
  FROM typegraph_nodes n
  WHERE n.graph_id = $1 AND n.id = $2 AND n.deleted_at IS NULL
  UNION ALL
  SELECT n.id, n.kind, r.depth + 1 AS depth, (r.path || n.id)::text[]
  FROM reachable r
  JOIN typegraph_edges e
    ON e.from_id = r.id
    AND e.graph_id = $1
    AND e.kind = 'knows'
    AND e.deleted_at IS NULL
  JOIN typegraph_nodes n
    ON n.graph_id = e.graph_id
    AND n.id = e.to_id
    AND n.kind = e.to_kind
    AND n.deleted_at IS NULL
  WHERE r.depth < $3 AND n.id != ALL(r.path)
)
SELECT id, depth FROM reachable;

Variant B — PG14+ native CYCLE clause

WITH RECURSIVE reachable AS (
  SELECT n.id, n.kind, 0 AS depth
  FROM typegraph_nodes n
  WHERE n.graph_id = $1 AND n.id = $2 AND n.deleted_at IS NULL
  UNION ALL
  SELECT n.id, n.kind, r.depth + 1 AS depth
  FROM reachable r
  JOIN typegraph_edges e
    ON e.from_id = r.id
    AND e.graph_id = $1
    AND e.kind = 'knows'
    AND e.deleted_at IS NULL
  JOIN typegraph_nodes n
    ON n.graph_id = e.graph_id
    AND n.id = e.to_id
    AND n.kind = e.to_kind
    AND n.deleted_at IS NULL
  WHERE r.depth < $3
) CYCLE id SET is_cycle USING path
SELECT id, depth FROM reachable WHERE NOT is_cycle;

Results

Start Rows reached manual-array p50 native-cycle p50 Δ p50
hub 0 14,677 61.8 ms 65.3 ms +5.7%
hub 1 16,952 69.4 ms 73.9 ms +6.5%
hub 2 17,355 73.0 ms 77.0 ms +5.4%
mid 0 1,591 7.2 ms 7.8 ms +7.6%
mid 1 758 3.7 ms 3.9 ms +4.3%
mid 2 522 3.2 ms 2.9 ms -7.5%
leaf 5-77 0.4-0.8 ms 0.4-0.8 ms within noise

Interpretation

Native CYCLE is consistently slower (~5-8%) on the queries large enough to matter. Within noise on small queries. There is no scenario in this bench where it beats the manual array pattern.

Why: PG's CYCLE is syntactic sugar over the same text[] tracking. It does not switch detection to a hash or bitmap. The extra work appears to come from the implicit cycle-marker column maintenance plus the NOT is_cycle filter on output. The path-elision win the commenter posited never materializes because the recursion still needs id and depth columns in the output projection.

Experiment 2 — prepared-statement plan cache

Query under test

Same SQL as Variant A above. Each variant runs 2 warmup + 20 alternating executions (hub, leaf, hub, leaf, …) on a dedicated checked-out client.

Variants

  • unnamed: no name field passed to pg.Client#query — driver issues unnamed parse-execute each time.
  • named-auto: named prepared statement, default plan_cache_mode (auto). PG decides custom-vs-generic after 5 executions.
  • named-force-custom: named prepared statement, SET plan_cache_mode = 'force_custom_plan'.
  • named-force-generic: named prepared statement, SET plan_cache_mode = 'force_generic_plan'.

Results

Variant hub p50 hub p95 leaf p50 leaf p95 pg_prepared_statements
unnamed 68.8 ms 79.1 ms 0.96 ms 1.78 ms (n/a)
named-auto 68.0 ms 72.3 ms 0.77 ms 1.12 ms generic=17, custom=5
named-force-custom 68.9 ms 87.8 ms 0.81 ms 1.31 ms generic=0, custom=22
named-force-generic 69.2 ms 73.8 ms 0.77 ms 0.98 ms generic=22, custom=0

Interpretation

The plan-mode flip is real. pg_prepared_statements confirms that under default auto mode, PG used a custom plan for the first 5 executions (including warmups) and then switched to generic for the remaining 17. This matches the documented behavior the Reddit commenter cited.

But across all four variants, latency is statistically identical for the same workload class. Forcing custom doesn't help; forcing generic doesn't hurt; using the default auto (where the flip happens) is indistinguishable from unnamed parse-execute.

Why this benchmark didn't expose pain even though the planner flipped:

  • The query's cost is dominated by the recursive worktable join, not by the initial start-node lookup.
  • Both typegraph_edges_from_idx and the typegraph_nodes primary key serve any start-node selectivity well enough that there's no hash-vs-nested-loop knife-edge to fall off when the planner uses averaged statistics.

This is a meaningful negative result. It does not disprove the risk in general — a different query shape with a planner knife-edge at the start-node lookup could still get burned by auto. But for the recursive traversals TypeGraph emits today, prepared statements with auto plan cache are fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment