typegraph-cycle-testing.md

Tested two potential optimizations based on a comment on the TypeGraph post in /r/postgresql from u/Deep_Ad1959:

Native CYCLE clause vs the manual ARRAY[] + != ALL(path) cycle detection that TypeGraph currently emits (packages/typegraph/src/query/dialect/postgres.ts:178-192).
Prepared-statement plan caching can pick a bad generic plan after the 5th execution under plan_cache_mode = auto when start-node selectivity varies widely.

Setup

Postgres: PG18 in a dedicated container (pgvector/pgvector:pg18).
Graph: 10,000 nodes (Person), ~21,500 knows edges with a heavy-tailed degree distribution:
- 5 hub nodes with 500 outgoing edges each
- 100 mid nodes with 30 outgoing edges each
- ~9,895 leaves with 0-3 outgoing edges
- +500 reciprocal pairs to seed cycles
Schema: matches what createPostgresTables produces — same typegraph_nodes, typegraph_edges, and typegraph_edges_from_idx shape the dialect emitter relies on.
Traversal depth: 5 hops, cycle prevention on (the default).
Iterations: 2 warmup + 10 samples per (start-node, variant) pair.
Bench file: packages/benchmarks/src/recursive-pg-bench.ts.

Experiment 1 — manual ARRAY vs native CYCLE clause

Variant A — what TypeGraph emits today

WITH RECURSIVE reachable AS (
  SELECT n.id, n.kind, 0 AS depth, ARRAY[n.id]::text[] AS path
  FROM typegraph_nodes n
  WHERE n.graph_id = $1 AND n.id = $2 AND n.deleted_at IS NULL
  UNION ALL
  SELECT n.id, n.kind, r.depth + 1 AS depth, (r.path || n.id)::text[]
  FROM reachable r
  JOIN typegraph_edges e
    ON e.from_id = r.id
    AND e.graph_id = $1
    AND e.kind = 'knows'
    AND e.deleted_at IS NULL
  JOIN typegraph_nodes n
    ON n.graph_id = e.graph_id
    AND n.id = e.to_id
    AND n.kind = e.to_kind
    AND n.deleted_at IS NULL
  WHERE r.depth < $3 AND n.id != ALL(r.path)
)
SELECT id, depth FROM reachable;

Variant B — PG14+ native CYCLE clause

WITH RECURSIVE reachable AS (
  SELECT n.id, n.kind, 0 AS depth
  FROM typegraph_nodes n
  WHERE n.graph_id = $1 AND n.id = $2 AND n.deleted_at IS NULL
  UNION ALL
  SELECT n.id, n.kind, r.depth + 1 AS depth
  FROM reachable r
  JOIN typegraph_edges e
    ON e.from_id = r.id
    AND e.graph_id = $1
    AND e.kind = 'knows'
    AND e.deleted_at IS NULL
  JOIN typegraph_nodes n
    ON n.graph_id = e.graph_id
    AND n.id = e.to_id
    AND n.kind = e.to_kind
    AND n.deleted_at IS NULL
  WHERE r.depth < $3
) CYCLE id SET is_cycle USING path
SELECT id, depth FROM reachable WHERE NOT is_cycle;

Results

Start	Rows reached	manual-array p50	native-cycle p50	Δ p50
hub 0	14,677	61.8 ms	65.3 ms	+5.7%
hub 1	16,952	69.4 ms	73.9 ms	+6.5%
hub 2	17,355	73.0 ms	77.0 ms	+5.4%
mid 0	1,591	7.2 ms	7.8 ms	+7.6%
mid 1	758	3.7 ms	3.9 ms	+4.3%
mid 2	522	3.2 ms	2.9 ms	-7.5%
leaf	5-77	0.4-0.8 ms	0.4-0.8 ms	within noise

Interpretation

Native CYCLE is consistently slower (~5-8%) on the queries large enough to matter. Within noise on small queries. There is no scenario in this bench where it beats the manual array pattern.

Why: PG's CYCLE is syntactic sugar over the same text[] tracking. It does not switch detection to a hash or bitmap. The extra work appears to come from the implicit cycle-marker column maintenance plus the NOT is_cycle filter on output. The path-elision win the commenter posited never materializes because the recursion still needs id and depth columns in the output projection.

Experiment 2 — prepared-statement plan cache

Query under test

Same SQL as Variant A above. Each variant runs 2 warmup + 20 alternating executions (hub, leaf, hub, leaf, …) on a dedicated checked-out client.

Variants

unnamed: no name field passed to pg.Client#query — driver issues unnamed parse-execute each time.
named-auto: named prepared statement, default plan_cache_mode (auto). PG decides custom-vs-generic after 5 executions.
named-force-custom: named prepared statement, SET plan_cache_mode = 'force_custom_plan'.
named-force-generic: named prepared statement, SET plan_cache_mode = 'force_generic_plan'.

Results

Variant	hub p50	hub p95	leaf p50	leaf p95	`pg_prepared_statements`
unnamed	68.8 ms	79.1 ms	0.96 ms	1.78 ms	(n/a)
named-auto	68.0 ms	72.3 ms	0.77 ms	1.12 ms	generic=17, custom=5
named-force-custom	68.9 ms	87.8 ms	0.81 ms	1.31 ms	generic=0, custom=22
named-force-generic	69.2 ms	73.8 ms	0.77 ms	0.98 ms	generic=22, custom=0

Interpretation

The plan-mode flip is real. pg_prepared_statements confirms that under default auto mode, PG used a custom plan for the first 5 executions (including warmups) and then switched to generic for the remaining 17. This matches the documented behavior the Reddit commenter cited.

But across all four variants, latency is statistically identical for the same workload class. Forcing custom doesn't help; forcing generic doesn't hurt; using the default auto (where the flip happens) is indistinguishable from unnamed parse-execute.

Why this benchmark didn't expose pain even though the planner flipped:

The query's cost is dominated by the recursive worktable join, not by the initial start-node lookup.
Both typegraph_edges_from_idx and the typegraph_nodes primary key serve any start-node selectivity well enough that there's no hash-vs-nested-loop knife-edge to fall off when the planner uses averaged statistics.

This is a meaningful negative result. It does not disprove the risk in general — a different query shape with a planner knife-edge at the start-node lookup could still get burned by auto. But for the recursive traversals TypeGraph emits today, prepared statements with auto plan cache are fine.

pdlug/typegraph-cycle-testing.md

Select an option

No results found

Select an option

No results found

Setup

Experiment 1 — manual ARRAY vs native CYCLE clause

Variant A — what TypeGraph emits today

Variant B — PG14+ native CYCLE clause

Results

Interpretation

Experiment 2 — prepared-statement plan cache

Query under test

Variants

Results

Interpretation