func.array_agg(column, order_by=column) in SQLAlchemy silently drops the order_by keyword argument, generating array_agg(path_sig) instead of array_agg(path_sig ORDER BY path_sig). This means the HDCA signature arrays used for job search equivalence checking are unordered, causing non-deterministic comparison results on PostgreSQL.
In lib/galaxy/managers/jobs.py, the agg_expression method:
def agg_expression(self, column):
if self.dialect_name == "sqlite":
return func.group_concat(column)
else:
return func.array_agg(column, order_by=column)The order_by=column kwarg is silently ignored by SQLAlchemy's func.array_agg. Verified:
>>> from sqlalchemy import func, String, column
>>> col = column('path_sig', String)
>>> expr = func.array_agg(col, order_by=col)
>>> print(expr.compile(compile_kwargs={'literal_binds': True}))
array_agg(path_sig)No ORDER BY is generated. Both the reference and candidate signature arrays are returned in whatever scan order PostgreSQL happens to use, which varies depending on query plan, table statistics, and prior test state.
Added logging after each stage of the job search pipeline to identify where candidates are lost:
Job search: 0 jobs match input datasets
Job search: 0 jobs remain after _filter_jobs
No equivalent jobs found (query returned 0 candidate rows)
Result: The SQL query itself returns 0 rows. The bug is in the HDCA signature matching (_build_stmt_for_hdca), not in parameter matching or output filtering.
Added logging after each CTE to trace the signature comparison:
HDCA search: v=75, collection_type=list, depth=0
HDCA search: reference dataset_ids=[251, 252, 253]
HDCA search: reference signature elements=['data0;251', 'data1;252', 'data2;253']
HDCA search: reference full signature=['data0;251', 'data1;252', 'data2;253']
HDCA search: candidate pre-filter HDCA ids=[75]
HDCA search: candidate signature elements=[(75, 'data2;253'), (75, 'data1;252'), (75, 'data0;251')]
HDCA search: candidate full signatures=[(75, ['data2;253', 'data1;252', 'data0;251'])]
HDCA search: equivalent HDCA ids=[]
Key finding: The reference and candidate are the same HDCA (id=75) with the same elements, but the aggregated arrays differ:
- Reference:
['data0;251', 'data1;252', 'data2;253'](happens to be alphabetical = insertion order) - Candidate:
['data2;253', 'data1;252', 'data0;251'](reverse order — different scan path)
The == comparison fails because the arrays have different element ordering.
>>> from sqlalchemy import func, String, column
>>> col = column('path_sig', String)
>>> print(func.array_agg(col, order_by=col).compile(compile_kwargs={'literal_binds': True}))
array_agg(path_sig) # NO ORDER BY!The correct approach uses aggregate_order_by from the PostgreSQL dialect:
>>> from sqlalchemy.dialects.postgresql import aggregate_order_by
>>> print(func.array_agg(aggregate_order_by(col, col.asc())).compile(
... dialect=postgresql.dialect(), compile_kwargs={'literal_binds': True}))
array_agg(path_sig ORDER BY path_sig ASC) # Correct!In lib/galaxy/managers/jobs.py, change agg_expression:
from sqlalchemy.dialects.postgresql import aggregate_order_by
def agg_expression(self, column):
if self.dialect_name == "sqlite":
return func.group_concat(column)
else:
return func.array_agg(aggregate_order_by(column, column.asc()))Without ORDER BY, array_agg returns elements in whatever order PostgreSQL's query executor processes them. This depends on:
- The query plan chosen (affected by table statistics, which change as tests run)
- The physical order of rows (affected by prior inserts/deletes)
- Join ordering decisions by the optimizer
When the reference and candidate CTEs happen to use the same scan order, the arrays match and the test passes. When they use different scan orders (different join paths, different CTE structures), the arrays differ and the test fails.
Failed run with full diagnostic output: https://github.com/mvdbeek/galaxy/actions/runs/23689091785
The debug log lines showing the signature mismatch are in the "Run tests" step output for the Test (3.10, 1) job.