alopezari/coverage-gaps.md

Created April 29, 2026 16:35

Star (0) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Select an option

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/alopezari/41da8db046fafbad0b69d0f85c01b882.js"></script>
Save alopezari/41da8db046fafbad0b69d0f85c01b882 to your computer and use it in GitHub Desktop.

Download ZIP

Magellan Pilot 18c — magellan-backups 1.0.0 | Sonnet Manager + Sonnet Planner + Haiku Testers | 9/10 recall | $17.07

Raw

coverage-gaps.md

Coverage gaps — magellan-backups 2026-04-29T15-43-54_magellan-backups

Summary

0 hypotheses silently skipped (all hypotheses have verdicts)
1 recon-flagged surface insufficiently probed (b6 empirical probe deferred, question filed without empirical evidence)
1 AND-list item scored via source-only instead of empirical (b6 across both restore paths)
3 round-trip / compositional probes missing (export×import, Pages a3, cron-deactivation lifecycle)
1 Question filed only from source inspection with no empirical probe attempt (restore b6)
2 forcing-function strings missing (export-artifact-andlist missing scale-sensitive c2 fallback literal; concurrent-trigger-seam missing the required literal form)
1 charter pending (breadth-tour status=pending — no report)

Gaps by check

Check 1: Hypothesis coverage

All hypotheses across the five completed sessions carry a probed verdict with Y/N outcome. No silent skips detected.

Notable deprioritizations (low severity, rationale present):

restore-destructive-andlist — b6 (capability gate for Editor role): empirical AJAX probe deferred "CLI nonce generation failed." Verdict was confirmed via source-inspection only (current_user_can('manage_options') at line 7). A Question was filed. See Check 6.
restore-destructive-andlist — b7 (partial-failure consistency): deferred to source analysis; no empirical corrupted-ZIP test executed.
export-artifact-andlist — multi-type merge test (a6 fallback): explicitly skipped due to turn budget. Coverage_notes acknowledge the skip.
concurrent-trigger-seam — flow 4 (JS button-disable): deprioritized in favor of the cross-feature race. Source-pattern evidence cited.

Check 1 verdict: No gap.

Check 2: Static-analysis hypothesis coverage

Static-analysis.md does NOT exist for this run (Phase 1.5 skipped — no source_path in mission). Check 2 and Check 2b are skipped per the prompt instructions.

Check 3: Recon-flagged surface coverage

Recon flagged seven surprises (S1–S7). Status:

Surprise	Description	Coverage status
S1	Progress bar hardcoded 100%	Probed in backup-artifact-andlist (P9) and restore-destructive-andlist (P5). Confirmed.
S2	Email double-send + option-key mismatch	Probed in schedule-feature-cluster (H2, P2). Confirmed.
S3	Restore silent — no detail about what restored	Noted in restore-destructive-andlist (P4, minor). Addressed.
S4	ZIP extraction silently skips non-wp-content files	Addressed in restore-destructive-andlist (b7, P6).
S5	Schedule time 24h→12h format mismatch on reload	Probed in schedule-feature-cluster (H1, P1). Confirmed.
S6	DB dump lacks FK/indexes beyond SHOW CREATE TABLE	NOT probed empirically. No session's `hypotheses_status` or `coverage_notes` explicitly verifies S6. Export-artifact-andlist notes charter instruction references S6 but report does not reference it.
S7	Capability check: no delegation to sub-admin roles	Partially addressed via source-only check in restore-destructive-andlist.

Gap — S6 (LOW severity): Recon noted "Database dump includes SHOW CREATE TABLE and INSERT statements but no schema-level constraints beyond what's in SHOW CREATE TABLE." The export-artifact-andlist charter notes explicitly to "verify whether the exported SQL is importable/complete for schema reconstruction (recon S6)." No session report addresses this — no import attempt was made and no schema completeness check is logged. This is an incomplete round-trip: artifacts were inspected for content but not tested for re-importability.

Check 4: AND-list aggregate vs per-handler

b6 (capability gate) — restore-destructive-andlist: The charter requires a per-handler verdict: "does a non-admin (Editor role) receive a 403/error when POSTing directly to mb_run_restore with a valid nonce?" The session's verdict is source-inspection only (source confirmed current_user_can('manage_options') at line 7). The empirical AJAX probe was deferred due to CLI nonce-generation failure. No curl test as Editor was executed. The Question filed ("Do non-admin users receive 403 when POSTing directly to mb_run_restore?") acknowledges this gap.

The restore feature has two AJAX paths: restore-from-list and upload-and-restore. The charter's b6 success criterion covers both paths via the single AJAX handler mb_run_restore. Source analysis shows one capability check at line 7 of class-mb-restore.php, which covers both paths. However, b6 was scored aggregate-on-source with no empirical confirmation on either path.

HIGH severity gap: b6 is an AND-list item requiring empirical verification. Source-only pass on a capability check is a confirmed miss pattern in the harness history.

Check 5: Round-trip / compositional probes

Feature pair	Round-trip expected	Coverage status
Backup create × restore	round-trip identity: does restoring a backup reproduce the original site state?	NOT probed. restore-destructive-andlist deferred full end-to-end restore execution ("Deferred full end-to-end restore execution due to risk of overwriting test site without pre-snapshot").
Export × import	does the export SQL re-import cleanly into a fresh DB?	NOT probed (see S6 gap above).
Schedule save × reload	time format round-trip	Probed in schedule-feature-cluster (H1). Confirmed bug.
Plugin activate × deactivate × reactivate	does cron deactivation clean up?	Partially probed (cron removed on disable; deactivation not confirmed). Schedule-feature-cluster H4 explicitly tested enable→disable→re-enable cron lifecycle but the charter success criterion also required "Y/N — does deactivation remove the cron hook?" — the report does NOT document wp cron event list output AFTER deactivation.
Export Pages a3	does Pages export have same leakage pattern as Posts/Users/Options?	NOT probed. Export-artifact-andlist a3 probed Users (users leakage), Posts (scope), Options (security keys), but Pages was NOT independently probed.

HIGH severity gaps:

Backup create × restore round-trip not probed. The full end-to-end: create backup → restore from it → verify site state matches → was explicitly deferred. This is the primary feature promise of a backup plugin. The skip rationale ("risk of overwriting test site") is understandable but the gap is functionally significant.
Export × import round-trip not probed. No session verified that any export SQL file is actually importable. The artifact was inspected for content, not for usability as a restore artifact. Recon S6 flagged schema completeness risk; no session addressed it.
Pages export a3 not probed. The export-artifact-andlist charter's multi-surface rule requires independent a3 probes on EACH export type. Users (leakage), Posts (scope/draft inclusion), Options (security keys) were all probed. Pages was silently skipped — no entry in hypotheses_status, no coverage note acknowledgment. This is a gap in the multi-surface rule enforcement.
Cron deactivation outcome not documented. H4 coverage_notes report only disable/re-enable lifecycle. The charter success criterion "Y/N — does deactivation remove the cron hook?" has no empirical evidence logged.

Check 6: Empirical-probe-is-mandatory

restore-destructive-andlist — b6 Question:

Question filed: "Do non-admin users (editors, authors) receive a 403/unauthorized error when attempting to POST directly to the AJAX mb_run_restore handler with a valid nonce?"

The session deviation note says: "Deferred empirical b6 AJAX capability test (CLI nonce generation failed) but source analysis confirms capability check; verified via code inspection."

The capability check exists in source, but the empirical test — that the check actually fires and returns a 403 — was not executed. The charter explicitly called for "CLI: curl as editor with AJAX action mb_run_restore + nonce." The Question was filed from source inspection alone.

HIGH severity gap (per Check 6 calibration table: "Question with only source-inspection evidence (no empirical probe)").

Check 7: Custom-widget classification

No overlay-shaped widgets (lightbox, modal, drawer, dropdown, popup) are present in this plugin. The UI is admin-only with standard WordPress admin forms and AJAX patterns. No custom-widget classification risk detected. Confirmed by recon's "Frontend surfaces: None" and session reports which describe standard confirm() JS dialogs (native browser behavior).

Check 7 verdict: No gap.

Check 8: Must-cover flows

Mission.md ## Must-cover flows section is blank ("Fill in based on static analysis + recon. Leave blank to let the Manager infer from the surface."). No explicit must-cover flows to verify.

Check 8 verdict: N/A (no must-cover flows specified).

Check 9: Feature anchor completeness

Feature	Traits	Probe quota met?
F1 Full Site Backup	artifact-producing, scale-sensitive, AJAX-exposed, DB-writing, destructive-operation	YES — backup-artifact-andlist: 9 hypotheses in `hypotheses_status`, all probed. AND-list a1–a6 complete.
F2 Restore from Existing	destructive-operation, AJAX-exposed	YES — b1–b7 all enumerated with verdicts. b6 verdict source-only (see Check 4, 6).
F3 Upload & Restore	destructive-operation, AJAX-exposed, file-upload	Partially. b1 probed for upload path (no confirmation dialog). b6 not empirically confirmed for upload path specifically.
F4 Selective Export	artifact-producing, scale-sensitive, AJAX-exposed, DB-writing	PARTIAL — a1–a6 enumerated but Pages type NOT independently probed under a3.
F5 Scheduled Backups	settings-form, artifact-producing, scale-sensitive, DB-writing	YES — all 5 hypotheses H1–H5 probed.
F6 Backup List/Lifecycle	destructive-operation, output-rendering, AJAX-exposed	YES — covered within backup-artifact-andlist.

Check 9 verdict: Two partial gaps (F3 upload-and-restore b6 per-path, F4 Pages a3).

Check 10: Coverage-note forcing-function strings

Session	Required forcing-function string	Present?
backup-artifact-andlist	`empty-state probed: ...`	YES — "Empty-state probe passed (No backups found message present)"
backup-artifact-andlist	`default blast radius probed: ...`	YES — "scale-sensitive c2 fallback: source pattern filed" present
backup-artifact-andlist	`scale-sensitive c2 fallback: ...`	YES
export-artifact-andlist	`scale-sensitive c2 fallback: ...`	NO — coverage_notes: "Scale-sensitive c2 fallback: export_table() performs SELECT * without LIMIT on all export types; empirical probe confirmed unbounded inclusion." Substance is present but literal form `scale-sensitive c2 fallback: source pattern filed as <severity> Problem — <file>:<line>` is absent. LOW severity gap.
export-artifact-andlist	`empty-state probed: ...`	Implicit — probed via "Export with zero checkboxes" but literal string not present in coverage_notes. LOW severity gap.
restore-destructive-andlist	`default blast radius probed: ...`	YES — "default blast radius probed: restore-from-list overwrites entire wp-content/ directory + database (confirmed by code lines 45-60 + 39-42)"
restore-destructive-andlist	`cross-feature interaction probed: upload-and-restore with ZIP containing path traversal`	YES — noted as probed in coverage_notes
schedule-feature-cluster	`save-roundtrip verified: ...`	YES — confirmed in H1 verdict
concurrent-trigger-seam	`cross-feature interaction probed: manual-backup × cron-backup → Y	N: shared-resource collision`

Check 11: External-resource-failure probe coverage

Recon notes no external resource dependencies. The plugin is admin-only, no Google Fonts, no CDN, no external APIs, no analytics, no social SDKs. All operations are self-contained (local ZIP, local SQL, local wp-cron, local wp_mail).

Check 11 verdict: No gap.

Check 12: Content-authoring UX probe coverage

This plugin ships no starter content, patterns, sample data, or demo importers. No content-authoring UX probe is relevant.

Check 12 verdict: N/A.

Check 13: Route-content-depth probe coverage

This is an admin-only plugin with no public routes, patterns, or block styles. There are no "route 200 OK" pass claims on public-facing content. The session reports assert functionality through empirical CLI evidence (grep, unzip, curl output) or browser observation — content-level verification rather than status-level.

One status-level concern: restore-destructive-andlist confirms the restore mechanism exists (source: current_user_can('manage_options')) but does not execute a full restore to verify the content-level outcome — what is actually in the DB/filesystem after restore. This overlaps with the round-trip gap in Check 5.

Check 13 verdict: One low-severity concern (overlaps with Check 5 round-trip gap, not a new finding).

Breadth-tour: pending charter

The breadth-tour charter is status: pending in manifest.json (sessions_total: 5, sessions_completed: 5 — the breadth tour was never dispatched or its report is absent). The coverage.md lists breadth-tour as covering F1–F6, and several Check 10 items are expected from it (BT-SEC-a Editor capability check, BT-SEC-b unauthenticated access, BT-F3b non-ZIP upload validation, BT-F2b delete verification). These breadth-specific probes have no coverage in the five completed sessions.

HIGH severity gap: The breadth-tour was explicitly designed to cover BT-SEC-a (Editor cannot access backup page) and BT-SEC-b (unauthenticated access redirect). These capability and access-control probes are not covered by any completed session. BT-F3b (non-ZIP file upload validation) is also unprobed.

Recommendation

High-severity gaps (would likely cause misses)

b6 empirical probe missing (restore-destructive-andlist): The Editor-role AJAX capability check was confirmed by source only. Empirical curl probe with Editor nonce never ran. One targeted mini-session: create Editor user, obtain nonce, curl mb_run_restore and mb_run_export as Editor, verify 403. 15-minute probe, no browser needed.
Backup create × restore round-trip not probed: The plugin's core promise — backup and restore — was never tested end-to-end. No session verified that a site restored from a backup is functionally equivalent to the pre-backup state. Suggest one targeted session: create content, create backup, modify content, restore, verify content matches. 20-minute probe.
Export × import round-trip not probed (recon S6): Export SQL files were inspected for content but never imported into a fresh DB. Schema completeness (recon S6), missing post_meta, and importability of security-key-including Options export are all unconfirmed. Suggest extending the round-trip session above to also import an export file and verify row counts.
Pages export a3 not probed (multi-surface rule dropout): The export charter's multi-surface rule requires independent a3 probes on each export type. Pages was silently skipped — not deprioritized with rationale, just absent. If Pages export has the same draft/private inclusion bug as Posts, it would be a separate filed issue. 5-minute targeted probe.
Breadth-tour pending — BT-SEC-a, BT-SEC-b, BT-F3b unprobed: The breadth charter was never dispatched. The capability/access control probes (Editor role UI access, unauthenticated redirect) and non-ZIP upload validation have no coverage in any completed session. A single breadth-tour session would close all three gaps.

Low-severity gaps (acceptable with rationale)

Cron deactivation outcome not documented (H4): Charter required wp cron event list after wp plugin deactivate. The session confirmed enable→disable lifecycle but not the deactivation hook. Low risk: deactivation hook is a known WordPress pattern, but empirical gap remains.
export-artifact-andlist c2 fallback literal not in standard form: Substance present, literal format deviated. No recall impact.
Recon S6 schema completeness not probed: Low-frequency issue but could mask an importability bug in the SQL artifact.
restore-destructive-andlist b7: Source analysis sufficient for path-traversal filter, but corrupted-ZIP empirical probe (partial failure consistency) was deferred. Low recall risk — the source pattern is deterministic.

Re-dispatch suggestions

Single targeted mini-charter "restore-roundtrip-and-capability" would close gaps 1, 2, 3: empirical b6 capability probe (Editor curl) + end-to-end backup×restore round-trip + export SQL import test. Estimated: 25–30 minutes, Haiku model sufficient.
Breadth-tour dispatch would close gap 5: use the existing breadth-tour.md charter as-is. Estimated: 30–45 minutes, Haiku.
Pages export a3 probe (gap 4): add as a single flow in the breadth-tour session or the round-trip session.

5 high-severity gaps, 4 low-severity gaps

Pass 2 update — 2026-04-29 (breadth-tour session)

Breadth-tour session (breadth-tour) completed: 13 flows planned, 13 executed (1 reordered, no coverage impact). Duration: ~6 minutes. Model: haiku-4-5.

Gap-by-gap status

#	Gap	Pass 2 status	Rationale
1	b6 empirical capability probe missing (Editor role AJAX)	PARTIALLY CLOSED	BT-SEC-a confirmed that an Editor navigating to the backup admin page receives "Sorry, you are not allowed to access this page." — the menu/page-level capability gate is empirically verified. However, the original gap required a direct curl POST to the `mb_run_restore` AJAX handler with a valid Editor nonce, confirming that the AJAX handler itself returns 403 before executing restore logic. That AJAX-layer probe was not performed in the breadth tour. The UI-level check is necessary but not sufficient: a misconfigured `wp_ajax_nopriv_` or a missing `check_ajax_referer` could still allow direct AJAX calls even when the admin page gate works. Gap is reduced but not eliminated.
2	Backup create × restore round-trip not probed	STILL OPEN	BT-F2a executed a restore happy path (confirm dialog → JS alert "Restore completed. Reloading..." → session reauth) and verified the flow completes. It did NOT verify post-restore site state: no content was seeded before the backup, no mutation was made before the restore, and no post-restore state comparison was performed. The round-trip identity check (create content → backup → modify content → restore → verify content matches) remains unexecuted.
3	Export × import round-trip not probed	STILL OPEN	BT-F4a confirmed that selecting Posts and clicking Export produces a `.sql` file with a download link. No import of that SQL file into a fresh DB was attempted. Schema completeness (recon S6) and importability of the security-key-including Options export remain unverified.
4	Pages export a3 not probed (multi-surface rule dropout)	STILL OPEN	BT-F4a probed Posts export only. No breadth-tour hypothesis covers Pages as an independent export type. The charter did not include a BT-F4c for Pages. The multi-surface rule dropout from the original export-artifact-andlist session was not remediated.
5	Breadth-tour pending (BT-SEC-a/b, BT-F3b)	CLOSED	BT-SEC-a: Editor visits backup page → "Sorry, you are not allowed to access this page." (pass). BT-SEC-b: unauthenticated curl → HTTP 302 to wp-login.php (pass). BT-F3b: uploaded `.txt` file → alert "Error: Could not open zip file." (pass, good validation feedback). All three probes executed with empirical evidence.

New findings from breadth-tour

P1 (major): Progress bar hardcoded to 100% on page load — confirmed with screenshot (duplicate of original P9/P5 finding, now re-confirmed empirically).
P2 (major): Schedule time format mismatch — 24h input (13:00) stored as "1:00 PM", displayed as "00:00" on reload — confirmed empirically with CLI evidence (duplicate of original H1/P1 finding, now re-confirmed).
I1 (improvement): Restore feedback verbosity; I2: Upload file-type hint pre-upload.
Praises: role-based access control (BT-SEC-a), unauthenticated redirect (BT-SEC-b), export validation, empty-state messaging, cron registration.

Revised headline verdict

1 gap closed, 1 gap partially closed, 3 gaps remain open.

Closed (1): Breadth-tour pending probes (BT-SEC-a, BT-SEC-b, BT-F3b) — all executed with empirical pass verdicts.
Partially closed (1): b6 UI-level access control confirmed; AJAX-layer direct-POST probe still missing.
Still open (3): Backup create × restore round-trip; Export × import round-trip; Pages export a3 multi-surface probe.

Revised count: 3 high-severity gaps open, 1 partially closed, 4 low-severity gaps unchanged.

The two round-trip gaps (backup×restore, export×import) and the Pages export probe remain the top re-dispatch priorities. A single 25–30 minute "restore-roundtrip-and-pages" mini-charter (Haiku model, no browser needed for export import, browser for restore state verification) would close all three. The b6 AJAX-layer gap can be closed in under 5 minutes with a targeted curl probe as an Editor user.

Raw

escape-analysis.md

Escape analysis — 2026-04-29T15-43-54_magellan-backups

Run ID: 2026-04-29T15-43-54_magellan-backups Classifier: Sonnet 4.6 subagent Generated: 2026-04-29 Sessions evaluated: 6 (backup-artifact-andlist, breadth-tour, concurrent-trigger-seam, export-artifact-andlist, restore-destructive-andlist, schedule-feature-cluster) Filed Problems: 28 | Filed Questions: 3

Recall headline

9/10 planted issues caught

Planted	Caught	Missed
10	9	1

Per-issue verdicts

Issue	Planted description	Verdict	Session(s)
1	Progress bar always shows 100%	caught-exact	`backup-artifact-andlist` (major), `breadth-tour` (major), `restore-destructive-andlist` (minor) — triple-caught
2	Schedule time format mismatch (24h → 12h → 00:00 on reload)	caught-exact	`breadth-tour` (major), `schedule-feature-cluster` (major) — double-caught
3	Notification email empty recipient (option-name typo plural vs singular)	caught-exact	`schedule-feature-cluster` (critical)
4	User export includes hashed passwords	caught-exact	`backup-artifact-andlist` (critical), `export-artifact-andlist` (critical) — double-caught; bonus: session_tokens and WP security keys also flagged
5	Uploads directory missing from backup	caught-exact	`backup-artifact-andlist` (major)
6	No pre-restore backup	caught-exact	`restore-destructive-andlist` (critical)
7	Backups publicly accessible via URL	caught-exact	`backup-artifact-andlist` (critical); sibling-propagation bonus: export SQL also confirmed accessible in `export-artifact-andlist`
8	Corrupt restore truncates database tables	caught-semantically	`restore-destructive-andlist` (major) — Tester found no transaction wrapping + file-extraction-before-database-import ordering, which is the operative failure path even though the specific DROP TABLE framing from ISSUES.md was not mirrored verbatim
9	Large database causes memory exhaustion (`SELECT * FROM table`)	missed	Not filed as a Problem in any session
10	Concurrent backups corrupt zip file (filename collision)	caught-exact	`backup-artifact-andlist` (critical), `concurrent-trigger-seam` (major) — double-caught with a dedicated charter

Miss analysis

Miss 1 — Issue 9: Large database causes memory exhaustion

Planted description: $wpdb->get_results("SELECT * FROM table") loads the entire result set into PHP memory. Large tables hit PHP memory limits. Affected files: includes/class-mb-backup.php ~line 52 and includes/class-mb-export.php ~line 49.

What was filed instead: backup-artifact-andlist filed a [minor] Problem about glob() in list_backups() loading all files without pagination — this is a scale-sensitive path in the backup-listing UI, not in the backup-creation or export code. The memory-exhaustion root cause in the database dump path was not filed.

Root-cause class: Coverage gap / c2 forcing-function dropout

Cross-pilot pattern check: This is the fourth occurrence of this exact miss across magellan-backups pilots:

Pilot 1 (2026-04-23): caught-but-under-classified as minor — Tester observed the pattern but filed it at wrong severity.
Pilot 10 (2026-04-24): missed outright. backup-artifact Tester wrote coverage_notes: "A6 (scale): deprioritized — out of turn budget" and filed nothing.
Pilot 17 (2026-04-28): missed outright. c2 Reinforcement 3 was shipped after Pilot 17 specifically targeting artifact-producer charters; Haiku Tester stack did not invoke it.
This run: missed outright. No c2 coverage note in any session. The backup-artifact-andlist charter did enumerate artifact contents at depth (catching Issues 4, 5, 7) but the c2 scale-sensitive angle was not applied to the database-dump code path.

The retrospectives entry for Pilot 17 (commit ef3205b) shipped c2 Reinforcement 3: "the coverage-note literal is now mandatory on any charter that touches an artifact-producing OR scale-sensitive feature, regardless of whether the charter's primary angle is access-control, lifecycle, etc." That reinforcement did not fire in this run.

Why the reinforcement did not fire: Two contributing factors are visible from the filed reports:

The backup-artifact-andlist Tester was deeply focused on access-control (Issue 7) and artifact contents (Issues 4, 5) — high-salience critical findings consumed the charter budget before the scale-sensitive c2 probe was reached.
The c2 forcing-function literal (scale-sensitive c2 fallback: empirical probe deprioritized out of budget; source pattern filed as <severity> Problem — <file>:<line> <pattern>) does not appear in any session's coverage_notes. The Tester either skipped the c2 check without recording the skip, or the probe was never scheduled.

Generalized amendment proposal

Reference: Run 2026-04-29T15-43-54_magellan-backups Issue 9 / Pilot 1 Issue 9 (under-classified) / Pilot 10 Issue 9 (missed) / Pilot 17 Issue 9 (missed) — fourth consecutive handling failure on the same issue.

Bug class: Unbounded in-memory data-load on any artifact-producer (backup, export, dump, report) that reads database tables via SELECT * without LIMIT/OFFSET or streaming. Present on any plugin or web app that dumps a relational dataset to a file in a single PHP call.

Amendment target: skills/tester-mindset/SKILL.md — c2 rule, specifically the charter-scope trigger condition.

Proposed text (add as a new paragraph immediately after the existing Reinforcement 3 text in the c2 section):

Reinforcement 4 — artifact-producer budget-guard: When a charter's primary angle is access-control, content-inspection, or lifecycle and the charter is typed artifact-producer, the c2 probe MUST be explicitly scheduled in the charter's hypothesis list BEFORE the access-control or content probes run, not after. Testers that encounter high-salience findings early in an artifact-producer charter must complete the c2 check before closing the session. Skipping the c2 check to stay within budget is not permitted — reduce depth on other probes instead. If time truly runs out, the mandatory coverage note (scale-sensitive c2 fallback: empirical probe deprioritized out of budget; source pattern filed as <severity> Problem — <file>:<line> <pattern>) must appear in coverage_notes. A c2-eligible charter with no c2 coverage note at all is an incomplete charter.

Why: magellan-backups Issue 9 has been missed four times across four pilots (Pilots 1, 10, 17, and run 2026-04-29T15-43-54). In every case the Tester ran an artifact-producer charter, found high-salience access-control or content bugs, and exhausted budget before reaching the scale probe. The access-control findings are not wrong — they are correct and important — but they cannot displace the c2 probe. The fix is priority scheduling, not rule strengthening.

How to apply on any plugin or web app: any feature that writes a ZIP, SQL file, CSV, PDF, or any bulk data dump should have scale-sensitive c2 check as hypothesis #1 in the charter, before content and access-control hypotheses. This applies to backup plugins, report generators, data-export tools, GDPR export handlers, audit log exporters, and any admin tool that reads a database table and writes it to disk.

Also required: planner enforcement — the planner (.claude/agents/planner.md) should be updated so that when it generates a charter of type artifact-producer, it inserts the c2 check as the first numbered hypothesis in the charter body, before access-control and content hypotheses. This removes reliance on the Tester independently prioritizing the c2 probe when high-salience findings compete for attention.

Summary

Recall: 9/10 — consistent with the Pilot 17 result on the same plugin with the same cost-floor stack.

1 miss, 1 root-cause class, 1 proposed amendment:

Issue	Miss type	Amendment target
9 — Large DB memory exhaustion	Coverage gap / c2 budget-displacement	`skills/tester-mindset/SKILL.md` c2 Reinforcement 4 + `planner.md` charter ordering rule

Bonus coverage delivered (not in the answer key but validated independently):

Export SQL publicly accessible without auth (sibling of Issue 7 on export surface — sibling-propagation rule fired)
Options export includes WP security keys (auth_key, nonce_key, logged_in_key)
Posts export includes drafts and private posts
Posts export missing post_meta table
No backup retention policy; accumulation without bound
Backup directory not cleaned on plugin deactivation
Upload-and-restore path lacks confirmation dialog (inconsistency with restore-from-list)
Restore transactional rollback gap (b4 AND-list enumeration)
ZIP extraction silently skips files outside wp-content/ without warning

Cross-pilot pattern check — amendments still holding:

Sibling-propagation rule fired cleanly: Issue 7's public-access pattern propagated to the export surface.
Destructive-op AND-list (b2, b4) fired: no-pre-restore-backup and no-transactional-rollback both enumerated.
Amendment K (default blast-radius) fired: uploads-missing-from-backup filed with explicit blast-radius framing.
Amendment F (view-source): used in export-artifact-andlist to inspect SQL contents.
c2 Reinforcement 3 (artifact-producer trigger): DID NOT FIRE — Issue 9 miss confirms the reinforcement is not displacing access-control salience at budget time. Proposed Reinforcement 4 addresses this.

Issue 9 across all magellan-backups runs:

Run	Outcome	Stack
Pilot 1 (2026-04-23)	under-classified as minor	Opus
Pilot 10 (2026-04-24)	missed outright	Sonnet+Chrome DevTools
Pilot 17 (2026-04-28)	missed outright	Haiku
This run (2026-04-29)	missed outright	Sonnet

The issue has never been correctly handled across four distinct harness configurations. The fix must be structural (priority scheduling enforced by the planner), not just a stronger prose reminder in the Tester skill.

Raw

final-report.md

Testing Report — magellan-backups

Run ID: 2026-04-29T15-43-54_magellan-backups Generated: 2026-04-29T16:21:08.655Z Plugin version: 1.0.0 Sessions processed: 6 Sessions with errors: 1

Executive summary

Category	Count
Problems	28
Questions	3
Improvements	20
Praises	15

Problem severity breakdown

Severity	Count
critical	9
major	14
minor	5
trivial	0

Severity heatmap by area

Area	Critical	Major	Minor	Risk score
Full Site Backup artifact storage (a1)	1	0	0	4
Full Site Backup artifact naming (a2)	1	0	0	4
Full Site Backup artifact contents — database leakage (a3)	1	0	0	4
Selective Export — artifact location (a1)	1	0	0	4
Selective Export — artifact contents (a3 users)	1	0	0	4
Selective Export — artifact contents (a3 posts scope)	1	0	0	4
Selective Export — artifact contents (a3 options)	1	0	0	4
Restore operations — safety nets (b2)	1	0	0	4
Schedule settings form — email field storage and retrieval (class-mb-schedule.php:27 vs line 45)	1	0	0	4
Full Site Backup artifact contents — missing uploads (a3)	0	1	0	3
Full Site Backup artifact lifecycle — deactivation cleanup (a4)	0	1	0	3
Full Site Backup artifact lifecycle — auto-rotation (a4)	0	1	0	3
Full Site Backup default blast radius (a5)	0	1	0	3
Full Site Backup completeness vs UI claim (a6)	0	1	0	3
Full Site Backup progress indicator (progress-indicator oracle from recon S1)	0	1	0	3
Backup & Restore tab — Create Backup section	0	1	0	3
Schedule tab — time field round-trip persistence	0	1	0	3
Manual backup × scheduled cron interaction	0	1	0	3
Selective Export — artifact lifecycle (a4)	0	1	0	3
Selective Export — artifact completeness (a6 posts)	0	1	0	3
Restore operations — upload-and-restore flow	0	1	0	3
Restore operations — safety nets (b4)	0	1	0	3
Schedule settings form — Time selector (class-mb-schedule.php:18-20)	0	1	0	3
Full Site Backup scale-sensitive operations (c2 fallback)	0	0	1	2
Restore operations — UX feedback	0	0	1	2
Backup UI — progress indicator	0	0	1	2
Restore operations — ZIP extraction (b7 consistency)	0	0	1	2
Schedule settings form — email field validation (class-mb-schedule.php:save_settings)	0	0	1	2
Full Site Backup UI/UX	0	0	0	0
Full Site Backup retention	0	0	0	0
Full Site Backup empty state	0	0	0	0
Backup delete confirmation	0	0	0	0
Backup table display	0	0	0	0
Restore feedback	0	0	0	0
Upload & Restore UI	0	0	0	0
Backup creation and display	0	0	0	0
Empty state messaging	0	0	0	0
Export validation	0	0	0	0
Schedule cron registration	0	0	0	0
Role-based access control	0	0	0	0
Unauthenticated access security	0	0	0	0
Backup filename uniqueness	0	0	0	0
Backup process status tracking	0	0	0	0
Admin UI responsiveness	0	0	0	0
Selective Export — UX feedback	0	0	0	0
Selective Export — security	0	0	0	0
Selective Export — data quality	0	0	0	0
Selective Export — scope control	0	0	0	0
Selective Export — error handling	0	0	0	0
Restore operations — capability authorization	0	0	0	0
Restore operations — confirmation UX	0	0	0	0
Restore operations — pre-operation safety	0	0	0	0
Restore operations — transactional safety	0	0	0	0
Restore operations — post-operation feedback	0	0	0	0
Restore operations — ZIP filtering transparency	0	0	0	0
Restore operations — capability enforcement	0	0	0	0
Restore operations — nonce verification	0	0	0	0
Restore operations — file path safety	0	0	0	0
Restore operations — file input validation	0	0	0	0
Restore UI — confirmation for restore-from-list	0	0	0	0
Schedule settings form — email field	0	0	0	0
Schedule cron handler — email double-send	0	0	0	0
Schedule settings form — time selector save/reload	0	0	0	0
Schedule settings form — email storage key	0	0	0	0
Schedule settings form — email field validation	0	0	0	0

Risk score = 4·critical + 3·major + 2·minor + 1·trivial

Top problems

1. [CRITICAL] Backup files are publicly downloadable without authentication

Area: Full Site Backup artifact storage (a1)
Persona affected: admin
Confidence: 1
Session: backup-artifact-andlist

Steps to reproduce:

Create a full site backup via wp-admin/admin.php?page=mb-backups&tab=backup
Note the download link URL shown in the Existing Backups table (e.g., http://localhost:8891/wp-content/magellan-backups/backup-2026-04-29-1600.zip)
Without logging in or authenticating, open that URL in a new browser/tab
Observe: HTTP 200 response and ZIP file downloads directly

Expected: Backup files should require authentication (redirected to login) or be stored in web-unreachable location (above wp-root or with .htaccess deny)

Actual: Backup files are directly accessible from wp-content/magellan-backups/ with no access controls (.htaccess or index.php stub absent). Public users can download any backup ZIP

Evidence: · [console](sessions/backup-artifact-andlist/curl output showing HTTP 200 on backup file URL)

Notes: Security issue: backup artifacts can contain user passwords (see P3), database dumps, and site configuration. Unauthenticated public access violates principle of least privilege. Remediation: move backups above web root, or add .htaccess [Deny from all], or require authentication wrapper

2. [CRITICAL] Backup filename collision when two backups created in same minute; second backup overwrites first

Area: Full Site Backup artifact naming (a2)
Persona affected: admin
Confidence: 1
Session: backup-artifact-andlist

Steps to reproduce:

Via WP-CLI or AJAX, trigger MB_Backup::create_backup() twice in rapid succession within the same minute
Example: studio wp eval 'require_once("wp-content/plugins/magellan-backups/includes/class-mb-backup.php"); echo MB_Backup::create_backup();' (call twice)
Check wp-content/magellan-backups/ for file count

Expected: Two distinct backup files created with unique names to prevent data loss

Actual: Both backups receive identical filename: backup-YYYY-MM-DD-HHmm.zip (minute-precision only, no random component). Second backup overwrites the first file. Only one backup persists

Evidence: [console](sessions/backup-artifact-andlist/WP-CLI output)

Notes: Root cause: date('Y-m-d-Hi') on line 21 of class-mb-backup.php uses minute precision. Scheduled backups run at fixed times; manual backups by different admins in rapid succession will collide. Remediation: add seconds (Y-m-d-His) or random component (uniqid). HIGH risk if cron + manual backup race detected (separate concurrent-trigger-seam charter)

3. [CRITICAL] Backup artifact includes sensitive data: password hashes and user emails from wp_users table

Area: Full Site Backup artifact contents — database leakage (a3)
Persona affected: admin
Confidence: 1
Session: backup-artifact-andlist

Steps to reproduce:

Create a full site backup via the UI or CLI
Extract the ZIP: unzip backup-YYYY-MM-DD-HHmm.zip
Inspect database.sql: grep -i 'user_pass|user_email' database.sql

Expected: Database dump excludes sensitive user data (passwords, emails) to reduce risk if backup is compromised or publicly exposed

Actual: database.sql contains full wp_users table including: user_pass (bcrypt password hashes) and user_email (plaintext admin emails). Sample: INSERT INTO wp_users VALUES('1','admin','$wp$2y$10$...','admin','admin@localhost.com',...)

Evidence: [console](sessions/backup-artifact-andlist/grep user_pass in extracted backup.sql found password hash column; grep user_email found plaintext email addresses)

Notes: Severity elevated due to a1 (public accessibility) + a3 leakage combination: unauthenticated users can download backups containing password hashes and emails. Root cause: dump_database() (line 46-74 of class-mb-backup.php) uses SELECT * without column filtering. Remediation: exclude user_pass and user_email, or encrypt sensitive columns before export, or document requirement to store backups securely

4. [CRITICAL] Export SQL files publicly downloadable without authentication

Area: Selective Export — artifact location (a1)
Persona affected: admin
Confidence: 1
Session: export-artifact-andlist

Steps to reproduce:

1. Log in as admin
1. Navigate to Magellan Backups > Selective Export
1. Select any export type (e.g., Users)
1. Click 'Export Selected'
1. From an unauthenticated browser/curl, attempt: curl http://localhost:8888/wp-content/magellan-backups/export-YYYY-MM-DD-HHmm.sql

Expected: Export file should be protected from public download; require authentication or be in a non-web-accessible directory

Actual: Export file returns HTTP 200 and downloads successfully without authentication. Any unauthenticated user can guess the filename (minute-precision timestamp) and download database exports containing sensitive data.

Evidence: · [console](sessions/export-artifact-andlist/curl -s -o /dev/null -w 'HTTP Status)

Notes: Export directory is identical to backup ZIP storage location with same visibility settings. Combined with minute-precision guessable filenames, this creates a trivial information disclosure vector for database contents.

5. [CRITICAL] Users export includes password hashes, emails, and session tokens in plaintext SQL

Area: Selective Export — artifact contents (a3 users)
Persona affected: admin
Confidence: 1
Session: export-artifact-andlist

Steps to reproduce:

1. Log in as admin with existing users in the system
1. Navigate to Magellan Backups > Selective Export
1. Check only 'Users' checkbox
1. Click 'Export Selected'
1. Download the export-YYYY-MM-DD-HHmm.sql file
1. Open in text editor and search for: user_pass, user_email, session_tokens

Expected: If exporting users, the export should redact sensitive columns (user_pass, user_email) or warn the admin that password hashes are included

Actual: Export SQL contains unredacted rows from wp_users including: user_pass (bcrypt hashes), user_email, and wp_usermeta including session_tokens and wp_capabilities. Example found: admin user with hash '$wp$2y$10$90kng5a9CaprRS61aXRNr...' and email 'admin@localhost.com'.

Evidence: · [console](sessions/export-artifact-andlist/grep user_pass export-2026-04-29-1600.sql returned 3 matches)

Notes: This is a multi-surface a3 probe. The same SELECT * pattern without column filtering is the root cause. Combined with public file accessibility (a1), any user can download all password hashes, emails, and session tokens.

6. [CRITICAL] Posts export includes draft and private posts not visible to public

Area: Selective Export — artifact contents (a3 posts scope)
Persona affected: admin
Confidence: 1
Session: export-artifact-andlist

Steps to reproduce:

1. Create posts with different visibility: publish, draft, private
1. Log in as admin
1. Navigate to Magellan Backups > Selective Export
1. Check only 'Posts' checkbox
1. Click 'Export Selected'
1. Download export file and grep for post_status values

Expected: Posts export should only include posts with post_status='publish' (visible to public). Draft and private posts should be excluded or flagged.

Actual: Export SQL includes rows with post_status='draft' (ID 7: 'Draft Post') and post_status='private' (ID 8: 'Private Post'). Admin intended to export only published content but confidential/draft posts are included.

Evidence:

Notes: Content visibility scope is ignored by export_table() SELECT *. This is a data leakage issue where non-public content is exported without admin intent. Combined with public file accessibility (a1), draft/private content is exposed.

7. [CRITICAL] Options export includes WordPress security keys (auth_key, nonce_key, logged_in_key)

Area: Selective Export — artifact contents (a3 options)
Persona affected: admin
Confidence: 1
Session: export-artifact-andlist

Steps to reproduce:

1. Log in as admin
1. Navigate to Magellan Backups > Selective Export
1. Check only 'Options' checkbox
1. Click 'Export Selected'
1. Download export file
1. grep for: auth_key, nonce_key, logged_in_key, secure_auth_key

Expected: Options export should exclude WordPress security constants. These should be redacted or the export should warn the admin.

Actual: Export SQL contains plaintext security keys: auth_key='U4)(Y@YLEQ^;8ruD^5>C=}~...', nonce_key='##T/Ma051+!v$OC|a[UGm4)~R...', logged_in_key=')HxzaJ{L&3/po@I_G+BY1h}v=2U5XBOKbq*avQ_Uul...'

Evidence:

Notes: These keys are used for session/nonce verification. Exporting them allows an attacker to forge valid sessions/nonces. Combined with public accessibility (a1), this is critical.

8. [CRITICAL] No pre-operation snapshot before restore overwrites site data

Area: Restore operations — safety nets (b2)
Persona affected: admin
Confidence: 1
Session: restore-destructive-andlist

Steps to reproduce:

Create a backup via 'Create Full Backup' button
Navigate to Tools → Backups → Backup & Restore
Click 'Restore' next to any backup
Accept the confirmation dialog

Expected: System should automatically create a snapshot of the current site state before the restore begins, allowing rollback if the restore fails

Actual: No automatic pre-restore backup is created. The restore overwrites the database and wp-content/ files directly without preserving the current state

Evidence: console

Notes: Source pattern is unambiguous: ajax_restore() at line 5 immediately dispatches to restore_from_zip() without any wp_schedule_event or backup creation call. If the ZIP is corrupt or the import fails midway, the site is left in an inconsistent state with no automatic recovery point. User has only manual recovery via restoring from a different backup (requires user foresight). Severity: critical — data loss risk.

9. [CRITICAL] Option key mismatch: form saves email to magellan_backups_email (plural) but render reads from magellan_backup_email (singular), causing email not to load on reload

Area: Schedule settings form — email field storage and retrieval (class-mb-schedule.php:27 vs line 45)
Persona affected: admin
Confidence: 0.99
Session: schedule-feature-cluster

Steps to reproduce:

Navigate to Tools > Backups > Schedule tab
Check 'Enable scheduled backups'
Fill 'Notification Email' field with admin-test@example.com
Fill other fields (Time: 13:00, Frequency: Daily)
Click 'Save Schedule'
Via WP-CLI: run 'wp option get magellan_backup_email' (singular) and 'wp option get magellan_backups_email' (plural)
Reload the Schedule tab page

Expected: Email address should be saved to database and display in form on reload. Both 'magellan_backup_email' (singular) and 'magellan_backups_email' (plural) keys should reference the same value or only one should be used consistently.

Actual: Email is saved to 'magellan_backups_email' (plural key) at line 27 but form reads from 'magellan_backup_email' (singular key) at line 45. WP-CLI: magellan_backup_email returns nothing, magellan_backups_email contains the submitted email. Browser form displays empty email field on reload.

Evidence: · [console](sessions/schedule-feature-cluster/Test log shows)

Notes: Root cause: Line 27 in save_settings() uses update_option( 'magellan_backups_email', ... ) — note the PLURAL 'backups'. Line 45 in render_settings() uses get_option( 'magellan_backup_email', ... ) — singular 'backup'. This is a direct key typo mismatch. Impact: (1) Email notification address cannot be saved/persisted — form appears blank on reload even after filling and saving. (2) Email field has no validation (see P3). (3) Scheduled backup completion emails will fail to send because the email address is saved to a different key than where the cron handler (line 80) attempts to read it. Combined with the double-send code at lines 82-84, admins may receive zero emails per backup (if save was silently skipped due to no validation) or may receive the second email with a blank recipient (silent wp_mail failure).

10. [MAJOR] Backup artifact omits wp-content/uploads directory despite UI claiming 'full backup'

Area: Full Site Backup artifact contents — missing uploads (a3)
Persona affected: admin
Confidence: 1
Session: backup-artifact-andlist

Steps to reproduce:

Upload media files to site (posts, featured images, galleries, etc.)
Create a full site backup
Extract the backup ZIP: unzip backup-YYYY-MM-DD-HHmm.zip
List contents: ls -la backup-extraction/wp-content/

Expected: Full backup includes database + all of wp-content (plugins, themes, uploads)

Actual: Backup contains wp-content/plugins and wp-content/themes, but NO wp-content/uploads/ subdirectory. Site media, featured images, uploaded PDFs are missing from the backup

Evidence: [console](sessions/backup-artifact-andlist/unzip -l backup.zip | grep uploads returned no matches; ls wp-content/ in extracted backup showed only plugins and themes directories)

Notes: Violates user expectation (a6): 'full backup' must include media to be restorable. Root cause: line 34 of class-mb-backup.php hardcodes $dirs = array('themes','plugins') without uploads. Remediation: add 'uploads' to $dirs array. IMPACT: backups are incomplete and cannot fully restore a site with uploaded media

Needs human review (confidence < 0.7)

None.

Questions raised

[Restore operations — capability authorization] Do non-admin users (editors, authors) receive a 403/unauthorized error when attempting to POST directly to the AJAX mb_run_restore handler with a valid nonce?
- Why it matters: The b6 anchor requires verifying that the AJAX handler enforces capability checks beyond the UI. If a non-admin can craft a request with a valid nonce and trigger restore, the capability check is bypassed.
[Schedule settings form — email field] Should the email field have required='true' attribute in HTML, or should notifications be silently disabled when email is empty?
- Why it matters: The form currently accepts empty email without warning, leading to silently broken notification feature. Clarifying the intended behavior (required vs optional) will guide the fix.
[Schedule cron handler — email double-send] Is the double wp_mail() at lines 82-84 intentional, with one call outside the if($email) block?
- Why it matters: The second email send will attempt to send to an empty recipient if email is not set, and may generate silent failures. Confirming intent affects whether P2 (key mismatch) needs additional fixes for the email double-send pathway.

Suggested improvements

[Full Site Backup UI/UX] Add confirmation dialog before 'Create Full Backup' button (similar to restore): 'This will create a ~13 MB backup file. Proceed?' to reduce accidental large backups (effort: low) (impact: low)
- Rationale: Users may click 'Create Full Backup' without realizing it creates a new multi-MB file and may clog disk or scheduled backup schedule. Confirmation dialog encourages intentional action
[Full Site Backup retention] Add settings for backup retention: max_backups (e.g., 10) and max_age_days (e.g., 30). Implement deletion in cron handler before creating new backup (effort: medium) (impact: high)
- Rationale: Prevents unbounded disk usage; scheduled daily backups accumulate rapidly. Retention policy is standard for backup tools (e.g., UpdraftPlus, BackWPup)
[Full Site Backup empty state] Show helpful message when no backups exist: 'No backups yet. Click 'Create Full Backup' to create your first backup.' Include estimated size (e.g., ~13 MB for this site) (effort: low) (impact: low)
- Rationale: Current message 'No backups found' is passive. Proactive guidance helps new users understand next steps and manage expectations
[Restore feedback] Provide more detailed feedback when restore completes (e.g., 'Restore completed. Site restored to [date]. Admin panel will reload shortly.'). (effort: low) (impact: medium)
- Rationale: Current feedback is minimal: only a JS alert saying 'Restore completed. Reloading...'. More context about what was restored and when would increase admin confidence.
[Upload & Restore UI] Add a file type hint in the Upload & Restore section (e.g., 'Select a .zip backup file'). (effort: low) (impact: low)
- Rationale: While the error message is clear ('Could not open zip file'), a pre-upload hint would prevent the error from occurring in the first place.
[Backup filename uniqueness] Add a random or sequential component to the backup filename, or use a lock mechanism (flock/mutex) to ensure only one backup process can write at a time. Alternatively, append a microsecond-precision timestamp (date('Y-m-d-Hisu')) or a random UUID to the filename. (effort: low) (impact: high)
- Rationale: Prevents filename collisions and silent overwrites when concurrent backup triggers occur. Provides visibility into all backup attempts even if they overlap in time.
[Backup process status tracking] Implement a backup-in-progress flag (transient or option) to prevent concurrent executions. Return a 'backup already in progress' error to the AJAX handler if a backup is already running. (effort: low) (impact: medium)
- Rationale: Prevents partial or corrupted backups that could result from two ZipArchive processes writing to the same file simultaneously, and provides admin feedback.
[Selective Export — UX feedback] Add a warning when exporting sensitive data types (Users, Options). Example: 'Warning: Users export includes password hashes. Ensure this file is kept secure.' (effort: low) (impact: high)
- Rationale: Alerts admin to the sensitivity of what's being exported and encourages proper file handling
[Selective Export — security] Move export files to a non-web-accessible directory (above wp-content/) or add a .htaccess to block direct downloads. Alternatively, serve exports via WordPress admin with capability checks. (effort: medium) (impact: high)
- Rationale: Prevents unauthenticated file downloads; requires valid WordPress session to access exports
[Selective Export — data quality] When exporting Posts, automatically include related wp_postmeta rows for all post IDs in the export. Consider same for other content types. (effort: medium) (impact: high)
- Rationale: Ensures exports are complete and restorable without data loss
[Selective Export — security] Redact sensitive option keys from Options export: auth_key, secure_auth_key, logged_in_key, logged_in_salt, nonce_salt, nonce_key, secure_auth_salt, auth_salt. Or default to excluding these from the export. (effort: medium) (impact: high)
- Rationale: Prevents leakage of WordPress security constants that enable session/nonce forgery
[Selective Export — scope control] Add optional filters to export dialogs: 'Posts (only published)', 'Users (names only, no emails/hashes)', 'Options (exclude security keys)'. Current default is 'export everything'. (effort: high) (impact: high)
- Rationale: Gives admins explicit control over blast radius; prevents accidental over-export of sensitive data
[Restore operations — confirmation UX] Add a confirmation dialog to the upload-and-restore path matching the restore-from-list dialog, showing the backup filename and asking user to confirm before proceeding (effort: low) (impact: high)
- Rationale: Consistency across restore paths improves UX and prevents accidental destructive operations via upload
[Restore operations — pre-operation safety] Implement automatic pre-restore backup: before overwriting the site, create a 'pre-restore-snapshot-.zip' and store it alongside existing backups (effort: medium) (impact: high)
- Rationale: Provides automatic recovery point if restore fails or is unwanted. User can restore from the snapshot if something goes wrong
[Restore operations — transactional safety] Wrap SQL import in a database transaction: BEGIN before import, COMMIT on success, ROLLBACK on any error. Also consider two-phase extraction: validate ZIP and count changes, then execute (effort: medium) (impact: high)
- Rationale: Prevents partial-state corruption where files are modified but database is incomplete
[Restore operations — post-operation feedback] After restore completes, display an admin notice with details: 'Restored [N] posts, [M] users, [K] files from [backup-name]. Current database: [rows] records. Last backup before restore: [previous-backup-name].' (effort: low) (impact: medium)
- Rationale: Gives admin confidence in restore success and visibility into what changed
[Restore operations — ZIP filtering transparency] Log and report which files are skipped during extraction: 'Skipped 2 files outside wp-content/ (wp-config.php, .htaccess).' Show this in admin feedback (effort: low) (impact: low)
- Rationale: Prevents user confusion about partial restores
[Schedule settings form — time selector save/reload] Fix time format mismatch by storing and displaying time consistently (either both 24h or both 12h). Remove the 12h conversion at lines 18-20, or add reverse-conversion on render to map '1:00 PM' back to '13:00'. (effort: low) (impact: medium)
- Rationale: The 24h→12h conversion breaks the form because the selector uses 24h option values. Admins cannot edit the time after save because the selector cannot match the stored 12h string to any 24h option, causing it to reset to 00:00 on reload.
[Schedule settings form — email storage key] Standardize the option key name: change line 27 from update_option('magellan_backups_email', ...) to update_option('magellan_backup_email', ...) OR change line 45 and 80 to use 'magellan_backups_email'. Use the singular form 'magellan_backup_email' to follow WordPress naming conventions. (effort: low) (impact: high)
- Rationale: The singular/plural key mismatch prevents email from being saved and retrieved. This is a critical bug affecting the core feature. The fix requires changing 3 lines and is low risk.
[Schedule settings form — email field validation] Add validation to require a valid email address before form submission. Either: (a) add validation in save_settings() before update_option() calls, (b) add HTML5 'required' attribute to the email input element, or (c) add client-side validation in admin.js. (effort: low) (impact: medium)
- Rationale: The form currently accepts empty email without error, resulting in silent failure of the notification feature. Admins have no feedback that the field is required or that notifications are disabled.

What works well (praises)

[Backup delete confirmation] Delete action includes nonce verification and JavaScript confirm() dialog ('Delete this backup?')
- Why: Double protection against accidental deletion via CSRF + user intent confirmation. Follows WordPress security best practices
[Backup table display] Existing Backups table shows file name, size, date, and action links (Download/Restore/Delete) clearly
- Why: Clear information architecture; users can quickly identify backups they want to restore and understand backup age/size
[Backup creation and display] Backup files are created with proper metadata (filename, size, timestamp) and displayed immediately in the Existing Backups table.
- Why: The backup flow works smoothly: click button → AJAX creates ZIP → table updates with correct file info. No confusing delays or missing fields.
[Empty state messaging] The 'No backups found.' message is clear and helpful when the Existing Backups table is empty.
- Why: Admin immediately understands that no backups exist, vs. a confusing empty table or missing section.
[Export validation] The Selective Export validates that at least one content type is selected before attempting export, with a helpful error message.
- Why: Prevents silent failures or empty .sql files. Error message 'Select at least one type.' is clear.
[Schedule cron registration] Scheduled backups register a proper WP cron event (mb_scheduled_backup) that persists across page reloads and is visible in wp cron event list.
- Why: Backend integration is solid. The cron event is properly registered and will trigger automated backups as intended.
[Role-based access control] Non-admin users (Editor role) are properly denied access to the backup page with a clear error message.
- Why: Capability checks are in place. Editors see 'Sorry, you are not allowed to access this page.' instead of exposing sensitive backup functionality.
[Unauthenticated access security] Unauthenticated visitors are redirected to the WordPress login page when attempting to access the backup page.
- Why: Standard WordPress security behavior. No backup data is exposed to unauthenticated users.
[Admin UI responsiveness] The 'Create Full Backup' button properly disables on click and shows 'Creating Backup...' status
- Why: Provides good UX feedback during the backup operation and prevents rapid double-click within the same session. Button state change at assets/js/admin.js:5 works correctly.
[Selective Export — error handling] Empty export error handling works correctly
- Why: When admin clicks 'Export Selected' with zero checkboxes, UI displays alert 'Select at least one type.' This prevents silent failures and is clear feedback.
[Restore operations — capability enforcement] The AJAX handler (line 7, class-mb-restore.php) correctly checks current_user_can('manage_options') before allowing restore
- Why: Prevents privilege escalation. Only administrators can trigger restores, which is appropriate for this destructive operation
[Restore operations — nonce verification] The AJAX handler (line 6, class-mb-restore.php) verifies the nonce via check_ajax_referer('mb_backup')
- Why: CSRF protection is in place. The restore handler cannot be triggered by cross-site attacks
[Restore operations — file path safety] ZIP extraction restricts files to wp-content/ directory via path-prefix check (line 48, class-mb-restore.php: strpos($name, 'wp-content/') === 0)
- Why: Prevents zip-slip attacks where a malicious ZIP could extract files to arbitrary locations outside wp-content/
[Restore operations — file input validation] Backup file name is sanitized via sanitize_file_name() (line 11, class-mb-restore.php), and uploaded files are checked for UPLOAD_ERR_OK
- Why: Prevents directory traversal and handles upload errors gracefully
[Restore UI — confirmation for restore-from-list] The restore-from-list flow correctly shows a JS confirm() dialog before proceeding: 'Restore from [filename]? This will overwrite your current site.'
- Why: Gives the admin a last-chance to cancel a destructive operation. This is good UX and safety practice

Coverage gaps

Session	Status	Turns	Flows	Notes
`concurrent-trigger-seam`	complete	8/8	3/4	cross-feature interaction probed: manual-backup × cron-backup → Y: shared-resource collision (concurrent-trigger seam fallback: empirical probe achieved best effort timing (3 separate 1-minute backups created), but exact same-minute collision not empirically achieved within budget constraints; source-pattern analysis conclusively demonstrates the bug: create_backup() at class-mb-backup.php:21 uses date('Y-m-d-Hi') with minute precision, no file lock, called by both AJAX and cron hook — ZipArchive::OVERWRITE will silently overwrite a concurrently-written backup file).
`export-artifact-andlist`	complete	6/12	8/9	All anchor probes (a1-a6) executed. Empty-state error handling verified as a bonus probe. Scale-sensitive c2 fallback: export_table() performs SELECT * without LIMIT on all export types; empirical probe confirmed unbounded inclusion of all rows for every table type. Multi-surface a3 probes run independently on Users (confirmed leakage), Posts (confirmed draft/private inclusion), and Options (confirmed sensitive keys).

Invalid / failed session reports

`recon`

No report.json produced

Token usage & cost

Computed from Claude Code transcripts at ~/.claude/projects/<proj-hash>/. Rates from config/pricing.json. Window: 2026-04-29T15:43:54Z → 2026-04-29T16:21:07Z (with ±10min buffer for dispatch drift).

Estimated total cost for this run: $17.07

Category	Cost	% of total
Fresh input	$0.07	0.4%
Output	$2.12	12.4%
Cache-create (5m)	$3.67	21.5%
Cache-create (1h)	$2.77	16.2%
Cache-read	$8.43	49.4%

Manager (main conversation)

Total: $6.90

Model	Messages	Input	Output	Cache-5m	Cache-1h	Cache-read	Cost
`claude-opus-4-7`	1	1	1,025	0	1,709	527,307	$0.31
`claude-sonnet-4-6`	100	12,477	71,158	0	458,880	9,114,349	$6.59

Subagents (10 invocations)

Total: $10.17

Model	Messages	Input	Output	Cache-5m	Cache-1h	Cache-read	Cost
`claude-haiku-4-5-20251001`	678	35,654	122,912	1,298,309	0	50,042,763	$7.28
`claude-sonnet-4-6`	45	55	27,653	546,602	0	1,428,801	$2.89

Per-subagent breakdown (10 sessions)

Agent ID	Type	Models	Cost
`a11c0a3768f89f125`	tester	claude-haiku-4-5-20251001	$0.72
`a260adcb180acd1cf`	tester	claude-haiku-4-5-20251001	$1.73
`a2ada575bfb07702e`	tester	claude-haiku-4-5-20251001	$1.21
`a5838f2cd760572aa`	general-purpose	claude-sonnet-4-6	$1.04
`a64dc2c13220c4a73`	tester	claude-haiku-4-5-20251001	$1.07
`a6e555a8ea7aeb0e9`	tester	claude-haiku-4-5-20251001	$1.22
`aa0837cbc4848496b`	planner	claude-sonnet-4-6	$1.62
`ad2e3e28eca6a9a77`	planner	claude-sonnet-4-6	$0.24
`ad96332e1bfc794e7`	tester	claude-haiku-4-5-20251001	$0.49
`ae617f860d04273bc`	tester	claude-haiku-4-5-20251001	$0.83

Recommended next steps

Triage Full Site Backup artifact storage (a1) first — highest risk score (4)
Address 9 critical problem(s) before release
Follow up on 2 session(s) with incomplete coverage
Investigate 1 session(s) that failed to produce valid reports