- 0 hypotheses silently skipped (all hypotheses have verdicts)
- 1 recon-flagged surface insufficiently probed (b6 empirical probe deferred, question filed without empirical evidence)
- 1 AND-list item scored via source-only instead of empirical (b6 across both restore paths)
- 3 round-trip / compositional probes missing (export×import, Pages a3, cron-deactivation lifecycle)
- 1 Question filed only from source inspection with no empirical probe attempt (restore b6)
- 2 forcing-function strings missing (export-artifact-andlist missing scale-sensitive c2 fallback literal; concurrent-trigger-seam missing the required literal form)
- 1 charter pending (breadth-tour status=pending — no report)
All hypotheses across the five completed sessions carry a probed verdict with Y/N outcome. No silent skips detected.
Notable deprioritizations (low severity, rationale present):
restore-destructive-andlist— b6 (capability gate for Editor role): empirical AJAX probe deferred "CLI nonce generation failed." Verdict was confirmed via source-inspection only (current_user_can('manage_options')at line 7). A Question was filed. See Check 6.restore-destructive-andlist— b7 (partial-failure consistency): deferred to source analysis; no empirical corrupted-ZIP test executed.export-artifact-andlist— multi-type merge test (a6 fallback): explicitly skipped due to turn budget. Coverage_notes acknowledge the skip.concurrent-trigger-seam— flow 4 (JS button-disable): deprioritized in favor of the cross-feature race. Source-pattern evidence cited.
Check 1 verdict: No gap.
Static-analysis.md does NOT exist for this run (Phase 1.5 skipped — no source_path in mission). Check 2 and Check 2b are skipped per the prompt instructions.
Recon flagged seven surprises (S1–S7). Status:
| Surprise | Description | Coverage status |
|---|---|---|
| S1 | Progress bar hardcoded 100% | Probed in backup-artifact-andlist (P9) and restore-destructive-andlist (P5). Confirmed. |
| S2 | Email double-send + option-key mismatch | Probed in schedule-feature-cluster (H2, P2). Confirmed. |
| S3 | Restore silent — no detail about what restored | Noted in restore-destructive-andlist (P4, minor). Addressed. |
| S4 | ZIP extraction silently skips non-wp-content files | Addressed in restore-destructive-andlist (b7, P6). |
| S5 | Schedule time 24h→12h format mismatch on reload | Probed in schedule-feature-cluster (H1, P1). Confirmed. |
| S6 | DB dump lacks FK/indexes beyond SHOW CREATE TABLE | NOT probed empirically. No session's hypotheses_status or coverage_notes explicitly verifies S6. Export-artifact-andlist notes charter instruction references S6 but report does not reference it. |
| S7 | Capability check: no delegation to sub-admin roles | Partially addressed via source-only check in restore-destructive-andlist. |
Gap — S6 (LOW severity): Recon noted "Database dump includes SHOW CREATE TABLE and INSERT statements but no schema-level constraints beyond what's in SHOW CREATE TABLE." The export-artifact-andlist charter notes explicitly to "verify whether the exported SQL is importable/complete for schema reconstruction (recon S6)." No session report addresses this — no import attempt was made and no schema completeness check is logged. This is an incomplete round-trip: artifacts were inspected for content but not tested for re-importability.
b6 (capability gate) — restore-destructive-andlist: The charter requires a per-handler verdict: "does a non-admin (Editor role) receive a 403/error when POSTing directly to mb_run_restore with a valid nonce?" The session's verdict is source-inspection only (source confirmed current_user_can('manage_options') at line 7). The empirical AJAX probe was deferred due to CLI nonce-generation failure. No curl test as Editor was executed. The Question filed ("Do non-admin users receive 403 when POSTing directly to mb_run_restore?") acknowledges this gap.
The restore feature has two AJAX paths: restore-from-list and upload-and-restore. The charter's b6 success criterion covers both paths via the single AJAX handler mb_run_restore. Source analysis shows one capability check at line 7 of class-mb-restore.php, which covers both paths. However, b6 was scored aggregate-on-source with no empirical confirmation on either path.
HIGH severity gap: b6 is an AND-list item requiring empirical verification. Source-only pass on a capability check is a confirmed miss pattern in the harness history.
| Feature pair | Round-trip expected | Coverage status |
|---|---|---|
| Backup create × restore | round-trip identity: does restoring a backup reproduce the original site state? | NOT probed. restore-destructive-andlist deferred full end-to-end restore execution ("Deferred full end-to-end restore execution due to risk of overwriting test site without pre-snapshot"). |
| Export × import | does the export SQL re-import cleanly into a fresh DB? | NOT probed (see S6 gap above). |
| Schedule save × reload | time format round-trip | Probed in schedule-feature-cluster (H1). Confirmed bug. |
| Plugin activate × deactivate × reactivate | does cron deactivation clean up? | Partially probed (cron removed on disable; deactivation not confirmed). Schedule-feature-cluster H4 explicitly tested enable→disable→re-enable cron lifecycle but the charter success criterion also required "Y/N — does deactivation remove the cron hook?" — the report does NOT document wp cron event list output AFTER deactivation. |
| Export Pages a3 | does Pages export have same leakage pattern as Posts/Users/Options? | NOT probed. Export-artifact-andlist a3 probed Users (users leakage), Posts (scope), Options (security keys), but Pages was NOT independently probed. |
HIGH severity gaps:
-
Backup create × restore round-trip not probed. The full end-to-end: create backup → restore from it → verify site state matches → was explicitly deferred. This is the primary feature promise of a backup plugin. The skip rationale ("risk of overwriting test site") is understandable but the gap is functionally significant.
-
Export × import round-trip not probed. No session verified that any export SQL file is actually importable. The artifact was inspected for content, not for usability as a restore artifact. Recon S6 flagged schema completeness risk; no session addressed it.
-
Pages export a3 not probed. The export-artifact-andlist charter's multi-surface rule requires independent a3 probes on EACH export type. Users (leakage), Posts (scope/draft inclusion), Options (security keys) were all probed. Pages was silently skipped — no entry in
hypotheses_status, no coverage note acknowledgment. This is a gap in the multi-surface rule enforcement. -
Cron deactivation outcome not documented. H4 coverage_notes report only disable/re-enable lifecycle. The charter success criterion "Y/N — does deactivation remove the cron hook?" has no empirical evidence logged.
restore-destructive-andlist — b6 Question:
Question filed: "Do non-admin users (editors, authors) receive a 403/unauthorized error when attempting to POST directly to the AJAX mb_run_restore handler with a valid nonce?"
The session deviation note says: "Deferred empirical b6 AJAX capability test (CLI nonce generation failed) but source analysis confirms capability check; verified via code inspection."
The capability check exists in source, but the empirical test — that the check actually fires and returns a 403 — was not executed. The charter explicitly called for "CLI: curl as editor with AJAX action mb_run_restore + nonce." The Question was filed from source inspection alone.
HIGH severity gap (per Check 6 calibration table: "Question with only source-inspection evidence (no empirical probe)").
No overlay-shaped widgets (lightbox, modal, drawer, dropdown, popup) are present in this plugin. The UI is admin-only with standard WordPress admin forms and AJAX patterns. No custom-widget classification risk detected. Confirmed by recon's "Frontend surfaces: None" and session reports which describe standard confirm() JS dialogs (native browser behavior).
Check 7 verdict: No gap.
Mission.md ## Must-cover flows section is blank ("Fill in based on static analysis + recon. Leave blank to let the Manager infer from the surface."). No explicit must-cover flows to verify.
Check 8 verdict: N/A (no must-cover flows specified).
| Feature | Traits | Probe quota met? |
|---|---|---|
| F1 Full Site Backup | artifact-producing, scale-sensitive, AJAX-exposed, DB-writing, destructive-operation | YES — backup-artifact-andlist: 9 hypotheses in hypotheses_status, all probed. AND-list a1–a6 complete. |
| F2 Restore from Existing | destructive-operation, AJAX-exposed | YES — b1–b7 all enumerated with verdicts. b6 verdict source-only (see Check 4, 6). |
| F3 Upload & Restore | destructive-operation, AJAX-exposed, file-upload | Partially. b1 probed for upload path (no confirmation dialog). b6 not empirically confirmed for upload path specifically. |
| F4 Selective Export | artifact-producing, scale-sensitive, AJAX-exposed, DB-writing | PARTIAL — a1–a6 enumerated but Pages type NOT independently probed under a3. |
| F5 Scheduled Backups | settings-form, artifact-producing, scale-sensitive, DB-writing | YES — all 5 hypotheses H1–H5 probed. |
| F6 Backup List/Lifecycle | destructive-operation, output-rendering, AJAX-exposed | YES — covered within backup-artifact-andlist. |
Check 9 verdict: Two partial gaps (F3 upload-and-restore b6 per-path, F4 Pages a3).
| Session | Required forcing-function string | Present? |
|---|---|---|
| backup-artifact-andlist | empty-state probed: ... |
YES — "Empty-state probe passed (No backups found message present)" |
| backup-artifact-andlist | default blast radius probed: ... |
YES — "scale-sensitive c2 fallback: source pattern filed" present |
| backup-artifact-andlist | scale-sensitive c2 fallback: ... |
YES |
| export-artifact-andlist | scale-sensitive c2 fallback: ... |
NO — coverage_notes: "Scale-sensitive c2 fallback: export_table() performs SELECT * without LIMIT on all export types; empirical probe confirmed unbounded inclusion." Substance is present but literal form scale-sensitive c2 fallback: source pattern filed as <severity> Problem — <file>:<line> is absent. LOW severity gap. |
| export-artifact-andlist | empty-state probed: ... |
Implicit — probed via "Export with zero checkboxes" but literal string not present in coverage_notes. LOW severity gap. |
| restore-destructive-andlist | default blast radius probed: ... |
YES — "default blast radius probed: restore-from-list overwrites entire wp-content/ directory + database (confirmed by code lines 45-60 + 39-42)" |
| restore-destructive-andlist | cross-feature interaction probed: upload-and-restore with ZIP containing path traversal |
YES — noted as probed in coverage_notes |
| schedule-feature-cluster | save-roundtrip verified: ... |
YES — confirmed in H1 verdict |
| concurrent-trigger-seam | `cross-feature interaction probed: manual-backup × cron-backup → Y | N: shared-resource collision` |
Recon notes no external resource dependencies. The plugin is admin-only, no Google Fonts, no CDN, no external APIs, no analytics, no social SDKs. All operations are self-contained (local ZIP, local SQL, local wp-cron, local wp_mail).
Check 11 verdict: No gap.
This plugin ships no starter content, patterns, sample data, or demo importers. No content-authoring UX probe is relevant.
Check 12 verdict: N/A.
This is an admin-only plugin with no public routes, patterns, or block styles. There are no "route 200 OK" pass claims on public-facing content. The session reports assert functionality through empirical CLI evidence (grep, unzip, curl output) or browser observation — content-level verification rather than status-level.
One status-level concern: restore-destructive-andlist confirms the restore mechanism exists (source: current_user_can('manage_options')) but does not execute a full restore to verify the content-level outcome — what is actually in the DB/filesystem after restore. This overlaps with the round-trip gap in Check 5.
Check 13 verdict: One low-severity concern (overlaps with Check 5 round-trip gap, not a new finding).
The breadth-tour charter is status: pending in manifest.json (sessions_total: 5, sessions_completed: 5 — the breadth tour was never dispatched or its report is absent). The coverage.md lists breadth-tour as covering F1–F6, and several Check 10 items are expected from it (BT-SEC-a Editor capability check, BT-SEC-b unauthenticated access, BT-F3b non-ZIP upload validation, BT-F2b delete verification). These breadth-specific probes have no coverage in the five completed sessions.
HIGH severity gap: The breadth-tour was explicitly designed to cover BT-SEC-a (Editor cannot access backup page) and BT-SEC-b (unauthenticated access redirect). These capability and access-control probes are not covered by any completed session. BT-F3b (non-ZIP file upload validation) is also unprobed.
-
b6 empirical probe missing (restore-destructive-andlist): The Editor-role AJAX capability check was confirmed by source only. Empirical curl probe with Editor nonce never ran. One targeted mini-session: create Editor user, obtain nonce, curl mb_run_restore and mb_run_export as Editor, verify 403. 15-minute probe, no browser needed.
-
Backup create × restore round-trip not probed: The plugin's core promise — backup and restore — was never tested end-to-end. No session verified that a site restored from a backup is functionally equivalent to the pre-backup state. Suggest one targeted session: create content, create backup, modify content, restore, verify content matches. 20-minute probe.
-
Export × import round-trip not probed (recon S6): Export SQL files were inspected for content but never imported into a fresh DB. Schema completeness (recon S6), missing post_meta, and importability of security-key-including Options export are all unconfirmed. Suggest extending the round-trip session above to also import an export file and verify row counts.
-
Pages export a3 not probed (multi-surface rule dropout): The export charter's multi-surface rule requires independent a3 probes on each export type. Pages was silently skipped — not deprioritized with rationale, just absent. If Pages export has the same draft/private inclusion bug as Posts, it would be a separate filed issue. 5-minute targeted probe.
-
Breadth-tour pending — BT-SEC-a, BT-SEC-b, BT-F3b unprobed: The breadth charter was never dispatched. The capability/access control probes (Editor role UI access, unauthenticated redirect) and non-ZIP upload validation have no coverage in any completed session. A single breadth-tour session would close all three gaps.
- Cron deactivation outcome not documented (H4): Charter required wp cron event list after
wp plugin deactivate. The session confirmed enable→disable lifecycle but not the deactivation hook. Low risk: deactivation hook is a known WordPress pattern, but empirical gap remains. - export-artifact-andlist c2 fallback literal not in standard form: Substance present, literal format deviated. No recall impact.
- Recon S6 schema completeness not probed: Low-frequency issue but could mask an importability bug in the SQL artifact.
- restore-destructive-andlist b7: Source analysis sufficient for path-traversal filter, but corrupted-ZIP empirical probe (partial failure consistency) was deferred. Low recall risk — the source pattern is deterministic.
- Single targeted mini-charter "restore-roundtrip-and-capability" would close gaps 1, 2, 3: empirical b6 capability probe (Editor curl) + end-to-end backup×restore round-trip + export SQL import test. Estimated: 25–30 minutes, Haiku model sufficient.
- Breadth-tour dispatch would close gap 5: use the existing breadth-tour.md charter as-is. Estimated: 30–45 minutes, Haiku.
- Pages export a3 probe (gap 4): add as a single flow in the breadth-tour session or the round-trip session.
5 high-severity gaps, 4 low-severity gaps
Breadth-tour session (breadth-tour) completed: 13 flows planned, 13 executed (1 reordered, no coverage impact). Duration: ~6 minutes. Model: haiku-4-5.
| # | Gap | Pass 2 status | Rationale |
|---|---|---|---|
| 1 | b6 empirical capability probe missing (Editor role AJAX) | PARTIALLY CLOSED | BT-SEC-a confirmed that an Editor navigating to the backup admin page receives "Sorry, you are not allowed to access this page." — the menu/page-level capability gate is empirically verified. However, the original gap required a direct curl POST to the mb_run_restore AJAX handler with a valid Editor nonce, confirming that the AJAX handler itself returns 403 before executing restore logic. That AJAX-layer probe was not performed in the breadth tour. The UI-level check is necessary but not sufficient: a misconfigured wp_ajax_nopriv_ or a missing check_ajax_referer could still allow direct AJAX calls even when the admin page gate works. Gap is reduced but not eliminated. |
| 2 | Backup create × restore round-trip not probed | STILL OPEN | BT-F2a executed a restore happy path (confirm dialog → JS alert "Restore completed. Reloading..." → session reauth) and verified the flow completes. It did NOT verify post-restore site state: no content was seeded before the backup, no mutation was made before the restore, and no post-restore state comparison was performed. The round-trip identity check (create content → backup → modify content → restore → verify content matches) remains unexecuted. |
| 3 | Export × import round-trip not probed | STILL OPEN | BT-F4a confirmed that selecting Posts and clicking Export produces a .sql file with a download link. No import of that SQL file into a fresh DB was attempted. Schema completeness (recon S6) and importability of the security-key-including Options export remain unverified. |
| 4 | Pages export a3 not probed (multi-surface rule dropout) | STILL OPEN | BT-F4a probed Posts export only. No breadth-tour hypothesis covers Pages as an independent export type. The charter did not include a BT-F4c for Pages. The multi-surface rule dropout from the original export-artifact-andlist session was not remediated. |
| 5 | Breadth-tour pending (BT-SEC-a/b, BT-F3b) | CLOSED | BT-SEC-a: Editor visits backup page → "Sorry, you are not allowed to access this page." (pass). BT-SEC-b: unauthenticated curl → HTTP 302 to wp-login.php (pass). BT-F3b: uploaded .txt file → alert "Error: Could not open zip file." (pass, good validation feedback). All three probes executed with empirical evidence. |
- P1 (major): Progress bar hardcoded to 100% on page load — confirmed with screenshot (duplicate of original P9/P5 finding, now re-confirmed empirically).
- P2 (major): Schedule time format mismatch — 24h input (13:00) stored as "1:00 PM", displayed as "00:00" on reload — confirmed empirically with CLI evidence (duplicate of original H1/P1 finding, now re-confirmed).
- I1 (improvement): Restore feedback verbosity; I2: Upload file-type hint pre-upload.
- Praises: role-based access control (BT-SEC-a), unauthenticated redirect (BT-SEC-b), export validation, empty-state messaging, cron registration.
1 gap closed, 1 gap partially closed, 3 gaps remain open.
- Closed (1): Breadth-tour pending probes (BT-SEC-a, BT-SEC-b, BT-F3b) — all executed with empirical pass verdicts.
- Partially closed (1): b6 UI-level access control confirmed; AJAX-layer direct-POST probe still missing.
- Still open (3): Backup create × restore round-trip; Export × import round-trip; Pages export a3 multi-surface probe.
Revised count: 3 high-severity gaps open, 1 partially closed, 4 low-severity gaps unchanged.
The two round-trip gaps (backup×restore, export×import) and the Pages export probe remain the top re-dispatch priorities. A single 25–30 minute "restore-roundtrip-and-pages" mini-charter (Haiku model, no browser needed for export import, browser for restore state verification) would close all three. The b6 AJAX-layer gap can be closed in under 5 minutes with a targeted curl probe as an Editor user.


