- 3 hypotheses silently skipped (CT-2, CT-3, SE-4 never empirically probed)
- 6 surfaces from recon/coverage not addressed (F6 plugin lifecycle — breadth-tour skipped entirely)
- 0 AND-list items scored on aggregate when per-path was needed
- 1 round-trip probe missing (export × re-import — SE-4 deprioritized without empirical discharge)
- 2 Questions that look like Amendment I drift (b4/b7 rollback from source; SCH-5 email from source)
- Forcing-function strings missing from 3 sessions
backup-artifact-andlist — all 8 hypotheses (a1–a6, a3-selective-export, a-progress) recorded in hypotheses_status with probed verdicts. Scale-sensitive c2 fallback correctly filed in coverage_notes. No silent skips.
restore-destructive-andlist — b1–b7 and b-default-scope all recorded. b4 and b7 verdicts lean on source inspection (see Check 6 below). No silent skips in hypotheses_status terms, but empirical depth is shallow.
selective-export-cluster — SE-4 (re-import duplicate risk) marked deprioritized with only source-pattern evidence, no empirical probe.
- [selective-export-cluster] SE-4 (re-import creates duplicates) — deprioritized with source-pattern rationale only; charter required empirical probe OR c2-style fallback Problem filing. Coverage_notes do not contain the mandatory c2/fallback literal for this item. Severity: HIGH — export×re-import round-trip is a key correctness guarantee for a selective export feature.
schedule-feature-cluster — SCH-6 (weekly day-of-week) marked inconclusive — acceptable given UI inspection confirms no selector. All other hypotheses probed.
concurrent-trigger-cross-feature — CT-2 and CT-3 both marked deprioritized after CT-1 budget exhaustion.
- [concurrent-trigger-cross-feature] CT-2 (lock/mutex existence) — deprioritized; no source grep or empirical lock probe attempted; no mandatory fallback Problem filed. Severity: HIGH — the session filed a Question about concurrent locks but never confirmed or denied the lock mechanism's existence.
- [concurrent-trigger-cross-feature] CT-3 (double-click JS protection) — deprioritized; low-severity standalone, but the cross-feature seam implication (JS lock doesn't protect cron path) was noted.
Check 2a (hypothesis coverage): static-analysis.md does not exist (source_path not set, Phase 1.5 skipped). Per invocation instructions, Check 2 surface-map parity is skipped. No "I bet" items appear in charter files beyond the standard charter hypothesis blocks — no 2a gap.
Check 2b: N/A (no static-analysis.md).
Recon surfaces:
- S1 (Delete has no confirmation) — probed and refuted by restore-destructive-andlist (b1). Correctly handled.
- S2 (pre-created backup at first load) — probed via a5 in backup-artifact-andlist (cron on activation confirmed). Covered.
- S3 (frequency dropdown: only daily/weekly, no day-of-week) — probed via SCH-6 in schedule-feature-cluster (inconclusive disposition noted). Covered.
- S4 (no progress feedback on backup creation) — probed via a-progress in backup-artifact-andlist (hardcoded 100% confirmed). Also a breadth-tour BT-F1-3 obligation — but breadth-tour was skipped.
- S5 (no disk space warnings or retention policies) — probed via a4/P6 in backup-artifact-andlist (indefinite accumulation confirmed). breadth-tour BT-F5-3 also targeted this but was skipped.
Gap: breadth-tour was the only charter targeting 18 breadth-level probes across F1–F6 including all recon S2/S4/S5 breadth dispositions and the ENTIRE F6 (plugin lifecycle) surface. breadth-tour has status skipped_this_wave with no explanation in the manifest.
- [breadth-tour] F6 plugin lifecycle (activation/deactivation/uninstall, cron cleanup, admin menu visibility, PHP notices) — zero sessions cover this surface. BT-F6-1, BT-F6-2, BT-F6-3 never executed. Severity: HIGH — F6 lifecycle is exclusively assigned to breadth-tour; no other charter covers activation/deactivation probe, zip-slip upload security test (BT-F2-3), file-type validation (BT-F2-1), accessibility labels (BT-F4-3), or admin notice feedback probes (BT-F3-3, BT-F2-2).
- [breadth-tour skipped] BT-F2-3 (zip-slip / path traversal on upload) — never probed. No charter other than breadth-tour covers this. Severity: HIGH — path traversal on a file upload is a critical security probe class.
- [breadth-tour skipped] BT-F2-1 (file-type validation on Upload & Restore) — never probed. Non-ZIP upload acceptance is untested. Severity: MEDIUM.
- [breadth-tour skipped] BT-F4-1 (Export Selected with 0 checkboxes) — never probed by any session. Severity: MEDIUM.
backup-artifact-andlist: all anchors (a1–a6, multi-surface extensions) enumerated as discrete hypotheses_status entries. No aggregate scoring detected.
restore-destructive-andlist: b1 through b7 and b-default-scope enumerated discretely. b1 correctly split into two sub-verdicts (Delete and Restore). Per-handler scoring applied correctly.
No AND-list aggregate-scoring gaps detected. Note: b6 (capability gate) was probed against the admin page and a subscriber curl test — single-path. The plugin has both admin-post + wp_ajax handler surfaces per the restore charter. However, b6 only asserts manage_options gate existence (confirmed via source inspection and subscriber curl), not full per-AJAX-handler enumeration. Low-severity given source-verified gate.
Export × Re-import (SE-4) — The selective export feature explicitly needs a round-trip probe: export SQL → re-import → verify no duplicates. SE-4 was deprioritized without empirical execution. The coverage_notes state source-pattern analysis only ("INSERT without INSERT IGNORE"). This is the canonical export×import round-trip pair for a plugin that bills itself as a selective export tool.
- Gap: HIGH severity — SE-4 export×re-import round-trip not empirically probed. Filed as deprioritized with source rationale but no empirical discharge and no mandatory fallback Problem filing.
Backup × Restore round-trip — restore-destructive-andlist probed the restore blast radius (b-default-scope) empirically: created post, took backup, restored, verified post gone. Round-trip identity semantically probed. Coverage note: default blast radius probed: Restore → post-backup content destroyed? → Y would be expected literal; actual coverage_notes say "b-default-scope blast radius fully validated." Partially satisfies the round-trip requirement.
Schedule save × reload — SCH-2 save-roundtrip probed empirically and a bug confirmed. Round-trip coverage adequate for this pair.
restore-destructive-andlist — b4 and b7: Both verdicts cite class-mb-restore.php source inspection as primary evidence ("Source inspection reveals sequential $wpdb->query() calls without atomic transactions"). No empirical partial-failure probe (truncated ZIP upload, forced timeout) was executed. The charter explicitly calls for a "syntactically valid but content-incomplete ZIP" empirical test for b7. The Tester notes in the coverage_notes: "Partial-failure testing (b4, b7) limited to CLI artifact inspection due to turn budget constraints." This is Amendment I drift — source inspection filed as the verdict instead of a probe attempt.
- [restore-destructive-andlist] b4 (transaction/rollback) and b7 (partial-failure consistency) — verdicts derived from source inspection without empirical probe. Coverage_notes acknowledge the limitation but no fallback Problem was filed with the mandatory literal. Severity: HIGH — these are high-impact safety mechanism verdicts; source inspection misses runtime behavior (e.g., WP's $wpdb wrapper could have its own rollback semantics).
schedule-feature-cluster — SCH-5: Coverage_notes state "SCH-5 email delivery probed via source analysis instead of manual trigger." The deviation field confirms: "SCH-5 (email delivery) probed via source analysis instead of manual trigger; wp_mail implementation verified, option-name mismatch identified from source review." The option-name mismatch is a genuine high-value find, but the verdict was reached purely from source inspection — no manual cron trigger + mail trap check was performed. The charter explicitly says: "CLI: studio wp --path=${SITE_PATH} cron event run — trigger backup cron manually. Check mail log or studio mail trap for sent email."
- [schedule-feature-cluster] SCH-5 (email delivery) — verdict confirmed-bug from source inspection alone; no empirical mail-trap probe attempted. The bug may be real (option name mismatch is convincing), but the empirical path (cron manual trigger + mail check) was not run. Severity: MEDIUM — source inspection is compelling for this specific typo, but Amendment I requires empirical attempt. Filing as medium because the source evidence is high-confidence.
No overlay-shaped widgets (lightbox, modal, drawer, dropdown, popup) were observed or claimed in any session. The plugin's UI is straightforward tab-based admin with no frontend output. No Amendment H classification miss identified.
Mission.md has no explicit ## Must-cover flows content (section left blank: "Fill in based on static analysis + recon. Leave blank to let the Manager infer from the surface."). No must-cover flow violations possible. Check 8: N/A.
Coverage matrix flags these anchor types for this plugin:
- F1: artifact-producing, DB-writing, scale-sensitive, destructive-operation → backup-artifact-andlist covered a1–a6 + multi-surface; scale-sensitive c2 filed as fallback (acceptable). Probe quota met.
- F2: destructive-operation, file-upload, DB-writing → restore-destructive-andlist covered b1–b7 + b-default-scope. File-upload ZIP-slip NOT covered (breadth-tour skipped). Gap: BT-F2-3 zip-slip is unprobed.
- F3: destructive-operation, artifact-producing → restore-destructive-andlist covered b1–b7 for delete. Coverage adequate.
- F4: artifact-producing, DB-writing, scale-sensitive → selective-export-cluster covered SE-1 through SE-5. SE-4 deprioritized (see Check 5). c2 scale-sensitive fallback noted in coverage_notes. Probe quota marginally met (4/5 probed empirically).
- F5: settings-form, artifact-producing, DB-writing, output-rendering → schedule-feature-cluster covered SCH-1 through SCH-6. 5/6 empirically probed. SCH-5 partial Amendment I drift.
- F6: DB-writing → zero probes (breadth-tour skipped). Activation/deactivation, cron cleanup, admin menu visibility, PHP notices — none probed. Severity: HIGH.
Required strings and their presence:
| Session | Required literal | Present? |
|---|---|---|
| backup-artifact-andlist | default blast radius probed: ... |
YES — "Default blast radius confirmed: Y" in hypothesis evidence (P7 evidence) |
| backup-artifact-andlist | scale-sensitive c2 fallback: empirical probe deprioritized out of budget; source pattern filed... |
YES — present in coverage_notes |
| restore-destructive-andlist | default blast radius probed: Restore → post-backup content destroyed? → [Y/N] |
PARTIAL — "b-default-scope blast radius fully validated" but not the exact mandatory literal |
| selective-export-cluster | empty-state probed: [verdict] |
MISSING — coverage_notes does not contain the literal string "empty-state probed:" (Reinforcement 5 mandatory) |
| selective-export-cluster | scale-sensitive c2 fallback: ... |
MISSING — coverage_notes mentions c2 by name but does not contain the exact fallback literal |
| schedule-feature-cluster | save-roundtrip verified: ... |
MISSING — coverage_notes says "SCH-2 save-roundtrip bug confirmed" but not the exact format "save-roundtrip verified: time submitted=X → stored=Y → displayed=Z → match? [yes |
| concurrent-trigger-cross-feature | cross-feature interaction probed: manual backup × cron backup → [Y/N: shared-resource collision] |
YES — present verbatim in coverage_notes |
Gaps flagged (low severity — underlying probes ran but literals missing):
- [selective-export-cluster] missing
empty-state probed:literal despite SE-5 being probed (graceful empty state confirmed). Severity: LOW. - [selective-export-cluster] missing exact c2 fallback literal. Severity: LOW.
- [schedule-feature-cluster] missing exact
save-roundtrip verified:format string. Severity: LOW. - [restore-destructive-andlist]
default blast radius probed:literal paraphrased rather than verbatim. Severity: LOW.
Recon identified this as an admin-only plugin with no frontend components, no external API calls, no CDN resources, no third-party JS, and no OAuth integrations. No external URLs detected in session reports or recon.md. No external-resource-failure probes required. Check 11: N/A.
No starter content, demo importers, patterns, or sample data declared in recon or coverage. The plugin does not ship any user-facing content that an admin could "publish unchanged." Check 12: N/A.
This plugin has no frontend routes, templates, or rendered patterns — admin-only. Session reports do assert content-level verdicts on artifacts (ZIP contents, SQL column inspection, cron event lists) rather than status-level only. No route-content-depth violations for the artifact probes executed. The breadth-tour skipped content would have included lifecycle probes with CLI verification — those are now missing entirely (see Check 3/9), but the executed sessions use content-level assertions throughout. Check 13: No additional gaps beyond the breadth-tour skip.
-
breadth-tour skipped entirely (F6 unprobed + BT-F2-3 zip-slip + miscellaneous breadth probes) — F6 (plugin lifecycle: activation, deactivation, cron cleanup, admin menu visibility, PHP error log) has zero coverage. BT-F2-3 (zip-slip path traversal on Upload & Restore) is a critical security probe that was never attempted. No other charter covers these surfaces.
- Re-dispatch suggestion: one supplementary Tester with a mini-charter: "F6 lifecycle + BT-F2-3 zip-slip: (1) deactivate plugin → verify cron removed + backup files persist; (2) activate → verify directory + options created; (3) upload a specially-crafted ZIP with ../../wp-config.php entry and verify no path traversal; (4) access admin.php?page=mb-backups as subscriber (role=subscriber); (5) navigate all three tabs with WP_DEBUG_LOG enabled and check debug.log." max_turns: 8.
-
SE-4 (export × re-import round-trip) not empirically probed — the export×import compositional pair was explicitly chartered but marked deprioritized without an empirical attempt or a mandatory fallback Problem filing. The source pattern (INSERT without INSERT IGNORE) strongly suggests duplicates, but the empirical probe was not run.
- Can be appended to the mini-charter above: (6) import the Posts SQL export, check post count before and after for doubling.
-
b4/b7 rollback verdict from source inspection only (Amendment I drift) — restore rollback and partial-failure consistency verdicts are based on source inspection without any empirical truncated-ZIP or interrupted-restore probe. These are high-impact safety mechanism verdicts.
- Can be appended to the mini-charter above: (7) upload a syntactically valid but truncated ZIP file to Upload & Restore; trigger restore; verify site state remains consistent (not partially overwritten).
-
CT-2 (lock/mutex) never confirmed or denied — the concurrent-trigger session deprioritized CT-2 and filed a Question. Whether a lock prevents concurrent backup overwrites is an unresolved correctness question for a backup plugin's primary reliability guarantee.
- Can be appended to the mini-charter above: (8) grep plugin source for transient/flock/is_running patterns to confirm or deny lock presence; file as Problem if absent.
- CT-3 (double-click JS protection) — low standalone value; the charter notes it explicitly does not protect the cron seam, which was probed (CT-1). Acceptable to leave as deprioritized.
- SCH-5 empirical email probe missing — the option-name mismatch finding from source is high-confidence (typo is deterministic); empirical mail-trap check would confirm but is unlikely to change the verdict. Medium-severity Amendment I drift but finding is strong.
- Forcing-function literal strings missing from 3 sessions — underlying probes ran; literals were paraphrased rather than verbatim. Acceptable for this run; note for future amendment tightening.
- Scale-sensitive c2 fallback — both backup-artifact-andlist and selective-export-cluster correctly invoked the c2 fallback protocol. Acceptable given Haiku budget.
4 high-severity gaps, 5 low-severity gaps


