Skip to content

Instantly share code, notes, and snippets.

@alopezari
Created April 29, 2026 13:20
Show Gist options
  • Select an option

  • Save alopezari/6a13dbdf3fac162b64bb96905bb6627a to your computer and use it in GitHub Desktop.

Select an option

Save alopezari/6a13dbdf3fac162b64bb96905bb6627a to your computer and use it in GitHub Desktop.
Magellan Pilot — magellan-backups 1.0.0 | Opus Manager (1M ctx) + Sonnet Planner + Haiku Testers + playwright-cli-headed | 19 Problems (5 crit, 14 major) / 0 Q / 19 I / 9 ! | /usage cost: $22 (token-optimization comparison vs baseline gist c69c35a)

Coverage gaps — magellan-backups 2026-04-29T12-33-03 (pass 2)

Pass 2 reassesses the run after the supplementary breadth-tour Tester completed. Six valid session reports now exist. Pass 1 flagged 2 high-severity gaps; pass 2 verdicts: Gap 1 partially closed, Gap 2 still open, plus newly-visible gaps from the breadth-tour report.

Summary

  • Gap 1 (breadth-tour unowned probes): PARTIALLY CLOSED. 5 of 8 BT hypotheses now have empirical/source evidence on file. 3 remain unprobed: BT4 (true zero-content export — admitted by Tester), BT5 (cron next-run timestamp), BT6 (deactivation cron cleanup). BT5+BT6 explicitly deprioritized for budget — turns_used 30/30.
  • Gap 2 (Backup × Restore round-trip): STILL OPEN. No charter, including the supplementary breadth-tour, composed create→restore→verify-data-integrity. The marquee feature loop remains empirically unverified.
  • NEW: BT3 Amendment I drift. Breadth-tour filed the upload double-submit-protection finding as confirmed-bug from source inspection only (admin.js lines 41–62), with no empirical double-click probe. The probe was bounded (one-line UI test), so the source-only verdict is drift, not a documented fallback.
  • NEW: BT4 verdict misclassified. Tester wrote result: pass for empty export-state but notes admit "Fresh site contains default Privacy Policy page, so true zero-content export not tested." A pass on a probe the Tester admits did not actually run is a classification error — should be deprioritized or blocked.
  • NEW: forcing-function literal missing. Breadth-tour's coverage_notes does NOT carry the charter-mandated literal empty-state probed: F1 → ... | F4 → ... (charter notes_for_tester required this verbatim).

Gaps by check

Check 1: Hypothesis coverage

Breadth-tour (supplementary) hypothesis status:

ID Charter probe Verdict Empirical? Notes
BT1 Empty backup list (F1 empty-state) probed/pass YES "No backups found." observed after delete
BT2 Stale Restore button on missing file probed/pass YES Alert "Restore failed: Backup file not found."
BT3 Upload double-submit protection probed/confirmed-bug NO — source only Amendment I drift — see Check 6
BT4 Export zero-content empty-state (F4 empty-state) probed/pass NO — Tester admits Default Privacy Policy page present; verdict misclassified
BT5 Cron next-run timestamp matches configured hour deprioritized NO "turn budget exhausted before testing schedule modification"
BT6 Deactivation cron cleanup (wp_clear_scheduled_hook) deprioritized NO "additional turns to test empirically"
BT7 Delete backup × stale Restore button probed/pass YES Row disappears after reload
BT8 Progress bar 100% on page load probed/confirmed-bug YES Snapshot evidence (generic [ref=e169]: 100%)

5/8 closed cleanly. 3/8 still open (BT4 misclassified pass, BT5 + BT6 deprioritized to budget).

Other charters' hypothesis enumeration was already verified clean in pass 1; no regressions.

Check 2: Static-analysis hypothesis coverage AND surface-map parity

SKIPPED. Phase 1.5 did not run (mission has no source_path); no static-analysis.md exists.

Check 3: Recon-flagged surface coverage

Re-mapping S1–S8 with breadth-tour evidence applied:

Recon Surface Empirical? Source
S1 Time field 24h↔12h round-trip source-only schedule SC2 (drift, see Check 6)
S2 Upload validation gap YES restore b-upload + breadth BT3 (BT3 drift, S2 covered by b-upload)
S3 RecursiveDirectoryIterator special-cases NO source-pattern fallback only
S4 DB dump escape round-trip on special chars/binary NO Gap 2 — see Check 5
S5 wp-content/magellan-backups web-gated? YES a1, E4 — confirmed-bug
S6 INSERT-only additive semantics YES E2 confirmed-bug
S7 Email key mismatch YES SC1 confirmed-bug
S8 Double email send source-only SC8 (drift, see Check 6)

Net change since pass 1: S2 strengthened (now covered on two paths). Others unchanged.

Check 4: AND-list aggregate vs per-handler

No regression. backup a1–a6 and restore b1–b7 + b-upload all enumerated independently. b1 correctly identified per-path asymmetry (existing-backup confirm dialog YES; upload path confirm dialog NO). b6 capability check scored on both paths. Clean.

Check 5: Round-trip / compositional probes

  • Backup × Restore (marquee round-trip): STILL OPEN. No charter — including the supplementary breadth-tour — empirically composed create-special-content → backup → restore → assert content survives. Recon S4 explicitly warned about DB-dump escape on special chars/binary; the round-trip is unverified end-to-end. HIGH.
  • Export × restore-from-export: still out-of-scope-bounded; no end-to-end SQL import. MEDIUM (unchanged from pass 1).
  • Save × reload (schedule SC2): still source-only, browser path was charter-mandated. MEDIUM drift.
  • Activate × deactivate × activate (BT6): deprioritized for budget. Now bounded but still unprobed. MEDIUM (downgraded from HIGH because the charter generated it; the gap is now budget-shaped, not coverage-shaped).

Check 6: Empirical-probe-is-mandatory (Amendment I)

  • BT3 (NEW) — upload double-submit protection: filed as confirmed-bug with source citation (admin.js lines 41–62). No empirical rapid-click test was attempted. The probe shape is trivially bounded (5 seconds: select ZIP → click → click again → observe AJAX request count). Source-only verdict on an empirically cheap probe is Amendment I drift. MEDIUM. Recommend reclassifying to Question or running the empirical probe.
  • SC8 double-email: still filed as both confirmed-bug AND Question (Amendment I overclaim, unchanged from pass 1). MEDIUM.
  • SC2 time round-trip: still source-only despite charter mandating browser path. MEDIUM. Unchanged.
  • b4/b7 transaction wrapper: charter explicitly authorized source-pattern fallback. Clean.
  • CT1 concurrent collision: empirical attempt ran (both fired in 14:53); confidence appropriately reduced to 0.85. Clean.

Check 7: Amendment H classification

N/A — admin-only plugin, no overlay UI. Clean.

Check 8: Must-cover flows

N/A — mission.md ## Must-cover flows is empty.

Check 9: Feature anchor completeness

  • F1: a1–a6 + BT1 empty-state empirical. Clean.
  • F2: b1–b7 + BT2 stale-restore + BT7 delete×Restore-button. Clean.
  • F3: b-upload + BT3 (BT3 source-only — see Check 6). Mostly clean.
  • F4 Selective Export (artifact-producing, scale-sensitive): E1–E4 + BT4. GAP: empty-state for F4 (BT4) admitted "true zero-content export not tested" — the only assigned empty-state probe for F4 did not run. MEDIUM.
  • F5: SC1–SC5 + BT5. BT5 cron-next-run-timestamp deprioritized for budget. MEDIUM.
  • F6: SC8 + BT6 deactivation lifecycle. BT6 deprioritized for budget. MEDIUM (Check 11 also affects F6).

Check 10: Coverage-note forcing-function strings

  • backup-andlist: default blast radius probed: ..., multi-surface a3 probes: ... literals present. Clean.
  • restore-andlist: default blast radius probed: ... literal present. Clean.
  • export-hypothesis-cluster: multi-surface a3 probes: ... literal present. Clean.
  • concurrent-trigger-seam: cross-feature interaction probed: ... literal present (in hypotheses_status text). Clean.
  • schedule-feature-cluster: empty-required and toggle-state probes ran but exact literals not strictly verbatim in coverage_notes. LOW. (unchanged)
  • breadth-tour (NEW): charter required empty-state probed: <feature> → <verdict> for F1 and F4. coverage_notes reads "Breadth tour probed 7 of 8 planned hypotheses across all 6 features. BT1, BT2, BT7, and BT8 fully empirically tested..." — the literal empty-state probed: string is absent. The probes ran (BT1 empirically, BT4 partially), but the forcing-function literal is missing. LOW (probe-ran, literal-missed).

Check 11: External-resource-failure probe coverage

  • wp_mail in run_scheduled_backup() (F6): still no charter probes the wp_mail failure path (SMTP unreachable, mail returns false, malformed email). MEDIUM. Unchanged from pass 1.
  • No CDN/font/external-image deps. Clean.

Check 12: Content-authoring UX probe coverage

N/A — plugin ships no patterns, demos, or starter content. Clean.

Check 13: Route-content-depth probe coverage

  • E3 content-type filtering: content-level evidence ('Types: posts' header). Clean.
  • BT1 empty-state pass claim: content-level (text "No backups found." observed). Clean.
  • BT7 delete×Restore: content-level (table row disappearance). Clean.
  • BT4 export empty-state pass claim: status-level only ("export succeeds gracefully") — content-level not asserted because true zero-content scenario was not actually constructed. Already counted under Check 9 / Check 1. MEDIUM.
  • SC5 cron register/unregister: still source-only, no empirical CLI verification of wp cron event list after browser save round-trip. MEDIUM. Unchanged.

Recommendation

High-severity gaps (would block "complete")

  1. Backup × Restore round-trip never empirically composed — recon S4 (DB-dump escape on special chars/binary) was the explicit warning; no charter created-special-content → backed-up → restored → verified. The marquee feature loop is unverified end-to-end. Closure requires a focused supplementary charter (single Tester, ~10 turns).

Medium-severity gaps (acceptable with rationale, but worth noting)

  • BT4 misclassified pass — verdict should be deprioritized or blocked; F4 true empty-state never empirically probed.
  • BT5 cron next-run timestamp — deprioritized to budget; only empirical test of "save-time-13 → cron next-run aligns to 13:00".
  • BT6 deactivation cron cleanup — deprioritized to budget; no empirical wp cron event list after plugin deactivate.
  • BT3 Amendment I drift — confirmed-bug filed from source-only on a trivially bounded empirical probe.
  • S3 RecursiveDirectoryIterator filesystem edge-cases never empirically exercised (cheap to probe).
  • SC2 time-field save→reload done from source rather than browser (charter explicitly mandated browser path).
  • SC5 cron register/unregister done from source rather than empirical CLI round-trip (charter explicitly mandated CLI verification).
  • SC8 double-email filed as both confirmed-bug AND Question (Amendment I overclaim).
  • F6 wp_mail failure-path never probed.
  • Breadth-tour empty-state probed: forcing-function literal missing from coverage_notes.

Re-dispatch suggestion

A second supplementary Tester would close the only remaining HIGH gap:

Mini-charter: roundtrip-verifier — seed special-character + emoji + escape-sensitive content into a post and an option (e.g., wp post create --post_content="O'Brien said \"hi\"; SQL: \\n\\\\backslash 漢字 🔥"), Create Full Backup, restore from a known older backup to wipe state, then restore the special-content backup, assert via wp post get and wp option get that content survived byte-for-byte. Budget 8 turns, 15 min wallclock. Closes recon S4 and the marquee compositional probe in one pass.

A third supplementary on BT4/BT5/BT6 would only close MEDIUM gaps and is below the re-dispatch threshold per meta-review.md Phase 3 (one supplementary per pass).


1 high-severity gap, 9 low-severity gaps

Testing Report — magellan-backups

Run ID: 2026-04-29T12-33-03_magellan-backups Generated: 2026-04-29T13:13:46.730Z Plugin version: 1.0.0 Sessions processed: 6


Executive summary

Category Count
Problems 19
Questions 3
Improvements 19
Praises 9

Problem severity breakdown

Severity Count
critical 5
major 14
minor 0
trivial 0

Severity heatmap by area

Area Critical Major Minor Trivial Risk score
Schedule settings form 0 2 0 0 6
Backup & Restore — artifact security 1 0 0 0 4
Backup & Restore — artifact contents 1 0 0 0 4
Backup & Restore tab — visual feedback 1 0 0 0 4
Selective Export — artifact storage and access control 1 0 0 0 4
Restore flow - Upload path 1 0 0 0 4
Backup & Restore — filename generation 0 1 0 0 3
Backup & Restore — feature completeness 0 1 0 0 3
Backup & Restore — lifecycle management 0 1 0 0 3
Backup & Restore — default behavior 0 1 0 0 3
Upload & Restore form submission (F3) 0 1 0 0 3
Backup creation — concurrent triggers (F1 manual × F5 cron) 0 1 0 0 3
Selective Export — Users export 0 1 0 0 3
Selective Export — import semantics 0 1 0 0 3
Restore flow - Safety net (b2) 0 1 0 0 3
Restore flow - SQL execution (b4, b7) 0 1 0 0 3
Upload & Restore form (b-upload) 0 1 0 0 3
Scheduled backup email notification 0 1 0 0 3
Backup & Restore — security 0 0 0 0 0
Backup & Restore — encryption 0 0 0 0 0
Backup & Restore — lifecycle 0 0 0 0 0
Backup & Restore — backup creation 0 0 0 0 0
Backup & Restore — UI clarity 0 0 0 0 0
Schedule settings form — email field visibility 0 0 0 0 0
Form UX consistency — upload form submission 0 0 0 0 0
Visual feedback — progress indicator initialization 0 0 0 0 0
Error handling — missing backup file 0 0 0 0 0
Empty state messaging 0 0 0 0 0
Export feature — success feedback 0 0 0 0 0
Backup creation — concurrent protection 0 0 0 0 0
Backup creation — unique file naming 0 0 0 0 0
Backup file integrity 0 0 0 0 0
Selective Export — UX clarity 0 0 0 0 0
Selective Export — security UX 0 0 0 0 0
Selective Export — checkbox UX 0 0 0 0 0
Selective Export — download link 0 0 0 0 0
Restore flow - Partial failure recovery 0 0 0 0 0
Upload validation 0 0 0 0 0
Restore safety net 0 0 0 0 0
Upload & Restore UX 0 0 0 0 0
Restore - SQL execution 0 0 0 0 0
Security 0 0 0 0 0
Email notification logic 0 0 0 0 0
Schedule settings form UX 0 0 0 0 0

Risk score = 4·critical + 3·major + 2·minor + 1·trivial

Top problems

1. [CRITICAL] Backup files are publicly downloadable without authentication

  • Area: Backup & Restore — artifact security
  • Persona affected: admin
  • Confidence: 1
  • Session: backup-andlist

Steps to reproduce:

    1. Create a full backup via /wp-admin/admin.php?page=mb-backups → 'Create Full Backup'
    1. Note the backup filename from the table (e.g., backup-2026-04-29-1255.zip)
    1. curl -I http://localhost:8891/wp-content/magellan-backups/

Expected: HTTP 403 Forbidden or redirect to login (/wp-login.php)

Actual: HTTP 200 OK — backup ZIP is served without authentication

Evidence: console

Notes: The wp-content/magellan-backups/ directory has no .htaccess or index.php gate. Anyone who can guess or discover the filename can download the entire site backup. This is a critical security issue.

2. [CRITICAL] Backup ZIP contains unencrypted database with user password hashes

  • Area: Backup & Restore — artifact contents
  • Persona affected: admin
  • Confidence: 1
  • Session: backup-andlist

Steps to reproduce:

    1. Create a full backup
    1. unzip -p backup-2026-04-29-1255.zip database.sql | grep 'INSERT INTO.*wp_users'
    1. Observe password hash in the output

Expected: User credentials should not be exposed in backups, or backups should be encrypted

Actual: database.sql contains INSERT INTO wp_users with actual password hashes: INSERT INTO wp_users VALUES('1','admin','$wp$2y$10$XVjA...',...) Password hashes are exposed in the downloadable backup.

Evidence: [console](sessions/backup-andlist/unzip output)

Notes: This is a critical information leakage. Combined with a1 (public download), attackers can obtain password hashes of all users. While hashes are bcrypt-protected, this is still sensitive information that should not be exposed in downloadable archives without encryption.

3. [CRITICAL] Export SQL files are publicly downloadable from wp-content/magellan-backups/ without authentication (a3 artifact-location security issue)

  • Area: Selective Export — artifact storage and access control
  • Persona affected: admin
  • Confidence: 1
  • Session: export-hypothesis-cluster

Steps to reproduce:

  1. Generate a Selective Export (any content type)
  2. Obtain the filename from the export page (e.g., export-2026-04-29-1254.sql)
  3. From an unauthenticated browser or curl, request: http://localhost:8893/wp-content/magellan-backups/export-2026-04-29-1254.sql
  4. Observe HTTP 200 response; file downloads successfully

Expected: Export SQL files should be stored in a protected location (e.g., wp-content/uploads/private/ with nonce-verified download endpoint) or the directory should have .htaccess blocking direct downloads. At minimum, export files should NOT be publicly accessible.

Actual: Export SQL files land in wp-content/magellan-backups/ which is web-accessible without authentication. Any unauthenticated user who knows or guesses the filename can download the SQL dump containing site data and user hashes.

Evidence: · console

Notes: Critical: the export file shares the same storage location and lack of protection as Full Backups (see backup-andlist charter's a3 findings). This is the multi-surface a3 discharge for Selective Export — both surfaces (Full Backup ZIP, Selective Export SQL) are equally exposed. The export file contains sensitive data: user_pass hashes (P1) + all site options + post/page content. Combined with P1 (password leakage), an attacker with HTTP access can enumerate user accounts and attempt offline password attacks. Recommend: immediate fix — store exports in protected directory or require authentication + nonce for downloads.

4. [CRITICAL] Progress bar shows 100% on initial page load before any backup operation

  • Area: Backup & Restore tab — visual feedback
  • Persona affected: admin
  • Confidence: 0.95
  • Session: breadth-tour

Steps to reproduce:

  1. Navigate to Tools → Backups
  2. Observe the progress bar element on initial page load

Expected: Progress bar should show 0% width or be hidden until a backup operation starts

Actual: Progress bar displays 100% width on page load before any backup has been triggered, implying the backup is complete when no operation has run

Evidence: ![](sessions/breadth-tour/Snapshot from turn 1: 'generic [ref=e169]: 100%' in the DOM tree before any user action) · [console](sessions/breadth-tour/no console errors observed)

Notes: The progress bar's initial state appears to be hardcoded at 100% width in CSS or JavaScript initialization, rather than defaulting to 0% or hidden state. This is a usability/feedback bug per FEW HICCUPPS Feedback dimension.

5. [CRITICAL] Upload & Restore form missing confirmation dialog — destructive action can be triggered accidentally

  • Area: Restore flow - Upload path
  • Persona affected: admin
  • Confidence: 0.95
  • Session: restore-andlist

Steps to reproduce:

  1. Navigate to /wp-admin/admin.php?page=mb-backups
  2. Scroll to 'Upload & Restore' section
  3. Choose a valid ZIP backup file
  4. Click 'Upload & Restore' button

Expected: A confirmation dialog should appear warning 'Restore from [filename]? This will overwrite your current site.' before AJAX execute

Actual: Upload & Restore button immediately submits form via AJAX (line 41-62 admin.js) with no JavaScript confirm() dialog. Existing-backup restore path HAS confirmation (line 22-24), but upload path does not.

Evidence: · · [console](sessions/restore-andlist/Browser console logs show no confirmation dialog fired before AJAX, only 'Restore completed' alert after)

Notes: This is a b1 hypothesis (confirmation dialog). The existing-backup restore path correctly implements confirm(), but the upload path bypasses this safety net. Code asymmetry: lines 22-24 vs lines 41-62 in admin.js.

6. [MAJOR] Backup filenames are guessable with minute-level precision, collision-prone

  • Area: Backup & Restore — filename generation
  • Persona affected: admin
  • Confidence: 1
  • Session: backup-andlist

Steps to reproduce:

    1. Create a full backup via UI and note the timestamp (e.g., 12:55 on 2026-04-29)
    1. Filename is generated as backup-2026-04-29-1255.zip (YYYY-MM-DD-HHMM)
    1. Any attacker knowing the approximate backup time can guess the filename

Expected: Filenames should include a random token (e.g., backup-2026-04-29-1255-a1b2c3d4e5f6.zip) to prevent guessing

Actual: Filename pattern is backup-YYYY-MM-DD-HHMM.zip with minute-level precision. No random suffix. Guessable and collision-prone if two backups created within the same minute.

Evidence: console

Notes: Combined with a1 (public download), this makes targeted attacks trivial. An attacker can enumerate filenames across multiple minutes and download backups at will.

7. [MAJOR] Full Backup claim is misleading — wp-content/uploads/ directory is omitted

  • Area: Backup & Restore — feature completeness
  • Persona affected: admin
  • Confidence: 1
  • Session: backup-andlist

Steps to reproduce:

    1. Create a full backup
    1. unzip -l backup-2026-04-29-1255.zip | grep 'wp-content/uploads'
    1. Observe no results (0 occurrences)

Expected: A backup labeled 'Full Backup' should include all site content: database AND wp-content (including uploads/)

Actual: The backup ZIP contains database.sql, plugins/, and themes/ but NOT uploads/. User expectation: 'Full Backup' means all site files.

Evidence: [console](sessions/backup-andlist/unzip -l shows wp-content/themes/ and wp-content/plugins/ but NO wp-content/uploads/)

Notes: In a disaster recovery scenario, users restoring from a 'Full Backup' will be surprised to discover that all media files are missing. This is a critical UX/expectation violation. Multi-surface a3 note: Full Backup ZIP (this charter: uploads/ omitted) + Selective Export SQL (export-hypothesis-cluster: to be tested separately).

8. [MAJOR] Backup files are not cleaned up when plugin is deactivated

  • Area: Backup & Restore — lifecycle management
  • Persona affected: admin
  • Confidence: 1
  • Session: backup-andlist

Steps to reproduce:

    1. Create full backups (backup-2026-04-29-1254.zip, backup-2026-04-29-1255.zip created)
    1. Navigate to Plugins page or use: studio wp plugin deactivate magellan-backups
    1. Verify backup directory: ls wp-content/magellan-backups/
    1. Observe: files still exist (not deleted)

Expected: Plugin should clean up its backup files on deactivation via register_deactivation_hook()

Actual: Deactivating the plugin leaves all backup ZIP files on disk. No cleanup logic in deactivation hook. Files accumulate indefinitely.

Evidence: [console](sessions/backup-andlist/Before deactivation)

Notes: Combined with no rotation/retention limit, this causes indefinite disk space accumulation. If an admin deactivates the plugin to uninstall it, the backups remain and consume storage.

9. [MAJOR] Backup cron event is registered on activation without explicit schedule configuration

  • Area: Backup & Restore — default behavior
  • Persona affected: admin
  • Confidence: 1
  • Session: backup-andlist

Steps to reproduce:

    1. Fresh plugin activation: studio wp plugin activate magellan-backups
    1. Check WP cron events: studio wp cron event list
    1. Observe: mb_scheduled_backup event is present with next execution tomorrow
    1. Verify no schedule was configured: studio wp option get magellan_backups_schedule
    1. Observe: option returns empty (not set)

Expected: Cron event should not be registered until admin explicitly enables a backup schedule

Actual: mb_scheduled_backup cron event is registered immediately on plugin activation, scheduled to fire every 1 day. No schedule configuration option is set. Default blast radius: automatic backups fire before admin configures anything.

Evidence: [console](sessions/backup-andlist/WP cron list)

Notes: This is a 'default blast radius' violation. Users expect features to be opt-in, not opt-out. An admin who activates the plugin without reading documentation will discover unexpected automatic backups running on their site, consuming resources and disk space.

10. [MAJOR] Users export SQL includes user_pass column with hashed passwords (a3 sensitive-leakage surface)

  • Area: Selective Export — Users export
  • Persona affected: admin
  • Confidence: 1
  • Session: export-hypothesis-cluster

Steps to reproduce:

  1. Navigate to /wp-admin/admin.php?page=mb-backups&tab=export
  2. Check 'Users' checkbox only
  3. Click 'Export Selected'
  4. Inspect the generated .sql file in wp-content/magellan-backups/
  5. Search for 'INSERT INTO wp_users' and observe user_pass column containing bcrypt hash

Expected: Users export should exclude the user_pass column from the SQL dump — only export non-sensitive user metadata (email, display name, user_login, usermeta fields like capabilities)

Actual: Users export includes full wp_users row with user_pass column containing bcrypt hashes: INSERT INTO wp_users VALUES('1','admin','$wp$2y$10$kqJBF2P.ihC/MPqzWU4EmuVo0QUT2dKu19XzGyBWwV.URnph/3DcK',...)

Evidence: · console

Notes: This is the multi-surface a3 (artifact contents) probe for the Selective Export feature — independent code path from the Full Backup leakage (class-mb-export.php vs class-mb-backup.php). Same leakage risk (password hashes in downloadable SQL), different surface. The export file is stored in the same unprotected directory as backups (see P3), so any admin or unauthenticated user who downloads the export has access to bcrypt hashes for offline attack. Reference: recon S6 flagged this as critical semantics issue.

Needs human review (confidence < 0.7)

None.

Questions raised

  • [Restore flow - Partial failure recovery] What is the intended behavior if a restore is interrupted mid-way?
    • Why it matters: Given no transaction wrapper and no checkpoint mechanism, it's unclear what admin should do if a browser crash or network timeout occurs during a large restore. Should they manually repair the database or attempt restore again?
  • [Upload validation] Should empty ZIP files be treated as valid backups?
    • Why it matters: The 0-byte ZIP was accepted and reported as 'Restore completed' with no user warning. A truly empty ZIP contains no database.sql or files, so it overwrites the site with nothing, destroying all content.
  • [Email notification logic] Double email send — which subject is intended? (SC8 detail)
    • Why it matters: The run_scheduled_backup() function sends two emails with subjects 'Backup Complete' and 'Magellan Backup Complete'. Is one of these a leftover from refactoring, or is the double-send intentional? If intentional, why does only the first email check if $email is set, while the second always sends? This impacts email notification clarity and may flood the admin inbox with duplicate messages.

Suggested improvements

  • [Backup & Restore — security] Implement .htaccess gating or serve backups through admin.php with nonce verification to prevent direct download access (effort: low) (impact: high)
    • Rationale: Current backup files are publicly accessible. Adding authentication would prevent unauthorized downloads.
  • [Backup & Restore — filename generation] Append a random token (8-12 chars) to backup filenames to prevent guessing attacks (effort: low) (impact: high)
    • Rationale: Minute-level precision + public access = trivial enumeration. A random suffix raises the attack barrier significantly.
  • [Backup & Restore — encryption] Encrypt backup ZIPs with an admin-provided passphrase or site-specific key to protect user credentials (effort: medium) (impact: high)
    • Rationale: Database dumps expose password hashes. Encryption at rest would protect against direct file disclosure.
  • [Backup & Restore — feature completeness] Include wp-content/uploads/ directory in 'Full Backup' or change label to 'Database and Code Backup' to match actual contents (effort: medium) (impact: high)
    • Rationale: User expectation: 'Full' means all site content. Media files are critical for disaster recovery.
  • [Backup & Restore — lifecycle] Implement backup retention policy (e.g., keep last N backups, delete older than X days) and enforce cleanup on plugin deactivation (effort: medium) (impact: medium)
    • Rationale: Without retention limits, disk usage grows unbounded. Deactivation cleanup prevents orphaned files.
  • [Backup & Restore — default behavior] Register backup cron event only after admin explicitly enables a schedule. Use hook like: schedule_page_loaded → check if enabled → register cron (effort: low) (impact: medium)
    • Rationale: Default automatic backups violate user expectations of opt-in behavior. Admins should explicitly enable scheduled backups.
  • [Schedule settings form — email field visibility] Verify and fix potential option key mismatch: Schedule form appears to save 'magellan_backups_email' but may read 'magellan_backup_email', causing email field to always appear empty after reload.
    • Rationale: Recon briefing (S7) identified this as a likely typo/key mismatch. If confirmed, users cannot see their saved email address, breaking the settings round-trip.
  • [Form UX consistency — upload form submission] Add double-click protection to Upload & Restore form button to match the Create Full Backup pattern: disable button on submit and show loading state.
    • Rationale: The Create Full Backup button correctly prevents double-submission with $btn.prop('disabled', true).text('Creating Backup...'). The Upload & Restore form lacks this, allowing rapid clicks to trigger duplicate restore attempts.
  • [Visual feedback — progress indicator initialization] Initialize progress bar to 0% or hidden state on page load, not 100%. Progress should advance as backup operation proceeds.
    • Rationale: Currently shows 100% on page load before any operation, creating false user expectations that the backup is complete when nothing has happened yet.
  • [Backup creation — concurrent protection] Implement mutex-based concurrency control using WordPress transients. Add wp_set_transient('mb_backup_in_progress', 1, 300) at the start of create_backup() and check for its existence before proceeding. Return a user-friendly error if a backup is already in progress. (effort: low) (impact: high)
    • Rationale: Prevents the OVERWRITE race condition and ensures backups are serialized. Users get clear feedback if they attempt to trigger a backup while one is already running.
  • [Backup creation — unique file naming] Add a second-precision or random suffix to the backup filename to avoid collisions even without a lock. For example: backup-YYYY-MM-DD-HHMM-{random-hash}.zip or backup-YYYY-MM-DD-HHMMSS.zip (second precision) (effort: low) (impact: medium)
    • Rationale: Provides defense-in-depth: even if a mutex is missed, unique filenames prevent overwrites. Second-precision timestamps eliminate the minute-precision collision window entirely.
  • [Selective Export — UX clarity] Add a prominent notice on the export page: 'Imports will ADD data to your existing posts/users/options. They will NOT replace existing content. Use only for merging data from another site.' (effort: low) (impact: high)
    • Rationale: Current behavior (INSERT-only) is counterintuitive to users accustomed to 'export/restore' workflows. A warning prevents accidental data duplication.
  • [Selective Export — security UX] Implement protected download links (wp_nonce) for export SQL files instead of direct web-accessible storage. Example: force downloads through admin.php?action=mb_download_export with nonce verification. (effort: medium) (impact: high)
    • Rationale: Removes public discoverability of export files; requires admin session to download. Aligns with security best practices for sensitive site-data exports.
  • [Restore safety net] Add pre-restore snapshot creation (effort: medium) (impact: high)
    • Rationale: Automatically create a backup of the current site state before restore executes. This is the primary safety net for destructive operations.
  • [Upload & Restore UX] Add confirmation dialog to Upload & Restore form (effort: low) (impact: high)
    • Rationale: Add a JavaScript confirm() check (like the existing-backup path) before submitting the upload form. Currently only the existing-backup path has confirmation; the upload path does not.
  • [Restore - SQL execution] Wrap SQL restore in transaction (effort: low) (impact: high)
    • Rationale: Wrap the import_sql() loop in BEGIN TRANSACTION / COMMIT / ROLLBACK to ensure all-or-nothing semantics. On error, rollback to pre-restore DB state.
  • [Upload validation] Add pre-validation to upload form (effort: medium) (impact: medium)
    • Rationale: Validate file size, MIME type, and ZIP integrity BEFORE sending to ZipArchive::open(). Provide user-friendly error messages for invalid uploads.
  • [Schedule settings form UX] The Notification Email field should be marked 'required' or include a validation warning if the schedule is enabled but no email is configured. (effort: low) (impact: medium)
    • Rationale: Currently, users can enable scheduled backups without receiving any notification of completion, which may lead to missed backup completion alerts.
  • [Schedule settings form UX] When the 'Enable scheduled backups' checkbox is unchecked, the Frequency, Time, and Notification Email fields should be visually disabled (grayed out) or hidden. (effort: low) (impact: low)
    • Rationale: Currently they remain active, which may confuse users about whether changes affect anything when the schedule is turned off.

What works well (praises)

  • [Backup & Restore — backup creation] Clicking 'Create Full Backup' button creates a ZIP file within seconds with appropriate file size (~13 MB for test site). No errors or confusing dialogs. Good UX for the core operation.
    • Why: Backup creation via UI is straightforward and responsive. Single-click 'Create Full Backup' works reliably.
  • [Backup & Restore — UI clarity] Admin can quickly see all backups, their sizes, timestamps, and available actions. No missing information in the display.
    • Why: Backup table shows filename, size, date, and actions (Download, Restore, Delete) in a clear, scannable format.
  • [Error handling — missing backup file] Restore function shows user-friendly error message for missing backup files
    • Why: When a backup file is manually deleted from disk, clicking Restore shows a clear alert 'Restore failed: Backup file not found' instead of raw PHP errors. This is correct error handling.
  • [Empty state messaging] Empty backup list displays clear 'No backups found' message
    • Why: When no backups exist, the UI shows a helpful message instead of a confusing empty table or pagination. Good UX.
  • [Export feature — success feedback] Export completion shows clear success message with download link
    • Why: After export completes, users immediately see 'Export ready: ' with a clickable link. Efficient and user-friendly.
  • [Backup file integrity] ZIP files created are valid and not corrupted in normal (sequential) execution
    • Why: Good foundational backup functionality; the concurrent-trigger issue is an edge case but the core backup mechanism works reliably
  • [Selective Export — checkbox UX] Content-type selection via checkboxes is clear and responsive; no UI lag or confusion when toggling selections
    • Why: Users can easily select exactly what to export; checkboxes immediately reflect state without page reload
  • [Selective Export — download link] Once export is ready, the page immediately displays a download link with a clear filename including the generation date
    • Why: Reduces friction; users know instantly when the export is available and can locate it easily (no email delay, no search required)
  • [Security] Nonce and capability check present on restore endpoints
    • Why: Both restore paths (existing-backup and upload) implement check_ajax_referer() and current_user_can('manage_options'), preventing unauthorized restore attempts.

Coverage gaps

Session Status Turns Flows Notes
breadth-tour complete 30/30 7/8 Breadth tour probed 7 of 8 planned hypotheses across all 6 features. BT1, BT2, BT7, and BT8 fully empirically tested. BT3 source-inspected for double-submit protection (found missing). BT4 tested with export containing existing content. BT5 and BT6 partially addressed via cron CLI observation and source code review. One critical bug found (BT8: progress bar at 100% on page load). One major bug found (BT3: upload form lacks double-submit protection). Secondary evidence from recon briefing notes and admin.js inspection informed analysis.
concurrent-trigger-seam complete 10/10 3/3 All three hypotheses (CT1, CT2, CT3) probed. CT1 and CT2 confirmed via source code inspection. CT3 confirmed via UI observation (button state) and server-side analysis. Empirical concurrent test executed: both manual and cron triggered within same minute; only one file created at target timestamp, but file integrity checks passed (valid ZIP). Architecture does not prevent filename collision, though execution did not produce observable corruption in this test run. Race condition window exists but may be intermittently observable depending on timing of ZipArchive operations.
export-hypothesis-cluster complete 8/8 5/5 Full hypothesis-cluster coverage achieved: E1 (user_pass leakage), E2 (additive semantics), E3 (content-type filtering), E4 (public export accessibility) all probed. Multi-surface a3 discharge note: multi-surface a3 probes completed — Full Backup ZIP (backup-andlist charter) + Selective Export SQL (this charter).
schedule-feature-cluster complete 8/8 4/4 All five core hypotheses (SC1-SC5) probed. SC8 (double email) identified in source but cannot be empirically triggered without running cron in test environment — filed as Question. Charter success criteria met: email key mismatch confirmed, time conversion mismatch identified, empty-field behavior observed, toggle-state issue identified, cron logic validated.

Environment warnings

These are signals observed during the run that point at test-environment quirks (Studio + SQLite shim, WP-CLI Phar, WC stack interactions), NOT plugin defects. Apply extra scrutiny to findings in affected areas — some Problems may be false positives caused by the environment, and some real bugs may be masked.

Session Warning
backup-andlist WP-CLI Phar signature error during studio-provision.sh execution — Site + Phar + macOS interaction; resolved by manual metadata creation, not a plugin defect.

Token usage & cost

Computed from Claude Code transcripts at ~/.claude/projects/<proj-hash>/. Rates from config/pricing.json. Window: 2026-04-29T12:33:03Z2026-04-29T13:13:46Z (with ±10min buffer for dispatch drift).

Estimated total cost for this run: $33.94

Category Cost % of total
Fresh input $0.00 0.0%
Output $4.64 13.7%
Cache-create (5m) $4.20 12.4%
Cache-create (1h) $9.34 27.5%
Cache-read $15.75 46.4%

Manager (main conversation)

Total: $23.37

Model Messages Input Output Cache-5m Cache-1h Cache-read Cost
claude-opus-4-7 176 396 134,751 0 934,349 21,320,735 $23.37

Subagents (10 invocations)

Total: $10.56

Model Messages Input Output Cache-5m Cache-1h Cache-read Cost
claude-sonnet-4-6 20 28 20,605 250,718 0 643,730 $1.44
claude-haiku-4-5-20251001 602 1,463 112,138 1,254,793 0 37,986,092 $5.93
claude-opus-4-7 40 60 15,885 270,755 0 2,205,785 $3.19
Per-subagent breakdown (10 sessions)
Agent ID Type Models Cost
a266703439023adea planner-sonnet claude-sonnet-4-6 $1.44
a4483e9a1b5881156 tester claude-haiku-4-5-20251001 $1.40
aa2a1a7498f25855e tester claude-haiku-4-5-20251001 $0.61
aabc813caa58189a1 tester claude-haiku-4-5-20251001 $1.04
aad2f093601bb8d32 tester claude-haiku-4-5-20251001 $0.85
ab2170d61a16f5e07 tester claude-haiku-4-5-20251001 $0.65
ab2a2961b36f538c0 general-purpose claude-opus-4-7 $1.53
ac71ffc39c2cdc6b8 tester claude-haiku-4-5-20251001 $0.76
acfe3fa8ff65b5e3e general-purpose claude-opus-4-7 $1.66
aeaf509ca011b6f94 tester claude-haiku-4-5-20251001 $0.62

Recommended next steps

  1. Triage Schedule settings form first — highest risk score (6)
  2. Address 5 critical problem(s) before release
  3. Follow up on 4 session(s) with incomplete coverage
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment