Skip to content

Instantly share code, notes, and snippets.

@alopezari
Created May 6, 2026 12:21
Show Gist options
  • Select an option

  • Save alopezari/5a4be0916322a64cfa109a71e4dd4c65 to your computer and use it in GitHub Desktop.

Select an option

Save alopezari/5a4be0916322a64cfa109a71e4dd4c65 to your computer and use it in GitHub Desktop.
WooCommerce Products — Coverage gaps / meta-review (2026-05-06T10-51-02_woocommerce)

Coverage Gaps — 2026-05-06T10-51-02_woocommerce

Meta-reviewer: Claude (Sonnet 4.6) | Date: 2026-05-06 | No answer-key files were consulted.


Check 3 — Flow execution rate (quick pass)

Charter Planned Executed Rate
simple-product-merchant 8 6 75%
variable-product-merchant 6 3 50%
grouped-external-merchant 6 6 100%
virtual-downloadable-merchant 5 4 80%
product-catalog-admin 7 4 57%
shopper-browse-search 12 12 100%
shopper-product-detail 6 4 67%
golden-path-product 4 4 100%

Three charters fell below 70%: variable-product-merchant (50%), product-catalog-admin (57%), shopper-product-detail (67%). Both of the first two hit their turn caps.


Check 4 — Must-cover flows vs. coverage matrix parity

Mission must-cover lists 16 merchant flows and 8 shopper flows. Cross-referencing against coverage.md:

Missing from charter set entirely:

  • "Edit an existing product (change price, update description, modify stock)" — F1/F6 cover creation only; no charter explicitly probes a post-publish edit cycle with reload verification. simple-product-merchant deprioritized the edit-then-verify round-trip flow (it verified create + reload but not a subsequent edit of an already-published product).
  • "Upsells and cross-sells product linking" — Listed as F10 and assigned to simple-product-merchant, but the tester explicitly deprioritized H7 citing it as "out-of-scope per mission" — which is incorrect (it IS in must-cover). This is a missed coverage obligation.
  • "Shipping (weight, dimensions, shipping class)" — Named in mission must-cover under Merchant flows. No hypothesis in any charter addresses whether Shipping tab fields (weight, dimensions, shipping class) persist. F5 assigns this to virtual-downloadable-merchant only via the "Shipping tab disappears for Virtual" check — the actual shipping-field persistence for non-virtual products has zero probes.
  • "Duplicate a product and verify the copy is independent" — F11 assigned to simple-product-merchant (H4), but H4 was deprioritized and the tester observed the duplicate action may have silently failed. This is unresolved and not filed as a Problem — only as a Q item, understating severity.

Severity: HIGH — Three must-cover flows (upsells/cross-sells, shipping field persistence, edit-existing-product cycle) have zero empirical probes across all 8 sessions.


Check 5 — Hypothesis depth vs. mission risk register

Mission identifies six hot spots. Mapping to hypothesis coverage:

Hot spot Hypotheses generated Probed?
Variable product data integrity H1–H6 in variable-product-merchant H3–H6 deprioritized (50% miss)
Stock management edge cases H1 in simple-product-merchant Probed
Sale price scheduling H2 in simple-product-merchant Probed
Download security Listed under virtual-downloadable; no dedicated H No probe
Bulk edit — partial application / field clearing H3–H4 in product-catalog-admin Both deprioritized
Product duplication — deep vs. shallow copy H4 in simple-product-merchant Deprioritized + unresolved
Frontend variation selector H2, H4 in shopper-product-detail Both deprioritized (no variations)
SEO / slug uniqueness on duplicate Not in any charter Zero coverage

Severity: HIGH — Four hot spots have zero empirical probe: download security, bulk-edit partial-application, frontend variation selector, and slug uniqueness on duplicate.

The variation-selector gap is particularly significant: variable-product-merchant (critical priority) produced only 3/6 flows; shopper-product-detail failed to instantiate a variable product with actual variations. F2 and F18 appear in the coverage matrix but have essentially zero end-to-end shopper-side validation.


Check 6 — Recon surprises propagated to charters

Recon identified 9 surprises (S1–S9). Checking whether charters probed the high-value surprises:

  • S2 — Brands taxonomy (new WC 10.x feature): Correctly captured in F9 and probed in product-catalog-admin (H1 PASS). Good.
  • S6 — Variations tab JS-hide behavior: Probed in variable-product-merchant (H1); tester reported the tab is visible for all product types, filed as a major Problem P1 at confidence 0.75. However, the question log also notes the change may have been triggered via eval() rather than a real click, introducing method bias. The finding is filed as a Problem but may be an automation artifact — no re-probe via real click was dispatched.
  • S7 — Point of Sale feature: Noted in recon but not addressed in any charter or hypothesis. POS is a new WC 10.x surface — the session has zero POS coverage. (Out of scope? Mission says "Products feature only" — however, POS is a product management surface. This is a gray area worth flagging.)
  • S8 / S9 — Coming Soon mode: Documented in recon and testers were briefed; not a gap per se, but shopper-browse-search used only 10/22 turns — suggesting the setup overhead from Coming Soon and wizard-dismissal commands was low in practice.

Severity: LOW — S6 finding confidence is reduced by method-bias (eval vs click); no re-probe was filed. S7 (POS) is likely out of scope but goes unacknowledged.


Check 7 — Model/budget pressure artifacts

Several patterns suggest budget pressure caused systematic under-coverage:

  1. simple-product-merchant hit turn 24/25 with 5 out of 8 hypotheses deprioritized. This charter carried F1, F6, F7, F8, F10, F11, F13, F20 — 8 feature areas. That is an overloaded charter. The planner concentrated too many must-cover flows into one critical charter, guaranteeing budget exhaustion.

  2. variable-product-merchant hit turn 24/25 with 4 out of 6 hypotheses deprioritized. The single most complex product type got the fewest completed probes. H3 (Generate variations Cartesian product), H4 (per-variation price independence), H5 (out-of-stock variation frontend), and H6 (Any attribute option) all zero-probed.

  3. product-catalog-admin hit its turn cap (22/22) with bulk edit (H3, H4) and all CSV flows (H5, H6, empty-state) deprioritized. Bulk edit is a mission hot spot; CSV export/import is a must-cover flow. Both ended with zero probes.

  4. shopper-product-detail had 9 turns remaining (13/22) but failed to obtain a working variable product with variations. The failure mode was environment: WP-CLI variation creation failed, and the tester did not attempt UI-based variation creation or flag it as a blocker requiring a supplemental charter.

Severity: HIGH — The planner over-packed simple-product-merchant and variable-product-merchant, and budget exhaustion was structurally predictable. The result is that the two highest-risk product types (Simple and Variable) each had their deepest hypotheses cut.


Check 8 — Deviation patterns (quick pass)

Three separate charters (simple-product-merchant, virtual-downloadable-merchant, golden-path-product) each independently reported that "WooCommerce was not pre-installed by the provision script despite being the target plugin." This is a consistent provisioning defect — the studio-provision.sh or equivalent mechanism did not install the SUT automatically. Testers compensated via manual WP-CLI install, which obscures any provisioning-related bugs. This should be flagged as a harness defect.


Check 11 — Untested feature × risk combinations

Comparing mission risk register + coverage matrix against what was actually probed:

Feature × Risk pairing Coverage gap type
Variable product — per-variation price/SKU independence Zero probe (H4 deprioritized)
Variable product — Cartesian variation generation Zero probe (H3 deprioritized)
Variable product — frontend selector + price update Zero probe (shopper-product-detail H2 environment failure)
Variable product — submit without selecting attribute (validation) Zero probe (H4 in shopper-product-detail)
Downloadable product — download URL exposure / security Zero probe (no hypothesis ever written)
Bulk edit — silent field clearing on non-edited fields Zero probe (product-catalog-admin H3 deprioritized)
CSV import — duplicate prevention (Update mode) Zero probe (H6 deprioritized)
Product duplicate — slug uniqueness Zero probe (no hypothesis)
Shipping fields persistence (weight/dimensions/class) Zero probe across all charters
Upsells/cross-sells persistence + frontend rendering Zero probe (incorrectly excluded from simple-product-merchant)

Ten feature × risk pairings have zero empirical probes. Six of these are explicitly called out in the mission's risk register or must-cover list.

Severity: HIGH


Check 12 — Question items that should be Problems

Two items were filed as Questions that appear to warrant Problem status:

  1. simple-product-merchant Q1 — "Why does the Duplicate row action not produce a visible new product?" The tester observed the action was taken and no new product appeared in the list. The charter hypothesis (H4) specifically tests "duplication creates an independent copy." The observed behavior (click → no visible product) is consistent with a silent duplication failure. Filing this as a Question rather than a severity:major Problem understates the risk to the must-cover flow "Duplicate a product and verify the copy is independent." The tester noted "env_warnings" about this too, further suggesting a real functional issue.

  2. variable-product-merchant P1 — Filed at confidence 0.75 with a note that the observation may be due to method bias (eval vs real click). The accompanying question "Does changing product type via real click trigger different behavior?" was not pursued with a follow-up probe. If the observation is a false positive (automation artifact), the false positive should be resolved, not left open.

Severity: MEDIUM


Check 13 — Persona and role coverage

Mission scope is "merchant (admin) and shopper." Checking coverage:

  • Admin persona: All admin-side charters used admin credentials. Covered.
  • Guest shopper: shopper-browse-search explicitly tested guest and logged-in customer. golden-path-product tested cart add (shopper perspective) but did not test as an explicitly authenticated customer for the shopper legs.
  • No role-escalation or capability probing: Mission is out-of-scope for this, consistent.
  • One notable gap: No charter tested the shopper experience for a downloadable product from the frontend. virtual-downloadable-merchant focused entirely on admin creation. The shopper-side view of a downloadable product (Is the "Add to cart" button present? What does the product detail page look like?) has zero probes. Mission must-cover includes "View a Downloadable product detail page (if purchasable without checkout in scope)."

Severity: LOW — The parenthetical "if purchasable without checkout in scope" makes this partially in-scope, but the frontend view of the product detail page for a downloadable product (price, button, any download-specific UI) was never tested.


Summary table

Check Finding Severity
3 3 charters below 70% flow rate; 2 hit turn cap MEDIUM
4 3 must-cover flows with zero probes (upsells, shipping persistence, edit-existing cycle) HIGH
5 4 hot spots with zero probes (download security, bulk-edit, variation selector, slug on duplicate) HIGH
6 S6 finding confidence reduced by method bias; no re-probe dispatched LOW
7 Planner over-packed 2 critical charters; budget exhaustion was structurally predictable HIGH
8 Provisioning defect (WC not auto-installed) across 3 sessions — harness gap MEDIUM
11 10 feature × risk pairings with zero empirical probes; 6 are mission-explicit HIGH
12 Product duplication silent-failure filed as Q instead of Problem; variation-tab finding unresolved MEDIUM
13 Downloadable product frontend view not tested despite being in must-cover LOW

Headline verdict

4 high-severity gaps, 3 medium-severity gaps, 2 low-severity gaps.

The most consequential gaps are structural: the planner overloaded the two riskiest charters (simple-product-merchant, variable-product-merchant) with too many hypotheses for a 20–25 turn budget, guaranteeing that the deepest probes — variable product variation mechanics, bulk edit field-clearing, product duplication independence — were cut. Ten feature × risk pairings from the mission's explicit must-cover and risk register have zero empirical evidence. The run produced useful findings on simpler surfaces (shopper browse, simple product create, grouped/external types) but missed the hardest problems in the variable product and catalog management areas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment