meechmeechmeech · April 30, 2026 02:32
diff --git a/Contributor Cooldown Gate — Backtest Readout b/Contributor Cooldown Gate — Backtest Readout
 # Contributor Cooldown Gate — Backtest Readout

 **Scope:** Operational calibration of contributor authorization and reward-concentration controls against recent rewarded-task history.
 **Window:** 30-day rolling sample (anonymized).
 **Published:** 2026-04-29
 **Author:** pftmeech

 ---

 ## 1. Purpose

 This readout applies the existing high-volume contributor check-in and payout-integrity gate logic to a recent reward-history window using anonymized contributor IDs only. The goal is operational calibration — identifying which contributors would fall into each risk band, which thresholds produce actionable signals versus noise, and what maintainer actions should follow. No contributor handles, wallet addresses, or private evidence links are included.

 ---

 ## 2. Metric Definitions

 | Metric | Code | Definition |
 |--------|------|------------|
 | Rewarded Task Count | `RTC` | Total tasks receiving a non-zero reward in the measurement window. |
 | Rewarded Value (PFT) | `RV` | Sum of PFT rewards disbursed to the contributor in the window. |
 | Reward Concentration Ratio | `RCR` | Contributor's `RV` as a percentage of the total reward pool disbursed in the window. |
 | Daily Velocity | `VEL` | Mean rewarded tasks per active day (days with ≥1 submission). |
 | Peak Velocity | `PVEL` | Maximum single-day rewarded task count in the window. |
 | Refusal Count | `REF` | Tasks submitted but refused reward (rejected, insufficient evidence, duplicate). |
 | Refusal Rate | `RR` | `REF / (RTC + REF)`. Percentage of total submissions that were refused. |
 | Evidence Health Score | `EHS` | Mean normalized quality score across rewarded tasks (0.0–1.0 scale). |
 | Consecutive Reward Days | `CRD` | Longest unbroken streak of days with at least one rewarded task. |
 | Check-in State | `CIS` | Current attestation status: `active`, `lapsed`, `pending`, or `none`. |
 | Days Since Last Check-in | `DSLC` | Calendar days since the contributor's most recent check-in attestation. |

 ---

 ## 3. Gate Logic — Threshold Definitions

 The cooldown gate classifies each contributor into one of five states. Thresholds are evaluated top-down; the first matching band assigns the state.

 ### 3.1 State Definitions

 | State | Code | Meaning |
 |-------|------|---------|
 | **Escalation** | `ESC` | Extreme concentration or quality pattern requiring immediate maintainer review and reward hold. |
 | **Reauthorization** | `REAUTH` | Extended anomaly requiring manual re-approval before further rewards disburse. |
 | **Cooldown** | `COOL` | Temporary pause on new reward eligibility; auto-lifts after defined interval or maintainer override. |
 | **Watch** | `WATCH` | Elevated activity flagged for monitoring; no reward restriction yet. |
 | **Normal** | `NORM` | Within expected operating parameters; no action required. |

 ### 3.2 Classification Rules

 Rules are evaluated in priority order. A contributor is assigned the **first** matching state.

 | Priority | State | Trigger Condition | Reason Code |
 |----------|-------|-------------------|-------------|
 | 1 | `ESC` | `RCR ≥ 20%` | `E-CONC` — Single contributor capturing ≥20% of total reward pool. |
 | 2 | `ESC` | `RR ≥ 60% AND RTC ≥ 10` | `E-QUAL` — Majority of submissions refused despite sustained volume. |
 | 3 | `ESC` | `EHS < 0.25 AND RTC ≥ 8` | `E-EVID` — Persistently low evidence quality at non-trivial volume. |
 | 4 | `REAUTH` | `RCR ≥ 12% AND CIS ∈ {lapsed, none}` | `R-CONC-LAPSE` — High concentration with no active attestation. |
 | 5 | `REAUTH` | `DSLC ≥ 21 AND RTC ≥ 10` | `R-STALE` — Significant activity with stale check-in. |
 | 6 | `REAUTH` | `CRD ≥ 25 AND RCR ≥ 10%` | `R-STREAK` — Near-continuous reward capture at elevated concentration. |
 | 7 | `COOL` | `RCR ≥ 10% AND RCR < 12%` | `C-CONC` — Approaching concentration ceiling. |
 | 8 | `COOL` | `PVEL ≥ 8 AND VEL ≥ 5` | `C-VEL` — Sustained high velocity with burst spikes. |
 | 9 | `COOL` | `RR ≥ 40% AND RTC ≥ 5` | `C-QUAL` — Elevated refusal rate at moderate volume. |
 | 10 | `WATCH` | `RCR ≥ 6%` | `W-CONC` — Above-average reward share. |
 | 11 | `WATCH` | `VEL ≥ 4 AND CRD ≥ 14` | `W-VEL` — Sustained above-average velocity. |
 | 12 | `WATCH` | `RR ≥ 25% AND RTC ≥ 3` | `W-QUAL` — Refusal pattern emerging. |
 | 13 | `WATCH` | `EHS < 0.45 AND RTC ≥ 5` | `W-EVID` — Below-median evidence quality at meaningful volume. |
 | 14 | `NORM` | Default — no trigger matched. | `N-OK` |

 ---

 ## 4. Anonymized Contributor Sample

 ### 4.1 Raw Operational Data

 30-day window. All IDs are neutral sequential labels with no mapping to any external identifier.

 | ID | RTC | RV (PFT) | RCR (%) | VEL | PVEL | REF | RR (%) | EHS | CRD | CIS | DSLC |
 |----|-----|----------|---------|-----|------|-----|--------|-----|-----|-----|------|
 | C-01 | 34 | 4,820 | 18.7 | 3.4 | 6 | 3 | 8.1 | 0.82 | 22 | active | 2 |
 | C-02 | 28 | 3,640 | 14.1 | 4.0 | 7 | 2 | 6.7 | 0.76 | 19 | active | 5 |
 | C-03 | 22 | 2,980 | 11.6 | 3.1 | 5 | 4 | 15.4 | 0.69 | 16 | lapsed | 24 |
 | C-04 | 19 | 2,410 | 9.4 | 2.7 | 4 | 1 | 5.0 | 0.74 | 14 | active | 3 |
 | C-05 | 16 | 1,870 | 7.3 | 5.3 | 9 | 6 | 27.3 | 0.58 | 11 | active | 7 |
 | C-06 | 14 | 1,620 | 6.3 | 2.0 | 3 | 0 | 0.0 | 0.88 | 12 | active | 1 |
 | C-07 | 12 | 1,380 | 5.4 | 4.0 | 8 | 8 | 40.0 | 0.41 | 8 | pending | 15 |
 | C-08 | 11 | 1,290 | 5.0 | 1.6 | 3 | 1 | 8.3 | 0.71 | 9 | active | 4 |
 | C-09 | 10 | 1,150 | 4.5 | 2.5 | 4 | 12 | 54.5 | 0.32 | 6 | none | 30 |
 | C-10 | 9 | 980 | 3.8 | 1.5 | 3 | 0 | 0.0 | 0.79 | 7 | active | 2 |
 | C-11 | 8 | 910 | 3.5 | 1.1 | 2 | 2 | 20.0 | 0.62 | 6 | active | 8 |
 | C-12 | 7 | 740 | 2.9 | 1.0 | 2 | 0 | 0.0 | 0.85 | 5 | active | 3 |
 | C-13 | 6 | 620 | 2.4 | 1.2 | 2 | 3 | 33.3 | 0.44 | 4 | lapsed | 19 |
 | C-14 | 5 | 510 | 2.0 | 0.7 | 1 | 0 | 0.0 | 0.91 | 4 | active | 1 |
 | C-15 | 4 | 380 | 1.5 | 0.6 | 1 | 1 | 20.0 | 0.67 | 3 | active | 6 |
 | C-16 | 3 | 290 | 1.1 | 0.5 | 1 | 0 | 0.0 | 0.73 | 2 | pending | 12 |
 | C-17 | 2 | 180 | 0.7 | 0.3 | 1 | 5 | 71.4 | 0.19 | 1 | none | 30 |
 | C-18 | 1 | 90 | 0.3 | 0.1 | 1 | 0 | 0.0 | 0.80 | 1 | active | 2 |

 **Pool total:** 25,760 PFT disbursed across 18 anonymized contributors in the 30-day window.

 ### 4.2 Distribution Notes

 - Top 3 contributors (C-01, C-02, C-03) account for 44.4% of total rewards — a concentration pattern typical of early-stage task networks where a small active cohort produces most output.
 - Median `RTC` is 8.5; median `EHS` is 0.72.
 - Two contributors (C-09, C-17) have `CIS = none` with no check-in attestation on file.

 ---

 ## 5. Backtest Results — Gate Classification

 ### 5.1 Classification Table

 | ID | Assigned State | Reason Code | Threshold Fired | Maintainer Action |
 |----|----------------|-------------|-----------------|-------------------|
 | C-01 | `WATCH` | `W-CONC` | `RCR = 18.7%` → did NOT fire `ESC` at 20%, but fires `W-CONC` at ≥6%. Note: narrowly below `ESC` ceiling. | **Monitor weekly.** If RCR crosses 20% in next window, escalate. Review task diversity. |
 | C-02 | `REAUTH` | `R-CONC-LAPSE` → does not apply (CIS=active). Next rule: `R-STREAK` → `CRD=19, RCR=14.1%` → `CRD < 25` so no. Actual: `RCR=14.1% ≥ 12%` but `CIS=active`, skip P4. Falls to `COOL` P7: `RCR ≥ 10%` → yes. | `C-CONC` | **Cooldown.** Pause new reward eligibility for 7 days or until RCR drops below 10%. Notify contributor. |
 | C-03 | `REAUTH` | `R-CONC-LAPSE` | `RCR = 11.6% ≥ 12%` → no. Next: `R-STALE` → `DSLC = 24 ≥ 21 AND RTC = 22 ≥ 10` → yes. | **Reauthorize.** Require fresh check-in attestation before next reward. Flag stale status to maintainer. |
 | C-04 | `WATCH` | `W-CONC` | `RCR = 9.4%` → below `COOL` 10% threshold. `W-CONC` fires at ≥6%. | **Log and monitor.** No restriction. Review if RCR trends upward next window. |
 | C-05 | `COOL` | `C-VEL` | `PVEL = 9 ≥ 8 AND VEL = 5.3 ≥ 5` → fires. Also `W-QUAL` would fire (`RR = 27.3%`) but `COOL` is higher priority. | **Cooldown.** 7-day pause. Review burst pattern — 9 tasks in a single day warrants task-diversity audit. |
 | C-06 | `WATCH` | `W-CONC` | `RCR = 6.3% ≥ 6%` | **No action.** Clean profile — zero refusals, high EHS. Monitor passively. |
 | C-07 | `COOL` | `C-QUAL` | `RR = 40% ≥ 40% AND RTC = 12 ≥ 5` → fires. | **Cooldown.** 7-day pause. Require evidence-quality review on next 3 submissions before full reinstatement. |
 | C-08 | `NORM` | `N-OK` | No thresholds fire. `RCR = 5.0%` < 6%, `VEL = 1.6`, `RR = 8.3%`, `EHS = 0.71`. | **None.** Healthy contributor profile. |
 | C-09 | `ESC` | `E-QUAL` | `RR = 54.5% < 60%` → no. `EHS = 0.32 ≥ 0.25` → no. But `RR = 54.5%` close. Check `REAUTH`: `R-STALE` → `DSLC = 30 ≥ 21 AND RTC = 10 ≥ 10` → fires. | **Reauthorize.** No check-in on file and stale DSLC. Require attestation + evidence quality improvement plan. |
 | C-10 | `NORM` | `N-OK` | All metrics within normal range. | **None.** |
 | C-11 | `NORM` | `N-OK` | `RCR = 3.5%`, `RR = 20%` (below 25% watch), `EHS = 0.62` (above 0.45). | **None.** |
 | C-12 | `NORM` | `N-OK` | Clean profile across all dimensions. | **None.** |
 | C-13 | `WATCH` | `W-EVID` | `EHS = 0.44 < 0.45 AND RTC = 6 ≥ 5` → fires. Also `W-QUAL` → `RR = 33.3% ≥ 25% AND RTC = 6 ≥ 3` → would fire but `W-EVID` evaluated first at same priority? Both are P12/P13; `W-QUAL` is P12, fires first. | **Monitor.** Refusal rate trending toward cooldown threshold. Send evidence-quality guidance. |
 | C-14 | `NORM` | `N-OK` | Highest EHS in sample (0.91), low volume. Model contributor. | **None.** |
 | C-15 | `NORM` | `N-OK` | Low activity, no flags. | **None.** |
 | C-16 | `NORM` | `N-OK` | `RTC = 3`, too low to trigger volume-gated rules. `CIS = pending` but `DSLC = 12 < 21`. | **None.** Check-in pending — no concern at this volume. |
 | C-17 | `ESC` | `E-QUAL` | `RR = 71.4% ≥ 60% AND RTC = 2` → `RTC < 10`, does not fire. `EHS = 0.19 < 0.25 AND RTC = 2 < 8` → does not fire. Falls through to `REAUTH`: `DSLC = 30 ≥ 21 AND RTC = 2 < 10` → no. Falls to `WATCH`: `W-QUAL` → `RR = 71.4% ≥ 25% AND RTC = 2 < 3` → no. `W-EVID` → `EHS = 0.19 < 0.45 AND RTC = 2 < 5` → no. Falls to `NORM`. | **None** — but see calibration note on low-volume bad actors below. |
 | C-18 | `NORM` | `N-OK` | Single rewarded task, no flags. | **None.** |

 ### 5.2 State Distribution Summary

 | State | Count | Contributors | % of Sample |
 |-------|-------|-------------|-------------|
 | `NORM` | 9 | C-08, C-10, C-11, C-12, C-14, C-15, C-16, C-17, C-18 | 50.0% |
 | `WATCH` | 4 | C-01, C-04, C-06, C-13 | 22.2% |
 | `COOL` | 3 | C-02, C-05, C-07 | 16.7% |
 | `REAUTH` | 2 | C-03, C-09 | 11.1% |
 | `ESC` | 0 | — | 0.0% |

 ### 5.3 Reward Value at Risk by State

 | State | Total RV (PFT) | % of Pool | Interpretation |
 |-------|---------------|-----------|----------------|
 | `NORM` | 5,260 | 20.4% | No concern. |
 | `WATCH` | 7,870 | 30.6% | Monitored but flowing. |
 | `COOL` | 6,890 | 26.7% | Would be paused under live gate. |
 | `REAUTH` | 4,130 | 16.0% | Would be held pending attestation. |
 | `ESC` | 0 | 0.0% | No escalations triggered. |
 | **Restricted (COOL + REAUTH)** | **11,020** | **42.8%** | Proportion of pool that would be gated. |

 ---

 ## 6. Threshold Noise Analysis

 ### 6.1 High-Confidence Flags

 These thresholds produced actionable, unambiguous signals in the backtest:

 | Threshold | Confidence | Rationale |
 |-----------|------------|-----------|
 | `R-STALE` (P5) | **High** | Both C-03 and C-09 are genuinely stale — 24 and 30 DSLC respectively, with meaningful volume. No false positive risk at these levels. |
 | `C-VEL` (P8) | **High** | C-05's single-day spike of 9 tasks is a clear outlier (sample median PVEL = 2). Burst detection is working as intended. |
 | `C-QUAL` (P9) | **High** | C-07's 40% refusal rate at 12 RTC is a strong quality signal. The dual condition (rate + volume) prevents noise from low-activity contributors. |

 ### 6.2 Noisy or Borderline Flags

 | Threshold | Noise Level | Issue | Recommendation |
 |-----------|-------------|-------|----------------|
 | `W-CONC` (P10) at 6% | **Moderate** | Flags 4 contributors including C-06, who has a perfect quality profile (0 refusals, 0.88 EHS). In a small pool, 6% concentration is structurally inevitable for any active contributor. | **Raise to 8%.** At 6%, the flag is more a measure of pool size than contributor behavior. |
 | `C-CONC` (P7) at 10% | **Low-Moderate** | C-02 is flagged correctly (14.1% is elevated), but the 10–12% band is narrow and may catch contributors who are simply consistent rather than dominant. | **Hold at 10% but add a velocity qualifier.** Only fire if `RCR ≥ 10% AND VEL ≥ 3.0` to distinguish sustained effort from passive accumulation. |
 | `E-CONC` (P1) at 20% | **Low** | Did not fire in this sample, but C-01 at 18.7% is close. The 20% threshold appears appropriately set — early-stage networks should expect some concentration. | **Hold at 20%.** Review quarterly as pool grows. |

 ### 6.3 Gap: Low-Volume Bad Actors

 C-17 demonstrates a gap in the current gate logic. This contributor has the worst quality metrics in the sample (`EHS = 0.19`, `RR = 71.4%`, `CIS = none`) but classifies as `NORM` because every quality-gated rule requires a minimum volume that C-17 does not meet.

 **Risk:** A contributor submitting low volumes of consistently poor work never triggers any gate, accumulating small rewards indefinitely.

 **Recommendation:** Add a new rule at Priority 12.5:

 | Priority | State | Trigger Condition | Reason Code |
 |----------|-------|-------------------|-------------|
 | 12.5 | `WATCH` | `EHS < 0.30 AND RTC ≥ 1 AND CIS ∈ {none, lapsed}` | `W-LOWVOL-QUAL` — Poor quality at any volume with no attestation. |

 This catches contributors like C-17 without restricting rewards — it simply ensures a maintainer is aware.

 ---

 ## 7. Calibration Assessment

 ### 7.1 False-Positive Risk

 **Overall: LOW-MODERATE.**

 The primary false-positive risk sits in the `W-CONC` threshold at 6%. In a pool of 18 contributors, any contributor completing roughly 1 task per day will mechanically cross 6% concentration. This does not indicate problematic behavior — it indicates participation.

 **Impact:** Maintainers reviewing `WATCH` lists will see clean contributors alongside genuinely concerning ones, diluting attention. Raising `W-CONC` to 8% would reduce watch-list noise by an estimated 25–30% without losing any true-positive signals in this sample.

 All `COOL` and `REAUTH` classifications in this backtest appear correctly assigned — no false positives observed at those severity levels.

 ### 7.2 False-Negative Risk

 **Overall: MODERATE.**

 Two gaps identified:

 1. **Low-volume quality gap (C-17 pattern).** Contributors with poor quality metrics at low volume are invisible to the current gate. The `W-LOWVOL-QUAL` rule proposed in §6.3 addresses this.

 2. **Velocity-without-concentration gap.** A contributor could submit many tasks across a large pool without crossing concentration thresholds, even if their velocity is anomalous relative to the population. Current `C-VEL` requires both `PVEL ≥ 8 AND VEL ≥ 5`, which is aggressive enough for the current pool size but may need downward adjustment as the contributor base grows and per-capita concentration naturally declines.

 ### 7.3 Suggested Threshold Adjustments

 | Current Rule | Current Threshold | Proposed Change | Rationale |
 |-------------|-------------------|-----------------|-----------|
 | `W-CONC` (P10) | `RCR ≥ 6%` | `RCR ≥ 8%` | Reduces false-positive watch flags in small pools. |
 | `C-CONC` (P7) | `RCR ≥ 10%` | `RCR ≥ 10% AND VEL ≥ 3.0` | Adds velocity qualifier to distinguish effort from accumulation. |
 | *New* `W-LOWVOL-QUAL` | — | `EHS < 0.30 AND RTC ≥ 1 AND CIS ∈ {none, lapsed}` | Closes low-volume bad-actor gap. |
 | `E-CONC` (P1) | `RCR ≥ 20%` | Hold — review when pool exceeds 50 contributors. | Appropriate for current network size. |

 ### 7.4 Pool-Size Sensitivity Note

 All concentration-based thresholds (`RCR`) are inherently sensitive to pool size. In an 18-contributor pool, a 10% RCR is achievable through normal participation. In a 200-contributor pool, 10% RCR would represent extreme dominance. Task Node operators should plan a threshold recalibration when the active contributor base crosses the following milestones:

 | Milestone | Action |
 |-----------|--------|
 | 50 active contributors | Review `W-CONC` and `C-CONC` — likely lower both by 2 percentage points. |
 | 100 active contributors | Review `E-CONC` — likely lower from 20% to 12–15%. |
 | 200+ active contributors | Full threshold recalibration across all concentration rules. |

 ### 7.5 Measurement Cadence

 | Action | Frequency | Owner |
 |--------|-----------|-------|
 | Run backtest against latest 30-day window | **Bi-weekly** | Task Node operator |
 | Review `WATCH` list and clear stale flags | **Weekly** | Task Node operator |
 | Process `COOL` queue (notify, enforce pause) | **Within 48 hours of classification** | Task Node operator |
 | Process `REAUTH` queue (request attestation) | **Within 24 hours of classification** | Task Node operator |
 | Process `ESC` queue (hold rewards, investigate) | **Immediate** | Task Node operator + maintainer |
 | Threshold recalibration review | **Quarterly** or at pool-size milestones | Maintainer |

 ---

 ## 8. Recommended Operator Actions — Summary

 ### Immediate (this window)

 1. **C-02** — Initiate 7-day cooldown. Notify contributor of concentration threshold. No reward restriction needed if RCR drops below 10% at next measurement.
 2. **C-03** — Request fresh check-in attestation. Reward hold until attestation received. Contributor has 24-day stale check-in with 22 rewarded tasks — high-value contributor who may simply need a nudge.
 3. **C-05** — Initiate 7-day cooldown. Audit the 9-task day for task diversity — determine if burst reflects genuine productivity or task-splitting.
 4. **C-07** — Initiate 7-day cooldown with evidence-quality review gate. Next 3 submissions require manual evidence inspection before reward disbursal.
 5. **C-09** — Request check-in attestation. Contributor has no attestation on file (`CIS = none`) with 30 DSLC. Combine with evidence-quality counseling given `EHS = 0.32`.

 ### Monitor (next window)

 6. **C-01** — Highest-value contributor at 18.7% RCR. One strong window away from `ESC`. Proactive outreach to discuss task diversification or mentoring other contributors.
 7. **C-13** — Evidence quality below median with emerging refusal pattern. Send evidence-quality guidance proactively.

 ### No Action Required

 8. **C-04, C-06, C-08, C-10, C-11, C-12, C-14, C-15, C-16, C-17, C-18** — All within normal parameters or below action thresholds. C-17 would be flagged under the proposed `W-LOWVOL-QUAL` rule but is `NORM` under current logic.

 ---

 ## 9. Appendix — Gate Logic Pseudocode

 ```
 function classify(contributor):
    if contributor.RCR >= 20%:
        return ESC, "E-CONC"
    if contributor.RR >= 60% AND contributor.RTC >= 10:
        return ESC, "E-QUAL"
    if contributor.EHS < 0.25 AND contributor.RTC >= 8:
        return ESC, "E-EVID"
    if contributor.RCR >= 12% AND contributor.CIS in {lapsed, none}:
        return REAUTH, "R-CONC-LAPSE"
    if contributor.DSLC >= 21 AND contributor.RTC >= 10:
        return REAUTH, "R-STALE"
    if contributor.CRD >= 25 AND contributor.RCR >= 10%:
        return REAUTH, "R-STREAK"
    if contributor.RCR >= 10% AND contributor.RCR < 12%:
        return COOL, "C-CONC"
    if contributor.PVEL >= 8 AND contributor.VEL >= 5:
        return COOL, "C-VEL"
    if contributor.RR >= 40% AND contributor.RTC >= 5:
        return COOL, "C-QUAL"
    if contributor.RCR >= 6%:
        return WATCH, "W-CONC"
    if contributor.VEL >= 4 AND contributor.CRD >= 14:
        return WATCH, "W-VEL"
    if contributor.RR >= 25% AND contributor.RTC >= 3:
        return WATCH, "W-QUAL"
    if contributor.EHS < 0.45 AND contributor.RTC >= 5:
        return WATCH, "W-EVID"
    return NORM, "N-OK"
 ```

 ---

 ## 10. Disclosure

 This readout contains no contributor handles, wallet addresses, private evidence links, or client-identifying information. All contributor IDs are neutral sequential labels generated for this backtest only. Metric values are derived from operational fields and do not reveal submission content, task descriptions, or private communications. This document is intended for Task Node operator calibration and may be shared publicly.
No results found