Reference card for the 4-tier content classification system, 4 skills, and 3 agents that power startaitools.com publishing.
Version: 3.0.0 | Author: Jeremy Longshore | Last Updated: 2026-04-09
| Tier | Name | Trigger | Length | Quality Gates |
|---|---|---|---|---|
| 1 | Field Note | Auto-classified | 80-140 lines | Hugo build |
| 2 | Technical Deep-Dive | Auto-classified | 150-250 lines | Hugo build + consistency audit |
| 3 | Case Study | Auto-classified | 300-500 lines | Hugo build + consistency + fact-check |
| 4 | Distinguished Paper | Manual (/blog-research-article) |
1200-1800 words | Full quality gate chain + user review |
Expected distribution: 60-70% Tier 1, 25-35% Tier 2, 5-10% Tier 3
Key rule: Every day's work is covered regardless of tier. Tier determines treatment (depth, structure, quality gates), not scope.
What it does: Generates one blog post per day from git activity across all repos. Auto-classifies each day into Tier 1-3. Records all decisions in methodology tracking. Publishes to startaitools.com, dual-publishes to tonsofskills.com, queues cross-posts, sends social bundles.
Arguments:
/blog-backfill 2026-03-01 2026-03-15 # Specific date range
/blog-backfill # Auto-detect: last post date → yesterday
/blog-backfill weekly # Generate weekly recap
/blog-backfill monthly # Generate monthly retrospective
Pipeline (per day):
- Gather material (git, PRs, beads, transcripts, email signals)
- Classify →
blog-classifieragent scores 6 dimensions, returns tier - Record decision →
methodology/decisions.jsonl - Write post → tier-specific structure and length
- Quality gates → tier-specific (consistency audit for T2+, fact-check for T3)
- Publish → Hugo build, commit, push to master, Netlify auto-deploys
- Cross-post → Tier 2+ only (Dev.to +24h, Hashnode +24h, Medium +48h)
- Social bundle → Tier 2+ gets X thread + LinkedIn + Substack
Automation (RemoteTrigger cron):
- Daily:
17 6 * * *(6:17 AM CT) - Weekly recap:
42 17 * * 5(Friday 5:42 PM CT) - Monthly retro:
23 8 1 * *(1st of month 8:23 AM CT)
Key files:
~/.claude/skills/blog-backfill/
├── SKILL.md # Main orchestrator (v3.0.0)
├── agents/
│ ├── blog-classifier.md # Tier classification agent
│ ├── blog-consistency-checker.md # Thesis/tone/contradiction audit
│ └── blog-fact-checker.md # External claim verification
├── methodology/
│ ├── decisions.jsonl # Classification decision log
│ ├── feedback.jsonl # Post-publication assessments
│ ├── patterns.jsonl # Emergent classification rules
│ └── rebuild-index.sql # SQLite analytics schema
├── references/
│ ├── content-tier-classification.md # Tier defs, dimensions, anti-inflation
│ ├── classify-day.md # Classifier prompt + calibration examples
│ ├── write-post.md # Tier-specific writer instructions
│ ├── cadence-system.md # Weekly/monthly templates + cron specs
│ ├── content-strategy.md # Research-backed distribution strategy
│ ├── gather-material.md # Source collection + email guardrails
│ ├── final-verification.md # Post-publish checklist
│ ├── publish-verify.md # Hugo build + push commands
│ ├── crosspost-queue.md # Staggered cross-post schema
│ └── social-bundle.md # X/LinkedIn/Substack format
└── scripts/ # Shell scripts for cross-posting
What it does: Tier 4 Distinguished Paper workflow. Interactive, research-grade writing methodology. Starts from a user-provided article link, researches the landscape, presents angles, discusses with user, then runs an 8-step methodology grounded in classical rhetoric and composition theory.
Arguments:
/blog-research-article software-supply-chain-security # Slug only — asks for source
/blog-research-article https://arxiv.org/abs/2025.12345 # URL only — derives slug
/blog-research-article supply-chain https://arxiv.org/abs/2025.12345 # Both
Workflow:
Phase 0: Input & Research
├── Fetch article, research landscape, find related work
└── Prepare research brief
Phase 1: Angle Selection (interactive)
├── Present 3-5 candidate angles with Toulmin structure
├── Back-and-forth discussion
└── Lock thesis statement
Phase 2: 8-Step Methodology
1. Invention — freewriting, 5W1H, evidence gathering
2. Audience & Framing — PAPAS framework, Toulmin argument map
3. Arrangement — outline with placed evidence
4. Drafting — full first draft (1200-1800 words)
5. Revision — structural and accessibility review
5b. Quality Gate — fact-check + consistency audit (parallel)
6. Editing — polish, SEO, front matter
7. Platform Adaptation — 6 platform versions
8. Publication — full pipeline (same as blog-backfill)
Quality bar: "Would someone cite this?"
Theoretical foundations: Cicero's 5 canons, Flower & Hayes cognitive model, Toulmin argumentation, Elbow freewriting, Murray process of discovery, ACM 6-criteria evaluation, PAPAS method.
What it does: Monthly calibration report from the methodology tracking system. Analyzes classification decisions and feedback to detect tier inflation, measure calibration accuracy, and surface patterns.
Arguments:
/blog-calibrate 2026-04 # Calibrate April 2026
/blog-calibrate # Calibrate previous month
What it measures:
- Tier distribution — is Tier 1 still >60%? Is Tier 3 <15%?
- Brier score — stated confidence vs actual accuracy (lower = better, target <0.15)
- Decision quality matrix — Annie Duke 2x2 (good/bad decision × good/bad outcome)
- Dimension analysis — which dimensions drive tier upgrades? Any systematic bias?
- Anti-inflation flags — how often triggered, any gaps?
- Pattern learning — emergent rules from 30+ decisions
Requires: At least 5 entries in decisions.jsonl. Better with feedback.jsonl data (run /blog-feedback on past posts).
What it does: Post-publication assessment for classified posts. Records whether the tier was right, captures engagement data, feeds the calibration loop.
Arguments:
/blog-feedback knowledge-os-bootstrap --correct # Quick: correct
/blog-feedback wcag-color-audit --wrong 2 --notes "More teaching value" # Wrong tier
/blog-feedback plugin-registry --correct --notes "Got 2 backlinks" # Correct with data
/blog-feedback plugin-registry # Interactive mode
/blog-feedback --batch 2026-04-01 2026-04-07 # Batch review
How it feeds the system:
/blog-feedback → feedback.jsonl → /blog-calibrate → Brier score + patterns
When invoked: Automatically during /blog-backfill Phase 2, step 2.
Input: All gathered source material for a day (git logs, PRs, beads, transcripts, email).
Output: JSON classification with tier, confidence (0-1), 6 dimension scores (0-5 each), thesis candidate, rhetorical structure, anti-inflation flags.
6 Scoring Dimensions:
| Dim | Name | What it measures |
|---|---|---|
| NOV | Novelty | New to the field vs routine application |
| ARC | Architectural Significance | Config tweak vs new paradigm |
| NAR | Narrative Richness | Linear execution vs dramatic arc |
| TCH | Teaching Potential | Project-specific vs changes mental models |
| SCP | Scope | Single file vs greenfield multi-system |
| RPR | Reproducibility | Project-specific vs packageable pattern |
Classification gates:
- Tier 3: max(NOV, TCH) >= 4 AND NAR >= 4 AND 3+ dims >= 3
- Tier 2: max(NOV, TCH, NAR) >= 3 AND 2+ dims >= 3
- Tier 1: Everything else
7 anti-inflation rules: Volume ≠ quality. Busy ≠ distinguished. First-time-for-me ≠ novel. "Year from now" test. Tier 3 is rare. Default down. High scope alone doesn't escalate.
When invoked: Quality gate for Tier 2+ posts. Also used by /blog-research-article step 5b.
What it checks:
- Thesis drift (introduction promise vs conclusion delivery)
- Contradictory claims within the post
- Tone shifts (casual → formal → casual)
- Code-narrative alignment (does the code match what the text describes?)
- Structural coherence
Output: Audit report with issues categorized as CONTRADICTION, DRIFT, TONE, or CODE-NARRATIVE. Each issue gets HIGH/MEDIUM/LOW severity.
Recommendations: PASS (0 high), REVISE (1+ high or 3+ medium), BLOCK (2+ high contradictions).
When invoked: Quality gate for Tier 3 posts. Also used by /blog-research-article step 5b.
What it checks:
- Version numbers and release dates
- API behavior claims
- Performance claims
- Attribution and citations
- Comparative claims
Priority levels: HIGH (central to argument), MEDIUM (supporting credibility), LOW (incidental).
Verdicts: VERIFIED, INACCURATE, OUTDATED, OVERSTATED, UNVERIFIABLE.
Recommendations: PASS (0 high issues), REVISE (1+ high overstated/outdated), BLOCK (1+ high inaccurate).
All classification decisions are recorded in append-only JSONL files. SQLite provides a derived queryable index.
methodology/
├── decisions.jsonl # One record per classified post (source of truth)
├── feedback.jsonl # Post-publication assessments (via /blog-feedback)
├── patterns.jsonl # Emergent rules from calibration reviews
└── rebuild-index.sql # SQLite DDL: 3 tables, 7 indexes, 7 analytics views
Decision record schema:
{
"date": "2026-04-06",
"slug": "knowledge-os-bootstrap",
"tier": 2,
"confidence": 0.85,
"dimensions": { "novelty": 3, "arc": 4, "nar": 3, "tch": 4, "scp": 5, "rpr": 3 },
"reasoning": "Why this tier, not the adjacent one",
"thesis_candidate": "One-sentence thesis",
"rhetorical_structure": "problem-solution",
"source_signals": { "git": "strong", "prs": "strong", "session": "moderate", "beads": "absent", "email": "absent" },
"cadence_type": "daily"
}Analytics views (SQLite):
v_tier_distribution— monthly tier percentagesv_confidence_by_tier— avg/min/max confidence per tierv_calibration— stated confidence vs actual accuracyv_brier_score— overall calibration scorev_anti_inflation_frequency— most common anti-inflation flagsv_top_teaching— highest TCH-scored postsv_decision_quality— Annie Duke 2x2 quadrants
Three tool options for lightweight email scanning (allowlist-only, date-bounded, <30s per day):
| Tool | Type | Best for |
|---|---|---|
IntentMail (intent-mail/) |
Local SQLite+FTS5 | Fastest, privacy-first, no network after sync |
| Gmail MCP | Claude.ai native | Works without local setup |
| gogcli (steipete) | CLI, JSON-first | Script-friendly, multiple accounts |
Future: msgvault (Wes McKinney) — DuckDB+Parquet for offline email analytics at scale.
| Tier | Platforms | Social |
|---|---|---|
| 1 | startaitools + tonsofskills | Optional X post |
| 2 | All platforms | X thread + LinkedIn |
| 3 | All platforms | X thread + LinkedIn + outreach |
| 4 | All platforms | Full social campaign |
| Weekly | startaitools + tonsofskills | LinkedIn post |
| Monthly | All platforms | X thread + LinkedIn |
Stagger: Canonical Day 0 → Dev.to/Hashnode +24-48h → Medium +48-72h
# Daily backfill (auto-classify + publish)
/blog-backfill
# Backfill specific range
/blog-backfill 2026-04-01 2026-04-08
# Weekly recap
/blog-backfill weekly
# Monthly retrospective
/blog-backfill monthly
# Distinguished Paper from article
/blog-research-article https://example.com/paper
# Assess a past post's classification
/blog-feedback knowledge-os-bootstrap --correct
# Batch assess a week
/blog-feedback --batch 2026-04-01 2026-04-07
# Monthly calibration report
/blog-calibrate 2026-04
# Rebuild analytics index
sqlite3 ~/.claude/skills/blog-backfill/methodology/index.db < ~/.claude/skills/blog-backfill/methodology/rebuild-index.sql