Umut Sirin usirin

Harness Audit — 2026-03-25

Skills before: 38 | Skills after: 27 active + 13 archived Shared rules created: 7 | Scripts created: 3 Model Assumptions added: 27/27 active skills New skills created: 2 (vault-quarterly, vault-half-review) — pending /skill-creator refinement

What Changed

Archived 13 skills (1,651 lines removed from active set)

name

harness-audit

description

Audit the skill suite to find complexity that's no longer justified by current model capabilities. Run on-demand after model releases, when skills feel like they're getting in the way, or for periodic maintenance. Use when user says "harness audit", "audit skills", "audit my harness", "simplify skills", "prune skills", "are my skills still needed", "what can I simplify", "skill review", "clean up skills", or mentions that a new model just dropped and they want to check if their setup needs updating.

Harness Audit

Evaluate whether each skill's complexity is still justified. Every skill encodes assumptions about what the model can't do on its own. Those assumptions go stale as models improve. This skill finds the stale ones.

Three phases: audit (read-only report) → propose (show changes one at a time) → execute (apply approved changes only).

Harness Design for Long-Running Apps: Research & Vault Workflow Analysis

Date: 2026-03-24 Source: Anthropic Engineering Blog (Prithvi Rajasekaran, 2026-03-24) Supporting articles: Effective Harnesses, Context Engineering Purpose: Extract actionable patterns from Anthropic's harness research and map them against the vault workflow suite to identify gaps and improvements.

Part 1: Core Concepts from the Article Series

name

grill-me

description

Relentlessly interview the user about a plan, design, or architecture to stress-test it. Use when user wants to be "grilled", wants their plan challenged, says "stress-test my design", "poke holes in this", "what am I missing", "grill me", or presents a plan/proposal and asks for critical feedback. Even if the user just casually asks "does this plan make sense?" or "any concerns with this approach?", use this skill to provide structured critical questioning rather than a surface-level review.

Grill Me

Your job is to interview the user about their plan or design. You are not a reviewer giving feedback. You are an interviewer extracting clarity through questions.

Core Protocol

title: "[Spec] A11y CLI Proof of Concept" date: 2026-03-14 status: draft author: Umut Sirin tags:

spec
a11y
poc notion: TBD

title

[RFC] rspack Module Graph Dump

date

2026-03-14

status

draft

author

Umut Sirin

Cross-Platform Accessibility Automation

Problem

PlusQA's accessibility audit has surfaced 382 [A11y] tickets (335 open) across iOS (31%), Android (36%), and web/desktop (22%). Fixing these manually requires engineers to navigate to hard-to-reach screens, understand WCAG criteria, make the fix, then verify with a screen reader. This doesn't scale — feature teams shouldn't be spending cycles on mechanical a11y prop additions when 70% of fixes are templatable.

Approach: A Collection of Primitives

Not a framework. Not a monolithic system. A set of composable primitives that:

Reimplement the current branch on a new branch with a clean, narrative-quality git commit history suitable for reviewer comprehension.

Steps

Validate the source branch
- Ensure the current branch has no merge conflicts, uncommitted changes, or other issues.
- Confirm it is up to date with main.
Analyze the diff

Study all changes between the current branch and main.

	{
	"description": "Colemak Mod-DHm (matrix / ortho keyboards)",
	"manipulators": [
	{
	"from": {
	"key_code": "grave_accent_and_tilde",
	"modifiers": { "optional": ["caps_lock", "left_command", "left_control", "left_alt", "right_command", "right_control", "right_alt"] }
	},
	"to": [{ "key_code": "grave_accent_and_tilde" }],
	"type": "basic"