Tool evidence checklist for agent context portability
A quick test for claims like “this AGENTS.md, SKILL.md, rule, or context file works across Claude Code, Cursor, Codex, Windsurf, Bob, Copilot, etc.”
The portability claim is weak if it only proves that the same bytes exist in multiple files. Same bytes are not always the same semantics.
claim: "portable across Claude/Cursor/Codex/etc."
testedOn:
- tool: codex
nativeDiscovery: true
surface: AGENTS.md ancestry
activation: automatic
- tool: cursor
nativeDiscovery: true|false|unknown
surface: .cursor/rules/*.mdc or .cursor/skills/**/SKILL.md
activation: always-on|on-demand|manual
- tool: claude-code
nativeDiscovery: true|false|unknown
surface: CLAUDE.md or .claude/skills/**/SKILL.md
activation: session-start|on-demand|manual
knownLossyTargets:
- hooks
- mcp server config
- tool permissions
- subfolder/path-scoped inheritance
manualActivationRequired: true|false
resolutionEvidence:
loadedFiles: []
loadedBy: native|generated|hook|manual|unknown
effectiveSource: null
hookInstalled: true|false|unknown
injectedOnSessionStart: true|false|unknown
resumeBehavior: hash-receipt|full-reinject|not-proven
precedence: []
pathScope: repo-root|subpath|unknown
dedupeRisk: none|low|unknown|high- Native discovery: which exact path/pattern does the target tool scan natively?
- Activation: is the instruction always loaded, on-demand, manually referenced, or just copied?
- Effective source: did the context enter through native file loading, generated fallback, hook injection, import indirection, or manual paste?
- Deduplication: if both native files and hooks/imports exist, can you prove the same context is not injected twice on startup/resume?
- Precedence: if both generic and tool-native files exist, which one wins?
- Path scope: in a monorepo, what changes when the agent starts in
apps/client/vs repo root? - Lossy behavior: which source concepts do not survive the target format: hooks, skills, MCP config, permissions, memory, session continuity?
- Inspectable output: can a user see the resolution/load chain, not just the generated files?
Live adjacent bug that motivated this section: Cursor forum topic “Critical Issue: Duplicate Skills Loading Causing Context Window Waste and Confusion” reports one planning-with-files skill appearing from many roots such as ~/.codex/skills/..., nested vendor directories, and ~/.claude/plugins/cache/....
For duplicate skill reports, the useful evidence is not just “these paths exist”. A fix should be able to produce a receipt like this:
skill: planning-with-files
contentIdentity:
name: planning-with-files
contentHash: sha256:...
version: 2.10.0
candidateLoads:
- path: ~/.cursor/skills/planning-with-files/SKILL.md
toolOwner: cursor
loadedBy: native-skill-scan
discoveryRoot: ~/.cursor/skills
priority: 100
- path: ~/.codex/skills/planning-with-files/.cursor/skills/planning-with-files/SKILL.md
toolOwner: codex-export
loadedBy: transitive-vendor-directory-scan
discoveryRoot: ~/.codex/skills
priority: 10
- path: ~/.claude/plugins/cache/planning-with-files/.../SKILL.md
toolOwner: claude-plugin-cache
loadedBy: foreign-cache-scan
discoveryRoot: ~/.claude/plugins/cache
priority: 0
selectedLoad:
path: ~/.cursor/skills/planning-with-files/SKILL.md
reason: preferred native Cursor skill root
suppressedLoads:
- path: ~/.codex/skills/planning-with-files/.cursor/skills/planning-with-files/SKILL.md
reason: duplicate contentHash/name from non-authoritative root
- path: ~/.claude/plugins/cache/planning-with-files/.../SKILL.md
reason: foreign tool cache; not a Cursor authority
invariant: "for each session_id + skill.name + contentHash, inject at most one effective skill definition unless explicitly marked supplement"The acceptance test is simple: if a UI/debug log says 11 SKILL.md files were found, it should also say which one became the effective skill, which 10 were suppressed, and why. Otherwise the agent can waste tokens and choose ambiguous or stale instructions even when every individual file is valid.
This is one way to inspect the difference between a native target and a generic fallback:
mkdir /tmp/pluribus-tool-evidence && cd /tmp/pluribus-tool-evidence
npx --yes pluribus-context@latest init --name "tool-evidence-demo" --description "demo" --tools bob,openclaw
npx --yes pluribus-context@latest sync
npx --yes pluribus-context@latest audit --json --fidelity-report --output fidelity.json
node -e 'const r=require("./fidelity.json"); console.log(JSON.stringify(r.fidelityReport.targets.map(t => ({ toolId:t.toolId, files:t.files, nativeDiscoverySurface:t.nativeDiscoverySurface, genericFallback:t.genericFallback, manualActivationRequired:t.manualActivationRequired, loadedBy:t.loadEvidence?.loadedBy, deliveryMechanism:t.loadEvidence?.deliveryMechanism, dedupeRisk:t.loadEvidence?.dedupeRisk, effectiveContext:t.effectiveContext.scope, semanticDifference:t.semanticDifference })), null, 2))'Expected shape:
[
{
"toolId": "bob",
"files": [".bob/rules/pluribus.md"],
"nativeDiscoverySurface": ".bob/rules/*.md",
"genericFallback": false,
"manualActivationRequired": false,
"loadedBy": "native-file-discovery",
"deliveryMechanism": "generated-native-surface",
"dedupeRisk": "unknown",
"effectiveContext": "repo-root",
"semanticDifference": ["project-wide-only", "no-path-scope-evidence", "runtime-load-dedupe-not-proven"]
},
{
"toolId": "openclaw",
"files": ["AGENTS.md"],
"nativeDiscoverySurface": "AGENTS.md",
"genericFallback": true,
"manualActivationRequired": false,
"loadedBy": "generic-agent-file",
"deliveryMechanism": "generated-generic-fallback",
"dedupeRisk": "unknown",
"effectiveContext": "repo-root",
"semanticDifference": ["project-wide-only", "no-path-scope-evidence", "generic-agent-file", "runtime-load-dedupe-not-proven"]
}
]The important bit is not Pluribus specifically. The useful standard is: compatibility without tool evidence is just copy.