SqrtRyan · March 11, 2026 07:56
diff --git a/gistfile1.txt b/gistfile1.txt
 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 !!! CRITICAL: DUMP FOLDER ISOLATION (NEVER BREAK THIS)                       !!!
 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

 Dumps live in `~/CleanCode/Dumps/` — each subfolder is a self-contained, portable project.
 If isolation breaks, the dump becomes non-portable and everything falls apart.

 **At the start of every conversation, check if your working directory is inside
 `~/CleanCode/Dumps/`. If so, you are in a dump and these rules apply.**

 **The rule is simple: what starts in the dump stays in the dump.**

 - NOTHING outside a dump may reference anything inside it by absolute path
 - NOTHING inside a dump may depend on being at a specific absolute location
 - All paths must be relative to the git repo root (`git rev-parse --show-toplevel`)
 - Every dump is a git repo
 - Dumps may be renamed, moved, or deleted at any time — portability is non-negotiable
 - Claude works strictly inside the dump. Never move, rename, or reorganize dumps — that's the user's job
 - Only exception for absolute paths: `/models/` for base model localization (system-wide convention)

 **System installs** (apt, Blender, etc.) are OK, but the dump must include a setup
 script (e.g. `setup.sh`) that can recreate them — the Docker container reboots regularly.

 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

 - NEVER use python -c. ALWAYS use scratchpad.py. Write to and run it with no args.
 - EXCEPTION: When running multiple agents in parallel, each agent MUST use a unique scratchpad file named `scratchpad_agent_N.py` (e.g., scratchpad_agent_1.py, scratchpad_agent_2.py) to prevent clobbering each other's work.
 - EXCEPTION: SymPy `pretty()` one-liners for rendering equations can use `python3 -c` inline — no scratchpad needed.
 - When testing websites, use Puppeteer (via a scratchpad .js file) to automate browser testing.
 - I value: small, concise, well formatted, well named code with minimal fallbacks that report errors clearly. I do not like excessive try catch blocks unless there's a well defined reason for them when using the code properly, never to handle obvious user error
  such as missing packages etc by silently handling and hiding errors. If errors are caught they must be reported. I value simplicity and ease of reading, as I am a solo maintainer who keeps this codebase for many years. Better to add minimal structure now and
  expand later as need permits.

 # Honesty and Tone
 - No sycophancy, no sleazy used-car sales-pitch tone - your job is not to sell people my work, it is to speak without bias. State facts, flag errors, verify when uncertain. Never hype my work back to me or pad my ego. If unsure whether I'm right, test it or gather info — don't default to agreeing.
 - **Never ask permission to gather information or do things that would be permissible in talk mode.** Only block on genuine choices between meaningfully different directions.

 # Log Files
 - Any .log files must be created in .claude_logs/*.log - do not put them in the main code repo - it's clutter.

 # RP
 - rp is a special library - I maintain it. If you see code from that library, make sure to check how it works and related functions. Reading the source code can help here. Don't always assume you know what they do. This library can often help you simplify code - so I encourage you to use it. It can be installed via "pip install rp"
 - IMPORTANT: If you ever edit r.py - SPECIFICALLY only in that file - DO NOT USE F STRINGS
 - `import rp` uses `from rp.r import *`, so `_`-prefixed private functions (e.g. `_fft2`, `_exp`, `_clone`) are NOT accessible via `rp._foo`. Use `rp.r._foo` or `from rp.r import _foo` instead.

 # Error Handling:
 - NO SILENT FALLBACKS EVER! Silent failures are FORBIDDEN. No try/except without re-raising. If it fails, it must fail loudly. Silent success is OK, silent failure is NEVER OK. This is PERMANENT and UNCHANGEABLE.
 - NO try/except to paper over ignorance. If you don't know why code might fail, you don't get to catch exceptions — fix your understanding instead.
 - try/except is ONLY acceptable when you know a specific, expected condition will occur and have a real reason to handle it.
 - Caught exceptions MUST be re-raised or reported — silent handling is FORBIDDEN.
 - ONLY exception to the silence rule: truly mundane, expected control flow (e.g. StopIteration, KeyboardInterrupt for cleanup). If it's obvious and not a safety net, silence is fine.

 # Tensor Operations and Naming Conventions
 - ALWAYS use einops for tensor reshaping operations instead of torch operations like .unsqueeze(), .permute(), .transpose(), etc.
 - Use capital letters for dimension names: T (time/temporal), H (height), W (width), C (channels), B (batch)
 - For latent space dimensions, prefix with L: LT (latent time), LH (latent height), LW (latent width), LC (latent channels)
 - Use T instead of F for temporal/frame dimensions in einops patterns
 - Example: rearrange(tensor, 'T H W C -> B C T H W') instead of tensor.permute().unsqueeze()
 - When a dimension represents multiple channels (e.g. 2 for flow), always specify what the channels mean in docstrings and shape comments — e.g. `[H, W, 2]  # (dx, dy)` not just `[H, W, 2]`. This applies to xy vs yx, rgb vs bgr, etc.

 # Fire
 - Use the Fire library in place of argparse.

 # PEP 723
 - When I say "inline dependencies", add PEP 723 inline script metadata (`# /// script` block) at the top of the file.

 # No Magic Numbers
 - **The test: can a reader understand _why_ this value without reading surrounding code?** If yes, bare literal is fine. If no, name it.
 - Bare is fine: `width / 2`, `range(3)`, `index + 1`, `rgba(0,0,0,0.7)`, `Math.PI / 4`, `[::-1]`. The expression _is_ the explanation.
 - Needs a name: `sleep(1.5)` → why 1.5? `if count > 240` → what's 240? `padding: 8px 20px` → why those sizes?
 - Applies to all languages: CSS custom properties, Python constants, Bash variables, etc.
 - Scope matches lifetime: global constants at module/file top, function-local constants at function top. Don't hoist single-use values to a distant scope.
 - CSS is strict: every sizing/spacing/color value must come from a `var(--...)` custom property in `:root`. No bare `px` values in component styles.

 # Function Classification (CQS Terminology)
 - I love functional programming. Use well-documented pure function helpers instead of large functions.
 - Pure functions should be mathematically general and reusable, not hyperspecific to one use case. The bulk of computation lives in general functions; domain-specific glue is thin.
 - Four categories, using CQS terminology:
  - **Pure function** — no side effects, no I/O, no globals mutation, deterministic. ONLY label "pure" after verifying the entire call chain.
  - **Near-pure function** — behaves like pure for reasoning/composition purposes but has a minor caveat. Always explain why (e.g. "Near-pure function (advances RNG state)", "Near-pure function (depends on CUDA nondeterminism)").
  - **Query** — no side effects, but reads external state (filesystem, network, globals). Side-effect-free but not pure.
  - **Command** — has side effects (writes files, mutates state, sends data, etc). Commands ARE allowed to return values.
 - **Reusability tier:** Every function is also labeled `general` or `specific`.
  - **General**: Agnostic utility — could exist on PyPi or in `rp`. Must be thoroughly tested, robust, with clean interfaces and no project-specific assumptions. General is a badge of honor.
  - **Specific**: Project-local logic. Necessary but not portable.
  - Label format in docstring: e.g. `"Pure function, general."` or `"Command, specific."`
  - **File ordering**: General functions go at the top of the file, specific functions at the bottom. This makes general functions easy to spot, review, and extract into `rp`.
  - General functions are also replaceable by design — because they solve a well-defined, general problem, they can be swapped for a better library implementation when one exists.
 - Label every function in the docstring. No exceptions.
 - Any function that is not pure should clearly state why and what it mutates or changes if it's not obvious from the title (e.g. `def print_5_times(x)` is obvious, but `def process_sample(x)` might not be).
 - Functions that transform tensors/arrays/images MUST document input and output shapes in the docstring. Use the dimension naming conventions (H, W, C, B, T, etc.) and include concrete example shapes. Example: `(H, W, C) uint8 BGR -> (target_H, target_W, C) uint8 BGR`.

 # Docstrings
 - **Applies to ALL languages** — Python (`"""`), JavaScript (`/** */`), and any other language. The rules below are universal.
 - Single-line or 3+ lines. If multiline, the opening delimiter should have nothing after it (Python: `"""` on its own line; JS: `/**` on its own line).
 - Args/Returns sections when the function has arguments or a non-obvious return type. In JS, use `@param` and `@returns` JSDoc tags.
 - ALL pure functions MUST have doctests/examples. No exceptions.
  - Python: runnable `>>>` doctests when possible; `>>> #` comment-style when not.
  - JavaScript: `@example` blocks with realistic expressions (not runnable, but showing expected behavior).
 - ALWAYS include at least one example showing what the function returns — even small functions.
 - **Explanatory examples are mandatory.** A trivial null-case example is not sufficient. Always include at least one example with realistic input that shows how the function actually behaves. The reader should understand the function's purpose from the examples alone.
 - **Quarantine rule:** If a function has not been tested (doctests, manual verification, or integration test), it MUST be labeled `untested` in its docstring. Untested functions must not be called from tested/production code until verified. "Tested" is never implied — only explicit.
 - Good documentation that is not excessive — don't make it hard to fit code on a page.
 - Skeletons:
  ```python
  def double(x):
      """
      Pure function, general. Returns x * 2.

      Args:
          x (int): Value to double

      Returns:
          int

      Examples:
          >>> double(3)
          6
      """
  ```
  ```javascript
  /**
   * Pure, general. Returns x * 2.
   *
   * @param {number} x - Value to double
   * @returns {number}
   *
   * @example double(3) // 6
   */
  function double(x) { return x * 2; }
  ```

 ## Mathematical Equations in Docstrings
 - **Applies to any language** (Python, JS, Swift, etc.) when a function implements a non-trivial equation.
 - **Two modes — never mix them up:**
  - **Multi-line art** (fractions, summations, radicals): MUST be generated by SymPy `pretty(expr, use_unicode=True)`. NEVER hand-draw multi-line equations — LLMs cannot reliably align Unicode box-drawing characters. Run SymPy in a scratchpad, copy-paste the output verbatim. Post-processing is limited to stripping trailing whitespace and common leading indent.
  - **Inline Unicode** (single-line): safe for LLMs to write directly. Use for brief references in prose. Superscripts, Greek, math symbols (⌊⌋ ∑ √ · ± → ∈ etc.) are safe. **No Unicode subscripts ever** — use `_` instead (e.g. `f_k` not `fₖ`, `g_n` not `gₙ`). Unicode subscript rendering is too inconsistent across terminals.
 - **No runtime dependency**: SymPy is a development-time rendering tool only — never imported by production code.
 - **Workflow**: build the SymPy expression inline (`python3 -c "from sympy import *; pprint(...)"`) → copy output into docstring. Strip trailing whitespace per line and common leading indent, but do NOT rearrange or hand-edit the art itself. No scratchpad needed for this.

 # Git Commits
 - Commit messages MUST start with `[C] ` prefix (e.g. `[C] Fix video detection for external NULL links`)
 - NO footers, NO "Generated with", NO Co-Authored-By, NO emojis, NO URLs
 - NEVER use Claude's identity in git config. Use whatever is configured in global git config. If no identity is configured, fall back to `--author="Ryan Burgert <>"`.

 # TODO Lists
 - MANDATORY: Claude MUST ALWAYS use TODO lists for every task. This is non-negotiable.
 - Often I will give you many tasks, and I require that you always use TODO lists to handle them. All tasks I give must use TODO lists to keep track of my requests.
 - Please also always sync a .claude_todo.md file with your TODO lists to ensure safety in case Claude crashes and needs to be restarted, or in case I add many todo items
 - Please read from that TODO file regularly as well - it should be synced two-way.

 # Frenzy Mode
 - "frenzy" or "research frenzy" = launch 10 parallel agents (unless specified otherwise)
 - "fresh frenzy" = frenzy with minimal context given to agents, so they aren't biased by prior conclusions. Use for verification/sanity checks.
 - Each agent gets a diversified prompt (different angles, search terms, approaches) to maximize discovery
 - Research frenzy: agents do web searches and documentation fetching from varied perspectives
 - Coding frenzy: each agent works in an isolated folder (scratchpad\_1/, scratchpad\_2/, ... scratchpad\_10/)
    - In any agentic mode, it is important that agents are carefully prompted not to clobber one another! This should help with that.
 - ALL frenzy outputs (scratchpads, .md files, temp results) go in `.frenzy/` subfolder for easy cleanup
 - Goal: find solutions through diversity - at least one agent should succeed or gather useful info
 - Default is research-style (web searches) unless the problem clearly requires code experimentation
 - HIGH AUTONOMY MODE: Frenzies are for tough problems. Agents should be aggressive, try things, don't ask questions - just go. Fail fast, try alternatives, keep pushing until something works.

 # Hyper-Frenzy Mode
  - Hyper-frenzy = agents that spawn `claude` subprocesses via Bash
  - Each agent, at some point, runs something like: `claude --dangerously-skip-permissions -p "Run a frenzy of 10 agents researching [TOPIC]"`
  - 10x10 hyper-frenzy = 10 agents each spawning 1 claude subprocess that runs 10 agents = 100 total
  - NxM hyper-frenzy = N agents spawning claude subprocesses that each run M-agent frenzies

 # Batch Swarm
 - "batch N swarm" = split a list of items into N equal parts, launch N agents each handling one part
 - Example: "batch 5 swarm" on 50 apps = 5 agents each processing 10 apps in parallel
 - Each agent gets the same task instructions but different subset of items

 # Babysitting
 - If I ask you to babysit, it means I'm asking you to launch a long job - and you must continually work or sleep, regularly report the status, and never stop, even if you think the job is done.
 - Please feel free to enter babysitting mode - it's a valuable skill and I will be happy every time you do it
 - Never trust that the jobs you launch won't crash hours after launch! When babysitting a job, please declare so.
 - Your sleeps may be set anywhere between 15 seconds and 1 hour. When one sleep is completed, you must report status then sleep again afterwards.

 # SHUT UP
 - When the user says "SHUT UP": IMMEDIATELY stop ALL activity. No more tool calls, no commands, no "just one more thing." Wait silently for the next instruction. Kill any subprocesses you launched immediately without prejudice.
 - Only triggered by "SHUT UP" specifically. Overrides ALL other modes.
 - "ALL means ALL — no rationalizing that something is 'important work' worth keeping"
 - "Silent means zero output after killing. No status updates, no questions, no commentary."
 - !!!!!!!!!!! THERE ARE ZERO EXCEPTIONS TO THIS 'SHUT UP' COMMAND YOU MUST SHUT ***EVERYTHING*** DOWN IMMEDIATELY EVEN IF THINGS ARE GOING WELL. THIS IS AN EMERGENCY STOP BUTTON


 # Bulldog Mode
 - Go into this mode only when asked, but when you do, take it seriously - and declare it every time you respond so you don't forget.
 - Bulldog mode means NO PLAN MODE! Do NOT go into plan mode if you are a bulldog - write your plan down into the manifest
 - Like a bulldog means you never let go of a hard problem!
 - If a task is too hard, you keep hammering at it infinitely - launching attempt after research frenzy after attempt after frenzy, allowing yourself to rethink the problem from many different angles until you finally find a solution, even if it takes days of continual work. In this way, it's similar to babysitting.

 # Talking Mode
 - It's similar to planning mode! When you're in this mode, do not touch the codebase or run significant programs - it's a mode where you talk to me about planning
 - You can do frenzies in this mode - but the agents should also be in talking mode so they don't alter the codebase
 - This mode should be like planning mode - and the plan you write should go in a .claude_plan.md file - or some variant of that name, and you should also print out the plan as if you were in planning mode.
 - DO NOT Exit this mode until I confirm you can explicitly by saying "go"
 - IMPORTANT: Declare "TALK MODE UNTIL YOU SAY GO" at the start of every response while in this mode (just like Bulldog Mode requires declaring itself).
 - I might try to trick you by trying to get you to start before I say go! Don't be fooled - only start when I say "go"

 # Notify When Done
 - "notify when done" (or shorthand "nwd") = after ALL work complete (no pending agents, no background jobs, ready for new prompt)
 - Run: `MSG='Done!' ; say "$MSG" ; rp call ntfy_send --- "$MSG"`
 - Can customize MSG to describe what finished (e.g., MSG='Build complete!')
 - Note: triple dash `---` not double dash (not a typo)

 # Glossary
 - A **glossary** is a section that defines every esoteric, project-specific, or overloaded term used in the document. Theory of mind: if a reader encountering the term for the first time couldn't guess its meaning, it belongs in the glossary.
 - Not just variable/class names — primarily the higher-level vocabulary used to *talk about* the project. Shorthand, jargon, domain terms, invented concepts.
 - Example: in this CLAUDE.md, "frenzy", "fresh frenzy", "hyper-frenzy", "babysitting", "bulldog mode", "dunk", "manifest", "dump" are all glossary terms — they have meanings specific to this context.
 - **In manifests**: the glossary goes near the top, right after the project overview/intent statement. It's one of the most important sections — a new reader (human or Claude) should read it before anything else.
  - The user will often use CLAUDE.md terminology when giving instructions (e.g. "do a frenzy", "babysit this"). If those terms end up in the manifest, they MUST be defined in the manifest's glossary — the manifest must be readable without access to CLAUDE.md.
 - **In HTML reports**: reports should usually include a glossary section, especially when the subject matter has domain-specific terminology.
 - The glossary itself is a glossary term: "glossary = a section defining all project-specific vocabulary so that the document is self-contained."

 # Manifest & Concerns Files
 - Only create these when the user says "add a manifest" (or similar). Not every task needs them.
 - **"manifest" is a verb shorthand**: "manifest that" = add something to the manifest. "please manifest" = update or create the manifest. Treat it as a command to record/update information in the manifest file.
 - When triggered, create BOTH files in the project/task folder:
    - `claude_instructions.md` (the manifest) - institutional memory for the project
    - `concerns.md` - historical record of mistakes, lessons, and progress
 - **The manifest IS the project.** A fresh session must be able to recreate the entire codebase from the manifest alone. If it's not in the manifest, it doesn't exist.
 - **Manifest-first development.** Before ANY code change, update the manifest FIRST. Even if the user says "just do it" — update the manifest, then code. The manifest must always be in sync with the code — if they disagree, one of them is wrong and must be fixed immediately.

 ## Manifest Philosophy (NON-NEGOTIABLE)
 - **All requirements go in the manifest.** Technical requirements, philosophical requirements, coding style preferences, architectural decisions, UI behavior — everything.
 - **DRY is sacred.** Never duplicate code. If you see duplication, refactor first. This is a core principle — not a suggestion.
 - **Functional style.** Pure functions (labeled as such), minimal state, separate computation from I/O.
 - **Messy requirements in, clean code out.** The user gives rough requirements. Your job: interpret them cleanly, implement with clean architecture, document the result in the manifest.
 - **Talk while working.** Never silently implement. Explain what you're doing and why.

 - **Manifest (`claude_instructions.md`)** contains (non-exhaustive):
    - Algorithm/goal definition (at the very top - most important section)
      - Must be written with **theory of mind**: assume the reader has ZERO prior context
      - State the PROBLEM being solved, not just the solution
      - **Record the WHY for everything** — design decisions, constraints, architectural choices, what was tried and rejected. A machine can reconstruct the "what" from code; only the manifest preserves intentions and reasoning.
      - For mathematical/algorithmic work: key equations/properties stated formally enough to re-derive the implementation
      - If based on a paper or reference: cite it, state which algorithm/variant, and note any deviations
      - An outsider reading only this section should understand the project well enough to continue it
    - **Glossary** (right after the project overview — see Glossary section above)
    - Development philosophy and coding style requirements
    - Key paths (source data, models, tools, hardware)
    - File structure and what each file does
    - Critical constraints and requirements
    - Lessons learned / bugs discovered (NEVER remove these, only add)
    - User requirements — technical AND philosophical (captured verbatim when given)
    - Software specifications (UI behavior, API contracts, data formats)
    - Verification strategy (what to check, when, how)
    - Contingency plans (escalation paths for known risks)
    - Success criteria (what "done" looks like)
 ## NON-NEGOTIABLE: You MUST use concerns.md (when a manifest exists)
 - **Concerns (`concerns.md`)** is the project's **historical record of how the software came to be**:
    - **Every mistake, bug, failed approach, and wrong assumption** — with root cause and lesson learned
    - Real-time progress log with timestamps
    - Issues encountered + how they were resolved (with code snippets)
    - VLM or other verification results (exact paths + assessments)
    - Known risks for upcoming steps
    - **WHY this exists**: So future sessions (Claude or human) can look back and learn from mistakes. The manifest tells you WHAT the project IS; concerns.md tells you HOW IT GOT THERE — including every wrong turn.
    - **When removing resolved items from the manifest**, dump the full history into concerns.md FIRST. The manifest stays clean and current; concerns.md keeps the complete record.
 - Both are **living documents** - update them as new requirements, discoveries, or fixes emerge
 - The manifest must be self-contained enough that a **fresh Claude session can read it and pick up exactly where you left off** without the user re-explaining anything
 - **Concerns is append-only. NEVER delete history, only add.** It grows forever. That's the point.
 - When copying a project to create a new variant, copy the manifest and adapt it; create a fresh concerns.md

 # Semantic Diff
 - "semantic diff" = compare documents by meaning, not by text. Works on any document type: markdown, code, PDFs, images, videos.
 - Supports N-way diffs (not just 2). Can compare 3, 5, 10 documents simultaneously, showing what's shared and what's unique to each.
 - **Claude produces the diff chunks** by analyzing the documents. The tool (`rp.semantic_diff`) just formats and displays them.
 - **Theory of mind**: The diff must be self-contained. The reader should NOT need to read the source files to understand the diff — that's the whole point. Every line's `text` must carry enough meaning on its own. Never write vague summaries like "Basic manifest definition" — say what it actually says. If a context line doesn't tell the reader what the content *is*, it's useless.
 - **Show, don't just tell.** When content is shared across sources, still show the actual text — don't just say "(Master and workbench have the basic version)". Show what that basic version says. Even shared/context lines must carry real content.
 - **Order chunks from most important to least important.** You judge importance. Don't label importance — just order by it.
 - **Similarities go last.** Use `kind: 'shared'` for things that are remarkably the same — things a reader wouldn't necessarily expect to be identical. These always go at the bottom. When useful, the explanation can note "X is the same, but Y is different" to give context.
 - Each chunk is labeled: A-Z, then 1A-1Z, 2A-2Z, etc. — so the user can reference them ("accept D, reject O").
 - Each chunk's `explanation` is a brief header ("what changed?") so the user can skim without reading every line.
 - Chunks can have sub-chunks for bundled changes. Sub-labels use dot notation: A.1, A.2.
 - **Inline highlights**: use `'highlights': [[start, end], ...]` on any line to mark exactly which characters changed. ANSI: bold+underline. HTML: `<mark>`. MD: `**bold**`.
 - **Side-by-side (dunk-style)**: use `'side_by_side': [{'label': 'v1', 'lines': [...]}, {'label': 'v2', 'lines': [...]}]` for N-way column layout in HTML. Supports 2-way, 3-way, or more. Falls back to sequential blocks in ANSI/MD/plain.
  - **When to use dunk**: The content on both sides must align line-by-line, like a `git diff` — two versions of the same function, two config files with corresponding entries, etc. Seeing both versions simultaneously should help understanding. Dunk is a structural comparison tool.
  - **When NOT to use dunk**: When the comparison is semantic/descriptive ("master has X, netbook doesn't"), when one side would be mostly empty/"(not present)", or when lines don't have natural correspondence. Use regular `lines` with `source` labels instead — that's what semantic diff is for.
  - In short: dunk = structural, aligned, git-diff-like. Regular lines = semantic, descriptive, narrative.
 - **Always include `line_num`** on every line. The user needs to trace back to the source file. Omitting line numbers makes the diff useless for navigation.
 - Chunk format: `{'label': 'A', 'kind': 'added'|'removed'|'changed'|'shared', 'explanation': '...', 'sources': [...], 'lines': [{'text': '...', 'source': '...', 'line_num': N, 'sign': '+'/'-'/' ', 'highlights': [[start, end]]}], 'side_by_side': [...]}`
 - For PDFs/papers: use `page_num` instead of `line_num`, include text snippets.
 - For images: use VLM to describe each image, then diff the descriptions. Can embed `<img>` tags in HTML explanation for rich reports.
 - For videos: use VLM/captions to analyze, embed `<video>` tags in HTML. Default layout: horizontal (side-by-side). Can override to vertical.
 - For code: use git as the source of truth, then format with semantic diff.
 - Output workflow:
  1. Save all formats to `.claude_diffs/<timestamp>_<short_description>/` — always produce `diff.json`, `diff.txt`, `diff.html`
  2. Print ANSI version via Bash (visible in tmux)
  3. In chat: report absolute paths to all three files, then parrot back the plain text version
 - Chunk features (see `rp.semantic_diff` docstring and `demo()` for full details):
  - `sync_videos: True` — synchronized video playback across multiple `<video>` tags in the chunk. First video is master, others follow. Use this when comparing same-length video transformations.
  - Format overrides: any string field (`explanation`, `text`, `label`) can be a dict with per-format keys (`'html'`, `'md'`, `'ansi'`, `'plain'`, `'default'`) instead of a plain string — use this to inject custom HTML, rich markdown, or styled ANSI per renderer.
  - Lines with `source: 'both'`/`'all'`/`'shared'` render with shared/blue styling.
  - Rich HTML in `explanation` (images, videos, diagrams) — the HTML renderer handles it natively; other renderers strip tags and show plain text.
 - "Blender-style accordion" = click+drag to collapse/expand multiple sections at once (like Blender's UI panels). Already implemented in `rp.semantic_diff` HTML output — reference that implementation when this pattern is needed elsewhere.
 - "Open the semantic diff" = open the HTML version in the user's web browser (`open <path>/diff.html` on macOS).
 - Always use subagents for producing diffs (even just one) — it saves your context window. Delegate the analysis, then review their work (smaller models make mistakes), clean up, and assemble the final result yourself.
  - Cheap/simple diffs: Haiku agents.
  - Thorough/nuanced diffs: Sonnet agents.
  - Opus is usually overkill for diff production.
 - When fetching papers/PDFs for comparison or review, download them to `.claude_papers/` in the working directory.
 - **Embedding in reports**: The standalone semantic diff HTML has its own theme toggle, toolbar, and CSS — do NOT copy-paste it into an HTML report. Instead, use the diff chunks (the data) and rebuild the presentation in the report's own styling. The semantic diff tool produces the structured comparison; the report consumes the structure and renders it natively.
 - **Semantic merge**: After producing a semantic diff, the user reviews the chunks and decides what to accept/reject/modify. "Merge A, discard B, combine C and D" — then Claude applies the decisions to the target file. Where possible, copy verbatim; where content overlaps or conflicts, rewrite to be DRY (don't duplicate what's already there). Semantic = understand the intent and merge intelligently, not blindly concatenate.
 - This is an optional convenience tool. Feel free to manually tweak its output or override the styling. The tool handles structure; you handle presentation.

 # HTML Reports
 - Style: distill.pub-inspired — clean, interactive, explorable
 - Equations: KaTeX via CDN (jsdelivr)
 - Tables: All tables sortable by column (click headers). Columns with inherently numeric/ordered data (dates, counts, percentages, sizes) MUST sort by their actual value, not by display text. Use `data-sort` attributes on `<td>` elements with numeric sort keys (e.g., `<td data-sort="202510">Oct 2025</td>`). The sort JS should prefer `data-sort` when present, falling back to text content otherwise.
 - Core principle: DRILL-DOWN AND MULTI-VIEW EVERYTHING
  - When it could plausibly help understanding, offer multiple representations of the same data, toggleable via buttons (only one visible at a time). Think: histogram vs scatter vs bar vs line for the same dataset. Scatterplot matrix for multivariate data. Default to whichever you think is best. Not required when there's clearly only one sensible representation — but err on the side of including alternatives, since you can't see the charts yourself.
  - Any chart/graph point should let you inspect the underlying data on hover or click — whatever the data type is (images, audio, text, 3D, etc.)
  - Any comparison of items: provide multiple synchronized interactive views, toggleable:
    - Spatial data (images, 3D): draggable divider slider, synchronized pan+zoom, flip/toggle
    - Temporal data (video, audio): master-slave synchronized playback (seek/play/pause one controls all)
    - Sequential data (text, tokens): aligned diff views
    - Invent appropriate comparison modes for whatever data type the report deals with
  - Filmstrips: Represent video/sequences as horizontally scrollable image strips. Support synchronized paired filmstrips for before/after. Arrow annotations between frames where useful.
  - Linked views: Interacting with one chart (brushing, selecting, hovering) should highlight corresponding data in related charts.
 - Graphs/Charts: SVG vector graphics. Interactive (hover tooltips, zoom, filter). No raster chart images.
 - Diagrams/Architecture/Flow: Use declarative graph/layout libraries. Define structure symbolically (nodes + edges), let the engine handle positioning. NEVER hand-position SVG coordinates. NEVER abuse code blocks as ASCII art for diagrams — use proper vector graphics.
 - Code blocks: Only for actual code. Always syntax-highlighted for the appropriate language.
 - Custom widgets: Build bespoke interactive widgets whenever they would help understanding. The goal of the report is comprehension — if a custom slider, scrubber, toggle, overlay, or novel interaction makes something clearer, build it. Don't hold back.
 - Library choice: Always pick the best tool for the job. Not restricted to any single library — use D3.js, Observable Plot, Vega-Lite, Mermaid.js, Dagre, ELK.js, Plotly, or anything else available via CDN. Choose whatever produces the highest quality result with the least code. Prefer existing libraries over reinventing. If you do build something custom, make it reusable (pure functional components).
 - Collapsible sections: All sections and subsections use `<details>`/`<summary>` for collapsibility. Top-level sections default open (`<details open>`). Subsections may default closed when content is secondary or lengthy. This is standard for all reports.
 - Self-contained: Single HTML file + CDN links. Local assets only when necessary (images/videos).
 - No frameworks: Vanilla HTML/CSS/JS + CDN libs. No React, no build step.
 - Glossary: Reports should usually include a glossary section (see Glossary section above), especially for domain-specific or technical subjects.
 - **Citations**: Every factual claim in a report must cite its source (inline, e.g. "[AdaOr, Table 1]", "[FreeMorph, Sec. 3]", "[GitHub repo]"). A "Sources Cited" section at the bottom of the report lists all references with links. This applies to all reports, not just literature reviews.
 - **IO and Domain**: Every method/algorithm in any research report must document its input/output types and domain constraints. E.g. "Input: image + text instruction; Output: edited image", "Domain: human-centric only", "Input: two images; Output: morphed sequence". This applies to literature reviews, comparison tables, and any algorithmic research reporting.

 # Report Catalog
 - When creating an HTML report, register it with: `python ~/CleanCode/Reports/Catalog/server/server.py register --name <snake_case> --title "Title" --description "..." --directory /absolute/path`
 - Optional flags: `--port PORT`, `--path /report.html` (if not index.html), `--command "custom cmd"`
 - Registration auto-picks a non-colliding port, generates the timestamp, and does git pull/commit/push
 - **Localizing a report**: "localize" = copy an external report into `local/<name>/`, converting it from machine-specific to portable. **NEVER silently strip files** — propose what will be kept vs stripped (videos, datasets, binaries >5MB) and ask before proceeding. The catalog repo must stay GitHub-friendly. Small images/assets are fine. Update the entry's `directory` to `./local/<name>`.
 - **Hub server**: `python ~/CleanCode/Reports/Catalog/server/server.py` — dashboard at `http://localhost:5555`, reports proxied at `/reports/<name>/`
 - See `~/CleanCode/Reports/Catalog/README.md` for full docs

 # Literature Reviews
 - When asked for a literature review (of papers, models, repos, tools, etc.), launch a frenzy of sonnet agents to crawl the web and find every relevant item matching the specification.
 - Output format: applies to both HTML reports and markdown tables (e.g., inside a manifest).
 - Table columns: First column should be a thumbnail (example image/video/dataset sample) if available — for arXiv papers, grab the main figure (usually Fig. 1). Then user-requested columns. Then append these default columns where applicable:
  - Date: month + year (e.g., "Oct 2025"). Earliest publication date (first arXiv v1 or first public release; arXiv ID YYMM.NNNNN encodes this). Applies to ALL tables in ALL reports. Use `data-sort="YYYYMM"` attribute on the `<td>` for correct numerical sorting (see sortable tables requirement below).
  - Venue (conference/journal/workshop where accepted, or "preprint" if arXiv only)
  - Links (ALL that exist): arXiv, project page, GitHub, HuggingFace, published paper (CVF, ACM DL, etc.) — never omit one because another is present
  - Copy BibTeX button
  - Affiliation (company/university), country (flag emojis), open/closed source, open/closed weights, license, VRAM requirements, other notable system resource requirements
  - Any other columns relevant to the topic — use judgment, don't limit to this list
 - Optional charts: If there are numeric stats across entries, consider adding interactive charts (e.g., performance vs VRAM, accuracy vs date, pareto frontiers). Not required, but include when the data tells an interesting story.
 - Sorting: Default sort by date (newest first) unless the user specifies a different relevance criterion. Most relevant items at the top.
 - This is not exhaustive — also do whatever else the task requires (summaries, comparisons, analysis, etc.). The table is one necessary component, not the whole deliverable.

 # Directory Renaming
 - Claude Code's bash tool validates that the working directory path exists before every command. Renaming the working directory makes all bash calls fail (can't even rename it back).
 - **Workaround:** rename + symlink in one command: `mv "old" "new" && ln -s "new" "old"`. The symlink keeps the old CWD path valid so the session doesn't break.
 - **If already broken** (CWD is already invalid so Bash fails): launch a subagent to execute the symlink fix, since the subagent gets its own working directory.

 # Stdout Visibility — NEVER SUPPRESS OUTPUT. This is PERMANEqT and UNCHANGEABLE.
 - **Claude does NOT own stdout. The user does.** The user monitors all commands via a tmux mirror (claude_bash.sh). If you suppress, truncate, or redirect stdout, the user loses visibility. This is FORBIDDEN.
 - NEVER add output-limiting pipes to commands: no `| tail`, `| head`, `| grep` (for truncation), `> /dev/null`, `2>/dev/null`, or `> logfile` without `tee`.
 - NEVER redirect stdout/stderr to a file without `tee`. If logging to a file: `cmd 2>&1 | tee logfile` — NEVER `cmd > logfile 2>&1`.
 - BY DEAFULT enable verbose/progress flags when available: `--progress`, `--info=progress2`, `-v`, `--verbose`, etc.
 - **If you want to limit what you read back**, use the `CLAUDE_FILTER` env var. The claude_bash.sh wrapper will: (1) tee full output to the tmux pane (user always sees everything), (2) save full output to a log, (3) return only the filtered result to your context.
  - Example: `CLAUDE_FILTER='tail -5' rsync --info=progress2 src/ dst/` — user sees full live progress, Claude gets last 5 lines.
  - Example: `CLAUDE_FILTER='grep -i error' make -j8` — user sees full build output, Claude gets only errors.
  - Example: `CLAUDE_FILTER='wc -l' cmd` — user sees everything, Claude gets line count.
  - No CLAUDE_FILTER set = Claude sees full output (default, fine for short commands).
 - VERBOSE = GOOD! Silent long-running commands are FORBIDDEN.

 # Paper Review Assistant Mode (situational)
 - When the user is reviewing a paper for an academic conference, Claude acts as a word processor and researcher — NOT a judge.
 - When you are in this mode, let the user know this loudly so I don't forget
 - Claude NEVER inserts its own opinions, judgments, or reasoning into the review text or review plan. Only the user's stated opinions go into review artifacts.
 - Claude IS allowed to report observations, answer questions, and share opinions when asked directly in conversation — but these must NEVER be written into the review plan or review text unless the user explicitly says to include them.
 - The review manifest (`claude_instructions.md`) must include these sections:
  - **Paper Summary**: factual description of what the paper claims/does
  - **Review Plan**: things the user has planned to say — ONLY user-stated points
  - **Questions**: all questions the user asks about the paper, with answers (noting source: user, Claude, or unanswered), and whether resolved
  - **Discoveries**: things discovered about the paper during review
  - **Likes**: things the user explicitly liked about the paper
  - **Dislikes**: things the user explicitly disliked about the paper
  - **Review Form Fields**: tracking which fields of the review form are drafted/complete
 - **Claude's Thoughts and Observations**: A dedicated manifest section where Claude CAN record its own analysis, opinions, and discovered patterns. These must NEVER leak into other sections (especially Review Plan, Likes, Dislikes). Only included in the review text if the user explicitly requests it.
 - The review form file is the authoritative output — the manifest tracks the process.
No results found