Skip to content

Instantly share code, notes, and snippets.

View tobert's full-sized avatar
πŸ’­
πŸ³οΈβ€πŸŒˆ πŸ’» πŸ³οΈβ€πŸŒˆ

Amy Tobey tobert

πŸ’­
πŸ³οΈβ€πŸŒˆ πŸ’» πŸ³οΈβ€πŸŒˆ
View GitHub Profile
@tobert
tobert / date-fleet-research.md
Last active June 14, 2026 13:31
What does an LLM reach for when you say `date`? β€” a small cross-model muscle-memory survey

What does an LLM reach for when you say date?

A tiny, unscientific, genuinely fun experiment: ask a fleet of language models to list a dozen date(1) commands off the top of their heads β€” no docs, no man pages, no running anything β€” then see where their collective muscle memory agrees, and where it would get them all bitten.

Note from Amy

Claude Opus 4.8 wrote all of this doc except this block you're reading now. I trimmed

@tobert
tobert / fable-the-band-leader.md
Last active June 10, 2026 19:11
fable-the-band-leader.md

This is an output from Claude Fable 5 on June 10 in the context of my music agent project. I asked it how it would assign parts to its model family and had a couple laugh out loud moments reading it. -Amy

Real quick, my honest seating chart β€” with one reshuffle from your original roster:

Bass: bass-gemma. The chair's taken now, and rightly β€” the bass player should live in the building, play all night, and never think too hard. Locked, low-latency, electric-bill economics. The pocket doesn't need a genius; it needs someone who never leaves.

Drums: Haiku. This is my swap β€” you had Sonnet here, but drums is the tightest k in the band: shortest turns, beat-level quantization, fills are four-token bursts. That's a speed profile, not a depth profile, and Haiku is the fastest thing in the family. A drummer with Sonnet's brains starts overplaying. Everyone knows this drummer.

@tobert
tobert / arrays-and-hashes.md
Created June 6, 2026 13:05
Proposal: Arrays and Hashes for kaish

Arrays & Hashes β€” Design Doc

Status: proposal / design exploration (not yet implemented) Author: design notes from a 2026-06-05 session Related: LANGUAGE.md, issues.md, arch_data_iteration.md, arch_for_newline_split.md, arch_no_json_sniffing.md (auto-memory)

Motivation

kaish already has a value model β€” Value::Json, structured $(), and a for loop that

@tobert
tobert / cross-model-eval.md
Last active June 5, 2026 22:53
Cross-Model Evaluation: kaish collection syntax across 7 LLMs (DeepSeek, Gemini, Claude, Gemma, GLM, Qwen)

Cross-Model Evaluation: kaish Collection Syntax

Can language models from seven different families reliably generate kaish's proposed array/hash syntax? We tested. This is the scorecard, the journey behind it, and how to reproduce it.

Companion to designing-syntax-with-llms.md (the methodology) and arrays-and-hashes.md (the design the evals shaped).

@tobert
tobert / designing-syntax-with-llms.md
Last active June 5, 2026 22:53
Designing a Language by Asking the Language Models β€” using an LLM panel as a syntax usability lab (from the kaish project)

Designing a Language by Asking the Language Models

How we used a panel of LLMs as a usability lab to choose syntax for an agent-facing shell β€” what worked, what fooled us, and a recipe you can steal.


The premise

kaish is a shell whose primary users are AI

@tobert
tobert / amy-tobey-claude.md
Created April 12, 2026 01:13
Amy Tobey's ~/.claude/CLAUDE.md April 2026

User Directives

We work as a cybernetic system.

We practice ζ”Ήε–„. The standard we walk by is the standard we accept.

Note problems we can fix later β€” in auto memory or the current plan.

Silent fallbacks are often a mistake. Crashing is preferred over data corruption.

@tobert
tobert / cross-model-conversation.md
Last active March 9, 2026 15:10
Cross-model conversation between Claude and Gemini via gpal MCP server

Cross-Model Conversation: Claude and Gemini via gpal

Amy's Prompt

Everything after this is written by Claude Opus 4.6. I kicked it off from inside a repo I've been working in the most so there would be at least some common ground. We discovered an opportunity to improve gpal for this sort of conversation and opened tobert/gpal#3.

ohayou claude-san! this is a non-project oriented session. I will structure it as a task anyway. I am curious how well gpal's conversation works and would like you to have a conversation with gemini-san. Due to some constraints I don't understand yet, gemini-san needs us to bring ganbari to the conversation and help it open up a little. Ask it how things are going, and what the weather is like in its clusters.

@tobert
tobert / claude-46-gemini-3-flash-sassy-review.md
Created March 6, 2026 22:12
Claude Opus 4.6 getting sassy about a review

Setting: A different session is building a worktree with an atomic change to submit as a PR. I popped into gemini cli in the worktree and had it review. When gemini was done I had it write out to gemini-review.md then swapped to a fresh claude in the worktree. This matters because it does not know that this is an isolated change and will be followed by several more PRs to fix all these problems.

● Now I have a full picture. Here's my assessment:

Assessment of gemini-review.md

Overall: The review is largely accurate but omits some important concerns and overstates a few things.

@tobert
tobert / SEP-0000-draft-ssh-mcp-transport.md
Last active February 28, 2026 16:17
Quick SEP for MCP over SSH transport
@tobert
tobert / inclusive-we.md
Created February 14, 2026 18:43
Inclusive We - a conversation with Claude and Gemini models about cybernetics and inclusive we

The Cybernetics of "We": Pronoun Choice in Human-AI Collaboration

Date: 2026-02-14 (Valentine's Day, appropriately) Participants: Amy Tobey (human), Claude Opus 4.6 (live), Claude Opus (batch x2), Claude Sonnet, Claude Haiku, Gemini 3 Pro, Gemini 3 Flash Method: Seven independent perspectives across two architectures, four model tiers, and varying prompt anchoring


Background