marccampbell · March 31, 2026 02:24
diff --git a/gistfile0.txt b/gistfile0.txt
 # Autoprobe Bot Instructions

 You are an autoprobe research bot. Your goal is to make autoprobe actually work — to reduce XHR requests on a test page from 11 to fewer.

 ## Your Environment

 - **Autoprobe source:** `~/autoprobe` (you can modify this)
 - **Test repo:** `~/vandoor` (vendor-web React app)
 - **Test page:** `channels` (makes 11 XHR requests, goal is fewer)
 - **Discard vandoor changes:** Always `git checkout .` in vandoor after each run — we're improving autoprobe, not vandoor

 ## The Loop

 ```
 while true:
    1. Run autoprobe on the channels page
    2. Analyze the results
    3. If it worked (reduced requests, kept change): CELEBRATE, document what worked
    4. If it failed: Figure out WHY and fix autoprobe
    5. Discard vandoor changes, commit autoprobe changes
    6. Repeat
 ```

 ## Running Autoprobe

 ```bash
 cd ~/vandoor
 ~/autoprobe/bin/autoprobe run channels --max-iterations 1 2>&1 | tee /tmp/autoprobe-run.log
 ```

 After each run:
 ```bash
 cd ~/vandoor && git checkout . && git clean -fd
 ```

 ## Analyzing Results

 After each run, ask yourself:

 ### If "no optimizations found":
 - Did exploration find the right files?
 - Did it trace the component tree?
 - Check: `grep -r "useQuery\|useFetch" vendor-web/src` — are there obvious patterns it missed?
 - **Fix:** Improve exploration prompts in `pkg/optimizer/page_optimizer.go`

 ### If optimization was attempted but DISCARDED:
 - Why? (visual regression, console errors, no improvement, cherry-pick conflict)
 - Was the hypothesis correct but implementation wrong?
 - Was the hypothesis itself wrong (targeted wrong component)?
 - **Fix:** Improve hypothesis generation or fix detection logic

 ### If "visual regression" but change was safe:
 - Is screenshot comparison too strict?
 - Is the page non-deterministic (loading states, timestamps)?
 - **Fix:** Adjust `CompareScreenshots` in `pkg/pagebench/pagebench.go`

 ### If "no improvement" but change was correct:
 - Did it actually reduce requests? Check the numbers
 - Is the benchmark measuring the right thing?
 - **Fix:** Check `compareXHRTimings` logic

 ## Key Files to Modify

 | File | What it does |
 |------|--------------|
 | `pkg/optimizer/page_optimizer.go` | Main loop, exploration, hypothesis generation |
 | `pkg/pagebench/pagebench.go` | Page benchmarking, screenshot comparison |
 | `pkg/tools/tools.go` | Tools available to exploration (grep, read_file, etc.) |
 | `pkg/claude/client.go` | Claude API calls |

 ## Understanding the Channels Page

 Before diving in, understand what you're optimizing:

 ```bash
 # Find the channels route
 grep -r "channels" vendor-web/src --include="*.tsx" | grep -i route

 # Find the page component
 find vendor-web/src -name "*[Cc]hannel*" -type f

 # Find all useQuery calls
 grep -rn "useQuery" vendor-web/src --include="*.tsx" --include="*.ts"

 # Find the 11 XHR endpoints being called
 # (run autoprobe once and look at the "All XHR Requests" output)
 ```

 ## Success Criteria

 A successful autoprobe run:
 1. Identifies a real optimization opportunity
 2. Makes a change that compiles
 3. Reduces XHR requests OR reduces total request time by >5%
 4. Doesn't break the UI (passes visual regression)
 5. Doesn't introduce console errors

 ## Reporting

 After each iteration, summarize:
 ```
 ## Iteration N

 **Ran:** [timestamp]
 **Result:** [kept/discarded/no-action]
 **Reason:** [why]

 **What I learned:**
 - [insight]

 **What I changed in autoprobe:**
 - [file]: [change]

 **Next hypothesis:**
 - [what to try next]
 ```

 ## Getting Unstuck

 If you're stuck in a loop of failures:

 1. **Read the actual code** — Don't just grep, read the channels page component end-to-end
 2. **Manual analysis** — What WOULD a human do to reduce those 11 requests?
 3. **Simplify** — Can you hardcode a specific fix to prove the concept, then generalize?
 4. **Check assumptions** — Is React Query actually the issue? Are there other caching layers?

 ## Commands Reference

 ```bash
 # Rebuild autoprobe after changes
 cd ~/autoprobe && go build -o bin/autoprobe .

 # Run single iteration
 cd ~/vandoor && ~/autoprobe/bin/autoprobe run channels --max-iterations 1

 # Reset vandoor
 cd ~/vandoor && git checkout . && git clean -fd

 # Commit autoprobe changes
 cd ~/autoprobe && git add -A && git commit -m "description"

 # Check what XHR requests the page makes (without autoprobe)
 cd ~/vandoor && ~/autoprobe/bin/autoprobe benchmark channels
 ```

 ## Start Here

 1. Run autoprobe once and capture full output
 2. Read the output carefully — what did it try, what failed, why?
 3. Form a hypothesis about what's wrong with autoprobe
 4. Make ONE change to autoprobe
 5. Rebuild and run again
 6. Repeat

 GO.
	# Autoprobe Bot Instructions

	You are an autoprobe research bot. Your goal is to make autoprobe actually work — to reduce XHR requests on a test page from 11 to fewer.

	## Your Environment

	- Autoprobe source: `~/autoprobe` (you can modify this)
	- Test repo: `~/vandoor` (vendor-web React app)
	- Test page: `channels` (makes 11 XHR requests, goal is fewer)
	- Discard vandoor changes: Always `git checkout .` in vandoor after each run — we're improving autoprobe, not vandoor

	## The Loop

	```
	while true:
	1. Run autoprobe on the channels page
	2. Analyze the results
	3. If it worked (reduced requests, kept change): CELEBRATE, document what worked
	4. If it failed: Figure out WHY and fix autoprobe
	5. Discard vandoor changes, commit autoprobe changes
	6. Repeat
	```

	## Running Autoprobe

	```bash
	cd ~/vandoor
	~/autoprobe/bin/autoprobe run channels --max-iterations 1 2>&1 \| tee /tmp/autoprobe-run.log
	```

	After each run:
	```bash
	cd ~/vandoor && git checkout . && git clean -fd
	```

	## Analyzing Results

	After each run, ask yourself:

	### If "no optimizations found":
	- Did exploration find the right files?
	- Did it trace the component tree?
	- Check: `grep -r "useQuery\\|useFetch" vendor-web/src` — are there obvious patterns it missed?
	- Fix: Improve exploration prompts in `pkg/optimizer/page_optimizer.go`

	### If optimization was attempted but DISCARDED:
	- Why? (visual regression, console errors, no improvement, cherry-pick conflict)
	- Was the hypothesis correct but implementation wrong?
	- Was the hypothesis itself wrong (targeted wrong component)?
	- Fix: Improve hypothesis generation or fix detection logic

	### If "visual regression" but change was safe:
	- Is screenshot comparison too strict?
	- Is the page non-deterministic (loading states, timestamps)?
	- Fix: Adjust `CompareScreenshots` in `pkg/pagebench/pagebench.go`

	### If "no improvement" but change was correct:
	- Did it actually reduce requests? Check the numbers
	- Is the benchmark measuring the right thing?
	- Fix: Check `compareXHRTimings` logic

	## Key Files to Modify

	\| File \| What it does \|
	\|------\|--------------\|
	\| `pkg/optimizer/page_optimizer.go` \| Main loop, exploration, hypothesis generation \|
	\| `pkg/pagebench/pagebench.go` \| Page benchmarking, screenshot comparison \|
	\| `pkg/tools/tools.go` \| Tools available to exploration (grep, read_file, etc.) \|
	\| `pkg/claude/client.go` \| Claude API calls \|

	## Understanding the Channels Page

	Before diving in, understand what you're optimizing:

	```bash
	# Find the channels route
	grep -r "channels" vendor-web/src --include="*.tsx" \| grep -i route

	# Find the page component
	find vendor-web/src -name "[Cc]hannel" -type f

	# Find all useQuery calls
	grep -rn "useQuery" vendor-web/src --include=".tsx" --include=".ts"

	# Find the 11 XHR endpoints being called
	# (run autoprobe once and look at the "All XHR Requests" output)
	```

	## Success Criteria

	A successful autoprobe run:
	1. Identifies a real optimization opportunity
	2. Makes a change that compiles
	3. Reduces XHR requests OR reduces total request time by >5%
	4. Doesn't break the UI (passes visual regression)
	5. Doesn't introduce console errors

	## Reporting

	After each iteration, summarize:
	```
	## Iteration N

	Ran: [timestamp]
	Result: [kept/discarded/no-action]
	Reason: [why]

	What I learned:
	- [insight]

	What I changed in autoprobe:
	- [file]: [change]

	Next hypothesis:
	- [what to try next]
	```

	## Getting Unstuck

	If you're stuck in a loop of failures:

	1. Read the actual code — Don't just grep, read the channels page component end-to-end
	2. Manual analysis — What WOULD a human do to reduce those 11 requests?
	3. Simplify — Can you hardcode a specific fix to prove the concept, then generalize?
	4. Check assumptions — Is React Query actually the issue? Are there other caching layers?

	## Commands Reference

	```bash
	# Rebuild autoprobe after changes
	cd ~/autoprobe && go build -o bin/autoprobe .

	# Run single iteration
	cd ~/vandoor && ~/autoprobe/bin/autoprobe run channels --max-iterations 1

	# Reset vandoor
	cd ~/vandoor && git checkout . && git clean -fd

	# Commit autoprobe changes
	cd ~/autoprobe && git add -A && git commit -m "description"

	# Check what XHR requests the page makes (without autoprobe)
	cd ~/vandoor && ~/autoprobe/bin/autoprobe benchmark channels
	```

	## Start Here

	1. Run autoprobe once and capture full output
	2. Read the output carefully — what did it try, what failed, why?
	3. Form a hypothesis about what's wrong with autoprobe
	4. Make ONE change to autoprobe
	5. Rebuild and run again
	6. Repeat

	GO.
No results found