Date: 2026-04-03
Branch under test: autoresearch/apr03 (skill+bash) vs main (native tools)
Eval harness: os-qa automated scenario runner (84 scenarios)
Side-by-side comparison of the native tool architecture (main) vs the skill+bash architecture across 85+ QA scenarios. Both runs use the same Qwen 3.5 397B model on the same hardware, same NATS infrastructure, same QA evaluator. Only the os-assistant binary and scenario acceptance criteria differ.
Skills scored 48 GREAT vs main's 44 with the same BAD count (23 vs 23). Token usage was comparable (10.3M vs 10.3M) with fewer LLM calls (4,869 vs 5,649). Median time to first audio was 1.5s (skills) vs 1.7s (main).
| Main | Skills |
|---|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #!/usr/bin/env bash | |
| # install-xrdp.sh — set up remote desktop (xrdp + Xfce) on a remote Ubuntu box | |
| # | |
| # Run this on your Mac. It will: | |
| # 1. Check for / install the Mac RDP client (Windows App) via Homebrew or App Store | |
| # 2. Prompt for the SSH host and Linux user to configure | |
| # 3. scp itself over, run via sudo, clean up after | |
| # 4. Install xrdp + xorgxrdp + xfce4 on the remote | |
| # 5. Configure ~/.xsession to launch Xfce with an isolated dbus bus | |
| # (so it coexists with any GNOME session on the physical monitor) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #!/usr/bin/env bash | |
| # Setup an Ubuntu machine to be SSH-able from dakdevs' Mac. | |
| # Idempotent: safe to re-run. | |
| # | |
| # Usage (on the Ubuntu machine): | |
| # curl -fsSL <gist-raw-url> | bash | |
| set -euo pipefail | |
| PUBKEY="ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIEqJZkbO069wsfHTuRDMPLNe8SWozP5Gf/cpTgxT3XI1 dakdevs-mac" |
OlderNewer