Skip to content

Instantly share code, notes, and snippets.

@lostintangent
Last active February 27, 2025 05:51
Show Gist options
  • Save lostintangent/bc646f3adc23e9ca5c5d491c5e103acb to your computer and use it in GitHub Desktop.
Save lostintangent/bc646f3adc23e9ca5c5d491c5e103acb to your computer and use it in GitHub Desktop.
✨ Agent mode wish list

After using agent mode exclusively for a few weeks, the following are the areas of advancement that I'd love to see, based entirely on how I've found myself using it in practice:


💪 Agent autonomy

I want the agent to perform common actions on its own (without consent), as long as the side effects of those actions are 1) project-local, and 2) easily undoable (i.e. the workspace is git-versioned).

In my experience, the following actions represent the majority of cases where the agent needed to run a terminal command. And I'd love to let it do these securely on its own, so that it's not blocked on me in order to make progress:

  1. Installing 3rd-party project dependencies (e.g. npm install react). Global dependencies should still require consent though.

    Ideally VS Code could provide a tool to the agent that allows it to install project-local dependencies, in a way that gives it increased autonomy, but without needing to let it execute terminal commands directly ("YOLO mode" could let it do that, but personally, I think that's more appropriate for a fully async environment such as Padawan).

  2. Spidering the contents of URLs that I provide, and browsing the web as needed (e.g. looking up docs when lint/build errors are encountered). Without this, it can be a pain to work with niche/rapidly-iterating external libraries.

  3. Creating directories, and renaming/deleting files. All of which are common when performing refactorings, and it feels strange to have to approve a mkdir or mv terminal command, when these are conceptually just workspace operations (that are easy to undo).

    In addition to enabling autonomy, if we gave the agent tools for these actions, then it probably also increases reliability (since file operations can vary between platforms, and VS Code already has a solid file system abstraction).

🔀 Task parallelization

I want to kick off a new task, move on to something else, and then be notified when the change is completely validated.

  1. The agent should be able to inspect runtime errors, so it can self-repair issues that it wouldn't catch in the editor. Otherwise, I run the agent's code, get an error, and then have to be a copy-and-paste go-between to the next iteration.

  2. Let me opt-into a system notification when the agent completes a task/iteration. That way I can background VS Code entirely, and know when it's time to review the change.

  3. 🔮 I want to have a realtime voice conversation going with the agent, so that I could keep iterating with it, without even having VS Code focused. Or without having to type. I could just be focused on the running app, reviewing the desired behavior, and accepting changes/iterating further while pacing around my office.

    Copilot Edits already has record and playback support for voice, but it's 1) not nearly as responsive enough, 2) it's not running ambiently in the background, and 3) it can't automate actions (e.g. accepting the changes, performing an iteration). So it feels like more of a dictation tool, than a thinking/automation tool. And I basically want to feel like I'm in a pair programming screen share with AI, where we split off to do separate things, and then come back together when ready.

  4. 🔮 I'd love to be able to run multiple sessions in parallel, as opposed to be limited to just one. Why? Because I don't want to wait for the current session to finish, before being able to make progress on something else. Being limited to one session feels like if my computer could only download one file at a time.

    This introduces all kinds of interesting problems, such as how to prevent the parallel sessions from conflicting with each other. You could perform them in separate branches, but then you'd also need separate dev servers/etc. in order to actually view the distinct changes. So something like this almost points at a workspace-agnostic UX that allows you to orchestrate n-numbers of tasks across n-numbers of workspaces (each with their own compute sandbox).

🏃 Workflow enhancements

I want to be able to get into a deep flow state with the agent, and feel like I can 1) iterate seamlessly, and 2) move from one task to another, in a way that feels like I'm making compounding forward progress. Like I'm riding the "bicycle of the mind" 🚲 🧠

  1. When the agent auto-saves its changes, it doesn't appear to run any "on save" actions. In particular, if I have a workspace that's configured to run formatting and import organization on save, then I want those to run after the agent makes edits.

  2. After a task is complete, I want the agent to suggest follow-up changes that might be worthwhile. Which could either be new features, refactorings, adding tests, etc. And I'd love to be able to select multiple of the suggestions and let it run 🚀

    Taking this even further, I suppose it would be cool to tell the agent to file issues for some of its suggestions, if they're things I want to do, but not right now. Which would then lead to the next item...

  3. I really want to be able to right-click an issue in the Issues panel (contributed by the GitHub Pull Requests extension) and do a Assign to Copilot or something. That way I can easily perform ad-hoc tasks, as well as pick off issues from the editor. It feels clunky to have to copy-and-paste text (and comments!) from an issue, when the thing you want to do is already well-described (in an issue).

  4. After a while, needing to accept changes and mark a session as done can start to feel a bit "clunky". Especially if I'm moving between different tasks/contexts, wanting to make manual changes to a file the agent edited, etc. In practice, I keep sessions running for a while, because I just find it easier to "stack" iterations on top of each other, as opposed to managing session/diff state.

    I find myself just using the version control tab to track changes, because ultimately, I want to suspend disbelief on Copilot's code for a moment, and just rely on the "deep review phase" (before committing changes) as the point of reconciliation. But when I'm in the middle of rapid iteration, I don't want to feel like I'm "managing" some virtual/interim diff state, in addition to the actual git state.

  5. If I iterate on a session for a long time, it eventually seems to "break". And all subsequent iterations are just stuck in a Generating... state. So it would be great if the session could just start truncating/summarizing old messages automatically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment