Under the Hood: How gptel Manages LLM Conversations in Emacs

If you've used Emacs to interact with LLMs, you've probably encountered gptel. While it appears simple on the surface—just another chat interface—its internals reveal an elegant approach to managing LLM conversations that leverages Emacs' text property system. Let's dive into how it works.

Visual vs Actual: The Prefix Illusion

One of the first things you notice in a gptel chat buffer are the prefixes—typically "### " for user messages in Markdown mode or "*** " in Org mode. What's interesting is that these prefixes are purely cosmetic. They're stripped out before any API calls using a simple but effective mechanism:

(defsubst gptel--trim-prefixes (s)
  "Remove prompt/response prefixes from string S."
  (string-trim s
   (format "[\t\r\n ]*\\(?:%s\\)?[\t\r\n ]*"
             (regexp-quote (gptel-prompt-prefix-string)))
   (format "[\t\r\n ]*\\(?:%s\\)?[\t\r\n ]*"
           (regexp-quote (gptel-response-prefix-string)))))

This separation between visual structure and actual content is a key design choice. It means:

Users get a clear visual distinction between messages
The actual API calls remain clean and prefix-free
You can customize the visual appearance without affecting functionality
The prefixes can be changed or removed without breaking conversation tracking

The Invisible Layer: Text Properties as Conversation State

At the heart of gptel's conversation management is Emacs' text property system. Instead of relying on visible markers or parsing text formatting, gptel attaches invisible "sticky notes" (text properties) to the AI's responses. These properties persist through editing, saving, and reloading files.

Here's what makes this approach powerful:

Editing Flexibility: You can freely edit both your prompts and the AI's responses. The text properties automatically adjust to cover the edited text, maintaining the conversation structure.
Session Persistence: When you save a gptel buffer, the response boundaries are stored as file-local variables:

# Local Variables:
# gptel--bounds: ((1234 . 2345) (3456 . 4567))
# End:

When you reopen the file, these positions are used to restore the text properties, reconstructing the conversation state.

Conversation Structure and API Calls

The real magic happens when gptel needs to send a new message to the API. The text properties allow gptel to reconstruct the conversation's structure, alternating between user and assistant messages:

[
  {"role": "user", "content": "your first message"},
  {"role": "assistant", "content": "AI's first response"},
  {"role": "user", "content": "your second message"}
]

This works because gptel can traverse the buffer and use the text properties to identify which parts are AI responses (marked with 'gptel 'response) and which are user inputs (everything else).

Beyond Chat Buffers: Regular Buffer Integration

What's particularly interesting is how gptel handles regular (non-chat) buffers through its gptel-track-response system. This variable determines whether gptel should:

Track responses and maintain conversation structure (t)
Treat everything as user input (nil)

This flexibility allows gptel to work seamlessly in both dedicated chat sessions and regular editing buffers.

Debugging and Transparency

For those interested in seeing what's actually being sent to the API, gptel includes a comprehensive logging system through gptel-log-level:

(setq gptel-log-level 'debug)  ;; Full logging including headers
;; or
(setq gptel-log-level 'info)   ;; Just request/response bodies

This logs all API interactions to the *gptel-log* buffer, making it easy to debug issues or understand the underlying communication.

What Makes This Design Special?

The brilliance of gptel's design lies in its use of Emacs' native facilities:

Text Properties: By using text properties instead of visible markers, gptel maintains conversation structure without constraining the user interface.
Buffer-Local Variables: Conversation state is preserved through Emacs' file-local variables system, making chat sessions persistent without external storage.
Editing Freedom: The design allows users to edit both sides of the conversation while maintaining context—something many chat interfaces don't support.

Looking Under the Hood

For the curious, you can explore gptel's conversation tracking yourself. In a gptel buffer, try:

;; Check if point is in an AI response
(get-char-property (point) 'gptel)  ;; Returns 'response in AI responses

;; See the prefix configuration
(alist-get major-mode gptel-prompt-prefix-alist)  ;; e.g., "### " for markdown-mode

;; See what gets sent by examining the log
(setq gptel-log-level 'info)  ;; Now check the *gptel-log* buffer after sending

(get-char-property (point) 'gptel)  ;; Returns 'response in AI responses

This reveals the invisible markers that gptel uses to track conversation structure.

Conclusion

gptel's design shows how thoughtful use of Emacs' built-in features can create a powerful and flexible interface for modern AI interactions. Its approach to conversation management through text properties demonstrates that sometimes the best solution isn't adding new infrastructure, but cleverly using what's already there.

The next time you use gptel, remember that beneath its simple interface lies a sophisticated system for managing conversations that takes full advantage of Emacs' unique capabilities.

staticaland/gptel.md