The first thing is to go through the buffer by line, and have a little state machine to drive the conversion.
I want `#+begin_ai` blocks to start blockquoting the individual interlocutors, with an empty line between each. We can experiment with adding separators or not.
We want to recognize if we are in:
- ai mode
- in ai ME section
- in ai SYS section
- in ai AI section
- source mode
- normal text mode
So, we need an enum for the states. That enum could be something like:
- :normal (just no special context)
- :in-dialogue (a `#+begin_ai` was recognized. We keep the interlocutor in a separate state variable, no need to overcomplicate things)
- :in-dialogue-src (we are inside a quote block inside an AI response)
- :in-src (we are in a `#+begin_src`) block
- :in-results (we are in a results block, if need be)
We need the following methods:
- is-begin-ai-block-p : begin_ai returns the header line
- is-end-ai-block-p : end_ai
- is-begin-src-block-p : begin_src, returns the language and header line
- is-end-src-block-p : end_src
- is-begin-quote-p : when triple-backquote inside an AI block, returns the language
- is-end-quote-p : ending triple-backquote
- is-ai-dialogue-change : we have a [..] block, returns the interlocutor
- is-header-p
- is-title-p
The main method of our state machine is going to be handle-state, which takes an input line, a current state, an output buffer, and returns a new state. I suspect an alist is the idiomatic way to keep structured data in emacs lisp.
I don’t really know how to do structures in emacs-lisp, so let’s ask the LLM.
Let’s switch to a more terse interlocutor. I suspect the information about defstruct is wrong, and the LLM started writing “common lisp”.
After looking at the definition of `cl-defstruct`, it does indeed support documentation and type annotations for the slots. The terse emacs seems to be wrong. Let’s let it be a bit more verbose.
We can now put this into an org-mode block and evaluate it.
(require 'cl-lib)
;; Defining structures
(cl-defstruct my-struct
"Documentation for my-struct."
(field1 nil :documentation "Doc for field1")
(field2 0 :type integer :documentation "Doc for field2"))
(cl-defstruct (my-struct2
(:constructor create-my-struct2)
(:predicate my-struct2-p)
(:copier copy-my-struct2))
"Documentation for my-struct2."
(field1 nil :documentation "Doc for field1")
(field2 "" :type string :read-only t :documentation "Doc for field2"))
(cl-defstruct (my-struct3 (:include my-struct))
"Documentation for my-struct3, inheriting from my-struct."
(field3 '() :type list :documentation "Doc for field3"))
;; Instantiating structures
(setq instance1 (make-my-struct :field1 '(:a :b :c) :field2 42))
(setq instance2 (create-my-struct2 :field1 'hello :field2 "world"))
(setq instance3 (make-my-struct3 :field1 1 :field2 2 :field3 '("a" "b" "c")))
;; Accessing documentation and docstrings programmatically
(documentation 'my-struct)
(documentation 'my-struct-field1 'function)The documentation function for `’mystruct ‘type` doesn’t seem to work.
No need to use the AI to write our state structure. We are going to use org-mode tangle to write it. This is the first time I use tangle, I don’t really know how it works, but decided to use the actual documentation to get started.
(require 'cl-lib)
(cl-defstruct org-ai-to-md-state
"Represents the internal state of the state machine used to convert org-ai flavoured org mode files to markdown."
(state :normal :type symbol :documentation "The state variable.
Can be:
- :normal (no special context)
- :in-dialogue (within a begin_ai block)
- :in-dialogue-src (within a code block inside a dialogue)
- in-src (within a begin_src code block)
- in-results (within the results of evaluating a code block)")
(current-speaker nil :type string :documentation "The current speaker if dialogue is active and a [..] tag was recognized.")
(current-src-language nil :type string :documentation "When inside a code block, the current language (used for syntax highlighting)"))Let’s use the ai to write the main driver. This is more out of curiosity, because it is probably just easier to write it out.
(defun org-ai-to-md-handle-lines (state output-buffer lines)
"The main driver of the converter. Iterates over the lines and accumulates them by calling HANDLE-LINE."
(dolist (line lines state)
(setq state (handle-line state output-buffer line))))We can now try to have the LLM do some real lifting. We have a description of our task, and we have thought about the parsing functions we wanted, as well as documented them.
This was quite a mouthful, so we definitely want to write some unit tests. But before we do that, I would actually like to see if there is an idomatic way to describe arguments and return values and their type.
We already saw that the super terse agent is often wrong, so let’s try again with a slight more verbose one.
Not entirely sure what to make out of all that, but it was interesting to see what is out there.
I’m intrigued by the autocomplete part, however. Not necessarily because I want it to be used in this project, but because I have been thinking about it for a few other package ideas.
Interesting, I’m not sure this is really useful per se, but I think there is a decent chance that the custom completion backend for company-mode is a real thing.
While I’m at it, I will ask it about completion in source blocks in org-mode, which is something that has been bugging me.