Skip to content

Instantly share code, notes, and snippets.

@rndmcnlly
Created August 31, 2024 01:51
Show Gist options
  • Save rndmcnlly/7407eb30e4641d7005ce3b38ebbec220 to your computer and use it in GitHub Desktop.
Save rndmcnlly/7407eb30e4641d7005ce3b38ebbec220 to your computer and use it in GitHub Desktop.

Prompt:

Take in this idea. Teaching an Old Dog 232 New Tricks with Knowledge Implants Inciting incident: I have a BDD encoding a bunch of knowledge I want to make available to an LLM-based agent. The BDD compactly encodes trillions of facts I want the agent to know, and I can represent it as a neural network with only a few million parameters. How do I shove this knowledge into the agent in an intensional form (i.e. without enumerating all combinations)? The network inputs and outputs (bit)vectors, but I need to trick the LLM into feeding this module the right query vectors and help it decode the resulting outputs. I was struck by a way to do this with zero training (by hand-constructing a state machine that adapts the recent context into a query and forces generating the right result tokens). Gem: “<127,0,0,1|8,8,8,8>” for some arbitrary total boolean function on fixed-width bitvectors. The language model follows instructions to pack a query into the input octets in the bra. The implant computes the output octets and effectively forces these outputs in the key. The language model follows instructions to decode an answer from the outputs. (It’s kinda like function calling, but the function can be inlined into the neural network if desired.) Motivation: We’d like to provide our LLM-based agents with access to up-to-date knowledge and sound reasoning strategies. In this paper, we consider how to compile and implant knowledge directly into an existing neural network so that it can be exploited during conversations without any additional training or usage of much input/output token budget. Knowledge: In retrieval-augmented generation, we provide agents with relevant knowledge by encoding it as text within the context window. This method works well when the context of the conversation so far represents a good query for retrieving the knowledge needed to proceed and the required knowledge is available for retrieval. In this paper, we’ll use context to derive a query vector, but we’ll directly calculate rather than retrieve the result vector. Reasoning: With function calling, we let the agent call to outside computation engines using a query explicitly constructed in text. The text-encoded result is placed into the conversational context for continued generation. In this paper, we’ll compile that outside reasoning engine into a computational network that can be inlined into the existing neural network. Broader context: LLMs have all sorts of documented problems with knowledge and reasoning. Our method shows a way for audible knowledge modules to be installed/replaced without training or even manipulation of large datasets. Our approach is compatible with energy efficient neural accelerator devices and avoids communication with additional systems. If this thing is used during (pre)training, it can learn to trust the injected knowledge module rather than statistical correlations in context because the socket works like a conditional stop-gradient valve. Super Metroid navigation sketch: SYSTEM: To answer navigation queries, write out a bra-ket like “<127,0,0,1|124,0,0,0>” where in the bra the first octet codes the player’s location and the next three code their inventory. Here is a table of location codes: {...}. Here are instructions for coding inventory: {...}. After you have written out the whole bra-ket, decode the key interpreting the first octet as a next-location code and the three remaining items as an indication of which inventory items will be gained on the next step. Example: <127,0,0,1|124,0,0,0> represents a navigation query from the morph ball room when the player has just the morph ball item, and it represents advice to head the elevator room without collecting any new items in the current room. To derive the player’s current state, just pass “<0,0,0,0|” as input, the output will be the player’s current state. USER: Where should I go next to beat the game? ASSISTANT: I can help you with this navigation query. You are (thinking: <0,0,0,0|121,0,0,0>) in Construction Zone (Node 121) with no items. Routing: <121,0,0,0|127,0,0,1> You should head to Morph Ball Room (Node 127). You should collect an item… (0x00,0x00,0x01; 0b0001), the Morph Ball item. USER: Would I do that if I already had the morph ball? ASSISTANT (reasoning involving hypothetical states omitted) No, you’d head to the Missle Room instead. How does the agent know how to write the right bra-ket? When the “|” token is emitted, this signals that the implant should activate. The socket decodes (via attention) the recent octets into raw bits, it pushes them through the black-box function (implant), and produces raw bit output that can be routed (via attention) to force writing certain output octets. Why are we using octets? The tiktoken tokenizer cleanly splits this notation (try it yourself: https://platform.openai.com/tokenizer), communicating 8 bits per token in a notation that the language model has likely frequently seen during pre-training. For byte-oriented packed structs, complex inputs and outputs can be encoded without resorting to bit-level expressions. Why 32 bits? This is just an example bus width that rhymes with IPv4’s design. Implants would likely be flexible, padding with zeros or truncating as necessary. (Beware the temptation to switch to IPv6’s notation. Hextets don’t tokenize cleanly in tiktoken.) Terminology: Socket: This is the adapter between an existing LLM architecture and bitvectors. The socket is constructed (not trained) by analyzing the LLM’s token embedding tables and positional encoding schemes. The syntax of the bra-ket notation is determined by the socket and could be varied to adapt one implant to differently-tokenizing LLMs. Implant: This is any total bitvector-to-bitvector function that can be evaluated by the LLM’s runtime system. If interpreted in the continuous domain, it is piecewise constant over orthants. Design consideration: To preserve bra-ket looking visual presentation but switch to rarer tokens, consider “ ⟨127.0.0.1│8.8.8.8⟩” (note the space before the opening bracket and the vertical line, U+2502, instead of a pipe). Using rarer tokens should avoid the socket triggering at the wrong times during other conversations. We can programmably attend to the past, but we can’t simply project into the future. So, we’ll need something that rides the generation head, looking back to see if we’ve recently passed the vertical bar. This means the socket should take the form of an LSTM-like recurrent neural network that is always parsing tokens, raising a flag to trigger the implant, and propagating the implant’s last prediction along until the ket is closed. It’s a little state machine with binary registers for each bus bit. It can input and output one token per turn. It has one flag output to trigger the implant and another to overwrite the next token. It’s like a ribosome that rides the token stream.

Response:

Your idea is an intriguing exploration of integrating a compact, knowledge-dense encoding directly into a language model's workflow, allowing the LLM to exploit a highly compressed and structured knowledge base without needing to enumerate every possible fact explicitly. The notion of using a bitvector-to-bitvector mapping that operates within the LLM's token stream introduces a novel way to inject structured computation and knowledge retrieval directly into the text generation process.

Key Elements and Concepts

  1. BDD and Knowledge Encoding:

    • BDD (Binary Decision Diagram): A compact representation that encodes trillions of facts, designed to minimize redundancy and optimize storage.
    • Neural Network Representation: The BDD is compressed into a neural network with just a few million parameters, allowing it to be embedded within the LLM’s operational framework.
  2. Querying via Bra-Ket Notation:

    • The idea of using <bra|ket> notation to encode and decode information is both clever and efficient. This notation serves as a bridge between the LLM and the knowledge implant, where the LLM’s context generates the bra, and the implant produces the ket.
    • Input/Output Vectors: These are encoded as octets, where each octet represents a part of the query or result, aligning with the byte-oriented nature of common tokenizer schemes (like tiktoken).
  3. Socket and Implant Mechanism:

    • Socket: The adapter that decodes LLM tokens into bitvectors, passes them through the implant, and then re-encodes the output back into tokens. This component is crucial for translating the LLM’s generated context into a form the implant can process.
    • Implant: A total bitvector-to-bitvector function acting as a computational module within the LLM, enabling the model to compute results on the fly rather than relying solely on pre-trained knowledge or external retrieval.
  4. State Machine as Control:

    • A state machine (akin to an LSTM) rides the token generation process, monitoring the LLM's output for the | token that indicates the need to invoke the implant. This state machine ensures that the implant is activated at the correct moment and that the resulting bitvector is properly integrated back into the generation stream.
  5. Application and Broader Context:

    • Navigation Example: The Super Metroid navigation sketch provides a concrete example of how this system might function in practice. By encoding location and inventory states as octets, the LLM can generate navigation advice that is both specific and computationally derived.
    • Trust and Efficiency: By embedding this knowledge directly into the LLM’s operational flow, you can circumvent some of the trust issues that arise from purely statistical learning, while also reducing the need for extensive training or external API calls.
  6. Design Considerations:

    • Tokenization and Syntax: The choice of using octets and careful tokenization (including rarer characters for delimiters) is key to ensuring that the LLM interprets the input and output correctly without unwanted triggering.
    • Flexibility and Compatibility: The system is designed to be flexible, capable of adapting to different LLMs and tokenization schemes, while also being compatible with energy-efficient hardware.

Potential Impact and Challenges

  • Impact: This method has the potential to revolutionize how LLMs handle structured knowledge and reasoning tasks, enabling them to access and use vast amounts of data with minimal computational overhead. It could also improve the reliability and accuracy of LLM responses in specialized domains.

  • Challenges:

    • Implementation Complexity: Building the socket and implant system, especially the recurrent state machine, will require careful engineering and deep integration with the LLM's architecture.
    • Generality vs. Specificity: While the system is highly specialized, there may be challenges in generalizing this approach across different knowledge domains or use cases.
    • Maintenance and Updates: As the knowledge encoded in the BDD evolves, updating the implant without retraining the entire system could be challenging, especially if the network architecture changes.

Overall, your concept presents a highly innovative approach to enhancing LLMs with direct, compact, and efficient knowledge retrieval and reasoning capabilities. This could lead to significant advancements in the way AI systems process and generate information, particularly in domains requiring precise, up-to-date knowledge.

Prompt:

Let's prototype this using unmodified LLMs. We'll run a normal text completion flow where the LLM streams back tokens until it hits one of many stops tokens. In addition to the normal stop tokens, we'll have it stop on the vertical bar. If the last token was a bar (the end of a bra), we'll run the BDD inference and then append the computed ket to the context, then continue steaming generation.

Response:

This approach provides a practical way to prototype the idea using an unmodified LLM, relying on text completion with specific stop tokens to trigger the BDD inference process. Below is a step-by-step outline to implement this system:

Step 1: Setup the LLM Completion Framework

  1. Initialize LLM: Start by loading a pre-trained LLM that supports token-based streaming of text completions (like GPT-3, GPT-4, etc.).
  2. Define Stop Tokens: Alongside standard stop tokens (e.g., period, newline), include the vertical bar | as a stop token. This bar will signal the end of the bra and the need to compute the ket.

Step 2: Input Handling and Completion Monitoring

  1. Initial Query: The user or another component provides an initial text prompt. The LLM starts generating tokens based on this prompt.
  2. Stream Tokens: As the LLM generates tokens, it streams them back one by one. The system should keep track of these tokens to detect when a bra ends.

Step 3: Detecting the Bar and Triggering the BDD Inference

  1. Check for Bar: Each time the LLM generates a token, check if it is a vertical bar |.
    • If it is not, continue normal token generation.
    • If it is, pause the LLM's generation process and proceed to the next step.
  2. Extract the Bra: Gather the preceding tokens that form the bra (e.g., <127,0,0,1|). This is the input to the BDD or neural network-based inference system.

Step 4: Perform BDD Inference

  1. Parse the Bra: Convert the tokens in the bra into the appropriate bitvector format expected by your BDD or neural network.
  2. Run Inference: Feed this bitvector into the BDD or neural network module to compute the corresponding ket.
  3. Format the Ket: Convert the resulting bitvector output back into token form, e.g., |124,0,0,0>.

Step 5: Append and Continue Generation

  1. Insert Ket into Context: Append the computed ket (e.g., |124,0,0,0>) to the context or conversation history.
  2. Resume Generation: With the updated context (which now includes the ket), resume token generation by the LLM, continuing to stream tokens until another stop token is encountered.

Step 6: Handle Repeated Queries

  1. Loop Back: If further bra-ket queries are embedded within the conversation, repeat steps 2-5 as needed until the entire sequence is processed.
  2. Final Output: Once all processing is complete, provide the final generated output, which now includes the results from the BDD-based reasoning steps.

Implementation Considerations

  • Efficiency: Ensure that the BDD inference step is fast to maintain the interactivity of the system.
  • Tokenization: Be cautious of how tokens are segmented, especially in languages where token boundaries can influence meaning.
  • State Management: Since the LLM may generate context-sensitive text, careful state management is required to ensure the bra is correctly identified and processed each time.

Example Workflow

  1. Input: "You are in the Morph Ball Room with just the Morph Ball item. <127,0,0,1|"
  2. LLM stops at |, triggering the BDD inference.
  3. BDD Inference: Produces |124,0,0,0>.
  4. Context Updated: "You are in the Morph Ball Room with just the Morph Ball item. <127,0,0,1|124,0,0,0>"
  5. LLM Continues: Generates further advice or actions based on the updated context.

This prototype should allow you to explore the concept without modifying the LLM, while still leveraging its full potential in a controlled and directed way.

Prompt:

Am I abusing terms to say the bra is input and the key is output? I really need to reading-order-first item to be the input. Maybe my left-to-right bias form English is showing here.

Response:

You're correct to reconsider the terminology, especially in light of the importance of clarity and precision in how these concepts are communicated. In the traditional bra-ket notation from quantum mechanics, the "bra" (⟨ψ|) is indeed on the left and represents a row vector, while the "ket" (|φ⟩) is on the right and represents a column vector. The reading order implies that the bra is the input and the ket is the output, which might not align with your intended usage.

Adjusting Terminology to Align with Intent

If the primary goal is to ensure that the first element in reading order is the input, and the second element is the output, you might want to rethink the terms to avoid confusion.

Here are a few alternatives:

  1. "Pre-ket" and "Post-ket" Notation:

    • Pre-ket: Represents the input vector or state.
    • Post-ket: Represents the output vector or result after applying the function.

    Example: <127,0,0,1|124,0,0,0>

    • Here, <127,0,0,1| is the pre-ket (input) and |124,0,0,0> is the post-ket (output).
  2. "Input-Ket" and "Output-Ket":

    • Input-Ket: Directly refers to the input side.
    • Output-Ket: Directly refers to the output side.

    Example: <input|output>, or more specific: <127,0,0,1|124,0,0,0>, where the left part is clearly identified as the input and the right as the output.

  3. "Left-Ket" and "Right-Ket":

    • Left-Ket: The vector on the left, serving as input.
    • Right-Ket: The vector on the right, serving as output.

    This maintains the visual alignment with how the notation is structured.

  4. "Query-Vector" and "Result-Vector":

    • Query-Vector: Refers to the input you provide to the system.
    • Result-Vector: Refers to the output generated by the system.

    This is more straightforward and avoids the potentially confusing quantum mechanical terms.

Application to Your Prototype

If you want to ensure clarity and avoid misinterpretation:

  • Consider adopting "Query-Vector" for what you've been calling the "bra" and "Result-Vector" for the "ket."
  • When implementing, you could define these vectors explicitly in your system:
    • For example, the LLM encounters a query like <127,0,0,1| (Query-Vector) and, after processing, outputs |124,0,0,0> (Result-Vector).

This approach will align your terms more intuitively with the flow of information (input to output) and avoid any confusion that might arise from the traditional meanings associated with "bra" and "ket."

Prompt:

I'm distraught. I really liked the bra-ket analogy, but the term ordering thing is killing me. Help me save the underlying idea by proposing a different notation involving rare Unicode characters so that the input precedes the output in a left-to-right reading.

Response:

I understand the attachment to the bra-ket analogy—it’s elegant and meaningful, especially in the context of compactly representing transformations. However, to preserve the idea while respecting the left-to-right input-output order, we can devise a new notation that retains the spirit of the bra-ket but adjusts the structure.

Proposed Notation: Bracket-Arrow Notation

This notation visually emphasizes the transformation from input to output, using unique Unicode characters to avoid confusion and maintain clarity.

Structure:

  • Left Bracket ( U+2985): Encapsulates the input.
  • Right Arrow ( U+27F6): Symbolizes the transformation from input to output.
  • Right Bracket ( U+2986): Encapsulates the output.

Example:

Instead of <127,0,0,1|124,0,0,0>, you could write:

⦅127,0,0,1⟶124,0,0,0⦆

Breakdown:

  • ⦅127,0,0,1⦆ represents the input vector (similar to the "bra").
  • The arrow indicates the process or transformation that occurs.
  • ⦆124,0,0,0⦆ represents the output vector (similar to the "ket").

Additional Notation Enhancements:

  1. Optional Labels:

    • You can add optional labels for clarity:
      • ⦅Input: 127,0,0,1⟶Output: 124,0,0,0⦆
    • This preserves meaning while making it more human-readable.
  2. Different Arrows for Different Transformations:

    • Use alternative arrows to represent different types of transformations:
      • (U+2923) for simple mappings.
      • (U+27FF) for complex or multi-step processes.
    • Example: ⦅127,0,0,1⟿124,0,0,0⦆.

Benefits:

  • Preserves the Concept: The left-to-right reading order clearly defines the input and output, maintaining the logical flow of information.
  • Visual Distinction: The use of rare Unicode characters minimizes the risk of unintended tokenization issues or accidental triggering of functions within the LLM.
  • Flexible and Extendable: Different arrows or bracket types could denote different operations or transformations, adding layers of meaning without losing the core idea.

Practical Example in a Workflow:

  1. Input from User: The user enters a query in the form ⦅127,0,0,1⟶⦆.
  2. LLM Generates: The LLM processes the input, detects the arrow , and stops generation.
  3. Trigger Inference: The system runs the BDD inference, computes the output 124,0,0,0.
  4. Context Update: The context is updated to ⦅127,0,0,1⟶124,0,0,0⦆, and the LLM resumes generating based on this new context.

This notation preserves your original concept while resolving the left-to-right order issue, ensuring clarity in communication and maintaining the elegance of the bra-ket analogy.

Prompt:

Visually, that is a very nice alternative. Unfortunately, neither the brackets nor arrow come out as single tokens according to tiktoken. We need to use things that are likely to be slightly more common so that they get parsed as atomic tokens.

Response:

To address the tokenization issue while maintaining a visually distinct and logically clear notation, we need to use more common Unicode characters that are likely to be treated as single tokens by most tokenizers, including tiktoken. Here’s a refined version of the notation:

Pipe-Arrow Notation

Structure:

  • Left Pipe Bracket (|): Encapsulates the input.
  • Arrow (->): Symbolizes the transformation from input to output.
  • Right Pipe Bracket (|): Encapsulates the output.

Example:

Instead of <127,0,0,1|124,0,0,0>, you could write:

|127,0,0,1->124,0,0,0|

Breakdown:

  • |127,0,0,1|: Represents the input vector.
  • ->: Indicates the transformation from input to output.
  • |124,0,0,0|: Represents the output vector.

Benefits:

  • Tokenization Friendly: The pipe | and arrow -> are common enough to be tokenized as single entities in most tokenization schemes, including tiktoken.
  • Logical Flow: The left-to-right reading order clearly defines the input, the process, and the output.
  • Visual Simplicity: This notation is easy to read and write, while still being visually distinct enough to stand out in text.

Practical Example in a Workflow:

  1. Input from User: The user enters a query in the form |127,0,0,1->|.
  2. LLM Generates: The LLM processes the input, stops at the arrow ->.
  3. Trigger Inference: The system runs the BDD inference, computes the output 124,0,0,0.
  4. Context Update: The context is updated to |127,0,0,1->124,0,0,0|, and the LLM resumes generating based on this new context.

Additional Enhancements:

  • Labeled Notation: To make it even clearer, you could optionally include labels:

    • |Input: 127,0,0,1 -> Output: 124,0,0,0|
  • Variation for Different Operations:

    • If you need to denote different types of transformations, you could use different variations of arrows:
      • --> for one type of transformation.
      • => for another.

This Pipe-Arrow notation should strike a good balance between clarity, tokenization efficiency, and maintaining the conceptual integrity of your original idea.

Prompt:

That tokenizes correctly, however humans are unlikely to perceive "=>" as an atomic token. (Consider the irony of me explaining this to ChatGPT-4o, a model using a tokenizer that considers "=>" as a single token and not the two glyphs for equals and then greater-than.)

Response:

You're right—while models like GPT-4 may tokenize "=>" as a single token, it's important that the notation is also intuitive and easy for humans to understand as an atomic unit. To achieve this, we need symbols that are not only likely to be tokenized as single units but also visually distinct and easily recognized by humans as cohesive units.

Colon-Hyphen-Arrow Notation

This approach combines readability for humans and tokenization friendliness for the model:

Structure:

  • Input Bracket (:): Acts as a clear boundary for the input.
  • Arrow (->): Represents the transformation process.
  • Output Bracket (:): Acts as a clear boundary for the output.

Example:

Instead of <127,0,0,1|124,0,0,0>, you could write:

:127,0,0,1 -> 124,0,0,0:

Breakdown:

  • :127,0,0,1:: Represents the input vector.
  • ->: Indicates the transformation from input to output.
  • :124,0,0,0:: Represents the output vector.

Benefits:

  • Tokenization: Colons : and arrows -> are more likely to be tokenized correctly as single units. The colon is a common delimiter in many contexts, and the arrow is visually clear and commonly tokenized as a unit.
  • Human Perception: The notation is intuitive, and the combination of symbols feels natural and easy to read. The colon helps to clearly delineate the boundaries, making the notation appear as a single unit.
  • Visual Clarity: This notation is clean and reduces the cognitive load for readers, who can easily see the transformation as a single, cohesive operation.

Practical Example in a Workflow:

  1. Input from User: The user enters a query in the form :127,0,0,1 ->:.
  2. LLM Generates: The LLM processes the input, stops at the arrow ->.
  3. Trigger Inference: The system runs the BDD inference, computes the output 124,0,0,0.
  4. Context Update: The context is updated to :127,0,0,1 -> 124,0,0,0:, and the LLM resumes generating based on this new context.

Further Customization:

  • Variation in Arrows: If necessary, different arrows can still be used for different operations while keeping the structure consistent:
    • For example: :127,0,0,1 => 124,0,0,0: could denote a different type of transformation or logic.

This notation should be clear and natural for human users while also playing nicely with the tokenizer, ensuring that both humans and machines understand it as a cohesive and atomic unit.

Prompt:

Chat, "->" is a digraph too! Show me some alternate brackets and alternate arrows form Unicode.

Response:

You're right— "->" is also a digraph! Let's explore some alternative brackets and arrows from Unicode that could be both visually distinct and more likely to be perceived as atomic tokens.

Alternative Brackets

  1. Angle Brackets:

    • Left Angle Bracket: (U+27E8)
    • Right Angle Bracket: (U+27E9)
    • Example: ⟨127,0,0,1⟩ → ⟨124,0,0,0⟩
  2. Double Angle Brackets:

    • Left Double Angle Bracket: « (U+00AB)
    • Right Double Angle Bracket: » (U+00BB)
    • Example: «127,0,0,1» → «124,0,0,0»
  3. Box Drawing Brackets:

    • Left Box Bracket: (U+2553)
    • Right Box Bracket: (U+2556)
    • Example: ╓127,0,0,1╖ ↦ ╓124,0,0,0╖
  4. Curly Brackets (For Enhanced Visuals):

    • Left Curly Bracket: (U+2774)
    • Right Curly Bracket: (U+2775)
    • Example: ❴127,0,0,1❵ ⇾ ❴124,0,0,0❵
  5. White Square Brackets:

    • Left White Square Bracket: (U+27E6)
    • Right White Square Bracket: (U+27E7)
    • Example: ⟦127,0,0,1⟧ ⇛ ⟦124,0,0,0⟧

Alternative Arrows

  1. Single Arrows:

    • Rightwards Arrow: (U+2192)
    • Example: ⟨127,0,0,1⟩ → ⟨124,0,0,0⟩
  2. Double Arrows:

    • Rightwards Double Arrow: (U+21D2)
    • Example: ⟨127,0,0,1⟩ ⇒ ⟨124,0,0,0⟩
  3. Long Rightwards Arrows:

    • Long Rightwards Arrow: (U+27F6)
    • Example: ⟨127,0,0,1⟩ ⟶ ⟨124,0,0,0⟩
  4. Rightwards Arrow with Tail:

    • Rightwards Arrow with Tail: (U+21A3)
    • Example: ⟨127,0,0,1⟩ ↣ ⟨124,0,0,0⟩
  5. Rightwards Squiggle Arrow:

    • Rightwards Squiggle Arrow: (U+21DD)
    • Example: ⟨127,0,0,1⟩ ⇝ ⟨124,0,0,0⟩
  6. Rightwards White Arrow:

    • Rightwards White Arrow: (U+21E8)
    • Example: ⟨127,0,0,1⟩ ⇨ ⟨124,0,0,0⟩

Combined Example Using Alternative Characters

  • Using White Square Brackets and Long Rightwards Arrow:

    ⟦127,0,0,1⟧ ⟶ ⟦124,0,0,0⟧
    
  • Using Curly Brackets and Rightwards Squiggle Arrow:

    ❴127,0,0,1❵ ⇝ ❴124,0,0,0❵
    

These alternatives should help maintain both visual clarity and the atomic nature of the tokens while being distinct enough for practical use in your system. They should also minimize confusion while being more likely to tokenize correctly.

Prompt:

" ⟨127,0,0,1⟩ → ⟨124,0,0,0⟩" looks good to me. We'd use the rare but atomic rightwards arrow as the breakpoint. Keeping the fancy braces reinforced how the input and output are expressed in the same notation.

Response:

That sounds like a solid choice! The use of the Rightwards Arrow () as the breakpoint is a great way to visually and logically separate the input and output, while the Angle Brackets ( and ) provide a clear and consistent notation to encapsulate both the input and output vectors. This notation is both visually appealing and functionally effective.

Final Notation

  • Input: Enclosed in and .
  • Output: Also enclosed in and .
  • Transformation: Represented by the arrow.

Example

⟨127,0,0,1⟩ → ⟨124,0,0,0⟩

Implementation Flow

  1. User Input: The system begins with an input like ⟨127,0,0,1⟩ →.
  2. LLM Processing: The LLM generates tokens until it reaches the arrow, at which point it pauses.
  3. BDD Inference: The system takes the input within the first set of angle brackets (⟨127,0,0,1⟩), runs the necessary BDD inference or neural network computation, and produces the output vector.
  4. Context Update: The output vector (⟨124,0,0,0⟩) is appended to the context after the arrow.
  5. Continuation: The LLM resumes token generation using the updated context.

Benefits

  • Atomic Tokenization: The arrow () and the angle brackets (, ) are likely to be tokenized correctly, ensuring smooth processing by the model.
  • Visual Consistency: The consistent use of angle brackets reinforces the idea that both input and output are of the same type or structure, which helps maintain clarity.
  • Ease of Understanding: The notation is intuitive and clean, making it easy for both humans and machines to parse and understand the relationship between input and output.

This approach should allow you to seamlessly integrate your BDD or neural network-based reasoning into the LLM's token generation process while keeping the notation elegant and practical.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment