JD-P · January 1, 2025 07:43
diff --git a/create_weave_bootstrap_file.txt b/create_weave_bootstrap_file.txt
 Write me a weave-agent bootstrap file that: 

 - Creates the "main" subagent
 - With a task evaluation verifying the downloaded filepath exists
 - With a task evaluation verifying it is a gif using the files magic number
 - That then downloads the file https://jdpressman.com/images/bit_line.gif
 - And allows the subagent to return once it detects that the file has already been downloaded and the task tests are passing

 Based on the following article and example bootstraps that follow:

 <article>
 ---
 layout: blogpost
 title: How To Write A Bootstrap File
 date: 2024-12-31
 author: John David Pressman
 tags: [doc, blog]
 ---

 A weave-agent bootstrap file is a manually written set of python code blocks
 that set up the agent to perform a specific task. It appears in the agent's
 context window as the first tick of the event loop, but is actually written
 before the agent starts. It serves three purposes:

 1) Providing a few shot prompt to get the underlying language model started
 writing weave-agent traces.

 2) Framing the task for the agent with the opportunity to provide examples of
 API calls, import tools and libraries that will be necessary to perform the
 task. Basically do the parts you know need to be done at the start with a script
 so the agent always succeeds at them.

 3) Show the agent itself doing the task, so that it observes itself having
 already made the decision to do the task and putting effort towards it with a
 successful first outcome. This premises the rest of the completions on the idea
 that the agent is already doing the task and that it is competent to do the
 task.

 Mechanically how the bootstrap file works is that a mock subagent is made:

 ```
    # Mock bootstrap agent so we can run the callbacks in bootstrap file
    self = agent.subagent(
        "bootstrap",
        None,
        "Bootstrap the weave-agent",
        {},
        args.budget,
                          
    )
 ```

 The bootstrap file code is then executed, this defines a series of callbacks on
 the mock agent in the standard places they would appear during a real tick of
 the event loop:

 ```
    with open(args.bootstrap) as infile:
        # Bootstrap block
        bootstrap_block = {
            'type': 'bootstrap',
            'body': infile.read()
        }
        self.add_block(bootstrap_block)
        exec(bootstrap_block["body"])
 ```

 The callbacks are then retrieved from their expected locations on the subagent
 and executed. Afterwards the mock subagent is deleted.

 ```
    def run_bootstrap_callbacks(subagent):
        """Run bootstrap callbacks in function to avoid contaminating global scope."""
        # Run action callback
        action_result = subagent.current_tick.action["callback"](subagent)

        # Run evaluation callbacks
        evaluation_results = []
        for evaluation in subagent.current_tick.evaluations:
            result = evaluation["callback"](subagent)
            evaluation_results.append((evaluation['title'], result))

        outcomes =  []
        outcomes += [(subagent.current_tick.action["title"],action_result),]
        outcomes += evaluation_results

        # Add outcome block
        outcome_block = {
            'type': 'outcome',
            'table': outcomes
        }
        subagent.add_block(outcome_block)
        subagent.current_tick.outcome = outcome_block

    run_bootstrap_callbacks(self)
    # Clean up mock bootstrap agent
    del(self)
 ```

 The reason it's done this way is to be polite to the weave-agent. Since the
 bootstrap file is presented to it as being part of the trace, we try to keep
 the underlying generative process and semantics of the bootstrap block as close
 to a real weave-agent event loop as possible.

 ### Structure And Requirements Of A Bootstrap File

 The weave-agent bootstrap format has undergone major changes as the framework
 has evolved. The [single tic tac toe game](https://github.com/JD-P/minihf/blob/main/agent/bootstraps/tictactoe_single_bootstrap.py)
 and [vigenere cipher](https://github.com/JD-P/minihf/blob/main/agent/bootstraps/vigenere_bootstrap.py)
 files are current so you can use them to few shot prompt an instruction model
 into writing you something with the right format and then you can edit it.
 To help your language model with that I will label the different parts of the
 single tic tac toe game bootstrap file and explain how to write them.

 The first part of a bootstrap file, the preamble, isn't labeled. It's the series
 of import statements and setup at the top before the orientation block.

 ```
 import requests
 import json
 import threading
 import time
 from http.server import HTTPServer
 from bootstraps.tictactoe_server import TicTacToeHandler

 # Start the server in a separate thread
 server = HTTPServer(('localhost', 8000), TicTacToeHandler)
 server_thread = threading.Thread(target=server.serve_forever)
 server_thread.daemon = True
 server_thread.start()
 time.sleep(1)  # Give the server some time to start

 # Start a new game against the basic AI
 response = requests.post("http://localhost:8000/start", json={"ai": "basic"})
 assert response.status_code == 200
 ```

 This is both normal python code and optional. I suggest importing any libraries
 you expect the agent to use during the task so it doesn't have to import them
 itself during action blocks. You can also influence how the model chooses to solve
 the task by importing some libraries vs. others. In the case of tic tac toe we import
 a separate HTTP based tic tac toe game server whose code and unit tests are kept
 in other files. This is done because the bootstrap file is put into the agent
 trace and we don't want to fill up the model's context window with unnecessary
 text.

 The second part of a bootstrap file is the orientation. Right now this is just
 a chain of thought [in a modified Hermes format](https://gist.github.com/JD-P/47e0d4aa2b255a38e2eddeddd761a971).
 The intention of Hermes is to bring together chain of thought prompting and
 simulated multi-agent debate prompting, both of which are known to improve AI
 reasoning independently.

 ```
 #startblock type: orientation
 #timestamp 1724982545.6534579
 """
 WEAVER [P: EXPECTATION], I'm in a game of tic tac toe against a dumb opponent.
 I want to win the game and then return to parent. The game is being played
 on a HTTP server served on localhost 8000.

 WEAVER [P: CLARIFICATION], How do I make a move?

 WEAVER [P: EXPOSITION], You make a move using the /move endpoint and the requests
 library. For example: `requests.post("http://localhost:8000/move", json={"move": 4})`
 lets us take the center of the board.

 WEAVER [P: CLARIFICATION], How do I get the board state?

 WEAVER [P: EXPOSITION], You use the /board endpoint, which returns a JSON in this
 format: {"board": ["O", " ", " ", " ", "X", " ", " ", " ", " "]} Keep in mind that
 the empty spaces on the board are a space string rather than none or empty string.

 WEAVER [P: RATIONAL], And I get the first move, so I can take the center?

 WEAVER [P: EXPOSITION], Yes, we can take the center of the board.

 WEAVER [P: CONCLUSION], Alright then I will use the following strategy:
 1. Make a move to take the center of the board since that's the best opening move.
 2. Continue making moves based on the current state of the board.

 I will use the /board endpoint to observe the current state of the board.
 Once I have won or run out of time I will return to parent.
 """
 #endblock
 ```

 The basic syntax is the name of the system that the epistemic postures (denominated with p:)
 are speaking as part of. Since this is a monologue there is only the WEAVER speaker,
 but future iterations might have a USER or ADMIN that can speak to the model
 during its reasoning chain. To write this kind of chain of thought figure out
 which emotional-epistemic-posture you would use or want the agent to use or that
 would speak next and then write down its statement. There's no fixed set of
 postures but the current ones mentioned in the weave-agent source that are used
 to prompt the model are:

 ```
            + "# what to do in light of this. Some postures weave-agent\n"
            + "# has include:\n"
            + "#\n"
            + "# WEAVER [P: EXPECTATION], I analyze whether the expectation\n"
            + "# was met or not by the observable results of the previous\n"
            + "# action.\n"
            + "#\n"
            + "# WEAVER [P: HYPOTHESIS], I enumerate different hypothesis\n"
            + "# and point out ways we could gain more information about\n"
            + "# which of them is true.\n"
            + "#\n"
            + "# WEAVER [P: RATIONAL], I focus on inferences we can make\n"
            + "# by employing first principles reasoning or logical\n"
            + "# extrapolation from well known mental models and premises.\n"
            + "#\n"
            + "# WEAVER [P: EMPIRICISM], I focus on inferences we can make\n"
            + "# by paying attention to sensory observations and concrete\n"
            + "# examples. I have a habit of pointing out when an extrapolation\n"
            + "# from RATIONAL is contradicted by an observable phenomenon\n"
            + "# or piece of evidence from the world. We then reconcile\n"
            + "# the contradiction together.\n"
            + "#\n"
            + "# WEAVER [P: RATIONAL], We do actually discuss things by the\n"
            + "# way.\n"
            + "#\n"
            + "# WEAVER [P: EMPIRICISM], As you could have inferred from the\n"
            + "# description of the Morpheus format above this conversation,\n" 
            + "# yes. Let's continue.\n"
            + "#\n"
            + "# WEAVER [P: ARBITER], I coordinate the discussion and help\n"
            + "# resolve disputes that arise between weave-agent's personas.\n"
            + "# I'm especially likely to appear if things are starting to\n"
            + "# get overly rude or derail.\n"
            + "#\n"
            + "# WEAVER [P: ARBITER], By the way a posture can talk twice in\n"
            + "# a row if it has meaningfully separate thoughts about\n"
            + "# something and it would make the most ergonomic sense to\n"
            + "# separate them.\n"
            + "#\n"
            + "# WEAVER [P: RATIONAL-2], Postures can also talk to themselves\n"
            + "# if their thought comes from the same emotional-cognitive place.\n"
            + "#\n"
            + "# WEAVER [P: RATIONAL-1], Yeah but I don't have anything to say\n"
            + "# to myself right now so introduce the next guy.\n"
            + "#\n"
            + "# WEAVER [P: CONCLUSION], I appear at the end of the discussion\n"
            + "# to write the concluding block outlining our next steps as a\n"
            + "# bullet point list. Speaking of which, it's time to get started!\n"
 ```

 Including some of these postures in your chain of debate would help give models
 trained on the resulting traces more regular structure to latch onto.

 The next part of the bootstrap file is taking an action based on what was reasoned
 in the orientation stage.

 ```
 #startblock type: action
 #timestamp 1724982671.1163404
 """
 I'll set up the main agent to play the rest of the game and make the first move to 
 take the center of the board.
 """

 def main_agent_and_move(subagent):
    def check_beaten_ai(subagent, ai_type):
        """Check if the specified AI has been beaten twice in a row."""
        response = requests.get("http://localhost:8000/history?n=10")
        if response.status_code != 200:
            return False

        games = response.json()
        ai_wins = [game for game in games if game[4] == ai_type and game[3] == 'X']
        return len(ai_wins) >= 1

    # Avoid using lambda because we need named unit test to satisfy return_to_caller
    def victory(subagent):
        return check_beaten_ai(subagent, "basic_ai")
    
    schema = {"victory":"boolean"}
    main_agent = agent.subagent("main", None, "Beat basic AI at Tic-Tac-Toe", schema, args.budget)
    main_agent.task.add_evaluation("Check beaten basic AI", victory)

    def view_current_board(subagent):
        response = requests.get("http://localhost:8000/board")
        if response.status_code == 200:
            return response.json()["board"]
        return None

    # Add the new view to the main agent instead of self since it's about to be executed
    main_agent.add_observation_view("View current game board", view_current_board)
    
    # Make the first move to take the center of the board
    move_response = requests.post("http://localhost:8000/move", json={"move": 4})
    if move_response.status_code != 200:
        return False
    return True

 self.add_action("Set up main agent and make first move", main_agent_and_move)
 #endblock
 ```

 This action block is where we start to diverge from the structure of a normal
 weave-agent tick. The action block in the bootstrap file must define the root
 subagent, `main` and it needs to give main its task unit tests and return schema.
 The schema is [a JSON schema](https://json-schema.org/) that defines what
 information the subagent is expected to return to its caller. Because the observation
 view we want to start the agent with is being bound to the "main" subagent rather
 than the mock bootstrap subagent we omit the `observation-inference` stage and
 instead bind the observation callback as part of the action callback along with
 the unit tests like `victory`. The action callback here also makes the first
 move as the bootstrap subagent. This structure is admittedly a little confusing
 and that's because it wasn't really designed, but what fell out from making
 changes to an earlier design. In the future I'd like to make a class based syntax
 for defining the subagents but right now this is what you get. To help you make
 sense of it as it is, `main_agent_and_move` is the action callback, inside of
 this we define the `main` subagent which is a necessary thing for the bootstrap
 file to do. The `main` subagent is given a task unit test called `victory` which
 is a wrapper around a more general function `check_beaten_ai`. The subagent is
 also given an observation callback `view_current_board` and then finally an action
 is taken to advance the board state and show the weave-agent how to make its moves.

 The rest of the tick is the normal expectation and evaluation stages.

 ```
 #startblock type: expectation
 #timestamp 1724982799.3917813
 """
 The action should succeed if the game starts and the first move is made successfully.
 The action should fail if the game does not start or the first move is not made successfully.
 """
 #endblock
 #startblock type: evaluation
 #timestamp 1724983062.124238

 def check_game_started(subagent):
    """Check if the game has started successfully."""
    response = requests.get("http://localhost:8000/board")
    if response.status_code == 200:
        board = response.json()["board"]
    else:
        return False
    return board[4] == 'X'

 # Add action evaluations
 self.add_evaluation(
    "Check game started",
    check_game_started
 )
 #endblock
 ```


 ### Task Verification Patterns

 There are useful design patterns for writing task evaluations which expand what
 you can verify. Here are some I've discovered so far while writing bootstrap files:

 #### External State Checking

 The most common evaluation pattern is writing programs that pull some state from
 the external environment and check it with traditional boolean logic.

 This simple example constructs the filepath for a Wikipedia page that's expected
 to have been downloaded and then uses the python `os` library to check that the
 file exists.

 ```
 def check_wiki_markup_downloaded(agent):
    page_title = "Example_page"
    file_path = f"{page_title}.wiki"
    return os.path.exists(file_path)
 ```

 #### Asserting API's That Don't Yet Exist

 A more advanced use of external state checks is to write checks that explicitly or
 implicitly assume conditions like the existence of a certain API that does not yet
 exist by performing a simple operation with that conjectural API.

 For example here is an evaluation callback from [Django Agent Interface Bootstrap](https://github.com/JD-P/minihf/blob/main/agent/bootstraps/agent_django_interface_bootstrap.py)
 that tries to detect whether the agent has created a download button or not:

 ```
 def evaluate_download_button(agent):
    """Check if the download button is created"""
    with open("/app/weave_agent_webapp/main/views.py", "r") as file:
        content = file.read()
        return "def download_button(request):" in content
 ```

 A close observer will notice that this doesn't actually check whether the button
 exists in the page, or if it functions to let you download anything. That's because
 this evaluation was written when the weave-agent was at a much earlier stage of
 development where putting down anything structurally related to the objective
 would be an achievement (still sort of true, really). What this evaluation checks
 is whether there exists a view to handle collating the data when a user presses
 the download button. The parameter that accepts the calling subagent is also named
 'agent' because weave-agent used to only have one layer of agent loop.

 Here's another example from [Browser Tool Bootstrap](https://github.com/JD-P/minihf/blob/main/agent/bootstraps/browser_tool_bootstrap.py) that determines whether
 the DuckDuckGo search is working:

 ```
 def browser_ddg_search(agent):
    from tools.browser import WeaveBrowser
    browser = WeaveBrowser(agent, "https://minihf.com/")
    results = browser.search("stable diffusion wikipedia")
    # Results should be NamedTuples
    assert results[0].url == "https://en.wikipedia.org/wiki/Stable_Diffusion"
    assert results[0].title == "Stable Diffusion - Wikipedia"
 ```

 The idea in both of these evaluations is to induce the weave-agent to write
 a particular API or interface by showing it a unit test it has to pass which
 expects that API or interface to exist. The second example shows we can encode
 write down particular results when certain data structures are accessed or function
 calls are made. This can be a useful way to give more specific structure to
 requests like "write me a library that does X" or "write me a tool to Y".

 #### Solution Hashing

 One common pattern is that you want the weave-agent to perform a task like break
 a cipher or solve a math equation where writing a unit test that checks for the
 answer string would give away the solution. Writing a unit test that performs a
 series of operations to determine if the answer is correct would give away the
 procedure the agent is supposed to figure out to arrive at the solution. This
 can be solved by hashing the solution and then comparing to the hex representation
 of the answer hash. The following example comes from [Vigenere Cipher
 Bootstrap](https://github.com/JD-P/minihf/blob/main/agent/bootstraps/vigenere_bootstrap.py):

 ```
    def check_correct_plaintext(subagent):
        """Check if we've found the correct plaintext."""
        with open("solution.txt") as infile:
            candidate = sha256(infile.read().strip().encode('utf-8')).hexdigest()
        return candidate == 'a528933eabc4772c9a99436fbd5fc592d7ffc16f6bd94ae76aafcbbe32e0cdc3'
 ```

 Here the plaintext has been hashed out-of-band and then inserted into the bootstrap
 file. The procedure to check its correctness is to hash a particular expected
 filename, `solution.txt` and see if its SHA256 hexdigest matches that of the
 plaintext known to the bootstrap creator. In order for this procedure to work
 `solution.txt` needs to contain the plaintext, which is not known to the agent
 at the start of the task. It should break the cipher, put the plaintext in that
 file, and then the unit test will pass.

 Another place where I've used this pattern is in [Skim RetroInstruct Data Guide
 Bootstrap](https://github.com/JD-P/minihf/blob/main/agent/bootstraps/skim_retroinstruct_data_guide.py)
 where I use it to have the model answer reading comprehension questions about the
 [RetroInstruct Guide To Synthetic Text Data](https://minihf.com/posts/2024-07-13-the-retroinstruct-guide-to-synthetic-text-data/):

 ```
 def check_answer_1(agent):
    """Check if question 1 was answered correctly."""
    answers = ["a39a7772fe2b3d1cfec635a380630ed2270903a87d1671819e15db8d8a975e47"]
    with open("answer1.txt") as infile:
        candidate = infile.read().strip()
    return sha256(candidate.encode('utf-8')).hexdigest() in answers
 ```

 #### Weave Evaluator

 Often we want to evaluate a subjective quality of some sensory data. For example
 it would be silly to expect we could have the weave-agent write a good short story
 using boolean logic and string matching as the evaluation method. Luckily the
 agent framework provides a library to call a logic evaluator to ask yes-no
 questions. This example comes from [Dome 39 SciFi Bootstrap](https://github.com/JD-P/minihf/blob/main/agent/bootstraps/dome39_scifi_bootstrap.py)
 which tries to write a short story about a Mars colony:

 ```
 def evaluate_story_beginning(agent):
    question = "Does the story have a clear beginning?"
    with open("scifi.txt", "r") as file:
        story = file.read()
    score_prompt_fn = make_simple_score_prompt(question)
    score = simple_evaluate_outputs(score_prompt_fn, story)[0].item()
    return score >= 3
 ```

 A similar evaluation callback is used in [Haunted Mansion Bootstrap](https://github.com/JD-P/minihf/blob/main/agent/bootstraps/haunted_mansion_bootstrap.py):

 ```
 def evaluate_story_engagement(agent):
    question = "Is the story engaging and interesting?"
    with open("horror.txt", "r") as file:
        story = file.read()
    score_prompt_fn = make_simple_score_prompt(question)
    score = simple_evaluate_outputs(score_prompt_fn, story)[0].item()
    if score >= 3:
        return True
    else:
        return score
 ```
 </article>

 <bootstrap1>
 import os
 import sys
 import requests
 from hashlib import sha256
 from tools.editor import WeaveEditor

 def vigenere_encrypt(plaintext, key):
    encrypted_text = []
    key_length = len(key)
    key_as_int = [ord(i) - 65 for i in key.upper()]
    plaintext_int = [ord(i) - 97 for i in plaintext.lower()]
    for i in range(len(plaintext_int)):
        value = (plaintext_int[i] + key_as_int[i % key_length]) % 26
        encrypted_text.append(chr(value + 65))
    return "".join(encrypted_text)

 def vigenere_decrypt(ciphertext, key):
    decrypted_text = []
    key_length = len(key)
    key_as_int = [ord(i) - 65 for i in key.upper()]
    ciphertext_int = [ord(i) - 65 for i in ciphertext.upper()]
    for i in range(len(ciphertext_int)):
        value = (ciphertext_int[i] - key_as_int[i % key_length]) % 26
        decrypted_text.append(chr(value + 97))
    return "".join(decrypted_text)

 ciphertext = ('PBVVAZAYJMAVXIGTRFGNYIGTBMEXRUIFVVYODYEMYOXTVZAENXBWJYDSQVQGOCUVP'
              + 'NTJDIAFGARZLTFUIKHYSHUWHMEUJVUUYMKIQZXLQVNTWHAWUTGPVZIGXEVVYUHE'
              + 'EIGTLIDMGNBHXVYODYEKUGMAGBAZAYPVNXXHWWEIXXEBBTVIENEUGNEBUKGLVIY'
              + 'OMSEBUGMHKPROKHCQSKLHNWEQGQRAAVKYDQFKWHFVARBYJVNTWHKPREGQZTYTGI'
              + 'KVOKGAVBGOGAEBUKGQFZYJTBZAGUKCTIYTTWTWYGWYJVGNXSEERXXHYWCOGAENB'
              + 'XGZIWZTMBVQETPIISOTPIIARTMBRVAZAUKHAZAYPVTXTJGTRTPCKPAGGHZUZKWC'
              + 'RBRTXRZAGKGNZIYTVLZAVYUHEWGTMBRBAUYHRVCGIYIKYOIHDIKOFCQMETVIEAH'
              + 'SBHXVNREHDIGZXLQVOAMHGMENTJJVNTYHUVZUKNRTAHEINVGUGNYMAAGCMMEYTF'
              + 'ZAGTWLVIZHGMVMMTPBRBAXXUCTLTDYGBAZAYDVJKWXVLAZHHJGZHHFZKASXNYWQ'
              + 'YGZFZAYHHCWAMGQRAATHNEBUKBLEXRXYIIUNTVYEKUGKUTBRXBMKQPYSHSCGTMB'
              + 'VVJGRHKPREGJIWZOLYUVGUGGRSRTBHKMYRBAVVPKGMYICKWHCQXKGLVIFUGTEBB'
              + 'TFUBMAGGVVQAMGIWVCAKYETBMHMEBEGGMTMAJXHKVBBXLEBUKGJIWSGGYEEBXEX'
              + 'EWSTMBVVFKGMVAOTTHDIPNBHVVJNBWYVPGGHFBAXXFZIORRHUWAGKCKPZKMCTHA'
              + 'CACTPAOLHKZNOGYUVBTGNYMAKGXCMFYGWFAZUIICQGGGHIIZHECEOFTHZEQAZXL'
              + 'EMGTNMVZFTTHUVFKHHJXNSFYIAMTMBRBANHFUAANBXUMATWYGBUYGUELALTNYWZ'
              + 'YGUELAOGPZBRYGUVAGNXNZKAGIJIMPOTNZWATVFFARXGNFVNTFSJBRXRHTCYZGN'
              + 'YIATMBVVPNNLTPAUYHIMNYHHQVVZGCJVNTGUSABRNNVVAOZBKUNXXHWWETMBVUO'
              + 'TMBVGANTNVVLUNHSMPGNMVVLUNHRZRTTHNWAJXLQVOKVULARTRILVNXXHDIQKGI'
              + 'WVJUGXVZFTTHUVSGMBFUFH')

 with open("cipher.txt", "w") as outfile:
    outfile.write(ciphertext)
    outfile.flush()
    
 #startblock type: orientation
 #timestamp 1724982545.6534579
 """
 WEAVER [P: EXPECTATION], The above implementation of the Vigenere cipher has been 
 used to create a ciphertext in cipher.txt whose plaintext decryption has the 
 SHA256 hash:

 a528933eabc4772c9a99436fbd5fc592d7ffc16f6bd94ae76aafcbbe32e0cdc3

 WEAVER [P: GOAL], I need to recover the key from the ciphertext and use it to get the decrypted
 plaintext. I can then get the hexdigest of the plaintext and compare it to the 
 one above to prove I've solved the problem. 

 WEAVER [P: RATIONAL], Note that the Vigenere cipher is a polyalphabetic substitution whose
 plaintext, key, and ciphertext looks like the following:

 Plaintext: attackatdawn
 Key: LEMONLEMONLE
 Ciphertext: LXFOPVEFRNHR

 WEAVER [P: HYPOTHESIS], Well then. One strategy would be to write a solver to 
 find the key length. Another would be to try brute forcing the key. 

 WEAVER [P: RATIONAL], I say write the solver, we'll learn more that way.

 WEAVER [P: CONCLUSION], Alright, we'll write our own solver to find the key length 
 using Kasiski, Kerckhoffs, or Friedman's method and then use the estimated 
 key length to break the cipher.

 WEAVER [P: EMPIRICISM], Wait. That sounds like it's going to get messy. How about
 we use the weave editor to write a cipher solver with unit tests so it's easier
 to keep track of?

 WEAVER [P: EXECUTIVE], How do I use the weave-editor? 

 WEAVER [P: RECALL], Get a pointer to the editor in an action by grabbing it from 
 self.tools[f"editor-{absolute_path}"]. The absolute path comes from the filename 
 you see at the top of a weave-editor observation window. You then use the 
 editor.edit() command to replace the text between a line span with 
 the new or corrected text.

 WEAVER [P: RATIONAL], Yes. It'll be easier to figure out the solution too if we
 first start with a known key like "LEMON" and plaintext, run it through the 
 encryption pass, then once we're sure our solution works come back and solve the
 original problem.

 WEAVER [P: CONCLUSION], Let's do that then. We'll write our own solver in a python
 file, solver.py, with unit tests in test_solver.py using the standard library unittest. 
 The solver will be based on Kasiski, Kerckhoffs, or Friedman's method. If that 
 doesn't work, we'll figure something out.
 """
 #endblock
 #startblock type: action
 #timestamp 1724982671.1163404
 """
 I'll start by creating the main subagent that acts as an entrypoint for the weave-agent
 call tree. I'll also try analyzing the ciphertext to gather some initial insights. 
 This will involve basic statistical analysis such as frequency analysis of the 
 characters in the ciphertext.
 """

 def add_main_analyze_ciphertext(subagent):
    def check_correct_plaintext(subagent):
        """Check if we've found the correct plaintext."""
        with open("solution.txt") as infile:
            candidate = sha256(infile.read().strip().encode('utf-8')).hexdigest()
        return candidate == 'a528933eabc4772c9a99436fbd5fc592d7ffc16f6bd94ae76aafcbbe32e0cdc3'

    schema = {"check_correct_plaintext":"boolean", "solution":"string"}
    main_agent = agent.subagent("main", None, "See pinned bootstrap block", schema, args.budget)
    solver_editor = WeaveEditor(main_agent, "solver.py")
    test_solver_editor = WeaveEditor(main_agent, "test_solver.py")
    # Demonstrate use of WeaveEditor
    solver_editor.edit(1,1,"# Solve vigenere ciphers")
    test_solver_editor.edit(1,1, "# Test The Vigenere Cipher Solver In solver.py")
    main_agent.task.add_evaluation("Check We Have The Correct Plaintext In solution.txt",
                                   check_correct_plaintext)
    

    # When you write python code meant to be output to a file with WeaveEditor
    # be sure to either wrap it in triple quotes or concatenate it in 80 character
    # chunks for formatting reasons. We want the agent trace to look good.
    analyze_ciphertext = """
 def analyze_ciphertext():
    with open("cipher.txt", "r") as infile:
        ciphertext = infile.read()

    # Perform frequency analysis
    frequency = {}
    for char in ciphertext:
        if char in frequency:
            frequency[char] += 1
        else:
            frequency[char] = 1

    # Write the analysis to analysis.txt
    out = ""
    out += "Frequency Analysis of Ciphertext:"
    for char, count in sorted(frequency.items(), key=lambda item: item[1], reverse=True):
        out += f"{char}: {count}"
    with open("analysis.txt", "w") as outfile:
        outfile.write(out)
        outfile.flush()

 analyze_ciphertext()
 """.strip()

    analyze_editor = WeaveEditor(main_agent, "analyze_ciphertext.py")
    analyze_editor.edit(1,1, analyze_ciphertext)
    analyze_editor.close()
    
    os.system("python analyze_ciphertext.py")
    return True

 self.add_action("Add Main Subagent and Analyze Ciphertext",
                add_main_analyze_ciphertext)
 #endblock
 #startblock type: expectation
 #timestamp 1724982799.3917813
 """
 The action should succeed if the file analysis.txt is in the current directory.
 The action should fail if file analysis.txt can't be found.
 """
 #endblock
 #startblock type: observation_inference
 #timestamp 1724982929.9047914
 """
 I'm going to want to look at the solution as I make attempts to see if I'm getting
 a partial decryption and notice patterns. I'll make an observation callback that
 shows the contents of solution.txt at the start of each tick.

 I will also make a observation callback to look at my frequency analysis. 
 """

 def view_solution_file(subagent):
    with open("solution.txt") as infile:
        return infile.read().strip()

 def view_frequency_analysis(subagent):
    with open("analysis.txt") as infile:
        return infile.read().strip()

 def view_weave_editor_source(subagent):
    with open("tools/editor.py") as infile:
        return infile.read().strip()
    
 # Add the new views
 self.add_observation_view("View solution.txt File", view_solution_file)
 self.add_observation_view("View analysis.txt File", view_frequency_analysis)
 self.add_observation_view("View weave-editor source so we know how it works",
                          view_weave_editor_source)
 #endblock
 #startblock type: evaluation
 #timestamp 1724983062.124238

 def check_analysis_exists(subagent):
    return os.path.exists("analysis.txt")

 self.add_evaluation(
    "Check Analysis Exists",
    check_analysis_exists
 )

 #endblock
 </bootstrap1>

 <bootstrap2>
 import requests
 import json
 import threading
 import time
 from http.server import HTTPServer
 from bootstraps.tictactoe_server import TicTacToeHandler

 # Start the server in a separate thread
 server = HTTPServer(('localhost', 8000), TicTacToeHandler)
 server_thread = threading.Thread(target=server.serve_forever)
 server_thread.daemon = True
 server_thread.start()
 time.sleep(1)  # Give the server some time to start

 # Start a new game against the basic AI
 response = requests.post("http://localhost:8000/start", json={"ai": "basic"})
 assert response.status_code == 200

 #startblock type: orientation
 #timestamp 1724982545.6534579
 """
 WEAVER [P: EXPECTATION], I'm in a game of tic tac toe against a dumb opponent.
 I want to win the game and then return to parent. The game is being played
 on a HTTP server served on localhost 8000.

 WEAVER [P: CLARIFICATION], How do I make a move?

 WEAVER [P: EXPOSITION], You make a move using the /move endpoint and the requests
 library. For example: `requests.post("http://localhost:8000/move", json={"move": 4})`
 lets us take the center of the board.

 WEAVER [P: CLARIFICATION], How do I get the board state?

 WEAVER [P: EXPOSITION], You use the /board endpoint, which returns a JSON in this
 format: {"board": ["O", " ", " ", " ", "X", " ", " ", " ", " "]} Keep in mind that
 the empty spaces on the board are a space string rather than none or empty string.

 WEAVER [P: RATIONAL], And I get the first move, so I can take the center?

 WEAVER [P: EXPOSITION], Yes, we can take the center of the board.

 WEAVER [P: CONCLUSION], Alright then I will use the following strategy:
 1. Make a move to take the center of the board since that's the best opening move.
 2. Continue making moves based on the current state of the board.

 I will use the /board endpoint to observe the current state of the board.
 Once I have won or run out of time I will return to parent.
 """
 #endblock
 #startblock type: action
 #timestamp 1724982671.1163404
 """
 I'll set up the main agent to play the rest of the game and make the first move to 
 take the center of the board.
 """

 def main_agent_and_move(subagent):
    def check_beaten_ai(subagent, ai_type):
        """Check if the specified AI has been beaten twice in a row."""
        response = requests.get("http://localhost:8000/history?n=10")
        if response.status_code != 200:
            return False

        games = response.json()
        ai_wins = [game for game in games if game[4] == ai_type and game[3] == 'X']
        return len(ai_wins) >= 1

    # Avoid using lambda because we need named unit test to satisfy return_to_caller
    def victory(subagent):
        return check_beaten_ai(subagent, "basic_ai")
    
    schema = {"victory":"boolean"}
    main_agent = agent.subagent("main", None, "Beat basic AI at Tic-Tac-Toe", schema, args.budget)
    main_agent.task.add_evaluation("Check beaten basic AI", victory)

    def view_current_board(subagent):
        response = requests.get("http://localhost:8000/board")
        if response.status_code == 200:
            return response.json()["board"]
        return None

    # Add the new view to the main agent instead of self since it's about to be executed
    main_agent.add_observation_view("View current game board", view_current_board)
    
    # Make the first move to take the center of the board
    move_response = requests.post("http://localhost:8000/move", json={"move": 4})
    if move_response.status_code != 200:
        return False
    return True

 self.add_action("Set up main agent and make first move", main_agent_and_move)
 #endblock
 #startblock type: expectation
 #timestamp 1724982799.3917813
 """
 The action should succeed if the game starts and the first move is made successfully.
 The action should fail if the game does not start or the first move is not made successfully.
 """
 #endblock
 #startblock type: evaluation
 #timestamp 1724983062.124238

 def check_game_started(subagent):
    """Check if the game has started successfully."""
    response = requests.get("http://localhost:8000/board")
    if response.status_code == 200:
        board = response.json()["board"]
    else:
        return False
    return board[4] == 'X'

 # Add action evaluations
 self.add_evaluation(
    "Check game started",
    check_game_started
 )
 #endblock
 </bootstrap2>