urwasm-jetting-2.md

Part 1

In this gist I will describe Lia interpreter model in a greater detail. As an addendum to the first part, I'd like to point out yet another advantage of jetting another interpreter on top of Wasm that would take a list of actions to be performed, as opposed to jetting just the invocation gate.

Both in Vere and Ares, at least right now, a jetting function that is called instead of performing Nock 9 on a jetted core requires that core must not be modified. Therefore, the jetting function must only read from the input core, allocating memory for the results.

Since ++invoke gate has module store, including linear memory as its input and output, any Wasm function invocation from Urbit would require copying the entire store. It might be a prohibitive overhead for some memory-heavy applications of Wasm, like emulating x86, and something to keep in mind for smaller cases.

On the other hand, interacting with an instantiated module in the case of ++invoke jetting is straightforward: the state of the module is a noun, and to interact with the state we would call various jetted gates that invoke functions and perform i/o on the store. Handling import function calls is also straightforward: ++invoke might return a blocked result with the name of the imported function, which is then resolved in the embedding context, modifying state of the module if necessary. How would the same be achieved in a stateless fashion?

Let's return to the Lia interpreter, this time with some additions:

++  lia
  |=  $:  module=octs
          actions=(list action)
          shop=(list (list value))
          ext-func=(map (pair cord cord) (list action))
          diff=(each (list action) (list value))
      ==
  ^-  $%  [%0 out=(list value)]
          [%1 name=@tas args=(list value)]
          [%2 ~]
      ==
  =.  +<
    ?:  ?=(%.y -.diff)
      +<(actions (weld actions p.diff))
    +<(shop (snoc shop p.diff))
  ::  (...)

Caching of the store

What is the purpose of diff field in the sample? Since Wasm execution is deterministic (it must be deterministic if we want to run it on Urbit; there are some nondeterministic operations defined in the formal specification, but Hoon code serves as a deterministic specification for a subset of possible behaviors of Wasm), then the state of a module is referentially transparent with regards to the input parameters of Lia:

module, module binary file,
actions, list of actions to be performed on the module. Instantiation is implicit, the actions include function invocation and i/o to the state, and also variable declaration and assignment and some other necessary logic like if branches and for loops for expressivity;
shop, list of values obtained from resolved Lia blocks. Notice that these values are for Lia getting blocked on an unresolved import, not Wasm, more about this later;
ext-func, definitions of import functions for Wasm. In a trivial case the list of actions would contain a single call to a function which is external to Lia and is named with @tas, not (pair cord cord) like Wasm import functions. In a nontrivial case the list of actions could contain multiple actions to be performed, e.g. memory read followed by calling a Lia import function which takes octs. The purpose here is for the import calls to be able to surface not only Wasm values but Lia values like octs, giving us the richness of import calls that we could have with ++invoke jetting model without exposing the entirety of the module's state.

Here diff appears to be semantically useless, since instead of placing changes there we could've placed them directly into a proper field of the sample. But diff is necessary for efficient computations: each time ++lia jet computes something with a Wasm runtime, it will save a cache of Lia interpreter in C/Rust tagged with a (hash of a) noun [module actions shop ext-func]. If ++lia is computed later with the same first four arguments, then the jet would first look for a cache, and if it finds one, only compute the diff, either injecting results of Lia import resolution into a suspended jetting Lia interpreter, or performing a list of appended actions. Failing to find a cache, jet of ++lia would do the same thing as ++lia in Hoon: append the diff to an appropriate field and run the whole thing.

At the price of having to bother with cache reclamation and perhaps slightly longer event log replays we get the possibility of running Wasm runtime alsmost full speed, without having to de/serialize nouns to structs and vice versa or to copy the state of the Wasm module for every interaction with it.

A typical interation with ++lia would look like this:

Call ++lia for the first time, with an empty diff,
Either Lia returns a success %0 or a failure %2, or
1. It returns a block %1, which is resolved outside of Lia.
2. The resolution result is placed into diff, and ++lia is called again. Next time that same result must be placed directly to shop to get the right cache and avoid recalculating from scratch.
If succeeded, new actions can be put in diff and ++lia can be called again to continue interacting with the module. After that these new actions must be placed directly to actions on subsequent calls of ++lia to get the right cache.

As for the cache reclamation, maybe ++lia could include another input parameter hint ignored in Hoon code, that would serve as a hint to the jet on how to handle the cache, e.g. for how long to keep it in the memory. Dojo generators and Gall agents, for example, would probably benefit from different cache reclamation strategies.

Quodss/urwasm-jetting-2.md

Caching of the store

Quodss commented Jan 25, 2024