dhil · September 25, 2024 12:53
diff --git a/diff b/diff
 $ git remote -vv
 origin	[email protected]:dhil/wasm-stack-switching.git (fetch)
 origin	[email protected]:dhil/wasm-stack-switching.git (push)
 spec	[email protected]:WebAssembly/spec.git (fetch)
 spec	[email protected]:WebAssembly/spec.git (push)
 upstream	[email protected]:WebAssembly/stack-switching.git (fetch)
 upstream	[email protected]:WebAssembly/stack-switching.git (push)
 $ git diff spec/wasm-3.0
 diff --git a/README.md b/README.md
 index cbc17b46..fde9c1c7 100644
 --- a/README.md
 +++ b/README.md
 @@ -1,5 +1,34 @@
 -[![CI for specs](https://github.com/WebAssembly/spec/actions/workflows/ci-spec.yml/badge.svg)](https://github.com/WebAssembly/spec/actions/workflows/ci-spec.yml)
 -[![CI for interpreter & tests](https://github.com/WebAssembly/spec/actions/workflows/ci-interpreter.yml/badge.svg)](https://github.com/WebAssembly/spec/actions/workflows/ci-interpreter.yml)
 +# Stack-Switching Proposal for WebAssembly
 +
 +This repository is a clone of [`WebAssembly/spec`](https://github.com/WebAssembly/spec/). It is meant for discussion, prototype specification, and implementation of a proposal to add
 +support for stack-switching.
 +
 +See the [explainer](proposals/stack-switching/Explainer.md) for a high-level summary of the proposal.
 +
 +## Previous proposals
 +
 +The current explainer represents the unification of two previous proposals: Typed Continuations (wasmfx) and Bag of Stacks (bos). (The explainers have now been unified. Once the reference interpreter and examples are adapted for the unified proposal this section will be removed from the README.)
 +
 +#### Typed Continuations
 +
 +* See the [explainer](proposals/continuations/Explainer.md) for a high-level summary of the proposal.
 +
 +* See the [overview](proposals/continuations/Overview.md) for a more formal description of the proposal.
 +
 +* An [implementation](https://github.com/WebAssembly/stack-switching/tree/wasmfx) is available as an extension to the reference interpreter. It is accesible from the `wasmfx` branch of this repository.
 +
 +* See the [examples](proposals/continuations/examples) for Wasm code for implementing various different features including lightweight threads, actors, and async/await.
 +
 +#### Bag of Stacks Proposal
 +
 +* See the [explainer](proposals/bag-o-stacks/Explainer.md) for a high-level summary of the proposal.
 +
 +Original README from upstream repository follows.
 +
 +--------------------------------------------------------------------------------
 +
 +[![CI for specs](https://github.com/WebAssembly/stack-switching/actions/workflows/ci-spec.yml/badge.svg)](https://github.com/WebAssembly/stack-switching/actions/workflows/ci-spec.yml)
 +[![CI for interpreter & tests](https://github.com/WebAssembly/stack-switching/actions/workflows/ci-interpreter.yml/badge.svg)](https://github.com/WebAssembly/stack-switching/actions/workflows/ci-interpreter.yml)
 
 # spec
 
 diff --git a/document/core/conf.py b/document/core/conf.py
 index eab24731..307d212e 100644
 --- a/document/core/conf.py
 +++ b/document/core/conf.py
 @@ -66,10 +66,10 @@ editor = u'Andreas Rossberg (editor)'
 logo = 'static/webassembly.png'
 
 # The name of the GitHub repository this resides in
 -repo = 'spec'
 +repo = 'stack-switching'
 
 # The name of the proposal it represents, if any
 -proposal = ''
 +proposal = 'stack-switching'
 
 # The draft version string (clear out for release cuts)
 draft = ' (Draft ' + date.today().strftime("%Y-%m-%d") + ')'
 diff --git a/proposals/bag-o-stacks/Explainer.md b/proposals/bag-o-stacks/Explainer.md
 new file mode 100644
 index 00000000..4e3eb263
 --- /dev/null
 +++ b/proposals/bag-o-stacks/Explainer.md
 @@ -0,0 +1,831 @@
 +# Stack Switching Coroutines
 +
 +## Motivation
 +
 +Non-local control flow (sometimes called _stack switching_) operators allow applications to manage their own computations. Technically, this means that an application can partition its logic into separable computations and can manage (i.e., schedule) their execution.[^nothreads] Pragmatically, this enables a range of control flows that include supporting the handling of asynchronous events, supporting cooperatively scheduled threads of execution, and supporting yield-style iterators and consumers.
 +
 +[^nothreads]: In this proposal we are solely concerned with _single threaded_ execution. Otherwise known as _cooperative scheduling_, applications that use coroutines have explicit events that mark changes of control between managed computations. In effect, the cooperative coroutines are all _multiplexed_ onto a single thread of computation.
 +
 +This proposal refers to those features of the core WebAssembly virtual machine needed to support non-local control flow and does not attempt to preclude any particular strategy for implementing high level language features. We also aim for a deliberately low-level minimal spanning set of concepts: the author of a high level language compiler should find all the necessary elements to enable their compiler to generate appropriate WebAssembly code fragments.
 +
 +## A bag of stacks approach
 +
 +Informally, the approach outlined in this proposal is aligned with the 'bag of stacks' style: the engine allows many different computations to be modeled but does not establish any pairwise relationship between them. I.e., there is no parent/child relationship between any two computations.
 +
 +There are two main reasons for adopting this style:
 +
 +* Requiring the engine to maintain parent/child relationships implies, in many instances, proving properties that are potentially onerous and do not significantly enhance the safety of the application. For example, in this design, the engine does not have to search when switching computations: it is a requirement of the language provider to ensure that the target is always directly known when switching between coroutines. Similarly, when suspending a computation, the engine does not need to dynamically verify that the subject of suspension is legitimate.[^types]
 +
 +[^types]: However, type safety is guaranteed: a switch to another computation requires a reference to that computation; and that reference is statically verified for type safety.
 +
 +* Given that WebAssembly is not a source language but a target language for a wide variety of programming languages, the fewer the embodied concepts in a design for coroutining, the less likely it will be for a given programming language provider to encounter semantic difficulties. For example, if a programming language does not itself incorporate a notion of a suspend/resume relationship between coroutines, then adding such a relationship to this design results in ignored features at best and additional complexity and performance penalty at worst.
 +
 +At the same time, it should be noted, this style is significantly lower level than other possible designs. This means that most language providers will have to make more of an effort to map their language features to the design. The tradeoff is that that work is not against design decisions that we embody in this proposal; and that we do not require features that language providers can't or won't use.
 +
 +## Terminology
 +
 +Selecting an appropriate terminology to model 'computations that can manage themselves' is challenging. Most terms are either slightly off the mark or are so over-used as to become meaningless. (The term computation is an example of the latter.) In this proposal we standardize certain nomenclature to aid readability:
 +
 +* Coroutine. A coroutine is a _computation_ that is under the active management of the application itself. This means that the coroutine can be started, paused[^selfPause] and stopped; however, it does not mean that the coroutine is running in parallel: true parallel or multi-threaded execution is beyond the scope of this design.
 +
 +[^selfPause]: Technically, a coroutine can be started, but must pause or stop itself: it is not possible for an 'outsider' to stop a coroutine.
 +
 +* Coroutine function. A coroutine function is a _function_ that denotes what the overall computation of a coroutine is. When a coroutine is started, the coroutine function is entered, and when the coroutine function terminates the coroutine is no longer available for execution.
 +
 +* Event. An event is an _occurrance_ that is of interest to some observer.
 +
 +* Event description. An event description is a _data value_ (or collection of data values) that the application deems is important to describe the event.
 +
 +* Stack. A stack is a _resource_ that may be used within the implementation of an engine to support some of the features of WebAssembly. Stacks are used to represent the active frames of a computation (including coroutines), some of the local variables used and so on. We often use the term _switching stacks_ to imply switching between coroutines.
 +
 +* Stack switch. A stack switch is an _event_ that is associated with the transfer of active execution from one coroutine to another. Stack switch events are typically also associated with event descriptions that encode the reason for the switch.
 +
 +## Stacks, Coroutines and Stack references
 +
 +Executing coroutines require internal resources that enable the execution system to keep track of their execution. These resources include the memory needed to represent local variables, arguments and function calls as well as the execution status of the coroutine. I.e., machines need a _stack resource_ to hold the state of execution. In this proposal, we do not expose stacks as first class entities; however, we do model suspended computations with a _stack reference_.
 +
 +### The coroutine abstraction
 +
 +Computations have an extent in time and space: they are initiated, instructions are executed and have a termination.  A _coroutine_ is a computation that is potentially addressable directly by the application itself. A coroutine is analogous to a function call, with the additional potential for being _suspended_, _resumed_ and for suspended coroutines[^susponly] to be referenced as the value of an expression.
 +
 +[^susponly]: We do not allow the actively executing coroutine to be explicitly referenced.
 +
 +#### The state of a coroutine
 +
 +In addition to storage for variables and function calls, particularly when describing the operations that can be applied to coroutines, it is useful to talk about the coroutine's execution status:
 +
 +* The `suspended` state implies that the coroutine has suspended execution. It may be resumed; but until then, the coroutine will not be executing any instructions.
 +
 +* The `active` state implies that the coroutine is currently executing; and that it is _the_ currently active coroutine. There is always exactly one active coroutine in a single threaded WebAssembly application.
 +
 +* The `moribund` state implies that the coroutine has terminated execution and cannot perform any additional computation -- attempting to resume a moribund computation will result in an execution trap. Any stack resources previously associated with the moribund coroutine may have been released.
 +
 +The status of a coroutine is not directly inspectable by WebAssembly instructions.
 +
 +Only the `active` coroutine may be suspended, and only coroutines that are in the `suspended` state may be resumed. Attempting to resume a `moribund` coroutine will result in a trap.
 +
 +We should note here that terms such as _resuming_ or _suspending_ are meant somewhat informally. The only operation that this proposal supports is _switching_ between coroutines: suspending and resuming are simply informal names of usage patterns of switching.
 +
 +### Stack references
 +
 +A stack is a first class value denoting a coroutine in a _suspended_ state. Associated with stacks is the `stack` type:
 +
 +```wasm
 +(type $c (stack <params> <rt>))
 +```
 +
 +where `<params>` are the types of values to be sent to the suspended computation as part of _waking it up_.
 +
 +The <rt> parameter is somewhat special: it is also a reference to a stack type; specifically it should be the stack type of the currently executing coroutine. This is the type of the stack that is needed to switch back to the current coroutine.
 +
 +The return stack type must be of the form:
 +
 +```wasm
 +(ref null? $c)
 +```
 +
 +where $c is the index of a stack type.
 +
 +>This affects which instructions are legal to perform; the switch_retire instruction passes a null stack as the return stack, whereas the regular switch instruction never passes a null stack.
 +>
 +>This, in turn, permits some potential optimizations in avoiding null checks; for those cases where it is not permitted.
 +
 +For example, a stack that is expecting a pair of `i32` values and is expected to signal back a single `i32` would have the type signature $cp from the definition:
 +
 +```wasm
 +(rec
 +  (type $cp
 +    (stack (param i32) (param i32) (ref $co)))
 +  (type $co
 +    (stack (param i32) (ref $cp))))
 +```
 +
 +All stack references participate in such recursively defined groups of types. The reason is straightforward: when switching from one coroutine to another, the default expectation is that the computation will eventually switch back. In general, the collection of messages between coroutines forms a closed conversation governed by a particular use case.
 +
 +This is in recognition that, in many cases, the communication patterns between coroutines is _asymmetric_: one coroutine expects event descriptions that fit one type and its partner coroutines expect a different form of event description.
 +
 +Stack references are single use: when used to switch to a coroutine the stack reference becomes invalid afterwards -- the engine is expected to trap if a stack reference is used twice.
 +
 +Stack references are created in two circumstances: when a coroutine is created and when a coroutine is switched from. Stack references are also consumed in two ways: when used to switch to a coroutine, the target stack reference is used, and when a coroutine finally completes no new stack reference for the returning coroutine is created (i.e., null is sent as the return stack).
 +
 +#### Type safety in switching
 +
 +In order for a switch between coroutines to be valid, the type of the target stack reference must be consistent with the current state of the stack -- the value stack on the originating coroutine must be populated with the appropriate list of values corresponding to the parameters of the stack reference being used; in addition, the target coroutine must be _expecting_ the same set of values. These values are transferred during the switch. Both of these conditions can be verified at code validation time.
 +
 +Given this, we can statically verify that WebAssembly programs that switch between coroutines are guaranteed to observe type safety during the switch.
 +
 +#### Subtyping
 +
 +Like function types, stack types are contravariant in their parameters.
 +
 +```pseudo
 +C |- stack t_1* rt_1 <: stack t_2* rt_2
 +-- C |- t_2* rt_2 <: t_1* rt_1
 +```
 +
 +`stack t_1 rt_1` is a subtype of `stack t_2* rt_2` iff:
 + - `t_2* rt_2` is a subtype of `t_1* rt_1`.
 +
 +The top type for stack references is `stack` and the bottom type is `nostack`. Like other bottom types, the nostack type is uninhabited.
 +
 +```pseudo
 +absheaptype ::= ... | stack | nostack
 +```
 +
 +### Life-cycle of a coroutine
 +
 +A coroutine is allocated in the suspended state using the `stack.new` instruction. The initial `switch` to the newly allocated coroutine performs the equivalent of a function call on the new stack resource. In addition to the arguments provided to the `switch`, an additional argument is provided that is a stack reference to the caller code -- the caller is suspended as a result of the `switch` instruction.
 +
 +During the normal execution of a coroutine, it is expected that it will switch to other coroutines using further `switch` instructions. It is only possible for a WebAssembly code to switch to a coroutine if the code has available to it the stack reference of the associated suspended coroutine.
 +
 +This direct access aspect implies that higher-level programming language features that rely on dynamic scoping must be realized using other facilities of WebAssembly. For one such approach, we refer the reader to [this proposal](dynamic scoping url).
 +
 +Eventually, the coroutine will be ready for termination; in which case it signals this by switching to another coroutine -- using the `switch_retire` instruction. This instruction is a `switch` instruction but it also results in the switching coroutine to become `moribund`; and the associated computation resources to become available for release.
 +
 +Note that coroutine functions are _not_ permitted to return normally, nor are they permitted to abort by throwing exceptions. Returning from a coroutine, or allowing an exception to be propagated out, results in a trap.
 +
 +>The primary justification for this is that the control flow patterns of switching coroutines do not typically embody a reasonable logical relationship that can be utilized when returning results. For example, a scheduler is responsible for ensuring the execution of one or more coroutines; but, schedulers are not typically interested in the _result_ of the computations of the coroutines they manage. Instead, return results (normal or exceptional) would typically be communicated to another coroutine – using normal switch_retire instructions.
 +
 +#### The Life-cycle of a stack reference
 +
 +Stack references identify coroutines that are in a suspended state. They are created as a coroutine becomes suspended when computation switches to a different coroutine. Stack references are consumed when the corresponding coroutine is switched to -- using a `switch` instruction.
 +
 +Once a stack reference has been used to `switch` to its identified coroutine, it is no longer valid. Any attempt to switch to a stack reference that has already been used will result in a trap. Unfortunately, the design of WebAssembly means that it is not possible to statically validate that any given stack reference is actually valid -- it is the responsibility of the application program to ensure that stack references are used just once.
 +
 +>It may seem that this can result in a large number of values being created and becoming garbage. However, stack references are implicitly references to the underlying stack resource which is _stable_ across the lifetime of the coroutine itself. Thus, one reasonable implementation strategy is to represent stack references as a pair: the stack resource and a counter. The counter -- which would also be stored in the stack resource -- is incremented every time the coroutine switches and is checked when the coroutine is switched to.
 +>
 +> Since the stack reference pair is never unpacked by WebAssembly code, it can be stored as a fat value in the value stack, in local variables, globals and in tables.[^shared]
 +
 +[^shared]: This implementation strategy becomes more complex when threading is taken into account, and the possibility of shared stack references arise.
 +
 +#### Coroutine identity
 +
 +Coroutines do not have a specific identity in this proposal. Instead, a stack reference denotes the particular state of a suspended coroutine. This token is only created when switching from a coroutine or when a `stack.new` instruction is executed to create a new coroutine.
 +
 +>It is not possible for WebAssembly code to discover which coroutine it is running on; indeed the currently active coroutine has no valid stack reference. One consequence of this design is that when a WebAssembly function calls another function from another module (say), that module cannot discover the identity of the coroutine and misuse it. Overall, this is in keeping with a capability-based approach to resource management.
 +
 +In the rest of this document we introduce the key instructions, give some worked examples and answer some of the frequently asked questions.
 +
 +## Instructions
 +
 +We introduce instructions for creating, switching between, and retiring stacks.
 +
 +### `stack.new` Create a new stack
 +
 +```pseudo
 +  C |- stack.new x y : [] -> (ref x)
 +  -- expand(C.TYPES[x]) = stack t* rt
 +  -- expand(C.FUNCS[y]) = func t* rt -> []
 +```
 +
 +`stack.new` takes two immediates: a type index `x` and a function index `y`. It is valid with type `[] -> (ref x)` iff:
 +
 + - The expansion of the type at index `x` is a stack type with parameters `t* rt`.
 + - The expansion of the type of the function at index `y` is a function type `t* rt -> []`.
 +
 +Let `f` be the function at index `y`. `stack.new` allocates a new suspended stack that expects to receive the arguments for `f`. Once the allocated stack is switched to, it will continue on to call `f` with the provided arguments and a reference to the previous active stack, or a null value if the previous active stack has been retired.
 +
 +### `switch` Switch to a stack
 +
 +```pseudo
 +  C |- switch x : t_1* (ref null x) -> t_2* rt
 +  -- expand(C.TYPES[x]) = stack t_1* (ref null? y)
 +  -- expand(C.TYPES[y]) = stack t_2* rt
 +```
 +
 +`switch` takes one immediate: a type index `x`. It is valid with type `t_1* (ref null x) -> t_2* rt` iff:
 +
 + - The expansion of the type at index `x` is a stack type with parameters `t_1* (ref null? y)`.
 + - The expansion of the type at index `y` is a stack type with parameters `t_2* rt`.
 +
 +If its stack reference operand is null or detached, `switch` traps. Otherwise, `switch` switches to the stack denoted by its stack reference operand, popping and sending the expected values `t_1*` along with a reference of type `(ref y)` denoting the prior active stack. The parameters of the stack type at index `y` determine what types will be received when the prior active stack is switched back to.
 +
 +> TODO: Describe checking whether a switch is allowed and trapping if it is not.
 +
 +### `switch_retire` Switch to a stack and retire the old stack
 +
 +```pseudo
 +  C |- switch_retire x : t_1* (ref null x) -> t_2*
 +  -- expand(C.TYPES[x]) = stack t_1* (ref null y)
 +```
 +
 +`switch_retire` takes one immediate: a type index `x`. It is valid with type `t_1* (ref null x) -> t_2*` iff:
 +
 + - The expansion of the type at index `x` is a stack type with parameters `t_1* (ref null y)`.
 +
 +`switch_retire` is very much like `switch`, except that it requires the target stack to be expecting a nullable stack reference and that instead of sending a reference to the previous active stack, it sends a null reference. This makes the previous active stack unreachable and potentially allows the engine to reclaim its resources eagerly. Since the previous active stack can never be resumed and the instructions following the `switch_retire` can never be executed, this instruction is valid with any result type.
 +
 +### `stack.bind` Partial application of stack arguments
 +
 +```pseudo
 +  C |- stack.bind x y : t_1* (ref null x) -> (ref y)
 +  -- expand(C.TYPES[x]) = stack t_1* t_2* rt
 +  -- epxand(C.TYPES[y]) = stack t_2* rt
 +```
 +
 +`stack.bind` takes two immediates: type indices `x` and `y`. It is valid with type `t_1* (ref null x) -> (ref y)` iff:
 +
 + - The expansion of the type at index `x` is a stack type with parameters `t_1* t_2* rt`.
 + - The expansion of the type at index `y` is a stack type with parameters `t_2* rt`.
 +
 + `stack.bind` takes a prefix of the arguments expected by a stack of type `x` as well as a reference to such a stack. It binds the provided arguments to the stack and returns a new stack reference to the same underlying stack, now expecting only the remaining, unbound arguments. Detaches all outstanding references to the stack.
 +
 +> Note: `stack.bind` is implementable in userspace either by bundling the bound values with the continuation or by introducing intermediate stack types that allow the values to be bound incrementally over the course of multiple switches to the target stack.
 +
 +## Examples
 +
 +We look at three examples in order of increasing complexity and sophistication: a yield-style generator, cooperative threading and handling asynchronous I/O.
 +
 +The objectives behind these examples are to demonstrate how common usage patterns may be implemented  and to exemplify how a compiler might target the features of this proposal.
 +
 +### Yield-style generators
 +
 +The so-called yield-style generator pattern consists of a pair: a generator function that generates elements and a consumer that consumes those elements -- the latter often taking the form of a `for`-loop. When the generator has found the next element it yields it to the consumer, and when the consumer needs the next element it waits for it. Yield-style generators represent the simplest use case for stack switching in general; which is why we lead with it here.
 +
 +There are several potential styles of the generator pattern in source languages: the source language may provide special function forms for the generator expression; or the source language may permit _any_ higher-order function to act as the core of a generator, and simply pass a special yield function to that higher-order walker. Another approach, common in OO-style languages, is to use the iterator pattern. Our example will take its cue from the first style.
 +
 +#### Communicating between the generator and its consumer
 +
 +One problem that must be addressed by any language compiler that emits coroutining code is how to represent any events that may occur. In the case of the generator pattern, there are two kinds of events that need to be modeled: the generator needs to be able to signal individual data elements as they are generated and it needs to be able to signal the end of the stream. Conversely, the consumer needs to be able to ask for the next element; and, in some cases, the consumer needs to be able to _cancel_ the generator -- to ask it to stop producing values.
 +
 +>The two sides of this conversation are not identical: the consumer sends control signals and the generator produces a stream of values. This asymmetry is prevalent in interactions amongst coroutines and is the main reason that the stack type structure has two type definitions.
 +
 +For our yield-style generator example, we identify four different events, two originating from the generator and two from the consumer:
 +
 +* `#yield <value>`. This event communicates a single data value from the generator, together with the encoding of the `#yield` code.
 +
 +* `#end`. This communicates that there are no data values (left) to get from the generator.
 +
 +>Note that the `#end` event only needs a single value to represent it; but, since the vector of values must have the same WebAssembly type for all kinds of messages from the generator, we will actually use two values: the encoding of `#end` and a dummy zero value.
 +
 +* `#next`. This is a message from the consumer to the generator: to produce the next value.
 +
 +* `#cancel`. This is a message from the consumer to cancel the generator.
 +
 +The required set of messages can be modeled using a recursive pattern of stack types, as in the WebAssembly type declaration:
 +
 +```wasm
 +(rec
 +  (type $genCmd (stack (param i32) (ref $genResp)))
 +  (type $genResp (stack (param i32) (param i32) (ref null $genCmd))))
 +```
 +where $genCmd has a single i32 which contains the command to the generator, and $genResp has two i32 elements, one encoding a response sentinel (e.g., #yield means there is a value and #end signals the end of the stream).
 +
 +>Note that, for convenience, the sentinel in a response is the second of the two i32 values.
 +
 +#### Generating elements of an array
 +
 +In this example, we implement an extremely minimal generator: one which iterates over the elements of an `i32` array. The array is assumed to lie in linear memory, and we pass to the generator function the address of the base of the array, where to start the iteration and the number of elements in it:
 +
 +```wasm
 +(rec
 +  ;; generic types for any generator of i32s
 +  (type $toConsumer (stack (param $val i32) (param (ref null $toGenerator))))
 +  (type $toGenerator (stack (param (ref $toConsumer))))
 +)
 +
 +;; types to initialize the array generator specifically
 +(type $finishInit (stack (param (ref $toGenerator))))
 +(type $initArrayGen (stack (param $from i32) (param $to i32) (param $els i32) (param (ref $finishInit))))
 +
 +(func $arrayGenerator (param $from i32) (param $to i32) (param $els i32) (param $finishInit (ref $finishInit))
 +  (local $toConsumer (ref $toConsumer))
 +
 +  ;; switch back to the consumer now that $from, $to, and $els have been initialized.
 +  (switch $finishInitArrayGen (local.get $finishInit))
 +  (local.set $toConsumer)
 +
 +  (block $on-end
 +    (loop $l
 +      (br_if $on-end (i32.ge (local.get $from) (local.get $to)))
 +
 +      (switch $toConsumer             ;; load and yield a value to the consumer
 +        (i32.load (i32.add (local.get $els)
 +                           (i32.mul (local.get $from)
 +                                    (i32.const 4))))
 +        (i32.const 0)                 ;; not end
 +        (local.get $toConsumer))
 +      (local.set $toConsumer)         ;; remember the consumer
 +
 +      ;; continue to the next element
 +      (local.set $from (i32.add (local.get $from) (i32.const 1)))
 +      (br $l)
 +    )
 +  ) ;; $on-end
 +
 +  (switch_retire
 +    (i32.const 0)                     ;; dummy value
 +    (local.get $consumer))
 +)
 +```
 +
 +Whenever the `$arrayGenerator` function yields after its initialization -- including when it finally finishes -- it returns three values: a new stack reference that allows the consumer to resume the generator, the value being yielded together with a sentinel which encodes whether this is a normal yield or the end of the generated elements.
 +
 +When there are no more elements to yield, the `$arrayGenerator` issues the `switch_retire` instruction which simultaneously discards the generator's resources and communicates the end to the consumer by sending a null return stack reference. We also pass a dummy value of zero to comply with type safety requirements.
 +
 +Whenever a `switch` instruction is used, it must be followed by instructions that store the return stack reference and use any sent values. In this example, the return stack reference is stored in the `$toConsumer` local variable, replacing its previous value which is no longer valid.
 +
 +>There is one aspect of building a generator that is not addressed by our code so far: how to start it. We will look at this in more detail as we look at the consumer side of yield-style generators next.
 +
 +#### Consuming generated elements
 +
 +The consumer of a generator/consumer pair is typically represented as a `for` loop in high level languages. However, we need to go 'under the covers' a little in order to realize our example.
 +
 +In WebAssembly, our `addAllElements` function creates the generator -- using the `stack.new` and `switch` instructions -- and employs a loop that repeatedly switches to it until the generator reports that there are no more elements. The code takes the form:
 +
 +```wasm
 +(func $addAllElements (param $from i32) (param $to i32) (param $els i32) (result i32)
 +  (local $total i32) ;; initialized to 0
 +  (local $toGenerator (ref $toGenerator))
 +
 +  ;; create the generator stack and switch to it with the initialization parameters.
 +  (switch $initArrayGen
 +    (local.get $from)
 +    (local.get $to)
 +    (local.get $els)
 +    (stack.new $initArrayGen $arrayGenerator))
 +  (local.set $toGenerator)
 +
 +  (block $on-end
 +    (loop $l
 +      (switch $toGenerator (local.get $toGenerator))
 +      (br_on_null $on-end)     ;; check whether we have ended
 +
 +      (local.set $toGenerator)  ;; remember the new generator reference
 +
 +      ;; add the yielded value to the total
 +      (local.get $total)
 +      (i32.add)
 +      (local.set $total)
 +      (br $l)
 +    )
 +  ) ;; $on-end
 +  (local.get $total)
 +)
 +```
 +
 +The loop uses a `br_on_null` instruction to determine when the generator has signaled the end by retiring and producing a null stack reference.
 +
 +#### Simplifying initialization with `stack.bind`
 +
 +Initializing the generator in this example required two stack switches and two additional stack types just to move the initialization values into the generator stack. This initialization can be simplified using the `stack.bind` instruction:
 +
 +```wasm
 +
 +;; no separate stack type necessary for finishing initialization.
 +(type $initArrayGen (stack (param $from i32) (param $to i32) (param $els i32) (param (ref $toConsumer))))
 +
 +(func $arrayGenerator (param $from i32) (param $to i32) (param $els i32) (param $toConsumer (ref $toConsumer))
 +
 +  ;; no switch necessary to finish initialization.
 +
 +  (block $on-end ...
 +)
 +
 +(func $addAllElements (param $from i32) (param $to i32) (param $els i32) (result i32)
 +  (local $total i32) ;; initialized to 0
 +  (local $toGenerator (ref $toGenerator))
 +
 +  ;; create the generator stack and partially apply the initialization parameters.
 +  (stack.bind $initArrayGen $toGenerator
 +    (local.get $from)
 +    (local.get $to)
 +    (local.get $els)
 +    (stack.new $initArrayGen $arrayGenerator))
 +  (local.set $toGenerator)
 +
 +  (block $on-end ...
 +)
 +```
 +
 +#### Flattening Communication
 +
 +The set of possible messages between coroutines is typically tied to the actual pattern being implemented: in this case we have a generator/consumer pair. But, in general, the messages form a _protocol_; typically, each message in a protocol will have different argument values with different types.
 +
 +There are various options for modeling protocols more-or-less precisely. However, in those programming languages that support algebraic data types, we can use them as a kind of poor man's protocol description. We cannot easily use algebraic data types to capture the ordering of messages, however.
 +
 +When implementing generators, it is quite important to perform as few allocations as possible (yield-style generators are effectively competing with java-style iterators and with normal inline while loops). So, for this example, we use a vector of values for each event description; where the one value is a sentinel -- encoded using the equivalent of an enumerated type --  and the remaining vector values depend on the event itself[^1].
 +
 +[^1]: An alternate strategy could be to pass a pointer to a data structure describing the event. A toolchain might be able to avoid multiple allocations by reusing the data structure.
 +
 +This strategy is effectively one of _flattening_ the type that represents the messages in the generator/consumer conversation into a vector of values. This involves determining the maximum number of values that may be communicated to/from a coroutine and _padding_ in the situations where the actual event does not need all the slots. Computing these vectors is the responsibility of the code generator.
 +
 +
 +### Cooperative Coroutines
 +
 +Cooperative coroutines, sometimes known as _green threads_ or _fibers_, allow an application to be structured in such a way that different responsibilities may be handled by different computations. The reasons for splitting into such coroutines may vary; but one common scenario is to allow multiple tasks to proceed at their own pace.
 +
 +In our formulation of fibers, we take an _arena_ based approach: when a program wishes to fork into separate fibers it does so by creating an arena or pool of fibers that represent the different activities. The arena computation as a whole only terminates when all of the fibers within it have completed. This allows a so-called _structured concurrency_ architecture that greatly enhances composability[^2].
 +
 +[^2]: However, how cooperative coroutines are actually structured depends on the source language and its approach to handling coroutines. We present one alternative.
 +
 +#### Structure of a Fiber
 +
 +One can argue that the most salient aspect of a fiber, compared (say) to a generator/consumer pair, is the notion of _peristence_. A fiber has an identity that persists throughout the lifetime of the fiber. This is in direct tension with the nature of stack references.
 +
 +The second architectural feature of fibers is the implied _scheduler_ or _arena_. An arena is responsible for managing a collection of fibers, and ensuring that each gets a reasonable chance at execution; typically realized via some form of scheduler.
 +
 +A straightforward approach to modeling fiber identity is to capture it with a user data structure -- two fibers are considered the same if they have the same fiber structure. This would typically have a mutable field in it to hold the stack reference when the fiber is suspended, and which would be null in the case the fiber was active (or dead).
 +
 +It is also likely that, in practice, a language runtime would include other language specific information in the same data structure: access to fiber-local variables is an obvious example. However, we will assume a minimal structure that has two fields in it:
 +
 +```wasm
 +(type $fiber (struct
 +  (field $stack mut (ref null $fiberCont))
 +  (field $arena (ref $arena))
 +))
 +```
 +
 +A scheduler needs to be able to select which of its fibers to execute next. It also needs to be able to inform its fibers whether they are being resumed normally, or are being canceled.
 +
 +Similarly, a fiber needs to communicate to its scheduler why it is yielding to the scheduler: it may be reporting success, an exception, or simply yielding. A final category of communication from the fiber is ‘yield with a reason’; such as when the fiber is requesting asynchronous I/O or a delayed timer.
 +
 +Together, these messages form the fiber protocol.
 +
 +In our example, we are going to assume that the only messages that a scheduler is expected to understand (from its managed fibers) are #pause, #terminate.[^wake] The assumption is that other messages are really directed to other fibers and not to the scheduler itself (and therefore will use different mechanisms).
 +
 +[^wake]: A slightly fuller exposition would typically also include a #wake command to allow a fiber to request that a given sibling be scheduled. We omit this for the sake of brevity.
 +
 +Conversely, a scheduler has one of two messages to send to its fibers: #resume and #cancel.
 +
 +We can model this pattern of communication using the following type declaration:
 +
 +```wasm
 +(rec
 +  (type $sched (stack (param i32) (ref $fbr)))
 +  (type $fbr (stack (param i32) (ref null $sched)))
 +)
 +```
 +
 +where we also define the constants in the enumerations:
 +
 +```c
 +typedef enum{
 +  pause = 0,
 +  terminate = 1
 +} fbrCmd;
 +
 +typedef enum{
 +  resume = 0,
 +  cancel = 1
 +} arenaCmd
 +```
 +
 +#### Running Fibers
 +
 +The WebAssembly implementation of `#pause` has two pieces: switching to the arena and maintaining the fiber structure so that the fiber can be resumed later on. Given that the fiber’s stack will not be available until after the switch to the scheduler, this means there are two parts to the code: one executed by the fiber and the other by the scheduler.
 +
 +The $pause function below is given a reference to the currently running fiber structure, and, we assume, that the arena is also modeled as a fiber (accessed from the $fiber structure):
 +
 +```wasm
 +(func $pause (param $fiber (ref $fiber))
 +  (local $cmd i32)
 +  (local.get $fiber)
 +  (struct.get $fiber $arena)
 +  (struct.get $arena $arenaCont)
 +  (i32.const #pause)
 +  (switch $fbr) ;; Switch to scheduler
 +  (local.set $cmd)    ;; Decode why we are being woken up
 +  (struct.get $fiber $arena)
 +  (struct.set $arena $arenaCont) ;; update arena’s stack
 +  (local.get $cmd)    ;; return resume or cancel signal
 +  (return)
 +)
 +```
 +
 +As can be seen, most of this code is simply accessing structures and updating them. In this case, `$pause` has to access the arena's stack, and update it when the fiber is resumed by the arena scheduler. Similarly, the arena has to manage the fiber's data structure:
 +
 +```wasm
 +  ...
 +  (local.get $resumee)    ;; the fiber we are going to resume
 +  (local.get $resumee)    ;; we need it twice
 +  (struct.get $fiber $stack)
 +  (i32.const #resume)     ;; we are resuming the fiber
 +  (switch $sched)
 +  (if                     ;; What did the fiber sent us
 +    (struct.set $fiber $stack)
 +  else
 +    ...                   ;; kill off the fiber
 +  end)
 +  ...
 +```
 +
 +This is not a complete function: we are just highlighting that part of the arena scheduler that is relevant to resuming a fiber. Apart from the manipulation of data structures, perhaps the most salient aspect of this code is an apparent inversion: the arena code is responsible for managing the storage of its client fibers, and the fiber is responsible for keeping track of the arena's stack. This is a result of computations not being able to address themselves -- of the active coroutine not having a 'pointer to itself'.
 +
 +It also implies that the design of functions such as `$pause` is fundamentally intertwined with that of the fiber's arena scheduler. However, the combination of a scheduler, a suite of library functions giving fibers capabilities, results in a complete package.
 +
 +#### A Fiber Function
 +
 +Given this, we can give a more complete example, building on the generator example above. This example uses fibers to split processing an array into segments: each fiber is responsible for adding up the elements of its segment.[^proxy]
 +
 +[^proxy]: This example is intended to serve as a proxy for a much more realistic situation: collecting multiple resources by splitting each into a separate task.
 +
 +Like stack functions, fiber functions have an extra argument: which is a reference to the `$fiber` structure.
 +
 +```wasm
 +(func $adderFiber (param $arena (ref $fiber))
 +   (param $els i32) (param $from i32) (param $to i32)
 +   (return i32)
 +  (local $total i32)
 +
 +  (switch $genResp
 +    (local.get $els)
 +    (local.get $from)
 +    (local.get $to)
 +    (stack.new $genResp $arrayGenerator))
 +
 +  (block $on-end
 +    (loop $l
 +      (block $on-yield ((ref $genResp) i32) ;; 'returned' by the generator
 +        (br_table $on-yield $on-end)  ;; dispatch on sentinel
 +      ) ;; the stack contains the generator and the yielded value
 +      (local.get $total)               ;; next entry to add is already on the fiber
 +      (i32.add)
 +      (local.set $total)
 +      (local.set $generator)           ;; store the generator reference that came back
 +      (switch $genResp
 +        (local.get $generator)
 +        (i32.const 0)                  ;; padding
 +        (i32.const #next))
 +      (local.get $fiber)
 +      (call $pause)                    ;; pause returns zero if we continue
 +      (if (br $on-end) (br $l))
 +    ) ;; fiber loop
 +  )
 +  (switch_retire   ;; report total to arena
 +    (local.get $fiber)
 +    (struct.get $fiber $arena)
 +    (local.get $total)
 +    (i32.const #terminate)
 +  ) ;; unreachable
 +)
 +```
 +
 +#### Managing Tasks in an Arena
 +
 +The core principle of structured concurrency is the same single entry/single exit principle behind structured programming: a group of concurrent activities (a.k.a. tasks) arranged into an _arena_ which is not terminated until its component tasks are accounted for.[^newtask] An arena denotes a set of computations that have some application-related role: i.e., the arena embodies a set of _tasks_.
 +
 +[^newtask]: In addition, to be consistent, any _new_ computations started are always associated with an arena.
 +
 +There are several legitimate varieties of arenas: in one scenario, all the tasks are _competing_ and the first one that completes results in all the others being canceled. Another common pattern involves the arena completing only when all the tasks have completed.
 +
 +This arena takes an array of fibers and terminates when the first one ends:
 +
 +```wasm
 +(type $fiberArray (array (ref $fiber)))
 +
 +(func $cancelingArena (param $fibers (ref $fiberArray))
 +  (result i32)
 +  (local $ix i32 (array.len (local.get $fibers)))
 +  (local $jx i32)
 +  (loop $l
 +    (local.set $ix (i32.const 0))
 +    (loop $for_ix
 +      (block $on-endGreen (i32)
 +        (block $on-pauseGreen
 +          (switch $fbr
 +            (array.get $fibers
 +              (local.get $fibers) (local.get $ix)) ;; pick up fiber
 +            (struct.get $fiber $fiberCont)
 +            (i32.const #resume)) ;; Our message is 'go'
 +          (br-table $on-pauseGreen $on-endGreen)
 +        )
 +        (local.set $ix
 +          (i32.add
 +            (local.get $ix)
 +            (i32.const 1)))
 +        (array.get $fibers (local.get $fibers) (local.get $ix)) ;; pick up the fiber again
 +        (struct.set $fiber $fiberCont)                          ;; update fiber structure
 +        (br_if $for_ix (i32.ge (local.get $ix) (local.get $len)))
 +      )
 +      (local.set $total
 +        (i32.add
 +          (local.get $total)))
 +      (local.set $jx (i32.const 0))
 +      (loop $for-jx
 +        (block $no-cancel
 +          (br_if $no_cancel
 +            (i32.eq
 +              (local.get $ix)
 +              (local.get $jx)))
 +          (switch $fbr
 +            (array.get $fibers
 +              (local.get $fibers) (local.get $ix)) ;; pick up fiber
 +            (struct.get $fiber $fiberCont)
 +            (i32.const #cancel)) ;; Our message is 'stop'
 +          drop
 +          drop
 +          drop ;; drop the results from this cancelation
 +        )
 +        (local.set $jx
 +          (i32.add
 +            (local.get $jx)
 +            (i32.const 1)))
 +        (br_if $for_jx
 +          (i32.lt (local.get $jx) (local.get $len)))
 +      )
 +      (local.get $total)
 +      return
 +    )
 +  )
 +)
 +```
 +
 +Although we are using structure and array concepts which are part of WasmGC, it would be straightforward -- if a little tedious -- to use tables and linear memory to store the relevant structures needed to support fibers.
 +
 +### Asynchronous I/O
 +
 +In our third example, we look at integrating coroutines with access to asynchronous APIs; which are accessed from module imports.
 +
 +On the web, asynchronous functions use the `Promise` pattern: an asynchronous I/O operation operates by first of all returning a `Promise` that 'holds' the I/O request, and at some point after the I/O operation is resolved a callback function attached to the `Promise` is invoked.[^other]
 +
 +[^other]: While non-Web embeddings of WebAssembly may not use `Promise`s in exactly the same way, the overall architecture of using promise-like entities to support async I/O is widespread.
 +  One specific feature that may be special to the Web is that it is not possible for an application to be informed of the result of an I/O request until after the currently executing code has completed and the browser's event loop has been invoked.
 +
 +#### JavaScript Promise Integration
 +
 +The JavaScript Promise Integration API (JSPI) allows a WebAssembly module to call a `Promise`-returning import and have it result in the WebAssembly module being suspended. In addition, calling into a marked export converts the normal return convention into a `Promise`-based convention: instead of returning the result indicated by the WebAssembly function, the marked function returns a `Promise`. This is `resolved` if the function returns normally (presumably after suspending at least once); if the function throws an exception, that exception is transmuted to a `reject` of the `Promise`.
 +
 +In normal mode, JSPI operates on the entire call chain -- between the call to the WebAssembly module itself and the call to a `Promise`-bearing import. It assumes that the WebAssembly code is synchronous in nature. However, what happens when your language already has coroutines?
 +
 +#### Our Scenario
 +
 +We would like to enable applications where several coroutines can make independant requests to a `fetch` import and only 'return' when we have issued them all. Specifically, our example will involve multiple tasks -- modeled using fibers -- making `fetch` requests and responding when all the requests complete.
 +
 +This implies a combination of local scheduling of tasks, possibly a _tree_ of arenas reflecting a hierarchical structure to the application.
 +
 +#### A `fetch`ing Task
 +
 +On the surface, the code for `fetch`ing data is very simple:[^async-c]
 +
 +```c
 +async fetcher(string url){
 +  string text = await simple_fetch(url);
 +  doSomething(text);
 +}
 +```
 +
 +[^async-c]: In an extension to the C language, we have invented a new type of function--the `async` function. In our mythical extension, only `async` functions are permitted to use the `await` expression form.
 +
 +#### Importing `fetch`
 +
 +The actual `fetch` Web API is quite complex; and we do not intend to explore that complexity. Instead, we will use a simplified `simple_fetch` that takes a url string and returns a `Promise` of a `string`. (Again, we ignore issues such as failures of `fetch` here.)
 +
 +JSPI works by wrapping imports to functions that suspend. More accurately, a marked import uses a different calling convention: when the callee returns – with a Promise – the calling computation is suspended. Whe Promise is ultimately resolved, the then function that is attached to the Promise is called by the task runner. That then function resumes the suspended computation, passing it the revolved value of the Promise
 +
 +Balancing these suspending imports is a Promising export: we mark one or more exports as promising. This affects how the export is invoked: it is executed on a separate stack. This is the stack that is suspended when the import is called; and resumed when the Promise is resolved.
 +
 +>The exports are marked as Promising because the original is wrapped into a function that returns a Promise, which is resolved by the value originally returned from the export.
 +
 +Our goal is to achieve similar functionality to JSPI, but to allow the application to continue in some fashion when calling a suspending import.
 +
 +This pseudo code describes a function that calls a ‘raw’ fetch function, and suspends back to the arena scheduler with an #ioPause request – sending the Promise from the simpleFetch.
 +
 +```pseudo
 +func callSimpleFetch(f:Fiber,u:Url):
 +  p: Promise = simpleFetch(u);
 +  f.suspend(#ioPause(p));
 +  return p.value
 +```
 +
 +#### Trees of schedulers
 +
 +In a complete application, there will likely be a tree of schedulers; each scheduler must also yield to it’s parent arena scheduler: this is the primary route by which the WebAssembly application as a whole can eventually yield back to the browser’s event loop.
 +
 +When, finally, the application does ‘return’ to the event loop, it must be the case that all the schedulers in tree – and all the fibers referenced from all the schedulers – are suspended.
 +
 +Ultimately, the application will be reentered when one of the Promises has been resolved and the event loop invokes the appropriate fiber. This must be handled with some care to avoid creating issues with the hierarchy of arena schedulers. In effect, the tree of schedulers will need to be re-awoken in the correct order so that each client fiber has access to its own scheduler in a valid state.
 +
 +One way to achieve this is for each arena scheduler to keep track of any Promises its client fibers create and, when it yields to its parent, create a special composite Promise consisting of an array of client Promises. This allows the tree to be correctly woken up when a client Promise is resolved.
 +
 +We do not claim that this is the only way of managing the asynchronous activities; indeed, individual language toolchains will have their own language specific requirements. However, this example is primarily intended to explain how one might implement the integration between WebAPIs such as `fetch` and a coroutining aware programming language.
 +
 +## Frequently Asked Questions
 +
 +### What is the difference between this proposal and the 'typed continuations' proposal?
 +
 +This proposal shares some common elements with the typed continuations proposal: for example, we have statically typed first class continuations – called stack references in this proposal.
 +
 +However, this proposal is significantly simpler also. There is no built-in mechanism for searching for effect handlers, there is no built-in mechanism for maintaining a suspend/resume relationship, there is no mechanism for automatic propagation of results, and there is no built-in mechanism for implementation dispatch when a stack is re-entered.
 +
 +The reasoning behind this 'concept shedding' is straightforward: we have tried to focus on the essential features that _all_ language providers will need to support _their own_ models of concurrency.
 +
 +On the other hand, in many realistic situations, higher-level abstractions are still required. We saw this with the [API](#a-fibers-api) that allowed us to model with long running computations as opposed to instantaneous snapshots. Using WebAssembly to realize this abstraction is also important: higher-level abstractions are often also less universal.
 +
 +### How do coroutines relate to the JS Promise integration API?
 +
 +JSPI focuses on the behavior of the whole application: it is targeted at enabling so-called legacy code (which is not aware of asynchronous computation) to access many of the Web APIs which are fundamentally asynchronous in nature.
 +
 +Internally, the implementation of JSPI requires many if not most of the techniques needed to support coroutines; however, this is largely hidden from the developer using JSPI.
 +
 +JSPI can be used to implement coroutine language features. However, this carries significant performance penalties as each time an application suspends using JSPI, it will not be re-entered until the brower’s task runner invokes the associated Promise’s then function. This effectively eliminates one of the key benefits of coroutines: of allowing an application to manage and schedule its own computations.
 +
 +A legitimate question remains of whether it is possible to polyfill JSPI in terms of coroutines. It definitely is possible to do so, albeit involving substantial amounts of extra JavaScript and WebAssembly code.
 +
 +### What other instructions might we want to include in this proposal?
 +
 +We may choose to add other instructions to the proposal to round out the instruction set or if there is specific demand for them. Potential additions include:
 +
 + - `stack.new_ref`: a variant of `stack.new` that takes a function reference operand instead of a function index immediate. The latter instructions can be specified in terms of the `*_ref` variants, but the `*_ref` variants would be less efficient in real implementations.
 + - `switch_throw`: Switch to a stack and throw an exception rather than sending the expected values. This can instead be accomplished by sending a sentinel value that informs the recipient that it should throw an exception itself, but `switch_throw` would be more direct.
 + - `return_switch`: Combines a `return_call` with a stack switch. Returns out of the current frame, switches to another stack, and calls into a new function once control returns to the original stack. This may end up being useful in combination with shared-everything threads, where creating shareable stack references would require careful management of the kinds of frames on the stack.
 + - `return_switch_throw`: Combines both of the above.
 +
 +### Why are we using 'lexical scoping' rather than 'dynamic scoping'
 +
 +A key property of this design is that, in order for a WebAssembly program to switch between coroutines, the target of the switch is explicitly identified. This so-called lexical scoping approach is in contrast with a dynamic approach -- commonly used for exception handling -- where the engine is expected to search the current evaluation context to decide where to suspend to (say).
 +
 +#### Implementation
 +
 +In a lexically-scoped design, the engine is explicitly told by the program where to transfer control to.
 +Thus, the only additional obligation the engine has to implement, besides the actual transfer of control, is validation that the target is _valid_ from the current control point.
 +
 +In a dynamically-scoped design, the engine has to search for the transfer target. This search is typically not a simple search to specify and/or implement since the _criteria_ for a successful search depends on the language, both current and future.
 +
 +By requiring the program to determine the target, the computation of this target becomes a burden for the toolchain rather than for the WebAssembly engine implementor.
 +
 +#### Symmetric coroutining (and its cousin: task chaining)
 +
 +With symmetric coroutines you can have a (often binary) collection of coroutines that directly yield to each other via application-specific messages. We saw a simple example of this in our [generator example](#generating-elements-of-an-array).
 +
 +Similar patterns arise when chaining tasks together, where one computation is intended to be followed by another. Involving a scheduler in this situation creates difficulties for types (the communication patterns between the tasks is often private and the types of data are not known to a general purpose scheduler).
 +
 +A lexically-scoped design more directly/simply/efficiently supports these common horizonal control-transfer patterns than a dynamically-scoped design which would typically bake in a parent scheduler.
 +
 +#### Composing Components
 +
 +In applications where multiple _components_ are combined to form an application the risks of dynamic scoping may be unacceptable. By definition, components have a _shared nothing_ interface where the only permitted communications are those permitted by a common _interface language_. This includes prohibiting exceptions to cross component boundaries--unless via coercion--and switching between tasks.
 +
 +By requiring explicit identification of the target of a switch we make the task (sic) of implementing component boundaries more manageable when coroutining is involved. In fact, this is envisaged in the components design by using _streaming_ and _future_ metaphors to allow for this kind of control flow between components.
 +
 +In addition, due to a focus on stack references, it is only possible for one component to be aware of coroutines in another component if they are given explicit access to them (by passing them as arguments of functions for example). Furthermore, since a stack reference is inherently single use, there is reduced risk of _leakage_ across the component boundary.
 +
 +Finally, since the active component does not have a valid stack reference. So, calling a function (across a component boundary) cannot result in the current coroutine's identity being discovered by the callee. This enhances the security boundary between components.
 +
 +#### Dynamic Scoped extensions to WebAssembly
 +
 +However, in a [companion proposal](), we explore a simple extension to WebAssembly that can be efficiently realized and that brings a dynamic scoping mechanism to WebAssembly. The two proposals are separate, but their combination can be used to realize a dynamically scoped effect handler scenario.
 +
 +### How are exceptions handled?
 +
 +One popular feature of exception handling systems is that of _automatic exception propagation_; where an exception is automatically propagated from its point of origin to an outer scope that is equipped to respond to it. However, this policy is generally incompatible with many forms of coroutining.
 +
 +The reason is that, when a coroutine is resumed, it may be from a context that does not resemble the context when it was executing previously; indeed it may be resumed from a context that cannot handle any application exceptions.
 +
 +This happens today in the browser, for example. When a `Promise`'s callback revolved and/or reject functions are called, it is typically from the context of the so-called microtask runner. The micro task runner cannot handle results from tasks it runs; instead, the convention is to map results into calls to the relevant callback function of the appropriate `Promise` chain. If that chain does not exist, the task runner simply drops results, exceptional or not.
 +
 +This, in turn, implies that application specific actions need to be taken when any exception is bubbling out of a coroutine. In general, we expect a great deal of variability in how results are transmitted from coroutines, and, as a result, choose not to specify any automatic propagation mechanism.
 +
 +So, in this proposal, results do _not_ propagate out of a coroutine. Instead, the application uses the `switch_retire` instruction to simultaneously terminate and send a final result to a coroutine that can take responsibility for the result. In the case of exceptions, one pattern that may apply is for the coroutine function to catch exceptions not handled by the application logic. This would then result in a message to another coroutine; which may rethrow the exception in that coroutine. The key here is that this routing logic is application or language specific: it is not mandated by the engine.
 +
 +### What about structured concurrency?
 +
 +As we noted above, structured concurrency is an approach to managing concurrent applications; one that enforces the equivalent of the single entry/single exit control flow property from structured programming. However, there are many other patterns of coroutining possible. Some languages, for example, stipulate a single global coroutine scheduler and do not support hierarchies of arena managers.
 +
 +On the other hand, if a WebAssembly application is constructed from multiple languages, then across multiple component boundaries, it is highly likely that such systems would have multiple schedulers. Furthermore, in the context of a Web browser, these schedulers will be forced to work together: a given language scheduler will be required to yield to other schedulers if they want to resolve their asynchronous I/O results.
 +
 +Structured concurrency is not built-in to our proposal. However, it is straightforward for a toolchain to generate patterns such as the arena pattern we illustrated [above](#cooperative-coroutines). It is our opinion that this degree of choice is advisable in order to avoid unnecessary obstacles in the path of a language implementer.
 +
 +### How does one support opt-out and opt-in?
 +
 +The only way that a suspended computation can be resumed is if one has access to its stack reference. As such, opt-out is extremely straightforward: simply do not allow code to become aware of the stack reference.
 +
 +Supporting full opt-in, where only specially prepared code can switch, and especially in the so-called _sandwich scenario_ is more difficult. If a  module invokes an import that reconnects to the module via another export, then this design will allow code to invoke switching patterns without being explicitly enabled.
 +
 +It is our opinion that the main method for preventing the sandwich scenario is to prevent modules that do not support switching from importing functions from suspending modules. Solutions to this would have greater utility than preventing abuse of suspendable computations; and perhaps should be subject to a different effort.
 +
 +### How are different patterns combined?
 +
 +A given fragment of code may be involved with more than one coroutining pattern. We saw a simple example of this [here](#cooperative-coroutines). It is straightforward to combine scenarios when one considers that each is distinguished by its own protocol.
 +
 +For example, most implementations of the generator pattern will not also be an implementation of the green thread pattern; and conversely, all the suspended green threads that a fiber scheduler is managing will be waiting for a go signal from the scheduler: they will not be waiting in the queue for the next yielded element from a generator.
 +
 +It is much more likely that different patterns will also be distinguished by separate suspension points and switching targets: A single switching target denotes a point in a conversation, different conversations will not intersect around any single switching location.
 +
 +Pragmatically, what this means is that each coroutining conversation that a given code is involved with will be 'represented' by a different target. If a code is both part of a generator/consumer pair and is a green thread, then, when pausing as a green thread, the scheduler target (i.e., the scheduler's stack reference) will be identified as the target of the switch. When yielding a value (say), the code will reference the consumer's stack reference. At no point is any given stack reference part of more than once coroutining conversation.
 +
 +### How are event descriptions represented?
 +
 +Event descriptions are the data packages that are exchanged when switching from one coroutine to another. For any given coroutining conversation, the space of such event descriptions is likely to be fixed by the designers of the conversation. In such a case, the set of possible event descriptions can often be modeled in terms of an _algebraic data type_ i.e., a sum of possible data values.
 +
 +WebAssembly does not, at this time, directly support algebraic data types. However, they can be _realized_ in a number of ways. The approach that we have been following in this design is one of _flattening_: we unpack all the possible argument data elements into a vector of values and then use a _discriminator_ value to signal which case is actually involved.
 +
 +There are two immediate consequences of this: we have to arrange for the vector of values to be large enough to encompass all the possible elements, and for any given event description there may be unused spaces in the vector. This last aspect may require the implementer to use slightly more general types than they otherwise would: for example, to use a nullable type where that may not be implied by the application logic.
 +
 +However, in practice, we don't anticipate that this wastage will be significant. This is, in part, because there is another strategy available to the implementer where this flattening approach does not work: boxing. In a boxed approach, the event description consists of a single value: a pointer to the event description; and it is up to the application code to construct the event description and to decode it as appropriate.
 +
 +For those scenarios that are more dynamic in nature: where it is not possible to predict, at compile-time, the contents of the event description, some form of boxing is likely to be necessary when using this design. However, in the future, some form of algebraic data type capability may be added to WebAssembly; in which case, such a capability could be used to advantage for communicating event descriptions.
 +
 +### How can this proposal be implemented?
 +
 +Implementing this proposal in a production engine raises some issues: how are stack references (and any underlying resources) managed, how to manage the sizes of the stack memories, how to integrate stack switching with accessing 'external' functions that may make special assumptions about the stack memory they use, and how to ensure performant implementations of the key operations. Many of these concerns arise from the fact that most language runtimes were not crafted with coroutining in mind.
 +
 +#### Growing stacks
 +
 +When a new coroutine is established, using the `stack.new` instruction, the engine must also allocate memory to allow the stack frames of functions to be stored. Normally, we expect the `stack.new` instruction to result in a new stack allocation and for subsequence function calls to be executed on this new stack memory. This allows for a rapid switch between coroutines since we can switch simply by ensuring that the `SP` register of the processor points to the new target.
 +
 +The engine also has to decide how much memory to allocate, and there also needs to be a strategy for dealing with the case when that memory is exhausted. The primary issue here is to determine how much memory to allocate for the newly created stack. It is not feasible in many cases to allocate a large block for each coroutine: if an application uses large numbers of coroutines then this can result in a lot of wasted memory. In addition, it is quite likely that most coroutines will have very small memory requirements; and only a few needing larger memories.
 +
 +However, given the capability for switching between coroutines, it is quite conceivable to allow stacks to be automatically grown when their stack memory is exhausted. This could be by creating a new larger memory and copying an existing stack resource into it; or it could be by allowing execution stacks to be segmented.
 +
 +Which approach is taken depends on the larger requirements of the WebAssembly engine itself.
 diff --git a/proposals/continuations/Explainer.md b/proposals/continuations/Explainer.md
 new file mode 100644
 index 00000000..904a52bf
 --- /dev/null
 +++ b/proposals/continuations/Explainer.md
 @@ -0,0 +1,1681 @@
 +# Typed continuations
 +
 +This document provides an informal presentation of the *typed
 +continuations* proposal, a minimal and compatible extension to Wasm
 +for structured non-local control flow. The proposal is minimal in the
 +sense that it leverages Wasm's existing instruction set and type
 +system. It extends the instruction set with instructions to suspend,
 +resume, and abort computations, and extends the type system with a
 +single new reference type for *continuations*.
 +
 +## Table of contents
 +
 +1. [Motivation](#motivation)
 +2. [Additional requirements](#additional-requirements)
 +3. [Instruction set](#instruction-set)
 +   1. [Declaring control tags](#declaring-control-tags)
 +   2. [Creating continuations](#creating-continuations)
 +   3. [Invoking continuations](#invoking-continuations)
 +   4. [Suspending continuations](#suspending-continuations)
 +   5. [Binding continuations](#binding-continuations)
 +   6. [Trapping continuations](#trapping-continuations)
 +4. [Examples](#examples)
 +   1. [Lightweight threads (static)](#lightweight-threads-static)
 +   2. [Lightweight threads (dynamic)](#lightweight-threads-dynamic)
 +   3. [Actors](#actors)
 +   4. [Async/await](#asyncawait)
 +   5. [Delimited continuations](#delimited-continuations)
 +5. [Implementation strategies](#implementation-strategies)
 +6. [Design considerations and extensions](#design-considerations-and-extensions)
 +   1.  [Memory management](#memory-management)
 +   2.  [Linear versus constant time dispatch](#linear-versus-constant-time-dispatch)
 +   3.  [Named handlers](#named-handlers)
 +   4.  [Direct switching](#direct-switching)
 +   5.  [Control/prompt as an alternative basis](#controlprompt-as-an-alternative-basis)
 +   6.  [Coupling of continuation capture and dispatch](#coupling-of-continuation-capture-and-dispatch)
 +   7.  [Tail-resumptive handlers](#tail-resumptive-handlers)
 +   8.  [Multi-shot continuations](#multi-shot-continuations)
 +   9.  [Interoperability, legacy code, and the barrier instruction](#interoperability-legacy-code-and-the-barrier-instruction)
 +   10. [First-class tags](#first-class-tags)
 +   11. [Shallow versus deep handlers](#shallow-versus-deep-handlers)
 +
 +## Motivation
 +
 +Non-local control flow features provide the ability to suspend the
 +current execution context and later resume it. Many
 +industrial-strength programming languages feature a wealth of
 +non-local control flow features such as async/await, coroutines,
 +generators/iterators, effect handlers, call/cc, and so forth. For some
 +programming languages non-local control flow is central to their
 +identity, meaning that they rely on non-local control flow for
 +efficiency, e.g. to support massively scalable concurrency.
 +
 +Currently, Wasm lacks support for implementing such features directly
 +and efficiently without a circuitous global transformation of source
 +programs on the producer side. One possible strategy is to add special
 +support for each individual non-local control flow feature to Wasm,
 +but strategy does not scale to the next 700 non-local control flow
 +features. Instead, the goal of this proposal is to introduce a unifed
 +structured mechanism that is sufficiently general to cover present
 +use-cases as well as being forwards compatible with future use-cases,
 +while admitting efficient implementations.
 +
 +The proposed mechanism is based on proven technology: *delimited
 +continuations*. An undelimited continuation represents the rest of a
 +computation from a certain point in its execution. A delimited
 +continuation is a more modular form of continuation, representing the
 +rest of a computation from a particular point in its execution up to a
 +*delimiter* or *prompt*. Operationally, one may think of undelimited
 +continuations as stacks and delimited continuations as segmented
 +stacks.
 +
 +In their raw form delimited continuations do not readily fit into the
 +Wasm ecosystem, as the Wasm type system is not powerful enough to type
 +them. The gist of the problem is that the classic treatment of
 +delimited continuations provides only one universal control tag
 +(i.e. the mechanism which transforms a runtime stack into a
 +programmatic data object). In order to use Wasm's simple type system
 +to type delimited continuations, we use the idea of multiple *named*
 +control tags from Plotkin and Pretnar's effect handlers. Each control
 +tag is declared module-wide along its payload type and return
 +type. This declaration can be used to readily type points of non-local
 +transfer of control. From an operational perspective we may view
 +control tags as a means for writing an interface for the possible
 +kinds of non-local transfers (or stack switches) that a computation
 +may perform.
 +
 +### Typed continuation primer
 +
 +A *continuation* is a first-class program object that represents the
 +remainder of computation from a certain point in the execution of a
 +program --- intuitively, its current stack. The typed continuations
 +proposal is based on a structured notion of delimited continuations. A
 +*delimited continuation* is a continuation whose extent is delimited
 +by some *control delimiter*, meaning it represents the remainder of
 +computation from a certain point up to (and possibly including) its
 +control delimiter -- intuitively, a segment of the stack. An
 +alternative to delimited continuations is undelimited continuations
 +which represent the remainder of the *entire* program. Delimited
 +continuations are preferable as they are more modular and more
 +fine-grained in the sense that they provide a means for suspending
 +local execution contexts rather than the entire global execution
 +context. In particular, delimited continuations are more expressive,
 +as an undelimited continuation is merely a delimited continuation
 +whose control delimiter is placed at the start of the program.
 +
 +The crucial feature of the typed continuations proposal that makes it
 +more structured than conventional delimited continuations is *control
 +tags*. A control tag is a typed symbolic entity that suspends the
 +current execution context and reifies it as a *continuation object*
 +(henceforth, just *continuation*) up to its control delimiter. The
 +type of a control tag communicates the type of its payload as well as
 +its expected return type, i.e. the type of data that must be supplied
 +to its associated continuation upon resumption. In other words,
 +control tags define an *interface* for constructing continuations.
 +
 +A second aspect of the design that aids modularity by separating
 +concerns is that the construction of continuations is distinct from
 +*handling* of continuations. A continuation is handled at the
 +delimiter of a control tag rather than at the invocation site of the
 +control tag. Control tags are a mild extension of exception tags as in
 +the exception handling proposal. The key difference is that in
 +addition to a payload type, a control tag also declares a return
 +type. Roughly, control tags can be thought of as resumable exceptions.
 +
 +Typed continuations may be efficiently implemented using segmented
 +stacks, but other implementations are also possible.
 +
 +## Additional requirements
 +
 + * **No GC dependency**: We intend every language to be able to use
 +   typed continuations to implement non-local flow abstractions
 +   irrespective of whether its memory is managed by a GC. Thus this
 +   proposal must not depend on the presence of a full-blown GC as in
 +   the GC proposal, rather, reference counting or a similar technique
 +   must be sufficient in cases where some form of memory management is
 +   necessary.
 +
 + * **Debugging friendliness**: The addition of continuations must
 +   preserve compatibility with standard debugging formats such as
 +   DWARF, meaning it must be possible to obtain a sequential
 +   unobstructed stack trace in the presence of continuations.
 +
 + * **Exception handling compatibility**: [The exception handling
 +   proposal](https://github.com/WebAssembly/exception-handling) adds
 +   special support for one kind of non-local control flow abstraction,
 +   namely, exception handlers. Exceptions must continue to work in the
 +   presence of typed continuations and vice versa.
 +
 + * **Preserve Wasm invariants of legacy code**: The proposal must
 +   provide a means to protect the invariants of existing Wasm
 +   code. For instance, this means that in the presence of code that
 +   uses typed continuations it should be possible to ensure that other
 +   legacy code cannot suspend. The mechanism for protecting invariants
 +   need not be automatic (in the same vein as explicit synchronisation
 +   might be needed when adding threads and shared memory).
 +
 +## Instruction set
 +
 +The proposal adds a new reference type for continuations.
 +
 +```wast
 +  (cont $t)
 +```
 +
 +A continuation type is given in terms of a function type `$t`, whose parameters `tp*`
 +describes the expected stack shape prior to resuming/starting the
 +continuation, and whose return types `tr*` describes the stack
 +shape after the continuation has run to completion.
 +
 +As a shorthand, we will often write the function type inline and write a continuation type as
 +```wast
 +  (cont [tp*] -> [tr*])
 +```
 +
 +### Declaring control tags
 +
 +A control tag is similar to an exception extended with a result type
 +(or list thereof). Operationally, a control tag may be thought of as a
 +*resumable* exception. A tag declaration provides the type signature
 +of a control tag.
 +
 +```wast
 +  (tag $e (param tp*) (result tr*))
 +```
 +
 +The `$e` is the symbolic index of the control tag in the index space
 +of tags. The parameter types `tp*` describe the expected stack layout
 +prior to invoking the tag, and the result types `tr*` describe the
 +stack layout following an invocation of the operation. In this
 +document we will sometimes write `$e : [tp*] -> [tr*]` as shorthand
 +for indicating that such a declaration is in scope.
 +
 +### Creating continuations
 +
 +The following instruction creates a continuation in *suspended state*
 +from a function.
 +
 +```wast
 +  cont.new $ct : [(ref $ft)] -> [(ref $ct)]
 +  where:
 +  - $ft = func [t1*] -> [t2*]
 +  - $ct = cont $ft
 +```
 +
 +The instruction takes as operand a reference to
 +a function of type `[t1*] -> [t2*]`. The body of this function is a
 +computation that may perform non-local control flow.
 +
 +
 +### Invoking continuations
 +
 +There are two ways to invoke (or run) a continuation.
 +
 +The first way to invoke a continuation resumes the continuation under
 +a *handler*, which handles subsequent control suspensions within the
 +continuation.
 +
 +```wast
 +  resume $ct (tag $e $l)* : [tp* (ref $ct)] -> [tr*]
 +  where:
 +  - $ct = cont [tp*] -> [tr*]
 +```
 +
 +The `resume` instruction is parameterised by a continuation type and a
 +handler dispatch table defined by a collection of pairs of control
 +tags and labels. Each pair maps a control tag to a label pointing to
 +its corresponding handler code. The `resume` instruction consumes its
 +continuation argument, meaning a continuation may be resumed only
 +once.
 +
 +The second way to invoke a continuation is to raise an exception at
 +the control tag invocation site. This amounts to performing "an
 +abortive action" which causes the stack to be unwound.
 +
 +
 +```wast
 +  resume_throw $ct $exn (tag $e $l)* : [tp* (ref $ct)])] -> [tr*]
 +  where:
 +  - $ct = cont [ta*] -> [tr*]
 +  - $exn : [tp*] -> []
 +```
 +
 +The instruction `resume_throw` is parameterised by a continuation
 +type, the exception to be raised at the control tag invocation site,
 +and a handler dispatch table. As with `resume`, this instruction also
 +fully consumes its continuation argument. Operationally, this
 +instruction raises the exception `$exn` with parameters of type `tp*`
 +at the control tag invocation point in the context of the supplied
 +continuation. As an exception is being raised (the continuation is not
 +actually being supplied a value) the parameter types for the
 +continuation `ta*` are unconstrained.
 +
 +### Suspending continuations
 +
 +A computation running inside a continuation can suspend itself by
 +invoking one of the declared control tags.
 +
 +
 +```wast
 +  suspend $e : [tp*] -> [tr*]
 +  where:
 +  - $e : [tp*] -> [tr*]
 +```
 +
 +The instruction `suspend` invokes the control tag named `$e` with
 +arguments of types `tp*`. Operationally, the instruction transfers
 +control out of the continuation to the nearest enclosing handler for
 +`$e`. This behaviour is similar to how raising an exception transfers
 +control to the nearest exception handler that handles the
 +exception. The key difference is that the continuation at the
 +suspension point expects to be resumed later with arguments of types
 +`tr*`.
 +
 +### Binding continuations
 +
 +The parameter list of a continuation may be shrunk via `cont.bind`. This
 +instruction provides a way to partially apply a given
 +continuation. This facility turns out to be important in practice due
 +to the block and type structure of Wasm as in order to return a
 +continuation from a block, all branches within the block must agree on
 +the type of continuation. By using `cont.bind`, one can
 +programmatically ensure that the branches within a block each return a
 +continuation with compatible type (the [Examples](#examples) section
 +provides several example usages of `cont.bind`).
 +
 +
 +```wast
 +  cont.bind $ct1 $ct2 : [tp1* (ref $ct1)] -> [(ref $ct2)]
 +  where:
 +  $ct1 = cont [tp1* tp2*] -> [tr*]
 +  $ct2 = cont [tp2*] -> [tr*]
 +```
 +
 +The instruction `cont.bind` binds the arguments of type `tp1*` to a
 +continuation of type `$ct1`, yielding a modified continuation of type
 +`$ct2` which expects fewer arguments. This instruction also consumes
 +its continuation argument, and yields a new continuation that can be
 +supplied to either `resume`,`resume_throw`, or `cont.bind`.
 +
 +### Trapping continuations
 +
 +In order to allow ensuring that control cannot be captured across
 +certain abstraction or language boundaries, we provide an instruction
 +for explicitly trapping attempts at reifying stacks across a certain
 +point.
 +
 +```wast
 +  barrier $l bt instr* end : [t1*] -> [t2*]
 +  where:
 +  - bt = [t1*] -> [t2*]
 +  - instr* : [t1*] -> [t2*]
 +```
 +
 +The `barrier` instruction is a block with label `$l`, block type
 +`bt = [t1*] -> [t2*]`, whose body is the instruction sequence given
 +by `instr*`. Operationally, `barrier` may be viewed as a "catch-all"
 +handler, that handles any control tag by invoking a trap.
 +
 +## Continuation lifetime
 +
 +### Producing continuations
 +
 +There are three different ways in which continuations are produced
 +(`cont.new,suspend,cont.bind`). A fresh continuation object is
 +allocated with `cont.new` and the current continuation is reused with
 +`suspend` and `cont.bind`.
 +
 +The `cont.bind` instruction is directly analogous to the mildly
 +controversial `func.bind` instruction from the function references
 +proposal. However, whereas the latter necessitates the allocation of a
 +new closure, as continuations are single-shot no allocation is
 +necessary: all allocation happens when the original continuation is
 +created by preallocating one slot for each continuation argument.
 +
 +### Consuming continuations
 +
 +There are three different ways in which continuations are consumed
 +(`resume,resume_throw,cont.bind`). A continuation is resumed with a
 +particular handler with `resume`. A continuation is aborted with
 +`resume_throw`. A continuation is partially applied with `cont.bind`.
 +
 +In order to ensure that continuations are one-shot, `resume`,
 +`resume_throw`, and `cont.bind` destructively modify the continuation
 +object such that any subsequent use of the same continuation object
 +will result in a trap.
 +
 +## Examples
 +
 +### Lightweight threads (static)
 +
 +(The full code for this example is [here](examples/static-lwt.wast).)
 +
 +Lightweight threads are one of the primary use-cases for typed
 +continuations. In their most basic *static* form we assume a fixed
 +collection of cooperative threads with a single tag that allows a
 +thread to signal that it is willing to yield.
 +
 +```wast
 +(module $lwt
 +  (tag $yield (export "yield"))
 +)
 +(register "lwt")
 +```
 +
 +The `$yield` tag takes no parameter and has no result. Having
 +declared it, we can now write some cooperative threads as functions.
 +
 +```wast
 +(module $example
 +  (tag $yield (import "lwt" "yield"))
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  (func $thread1 (export "thread1")
 +    (call $log (i32.const 10))
 +    (suspend $yield)
 +    (call $log (i32.const 11))
 +    (suspend $yield)
 +    (call $log (i32.const 12))
 +  )
 +
 +  (func $thread2 (export "thread2")
 +    (call $log (i32.const 20))
 +    (suspend $yield)
 +    (call $log (i32.const 21))
 +    (suspend $yield)
 +    (call $log (i32.const 22))
 +  )
 +
 +  (func $thread3 (export "thread3")
 +    (call $log (i32.const 30))
 +    (suspend $yield)
 +    (call $log (i32.const 31))
 +    (suspend $yield)
 +    (call $log (i32.const 32))
 +  )
 +)
 +(register "example")
 +```
 +
 +Our intention is to interleave the execution of `$thread1`,
 +`$thread2`, and `$thread3`, using `(suspend $yield)` to suspend
 +execution to a scheduler which will perform a context switch.
 +
 +If we were to try to run any of these functions at the top-level then
 +they would trap as soon as they try to suspend with the `$yield$`
 +tag, because we have not yet specified how to handle it.
 +
 +We now define a scheduler.
 +
 +```wast
 +(module $scheduler
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  (tag $yield (import "lwt" "yield"))
 +
 +  ;; queue interface
 +  (func $queue-empty (import "queue" "queue-empty") (result i32))
 +  (func $dequeue (import "queue" "dequeue") (result (ref null $cont)))
 +  (func $enqueue (import "queue" "enqueue") (param $k (ref $cont)))
 +
 +  (func $run (export "run")
 +    (loop $l
 +      (if (call $queue-empty) (then (return)))
 +      (block $on_yield (result (ref $cont))
 +        (resume $cont (tag $yield $on_yield)
 +                      (call $dequeue)
 +        )
 +        (br $l)  ;; thread terminated
 +      ) ;;   $on_yield (result (ref $cont))
 +      (call $enqueue)  ;; continuation of current thread
 +      (br $l)
 +    )
 +  )
 +)
 +(register "scheduler")
 +```
 +
 +We assume a suitable interface to a queue of active threads
 +represented as continuations. The scheduler is a loop which repeatedly
 +runs the continuation (thread) at the head of the queue. It does so by
 +resuming the continuation with a handler for the `$yield` tag. The
 +handler `(tag $yield $on_yield)` specifies that the `$yield` tag
 +is handled by running the code immediately following the block
 +labelled with `$on_yield`, the `$on_yield` clause. The result of the
 +block `(result (ref $cont))` declares that there will be a
 +continuation on the stack when suspending with the `$yield` tag,
 +which is the continuation of the currently executing thread. The
 +`$on_yield` clause enqueues this continuation and proceeds to the next
 +iteration of the loop.
 +
 +In order to interleave our three test threads together, we create a
 +new continuation for each, enqueue the continuations, and invoke the
 +scheduler. The `cont.new` operation turns a function reference into a
 +corresponding continuation reference.
 +
 +```wast
 +(module
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  (func $scheduler (import "scheduler" "run"))
 +  (func $enqueue (import "queue" "enqueue") (param (ref $cont)))
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  (func $thread1 (import "example" "thread1"))
 +  (func $thread2 (import "example" "thread2"))
 +  (func $thread3 (import "example" "thread3"))
 +
 +  (elem declare func $thread1 $thread2 $thread3)
 +
 +  (func (export "run")
 +    (call $enqueue (cont.new $cont (ref.func $thread1)))
 +    (call $enqueue (cont.new $cont (ref.func $thread2)))
 +    (call $enqueue (cont.new $cont (ref.func $thread3)))
 +
 +    (call $log (i32.const -1))
 +    (call $scheduler)
 +    (call $log (i32.const -2))
 +  )
 +)
 +
 +(invoke "run")
 +```
 +
 +The output is as follows.
 +```
 +-1 : i32
 +10 : i32
 +20 : i32
 +30 : i32
 +11 : i32
 +21 : i32
 +31 : i32
 +12 : i32
 +22 : i32
 +32 : i32
 +-2 : i32
 +```
 +The threads are interleaved as expected.
 +
 +### Lightweight threads (dynamic)
 +
 +(The full code for this example is [here](examples/lwt.wast).)
 +
 +We can make our lightweight threads functionality considerably more
 +expressive by allowing new threads to be forked dynamically.
 +
 +```wast
 +(module $lwt
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  (tag $yield (export "yield"))
 +  (tag $fork (export "fork") (param (ref $cont)))
 +)
 +(register "lwt")
 +```
 +
 +We declare a new `$fork` tag that takes a continuation as a
 +parameter and (like `$yield`) returns no result. Now we modify our
 +example to fork each of the three threads from a single main thread.
 +
 +```wast
 +(module $example
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  (tag $yield (import "lwt" "yield"))
 +  (tag $fork (import "lwt" "fork") (param (ref $cont)))
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  (elem declare func $thread1 $thread2 $thread3)
 +
 +  (func $main (export "main")
 +    (call $log (i32.const 0))
 +    (suspend $fork (cont.new $cont (ref.func $thread1)))
 +    (call $log (i32.const 1))
 +    (suspend $fork (cont.new $cont (ref.func $thread2)))
 +    (call $log (i32.const 2))
 +    (suspend $fork (cont.new $cont (ref.func $thread3)))
 +    (call $log (i32.const 3))
 +  )
 +
 +  (func $thread1
 +    (call $log (i32.const 10))
 +    (suspend $yield)
 +    (call $log (i32.const 11))
 +    (suspend $yield)
 +    (call $log (i32.const 12))
 +  )
 +
 +  (func $thread2
 +    (call $log (i32.const 20))
 +    (suspend $yield)
 +    (call $log (i32.const 21))
 +    (suspend $yield)
 +    (call $log (i32.const 22))
 +  )
 +
 +  (func $thread3
 +    (call $log (i32.const 30))
 +    (suspend $yield)
 +    (call $log (i32.const 31))
 +    (suspend $yield)
 +    (call $log (i32.const 32))
 +  )
 +)
 +(register "example")
 +```
 +
 +As with the static example we define a scheduler module.
 +```wast
 +(module $scheduler
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  (tag $yield (import "lwt" "yield"))
 +  (tag $fork (import "lwt" "fork") (param (ref $cont)))
 +
 +  (func $queue-empty (import "queue" "queue-empty") (result i32))
 +  (func $dequeue (import "queue" "dequeue") (result (ref null $cont)))
 +  (func $enqueue (import "queue" "enqueue") (param $k (ref null $cont)))
 +  ...
 +)
 +(register "scheduler")
 +```
 +
 +In this example we illustrate five different schedulers. First, we
 +write a baseline synchronous scheduler which simply runs the current
 +thread to completion without actually yielding.
 +
 +```wast
 +  (func $sync (export "sync") (param $nextk (ref null $cont))
 +    (loop $l
 +      (if (ref.is_null (local.get $nextk)) (then (return)))
 +      (block $on_yield (result (ref $cont))
 +        (block $on_fork (result (ref $cont) (ref $cont))
 +          (resume $cont (tag $yield $on_yield)
 +                        (tag $fork $on_fork)
 +                        (local.get $nextk)
 +          )
 +          (local.set $nextk (call $dequeue))
 +          (br $l)  ;; thread terminated
 +        ) ;;   $on_fork (result (ref $cont) (ref $cont))
 +        (local.set $nextk)                      ;; current thread
 +        (call $enqueue) ;; new thread
 +        (br $l)
 +      )
 +      ;;     $on_yield (result (ref $cont))
 +      (local.set $nextk)  ;; carry on with current thread
 +      (br $l)
 +    )
 +  )
 +```
 +
 +The `$nextk` parameter represents the continuation of the next
 +thread. The loop is repeatedly executed until `$nextk` is null
 +(meaning that all threads have finished). The body of the loop is the
 +code inside the two nested blocks. It resumes the next continuation,
 +dequeues the next continuation, and then continues to the next
 +iteration of the loop. The handler passed to `resume` specifies how to
 +handle both `$yield` and `$fork` tags. Yielding carries on executing
 +the current thread (this scheduler is synchronous). Forking enqueues
 +the new thread and continues executing the current thread.
 +
 +As with the static example, the result of the `$on_yield` block
 +`(result (ref $cont))` declares that there will be a continuation on
 +the stack when suspending with the `$yield` tag, which is the
 +continuation of the currently executing thread. The result of the
 +`$on_fork` block `(result (ref $cont) (ref $cont))` declares that
 +there will be two continuations on the stack when suspending with the
 +`$fork` tag: the first is the parameter passed to fork (the new
 +thread) and the second is the continuation of the currently executing
 +thread.
 +
 +Running the synchronous scheduler on the example produces the following output.
 +```
 +0 : i32
 +1 : i32
 +2 : i32
 +3 : i32
 +10 : i32
 +11 : i32
 +12 : i32
 +20 : i32
 +21 : i32
 +22 : i32
 +30 : i32
 +31 : i32
 +32 : i32
 +```
 +First the main thread runs to completion, then each of the forked
 +threads in sequence.
 +
 +Following a similar pattern, we define four different asynchronous
 +schedulers.
 +
 +```wast
 +  ;; four asynchronous schedulers:
 +  ;;   * kt and tk don't yield on encountering a fork
 +  ;;     1) kt runs the continuation, queuing up the new thread for later
 +  ;;     2) tk runs the new thread first, queuing up the continuation for later
 +  ;;   * ykt and ytk do yield on encountering a fork
 +  ;;     3) ykt runs the continuation, queuing up the new thread for later
 +  ;;     4) ytk runs the new thread first, queuing up the continuation for later
 +
 +  ;; no yield on fork, continuation first
 +  (func $kt (export "kt") (param $nextk (ref null $cont))
 +    (loop $l
 +      (if (ref.is_null (local.get $nextk)) (then (return)))
 +      (block $on_yield (result (ref $cont))
 +        (block $on_fork (result (ref $cont) (ref $cont))
 +          (resume $cont (tag $yield $on_yield)
 +                        (tag $fork $on_fork)
 +                        (local.get $nextk)
 +          )
 +          (local.set $nextk (call $dequeue))
 +          (br $l)  ;; thread terminated
 +        ) ;;   $on_fork (result (ref $cont) (ref $cont))
 +        (local.set $nextk)                      ;; current thread
 +        (call $enqueue) ;; new thread
 +        (br $l)
 +      )
 +      ;;     $on_yield (result (ref $cont))
 +      (call $enqueue)                    ;; current thread
 +      (local.set $nextk (call $dequeue)) ;; next thread
 +      (br $l)
 +    )
 +  )
 +
 +  ;; no yield on fork, new thread first
 +  (func $tk (export "tk") (param $nextk (ref null $cont))
 +    (loop $l
 +      (if (ref.is_null (local.get $nextk)) (then (return)))
 +      (block $on_yield (result (ref $cont))
 +        (block $on_fork (result (ref $cont) (ref $cont))
 +          (resume $cont (tag $yield $on_yield)
 +                        (tag $fork $on_fork)
 +                        (local.get $nextk)
 +          )
 +          (local.set $nextk (call $dequeue))
 +          (br $l)  ;; thread terminated
 +        ) ;;   $on_fork (result (ref $cont) (ref $cont))
 +        (call $enqueue)                            ;; current thread
 +        (local.set $nextk) ;; new thread
 +        (br $l)
 +      )
 +      ;;     $on_yield (result (ref $cont))
 +      (call $enqueue)                    ;; current thread
 +      (local.set $nextk (call $dequeue)) ;; next thread
 +      (br $l)
 +    )
 +  )
 +
 +  ;; yield on fork, continuation first
 +  (func $ykt (export "ykt") (param $nextk (ref null $cont))
 +    (loop $l
 +      (if (ref.is_null (local.get $nextk)) (then (return)))
 +      (block $on_yield (result (ref $cont))
 +        (block $on_fork (result (ref $cont) (ref $cont))
 +          (resume $cont (tag $yield $on_yield)
 +                        (tag $fork $on_fork)
 +                        (local.get $nextk)
 +          )
 +          (local.set $nextk (call $dequeue))
 +          (br $l)  ;; thread terminated
 +        ) ;;   $on_fork (result (ref $cont) (ref $cont))
 +        (call $enqueue)                         ;; current thread
 +        (call $enqueue) ;; new thread
 +        (local.set $nextk (call $dequeue))      ;; next thread
 +        (br $l)
 +      )
 +      ;;     $on_yield (result (ref $cont))
 +      (call $enqueue)                    ;; current thread
 +      (local.set $nextk (call $dequeue)) ;; next thread
 +      (br $l)
 +    )
 +  )
 +
 +  ;; yield on fork, new thread first
 +  (func $ytk (export "ytk") (param $nextk (ref null $cont))
 +    (loop $l
 +      (if (ref.is_null (local.get $nextk)) (then (return)))
 +      (block $on_yield (result (ref $cont))
 +        (block $on_fork (result (ref $cont) (ref $cont))
 +          (resume $cont (tag $yield $on_yield)
 +                        (tag $fork $on_fork)
 +                        (local.get $nextk)
 +          )
 +          (local.set $nextk (call $dequeue))
 +          (br $l)  ;; thread terminated
 +        ) ;;   $on_fork (result (ref $cont) (ref $cont))
 +        (local.set $nextk)
 +        (call $enqueue) ;; new thread
 +        (call $enqueue (local.get $nextk))      ;; current thread
 +        (local.set $nextk (call $dequeue))      ;; next thread
 +        (br $l)
 +      )
 +      ;;     $on_yield (result (ref $cont))
 +      (call $enqueue)                    ;; current thread
 +      (local.set $nextk (call $dequeue)) ;; next thread
 +      (br $l)
 +    )
 +  )
 +```
 +
 +Each `$on_yield` clause is identical, enqueing the continuation of the
 +current thread and dequeing the next continuation for the thread. The
 +`$on_fork` clauses implement different behaviours for scheduling the
 +current and newly forked threads.
 +
 +We run our example using each of the five schedulers.
 +
 +```wast
 +(module
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  (func $scheduler1 (import "scheduler" "sync") (param $nextk (ref null $cont)))
 +  (func $scheduler2 (import "scheduler" "kt") (param $nextk (ref null $cont)))
 +  (func $scheduler3 (import "scheduler" "tk") (param $nextk (ref null $cont)))
 +  (func $scheduler4 (import "scheduler" "ykt") (param $nextk (ref null $cont)))
 +  (func $scheduler5 (import "scheduler" "ytk") (param $nextk (ref null $cont)))
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  (func $main (import "example" "main"))
 +
 +  (elem declare func $main)
 +
 +  (func (export "run")
 +    (call $log (i32.const -1))
 +    (call $scheduler1 (cont.new $cont (ref.func $main)))
 +    (call $log (i32.const -2))
 +    (call $scheduler2 (cont.new $cont (ref.func $main)))
 +    (call $log (i32.const -3))
 +    (call $scheduler3 (cont.new $cont (ref.func $main)))
 +    (call $log (i32.const -4))
 +    (call $scheduler4 (cont.new $cont (ref.func $main)))
 +    (call $log (i32.const -5))
 +    (call $scheduler5 (cont.new $cont (ref.func $main)))
 +    (call $log (i32.const -6))
 +  )
 +)
 +
 +(invoke "run")
 +```
 +
 +The output is as follows, demonstrating the various different scheduling behaviours.
 +```
 +-1 : i32
 +0 : i32
 +1 : i32
 +2 : i32
 +3 : i32
 +10 : i32
 +11 : i32
 +12 : i32
 +20 : i32
 +21 : i32
 +22 : i32
 +30 : i32
 +31 : i32
 +32 : i32
 +-2 : i32
 +0 : i32
 +1 : i32
 +2 : i32
 +3 : i32
 +10 : i32
 +20 : i32
 +30 : i32
 +11 : i32
 +21 : i32
 +31 : i32
 +12 : i32
 +22 : i32
 +32 : i32
 +-3 : i32
 +0 : i32
 +10 : i32
 +1 : i32
 +20 : i32
 +11 : i32
 +2 : i32
 +30 : i32
 +21 : i32
 +12 : i32
 +3 : i32
 +31 : i32
 +22 : i32
 +32 : i32
 +-4 : i32
 +0 : i32
 +1 : i32
 +10 : i32
 +2 : i32
 +20 : i32
 +11 : i32
 +3 : i32
 +30 : i32
 +21 : i32
 +12 : i32
 +31 : i32
 +22 : i32
 +32 : i32
 +-5 : i32
 +0 : i32
 +10 : i32
 +1 : i32
 +11 : i32
 +20 : i32
 +2 : i32
 +12 : i32
 +21 : i32
 +30 : i32
 +3 : i32
 +22 : i32
 +31 : i32
 +32 : i32
 +-6 : i32
 +```
 +
 +### Actors
 +
 +TODO
 +
 +### Async/await
 +
 +TODO
 +
 +### Delimited continuations
 +
 +(The full code for this example is [here](examples/control-lwt.wast).)
 +
 +Conventional unstructured delimited continuations can be directly
 +implemented using our typed continuations design. Here we illustrate
 +how to implement lightweight threads on top of the control/prompt
 +delimited control operators.
 +
 +First we implement control/prompt.
 +
 +```wast
 +;; interface to control/prompt
 +(module $control
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  ;; we sometimes write contref as shorthand for a reference to a continuation
 +
 +  (type $cont-func (func (param (ref $cont)))) ;; [contref ([] -> [])] -> []
 +  (type $cont-cont (cont $cont-func))          ;; cont ([contref ([] -> [])] -> [])
 +
 +  ;; Implementation of a generic delimited control operator using
 +  ;; effect handlers.
 +  ;;
 +  ;; For lightweight threads we have no payload. More general types
 +  ;; for control and prompt are:
 +  ;;
 +  ;;   control : [([contref ([ta*] -> [tr*])] -> [tr*])] -> [ta*]
 +  ;;   prompt : [contref ([] -> [tr*])] -> [tr*]
 +  ;;
 +  ;; (We can also give more refined types if we want to support
 +  ;; answer-type modification and various flavours of answer-type
 +  ;; polymorphism - but these are well outside the scope of a Wasm
 +  ;; proposal!)
 +  ;;
 +  ;; (Technically this is control0/prompt0 rather than
 +  ;; control/prompt.)
 +  (tag $control (export "control") (param (ref $cont-func)))    ;; control : [([contref ([] -> [])] -> [])] -> []
 +  (func $prompt (export "prompt") (param $nextk (ref null $cont)) ;; prompt : [(contref ([] -> []))] -> []
 +    (block $on_control (result (ref $cont-func) (ref $cont))
 +       (resume $cont (tag $control $on_control)
 +                     (local.get $nextk))
 +       (return)
 +    ) ;;   $on_control (param (ref $cont-func) (ref $cont))
 +    (let (local $h (ref $cont-func)) (local $k (ref $cont))
 +      (call_ref (local.get $k) (local.get $h))
 +    )
 +  )
 +)
 +(register "control")
 +```
 +
 +The `$control` tag amounts to a universal control tag, which takes a
 +second-order function `$h` as an argument (it's second-order in that
 +it's a function that itself takes a function, wrapped in a
 +continuation, as an argument). The implementation of prompt is the
 +universal handler for `$control`, which simply applies the second
 +order function `$h` to the captured continuation.
 +
 +In the above code we have specialised `$control` and `$prompt` to the
 +case where the continuation has no parameters and no results, as this
 +suffices for implementing lightweight threads. A continuation
 +parameter corresponds to the result of a control tag, so in the
 +absence of parametric polymorphism, in order to simulate standard
 +control tags in general we would need one copy of `$control` for each
 +type of result we wanted to support.
 +
 +The following example is just like the one we implemented for dynamic
 +lightweight threads using `$yield` and `$fork` tags decoupled from
 +handlers for defining different schedulers. Here instead we
 +parameterise the whole example by the behaviour of yielding and
 +forking as `$yield` and `$fork` functions.
 +
 +```wast
 +(module $example
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  (type $cont-func (func (param (ref $cont)))) ;; [cont ([] -> [])] -> []
 +  (type $cont-cont (cont $cont-func))          ;; cont ([cont ([] -> [])] -> [])
 +
 +  (type $func-cont-func-func (func (param (ref $func)) (param (ref $cont-func)))) ;; ([] -> []) -> ([contref ([] -> [])] -> []) -> []
 +  (type $func-cont-func-cont (cont $func-cont-func-func))                         ;; cont (([] -> []) -> ([contref ([] -> [])] -> []) -> [])
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  (elem declare func $main $thread1 $thread2 $thread3)
 +
 +  (func $main (export "main") (param $yield (ref $func)) (param $fork (ref $cont-func))
 +    (call $log (i32.const 0))
 +    (call_ref
 +      (cont.bind $func-cont-func-cont $cont (local.get $yield) (local.get $fork)
 +        (cont.new $func-cont-func-cont (ref.func $thread1)))
 +      (local.get $fork))
 +    (call $log (i32.const 1))
 +    (call_ref
 +      (cont.bind $func-cont-func-cont $cont (local.get $yield) (local.get $fork)
 +        (cont.new $func-cont-func-cont (ref.func $thread2)))
 +      (local.get $fork))
 +    (call $log (i32.const 2))
 +    (call_ref
 +      (cont.bind $func-cont-func-cont $cont (local.get $yield) (local.get $fork)
 +        (cont.new $func-cont-func-cont (ref.func $thread3)))
 +      (local.get $fork))
 +    (call $log (i32.const 3))
 +  )
 +
 +  (func $thread1 (param $yield (ref $func)) (param $fork (ref $cont-func))
 +    (call $log (i32.const 10))
 +    (call_ref (local.get $yield))
 +    (call $log (i32.const 11))
 +    (call_ref (local.get $yield))
 +    (call $log (i32.const 12))
 +  )
 +
 +  (func $thread2 (param $yield (ref $func)) (param $fork (ref $cont-func))
 +    (call $log (i32.const 20))
 +    (call_ref (local.get $yield))
 +    (call $log (i32.const 21))
 +    (call_ref (local.get $yield))
 +    (call $log (i32.const 22))
 +  )
 +
 +  (func $thread3 (param $yield (ref $func)) (param $fork (ref $cont-func))
 +    (call $log (i32.const 30))
 +    (call_ref (local.get $yield))
 +    (call $log (i32.const 31))
 +    (call_ref (local.get $yield))
 +    (call $log (i32.const 32))
 +  )
 +)
 +(register "example")
 +```
 +
 +The function type `$func-cont-func-fun` is the type of a function that
 +takes an implementation of a `$yield` function and the implementation
 +as a `$fork` function as pararameters; the continuation type
 +`$func-cont-func-cont` is the same thing as a continuation.
 +
 +We now define a scheduler module analogous to that of the previous
 +dynamic lightweight thread example. As before, we will implement five
 +different schedulers.
 +
 +```wast
 +(module
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  (type $cont-func (func (param (ref $cont)))) ;; [contref ([] -> [])] -> []
 +  (type $cont-cont (cont $cont-func))          ;; [(contref ([contref ([] -> [])]))] -> []
 +
 +  (type $func-cont-func-func (func (param (ref $func)) (param (ref $cont-func)))) ;; ([] -> []) -> ([cont ([] -> [])] -> []) -> []
 +  (type $func-cont-func-cont (cont $func-cont-func-func))                         ;; cont (([] -> []) -> ([cont ([] -> [])] -> []) -> [])
 +
 +  (elem declare func
 +     $handle-yield-sync $handle-yield
 +     $handle-fork-sync $handle-fork-kt $handle-fork-tk $handle-fork-ykt $handle-fork-ytk
 +     $yield
 +     $fork-sync $fork-kt $fork-tk $fork-ykt $fork-ytk)
 +
 +  ;; control/prompt interface
 +  (tag $control (import "control" "control") (param (ref $cont-func)))     ;; control : ([cont ([] -> [])] -> []) -> []
 +  (func $prompt (import "control" "prompt") (param $nextk (ref null $cont))) ;; prompt : cont ([] -> []) -> []
 +
 +  ;; queue interface
 +  (func $queue-empty (import "queue" "queue-empty") (result i32))
 +  (func $dequeue (import "queue" "dequeue") (result (ref null $cont)))
 +  (func $enqueue (import "queue" "enqueue") (param $k (ref $cont)))
 +  ...
 +(register "scheduler")
 +```
 +
 +Unlike before, with control/prompt a generic scheduler loop must be
 +decoupled from the implementations of each operation (yield / fork) as
 +the latter are passed in as arguments to user code
 +
 +```wast
 +  ;; generic boilerplate scheduler
 +  (func $scheduler (param $nextk (ref null $cont))
 +    (loop $loop
 +      (if (ref.is_null (local.get $nextk)) (then (return)))
 +      (call $prompt (local.get $nextk))
 +      (local.set $nextk (call $dequeue))
 +      (br $loop)
 +    )
 +  )
 +```
 +
 +The scheduler loop simply keeps on calling prompt with the next thread
 +in the queue until the queue of threads is exhausted.
 +
 +For each scheduler, we invoke the generic scheduler using a
 +continuation parameterised by suitable implementations of yield and
 +fork.
 +
 +First, we do the baseline synchronous scheduler.
 +
 +```wast
 +  ;; synchronous scheduler
 +  (func $handle-yield-sync (param $k (ref $cont))
 +    (call $scheduler (local.get $k))
 +  )
 +  (func $yield-sync
 +    (suspend $control (ref.func $handle-yield))
 +  )
 +  (func $handle-fork-sync (param $t (ref $cont)) (param $k (ref $cont))
 +    (call $enqueue (local.get $t))
 +    (call $scheduler (local.get $k))
 +  )
 +  (func $fork-sync (param $t (ref $cont))
 +    (suspend $control (func.bind (type $cont-func) (local.get $t) (ref.func $handle-fork-sync)))
 +  )
 +  (func $sync (export "sync") (param $k (ref $func-cont-func-cont))
 +    (call $scheduler
 +      (cont.bind $func-cont-func-cont $cont (ref.func $yield) (ref.func $fork-sync) (local.get $k)))
 +  )
 +```
 +
 +The `func.bind` instruction is needed in the implementations of fork
 +More generally `func.bind` is needed for any operation that takes
 +arguments. One could use another continuation here instead, but
 +constructing a new continuation every time an operation is invoked
 +seems unnecessarily wasteful.
 +
 +All of the asynchronous schedulers make use of the same implementation
 +of yield, which enqueues the continuation of the current thread and
 +dequeues the next available thread.
 +
 +```wast
 +  ;; asynchronous yield (used by all asynchronous schedulers)
 +  (func $handle-yield (param $k (ref $cont))
 +    (call $enqueue (local.get $k))
 +    (call $scheduler (call $dequeue))
 +  )
 +  (func $yield
 +    (suspend $control (ref.func $handle-yield))
 +  )
 +```
 +
 +Each asynchronous scheduler uses its own implementation of fork.
 +
 +```wast
 +  ;; four asynchronous implementations of fork:
 +  ;;   * kt and tk don't yield on encountering a fork
 +  ;;     1) kt runs the continuation, queuing up the new thread for later
 +  ;;     2) tk runs the new thread first, queuing up the continuation for later
 +  ;;   * ykt and ytk do yield on encountering a fork
 +  ;;     3) ykt runs the continuation, queuing up the new thread for later
 +  ;;     4) ytk runs the new thread first, queuing up the continuation for later
 +
 +  ;; no yield on fork, continuation first
 +  (func $handle-fork-kt (param $t (ref $cont)) (param $k (ref $cont))
 +    (call $enqueue (local.get $t))
 +    (call $scheduler (local.get $k))
 +  )
 +  (func $fork-kt (param $t (ref $cont))
 +    (suspend $control (func.bind (type $cont-func) (local.get $t) (ref.func $handle-fork-kt)))
 +  )
 +  (func $kt (export "kt") (param $k (ref $func-cont-func-cont))
 +    (call $scheduler
 +      (cont.bind $func-cont-func-cont $cont (ref.func $yield) (ref.func $fork-kt) (local.get $k)))
 +  )
 +
 +  ;; no yield on fork, new thread first
 +  (func $handle-fork-tk (param $t (ref $cont)) (param $k (ref $cont))
 +    (call $enqueue (local.get $k))
 +    (call $scheduler (local.get $t))
 +  )
 +  (func $fork-tk (param $t (ref $cont))
 +    (suspend $control (func.bind (type $cont-func) (local.get $t) (ref.func $handle-fork-tk)))
 +  )
 +  (func $tk (export "tk") (param $k (ref $func-cont-func-cont))
 +    (call $scheduler
 +      (cont.bind $func-cont-func-cont $cont (ref.func $yield) (ref.func $fork-tk) (local.get $k)))
 +  )
 +
 +  ;; yield on fork, continuation first
 +  (func $handle-fork-ykt (param $t (ref $cont)) (param $k (ref $cont))
 +    (call $enqueue (local.get $k))
 +    (call $enqueue (local.get $t))
 +    (call $scheduler (call $dequeue))
 +  )
 +  (func $fork-ykt (param $t (ref $cont))
 +    (suspend $control (func.bind (type $cont-func) (local.get $t) (ref.func $handle-fork-ykt)))
 +  )
 +  (func $ykt (export "ykt") (param $k (ref $func-cont-func-cont))
 +    (call $scheduler
 +      (cont.bind $func-cont-func-cont $cont (ref.func $yield) (ref.func $fork-ykt) (local.get $k)))
 +  )
 +
 +  ;; yield on fork, new thread first
 +  (func $handle-fork-ytk (param $t (ref $cont)) (param $k (ref $cont))
 +    (call $enqueue (local.get $t))
 +    (call $enqueue (local.get $k))
 +    (call $scheduler (call $dequeue))
 +  )
 +  (func $fork-ytk (param $t (ref $cont))
 +    (suspend $control (func.bind (type $cont-func) (local.get $t) (ref.func $handle-fork-ytk)))
 +  )
 +  (func $ytk (export "ytk") (param $k (ref $func-cont-func-cont))
 +    (call $scheduler
 +      (cont.bind $func-cont-func-cont $cont (ref.func $yield) (ref.func $fork-ytk) (local.get $k)))
 +  )
 +)
 +(register "scheduler")
 +```
 +
 +Invoking the schedulers is much like in our original dynamic
 +lightweight threads example, but the types are more complex due to the
 +need to index the handled computation (`$main` in this case) by the
 +implementations of forking and yielding.
 +
 +```wast
 +(module
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  (type $cont-func (func (param (ref $cont)))) ;; [contref ([] -> [])] -> []
 +  (type $cont-cont (cont $cont-func))          ;; cont ([contref ([] -> [])] -> [])
 +
 +  (type $func-cont-func-func (func (param (ref $func)) (param (ref $cont-func)))) ;; ([] -> []) -> ([contref ([] -> [])] -> []) -> []
 +  (type $func-cont-func-cont (cont $func-cont-func-func))                         ;; contref (([] -> []) -> ([contref ([] -> [])] -> []) -> [])
 +
 +  (func $scheduler-sync (import "scheduler" "sync") (param $nextk (ref $func-cont-func-cont)))
 +  (func $scheduler-kt (import "scheduler" "kt") (param $nextk (ref $func-cont-func-cont)))
 +  (func $scheduler-tk (import "scheduler" "tk") (param $nextk (ref $func-cont-func-cont)))
 +  (func $scheduler-ykt (import "scheduler" "ykt") (param $nextk (ref $func-cont-func-cont)))
 +  (func $scheduler-ytk (import "scheduler" "ytk") (param $nextk (ref $func-cont-func-cont)))
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  (func $main (import "example" "main") (param $yield (ref $func)) (param $fork (ref $cont-func)))
 +
 +  (elem declare func $main)
 +
 +  (func $run (export "run")
 +    (call $log (i32.const -1))
 +    (call $scheduler-sync (cont.new $func-cont-func-cont (ref.func $main)))
 +    (call $log (i32.const -2))
 +    (call $scheduler-kt (cont.new $func-cont-func-cont (ref.func $main)))
 +    (call $log (i32.const -3))
 +    (call $scheduler-tk (cont.new $func-cont-func-cont (ref.func $main)))
 +    (call $log (i32.const -4))
 +    (call $scheduler-ykt (cont.new $func-cont-func-cont (ref.func $main)))
 +    (call $log (i32.const -5))
 +    (call $scheduler-ytk (cont.new $func-cont-func-cont (ref.func $main)))
 +    (call $log (i32.const -6))
 +  )
 +)
 +```
 +
 +The output of running this code is just as in the direct
 +implementation of dynamic lightweight threads.
 +
 +## Implementation strategies
 +
 +### Segmented stacks
 +
 +TODO
 +
 +<!--
 +Segmented stacks is an implementation technique for
 +continuations (cite: Dybvig et al., Chez Scheme, Multicore OCaml). The
 +principal idea underpinning segmented stacks is to view each
 +continuation as representing a separate stack.
 +
 +Initially there is one stack, which may create other stacks. Each
 +child stack may create further stacks. Lineage of stacks is maintained
 +by linking children to their parent.
 +
 +```ioke
 +      (stack 1)
 +       (active)
 + |-----------------------|
 + |                       |
 + | ...                   |
 + |                       |
 + | (resume $h1 $c); |
 +>| $c = (cont.new $f)    |
 + |                       |
 + .                       .
 + .                       .
 +
 +```
 +
 +The first stack may perform some computation, before the stack pointer
 +`>` moves to the `cont.new` instruction. Execution of the `cont.new`
 +instruction creates a new suspended stack for the computation implied
 +by `$f`.
 +
 +
 +```ioke
 +      (stack 1)                         (stack 2)
 +       (active)
 + |---------------------|
 + |                     |
 + | ...                 |
 + |                     |                 (suspended)
 + |                     |           |---------------------|
 +>| resume $h1 $c -------------| $f()                |
 + .                     .           .                     .
 + .                     .           .                     .
 +                                   .                     .
 +
 +```
 +
 +Stack 1 maintains control after the creation of stack 2, and thus
 +execution continues on stack 1. The next instruction, `resume`
 +suspends stack 1 and transfers control to new stack 2. Before
 +transferring control, the instruction installs a delimiter `$h1` on
 +stack 1. The transfer of control reverses the parent-child link, such
 +that stack 2 now points back to stack 1. The instruction also installs
 +a delimiter on the parent
 +
 +```ioke
 +      (stack 1)                         (stack 2)
 +       (active)
 + |---------------------|
 + |                     |
 + | ...                 |
 + |                     |                 (suspended)
 + |                     |       ----|---------------------|
 +>| $h1                 |<-----/    | $f()                |
 + .                     .           |                     |
 + .                     .           .                     .
 + .                     .           .                     .
 +                                   .                     .
 +```
 +
 +The stack pointer moves to the top of stack 2, and thus execution
 +continues on stack 2.
 +
 +
 +```ioke
 +      (stack 1)                         (stack 2)
 +       (suspended)
 + |---------------------|
 + |                     |
 + | ...                 |
 + |                     |                 (active)
 + |                     |       ----|---------------------|
 + | $h1                 |<-----/   >| $f()                |
 + .                     .           |                     |
 + .                     .           .                     .
 + .                     .           .                     .
 +                                   .                     .
 +```
 +
 +As execution continues on stack 2 it may eventually perform a
 +`suspend`, which will cause another transfer of
 +control. Supposing it invokes `suspend` with some `$e` handled
 +by `$h1`, then stack 2 will transfer control back to stack 1.
 +
 +
 +```ioke
 +      (stack 1)                         (stack 2)
 +       (suspended)
 + |---------------------|
 + |                     |
 + | ...                 |
 + |                     |                 (active)
 + |                     |       ----|---------------------|
 + | $h1                 |<-----/    | ...                 |
 + .                     .          >| suspend $e     |
 + .                     .           .                     .
 + .                     .           .                     .
 +                                   .                     .
 +```
 +
 +The suspension reverses the parent-child link again, and leaves behind
 +a "hole" on stack 2, that can be filled by an invocation of
 +`resume`.
 +
 +```ioke
 +      (stack 1)                         (stack 2)
 +       (suspended)
 + |---------------------|
 + |                     |
 + | ...                 |
 + |                     |                 (active)
 + |                     |       ----|---------------------|
 +>| $h1                 |------/    | ...                 |
 + .                     .           | [ ]                 |
 + .                     .           .                     .
 + .                     .           .                     .
 +                                   .                     .
 +```
 +
 +-->
 +
 +
 +### Continuation-passing style
 +
 +TODO
 +
 +### Virtual memory
 +
 +TODO
 +
 +### Stack cut'n'paste
 +
 +TODO
 +
 +### OS threads
 +
 +TODO
 +
 +## Design considerations and extensions
 +
 +### Memory management
 +
 +The current proposal does not require a general garbage collector as
 +the linearity of continuations guarantees that there are no cycles in
 +continuation objects. In theory, we could dispense with automated
 +memory management altogether if we took seriously the idea that
 +failure to use a continuation constitutes a bug in the producer. In
 +practice, for most producers enforcing such a discipline is
 +unrealistic and not something an engine can rely on anyway. To prevent
 +space leaks, most engines will need some form of automated memory
 +meanagement for unconsumed continuations. Due to the acyclicity of
 +continuations, a reference counting scheme is sufficient.
 +
 +### Linear versus constant time dispatch
 +
 +The `suspend` instruction relies on traversing a stack of
 +handlers in order to find the appropriate handler, similarly to
 +exception handling. A potential problem is that this can incur a
 +linear runtime cost, especially if we think in terms of segmented
 +stacks, where `suspend` must search the active stack chain for a
 +suitable handler for its argument. Practical experience from Multicore
 +OCaml suggests that for critical use cases (async/await, lightweight
 +threads, actors, etc.) the depth of the handler stack tends to be
 +small so the cost of this linear traversal is negligible. Nonetheless,
 +future applications may benefit from constant-time dispatch. To enable
 +constant-time dispatch we would need to know the target stack a
 +priori, which might be acheived either by maintaining a shadow stack
 +or by extending `suspend` to explicitly target a named handler.
 +
 +### Named handlers
 +
 +We can accommodate named handlers by introducing a new reference type
 +`handler t*`, which essentially is a unique prompt created by
 +executing a variant of the `resume` instruction and is passed to the
 +continuation:
 +
 +```wast
 +  resume_with $ht $ct (tag $e $l)* : [ t1* (ref $ht) (ref $ct) ] -> [ t2* ]
 +  where:
 +  - $ht = handler t2*
 +  - $ct = cont ([ (ref $ht) t1* ] -> [ t2* ])
 +```
 +
 +The handler reference is similar to a prompt in a system of
 +multi-prompt continuations. However, since it is created fresh for
 +each handler, multiple activations of the same prompt cannot exist by
 +construction.
 +
 +This instruction is complemented by an instruction for suspending to a
 +specific handler:
 +
 +```wast
 +  suspend_to $ht $e : [ s* (ref $ht) ] -> [ t* ]
 +  where:
 +  - $ht = handler tr*
 +  - $e : [ s* ] -> [ t* ]
 +```
 +
 +If the handler is not currently active, e.g., because an outer handler
 +has been suspended, then this instruction would trap.
 +
 +### Direct switching
 +
 +The current proposal uses the asymmetric suspend/resume pair of
 +primitives that is characteristic of effect handlers. It does not
 +include a symmetric way of switching to another continuation directly,
 +without going through a handler, and it is conceivable that the double
 +hop through a handler might involve unnecessary overhead for use cases
 +like lightweight threading.
 +
 +Though there is currently no evidence that the double hop overhead is
 +significant in practice, if it does turn out to be important for some
 +applications then the current proposal can be extended with a more
 +symmetric `switch_to` primitive.
 +
 +Given named handlers, it is possible to introduce a somewhat magic
 +instruction for switching directly to another continuation:
 +
 +```wast
 +  switch_to : [ t1* (ref $ct1) (ref $ht) ] -> [ t2* ]
 +  where:
 +  - $ht = handler t3*
 +  - $ct1 = cont ([ (ref $ht) (ref $ct2$) t1* ] -> [ t3* ])
 +  - $ct2 = cont ([ t2* ] -> [ t3* ])
 +```
 +
 +This behaves as if there was a built-in tag
 +
 +```wast
 +  (tag $Switch (param t1* (ref $ct1)) (result t3*))
 +```
 +
 +with which the computation suspends to the handler, and the handler
 +implicitly handles this by resuming to the continuation argument,
 +thereby effectively switching to it in one step. Like `suspend_to`,
 +this would trap if the handler was not currently active.
 +
 +The fact that the handler implicitly resumes, passing itself as a
 +handler to the target continuation, makes this construct behave like a
 +deep handler, which is slightly at odds with the rest of the proposal.
 +
 +In addition to the handler, `switch_to` also passes the new
 +continuation to the target, which allows the target to switch back to
 +it in a symmetric fashion. Notably, in such a use case, `$ct1` and
 +`$ct2` would be the same type (and hence recursive).
 +
 +In fact, symmetric switching need not necessarily be tied to named
 +handlers, since there could also be an indirect version with dynamic
 +handler lookup:
 +
 +```wast
 +  switch : [ t1* (ref $ct1) ] -> [ t2* ]
 +  where:
 +  - $ct1 = cont ([ (ref $ct2) t1* ] -> [ t3* ])
 +  - $ct2 = cont ([ t2* ] -> [ t3* ])
 +```
 +
 +It seems undesirable that every handler implicitly handles the
 +built-in `$Switch` tag, so this should be opt-in by a mode flag on the
 +resume instruction(s).
 +
 +### Control/prompt as an alternative basis
 +
 +An alternative to our typed continuations proposal is to use more
 +established delimited control operators such as control/prompt and
 +shift/reset. As illustrated in the examples section, control/prompt
 +can be viewed as a special instance of the current proposal with a
 +single universal control tag `control` and a handler for each
 +`prompt`.
 +
 +As `control` amounts to a universal control tag it correspondingly has
 +a higher-order type. As illustrated by the example, this requires more
 +complicated types than with the current proposal and depends on
 +greater use of function closures.
 +
 +When considered as a source language feature effect handlers are
 +preferable to control/prompt because they are more modular and easier
 +to reason about. Effect handlers naturally provide a separation of
 +concerns. Users program to an effect interface, whereas `control`
 +allows (and indeed requires) them to essentially rewrite the
 +implementation inline (in practice this is unmanageable, so one
 +abstracts over a few key behaviours using functions as illustrated in
 +the example). Of course, intermediate languages have different
 +requirements to source languages, so modularity and ease of reasoning
 +may be less critical. Nonetheless, they should not be discounted
 +entirely.
 +
 +### Coupling of continuation capture and dispatch
 +
 +A possible concern with the current design is that it relies on a
 +specific form of dispatch based on tags. Suspending not only captures
 +the current continuation up to the nearest prompt, but also dispatches
 +to the handler clause associated with the given tag. It might be
 +tempting to try to decouple continuation capture from dispatch, but it
 +is unclear what other form of dispatch would be useful or whether
 +there is a clean way to enable such decoupling.
 +
 +With control/prompt there is no coupling of continuation capture with
 +dispatch, because there is no dispatch. But this is precisely because
 +`control` behaves as a universal tag, which requires behaviour to be
 +given inline via a closure, breaking modularity and necessitating a
 +higher-order type even for simple uses of continuations like
 +lightweight threads.
 +
 +This is not to say that control/prompt or a generalisation to
 +multiprompt delimited continuations is necessarily a bad low-level
 +implementation technique. For instance, the
 +[libmprompt](https://github.com/koka-lang/libmprompt) C library
 +implements effect handlers on top of multiprompt delimited
 +continuations. However, a key difference there is that the C
 +implementation does not require static stack typing, something that is
 +fundamental to the design of Wasm. Thus, the implementation does not
 +need to contend directly with the higher-order type of `control`.
 +
 +### Tail-resumptive handlers
 +
 +A handler is said to be *tail-resumptive* if the handler invokes the
 +continuation in tail-position in every control tag clause. The
 +canonical example of a tail-resumptive handler is dynamic binding
 +(which can be useful to implement implicit parameters to
 +computations). The control tag clauses of a tail-resumptive handler
 +can be inlined at the control tag invocation sites, because they do
 +not perform any non-trivial control flow manipulation, they simply
 +retrieve a value. Inlining clause definitions means that no time is
 +spent constructing continuation objects.
 +
 +The present iteration of this proposal does not include facilities for
 +identifying and inlining tail-resumptive handlers. None of the
 +critical use-cases requires such a facility. Nevertheless, it is
 +natural to envisage a future iteration of this proposal that includes
 +an extension for distinguishing tail-resumptive handlers.
 +
 +### Multi-shot continuations
 +
 +Continuations in this proposal are *single-shot* (aka *linear*),
 +meaning that they must be invoked exactly once (though this is not
 +statically enforced). A continuation can be invoked either by resuming
 +it (with `resume`) or by aborting it (with `resume_throw`). Some
 +applications such as backtracking, probabilistic programming, and
 +process duplication exploit *multi-shot* continuations, but none of
 +the critical use cases require multi-shot continuations. Nevertheless,
 +it is natural to envisage a future iteration of this proposal that
 +includes support for multi-shot continuations by way of a continuation
 +clone instruction.
 +
 +### Interoperability, legacy code, and the barrier instruction
 +
 +The barrier instruction provides a direct way of preventing control
 +tags from being suspended outside a particular computation.
 +
 +Consider a module A written using an existing C/C++ compiler that
 +targets a Wasm backend. Let us assume that module A depends on a
 +second Wasm module B. Now suppose that the producer for module B is
 +updated to take advantage of typed continuations. In order to ensure
 +that suspensions arising in calls to B do not pass through A,
 +potentially causing unexpected changes to the semantics of A, the
 +producer for module A can ensure that all external calls are wrapped
 +in the barrier instruction.
 +
 +It might seem preferable to somehow guarantee that support for typed
 +continuations is not enabled by default, meaning that no changes to
 +the producer for module A would be necessary. But it is unclear what
 +such an approach would look like in practice and whether it would
 +actually be feasible. In any case, using the barrier instruction the
 +producer for B could make module B safe for linking with an unchanged
 +module A by wrapping the barrier instruction around all of the
 +functions exported by module B.
 +
 +Questions of Wasm interoperability and support for legacy code are
 +largely orthogonal to the typed continuations proposal and similar
 +issues already arise with extensions such as exceptions.
 +
 +### First-class tags
 +
 +In the current proposal tags are statically defined in a module
 +header. This should suffice for supporting the critical
 +use-cases. However, for some purposes, such as implementing richer
 +forms of control operators such as effect handlers, it might be useful
 +to add support for dynamically generated tags. These could be used,
 +for instance, for more efficiently compiling effect handlers that take
 +advantage of features such as Multicore OCaml's functors, where the
 +type of an effect (tag) may not be fully known at compile time.
 +
 +### Shallow versus deep handlers
 +
 +The effect handlers feature which underlies the design of the typed
 +continuations proposal classically comes in too forms: shallow and
 +deep handlers. With shallow handlers, the installation of handlers is
 +completely decoupled from resuming a continuation. With deep handlers,
 +the handler that produced the continuation is automatically
 +reinstalled when a continuation is resumed. The typed continuations
 +proposal adopts a hybrid of shallow and deep handlers, which we call
 +*sheep handlers*. Like a shallow handler, there is no automatic
 +reinstallation of an existing handler. But like deep handlers a new
 +handler is installed when a continuation is resumed: the new handler
 +is written explicitly as part of the `resume` instruction.
 +
 +
 +TODO: resuspend (aka OCaml's reperform, and analogous to exception proposal's rethrow)
 +
 +TODO: return clauses
 +
 +TODO: preemption / asynchrony / interrupts
 +
 +TODO: how do we interact with parametric polymorphism?
 +
 +TODO: lexically-scoped handlers
 +
 +TODO: parametric tags / existential types?
 +
 +TODO: tag subtyping?
 +
 +TODO: compare to asyncify?
 +
 +TODO: compare to Wasm/k?
 +
 +TOOD: compare to the Koka Wasm backend?
 diff --git a/proposals/continuations/Overview.md b/proposals/continuations/Overview.md
 new file mode 100644
 index 00000000..7b4e5fc0
 --- /dev/null
 +++ b/proposals/continuations/Overview.md
 @@ -0,0 +1,166 @@
 +# Typed Continuations for WebAssembly
 +
 +## Language Extensions
 +
 +Based on [typed reference proposal](https://github.com/WebAssembly/function-references/blob/master/proposals/function-references/Overview.md) and [exception handling proposal](https://github.com/WebAssembly/exception-handling/blob/master/proposals/exception-handling/Exceptions.md).
 +
 +
 +### Types
 +
 +#### Defined Types
 +
 +* `cont <typeidx>` is a new form of defined type
 +  - `(cont $ft) ok` iff `$ft ok` and `$ft = [t1*] -> [t2*]`
 +
 +
 +### Instructions
 +
 +* `cont.new <typeidx>` creates a new continuation
 +  - `cont.new $ct : [(ref null? $ft)] -> [(ref $ct)]`
 +    - iff `$ct = cont $ft`
 +
 +* `cont.bind <typeidx> <typeidx>` binds a continuation to (partial) arguments
 +  - `cont.bind $ct $ct' : [t3* (ref null? $ct)] -> [(ref $ct')]`
 +    - iff `$ct = cont $ft`
 +    - and `$ft = [t3* t1*] -> [t2*]`
 +    - and `$ct' = cont $ft'`
 +    - and `$ft' = [t1'*] -> [t2'*]`
 +    - and `[t1*] -> [t2*] <: [t1'*] -> [t2'*]`
 +
 +* `suspend <tagidx>` suspends the current continuation
 +  - `suspend $t : [t1*] -> [t2*]`
 +    - iff `tag $t : [t1*] -> [t2*]`
 +
 +* `resume <typeidx> (tag <tagidx> <labelidx>)*` resumes a continuation
 +  - `resume $ct (tag $t $l)* : [t1* (ref null? $ct)] -> [t2*]`
 +    - iff `$ct = cont $ft`
 +    - and `$ft = [t1*] -> [t2*]`
 +    - and `(tag $t : [te1*] -> [te2*])*`
 +    - and `(label $l : [te1'* (ref null? $ct')])*`
 +    - and `([te1*] <: [te1'*])*`
 +    - and `($ct' = cont $ft')*`
 +    - and `$ft' = [t1'*] -> [t2'*]`
 +    - and `([te2*] -> [t2*] <: [t1'*] -> [t2'*])*`
 +
 +* `resume_throw <typeidx> <tagidx> (tag <tagidx> <labelidx>)` aborts a continuation
 +  - `resume_throw $ct $e (tag $t $l): [te* (ref null? $ct)] -> [t2*]`
 +    - iff `(tag $e : [te*] -> [])`
 +    - and `$ct = cont $ft`
 +    - and `$ft = [t1*] -> [t2*]`
 +    - and `(tag $t : [te1*] -> [te2*])*`
 +    - and `(label $l : [te1'* (ref null? $ct')])*`
 +    - and `([te1*] <: [te1'*])*`
 +    - and `($ct' = cont $ft')*`
 +    - and `$ft' = [t1'*] -> [t2'*]`
 +    - and `([te2*] -> [t2*] <: [t1'*] -> [t2'*])*`
 +
 +* `barrier <blocktype> <instr>* end` blocks suspension
 +  - `barrier $l bt instr* end : [t1*] -> [t2*]`
 +    - iff `bt = [t1*] -> [t2*]`
 +    - and `instr* : [t1*] -> [t2*]` with labels extended with `[t2*]`
 +
 +
 +## Reduction Semantics
 +
 +### Store extensions
 +
 +* New store component `tags` for allocated tags
 +  - `S ::= {..., tags <taginst>*}`
 +
 +* A *tag instance* represents a control tag
 +  - `taginst ::= {type <tagtype>}`
 +
 +* New store component `conts` for allocated continuations
 +  - `S ::= {..., conts <cont>?*}`
 +
 +* A continuation is a context annotated with its hole's arity
 +  - `cont ::= (E : n)`
 +
 +
 +### Administrative instructions
 +
 +* `(ref.cont a)` represents a continuation value, where `a` is a *continuation address* indexing into the store's `conts` component
 +  - `ref.cont a : [] -> [(ref $ct)]`
 +    - iff `S.conts[a] = epsilon \/ S.conts[a] = (E : n)`
 +    - and `$ct = cont $ft`
 +    - and `$ft = [t1^n] -> [t2*]`
 +
 +* `(handle{(<tagaddr> <labelidx>)*}? <instr>* end)` represents an active handler (or a barrier when no handler list is present)
 +  - `(handle{(a $l)*}? instr* end) : [t1*] -> [t2*]`
 +    - iff `instr* : [t1*] -> [t2*]`
 +    - and `(S.tags[a].type = [te1*] -> [te2*])*`
 +    - and `(label $l : [te1'* (ref null? $ct')])*`
 +    - and `([te1*] <: [te1'*])*`
 +    - and `($ct' = cont $ft')*`
 +    - and `([te2*] -> [t2*] <: $ft')*`
 +
 +
 +### Handler contexts
 +
 +```
 +H^ea ::=
 +  _
 +  val* H^ea instr*
 +  label_n{instr*} H^ea end
 +  frame_n{F} H^ea end
 +  catch{...} H^ea end
 +  handle{(ea' $l)*} H^ea end   (iff ea notin ea'*)
 +```
 +
 +
 +### Reduction
 +
 +* `S; F; (ref.null t) (cont.new $ct)  -->  S; F; trap`
 +
 +* `S; F; (ref.func fa) (cont.new $ct)  -->  S'; F; (ref.cont |S.conts|)`
 +  - iff `S' = S with conts += (E : n)`
 +  - and `E = _ (invoke fa)`
 +  - and `$ct = cont $ft`
 +  - and `$ft = [t1^n] -> [t2*]`
 +
 +* `S; F; (ref.null t) (cont.bind $ct $ct')  -->  S; F; trap`
 +
 +* `S; F; (ref.cont ca) (cont.bind $ct $ct')  -->  S'; F; trap`
 +  - iff `S.conts[ca] = epsilon`
 +
 +* `S; F; v^n (ref.cont ca) (cont.bind $ct $ct')  -->  S'; F; (ref.const |S.conts|)`
 +  - iff `S.conts[ca] = (E' : n')`
 +  - and `$ct' = cont $ft'`
 +  - and `$ft' = [t1'*] -> [t2'*]`
 +  - and `n = n' - |t1'*|`
 +  - and `S' = S with conts[ca] = epsilon with conts += (E : |t1'*|)`
 +  - and `E = E'[v^n _]`
 +
 +* `S; F; (ref.null t) (resume $ct (tag $e $l)*)  -->  S; F; trap`
 +
 +* `S; F; (ref.cont ca) (resume $ct (tag $e $l)*)  -->  S; F; trap`
 +  - iff `S.conts[ca] = epsilon`
 +
 +* `S; F; v^n (ref.cont ca) (resume $ct (tag $t $l)*)  -->  S'; F; handle{(ea $l)*} E[v^n] end`
 +  - iff `S.conts[ca] = (E : n)`
 +  - and `(ea = F.tags[$t])*`
 +  - and `S' = S with conts[ca] = epsilon`
 +
 +* `S; F; (ref.null t) (resume_throw $ct $e (tag $t $l)*)  -->  S; F; trap`
 +
 +* `S; F; (ref.cont ca) (resume_throw $ct $e (tag $t $l)*)  -->  S; F; trap`
 +  - iff `S.conts[ca] = epsilon`
 +
 +* `S; F; v^m (ref.cont ca) (resume_throw $ct $e (tag $t $l)*)  -->  S'; F; handle{(ea $l)*} E[v^m (throw $e)] end`
 +  - iff `S.conts[ca] = (E : n)`
 +  - and `(ea = F.tags[$t])*`
 +  - and `S.tags[F.tags[$e]].type = [t1^m] -> [t2*]`
 +  - and `S' = S with conts[ca] = epsilon`
 +
 +* `S; F; (barrier bt instr* end)  -->  S; F; handle instr* end`
 +
 +* `S; F; (handle{(e $l)*}? v* end)  -->  S; F; v*`
 +
 +* `S; F; (handle H^ea[(suspend $e)] end)  --> S; F; trap`
 +  - iff `ea = F.tags[$e]`
 +
 +* `S; F; (handle{(ea1 $l1)* (ea $l) (ea2 $l2)*} H^ea[v^n (suspend $e)] end)  --> S'; F; v^n (ref.cont |S.conts|) (br $l)`
 +  - iff `ea notin ea1*`
 +  - and `ea = F.tags[$e]`
 +  - and `S.tags[ea].type = [t1^n] -> [t2^m]`
 +  - and `S' = S with conts += (H^ea : m)`
 diff --git a/proposals/continuations/examples/actor-lwt.wast b/proposals/continuations/examples/actor-lwt.wast
 new file mode 100644
 index 00000000..60d87959
 --- /dev/null
 +++ b/proposals/continuations/examples/actor-lwt.wast
 @@ -0,0 +1,585 @@
 +;; Actors via lightweight threads
 +
 +;; actor interface
 +(module $actor
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  ;; self : [i32] -> []
 +  ;; spawn : [cont ([] -> [])] -> [i32]
 +  ;; send : [i32 i32] -> []
 +  ;; recv : [] -> [i32]
 +  (tag $self (export "self") (result i32))
 +  (tag $spawn (export "spawn") (param (ref $cont)) (result i32))
 +  (tag $send (export "send") (param i32 i32))
 +  (tag $recv (export "recv") (result i32))
 +)
 +(register "actor")
 +
 +;; a simple example - pass a message through a chain of actors
 +(module $chain
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  (type $i-func (func (param i32))) ;; [i32] -> []
 +  (type $i-cont (cont $i-func))     ;; cont ([i32] -> [])
 +
 +  ;; self : [i32] -> []
 +  ;; spawn : [cont ([] -> [])] -> [i32]
 +  ;; send : [i32 i32] -> []
 +  ;; recv : [] -> [i32]
 +  (tag $self (import "actor" "self") (result i32))
 +  (tag $spawn (import "actor" "spawn") (param (ref $cont)) (result i32))
 +  (tag $send (import "actor" "send") (param i32 i32))
 +  (tag $recv (import "actor" "recv") (result i32))
 +
 +  (elem declare func $next)
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  (func $next (param $p i32)
 +    (local $s i32)
 +    (local.set $s (suspend $recv))
 +    (call $log (i32.const -1))
 +    (suspend $send (local.get $s) (local.get $p))
 +  )
 +
 +  ;; send the message 42 through a chain of n actors
 +  (func $chain (export "chain") (param $n i32)
 +    (local $p i32)
 +    (local.set $p (suspend $self))
 +
 +    (loop $l
 +      (if (i32.eqz (local.get $n))
 +        (then (suspend $send (i32.const 42) (local.get $p)))
 +        (else (local.set $p (suspend $spawn (cont.bind $i-cont $cont (local.get $p) (cont.new $i-cont (ref.func $next)))))
 +              (local.set $n (i32.sub (local.get $n) (i32.const 1)))
 +              (br $l))
 +      )
 +    )
 +    (call $log (suspend $recv))
 +  )
 +)
 +(register "chain")
 +
 +;; queues of threads and mailboxes
 +(module $queue
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  ;; table (threads) and memory (mailboxes) as simple queues
 +  (table $queue 0 (ref null $cont))
 +  (memory 1)
 +
 +  (tag $too-many-mailboxes)
 +
 +  (global $qdelta i32 (i32.const 10))
 +
 +  (global $qback-k (mut i32) (i32.const 0))
 +  (global $qfront-k (mut i32) (i32.const 0))
 +
 +  (func $queue-empty-k (export "queue-empty") (result i32)
 +    (i32.eq (global.get $qfront-k) (global.get $qback-k))
 +  )
 +
 +  (func $dequeue-k (export "dequeue-k") (result (ref null $cont))
 +    (local $i i32)
 +    (if (call $queue-empty-k)
 +      (then (return (ref.null $cont)))
 +    )
 +    (local.set $i (global.get $qfront-k))
 +    (global.set $qfront-k (i32.add (local.get $i) (i32.const 1)))
 +    (table.get $queue (local.get $i))
 +  )
 +
 +  (func $enqueue-k (export "enqueue-k") (param $k (ref $cont))
 +    ;; Check if queue is full
 +    (if (i32.eq (global.get $qback-k) (table.size $queue))
 +      (then
 +        ;; Check if there is enough space in the front to compact
 +        (if (i32.lt_u (global.get $qfront-k) (global.get $qdelta))
 +          (then
 +            ;; Space is below threshold, grow table instead
 +            (drop (table.grow $queue (ref.null $cont) (global.get $qdelta)))
 +          )
 +          (else
 +            ;; Enough space, move entries up to head of table
 +            (global.set $qback-k (i32.sub (global.get $qback-k) (global.get $qfront-k)))
 +            (table.copy $queue $queue
 +              (i32.const 0)         ;; dest = new front = 0
 +              (global.get $qfront-k)  ;; src = old front
 +              (global.get $qback-k)   ;; len = new back = old back - old front
 +            )
 +            (table.fill $queue      ;; null out old entries to avoid leaks
 +              (global.get $qback-k)   ;; start = new back
 +              (ref.null $cont)      ;; init value
 +              (global.get $qfront-k)  ;; len = old front = old front - new front
 +            )
 +            (global.set $qfront-k (i32.const 0))
 +          )
 +        )
 +      )
 +    )
 +    (table.set $queue (global.get $qback-k) (local.get $k))
 +    (global.set $qback-k (i32.add (global.get $qback-k) (i32.const 1)))
 +  )
 +
 +  (global $qback-mb (mut i32) (i32.const 0))
 +  (global $qfront-mb (mut i32) (i32.const 0))
 +
 +  (func $queue-empty-mb (export "queue-empty-mb") (result i32)
 +    (i32.eq (global.get $qfront-mb) (global.get $qback-mb))
 +  )
 +
 +  (func $dequeue-mb (export "dequeue-mb") (result i32)
 +    (local $i i32)
 +    (local $mb i32)
 +    (if (call $queue-empty-mb)
 +      (then (return (i32.const -1)))
 +    )
 +    (local.set $i (global.get $qfront-mb))
 +    (global.set $qfront-mb (i32.add (local.get $i) (i32.const 1)))
 +    (local.set $mb (i32.load (i32.mul (local.get $i) (i32.const 4))))
 +    (return (local.get $mb))
 +  )
 +
 +  (func $enqueue-mb (export "enqueue-mb") (param $mb i32)
 +    ;; Check if queue is full
 +    (if (i32.eq (global.get $qback-mb) (i32.const 16383))
 +      (then
 +        ;; Check if there is enough space in the front to compact
 +        (if (i32.lt_u (global.get $qfront-mb) (global.get $qdelta))
 +          (then
 +            ;; Space is below threshold, throw exception
 +            (throw $too-many-mailboxes)
 +          )
 +          (else
 +            ;; Enough space, move entries up to head of table
 +            (global.set $qback-mb (i32.sub (global.get $qback-mb) (global.get $qfront-mb)))
 +            (memory.copy
 +              (i32.const 0)                                    ;; dest = new front = 0
 +              (i32.mul (global.get $qfront-mb) (i32.const 4))  ;; src = old front
 +              (i32.mul (global.get $qback-mb) (i32.const 4))   ;; len = new back = old back - old front
 +            )
 +            (memory.fill                                       ;; null out old entries to avoid leaks
 +              (i32.mul (global.get $qback-mb) (i32.const 4))   ;; start = new back
 +              (i32.const -1)                                   ;; init value
 +              (i32.mul (global.get $qfront-mb) (i32.const 4))  ;; len = old front = old front - new front
 +            )
 +            (global.set $qfront-mb (i32.const 0))
 +          )
 +        )
 +      )
 +    )
 +    (i32.store (i32.mul (global.get $qback-mb) (i32.const 4)) (local.get $mb))
 +    (global.set $qback-mb (i32.add (global.get $qback-mb) (i32.const 1)))
 +  )
 +)
 +(register "queue")
 +
 +(module $mailboxes
 +  ;; Stupid implementation of mailboxes that raises an exception if
 +  ;; there are too many mailboxes or if more than one message is sent
 +  ;; to any given mailbox.
 +  ;;
 +  ;; Sufficient for the simple chain example.
 +
 +  ;; -1 means empty
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  (tag $too-many-mailboxes)
 +  (tag $too-many-messages)
 +
 +  (memory 1)
 +
 +  (global $msize (mut i32) (i32.const 0)) ;; current number of mailboxes
 +  (global $mmax i32 (i32.const 1024))     ;; maximum number of mailboxes
 +
 +  (func $init (export "init")
 +     (global.set $msize (i32.const 0))
 +     (memory.fill (i32.const 0) (i32.const -1) (i32.mul (global.get $mmax) (i32.const 4)))
 +  )
 +
 +  (func $empty-mb (export "empty-mb") (param $mb i32) (result i32)
 +    (local $offset i32)
 +    (local.set $offset (i32.mul (local.get $mb) (i32.const 4)))
 +    (i32.eq (i32.load (local.get $offset)) (i32.const -1))
 +  )
 +
 +  (func $new-mb (export "new-mb") (result i32)
 +     (local $mb i32)
 +
 +     (if (i32.ge_u (global.get $msize) (global.get $mmax))
 +         (then (throw $too-many-mailboxes))
 +     )
 +
 +     (local.set $mb (global.get $msize))
 +     (global.set $msize (i32.add (global.get $msize) (i32.const 1)))
 +     (return (local.get $mb))
 +  )
 +
 +  (func $send-to-mb (export "send-to-mb") (param $v i32) (param $mb i32)
 +    (local $offset i32)
 +    (local.set $offset (i32.mul (local.get $mb) (i32.const 4)))
 +    (if (call $empty-mb (local.get $mb))
 +      (then (i32.store (local.get $offset) (local.get $v)))
 +      (else (throw $too-many-messages))
 +    )
 +  )
 +
 +  (func $recv-from-mb (export "recv-from-mb") (param $mb i32) (result i32)
 +    (local $v i32)
 +    (local $offset i32)
 +    (local.set $offset (i32.mul (local.get $mb) (i32.const 4)))
 +    (local.set $v (i32.load (local.get $offset)))
 +    (i32.store (local.get $offset) (i32.const -1))
 +    (local.get $v)
 +  )
 +)
 +(register "mailboxes")
 +
 +
 +;; a simple example - pass a message through a chain of actors
 +(module $chain
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  (type $i-func (func (param i32)))
 +  (type $i-cont (cont $i-func))
 +
 +  (tag $self (import "actor" "self") (result i32))
 +  (tag $spawn (import "actor" "spawn") (param (ref $cont)) (result i32))
 +  (tag $send (import "actor" "send") (param i32 i32))
 +  (tag $recv (import "actor" "recv") (result i32))
 +
 +  (elem declare func $next)
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  (func $next (param $p i32)
 +    (local $s i32)
 +    (local.set $s (suspend $recv))
 +    (call $log (i32.const -1))
 +    (suspend $send (local.get $s) (local.get $p))
 +  )
 +
 +  ;; send the message 42 through a chain of n actors
 +  (func $chain (export "chain") (param $n i32)
 +    (local $p i32)
 +    (local.set $p (suspend $self))
 +
 +    (loop $l
 +      (if (i32.eqz (local.get $n))
 +        (then (suspend $send (i32.const 42) (local.get $p)))
 +        (else (local.set $p (suspend $spawn (cont.bind $i-cont $cont (local.get $p) (cont.new $i-cont (ref.func $next)))))
 +              (local.set $n (i32.sub (local.get $n) (i32.const 1)))
 +              (br $l))
 +      )
 +    )
 +    (call $log (suspend $recv))
 +  )
 +)
 +(register "chain")
 +
 +;; interface to lightweight threads
 +(module $lwt
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  (tag $yield (export "yield"))
 +  (tag $fork (export "fork") (param (ref $cont)))
 +)
 +(register "lwt")
 +
 +;; queue of threads
 +(module $queue
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  ;; Table as simple queue (keeping it simple, no ring buffer)
 +  (table $queue 0 (ref null $cont))
 +  (global $qdelta i32 (i32.const 10))
 +  (global $qback (mut i32) (i32.const 0))
 +  (global $qfront (mut i32) (i32.const 0))
 +
 +  (func $queue-empty (export "queue-empty") (result i32)
 +    (i32.eq (global.get $qfront) (global.get $qback))
 +  )
 +
 +  (func $dequeue (export "dequeue") (result (ref null $cont))
 +    (local $i i32)
 +    (if (call $queue-empty)
 +      (then (return (ref.null $cont)))
 +    )
 +    (local.set $i (global.get $qfront))
 +    (global.set $qfront (i32.add (local.get $i) (i32.const 1)))
 +    (table.get $queue (local.get $i))
 +  )
 +
 +  (func $enqueue (export "enqueue") (param $k (ref $cont))
 +    ;; Check if queue is full
 +    (if (i32.eq (global.get $qback) (table.size $queue))
 +      (then
 +        ;; Check if there is enough space in the front to compact
 +        (if (i32.lt_u (global.get $qfront) (global.get $qdelta))
 +          (then
 +            ;; Space is below threshold, grow table instead
 +            (drop (table.grow $queue (ref.null $cont) (global.get $qdelta)))
 +          )
 +          (else
 +            ;; Enough space, move entries up to head of table
 +            (global.set $qback (i32.sub (global.get $qback) (global.get $qfront)))
 +            (table.copy $queue $queue
 +              (i32.const 0)         ;; dest = new front = 0
 +              (global.get $qfront)  ;; src = old front
 +              (global.get $qback)   ;; len = new back = old back - old front
 +            )
 +            (table.fill $queue      ;; null out old entries to avoid leaks
 +              (global.get $qback)   ;; start = new back
 +              (ref.null $cont)      ;; init value
 +              (global.get $qfront)  ;; len = old front = old front - new front
 +            )
 +            (global.set $qfront (i32.const 0))
 +          )
 +        )
 +      )
 +    )
 +    (table.set $queue (global.get $qback) (local.get $k))
 +    (global.set $qback (i32.add (global.get $qback) (i32.const 1)))
 +  )
 +)
 +(register "queue")
 +
 +;; simple scheduler for lightweight threads
 +(module $scheduler
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  (tag $yield (import "lwt" "yield"))
 +  (tag $fork (import "lwt" "fork") (param (ref $cont)))
 +
 +  (func $queue-empty (import "queue" "queue-empty") (result i32))
 +  (func $dequeue (import "queue" "dequeue") (result (ref null $cont)))
 +  (func $enqueue (import "queue" "enqueue") (param $k (ref $cont)))
 +
 +  (func $run (export "run") (param $main (ref $cont))
 +    (call $enqueue (local.get $main))
 +    (loop $l
 +      (if (call $queue-empty) (then (return)))
 +      (block $on_yield (result (ref $cont))
 +        (block $on_fork (result (ref $cont) (ref $cont))
 +          (resume $cont (tag $yield $on_yield) (tag $fork $on_fork)
 +            (call $dequeue)
 +          )
 +          (br $l)  ;; thread terminated
 +        ) ;;   $on_fork (result (ref $cont) (ref $cont))
 +        (call $enqueue)  ;; current thread
 +        (call $enqueue)  ;; new thread
 +        (br $l)
 +      )
 +      ;;     $on_yield (result (ref $cont))
 +      (call $enqueue) ;; current thread
 +      (br $l)
 +    )
 +  )
 +)
 +(register "scheduler")
 +
 +(module $mailboxes
 +  ;; Stupid implementation of mailboxes that raises an exception if
 +  ;; there are too many mailboxes or if more than one message is sent
 +  ;; to any given mailbox.
 +  ;;
 +  ;; Sufficient for the simple chain example.
 +
 +  ;; -1 means empty
 +
 +  (tag $too-many-mailboxes)
 +  (tag $too-many-messages)
 +
 +  (memory 1)
 +
 +  (global $msize (mut i32) (i32.const 0))
 +  (global $mmax i32 (i32.const 1024)) ;; maximum number of mailboxes
 +
 +  (func $init (export "init")
 +     (memory.fill (i32.const 0) (i32.const -1) (i32.mul (global.get $mmax) (i32.const 4)))
 +  )
 +
 +  (func $empty-mb (export "empty-mb") (param $mb i32) (result i32)
 +    (local $offset i32)
 +    (local.set $offset (i32.mul (local.get $mb) (i32.const 4)))
 +    (i32.eq (i32.load (local.get $offset)) (i32.const -1))
 +  )
 +
 +  (func $new-mb (export "new-mb") (result i32)
 +     (local $mb i32)
 +
 +     (if (i32.ge_u (global.get $msize) (global.get $mmax))
 +         (then (throw $too-many-mailboxes))
 +     )
 +
 +     (local.set $mb (global.get $msize))
 +     (global.set $msize (i32.add (global.get $msize) (i32.const 1)))
 +     (return (local.get $mb))
 +  )
 +
 +  (func $send-to-mb (export "send-to-mb") (param $v i32) (param $mb i32)
 +    (local $offset i32)
 +    (local.set $offset (i32.mul (local.get $mb) (i32.const 4)))
 +    (if (call $empty-mb (local.get $mb))
 +      (then (i32.store (local.get $offset) (local.get $v)))
 +      (else (throw $too-many-messages))
 +    )
 +  )
 +
 +  (func $recv-from-mb (export "recv-from-mb") (param $mb i32) (result i32)
 +    (local $v i32)
 +    (local $offset i32)
 +    (local.set $offset (i32.mul (local.get $mb) (i32.const 4)))
 +    (local.set $v (i32.load (local.get $offset)))
 +    (i32.store (local.get $offset) (i32.const -1))
 +    (local.get $v)
 +  )
 +)
 +(register "mailboxes")
 +
 +;; actors implemented via lightweight threads
 +(module $actor-as-lwt
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  (type $i-func (func (param i32))) ;; [i32] -> []
 +  (type $i-cont (cont $i-func))     ;; cont ([i32] -> [])
 +
 +  (type $ic-func (func (param i32 (ref $cont)))) ;; [i32 (cont ([] -> []))] -> []
 +  (type $ic-cont (cont $ic-func))                ;; cont ([i32 (cont ([] -> []))] -> [])
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  ;; lwt interface
 +  (tag $yield (import "lwt" "yield"))
 +  (tag $fork (import "lwt" "fork") (param (ref $cont)))
 +
 +  ;; mailbox interface
 +  (func $init (import "mailboxes" "init"))
 +  (func $empty-mb (import "mailboxes" "empty-mb") (param $mb i32) (result i32))
 +  (func $new-mb (import "mailboxes" "new-mb") (result i32))
 +  (func $send-to-mb (import "mailboxes" "send-to-mb") (param $v i32) (param $mb i32))
 +  (func $recv-from-mb (import "mailboxes" "recv-from-mb") (param $mb i32) (result i32))
 +
 +  ;; queue interface
 +  (func $queue-empty (import "queue" "queue-empty") (result i32))
 +  (func $dequeue (import "queue" "dequeue") (result (ref null $cont)))
 +  (func $enqueue (import "queue" "enqueue") (param $k (ref $cont)))
 +
 +  ;; actor interface
 +  ;; self : [i32] -> []
 +  ;; spawn : [cont ([] -> [])] -> [i32]
 +  ;; send : [i32 i32] -> []
 +  ;; recv : [] -> [i32]
 +  (tag $self (import "actor" "self") (result i32))
 +  (tag $spawn (import "actor" "spawn") (param (ref $cont)) (result i32))
 +  (tag $send (import "actor" "send") (param i32 i32))
 +  (tag $recv (import "actor" "recv") (result i32))
 +
 +  (elem declare func $actk)
 +
 +  (func $actk (param $mine i32) (param $nextk (ref $cont))
 +    (local $ik (ref $i-cont))
 +    (local $k (ref $cont))
 +    (local $you (ref $cont))
 +    (local $yours i32)
 +    (loop $l
 +      (block $on_self (result (ref $i-cont))
 +        (block $on_spawn (result (ref $cont) (ref $i-cont))
 +          (block $on_send (result i32 i32 (ref $cont))
 +            (block $on_recv (result (ref $i-cont))
 +               (resume $cont (tag $self $on_self)
 +                             (tag $spawn $on_spawn)
 +                             (tag $send $on_send)
 +                             (tag $recv $on_recv)
 +                             (local.get $nextk)
 +               )
 +               (return)
 +            ) ;;   $on_recv (result (ref $i-cont))
 +            (local.set $ik)
 +            ;; block this thread until the mailbox is non-empty
 +            (loop $blocked
 +              (if (call $empty-mb (local.get $mine))
 +                  (then (suspend $yield)
 +                        (br $blocked))
 +              )
 +            )
 +            (local.set $nextk (cont.bind $i-cont $cont (call $recv-from-mb (local.get $mine)) (local.get $ik)))
 +            (br $l)
 +          ) ;;   $on_send (result i32 i32 (ref $cont))
 +          (local.set $k)
 +          (call $send-to-mb)
 +          (local.set $nextk (local.get $k))
 +          (br $l)
 +        ) ;;   $on_spawn (result (ref $cont) (ref $i-cont))
 +        (local.set $ik)
 +        (local.set $you)
 +        (call $new-mb)
 +        (local.set $yours)
 +        (suspend $fork (cont.bind $ic-cont $cont
 +                                  (local.get $yours)
 +                                  (local.get $you)
 +                                  (cont.new $ic-cont (ref.func $actk))))
 +        (local.set $nextk (cont.bind $i-cont $cont (local.get $yours) (local.get $ik)))
 +        (br $l)
 +      ) ;;   $on_self (result (ref $i-cont))
 +      (local.set $ik)
 +      (local.set $nextk (cont.bind $i-cont $cont (local.get $mine) (local.get $ik)))
 +      (br $l)
 +    )
 +  )
 +
 +  (func $act (export "act") (param $k (ref $cont))
 +    (call $init)
 +    (call $actk (call $new-mb) (local.get $k))
 +  )
 +)
 +(register "actor-as-lwt")
 +
 +;; composing the actor and scheduler handlers together
 +(module $actor-scheduler
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  (type $cont-func (func (param (ref $cont)))) ;; [cont ([] -> []) -> []]
 +  (type $cont-cont (cont $cont-func))          ;; cont ([cont ([] -> []) -> []])
 +
 +  (elem declare func $act $scheduler)
 +
 +  (func $act (import "actor-as-lwt" "act") (param $k (ref $cont)))
 +  (func $scheduler (import "scheduler" "run") (param $k (ref $cont)))
 +
 +  (func $run-actor (export "run-actor") (param $k (ref $cont))
 +    (call $scheduler (cont.bind $cont-cont $cont (local.get $k) (cont.new $cont-cont (ref.func $act))))
 +  )
 +)
 +(register "actor-scheduler")
 +
 +(module
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  (type $i-func (func (param i32))) ;; [i32] -> []
 +  (type $i-cont (cont $i-func))     ;; cont ([i32] -> [])
 +
 +  (elem declare func $chain)
 +
 +  (func $run-actor (import "actor-scheduler" "run-actor") (param $k (ref $cont)))
 +  (func $chain (import "chain" "chain") (param $n i32))
 +
 +  (func $run-chain (export "run-chain") (param $n i32)
 +    (call $run-actor (cont.bind $i-cont $cont (local.get $n) (cont.new $i-cont (ref.func $chain))))
 +  )
 +)
 +
 +(invoke "run-chain" (i32.const 64))
 diff --git a/proposals/continuations/examples/actor.wast b/proposals/continuations/examples/actor.wast
 new file mode 100644
 index 00000000..151c08d5
 --- /dev/null
 +++ b/proposals/continuations/examples/actor.wast
 @@ -0,0 +1,395 @@
 +;; Actors
 +
 +;; actor interface
 +(module $actor
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  ;; self : [i32] -> []
 +  ;; spawn : [cont ([] -> [])] -> [i32]
 +  ;; send : [i32 i32] -> []
 +  ;; recv : [] -> [i32]
 +  (tag $self (export "self") (result i32))
 +  (tag $spawn (export "spawn") (param (ref $cont)) (result i32))
 +  (tag $send (export "send") (param i32 i32))
 +  (tag $recv (export "recv") (result i32))
 +)
 +(register "actor")
 +
 +;; a simple example - pass a message through a chain of actors
 +(module $chain
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  (type $i-func (func (param i32))) ;; [i32] -> []
 +  (type $i-cont (cont $i-func))     ;; cont ([i32] -> [])
 +
 +  ;; self : [i32] -> []
 +  ;; spawn : [cont ([] -> [])] -> [i32]
 +  ;; send : [i32 i32] -> []
 +  ;; recv : [] -> [i32]
 +  (tag $self (import "actor" "self") (result i32))
 +  (tag $spawn (import "actor" "spawn") (param (ref $cont)) (result i32))
 +  (tag $send (import "actor" "send") (param i32 i32))
 +  (tag $recv (import "actor" "recv") (result i32))
 +
 +  (elem declare func $next)
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  (func $next (param $p i32)
 +    (local $s i32)
 +    (local.set $s (suspend $recv))
 +    (call $log (i32.const -1))
 +    (suspend $send (local.get $s) (local.get $p))
 +  )
 +
 +  ;; send the message 42 through a chain of n actors
 +  (func $chain (export "chain") (param $n i32)
 +    (local $p i32)
 +    (local.set $p (suspend $self))
 +
 +    (loop $l
 +      (if (i32.eqz (local.get $n))
 +        (then (suspend $send (i32.const 42) (local.get $p)))
 +        (else (local.set $p (suspend $spawn (cont.bind $i-cont $cont (local.get $p) (cont.new $i-cont (ref.func $next)))))
 +              (local.set $n (i32.sub (local.get $n) (i32.const 1)))
 +              (br $l))
 +      )
 +    )
 +    (call $log (suspend $recv))
 +  )
 +)
 +(register "chain")
 +
 +;; queues of threads and mailboxes
 +(module $queue
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  ;; table (threads) and memory (mailboxes) as simple queues
 +  (table $queue 0 (ref null $cont))
 +  (memory 1)
 +
 +  (tag $too-many-mailboxes)
 +
 +  (global $qdelta i32 (i32.const 10))
 +
 +  (global $qback-k (mut i32) (i32.const 0))
 +  (global $qfront-k (mut i32) (i32.const 0))
 +
 +  (func $queue-empty-k (export "queue-empty") (result i32)
 +    (i32.eq (global.get $qfront-k) (global.get $qback-k))
 +  )
 +
 +  (func $dequeue-k (export "dequeue-k") (result (ref null $cont))
 +    (local $i i32)
 +    (if (call $queue-empty-k)
 +      (then (return (ref.null $cont)))
 +    )
 +    (local.set $i (global.get $qfront-k))
 +    (global.set $qfront-k (i32.add (local.get $i) (i32.const 1)))
 +    (table.get $queue (local.get $i))
 +  )
 +
 +  (func $enqueue-k (export "enqueue-k") (param $k (ref $cont))
 +    ;; Check if queue is full
 +    (if (i32.eq (global.get $qback-k) (table.size $queue))
 +      (then
 +        ;; Check if there is enough space in the front to compact
 +        (if (i32.lt_u (global.get $qfront-k) (global.get $qdelta))
 +          (then
 +            ;; Space is below threshold, grow table instead
 +            (drop (table.grow $queue (ref.null $cont) (global.get $qdelta)))
 +          )
 +          (else
 +            ;; Enough space, move entries up to head of table
 +            (global.set $qback-k (i32.sub (global.get $qback-k) (global.get $qfront-k)))
 +            (table.copy $queue $queue
 +              (i32.const 0)         ;; dest = new front = 0
 +              (global.get $qfront-k)  ;; src = old front
 +              (global.get $qback-k)   ;; len = new back = old back - old front
 +            )
 +            (table.fill $queue      ;; null out old entries to avoid leaks
 +              (global.get $qback-k)   ;; start = new back
 +              (ref.null $cont)      ;; init value
 +              (global.get $qfront-k)  ;; len = old front = old front - new front
 +            )
 +            (global.set $qfront-k (i32.const 0))
 +          )
 +        )
 +      )
 +    )
 +    (table.set $queue (global.get $qback-k) (local.get $k))
 +    (global.set $qback-k (i32.add (global.get $qback-k) (i32.const 1)))
 +  )
 +
 +  (global $qback-mb (mut i32) (i32.const 0))
 +  (global $qfront-mb (mut i32) (i32.const 0))
 +
 +  (func $queue-empty-mb (export "queue-empty-mb") (result i32)
 +    (i32.eq (global.get $qfront-mb) (global.get $qback-mb))
 +  )
 +
 +  (func $dequeue-mb (export "dequeue-mb") (result i32)
 +    (local $i i32)
 +    (local $mb i32)
 +    (if (call $queue-empty-mb)
 +      (then (return (i32.const -1)))
 +    )
 +    (local.set $i (global.get $qfront-mb))
 +    (global.set $qfront-mb (i32.add (local.get $i) (i32.const 1)))
 +    (local.set $mb (i32.load (i32.mul (local.get $i) (i32.const 4))))
 +    (return (local.get $mb))
 +  )
 +
 +  (func $enqueue-mb (export "enqueue-mb") (param $mb i32)
 +    ;; Check if queue is full
 +    (if (i32.eq (global.get $qback-mb) (i32.const 16383))
 +      (then
 +        ;; Check if there is enough space in the front to compact
 +        (if (i32.lt_u (global.get $qfront-mb) (global.get $qdelta))
 +          (then
 +            ;; Space is below threshold, throw exception
 +            (throw $too-many-mailboxes)
 +          )
 +          (else
 +            ;; Enough space, move entries up to head of table
 +            (global.set $qback-mb (i32.sub (global.get $qback-mb) (global.get $qfront-mb)))
 +            (memory.copy
 +              (i32.const 0)                                    ;; dest = new front = 0
 +              (i32.mul (global.get $qfront-mb) (i32.const 4))  ;; src = old front
 +              (i32.mul (global.get $qback-mb) (i32.const 4))   ;; len = new back = old back - old front
 +            )
 +            (memory.fill                                       ;; null out old entries to avoid leaks
 +              (i32.mul (global.get $qback-mb) (i32.const 4))   ;; start = new back
 +              (i32.const -1)                                   ;; init value
 +              (i32.mul (global.get $qfront-mb) (i32.const 4))  ;; len = old front = old front - new front
 +            )
 +            (global.set $qfront-mb (i32.const 0))
 +          )
 +        )
 +      )
 +    )
 +    (i32.store (i32.mul (global.get $qback-mb) (i32.const 4)) (local.get $mb))
 +    (global.set $qback-mb (i32.add (global.get $qback-mb) (i32.const 1)))
 +  )
 +)
 +(register "queue")
 +
 +(module $mailboxes
 +  ;; Stupid implementation of mailboxes that raises an exception if
 +  ;; there are too many mailboxes or if more than one message is sent
 +  ;; to any given mailbox.
 +  ;;
 +  ;; Sufficient for the simple chain example.
 +
 +  ;; -1 means empty
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  (tag $too-many-mailboxes)
 +  (tag $too-many-messages)
 +
 +  (memory 1)
 +
 +  (global $msize (mut i32) (i32.const 0)) ;; current number of mailboxes
 +  (global $mmax i32 (i32.const 1024))     ;; maximum number of mailboxes
 +
 +  (func $init (export "init")
 +     (global.set $msize (i32.const 0))
 +     (memory.fill (i32.const 0) (i32.const -1) (i32.mul (global.get $mmax) (i32.const 4)))
 +  )
 +
 +  (func $empty-mb (export "empty-mb") (param $mb i32) (result i32)
 +    (local $offset i32)
 +    (local.set $offset (i32.mul (local.get $mb) (i32.const 4)))
 +    (i32.eq (i32.load (local.get $offset)) (i32.const -1))
 +  )
 +
 +  (func $new-mb (export "new-mb") (result i32)
 +     (local $mb i32)
 +
 +     (if (i32.ge_u (global.get $msize) (global.get $mmax))
 +         (then (throw $too-many-mailboxes))
 +     )
 +
 +     (local.set $mb (global.get $msize))
 +     (global.set $msize (i32.add (global.get $msize) (i32.const 1)))
 +     (return (local.get $mb))
 +  )
 +
 +  (func $send-to-mb (export "send-to-mb") (param $v i32) (param $mb i32)
 +    (local $offset i32)
 +    (local.set $offset (i32.mul (local.get $mb) (i32.const 4)))
 +    (if (call $empty-mb (local.get $mb))
 +      (then (i32.store (local.get $offset) (local.get $v)))
 +      (else (throw $too-many-messages))
 +    )
 +  )
 +
 +  (func $recv-from-mb (export "recv-from-mb") (param $mb i32) (result i32)
 +    (local $v i32)
 +    (local $offset i32)
 +    (local.set $offset (i32.mul (local.get $mb) (i32.const 4)))
 +    (local.set $v (i32.load (local.get $offset)))
 +    (i32.store (local.get $offset) (i32.const -1))
 +    (local.get $v)
 +  )
 +)
 +(register "mailboxes")
 +
 +;; actors implemented directly
 +(module $scheduler
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  (type $i-func (func (param i32))) ;; [i32] -> []
 +  (type $i-cont (cont $i-func))     ;; cont ([i32] -> [])
 +
 +  (type $i-cont-func (func (param (ref $i-cont)))) ;; [(cont ([i32] -> []))] -> []
 +  (type $i-cont-cont (cont $i-cont-func))          ;; cont ([(cont ([i32] -> []))] -> [])
 +
 +  ;; mailbox interface
 +  (func $init (import "mailboxes" "init"))
 +  (func $empty-mb (import "mailboxes" "empty-mb") (param $mb i32) (result i32))
 +  (func $new-mb (import "mailboxes" "new-mb") (result i32))
 +  (func $send-to-mb (import "mailboxes" "send-to-mb") (param $v i32) (param $mb i32))
 +  (func $recv-from-mb (import "mailboxes" "recv-from-mb") (param $mb i32) (result i32))
 +
 +  ;; queue interface
 +  (func $dequeue-mb (import "queue" "dequeue-mb") (result i32))
 +  (func $enqueue-mb (import "queue" "enqueue-mb") (param i32))
 +  (func $dequeue-k (import "queue" "dequeue-k") (result (ref null $cont)))
 +  (func $enqueue-k (import "queue" "enqueue-k") (param (ref $cont)))
 +
 +  ;; actor interface
 +  ;; self : [i32] -> []
 +  ;; spawn : [cont ([] -> [])] -> [i32]
 +  ;; send : [i32 i32] -> []
 +  ;; recv : [] -> [i32]
 +  (tag $self (import "actor" "self") (result i32))
 +  (tag $spawn (import "actor" "spawn") (param (ref $cont)) (result i32))
 +  (tag $send (import "actor" "send") (param i32 i32))
 +  (tag $recv (import "actor" "recv") (result i32))
 +
 +  (elem declare func $recv-againf)
 +
 +  ;; We implement blocking by reinvoking recv with the original
 +  ;; handler. This is a common pattern nicely supported by shallow but
 +  ;; not deep handlers. However, it does require composing the new
 +  ;; reinvoked recv with the continuation. We simulate this behaviour
 +  ;; (inefficiently, perhaps) by resuming the continuation with an
 +  ;; identity handler and then building a new continuation. Might an
 +  ;; instruction for composing or extending continuations be palatable
 +  ;; / desirable?
 +  ;;
 +  ;; The resume_throw operation can be implemented (inefficiently)
 +  ;; with continuation composition.
 +
 +  ;; compose recv with an existing continuation
 +  (func $recv-againf (param $ik (ref $i-cont))
 +    (local $res i32)
 +    (suspend $recv)
 +    (local.set $res)
 +    (resume $i-cont (local.get $res) (local.get $ik))
 +  )
 +  (func $recv-again (param $ik (ref $i-cont)) (result (ref $cont))
 +    (cont.bind $i-cont-cont $cont (local.get $ik) (cont.new $i-cont-cont (ref.func $recv-againf)))
 +  )
 +
 +  ;; There are multiple ways of avoiding the need for
 +  ;; $recv-again. Here are a couple.
 +  ;;
 +  ;; 1) Build handlers on top of lightweight threads (with fork and
 +  ;; yield). Then we can just keep on yielding until the mailbox is
 +  ;; non-empty, and delegate the actual scheduling to a separate
 +  ;; handler.
 +  ;;
 +  ;; 2) Distinguish between unblocked and blocked threads in the
 +  ;; thread queue. Typing makes this a bit of a pain to hack up
 +  ;; directly in Wasm, but in practice this is not difficult, and
 +  ;; similar to what existing actor implementations do.
 +
 +  (func $run (export "run") (param $nextk (ref null $cont))
 +    (local $mine i32)                ;; current mailbox
 +    (local $ik (ref $i-cont))
 +    (local $k (ref $cont))
 +    (local $you (ref $cont))
 +    (local $yours i32)
 +    (call $init)
 +    (local.set $mine (call $new-mb))
 +    (loop $l
 +      (if (ref.is_null (local.get $nextk)) (then (return)))
 +      (block $on_self (result (ref $i-cont))
 +        (block $on_spawn (result (ref $cont) (ref $i-cont))
 +          (block $on_send (result i32 i32 (ref $cont))
 +            (block $on_recv (result (ref $i-cont))
 +               (resume $cont (tag $self $on_self)
 +                             (tag $spawn $on_spawn)
 +                             (tag $send $on_send)
 +                             (tag $recv $on_recv)
 +                             (local.get $nextk)
 +               )
 +               (local.set $mine (call $dequeue-mb))
 +               (local.set $nextk (call $dequeue-k))
 +               (br $l)
 +            ) ;;   $on_recv (result (ref $i-cont))
 +            (local.set $ik)
 +            ;; block this thread until the mailbox is non-empty
 +            (if (call $empty-mb (local.get $mine))
 +                (then (call $enqueue-mb (local.get $mine))
 +                      (call $enqueue-k (call $recv-again (local.get $ik)))
 +                      (local.set $mine (call $dequeue-mb))
 +                      (local.set $nextk (call $dequeue-k))
 +                      (br $l))
 +            )
 +            (local.set $nextk (cont.bind $i-cont $cont (call $recv-from-mb (local.get $mine)) (local.get $ik)))
 +            (br $l)
 +          ) ;;   $on_send (result i32 i32 (ref $cont))
 +          (local.set $k)
 +          (call $send-to-mb)
 +          (local.set $nextk (local.get $k))
 +          (br $l)
 +        ) ;;   $on_spawn (result (ref $cont) (ref $i-cont))
 +        (local.set $ik)
 +        (local.set $you)
 +        (call $new-mb)
 +        (local.set $yours)
 +        (call $enqueue-mb (local.get $yours))
 +        (call $enqueue-k (local.get $you))
 +        (local.set $nextk (cont.bind $i-cont $cont (local.get $yours) (local.get $ik)))
 +        (br $l)
 +      ) ;;   $on_self (result (ref $i-cont))
 +      (local.set $ik)
 +      (local.set $nextk (cont.bind $i-cont $cont (local.get $mine) (local.get $ik)))
 +      (br $l)
 +    )
 +  )
 +)
 +(register "scheduler")
 +
 +(module
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  (type $i-func (func (param i32))) ;; [i32] -> []
 +  (type $i-cont (cont $i-func))     ;; cont ([i32] -> [])
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  (elem declare func $chain)
 +
 +  (func $act (import "scheduler" "run") (param $k (ref null $cont)))
 +  (func $chain (import "chain" "chain") (param $n i32))
 +
 +  (func $run-chain (export "run-chain") (param $n i32)
 +    (call $act (cont.bind $i-cont $cont (local.get $n) (cont.new $i-cont (ref.func $chain))))
 +  )
 +)
 +
 +(assert_return (invoke "run-chain" (i32.const 64)))
 diff --git a/proposals/continuations/examples/async-await.wast b/proposals/continuations/examples/async-await.wast
 new file mode 100644
 index 00000000..53570c3b
 --- /dev/null
 +++ b/proposals/continuations/examples/async-await.wast
 @@ -0,0 +1,335 @@
 +;; async-await interface
 +(module $async-await
 +  (type $i-func (func (param i32)))
 +  (type $i-cont (cont $i-func))
 +
 +  ;; We use yield and fulfill to simulate asynchronous operations.
 +  ;;
 +  ;; Given a suitable asynchronous I/O API, they needn't be exposed to
 +  ;; user code.
 +  ;;
 +  ;; yield : [] -> []
 +  ;; fulfill : [i32] -> [i32]
 +  (tag $yield (export "yield"))
 +  (tag $fulfill (export "fulfill") (param i32) (param i32))
 +
 +  ;; async : [cont ([i32] -> [])] -> [i32]
 +  ;; await : [i32] -> [i32]
 +  (tag $async (export "async") (param (ref $i-cont)) (result i32))
 +  (tag $await (export "await") (param i32) (result i32))
 +)
 +(register "async-await")
 +
 +(module $example
 +  (type $i-func (func (param i32)))
 +  (type $i-cont (cont $i-func))
 +
 +  (type $iii-fun (func (param i32 i32 i32)))
 +  (type $iii-cont (cont $iii-fun))
 +
 +  ;; yield : [] -> []
 +  ;; fulfill : [i32] -> [i32]
 +  ;; async : [cont ([i32] -> [])] -> [i32]
 +  ;; await : [i32] -> [i32]
 +  (tag $yield (import "async-await" "yield"))
 +  (tag $fulfill (import "async-await" "fulfill") (param i32) (param i32))
 +  (tag $async (import "async-await" "async") (param (ref $i-cont)) (result i32))
 +  (tag $await (import "async-await" "await") (param i32) (result i32))
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  (elem declare func $sum)
 +
 +  ;; an asynchronous function that computes i + i+1 + ... + j
 +  ;;
 +  ;; (instead of computing synchronously, it allows other computations
 +  ;; to execute each time round the loop)
 +  ;;
 +  ;; the final result is written to the promise $p
 +  (func $sum (param $i i32) (param $j i32) (param $p i32)
 +     (local $a i32)
 +     (loop $l
 +        (call $log (local.get $i))
 +        (local.set $a (i32.add (local.get $a) (local.get $i)))
 +        (local.set $i (i32.add (local.get $i) (i32.const 1)))
 +        (if (i32.le_u (local.get $i) (local.get $j))
 +           (then (suspend $yield)
 +                 (br $l))
 +        )
 +     )
 +     (suspend $fulfill (local.get $p) (local.get $a))
 +  )
 +
 +  ;; compute p = 1+..+3; q = 5+..+7; r = 10+...+15 asynchronously
 +  ;; once p and q have finished computing, compute x = p*q
 +  ;; once r has finished computing, return x+r
 +  (func $run (export "run")
 +     (local $p i32)
 +     (local $q i32)
 +     (local $r i32)
 +
 +     (local $x i32)
 +     (local $y i32)
 +
 +     (call $log (i32.const -1))
 +     (local.set $p (suspend $async (cont.bind $iii-cont $i-cont (i32.const 1) (i32.const 3) (cont.new $iii-cont (ref.func $sum)))))
 +     (call $log (i32.const -2))
 +     (local.set $q (suspend $async (cont.bind $iii-cont $i-cont (i32.const 5) (i32.const 7) (cont.new $iii-cont (ref.func $sum)))))
 +     (call $log (i32.const -3))
 +     (local.set $r (suspend $async (cont.bind $iii-cont $i-cont (i32.const 10) (i32.const 15) (cont.new $iii-cont (ref.func $sum)))))
 +     (call $log (i32.const -4))
 +
 +     (local.set $x (i32.mul (suspend $await (local.get $p))
 +                            (suspend $await (local.get $q))))
 +
 +     (call $log (i32.const -5))
 +
 +     (local.set $y (i32.add (suspend $await (local.get $r)) (local.get $x)))
 +
 +     (call $log (i32.const -6))
 +     (call $log (local.get $y))
 +     (call $log (i32.const -7))
 +  )
 +)
 +(register "example")
 +
 +;; queue of threads
 +(module $queue
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  ;; Table as simple queue (keeping it simple, no ring buffer)
 +  (table $queue 0 (ref null $cont))
 +  (global $qdelta i32 (i32.const 10))
 +  (global $qback (mut i32) (i32.const 0))
 +  (global $qfront (mut i32) (i32.const 0))
 +
 +  (func $queue-empty (export "queue-empty") (result i32)
 +    (i32.eq (global.get $qfront) (global.get $qback))
 +  )
 +
 +  (func $dequeue (export "dequeue") (result (ref null $cont))
 +    (local $i i32)
 +    (if (call $queue-empty)
 +      (then (return (ref.null $cont)))
 +    )
 +    (local.set $i (global.get $qfront))
 +    (global.set $qfront (i32.add (local.get $i) (i32.const 1)))
 +    (table.get $queue (local.get $i))
 +  )
 +
 +  (func $enqueue (export "enqueue") (param $k (ref null $cont))
 +    ;; Check if queue is full
 +    (if (i32.eq (global.get $qback) (table.size $queue))
 +      (then
 +        ;; Check if there is enough space in the front to compact
 +        (if (i32.lt_u (global.get $qfront) (global.get $qdelta))
 +          (then
 +            ;; Space is below threshold, grow table instead
 +            (drop (table.grow $queue (ref.null $cont) (global.get $qdelta)))
 +          )
 +          (else
 +            ;; Enough space, move entries up to head of table
 +            (global.set $qback (i32.sub (global.get $qback) (global.get $qfront)))
 +            (table.copy $queue $queue
 +              (i32.const 0)         ;; dest = new front = 0
 +              (global.get $qfront)  ;; src = old front
 +              (global.get $qback)   ;; len = new back = old back - old front
 +            )
 +            (table.fill $queue      ;; null out old entries to avoid leaks
 +              (global.get $qback)   ;; start = new back
 +              (ref.null $cont)      ;; init value
 +              (global.get $qfront)  ;; len = old front = old front - new front
 +            )
 +            (global.set $qfront (i32.const 0))
 +          )
 +        )
 +      )
 +    )
 +    (table.set $queue (global.get $qback) (local.get $k))
 +    (global.set $qback (i32.add (global.get $qback) (i32.const 1)))
 +  )
 +)
 +(register "queue")
 +
 +;; promises
 +(module $promise
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  (type $i-func (func (param i32)))
 +  (type $i-cont (cont $i-func))
 +
 +  ;; a simplistic implementation of promises that assumes a maximum of
 +  ;; 1000 promises and a maximum of one observer per promise
 +
 +  (tag $too-many-promises)
 +  (tag $too-many-observers)
 +
 +  (global $num-promises (mut i32) (i32.const 0))
 +  (global $max-promises i32 (i32.const 1000))
 +  (table $observers 1000 (ref null $i-cont))  ;; observers waiting for promises to be fulfilled
 +  (memory 1)                                 ;; promise values
 +
 +  ;; create and return a new promise
 +  (func $new (export "new") (result i32)
 +     (local $offset i32)
 +     (local $p i32)
 +     (if (i32.eq (global.get $num-promises) (global.get $max-promises))
 +        (then (throw $too-many-promises)))
 +     (local.set $p (global.get $num-promises))
 +     (local.set $offset (i32.mul (local.get $p) (i32.const 4)))
 +     (table.set $observers (local.get $p) (ref.null $i-cont))
 +     (i32.store (local.get $offset) (i32.const -1))
 +     (global.set $num-promises (i32.add (local.get $p) (i32.const 1)))
 +     (return (local.get $p))
 +  )
 +
 +  ;; check whether promise $p is fulfilled
 +  (func $fulfilled (export "fulfilled") (param $p i32) (result i32)
 +    (local $offset i32)
 +    (local.set $offset (i32.mul (local.get $p) (i32.const 4)))
 +    (i32.ne (i32.load (local.get $offset)) (i32.const -1))
 +  )
 +
 +  ;; current value of promise $p
 +  (func $read (export "read") (param $p i32) (result i32)
 +    (local $offset i32)
 +    (local.set $offset (i32.mul (local.get $p) (i32.const 4)))
 +    (i32.load (local.get $offset))
 +  )
 +
 +  ;; register an observer for when promise $p is fulfilled
 +  (func $await (export "await") (param $p i32) (param $k (ref $i-cont))
 +    (if (ref.is_null (table.get $observers (local.get $p)))
 +       (then (table.set $observers (local.get $p) (local.get $k)))
 +       (else (throw $too-many-observers))
 +    )
 +  )
 +
 +  ;; fulfill promise $p with value $v
 +  (func $fulfill (export "fulfill") (param $p i32) (param $v i32) (result (ref null $cont))
 +    (local $offset i32)
 +    (local $k (ref null $i-cont))
 +    (local.set $offset (i32.mul (local.get $p) (i32.const 4)))
 +    (i32.store (local.get $offset) (local.get $v))
 +    (local.set $k (table.get $observers (local.get $p)))
 +    (if (ref.is_null (local.get $k))
 +      (then (return (ref.null $cont)))
 +    )
 +    (return (cont.bind $i-cont $cont (local.get $v) (local.get $k)))
 +  )
 +)
 +(register "promise")
 +
 +;; async-await scheduler
 +(module $scheduler
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  (type $i-func (func (param i32)))
 +  (type $i-cont (cont $i-func))
 +
 +  ;; async-await interface
 +  ;;
 +  ;; yield : [] -> []
 +  ;; fulfill : [i32] -> [i32]
 +  ;; async : [cont ([i32] -> [])] -> [i32]
 +  ;; await : [i32] -> [i32]
 +  (tag $yield (import "async-await" "yield"))
 +  (tag $fulfill (import "async-await" "fulfill") (param i32) (param i32))
 +  (tag $async (import "async-await" "async") (param (ref $i-cont)) (result i32))
 +  (tag $await (import "async-await" "await") (param i32) (result i32))
 +
 +  ;; queue interface
 +  (func $queue-empty (import "queue" "queue-empty") (result i32))
 +  (func $dequeue (import "queue" "dequeue") (result (ref null $cont)))
 +  (func $enqueue (import "queue" "enqueue") (param $k (ref null $cont)))
 +
 +  ;; promise interface
 +  (func $new-promise (import "promise" "new") (result i32))
 +  (func $promise-fulfilled (import "promise" "fulfilled") (param $p i32) (result i32))
 +  (func $promise-value (import "promise" "read") (param $p i32) (result i32))
 +  (func $await-promise (import "promise" "await") (param $p i32) (param $k (ref $i-cont)))
 +  (func $fulfill-promise (import "promise" "fulfill") (param $p i32) (param $v i32) (result (ref null $cont)))
 +
 +  (func $run (export "run") (param $nextk (ref null $cont))
 +    (local $p i32)
 +    (local $v i32)
 +    (local $ik (ref $i-cont))
 +    (local $ak (ref $i-cont))
 +    (local $k (ref null $cont))
 +    (loop $l
 +      (if (ref.is_null (local.get $nextk)) (then (return)))
 +      (block $on_yield (result (ref $cont))
 +        (block $on_fulfill (result i32 i32 (ref $cont))
 +          (block $on_async (result (ref $i-cont) (ref $i-cont))
 +            (block $on_await (result i32 (ref $i-cont))
 +              (resume $cont (tag $yield $on_yield)
 +                            (tag $fulfill $on_fulfill)
 +                            (tag $async $on_async)
 +                            (tag $await $on_await)
 +                            (local.get $nextk)
 +              )
 +              (local.set $nextk (call $dequeue))
 +              (br $l)  ;; thread terminated
 +            ) ;;   $on_await (result i32 (ref $i-cont))
 +            (local.set $ik)
 +            (local.set $p)
 +            (if (call $promise-fulfilled (local.get $p))
 +                ;; if promise fulfilled then run continuation partially applied to value
 +                (then (local.set $nextk (cont.bind $i-cont $cont (call $promise-value (local.get $p)) (local.get $ik))))
 +                ;; else add continuation to promise and run next continuation from the queue
 +                (else (call $await-promise (local.get $p) (local.get $ik))
 +                      (local.set $nextk (call $dequeue)))
 +            )
 +            (br $l)
 +          ) ;;   $on_async (result (ref $i-func) (ref $i-cont))
 +          (local.set $ik)
 +          (local.set $ak)
 +            ;; create new promise
 +            (call $new-promise)
 +            (local.set $p)
 +            ;; enqueue continuation partially applied to promise
 +            (call $enqueue (cont.bind $i-cont $cont (local.get $p) (local.get $ik)))
 +            ;; run computation partially applied to promise
 +            (local.set $nextk (cont.bind $i-cont $cont (local.get $p) (local.get $ak)))
 +          (br $l)
 +        ) ;;   $on_fulfill (result i32 i32 (ref $cont))
 +        (local.set $nextk)
 +        (local.set $v)
 +        (local.set $p)
 +        (call $fulfill-promise (local.get $p) (local.get $v))
 +        (local.set $k)
 +        (if (ref.is_null (local.get $k))
 +          (then)
 +          (else (call $enqueue (local.get $k)))
 +        )
 +        (br $l)
 +      ) ;;   $on_yield (result (ref $cont))
 +      (call $enqueue)                    ;; current thread
 +      (local.set $nextk (call $dequeue)) ;; next thread
 +      (br $l)
 +    )
 +  )
 +)
 +(register "scheduler")
 +
 +(module
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  (func $scheduler (import "scheduler" "run") (param $nextk (ref null $cont)))
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  (func $run-example (import "example" "run"))
 +
 +  (elem declare func $run-example)
 +
 +  (func (export "run")
 +    (call $scheduler (cont.new $cont (ref.func $run-example)))
 +  )
 +)
 +
 +(invoke "run")
 diff --git a/proposals/continuations/examples/control-lwt.wast b/proposals/continuations/examples/control-lwt.wast
 new file mode 100644
 index 00000000..d7c00cb3
 --- /dev/null
 +++ b/proposals/continuations/examples/control-lwt.wast
 @@ -0,0 +1,356 @@
 +;; dynamic lightweight threads via control/prompt
 +
 +;; interface to control/prompt
 +(module $control
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  (type $cont-func (func (param (ref $cont)))) ;; [cont ([] -> [])] -> []
 +  (type $cont-cont (cont $cont-func))          ;; cont ([cont ([] -> [])] -> [])
 +
 +  ;; Implementation of a generic delimited control operator using
 +  ;; effect handlers.
 +  ;;
 +  ;; For lightweight threads we have no payload. More general types
 +  ;; for control and prompt are:
 +  ;;
 +  ;;   control : ([cont ([ta*] -> [tr*])] -> [tr*]) -> [ta*]
 +  ;;   prompt : cont ([] -> [tr*]) -> [tr*]
 +  ;;
 +  ;; (We can also give more refined types if we want to support
 +  ;; answer-type modification and various flavours of answer-type
 +  ;; polymorphism - but these are well outside the scope of a Wasm
 +  ;; proposal!)
 +  ;;
 +  ;; (Technically this is control0/prompt0 rather than
 +  ;; control/prompt.)
 +  (tag $control (export "control") (param (ref $cont-cont)))    ;; control : ([cont ([] -> [])] -> []) -> []
 +  (func $prompt (export "prompt") (param $nextk (ref null $cont)) ;; prompt : cont ([] -> []) -> []
 +    (local $h (ref $cont-cont))
 +    (local $k (ref $cont))
 +    (block $on_control (result (ref $cont-cont) (ref $cont))
 +       (resume $cont (tag $control $on_control)
 +                     (local.get $nextk))
 +       (return)
 +    ) ;;   $on_control (param (ref $cont-func) (ref $cont))
 +    (local.set $k)
 +    (local.set $h)
 +    (resume $cont-cont (local.get $k) (local.get $h))
 +  )
 +)
 +(register "control")
 +
 +;; With control/prompt we use functions for abstracting over yield and
 +;; fork operations rather than tags.
 +
 +(module $example
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  (type $cont-func (func (param (ref $cont)))) ;; [cont ([] -> [])] -> []
 +  (type $cont-cont (cont $cont-func))          ;; cont ([cont ([] -> [])] -> [])
 +
 +  (type $func-cont-func-func (func (param (ref $func)) (param (ref $cont-func)))) ;; ([] -> []) -> ([cont ([] -> [])] -> []) -> []
 +  (type $func-cont-func-cont (cont $func-cont-func-func))                         ;; cont (([] -> []) -> ([cont ([] -> [])] -> []) -> [])
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  (elem declare func $main $thread1 $thread2 $thread3)
 +
 +  (func $main (export "main") (param $yield (ref $func)) (param $fork (ref $cont-func))
 +    (call $log (i32.const 0))
 +    (call_ref $cont-func
 +      (cont.bind $func-cont-func-cont $cont (local.get $yield) (local.get $fork)
 +        (cont.new $func-cont-func-cont (ref.func $thread1)))
 +      (local.get $fork))
 +    (call $log (i32.const 1))
 +    (call_ref $cont-func
 +      (cont.bind $func-cont-func-cont $cont (local.get $yield) (local.get $fork)
 +        (cont.new $func-cont-func-cont (ref.func $thread2)))
 +      (local.get $fork))
 +    (call $log (i32.const 2))
 +    (call_ref $cont-func
 +      (cont.bind $func-cont-func-cont $cont (local.get $yield) (local.get $fork)
 +        (cont.new $func-cont-func-cont (ref.func $thread3)))
 +      (local.get $fork))
 +    (call $log (i32.const 3))
 +  )
 +
 +  (func $thread1 (param $yield (ref $func)) (param $fork (ref $cont-func))
 +    (call $log (i32.const 10))
 +    (call_ref $func (local.get $yield))
 +    (call $log (i32.const 11))
 +    (call_ref $func (local.get $yield))
 +    (call $log (i32.const 12))
 +  )
 +
 +  (func $thread2 (param $yield (ref $func)) (param $fork (ref $cont-func))
 +    (call $log (i32.const 20))
 +    (call_ref $func (local.get $yield))
 +    (call $log (i32.const 21))
 +    (call_ref $func (local.get $yield))
 +    (call $log (i32.const 22))
 +  )
 +
 +  (func $thread3 (param $yield (ref $func)) (param $fork (ref $cont-func))
 +    (call $log (i32.const 30))
 +    (call_ref $func (local.get $yield))
 +    (call $log (i32.const 31))
 +    (call_ref $func (local.get $yield))
 +    (call $log (i32.const 32))
 +  )
 +)
 +(register "example")
 +
 +(module $queue
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  ;; Table as simple queue (keeping it simple, no ring buffer)
 +  (table $queue 0 (ref null $cont))
 +  (global $qdelta i32 (i32.const 10))
 +  (global $qback (mut i32) (i32.const 0))
 +  (global $qfront (mut i32) (i32.const 0))
 +
 +  (func $queue-empty (export "queue-empty") (result i32)
 +    (i32.eq (global.get $qfront) (global.get $qback))
 +  )
 +
 +  (func $dequeue (export "dequeue") (result (ref null $cont))
 +    (local $i i32)
 +    (if (call $queue-empty)
 +      (then (return (ref.null $cont)))
 +    )
 +    (local.set $i (global.get $qfront))
 +    (global.set $qfront (i32.add (local.get $i) (i32.const 1)))
 +    (table.get $queue (local.get $i))
 +  )
 +
 +  (func $enqueue (export "enqueue") (param $k (ref $cont))
 +    ;; Check if queue is full
 +    (if (i32.eq (global.get $qback) (table.size $queue))
 +      (then
 +        ;; Check if there is enough space in the front to compact
 +        (if (i32.lt_u (global.get $qfront) (global.get $qdelta))
 +          (then
 +            ;; Space is below threshold, grow table instead
 +            (drop (table.grow $queue (ref.null $cont) (global.get $qdelta)))
 +          )
 +          (else
 +            ;; Enough space, move entries up to head of table
 +            (global.set $qback (i32.sub (global.get $qback) (global.get $qfront)))
 +            (table.copy $queue $queue
 +              (i32.const 0)         ;; dest = new front = 0
 +              (global.get $qfront)  ;; src = old front
 +              (global.get $qback)   ;; len = new back = old back - old front
 +            )
 +            (table.fill $queue      ;; null out old entries to avoid leaks
 +              (global.get $qback)   ;; start = new back
 +              (ref.null $cont)      ;; init value
 +              (global.get $qfront)  ;; len = old front = old front - new front
 +            )
 +            (global.set $qfront (i32.const 0))
 +          )
 +        )
 +      )
 +    )
 +    (table.set $queue (global.get $qback) (local.get $k))
 +    (global.set $qback (i32.add (global.get $qback) (i32.const 1)))
 +  )
 +)
 +(register "queue")
 +
 +(module $scheduler
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  (type $cont-func (func (param (ref $cont)))) ;; [cont ([] -> [])] -> []
 +  (type $cont-cont (cont $cont-func))          ;; cont ([cont ([] -> [])] -> [])
 +
 +  (type $func-cont-func-func (func (param (ref $func)) (param (ref $cont-func)))) ;; ([] -> []) -> ([cont ([] -> [])] -> []) -> []
 +  (type $func-cont-func-cont (cont $func-cont-func-func))                         ;; cont (([] -> []) -> ([cont ([] -> [])] -> []) -> [])
 +
 +  (type $func-cont-cont (func (param (ref $cont)) (param (ref $cont))))
 +  (type $cont-cont-func (cont $func-cont-cont))
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  ;; queue interface
 +  (func $queue-empty (import "queue" "queue-empty") (result i32))
 +  (func $dequeue (import "queue" "dequeue") (result (ref null $cont)))
 +  (func $enqueue (import "queue" "enqueue") (param $k (ref $cont)))
 +
 +  (elem declare func
 +     $handle-yield-sync $handle-yield
 +     $handle-fork-sync $handle-fork-kt $handle-fork-tk $handle-fork-ykt $handle-fork-ytk
 +     $yield
 +     $fork-sync $fork-kt $fork-tk $fork-ykt $fork-ytk)
 +
 +  ;; control/prompt interface
 +  (tag $control (import "control" "control") (param (ref $cont-cont)))     ;; control : ([cont ([] -> [])] -> []) -> []
 +  (func $prompt (import "control" "prompt") (param $nextk (ref null $cont))) ;; prompt : cont ([] -> []) -> []
 +
 +  ;; generic boilerplate scheduler
 +  ;;
 +  ;; with control/prompt the core scheduler loop must be decoupled
 +  ;; from the implementations of each operation (yield / fork) as the
 +  ;; latter are passed in as arguments to user code
 +  (func $scheduler (param $nextk (ref null $cont))
 +    (loop $loop
 +      (if (ref.is_null (local.get $nextk)) (then (return)))
 +      (call $prompt (local.get $nextk))
 +      (local.set $nextk (call $dequeue))
 +      (br $loop)
 +    )
 +  )
 +
 +  ;; func.bind is needed in the implementations of fork
 +  ;;
 +  ;; More generally func.bind is needed for any operation that
 +  ;; takes arguments.
 +  ;;
 +  ;; One could use another continuation here instead, but constructing
 +  ;; a new continuation every time an operation is invoked seems
 +  ;; unnecessarily wasteful.
 +
 +  ;; synchronous scheduler
 +  (func $handle-yield-sync (param $k (ref $cont))
 +    (call $scheduler (local.get $k))
 +  )
 +  (func $yield-sync
 +    (suspend $control (cont.new $cont-cont (ref.func $handle-yield)))
 +  )
 +  (func $handle-fork-sync (param $t (ref $cont)) (param $k (ref $cont))
 +    (call $enqueue (local.get $t))
 +    (call $scheduler (local.get $k))
 +  )
 +  (func $fork-sync (param $t (ref $cont))
 +    (suspend $control
 +      (cont.bind $cont-cont-func $cont-cont (local.get $t)
 +        (cont.new $cont-cont-func (ref.func $handle-fork-sync))))
 +  )
 +  (func $sync (export "sync") (param $k (ref $func-cont-func-cont))
 +    (call $scheduler
 +      (cont.bind $func-cont-func-cont $cont (ref.func $yield) (ref.func $fork-sync) (local.get $k)))
 +  )
 +
 +  ;; asynchronous yield (used by all asynchronous schedulers)
 +  (func $handle-yield (param $k (ref $cont))
 +    (call $enqueue (local.get $k))
 +    (call $scheduler (call $dequeue))
 +  )
 +  (func $yield
 +    (suspend $control (cont.new $cont-cont (ref.func $handle-yield)))
 +  )
 +  ;; four asynchronous implementations of fork:
 +  ;;   * kt and tk don't yield on encountering a fork
 +  ;;     1) kt runs the continuation, queuing up the new thread for later
 +  ;;     2) tk runs the new thread first, queuing up the continuation for later
 +  ;;   * ykt and ytk do yield on encountering a fork
 +  ;;     3) ykt runs the continuation, queuing up the new thread for later
 +  ;;     4) ytk runs the new thread first, queuing up the continuation for later
 +
 +  ;; no yield on fork, continuation first
 +  (func $handle-fork-kt (param $t (ref $cont)) (param $k (ref $cont))
 +    (call $enqueue (local.get $t))
 +    (call $scheduler (local.get $k))
 +  )
 +  (func $fork-kt (param $t (ref $cont))
 +    (suspend $control
 +      (cont.bind $cont-cont-func $cont-cont (local.get $t)
 +        (cont.new $cont-cont-func (ref.func $handle-fork-kt))))
 +  )
 +  (func $kt (export "kt") (param $k (ref $func-cont-func-cont))
 +    (call $scheduler
 +      (cont.bind $func-cont-func-cont $cont (ref.func $yield) (ref.func $fork-kt) (local.get $k)))
 +  )
 +
 +  ;; no yield on fork, new thread first
 +  (func $handle-fork-tk (param $t (ref $cont)) (param $k (ref $cont))
 +    (call $enqueue (local.get $k))
 +    (call $scheduler (local.get $t))
 +  )
 +  (func $fork-tk (param $t (ref $cont))
 +    (suspend $control
 +      (cont.bind $cont-cont-func $cont-cont (local.get $t)
 +        (cont.new $cont-cont-func (ref.func $handle-fork-tk))))
 +  )
 +  (func $tk (export "tk") (param $k (ref $func-cont-func-cont))
 +    (call $scheduler
 +      (cont.bind $func-cont-func-cont $cont (ref.func $yield) (ref.func $fork-tk) (local.get $k)))
 +  )
 +
 +  ;; yield on fork, continuation first
 +  (func $handle-fork-ykt (param $t (ref $cont)) (param $k (ref $cont))
 +    (call $enqueue (local.get $k))
 +    (call $enqueue (local.get $t))
 +    (call $scheduler (call $dequeue))
 +  )
 +  (func $fork-ykt (param $t (ref $cont))
 +    (suspend $control
 +      (cont.bind $cont-cont-func $cont-cont (local.get $t)
 +        (cont.new $cont-cont-func (ref.func $handle-fork-ykt))))
 +  )
 +  (func $ykt (export "ykt") (param $k (ref $func-cont-func-cont))
 +    (call $scheduler
 +      (cont.bind $func-cont-func-cont $cont (ref.func $yield) (ref.func $fork-ykt) (local.get $k)))
 +  )
 +
 +  ;; yield on fork, new thread first
 +  (func $handle-fork-ytk (param $t (ref $cont)) (param $k (ref $cont))
 +    (call $enqueue (local.get $t))
 +    (call $enqueue (local.get $k))
 +    (call $scheduler (call $dequeue))
 +  )
 +  (func $fork-ytk (param $t (ref $cont))
 +    (suspend $control
 +      (cont.bind $cont-cont-func $cont-cont (local.get $t)
 +        (cont.new $cont-cont-func (ref.func $handle-fork-ytk))))
 +  )
 +  (func $ytk (export "ytk") (param $k (ref $func-cont-func-cont))
 +    (call $scheduler
 +      (cont.bind $func-cont-func-cont $cont (ref.func $yield) (ref.func $fork-ytk) (local.get $k)))
 +  )
 +)
 +(register "scheduler")
 +
 +(module
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  (type $cont-func (func (param (ref $cont)))) ;; [cont ([] -> [])] -> []
 +  (type $cont-cont (cont $cont-func))          ;; cont ([cont ([] -> [])] -> [])
 +
 +  (type $func-cont-func-func (func (param (ref $func)) (param (ref $cont-func)))) ;; ([] -> []) -> ([cont ([] -> [])] -> []) -> []
 +  (type $func-cont-func-cont (cont $func-cont-func-func))                         ;; cont (([] -> []) -> ([cont ([] -> [])] -> []) -> [])
 +
 +  (func $scheduler-sync (import "scheduler" "sync") (param $nextk (ref $func-cont-func-cont)))
 +  (func $scheduler-kt (import "scheduler" "kt") (param $nextk (ref $func-cont-func-cont)))
 +  (func $scheduler-tk (import "scheduler" "tk") (param $nextk (ref $func-cont-func-cont)))
 +  (func $scheduler-ykt (import "scheduler" "ykt") (param $nextk (ref $func-cont-func-cont)))
 +  (func $scheduler-ytk (import "scheduler" "ytk") (param $nextk (ref $func-cont-func-cont)))
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  (func $main (import "example" "main") (param $yield (ref $func)) (param $fork (ref $cont-func)))
 +
 +  (elem declare func $main)
 +
 +  (func $run (export "run")
 +    (call $log (i32.const -1))
 +    (call $scheduler-sync (cont.new $func-cont-func-cont (ref.func $main)))
 +    (call $log (i32.const -2))
 +    (call $scheduler-kt (cont.new $func-cont-func-cont (ref.func $main)))
 +    (call $log (i32.const -3))
 +    (call $scheduler-tk (cont.new $func-cont-func-cont (ref.func $main)))
 +    (call $log (i32.const -4))
 +    (call $scheduler-ykt (cont.new $func-cont-func-cont (ref.func $main)))
 +    (call $log (i32.const -5))
 +    (call $scheduler-ytk (cont.new $func-cont-func-cont (ref.func $main)))
 +    (call $log (i32.const -6))
 +  )
 +)
 +
 +(invoke "run")
 diff --git a/proposals/continuations/examples/fun-actor-lwt.wast b/proposals/continuations/examples/fun-actor-lwt.wast
 new file mode 100644
 index 00000000..a9cfff91
 --- /dev/null
 +++ b/proposals/continuations/examples/fun-actor-lwt.wast
 @@ -0,0 +1,398 @@
 +;; Actors via lightweight threads - functional version
 +
 +;; actor interface
 +(module $actor
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  (tag $self (export "self") (result i32))
 +  (tag $spawn (export "spawn") (param (ref $cont)) (result i32))
 +  (tag $send (export "send") (param i32 i32))
 +  (tag $recv (export "recv") (result i32))
 +)
 +(register "actor")
 +
 +;; a simple example - pass a message through a chain of actors
 +(module $chain
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  (type $i-func (func (param i32)))
 +  (type $i-cont (cont $i-func))
 +
 +  (tag $self (import "actor" "self") (result i32))
 +  (tag $spawn (import "actor" "spawn") (param (ref $cont)) (result i32))
 +  (tag $send (import "actor" "send") (param i32 i32))
 +  (tag $recv (import "actor" "recv") (result i32))
 +
 +  (elem declare func $next)
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  (func $next (param $p i32)
 +    (local $s i32)
 +    (local.set $s (suspend $recv))
 +    (call $log (i32.const -1))
 +    (suspend $send (local.get $s) (local.get $p))
 +  )
 +
 +  ;; send the message 42 through a chain of n actors
 +  (func $chain (export "chain") (param $n i32)
 +    (local $s i32)
 +    (local $p i32)
 +    (local.set $p (suspend $self))
 +
 +    (loop $l
 +      (if (i32.eqz (local.get $n))
 +        (then (suspend $send (i32.const 42) (local.get $p)))
 +        (else (local.set $p (suspend $spawn (cont.bind $i-cont $cont (local.get $p) (cont.new $i-cont (ref.func $next)))))
 +              (local.set $n (i32.sub (local.get $n) (i32.const 1)))
 +              (br $l))
 +      )
 +    )
 +    (local.set $s (suspend $recv))
 +    (call $log (local.get $s))
 +  )
 +)
 +(register "chain")
 +
 +;; interface to lightweight threads
 +(module $lwt
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  (tag $yield (export "yield"))
 +  (tag $fork (export "fork") (param (ref $cont)))
 +)
 +(register "lwt")
 +
 +;; queue of threads
 +(module $queue
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  ;; Table as simple queue (keeping it simple, no ring buffer)
 +  (table $queue 0 (ref null $cont))
 +  (global $qdelta i32 (i32.const 10))
 +  (global $qback (mut i32) (i32.const 0))
 +  (global $qfront (mut i32) (i32.const 0))
 +
 +  (func $queue-empty (export "queue-empty") (result i32)
 +    (i32.eq (global.get $qfront) (global.get $qback))
 +  )
 +
 +  (func $dequeue (export "dequeue") (result (ref null $cont))
 +    (local $i i32)
 +    (if (call $queue-empty)
 +      (then (return (ref.null $cont)))
 +    )
 +    (local.set $i (global.get $qfront))
 +    (global.set $qfront (i32.add (local.get $i) (i32.const 1)))
 +    (table.get $queue (local.get $i))
 +  )
 +
 +  (func $enqueue (export "enqueue") (param $k (ref $cont))
 +    ;; Check if queue is full
 +    (if (i32.eq (global.get $qback) (table.size $queue))
 +      (then
 +        ;; Check if there is enough space in the front to compact
 +        (if (i32.lt_u (global.get $qfront) (global.get $qdelta))
 +          (then
 +            ;; Space is below threshold, grow table instead
 +            (drop (table.grow $queue (ref.null $cont) (global.get $qdelta)))
 +          )
 +          (else
 +            ;; Enough space, move entries up to head of table
 +            (global.set $qback (i32.sub (global.get $qback) (global.get $qfront)))
 +            (table.copy $queue $queue
 +              (i32.const 0)         ;; dest = new front = 0
 +              (global.get $qfront)  ;; src = old front
 +              (global.get $qback)   ;; len = new back = old back - old front
 +            )
 +            (table.fill $queue      ;; null out old entries to avoid leaks
 +              (global.get $qback)   ;; start = new back
 +              (ref.null $cont)      ;; init value
 +              (global.get $qfront)  ;; len = old front = old front - new front
 +            )
 +            (global.set $qfront (i32.const 0))
 +          )
 +        )
 +      )
 +    )
 +    (table.set $queue (global.get $qback) (local.get $k))
 +    (global.set $qback (i32.add (global.get $qback) (i32.const 1)))
 +  )
 +)
 +(register "queue")
 +
 +;; simple scheduler
 +(module $scheduler
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  (tag $yield (import "lwt" "yield"))
 +  (tag $fork (import "lwt" "fork") (param (ref $cont)))
 +
 +  (func $queue-empty (import "queue" "queue-empty") (result i32))
 +  (func $dequeue (import "queue" "dequeue") (result (ref null $cont)))
 +  (func $enqueue (import "queue" "enqueue") (param $k (ref $cont)))
 +
 +  (func $run (export "run") (param $main (ref $cont))
 +    (call $enqueue (local.get $main))
 +    (loop $l
 +      (if (call $queue-empty) (then (return)))
 +      (block $on_yield (result (ref $cont))
 +        (block $on_fork (result (ref $cont) (ref $cont))
 +          (resume $cont (tag $yield $on_yield) (tag $fork $on_fork)
 +            (call $dequeue)
 +          )
 +          (br $l)  ;; thread terminated
 +        ) ;;   $on_fork (result (ref $cont) (ref $cont))
 +        (call $enqueue) ;; current thread
 +        (call $enqueue) ;; new thread
 +        (br $l)
 +      )
 +      ;;     $on_yield (result (ref $cont))
 +      (call $enqueue) ;; current thread
 +      (br $l)
 +    )
 +  )
 +)
 +(register "scheduler")
 +
 +(module $mailboxes
 +  ;; Stupid implementation of mailboxes that raises an exception if
 +  ;; there are too many mailboxes or if more than one message is sent
 +  ;; to any given mailbox.
 +  ;;
 +  ;; Sufficient for the simple chain example.
 +
 +  ;; -1 means empty
 +
 +  (tag $too-many-mailboxes)
 +  (tag $too-many-messages)
 +
 +  (memory 1)
 +
 +  (global $msize (mut i32) (i32.const 0))
 +  (global $mmax i32 (i32.const 1024)) ;; maximum number of mailboxes
 +
 +  (func $init (export "init")
 +     (memory.fill (i32.const 0) (i32.const -1) (i32.mul (global.get $mmax) (i32.const 4)))
 +  )
 +
 +  (func $empty-mb (export "empty-mb") (param $mb i32) (result i32)
 +    (local $offset i32)
 +    (local.set $offset (i32.mul (local.get $mb) (i32.const 4)))
 +    (i32.eq (i32.load (local.get $offset)) (i32.const -1))
 +  )
 +
 +  (func $new-mb (export "new-mb") (result i32)
 +     (local $mb i32)
 +
 +     (if (i32.ge_u (global.get $msize) (global.get $mmax))
 +         (then (throw $too-many-mailboxes))
 +     )
 +
 +     (local.set $mb (global.get $msize))
 +     (global.set $msize (i32.add (global.get $msize) (i32.const 1)))
 +     (return (local.get $mb))
 +  )
 +
 +  (func $send-to-mb (export "send-to-mb") (param $v i32) (param $mb i32)
 +    (local $offset i32)
 +    (local.set $offset (i32.mul (local.get $mb) (i32.const 4)))
 +    (if (call $empty-mb (local.get $mb))
 +      (then (i32.store (local.get $offset) (local.get $v)))
 +      (else (throw $too-many-messages))
 +    )
 +  )
 +
 +  (func $recv-from-mb (export "recv-from-mb") (param $mb i32) (result i32)
 +    (local $v i32)
 +    (local $offset i32)
 +    (local.set $offset (i32.mul (local.get $mb) (i32.const 4)))
 +    (local.set $v (i32.load (local.get $offset)))
 +    (i32.store (local.get $offset) (i32.const -1))
 +    (local.get $v)
 +  )
 +)
 +(register "mailboxes")
 +
 +;; actors via lightweight threads
 +(module $actor-as-lwt
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  (type $i-func (func (param i32)))
 +  (type $i-cont (cont $i-func))
 +
 +  (type $icont-func (func (param i32 (ref $cont))))
 +  (type $icont-cont (cont $icont-func))
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  ;; lwt interface
 +  (tag $yield (import "lwt" "yield"))
 +  (tag $fork (import "lwt" "fork") (param (ref $cont)))
 +
 +  ;; mailbox interface
 +  (func $init (import "mailboxes" "init"))
 +  (func $empty-mb (import "mailboxes" "empty-mb") (param $mb i32) (result i32))
 +  (func $new-mb (import "mailboxes" "new-mb") (result i32))
 +  (func $send-to-mb (import "mailboxes" "send-to-mb") (param $v i32) (param $mb i32))
 +  (func $recv-from-mb (import "mailboxes" "recv-from-mb") (param $mb i32) (result i32))
 +
 +  ;; actor interface
 +  (tag $self (import "actor" "self") (result i32))
 +  (tag $spawn (import "actor" "spawn") (param (ref $cont)) (result i32))
 +  (tag $send (import "actor" "send") (param i32 i32))
 +  (tag $recv (import "actor" "recv") (result i32))
 +
 +  (elem declare func $act-nullary $act-res)
 +
 +  ;; resume with $ik applied to $res
 +  (func $act-res (param $mine i32) (param $res i32) (param $ik (ref $i-cont))
 +    (local $yours i32)
 +    (local $k (ref $cont))
 +    (local $you (ref $cont))
 +    (block $on_self (result (ref $i-cont))
 +      (block $on_spawn (result (ref $cont) (ref $i-cont))
 +        (block $on_send (result i32 i32 (ref $cont))
 +          (block $on_recv (result (ref $i-cont))
 +             ;; this should really be a tail call to the continuation
 +             ;; do we need a 'return_resume' operator?
 +             (resume $i-cont (tag $self $on_self)
 +                             (tag $spawn $on_spawn)
 +                             (tag $send $on_send)
 +                             (tag $recv $on_recv)
 +                             (local.get $res) (local.get $ik)
 +             )
 +             (return)
 +          ) ;;   $on_recv (result (ref $i-cont))
 +          (local.set $ik)
 +          ;; block this thread until the mailbox is non-empty
 +          (loop $l
 +            (if (call $empty-mb (local.get $mine))
 +                (then (suspend $yield)
 +                      (br $l))
 +            )
 +          )
 +          (call $recv-from-mb (local.get $mine))
 +          (local.set $res)
 +          (return_call $act-res (local.get $mine) (local.get $res) (local.get $ik))
 +        ) ;;   $on_send (result i32 i32 (ref $cont))
 +        (local.set $k)
 +        (call $send-to-mb)
 +        (return_call $act-nullary (local.get $mine) (local.get $k))
 +      ) ;;   $on_spawn (result (ref $cont) (ref $i-cont))
 +      (local.set $ik)
 +      (local.set $you)
 +      (call $new-mb)
 +      (local.set $yours)
 +      (suspend $fork (cont.bind $icont-cont $cont
 +                                (local.get $yours)
 +                                (local.get $you)
 +                                (cont.new $icont-cont (ref.func $act-nullary))))
 +      (return_call $act-res (local.get $mine) (local.get $yours) (local.get $ik))
 +    ) ;;   $on_self (result (ref $i-cont))
 +    (local.set $ik)
 +    (return_call $act-res (local.get $mine) (local.get $mine) (local.get $ik))
 +  )
 +
 +  ;; resume with nullary continuation
 +  (func $act-nullary (param $mine i32) (param $k (ref $cont))
 +    (local $res i32)
 +    (local $ik (ref $i-cont))
 +    (local $you (ref $cont))
 +    (local $yours i32)
 +    (block $on_self (result (ref $i-cont))
 +      (block $on_spawn (result (ref $cont) (ref $i-cont))
 +        (block $on_send (result i32 i32 (ref $cont))
 +          (block $on_recv (result (ref $i-cont))
 +             ;; this should really be a tail call to the continuation
 +             ;; do we need a 'return_resume' operator?
 +             (resume $cont (tag $self $on_self)
 +                           (tag $spawn $on_spawn)
 +                           (tag $send $on_send)
 +                           (tag $recv $on_recv)
 +                           (local.get $k)
 +             )
 +             (return)
 +          ) ;;   $on_recv (result (ref $i-cont))
 +          (local.set $ik)
 +          ;; block this thread until the mailbox is non-empty
 +          (loop $l
 +            (if (call $empty-mb (local.get $mine))
 +                (then (suspend $yield)
 +                      (br $l))
 +            )
 +          )
 +          (call $recv-from-mb (local.get $mine))
 +          (local.set $res)
 +          (return_call $act-res (local.get $mine) (local.get $res) (local.get $ik))
 +        ) ;;   $on_send (result i32 i32 (ref $cont))
 +        (local.set $k)
 +        (call $send-to-mb)
 +        (return_call $act-nullary (local.get $mine) (local.get $k))
 +      ) ;;   $on_spawn (result (ref $cont) (ref $i-cont))
 +      (local.set $ik)
 +      (local.set $you)
 +      (call $new-mb)
 +      (local.set $yours)
 +      (suspend $fork (cont.bind $icont-cont $cont
 +                                (local.get $yours)
 +                                (local.get $you)
 +                                (cont.new $icont-cont (ref.func $act-nullary))))
 +      (return_call $act-res (local.get $mine) (local.get $yours) (local.get $ik))
 +    ) ;;   $on_self (result (ref $i-cont))
 +    (local.set $ik)
 +    (return_call $act-res (local.get $mine) (local.get $mine) (local.get $ik))
 +  )
 +
 +  (func $act (export "act") (param $k (ref $cont))
 +    (call $init)
 +    (call $act-nullary (call $new-mb) (local.get $k))
 +  )
 +)
 +(register "actor-as-lwt")
 +
 +;; composing the actor and scheduler handlers together
 +(module $actor-scheduler
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  (type $cont-func (func (param (ref $cont))))
 +  (type $cont-cont (cont $cont-func))
 +
 +  (type $f-func (func (param (ref $func))))
 +
 +  (elem declare func $act $scheduler)
 +
 +  (func $act (import "actor-as-lwt" "act") (param $k (ref $cont)))
 +  (func $scheduler (import "scheduler" "run") (param $k (ref $cont)))
 +
 +  (func $run-actor (export "run-actor") (param $k (ref $cont))
 +    (call $scheduler (cont.bind $cont-cont $cont (local.get $k) (cont.new $cont-cont (ref.func $act))))
 +  )
 +)
 +(register "actor-scheduler")
 +
 +(module
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  (type $i-func (func (param i32)))
 +  (type $i-cont (cont $i-func))
 +
 +  (elem declare func $chain)
 +
 +  (func $run-actor (import "actor-scheduler" "run-actor") (param $k (ref $cont)))
 +  (func $chain (import "chain" "chain") (param $n i32))
 +
 +  (func $run-chain (export "run-chain") (param $n i32)
 +    (call $run-actor (cont.bind $i-cont $cont (local.get $n) (cont.new $i-cont (ref.func $chain))))
 +  )
 +)
 +
 +(invoke "run-chain" (i32.const 64))
 diff --git a/proposals/continuations/examples/fun-lwt.wast b/proposals/continuations/examples/fun-lwt.wast
 new file mode 100644
 index 00000000..2b57b53d
 --- /dev/null
 +++ b/proposals/continuations/examples/fun-lwt.wast
 @@ -0,0 +1,267 @@
 +;; functional lightweight threads
 +
 +;; interface to lightweight threads
 +(module $lwt
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  (tag $yield (export "yield"))                   ;; [] -> []
 +  (tag $fork (export "fork") (param (ref $cont))) ;; [cont ([] -> [])] -> []
 +)
 +(register "lwt")
 +
 +(module $example
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  (tag $yield (import "lwt" "yield"))                   ;; [] -> []
 +  (tag $fork (import "lwt" "fork") (param (ref $cont))) ;; [cont ([] -> [])] -> []
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  (elem declare func $thread1 $thread2 $thread3)
 +
 +  (func $main (export "main")
 +    (call $log (i32.const 0))
 +    (suspend $fork (cont.new $cont (ref.func $thread1)))
 +    (call $log (i32.const 1))
 +    (suspend $fork (cont.new $cont (ref.func $thread2)))
 +    (call $log (i32.const 2))
 +    (suspend $fork (cont.new $cont (ref.func $thread3)))
 +    (call $log (i32.const 3))
 +  )
 +
 +  (func $thread1
 +    (call $log (i32.const 10))
 +    (suspend $yield)
 +    (call $log (i32.const 11))
 +    (suspend $yield)
 +    (call $log (i32.const 12))
 +  )
 +
 +  (func $thread2
 +    (call $log (i32.const 20))
 +    (suspend $yield)
 +    (call $log (i32.const 21))
 +    (suspend $yield)
 +    (call $log (i32.const 22))
 +  )
 +
 +  (func $thread3
 +    (call $log (i32.const 30))
 +    (suspend $yield)
 +    (call $log (i32.const 31))
 +    (suspend $yield)
 +    (call $log (i32.const 32))
 +  )
 +)
 +(register "example")
 +
 +(module $queue
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  ;; Table as simple queue (keeping it simple, no ring buffer)
 +  (table $queue 0 (ref null $cont))
 +  (global $qdelta i32 (i32.const 10))
 +  (global $qback (mut i32) (i32.const 0))
 +  (global $qfront (mut i32) (i32.const 0))
 +
 +  (func $queue-empty (export "queue-empty") (result i32)
 +    (i32.eq (global.get $qfront) (global.get $qback))
 +  )
 +
 +  (func $dequeue (export "dequeue") (result (ref null $cont))
 +    (local $i i32)
 +    (if (call $queue-empty)
 +      (then (return (ref.null $cont)))
 +    )
 +    (local.set $i (global.get $qfront))
 +    (global.set $qfront (i32.add (local.get $i) (i32.const 1)))
 +    (table.get $queue (local.get $i))
 +  )
 +
 +  (func $enqueue (export "enqueue") (param $k (ref $cont))
 +    ;; Check if queue is full
 +    (if (i32.eq (global.get $qback) (table.size $queue))
 +      (then
 +        ;; Check if there is enough space in the front to compact
 +        (if (i32.lt_u (global.get $qfront) (global.get $qdelta))
 +          (then
 +            ;; Space is below threshold, grow table instead
 +            (drop (table.grow $queue (ref.null $cont) (global.get $qdelta)))
 +          )
 +          (else
 +            ;; Enough space, move entries up to head of table
 +            (global.set $qback (i32.sub (global.get $qback) (global.get $qfront)))
 +            (table.copy $queue $queue
 +              (i32.const 0)         ;; dest = new front = 0
 +              (global.get $qfront)  ;; src = old front
 +              (global.get $qback)   ;; len = new back = old back - old front
 +            )
 +            (table.fill $queue      ;; null out old entries to avoid leaks
 +              (global.get $qback)   ;; start = new back
 +              (ref.null $cont)      ;; init value
 +              (global.get $qfront)  ;; len = old front = old front - new front
 +            )
 +            (global.set $qfront (i32.const 0))
 +          )
 +        )
 +      )
 +    )
 +    (table.set $queue (global.get $qback) (local.get $k))
 +    (global.set $qback (i32.add (global.get $qback) (i32.const 1)))
 +  )
 +)
 +(register "queue")
 +
 +(module $scheduler
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  (tag $yield (import "lwt" "yield"))                   ;; [] -> []
 +  (tag $fork (import "lwt" "fork") (param (ref $cont))) ;; [cont ([] -> [])] -> []
 +
 +  (func $queue-empty (import "queue" "queue-empty") (result i32))
 +  (func $dequeue (import "queue" "dequeue") (result (ref null $cont)))
 +  (func $enqueue (import "queue" "enqueue") (param $k (ref $cont)))
 +
 +  ;; synchronous scheduler (run current thread to completion without
 +  ;; yielding)
 +  (func $sync (export "sync") (param $nextk (ref null $cont))
 +    (if (ref.is_null (local.get $nextk)) (then (return)))
 +    (block $on_yield (result (ref $cont))
 +      (block $on_fork (result (ref $cont) (ref $cont))
 +        (resume $cont
 +          (tag $yield $on_yield)
 +          (tag $fork $on_fork)
 +          (local.get $nextk)
 +        )
 +        (return_call $sync (call $dequeue))
 +      ) ;;   $on_fork (result (ref $func) (ref $cont))
 +      (local.set $nextk)
 +      (call $enqueue)
 +      (return_call $sync (local.get $nextk))
 +    ) ;;   $on_yield (result (ref $cont))
 +    (return_call $sync)
 +  )
 +
 +  ;; four different schedulers:
 +  ;;   * kt and tk don't yield on encountering a fork
 +  ;;     1) kt runs the continuation, queuing up the new thread for later
 +  ;;     2) tk runs the new thread first, queuing up the continuation for later
 +  ;;   * ykt and ytk do yield on encountering a fork
 +  ;;     3) ykt runs the continuation, queuing up the new thread for later
 +  ;;     4) ytk runs the new thread first, queuing up the continuation for later
 +
 +  ;; no yield on fork, continuation first
 +  (func $kt (export "kt") (param $nextk (ref null $cont))
 +    (if (ref.is_null (local.get $nextk)) (then (return)))
 +    (block $on_yield (result (ref $cont))
 +      (block $on_fork (result (ref $cont) (ref $cont))
 +        (resume $cont
 +          (tag $yield $on_yield)
 +          (tag $fork $on_fork)
 +          (local.get $nextk)
 +        )
 +        (return_call $tk (call $dequeue))
 +      ) ;;   $on_fork (result (ref $func) (ref $cont))
 +      (local.set $nextk)
 +      (call $enqueue)
 +      (return_call $tk (local.get $nextk))
 +    ) ;;   $on_yield (result (ref $cont))
 +    (call $enqueue)
 +    (return_call $tk (call $dequeue))
 +  )
 +
 +  ;; no yield on fork, new thread first
 +  (func $tk (export "tk") (param $nextk (ref null $cont))
 +    (if (ref.is_null (local.get $nextk)) (then (return)))
 +    (block $on_yield (result (ref $cont))
 +      (block $on_fork (result (ref $cont) (ref $cont))
 +        (resume $cont
 +          (tag $yield $on_yield)
 +          (tag $fork $on_fork)
 +          (local.get $nextk))
 +        (return_call $kt (call $dequeue))
 +      ) ;;   $on_fork (result (ref $cont) (ref $cont))
 +      (return_call $kt (call $enqueue))
 +    ) ;;   $on_yield (result (ref $cont))
 +    (call $enqueue)
 +    (return_call $kt (call $dequeue))
 +  )
 +
 +  ;; yield on fork, continuation first
 +  (func $ykt (export "ykt") (param $nextk (ref null $cont))
 +    (if (ref.is_null (local.get $nextk)) (then (return)))
 +    (block $on_yield (result (ref $cont))
 +      (block $on_fork (result (ref $cont) (ref $cont))
 +        (resume $cont
 +          (tag $yield $on_yield)
 +          (tag $fork $on_fork)
 +          (local.get $nextk)
 +        )
 +        (return_call $ykt (call $dequeue))
 +      ) ;;   $on_fork (result (ref $cont) (ref $cont))
 +      (call $enqueue)
 +      (call $enqueue)
 +      (return_call $ykt (call $dequeue))
 +    ) ;;   $on_yield (result (ref $cont))
 +    (call $enqueue)
 +    (return_call $ykt (call $dequeue))
 +  )
 +
 +  ;; yield on fork, new thread first
 +  (func $ytk (export "ytk") (param $nextk (ref null $cont))
 +    (local $k (ref $cont))
 +    (if (ref.is_null (local.get $nextk)) (then (return)))
 +    (block $on_yield (result (ref $cont))
 +      (block $on_fork (result (ref $cont) (ref $cont))
 +        (resume $cont (tag $yield $on_yield) (tag $fork $on_fork) (local.get $nextk))
 +        (return_call $ytk (call $dequeue))
 +      ) ;;   $on_fork (result (ref $cont) (ref $cont))
 +      (local.set $k)
 +      (call $enqueue)
 +      (call $enqueue (local.get $k))
 +      (return_call $ytk (call $dequeue))
 +    ) ;;   $on_yield (result (ref $cont))
 +    (call $enqueue)
 +    (return_call $ytk (call $dequeue))
 +  )
 +)
 +
 +(register "scheduler")
 +
 +(module
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  (func $scheduler-sync (import "scheduler" "sync") (param $nextk (ref null $cont)))
 +  (func $scheduler-kt (import "scheduler" "kt") (param $nextk (ref null $cont)))
 +  (func $schedule-tk (import "scheduler" "tk") (param $nextk (ref null $cont)))
 +  (func $scheduler-ykt (import "scheduler" "ykt") (param $nextk (ref null $cont)))
 +  (func $scheduler-ytk (import "scheduler" "ytk") (param $nextk (ref null $cont)))
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  (func $main (import "example" "main"))
 +
 +  (elem declare func $main)
 +
 +  (func (export "run")
 +    (call $log (i32.const -1))
 +    (call $scheduler-sync (cont.new $cont (ref.func $main)))
 +    (call $log (i32.const -2))
 +    (call $scheduler-kt (cont.new $cont (ref.func $main)))
 +    (call $log (i32.const -3))
 +    (call $schedule-tk (cont.new $cont (ref.func $main)))
 +    (call $log (i32.const -4))
 +    (call $scheduler-ykt (cont.new $cont (ref.func $main)))
 +    (call $log (i32.const -5))
 +    (call $scheduler-ytk (cont.new $cont (ref.func $main)))
 +    (call $log (i32.const -6))
 +  )
 +)
 +
 +(invoke "run")
 +
 diff --git a/proposals/continuations/examples/fun-pipes.wast b/proposals/continuations/examples/fun-pipes.wast
 new file mode 100644
 index 00000000..4c4008de
 --- /dev/null
 +++ b/proposals/continuations/examples/fun-pipes.wast
 @@ -0,0 +1,88 @@
 +;; Simple pipes example (functional version)
 +(module $pipes
 +  (type $pfun (func (result i32)))
 +  (type $cfun (func (param i32) (result i32)))
 +  (type $producer (cont $pfun))
 +  (type $consumer (cont $cfun))
 +
 +  (tag $send (export "send") (param i32))
 +  (tag $receive (export "receive") (result i32))
 +
 +  (func $piper (param $n i32) (param $p (ref $producer)) (param $c (ref $consumer))
 +     (block $on-receive (result (ref $consumer))
 +        (resume $consumer (tag $receive $on-receive) (local.get $n) (local.get $c))
 +        (return)
 +     ) ;; receive
 +     (local.set $c)
 +     (return_call $copiper (local.get $c) (local.get $p))
 +  )
 +
 +  (func $copiper (param $c (ref $consumer)) (param $p (ref $producer))
 +     (local $n i32)
 +     (block $on-send (result i32 (ref $producer))
 +        (resume $producer (tag $send $on-send) (local.get $p))
 +        (return)
 +     ) ;; send
 +     (local.set $p)
 +     (local.set $n)
 +     (return_call $piper (local.get $n) (local.get $p) (local.get $c))
 +  )
 +
 +  (func $pipe (export "pipe") (param $p (ref $producer)) (param $c (ref $consumer))
 +     (call $piper (i32.const -1) (local.get $p) (local.get $c))
 +  )
 +)
 +(register "pipes")
 +
 +(module
 +  (type $pfun (func (result i32)))
 +  (type $cfun (func (param i32) (result i32)))
 +
 +  (type $producer (cont $pfun))
 +  (type $consumer (cont $cfun))
 +
 +  (tag $send (import "pipes" "send") (param i32))
 +  (tag $receive (import "pipes" "receive") (result i32))
 +
 +  (func $pipe (import "pipes" "pipe") (param $p (ref $producer)) (param $c (ref $consumer)))
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  (elem declare func $nats $sum)
 +
 +  ;; send n, n+1, ...
 +  (func $nats (param $n i32) (result i32)
 +     (loop $l
 +       (call $log (i32.const -1))
 +       (call $log (local.get $n))
 +       (suspend $send (local.get $n))
 +       (local.set $n (i32.add (local.get $n) (i32.const 1)))
 +       (br $l)
 +     )
 +     (unreachable)
 +  )
 +
 +  ;; receive 10 nats and return their sum
 +  (func $sum (param $dummy i32) (result i32)
 +     (local $i i32)
 +     (local $a i32)
 +     (local.set $i (i32.const 10))
 +     (local.set $a (i32.const 0))
 +     (loop $l
 +       (local.set $a (i32.add (local.get $a) (suspend $receive)))
 +       (call $log (i32.const -2))
 +       (call $log (local.get $a))
 +       (local.set $i (i32.sub (local.get $i) (i32.const 1)))
 +       (br_if $l (i32.ne (local.get $i) (i32.const 0)))
 +     )
 +     (return (local.get $a))
 +  )
 +
 +  (func (export "run") (param $n i32)
 +     (call $pipe (cont.bind $consumer $producer (local.get $n) (cont.new $consumer (ref.func $nats)))
 +                 (cont.new $consumer (ref.func $sum))
 +     )
 + )
 +)
 +
 +(invoke "run" (i32.const 0))
 diff --git a/proposals/continuations/examples/fun-state.wast b/proposals/continuations/examples/fun-state.wast
 new file mode 100644
 index 00000000..440aaedf
 --- /dev/null
 +++ b/proposals/continuations/examples/fun-state.wast
 @@ -0,0 +1,61 @@
 +;; Simple state example - functional with heterogeneous continuations
 +(module $state
 +  (tag $get (result i32))
 +  (tag $set (param i32))
 +
 +  (type $gf (func (param i32) (result i32)))
 +  (type $sf (func (result i32)))
 +
 +  (type $gk (cont $gf))
 +  (type $sk (cont $sf))
 +
 +  (func $getting (param $k (ref $gk)) (param $s i32) (result i32)
 +     (block $on_get (result (ref $gk))
 +       (block $on_set (result i32 (ref $sk))
 +          (resume $gk (tag $get $on_get) (tag $set $on_set)
 +            (local.get $s) (local.get $k)
 +          )
 +          (return)
 +        ) ;;  $on_set (result i32 (ref $sk))
 +        (return_call $setting)
 +      ) ;;  $on_get (result (ref $gk))
 +      (local.get $s)
 +      (return_call $getting)
 +  )
 +
 +  (func $setting (param $s i32) (param $k (ref $sk)) (result i32)
 +     (block $on_get (result (ref $gk))
 +       (block $on_set (result i32 (ref $sk))
 +          (resume $sk (tag $get $on_get) (tag $set $on_set)
 +            (local.get $k)
 +          )
 +          (return)
 +        ) ;;  $on_set (result i32 (ref $sk))
 +        (return_call $setting)
 +      ) ;;  $on_get (result (ref $gk))
 +      (local.get $s)
 +      (return_call $getting)
 +  )
 +
 +  (func $f (result i32)
 +    (suspend $set (i32.const 7))
 +    (i32.add
 +      (suspend $get)
 +      (i32.mul
 +        (i32.const 2)
 +        (suspend $set (i32.const 3))
 +        (i32.add
 +          (i32.const 3)
 +          (suspend $get)
 +        )
 +      )
 +    )
 +  )
 +
 +  (elem declare func $f)
 +  (func (export "run") (result i32)
 +    (call $setting (i32.const 0) (cont.new $sk (ref.func $f)))
 +  )
 +)
 +
 +(assert_return (invoke "run") (i32.const 19))
 diff --git a/proposals/continuations/examples/generators.wast b/proposals/continuations/examples/generators.wast
 new file mode 100644
 index 00000000..a7ce4e05
 --- /dev/null
 +++ b/proposals/continuations/examples/generators.wast
 @@ -0,0 +1,166 @@
 +;; Generators
 +
 +;; generator interface
 +(module $generator
 +  (tag $yield (export "yield") (param i32))
 +)
 +(register "generator")
 +
 +(module $examples
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  (tag $yield (import "generator" "yield") (param i32))
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  ;; yields successive natural numbers
 +  (func $naturals (export "naturals")
 +    (local $n i32)
 +    (loop $l
 +      (suspend $yield (local.get $n))
 +      (local.set $n (i32.add (local.get $n) (i32.const 1)))
 +      (br $l)
 +    )
 +  )
 +
 +  ;; yields 1-2-3
 +  (func $one-two-three (export "one-two-three")
 +    (suspend $yield (i32.const 1))
 +    (suspend $yield (i32.const 2))
 +    (suspend $yield (i32.const 3))
 +  )
 +
 +  ;; yields successive Fibonacci numbers
 +  (func $fibonacci (export "fibonacci")
 +    (local $a i32)
 +    (local $b i32)
 +    (local $t i32)
 +    (local.set $b (i32.const 1))
 +    (loop $l
 +      (suspend $yield (local.get $a))
 +      (local.set $t (local.get $a))
 +      (local.set $a (local.get $b))
 +      (local.set $b (i32.add (local.get $t) (local.get $a)))
 +      (br $l)
 +    )
 +  )
 +
 +  (func $print-first (export "print-first") (param $n i32) (param $k (ref $cont))
 +    (loop $l
 +      (block $on_yield (result i32 (ref $cont))
 +        (if (local.get $n)
 +          (then (resume $cont (tag $yield $on_yield) (local.get $k)))
 +        )
 +        (return)
 +      ) ;;   $on_yield (result i32 (ref $cont))
 +      (local.set $k)
 +      (call $log)
 +      (local.set $n (i32.add (local.get $n) (i32.const -1)))
 +      (br $l)
 +    )
 +    (unreachable)
 +  )
 +  
 +  (func $sum-first (export "sum-first") (param $n i32) (param $k (ref $cont)) (result i32)
 +    (local $sum i32)
 +    (loop $l
 +      (block $on_yield (result i32 (ref $cont))
 +        (if (local.get $n)
 +          (then (resume $cont (tag $yield $on_yield) (local.get $k)))
 +        )
 +        (return (local.get $sum))
 +      ) ;;   $on_yield (result i32 (ref $cont))
 +      (local.set $k)
 +      (local.set $sum (i32.add (local.get $sum)))
 +      (local.set $n (i32.add (local.get $n) (i32.const -1)))
 +      (br $l)
 +    )
 +    (unreachable)
 +  )
 +)
 +(register "examples")
 +
 +;; storing generators in a global table and then accessing them through i32 handles
 +;; without knowledge of handlers
 +(module $manager
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  (tag $yield (import "generator" "yield") (param i32))
 +
 +  (table $active 0 (ref null $cont))
 +
 +  (func $init (export "init") (param $k (ref $cont)) (result i32)
 +    (table.grow $active (local.get $k) (i32.const 1))
 +  )
 +
 +  (func $next (export "next") (param $g i32) (result i32)
 +    (local $next_k (ref $cont))
 +    (local $next_v i32)
 +    (block $on_yield (result i32 (ref $cont))
 +      (resume $cont (tag $yield $on_yield)
 +                    (table.get $active (local.get $g))
 +      )
 +      (return (i32.const -1))
 +    ) ;;   $on_yield (result i32 (ref $cont))
 +    (local.set $next_k)
 +    (local.set $next_v)
 +    (table.set (local.get $g) (local.get $next_k))
 +    (return (local.get $next_v))
 +  )
 +)
 +(register "manager")
 +
 +(module
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  (elem declare func $naturals $fibonacci $one-two-three)
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +  (func $naturals (import "examples" "naturals"))
 +  (func $fibonacci (import "examples" "fibonacci"))
 +  (func $one-two-three (import "examples" "one-two-three"))
 +  (func $print-first (import "examples" "print-first") (param $n i32) (param $k (ref $cont)))
 +  (func $sum-first (import "examples" "sum-first") (param $n i32) (param $k (ref $cont)) (result i32))
 +  (func $init (import "manager" "init") (param $k (ref $cont)) (result i32))
 +  (func $next (import "manager" "next") (param i32) (result i32))
 +
 +  (func $print-with-next (param $n i32) (param $gen i32)
 +    (loop $l
 +      (if (i32.eqz (local.get $n)) (then (return)))
 +      (call $next (local.get $gen))
 +      (call $log)
 +      (local.set $n (i32.add (local.get $n) (i32.const -1)))
 +      (br $l)
 +    )
 +  )
 +
 +  (func $interleave-naturals-and-fib
 +      (local $gen1 i32)
 +      (local $gen2 i32)
 +      (local.set $gen1 (call $init (cont.new $cont (ref.func $naturals))))
 +      (local.set $gen2 (call $init (cont.new $cont (ref.func $fibonacci))))
 +      (call $print-with-next (i32.const 5) (local.get $gen1))
 +      (call $print-with-next (i32.const 5) (local.get $gen2))
 +      (call $print-with-next (i32.const 5) (local.get $gen1))
 +      (call $print-with-next (i32.const 5) (local.get $gen2))
 +      (call $print-with-next (i32.const 5) (local.get $gen1))
 +      (call $print-with-next (i32.const 5) (local.get $gen2))
 +      (call $print-with-next (i32.const 5) (local.get $gen1))
 +      (call $print-with-next (i32.const 5) (local.get $gen2))
 +  )
 +
 +  (func $main (export "main")
 +    (call $interleave-naturals-and-fib)
 +    (call $print-first (i32.const 42) (cont.new $cont (ref.func $naturals)))
 +    (call $print-first (i32.const 42) (cont.new $cont (ref.func $fibonacci)))
 +    (call $sum-first (i32.const 101) (cont.new $cont (ref.func $naturals)))
 +    (call $log)
 +    (call $sum-first (i32.const 10) (cont.new $cont (ref.func $one-two-three)))
 +    (call $log)
 +  )
 +)
 +
 +(invoke "main")
 diff --git a/proposals/continuations/examples/lwt.wast b/proposals/continuations/examples/lwt.wast
 new file mode 100644
 index 00000000..65232d5b
 --- /dev/null
 +++ b/proposals/continuations/examples/lwt.wast
 @@ -0,0 +1,294 @@
 +;; dynamic lightweight threads
 +
 +;; interface to lightweight threads
 +(module $lwt
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  (tag $yield (export "yield"))                   ;; [] -> []
 +  (tag $fork (export "fork") (param (ref $cont))) ;; [cont ([] -> [])] -> []
 +)
 +(register "lwt")
 +
 +(module $example
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  (tag $yield (import "lwt" "yield"))                   ;; [] -> []
 +  (tag $fork (import "lwt" "fork") (param (ref $cont))) ;; [cont ([] -> [])] -> []
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  (elem declare func $thread1 $thread2 $thread3)
 +
 +  (func $main (export "main")
 +    (call $log (i32.const 0))
 +    (suspend $fork (cont.new $cont (ref.func $thread1)))
 +    (call $log (i32.const 1))
 +    (suspend $fork (cont.new $cont (ref.func $thread2)))
 +    (call $log (i32.const 2))
 +    (suspend $fork (cont.new $cont (ref.func $thread3)))
 +    (call $log (i32.const 3))
 +  )
 +
 +  (func $thread1
 +    (call $log (i32.const 10))
 +    (suspend $yield)
 +    (call $log (i32.const 11))
 +    (suspend $yield)
 +    (call $log (i32.const 12))
 +  )
 +
 +  (func $thread2
 +    (call $log (i32.const 20))
 +    (suspend $yield)
 +    (call $log (i32.const 21))
 +    (suspend $yield)
 +    (call $log (i32.const 22))
 +  )
 +
 +  (func $thread3
 +    (call $log (i32.const 30))
 +    (suspend $yield)
 +    (call $log (i32.const 31))
 +    (suspend $yield)
 +    (call $log (i32.const 32))
 +  )
 +)
 +(register "example")
 +
 +;; queue of threads
 +(module $queue
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  ;; Table as simple queue (keeping it simple, no ring buffer)
 +  (table $queue 0 (ref null $cont))
 +  (global $qdelta i32 (i32.const 10))
 +  (global $qback (mut i32) (i32.const 0))
 +  (global $qfront (mut i32) (i32.const 0))
 +
 +  (func $queue-empty (export "queue-empty") (result i32)
 +    (i32.eq (global.get $qfront) (global.get $qback))
 +  )
 +
 +  (func $dequeue (export "dequeue") (result (ref null $cont))
 +    (local $i i32)
 +    (if (call $queue-empty)
 +      (then (return (ref.null $cont)))
 +    )
 +    (local.set $i (global.get $qfront))
 +    (global.set $qfront (i32.add (local.get $i) (i32.const 1)))
 +    (table.get $queue (local.get $i))
 +  )
 +
 +  (func $enqueue (export "enqueue") (param $k (ref null $cont))
 +    ;; Check if queue is full
 +    (if (i32.eq (global.get $qback) (table.size $queue))
 +      (then
 +        ;; Check if there is enough space in the front to compact
 +        (if (i32.lt_u (global.get $qfront) (global.get $qdelta))
 +          (then
 +            ;; Space is below threshold, grow table instead
 +            (drop (table.grow $queue (ref.null $cont) (global.get $qdelta)))
 +          )
 +          (else
 +            ;; Enough space, move entries up to head of table
 +            (global.set $qback (i32.sub (global.get $qback) (global.get $qfront)))
 +            (table.copy $queue $queue
 +              (i32.const 0)         ;; dest = new front = 0
 +              (global.get $qfront)  ;; src = old front
 +              (global.get $qback)   ;; len = new back = old back - old front
 +            )
 +            (table.fill $queue      ;; null out old entries to avoid leaks
 +              (global.get $qback)   ;; start = new back
 +              (ref.null $cont)      ;; init value
 +              (global.get $qfront)  ;; len = old front = old front - new front
 +            )
 +            (global.set $qfront (i32.const 0))
 +          )
 +        )
 +      )
 +    )
 +    (table.set $queue (global.get $qback) (local.get $k))
 +    (global.set $qback (i32.add (global.get $qback) (i32.const 1)))
 +  )
 +)
 +(register "queue")
 +
 +(module $scheduler
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  (tag $yield (import "lwt" "yield"))                   ;; [] -> []
 +  (tag $fork (import "lwt" "fork") (param (ref $cont))) ;; [cont ([] -> [])] -> []
 +
 +  (func $queue-empty (import "queue" "queue-empty") (result i32))
 +  (func $dequeue (import "queue" "dequeue") (result (ref null $cont)))
 +  (func $enqueue (import "queue" "enqueue") (param $k (ref null $cont)))
 +
 +  ;; synchronous scheduler (run current thread to completion without
 +  ;; yielding)
 +  (func $sync (export "sync") (param $nextk (ref null $cont))
 +    (loop $l
 +      (if (ref.is_null (local.get $nextk)) (then (return)))
 +      (block $on_yield (result (ref $cont))
 +        (block $on_fork (result (ref $cont) (ref $cont))
 +          (resume $cont
 +            (tag $yield $on_yield)
 +            (tag $fork $on_fork)
 +            (local.get $nextk)
 +          )
 +          (local.set $nextk (call $dequeue))
 +          (br $l)  ;; thread terminated
 +        ) ;;   $on_fork (result (ref $cont) (ref $cont))
 +        (local.set $nextk)                      ;; current thread
 +        (call $enqueue) ;; new thread
 +        (br $l)
 +      ) ;;   $on_yield (result (ref $cont))
 +      (local.set $nextk)  ;; carry on with current thread
 +      (br $l)
 +    )
 +  )
 +
 +  ;; four asynchronous schedulers:
 +  ;;   * kt and tk don't yield on encountering a fork
 +  ;;     1) kt runs the continuation, queuing up the new thread for later
 +  ;;     2) tk runs the new thread first, queuing up the continuation for later
 +  ;;   * ykt and ytk do yield on encountering a fork
 +  ;;     3) ykt runs the continuation, queuing up the new thread for later
 +  ;;     4) ytk runs the new thread first, queuing up the continuation for later
 +
 +  ;; no yield on fork, continuation first
 +  (func $kt (export "kt") (param $nextk (ref null $cont))
 +    (loop $l
 +      (if (ref.is_null (local.get $nextk)) (then (return)))
 +      (block $on_yield (result (ref $cont))
 +        (block $on_fork (result (ref $cont) (ref $cont))
 +          (resume $cont
 +            (tag $yield $on_yield)
 +            (tag $fork $on_fork)
 +            (local.get $nextk)
 +          )
 +          (local.set $nextk (call $dequeue))
 +          (br $l)  ;; thread terminated
 +        ) ;;   $on_fork (result (ref $cont) (ref $cont))
 +        (local.set $nextk) ;; current thread
 +        (call $enqueue) ;; new thread
 +        (br $l)
 +      ) ;;   $on_yield (result (ref $cont))
 +      (call $enqueue) ;; current thread
 +      (local.set $nextk (call $dequeue)) ;; next thread
 +      (br $l)
 +    )
 +  )
 +
 +  ;; no yield on fork, new thread first
 +  (func $tk (export "tk") (param $nextk (ref null $cont))
 +    (loop $l
 +      (if (ref.is_null (local.get $nextk)) (then (return)))
 +      (block $on_yield (result (ref $cont))
 +        (block $on_fork (result (ref $cont) (ref $cont))
 +          (resume $cont
 +            (tag $yield $on_yield)
 +            (tag $fork $on_fork)
 +            (local.get $nextk)
 +          )
 +          (local.set $nextk (call $dequeue))
 +          (br $l)  ;; thread terminated
 +        ) ;;   $on_fork (result (ref $cont) (ref $cont))
 +        (call $enqueue) ;; current thread
 +        (local.set $nextk) ;; new thread
 +        (br $l)
 +      ) ;;   $on_yield (result (ref $cont))
 +      (call $enqueue) ;; current thread
 +      (local.set $nextk (call $dequeue)) ;; next thread
 +      (br $l)
 +    )
 +  )
 +
 +  ;; yield on fork, continuation first
 +  (func $ykt (export "ykt") (param $nextk (ref null $cont))
 +    (loop $l
 +      (if (ref.is_null (local.get $nextk)) (then (return)))
 +      (block $on_yield (result (ref $cont))
 +        (block $on_fork (result (ref $cont) (ref $cont))
 +          (resume $cont
 +            (tag $yield $on_yield)
 +            (tag $fork $on_fork)
 +            (local.get $nextk)
 +          )
 +          (local.set $nextk (call $dequeue))
 +          (br $l)  ;; thread terminated
 +        ) ;;   $on_fork (result (ref $cont) (ref $cont))
 +        (call $enqueue) ;; current thread
 +        (call $enqueue) ;; new thread
 +        (local.set $nextk (call $dequeue)) ;; next thread
 +        (br $l)
 +      ) ;;   $on_yield (result (ref $cont))
 +      (call $enqueue) ;; current thread
 +      (local.set $nextk (call $dequeue)) ;; next thread
 +      (br $l)
 +    )
 +  )
 +
 +  ;; yield on fork, new thread first
 +  (func $ytk (export "ytk") (param $nextk (ref null $cont))
 +    (loop $l
 +      (if (ref.is_null (local.get $nextk)) (then (return)))
 +      (block $on_yield (result (ref $cont))
 +        (block $on_fork (result (ref $cont) (ref $cont))
 +          (resume $cont
 +            (tag $yield $on_yield)
 +            (tag $fork $on_fork)
 +            (local.get $nextk)
 +          )
 +          (local.set $nextk (call $dequeue))
 +          (br $l)  ;; thread terminated
 +        ) ;;   $on_fork (result (ref $cont) (ref $cont))
 +        (local.set $nextk)
 +        (call $enqueue) ;; new thread
 +        (call $enqueue (local.get $nextk)) ;; current thread
 +        (local.set $nextk (call $dequeue)) ;; next thread
 +        (br $l)
 +      ) ;;   $on_yield (result (ref $cont))
 +      (call $enqueue) ;; current thread
 +      (local.set $nextk (call $dequeue)) ;; next thread
 +      (br $l)
 +    )
 +  )
 +)
 +(register "scheduler")
 +
 +(module
 +  (type $func (func))       ;; [] -> []
 +  (type $cont (cont $func)) ;; cont ([] -> [])
 +
 +  (func $scheduler-sync (import "scheduler" "sync") (param $nextk (ref null $cont)))
 +  (func $scheduler-kt (import "scheduler" "kt") (param $nextk (ref null $cont)))
 +  (func $schedule-tk (import "scheduler" "tk") (param $nextk (ref null $cont)))
 +  (func $scheduler-ykt (import "scheduler" "ykt") (param $nextk (ref null $cont)))
 +  (func $scheduler-ytk (import "scheduler" "ytk") (param $nextk (ref null $cont)))
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  (func $main (import "example" "main"))
 +
 +  (elem declare func $main)
 +
 +  (func (export "run")
 +    (call $log (i32.const -1))
 +    (call $scheduler-sync (cont.new $cont (ref.func $main)))
 +    (call $log (i32.const -2))
 +    (call $scheduler-kt (cont.new $cont (ref.func $main)))
 +    (call $log (i32.const -3))
 +    (call $schedule-tk (cont.new $cont (ref.func $main)))
 +    (call $log (i32.const -4))
 +    (call $scheduler-ykt (cont.new $cont (ref.func $main)))
 +    (call $log (i32.const -5))
 +    (call $scheduler-ytk (cont.new $cont (ref.func $main)))
 +    (call $log (i32.const -6))
 +  )
 +)
 +
 +(invoke "run")
 diff --git a/proposals/continuations/examples/pipes.wast b/proposals/continuations/examples/pipes.wast
 new file mode 100644
 index 00000000..e3581785
 --- /dev/null
 +++ b/proposals/continuations/examples/pipes.wast
 @@ -0,0 +1,95 @@
 +;; Simple pipes example
 +(module $pipes
 +  (type $pfun (func (result i32)))
 +  (type $cfun (func (param i32) (result i32)))
 +  (type $producer (cont $pfun))
 +  (type $consumer (cont $cfun))
 +
 +  (tag $send (export "send") (param i32))
 +  (tag $receive (export "receive") (result i32))
 +
 +  (func $piper (export "pipe") (param $p (ref $producer)) (param $c (ref $consumer))
 +    (local $n i32)
 +    (local $consuming i32)
 +
 +    (local.set $n (i32.const -1))
 +    (local.set $consuming (i32.const 1))
 +
 +    (loop $l
 +      (if (local.get $consuming)
 +        (then
 +           (block $on-receive (result (ref $consumer))
 +             (resume $consumer (tag $receive $on-receive) (local.get $n) (local.get $c))
 +             (return)
 +           ) ;; receive
 +           (local.set $c)
 +           (local.set $consuming (i32.const 0))
 +           (br $l)
 +        )
 +      ) ;; else producing
 +           (block $on-send (result i32 (ref $producer))
 +             (resume $producer (tag $send $on-send) (local.get $p))
 +             (return)
 +           ) ;; send
 +           (local.set $p)
 +           (local.set $n)
 +           (local.set $consuming (i32.const 1))
 +           (br $l)
 +    )
 +  )
 +)
 +
 +(register "pipes")
 +
 +(module
 +  (type $pfun (func (result i32)))
 +  (type $cfun (func (param i32) (result i32)))
 +
 +  (type $producer (cont $pfun))
 +  (type $consumer (cont $cfun))
 +
 +  (tag $send (import "pipes" "send") (param i32))
 +  (tag $receive (import "pipes" "receive") (result i32))
 +
 +  (func $pipe (import "pipes" "pipe") (param $p (ref $producer)) (param $c (ref $consumer)))
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  (elem declare func $nats $sum)
 +
 +  ;; send n, n+1, ...
 +  (func $nats (param $n i32) (result i32)
 +     (loop $l
 +       (call $log (i32.const -1))
 +       (call $log (local.get $n))
 +       (suspend $send (local.get $n))
 +       (local.set $n (i32.add (local.get $n) (i32.const 1)))
 +       (br $l)
 +     )
 +     (unreachable)
 +  )
 +
 +  ;; receive 10 nats and return their sum
 +  (func $sum (param $dummy i32) (result i32)
 +     (local $i i32)
 +     (local $a i32)
 +     (local.set $i (i32.const 10))
 +     (local.set $a (i32.const 0))
 +     (loop $l
 +       (local.set $a (i32.add (local.get $a) (suspend $receive)))
 +       (call $log (i32.const -2))
 +       (call $log (local.get $a))
 +       (local.set $i (i32.sub (local.get $i) (i32.const 1)))
 +       (br_if $l (i32.ne (local.get $i) (i32.const 0)))
 +     )
 +     (return (local.get $a))
 +  )
 +
 +  (func (export "run") (param $n i32)
 +     (call $pipe (cont.bind $consumer $producer (local.get $n) (cont.new $consumer (ref.func $nats)))
 +                 (cont.new $consumer (ref.func $sum))
 +     )
 + )
 +)
 +
 +(invoke "run" (i32.const 0))
 diff --git a/proposals/continuations/examples/static-lwt.wast b/proposals/continuations/examples/static-lwt.wast
 new file mode 100644
 index 00000000..22bd0f34
 --- /dev/null
 +++ b/proposals/continuations/examples/static-lwt.wast
 @@ -0,0 +1,151 @@
 +;; static lightweight threads
 +
 +;; interface to a fixed collection of lightweight threads
 +(module $lwt
 +  (tag $yield (export "yield"))
 +)
 +(register "lwt")
 +
 +(module $example
 +  (tag $yield (import "lwt" "yield"))
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  (func $thread1 (export "thread1")
 +    (call $log (i32.const 10))
 +    (suspend $yield)
 +    (call $log (i32.const 11))
 +    (suspend $yield)
 +    (call $log (i32.const 12))
 +  )
 +
 +  (func $thread2 (export "thread2")
 +    (call $log (i32.const 20))
 +    (suspend $yield)
 +    (call $log (i32.const 21))
 +    (suspend $yield)
 +    (call $log (i32.const 22))
 +  )
 +
 +  (func $thread3 (export "thread3")
 +    (call $log (i32.const 30))
 +    (suspend $yield)
 +    (call $log (i32.const 31))
 +    (suspend $yield)
 +    (call $log (i32.const 32))
 +  )
 +)
 +(register "example")
 +
 +;; queue of threads
 +(module $queue
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  ;; Table as simple queue (keeping it simple, no ring buffer)
 +  (table $queue 0 (ref null $cont))
 +  (global $qdelta i32 (i32.const 10))
 +  (global $qback (mut i32) (i32.const 0))
 +  (global $qfront (mut i32) (i32.const 0))
 +
 +  (func $queue-empty (export "queue-empty") (result i32)
 +    (i32.eq (global.get $qfront) (global.get $qback))
 +  )
 +
 +  (func $dequeue (export "dequeue") (result (ref null $cont))
 +    (local $i i32)
 +    (if (call $queue-empty)
 +      (then (return (ref.null $cont)))
 +    )
 +    (local.set $i (global.get $qfront))
 +    (global.set $qfront (i32.add (local.get $i) (i32.const 1)))
 +    (table.get $queue (local.get $i))
 +  )
 +
 +  (func $enqueue (export "enqueue") (param $k (ref $cont))
 +    ;; Check if queue is full
 +    (if (i32.eq (global.get $qback) (table.size $queue))
 +      (then
 +        ;; Check if there is enough space in the front to compact
 +        (if (i32.lt_u (global.get $qfront) (global.get $qdelta))
 +          (then
 +            ;; Space is below threshold, grow table instead
 +            (drop (table.grow $queue (ref.null $cont) (global.get $qdelta)))
 +          )
 +          (else
 +            ;; Enough space, move entries up to head of table
 +            (global.set $qback (i32.sub (global.get $qback) (global.get $qfront)))
 +            (table.copy $queue $queue
 +              (i32.const 0)         ;; dest = new front = 0
 +              (global.get $qfront)  ;; src = old front
 +              (global.get $qback)   ;; len = new back = old back - old front
 +            )
 +            (table.fill $queue      ;; null out old entries to avoid leaks
 +              (global.get $qback)   ;; start = new back
 +              (ref.null $cont)      ;; init value
 +              (global.get $qfront)  ;; len = old front = old front - new front
 +            )
 +            (global.set $qfront (i32.const 0))
 +          )
 +        )
 +      )
 +    )
 +    (table.set $queue (global.get $qback) (local.get $k))
 +    (global.set $qback (i32.add (global.get $qback) (i32.const 1)))
 +  )
 +)
 +(register "queue")
 +
 +(module $scheduler
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  (tag $yield (import "lwt" "yield"))
 +
 +  ;; queue interface
 +  (func $queue-empty (import "queue" "queue-empty") (result i32))
 +  (func $dequeue (import "queue" "dequeue") (result (ref null $cont)))
 +  (func $enqueue (import "queue" "enqueue") (param $k (ref $cont)))
 +
 +  (func $run (export "run")
 +    (loop $l
 +      (if (call $queue-empty) (then (return)))
 +      (block $on_yield (result (ref $cont))
 +        (resume $cont (tag $yield $on_yield)
 +                      (call $dequeue)
 +        )
 +        (br $l)  ;; thread terminated
 +      ) ;;   $on_yield (result (ref $cont))
 +      (call $enqueue)  ;; continuation of current thread
 +      (br $l)
 +    )
 +  )
 +)
 +(register "scheduler")
 +
 +(module
 +  (type $func (func))
 +  (type $cont (cont $func))
 +
 +  (func $scheduler (import "scheduler" "run"))
 +  (func $enqueue (import "queue" "enqueue") (param (ref $cont)))
 +
 +  (func $log (import "spectest" "print_i32") (param i32))
 +
 +  (func $thread1 (import "example" "thread1"))
 +  (func $thread2 (import "example" "thread2"))
 +  (func $thread3 (import "example" "thread3"))
 +
 +  (elem declare func $thread1 $thread2 $thread3)
 +
 +  (func (export "run")
 +    (call $enqueue (cont.new $cont (ref.func $thread1)))
 +    (call $enqueue (cont.new $cont (ref.func $thread2)))
 +    (call $enqueue (cont.new $cont (ref.func $thread3)))
 +
 +    (call $log (i32.const -1))
 +    (call $scheduler)
 +    (call $log (i32.const -2))
 +  )
 +)
 +
 +(invoke "run")
 diff --git a/proposals/stack-switching-requirements/requirements.md b/proposals/stack-switching-requirements/requirements.md
 new file mode 100644
 index 00000000..32f83c0c
 --- /dev/null
 +++ b/proposals/stack-switching-requirements/requirements.md
 @@ -0,0 +1,35 @@
 +# Multi-Stack Requirements
 +
 +## Goals
 +Capabilities to permit Wasm applications to safely and efficiently implement common patterns of non-local control flow.
 +
 +## Critical Use Cases
 +* Async/await
 +* Green Threads
 +* Yield-style generators
 +* First class continuations (e.g., Scheme, react-style programming over large trees)
 +
 +## Critical Success Factors
 +* Must respect browser implementation constraints
 +* Must permit competitive implementations in modern Wasm engines
 +* Must interoperate with JS Promises
 +  * Must not allow stack switching to leak into JavaScript
 +* Must be consistent with modern Control Flow Integrity measures
 +* Must be compatible with existing and proposed Wasm extensions; e.g.,
 +  * Exception handling
 +  * Threading
 +  * Garbage Collection
 +* Must enable applications/engines to maintain critical invariants:
 +  * Maintain integrity of host embedder’s event loop
 +  * Preserving critical sections
 +  * Ensuring reachability of linear memory GC roots
 +  * Maintaining correctness of application shadow stacks
 +
 +## Criteria
 +1. Prefer languages and features that are most likely to lead to adoption
 +1. Prefer approaches that are known to be efficiently implementable
 +   * Without undue reliance on in-engine optimization techniques
 +1. Prefer ‘small’ designs that are compatible with future features
 +1. Prefer expressive orthogonally composable sets of features
 +1. Prefer designs that facilitate reasoning about execution environment
 +1. Design to enable embedder neutral wasm modules
 diff --git a/proposals/stack-switching/Explainer.md b/proposals/stack-switching/Explainer.md
 new file mode 100644
 index 00000000..5915ad2b
 --- /dev/null
 +++ b/proposals/stack-switching/Explainer.md
 @@ -0,0 +1,876 @@
 +# Stack-switching
 +
 +This proposal adds typed stack-switching to WebAssembly, enabling a
 +single WebAssembly instance to manage multiple execution stacks
 +concurrently. The primary use-case for stack-switching is to add
 +direct support for modular compilation of advanced non-local control
 +flow idioms, e.g. coroutines, async/await, generators, lightweight
 +threads, and so forth. This document outlines the new instructions and
 +validation rules to facilitate stack-switching.
 +
 +## Table of contents
 +1. [Motivation](#motivation)
 +1. [Continuations](#continuations)
 +1. [Introduction to continuation-based stack-switching](#introduction-to-continuation-based-stack-switching)
 +   1. [Generators](#generators)
 +   1. [Task scheduling](#task-scheduling)
 +1. [Instruction set extension](#instruction-set-extension)
 +   1. [Declaring control tags](#declaring-control-tags)
 +   1. [Creating continuations](#creating-continuations)
 +   1. [Invoking continuations](#invoking-continuations)
 +   1. [Suspending continuations](#suspending-continuations)
 +   1. [Partial continuation application](#partial-continuation-application)
 +   1. [Continuation lifetime](#continuation-lifetime)
 +1. [Design considerations](#design-considerations)
 +   1. [Asymmetric switching](#asymmetric-switching)
 +   1. [Symmetric switching](#symmetric-switching)
 +   1. [Partial application](#partial-application)
 +   1. [One-shot continuations](#one-shot-continuations)
 +1. [Specification changes](#specification-changes)
 +   1. [Types](#types)
 +   1. [Tags](#tags)
 +   1. [Instructions](#instructions)
 +   1. [Execution](#execution)
 +   1. [Binary format](#binary-format)
 +
 +## Motivation
 +
 +Non-local control flow features provide the ability to suspend the
 +current execution context and later resume it. Many
 +industrial-strength programming languages feature a wealth of
 +non-local control flow features such as async/await, coroutines,
 +generators/iterators, effect handlers, and so forth. For some
 +programming languages non-local control flow is central to their
 +identity, meaning that they rely on non-local control flow for
 +efficiency, e.g. to support massively scalable concurrency.
 +
 +Rather than build specific control flow mechanisms for all possible
 +varieties of non-local control flow, our strategy is to build a single
 +mechanism, *continuations*, that can be used by language providers to
 +construct their own language specific features.
 +
 +## Continuations
 +
 +A continuation represents a snapshot of execution on a particular
 +stack. Stack-switching is realised by instructions for suspending and
 +resuming continuations. Continuations are composable, meaning that
 +when a suspended continuation is resumed it is spliced onto the
 +current continuation. This splicing establishes a parent-child
 +relationship between the current and resumed continuation. In this
 +respect the design provides a form of *asymmetric switching*.
 +
 +When suspending, we provide a tag and payload, much like when raising
 +an exception. Correspondingly, when resuming a suspended continuation
 +a *handler* is installed which specifies different behaviours for the
 +different kinds of tag with which the resumed continuation may
 +subsequently be suspended. Unlike for a normal exception handler the
 +handler is passed the suspended continuation as well as a payload.
 +
 +We also offer an alternative to the interface based on suspending and
 +resuming continuations by way of an instruction for direct switching
 +between continuations. Direct switching combines suspending the
 +current continuation with resuming a previously suspended peer
 +continuation. Direct switching establishes a peer-to-peer relationship
 +between the current continuation and its peer. In this respect the
 +design provides a form of *symmetric switching*.
 +
 +## Introduction to continuation-based stack-switching
 +
 +We illustrate the proposed stack-switching mechanism using two
 +examples: generators and task scheduling. The generators example uses
 +asymmetric stack-switching and the task scheduling example uses
 +symmetric stack-switching.
 +
 +### Generators
 +
 +The first example illustrates a generator-consumer pattern. Execution
 +switches back and forth between a generator and a consumer execution
 +stack. Whenever execution switches from the generator to the consumer
 +the generator also passes a value to the consumer.
 +
 +The stack-switching proposal reuses *tags* from the exception handling
 +proposal. The following tag is used to coordinate between the
 +generator and the consumer.
 +
 +```wat
 +(tag $gen (param i32))
 +```
 +
 +To switch execution to the consumer, the generator must suspend
 +execution. The suspend instruction takes a tag, here `$gen`. The tag
 +is used at runtime to determine how to continue execution, by
 +identifying the active *suspend handler* for that tag. In our example,
 +this handler will be provided by the consumer.
 +
 +Much like with exceptions parameter types specify the types of values
 +passed from the suspend site to the corresponding handler. In our
 +example, the tag's single `i32` parameter value is the value created
 +by the generator that is passed to the consumer.
 +
 +The proposal also extends tags with result types (not used in this
 +example). These allow the types of values passed from a resume site to
 +a previously suspended continuation when resuming it to be specified.
 +
 +The overall module implementing our example has the following shape.
 +
 +```wat
 +(module $generator
 +  (type $ft (func))
 +  ;; Types of continuations used by the generator:
 +  ;; No need for param or result types: No data passed back to the
 +  ;; generator when resuming it, and $generator function has no return
 +  ;; values.
 +  (type $ct (cont $ft))
 +
 +  ;; Tag used to coordinate between generator and consumer: The i32 param
 +  ;; corresponds to the generated values passed; no values passed back from
 +  ;; generator to consumer.
 +  (tag $gen (param i32))
 +
 +
 +  (func $print (import "spectest" "print_i32") (param i32))
 +
 +  ;; Simple generator yielding values from 100 down to 1
 +  (func $generator ...)
 +  (elem declare func $generator)
 +
 +  (func $consumer ...)
 +
 +)
 +```
 +
 +The module defines *continuation type* `$ct` based on function type
 +`$ft`. Suspended continuations of type `(ref $ct)` take no arguments
 +and return no results. As we shall see, the generator and consumer
 +manipulate suspended continuations of type `(ref $ct)`.
 +
 +The generator is defined as follows.
 +
 +```wat
 +;; Simple generator yielding values from 100 down to 1
 +(func $generator
 +  (local $i i32)
 +  (local.set $i (i32.const 100))
 +  (loop $l
 +    ;; Suspend execution, pass current value of $i to consumer
 +    (suspend $gen (local.get $i))
 +    ;; Decrement $i and exit loop once $i reaches 0
 +    (local.tee $i (i32.sub (local.get $i) (i32.const 1)))
 +    (br_if $l)
 +  )
 +)
 +```
 +
 +It executes 100 iterations of a loop and returns afterwards. Execution
 +is suspended on each iteration using the `$gen` tag. The value passed
 +from the `suspend` instruction to the handler (i.e., the value
 +produced by the generator) is just the current value of the loop
 +counter.
 +
 +The consumer is defined as follows.
 +
 + ```wat
 +(func $consumer
 +  (local $c (ref $ct))
 +  ;; Create continuation executing function $generator.
 +  ;; Execution only starts when resumed for the first time.
 +  (local.set $c (cont.new $ct (ref.func $generator)))
 +
 +  (loop $loop
 +    (block $on_gen (result i32 (ref $ct))
 +      ;; Resume continuation $c
 +      (resume $ct (on $gen $on_gen) (local.get $c))
 +      ;; $generator returned: no more data
 +      (return)
 +    )
 +    ;; Generator suspended, stack now contains [i32 (ref $ct)]
 +    ;; Save continuation to resume it in the next iteration
 +    (local.set $c)
 +    ;; Stack now contains the i32 value produced by $generator
 +    (call $print)
 +
 +    (br $loop)
 +  )
 +)
 + ```
 +
 +It uses `cont.new` to create a continuation executing the
 +generator. This instruction creates a value of reference type `(ref
 +$ct)`, saved in `$c`. It then runs a loop, where the `resume`
 +instruction is used to continue execution of the continuation
 +currently saved in `$c` on each iteration.
 +
 +In general, a `resume` instruction may not only take a suspended
 +continuation as an argument, but also additional values to be passed
 +to the suspended continuation when it is resumed. These are specified
 +in the parameters of the continuation's type. In our example, `$ct`
 +has no parameters, indicating that no data is passed from the consumer
 +to the generator.
 +
 +When a suspended continuation is resumed it is spliced onto the
 +current continuation (which may in fact be the top-level continuation
 +corresponding to the main stack). This splicing establishes a
 +parent-child relationship between the current and resumed
 +continuation. This asymmetric relationship affects execution in two
 +ways, which we now discuss.
 +
 +First, in the `resume` instruction the *handler clause* `(on $gen
 +$on_gen)` installs a suspend handler for that tag while executing the
 +continuation. This means that if during execution of `$c`, the
 +continuation executes the instruction `suspend $gen`, execution
 +continues in the block `$on_gen`. In general, executing an instruction
 +`suspend $e` for some tag `$e` means that execution continues at the
 +*innermost* ancestor whose `resume` instruction installed a suspend
 +handler for `$e`. This behaviour is directly analogous to the search
 +for a matching exception handler after raising an exception. However,
 +it is more general in that the handler is also passed the suspended
 +current continuation. The extent of a suspended continuation captures
 +execution from the instruction immediately following `suspend $e` up
 +to the `resume` instruction that handles `$e`. In other words, as well
 +as resuming a suspended computation and installing a handler, the
 +`resume` instruction also acts as a *delimiter* for new suspended
 +continuations created by performing `suspend $e` in the scope of the
 +resumed continuation.
 +
 +When the generator executes `suspend $gen`, execution continues in the
 +`$on_gen` block in `$consumer`. In that case, two values are pushed
 +onto the Wasm value stack. The topmost value is a new suspended
 +continuation. It is the continuation of executing the generator
 +following the `suspend` instruction (up to the handler). The other
 +value is the `i32` value passed from the generator to the consumer, as
 +required by the tag's definition. In our example, the consumer simply
 +prints the generated value and saves the new continuation in `$c` to
 +be resumed in the next iteration.
 +
 +Second, the parent-child relationship determines where execution
 +continues after a continuation returns. Control simply transfers to
 +the next instruction after the `resume` instruction that resumed the
 +continuation in the parent, just as a normal function call returns to
 +the instruction after its call site. For instance, in our example once
 +the loop counter `$i` reaches 0, the `$generator` function returns and
 +we have reached the end of the continuation. Execution then continues
 +at the parent immediately after the `resume` instruction called by the
 +consumer, and the consumer also returns.
 +
 +The full definition of this module can be found
 +[here](examples/generator.wast).
 +
 +### Task scheduling
 +
 +The second example demonstrates how to implement task scheduling with
 +the stack-switching instructions. Specifically, suppose we want to
 +schedule a number of tasks, represented by functions `$task_0` to
 +`$task_n`, to be executed concurrently. Scheduling is cooperative,
 +meaning that tasks explicitly yield execution so that a scheduler may
 +pick the next task to run.
 +
 +One approach is to use the asymmetric stack-switching approach we used
 +for the generator example. We define a function `$entry` that resumes
 +the initial task and installs a handler for a `$yield` tag inside an
 +event loop. In order to yield execution, tasks simply perform
 +`(suspend $yield)`, transferring control back to the parent
 +continuation, which is the event loop. The event loop then selects the
 +next task (if any) from a task queue and resumes it.
 +
 +This approach is illustrated by the following skeleton code.
 +
 +```wat
 +(module $scheduler1
 +
 +  (type $ft (func))
 +  ;; Continuation type of all tasks
 +  (type $ct (cont $ft))
 +
 +
 +  ;; Tag used to yield execution in one task and resume another one.
 +  (tag $yield)
 +
 +  ;; Used by scheduler to manage task continuations
 +  (table $task_queue 1000 (ref null $ct))
 +
 +  ;; Entry point, becomes parent of all tasks.
 +  ;; Also acts as scheduler when tasks yield or finish.
 +  (func $entry (param $initial_task (ref $ft))
 +    ;; initialise $task_queue with $initial_task
 +    ...
 +    (loop $resume_next
 +      ;; pick $next_task from queue, or return if no more tasks.
 +      ...
 +      (block $on_yield (result (ref $ct))
 +        (resume $ct (on $yield $on_yield) (local.get $next_task))
 +        ;; task finished execution
 +        (br $resume_next)
 +      )
 +      ;; task suspended: put continuation in queue, then loop to determine next
 +      ;; one to resume.
 +      ...
 +    )
 +  )
 +
 +  (func $task_0
 +    ...
 +    ;; To yield execution, simply suspend to scheduling logic in $entry.
 +    (suspend $yield)
 +    ...
 +  )
 +
 +  ...
 +
 +  (func $task_n ...)
 +
 +)
 +```
 +
 +Note that `$entry` performs all scheduling; it is responsible for
 +picking the next task to resume in two different circumstances: a) if
 +the most recent task suspended itself, and b) if it simply ran to
 +completion and returned. However, notice that this asymmetric approach
 +requires two stack switches in order to change execution from one task
 +to another: first when suspending from the a task to the event loop,
 +and second when the event loop resumes the next task.
 +
 +Our proposal provides a mechanism to optimise the particular pattern
 +shown here, where suspending one continuation is followed by a handler
 +resuming another continuation `$c`. We can use the `switch`
 +instruction, which also relies on tags, to transfer control from the
 +original continuation directly to `$c`, thus avoiding the need for an
 +intermediate stack switch to the parent.
 +
 +Concretely, executing `switch $ct $yield (local.get $c)` in our
 +example behaves equivalently to `(suspend $yield)`, assuming that the
 +active (ordinary) handler for `$yield` immediately resumes `$c` and
 +additionally passes the continuation obtained from handling `$yield`
 +along as an argument to `$c`. However, as mentioned above, using a
 +`switch` instruction here has the advantage that a Wasm engine can
 +implement it directly using only a single stack switch.
 +Each `switch` instruction is annotated with the type of the
 +continuation switched to.
 +
 +The key idea is to inline scheduling logic in the tasks themselves in
 +order to reduce (or avoid altogether) the need to switch stacks to the
 +event loop in order to implement switching tasks.
 +
 +This alternative approach is illustrated by the following skeleton
 +code.
 +
 +```wat
 +(module $scheduler2
 +  (rec
 +    (type $ft (func (param (ref null $ct))))
 +    ;; Continuation type of all tasks
 +    (type $ct (cont $ft))
 +  )
 +
 +  ;; Tag used to yield execution in one task and resume another one.
 +  (tag $yield)
 +
 +  ;; Used by scheduler to manage task continuations
 +  (table $task_queue 1000 (ref null $ct))
 +
 +  ;; Entry point, becomes parent of all tasks.
 +  ;; Only acts as scheduler when tasks finish.
 +  (func $entry (param $initial_task (ref $ft))
 +    ;; initialise $task_queue with $initial_task
 +    ...
 +    (loop $resume_next
 +      ;; pick $next_task from queue, or return if no more tasks.
 +      ;; Note that there is no suspend handler for $yield
 +      ...
 +      (resume $ct (on $yield switch) (ref.null $ct) (local.get $next_task))
 +      ;; task finished execution: loop to pick next one
 +      (br $resume_next)
 +      ...
 +    )
 +  )
 +
 +  (func $task_0 (type $ft)
 +    ;; If $c is not null, put in task_queue.
 +    ...
 +    ;; To yield execution, call $yield_to_next
 +    (call $yield_to_next)
 +    ...
 +  )
 +
 +  ...
 +
 +  (func $task_n (type $ft) ...)
 +
 +  ;; Determines next task to switch to directly.
 +  (func $yield_to_next
 +    ;; determine $next_task
 +    ...
 +    (block $done
 +      (br_if $done (ref.is_null (local.get $next_task)))
 +      ;; Switch to $next_task.
 +      ;; The switch instruction implicitly passes a reference to the currently
 +      ;; executing continuation as an argument to $next_task.
 +      (switch $ct $yield (local.get $next_task))
 +      ;; If we get here, some other continuation switch-ed directly to us, or
 +      ;; $entry resumed us.
 +      ;; In the first case, we receive the continuation that switched to us here
 +      ;; and we need to enqueue it in the task list.
 +      ;; In the second case, the passed continuation reference will be null.
 +      ...
 +    )
 +    ;; Just return if no other task in queue, making the $yield_to_next call
 +    ;; a noop.
 +  )
 +
 +)
 +```
 +
 +Here, the event loop is still responsible for resuming tasks from the
 +task queue, starting with some initial task. Thus, it will still be
 +the parent of all task continuations, as in the previous version.
 +However, it is no longer responsible for handling suspensions of
 +tasks. Instead, it only resumes the next task from the queue whenever
 +the previously running task returns. Yielding execution from one task
 +that merely wishes to suspend itself to another will be handled by the
 +tasks themselves, using the `switch` instruction.
 +
 +The fact that the event loop does not handle suspensions is reflected
 +by the fact that its `resume` instruction does not install a suspend
 +handler for the `yield` tag. Instead, the resume instruction installs
 +a *switch handler* for tag `yield`. In order to yield execution a task
 +calls a separate function `$yield_to_next`. The scheduling logic picks
 +the next task `$next_task` and switches directly to it. Here, the
 +target continuation (i.e., `$next_task` in `$yield_to_next`) receives
 +the suspended current continuation (i.e., the one that just called
 +`$switch_to_next`) as an argument. The payload passing mechanism used
 +for integer values in the generator example is now used to pass
 +continuation references. The task that we switched to is now
 +responsible for enqueuing the previous continuation (i.e., the one
 +received as a payload) in the task list.
 +
 +As a minor complication, we need to encode the fact that the
 +continuation switched to receives the current one as an argument in
 +the type of the continuations handled by all scheduling logic. This
 +means the type `$ct` must be recursive: a continuation of this type
 +takes a value of type `(ref null $ct)` as a parameter. In order to
 +give the same type to continuations that have yielded execution (those
 +created by `switch`) and those continuations that correspond to
 +beginning the execution of a `$task_i` function (those created by
 +`cont.new`), we add a `(ref null $ct)` parameter to all of the
 +`$task_i` functions.  Finally, observe that the event loop passes a
 +null continuation to any continuation it resumes, indicating to the
 +resumed continuation that there is no previous continuation to enqueue
 +in the task list.
 +
 +Note that installing a switch handler for `$yield` in `entry` is
 +strictly necessary. It acts as a delimiter, determining the extent of
 +the suspended continuation created when performing `switch` with tag
 +`$yield`. This form of stack-switching is symmetric in the following
 +sense. Rather than switching back to the parent (as `suspend` would),
 +`switch` effectively replaces the continuation under the handler for
 +`yield` in the event loop with a different continuation.
 +
 +The proposal also allows passing additional payloads when performing a
 +`switch` instruction, besides the suspended current continuation. For
 +simplicity, our example does not make use of this feature, as we can
 +see from the type `$ct`, which has no further parameters besides the
 +continuation argument required by `switch`. However, this mechanism
 +could be used to optimise the implementation of task scheduling
 +further.
 +
 +In `$scheduler2`, if a `$task_i` function finishes and therefore
 +returns, two stack switches are required to continue execution in the
 +next task in the queue. This is due to the fact that the returning
 +continuation switches to the parent (i.e., the event loop), which then
 +resumes the next task. To avoid this additional stack switch, we could
 +add boilerplate code to all of our task functions. Immediately before
 +a task function would ordinarily return, it should instead switch to
 +the next task. When doing so, it should pass a new flag to the target
 +continuation to indicate that the source continuation should not be
 +enqueued in the task list, but should instead be cancelled. Cancellation
 +can be implemented using another instruction, `resume_throw`, which is
 +described later in the document.
 +
 +Full versions of `$scheduler1` and  `$scheduler2` can be found
 +[here](examples/scheduler1.wast) and [here](examples/scheduler2.wast).
 +
 +## Instruction set extension
 +
 +Here we give an informal account of the proposed instruction set
 +extension. In the [specification changes](#specification-changes) we
 +give a more formal account of the validation rules and changes to the
 +binary format.
 +
 +For simplicity we ignore subtyping in this section, but in the
 +[specification changes](#specification-changes) we take full account
 +of subtyping.
 +
 +The proposal adds a new reference type for continuations.
 +
 +```wast
 +  (cont $ft)
 +```
 +
 +A continuation type is specified in terms of a function type `$ft`,
 +whose parameter types `t1*` describe the expected stack shape prior to
 +resuming/starting the continuation, and whose return types `t2*`
 +describe the stack shape after the continuation has run to completion.
 +
 +As a shorthand, we will often write the function type inline and write
 +a continuation type as
 +
 +```wast
 +  (cont [t1*] -> [t2*])
 +```
 +
 +### Declaring control tags
 +
 +Control tags generalise exception tags to include result
 +types. Operationally, a control tag may be thought of as a *resumable*
 +exception. A tag declaration provides the type signature of a control
 +tag.
 +
 +```wast
 +  (tag $t (param t1*) (result t2*))
 +```
 +
 +The `$t` is the symbolic index of the control tag in the index space
 +of tags. The parameter types `t1*` describe the expected stack layout
 +prior to invoking the tag, and the result types `t2*` describe the
 +stack layout following an invocation of the operation.
 +
 +We will often write `$t : [t1*] -> [t2*]` as shorthand for indicating
 +that such a declaration is in scope.
 +
 +### Creating continuations
 +
 +The following instruction creates a *suspended continuation* from a
 +function.
 +
 +```wast
 +  cont.new $ct : [(ref $ft)] -> [(ref $ct)]
 +  where:
 +  - $ft = func [t1*] -> [t2*]
 +  - $ct = cont $ft
 +```
 +
 +It takes a reference to a function of type `[t1*] -> [t2*]` whose body
 +may perform non-local control flow.
 +
 +### Invoking continuations
 +
 +There are three ways to invoke a suspended continuation.
 +
 +The first way to invoke a continuation is to resume the suspended
 +continuation under a *handler*. The handler specifies what to do when
 +control is subsequently suspended again.
 +
 +```wast
 +  resume $ct hdl* : [t1* (ref $ct)] -> [t2*]
 +  where:
 +  - $ct = cont [t1*] -> [t2*]
 +```
 +
 +The `resume` instruction is parameterised by a continuation type and a
 +handler dispatch table `hdl`. The shape of `hdl` can be either:
 +
 +1. `(on $e $l)` mapping the control tag `$e` to the label
 +`$l`. Intercepting `$e` causes a branch to `$l`.
 +
 +2. `(on $e switch)` allowing a direct switch with control tag `$e`.
 +
 +The `resume` instruction consumes its continuation argument, meaning
 +that a continuation may be resumed only once.
 +
 +The second way to invoke a continuation is to raise an exception at
 +the control tag invocation site which causes the stack to be unwound.
 +
 +```wast
 +  resume_throw $ct $exn hdl* : [te* (ref $ct)])] -> [t2*]
 +  where:
 +  - $ct = cont [t1*] -> [t2*]
 +  - $exn : [te*] -> []
 +```
 +
 +The `resume_throw` instruction is parameterised by a continuation
 +type, the exception to be raised at the control tag invocation site,
 +and a handler dispatch table. As with `resume`, this instruction also
 +fully consumes its continuation argument. This instruction raises the
 +exception `$exn` with parameters of type `te*` at the control tag
 +invocation point in the context of the supplied continuation. As an
 +exception is being raised (the continuation is not actually being
 +supplied a value) the parameter types for the continuation `t1*` are
 +unconstrained.
 +
 +The third way to invoke a continuation is to perform a symmetric
 +switch.
 +
 +```wast
 +  switch $ct1 $e : [t1* (ref $ct1)] -> [t2*]
 +  where:
 +  - $e : [] -> [t*]
 +  - $ct1 = cont [t1* (ref $ct2)] -> [t*]
 +  - $ct2 = cont [t2*] -> [t*]
 +```
 +
 +The `switch` instruction is parameterised by the type of the
 +continuation argument (`$ct1`) and a control tag
 +(`$e`). It suspends the current continuation (of type `$ct2`), then
 +performs a direct switch to the suspended peer continuation (of type
 +`$ct1`), passing in the required parameters (including the just
 +suspended current continuation, in order to allow the peer to switch
 +back again). As with `resume` and `resume_throw`, the `switch`
 +instruction fully consumes its suspended continuation argument.
 +
 +### Suspending continuations
 +
 +The current continuation can be suspended.
 +
 +```wast
 +  suspend $e : [t1*] -> [t2*]
 +  where:
 +  - $e : [t1*] -> [t2*]
 +```
 +
 +The `suspend` instruction invokes the control tag `$e` with arguments
 +of types `t1*`. It suspends the current continuation up to the nearest
 +enclosing handler for `$e`. This behaviour is similar to how raising
 +an exception transfers control to the nearest exception handler that
 +handles the exception. The key difference is that the continuation at
 +the suspension point expects to be resumed later with arguments of
 +types `t2*`.
 +
 +### Partial continuation application
 +
 +A suspended continuation can be partially applied to a prefix of its
 +arguments yielding another suspended continuation.
 +
 +```wast
 +  cont.bind $ct1 $ct2 : [t1* (ref $ct1)] -> [(ref $ct2)]
 +  where:
 +  - $ct1 = cont [t1* t3*] -> [t2*]
 +  - $ct2 = cont [t3*] -> [t2*]
 +```
 +
 +The `cont.bind` instruction binds a prefix of its arguments of type
 +`t1*` to a suspended continuation of type `$ct1`, yielding a modified
 +suspended continuation of type `$ct2`. The `cont.bind` instruction
 +also consumes its continuation argument, and yields a new continuation
 +that can be supplied to `resume`,`resume_throw`, `switch` or
 +`cont.bind`.
 +
 +### Continuation lifetime
 +
 +#### Producing continuations
 +
 +There are four different ways in which continuations may be produced
 +(`cont.new,suspend,cont.bind,switch`). A fresh continuation object
 +is allocated with `cont.new` and the current continuation is reused
 +with `suspend`, `cont.bind`, and `switch`.
 +
 +The `cont.bind` instruction is similar to the `func.bind` instruction
 +that was initially part of the function references proposal. However,
 +whereas the latter necessitates the allocation of a new closure, as
 +continuations are single-shot no allocation is necessary: all
 +allocation happens when the original continuation is created by
 +preallocating one slot for each continuation argument.
 +
 +#### Consuming continuations
 +
 +There are four different ways in which suspended continuations are
 +consumed (`resume,resume_throw,switch,cont.bind`). A suspended
 +continuation may be resumed with a particular handler with `resume`;
 +aborted with `resume_throw`; directly switched to via `switch`; or
 +partially applied with `cont.bind`.
 +
 +In order to ensure that continuations are one-shot, `resume`,
 +`resume_throw`, `switch`, and `cont.bind` destructively modify the
 +suspended continuation such that any subsequent use of the same
 +suspended continuation will result in a trap.
 +
 +## Design considerations
 +
 +In this section we discuss some key design considerations.
 +
 +### Asymmetric switching
 +
 +Resuming a suspended continuation establishes a parent-child
 +relationship which aligns with the caller-callee relationship for
 +standard function calls meaning that no special plumbing is needed in
 +order to compose the non-local control features we define with
 +built-in non-local control features such as traps, exceptions, and
 +embedder integration.
 +
 +### Symmetric switching
 +
 +Direct switching to a suspended peer continuation is semantically
 +equivalent to suspending the current continuation with a special
 +switch tag whose payload is the suspended peer continuation in the
 +context of a handler which resumes the peer continuation. However,
 +direct switching can (and should) be optimised to avoid the need to
 +switch control to the handler before switching control to the peer.
 +
 +### Partial application
 +
 +Partial application can be important in practice due to the block and
 +type structure of Wasm, as in order to return a continuation from a
 +block all branches within the block must agree on the type of
 +continuation. Using `cont.bind`, a producer can ensure that the
 +branches within a block each produce a continuation with the same type.
 +
 +### One-shot continuations
 +
 +Continuations in the current proposal are single-shot (aka linear),
 +meaning that they should be invoked exactly once. A continuation can
 +be invoked either by resuming it (with `resume`); by aborting it (with
 +`resume_throw`); or by switching to it (with `switch`). An attempt to
 +invoke a continuation more than once results in a trap. Some
 +applications such as backtracking, probabilistic programming, and
 +process duplication exploit multi-shot continuations, but none of our
 +critical use-cases requires multi-shot continuations.
 +
 +## Specification changes
 +
 +This proposal is based on the [function references proposal](https://github.com/WebAssembly/function-references)
 +and [exception handling proposal](https://github.com/WebAssembly/exception-handling).
 +
 +### Types
 +
 +We extend the structure of composite types and heap types as follows.
 +
 +- `cont <typeidx>` is a new form of composite type
 +  - `(cont $ft) ok` iff `$ft ok` and `$ft = [t1*] -> [t2*]`
 +
 +We add two new continuation heap types and their subtyping hierarchy:
 +- `heaptypes ::= ... | cont | nocont`
 +- `nocont ok` and `cont ok` always
 +- `nocont` is the bottom type of continuation types, whereas `cont` is the top type, i.e. `nocont <: cont`
 +
 +### Tags
 +
 +We change the wellformedness condition for tag types to be more liberal, i.e.
 +
 +- `(tag $t (type $ft)) ok` iff `$ft ok` and `$ft = [t1*] -> [t2*]`
 +
 +In other words, the return type of tag types is allowed to be non-empty.
 +
 +### Instructions
 +
 +The new instructions and their validation rules are as follows. To simplify the presentation, we write this:
 +
 +```
 +C.types[$ct] ~~ cont [t1*] -> [t2*]
 +```
 +
 +where we really mean this:
 +
 +```
 +C.types[$ct] ~~ cont $ft
 +C.types[$ft] ~~ func [t1*] -> [t2*]
 +```
 +
 +This abbreviation will be formalised with an auxiliary function or other means in the spec.
 +
 +- `cont.new <typeidx>`
 +  - Create a new continuation from a given typed funcref.
 +  - `cont.new $ct : [(ref null $ft)] -> [(ref $ct)]`
 +    - iff `C.types[$ct] ~~ cont [t1*] -> [t2*]`
 +
 +- `cont.bind <typeidx> <typeidx>`
 +  - Partially apply a continuation.
 +  - `cont.bind $ct $ct' : [t3* (ref null $ct)] -> [(ref $ct')]`
 +    - iff `C.types[$ct] ~~ cont [t3* t1*] -> [t2*]`
 +    - and `C.types[$ct'] ~~ cont [t1'*] -> [t2'*]`
 +    - and `[t1*] -> [t2*] <: [t1'*] -> [t2'*]`
 +
 +- `resume <typeidx> hdl*`
 +  - Execute a given continuation.
 +    - If the executed continuation suspends with a control tag `$t`, the corresponding handler `(on $t H)` is executed.
 +  - `resume $ct hdl* : [t1* (ref null $ct)] -> [t2*]`
 +    - iff `C.types[$ct] ~~ cont [t1*] -> [t2*]`
 +    - and `(hdl : t2*)*`
 +
 +- `resume_throw <typeidx> <exnidx> hdl*`
 +  - Execute a given continuation, but force it to immediately throw the annotated exception.
 +  - Used to abort a continuation.
 +  - `resume_throw $ct $e hdl* : [te* (ref null $ct)] -> [t2*]`
 +    - iff `C.types[$ct] ~~ cont [t1*] -> [t2*]`
 +    - and `C.tags[$e] : tag $ft`
 +    - and `C.types[$ft] ~~ func [te*] -> []`
 +    - and `(hdl : t2*)*`
 +
 +- `hdl = (on <tagidx> <labelidx>) | (on <tagidx> switch)`
 +  - Handlers attached to `resume` and `resume_throw`, handling control tags for `suspend` and `switch`, respectively.
 +  - `(on $e $l) : t*`
 +    - iff `C.tags[$e] = tag $ft`
 +    - and `C.types[$ft] ~~ func [t1*] -> [t2*]`
 +    - and `C.labels[$l] = [t1'* (ref null? $ct)]`
 +    - and `t1* <: t1'*`
 +    - and `C.types[$ct] ~~ cont [t2'*] -> [t'*]`
 +    - and `[t2*] -> [t*] <: [t2'*] -> [t'*]`
 +  - `(on $e switch) : t*`
 +    - iff `C.tags[$e] = tag $ft`
 +    - and `C.types[$ft] ~~ func [] -> [t*]`
 +
 +- `suspend <tagidx>`
 +  - Use a control tag to suspend the current computation.
 +  - `suspend $t : [t1*] -> [t2*]`
 +    - iff `C.tags[$t] = tag $ft`
 +    - and `C.types[$ft] ~~ func [t1*] -> [t2*]`
 +
 +- `switch <typeidx> <tagidx>`
 +  - Switch to executing a given continuation directly, suspending the current execution.
 +  - The suspension and switch are performed from the perspective of a parent `(on $e switch)` handler, determined by the annotated control tag.
 +  - `switch $ct1 $e : [t1* (ref null $ct1)] -> [t2*]`
 +    - iff `C.tags[$e] = tag $ft`
 +    - and `C.types[$ft] ~~ func [] -> [t*]`
 +    - and `C.types[$ct1] ~~ cont [t1* (ref null? $ct2)] -> [te1*]`
 +    - and `te1* <: t*`
 +    - and `C.types[$ct2] ~~ cont [t2*] -> [te2*]`
 +    - and `t* <: te2*`
 +
 +### Execution
 +
 +The same control tag may be used simultaneously by `throw`, `suspend`,
 +`switch`, and their associated handlers. When searching for a handler
 +for an event, only handlers for the matching kind of event are
 +considered, e.g. only `(on $e $l)` handlers can handle `suspend`
 +events and only `(on $e switch)` handlers can handle `switch`
 +events. The handler search continues past handlers for the wrong kind
 +of event, even if they use the correct tag.
 +
 +### Binary format
 +
 +We extend the binary format of composite types, heap types, and instructions.
 +
 +#### Composite types
 +
 +| Opcode | Type            | Parameters | Note |
 +| ------ | --------------- | ---------- |------|
 +| -0x20  | `func t1* t2*`  | `t1* : vec(valtype)` `t2* : vec(valtype)` | from Wasm 1.0 |
 +| -0x23  | `cont $ft`      | `$ft : typeidx` | new |
 +
 +#### Heap Types
 +
 +The opcode for heap types is encoded as an `s33`.
 +
 +| Opcode | Type            | Parameters | Note |
 +| ------ | --------------- | ---------- | ---- |
 +| i >= 0 | i               |            | from function-references |
 +| -0x0b  | `nocont`        |            | new  |
 +| -0x18  | `cont`          |            | new  |
 +
 +#### Instructions
 +
 +We use the use the opcode space `0xe0-0xe5` for the seven new instructions.
 +
 +| Opcode | Instruction              | Immediates |
 +| ------ | ------------------------ | ---------- |
 +| 0xe0   | `cont.new $ct`           | `$ct : u32` |
 +| 0xe1   | `cont.bind $ct $ct'`     | `$ct : u32`, `$ct' : u32` |
 +| 0xe2   | `suspend $t`             | `$t : u32` |
 +| 0xe3   | `resume $ct hdl*` | `$ct : u32` (for hdl see below) |
 +| 0xe4   | `resume_throw $ct $e hdl*` | `$ct : u32`, `$e : u32` (for hdl see below) |
 +| 0xe5   | `switch $ct1 $e`          | `$ct1 : u32`, `$e : u32` |
 +
 +In the case of `resume` and `resume_throw` we use a leading byte to
 +indicate the shape of `hdl` as follows.
 +
 +| Opcode | On clause shape | Immediates |
 +| ------ | --------------- | ---------- |
 +| 0x00   | `(on $t $h)`    | `$t : u32`, `$h : u32` |
 +| 0x01   | `(on $t switch)` | `$t : u32` |
 diff --git a/proposals/stack-switching/examples/generator.wast b/proposals/stack-switching/examples/generator.wast
 new file mode 100644
 index 00000000..4908a4f0
 --- /dev/null
 +++ b/proposals/stack-switching/examples/generator.wast
 @@ -0,0 +1,54 @@
 +(module $generator
 +  (type $ft (func))
 +  ;; Types of continuations used by the generator:
 +  ;; No need for param or result types: No data data passed back to the
 +  ;; generator when resuming it, and $generator function has no return
 +  ;; values.
 +  (type $ct (cont $ft))
 +
 +  ;; Tag used to coordinate between generator and consumer: The i32 param
 +  ;; corresponds to the generated values passed; no values passed back from
 +  ;; generator to consumer.
 +  (tag $yield (param i32))
 +
 +
 +  (func $print (import "spectest" "print_i32") (param i32))
 +
 +  ;; Simple generator yielding values from 100 down to 1
 +  (func $generator
 +    (local $i i32)
 +    (local.set $i (i32.const 100))
 +    (loop $l
 +      ;; Suspend execution, pass current value of $i to consumer
 +      (suspend $yield (local.get $i))
 +      ;; Decrement $i and exit loop once $i reaches 0
 +      (local.tee $i (i32.sub (local.get $i) (i32.const 1)))
 +      (br_if $l)
 +    )
 +  )
 +  (elem declare func $generator)
 +
 +  (func $consumer
 +    (local $c (ref $ct))
 +    ;; Create continuation executing function $generator.
 +    ;; Execution only starts when resumed for the first time.
 +    (local.set $c (cont.new $ct (ref.func $generator)))
 +
 +    (loop $loop
 +      (block $on_yield (result i32 (ref $ct))
 +        ;; Resume continuation $c
 +        (resume $ct (on $yield $on_yield) (local.get $c))
 +        ;; $generator returned: no more data
 +        (return)
 +      )
 +      ;; Generator suspended, stack now contains [i32 (ref $ct)]
 +      ;; Save continuation to resume it in next iteration
 +      (local.set $c)
 +      ;; Stack now contains the i32 value yielded by $generator
 +      (call $print)
 +
 +      (br $loop)
 +    )
 +  )
 +
 +)
 diff --git a/proposals/stack-switching/examples/scheduler1.wast b/proposals/stack-switching/examples/scheduler1.wast
 new file mode 100644
 index 00000000..fd919e09
 --- /dev/null
 +++ b/proposals/stack-switching/examples/scheduler1.wast
 @@ -0,0 +1,160 @@
 +;; queue of threads
 +(module $queue
 +
 +  (type $ft (func))
 +  (type $ct (cont $ft))
 +
 +  ;; Table as simple queue (keeping it simple, no ring buffer)
 +  (table $task_queue 0 (ref null $ct))
 +  (global $qdelta i32 (i32.const 10))
 +  (global $qback (mut i32) (i32.const 0))
 +  (global $qfront (mut i32) (i32.const 0))
 +
 +  (func $queue_empty (export "queue-empty") (result i32)
 +    (i32.eq (global.get $qfront) (global.get $qback))
 +  )
 +
 +  (func $dequeue (export "dequeue") (result (ref null $ct))
 +    (local $i i32)
 +    (if (call $queue_empty)
 +      (then (return (ref.null $ct)))
 +    )
 +    (local.set $i (global.get $qfront))
 +    (global.set $qfront (i32.add (local.get $i) (i32.const 1)))
 +    (table.get $task_queue (local.get $i))
 +  )
 +
 +  (func $enqueue (export "enqueue") (param $k (ref null $ct))
 +    ;; Check if queue is full
 +    (if (i32.eq (global.get $qback) (table.size $task_queue))
 +      (then
 +        ;; Check if there is enough space in the front to compact
 +        (if (i32.lt_u (global.get $qfront) (global.get $qdelta))
 +          (then
 +            ;; Space is below threshold, grow table instead
 +            (drop (table.grow $task_queue (ref.null $ct) (global.get $qdelta)))
 +          )
 +          (else
 +            ;; Enough space, move entries up to head of table
 +            (global.set $qback (i32.sub (global.get $qback) (global.get $qfront)))
 +            (table.copy $task_queue $task_queue
 +              (i32.const 0)         ;; dest = new front = 0
 +              (global.get $qfront)  ;; src = old front
 +              (global.get $qback)   ;; len = new back = old back - old front
 +            )
 +            (table.fill $task_queue      ;; null out old entries to avoid leaks
 +              (global.get $qback)   ;; start = new back
 +              (ref.null $ct)      ;; init value
 +              (global.get $qfront)  ;; len = old front = old front - new front
 +            )
 +            (global.set $qfront (i32.const 0))
 +          )
 +        )
 +      )
 +    )
 +    (table.set $task_queue (global.get $qback) (local.get $k))
 +    (global.set $qback (i32.add (global.get $qback) (i32.const 1)))
 +  )
 +)
 +(register "queue")
 +
 +(module $scheduler1
 +  (type $ft (func))
 +  ;; Continuation type of all tasks
 +  (type $ct (cont $ft))
 +
 +
 +  (func $task_enqueue (import "queue" "enqueue") (param (ref null $ct)))
 +  (func $task_dequeue (import "queue" "dequeue") (result (ref null $ct)))
 +  (func $task_queue-empty (import "queue" "queue-empty") (result i32))
 +  (func $print_i32 (import "spectest" "print_i32") (param i32))
 +
 +  ;; Tag used to yield execution in one task and resume another one.
 +  (tag $yield)
 +
 +  ;; Entry point, becomes parent of all tasks.
 +  ;; Also acts as scheduler when tasks yield or finish.
 +  (func $entry (param $initial_task (ref $ft))
 +    (local $next_task (ref null $ct))
 +
 +    ;; initialise $task_queue with initial task
 +    (call $task_enqueue (cont.new $ct (local.get $initial_task)))
 +
 +    (loop $resume_next
 +      ;; pick $next_task from queue, or return if no more tasks.
 +      (if (call $task_queue-empty)
 +        (then (return))
 +        (else (local.set $next_task (call $task_dequeue)))
 +      )
 +      (block $on_yield (result (ref $ct))
 +        (resume $ct (on $yield $on_yield) (local.get $next_task))
 +        ;; task finished execution
 +        (br $resume_next)
 +      )
 +      ;; task suspended: put continuation in queue, then loop to determine next
 +      ;; one to resume.
 +      (call $task_enqueue)
 +      (br $resume_next)
 +    )
 +  )
 +
 +  ;; To simplify the example, all task_i functions execute this function. Each
 +  ;; task has an $id, but this is only used for printing.
 +  ;; $to_spawn represents another task that this function will add to the task
 +  ;; queue, unless the reference is null.
 +  (func $task_impl
 +        (param $id i32)
 +        (param $to_spawn (ref null $ft))
 +
 +    (if (ref.is_null (local.get $to_spawn))
 +      (then)
 +      (else (call $task_enqueue (cont.new $ct (local.get $to_spawn)))))
 +
 +    (call $print_i32 (local.get $id))
 +    (suspend $yield)
 +    (call $print_i32 (local.get $id))
 +  )
 +
 +  ;; The actual $task_i functions simply call $task_impl, with i as the value
 +  ;; for $id, and $task_(i+1) as the task to spawn, except for $task_3, which
 +  ;; does not spawn another task.
 +  ;;
 +  ;; The observant reader may note that all $task_i functions may be seen as
 +  ;; partial applications of $task_impl.
 +  ;; Indeed, we could obtain *continuations* running each $task_i from a
 +  ;; continuation running $task_impl and cont.bind.
 +
 +  (func $task_3
 +    (i32.const 3)
 +    (ref.null $ft)
 +    (call $task_impl)
 +  )
 +  (elem declare func $task_3)
 +
 +  (func $task_2
 +    (i32.const 2)
 +    (ref.func $task_3)
 +    (call $task_impl)
 +  )
 +  (elem declare func $task_2)
 +
 +  (func $task_1
 +    (i32.const 1)
 +    (ref.func $task_2)
 +    (call $task_impl)
 +  )
 +  (elem declare func $task_1)
 +
 +  (func $task_0
 +    (i32.const 0)
 +    (ref.func $task_1)
 +    (call $task_impl)
 +  )
 +  (elem declare func $task_0)
 +
 +
 +  (func (export "main")
 +    (call $entry (ref.func $task_0))
 +  )
 +)
 +(invoke "main")
 diff --git a/proposals/stack-switching/examples/scheduler2.wast b/proposals/stack-switching/examples/scheduler2.wast
 new file mode 100644
 index 00000000..8fe4bfed
 --- /dev/null
 +++ b/proposals/stack-switching/examples/scheduler2.wast
 @@ -0,0 +1,198 @@
 +;; queue of threads
 +(module $queue
 +  (rec
 +    (type $ft (func (param (ref null $ct))))
 +    (type $ct (cont $ft)))
 +
 +  ;; Table as simple queue (keeping it simple, no ring buffer)
 +  (table $task_queue 0 (ref null $ct))
 +  (global $qdelta i32 (i32.const 10))
 +  (global $qback (mut i32) (i32.const 0))
 +  (global $qfront (mut i32) (i32.const 0))
 +
 +  (func $queue_empty (export "queue-empty") (result i32)
 +    (i32.eq (global.get $qfront) (global.get $qback))
 +  )
 +
 +  (func $dequeue (export "dequeue") (result (ref null $ct))
 +    (local $i i32)
 +    (if (call $queue_empty)
 +      (then (return (ref.null $ct)))
 +    )
 +    (local.set $i (global.get $qfront))
 +    (global.set $qfront (i32.add (local.get $i) (i32.const 1)))
 +    (table.get $task_queue (local.get $i))
 +  )
 +
 +  (func $enqueue (export "enqueue") (param $k (ref null $ct))
 +    ;; Check if queue is full
 +    (if (i32.eq (global.get $qback) (table.size $task_queue))
 +      (then
 +        ;; Check if there is enough space in the front to compact
 +        (if (i32.lt_u (global.get $qfront) (global.get $qdelta))
 +          (then
 +            ;; Space is below threshold, grow table instead
 +            (drop (table.grow $task_queue (ref.null $ct) (global.get $qdelta)))
 +          )
 +          (else
 +            ;; Enough space, move entries up to head of table
 +            (global.set $qback (i32.sub (global.get $qback) (global.get $qfront)))
 +            (table.copy $task_queue $task_queue
 +              (i32.const 0)         ;; dest = new front = 0
 +              (global.get $qfront)  ;; src = old front
 +              (global.get $qback)   ;; len = new back = old back - old front
 +            )
 +            (table.fill $task_queue      ;; null out old entries to avoid leaks
 +              (global.get $qback)   ;; start = new back
 +              (ref.null $ct)      ;; init value
 +              (global.get $qfront)  ;; len = old front = old front - new front
 +            )
 +            (global.set $qfront (i32.const 0))
 +          )
 +        )
 +      )
 +    )
 +    (table.set $task_queue (global.get $qback) (local.get $k))
 +    (global.set $qback (i32.add (global.get $qback) (i32.const 1)))
 +  )
 +)
 +(register "queue")
 +
 +(module $scheduler2
 +  (rec
 +    (type $ft (func (param (ref null $ct))))
 +    ;; Continuation type of all tasks
 +    (type $ct (cont $ft))
 +  )
 +
 +  (func $task_enqueue (import "queue" "enqueue") (param (ref null $ct)))
 +  (func $task_dequeue (import "queue" "dequeue") (result (ref null $ct)))
 +  (func $task_queue-empty (import "queue" "queue-empty") (result i32))
 +  (func $print_i32 (import "spectest" "print_i32") (param i32))
 +
 +  ;; Tag used to yield execution in one task and resume another one.
 +  (tag $yield)
 +
 +  ;; Entry point, becomes parent of all tasks.
 +  ;; Only acts as scheduler when tasks finish.
 +  (func $entry (param $initial_task (ref $ft))
 +    (local $next_task (ref null $ct))
 +
 +    ;; initialise $task_queue with initial task
 +    (call $task_enqueue (cont.new $ct (local.get $initial_task)))
 +
 +    (loop $resume_next
 +      ;; pick $next_task from queue, or return if no more tasks.
 +      ;; Note that there is no suspend handler for $yield
 +      (if (call $task_queue-empty)
 +        (then (return))
 +        (else (local.set $next_task (call $task_dequeue)))
 +      )
 +      (resume $ct (on $yield switch)
 +        (ref.null $ct) (local.get $next_task))
 +      ;; task finished execution: loop to pick next one
 +      (br $resume_next)
 +    )
 +  )
 +
 +  ;; To simplify the example, all task_i functions execute this function. Each
 +  ;; task has an $id, but this is only used for printing.
 +  ;; $to_spawn represents another task that this function will add to the task
 +  ;; queue, unless the reference is null.
 +  ;; $c corresponds to the continuation parameter of the original $task_i
 +  ;; functions.
 +  ;; This means that it is the previous continuation we just switch-ed away
 +  ;; from, or a null reference if the task was resumed from $entry.
 +  (func $task_impl
 +        (param $id i32)
 +        (param $to_spawn (ref null $ft))
 +        (param $c (ref null $ct))
 +
 +    (if (ref.is_null (local.get $c))
 +      (then)
 +      (else (call $task_enqueue (local.get $c))))
 +
 +    (if (ref.is_null (local.get $to_spawn))
 +      (then)
 +      (else (call $task_enqueue (cont.new $ct (local.get $to_spawn)))))
 +
 +    (call $print_i32 (local.get $id))
 +    (call $yield_to_next)
 +    (call $print_i32 (local.get $id))
 +  )
 +
 +  ;; The actual $task_i functions simply call $task_impl, with i as the value
 +  ;; for $id, and $task_(i+1) as the task to spawn, except for $task_3, which
 +  ;; does not spawn another task.
 +  ;;
 +  ;; The observant reader may note that all $task_i functions may be seen as
 +  ;; partial applications of $task_impl.
 +  ;; Indeed, we could obtain *continuations* running each $task_i from a
 +  ;; continuation running $task_impl and cont.bind.
 +
 +  (func $task_3 (type $ft)
 +    (i32.const 3)
 +    (ref.null $ft)
 +    (local.get 0)
 +    (call $task_impl)
 +  )
 +  (elem declare func $task_3)
 +
 +  (func $task_2 (type $ft)
 +    (i32.const 2)
 +    (ref.func $task_3)
 +    (local.get 0)
 +    (call $task_impl)
 +  )
 +  (elem declare func $task_2)
 +
 +  (func $task_1 (type $ft)
 +    (i32.const 1)
 +    (ref.func $task_2)
 +    (local.get 0)
 +    (call $task_impl)
 +  )
 +  (elem declare func $task_1)
 +
 +  (func $task_0 (type $ft)
 +    (i32.const 0)
 +    (ref.func $task_1)
 +    (local.get 0)
 +    (call $task_impl)
 +  )
 +  (elem declare func $task_0)
 +
 +
 +  ;; Determines next task to switch to directly.
 +  (func $yield_to_next
 +    (local $next_task (ref null $ct))
 +    (local $received_task (ref null $ct))
 +
 +    ;; determine $next_task
 +    (local.set $next_task (call $task_dequeue))
 +
 +    (block $done
 +      (br_if $done (ref.is_null (local.get $next_task)))
 +      ;; Switch to $next_task.
 +      ;; The switch instruction implicitly passes a reference to the currently
 +      ;; executing continuation as an argument to $next_task.
 +      (switch $ct $yield (local.get $next_task))
 +      ;; If we get here, some other continuation switch-ed directly to us, or
 +      ;; $entry resumed us.
 +      ;; In the first case, we receive the continuation that switched to us here
 +      ;; and we need to enqueue it in the task list.
 +      ;; In the second case, the passed continuation reference will be null.
 +      (local.set $received_task)
 +      (if (ref.is_null (local.get $received_task))
 +        (then)
 +        (else (call $task_enqueue (local.get $received_task))))
 +    )
 +    ;; Just return if no other task in queue, making the $yield_to_next call
 +    ;; a noop.
 +  )
 +
 +  (func (export "main")
 +    (call $entry (ref.func $task_0))
 +  )
 +)
 +(invoke "main")