Skip to content

Instantly share code, notes, and snippets.

@etsangsplk
Forked from ELLIOTTCABLE/ast.c
Created May 16, 2020 19:41
Show Gist options
  • Save etsangsplk/cd8001f329979bf9842b30ef51a002a0 to your computer and use it in GitHub Desktop.
Save etsangsplk/cd8001f329979bf9842b30ef51a002a0 to your computer and use it in GitHub Desktop.
struct e(node);
struct e(ast);
typedef struct e(node) * e(node);
typedef struct e(ast) * e(ast);
/* FIXME: I am slightly uncomfortable imposing a hard limit on the maximum size of “documents” that this Paws
* interpreter can handle. I’d love to use some sort of arbitrary-integer representation for this, but
* at the moment, that’s extra work I cannot afford. */
typedef unsigned long long int e(ast_index);
/* There are three basic types of `node`s in a cPaws `AST`:
*
* - `PHRASE` nodes, the most basic, are generally a single ‘word’ in the code: the `AST` representing
* `foo bar baz` contains three `PHRASE` nodes, `“foo”`, `“bar”`, and `“baz”`. They need not, however, be a
* single ‘word’; `PHRASE` nodes may contain multiple ‘words’ when surrounded by double-quotes
* (e.g. `foo “bar baz”` contains only two `PHRASE` nodes: `“foo”` and `“bar baz”`)
*
* - `EXPRESSION` nodes consist of a series of juxtaposed sub-nodes, which may be any of the three `node_type`s.
* Their sub-nodes are juxtaposed by dint of seperation by whitespace (which may not include newlines, if the
* `EXPRESSION` node in question is the direct descendant of a new `SCOPE`, as newlines imply new
* sub-expressions.)
* - A `PHRASE` node inside an `EXPRESSION` node is obviously the basic build block of the language; it
* implies a juxtaposition with the previous node (or, if there is no previous node in *this* `EXPRESSION`,
* then instead implies a juxtaposition with the closest parent `SCOPE`)
* - Another `EXPRESSION` node as a child of this `EXPRESSION` comprises a sub-expression, which implies a
* juxtaposition of the *result* of said sub-expression with the the previous node (or alternatively, the
* closest parent `SCOPE`; see above.) Sub-expressions are denoted by opening and closing parenthesis within
* the parent `EXPRESSION` (e.g. `foo (bar baz)` is an `EXPRESSION` with two nodes: the `PHRASE` `foo` and
* the (sub-)`EXPRESSION` `bar baz`, which itself contains two `PHRASE` nodes, `“bar”` and `“baz”`. )
* - A new `SCOPE` node as a child of the `EXPRESSION` comprises a new sub-scope
*
* - `SCOPE` nodes indicate sub-sections of a program within which the juxtapositions of the first sub-node of
* each `EXPRESSION` and sub-expression within that `SCOPE` are resolved against that `SCOPE`.
*
* In libspace terms, this implies that within a given `execution`, instantiated for a given `SCOPE`, all
* `EXPRESSION`s evalulated will have their first node effectively juxtaposed against that `execution`’s
* `locals`-`fork`.
*/
enum e(node_type) { e(PHRASE) = 0, e(EXPRESSION), e(SCOPE) };
/* `node` is the core of our `AST` implementation. A given document, read into `Paws.c`, is represented by an
* impure singly-linked-list of these “nodes.” Each node includes a pointer to the `next` linear `node` in the
* parent document.
*
* Two `node_type`s (`EXPRESSION` and `SCOPE`) are capable of having children, and such `node` instances also
* encapsulate a pointer to the first such child. The `last` child in an enclosing node is boolean-flagged as
* such, with its `next` pointer referencing said enclosing `node` instead of the laterally subsequent node.
*
* The last node in a `SCOPE` sourcing from a foreign Iunit may be missing its `next` pointer if the subsequent
* node (parent node) from the original document was irrelevant to the portions of the stuffspace shared with
* this interpreter.
*
* Every node includes unsigned, numeric `ast_index`es for the first and last character *of that node*. These
* indexes are not necessarily undivided, and do not encompass the entire document. Meaningless whitespace is
* usually not included in the `ast_index`-range of any terminal `node`. The range between the `start` and `end`
* indicies for the `node_type`s with children will fully encompass the ranges for each of their children nodes.
*
* `PHRASE`s, as terminal nodes, provide a pointer to their static libspace representation in the stead of a
* pointer to `child`ren. Currently, this means only a pointer to a preallocated libspace `label` for that
* `PHRASE`. This pointer is typecast as a `thing` to be future-proof, but need not currently be annotated, as
* the `thing` annotation will be ignored, and the pointer’s referencee memory treated as a `label` instance.
*/
struct e(node) {
e(node) next;
enum e(node_type) isa;
bool last;
e(ast_index) start;
e(ast_index) end;
union /* e(node_content) */ {
e(node) /* content.*/child;
e(thing) /* content.*/content; } // »
content; };
struct e(ast) { void* nothing; }; // Reserved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment