Created
May 1, 2011 02:58
-
-
Save ELLIOTTCABLE/950203 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
struct e(node); | |
struct e(ast); | |
typedef struct e(node) * e(node); | |
typedef struct e(ast) * e(ast); | |
/* FIXME: I am slightly uncomfortable imposing a hard limit on the maximum size of “documents” that this Paws | |
* interpreter can handle. I’d love to use some sort of arbitrary-integer representation for this, but | |
* at the moment, that’s extra work I cannot afford. */ | |
typedef unsigned long long int e(ast_index); | |
/* There are three basic types of `node`s in a cPaws `AST`: | |
* | |
* - `PHRASE` nodes, the most basic, are generally a single ‘word’ in the code: the `AST` representing | |
* `foo bar baz` contains three `PHRASE` nodes, `“foo”`, `“bar”`, and `“baz”`. They need not, however, be a | |
* single ‘word’; `PHRASE` nodes may contain multiple ‘words’ when surrounded by double-quotes | |
* (e.g. `foo “bar baz”` contains only two `PHRASE` nodes: `“foo”` and `“bar baz”`) | |
* | |
* - `EXPRESSION` nodes consist of a series of juxtaposed sub-nodes, which may be any of the three `node_type`s. | |
* Their sub-nodes are juxtaposed by dint of seperation by whitespace (which may not include newlines, if the | |
* `EXPRESSION` node in question is the direct descendant of a new `SCOPE`, as newlines imply new | |
* sub-expressions.) | |
* - A `PHRASE` node inside an `EXPRESSION` node is obviously the basic build block of the language; it | |
* implies a juxtaposition with the previous node (or, if there is no previous node in *this* `EXPRESSION`, | |
* then instead implies a juxtaposition with the closest parent `SCOPE`) | |
* - Another `EXPRESSION` node as a child of this `EXPRESSION` comprises a sub-expression, which implies a | |
* juxtaposition of the *result* of said sub-expression with the the previous node (or alternatively, the | |
* closest parent `SCOPE`; see above.) Sub-expressions are denoted by opening and closing parenthesis within | |
* the parent `EXPRESSION` (e.g. `foo (bar baz)` is an `EXPRESSION` with two nodes: the `PHRASE` `foo` and | |
* the (sub-)`EXPRESSION` `bar baz`, which itself contains two `PHRASE` nodes, `“bar”` and `“baz”`. ) | |
* - A new `SCOPE` node as a child of the `EXPRESSION` comprises a new sub-scope | |
* | |
* - `SCOPE` nodes indicate sub-sections of a program within which the juxtapositions of the first sub-node of | |
* each `EXPRESSION` and sub-expression within that `SCOPE` are resolved against that `SCOPE`. | |
* | |
* In libspace terms, this implies that within a given `execution`, instantiated for a given `SCOPE`, all | |
* `EXPRESSION`s evalulated will have their first node effectively juxtaposed against that `execution`’s | |
* `locals`-`fork`. | |
*/ | |
enum e(node_type) { e(PHRASE) = 0, e(EXPRESSION), e(SCOPE) }; | |
/* `node` is the core of our `AST` implementation. A given document, read into `Paws.c`, is represented by an | |
* impure singly-linked-list of these “nodes.” Each node includes a pointer to the `next` linear `node` in the | |
* parent document. | |
* | |
* Two `node_type`s (`EXPRESSION` and `SCOPE`) are capable of having children, and such `node` instances also | |
* encapsulate a pointer to the first such child. The `last` child in an enclosing node is boolean-flagged as | |
* such, with its `next` pointer referencing said enclosing `node` instead of the laterally subsequent node. | |
* | |
* The last node in a `SCOPE` sourcing from a foreign Iunit may be missing its `next` pointer if the subsequent | |
* node (parent node) from the original document was irrelevant to the portions of the stuffspace shared with | |
* this interpreter. | |
* | |
* Every node includes unsigned, numeric `ast_index`es for the first and last character *of that node*. These | |
* indexes are not necessarily undivided, and do not encompass the entire document. Meaningless whitespace is | |
* usually not included in the `ast_index`-range of any terminal `node`. The range between the `start` and `end` | |
* indicies for the `node_type`s with children will fully encompass the ranges for each of their children nodes. | |
* | |
* `PHRASE`s, as terminal nodes, provide a pointer to their static libspace representation in the stead of a | |
* pointer to `child`ren. Currently, this means only a pointer to a preallocated libspace `label` for that | |
* `PHRASE`. This pointer is typecast as a `thing` to be future-proof, but need not currently be annotated, as | |
* the `thing` annotation will be ignored, and the pointer’s referencee memory treated as a `label` instance. | |
*/ | |
struct e(node) { | |
e(node) next; | |
enum e(node_type) isa; | |
bool last; | |
e(ast_index) start; | |
e(ast_index) end; | |
union /* e(node_content) */ { | |
e(node) /* content.*/child; | |
e(thing) /* content.*/content; } // » | |
content; }; | |
struct e(ast) { void* nothing; }; // Reserved |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment