Skip to content

Instantly share code, notes, and snippets.

@Floweynt
Last active February 8, 2023 04:01
Show Gist options
  • Save Floweynt/fed073d2e930333a4dcc96ca3f3bdeec to your computer and use it in GitHub Desktop.
Save Floweynt/fed073d2e930333a4dcc96ca3f3bdeec to your computer and use it in GitHub Desktop.

Language specifications

Tokens

Tokens are defined as follow:

match_id = [a-zA-Z_][a-zA-Z0-9_]*

TOK_IDENTIFIER := match_id
TOK_LANG_IDENTIFIER := @match_id
TOK_INTEGER := [0-9]+ ([0-9]+match_id, but this is currently unsupported)
TOK_CHAR := '.'
TOK_STRING := ([^"\\]|\\.)*
TOK_OPERATOR :=
    "::"
    "++"
    "--"
    "."
    "+"
    "-"
    "!"
    "~"
    "*"
    "&"
    "/"
    "%"
    "<<"
    ">>"
    "<=>"
    "<"
    "<="
    ">"
    ">="
    "=="
    "!="
    "^"
    "|"
    "&&"
    "||"
    "="
    "+="
    "-="
    "*="
    "/="
    "%="
    "<<="
    ">>="
    "&="
    "^="
    "|="
    "->"

TOK_PAREN_OPEN = '('
TOK_PAREN_OPEN := '('
TOK_PAREN_CLOSE := ')'
TOK_BRACKET_OPEN := '['
TOK_BRACKET_CLOSE := ']'
TOK_BRACE_OPEN := '{'
TOK_BRACE_CLOSE := '}'
TOK_SEMICOLON := ';'
TOK_COLON := ':'
TOK_COMMA := ','
TOK_BKSLASH := '\\'
TOK_ELLIPSIS := "..."

TOK_KW_AUTO := "auto"
TOK_KW_VAR := "const"
TOK_KW_VAR := "var"
TOK_KW_CONSTEVAL := "consteval"
TOK_KW_COMPTIME := "comptime"
TOK_KW_USING := "using"
TOK_KW_NAMESPACE := "namespace"
TOK_KW_YIELD := "yield"
TOK_KW_MATCH := "match"
TOK_KW_CASE := "case"
TOK_KW_IF := "if"
TOK_KW_ELSE := "else"
TOK_KW_WHILE := "while"
TOK_KW_FOR := "for"
TOK_KW_STRUCT := "struct"

Syntax

The body of any file compiled should consist of a (maybe empty) list of top-level-stmt
The following are the syntax definitions

top-level-stmt ::= variable-def-expr | using-expr | namespace-stmt
binary-op-expr ::= [ unary-expr (operator) ]
block-expr ::= [ expr ';' ] // (note that the semicolon is optional on some expressions, such as if) 
char-literal ::= TOK_CHAR
expr ::= variable-def-expr | using-expr | if-expr | binary-expr
identifier-expr ::= TOK_IDENTIFIER
if-expr ::= if ( expr ) expr ';' [ else if ( expr ) expr ] <else expr>
lambda-expr ::= TOK_BKSLASH ( variable-def-expr ) <TOK_OPERATOR(->) type-expr > block-expr
lang-id-expr ::= TOK_LANG_IDENTIFIER
namespace-stmt ::= TOK_KW_NAMESPACE TOK_IDENTIFIER [ '::' TOK_IDENTIFIER ] block-expr
paren-expr ::= ( expr )
simple-primary-expr ::= identifier-expr | lang-id-expr | TOK_INTEGER | TOK_FLOATING 
    | TOK_STRING | TOK_CHAR | paren-expr | lambda-expr
primary-expr ::= simple-primary-expr [ expr ] | ( expr )
struct-literal ::= tok_kw_struct { [variable-def-expr] }
type-expr ::= TOK_KW_AUTO | expr
unary-expr ::= unary-operator primary-expr
using-expr ::= TOK_KW_USING TOK_IDENTIFIER = expr
variable-expr ::= [ modifiers: consteval, comptime, const, var ] TOK_IDENTIFIER <TOK_ELLIPSIS> : type-expr < = expr > 

Examples

Declare a variable:

var i: auto = 1;
var i: std::int = 2;
const i: auto = 1.2;

Declare a type:

comptime const Int = std::int;
using Float = std::Float

Declare a function:

// with type deduction, forward declerations need to specify type
auto fun(var i: int) -> void;
auto fun(var i: int) -> auto { ... } // OK, explicit auto, forced to be void by decleration
auto fun(var i: int)  { ... } // OK, implicit auto, forced to be void by decleration

auto bar(var i: int) -> auto {...} // OK, explicit auto, type deduced
auto bar(var i: int) {...} // OK, implicit auto, type deduced
auto bar(var i: int) -> char {...} // OK, explicit type

auto baz(var i: int); // BAD, forward decl needs explicit, non deduced, type
auto baz(var i: int) -> auto; // BAD, forward decl needs explicit, non deduced, type

Semantics

Entry point

A executable should have an entry point function declared as auto main(<pointer semantics not decided yet>) -> void

Values

Most things have a value
For example, blocks are expressions:

{
    var i: auto = 9;
    var j: auto = 12;
    yield i + j * j;
} // this has a value of type "int"

{
    {
        yield 4;
    } // has type int
} // has type void

However:

{
    var i: auto = 0;
    i++;
} // has type "void"

Since there is no yield expression.

There can only be one yield expression in a block, and all code after the yield expression will not be executed.

Most control flow will have a type, provided they meet specific requirements:

  • All branches will either not return, or be of some type
  • If the types of the branches are not the same, void will be the type of the expression For example:
... = if(i == 0) 1;
    else if(i == 1) 13;
    else if(i == 2)
    {
        var j: auto = some_function(i);
        yield j + i;
    }
    else i + 2;

// The following will not work
some_int = if(i == 0) i++; // no else statement
some_int = if(i == 1) { i++; } else { i--; } // blocks do not automatically have a non-void value

Loops have similar semantics, except with break val (since yield does not make sense, as the body of the for loop having a value cannot be reasonable implemnted without language-level dynamic arrays):

for(var i: auto = 0; i < 10; i++)
{
    if(i % 3) break i;
}

else -1;

All break expressions within the body of the loop, but outside of any nested loop must have the same type, as well as the else statement. Breaks without a value (break;) will be treated as having a type of void. The reason that for loops have a stricter behaviour is that it is impossible to accidentaly use this feature, whereas conditional branches may have a value, but only exist to cause a side effect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment