These are mostly syntactic sugar for deeper code generation and integration with the user's AST.
They are also tied to a top-down concept that helps guide parsing by supporting context.
E.g: How do you validate the type of a function parameter before the end of the body?
StructuredRules grammar is a superset of BNF.
StructuredRules cater to some of the most common paths and aim to make them easier to execute.
For the purposes of this document:
- Specification language: BNF-based;
- Actions: Yacc/Bison style (
{ ... }
); - Language: Golang
- Variables: Yacc/Bison style:
$$
is the current context/return value,$0
is the first captured attribute, ...
This is a core concept that is re-used in several contexts that ultimately make it intuitive.
ProductionName( [ non-terminal or production ] )
This has several purposes that will be expanded above, but is one of the fundamentals of StructuredRules.
We use them in production rules all the time, we almost never want to see them, so we end up counting them.
A structured rule expression discards unnamed matches automatically. So that BNF rules continue to behave as they would without StructuredRules, this requires the production uses the parentheses syntax, with or without an anchored match:
IfStmt() : "if" IfExpr Statement "else" Statement ";" ;
StructName
: "struct" identifier { $$ = $1 };
StructName("struct") : identifier ; // no action required
Inheritance
: ":" identifier { $$ = $1 }
| empty
;
Structure
: StructName [ Inheritance ] StructBody
;
?
suffix.
You can make any rule optional on a case-by-case basis by suffixing it with a question-mark:
Structure : StructName Inheritance? StructBody { $$ = $1 /* nil if inheritance wasn't used */ }
Or you can make a rule always-optional by suffixing the declaration with a question-mark:
Inheritance?(":") : identifier;
Structure : StructName Inheritance // invalid: must use 'Inheritance?'
Structure : StructName Inheritance? StructBody;
Note The question-mark should be syntactic and not part of the name so that error messages do not introduce the '?' for end-users, but when processing the grammar "Inheritance" should be rejected with a note that "Inheritance is optional, use Inheritance? instead".
StructDecl
: "struct" { $$ = new_struct(); }
StructName ...
{ end_struct($$); }
;
gocc note: gocc doesn't appear to provide for this
Since we're introducing programmer-familiar concepts, lets use the brace-enclosure concept.
StructDecl("struct") {
...
}
This effectively behaves as though there were two rules injecting actions at the start and end of the rule.
StructDecl
: "struct" { $$ = push_context(StructDecl_Open($$)) } /* pass current context to function, return becomes context for this scope */
...
{ descoped = pop_context(); return descoped }
;
StructuredRuleDecl
: StructuredRuleDecl StructuredRuleModifiers "{" StructuredRuleBody "}" [ ";" ];
StructuredRule
: StructuredRuleIdentity "(" StructuredRuleAnchor ")" /* T0 = rule's name, T2 = rule's keyword */
| StructuredRuleIdentity "(" ")"
;
StructuredRuleName
: ProductionName "?" { $$ = &StructuredRule{ Name: $0, Optional: true } }
| ProductionName { $$ = &StructuredRule{ Name: $0, Optional: false } }
;
StructuredRuleAnchor
: Keyword
| Between
| empty
;
Keyword : StringLiteral ;
Between
: StringLiteral "..." StringLiteral // 0+ case
| StringLiteral "(" Repetitions ")" StringLiteral
;
Repetitions
: integer "," integer
| integer "," "N"
| integer
;
StructuredRuleBody
: Expressions
;
Expressions
: Expressions Expression
| Expression
;
Expression:
// discard non-terminals
: NonTerminal
// Call a corresponding function on the context with the value of rule's match
| "." StructuredRule
// Capture everything else
| ProductionName
| StructuredRule
;
// Regular EBNF here
Grammar : Definitions;
Definitions : Struct | Function;
Struct("struct") {
.StructName(identifier) [ ":" .ParentName(identifier) ] StructBody
/*
Because this is a brace-enclosed rule, we call Struct_Open and Struct_Close before and after the rule.
. StructName( identifier )
// Expands to: StructName: identifier { $context.StructName($0) } }
[ ":" . ParentName( identifier )]
// Expands to: ParentName? : ":" identifier { $context.ParentName($1) } }
// Ordinary rule
StructBody { $context.StructBody($$) }
*/
}
// Declares a rule called "Function" anchored to the keyword 'fn'
Function("fn") {
. FunctionName( identifier )
. ParameterList
. FunctionBody
}
StructBody("{" (1, N) "}") {
/* But it also produces code equivalent to:
newContext := context.StructBody_Open()
err := ...match struct body...(newContext)
context.StructBody_Close(newContext, err)
*/
// The '$n: <name>' syntax adds a type-switch for you rather than you using
// a naked cast or having to implement your own switch statements.
// Also, note that in theory StructMember_Close could do this for you,
. StructMember { context.AddMember($$: ast.StructMember) }
}
StructMember() {
. MemberName( identifier )
. TypeName
. ArrayDim ? // allow space between rule and ?
fieldTerminator // captured and named
}
fieldTerminator : ";" | "\n";
ParameterList("(" (0, N) ")") {
ParameterName( identifier ) TypeName { context.AddParameter($0, $1); }
}
TypeName
: "int"
| "bool"
| "string"
;
FunctionBody("{" (0, N) "}") {
}
- Allow space between ProductionName and ??
-
- In a longer list, "...?" is less visible than "[ ... ]", "... ?" might be easier to sight;
- Missing an explanation of why you still need actions, and what happens to matches that you passed to functions:
// StructBody() here is silently "lost" because there's no action and you didn't pass it to a method
Struct() { .StructDecl() .Inheritance?() StructBody() }