Skip to content

Instantly share code, notes, and snippets.

@gmarik
Forked from zaach/0_new.md
Created February 8, 2021 19:11
Show Gist options
  • Save gmarik/aa1a014e88015494692b50c445022275 to your computer and use it in GitHub Desktop.
Save gmarik/aa1a014e88015494692b50c445022275 to your computer and use it in GitHub Desktop.
New Jison 0.3 features

Some improvements have been made for parser and lexer grammars in Jison 0.3 (demonstrated in the FlooP/BlooP example below.)

For lexers:

  • Patterns may use unquoted characters instead of strings
  • Two new options, %options flex case-insensitive
  • flex: the rule with the longest match is used, and no word boundary patterns are added
  • case-insensitive: all patterns are case insensitive
  • User code section is included in the generated module

For parsers:

  • Arrow syntax for semantic actions
  • EBNF syntax (enabled using the %ebnf declaration)
  • Operators include repetition (*), non-empty repetition (+), grouping (()), alternation within groups (|), and option (?)
  • User code section and code blocks are included in the generated module

Also, Robert Plummer has created a PHP port of Jison's parser.

See the grammar below for more examples.

ID [A-Z-]+"?"?
NUM ([1-9][0-9]+|[0-9])
%options flex case-insensitive
%%
\s+ /* ignore */
{NUM} return 'NUMBER'
DEFINE return 'DEFINE'
PROCEDURE return 'PROCEDURE'
BLOCK return 'BLOCK'
BEGIN return 'BEGIN'
OUTPUT return 'OUTPUT'
CELL return 'CELL'
IF return 'IF'
THEN return 'THEN'
LOOP return 'LOOP'
"MU-LOOP" return yy.bloop ? 'INVALID' : 'MU_LOOP'
AT return 'AT'
MOST return 'MOST'
TIMES return 'TIMES'
ABORT return 'ABORT'
END return 'END'
QUIT return 'QUIT'
AND return 'AND'
YES return 'YES'
NO return 'NO'
{ID} return 'IDENT'
"." return '.'
"''" return 'QUOTE'
"[" return '['
"]" return ']'
"(" return '('
")" return ')'
"{" return '{'
"}" return '}'
":" return ':'
";" return ';'
"," return ','
"+" return '+'
"*" return '*'
"×" return '*' //non-ascii
"<=" return '<='
"⇐" return '<=' //non-ascii
"<" return '<'
">" return '>'
"=" return '='
<<EOF>> return 'EOF'
. return 'INVALID'
/* BlooP and FlooP parser - http://en.wikipedia.org/wiki/BlooP_and_FlooP */
/* Code blocks are inserted at the top of the generated module. */
%{
var ast = require('./ast'),
Program = ast.Program,
ProcedureStmt = ast.ProcedureStmt,
BlockStmt = ast.BlockStmt,
LoopStmt = ast.LoopStmt,
MuLoopStmt = ast.MuLoopStmt,
NumberLit = ast.NumberLit,
BooleanLit = ast.BooleanLit,
OutputExpr = ast.OutputExpr,
Identifier = ast.Identifier,
CellExpr = ast.CellExpr,
PlusExpr = ast.PlusExpr,
TimesExpr = ast.TimesExpr,
ApplyExpr = ast.ApplyExpr,
LessCond = ast.LessCond,
GreaterCond = ast.GreaterCond,
GreaterCond = ast.GreaterCond,
EqualCond = ast.EqualCond,
CompoundCond = ast.CompoundCond,
AssignStmt = ast.AssignStmt,
IfThenStmt = ast.IfThenStmt,
QuitStmt = ast.QuitStmt,
AbortStmt = ast.AbortStmt;
%}
%nonassoc '+'
%nonassoc '*'
/* enable EBNF grammar syntax */
%ebnf
%%
program
: procedure* EOF
{ return Program({},$1) }
;
procedure
: DEFINE PROCEDURE QUOTE IDENT QUOTE '[' (identifier ',')* identifier? ']' ':' block '.'
-> ProcedureStmt({name:$4},[$7.concat([$8]),$11])
;
block
: BLOCK NUMBER ':' BEGIN (statement ';')+ BLOCK NUMBER ':' END
-> BlockStmt({id: $2},$5)
;
statement
: cell '<=' expression -> AssignStmt({}, [$1, $3])
| output '<=' expression -> AssignStmt({}, [$1, $3])
| LOOP (AT MOST)? expression TIMES ':' block -> LoopStmt({}, [$3, $6])
| MU_LOOP ':' block -> MuLoopStmt({}, [$3])
| IF condition ',' THEN ':' (statement | block) -> IfThenStmt({}, [$2, $6])
| QUIT BLOCK NUMBER -> QuitStmt({id: $3})
| ABORT LOOP NUMBER -> AbortStmt({id: $3})
;
condition
: expression
| expression '<' expression -> LessCond({}, [$1, $3])
| expression '>' expression -> GreaterCond({}, [$1, $3])
| expression '=' expression -> EqualCond({}, [$1, $3])
| '{' condition AND condition '}' -> CompoundCond({}, [$1, $3])
;
expression
: NUMBER -> NumberLit({value: $1}, [])
| identifier
| IDENT '[' (expression ',')* expression? ']' -> ApplyExpr({name:$1}, $3.concat([$4]))
| cell
| output
| NO -> BooleanLit({value: false}, [])
| YES -> BooleanLit({value: true}, [])
| expression '+' expression -> PlusExpr({}, [$1, $3])
| expression '*' expression -> TimesExpr({}, [$1, $3])
;
output
: OUTPUT -> OutputExpr({},[])
;
cell
: CELL '(' NUMBER ')' -> CellExpr({id: $3})
;
identifier
: IDENT -> Identifier({value: $1})
;
%%
// additional user code here
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment