Skip to content

Instantly share code, notes, and snippets.

@dkincaid
Last active December 22, 2015 09:58
Show Gist options
  • Save dkincaid/6455499 to your computer and use it in GitHub Desktop.
Save dkincaid/6455499 to your computer and use it in GitHub Desktop.
RTF Parser Early Tests
grammar MinRtf ;
document : (control | text )+ ;
text : TEXT ;
control : KEYWORD INT? SPACE? ;
KEYWORD : '\\' (ASCIILETTER)+ ;
fragment ASCIILETTER : [A-Za-z] ;
fragment DIGIT : [0-9] ;
INT : '-'? DIGIT+ ;
WS : [\r\n] -> skip ;
SPACE : ' ' ;
TEXT : ~('\\' | '\r' | '\n' )+ ;
grammar MinRtf ;
document : (control | text )+ ;
text : (TEXT | SPACE) ;
control : CONTROL ;
CONTROL : KEYWORD INT? SPACE? ;
KEYWORD : '\\' (ASCIILETTER)+ ;
fragment ASCIILETTER : [A-Za-z] ;
fragment DIGIT : [0-9] ;
INT : '-'? DIGIT+ ;
WS : [\r\n] -> skip ;
SPACE : ' ' ;

Desired results

this one \b0 that one -> (document (text this one ) (control \b 0) (text that one))
this one \b0that one -> (document (text this one ) (control \b 0) (text that one))
this one \b that one -> (document (text this one ) (control \b) (text that one))
this one \b\i that one -> (document (text this one ) (control \b) (control \i) (text that one))

MinRtf-control-rule

this one \b0 that one -> (document (text this one ) (control \b) (text 0 that one))
this one \b0that one -> (document (text this one ) (control \b) (text 0that one))
this one \b that one -> (document (text this one ) (control \b) (text  that one))
this one \b\i that one -> (document (text this one ) (control \b) (control \i) (text that one))

MinRtf-control-token

this one \b0 that one -> (document (text this one ) (control \b0 ) (text that one))
this one \b0that one -> (document (text this one ) (control \b0) (text that one))
this one \b that one -> (document (text this one ) (control \b ) (text that one))
this one \b\i that one -> (document (text this one ) (control \b) (control \i ) (text that one))

TstLexer/TstParser

this one \b0 that one -> (document (text this one ) (control \b 0 ) (text that one))
this one \b0that one -> (document (text this one ) (control \b 0) (text that one))
this one \b that one -> (document (text this one ) (control \b) (text that one))
this one \b\i that one -> (document (text this one ) (control \b) (text that one)) plus token recognition error on '\' 'i'
lexer grammar TstLexer;
KEYWORD : '\\' (ASCIILETTER)+ -> pushMode(CTRL) ;
fragment ASCIILETTER : [A-Za-z] ;
fragment DIGIT : [0-9] ;
WS : [\r\n] -> skip ;
SPACE : ' ' ;
TEXT : ~('\\' | '{' | '}' | '\n' | '\r')+ ;
mode CTRL;
IGNORE_SPACE : (' ' | '\n' | '\r') -> skip, popMode ;
INT : '-'? DIGIT+ SPACE? -> popMode ;
parser grammar TstParser ;
options { tokenVocab=TstLexer; }
document : (control | text )+ ;
text : (TEXT | SPACE) ;
control : KEYWORD INT* ;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment