We're writing support for Neon language into PhpStorm and we want to have it heavily tested with automated unit tests. First step when processing any programming language is a lexical analyzer or shortly lexer, which takes source code and splits into individual words of the programming language, which are called tokens (actually, also symbols, punctuation and whitespace count as tokens).
Because parsing strings manually is tedious and boring, clever people made tools to help us a little bit. Flex is de facto standard language for writing lexers, but we'll use its port to Java called JFlex. In flex files you describe patterns for several types of tokens and associate a piece of code with each. Have a look at flex file for very simple [properties](https://github.com/JetBrains/intellij-community/blob/master/plugins/properties/src/com/intellij/lang/properties/pa