Last active
August 29, 2015 13:57
-
-
Save al3xandru/9383312 to your computer and use it in GitHub Desktop.
A Pygments lexer for Cassandra Query Language (CQL)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| class CqlLexer(RegexLexer): | |
| """ | |
| Lexer for Cassandra Query Language. | |
| Spec available `here<https://cassandra.apache.org/doc/cql3/CQL.html>`_. | |
| """ | |
| name = 'CQL' | |
| aliases = ['cql'] | |
| filenames = ['*.cql'] | |
| mimetypes = ['text/x-sql'] | |
| flags = re.IGNORECASE | |
| tokens = { | |
| 'root': [ | |
| (r'\s+', Text), | |
| (r'--.*?\n', Comment.Single), | |
| (r'//.*\n', Comment.Single), | |
| (r'/\*', Comment.Multiline, 'multiline-comments'), | |
| (r'(ADD|ALL|ALTER|AND|ANY|APPLY|AS|ASC|AUTHORIZE|' | |
| r'BATCH|BEGIN|BY|' | |
| r'CLUSTERING|COLUMNFAMILY|COMPACT|CONSISTENCY|CONTAINS|COUNT|CREATE|CUSTOM|' | |
| r'DELETE|DESC|DROP|DISTINCT|' | |
| r'EXISTS|FROM|GRANT|IF|IN|INDEX|INSERT|INTO|KEY|KEYSPACE|' | |
| r'LEVEL|LIMIT|' | |
| r'MODIFY|NORECURSIVE|NOSUPERUSER|NOT|OF|ON|ORDER|' | |
| r'PERMISSION|PERMISSIONS|PRIMARY|REVOKE|' | |
| r'SCHEMA|SELECT|STATIC|STORAGE|SUPERUSER|' | |
| r'TABLE|TOKEN|TRIGGER|TRUNCATE|TTL|TYPE|' | |
| r'UPDATE|USE|USER|USERS|USING|' | |
| r'VALUES|WHERE|WITH|WRITETIME)\b', Keyword), | |
| (r'(ASCII|BIGINT|BLOB|BOOLEAN|COUNTER|' | |
| r'DECIMAL|DOUBLE|FLOAT|INET|INT|TEXT|' | |
| r'TIMESTAMP|TIMEUUID|UUID|VARCHAR|VARINT|' | |
| r'LIST|SET|MAP)\b', Name.Builtin), | |
| (r'(TRUE|FALSE)\b', Literal), | |
| (r'(NAN|INFINITY)\b', Number.Float), | |
| (r'[\*+<>=-]', Operator), | |
| (r'[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}', Number.Hex), | |
| (r'0[xX][0-9a-fA-F]*\b', Number.Hex), | |
| (r'[0-9]+(\.[0-9]*)?([eE][+-]?[0-9+])?\b', Number.Float), | |
| (r'[0-9]+\b', Number.Integer), | |
| (r"'(''|[^'])*'", String.Symbol), | |
| (r'"(""|[^"])*"', Name), | |
| (r'[a-zA-Z][a-zA-Z0-9_]*', Name), | |
| (r'[;:()\[\],\.\{\}\<\>]', Punctuation) | |
| ], | |
| 'multiline-comments': [ | |
| (r'/\*', Comment.Multiline, 'multiline-comments'), | |
| (r'\*/', Comment.Multiline, '#pop'), | |
| (r'[^/\*]+', Comment.Multiline), | |
| (r'[/*]', Comment.Multiline) | |
| ] | |
| } |
Author
A bunch of minor potential improvements from quickly looking (and keeping in mind that I don't know pygments):
- QUORUM/LOCAL_QUORUM/ONE/TWO/THREE are not part CQL3 keywords/not part of the syntax (they were part of CQL2 and of the beta of CQL3 but were removed in the final; ALL does is a keyword though for permissions stuffs).
- Some missing keywords: STATIC (for C* 2.0.6 onwards), TRIGGER, PASSWORD. We also support NAN and INFINITY as floating point constants.
- This doesn't seem to handle float (3.2, 4.12e-12, ...), blob (0x, 0xa092b) and boolean literals (true, false).
- Strictly speaking, double-quoted strings are "Name" not "String.Symbol" (though maybe it just look ugly if you change that, haven't checked really).
- The CQL grammar don't allow a '_' as first character of a 'Name'.
- There is a bunch of the 'Operator' that are not really supported but I suppose that's no biggy :)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I should probably submit a pull request after finally managing to push these changes on bitbucket: https://bitbucket.org/al3xandru/pygments-main/branches