Regexes could be a lot more readable with a little bit more verbosity
- Match characters in single quotes
- Use plain keywords for groups of chars (like whitespace)
- Slightly alter some syntax to make more explicit
Match an exact string with single quotes: 'hello world'
Match either strings: 'hello' | 'world'
(matches 'hello' or 'world')
Groupings: ['hello' | 'hi'] ' world'
(matches 'hello world' or 'hi world')
Match zero or more of something ['hi' 0+] ' world'
(matches 'hi world', 'hihi world', 'hihihi world', etc)
Match one or more of something ['hi' 1+] ' world'
Match two or three of something ['hi' 2-3] ' world'
(matches 'hihi world' or 'hihihi world')
Match alphanumerics (aka [a-zA-Z0-9]
): alphanum
Match alphabet (aka [a-zA-Z]
): alpha
Match ranges: [ 'a' - 'z' | 'A' - 'Z' ]
Match not something: [not 'a' - 'z']
(anything except a-z)
Match groups: ('hello') ' ' ('world')
Other keywords: digit
, whitespace
, startln
, endln
, startstr
, endstr
, word-boundary
HTML tags: '<' [alpha 1+] as tag ' ' [any 0+] '>' [any 0+] '</' tag '>'