Skip to content

Instantly share code, notes, and snippets.

@alganet
Last active August 30, 2017 13:38
Show Gist options
  • Save alganet/5104804 to your computer and use it in GitHub Desktop.
Save alganet/5104804 to your computer and use it in GitHub Desktop.
Greppy
<?php
// Simple empty matching. Explains a lot of how the idea works:
//
// By default, expressions are delimited by a line, which
// implies /^$/. Everything goes inside that. Calling nothing
// else more matches any line.
//
// p() is a function call. Could be Pattern::create(), but I
// believe providing a function is sane. It can be provided
// as a namespaced function (Greppy\...\pattern_create as p)
// and doesn't hurt any standard.
//
p()->test(''); //true
// The wildcard. This is used for creating a pattern that the
// library doesn't support.
//
// The regex isn't unprocessed though. Delimiters are not
// mandatory since you can nest wildcards just like you can
// concatenate regular expressions on variables.
p('[^/]*')->test('yay no slashes here'); //true
// More elaborate sample. Regular expressions are most complicate
// by their flow, not their syntax. The syntax only expresses
// (beautifuly) it's flow.
//
// This is an attempt to translate most patterns we use in common
// regular expressions to a DSL
//
// The calls are read like this: Pattern for char until, oh wait,
// start capturing, well... find a digit. Ok. Now match this.
//
// A translated regex is: ^.*(\d)$
//
// It's simple pieces are:
//
// - ^ .. $ // By default by p()
// - . // ->char
// - * // ->until
// - ( // ->capture
// - \d // ->digit
// - ) // Implicit, opening capture auto-closes it.
p()->char->until->capture->digit->match('slide-5'); //5
// The regex above and the flow described by both it's pieced and
// it's fluent interface matches the graph seen by the regular
// expression debugger Debuggex:
//
// http://www.debuggex.com/?re=^.*%28\d%29%24&str=slide-5
//
// The graph shows a straight line with two loops (one for the `until`
// and one for the `capture` steps) and the steps for the line
// capture.
// The ->match() method seen above returns the captured groups.
// You can use it alongside the capture() feature:
// Returns array('slide-number' => 5);
p()->char->until->capture('slide-number')->digit->matchAll('slide-5');
// The mandatory line capture is actually optional:
p()->inline->atLeast->digit->match('hey my numba is 874'); // 874
// Above sample translates to \d+. Also very similar to the graph on
// Debuggex: http://www.debuggex.com/?re=\d%2B&str=hey+my+numba+is+874
// with a single loop for the "at Least".
//
// Note that the modifier is actually before, but it's compiled after
// the \d matcher. This is easy to acomplish since only a few modifiers
// and matchers exist.
// Capture: (\d)
p()->capture->digit;
// Silent Capture: (?:\d)
p()->silent->digit;
// Named Capture: (?P<numero>\d)
p()->capture('numero')->digit;
// Reusing captures. Expression: (?P<part>[/].*[^/])
p()->capture('part')->char('/')->char->until->not('/')->match('/foo/bar/baz');
// Result from above: array('/foo', '/bar', '/baz');
@wesleyvicthor
Copy link

🐼

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment