- The rules for the flags
/g
and/y
are complicated. - Regular expression operations keep state in RegExp objects, via property
.lastIndex
. That leads to several pitfalls if a RegExp object is used multiple times. It also makes it impossible to freeze instances of RegExp. String.prototype.replace()
accepts a callback where accessing named captures is inconvenient (via an object that is passed as the last argument).
- Improving the RegExp API without introducing a new constructor.
Common characteristics:
- Options object:
.startIndex = 0
.returnIndices = false
- Don’t change the RegExp object in any way.
- Completely ignore
.lastIndex
.
- Completely ignore
- Completely ignore the following flags:
/g
(.global
): not needed because each method is either global or non-global/y
(.sticky
): replaced by the assertion\G
/d
(.hasIndices
): replaced by option.returnIndices
- Not as important but I’d prefer an option for a method over a RegExp flag because this toggle is more about how an operation works than about how a RegExp matches.
.execOnce(str, options?): MatchObject
.execMany(str, options?): Iteratable<MatchObject>
.testOnce(str, options?): boolean
Better callback type signature: callback(matchObject)
.replaceOnce(stringOrRegExp, stringOrCallback, options?): string
.replaceMany(stringOrRegExp, stringOrCallback, options?): string
Open questions:
- Should these methods forward to Symbol-keyed properties of
stringOrRegExp
?- Another option: turn them into
RegExp.prototype.*
methods.
- Another option: turn them into
String.prototype.search(patternStringOrRegExp): number
String.prototype.split(verbatimStringOrRegExp?, limit?): Array<string>
Open question:
- Would it make sense to support an additional argument
options
?
RegExp.prototype.exec
RegExp.prototype.test
String.prototype.match
String.prototype.matchAll
String.prototype.replace
String.prototype.replaceAll
- Matches at the current matching position (0 or
.startIndex
). - Loosely related to the
^
assertion
How should legacy methods handle \G
?
- It clashes with flag
/y
because with that flag, a regular expression implicitly starts with\G
.- Thus: throwing an exception when
\G
is used with/y
seems the best option.
- Thus: throwing an exception when
- Other than that, we could specify a “current position” for all legacy methods (sometimes
.lastIndex
, sometimes 0) and use that with\G
. /y
is ignored by.split()
, so supporting\G
would be an improvement.
- List of upcoming RegExp proposals (collapsed section “Future: Active proposals”): https://github.com/slevithan/awesome-regex#javascript-regex-evolution
- Currently, there is no plan to support multi-line RegExp literals in JavaScript. A template tag is a good alternative and would be very useful for the proposed flag
/x
. - Two TC39 members have expressed an interest in adding a template tags for RegExps to JavaScript (source).
- A template tag could look like this: https://github.com/slevithan/regex
All
is already taken.- It’s just a first idea – suggestions welcome!
- Other options:
Multi
,OnceOrMore
- Other options:
- I find non-overloaded methods easier to understand (they are also easier to statically type):
.execOnce
and.execMany
have different return types. - I also like to avoid single big methods that do too much.
- Precedents for method pairs in the current API:
.replace()
and.replaceAll()
.match()
and.matchAll()
- Thanks to Steven Levithan for inspiring this proposal and his feedback to my ideas.
(Edit: The following feedback was based on an earlier version with significant differences. See also the earlier related discussion here.)
This is great! Lots of good ideas that work well together and provide a cleaner, easier to use, and less surprising API with fewer footguns.
Bikeshedding about naming aside, one concern is how much this bundles into one proposal. Some things have to be bundled, but it can be split into two independent proposals without taking anything away:
\G
, plus newRegExp
andString
methods that improve API signatures and completely move away fromlastIndex
and flags/dgy
(which modify how various methods apply regexes and the shape of their results), as opposed to flags that modify the meaning of regexes (/ims
, etc.).Also, to decouple even more, it can be explicitly stated that flags
x
andn
are not added in the proposal, even though their behavior is always on in the template tag. Nothing stops separate proposals from addingx
andn
to regex literals and theRegExp
constructor (before, after, or alongside the introduction of such a tag).The flagship improvement for # 2 is of course moving away from the statefulness of regexes, which has long been a source of bugs and developer surprise (here's one example, and it would probably be good to collect more).
\G
is a great feature. It's more flexible than/y
and it's broadly supported in other regex flavors (.NET, Perl, PCRE, Java, Ruby, Boost.Regex, etc.). But since the way it's most commonly used overlaps with/y
, it probably wouldn't make sense to add unless coupled with a proposal like this that also essentially deprecates/y
. However, there is still the issue of how exactly it should work. A few options:/y
, every regex/string method not introduced in the proposal could throw if the regex they're provided uses\G
. This is probably easiest, but it's arguably not the best for users and there is no precedent for it, apart frommatchAll
andreplaceAll
throwing based on flags.lastIndex
(since that would probably break the case for introducing it), it could track the match-end position on its target strings (in a property that might or might not be user-visible). This would in any case be an improvement onlastIndex
, since particular regexes can of course be applied to more than one string.\G
. This would be needed when using flag/g
with string methodsreplace
,replaceAll
,match
, andmatchAll
, but wouldn't be needed bysearch
or theRegExp
methodsexec
andtest
. It would additionally be needed by stringsplit
with or without flag/g
, since the\G
assertion should work like any other assertion (^
, etc.). This would offer another improvement on/y
since/y
is ignored bysplit
.It can be disabled via a modifier in the pattern:
(?-n:...)
, and maybe(?-n)...
in the future. But apart from that, if you allowed turning it off via some other option, then the same should probably be allowed for disablingx
andv
. I'd favor not offering an additional option to turn off any flags that are on by default when using the tag. Always-onn
might be more controversial than always-onx
andv
, but it has multiple benefits:(?:...)
.regex
's behavior for/n
, it avoids the footgun of referring to named captures by number.nosubs
) and in other flavors the numbering of named captures is inconsistent (e.g. in JS it's left to right for all captures, but in .NET it's unnamed captures first followed by named captures).