- The dot matches (almost) any character
- Non-greedy RegEx
- Left-Non-Greedy RegEx
- Matching Series Of Words
- Splitting a String
- Negating a Reg Exp
- Replacing Characters With a Pattern Including The Matching Characters
- Validate A String With Specific Characters
- Frequent Password Validation Rules
- Converting a URL To A Regex
- Replacing characters except the nth ones?
The belief that the dot can match any character leads to many mistakes. The exceptions to this rule are line-break characters. Line-break characters are:
\n
: New line.\r\n
: New line in Windows.
A typical mistake occurs when remove content between a start and end pattern in a multiline document. For example, let's assume that we have the following piece of text:
Hello,
I'm Charles
delelemestart
I hate you all, and I want you to go to hell.
delelemeend
How can I help you my dear friend?
If we wish to delete Charles' nasty block of thought, the following will not replace anything:
text.replace(/delelemestart(.*?)delelemeend/g,'')
Instead, we need to use:
text.replace(/delelemestart((.|\n)*?)delelemeend/g,'')
Example:
"This is another <Hello World> test -> Yeaaaaaaaahh!!!".match(/<(.*?)>/)[1];
//> 'Hello World'
If you want to support multiline, replace
(.*?)
with((.|\n)*?)
How do you match the content located between 2 specific symbols? Example, you want to retrieve the text between < and > in the following text: This is another <Hello World> test -> Yeaaaaaaaahh!!!.
In the example above you expect the result of your regex to be Hello World, but bad surprise, you're getting Hello World> test ->.
You're most likely using the standard greedy capture
"This is another <Hello World> test -> Yeaaaaaaaahh!!!".match(/<(.*)>/)[1];
//> 'Hello World> test -'
What you need instead is a non-greedy capture:
"This is another <Hello World> test -> Yeaaaaaaaahh!!!".match(/<(.*?)>/)[1];
//> 'Hello World'
If you need all the matches in your text, use the g option:
"This is another <Hello World> test -> Yeaaaaaaaahh!!! So <happy>".match(/<(.*?)>/g);
//> ['<Hello World>', '<happy>']
The above is great, and is generally referred as a right-non-greedy regex. To highlight the difference, let's have a look at the following example:
"This is another <Hello <World> test -> Yeaaaaaaaahh!!!".match(/<(.*?)>/)[1];
//> 'Hello <World'
The result of the above regex is Hello <World, but what if we wanted world?
"This is another <Hello <World> test -> Yeaaaaaaaahh!!!".match(/<([^<]*?)>/)[1];
//> 'World'
The above example works well when the delimiters are single characters. However, this trick stops working when the delimiters are made of words or multiple characters. The following example is quite difficult to solve with a regex:
"This is another _bla_Hello _bla_World_blip_ test -> Yeaaaaaaaahh!!!"
Where the opening delimiter is _bla_
and the ending delimiter is _blip_
.
The trick is to escape the delimiter with a rare ASCII (i.e., an ASCII that should probably never be inserted in the string). If the context of your problem allows for such ASCII, then the following trick will work:
"This is another _bla_Hello _bla_World_blip_ test -> Yeaaaaaaaahh!!!".replace(/_bla_/g, '░').match(/░([^░]*?)_blip_/)[1];
//> 'World'
// This will match any string that starts (^) with '/ar/' (\/ar\/) or (|) '/es/' (\/es\/)
const regex = /^\/ar\/|\/es\//
"/ar/learn/overview/".match(regex) // true
"/es/learn/overview/".match(regex) // true
"/it/learn/overview/".match(regex) // false
This is the easiest:
"hello_world".split(/_.{1}/g)
// > [ 'hello', 'orld' ]
This is where the regex magic happens. Use a positive lookahead:
"hello_world".split(/(?=_.{1})/g)
// > [ 'hello', '_world' ]
Simply use a negative look around:
(?!regexp)
or, if the you're negating a series of characters:
[^regexp]
If we take the example above
// This will match any string that DOES NOT start (^) with '/ar/' (\/ar\/) or (|) '/es/' (\/es\/)
const regex = /^(?!\/ar\/|\/es\/)/
"/ar/learn/overview/".match(regex) // false
"/es/learn/overview/".match(regex) // false
"/it/learn/overview/".match(regex) // true
// The following will replace all non-alphanumeric characters:
const slurp = name.replace(/[^a-zA-Z0-9]/g, '')
Use $&
"https://neap.co".replace(/(neap|co)/g, 'hello_$&')
// 'https://hello_neap.hello_co'
For example, if you want to only allow capital and lowercase as well as +, -, & and % in your input:
/^[a-zA-Z\+\-&%]+$/
Minimum eight characters, at least one lowercase letter, one uppercase letter and one number:
^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z]).{8,}$
Minimum eight characters, at least one lowercase letter, one uppercase letter, one number and one special character:
^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[!@#\$%\^&\*\(\)_\-\+=\[{\]}\\\|;:'",<\.>\/\?`~]).{8,}$
url.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')
Use the index argument. The next example skips the first character:
'HelloWorld'.replace(/[A-Z]/g, (l,idx) => idx ? ` ${l.toLowerCase()}` : l) // 'Hello world'