Regular Expressions often appear daunting with their Perl-esque syntax. You will find countless tutorials online to get started, but few will highlight the traps most of us will fall into.
Note: Using JavaScript-friendly syntax
We'll be using this text:
<p>Hello <STRONG>世界</STRONG>!</p>
1. Greedy by default
Appending ?
is how to tell it not to be greedy:
/(<.*>)/ // Greedy: Returns the whole string, til the last matching >
/(<.*?>)/ // Non-greedy: Returns up to the 1st matching >
2. Case sensitive by default
/(<strong>)/ // Returns nothing
/(<strong>)/i // Using i flag: Returns <STRONG>
3. Matching Unicode characters
This will return nothing:
/<strong>(\w+)<\/strong>/i
A trick consist in using \S
to match anything but white space characters:
/<strong>(\S+)<\/strong>/i // Returns 世界
Note: Unicode is coming to regexp with ES6: see here »