This tutorial will be looking into the a Regex (regular expression) created to find any email in a text.
This regex finds emails, but what is an email actually? An email is defined in this case as characters before an @ symbol, with a domain after the @ symbol. (domain being a .com, .org, etc). The regex looks like this
it's comprised of 3 groups, which we will go over in the grouping and capturing segment.
- Anchors
- Quantifiers
- OR Operator
- Character Classes
- Flags
- Grouping and Capturing
- Bracket Expressions
- Greedy and Lazy Match
- Boundaries
- Back-references
- Look-ahead and Look-behind
The anchors in this regex consist of the "^" at the beginning of the string and the "$" at the end, all these do is assert the position of the search at the beginning of the string (The Arrow) and at the end of the string (The Dollar Sign) so if anything else is shown in the string, it won't show up in the search.
Your quantifiers at play here are the + signs and the curly brackets ({2,6}) in group 3. The +'s in group 1 and 2 are telling the search to look for as many matches as they can find consecutively from the previous group.
first group being [a-z0-9_.-] and second being [\da-z.-]
third group has curly brackets there, indicating the search to find between 2 and 6 characters from the previous group. No more no less.
The OR operators here are the square brackets. They are indicating every search smushed together inbetween there are the searches choices. For instance in group one, the searches are a-z, 0-9, _, ., and -. It looks for a consecutive string containing any mix of those characters.
In group 2, instead of specifying 0-9, they used a character class "/d" which is a class for digits, aka, 0-9.
Flags are things like putting "i" after a group indicating case insensitivity, but as far as I'm aware those are not in this regex so no need to worry about it!
grouping is done with simple parenthesis. This is used to extract info from the search. There are 3 groups here one for the name, one for the domain name, and one for the domain specifier (.com).
bracket expressions by themselves (which are just anything inside a bracket) are used to find a single digit from whatever you specified inside the brackets, but they are very powerful when mixed with a quanitifer like we did Every group here has a bracket, but theyre all extended by a quantifier at the end.
Greedy and lazy are concepts for searches, greedy finding more of a search, and lazy finding less. Groups 1 and 2 are greedy, as they have the + quantifier which can theoretically find infinite characters. Group 3 is lazy, as the most of can find is 6 while still fitting the search.
bounderies are /b and /B (inverse), you put that at the beginning and end of a literal text to show the beginning and end of your search. We dont have any of those in this regex
these are things like /1 at the end of a group that matches the same text that was matched by whatever number group you put. It's complicated and we aint using it!!
This would only match a search if it was before or after another search, without the look-ahead/behind being apart of the search itself. Another complicated search we aint using here buddy!!!
My nane is Ruben, some people call me Connor. I'm a full stack developer and musician from Boston. Checkout some stuff I have made here --> https://github.com/RubeySchulz