Skip to content

Instantly share code, notes, and snippets.

@RubeySchulz
Last active July 11, 2022 13:50
Show Gist options
  • Save RubeySchulz/771c8a8fbdd9bac7c4fd2cc9f2a866f3 to your computer and use it in GitHub Desktop.
Save RubeySchulz/771c8a8fbdd9bac7c4fd2cc9f2a866f3 to your computer and use it in GitHub Desktop.
Email Regex Tutorial

Email Regex Tutorial

This tutorial will be looking into the a Regex (regular expression) created to find any email in a text.

Summary

This regex finds emails, but what is an email actually? An email is defined in this case as characters before an @ symbol, with a domain after the @ symbol. (domain being a .com, .org, etc). The regex looks like this

/^([a-z0-9_.-]+)@([\da-z.-]+).([a-z.]{2,6})$/

it's comprised of 3 groups, which we will go over in the grouping and capturing segment.

Table of Contents

Regex Components

Anchors

The anchors in this regex consist of the "^" at the beginning of the string and the "$" at the end, all these do is assert the position of the search at the beginning of the string (The Arrow) and at the end of the string (The Dollar Sign) so if anything else is shown in the string, it won't show up in the search.

Quantifiers

Your quantifiers at play here are the + signs and the curly brackets ({2,6}) in group 3. The +'s in group 1 and 2 are telling the search to look for as many matches as they can find consecutively from the previous group.

first group being [a-z0-9_.-] and second being [\da-z.-]

third group has curly brackets there, indicating the search to find between 2 and 6 characters from the previous group. No more no less.

OR Operator

The OR operators here are the square brackets. They are indicating every search smushed together inbetween there are the searches choices. For instance in group one, the searches are a-z, 0-9, _, ., and -. It looks for a consecutive string containing any mix of those characters.

Character Classes

In group 2, instead of specifying 0-9, they used a character class "/d" which is a class for digits, aka, 0-9.

Flags

Flags are things like putting "i" after a group indicating case insensitivity, but as far as I'm aware those are not in this regex so no need to worry about it!

Grouping and Capturing

grouping is done with simple parenthesis. This is used to extract info from the search. There are 3 groups here one for the name, one for the domain name, and one for the domain specifier (.com).

Bracket Expressions

bracket expressions by themselves (which are just anything inside a bracket) are used to find a single digit from whatever you specified inside the brackets, but they are very powerful when mixed with a quanitifer like we did Every group here has a bracket, but theyre all extended by a quantifier at the end.

Greedy and Lazy Match

Greedy and lazy are concepts for searches, greedy finding more of a search, and lazy finding less. Groups 1 and 2 are greedy, as they have the + quantifier which can theoretically find infinite characters. Group 3 is lazy, as the most of can find is 6 while still fitting the search.

Boundaries

bounderies are /b and /B (inverse), you put that at the beginning and end of a literal text to show the beginning and end of your search. We dont have any of those in this regex

Back-references

these are things like /1 at the end of a group that matches the same text that was matched by whatever number group you put. It's complicated and we aint using it!!

Look-ahead and Look-behind

This would only match a search if it was before or after another search, without the look-ahead/behind being apart of the search itself. Another complicated search we aint using here buddy!!!

Author

My nane is Ruben, some people call me Connor. I'm a full stack developer and musician from Boston. Checkout some stuff I have made here --> https://github.com/RubeySchulz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment