This tutorial will demonstrate how to use Regex expressions to validate an email. To follow along and test your expressions go to Regex101.
- Intro
- Quantifiers
- Anchors
- OR Operator
- Character Classes
- Flags
- Grouping and Capturing
- Bracket Expressions
- Character Escapes
A regular expression is a string of characters that helps you match a search pattern in text.
Now lets break down a regex expression in the case of an email, here's an email [email protected]
In a valid email we expect your username followed by an @
character, followed by your email provider, followed by a dot .
, and their top-level domain TLD (ex: .com
, .org
).
Here is an example of an email regex expression:
/^([a-z0-9A-Z\d\.-_]+)@([a-z\d-]+)\.([a-z]{2,6})?$/
All characters between the forward slashes are the expression. The forward slashes represent the beginning and end of the expression, this is how your code editor knows you're creating a regex expression. Note in your code editor the foward slashes immediately turn red.
Next up, let's discuss anchors. Anchors do not match any character within the expression, they match the position of the element(s) before, after or between characters, so they expect the match to be at a certain position within the expression.
The anchors in this expression are the caret, ^
, and the dollar sign, $
. So the expression immediately following the caret, ([a-z0-9A-Z\d\.-_]+)
, is expected at the beginning of the test string, if the test string that looks to match this group is at the end, it will not be a match.
Also note the sections of the expression, in this expression there are 3 sections or groups that each have regex characters in them, the groups are partitioned by the parenthesis ( )
, group one would be ([a-z0-9A-Z\d\.-_]+)
, group two would be ([a-z\d-]+)
, and group three is ([a-z]{2,6})
.
Inside each group is a bracket [ ]
that has the expression within them, everything inside the brackets is the outline that the expression uses to make a match with a provided string. Each bracket defines what is allowed in the test string at this position. Lets take group one for example, ([a-z0-9A-Z\d\.-_]+)
, in this bracket we accept all lowercase letters a-z, all UPPERCASE letters A-Z, all digits 0-9, and special characters - the dot .
, and underscore _
, specifically.
Character class in a regex defines a set of characters. A character class used in this regex is \d
which matches any single digit from 0-9.
Escape characters such as the backlash precedes a reserved character, that would otherwise be interpreted literally. Note the backslash dot behind the second group, ([a-z\d-]+)
, the behavior of the dot would normally be a wildcard character matching any single character, to escape that behavior use the backlash for it to represent it's own character in the expression.
Quanitfiers in Regex represents the number of character expected {2, 6}
and +
.
The +
quanitfier means that any character within the brackets should match one or more times.
The quantifier in ([a-z]{2,6})
is looking for the group at the end to be between 2-6 characters long.
Examples of accepted values for ([a-z]{2,6})
:
- .com
- .eu
For Example all of these emails will be accepted because they match our regex pattern.
Follow me for more tutorials like this.
Contact Email
Github