The following documentation will attempt to explain the components of regular expressions (regex) and how they are used for many purposes, one of which is to validate a hexadecimal color code.
The following regex is used when validating a hexadecimal color code:
/^#?([0-9a-f]{6}|[0-9a-f]{3})$/i
The values are validated based on the following rule sets:
-
Optionally begins with a hash (
#) -
Must be 3 or 6 characters in length
-
Uses the letters
a-fand/or digits from0-1 -
Case insensitive
-
All expressions must be valid to pass
The following are examples of hexadecimal color codes that will pass:
// Examples
#ffffff
#FFF
#fFfFfF
#000
#1f2f3F
#345678
The following are examples of hexadecimal color codes that will fail:
// Examples
#fffff => // 5 characters
#fF => // 2 characters
#1f2f3T => // uses invalid character "T"
Here's a brief breakdown of the validation occuring with the regex expression /^#?([0-9a-f]{6}|[0-9a-f]{3})$/i:
-
(
/)- marks the opening of the regex
-
(
^)- asserts position at start of the string
-
(
#)- matches the hastag character
#
- matches the hastag character
-
(
?)- matches the previous token but makes that token optional
-
Capuring Group
([0-9a-f]{6}|[0-9a-f]{3})- Option 1 (uses 6-character hexadecimal value):
- Match a single character from the following regex
[0-9a-f]{6}:0-9matches a single character in the range between 0 and 9a-fmatches a single character in the range between a and f (case insensitive){6}matches the previous token exactly 6 times
- Match a single character from the following regex
- Option 2 (uses 3-character hexadecimal value):
- Match a single character from the following regex
[0-9a-f]{3}:0-9matches a single character in the range between 0 and 9a-fmatches a single character in the range between a and f (case insensitive){3}matches the previous token exactly 3 times
- Match a single character from the following regex
- Option 1 (uses 6-character hexadecimal value):
-
(
$)- asserts position at the end of the string
-
(
/)- marks the ending of the regex
-
(
i)- match case insensitive on entire regex
- Anchors
- Quantifiers
- OR Operator
- Character Classes
- Flags
- Grouping and Capturing
- Greedy and Lazy Match
The regex example /^#?([0-9a-f]{6}|[0-9a-f]{3})$/i uses 2 anchors, ( ^ ) and ( $ ). The ( ^ ) symbol indicates the beginning of a string while the ( $ ) symbol indicates the end of a string. For reference the string is what we're looking to validate.
Anchors are unique in that they match a position within a string, not a character. Thus, in the example above both the achors indicate the beginning and the end of the string we're looking to match, not a specific character.
Quantifiers indicate that the preceding token must be matched a certain number of times. By default, quantifiers are greedy, and will match as many characters as possible.
While there are seeveral different qualifiers, the regex /^#?([0-9a-f]{6}|[0-9a-f]{3})$/i only uses 3 basic quantifiers, technically only 2 different quantifier types.
The ( ? ) is called an optional because the regex considers the token/character preceding the ( ? ) to be optional, and will return matches with and without the specified token, basically making that token nonexistant. For the expression, the ( ? ) character is preceded by the ( # ) token, meaning the regex will return matches with and without the ( # ) character attatched.
Below is another good example of an optional quantifier:
// Example
colou?r => // accepts "color" or "colour"
The second type of quantifier used in the regex is called a quantifier and is expressed as {6} and {3}. This quantifier type matches the specified quantity of the previous token. For example, {1,3} will match 1 to 3, {3} will match exactly 3, and {3,} will match 3 or more.
Here's the first part of the regular expression:
[0-9a-f]{6}
Here we're saying, "validate if the string matches any of the following characters 0-9 or a-f and is 6 characters in length exactly ({6})".
The second part of the expression is very similiar to the first:
[0-9a-f]{3}
This is almost the same as the first quantifier, but here we're saying "validate if the string matches any of the following characters 0-9 or a-f and is 3 characters in length exactly ({3}).
The regular expression /^#?([0-9a-f]{6}|[0-9a-f]{3})$/i is interesting because it has 2 different outcomes which would both validate the input given.
Luckily it can test for both cases simultaneously. The ( | ) character is called an alternation. It acts like a boolean OR, matching one sequence or another. It can operate within a group, or on a whole expression. The patterns will be tested in order.
Hexadecimal color codes can be 6 characters in length (#ffffff) or 3 characters in length (#fff), without counting the hashtag. Both are valid hexadecimal color codes so it needs to account for both cases. Also worth noting, the following example only accepts inputs of length 6 or 3, so a hexadecimal color code of #ffff would fail to validate using this regex.
In reference to the regex the alternation character tells us that the code will look to match hexadecimal codes with a length of 6 first, then try to find hexadecimal codes with a length of 3.
Character classes match a character from a specific set. There are a number of predefined character classes and you can also define your own sets.
In the regular expression /^#?([0-9a-f]{6}|[0-9a-f]{3})$/i which checks for the validity of a hexadecimal color code we define one character class:
[0-9a-f]
It's actually defined twice to check it for the two possible hexadecimal outcomes. As the character class is defined above it states that we're looking to match any digit 0-9 and any lowercase instance of a letter a-f.
There's also an i token at the end of the regex which ultimately infulences the defined character class to match any case of a letter a-f.
Expression flags change how the expression is interpreted. Flags follow the closing forward slash of the expression (ex. /.+/igm ).
As stated in the previous section, the decision to add an i token at the end of the regex changes how the expression is interpreted. The ignore case (i) makes the whole expression case-insensitive, hence it can decipher both #fffFFF and #fffffF as the same hexadecimal color value.
Here's another example of i in use:
// Example
/aBc/i => matches: "AbC" == "aBc" === "abc" === "ABC"
We can group multiple tokens together and create a capture group for extracting a substring or using a backreference. Here's a quick example:
// Example
// The plus (`+`) matches 1 or more of the preceding token, in this case a reference of `ha`
(ha)+ => // matches "hahaha" "haa" "hah!"
The regex for hexadecimal color codes has 1 capturing group:
([0-9a-f]{6}|[0-9a-f]{3})
And as we saw earlier the alternation token ( | ) allows us to have a first and second option for matching hexadecimal codes.
We have 2 good examples of using greedy and lazy matching.
First, let's take a look at an example of greedy matching. In the regex {6} and {3} are the quantifiers. Although in this instance we are requiring the match to be lengths of {6} or {3} only, there are other ways to define a quantifier to show an example of greediness.
What if we changed the regex to look like this:
/^#?([0-9a-f]{3,6})$/i
Well, in this instance it would validate hexadecimal color codes from lengths 3 up to 6, with lengths of 4 and 5 validated. We don't want that but that's greediness. Remember that the regex engine is eager to return a match. It will not continue backtracking further to see if there is another possible match.
Like the plus ( + ), the star ( * ) and the repetition using curly braces are greedy.
We also have an example of lazy matching.
Remember the ( ? ) token, which basically makes the preceding token nonexistant and acts as an example of lazy matching. In the regex, the part of the expression referencing the hashtag ( #? ) is saying, "Match anything with a hashtag, but you know what it's totally fine if there's no hashtag you can match that too".
If you have any questions about this project contact me directly at [email protected]
Visit this project's gist at https://gist.github.com/warrenp11/cc25ad43eecf1fec5df9802203a5dd8f
View more of my projects at https://www.github.com/warrenp11