Created
August 27, 2017 14:08
-
-
Save victor-iyi/535f7d068d50e389addab2919dcbe4da to your computer and use it in GitHub Desktop.
A guide to most used regular expressions in Python
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import re | |
''' | |
Identifiers: | |
\d = any number | |
\D = anything but a number | |
\s = space | |
\S = anything but a space | |
\w = any letter | |
\W = anything but a letter | |
. = any character, except for a new line | |
\b = space around whole words | |
\. = period. must use backslash, because . normally means any character. | |
Modifiers: | |
{1,3} = for digits, u expect 1-3 counts of digits, or "places" | |
+ = match 1 or more | |
? = match 0 or 1 repetitions. | |
* = match 0 or MORE repetitions | |
$ = matches at the end of string | |
^ = matches start of a string | |
| = matches either/or. Example x|y = will match either x or y | |
[] = range, or "variance" | |
{x} = expect to see this amount of the preceding code. | |
{x,y} = expect to see this x-y amounts of the precedng code | |
White Space Charts: | |
\n = new line | |
\s = space | |
\t = tab | |
\e = escape | |
\f = form feed | |
\r = carriage return | |
Characters to REMEMBER TO ESCAPE IF USED! | |
. + * ? [ ] $ ^ ( ) { } | \ | |
Brackets: | |
[] = quant[ia]tative = will find either quantitative, or quantatative. | |
[a-z] = return any lowercase letter a-z | |
[1-5a-qA-Z] = return all numbers 1-5, lowercase letters a-q and uppercase A-Z | |
Compilation FLAGS | |
Compilation flags let you modify some aspects of how regualr expressions works. | |
Flags are available in the re module under two names, a long name such as | |
IGNORECASE and a short, one-letter form such as I | |
Flag Meaning | |
ASCII, A Makes several escapes like \w, \b, \s and \d | |
match only on ASCII characters with the | |
respective property. | |
DOTALL, S Make . match ant character, including newlines | |
IGNORECASE, I Do case-insensitivematches | |
LOCALE, L Do a locale-aware match | |
MULTILINE, M Multiline matching, affecting ^ and $ | |
VERBOSE, X(for'extended') Enable verbose REs, which can be organized | |
more cleanly and understandably | |
MATCH FUNCTION | |
matchObj = re.match(pattern, string, flags=0) | |
SEARCH FUNCTION | |
searchObj = re.search(pattern, string, flags=0) | |
''' | |
# search function of re | |
def search(pattern, string, flags=0): | |
try: | |
searchObj = re.search(pattern, string, flags) | |
if searchObj: | |
return searchObj.groups() | |
else: | |
return 'No match found!' | |
except Exception as e: | |
return str(e) | |
# match function of re | |
def match(pattern, string, flags=0): | |
try: | |
matchObj = re.match(pattern, string, flags) | |
if matchObj: | |
return matchObj.groups() | |
else: | |
return 'No match found!' | |
except Exception as e: | |
return str(e) | |
# findall function of re | |
def findall(pattern, string, flags=0): | |
try: | |
find = re.findall(pattern, string, flags) | |
return find | |
except Exception as e: | |
return str(e) | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment