Skip to content

Instantly share code, notes, and snippets.

@arthurwuhoo
Created June 30, 2016 13:57
Show Gist options
  • Save arthurwuhoo/66992b5df97907e4a78a46dbfc30ec4e to your computer and use it in GitHub Desktop.
Save arthurwuhoo/66992b5df97907e4a78a46dbfc30ec4e to your computer and use it in GitHub Desktop.
# =====================================================================================================================
# REGULAR EXPRESSION EXERCISES
# =====================================================================================================================
# Exercise 1 ----------------------------------------------------------------------------------------------------------
# Match both "manhood" and "hoola hoop" but not "wahoo".
grepl("hoo.+$", c("manhood", "hoola hoop", "wahoo"))
# Exercise 2 ----------------------------------------------------------------------------------------------------------
# Match dates in "YYYY/MM/DD" or "YYYY-MM-DD" format.
# First let's figure out how to match months.
grepl("(0[1-9]|1[012])", c("01", "06", "12", "15", "25", "31"))
# Now matching days.
grepl("(0[1-9]|[12][0-9]|3[01])", c("01", "06", "12", "15", "25", "31", "00", "32"))
# Now put it all together.
grepl("^[[:digit:]]{4}[/-](0[1-9]|1[012])[/-](0[1-9]|[12][0-9]|3[01])", c("1972/06/16", "1972-06-16", "1972 06 16", "1972-16-06", "16/06/1972", "06-16-1972"))
# Exercise 3 ----------------------------------------------------------------------------------------------------------
# Match a username which should conform to the following:
#
# - between 4 and 16 characters;
# - must consist of uppercase or lowercase letters, digits, underscores or hyphens;
# - cannot begin with an underscore or hyphen.
grepl("^[[:alpha:]][[:alnum:]_-]{3,15}$", c("user_name13", "silly-53_name", "_invalid_name"))
# Exercise 4 ----------------------------------------------------------------------------------------------------------
# Match a 3 or 6 character RGB in hexidecimal format. The leading "#" is optional.
grepl("^#?([0-9a-f]{3}|[0-9a-f]{6})$", c("#D3a113", "#3F2", "D3a113", "#33333", "#4d82H4"), ignore.case = TRUE)
# Exercise 5 ----------------------------------------------------------------------------------------------------------
Match "data" or "Data" but only when followed by "science" or "Science".
gsub("[Dd]ata (?=[Ss]cience)", "", c("data science", "Data Science", "data fud"), perl = TRUE)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment