Skip to content

Instantly share code, notes, and snippets.

@cbare
Last active December 12, 2015 04:48
Show Gist options
  • Select an option

  • Save cbare/4716614 to your computer and use it in GitHub Desktop.

Select an option

Save cbare/4716614 to your computer and use it in GitHub Desktop.
An example of regular expressions using capture groups in R.
## An example of regular expressions using capture groups
## in R. See:
## http://stackoverflow.com/questions/14700799/r-regex-gsub-extract-part-of-pattern/14714370#14714370
############################################################
# example data
data <-
"Station lat lon
1940 K01R 31-08N 092-34W
1941 K01T 28-08N 094-24W
1942 K03Y 48-47N 096-57W
1943 K04V 38-05-50N 106-10-07W
1944 K05F 31-25-16N 097-47-49W
1945 K06D 48-53-04N 099-37-15W"
## read string into a data.frame
df <- read.table(text=data, head=T, stringsAsFactors=F)
## here's the pattern we want to extract
pattern <- "(\\d{1,3})-(\\d{1,3})(?:-(\\d{1,3}))?([NSWE]{1})"
## The stringr library's str_match function returns a data.frame
## in which the first column is the whole matched string and additional
## columns hold the contents of each capture-group in the regex.
library(stringr)
str_match(df$lat, pattern)
## Alternatively, the package gsubfn defines a strapply function,
## which R-ishly applies a function to each matching string.
## True to it's early version number (0.6-5) I came across some
## bugs.
# http://code.google.com/p/gsubfn
# install.packages('gsubfn')
library(gsubfn)
parts <- strapply(df$lat, pattern, FUN=c, simplify=rbind, backref=NULL)
@cbare
Copy link
Copy Markdown
Author

cbare commented Feb 5, 2013

Written in response to a StackOverflow question: R regex / gsub : extract part of pattern

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment