Skip to content

Instantly share code, notes, and snippets.

@arnauldvm
Created December 13, 2016 10:09
Show Gist options
  • Save arnauldvm/970c89e1633df0fc81cc0f2116e74a6d to your computer and use it in GitHub Desktop.
Save arnauldvm/970c89e1633df0fc81cc0f2116e74a6d to your computer and use it in GitHub Desktop.
parse a vector of lines with a named regex and produce a data.frame
parseLines = function(perlNamedPattern, linesVector) {
match = regexpr(perlNamedPattern, linesVector, perl=TRUE)
found_idx = as.vector(match)>0
names = attr(match, "capture.names")
start = attr(match, "capture.start")
length = attr(match, "capture.length")
end = start + length -1
results_matrix = c()
for (col in seq(1, length(names))) {
results_matrix = rbind(results_matrix, substr(linesVector, start[,col], end[,col]))
}
results_df=data.frame(t(results_matrix))
names(results_df) = names
return(results_df[found_idx,])
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment