Skip to content

Instantly share code, notes, and snippets.

@nischalshrestha
Last active April 5, 2019 21:24
Show Gist options
  • Save nischalshrestha/78f265c17d03634de2e2e7cc84d340c4 to your computer and use it in GitHub Desktop.
Save nischalshrestha/78f265c17d03634de2e2e7cc84d340c4 to your computer and use it in GitHub Desktop.
regex for dataframe syntax
import sys
import re
# construct regex here and export python regex: https://www.debuggex.com
# TODO make regex ignore white space where it doesn't matter
# first case: df[['col'...]] <-> df[c('col'...)]
list_re = "\\s{0,}\\[\\s{0,}('\\w{1,}'){1,}\\s{0,}(,\\s{0,}'\\w{1,}'){0,}\\s{0,}\\]"
list_vars = re.compile(list_re)
# R: c()
list_re_r = "c\\(('\w{1,}'){1,}(,\\s{0,}'\w{1,}'){0,}\\)"
list_vars_r = re.compile(list_re_r)
# recognizes df[...] syntax
subset_cols_re = "df\\s{0,}\\[" + list_re + "\\]"
subset_cols = re.compile(subset_cols_re)
subset_cols_re_r = "df\\[" + list_re_r + "\\]"
subset_cols_r = re.compile(subset_cols_re_r)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment