Skip to content

Instantly share code, notes, and snippets.

@odoublewen
Last active August 2, 2019 19:51
Show Gist options
  • Save odoublewen/bc3f024629b8c2457359bffeff589f7e to your computer and use it in GitHub Desktop.
Save odoublewen/bc3f024629b8c2457359bffeff589f7e to your computer and use it in GitHub Desktop.
read multiple csv or txt files, adding filename as a column, using data.table vs tidyverse
dirname = 'path/to/files'
files = list.files(dirname, pattern = "*_data.txt")
# tidyverse approach
library(tidyverse)
data = tibble(filename=files) %>%
mutate(file_contents=map(filename, ~ read_tsv(file.path(dirname, .)))) %>%
unnest()
# data.table approach
library(data.table)
data <- rbindlist(lapply(files, function(x) {
dt = fread(file.path(dirname, x))
dt$filename = x
dt}))
# above, but functionalized
fread_directory <- function(dirname, pattern='*', label='filename') {
files = list.files(dirname, pattern)
scrub = sub('*','', pattern, fixed=TRUE)
rbindlist(lapply(files, function(x) {
dt = fread(file.path(dirname, x))
dt[, label] = sub(scrub, '', x)
dt
}))
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment