Skip to content

Instantly share code, notes, and snippets.

@JoachimGoedhart
Last active May 14, 2024 14:09
Show Gist options
  • Save JoachimGoedhart/1af625a72efa848426d9f973ab8fae16 to your computer and use it in GitHub Desktop.
Save JoachimGoedhart/1af625a72efa848426d9f973ab8fae16 to your computer and use it in GitHub Desktop.
Reads all *.csv files from working directory and moves the data into a single dataframe
require(data.table)
require(dplyr)
#Get a list with all csv files from the directory that is set as 'working directory'
filelist = list.files(pattern="*.csv$")
#read all csv files with data.table::fread() and put in df_input_list
df_input_list <- lapply(filelist, fread)
#reading in csv files can also be done using the base R function read.csv(), without needing to load package "data.table":
# df_input_list <- lapply(filelist, read.csv)
#get the filenames, remove extension for use as "id"
names(df_input_list) <- gsub(filelist, pattern="\\..*", replacement="")
#Merge all the dataframes and use the filenames as id
df_merged <- bind_rows(df_input_list, .id = "id")
@MiGuyB
Copy link

MiGuyB commented May 14, 2024

1.9 GB of data in 12 files read and merged in less than 10 seconds..You did a real good job. Using read.csv instead of data.table is much slower.

@JoachimGoedhart
Copy link
Author

Thanks, I appreciate the feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment