Skip to content

Instantly share code, notes, and snippets.

@cwickham
Last active November 26, 2018 21:36
Show Gist options
  • Select an option

  • Save cwickham/93c35206b577b350a57d21ed2e5bcef1 to your computer and use it in GitHub Desktop.

Select an option

Save cwickham/93c35206b577b350a57d21ed2e5bcef1 to your computer and use it in GitHub Desktop.
Example of combining `bind_rows()` with `map()` and `read_csv()`

Example of combining bind_rows() with map() and read_csv()

library(tidyverse)
## ── Attaching packages ──────────── tidyverse 1.2.1 ──

## ✔ ggplot2 3.0.0     ✔ purrr   0.2.5
## ✔ tibble  1.4.2     ✔ dplyr   0.7.5
## ✔ tidyr   0.8.1     ✔ stringr 1.3.1
## ✔ readr   1.1.1     ✔ forcats 0.3.0

## ── Conflicts ─────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()

If you have some data files with a similar structure:

files <- fs::dir_ls(glob = "*.csv")

For each file name, read it in:

data_list <- map(files, read_csv)
## Parsed with column specification:
## cols(
##   quarter = col_integer(),
##   sales = col_double()
## )
## Parsed with column specification:
## cols(
##   quarter = col_integer(),
##   sales = col_double()
## )
## Parsed with column specification:
## cols(
##   quarter = col_integer(),
##   sales = col_double()
## )

Then concatenate them together with bind_rows():

bind_rows(data_list)
## # A tibble: 12 x 2
##    quarter sales
##      <int> <dbl>
##  1       1 1.13 
##  2       2 0.887
##  3       3 0.714
##  4       4 1.08 
##  5       1 2.04 
##  6       2 1.32 
##  7       3 0.250
##  8       4 3.37 
##  9       1 1.55 
## 10       2 0.866
## 11       3 3.53 
## 12       4 0.839

If you need to keep track of which data came from which file:

tibble(file_names = files) %>% 
  mutate(data = map(files, read_csv)) %>% 
  unnest()
## Parsed with column specification:
## cols(
##   quarter = col_integer(),
##   sales = col_double()
## )
## Parsed with column specification:
## cols(
##   quarter = col_integer(),
##   sales = col_double()
## )
## Parsed with column specification:
## cols(
##   quarter = col_integer(),
##   sales = col_double()
## )

## # A tibble: 12 x 3
##    file_names quarter sales
##    <fs::path>   <int> <dbl>
##  1 bc.csv           1 1.13 
##  2 bc.csv           2 0.887
##  3 bc.csv           3 0.714
##  4 bc.csv           4 1.08 
##  5 ca.csv           1 2.04 
##  6 ca.csv           2 1.32 
##  7 ca.csv           3 0.250
##  8 ca.csv           4 3.37 
##  9 or.csv           1 1.55 
## 10 or.csv           2 0.866
## 11 or.csv           3 3.53 
## 12 or.csv           4 0.839
---
title: Example of combining `bind_rows()` with `map()` and `read_csv()`
output: github_document
---
```{r}
library(tidyverse)
```
If you have some data files with a similar structure:
```{r}
files <- fs::dir_ls(glob = "*.csv")
```
For each file name, read it in:
```{r}
data_list <- map(files, read_csv)
```
Then concatenate them together with `bind_rows()`:
```{r}
bind_rows(data_list)
```
If you need to keep track of which data came from which file:
```{r}
tibble(file_names = files) %>%
mutate(data = map(files, read_csv)) %>%
unnest()
```
quarter sales
1 1.1348993325975116
2 0.8866861386940291
3 0.7139273806986334
4 1.0782276279804233
quarter sales
1 2.0425583607608324
2 1.322612284640682
3 0.2502279566730268
4 3.3737260623085983
quarter sales
1 1.5538028030747417
2 0.8664526714500033
3 3.525723954452635
4 0.8387123578739287
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment