map_dfr()1. Read in all sheets of an Excel file and bind them into one large data frame.
View the sheets in our workbook
path <- (here::here("import-data", "data", "all-school_attendance.xlsx"))
path %>%
readxl::excel_sheets()
[1] "Sheet1" "Sheet2" "Sheet3"
We first use readxl::excel_sheets() to grab the sheet
names and put them into a character vector. Then we set the names of the
sheets in the vector using purrr::set_names(). Last we read
the sheets in using readxl::read_excel() and bind them
using purrr::map_dfr(). We add the
purrr::map_dfr() argument .id to add a new
variable that we call source so we know which sheet each
row comes from.
data <- path %>%
readxl::excel_sheets() %>%
purrr::set_names() %>%
purrr::map_dfr(readxl::read_excel, path = path, .id = "source")
head(data)
# A tibble: 6 x 26
source YEAR PROPORTIONAL_ATTENDANCE_TOTAL_90ABV PROPORTIONAL_ATTENDANCE_TOTAL
<chr> <dbl> <dbl> <dbl>
1 Sheet1 2015 518. 534
2 Sheet1 2016 519 534
3 Sheet1 2017 528 551
4 Sheet1 2018 529. 564.
5 Sheet1 2019 558. 582.
6 Sheet2 2015 518. 534
# i 22 more variables: PROPORTIONAL_ATTENDANCE_TOTAL_PCT <dbl>,
# PROPORTIONAL_ATTENDANCE_BLACK_90ABV <chr>,
# PROPORTIONAL_ATTENDANCE_BLACK <chr>,
# PROPORTIONAL_ATTENDANCE_BLACK_PCT <chr>,
# PROPORTIONAL_ATTENDANCE_HISPANIC_90ABV <dbl>,
# PROPORTIONAL_ATTENDANCE_HISPANIC <dbl>,
# PROPORTIONAL_ATTENDANCE_HISPANIC_PCT <dbl>, ...
purrr::map_dfr() is now superseded.
You can still use it but the preferred method of binding files now is
using purrr::list_rbind() from purrr version
1.0.2. Similar to dplyr::bind_rows(), used in
purrr::map_dfr(), purrr::list_rbind() matches
by variable names and requires variables to be the same types. It also
allows some variables to be missing from some sheets (filling with NA as
needed).data <- path %>%
readxl::excel_sheets() %>%
purrr::set_names() %>%
purrr::map(readxl::read_excel, path = path) %>%
purrr::list_rbind(names_to = "sheet_name")
head(data)
# A tibble: 6 x 26
sheet_name YEAR PROPORTIONAL_ATTENDANCE_TOTAL_90ABV PROPORTIONAL_ATTENDANCE~1
<chr> <dbl> <dbl> <dbl>
1 Sheet1 2015 518. 534
2 Sheet1 2016 519 534
3 Sheet1 2017 528 551
4 Sheet1 2018 529. 564.
5 Sheet1 2019 558. 582.
6 Sheet2 2015 518. 534
# i abbreviated name: 1: PROPORTIONAL_ATTENDANCE_TOTAL
# i 22 more variables: PROPORTIONAL_ATTENDANCE_TOTAL_PCT <dbl>,
# PROPORTIONAL_ATTENDANCE_BLACK_90ABV <chr>,
# PROPORTIONAL_ATTENDANCE_BLACK <chr>,
# PROPORTIONAL_ATTENDANCE_BLACK_PCT <chr>,
# PROPORTIONAL_ATTENDANCE_HISPANIC_90ABV <dbl>,
# PROPORTIONAL_ATTENDANCE_HISPANIC <dbl>, ...
Return to Import Files