dir_ls()1. Review files in the folder “obs_2022-03-15”
Note: I am currently working in an R project. I used the
here::here() function which implicitly sets my directory to
the top level (root directory) of my current project. Then any
subsequent folders can be listed in descending order. In this case my
folder “obs_2022-03-15” is within another folder called “file-system”
which is located in the top level of my project directory.
If you aren’t working in R projects you can list your entire file
path, or set a relative file path using something like
fs::path() inside fs::dir_ls().
fs::dir_ls(here::here("file-system", "obs_2022-03-15"))
[1] "C:/Users/Me/project/file-system/obs_2022-03-15/obs1.xlsx"
[2] "C:/Users/Me/project/file-system/obs_2022-03-15/obs2.xlsx"
[3] "C:/Users/Me/project/file-system/obs_2022-03-15/obs3.xlsx"
[4] "C:/Users/Me/project/file-system/obs_2022-03-15/summary_obs.xlsx"
Notice you will get the full path names here.
fs::dir_ls() creates a character vector of file names.
basename()1. View just the file names in your path
base::basename() removes all of the path up to and
including the last path separator
base::basename(fs::dir_ls(here::here("file-system", "obs_2022-03-15")))
[1] "obs1.xlsx" "obs2.xlsx" "obs3.xlsx" "summary_obs.xlsx"
You can also use base::dir() or
base::list.files() to retrieve just the file names. If you
add the argument full.names = TRUE to
base::list.files() you will get the full path.
base::dir() is an alias for
base::list.files()base::list.files(here::here("file-system", "obs_2022-03-15"))
[1] "obs1.xlsx" "obs2.xlsx" "obs3.xlsx" "summary_obs.xlsx"
Last, there is also a function from the fs package that
you can use to pull just the file names,
fs::path_file()
fs::path_file(fs::dir_ls(here::here("file-system", "obs_2022-03-15")))
[1] "obs1.xlsx" "obs2.xlsx" "obs3.xlsx" "summary_obs.xlsx"
file_info()1. View detailed information about your files
Let’s review all of our files in the “analysis_data” folder.
?file_info into the console.fs::file_info(fs::dir_ls(here::here("file-system", "analysis_data"))) %>%
dplyr::select(path, type, modification_time)
# A tibble: 3 x 3
path type modification_time
<chr> <fct> <dttm>
1 C:/Users/Me/project/file-system/summary_obs_v01.xlsx file 2022-07-30 20:26:52
2 C:/Users/Me/project/file-system/summary_obs_v02.xlsx file 2022-07-30 20:26:52
3 C:/Users/Me/project/file-system/summary_obs_v03.xlsx file 2022-08-24 12:39:01
2. Find the most recent version of a file based on modification time
dplyr::slice() and
base::which.max() to select the path with the most recent
modification time.fs::file_info(fs::dir_ls(here::here("file-system"))) %>%
dplyr::select(path,
type, modification_time) %>%
dplyr::slice(base::which.max(modification_time))
# A tibble: 1 x 3
path type modification_time
<chr> <fct> <dttm>
1 C:/Users/Me/project/file-system/summary_obs_v03.xlsx file 2022-08-24 12:39:01
3. Find the most recent version of a file based on file name
I don’t always trust the modification time to give me my most recent version.You’ll notice that I’ve also versioned our files by adding “_v#” as a suffix to each file name. I can instead use this information to find the most recent version.
Note: In this scenario I don’t need to use
fs::file_info() as I am only choosing my most recent file
version based on my naming convention.
Note: I am using dplyr::arrange() to sort my file
names descending (get the most recent version on top) and then selecting
the top file name using dplyr::slice_head().
Note: This same code would work if I had used a date suffix (“_YYYY-MM-DD”) to version files rather than “_v#”
tibble::as_tibble(fs::dir_ls(here::here("file-system"))) %>%
dplyr::arrange(desc(value)) %>%
dplyr::slice_head(n=1)
# A tibble: 1 x 1
value
<chr>
1 C:/Users/Me/project/file-system/summary_obs_v03.xlsx
And last, if I wanted to go one step further and import the most
recent version of a file, I could simply save the tibble from above and
pull the “value” column with our path using dplyr::pull()
and use that character vector in an import function such as
readxl::read_excel().
newest_file <- tibble::as_tibble(fs::dir_ls(
here::here("file-system"))) %>%
dplyr::arrange(desc(value)) %>%
dplyr::slice_head(n=1) %>%
dplyr::pull(value)
most_recent <- readxl::read_excel(newest_file)
Return to File System