Package: fs


Function: dir_ls()


1. Review files in the folder “obs_2022-03-15”

  • Note: I am currently working in an R project. I used the here::here() function which implicitly sets my directory to the top level (root directory) of my current project. Then any subsequent folders can be listed in descending order. In this case my folder “obs_2022-03-15” is within another folder called “file-system” which is located in the top level of my project directory.

  • If you aren’t working in R projects you can list your entire file path, or set a relative file path using something like fs::path() inside fs::dir_ls().

fs::dir_ls(here::here("file-system", "obs_2022-03-15"))
[1] "C:/Users/Me/project/file-system/obs_2022-03-15/obs1.xlsx"       
[2] "C:/Users/Me/project/file-system/obs_2022-03-15/obs2.xlsx"       
[3] "C:/Users/Me/project/file-system/obs_2022-03-15/obs3.xlsx"       
[4] "C:/Users/Me/project/file-system/obs_2022-03-15/summary_obs.xlsx"

Notice you will get the full path names here. fs::dir_ls() creates a character vector of file names.

  • Note: One thing to note here is that your files may not print out in the order that you see them in your directory (i.e. the order your operating system puts them in). That is because your operating system, such as Windows, has its own ordering system. However, when the files are displayed in R, you will see the files in alphabetical order. So if you have files doc1, doc2, doc10, in that order in your directory, they may appear as doc1, doc10, doc2 in RStudio.


Package: base


Function: basename()


1. View just the file names in your path

base::basename() removes all of the path up to and including the last path separator

base::basename(fs::dir_ls(here::here("file-system", "obs_2022-03-15")))
[1] "obs1.xlsx"        "obs2.xlsx"        "obs3.xlsx"        "summary_obs.xlsx"

You can also use base::dir() or base::list.files() to retrieve just the file names. If you add the argument full.names = TRUE to base::list.files() you will get the full path.

  • Note: base::dir() is an alias for base::list.files()
base::list.files(here::here("file-system", "obs_2022-03-15"))
[1] "obs1.xlsx"        "obs2.xlsx"        "obs3.xlsx"        "summary_obs.xlsx"

Last, there is also a function from the fs package that you can use to pull just the file names, fs::path_file()

fs::path_file(fs::dir_ls(here::here("file-system", "obs_2022-03-15")))
[1] "obs1.xlsx"        "obs2.xlsx"        "obs3.xlsx"        "summary_obs.xlsx"

Package: fs


Function: file_info()


1. View detailed information about your files

Let’s review all of our files in the “analysis_data” folder.

  • Note: There are more columns you can view with this function but I have selected 3 columns of interest to view here. To view all options type ?file_info into the console.
fs::file_info(fs::dir_ls(here::here("file-system", "analysis_data"))) %>%
  dplyr::select(path, type, modification_time)
# A tibble: 3 x 3
  path                                                 type  modification_time  
  <chr>                                                <fct> <dttm>             
1 C:/Users/Me/project/file-system/summary_obs_v01.xlsx file  2022-07-30 20:26:52
2 C:/Users/Me/project/file-system/summary_obs_v02.xlsx file  2022-07-30 20:26:52
3 C:/Users/Me/project/file-system/summary_obs_v03.xlsx file  2022-08-24 12:39:01

2. Find the most recent version of a file based on modification time

  • Note: I am using dplyr::slice() and base::which.max() to select the path with the most recent modification time.
fs::file_info(fs::dir_ls(here::here("file-system"))) %>%
  dplyr::select(path,
                type, modification_time) %>%
  dplyr::slice(base::which.max(modification_time))
# A tibble: 1 x 3
  path                                                 type  modification_time  
  <chr>                                                <fct> <dttm>             
1 C:/Users/Me/project/file-system/summary_obs_v03.xlsx file  2022-08-24 12:39:01

3. Find the most recent version of a file based on file name

I don’t always trust the modification time to give me my most recent version.You’ll notice that I’ve also versioned our files by adding “_v#” as a suffix to each file name. I can instead use this information to find the most recent version.

  • Note: In this scenario I don’t need to use fs::file_info() as I am only choosing my most recent file version based on my naming convention.

  • Note: I am using dplyr::arrange() to sort my file names descending (get the most recent version on top) and then selecting the top file name using dplyr::slice_head().

  • Note: This same code would work if I had used a date suffix (“_YYYY-MM-DD”) to version files rather than “_v#”

tibble::as_tibble(fs::dir_ls(here::here("file-system"))) %>%
  dplyr::arrange(desc(value)) %>%
  dplyr::slice_head(n=1)
# A tibble: 1 x 1
  value                                               
  <chr>                                               
1 C:/Users/Me/project/file-system/summary_obs_v03.xlsx

And last, if I wanted to go one step further and import the most recent version of a file, I could simply save the tibble from above and pull the “value” column with our path using dplyr::pull() and use that character vector in an import function such as readxl::read_excel().

newest_file <- tibble::as_tibble(fs::dir_ls(
  here::here("file-system"))) %>%
  dplyr::arrange(desc(value)) %>%
  dplyr::slice_head(n=1) %>%
  dplyr::pull(value)

most_recent <- readxl::read_excel(newest_file)

Return to File System