Package: dplyr


Function: rename_with()


  • Note: Unlike dplyr::rename() where you can rename individual columns, dplyr::rename_with() renames columns using a function.

  • Note: You can either rename all variables or select variables. The default argument is .cols = everything().


1. Replace periods in all variable names with underscores

Review the data (d2)

# A tibble: 3 x 3
  Var.1 Var.2 Var.3
  <chr> <dbl> <dbl>
1 a         1     4
2 b        NA     5
3 c         3     1

Modify variable names by adding the stringr::str_replace() function.

  • Note: In regular expressions (regex), “.” has a special meaning. It is used to match any character. So therefore we will need to escape that functionality, telling the regex to match exactly, by adding “\\”.

  • Note: If there were more than one period in the variable names, you could also use stringr::str_replace_all() to remove all instances of periods, rather than just the first instance.

  • Note: In this first example, we are creating an anonymous formula using ~ to add the stringr::str_replace() arguments of pattern and replacement.

d2 %>% 
  dplyr::rename_with(~ stringr::str_replace(., pattern = "\\.", replacement = "_"))
# A tibble: 3 x 3
  Var_1 Var_2 Var_3
  <chr> <dbl> <dbl>
1 a         1     4
2 b        NA     5
3 c         3     1

Instead of creating an anonymous function, we could also “pass the dots” through the argument in dplyr::rename_with() and then we do not need to create an anonymous function.

d2 %>% 
  dplyr::rename_with(stringr::str_replace, pattern = "\\.", replacement = "_")
# A tibble: 3 x 3
  Var_1 Var_2 Var_3
  <chr> <dbl> <dbl>
1 a         1     4
2 b        NA     5
3 c         3     1
  • Note: Last, if I only wanted to replace the periods for Var1 and Var2, I could add the argument cols =
d2 %>% 
  dplyr::rename_with(~ stringr::str_replace(., "\\.", "_"), .cols=c(Var.1, Var.2))
# A tibble: 3 x 3
  Var_1 Var_2 Var.3
  <chr> <dbl> <dbl>
1 a         1     4
2 b        NA     5
3 c         3     1

2. Remove 1:, 2: and 3: from Var1, Var2 and Var3

Review the data (d6)

# A tibble: 3 x 3
  `1:Var1` `2:Var2` `3:Var3`
  <chr>       <dbl>    <dbl>
1 a               1        4
2 b              NA        5
3 c               3        1

Modify variable names by adding the stringr::str_remove() function.

d6 %>% 
  dplyr::rename_with(~ stringr::str_remove(., "[:digit:]:"))
# A tibble: 3 x 3
  Var1   Var2  Var3
  <chr> <dbl> <dbl>
1 a         1     4
2 b        NA     5
3 c         3     1

3. For any variable that contains the word “var”, remove the prefix “s_”

Review the data (d5)

# A tibble: 3 x 3
  s_id  s_var1 s_var2
  <chr> <chr>   <dbl>
1 a     m         100
2 b     f         150
3 c     n         160

Modify variable names by adding the stringr::str_remove() function.

  • Note: We are using the .cols argument here and modifying the default which is .cols = everything() and selecting the columns we want by adding the tidyselect selection helper contains().

  • Note: Other tidyselect selection helpers include startswith(), ends_with(), matches, num_range, where and more.

d5 %>% 
  dplyr::rename_with(~ stringr::str_remove(., "s_"), .cols = tidyselect::contains("var"))
# A tibble: 3 x 3
  s_id  var1   var2
  <chr> <chr> <dbl>
1 a     m       100
2 b     f       150
3 c     n       160


Package: purrr


Function: set_names()


  • Note: You can use purrr::set_names() to modify names as well, but it doesn’t have the nice compatibility with tidyselect that allows you to modify only select variable names. But purrr::set_names() works well to modify all variable names.


1. Remove “s_” in all variable names

Review the data (d5)

# A tibble: 3 x 3
  s_id  s_var1 s_var2
  <chr> <chr>   <dbl>
1 a     m         100
2 b     f         150
3 c     n         160

Modify variable names by adding the stringr::str_remove() function.

  • Note: If there were more than one period in the variable names, you could also use stringr::str_replace_all() to remove all instances of s_ in a variable name, rather than just the first instance.
d5 %>% 
  purrr::set_names(~ stringr::str_remove(., "s_"))
# A tibble: 3 x 3
  id    var1   var2
  <chr> <chr> <dbl>
1 a     m       100
2 b     f       150
3 c     n       160

Return to Name Variables