rename_with()
Note: Unlike dplyr::rename()
where you can rename
individual columns, dplyr::rename_with()
renames columns
using a function.
Note: You can either rename all variables or select variables. The default argument is .cols = everything().
1. Replace periods in all variable names with underscores
Review the data (d2)
# A tibble: 3 x 3
Var.1 Var.2 Var.3
<chr> <dbl> <dbl>
1 a 1 4
2 b NA 5
3 c 3 1
Modify variable names by adding the
stringr::str_replace()
function.
Note: In regular expressions (regex), “.” has a special meaning. It is used to match any character. So therefore we will need to escape that functionality, telling the regex to match exactly, by adding “\\”.
Note: If there were more than one period in the variable names,
you could also use stringr::str_replace_all()
to remove all
instances of periods, rather than just the first instance.
Note: In this first example, we are creating an anonymous formula
using ~
to add the stringr::str_replace()
arguments of pattern and replacement.
d2 %>%
dplyr::rename_with(~ stringr::str_replace(., pattern = "\\.", replacement = "_"))
# A tibble: 3 x 3
Var_1 Var_2 Var_3
<chr> <dbl> <dbl>
1 a 1 4
2 b NA 5
3 c 3 1
Instead of creating an anonymous function, we could also “pass the
dots” through the … argument in
dplyr::rename_with()
and then we do not need to create an
anonymous function.
d2 %>%
dplyr::rename_with(stringr::str_replace, pattern = "\\.", replacement = "_")
# A tibble: 3 x 3
Var_1 Var_2 Var_3
<chr> <dbl> <dbl>
1 a 1 4
2 b NA 5
3 c 3 1
d2 %>%
dplyr::rename_with(~ stringr::str_replace(., "\\.", "_"), .cols=c(Var.1, Var.2))
# A tibble: 3 x 3
Var_1 Var_2 Var.3
<chr> <dbl> <dbl>
1 a 1 4
2 b NA 5
3 c 3 1
2. Remove 1:, 2: and 3: from Var1, Var2 and Var3
Review the data (d6)
# A tibble: 3 x 3
`1:Var1` `2:Var2` `3:Var3`
<chr> <dbl> <dbl>
1 a 1 4
2 b NA 5
3 c 3 1
Modify variable names by adding the
stringr::str_remove()
function.
d6 %>%
dplyr::rename_with(~ stringr::str_remove(., "[:digit:]:"))
# A tibble: 3 x 3
Var1 Var2 Var3
<chr> <dbl> <dbl>
1 a 1 4
2 b NA 5
3 c 3 1
3. For any variable that contains the word “var”, remove the prefix “s_”
Review the data (d5)
# A tibble: 3 x 3
s_id s_var1 s_var2
<chr> <chr> <dbl>
1 a m 100
2 b f 150
3 c n 160
Modify variable names by adding the
stringr::str_remove()
function.
Note: We are using the .cols argument here and modifying
the default which is .cols = everything() and selecting the
columns we want by adding the tidyselect
selection helper
contains()
.
Note: Other tidyselect
selection helpers include
startswith()
, ends_with()
,
matches
, num_range
, where
and
more.
d5 %>%
dplyr::rename_with(~ stringr::str_remove(., "s_"), .cols = tidyselect::contains("var"))
# A tibble: 3 x 3
s_id var1 var2
<chr> <chr> <dbl>
1 a m 100
2 b f 150
3 c n 160
set_names()
purrr::set_names()
to modify names as
well, but it doesn’t have the nice compatibility with
tidyselect
that allows you to modify only select variable
names. But purrr::set_names()
works well to modify all
variable names.1. Remove “s_” in all variable names
Review the data (d5)
# A tibble: 3 x 3
s_id s_var1 s_var2
<chr> <chr> <dbl>
1 a m 100
2 b f 150
3 c n 160
Modify variable names by adding the
stringr::str_remove()
function.
stringr::str_replace_all()
to remove all
instances of s_ in a variable name, rather than just the first
instance.d5 %>%
purrr::set_names(~ stringr::str_remove(., "s_"))
# A tibble: 3 x 3
id var1 var2
<chr> <chr> <dbl>
1 a m 100
2 b f 150
3 c n 160
Return to Name Variables