rename()
dplyr::rename()
formula is new value=old
value, this is opposite of dplyr::recode()
.1. Set all column names using an existing file (for example a data dictionary).
Review the data (d1)
# A tibble: 3 x 3
Var1 Var2 Var3
<chr> <dbl> <dbl>
1 a 2 3.6
2 b NA 8.5
3 c 3 NA
Read in and review the data dictionary
dplyr::relocate()
to reorder them.dict <- readxl::read_excel("dictionary.xlsx")
# A tibble: 3 x 2
new_name old_name
<chr> <chr>
1 stu_id Var1
2 read_score Var2
3 math_score Var3
In this case we can use the existing file to rename the variables for us, rather than us hand entering “stu_id = Var1, read_score = Var2, math_score = Var3”. You can see how this would save us time if we have many variables to rename.
The first thing we need to do is use the function
tibble::deframe()
to convert our dictionary data frame to a
two-column names vector.
dict_names <- tibble::deframe(dict)
dict_names
stu_id read_score math_score
"Var1" "Var2" "Var3"
Now we can rename our variables using this vector.
tidyselect::all_of()
to remove
ambiguity between columns and external variables. See this link for more
details https://tidyselect.r-lib.org/reference/faq-external-vector.htmld1 %>%
dplyr::rename(tidyselect::all_of(dict_names))
# A tibble: 3 x 3
stu_id read_score math_score
<chr> <dbl> <dbl>
1 a 2 3.6
2 b NA 8.5
3 c 3 NA
There are many other ways to rename variables using another column of names. However I like this way the best because of versatility.
Even if the data dictionary rows are not in the same order as the variables in our data frame, the renaming will still work.
For example, here is the dictionary where the rows are in a different order.
dict2
# A tibble: 3 x 2
new_name old_name
<chr> <chr>
1 math_score Var3
2 stu_id Var1
3 read_score Var2
And we can still use this to rename our variables.
dict_names <- tibble::deframe(dict2)
dict_names
math_score stu_id read_score
"Var3" "Var1" "Var2"
d1 %>%
dplyr::rename(tidyselect::all_of(dict_names))
# A tibble: 3 x 3
stu_id read_score math_score
<chr> <dbl> <dbl>
1 a 2 3.6
2 b NA 8.5
3 c 3 NA
We can also use this dictionary, even if it doesn’t have new names for all of our variables. Take this dictionary that can only relabel 2 of our variables.
dict3
# A tibble: 2 x 2
new_name old_name
<chr> <chr>
1 stu_id Var1
2 read_score Var2
We can still rename.
dict_names <- tibble::deframe(dict3)
dict_names
stu_id read_score
"Var1" "Var2"
d1 %>%
dplyr::rename(tidyselect::all_of(dict_names))
# A tibble: 3 x 3
stu_id read_score Var3
<chr> <dbl> <dbl>
1 a 2 3.6
2 b NA 8.5
3 c 3 NA
Return to Name Variables