relocate()1. Relocate variables to be in the same order as your data dictionary
Review the data (d5)
# A tibble: 4 x 4
s_id t_last_name t_first_name grade
<dbl> <chr> <chr> <dbl>
1 10 simpson homer 4
2 15 simpson marge 5
3 12 simpson homer 3
4 13 flanders marge 2
Review the data dictionary
Notice the order of variables in the dictionary is different than the current order of variables in the data
# A tibble: 4 x 2
var_name label
<chr> <chr>
1 s_id student id
2 grade grade level
3 t_first_name teacher first name
4 t_last_name teacher last name
We can now create a character vector of all of the variables in the correct order using our data dictionary.
dplyr::pull() to extract the one column
with the names of the variables in our data dictionary and create a
character vectorvar_order <- dict %>%
dplyr::select(var_name) %>%
dplyr::pull()
We can now use this character vector to reorder variables in our data frame.
tidyselect::all_of() to select variables
that are contained in a character vector (an environment variable).d5 %>%
dplyr::relocate(all_of(var_order))
# A tibble: 4 x 4
s_id grade t_first_name t_last_name
<dbl> <dbl> <chr> <chr>
1 10 4 homer simpson
2 15 5 marge simpson
3 12 3 homer simpson
4 13 2 marge flanders
If for some reason you have additional variables in your data frame that are not in your data dictionary, I would first recommend adding all variables in your data frame to your data dictionary to make sure you are accounting for everything.
However, if for some reason the situation occurs where you have more variables in your data frame than are in your data dictionary, all variables not listed in your data dictionary will be added to the end of your dataset.
Here is an example of this situation.
Our data dictionary with partial variables.
# A tibble: 3 x 2
var_name label
<chr> <chr>
1 grade grade level
2 t_first_name teacher first name
3 t_last_name teacher last name
var_order <- dict2 %>%
dplyr::select(var_name) %>%
dplyr::pull()
d5 %>%
dplyr::relocate(all_of(var_order))
# A tibble: 4 x 4
grade t_first_name t_last_name s_id
<dbl> <chr> <chr> <dbl>
1 4 homer simpson 10
2 5 marge simpson 15
3 3 homer simpson 12
4 2 marge flanders 13
The other scenario that might happen is that your data dictionary has more variables than your data frame. In this situation you will get an error that says “Can’t subset columns that don’t exist”. You will need to either filter your data dictionary to the variables that exist in your current data, or fix the discrepancies between your data dictionary and data frames.
Return to Reorder