Package: dplyr


Function: relocate()


1. Relocate variables to be in the same order as your data dictionary

Review the data (d5)

# A tibble: 4 x 4
   s_id t_last_name t_first_name grade
  <dbl> <chr>       <chr>        <dbl>
1    10 simpson     homer            4
2    15 simpson     marge            5
3    12 simpson     homer            3
4    13 flanders    marge            2

Review the data dictionary

Notice the order of variables in the dictionary is different than the current order of variables in the data

# A tibble: 4 x 2
  var_name     label             
  <chr>        <chr>             
1 s_id         student id        
2 grade        grade level       
3 t_first_name teacher first name
4 t_last_name  teacher last name 

We can now create a character vector of all of the variables in the correct order using our data dictionary.

  • Note: We use dplyr::pull() to extract the one column with the names of the variables in our data dictionary and create a character vector
var_order <- dict %>%
  dplyr::select(var_name) %>%
  dplyr::pull()

We can now use this character vector to reorder variables in our data frame.

  • Note: We use tidyselect::all_of() to select variables that are contained in a character vector (an environment variable).
d5 %>%
  dplyr::relocate(all_of(var_order))
# A tibble: 4 x 4
   s_id grade t_first_name t_last_name
  <dbl> <dbl> <chr>        <chr>      
1    10     4 homer        simpson    
2    15     5 marge        simpson    
3    12     3 homer        simpson    
4    13     2 marge        flanders   

If for some reason you have additional variables in your data frame that are not in your data dictionary, I would first recommend adding all variables in your data frame to your data dictionary to make sure you are accounting for everything.

However, if for some reason the situation occurs where you have more variables in your data frame than are in your data dictionary, all variables not listed in your data dictionary will be added to the end of your dataset.

Here is an example of this situation.

Our data dictionary with partial variables.

# A tibble: 3 x 2
  var_name     label             
  <chr>        <chr>             
1 grade        grade level       
2 t_first_name teacher first name
3 t_last_name  teacher last name 
var_order <- dict2 %>%
  dplyr::select(var_name) %>%
  dplyr::pull()
d5 %>%
  dplyr::relocate(all_of(var_order))
# A tibble: 4 x 4
  grade t_first_name t_last_name  s_id
  <dbl> <chr>        <chr>       <dbl>
1     4 homer        simpson        10
2     5 marge        simpson        15
3     3 homer        simpson        12
4     2 marge        flanders       13

The other scenario that might happen is that your data dictionary has more variables than your data frame. In this situation you will get an error that says “Can’t subset columns that don’t exist”. You will need to either filter your data dictionary to the variables that exist in your current data, or fix the discrepancies between your data dictionary and data frames.

Return to Reorder