Package: labelled


Function: set_value_labels()


1. Set labels for one variable, Var2, using a long formatted data dictionary

Review the data (d10)

# A tibble: 3 x 3
  Var1   Var2  Var3
  <chr> <dbl> <dbl>
1 a         1     3
2 b      -999     1
3 c         2  -999

In this scenario, we have an existing dataset that provides the labels for our variables.

Review our data dictionary (dict_long)

Note that this dataset is not in the usual data dictionary format I would think of (with each row corresponding to one variable). In this case, this would be a specialized long format type of codebook that you may have received or created solely for the purpose of working with value labels.

# A tibble: 7 x 3
  var   value label    
  <chr> <dbl> <chr>    
1 Var2      1 yes      
2 Var2      2 no       
3 Var2   -999 missing  
4 Var3      1 never    
5 Var3      2 sometimes
6 Var3      3 always   
7 Var3   -999 missing  

Add value labels

  • Note: First we need to create a named character vector. We can do this using dplyr::pull(). Note that the label variable is provided second. This is important because the second variable will become the “names”.
var_labels <- dict_long %>%
  dplyr::filter(var == "Var2") %>%
  dplyr::pull(value, label) 
var_labels
    yes      no missing 
      1       2    -999 

We can also do this using tibble::deframe(). You will get the same result.

HOWEVER, do note that the order that you list variables is different. In this method, the label (which will be the “names”) will need to be provided first.

var_labels <- labels %>%
  dplyr::filter(var == "Var2") %>%
  dplyr::select(label, value) %>%
  tibble::deframe()

Now we can use this vector to add our variable labels.

d10 <- d10 %>%
  labelled::set_value_labels(Var2 = var_labels)

labelled::val_labels(d10)
$Var1
NULL

$Var2
    yes      no missing 
      1       2    -999 

$Var3
NULL

Function: labelled()


1. Set labels for multiple variables (Var2 and Var3) with the same label values using a long formatted data dictionary

Review the data (d5)

# A tibble: 3 x 3
  Var1   Var2  Var3
  <chr> <dbl> <dbl>
1 d         1    NA
2 e         2     2
3 f         1     1

This time there are several variables that have the same labels so we want to apply our labels across multiple variables.

Review our data dictionary (dict_long2)

# A tibble: 4 x 3
  var   value label
  <chr> <dbl> <chr>
1 Var2      1 yes  
2 Var2      2 no   
3 Var3      1 yes  
4 Var3      2 no   

Since labels are repeated in our data dictionary, filter to just one of the variables that have the labels we need.

var_labels <- dict_long2 %>%
  filter(var == "Var2") %>%
  dplyr::pull(value, label) 

Or you could remove duplicate labels using dplyr::distinct()

var_labels <- dict_long2 %>%
  distinct(label, .keep_all = TRUE) %>%
  dplyr::pull(value, label) 
d5 <- d5 %>% 
  dplyr::mutate(dplyr::across(Var2:Var3, 
            ~labelled::labelled(., labels = var_labels)))

d5 %>% 
  labelled::val_labels()
$Var1
NULL

$Var2
yes  no 
  1   2 

$Var3
yes  no 
  1   2 

Go to Add value labels using wide formatted data dictionary

Return to Label Data