set_variable_labels()1. Set labels for one or more variables (id,
gradelevel, q1) using a data
dictionary
Review the data (d11)
# A tibble: 3 x 3
id gradelevel q1
<dbl> <dbl> <dbl>
1 1234 6 4
2 2345 6 5
3 3456 7 3
Review our data dictionary (dict)
# A tibble: 3 x 2
varname label
<chr> <chr>
1 id study ID
2 gradelevel student grade level
3 q1 do you get along with your teacher?
Add variable labels
labelled::set_variable_labels() works with named lists or
character vectors. I prefer to work with named lists which matches based
on variable name. With a character vector, if the variables are not in
the exact same order as our data frame variables, labels will be
incorrectly assigned.We can use tibble::deframe() to create a named character
vector and turn it into a list using base::as.list(). Make
sure your varname variable is first in your data frame and
your label variable is second. The first variable will
become your “names”.
var_labels <- dict %>%
tibble::deframe() %>%
base::as.list()
We could also get the same named list using
dplyr::pull() in conjunction with
base::as.list(). HOWEVER, note that the order of variables
will need to change. Now we need the label to be first and
the varname to be second in order for varname
to become our “names”.
var_labels <- dict %>%
dplyr::pull(label, varname) %>%
base::as.list()
Now we can use this list to add our variable labels using the argument .labels.
d11 <- d11 %>%
labelled::set_variable_labels(.labels = var_labels)
labelled::var_label(d11)
$id
[1] "study ID"
$gradelevel
[1] "student grade level"
$q1
[1] "do you get along with your teacher?"
This also works if you only want to add variable labels to some variables, not all. There may be times when some of your variables already have labels or you simply do not want to add labels to all variables. If there are existing labels and those variables are not in your data dictionary, those labels will remain. If they are in your data dictionary, those labels will be replaced with your data dictionary labels.
Here is an example where we choose to only add labels to 2 of our variables.
dplyr::filter() to
just the two variables of interest.You’ll see labels are only added to gradelevel and
q1.
var_labels <- dict %>%
dplyr::filter(varname %in% c("gradelevel", "q1")) %>%
tibble::deframe() %>%
base::as.list()
d11 <- d11 %>%
labelled::set_variable_labels(.labels = var_labels)
labelled::var_label(d11)
$id
NULL
$gradelevel
[1] "student grade level"
$q1
[1] "do you get along with your teacher?"
If “id” already had a label, like in the example below, and that variable was not in the data dictionary, that label would be retained.
Again, we narrow the data dictionary to only gradelevel
and q1 to demonstrate this point.
$id
[1] "research study id"
$gradelevel
NULL
$q1
NULL
var_labels <- dict %>%
dplyr::filter(varname %in% c("gradelevel", "q1")) %>%
tibble::deframe() %>%
base::as.list()
d11 <- d11 %>%
labelled::set_variable_labels(.labels = var_labels)
labelled::var_label(d11)
$id
[1] "research study id"
$gradelevel
[1] "student grade level"
$q1
[1] "do you get along with your teacher?"
However, if study id was in our data dictionary, it would write over the existing label.
var_labels <- dict %>%
tibble::deframe() %>%
base::as.list()
d11 <- d11 %>%
labelled::set_variable_labels(.labels = var_labels)
labelled::var_label(d11)
$id
[1] "study ID"
$gradelevel
[1] "student grade level"
$q1
[1] "do you get along with your teacher?"
Here is an example.
Review our data dictionary (dict2)
# A tibble: 4 x 2
varname label
<chr> <chr>
1 id study ID
2 gradelevel student grade level
3 q1 do you get along with your teacher?
4 q2 something
Add our labels
var_labels <- dict2 %>%
tibble::deframe() %>%
base::as.list()
d11 <- d11 %>%
labelled::set_variable_labels(.labels = var_labels, .strict = FALSE)
labelled::var_label(d11)
$id
[1] "study ID"
$gradelevel
[1] "student grade level"
$q1
[1] "do you get along with your teacher?"
Return to Label Data