Package: dplyr


Function: across()


  • Note: dplyr::across() is a function which allows us to use the dplyr::select() semantics within functions like dplyr::summarise() or dplyr::mutate()


1. Round all mean scores to be 2 digits

Review the data (d5)

# A tibble: 3 x 4
     id scale1_mean scale2_mean scale3_mean
  <dbl> <chr>       <chr>       <chr>      
1    10 23.455      22.133      21.921     
2    11 19.001      21.677      23.808     
3    12 17.465      18.111      24.393     

We want to round all mean scores to 2 digits.

Since we want to apply a transformation to multiple variables, we can use the dplyr::across() function which allows us to use the dplyr::select() semantics.

  • Note: I am using the dplyr::mutate() function to modify our existing variables.

  • Note: I am using the base::round() function to round our variables with the digits argument. To learn more about how base R does rounding, see Rounding.

d5 %>%
  dplyr::mutate(dplyr::across(scale1_mean:scale3_mean,
                round, digits = 2))
# A tibble: 3 x 4
     id scale1_mean scale2_mean scale3_mean
  <dbl> <chr>       <chr>       <chr>      
1    10 23.45       22.13       21.92      
2    11 19.00       21.68       23.81      
3    12 17.46       18.11       24.39      

Similar to dplyr::select() you can also select your variables by using a character vector

d5 %>%
  dplyr::mutate(dplyr::across(c(scale1_mean, scale2_mean, scale3_mean),
                round, digits = 2))

Or using tidyselect selection helpers.

d5 %>%
  dplyr::mutate(dplyr::across(contains("mean"),
                round, digits = 2))

Function: if_any() or if_all()


  • Note: dplyr::if_any() and dplyr::if_all() are predicate functions used to select columns in a filtering capacity and return a logical vector. This function is available in version 1.0.5 of dplyr. dplyr::if_any() returns a true when the statement is true for any of the variables. dplyr::if_all() returns a true when the statement is true for all of the variables.


1. Create a new variable that states if the survey was complete or not

Review the data (d6)

# A tibble: 4 x 4
     id    q1    q2    q3
  <dbl> <dbl> <dbl> <dbl>
1    10     1     2     3
2    11     2     1     4
3    12     3    NA     4
4    13    NA    NA    NA

Create our new complete variable that is 1 if the survey is complete and 0 if it is not.

  • Note: I am using the dplyr::mutate() function to create our new variable.

  • Note: I am using dplyr::case_when() to recode existing variables into a new variable.

d6 %>%
  dplyr::mutate(complete = 
           dplyr::case_when(
             dplyr::if_all(q1:q3, ~ is.na(.x)) ~ 0,
             TRUE ~ 1
           ))
# A tibble: 4 x 5
     id    q1    q2    q3 complete
  <dbl> <dbl> <dbl> <dbl>    <dbl>
1    10     1     2     3        1
2    11     2     1     4        1
3    12     3    NA     4        1
4    13    NA    NA    NA        0

Return to Select Variables