Package: dplyr


Function: filter()


  • Note: Using dplyr::across() in dplyr::filter() is deprecated. dplyr::if_any() and dplyr::if_all() are predicate functions used to select columns within dplyr::filter(). This function is available in version 1.0.5 of dplyr. dplyr::if_any() returns a true when the statement is true for any of the variables. dplyr::if_all() returns a true when the statement is true for all of the variables. See Filter using if_all or if_any for further explanation


1. Remove any row that has -999 for AT LEAST ONE variable.

Review the data (d19)

# A tibble: 6 x 4
  extra2 extra3    id test_score
   <dbl>  <dbl> <dbl>      <dbl>
1      1      2    10        205
2   -999      0    11        220
3      3   -999    12        250
4   -999      0    13        217
5   -999   -999  -999       -999
6   -999   -999  -999       -999

Filter out any row that has -999 for at least one variable

  • Note: We use the tidyselect selection helper everything() to refer to all variables.
d19 %>% 
  dplyr::filter(!dplyr::if_any(tidyselect::everything(), ~ . == -999))
# A tibble: 1 x 4
  extra2 extra3    id test_score
   <dbl>  <dbl> <dbl>      <dbl>
1      1      2    10        205

2. Keep any row that has -999 for ALL variables.

Review the data (d19)

# A tibble: 6 x 4
  extra2 extra3    id test_score
   <dbl>  <dbl> <dbl>      <dbl>
1      1      2    10        205
2   -999      0    11        220
3      3   -999    12        250
4   -999      0    13        217
5   -999   -999  -999       -999
6   -999   -999  -999       -999

Keep any row that has -999 for all variables

d19 %>% 
  dplyr::filter(dplyr::if_all(tidyselect::everything(), ~ . == -999))
# A tibble: 2 x 4
  extra2 extra3    id test_score
   <dbl>  <dbl> <dbl>      <dbl>
1   -999   -999  -999       -999
2   -999   -999  -999       -999

Return to Filter