Package: dplyr


Function: filter()

Examples using multiple criteria same variable (numeric)


1. Keep any row that has -999 OR 0 for extra3.

Review the data (d8).

# A tibble: 4 x 6
  extra1 extra2 extra3 stu_id test_score tch_name
  <chr>   <dbl>  <dbl>  <dbl>      <dbl> <chr>   
1 a           1      2     10        205 harris  
2 b        -999      0     11        220 steve   
3 c        -999   -999     12        250 harris  
4 d           4      0     13        217 lewis   

Keep any row that has -999 or 0 for extra3.

  • Note: Use of the operator | to denote and/or (and doesn’t really apply since it’s the same variable).
d8 %>% 
  dplyr::filter(extra3 == -999 | extra3 ==0)
# A tibble: 3 x 6
  extra1 extra2 extra3 stu_id test_score tch_name
  <chr>   <dbl>  <dbl>  <dbl>      <dbl> <chr>   
1 b        -999      0     11        220 steve   
2 c        -999   -999     12        250 harris  
3 d           4      0     13        217 lewis   

You would get the same result using the xor operator.

d8 %>% 
  dplyr::filter(xor(extra3 == -999 , extra3 ==0))
# A tibble: 3 x 6
  extra1 extra2 extra3 stu_id test_score tch_name
  <chr>   <dbl>  <dbl>  <dbl>      <dbl> <chr>   
1 b        -999      0     11        220 steve   
2 c        -999   -999     12        250 harris  
3 d           4      0     13        217 lewis   

Last, you can use the %in% operator as well.

d8 %>% 
  dplyr::filter(extra3 %in% c(-999,0))
# A tibble: 3 x 6
  extra1 extra2 extra3 stu_id test_score tch_name
  <chr>   <dbl>  <dbl>  <dbl>      <dbl> <chr>   
1 b        -999      0     11        220 steve   
2 c        -999   -999     12        250 harris  
3 d           4      0     13        217 lewis   

2. Keep any row whose value for extra3 is between 0 and 10.

Review the data (d8).

# A tibble: 4 x 6
  extra1 extra2 extra3 stu_id test_score tch_name
  <chr>   <dbl>  <dbl>  <dbl>      <dbl> <chr>   
1 a           1      2     10        205 harris  
2 b        -999      0     11        220 steve   
3 c        -999   -999     12        250 harris  
4 d           4      0     13        217 lewis   

Keep any row that has a value between 0 and 10 for extra3.

  • Note: Use the operator & to denote that both criteria must be met.
d81 <- d8 %>% 
  dplyr::filter(extra3 >= 0 & extra3 <=10)

d81
# A tibble: 3 x 6
  extra1 extra2 extra3 stu_id test_score tch_name
  <chr>   <dbl>  <dbl>  <dbl>      <dbl> <chr>   
1 a           1      2     10        205 harris  
2 b        -999      0     11        220 steve   
3 d           4      0     13        217 lewis   

You would get the same result using the dplyr::between() function. It includes both 0 and 10.

d8 %>% 
  dplyr::filter(dplyr::between(extra3, 0, 10))
# A tibble: 3 x 6
  extra1 extra2 extra3 stu_id test_score tch_name
  <chr>   <dbl>  <dbl>  <dbl>      <dbl> <chr>   
1 a           1      2     10        205 harris  
2 b        -999      0     11        220 steve   
3 d           4      0     13        217 lewis   

Function: filter()

Examples using multiple criteria same variable (character)


1. Keep any row whose tch_name is either “harris” OR “lewis”.

Review the data (d8).

# A tibble: 4 x 6
  extra1 extra2 extra3 stu_id test_score tch_name
  <chr>   <dbl>  <dbl>  <dbl>      <dbl> <chr>   
1 a           1      2     10        205 harris  
2 b        -999      0     11        220 steve   
3 c        -999   -999     12        250 harris  
4 d           4      0     13        217 lewis   

Keep any row where tch_name is “harris” or “lewis”.

  • Note: Use of the operator | to denote and/or (and doesn’t really apply since it’s the same variable).
d8 %>% 
  dplyr::filter(tch_name == "harris" | tch_name == "lewis")
# A tibble: 3 x 6
  extra1 extra2 extra3 stu_id test_score tch_name
  <chr>   <dbl>  <dbl>  <dbl>      <dbl> <chr>   
1 a           1      2     10        205 harris  
2 c        -999   -999     12        250 harris  
3 d           4      0     13        217 lewis   

You would get the same result using the %in% operator.

d8 %>% 
  dplyr::filter(tch_name %in% c("harris", "lewis"))
# A tibble: 3 x 6
  extra1 extra2 extra3 stu_id test_score tch_name
  <chr>   <dbl>  <dbl>  <dbl>      <dbl> <chr>   
1 a           1      2     10        205 harris  
2 c        -999   -999     12        250 harris  
3 d           4      0     13        217 lewis   

Or the same result adding the stringr::str_detect() function.

d8 %>% 
  dplyr::filter(stringr::str_detect(tch_name, 'harris|lewis') )
# A tibble: 3 x 6
  extra1 extra2 extra3 stu_id test_score tch_name
  <chr>   <dbl>  <dbl>  <dbl>      <dbl> <chr>   
1 a           1      2     10        205 harris  
2 c        -999   -999     12        250 harris  
3 d           4      0     13        217 lewis   

Return to Filter