filter()1. Keep any row where q1 equals
q2.
Review the data (d13)
# A tibble: 3 x 3
id q1 q2
<dbl> <chr> <chr>
1 10 harris Harris
2 20 steve steve
3 30 lewis <NA>
Filter rows
d13 %>%
dplyr::filter(q1 == q2)
# A tibble: 1 x 3
id q1 q2
<dbl> <chr> <chr>
1 20 steve steve
2. Remove any row where q1 does not equal
q2, while also keeping rows that have NA for
q2.
Review the data (d13)
# A tibble: 3 x 3
id q1 q2
<dbl> <chr> <chr>
1 10 harris Harris
2 20 steve steve
3 30 lewis <NA>
Filter out any row where q1 is not equal to
q2
Note: Use the logical operator != to denote not equal to.
Note: I make an explicit call to keep rows with
NA values for q2. If I did not do this, filter would drop
the last row of data.
d13 %>%
dplyr::filter(q1 != q2 | is.na(q2))
# A tibble: 2 x 3
id q1 q2
<dbl> <chr> <chr>
1 10 harris Harris
2 30 lewis <NA>
There are still other ways to keep NAs as well. As we saw in Filter rows that contain NA values there are other ways to keep NA values using the %in% operator.
However, in this scenario where we are comparing two vectors, you
cannot use the %in% operator the same way we that we used it in
Filter
rows that contain NA values. Here you need to add the
dplyr::rowwise() function. This is also a better method
when both q1 and q2 might have NA values and you want to keep rows when
either column has NA, but not when both columns have NA.
d13 %>%
dplyr::rowwise() %>%
dplyr::filter(!q1 %in% q2)
# A tibble: 2 x 3
# Rowwise:
id q1 q2
<dbl> <chr> <chr>
1 10 harris Harris
2 30 lewis <NA>
And yet, another way you may want to keep rows in your data is using
the dplyr::case_when() function.
d13 %>%
dplyr::filter(
dplyr::case_when(
q1 == q2 ~ FALSE,
TRUE ~ TRUE))
# A tibble: 2 x 3
id q1 q2
<dbl> <chr> <chr>
1 10 harris Harris
2 30 lewis <NA>
Return to Filter