if_else()Note: dplyr::if_else() will evaluate in
order, so proceed from most specific to most general. The function will
not write over a previous evaluation even if something meets a new
criteria.
Note: dplyr::if_else() will not
evaluate NAs as TRUE or FALSE and therefore will return NAs. If you want
to specifically evaluate NAs you will need to call out NAs.
1. Collapse all “Google Meet” in character var (online_platform) into one category
Review the data (d6)
# A tibble: 7 x 2
tch_id online_platform
<dbl> <chr>
1 105 google
2 106 meet
3 107 Zoom
4 108 Zoom
5 109 Google Meet
6 112 Microsoft Teams
7 113 <NA>
Collapse all spellings of Google Meet into one uniform category
stringr::str_detect() we can detect
different spellings of google meet. I enter a pattern (“google|meet”)
that means to look for any value in online_platform that contains the
word “google” OR “meet” and recategorize those to “Google Meet”d6 %>%
dplyr::mutate(
online_platform_new =
dplyr::if_else(
stringr::str_detect(online_platform, "google|meet"),
"Google Meet",
online_platform
)
)
# A tibble: 7 x 3
tch_id online_platform online_platform_new
<dbl> <chr> <chr>
1 105 google Google Meet
2 106 meet Google Meet
3 107 Zoom Zoom
4 108 Zoom Zoom
5 109 Google Meet Google Meet
6 112 Microsoft Teams Microsoft Teams
7 113 <NA> <NA>
We can also do this using %in% instead of
stringr::str_detect()
d6 %>%
dplyr::mutate(
online_platform_new =
dplyr::if_else(
online_platform %in% c("google", "meet", "Google Meet"),
"Google Meet",
online_platform
)
)
# A tibble: 7 x 3
tch_id online_platform online_platform_new
<dbl> <chr> <chr>
1 105 google Google Meet
2 106 meet Google Meet
3 107 Zoom Zoom
4 108 Zoom Zoom
5 109 Google Meet Google Meet
6 112 Microsoft Teams Microsoft Teams
7 113 <NA> <NA>
2. Collapse all “Google Meet” in character var (online_platform) into one category and address case sensitivity
Review the data (d12)
# A tibble: 7 x 2
tch_id online_platform
<dbl> <chr>
1 105 Google
2 106 meet
3 107 Zoom
4 108 Zoom
5 109 Google Meet
6 112 Microsoft Teams
7 113 <NA>
Collapse all spellings of Google Meet into one uniform category
stringr::str_detect() we can detect
different spellings of google meet. It is important to note that this
function is case sensitive. As we saw above in the data, the
“google|meet” pattern we used before will no longer catch our upper case
“Google”. So there are a number of ways we can deal with this. The first
way we will try is by adding stringr::regex() and adding the argument
ignore_case = TRUE.d12 %>%
dplyr::mutate(
online_platform_new =
dplyr::if_else(
stringr::str_detect(
online_platform,
stringr::regex("google|meet", ignore_case = TRUE)
),
"Google Meet",
online_platform
)
)
# A tibble: 7 x 3
tch_id online_platform online_platform_new
<dbl> <chr> <chr>
1 105 Google Google Meet
2 106 meet Google Meet
3 107 Zoom Zoom
4 108 Zoom Zoom
5 109 Google Meet Google Meet
6 112 Microsoft Teams Microsoft Teams
7 113 <NA> <NA>
We can also do this by adding (?i) to the pattern, which means match case insensitive.
d12 %>%
dplyr::mutate(
online_platform_new = dplyr::if_else(
stringr::str_detect(online_platform, "(?i)google|meet"),
"Google Meet",
online_platform
)
)
# A tibble: 7 x 3
tch_id online_platform online_platform_new
<dbl> <chr> <chr>
1 105 Google Google Meet
2 106 meet Google Meet
3 107 Zoom Zoom
4 108 Zoom Zoom
5 109 Google Meet Google Meet
6 112 Microsoft Teams Microsoft Teams
7 113 <NA> <NA>
3. Collapse a numeric variable (Var2) into discrete numeric categories
Review the data (d10)
# A tibble: 5 x 2
Var1 Var2
<chr> <dbl>
1 a 1
2 b 16
3 c 3
4 d 4
5 e NA
Create an indicator variable. 1 = 8 or lower, 0 = 9 or higher
*Note: NA will remain as NA because if_else does not evaluate it as TRUE or FALSE.
d10 %>%
dplyr::mutate(Var2_new = dplyr::if_else(Var2 <= 8, 1, 0))
# A tibble: 5 x 3
Var1 Var2 Var2_new
<chr> <dbl> <dbl>
1 a 1 1
2 b 16 0
3 c 3 1
4 d 4 1
5 e NA NA
*Note: If you want to evaluate NA, you will need to specifically add it
d10 %>%
dplyr::mutate(Var2_new = dplyr::if_else(Var2 <= 8 |
is.na(Var2), 1, 0))
# A tibble: 5 x 3
Var1 Var2 Var2_new
<chr> <dbl> <dbl>
1 a 1 1
2 b 16 0
3 c 3 1
4 d 4 1
5 e NA 1
You could also specifically add NA this way as well
d10 %>%
dplyr::mutate(Var2_new = dplyr::if_else(Var2 > 8 &
!is.na(Var2), 0, 1))
# A tibble: 5 x 3
Var1 Var2 Var2_new
<chr> <dbl> <dbl>
1 a 1 1
2 b 16 0
3 c 3 1
4 d 4 1
5 e NA 1
case_when()Note: dplyr::case_when() will evaluate
in order, so proceed from most specific to most general. The function
will not write over a previous evaluation even if something meets a new
criteria.
1. Collapse all “Google Meet” into one category
Review the data (d6)
# A tibble: 7 x 2
tch_id online_platform
<dbl> <chr>
1 105 google
2 106 meet
3 107 Zoom
4 108 Zoom
5 109 Google Meet
6 112 Microsoft Teams
7 113 <NA>
Collapse all spellings of Google Meet into one uniform category
d6 %>%
dplyr::mutate(
online_platform_new =
dplyr::case_when(
stringr::str_detect(online_platform, "google|meet") ~ "Google Meet",
TRUE ~ online_platform
)
)
# A tibble: 7 x 3
tch_id online_platform online_platform_new
<dbl> <chr> <chr>
1 105 google Google Meet
2 106 meet Google Meet
3 107 Zoom Zoom
4 108 Zoom Zoom
5 109 Google Meet Google Meet
6 112 Microsoft Teams Microsoft Teams
7 113 <NA> <NA>
d6 %>%
dplyr::mutate(online_platform_new =
dplyr::case_when(
stringr::str_detect(online_platform, "google|meet") ~ "Google Meet"
))
# A tibble: 7 x 3
tch_id online_platform online_platform_new
<dbl> <chr> <chr>
1 105 google Google Meet
2 106 meet Google Meet
3 107 Zoom <NA>
4 108 Zoom <NA>
5 109 Google Meet <NA>
6 112 Microsoft Teams <NA>
7 113 <NA> <NA>
d6 %>%
dplyr::mutate(
online_platform_new =
dplyr::case_when(
stringr::str_detect(online_platform, "google|meet") ~ "Google Meet",
TRUE ~ "Other"
)
)
# A tibble: 7 x 3
tch_id online_platform online_platform_new
<dbl> <chr> <chr>
1 105 google Google Meet
2 106 meet Google Meet
3 107 Zoom Other
4 108 Zoom Other
5 109 Google Meet Other
6 112 Microsoft Teams Other
7 113 <NA> Other
dplyr::case_when() you need to call out the specific type
of NA. In this case NA_character_.d6 %>%
dplyr::mutate(
online_platform_new =
dplyr::case_when(
is.na(online_platform) ~ NA_character_,
stringr::str_detect(online_platform, "google|meet") ~ "Google Meet",
TRUE ~ "Other"
)
)
# A tibble: 7 x 3
tch_id online_platform online_platform_new
<dbl> <chr> <chr>
1 105 google Google Meet
2 106 meet Google Meet
3 107 Zoom Other
4 108 Zoom Other
5 109 Google Meet Other
6 112 Microsoft Teams Other
7 113 <NA> <NA>
2. Collapse all “Google Meet” into one category and clarify that “google” means Google Hangouts
Review the data (d9)
# A tibble: 10 x 2
tch_id online_platform
<dbl> <chr>
1 105 google
2 106 meet
3 107 Zoom
4 108 Zoom
5 109 Google Meet
6 112 Microsoft Teams
7 113 google meet
8 114 google met
9 115 hangouts
10 116 <NA>
Collapse all spellings of Google Meet into one uniform category and clarify that google means Google Hangouts
Note: Here I am adding the base::tolower() function
to deal with the varying capitalization. You can also use the solutions
we showed above in dplyr::if_else() where we add either
(?i) to the pattern or add stringr::regex() around the
pattern with the argument ignore_case = TRUE.
Note: Note that I am not putting “google” at the top because if I did, all instances of “google” would be recoded to google hangouts.
d9 %>%
dplyr::mutate(online_platform =
tolower(online_platform)) %>%
dplyr::mutate(
online_platform_new =
dplyr::case_when(
stringr::str_detect(online_platform, "met|meet") ~ "google meet",
stringr::str_detect(online_platform, "google|hangouts") ~ "google hangouts",
TRUE ~ online_platform
)
)
# A tibble: 10 x 3
tch_id online_platform online_platform_new
<dbl> <chr> <chr>
1 105 google google hangouts
2 106 meet google meet
3 107 zoom zoom
4 108 zoom zoom
5 109 google meet google meet
6 112 microsoft teams microsoft teams
7 113 google meet google meet
8 114 google met google meet
9 115 hangouts google hangouts
10 116 <NA> <NA>
d9 %>%
dplyr::mutate(online_platform =
tolower(online_platform)) %>%
dplyr::mutate(
online_platform_new =
dplyr::case_when(
stringr::str_detect(online_platform, "google|hangouts") ~ "google hangouts",
stringr::str_detect(online_platform, "met|meet") ~ "google meet",
TRUE ~ online_platform
)
)
# A tibble: 10 x 3
tch_id online_platform online_platform_new
<dbl> <chr> <chr>
1 105 google google hangouts
2 106 meet google meet
3 107 zoom zoom
4 108 zoom zoom
5 109 google meet google hangouts
6 112 microsoft teams microsoft teams
7 113 google meet google hangouts
8 114 google met google hangouts
9 115 hangouts google hangouts
10 116 <NA> <NA>
3. Collapse a numeric variable (Var2) into discrete character categories
Review the data (d8)
# A tibble: 5 x 3
Var1 Var2 Var3
<chr> <dbl> <dbl>
1 a 1 2
2 b 16 0
3 c 3 1
4 d 4 3
5 e 14 0
Collapse Var2 into low, medium, high categories.
Note: We use the & operator to say if a value meets both conditions
Note: We cannot use dplyr::between because our left
side is not >=
d8 %>%
dplyr::mutate(Var2_new =
dplyr::case_when(Var2 > 14 ~ "high",
Var2 > 6 & Var2 <= 14 ~ "medium",
TRUE ~ "low"))
# A tibble: 5 x 4
Var1 Var2 Var3 Var2_new
<chr> <dbl> <dbl> <chr>
1 a 1 2 low
2 b 16 0 high
3 c 3 1 low
4 d 4 3 low
5 e 14 0 medium
dplyr::case_when() evaluates in order, we
don’t actually have to add the right side of our “medium” logic because
all of the highs have been assigned and will not be written over by the
mediumsd8 %>%
dplyr::mutate(Var2_new =
dplyr::case_when(Var2 > 14 ~ "high",
Var2 > 6 ~ "medium",
TRUE ~ "low"))
# A tibble: 5 x 4
Var1 Var2 Var3 Var2_new
<chr> <dbl> <dbl> <chr>
1 a 1 2 low
2 b 16 0 high
3 c 3 1 low
4 d 4 3 low
5 e 14 0 medium
recode()1. Collapse all “Google Meet” into one category
Review the data (d6)
# A tibble: 7 x 2
tch_id online_platform
<dbl> <chr>
1 105 google
2 106 meet
3 107 Zoom
4 108 Zoom
5 109 Google Meet
6 112 Microsoft Teams
7 113 <NA>
Collapse all spellings of Google Meet into one uniform category
d6 %>%
dplyr::mutate(
online_platform_new =
dplyr::recode(online_platform, google = "Google Meet", meet = "Google Meet")
)
# A tibble: 7 x 3
tch_id online_platform online_platform_new
<dbl> <chr> <chr>
1 105 google Google Meet
2 106 meet Google Meet
3 107 Zoom Zoom
4 108 Zoom Zoom
5 109 Google Meet Google Meet
6 112 Microsoft Teams Microsoft Teams
7 113 <NA> <NA>
Return to Recode