str_trim()1. Remove spaces from item1
Review the data (d1)
# A tibble: 5 x 2
stu_id item1
<dbl> <chr>
1 1234 " yes"
2 2345 "no"
3 3456 "maybe"
4 4567 "no"
5 5678 "yes"
I can see here that my variable item1 has some values
with added spaces. This is going to cause issues for me when I try to do
things like filter or group using this variable.
For example if I want to get a table of values for item1, we can see that the yes values are not grouping together.
d1 %>%
janitor::tabyl(item1)
item1 n percent
yes 1 0.2
maybe 1 0.2
no 2 0.4
yes 1 0.2
So I want to remove all white space on the left side of my variable.
Note: I am using dplyr::mutate() to create a new
variable “item1” that replaces the prior “item1” variable.
Note: I used the argument side = “left” here but I could also say “right” or “both” (which is the default argument).
d1 %>%
dplyr::mutate(item1 = stringr::str_trim(item1, side = "left"))
# A tibble: 5 x 2
stu_id item1
<dbl> <chr>
1 1234 yes
2 2345 no
3 3456 maybe
4 4567 no
5 5678 yes
2. Remove spaces from item1 and
item2
Review the data (d2)
# A tibble: 5 x 3
stu_id item1 item2
<dbl> <chr> <chr>
1 1234 " yes" "yes"
2 2345 "no" "no "
3 3456 "maybe" "no"
4 4567 "no" "no"
5 5678 "yes" "maybe"
We see here that we have spacing issues across multiple variables
(“item1” and “item2”) so I want to remove spaces from both variables. I
can use dplyr::across() to efficiently take care of
this.
d2 %>%
dplyr::mutate(dplyr::across(item1:item2, stringr::str_trim, side = "both"))
# A tibble: 5 x 3
stu_id item1 item2
<dbl> <chr> <chr>
1 1234 yes yes
2 2345 no no
3 3456 maybe no
4 4567 no no
5 5678 yes maybe
str_squish()1. Remove all excessive leading, trailing and middle spaces
from item1
Review the data (d7)
# A tibble: 4 x 2
id item1
<dbl> <chr>
1 1 " 1, and 2, and 3"
2 2 "1, and 3"
3 3 "3"
4 4 "1, and 2, and 3"
Remove all spaces from item1
d7 %>%
dplyr::mutate(item1 = stringr::str_squish(item1))
# A tibble: 4 x 2
id item1
<dbl> <chr>
1 1 1, and 2, and 3
2 2 1, and 3
3 3 3
4 4 1, and 2, and 3
Return to Strings