Package: stringr


Function: str_trim()


1. Remove spaces from item1

Review the data (d1)

# A tibble: 5 x 2
  stu_id item1  
   <dbl> <chr>  
1   1234 " yes" 
2   2345 "no"   
3   3456 "maybe"
4   4567 "no"   
5   5678 "yes"  

I can see here that my variable item1 has some values with added spaces. This is going to cause issues for me when I try to do things like filter or group using this variable.

For example if I want to get a table of values for item1, we can see that the yes values are not grouping together.

d1 %>%
  janitor::tabyl(item1)
 item1 n percent
   yes 1     0.2
 maybe 1     0.2
    no 2     0.4
   yes 1     0.2

So I want to remove all white space on the left side of my variable.

  • Note: I am using dplyr::mutate() to create a new variable “item1” that replaces the prior “item1” variable.

  • Note: I used the argument side = “left” here but I could also say “right” or “both” (which is the default argument).

d1 %>%
  dplyr::mutate(item1 = stringr::str_trim(item1, side = "left"))
# A tibble: 5 x 2
  stu_id item1
   <dbl> <chr>
1   1234 yes  
2   2345 no   
3   3456 maybe
4   4567 no   
5   5678 yes  

2. Remove spaces from item1 and item2

Review the data (d2)

# A tibble: 5 x 3
  stu_id item1   item2  
   <dbl> <chr>   <chr>  
1   1234 " yes"  "yes"  
2   2345 "no"    "no "  
3   3456 "maybe" "no"   
4   4567 "no"    "no"   
5   5678 "yes"   "maybe"

We see here that we have spacing issues across multiple variables (“item1” and “item2”) so I want to remove spaces from both variables. I can use dplyr::across() to efficiently take care of this.

d2 %>%
  dplyr::mutate(dplyr::across(item1:item2, stringr::str_trim, side = "both"))
# A tibble: 5 x 3
  stu_id item1 item2
   <dbl> <chr> <chr>
1   1234 yes   yes  
2   2345 no    no   
3   3456 maybe no   
4   4567 no    no   
5   5678 yes   maybe

Package: stringr


Function: str_squish()


1. Remove all excessive leading, trailing and middle spaces from item1

Review the data (d7)

# A tibble: 4 x 2
     id item1              
  <dbl> <chr>              
1     1 " 1,  and 2, and 3"
2     2 "1, and 3"         
3     3 "3"                
4     4 "1, and 2,  and 3" 

Remove all spaces from item1

d7 %>%
  dplyr::mutate(item1 = stringr::str_squish(item1))
# A tibble: 4 x 2
     id item1          
  <dbl> <chr>          
1     1 1, and 2, and 3
2     2 1, and 3       
3     3 3              
4     4 1, and 2, and 3

Return to Strings