Package: stringr


Function: str_replace()


1. Replace values in school

Review the data (d13)

# A tibble: 3 x 2
     id school                   
  <dbl> <chr>                    
1    10 hickman sr. high school  
2    20 west (senior) high school
3    30 east sr. high school     

Replace “sr.” and “(senior)” with “senior”

We first need to create a new variable “school” using dplyr::mutate() which will overwrite the existing school variable. We can then use stringr::str_replace() to replace our pattern using the pattern argument and denote our replacement using the replacement argument.

  • Note: The pattern argument uses regex arguments. Both periods and parentheses have special meaning but you can escape the special meaning using a double backslash or a bracket (as used in this case).
d13 %>%
  dplyr::mutate(school = stringr::str_replace(school, pattern = "sr[.]|[(]senior[)]", replacement = "senior"))
# A tibble: 3 x 2
     id school                    
  <dbl> <chr>                     
1    10 hickman senior high school
2    20 west senior high school   
3    30 east senior high school   

2. Replace values in item1

Review the data (d4)

# A tibble: 4 x 2
     id item1          
  <dbl> <chr>          
1     1 1, and 2, and 3
2     2 1, and 3       
3     3 3              
4     4 1, and 2, and 3

Replace all instances of the word “and” from item1 with “or”

Notice here, if we simply use stringr::str_replace() we will only replace the first instance of “and”.

d4 %>%
  dplyr::mutate(item1 = stringr::str_replace(item1, "and", "or"))
# A tibble: 4 x 2
     id item1         
  <dbl> <chr>         
1     1 1, or 2, and 3
2     2 1, or 3       
3     3 3             
4     4 1, or 2, and 3

So therefore, if we want to replace all instances, we need to use stringr::str_replace_all()

d4 %>%
  dplyr::mutate(item1 = stringr::str_replace_all(item1, "and", "or"))
# A tibble: 4 x 2
     id item1        
  <dbl> <chr>        
1     1 1, or 2, or 3
2     2 1, or 3      
3     3 3            
4     4 1, or 2, or 3

Function: str_replace_all()


1. Replace values in tags with a named vector of new values

Review the data (d17)

# A tibble: 4 x 2
     id tags                                    
  <dbl> <chr>                                   
1   123 long tag 1, long tag 3, long tag 5      
2   124 other long tag 2, long tag 3            
3   126 long tag 1, other long tag 2, long tag 6
4   127 long tag 6                              

First let’s create our named vector of replacement information.

name_replace <- tibble::tribble(~old_name, ~new_name,
                "long tag 1", "tag1",
                "other long tag 2", "tag2",
                "long tag 3", "tag3",
                "long tag 5", "tag5",
                "long tag 6", "tag6") |>
  tibble::deframe()

Now we can use this vector to replace the values in our tags variable.

d17 %>%
  dplyr::mutate(tags = 
                  stringr::str_replace_all(tags, name_replace))
# A tibble: 4 x 2
     id tags            
  <dbl> <chr>           
1   123 tag1, tag3, tag5
2   124 tag2, tag3      
3   126 tag1, tag2, tag6
4   127 tag6            

2. Redact names from an open text variable

Review the data (d18)

# A tibble: 4 x 4
  tch_id support1 support2 support_other                       
   <dbl>    <dbl>    <dbl> <chr>                               
1   1234        1        0 ""                                  
2   1235        0        0 "Mr. Lewis helps me in my classroom"
3   1237        1        1 "Professional development X"        
4   1241        0        1 "Sherry provides coaching"          

Replace names with “<redacted>”

d18 %>%
  mutate(support_other = str_replace_all(support_other, "Mr. Lewis|Sherry", "<redacted>"))
# A tibble: 4 x 4
  tch_id support1 support2 support_other                        
   <dbl>    <dbl>    <dbl> <chr>                                
1   1234        1        0 ""                                   
2   1235        0        0 "<redacted> helps me in my classroom"
3   1237        1        1 "Professional development X"         
4   1241        0        1 "<redacted> provides coaching"       

Return to Strings