Package: stringr


Function: str_extract()


1. Extract specified phrases from a list of tags in a variable

Review the data (d17)

# A tibble: 4 x 2
     id tags                                    
  <dbl> <chr>                                   
1   123 long tag 1, long tag 3, long tag 5      
2   124 other long tag 2, long tag 3            
3   126 long tag 1, other long tag 2, long tag 6
4   127 long tag 6                              

We have a variable with several tags included but we want to create a new variable which only includes tags from a specified list. First we create a vector of the tags of interest.

tags_of_interest <- c("long tag 1", "other long tag 2", "long tag 6")

Now we want to create our new variable, special_tags that pulls out any tag from that list in our current tag variable and places it in our new variable.

  • Note: Because we are adding dplyr::rowwise() we will need to ungroup() at the end so that that it doesnโ€™t slow down our future operations.
d17 %>%
  dplyr::rowwise() %>%
  dplyr::mutate(
    special_tags =
      stringr::str_extract(tags, tags_of_interest) %>%
      na.omit() %>%
      paste0(collapse = "; ")
  ) %>%
  dplyr::ungroup()
# A tibble: 4 x 3
     id tags                                     special_tags                   
  <dbl> <chr>                                    <chr>                          
1   123 long tag 1, long tag 3, long tag 5       long tag 1                     
2   124 other long tag 2, long tag 3             other long tag 2               
3   126 long tag 1, other long tag 2, long tag 6 long tag 1; other long tag 2; ~
4   127 long tag 6                               long tag 6                     

Return to Strings