Package: tidyr


Function: separate_rows()


  • Note: Unlike tidyr::separate() which splits columns into new columns, tidyr::separate_rows() splits a column with multiple delimited values into new rows.

  • Note: Since writing this, tidyr::separate_rows() has been superseded by separate_longer_delim(). See this link for more info.


1. Split a variable (toppings) into multiple rows

Review the data (d5)

# A tibble: 3 x 2
  name  toppings               
  <chr> <chr>                  
1 Marge anchovies,garlic,tomato
2 Homer sausage,red pepper     
3 Ned   cheese                 

Split toppings into new rows

  • Note: The original variable will be removed unless you add the argument remove=FALSE
  • Note: If no separator is indicated, the function will split values wherever a non-alphanumeric character appears. If you want to be clear where to separate, add the argument sep=“,” as I did below otherwise the words red and pepper will be split into separate rows.
d5 %>% 
  tidyr::separate_rows(toppings, sep=",")
# A tibble: 6 x 2
  name  toppings  
  <chr> <chr>     
1 Marge anchovies 
2 Marge garlic    
3 Marge tomato    
4 Homer sausage   
5 Homer red pepper
6 Ned   cheese    
  • Note: While I think tidyr::separate_rows() is a very clean way of accomplishing the goal above, I think it is nice to see that there are other methods to accomplishing this result. One of those other ways is using stringr::str_split().

Notice here though that you have to use dplyr::mutate() to create a new column (or in this case replace the existing column).

d5 %>%
  mutate(toppings = stringr::str_split(toppings, pattern = ",")) 
# A tibble: 3 x 2
  name  toppings 
  <chr> <list>   
1 Marge <chr [3]>
2 Homer <chr [2]>
3 Ned   <chr [1]>

You’ll also notice that you do not get a normal character vector for toppings. You get a nested list-column that now requires us to tidyr::unnest() in order to see the information.

So while this method does work, you can see what tidyr::separte_rows() is much simpler for this use case.

d5 %>%
  mutate(toppings = stringr::str_split(toppings, pattern = ",")) %>%
  tidyr::unnest()
# A tibble: 6 x 2
  name  toppings  
  <chr> <chr>     
1 Marge anchovies 
2 Marge garlic    
3 Marge tomato    
4 Homer sausage   
5 Homer red pepper
6 Ned   cheese    

Return to Separate