Separate rows

Package: tidyr

Function: `separate_rows()`

Note: Unlike tidyr::separate() which splits columns into new columns, tidyr::separate_rows() splits a column with multiple delimited values into new rows.
Note: Since writing this, tidyr::separate_rows() has been superseded by separate_longer_delim(). See this link for more info.

1. Split a variable (toppings) into multiple rows

Review the data (d5)

# A tibble: 3 x 2
  name  toppings               
  <chr> <chr>                  
1 Marge anchovies,garlic,tomato
2 Homer sausage,red pepper     
3 Ned   cheese

Split toppings into new rows

Note: The original variable will be removed unless you add the argument remove=FALSE
Note: If no separator is indicated, the function will split values wherever a non-alphanumeric character appears. If you want to be clear where to separate, add the argument sep=“,” as I did below otherwise the words red and pepper will be split into separate rows.

d5 %>% 
  tidyr::separate_rows(toppings, sep=",")

# A tibble: 6 x 2
  name  toppings  
  <chr> <chr>     
1 Marge anchovies 
2 Marge garlic    
3 Marge tomato    
4 Homer sausage   
5 Homer red pepper
6 Ned   cheese

Note: While I think tidyr::separate_rows() is a very clean way of accomplishing the goal above, I think it is nice to see that there are other methods to accomplishing this result. One of those other ways is using stringr::str_split().

Notice here though that you have to use dplyr::mutate() to create a new column (or in this case replace the existing column).

d5 %>%
  mutate(toppings = stringr::str_split(toppings, pattern = ","))

# A tibble: 3 x 2
  name  toppings 
  <chr> <list>   
1 Marge <chr [3]>
2 Homer <chr [2]>
3 Ned   <chr [1]>

You’ll also notice that you do not get a normal character vector for toppings. You get a nested list-column that now requires us to tidyr::unnest() in order to see the information.

So while this method does work, you can see what tidyr::separte_rows() is much simpler for this use case.

d5 %>%
  mutate(toppings = stringr::str_split(toppings, pattern = ",")) %>%
  tidyr::unnest()

# A tibble: 6 x 2
  name  toppings  
  <chr> <chr>     
1 Marge anchovies 
2 Marge garlic    
3 Marge tomato    
4 Homer sausage   
5 Homer red pepper
6 Ned   cheese

Return to Separate

Separate rows

Package: tidyr

Function: separate_rows()

Function: `separate_rows()`