str_extract()1. Extract all numeric values from
item1
Review the data (d3)
# A tibble: 4 x 2
id item1
<dbl> <chr>
1 1 12cm
2 2 4cm
3 3 6cm
4 4 2.5CM
We first need to create our new variable item1 which
will write over our existing item1 variable since I named
it the same name. We do this using dplyr::mutate().
Next we use the argument pattern = in our
stringr function to denote what pattern we want to extract.
If it wasn’t for the 2.5 value, this pattern would be fairly simple. We
could use something like “\\d*” to say zero or more digits.
But since we have a “2.5” value, the pattern is a little more complicated. We say the pattern starts with zero or more digits (“^\\d*”) and then an optional period (“\\.?”) and then zero or more digits (“\\d*”).
d3 %>%
dplyr::mutate(item1 = stringr::str_extract(item1, pattern = "^\\d*\\.?\\d*"))
# A tibble: 4 x 2
id item1
<dbl> <chr>
1 1 12
2 2 4
3 3 6
4 4 2.5
extract()1. Extract all numeric values from
item1
Review the data (d3)
# A tibble: 4 x 2
id item1
<dbl> <chr>
1 1 12cm
2 2 4cm
3 3 6cm
4 4 2.5CM
Using tidyr::extract() you no longer need to use
dplyr::mutate() because you have the argument into
where you name the new extracted variable/s.
The other thing to note is that in the regex argument, each pattern associated with an “into” variable needs to be surrounded by parentheses.
d3 %>%
tidyr::extract(col =item1, into = "item1", regex = "(^\\d*\\.?\\d*)")
# A tibble: 4 x 2
id item1
<dbl> <chr>
1 1 12
2 2 4
3 3 6
4 4 2.5
Note again that the variable is still character and would need to be converted to numeric as a next step.
parse_number()1. Extract all numeric values from
item1
Review the data (d3)
# A tibble: 4 x 2
id item1
<dbl> <chr>
1 1 12cm
2 2 4cm
3 3 6cm
4 4 2.5CM
There is another function readr::parse_number() that
pulls out all numeric values from a string variable and converts the
variable to numeric for us, which is pretty cool.
d3 %>%
dplyr::mutate(item1 = readr::parse_number(item1))
# A tibble: 4 x 2
id item1
<dbl> <dbl>
1 1 12
2 2 4
3 3 6
4 4 2.5
Return to Strings