Package: base


Function: as.numeric()


1. Convert a character variable (Var3) to numeric

Review the data (d2)

# A tibble: 3 x 5
  Var1   Var2 Var3  Var4       Var5 
  <chr> <int> <chr> <chr>      <lgl>
1 b         2 3.6   10/10/2004 TRUE 
2 a        NA 8.5   12/14/2007 FALSE
3 c         3 X     08/09/2020 TRUE 

View the class for Var3

  • Note: Var3 will read in as character because an “X” was used to denote missing values in the data.
class(d2$Var3)
[1] "character"

Convert Var3 to numeric.

  • Note: In the case of Var3 you will get a warning message that says “NAs were introduced in conversion”. In this case, I am okay with that because “X”s were used to denote NAs previously and I want those to be converted to NA.

HOWEVER, if your variable contains any unexpected character values (spaces, extra decimal points, letters) and you were unaware of these values, you may have values converted to NA that you did not want that for. Whenever you get the error message above, always look into the reason before moving on. It may be that your variable requires some sort of transformation (such as a recode) before converting the type.

  • Note: We are recoding into a new variable using dplyr::mutate() and saving over the original variable by naming the new variable the same name as the original.
d2 <- d2 %>% 
  dplyr::mutate(Var3 = as.numeric(Var3))

class(d2$Var3)
[1] "numeric"

2. Convert a logical variable (Var5) to numeric

Review the data (d2)

# A tibble: 3 x 5
  Var1   Var2 Var3  Var4       Var5 
  <chr> <int> <chr> <chr>      <lgl>
1 b         2 3.6   10/10/2004 TRUE 
2 a        NA 8.5   12/14/2007 FALSE
3 c         3 X     08/09/2020 TRUE 

View the class for Var5

class(d2$Var5)
[1] "logical"

Convert Var5 to numeric.

d2 <- d2 %>% 
  dplyr::mutate(Var5 = as.numeric(Var5))

d2$Var5
[1] 1 0 1
class(d2$Var5)
[1] "numeric"

3. Convert all character variables to numeric

Review the data (d4)

# A tibble: 3 x 4
     ID Var2  Var3  Var4 
  <dbl> <chr> <chr> <chr>
1     1 2     3.6   4    
2     2 X     8.5   6    
3     3 2.5   X     X    

View the class for all variables

  • Note: Var2, Var3 and Var4 are read in as character variables because an “X” was used to denote missing values in the data.

  • Note: Another way to have dealt with these columns that have “X” denoting NA, is to have read in the data using a function where you explicitly state what the missing values are. Example: `readr::read_csv(“file.csv”, na=“X”). If you read in your file this way, the column would have read in as numeric rather than character.

str(d4)
tibble [3 x 4] (S3: tbl_df/tbl/data.frame)
 $ ID  : num [1:3] 1 2 3
 $ Var2: chr [1:3] "2" "X" "2.5"
 $ Var3: chr [1:3] "3.6" "8.5" "X"
 $ Var4: chr [1:3] "4" "6" "X"

Convert all character variables to numeric variables

  • Note: Using the function dplyr::across() we are applying a transformation across multiple columns
  • Note: You must wrap is.character, a predicate function (returns a true/false), in the tidyselect selection helper where().
  • Note: We are recoding into new variables using dplyr::mutate() and saving over the original variables.
d4 <- d4 %>% 
  dplyr::mutate(dplyr::across(where(is.character), as.numeric))

View the class for all variables

str(d4)
tibble [3 x 4] (S3: tbl_df/tbl/data.frame)
 $ ID  : num [1:3] 1 2 3
 $ Var2: num [1:3] 2 NA 2.5
 $ Var3: num [1:3] 3.6 8.5 NA
 $ Var4: num [1:3] 4 6 NA

You can also call out the exact variables you want to convert

d4 %>% 
  dplyr::mutate(dplyr::across(Var2:Var4, as.numeric))
# A tibble: 3 x 4
     ID  Var2  Var3  Var4
  <dbl> <dbl> <dbl> <dbl>
1     1   2     3.6     4
2     2  NA     8.5     6
3     3   2.5  NA      NA

Or in the case of this data frame, since you essentially want all variables to be numeric (Var1 just happens to already be numeric), you could convert all variables to numeric using the tidyselect selection helper everything().

d4 %>% 
  dplyr::mutate(dplyr::across(tidyselect::everything(), as.numeric))
# A tibble: 3 x 4
     ID  Var2  Var3  Var4
  <dbl> <dbl> <dbl> <dbl>
1     1   2     3.6     4
2     2  NA     8.5     6
3     3   2.5  NA      NA

4. Convert a factor variable (Var3) to numeric

Review the data (d3)

# A tibble: 3 x 4
  Var1   Var2 Var3  Var4      
  <chr> <int> <fct> <chr>     
1 b         2 3     10/10/2004
2 a        NA 8     12/14/2007
3 c         3 2     08/09/2020

View the class for Var3

class(d3$Var3)
[1] "factor"

Convert Var3 to numeric.

  • Note: We MUST convert the factor variable to character before converting to numeric or we will not retain our original values. Instead, base::as.numeric() will convert our factor values to their factor levels (3=2, 8=3, 1=1) which is not what we want. See the first example vs the second example.

Don’t do this

d3 %>% 
  dplyr::mutate(Var3 = as.numeric(Var3))
# A tibble: 3 x 4
  Var1   Var2  Var3 Var4      
  <chr> <int> <dbl> <chr>     
1 b         2     2 10/10/2004
2 a        NA     3 12/14/2007
3 c         3     1 08/09/2020

Do this

d3 <- d3 %>% 
  dplyr::mutate(Var3 = as.numeric(as.character(Var3)))

d3
# A tibble: 3 x 4
  Var1   Var2  Var3 Var4      
  <chr> <int> <dbl> <chr>     
1 b         2     3 10/10/2004
2 a        NA     8 12/14/2007
3 c         3     2 08/09/2020
class(d3$Var3)
[1] "numeric"

Return to Data Types