compare_df_cols()1. Compare two data frames that should have identical column types to see if there are any differences
Read in the data
entry1 <- readr::read_csv("project-a_forms_entry1.csv")
entry3 <- readr::read_csv("project-b_forms_entry3.csv")
Review entry1
# A tibble: 4 x 5
stu_id grade q1 q2 q3
<dbl> <dbl> <dbl> <dbl> <dbl>
1 1234 1 2 4 6
2 1235 2 1 5 6
3 1236 1 NA 12 4
4 1237 3 3 2 4
Review entry3
# A tibble: 4 x 5
stu_id grade q1 q2 q3
<dbl> <chr> <dbl> <dbl> <dbl>
1 1234 1 1 4 6
2 1235 2 1 5 6
3 1236 1 NA 1 4
4 1237 3 3 2 4
Now check if there are any differences in the column classes between
the two data frames. The function
janitor::compare_df_cols() will indicate if data frames
will successfully bind together by rows.
compare_df_cols(entry1, entry3)
column_name entry1 entry3
1 grade numeric character
2 q1 numeric numeric
3 q2 numeric numeric
4 q3 numeric numeric
5 stu_id numeric numeric
If we only wanted to see the differences, we could add the argument
return = “mismatch”. We can easily see here that
grade is a different variable type across files and we will
need to transform the variable in entry1 or entry3 in if we want to bind
the two data frames.
compare_df_cols(entry1, entry3, return = "mismatch")
column_name entry1 entry3
1 grade numeric character
Return to Compare data frames