Package: dplyr


Function: count()


1. Count how many distinct schools we have

Review the data (d1)

# A tibble: 6 x 3
  school tch_id stu_id
  <chr>   <dbl>  <dbl>
1 a          12     30
2 b          13     20
3 a          12     50
4 b          17     22
5 c          18     25
6 c          18     35

Count the distinct number of schools

Because schools inherently occur more than once in a dataset, you need to consider how to remove duplicates to get distinct values. We can use dplyr::distinct() to get our distinct school names and then do a count.

d1 %>%
  dplyr::distinct(school) %>%
  dplyr::count()
# A tibble: 1 x 1
      n
  <int>
1     3

We could also use another function dplyr::n_groups() which requires you to first group by your variable of interest (in this case school) and then use the function.

  • Note: It is important to note that the output between using dplyr::count() and dplyr::n_groups() is different. The former produces a tibble and the latter produces a numeric vector.
d1 %>%
  dplyr::group_by(school) %>%
  dplyr::n_groups()
[1] 3

Return to Count