Package: base


Function: sample()


1. Randomly assign treatment to cases

Review the data (d1)

# A tibble: 14 x 2
      id grade
   <dbl> <dbl>
 1    10     3
 2    11     3
 3    12     3
 4    13     4
 5    14     4
 6    15     4
 7    16     3
 8    17     3
 9    18     4
10    19     4
11    20     3
12    21     4
13    22     4
14    23     3

I want to randomly assign a treatment variable (treat) to cases. 1 = treatment and 0 = control. I want an even disbursement.

First we need to use base::set.seed() and choose any number to add as my seed. Setting this seed ensures that if I ever run this code again at a later time, I will get the same random sample each time. Very important!

Within my base::sample() function I can use base::rep() to replicate the values of 0 and 1 and assign the number of times to use each value in the times argument. I could put the literal number 7 in this argument, or if I wanted R to do the calculation for me, I could use dplyr::n() to calculate the number of rows and divide it by 2.

  • Note: I am using dplyr::mutate() to create my new “treat” variable.
base::set.seed(1234)

d1_new <- d1 %>%
  dplyr::mutate(treat = base::sample(base::rep(c(1,0), times = dplyr::n()/2)))

d1_new
# A tibble: 14 x 3
      id grade treat
   <dbl> <dbl> <dbl>
 1    10     3     0
 2    11     3     0
 3    12     3     0
 4    13     4     1
 5    14     4     1
 6    15     4     1
 7    16     3     0
 8    17     3     0
 9    18     4     0
10    19     4     0
11    20     3     1
12    21     4     1
13    22     4     1
14    23     3     1

Let’s make sure we have an even number of treatment and control

d1_new %>%
  janitor::tabyl(treat)
 treat n percent
     0 7     0.5
     1 7     0.5

If I cared about grade level and wanted to randomly assign treatment evenly within grade, I could add a dplyr::group_by() statement and group by grade before randomly assigning.

Notice now that treatment cannot evenly divided though (each grade level has 7 participants). In this case you can assign different numbers of times to repeat each value. For example, here I chose to repeat 1 three times and 0 four times.

base::set.seed(1234)

d1_new <- d1 %>%
  dplyr::group_by(grade) %>%
  dplyr::mutate(treat = base::sample(base::rep(c(1,0), times = c(3,4))))

d1_new
# A tibble: 14 x 3
# Groups:   grade [2]
      id grade treat
   <dbl> <dbl> <dbl>
 1    10     3     0
 2    11     3     1
 3    12     3     0
 4    13     4     0
 5    14     4     0
 6    15     4     1
 7    16     3     0
 8    17     3     1
 9    18     4     1
10    19     4     0
11    20     3     1
12    21     4     0
13    22     4     1
14    23     3     0

Let’s review our treatment now

d1_new %>%
  janitor::tabyl(grade, treat)
 grade 0 1
     3 4 3
     4 4 3

Return to Randomize