pivot_wider()tidyr::pivot_wider(). Type ?pivot_wider in your console to see all arguments. But the most basic arguments to be aware of include:1. Restructure our longitudinal data into wide format using the “wave” column
Review the data (d2)
# A tibble: 6 x 4
tch_id treatment wave q1
<dbl> <dbl> <dbl> <dbl>
1 1234 1 1 3
2 1234 1 2 5
3 2345 0 1 4
4 2345 0 2 5
5 3456 1 1 NA
6 3456 1 2 2
Restructure the data into wide format, adding “w#_” as a prefix to the “q1” variable.
d2 %>%
tidyr::pivot_wider(
names_from = wave,
names_glue = "w{wave}_{.value}",
values_from = q1)
# A tibble: 3 x 4
tch_id treatment w1_q1 w2_q1
<dbl> <dbl> <dbl> <dbl>
1 1234 1 3 5
2 2345 0 4 5
3 3456 1 NA 2
2. Restructure our longitudinal data into wide format using the “year” and “cycle” column
Review the data (d7)
# A tibble: 8 x 5
id year cycle q1 q2
<dbl> <chr> <dbl> <dbl> <dbl>
1 1234 1819 1 1 2
2 1234 1819 2 2 3
3 1234 1920 1 2 4
4 1234 1920 2 2 3
5 2345 1819 1 0 1
6 2345 1819 2 1 3
7 2345 1920 1 2 3
8 2345 1920 2 3 3
Restructure the data into wide format, adding the year and cycle as a prefix to “q1” and “q2”.
d7 %>%
pivot_wider(
names_from = c(year,cycle),
values_from = c(q1, q2),
names_glue = "y{year}_c{cycle}_{.value}"
)
# A tibble: 2 x 9
id y1819_c1_q1 y1819_c2_q1 y1920_c1_q1 y1920_c2_q1 y1819_c1_q2 y1819_c2_q2
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1234 1 2 2 2 2 3
2 2345 0 1 2 3 1 3
# ... with 2 more variables: y1920_c1_q2 <dbl>, y1920_c2_q2 <dbl>
3. Restructure the data into wide format to calculate a statistic such as intraclass correlation coefficient
Review the data (d3)
# A tibble: 6 x 3
tch_id rater_id score
<dbl> <dbl> <dbl>
1 1234 16 23
2 1234 22 24
3 2345 16 22
4 2345 22 19
5 3456 16 18
6 3456 22 20
For ICC analysis, we typically only want two columns of data, one for rater 1 and one for rater 2 (or one column per rater if we have more than 2 raters per case).
d3 %>%
tidyr::pivot_wider(names_from=rater_id,
values_from=score,
names_prefix= "rater_") %>%
select(-tch_id)
# A tibble: 3 x 2
rater_16 rater_22
<dbl> <dbl>
1 23 24
2 22 19
3 18 20
If we are only concerned with the pairs, and not the individual raters, we can also use this method even if the raters are not consistent. Take this data for example, where the raters change, but we have a column to indicate which rater is primary and which rater is secondary.
# A tibble: 7 x 4
tch_id rater_id primary score
<dbl> <dbl> <dbl> <dbl>
1 1234 16 1 23
2 1234 22 2 24
3 2345 16 1 22
4 2345 25 2 19
5 3456 18 1 18
6 3456 22 2 20
7 4567 18 1 22
We can still restructure this data.
d4 %>%
tidyr::pivot_wider(
id_cols = tch_id,
names_from=primary,
values_from=score,
names_prefix= "rater_") %>%
dplyr::select(-tch_id)
# A tibble: 4 x 2
rater_1 rater_2
<dbl> <dbl>
1 23 24
2 22 19
3 18 20
4 22 NA
Return to Restructure