我有一个数据帧,格式如下。状态有两个级别(PRE、POST)。
SI_mean
|
TU_平均
|
ED_mean
|
平均值(_M)
|
DT_mean
|
SK_mean
|
ATT_mean
|
地位
|
2.6
|
2.75
|
2.6
|
2.8
|
3.4
|
2.5
|
3.8
|
PRE
|
3.
|
3.
|
2.4
|
2.4
|
3.
|
3.
|
4.
|
PRE
|
2.4
|
2.75
|
2.4
|
2.2
|
2.6
|
2.25
|
2.8
|
PRE
|
我想用wilcox.test比较每列状态级别的值。所以我立即尝试,
df |>
summarise(across(contains("mean"),~wilcox.test(.x~Status)$p.value))
但迎接我的是
Error in `summarise()`:
â¹ In argument: `across(contains("mean"), ~wilcox.test(.x ~ Status)$p.value)`.
â¹ In row 1.
Caused by error in `across()`:
! Can't compute column `SI_mean`.
Caused by error in `wilcox.test.formula()`:
! grouping factor must have exactly 2 levels
所以我改为使用长格式,它如预期的那样工作,
df |> pivot_longer(contains("mean"),names_to = "Variable",values_to = "Mean") |>
group_by(Variable) |>
summarise(
wilcox_p_value = wilcox.test(Mean~Status)$p.value
)
但为什么
summarise
在宽格式中失败?
我只是对我误解的内容感兴趣
总结
功能,以及我将如何使它在宽格式上工作。
数据
df=structure(list(SI_mean = c(2.6, 3, 2.4, 3, 3, 3.2, 2.2, 4, 3.8,
2.8, 3.6, 2, 3.6, 3.6, 3.8, 3.2, 3, 4, 4, 3, 3.2, 4, 4, 3.2,
3.2, 3, 3.2, 3.8, 4, 4, 4, 3), TU_mean = c(2.75, 3, 2.75, 3,
3, 2.75, 3, 3.5, 3.75, 2.5, 3.25, 2, 3.5, 4, 3, 3.25, 3, 4, 4,
3, 4, 4, 4, 3.25, 3.25, 3, 3, 3.25, 4, 4, 4, 3), ED_mean = c(2.6,
2.4, 2.4, 3, 2.8, 4, 2, 3.8, 2.6, 2, 2.8, 2, 3, 3.4, 3, 1, 3,
4, 3.8, 3, 3, 4, 4, 3.2, 4, 2.6, 4, 4, 3.8, 3.6, 4, 3), MT_mean = c(2.8,
2.4, 2.2, 3, 2.8, 3.4, 2.2, 3.6, 3.4, 3, 2.6, 1.8, 3.4, 3, 4,
2, 3, 3.4, 3.4, 3, 4, 4, 4, 3.2, 4, 2.8, 4, 4, 3.8, 3.6, 4, 3
), DT_mean = c(3.4, 3, 2.6, 3, 3, 3.8, 2.4, 3.6, 3, 3, 2.8, 2.4,
3.6, 3.6, 3, 2.2, 3, 4, 4, 4, 3.6, 4, 4, 3.6, 3.8, 2.8, 4, 4,
4, 3.8, 4, 3), SK_mean = c(2.5, 3, 2.25, 3, 3, 3.5, 2.25, 4,
3.25, 2.25, 2.5, 2.5, 3.75, 3.75, 4, 1, 2, 4, 3.25, 3, 3.75,
4, 4, 2.75, 4, 3, 4, 4, 4, 4, 4, 3), ATT_mean = c(3.8, 4, 2.8,
3, 3, 3.8, 3, 3.6, 3, 4, 4, 3, 3.8, 3.6, 4, 3.8, 4, 4, 4, 4,
4, 4, 4, 3.6, 3.8, 3, 4, 4, 4, 4, 4, 4), Status = c("PRE", "PRE",
"PRE", "PRE", "PRE", "PRE", "PRE", "PRE", "PRE", "PRE", "PRE",
"PRE", "PRE", "PRE", "PRE", "PRE", "PRE", "POST", "POST", "POST",
"POST", "POST", "POST", "POST", "POST", "POST", "POST", "POST",
"POST", "POST", "POST", "POST")), class = c("rowwise_df", "tbl_df",
"tbl", "data.frame"), row.names = c(NA, -32L), groups = structure(list(
.rows = structure(list(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L,
10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L,
21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L,
32L), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = c(NA, -32L), class = c("tbl_df",
"tbl", "data.frame")))