计算有关数据子集的统计信息

这是我的数据的一个可重现的小示例:


> mydata <- structure(list(subject = c(1, 1, 1, 2, 2, 2), time = c(0, 1, 2, 0, 1, 2), measure = c(10, 12, 8, 7, 0, 0)), .Names = c("subject", "time", "measure"), row.names = c(NA, -6L), class = "data.frame")


> mydata


subject  time  measure

1          0      10

1          1      12

1          2       8

2          0       7

2          1       0

2          2       0

我想生成一个包含该measure特定主题的平均值的新变量,因此:


subject  time  measure  mn_measure

1          0      10      10

1          1      12      10

1          2       8      10

2          0       7      2.333

2          1       0      2.333

2          2       0      2.333

除了以编程方式遍历所有记录或首先重塑为宽格式之外,是否有一种简单的方法可以做到这一点?


宝慕林4294392
浏览 466回答 3
3回答

慕妹3242003

使用base R函数ave(),尽管其名称令人困惑,它仍可以计算各种统计信息,包括mean:within(mydata, mean<-ave(measure, subject, FUN=mean))&nbsp; subject time measure&nbsp; &nbsp; &nbsp; mean1&nbsp; &nbsp; &nbsp; &nbsp;1&nbsp; &nbsp; 0&nbsp; &nbsp; &nbsp; 10 10.0000002&nbsp; &nbsp; &nbsp; &nbsp;1&nbsp; &nbsp; 1&nbsp; &nbsp; &nbsp; 12 10.0000003&nbsp; &nbsp; &nbsp; &nbsp;1&nbsp; &nbsp; 2&nbsp; &nbsp; &nbsp; &nbsp;8 10.0000004&nbsp; &nbsp; &nbsp; &nbsp;2&nbsp; &nbsp; 0&nbsp; &nbsp; &nbsp; &nbsp;7&nbsp; 2.3333335&nbsp; &nbsp; &nbsp; &nbsp;2&nbsp; &nbsp; 1&nbsp; &nbsp; &nbsp; &nbsp;0&nbsp; 2.3333336&nbsp; &nbsp; &nbsp; &nbsp;2&nbsp; &nbsp; 2&nbsp; &nbsp; &nbsp; &nbsp;0&nbsp; 2.333333请注意,我within只是为了缩短代码而使用。这是没有的等效项within():mydata$mean <- ave(mydata$measure, mydata$subject, FUN=mean)mydata&nbsp; subject time measure&nbsp; &nbsp; &nbsp; mean1&nbsp; &nbsp; &nbsp; &nbsp;1&nbsp; &nbsp; 0&nbsp; &nbsp; &nbsp; 10 10.0000002&nbsp; &nbsp; &nbsp; &nbsp;1&nbsp; &nbsp; 1&nbsp; &nbsp; &nbsp; 12 10.0000003&nbsp; &nbsp; &nbsp; &nbsp;1&nbsp; &nbsp; 2&nbsp; &nbsp; &nbsp; &nbsp;8 10.0000004&nbsp; &nbsp; &nbsp; &nbsp;2&nbsp; &nbsp; 0&nbsp; &nbsp; &nbsp; &nbsp;7&nbsp; 2.3333335&nbsp; &nbsp; &nbsp; &nbsp;2&nbsp; &nbsp; 1&nbsp; &nbsp; &nbsp; &nbsp;0&nbsp; 2.3333336&nbsp; &nbsp; &nbsp; &nbsp;2&nbsp; &nbsp; 2&nbsp; &nbsp; &nbsp; &nbsp;0&nbsp; 2.333333

翻过高山走不出你

或者与data.table包:require(data.table)dt <- data.table(mydata, key = "subject")dt[, mn_measure := mean(measure), by = subject]#&nbsp; &nbsp;subject time measure mn_measure# 1:&nbsp; &nbsp; &nbsp; &nbsp;1&nbsp; &nbsp; 0&nbsp; &nbsp; &nbsp; 10&nbsp; 10.000000# 2:&nbsp; &nbsp; &nbsp; &nbsp;1&nbsp; &nbsp; 1&nbsp; &nbsp; &nbsp; 12&nbsp; 10.000000# 3:&nbsp; &nbsp; &nbsp; &nbsp;1&nbsp; &nbsp; 2&nbsp; &nbsp; &nbsp; &nbsp;8&nbsp; 10.000000# 4:&nbsp; &nbsp; &nbsp; &nbsp;2&nbsp; &nbsp; 0&nbsp; &nbsp; &nbsp; &nbsp;7&nbsp; &nbsp;2.333333# 5:&nbsp; &nbsp; &nbsp; &nbsp;2&nbsp; &nbsp; 1&nbsp; &nbsp; &nbsp; &nbsp;0&nbsp; &nbsp;2.333333# 6:&nbsp; &nbsp; &nbsp; &nbsp;2&nbsp; &nbsp; 2&nbsp; &nbsp; &nbsp; &nbsp;0&nbsp; &nbsp;2.333333
打开App,查看更多内容
随时随地看视频慕课网APP