按组计算连续行中的值之间的差异

按组计算连续行中的值之间的差异

这是我的df(data.frame):


group value

1     10

1     20

1     25

2     5

2     10

2     15 

我需要按组计算连续行中值之间的差异。


所以,我需要一个结果。


group value diff

1     10    NA # because there is a no previous value

1     20    10 # value[2] - value[1]

1     25    5  # value[3] value[2]

2     5     NA # because group is changed

2     10    5  # value[5] - value[4]

2     15    5  # value[6] - value[5]

虽然,我可以通过使用来处理这个问题ddply,但需要花费太多时间。这是因为我的团队中有很多团体df。(我的超过1,000,000个团体df)


有没有其他有效的方法来处理这个问题?


扬帆大鱼
浏览 530回答 3
3回答

拉风的咖菲猫

该软件包data.table可以使用该shift功能相当快速地完成此操作。require(data.table)df <- data.table(group = rep(c(1, 2), each = 3), value = c(10,20,25,5,10,15))#setDT(df) #if df is already a data framedf[ , diff := value - shift(value), by = group]&nbsp; &nbsp;&nbsp;#&nbsp; &nbsp;group value diff#1:&nbsp; &nbsp; &nbsp;1&nbsp; &nbsp; 10&nbsp; &nbsp;NA#2:&nbsp; &nbsp; &nbsp;1&nbsp; &nbsp; 20&nbsp; &nbsp;10#3:&nbsp; &nbsp; &nbsp;1&nbsp; &nbsp; 25&nbsp; &nbsp; 5#4:&nbsp; &nbsp; &nbsp;2&nbsp; &nbsp; &nbsp;5&nbsp; &nbsp;NA#5:&nbsp; &nbsp; &nbsp;2&nbsp; &nbsp; 10&nbsp; &nbsp; 5#6:&nbsp; &nbsp; &nbsp;2&nbsp; &nbsp; 15&nbsp; &nbsp; 5setDF(df) #if you want to convert back to old data.frame syntax或者使用中的lag功能dplyrdf %>%&nbsp; &nbsp; group_by(group) %>%&nbsp; &nbsp; mutate(Diff = value - lag(value))#&nbsp; &nbsp;group value&nbsp; Diff#&nbsp; &nbsp;<int> <int> <int># 1&nbsp; &nbsp; &nbsp;1&nbsp; &nbsp; 10&nbsp; &nbsp; NA# 2&nbsp; &nbsp; &nbsp;1&nbsp; &nbsp; 20&nbsp; &nbsp; 10# 3&nbsp; &nbsp; &nbsp;1&nbsp; &nbsp; 25&nbsp; &nbsp; &nbsp;5# 4&nbsp; &nbsp; &nbsp;2&nbsp; &nbsp; &nbsp;5&nbsp; &nbsp; NA# 5&nbsp; &nbsp; &nbsp;2&nbsp; &nbsp; 10&nbsp; &nbsp; &nbsp;5# 6&nbsp; &nbsp; &nbsp;2&nbsp; &nbsp; 15&nbsp; &nbsp; &nbsp;5有关前期data.table::shift和前期的替代方案dplyr::lag,请参阅编辑。

蛊毒传说

您可以使用基本功能,ave()此df <- data.frame(group=rep(c(1,2),each=3),value=c(10,20,25,5,10,15))df$diff <- ave(df$value, factor(df$group), FUN=function(x) c(NA,diff(x)))返回&nbsp; group value diff1&nbsp; &nbsp; &nbsp;1&nbsp; &nbsp; 10&nbsp; &nbsp;NA2&nbsp; &nbsp; &nbsp;1&nbsp; &nbsp; 20&nbsp; &nbsp;103&nbsp; &nbsp; &nbsp;1&nbsp; &nbsp; 25&nbsp; &nbsp; 54&nbsp; &nbsp; &nbsp;2&nbsp; &nbsp; &nbsp;5&nbsp; &nbsp;NA5&nbsp; &nbsp; &nbsp;2&nbsp; &nbsp; 10&nbsp; &nbsp; 56&nbsp; &nbsp; &nbsp;2&nbsp; &nbsp; 15&nbsp; &nbsp; 5

蝴蝶不菲

试着用tapplydf$diff<-as.vector(unlist(tapply(df$value,df$group,FUN=function(x){&nbsp;return&nbsp;(c(NA,diff(x)))})))
打开App,查看更多内容
随时随地看视频慕课网APP