在特定值的连续运行中创建计数器

在特定值的连续运行中创建计数器

我有小时价值。我想计算自上一次非零以来该值连续多少小时。对于电子表格或循环来说,这是一项简单的工作,但我希望有一个快速的矢量化单行程来完成任务。


x <- c(1, 0, 1, 0, 0, 0, 1, 1, 0, 0)

df <- data.frame(x, zcount = NA)


df$zcount[1] <- ifelse(df$x[1] == 0, 1, 0)

for(i in 2:nrow(df)) 

  df$zcount[i] <- ifelse(df$x[i] == 0, df$zcount[i - 1] + 1, 0)

期望的输出:


R> df

   x zcount

1  1      0

2  0      1

3  1      0

4  0      1

5  0      2

6  0      3

7  1      0

8  1      0

9  0      1

10 0      2


慕斯王
浏览 522回答 3
3回答

森栏

这里有一个方法,建立在约书亚的rle方法:(编辑以使用seq_len和lapply按马立克的建议)> (!x) * unlist(lapply(rle(x)$lengths, seq_len))&nbsp;[1] 0 1 0 1 2 3 0 0 1 2更新。只是为了踢,这是另一种方法,大约快5倍:cumul_zeros <- function(x)&nbsp; {&nbsp; x <- !x&nbsp; rl <- rle(x)&nbsp; len <- rl$lengths&nbsp; v <- rl$values&nbsp; cumLen <- cumsum(len)&nbsp; z <- x&nbsp; # replace the 0 at the end of each zero-block in z by the&nbsp;&nbsp; # negative of the length of the preceding 1-block....&nbsp; iDrops <- c(0, diff(v)) < 0&nbsp; z[ cumLen[ iDrops ] ] <- -len[ c(iDrops[-1],FALSE) ]&nbsp; # ... to ensure that the cumsum below does the right thing.&nbsp; # We zap the cumsum with x so only the cumsums for the 1-blocks survive:&nbsp; x*cumsum(z)}试试一个例子:> cumul_zeros(c(1,1,1,0,0,0,0,0,1,1,1,0,0,1,1))&nbsp;[1] 0 0 0 1 2 3 4 5 0 0 0 1 2 0 0现在比较百万长度向量的时间:> x <- sample(0:1, 1000000,T)> system.time( z <- cumul_zeros(x))&nbsp; &nbsp;user&nbsp; system elapsed&nbsp;&nbsp; &nbsp;0.15&nbsp; &nbsp; 0.00&nbsp; &nbsp; 0.14&nbsp;> system.time( z <- (!x) * unlist( lapply( rle(x)$lengths, seq_len)))&nbsp; &nbsp;user&nbsp; system elapsed&nbsp;&nbsp; &nbsp;0.75&nbsp; &nbsp; 0.00&nbsp; &nbsp; 0.75&nbsp;故事的道德:单行更好,更容易理解,但并不总是最快!

千万里不及你

William Dunlap关于R-help的帖子是寻找与跑步长度相关的所有事情的地方。他在这篇文章中的f7&nbsp;是f7&nbsp;<-&nbsp;function(x){&nbsp;tmp<-cumsum(x);tmp-cummax((!x)*tmp)}在目前的情况下f7(!x)。在性能方面有>&nbsp;x&nbsp;<-&nbsp;sample(0:1,&nbsp;1000000,&nbsp;TRUE)>&nbsp;system.time(res7&nbsp;<-&nbsp;f7(!x)) &nbsp;&nbsp;&nbsp;user&nbsp;&nbsp;system&nbsp;elapsed&nbsp; &nbsp;&nbsp;0.076&nbsp;&nbsp;&nbsp;0.000&nbsp;&nbsp;&nbsp;0.077&nbsp;>&nbsp;system.time(res0&nbsp;<-&nbsp;cumul_zeros(x)) &nbsp;&nbsp;&nbsp;user&nbsp;&nbsp;system&nbsp;elapsed&nbsp; &nbsp;&nbsp;0.345&nbsp;&nbsp;&nbsp;0.003&nbsp;&nbsp;&nbsp;0.349&nbsp;>&nbsp;identical(res7,&nbsp;res0)[1]&nbsp;TRUE

慕盖茨4494581

rle 将“计算自上一次非零以来该值连续多少小时”,但不是“所需输出”的格式。请注意相应值为零的元素的长度:rle(x)# Run Length Encoding#&nbsp; &nbsp;lengths: int [1:6] 1 1 1 3 2 2#&nbsp; &nbsp;values : num [1:6] 1 0 1 0 1 0
打开App,查看更多内容
随时随地看视频慕课网APP