为每个数据组中的行创建一个序号(计数器

为每个数据组中的行创建一个序号(计数器

我们如何在数据的每一组中生成唯一的标识号?以下是按“PersonId”分组的一些数据:


personid date measurement

1         x     23

1         x     32

2         y     21

3         x     23

3         z     23

3         y     23

我希望为“PersonId”定义的每个子集中的每一行添加一个id列,其值总是以1..这是我想要的输出:


personid date measurement id

1         x     23         1

1         x     32         2

2         y     21         1

3         x     23         1

3         z     23         2

3         y     23         3

我很感谢你的帮助。


互换的青春
浏览 648回答 3
3回答

GCT1015

被误导的名字ave()函数,带参数FUN=seq_along,就能很好地完成这一任务-即使你personid列没有严格的排序。df <- read.table(text = "personid date measurement1&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;x&nbsp; &nbsp; &nbsp;231&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;x&nbsp; &nbsp; &nbsp;322&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;y&nbsp; &nbsp; &nbsp;213&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;x&nbsp; &nbsp; &nbsp;233&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;z&nbsp; &nbsp; &nbsp;233&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;y&nbsp; &nbsp; &nbsp;23", header=TRUE)## First with your data.frameave(df$personid, df$personid, FUN=seq_along)# [1] 1 2 1 1 2 3## Then with another, in which personid is *not* in orderdf2 <- df[c(2:6, 1),]ave(df2$personid, df2$personid, FUN=seq_along)# [1] 1 1 1 2 3 2

拉莫斯之舞

一些dplyr替代品,使用方便函数row_number和n.library(dplyr)df %>% group_by(personid) %>% mutate(id = row_number())df %>% group_by(personid) %>% mutate(id = 1:n())df %>% group_by(personid) %>% mutate(id = seq_len(n()))df %>% group_by(personid) %>% mutate(id = seq_along(personid))您也可以使用getanID从包装splitstackshape..注意,输入数据集作为data.table.getanID(data = df, id.vars = "personid")#&nbsp; &nbsp; personid date measurement .id# 1:&nbsp; &nbsp; &nbsp; &nbsp; 1&nbsp; &nbsp; x&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 23&nbsp; &nbsp;1# 2:&nbsp; &nbsp; &nbsp; &nbsp; 1&nbsp; &nbsp; x&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 32&nbsp; &nbsp;2# 3:&nbsp; &nbsp; &nbsp; &nbsp; 2&nbsp; &nbsp; y&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 21&nbsp; &nbsp;1# 4:&nbsp; &nbsp; &nbsp; &nbsp; 3&nbsp; &nbsp; x&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 23&nbsp; &nbsp;1# 5:&nbsp; &nbsp; &nbsp; &nbsp; 3&nbsp; &nbsp; z&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 23&nbsp; &nbsp;2# 6:&nbsp; &nbsp; &nbsp; &nbsp; 3&nbsp; &nbsp; y&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 23&nbsp; &nbsp;3

温温酱

使用data.table,并假设您希望通过date在personid子集library(data.table)DT <- data.table(Data)DT[,id := order(date), by&nbsp; = personid]##&nbsp; &nbsp; personid date measurement id## 1:&nbsp; &nbsp; &nbsp; &nbsp; 1&nbsp; &nbsp; x&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 23&nbsp; 1## 2:&nbsp; &nbsp; &nbsp; &nbsp; 1&nbsp; &nbsp; x&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 32&nbsp; 2## 3:&nbsp; &nbsp; &nbsp; &nbsp; 2&nbsp; &nbsp; y&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 21&nbsp; 1## 4:&nbsp; &nbsp; &nbsp; &nbsp; 3&nbsp; &nbsp; x&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 23&nbsp; 1## 5:&nbsp; &nbsp; &nbsp; &nbsp; 3&nbsp; &nbsp; z&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 23&nbsp; 3## 6:&nbsp; &nbsp; &nbsp; &nbsp; 3&nbsp; &nbsp; y&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 23&nbsp; 2如果你不想dateDT[, id := 1:.N, by = personid]##&nbsp; &nbsp; personid date measurement id## 1:&nbsp; &nbsp; &nbsp; &nbsp; 1&nbsp; &nbsp; x&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 23&nbsp; 1## 2:&nbsp; &nbsp; &nbsp; &nbsp; 1&nbsp; &nbsp; x&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 32&nbsp; 2## 3:&nbsp; &nbsp; &nbsp; &nbsp; 2&nbsp; &nbsp; y&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 21&nbsp; 1## 4:&nbsp; &nbsp; &nbsp; &nbsp; 3&nbsp; &nbsp; x&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 23&nbsp; 1## 5:&nbsp; &nbsp; &nbsp; &nbsp; 3&nbsp; &nbsp; z&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 23&nbsp; 2## 6:&nbsp; &nbsp; &nbsp; &nbsp; 3&nbsp; &nbsp; y&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 23&nbsp; 3以下任何一项都将有效DT[, id := seq_along(measurement), by =&nbsp; personid]DT[, id := seq_along(date), by =&nbsp; personid]使用的等效命令plyrlibrary(plyr)# ordering by dateddply(Data, .(personid), mutate, id = order(date))# in original orderddply(Data, .(personid), mutate, id = seq_along(date))ddply(Data, .(personid), mutate, id = seq_along(measurement))
打开App,查看更多内容
随时随地看视频慕课网APP