猿问

通过唯一标识符聚合并将相关值连接为字符串

通过唯一标识符聚合并将相关值连接为字符串

我有一种我认为可以满足的需要aggregate或reshape但我不太明白。


我有一张名单brand),以及随附的ID号(id)。这个数据是长形式的,所以名称可以有多个ID。我想用这个名字(brand)并将多个可能的连接起来。id用注释分隔成字符串。


例如:


brand            id 

RadioShack       2308

Rag & Bone       4466

Ragu             1830

Ragu             4518

Ralph Lauren     1638

Ralph Lauren     2719

Ralph Lauren     2720

Ralph Lauren     2721

Ralph Lauren     2722 

应成为:


RadioShack       2308

Rag & Bone       4466

Ragu             1830,4518

Ralph Lauren     1638,2719,2720,2721,2722

我怎样才能做到这一点?


开满天机
浏览 620回答 3
3回答

神不在的星期二

让我们给你的数据打电话。DF> aggregate(id ~ brand, data = DF, c)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;brand&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;id1&nbsp; &nbsp;RadioShack&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;23082&nbsp; &nbsp;Rag & Bone&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;44663&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Ragu&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;1830, 45184 Ralph Lauren 1638, 2719, 2720, 2721, 2722另一种选择aggregate是:result <- aggregate(id ~ brand, data = DF, paste, collapse = ",")这会产生同样的结果id不是list更多。感谢@Frank的评论。去看class在每一栏中尝试:> sapply(result, class)&nbsp; &nbsp; &nbsp; brand&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; id&nbsp;&nbsp; &nbsp;"factor" "character"正如@DavidArenburg在评论中提到的,另一种选择是使用toString职能:aggregate(id ~ brand, data = DF, toString)

慕姐8265434

一条干净的线条data.tablelibrary(data.table)setDT(DF)有两种选择:结果作为一份清单DF[ , .(id = list(id)), by = brand]&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; brand&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;id1:&nbsp; &nbsp;RadioShack&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;23082:&nbsp; &nbsp;Rag & Bone&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;44663:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Ragu&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1830,45184: Ralph Lauren 1638,2719,2720,2721,2722>&nbsp;结果为字符串DF[ , .(id = paste(id, collapse=",")), by = brand]&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; brand&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;id1:&nbsp; &nbsp;RadioShack&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;23082:&nbsp; &nbsp;Rag & Bone&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;44663:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Ragu&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1830,45184: Ralph Lauren 1638,2719,2720,2721,2722注即使这两个结果出现同样的(即当您打印它们时,它们看起来是相同的),它们实际上是非常不同的,并且允许不同的功能。也就是说,使用List选项(第一个选项),您就可以在源文件上执行功能。idS.后者将允许您更容易地显示信息(包括导出到CSV或excel),但要在id需要把他们接回去。
随时随地看视频慕课网APP
我要回答