猿问
下载APP

同时合并列表中的多个data.frames

同时合并列表中的多个data.frames

我有一个我要合并的许多data.frames的列表。这里的问题是每个data.frame在行数和列数方面都不同,但它们都共享关键变量(我已经调用过"var1""var2"在下面的代码中)。如果data.frames在列方面是相同的,我只能rbind,plyr的rbind.fill可以完成这项工作,但这些数据并非如此。

因为该merge命令仅适用于2个data.frames,所以我转向Internet寻求创意。我从这里得到了这个,它在R 2.7.2中完美运行,这是我当时所拥有的:

merge.rec <- function(.list, ...){
    if(length(.list)==1) return(.list[[1]])
    Recall(c(list(merge(.list[[1]], .list[[2]], ...)), .list[-(1:2)]), ...)}

我会像这样调用函数:

df <- merge.rec(my.list, by.x = c("var1", "var2"), 
                by.y = c("var1", "var2"), all = T, suffixes=c("", ""))

但是在2.7.2之后的任何R版本中,包括2.11和2.12,此代码失败并出现以下错误:

Error in match.names(clabs, names(xi)) : 
  names do not match previous names

(很明显,我在其他地方看到了其他对此错误的引用而没有解决方案)。

有什么方法可以解决这个问题吗?


慕粉4167745
浏览 282回答 4
4回答

SMILET

另一个问题具体询问如何在R中使用dplyr执行多个左连接。这个问题被标记为这个问题的副本,所以我在这里回答,使用下面的3个示例数据框:library(dplyr)x <- data_frame(i = c("a","b","c"), j = 1:3)y <- data_frame(i = c("b","c","d"), k = 4:6)z <- data_frame(i = c("c","d","a"), l = 7:9)更新2018年6月:我将答案分为三个部分,分别代表三种不同的合并方式。purrr如果您已经在使用tidyverse软件包,那么您可能希望使用这种方式。为了进行比较,您将找到使用相同样本数据集的基本R版本。reduce从purrr包中加入他们该purrr包提供了一个reduce具有简洁语法的函数:library(tidyverse)list(x, y, z) %>% reduce(left_join, by = "i")#&nbsp; A tibble: 3 x 4#&nbsp; i&nbsp; &nbsp; &nbsp; &nbsp;j&nbsp; &nbsp; &nbsp;k&nbsp; &nbsp; &nbsp;l#&nbsp; <chr> <int> <int> <int># 1 a&nbsp; &nbsp; &nbsp; 1&nbsp; &nbsp; NA&nbsp; &nbsp; &nbsp;9# 2 b&nbsp; &nbsp; &nbsp; 2&nbsp; &nbsp; &nbsp;4&nbsp; &nbsp; NA# 3 c&nbsp; &nbsp; &nbsp; 3&nbsp; &nbsp; &nbsp;5&nbsp; &nbsp; &nbsp;7您还可以执行其他连接,例如a full_join或inner_join:list(x, y, z) %>% reduce(full_join, by = "i")# A tibble: 4 x 4# i&nbsp; &nbsp; &nbsp; &nbsp;j&nbsp; &nbsp; &nbsp;k&nbsp; &nbsp; &nbsp;l# <chr> <int> <int> <int># 1 a&nbsp; &nbsp; &nbsp;1&nbsp; &nbsp; &nbsp;NA&nbsp; &nbsp; &nbsp;9# 2 b&nbsp; &nbsp; &nbsp;2&nbsp; &nbsp; &nbsp;4&nbsp; &nbsp; &nbsp; NA# 3 c&nbsp; &nbsp; &nbsp;3&nbsp; &nbsp; &nbsp;5&nbsp; &nbsp; &nbsp; 7# 4 d&nbsp; &nbsp; &nbsp;NA&nbsp; &nbsp; 6&nbsp; &nbsp; &nbsp; 8list(x, y, z) %>% reduce(inner_join, by = "i")# A tibble: 1 x 4# i&nbsp; &nbsp; &nbsp; &nbsp;j&nbsp; &nbsp; &nbsp;k&nbsp; &nbsp; &nbsp;l# <chr> <int> <int> <int># 1 c&nbsp; &nbsp; &nbsp;3&nbsp; &nbsp; &nbsp;5&nbsp; &nbsp; &nbsp;7dplyr::left_join() 与基地R Reduce()list(x,y,z) %>%&nbsp; &nbsp; Reduce(function(dtf1,dtf2) left_join(dtf1,dtf2,by="i"), .)#&nbsp; &nbsp;i j&nbsp; k&nbsp; l# 1 a 1 NA&nbsp; 9# 2 b 2&nbsp; 4 NA# 3 c 3&nbsp; 5&nbsp; 7基础R merge()与基础RReduce()为了进行比较,这里是左连接的基本R版本&nbsp;Reduce(function(dtf1, dtf2) merge(dtf1, dtf2, by = "i", all.x = TRUE),&nbsp; &nbsp; &nbsp; &nbsp; list(x,y,z))#&nbsp; &nbsp;i j&nbsp; k&nbsp; l# 1 a 1 NA&nbsp; 9# 2 b 2&nbsp; 4 NA# 3 c 3&nbsp; 5&nbsp; 7

慕标5265247

减少使这相当容易:merged.data.frame = Reduce(function(...) merge(..., all=T), list.of.data.frames)这是使用一些模拟数据的完整示例:set.seed(1)list.of.data.frames = list(data.frame(x=1:10, a=1:10), data.frame(x=5:14, b=11:20), data.frame(x=sample(20, 10), y=runif(10)))merged.data.frame = Reduce(function(...) merge(..., all=T), list.of.data.frames)tail(merged.data.frame)#&nbsp; &nbsp; x&nbsp; a&nbsp; b&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;y#12 12 NA 18&nbsp; &nbsp; &nbsp; &nbsp; NA#13 13 NA 19&nbsp; &nbsp; &nbsp; &nbsp; NA#14 14 NA 20 0.4976992#15 15 NA NA 0.7176185#16 16 NA NA 0.3841037#17 19 NA NA 0.3800352以下是使用这些数据进行复制的示例my.list:merged.data.frame = Reduce(function(...) merge(..., by=match.by, all=T), my.list)merged.data.frame[, 1:12]#&nbsp; matchname party st district chamber senate1993 name.x v2.x v3.x v4.x senate1994 name.y#1&nbsp; &nbsp;ALGIERE&nbsp; &nbsp;200 RI&nbsp; &nbsp; &nbsp; 026&nbsp; &nbsp; &nbsp; &nbsp;S&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;NA&nbsp; &nbsp;<NA>&nbsp; &nbsp;NA&nbsp; &nbsp;NA&nbsp; &nbsp;NA&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;NA&nbsp; &nbsp;<NA>#2&nbsp; &nbsp; &nbsp;ALVES&nbsp; &nbsp;100 RI&nbsp; &nbsp; &nbsp; 019&nbsp; &nbsp; &nbsp; &nbsp;S&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;NA&nbsp; &nbsp;<NA>&nbsp; &nbsp;NA&nbsp; &nbsp;NA&nbsp; &nbsp;NA&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;NA&nbsp; &nbsp;<NA>#3&nbsp; &nbsp; BADEAU&nbsp; &nbsp;100 RI&nbsp; &nbsp; &nbsp; 032&nbsp; &nbsp; &nbsp; &nbsp;S&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;NA&nbsp; &nbsp;<NA>&nbsp; &nbsp;NA&nbsp; &nbsp;NA&nbsp; &nbsp;NA&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;NA&nbsp; &nbsp;<NA>注意:看起来这可能是一个错误merge。问题是没有检查添加后缀(处理重叠的不匹配名称)实际上使它们唯一。在某一点上,它使用[.data.frame它做 make.unique名字,导致rbind失败。# first merge will end up with 'name.x' & 'name.y'merge(my.list[[1]], my.list[[2]], by=match.by, all=T)# [1] matchname&nbsp; &nbsp; party&nbsp; &nbsp; &nbsp; &nbsp; st&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;district&nbsp; &nbsp; &nbsp;chamber&nbsp; &nbsp; &nbsp; senate1993&nbsp; &nbsp;name.x&nbsp; &nbsp; &nbsp;&nbsp;# [8] votes.year.x senate1994&nbsp; &nbsp;name.y&nbsp; &nbsp; &nbsp; &nbsp;votes.year.y#<0 rows> (or 0-length row.names)# as there is no clash, we retain 'name.x' & 'name.y' and get 'name' againmerge(merge(my.list[[1]], my.list[[2]], by=match.by, all=T), my.list[[3]], by=match.by, all=T)# [1] matchname&nbsp; &nbsp; party&nbsp; &nbsp; &nbsp; &nbsp; st&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;district&nbsp; &nbsp; &nbsp;chamber&nbsp; &nbsp; &nbsp; senate1993&nbsp; &nbsp;name.x&nbsp; &nbsp; &nbsp;&nbsp;# [8] votes.year.x senate1994&nbsp; &nbsp;name.y&nbsp; &nbsp; &nbsp; &nbsp;votes.year.y senate1995&nbsp; &nbsp;name&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;votes.year&nbsp;&nbsp;#<0 rows> (or 0-length row.names)# the next merge will fail as 'name' will get renamed to a pre-existing field.最简单的修复方法是不要为重复字段(其中有很多字段)保留字段重命名merge。例如:my.list2 = Map(function(x, i) setNames(x, ifelse(names(x) %in% match.by,&nbsp; &nbsp; &nbsp; names(x), sprintf('%s.%d', names(x), i))), my.list, seq_along(my.list))该merge/ Reduce然后将正常工作。

ibeautiful

您可以merge_all在reshape包中使用它。您可以传递参数以merge使用...参数reshape::merge_all(list_of_dataframes,&nbsp;...)这是合并数据帧的不同方法的优秀资源。

蛊毒传说

您可以使用递归来执行此操作。我没有验证以下内容,但它应该给你正确的想法:MergeListOfDf&nbsp;=&nbsp;function(&nbsp;data&nbsp;,&nbsp;...&nbsp;){ &nbsp;&nbsp;&nbsp;&nbsp;if&nbsp;(&nbsp;length(&nbsp;data&nbsp;)&nbsp;==&nbsp;2&nbsp;)&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;return(&nbsp;merge(&nbsp;data[[&nbsp;1&nbsp;]]&nbsp;,&nbsp;data[[&nbsp;2&nbsp;]]&nbsp;,&nbsp;...&nbsp;)&nbsp;) &nbsp;&nbsp;&nbsp;&nbsp;}&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;return(&nbsp;merge(&nbsp;MergeListOfDf(&nbsp;data[&nbsp;-1&nbsp;]&nbsp;,&nbsp;...&nbsp;)&nbsp;,&nbsp;data[[&nbsp;1&nbsp;]]&nbsp;,&nbsp;...&nbsp;)&nbsp;)}
打开App,查看更多内容
随时随地看视频慕课网APP
我要回答