选择data.table R中的列子集

我有一个带有一列列的数据表,例如:


dt<-data.table(matrix(runif(10*10),10,10))

我想对数据表执行一些操作,例如生成相关矩阵(cor(dt))。为此,我想删除一些包含非数字值或超出一定范围的值的列。


假设我要查找不包括V1,V2,V3和V5的相关矩阵。


这是我目前的方法:


cols<-!(colnames(dt)=="V1" | colnames(dt)=="V2" | colnames(dt)=="V3" | colnames(dt)=="V5")

new_dt<-subset(dt,,cols)

cor(new_dt)

考虑到data.table语法通常如此优雅,我觉得这很麻烦。有更好的方法吗?


婷婷同学_
浏览 408回答 3
3回答

Cats萌萌

用途with=FALSE:cols = paste("V", c(1,2,3,5), sep="")dt[, !cols, with=FALSE]我建议您仔细阅读“ data.table简介”小插图。更新:从此以后v1.10.2,您还可以执行以下操作:dt[, ..cols]请参阅第一条新闻下v1.10.2的在这里的附加说明。

宝慕林4294392

你可以做dt[, !c("V1","V2","V3","V5"), with=FALSE]要得到&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; V4&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;V6&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;V7&nbsp; &nbsp; &nbsp; &nbsp; V8&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;V9&nbsp; &nbsp; &nbsp; &nbsp; V10&nbsp;1: 0.88612076 0.94727825 0.50502208 0.6702523 0.24186706 0.96263313&nbsp;2: 0.11121752 0.13969145 0.19092645 0.9589867 0.27968190 0.07796870&nbsp;3: 0.50179822 0.10641301 0.08540322 0.3297847 0.03643195 0.18082180&nbsp;4: 0.09787517 0.07312777 0.88077548 0.3218041 0.75826099 0.55847774&nbsp;5: 0.73475574 0.96644484 0.58261312 0.9921499 0.78962675 0.04976212&nbsp;6: 0.88861117 0.85690337 0.27723130 0.3662264 0.50881663 0.67402625&nbsp;7: 0.33933983 0.83392047 0.30701697 0.6138122 0.85107176 0.58609504&nbsp;8: 0.89907094 0.61389815 0.19957386 0.3968331 0.78876682 0.90546328&nbsp;9: 0.54136123 0.08274569 0.25190790 0.1920462 0.15142604 0.1213480710: 0.36511064 0.88117171 0.05730210 0.9441072 0.40125023 0.62828674

繁星coding

这似乎有所改善:> cols<-!(colnames(dt) %in% c("V1","V2","V3","V5"))> new_dt<-subset(dt,,cols)> cor(new_dt)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; V4&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; V6&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; V7&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; V8&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;V9&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;V10V4&nbsp; &nbsp;1.0000000&nbsp; 0.14141578 -0.44466832&nbsp; 0.23697216 -0.1020074&nbsp; 0.48171747V6&nbsp; &nbsp;0.1414158&nbsp; 1.00000000 -0.21356218 -0.08510977 -0.1884202 -0.22242274V7&nbsp; -0.4446683 -0.21356218&nbsp; 1.00000000 -0.02050846&nbsp; 0.3209454 -0.15021528V8&nbsp; &nbsp;0.2369722 -0.08510977 -0.02050846&nbsp; 1.00000000&nbsp; 0.4627034 -0.07020571V9&nbsp; -0.1020074 -0.18842023&nbsp; 0.32094540&nbsp; 0.46270335&nbsp; 1.0000000 -0.19224973V10&nbsp; 0.4817175 -0.22242274 -0.15021528 -0.07020571 -0.1922497&nbsp; 1.00000000这个不是很容易理解,但是可能需要用数字向量指定列的情况下使用:subset(dt, , !grepl(paste0("V", c(1:3,5),collapse="|"),colnames(dt) ))
打开App,查看更多内容
随时随地看视频慕课网APP