理解order()函数

理解order()函数

我试图了解该order()功能的工作原理。我的印象是它返回了索引的排列,当排序时,它会对原始向量进行排序。

例如,

> a <- c(45,50,10,96)> order(a)[1] 3 1 2 4

我本来希望这会返回c(2, 3, 1, 4),因为排序的列表将是10 45 50 96。

有人能帮我理解这个函数的返回值吗?


守着一只汪
浏览 1479回答 3
3回答

大话西游666

这似乎可以解释它。的定义order是,a[order(a)]为递增次序。这适用于您的示例,其中正确的顺序是第四,第二,第一,然后第三个元素。您可能一直在寻找rank,它返回元素的等级,R> a <- c(4.1, 3.2, 6.1, 3.1)R> order(a)[1] 4 2 1 3R> rank(a)[1] 3 2 4 1因此rank告诉您数字的顺序,&nbsp;order告诉您如何按升序获取它们。plot(a, rank(a)/length(a))将给出CDF的图表。order但是,要知道为什么&nbsp;有用,尝试plot(a, rank(a)/length(a),type="S")&nbsp;哪个会弄乱,因为数据不是按顺序递增如果您这样做oo<-order(a)plot(a[oo],rank(a[oo])/length(a),type="S")或只是oo<-order(a)plot(a[oo],(1:length(a))/length(a)),type="S")获得了CDF的折线图。我打赌你在考虑排名。

眼眸繁星

要对1D向量或单列数据进行排序,只需调用sort函数并传入序列。另一方面,顺序函数对于对数据二维数据进行排序是必要的- 即,在矩阵或数据帧中收集的多列数据。Stadium Home Week Qtr Away Off Def Result&nbsp; &nbsp; &nbsp; &nbsp;Kicker Dist751&nbsp; &nbsp; &nbsp;Out&nbsp; PHI&nbsp; &nbsp;14&nbsp; &nbsp;4&nbsp; NYG PHI NYG&nbsp; &nbsp;Good&nbsp; &nbsp; &nbsp; D.Akers&nbsp; &nbsp;50491&nbsp; &nbsp; &nbsp;Out&nbsp; &nbsp;KC&nbsp; &nbsp; 9&nbsp; &nbsp;1&nbsp; OAK OAK&nbsp; KC&nbsp; &nbsp;Good S.Janikowski&nbsp; &nbsp;32702&nbsp; &nbsp; &nbsp;Out&nbsp; OAK&nbsp; &nbsp;15&nbsp; &nbsp;4&nbsp; CLE CLE OAK&nbsp; &nbsp;Good&nbsp; &nbsp; &nbsp;P.Dawson&nbsp; &nbsp;37571&nbsp; &nbsp; &nbsp;Out&nbsp; &nbsp;NE&nbsp; &nbsp; 1&nbsp; &nbsp;2&nbsp; OAK OAK&nbsp; NE Missed S.Janikowski&nbsp; &nbsp;43654&nbsp; &nbsp; &nbsp;Out&nbsp; NYG&nbsp; &nbsp;11&nbsp; &nbsp;2&nbsp; PHI NYG PHI&nbsp; &nbsp;Good&nbsp; &nbsp; &nbsp; J.Feely&nbsp; &nbsp;26307&nbsp; &nbsp; &nbsp;Out&nbsp; DEN&nbsp; &nbsp;14&nbsp; &nbsp;2&nbsp; BAL DEN BAL&nbsp; &nbsp;Good&nbsp; &nbsp; &nbsp; &nbsp;J.Elam&nbsp; &nbsp;48492&nbsp; &nbsp; &nbsp;Out&nbsp; &nbsp;KC&nbsp; &nbsp;13&nbsp; &nbsp;3&nbsp; DEN&nbsp; KC DEN&nbsp; &nbsp;Good&nbsp; &nbsp; &nbsp; L.Tynes&nbsp; &nbsp;34691&nbsp; &nbsp; &nbsp;Out&nbsp; NYJ&nbsp; &nbsp;17&nbsp; &nbsp;3&nbsp; BUF NYJ BUF&nbsp; &nbsp;Good&nbsp; &nbsp; &nbsp;M.Nugent&nbsp; &nbsp;25164&nbsp; &nbsp; &nbsp;Out&nbsp; CHI&nbsp; &nbsp;13&nbsp; &nbsp;2&nbsp; &nbsp;GB CHI&nbsp; GB&nbsp; &nbsp;Good&nbsp; &nbsp; &nbsp; R.Gould&nbsp; &nbsp;2580&nbsp; &nbsp; &nbsp; Out&nbsp; BAL&nbsp; &nbsp; 1&nbsp; &nbsp;2&nbsp; IND IND BAL&nbsp; &nbsp;Good M.Vanderjagt&nbsp; &nbsp;20以下是2008年NFL赛季投篮数据的摘录,这是一个我称之为“fg”的数据帧。假设这10个数据点代表了2008年尝试的所有实地目标; 进一步假设你想知道那一年尝试的最长射门次数的距离,踢球的距离以及是否好的; 你也想知道第二长,第三长,等等。最后你想要最短的射门尝试。好吧,你可以这样做:sort(fg$Dist, decreasing=T)返回:50 48 43 37 34 32 26 25 25 20这是正确的,但不是很有用 - 它确实告诉我们最长的射门尝试的距离,第二长的,......以及最短的; 然而,这就是我们所知道的 - 例如,我们不知道踢球者是谁,尝试是否成功等等。当然,我们需要在“Dist”栏上排序整个数据框(换句话说,我们想要对单个属性Dist。上的所有数据行进行排序,如下所示:Stadium Home Week Qtr Away Off Def Result&nbsp; &nbsp; &nbsp; &nbsp;Kicker Dist751&nbsp; &nbsp; &nbsp;Out&nbsp; PHI&nbsp; &nbsp;14&nbsp; &nbsp;4&nbsp; NYG PHI NYG&nbsp; &nbsp;Good&nbsp; &nbsp; &nbsp; D.Akers&nbsp; &nbsp;50307&nbsp; &nbsp; &nbsp;Out&nbsp; DEN&nbsp; &nbsp;14&nbsp; &nbsp;2&nbsp; BAL DEN BAL&nbsp; &nbsp;Good&nbsp; &nbsp; &nbsp; &nbsp;J.Elam&nbsp; &nbsp;48571&nbsp; &nbsp; &nbsp;Out&nbsp; &nbsp;NE&nbsp; &nbsp; 1&nbsp; &nbsp;2&nbsp; OAK OAK&nbsp; NE Missed S.Janikowski&nbsp; &nbsp;43702&nbsp; &nbsp; &nbsp;Out&nbsp; OAK&nbsp; &nbsp;15&nbsp; &nbsp;4&nbsp; CLE CLE OAK&nbsp; &nbsp;Good&nbsp; &nbsp; &nbsp;P.Dawson&nbsp; &nbsp;37492&nbsp; &nbsp; &nbsp;Out&nbsp; &nbsp;KC&nbsp; &nbsp;13&nbsp; &nbsp;3&nbsp; DEN&nbsp; KC DEN&nbsp; &nbsp;Good&nbsp; &nbsp; &nbsp; L.Tynes&nbsp; &nbsp;34491&nbsp; &nbsp; &nbsp;Out&nbsp; &nbsp;KC&nbsp; &nbsp; 9&nbsp; &nbsp;1&nbsp; OAK OAK&nbsp; KC&nbsp; &nbsp;Good S.Janikowski&nbsp; &nbsp;32654&nbsp; &nbsp; &nbsp;Out&nbsp; NYG&nbsp; &nbsp;11&nbsp; &nbsp;2&nbsp; PHI NYG PHI&nbsp; &nbsp;Good&nbsp; &nbsp; &nbsp; J.Feely&nbsp; &nbsp;26691&nbsp; &nbsp; &nbsp;Out&nbsp; NYJ&nbsp; &nbsp;17&nbsp; &nbsp;3&nbsp; BUF NYJ BUF&nbsp; &nbsp;Good&nbsp; &nbsp; &nbsp;M.Nugent&nbsp; &nbsp;25164&nbsp; &nbsp; &nbsp;Out&nbsp; CHI&nbsp; &nbsp;13&nbsp; &nbsp;2&nbsp; &nbsp;GB CHI&nbsp; GB&nbsp; &nbsp;Good&nbsp; &nbsp; &nbsp; R.Gould&nbsp; &nbsp;2580&nbsp; &nbsp; &nbsp; Out&nbsp; BAL&nbsp; &nbsp; 1&nbsp; &nbsp;2&nbsp; IND IND BAL&nbsp; &nbsp;Good M.Vanderjagt&nbsp; &nbsp;20这就是订单的作用。它是二维数据的“排序”; 换一种说法,它返回由行号这样的一维整数索引排序行根据该矢量,会给你的专栏,正确的面向行的排序DIST这是它的工作原理。上面,sort用于对Dist列进行排序; 要对Dist列上的整个数据框进行排序,我们使用'order' 与上面使用的'sort'完全相同:ndx = order(fg$Dist, decreasing=T)(我通常将从'order'返回的数组绑定到变量'ndx',它代表'index',因为我将它用作索引数组来排序。)那是第1步,这是第2步:'ndx','sort'返回的内容然后用作索引数组来重新排序数据帧'fg':fg_sorted = fg[ndx,]fg_sorted是紧接在上面的重新排序的数据帧。总而言之,'sort'用于创建索引数组(指定要排序的列的排序顺序),然后将其用作索引数组以重新排序数据框(或矩阵)。

汪汪一只猫

?order告诉你原始向量的哪个元素需要放在第一个,第二个等,以便对原始向量进行排序,而?rank告诉你哪个元素具有最低,第二低等值。例如:>&nbsp;a&nbsp;<-&nbsp;c(45,&nbsp;50,&nbsp;10,&nbsp;96)>&nbsp;order(a)&nbsp;&nbsp;[1]&nbsp;3&nbsp;1&nbsp;2&nbsp;4&nbsp;&nbsp;>&nbsp;rank(a)&nbsp;&nbsp;[1]&nbsp;2&nbsp;3&nbsp;1&nbsp;4所以order(a)说,“当你排序时,把第三个元素放在第一位......”,而是rank(a)说,'第一个元素是第二个最低......'。(请注意,他们都同意哪个元素最低等等;它们只是以不同的方式呈现信息。)因此,我们看到我们可以使用order()排序,但我们不能使用rank()这种方式:>&nbsp;a[order(a)]&nbsp;&nbsp;[1]&nbsp;10&nbsp;45&nbsp;50&nbsp;96&nbsp;&nbsp;>&nbsp;sort(a)&nbsp;&nbsp;[1]&nbsp;10&nbsp;45&nbsp;50&nbsp;96&nbsp;&nbsp;>&nbsp;a[rank(a)]&nbsp;&nbsp;[1]&nbsp;50&nbsp;10&nbsp;45&nbsp;96通常,除非已经对矢量进行了排序,否则order()它将不相等rank():>&nbsp;b&nbsp;<-&nbsp;sort(a)&nbsp;&nbsp;>&nbsp;order(b)==rank(b)&nbsp;&nbsp;[1]&nbsp;TRUE&nbsp;TRUE&nbsp;TRUE&nbsp;TRUE此外,由于order()(基本上)在数据行上操作,您可以在不影响信息的情况下编写它们,但反过来会产生乱码:>&nbsp;order(rank(a))==order(a)&nbsp;&nbsp;[1]&nbsp;TRUE&nbsp;TRUE&nbsp;TRUE&nbsp;TRUE&nbsp;&nbsp;>&nbsp;rank(order(a))==rank(a)&nbsp;&nbsp;[1]&nbsp;FALSE&nbsp;FALSE&nbsp;FALSE&nbsp;&nbsp;TRUE
打开App,查看更多内容
随时随地看视频慕课网APP