如何以第一行获得最大数字，第二行获得最小数字，第三行获得第二大数字的方式对组进行排序，依此类推

5回答

慕慕森

采用排序后的顺序，然后对其应用二次函数，其中根是数组长度的 1/2（加上一些小的偏移量）。通过这种方式，最高排名被赋予极值（eps偏移量的符号决定了您是否想要排名在最低值之上的最高值）。我在末尾添加了一个小组来展示它如何正确处理重复值或奇数组大小。def extremal_rank(s):    eps = 10**-4    y = (pd.Series(np.arange(1, len(s)+1), index=s.sort_values().index)          - (len(s)+1)/2 + eps)**2    return y.reindex_like(s)    df['rnk'] = df.groupby('Group')['Performance'].apply(extremal_rank)df = df.sort_values(['Group', 'rnk'], ascending=[True, False])   Group              Name  Performance     rnk2      A      Chad Webster          142  6.25050      A      Sheldon Webb           33  6.24954      A    Elijah Mendoza          122  2.25031      A        Traci Dean           64  2.24973      A        Ora Harmon          116  0.25015      A   June Strickland           68  0.24998      B         Joel Gill          132  2.25039      B      Vernon Stone           80  2.24977      B      Betty Sutton          127  0.25016      B      Beth Vasquez           95  0.249911     C                 b          110  9.000612     C                 c           68  8.999410     C                 a          110  4.000413     C                 d           68  3.999615     C                 f           70  1.000216     C                 g           70  0.999814     C                 e           70  0.0000

0 0

倚天杖

您可以避免在 Performace 上groupby使用sort_values一次升序一次降序，concat两个排序的数据帧，然后使用sort_index并drop_duplicates获得预期的输出：df_ = (pd.concat([df.sort_values(['Group', 'Performance'], ascending=[True, False])                    .reset_index(), #need the original index for later drop_duplicates                  df.sort_values(['Group', 'Performance'], ascending=[True, True])                    .reset_index()                    .set_index(np.arange(len(df))+0.5)], # for later sort_index                 axis=0)         .sort_index()         .drop_duplicates('index', keep='first')         .reset_index(drop=True)       [['Group', 'Name', 'Performance']]       )print(df_)  Group              Name  Performance0     A      Chad Webster          1421     A      Sheldon Webb           332     A    Elijah Mendoza          1223     A        Traci Dean           644     A        Ora Harmon          1165     A   June Strickland           686     B         Joel Gill          1327     B      Vernon Stone           808     B      Betty Sutton          1279     B      Beth Vasquez           95

0 0

德玛西亚99

对每个组应用nlargest和的排序串联：nsmallest>>> (df.groupby('Group')[df.columns[1:]]      .apply(lambda x:      pd.concat([x.nlargest(x.shape[0]//2,'Performance').reset_index(),                 x.nsmallest(x.shape[0]-x.shape[0]//2,'Performance').reset_index()]            )            .sort_index()            .drop('index',1))      .reset_index().drop('level_1',1))  Group              Name  Performance0     A      Chad Webster          1421     A      Sheldon Webb           332     A    Elijah Mendoza          1223     A        Traci Dean           644     A        Ora Harmon          1165     A   June Strickland           686     B         Joel Gill          1327     B      Vernon Stone           808     B      Betty Sutton          1279     B      Beth Vasquez           95

0 0

qq_笑_17

只是另一种使用自定义函数的方法np.empty：def mysort(s):    arr = s.to_numpy()    c = np.empty(arr.shape, dtype=arr.dtype)    idx = arr.shape[0]//2 if not arr.shape[0]%2 else arr.shape[0]//2+1    c[0::2], c[1::2] = arr[:idx], arr[idx:][::-1]    return pd.DataFrame(c, columns=s.columns)print (df.sort_values("Performance", ascending=False).groupby("Group").apply(mysort))        Group              Name PerformanceGroup                                      A     0     A      Chad Webster         142      1     A      Sheldon Webb          33      2     A    Elijah Mendoza         122      3     A        Traci Dean          64      4     A        Ora Harmon         116      5     A   June Strickland          68B     0     B         Joel Gill         132      1     B      Vernon Stone          80      2     B      Betty Sutton         127      3     B      Beth Vasquez          95基准：

0 0

冉冉说

让我们尝试用检测min, max行groupby().transform()，然后排序：groups = df.groupby('Group')['Performance']mins, maxs = groups.transform('min'), groups.transform('max')(df.assign(temp=df['Performance'].eq(mins) | df['Performance'].eq(maxs))   .sort_values(['Group','temp','Performance'],                ascending=[True, False, False])   .drop('temp', axis=1))输出：  Group              Name  Performance2     A      Chad Webster          1420     A      Sheldon Webb           334     A    Elijah Mendoza          1223     A        Ora Harmon          1165     A   June Strickland           681     A        Traci Dean           648     B         Joel Gill          1329     B      Vernon Stone           807     B      Betty Sutton          1276     B      Beth Vasquez           95

0 0