python如何使用groupby对数据进行分类并计算其他列的平均值

我有一个数据框,如下所示。我想使用“part1”列作为将数据分类为 3 个部分(每个部分具有相同数量的数据集)的基准,并计算每个组的 part2 均值的均值。如row0和row1为groupB,均值为(0.67+(-0.03))/2。


import pandas as pd

df = pd.DataFrame({

    "date":["20130101","20130101","20130103","20130103","20130105","20130105"],

    "part1":[0.5,0.7,1.3,1.5,0.1,0.3],

    "part2":[0.67,-0.03,1.95,-3.25,-0.3,0.6]

})

    date    part1   part2   output

0   20130101    0.5 0.67    0.32

1   20130101    0.7 -0.03   0.32

2   20130103    1.3 1.95    -0.65

3   20130103    1.5 -3.25   -0.65

4   20130105    0.1 -0.3    0.15

5   20130105    0.3 0.6 0.15


潇湘沐
浏览 697回答 3
3回答

慕容森

如果你想计算每天的平均值,你可以使用groupby如下:import pandas as pddf = pd.DataFrame({    "date":["20130101","20130101","20130103","20130103","20130105","20130105"],    "part1":[0.5,0.7,1.3,1.5,0.1,0.3],    "part2":[0.67,-0.03,1.95,-3.25,-0.3,0.6]})df.groupby("date").mean().reset_index()结果:       date  part1  part20  20130101    0.6   0.321  20130103    1.4  -0.652  20130105    0.2   0.15

犯罪嫌疑人X

您可以by为熊猫groupby方法的参数传递一个函数。from functools import partialimport pandas as pddf = pd.DataFrame({&nbsp; &nbsp; "date":["20130101","20130101","20130103","20130103","20130105","20130105"],&nbsp; &nbsp; "part1":[0.5,0.7,1.3,1.5,0.1,0.3],&nbsp; &nbsp; "part2":[0.67,-0.03,1.95,-3.25,-0.3,0.6]})def grouper(df, val):&nbsp; &nbsp; foo = df.iloc[val]['part1']&nbsp; &nbsp; if foo > 0.0 and foo < 0.4:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;return 0&nbsp; &nbsp; elif foo > 0.3 and foo < 1.0:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;return 1&nbsp; &nbsp; elif foo > 1.0:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;return 2grouped = df['part2'].groupby(by=partial(grouper, df)).mean()这导致1&nbsp; &nbsp; 0.152&nbsp; &nbsp; 0.323&nbsp; &nbsp;-0.65Name: part2, dtype: float64
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python