如何在熊猫中有效地合并列和分组?

我有以下数据框:


date  = ['2015-02-03 23:00:00','2015-02-03 23:30:00','2015-02-04 00:00:00','2015-02-04 00:30:00','2015-02-04 01:00:00','2015-02-04 01:30:00','2015-02-04 02:00:00','2015-02-04 02:30:00','2015-02-04 03:00:00','2015-02-04 03:30:00','2015-02-04 04:00:00','2015-02-04 04:30:00','2015-02-04 05:00:00','2015-02-04 05:30:00','2015-02-04 06:00:00','2015-02-04 06:30:00','2015-02-04 07:00:00','2015-02-04 07:30:00','2015-02-04 08:00:00','2015-02-04 08:30:00','2015-02-04 09:00:00','2015-02-04 09:30:00','2015-02-04 10:00:00','2015-02-04 10:30:00','2015-02-04 11:00:00','2015-02-04 11:30:00','2015-02-04 12:00:00','2015-02-04 12:30:00','2015-02-04 13:00:00','2015-02-04 13:30:00','2015-02-04 14:00:00','2015-02-04 14:30:00','2015-02-04 15:00:00','2015-02-04 15:30:00','2015-02-04 16:00:00','2015-02-04 16:30:00','2015-02-04 17:00:00','2015-02-04 17:30:00','2015-02-04 18:00:00','2015-02-04 18:30:00','2015-02-04 19:00:00','2015-02-04 19:30:00','2015-02-04 20:00:00','2015-02-04 20:30:00','2015-02-04 21:00:00','2015-02-04 21:30:00','2015-02-04 22:00:00','2015-02-04 22:30:00','2015-02-04 23:00:00','2015-02-04 23:30:00']

df = pd.DataFrame({'value':value,'index':date})

df.index = pd.to_datetime(df['index'],format='%Y-%m-%d %H:%M')

df.drop(['index'],axis=1,inplace=True)

print(df)    


                     value

index                     

2015-02-03 23:00:00  33.24

2015-02-03 23:30:00  31.71

2015-02-04 00:00:00  34.39

2015-02-04 00:30:00  34.49

2015-02-04 01:00:00  34.67

2015-02-04 01:30:00  34.46

我想有效地进行以下操作:


对于每一年,计算严格低于 0、包含 0 和严格低于 20、然后高于 20 的值出现的百分比

我知道函数 cut 和 groupby,但我想不出一种方法来合并两者来优雅地做到这一点。


预期结果类似于:


                   inf0        supequal0_inf20         supequal20                                                    

2015               0.2                0.6                  0.2

2016               0.7                0.1                  0.2

2017               0.1                0.8                  0.1

非常感谢您的帮助,


红糖糍粑
浏览 110回答 1
1回答

皈依舞

鉴于您的df,我不知道优雅,这应该有效:# altered bins for demonstration purposesbinned = pd.cut(x=df.value, bins=[-np.inf, 40, 50, np.inf], right=False, labels=['low', 'mid', 'high'])grouped = binned.groupby([pd.Grouper(freq='Y'), binned]).count() / binned.groupby(pd.Grouper(freq='Y')).count()结果print(grouped):index       value2015-12-31  low      0.520000            mid      0.380000            high     0.100000
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python