慕慕森
我们需要numeric列能够对它们进行计算,在这种情况下sum:#Example dataframedf = pd.DataFrame({'date':['2019-01-04', '2019-01-04', '2019-01-03', '2018-12-22', '2018-08-31'], 'replies_count':['46', '143', '64', '154', '50'], 'polarity':[10, 20, 30, 40, 50]})print(df) date replies_count polarity0 2019-01-04 46 101 2019-01-04 143 202 2019-01-03 64 303 2018-12-22 154 404 2018-08-31 50 50检查列的类型print(df.dtypes)date objectreplies_count objectpolarity int64dtype: object应用groupby与sumprint(df.groupby('date').sum()) polaritydate 2018-08-31 502018-12-22 402019-01-03 302019-01-04 30现在将replies_count列的类型更改为int并执行相同groupby的操作sumdf['replies_count'] = df['replies_count'].astype(int)print(df.groupby('date').sum()) replies_count polaritydate 2018-08-31 50 502018-12-22 154 402019-01-03 64 302019-01-04 189 30正如我们所见,该列现在已包含在内。