猿问

按数据帧分组,按值小于一秒 - pandas

假设我有一个熊猫数据框,如下所示:


>>> df=pd.DataFrame({'dt':pd.to_datetime(['2018-12-10 16:35:34.246','2018-12-10 16:36:34.243','2018-12-10 16:38:34.216','2018-12-10 16:42:34.123']),'value':[1,2,3,4]})

>>> df

                       dt  value

0 2018-12-10 16:35:34.246      1

1 2018-12-10 16:36:34.243      2

2 2018-12-10 16:38:34.216      3

3 2018-12-10 16:42:34.123      4

>>> 

我想按'dt'列对这个数据框进行分组,但我想以一种方式对它进行分组,它认为小于一秒不同的值是相同的,在对那些分组后,我想'value'根据每个组总结该列,并且我希望数据帧两个保持相同的长度,因此小于一秒的差异值将都是重复值,到目前为止我尝试过:


>>> df.groupby('dt',as_index=False)['value'].sum()

                       dt  value

0 2018-12-10 16:35:34.246      1

1 2018-12-10 16:36:34.243      2

2 2018-12-10 16:38:34.216      3

3 2018-12-10 16:42:34.123      4

>>> 

但是如您所见,数据框没有更改,因为它按等效'dt'列值进行分组。


我想要的输出是:


                       dt  value

0 2018-12-10 16:35:34.246      3

1 2018-12-10 16:36:34.243      3

2 2018-12-10 16:38:34.216      3

3 2018-12-10 16:42:34.123      4


qq_笑_17
浏览 151回答 2
2回答

繁花如伊

蛮力解决方案是取您的datetime系列和每个datetime值之间的绝对差异,然后与阈值进行比较:# data from @StephenCowleythreshold = pd.Timedelta(seconds=1)df['val'] = [df.loc[(df['dt'] - t).abs() < threshold, 'value'].sum()&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;for t in df['dt']]print(df)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;dt&nbsp; value&nbsp; val0 2018-12-10 16:35:34.246&nbsp; &nbsp; &nbsp; 1&nbsp; &nbsp; 31 2018-12-10 16:35:34.243&nbsp; &nbsp; &nbsp; 2&nbsp; &nbsp; 32 2018-12-10 16:38:34.216&nbsp; &nbsp; &nbsp; 3&nbsp; &nbsp; 33 2018-12-10 16:42:34.123&nbsp; &nbsp; &nbsp; 4&nbsp; &nbsp; 4
随时随地看视频慕课网APP

相关分类

Python
我要回答