猿问

python 给定感兴趣的子级别的 groupby 方法

给定如下所示的 pandas 数据框,我想对“用户”进行某种分组,但在时间列上有一个特殊定义的子标准,对金额列进行求和。


   amount  time users

0      11     0     A

1      23    10     A

2      12    20     A

3      34    30     A

4      56    40     B

5      77    50     B

6      89    60     C

为此,我有一对 range_start 和 range_end 例如在元组或类似列表中。时间列的这些 sub_group_ranges 应该使我能够在数据帧内的每个批次上应用 groupby().sum 。


sub_group_ranges = [(0,0),(20,30),(40,50),(60,60)]

结果应如下所示。每个用户的间隔计数是任意的。


   sum_amount_on_timerange user

0                       57    A

1                      133    B

2                       89    C

我发现这篇文章很相似,但如果我没有连续的间隔(意味着第一个间隔的结束不是下一个间隔的开始),我不明白如何使用它。


如果有人知道要寻找什么,那就太好了。多谢


哆啦的时光机
浏览 93回答 1
1回答

三国纷争

我不确定我完全理解你想要做什么,但这里有一些可能有用的东西df = pd.DataFrame([users,time]).Tdf.columns = ['users','time']def filter_time_range(ele,trange):&nbsp; &nbsp; if (ele>trange[0]) and (ele<=trange[1]):&nbsp; &nbsp; &nbsp; &nbsp; return ele&nbsp; &nbsp; else:&nbsp; &nbsp; &nbsp; &nbsp; return np.nan&nbsp; &nbsp;sub_group_ranges = [(0,0),(20,30),(40,50),(60,60)]for trange in sub_group_ranges:&nbsp; &nbsp; df[str(trange)] = df['time'].apply(lambda x: filter_time_range(x,trange))&nbsp; &nbsp;&nbsp;df导致&nbsp; users time&nbsp; &nbsp; (0, 0)&nbsp; (20, 30)&nbsp; &nbsp; (40, 50)&nbsp; &nbsp; (60, 60)0&nbsp; &nbsp;A&nbsp; &nbsp;0&nbsp; &nbsp; &nbsp; &nbsp; NaN&nbsp; &nbsp; &nbsp;NaN&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;NaN&nbsp; &nbsp; &nbsp; NaN1&nbsp; &nbsp;A&nbsp; &nbsp;10&nbsp; &nbsp; &nbsp; &nbsp;NaN&nbsp; &nbsp; &nbsp;NaN&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;NaN&nbsp; &nbsp; &nbsp; NaN2&nbsp; &nbsp;A&nbsp; &nbsp;20&nbsp; &nbsp; &nbsp; &nbsp;NaN&nbsp; &nbsp; &nbsp;NaN&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;NaN&nbsp; &nbsp; &nbsp; NaN3&nbsp; &nbsp;A&nbsp; &nbsp;30&nbsp; &nbsp; &nbsp; &nbsp;NaN&nbsp; &nbsp; &nbsp;30.0&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;NaN&nbsp; &nbsp; &nbsp;NaN4&nbsp; &nbsp;B&nbsp; &nbsp;40&nbsp; &nbsp; &nbsp; &nbsp;NaN&nbsp; &nbsp; &nbsp;NaN&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;NaN&nbsp; &nbsp; &nbsp; NaN5&nbsp; &nbsp;B&nbsp; &nbsp;50&nbsp; &nbsp; &nbsp; &nbsp;NaN&nbsp; &nbsp; &nbsp;NaN&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;50.0&nbsp; &nbsp; &nbsp;NaN6&nbsp; &nbsp;C&nbsp; &nbsp;60&nbsp; &nbsp; &nbsp; &nbsp;NaN&nbsp; &nbsp; &nbsp;NaN&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;NaN&nbsp; &nbsp; &nbsp; NaN以及您的用户分组df.groupby(['users']).sum()&nbsp; (0, 0)&nbsp; &nbsp; (20, 30)&nbsp; &nbsp; (40, 50)&nbsp; &nbsp; (60, 60)users&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;A&nbsp; &nbsp; &nbsp;0.0&nbsp; &nbsp; 30.0&nbsp; &nbsp;0.0&nbsp; &nbsp; &nbsp;0.0B&nbsp; &nbsp; &nbsp;0.0&nbsp; &nbsp; 0.0&nbsp; &nbsp; 50.0&nbsp; &nbsp; 0.0C&nbsp; &nbsp; &nbsp;0.0&nbsp; &nbsp; 0.0&nbsp; &nbsp; 0.0&nbsp; &nbsp; &nbsp;0.0我已从amount我的数据框中排除
随时随地看视频慕课网APP

相关分类

Python
我要回答