对于表中的每条记录,我想做一个基于两个分类列的累积计数。
在下表中,我想获取cum_count列,它是根据列industry和deal_status计算的。这个想法是,对于每条记录,计算同一行业以前赢得的交易数量。
例如,表的最后一个记录有cum_count = 3,因为只有3涉及deal_status =赢得了业界= X之前已经见过。
该大熊猫GroupBy.cumcount功能的确,对于一个变量...
对于我描述的案例,我如何才能做到这一点?
pd.DataFrame({'time': [1, 2, 3, 4, 5, 6, 7],
'company' : ["ciaA", "ciaB", "ciaA", "ciaC", "ciaA", "ciaD", "ciaE"],
'industry' : ["x", "y", "x", "x", "x", "y", "x"],
'deal_status' : ["won", "lost", "won", "won", "lost", "won", "lost"],
'cum_count' : [0, 0, 1, 2, 3, 0, 3]})
time company industry deal_status cum_count
1 ciaA x won 0
2 ciaB y lost 0
3 ciaA x won 1
4 ciaC x won 2
5 ciaA x lost 3
6 ciaD y won 0
7 ciaE x lost 3
缥缈止盈
相关分类