如何标记熊猫数据帧中的最后一个重复元素

使用Series.duplicated或DataFrame.duplicated指定列和参数keep='last'，然后将反转掩码转换为整数以True/False进行1/0映射或使用numpy.where：df['Last_dup1'] = (~df['Policy_id'].duplicated(keep='last')).astype(int)df['Last_dup1'] = np.where(df['Policy_id'].duplicated(keep='last'), 0, 1)或者：df['Last_dup1'] = (~df.duplicated(subset=['Policy_id'], keep='last')).astype(int)df['Last_dup1'] = np.where(df.duplicated(subset=['Policy_id'], keep='last'), 0, 1)print (df)   Id Policy_id  Start_Date  Last_dup  Last_dup10   0      b123  2019/02/24         0          01   1      b123  2019/03/24         0          02   2      b123  2019/04/24         1          13   3      c123  2018/09/01         0          04   4      c123  2018/10/01         1          15   5      d123  2017/02/24         0          06   6      d123  2017/03/24         1          1

如何标记熊猫数据帧中的最后一个重复元素

2回答