慕田峪9158850
DataFrame.merge如果不确定是否2020是每个组的第一个,请使用:df1 = df[df['year'].eq(2020)]df['value'] -= df.merge(df1,how='left',on=['id','variable'],suffixes=('_',''))['value'].valuesprint (df) id variable year value0 1 a 2020 01 1 a 2021 12 1 a 2022 33 1 b 2020 04 1 b 2021 55 1 b 2022 7如果2020总是每组第一个GroupBy.transform使用GroupBy.first:df['value'] -= df.groupby(['id','variable'])['value'].transform('first')print (df) id variable year value0 1 a 2020 01 1 a 2021 12 1 a 2022 33 1 b 2020 04 1 b 2021 55 1 b 2022 7编辑:如果数据中的2020每组重复行解决方案首先删除重复项并仅减去第一个值:print (df) id variable year value0 1 a 2020 31 1 a 2020 22 1 a 2022 53 1 b 2020 34 1 b 2021 85 1 b 2022 10df1 = df[df['year'].eq(2020)]df['value'] -= df.merge(df1.drop_duplicates(['id','variable']), how='left', on=['id','variable'], suffixes=('_',''))['value'].valuesprint (df) id variable year value0 1 a 2020 01 1 a 2020 -12 1 a 2022 23 1 b 2020 04 1 b 2021 55 1 b 2022 7或聚合值,例如通过sum重复数据删除:print (df) id variable year value0 1 a 2020 31 1 a 2020 12 1 a 2022 53 1 b 2020 34 1 b 2021 85 1 b 2022 10df = df.groupby(['id','variable','year'], as_index=False).sum()print (df) id variable year value0 1 a 2020 41 1 a 2022 52 1 b 2020 33 1 b 2021 84 1 b 2022 10df1 = df[df['year'].eq(2020)]df['value'] -= df.merge(df1, how='left', on=['id','variable'], suffixes=('_',''))['value'].valuesprint (df) id variable year value0 1 a 2020 01 1 a 2022 12 1 b 2020 03 1 b 2021 54 1 b 2022 7