我有一个 df 如下;将 pandas 导入为 pd
df = pd.DataFrame({
"ID": ['company A', 'company A', 'company A', 'company B','company B', 'company B', 'company C', 'company C','company C','company C', 'company D', 'company D','company D'],
'Sender': [28, 'delete', 'flag_source', 56, 28, 312, 'delete', 'flag_source', 78, 102, 26, 101, 96],
'Receiver': [129, 28, 'delete', 172, 56, 28, 61, 'delete', 12, 78, 98, 26, 101],
'Date': ['2020-04-12', '2020-03-20', '2020-03-20', '2019-02-11', '2019-01-31', '2018-04-02', '2020-06-29', '2020-06-29', '2019-11-29', '2019-10-01', '2020-04-03', '2020-01-30', '2019-10-18'],
'Sender_type': ['house', 'temp', 'house', 'house', 'house', 'house', 'temp', 'house', 'house','house','house', 'temp', 'house'],
'Receiver_type': ['house', 'house', 'temp', 'house','house','house','house', 'temp', 'house','house','house','house','temp'],
'Price': [32, 50, 47, 21, 23, 19, 52, 39, 12, 22, 61, 53, 19]
})
它是这样的:
ID Sender Receiver Date Sender_type Receiver_type Price
0 company A 28 129 2020-04-12 house house 32
1 company A delete 28 2020-03-20 temp house 50 # combine this row with below
2 company A flag_source delete 2020-03-20 house temp 47 # combine this row with above
3 company B 56 172 2019-02-11 house house 21
4 company B 28 56 2019-01-31 house house 23
5 company B 312 28 2018-04-02 house house 19
我希望通过以下规则合并/合并每个组“ID”(公司 x)的两行:将“Sender”中包含“flag_source”的行及其上面的行合并为一个新行。在这个新行中:Sender 是 flag_source,'Revceiver' 是其上面的值(删除两个 'delete' 值),Date 是上面的日期,Sender_type 和 Receiver_type 是 'house','Price' 是上面的上一个值价值。然后删除两行。例如,对于 A 公司,它将合并第 1 行和第 2 行以生成以下新行:
ID Sender Receiver Date Sender_type Receiver_type Price
company A flag_source 28 2020-03-20 house house 50
跃然一笑
精慕HU
慕容3067478
相关分类