需要为每个符合以下条件的数据帧分配一个特定的ID
fd = df[(df['B'].str.match('.*Color:.*') | df['B'].str.match('.*colorFUL:.*')) & df.A.isnull()]
fd2 = df[(df['B'].str.match('.*Type:.*')) & df.A.isnull()]
在输出文件中,两个数据帧都被写在另一个数据帧的下面。需要添加列 C,其中 ID“1”分配给 fd,ID“2”分配给 fd2。这将有助于过滤数据帧。
这是当前的输出
A B
nan this has Color:Red
nan Color: Blue,red, green
nan Color: Yellow
nan This has many colors. Color: green, red, Yellow
nan Filter oil Type: Synthetic Motor oil
nan Oil Type : High Mileage Motor oil
预期输出
A B C
nan this has Color:Red 1
nan Color: Blue,red, green 1
nan Color: Yellow 1
nan This has many colors. Color: green, red, Yellow 1
nan Filter oil Type: Synthetic Motor oil 2
nan Oil Type : High Mileage Motor oil 2
慕田峪4524236
相关分类