我有一个包含大家庭成员列表的熊猫数据框。
import pandas as pd
data = {'child':['Joe','Anna','Anna','Steffani','Bob','Rea','Dani','Dani','Selma','John','Kevin'],
'parents':['Steffani','Bob','Steffani','Dani','Selma','Anna','Selma','John','Kevin','-','Robert'],
}
df = pd.DataFrame(data)
从这个数据框中,我需要通过在右侧添加多个列来显示数据之间的关系来构建一个新表。右栏中的值显示了长辈关系。每列代表关系。如果我可以绘制图表,它可能看起来像这样:
child --> parents --> grandparents --> parents of grandparents --> grandparents of grandparents --> etc.
因此,数据帧的预期输出将如下所示:
child parents A B C D (etc)
---------------------------------------------------------------------------------
0 Joe Steffani Dani Selma Kevin <If still possible>
1 Joe Steffani Dani John -
2 Anna Bob Selma Kevin Robert
3 Anna Steffani Dani Selma Kevin
4 Anna Steffani Dani John -
5 Steffani Dani Selma Kevin Robert
6 Steffani Dani John - -
7 Bob Selma Kevin Robert -
8 Rea Anna Bob Selma Kevin
9 Rea Anna Steffani Dani Selma
10 Rea Anna Steffani Dani John
11 Dani Selma Kevin Robert -
12 Dani John - - -
13 Selma Kevin Robert - -
14 John - - - -
15 Kevin Robert - - -
目前,我使用手动构建新表pandas.merge。但是我需要做很多次,直到最后一列与左列没有长辈关系。例如:
步骤1
df2 = pd.merge(df, df, left_on='parents', right_on='child', how='left').fillna('-')
df2 = df2[['child_x','parents_x','parents_y']]
df2.columns = ['child','parents','A']
第2步
df3 = pd.merge(df2, df, left_on='A', right_on='child', how='left').fillna('-')
df3 = df3[['child_x','parents_x','A','parents_y']]
df3.columns = ['child','parents','A','B']
第 3 步
素胚勾勒不出你
千巷猫影
相关分类