比较 Pandas 中的列表元素和子列表元素

df


col1                       col2

['aa', 'bb', 'cc', 'dd']   [['ee', 'ff', 'gg', 'hh'], ['qq', 'ww', 'ee', 'rr']]

['ss', 'dd', 'ff', 'gg']   [['mm', 'nn', 'vv', 'cc'], ['zz', 'aa', 'jj', 'kk']]

['ss', 'dd']               [['mm', 'nn', 'vv', 'cc'], ['zz', 'aa', 'jj', 'kk']]

我希望能够运行一个函数,将第一个列表元素连接到 中的第一个子列表元素col1(有多个子列表)col2,然后将第二个列表元素连接到 中col1的第二个子列表元素col2。


结果将类似于此列:


results

[['aaee', 'bbff', 'ccgg', 'ddhh'],['aaqq', 'bbww', 'ccee', 'ddrr']]

[['ssmm', 'ddnn', 'ffvv', 'ggcc'],['sszz', 'ddaa', 'ffjj', 'ggkk']]

[['ssmm', 'ddnn'],['sszz', 'ddaa']]

我认为这与循环遍历第一个元素有关,col1并以某种方式循环并将它们与每个子列表中的相应项目相匹配col2- 我该怎么做?


转换后的代码


[[[df1.agg(lambda x: get_top_matches(u,w), axis=1) for u,w in zip(x,v)]\

for v in y] for x,y in zip(df1['parent_org_name_list'], df1['children_org_name_sublists'])]

结果:

http://img4.mukewang.com/64c23eeb000192a206500461.jpg

RISEBY
浏览 134回答 3
3回答

慕丝7291255

你可以zip在这里使用:[[[u+w for u,w in zip(x,v)] for v in y] for x,y in zip(df['col1'], df['col2'])]输出:[[['aaee', 'bbff', 'ccgg', 'ddhh'], ['aaqq', 'bbww', 'ccee', 'ddrr']], [['ssmm', 'ddnn', 'ffvv', 'ggcc'], ['sszz', 'ddaa', 'ffjj', 'ggkk']], [['ssmm', 'ddnn'], ['sszz', 'ddaa']]]要分配回您的数据框,您可以执行以下操作:df['results'] = [[[u+w for u,w in zip(x,v)] for v in y]             for x,y in zip(df['col1'], df['col2'])]

holdtom

Max,循环尝试这个解决方案。它允许对转换进行更精细的控制,包括处理不均匀的长度(参见len_limit示例):import pandas as pddf = pd.DataFrame({'c1':[['aa', 'bb', 'cc', 'dd'],['ss', 'dd', 'ff', 'gg']],                   'c2':[[['ee', 'ff', 'gg', 'hh'], ['qq', 'ww', 'ee', 'rr']],                         [['mm', 'nn', 'vv', 'cc'], ['zz', 'aa', 'jj', 'kk']]],})  df ['c3'] = 'empty'  # send string to 'c3' so it is object data typeprint(df)                 c1                                    c2     c30  [aa, bb, cc, dd]  [[ee, ff, gg, hh], [qq, ww, ee, rr]]  empty1  [ss, dd, ff, gg]  [[mm, nn, vv, cc], [zz, aa, jj, kk]]  emptyfor i, row  in df.iterrows():    c3_list = []    len_limit = len (row['c1']    for c2_sublist in row['c2']:        c3_list.append([j1+j2 for j1, j2 in zip(row['c1'], c2_sublist[:len_limit])])    df.at[i, 'c3'] = c3_list    print (df['c3'])0    [[aaee, bbff, ccgg, ddhh], [aaqq, bbww, ccee, ...1    [[ssmm, ddnn, ffvv, ggcc], [sszz, ddaa, ffjj, ...Name: c3, dtype: object

ITMISS

尝试:df["results"] = df[["col1", "col2"]].apply(lambda x: [list(map(''.join, zip(x["col1"], el))) for el in x["col2"]], axis=1)输出:>>> df["results"]0    [[aaee, bbff, ccgg, ddhh], [aaqq, bbww, ccee, ...1    [[ssmm, ddnn, ffvv, ggcc], [sszz, ddaa, ffjj, ...2                         [[ssmm, ddnn], [sszz, ddaa]]
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python