如何将函数应用于两列Pandas数据帧

3回答

慕妹3242003

这是一个使用apply数据帧的示例，我正在调用它axis = 1。注意区别在于，不是尝试将两个值传递给函数f，而是重写函数以接受pandas Series对象，然后索引Series以获取所需的值。In [49]: dfOut[49]:           0         10  1.000000  0.0000001 -0.494375  0.5709942  1.000000  0.0000003  1.876360 -0.2297384  1.000000  0.000000In [50]: def f(x):       ....:  return x[0] + x[1]     ....:  In [51]: df.apply(f, axis=1) #passes a Series object, row-wiseOut[51]: 0    1.0000001    0.0766192    1.0000003    1.6466224    1.000000根据您的使用情况，创建一个pandas group对象，然后apply在该组上使用有时会很有帮助。

千万里不及你

一个简单的解决方案是df['col_3'] = df[['col_1','col_2']].apply(lambda x: f(*x), axis=1)

梵蒂冈之花

在熊猫中有一种干净，单行的方式：df['col_3'] = df.apply(lambda x: f(x.col_1, x.col_2), axis=1)这允许f是具有多个输入值的用户定义函数，并使用（安全）列名而不是（不安全）数字索引来访问列。数据示例（基于原始问题）：import pandas as pddf = pd.DataFrame({'ID':['1', '2', '3'], 'col_1': [0, 2, 3], 'col_2':[1, 4, 5]})mylist = ['a', 'b', 'c', 'd', 'e', 'f']def get_sublist(sta,end):    return mylist[sta:end+1]df['col_3'] = df.apply(lambda x: get_sublist(x.col_1, x.col_2), axis=1)产量print(df)：  ID  col_1  col_2      col_30  1      0      1     [a, b]1  2      2      4  [c, d, e]2  3      3      5  [d, e, f]