慕桂英4014372
您可以使用groupbyon'A'和 usefirst来查找第一个对应的值'B'(它不会选择NaN)。import pandas as pddf = pd.DataFrame({'A':[1,2,3,2,3,1], 'B':[20, None, None, 30, 40, None], 'C': [4,8,2,9,1,3]})# find first 'B' value for each 'A'lookup = df[['A', 'B']].groupby('A').first()['B']# only use rows where 'B' is NaNnan_mask = df['B'].isnull()# replace NaN values in 'B' with lookup valuesdf['B'].loc[nan_mask] = df.loc[nan_mask].apply(lambda x: lookup[x['A']], axis=1)print(df)哪个输出: A B C0 1 20.0 41 2 30.0 82 3 40.0 23 2 30.0 94 3 40.0 15 1 20.0 3如果有很多NaN值,'B'您可能希望在使用之前排除它们groupby。import pandas as pddf = pd.DataFrame({'A':[1,2,3,2,3,1], 'B':[20, None, None, 30, 40, None], 'C': [4,8,2,9,1,3]})# Only use rows where 'B' is NaNnan_mask = df['B'].isnull()# Find first 'B' value for each 'A'lookup = df[~nan_mask][['A', 'B']].groupby('A').first()['B']df['B'].loc[nan_mask] = df.loc[nan_mask].apply(lambda x: lookup[x['A']], axis=1)print(df)
大话西游666
您可以先执行 sort_values,然后根据 A 列向前填充 B 列。实现这一点的方法是:import pandas as pdimport numpy as npx = {'A':[1,2,3,2,3,1], 'B':[20,np.nan,np.nan,30,40,np.nan], 'C':[4,8,2,9,1,3]}df = pd.DataFrame(x)#sort_values first, then forward fill based on column B#this will get the right values for you while maintaing#the original order of the dataframedf['B'] = df.sort_values(by=['A','B'])['B'].ffill()print (df)输出将是:原始数据: A B C0 1 20.0 41 2 NaN 82 3 NaN 23 2 30.0 94 3 40.0 15 1 NaN 3更新数据: A B C0 1 20.0 41 2 30.0 82 3 40.0 23 2 30.0 94 3 40.0 15 1 20.0 3