我DataFrame想根据特定列的值以某种方式添加新列,该列的结果取决于another中 包含的数据DataFrame。
更具体地说,我有
df_original =
Crncy Spread Duration
0 EUR 100 1.2
1 nan nan nan
2 100 3.46
3 CHF 200 2.5
4 USD 50 5.0
...
df_interpolation =
CRNCY TENOR Adj_EUR Adj_USD
0 EUR 1 10 20
1 EUR 2 20 30
2 EUR 5 30 40
3 EUR 7 40 50
...
10 CHF 1 50 10
11 CHF 2 60 20
12 CHF 5 70 30
...
现在想添加的列Adj_EUR与Adj_USD到df_original每行的基础上的值Crncy,并Duration使用标准的线性内插。
因此,对于每个可用,我们想使用TENOR和Adj_USD/和Adj_EURfromdf_interpolation和Durationfrom来形成插值。df_originalCrncy
例如,使用optimize-package的伪代码scipy:
from scipy import optimize
""" Do this for both 'Adj_EUR' and 'Adj_USD' """
# For 'Adj_EUR'
for curr, df in df_original.groupby('Crncy'):
x_data = df_interpolation[df_interpolation['CRNCY']==curr].as_matrix(['TENOR'])
y_data = df_interpolation[df_interpolation['CRNCY']==curr].as_matrix(['Adj_EUR'])
""" Linear fit """
z_linear = optimize.curve_fit(lambda t,a,b: a + b * t, x_data.ravel(), y_data.ravel())[0]
""" Somehow add the values back to df_original in a new column """
df['Adj_EUR'] = z_linear[0] + z_linear[1] * df['Duration']
屈服
Crncy Spread Duration Adj_EUR Adj_USD
0 EUR 100 1.2 12 22
1 nan nan nan 0.0 0.0
...
关于如何执行此操作的任何线索?
相关分类