下面是一些反映我正在使用的数据的虚拟数据。
import pandas as pd
import numpy as np
from numpy import random
random.seed(30)
# Dummy data that represents a percent change
datelist = pd.date_range(start='1983-01-01', end='1994-01-01', freq='Y')
df1 = pd.DataFrame({"P Change_1": np.random.uniform(low=-0.55528, high=0.0396181, size=(11,)),
"P Change_2": np.random.uniform(low=-0.55528, high=0.0396181, size=(11,))})
#This dataframe contains the rows we want to operate on
df2 = pd.DataFrame({
'Loc1': [None, None, None, None, None, None, None, None, None, None, 2.5415],
'Loc2': [None, None, None, None, None, None, None, None, None, None, 3.2126],})
#Set the datetime index
df1 = df1.set_index(datelist)
df2 = df2.set_index(datelist)
df1:
P Change_1 P Change_2
1984-12-31 -0.172080 -0.231574
1985-12-31 -0.328773 -0.247018
1986-12-31 -0.160834 -0.099079
1987-12-31 -0.457924 0.000266
1988-12-31 0.017374 -0.501916
1989-12-31 -0.349052 -0.438816
1990-12-31 0.034711 0.036164
1991-12-31 -0.415445 -0.415372
1992-12-31 -0.206852 -0.413107
1993-12-31 -0.313341 -0.181030
1994-12-31 -0.474234 -0.118058
df2:
Loc1 Loc2
1984-12-31 NaN NaN
1985-12-31 NaN NaN
1986-12-31 NaN NaN
1987-12-31 NaN NaN
1988-12-31 NaN NaN
1989-12-31 NaN NaN
1990-12-31 NaN NaN
1991-12-31 NaN NaN
1992-12-31 NaN NaN
1993-12-31 NaN NaN
1994-12-31 2.5415 3.2126
数据框详细信息:
首先,Loc1 将对应于 P Change_1,Loc2 对应于 P Change_2,等等。首先查看 Loc1,我想用相关值填充包含 Loc1 和 Loc2 的 DataFrame,或者计算一个包含 Calc1 和 Calc2 列的新数据帧.
计算:
我想从 Loc1 的 1994 年值开始,并通过采用 Loc1 1993 = Loc1 1994 + (Loc1 1994 * P Change_1 1993) 计算 1993 年的新值。填充的值将是 2.5415 +(-0.313341 * 2.5415),大约等于 1.74514。
这个 1.74514 值将替换 1993 年的 NaN 值,然后我想使用该计算值来获得 1992 年的值。这意味着我们现在计算 Loc1 1992 = Loc1 1993 + (Loc1 1993 * P Change_1 1992)。我想按行执行此操作,直到它获得时间序列中最早的值。
实现这个逐行方程的最佳方法是什么?我希望这是有道理的,非常感谢任何帮助!
呼啦一阵风
慕仙森
饮歌长啸
相关分类