在这里,我将时间添加到仅给出日期的数据中。值之间有 5 分钟或每个日期 288 个值。
该代码在输入数据帧为 1 天(288 行)或更短时有效,但在输入较长时会出错。知道我错过了什么吗?提前致谢。
相关代码部分:
import datetime as dt
print("Print df_raw:\n", df_raw)
df = df_raw[:288]
# df = df_raw[:289] # Gives KeyError, see traceback below
print("\nPrint df BEFORE groubpy/apply:\n", df)
df.loc[:,'date'] = pd.to_datetime(df.date)
def f(x):
x['DT']=[val+dt.timedelta(minutes=(pos*5)) for val,pos in zip(x.loc[:,'date'], range(0,len(x.loc[:,'date'])))]
return x
df = df.groupby('date').apply(f)
df = df.set_index('DT').drop(columns='date')
print("\nPrint df AFTER groubpy/apply:\n", df)
输出(288 行或更少,按预期工作):
Print df_raw:
date values
0 2015-03-10 556.25
0 2015-03-10 516.993
0 2015-03-10 468.75
0 2015-03-10 432.812
0 2015-03-10 87.1095
.. ... ...
84 2014-12-16 None
84 2014-12-16 None
84 2014-12-16 160.938
84 2014-12-16 145.118
84 2014-12-16 125.977
[24480 rows x 2 columns]
Print df BEFORE groubpy/apply:
date values
0 2015-03-10 556.25
0 2015-03-10 516.993
0 2015-03-10 468.75
0 2015-03-10 432.812
0 2015-03-10 87.1095
.. ... ...
0 2015-03-10 781.446
0 2015-03-10 743.36
0 2015-03-10 708.985
0 2015-03-10 669.922
0 2015-03-10 632.422
[288 rows x 2 columns]
Print df AFTER groubpy/apply:
values
DT
2015-03-10 00:00:00 556.25
2015-03-10 00:05:00 516.993
2015-03-10 00:10:00 468.75
2015-03-10 00:15:00 432.812
2015-03-10 00:20:00 87.1095
... ...
2015-03-10 23:35:00 781.446
2015-03-10 23:40:00 743.36
2015-03-10 23:45:00 708.985
2015-03-10 23:50:00 669.922
2015-03-10 23:55:00 632.422
[288 rows x 1 columns]
慕田峪4524236
开满天机
相关分类