我想要一个简短的数据框:
每分钟安排一次
添加缺失分钟数的行(从 09:05 到 09:20)
然后每 5 分钟重新采样一次
time cars flow
0 9:07 737 In
1 9:06 22 Out
2 9:18 42 In
3 9:19 36 Unknown
我尝试过的:
data = {'time': ["9:07", "9:06", "9:18", "9:19"],
'cars' : [737, 22, 42,36],
'flow': ["In","Out","In","Unknown"]}
df = pd.DataFrame(data)
idx = pd.date_range("9:05", "09:20", freq="1min")
idx = idx.rename('time')
df = df.set_index('time')
df.index = pd.to_datetime(df.index)
df = df.reindex(idx, fill_value=0)
df = df.groupby('flow').resample('5T')['cars'].sum() # how_many_volume
print(df)
它返回:
flow time
0 2020-10-21 09:05:00 0
2020-10-21 09:10:00 0
2020-10-21 09:15:00 0
2020-10-21 09:20:00 0
In 2020-10-21 09:05:00 737
2020-10-21 09:10:00 0
2020-10-21 09:15:00 42
Out 2020-10-21 09:05:00 22
Unknown 2020-10-21 09:15:00 36
但想要的是:
In 2020-10-21 09:05:00 737
2020-10-21 09:10:00 0
2020-10-21 09:15:00 42
2020-10-21 09:20:00 0
Out 2020-10-21 09:05:00 22
2020-10-21 09:10:00 0
2020-10-21 09:15:00 0
2020-10-21 09:20:00 0
Unknown 2020-10-21 09:05:00 0
2020-10-21 09:10:00 0
2020-10-21 09:15:00 36
2020-10-21 09:20:00 0
有什么方法可以实现呢?
撒科打诨
相关分类