日期时间填充与先前值之间的重采样间隙（多索引）

我正在尝试更正没有日期的每一行。然后想法只是填补缺失日期之间的空白，并用以前的值完成其他列。

ds SKU Estoque leadtime

0 2018-01-02 504777 45 11

1 2018-01-04 504777 42 11

2 2018-01-05 504777 41 11

3 2018-01-09 504777 40 11

4 2018-01-12 504777 37 11

5 2018-01-13 504777 36 11

6 2018-01-15 504777 35 11

... ... ... ... ...

6629 2018-08-14 857122 11 10

6630 2018-08-15 857122 10 10

6631 2018-08-16 857122 9 10

6632 2018-08-17 857122 7 10

6633 2018-08-23 857122 14 10

6634 2018-08-24 857122 13 10

我已经尝试过：

df.set_index('ds', inplace=True)

df = df.resample("D")

或者

df.resample("D", how='first', fill_method='ffill')

但我刚得到这个：

DatetimeIndexResampler [freq=<Day>, axis=0, closed=left, label=left, convention=start, base=0]

当我尝试：

(df.groupby('SKU')

.resample('D')

.last()

.reset_index()

.set_index('ds'))

我收到此错误：

ValueError: cannot insert SKU, already exists

我试图得到这个结果：

ds SKU Estoque leadtime

0 2018-01-02 504777 45 11

1 2018-01-03 504777 45 11

2 2018-01-04 504777 42 11

3 2018-01-05 504777 41 11

4 2018-01-06 504777 41 11

5 2018-01-07 504777 41 11

6 2018-01-08 504777 41 11

7 2018-01-09 504777 40 11

... ... ... ... ...

PS：如果我将日期设置为索引，我有重复的索引。我需要首先隔离每个产品（分组依据）

温温酱

浏览 191回答 1

1回答

慕勒3428872

在您的情况下，您可能需要与 apply#df.set_index('ds', inplace=True)df.groupby('SKU').apply(lambda x : x.resample('D').ffill()).reset_index(level=0,drop=True)

随时随地看视频慕课网APP