我对日期时间列有一个奇怪的问题。假设 start_date 列中有一个日期:
>>> df2.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 641 entries, 9 to 1394
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 number 641 non-null object
1 start_date 641 non-null datetime64[ns]
dtypes: datetime64[ns](1), object(1)
memory usage: 15.0+ KB
当我将索引设置为start_date时,DatetimeIndex似乎不完整:
>>> df2 = df2.set_index('start_date')
>>> df2.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 641 entries, 2020-01-01 to 2020-03-01
Data columns (total 1 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 number 641 non-null object
dtypes: object(1)
memory usage: 10.0+ KB
实际上这个数据框中还有更多条目:
df3 = df2.copy()
df3 = df3.reset_index()
df3 = df3[pd.to_datetime(df3['start_date']).dt.month > 3]
df3 = df3.set_index('start_date')
df3.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 393 entries, 2020-04-01 to 2020-09-01
Data columns (total 1 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 number 393 non-null object
dtypes: object(1)
memory usage: 6.1+ KB
正如您所看到的,有日期截至 的条目2020-09-01。但为什么有时只给出这些日期呢?我无法在索引 start_date 中检测到间隙或类似的内容。
拉莫斯之舞