猿问

从日期范围取消嵌套字段创建 DataFrame 行

我有以下熊猫数据框:


ID       start_date        end_date        codes         type

1        2019-01-01       2019-01-05      [x, y]          A

2        2019-01-01       2019-01-05      [x, y, z]       B

我想要做的是为每个代码生成与两个日期之间的范围相同的行数。输出将是这样的:


ID          date              codes        type

1        2019-01-01            x            A

1        2019-01-02            x            A

1        2019-01-03            x            A

1        2019-01-04            x            A

1        2019-01-05            x            A

1        2019-01-01            y            A

1        2019-01-02            y            A

1        2019-01-03            y            A

1        2019-01-04            y            A

1        2019-01-05            y            A

2        2019-01-01            x            B

2        2019-01-02            x            B

.....

非常感谢!


杨魅力
浏览 127回答 1
1回答

慕妹3146593

熊猫 > 0.25.0#if necessary#df['start_date']= pd.to_datetime(df['start_date'])#df['end_date']= pd.to_datetime(df['end_date'])new_df = (df.melt(['ID','type','codes'],value_name = 'date')&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; .set_index('date')&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; .groupby(['ID','type'])&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; .resample('D').ffill()&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; .drop(columns = 'variable')&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; .explode('codes')&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; .reset_index(level=[0,1],drop=True)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; .sort_values(['ID','type','codes'])&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; .reset_index()&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; .reindex(columns = ['ID','date','codes','type'])&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;)print(new_df)熊猫 < 0.25.0#if necessary#df['start_date']= pd.to_datetime(df['start_date'])#df['end_date']= pd.to_datetime(df['end_date'])new_df = (df.melt(['ID','type','codes'],value_name = 'date')&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; .set_index('date')&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; .groupby(['ID','type'])&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; .resample('D').ffill()&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; .drop(columns = 'variable'))new_df = (new_df.reindex(new_df.index.repeat(new_df.codes.str.len()))&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; .assign(codes=np.concatenate(new_df.codes.values))&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; .reset_index(level=[0,1],drop=True)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; .sort_values(['ID','type','codes'])&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; .reset_index()&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; .reindex(columns = ['ID','date','codes','type']))print(new_df)输出&nbsp; &nbsp; ID&nbsp; &nbsp; &nbsp; &nbsp;date codes type0&nbsp; &nbsp; 1 2019-01-01&nbsp; &nbsp; &nbsp;x&nbsp; &nbsp; A1&nbsp; &nbsp; 1 2019-01-02&nbsp; &nbsp; &nbsp;x&nbsp; &nbsp; A2&nbsp; &nbsp; 1 2019-01-03&nbsp; &nbsp; &nbsp;x&nbsp; &nbsp; A3&nbsp; &nbsp; 1 2019-01-04&nbsp; &nbsp; &nbsp;x&nbsp; &nbsp; A4&nbsp; &nbsp; 1 2019-01-05&nbsp; &nbsp; &nbsp;x&nbsp; &nbsp; A5&nbsp; &nbsp; 1 2019-01-01&nbsp; &nbsp; &nbsp;y&nbsp; &nbsp; A6&nbsp; &nbsp; 1 2019-01-02&nbsp; &nbsp; &nbsp;y&nbsp; &nbsp; A7&nbsp; &nbsp; 1 2019-01-03&nbsp; &nbsp; &nbsp;y&nbsp; &nbsp; A8&nbsp; &nbsp; 1 2019-01-04&nbsp; &nbsp; &nbsp;y&nbsp; &nbsp; A9&nbsp; &nbsp; 1 2019-01-05&nbsp; &nbsp; &nbsp;y&nbsp; &nbsp; A10&nbsp; &nbsp;2 2019-01-01&nbsp; &nbsp; &nbsp;x&nbsp; &nbsp; B11&nbsp; &nbsp;2 2019-01-02&nbsp; &nbsp; &nbsp;x&nbsp; &nbsp; B12&nbsp; &nbsp;2 2019-01-03&nbsp; &nbsp; &nbsp;x&nbsp; &nbsp; B13&nbsp; &nbsp;2 2019-01-04&nbsp; &nbsp; &nbsp;x&nbsp; &nbsp; B14&nbsp; &nbsp;2 2019-01-05&nbsp; &nbsp; &nbsp;x&nbsp; &nbsp; B15&nbsp; &nbsp;2 2019-01-01&nbsp; &nbsp; &nbsp;y&nbsp; &nbsp; B16&nbsp; &nbsp;2 2019-01-02&nbsp; &nbsp; &nbsp;y&nbsp; &nbsp; B17&nbsp; &nbsp;2 2019-01-03&nbsp; &nbsp; &nbsp;y&nbsp; &nbsp; B18&nbsp; &nbsp;2 2019-01-04&nbsp; &nbsp; &nbsp;y&nbsp; &nbsp; B19&nbsp; &nbsp;2 2019-01-05&nbsp; &nbsp; &nbsp;y&nbsp; &nbsp; B20&nbsp; &nbsp;2 2019-01-01&nbsp; &nbsp; &nbsp;z&nbsp; &nbsp; B21&nbsp; &nbsp;2 2019-01-02&nbsp; &nbsp; &nbsp;z&nbsp; &nbsp; B22&nbsp; &nbsp;2 2019-01-03&nbsp; &nbsp; &nbsp;z&nbsp; &nbsp; B23&nbsp; &nbsp;2 2019-01-04&nbsp; &nbsp; &nbsp;z&nbsp; &nbsp; B24&nbsp; &nbsp;2 2019-01-05&nbsp; &nbsp; &nbsp;z&nbsp; &nbsp; B
随时随地看视频慕课网APP

相关分类

Python
我要回答