使用 timedelta 将 df1 中的每一行的 pandas DataFrame

我有两个熊猫数据框。我想将所有行保留在df2等于Typein TypeANDdf1之间Date(Date- df11 天或 + 1 天)。我怎样才能做到这一点?


df1


   IBSN  Type          Date

0     1     X    2014-08-17

1     1     Y    2019-09-22

df2


   IBSN  Type          Date

0     2     X    2014-08-16

1     2     D    2019-09-22

2     9     X    2014-08-18

3     3     H    2019-09-22

4     3     Y    2019-09-23

5     5     G    2019-09-22

资源


   IBSN  Type          Date

0     2     X    2014-08-16 <-- keep because Type = df1[0]['Type'] AND Date = df1[0]['Date'] - 1

1     9     X    2014-08-18 <-- keep because Type = df1[0]['Type'] AND Date = df1[0]['Date'] + 1

2     3     Y    2019-09-23 <-- keep because Type = df1[1]['Type'] AND Date = df1[1]['Date'] + 1



喵喔喔
浏览 148回答 2
2回答

月关宝盒

这应该这样做:import pandas as pdfrom datetime import timedelta# create dummy datadf1 = pd.DataFrame([[1, 'X', '2014-08-17'], [1, 'Y', '2019-09-22']], columns=['IBSN', 'Type', 'Date'])df1['Date'] = pd.to_datetime(df1['Date'])&nbsp; # might not be necessary if your Date column already contain datetime objectsdf2 = pd.DataFrame([[2, 'X', '2014-08-16'], [2, 'D', '2019-09-22'], [9, 'X', '2014-08-18'], [3, 'H', '2019-09-22'], [3, 'Y', '2014-09-23'], [5, 'G', '2019-09-22']], columns=['IBSN', 'Type', 'Date'])df2['Date'] = pd.to_datetime(df2['Date'])&nbsp; # might not be necessary if your Date column already contain datetime objects# add date boundaries to the first dataframedf1['Date_from'] = df1['Date'].apply(lambda x: x - timedelta(days=1))df1['Date_to'] = df1['Date'].apply(lambda x: x + timedelta(days=1))# merge the date boundaries to df2 on 'Type'. Filter rows where date is between# data_from and date_to (inclusive). Drop 'date_from' and 'date_to' columnsdf2 = df2.merge(df1.loc[:, ['Type', 'Date_from', 'Date_to']], on='Type', how='left')df2[(df2['Date'] >= df2['Date_from']) & (df2['Date'] <= df2['Date_to'])].\&nbsp; &nbsp; drop(['Date_from', 'Date_to'], axis=1)请注意,根据您的逻辑,df2(3 Y 2014-09-23)中的第 4 行不应保留,因为其日期(2014)不在 df1(2019 年)的给定日期之间。

白猪掌柜的

假设Date两个数据框中的列已经在 dtype 中datetime。我会构造IntervalIndex分配给df1. to的Map列Type。最后检查相等性以创建要切片的掩码df1df2iix = pd.IntervalIndex.from_arrays(df1.Date + pd.Timedelta(days=-1),&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;df1.Date + pd.Timedelta(days=1), closed='both')df1 = df1.set_index(iix)s = df2['Date'].map(df1.Type)df_final = df2[df2.Type == s]Out[1131]:&nbsp; &nbsp;IBSN Type&nbsp; &nbsp; &nbsp; &nbsp;Date0&nbsp; &nbsp; &nbsp;2&nbsp; &nbsp; X 2014-08-162&nbsp; &nbsp; &nbsp;9&nbsp; &nbsp; X 2014-08-184&nbsp; &nbsp; &nbsp;3&nbsp; &nbsp; Y 2019-09-23
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python