如何求每一行到最近满足条件的行的距离?

import datetime

import pandas as pd

pd.DataFrame({'date': {0: datetime.date(2020, 8, 15),

  1: datetime.date(2020, 8, 16),

  2: datetime.date(2020, 8, 16),

  3: datetime.date(2020, 8, 17),

  4: datetime.date(2020, 8, 17),

  5: datetime.date(2020, 8, 18),

  6: datetime.date(2020, 8, 19),

  7: datetime.date(2020, 8, 19)},

 'sign_change': {0: 0, 1: 0, 2: 0, 3: 1, 4: 1, 5: 0, 6: 1, 7: 1},

 'distance (desired_output)': {0: 2, 1: 1, 2: 1, 3: 0, 4: 0, 5: 1, 6: 0, 7: 0}})



      date      sign_change         distance (desired_output)

0  2020-08-15            0                          2

1  2020-08-16            0                          1

2  2020-08-16            0                          1

3  2020-08-17            1                          0

4  2020-08-17            1                          0

5  2020-08-18            0                          1

6  2020-08-19            1                          0

7  2020-08-19            1                          0

对于每一行,我想找到到最近的行的距离(以天为单位),其中sign_change == 1。我已在上面的数据框中手动输入了所需的输出。


泛舟湖上清波郎朗
浏览 99回答 2
2回答

回首忆惘然

我们来尝试一下广播:s = df.sign_change!=1offset = (np.abs(df.loc[s,'date'].values[None,:] - df.loc[~s,['date']].values).min(0)            /pd.to_timedelta('1D')         )df['distance'] = 0df.loc[s,'distance'] = offset输出:         date  sign_change  distance (desired_output)  distance0  2020-08-15            0                          2       2.01  2020-08-16            0                          1       1.02  2020-08-16            0                          1       1.03  2020-08-17            1                          0       0.04  2020-08-17            1                          0       0.05  2020-08-18            0                          1       1.06  2020-08-19            1                          0       0.07  2020-08-19            1                          0       0.0

繁星coding

您可以使用where,bfill()和ffill()。本质上,.where符号是1,您返回日期,否则返回NaN。从那里您可以bfill或向后填写该日期back到下一个日期1;您可以ffill将该日期填写forward到下一个日期1。fill然后取该日期和该日期的差值。最后,.fillna(0)是数据帧中的最后一个值。解决方案#1 - 只期待最近的日期(有关整体最近的日期,请参阅解决方案#2):df['distance (desired_output)'] = ((df['date'].where(df['sign_change'] == 1).bfill()&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - df['date']).dt.days).fillna(0)dfOut[1]:&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; date&nbsp; sign_change&nbsp; distance (desired_output)0 2020-08-15&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 2.01 2020-08-16&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1.02 2020-08-16&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1.03 2020-08-17&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0.04 2020-08-17&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0.05 2020-08-18&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1.06 2020-08-19&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0.07 2020-08-19&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0.0解决方案#2(此解决方案ffill()与bfill()系列进行比较,并返回最接近日期的最小天数或天数,无论之前还是之后。import datetimeimport pandas as pddf = pd.DataFrame({'date': {0: datetime.date(2020, 8, 15),&nbsp; 1: datetime.date(2020, 8, 16),&nbsp; 2: datetime.date(2020, 8, 16),&nbsp; 3: datetime.date(2020, 8, 17),&nbsp; 4: datetime.date(2020, 8, 17),&nbsp; 5: datetime.date(2020, 8, 18),&nbsp; 6: datetime.date(2020, 8, 19),&nbsp; 7: datetime.date(2020, 8, 19),&nbsp; 8: datetime.date(2020, 8, 20),&nbsp; 9: datetime.date(2020, 8, 21)},&nbsp;'sign_change': {0: 0, 1: 0, 2: 0, 3: 1, 4: 1, 5: 0, 6: 1, 7: 1, 8: 0, 9: 0},&nbsp;'distance (desired_output)': {0: 2, 1: 1, 2: 1, 3: 0, 4: 0, 5: 1, 6: 0, 7: 0}})df['date'] = pd.to_datetime(df['date'])s = (df['date'].where(df['sign_change'] == 1))b = (s.bfill() - df['date']).dt.daysf = (s.ffill() - df['date']).dt.days.abs()df['distance (desired_output)'] = np.where((b <= f) | (b.notnull()), b, f)dfOut[2]:&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; date&nbsp; sign_change&nbsp; distance (desired_output)0 2020-08-15&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 2.01 2020-08-16&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1.02 2020-08-16&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1.03 2020-08-17&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0.04 2020-08-17&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0.05 2020-08-18&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1.06 2020-08-19&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0.07 2020-08-19&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0.08 2020-08-20&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1.09 2020-08-21&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 2.0
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python