如何访问去年的值以进行逐年比较?日期时间索引

我有一个数据框,包含 2 年的日期和给定日期的收入,


import pandas as pd

import datetime

import itertools

import time

import plotly.graph_objects as go


startDate = datetime.date(2018,1,1)

endDate = datetime.date(2019,12,31)


date_range = pd.date_range(start=startDate, end=endDate)

performance_df = pd.DataFrame(date_range)

performance_df.columns = ['Date']

performance_df = performance_df.set_index(['Date'])

print(performance_df)

我需要访问去年的收入并在新的 df 列中将其打印在当前年份旁边,这样我就可以比较年度业绩。我想要的输出看起来像这样


                 Revenue .  LY Revenue

Date                    

2018-01-01  25891.846787 .   Nan

2018-01-02  25851.615541 .   Nan

2018-01-03  25037.711900 .   Nan

2018-01-04  26715.764965 .   Nan

2018-01-05  23988.356950 .   Nan

...                  ...

2019-12-27   3539.618050 .  25744.075480

2019-12-28   3534.997476 .  27119.697589

2019-12-29   3527.721147 .  28894.626077

2019-12-30   3489.915430 .  30321.364425

2019-12-31   3287.543337 .  29665.558703

你是如何实现这一目标的?到目前为止,我刚刚能够通过以下方式从索引中获取去年的日期:


performance_df['Last year dates'] = (performance_df['Revenue'].index - pd.Timedelta(days=365))

但我想为该日期提供相应的收入,而不仅仅是日期本身。


HUWWW
浏览 143回答 3
3回答

慕田峪4524236

使用Series.shift:performance_df['LY Revenue']=performance_df['Revenue'].shift(365)print(performance_df)            Revenue  LY RevenueDate                           2018-01-01 25891.8%        nan%2018-01-02 25851.6%        nan%2018-01-03 25037.7%        nan%2018-01-04 26715.8%        nan%2018-01-05 23988.4%        nan%...             ...         ...2019-12-27  3539.6%    25744.1%2019-12-28  3535.0%    27119.7%2019-12-29  3527.7%    28894.6%2019-12-30  3489.9%    30321.4%2019-12-31  3287.5%    29665.6%[730 rows x 2 columns]在这里您可以看到 2019 年的开始:print(performance_df[364:366])            Revenue  LY RevenueDate                           2018-12-31 29665.6%        nan%2019-01-01 28601.7%    25891.8%

料青山看我应如是

IIUC,你需要这个。这仅在您将日期时间作为索引时才有效。我们在这里所做的是使用日期时间值按日和月分组,即使日期在闰年和正常年之间,这也应该有效。performance_df['LY_Revenue'] = performance_df.groupby([performance_df.index.month,performance_df.index.day])['Revenue'].shift()print(performance_df)输出                Revenue     LY_RevenueDate        2018-01-01  25891.846787    NaN2018-01-02  25851.615541    NaN2018-01-03  25037.711900    NaN2018-01-04  26715.764965    NaN2018-01-05  23988.356950    NaN2018-01-06  19029.057983    NaN2018-01-07  16935.481705    NaN2018-01-08  22756.072913    NaN2018-01-09  30385.672829    NaN2018-01-10  32970.132175    NaN2018-01-11  31089.167075    NaN2018-01-12  24262.972415    NaN2018-01-13  18261.273832    NaN2018-01-14  18304.754084    NaN2018-01-15  26297.835665    NaN2018-01-16  32619.669405    NaN2018-01-17  35565.262225    NaN2018-01-18  33229.971940    NaN2018-01-19  25405.647136    NaN2018-01-20  19980.890375    NaN2018-01-21  20487.553161    NaN2018-01-22  29709.032322    NaN2018-01-23  38164.493648    NaN2018-01-24  39050.801147    NaN2018-01-25  36612.554433    NaN2018-01-26  28169.782524    NaN2018-01-27  22086.641618    NaN2018-01-28  21631.662706    NaN2018-01-29  28419.945290    NaN2018-01-30  35644.617364    NaN...     ...     ...2019-12-02  2973.892113     28289.6972072019-12-03  2674.316864     34737.3173682019-12-04  2460.238549     40574.9103482019-12-05  2800.034200     40556.0668872019-12-06  3195.262337     39927.3225072019-12-07  3107.693557     34634.7483832019-12-08  2961.140812     27666.4673642019-12-09  2340.478044     27774.3638322019-12-10  1931.373925     33950.8468752019-12-11  1847.123639     39518.0613122019-12-12  2179.325333     39587.5687012019-12-13  2438.035383     38832.6603112019-12-14  2379.865127     32258.4622222019-12-15  2255.598970     23343.0083152019-12-16  1870.926018     23914.8957752019-12-17  1620.608382     28173.0941752019-12-18  1511.311007     30306.5558272019-12-19  1685.967616     28284.3103922019-12-20  2099.849763     24228.7544262019-12-21  2430.507619     20495.9993652019-12-22  2701.975519     19302.9364452019-12-23  2997.630051     21391.0907772019-12-24  2977.347247     21072.2201292019-12-25  2893.576704     19770.6812502019-12-26  3207.467022     22751.2054472019-12-27  3539.618050     25744.0754802019-12-28  3534.997476     27119.6975892019-12-29  3527.721147     28894.6260772019-12-30  3489.915430     30321.3644252019-12-31  3287.543337     29665.558703

大话西游666

鉴于您的数据是时间索引的,您可以使用freqperformance_df['LY Revenue'] = performance_df.Revenue.shift(freq='365d')输出:            Revenue  LY RevenueDate                           2018-01-01 25891.8%        nan%2018-01-02 25851.6%        nan%2018-01-03 25037.7%        nan%2018-01-04 26715.8%        nan%2018-01-05 23988.4%        nan%2018-01-06 19029.1%        nan%2018-01-07 16935.5%        nan%2018-01-08 22756.1%        nan%2018-01-09 30385.7%        nan%...2019-12-21  2430.5%    20496.0%2019-12-22  2702.0%    19302.9%2019-12-23  2997.6%    21391.1%2019-12-24  2977.3%    21072.2%2019-12-25  2893.6%    19770.7%2019-12-26  3207.5%    22751.2%2019-12-27  3539.6%    25744.1%2019-12-28  3535.0%    27119.7%2019-12-29  3527.7%    28894.6%2019-12-30  3489.9%    30321.4%2019-12-31  3287.5%    29665.6%但请注意,365D通常不一定是一年。
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python