猿问

对同一天求和求平均值

我有我在 excel 中按天数排序的数据,我现在想要做的是获得每天的每日收益总和。这里的问题是我这些天有多个条目。所以我可能只有一个 2018-12-05 的每日返回条目,但 2018-12-06 的 5 个条目。我希望我只获得 2018-12-06 的一个条目,其中包含累积每日回报(因此所有累积回报加在一起)和平均每日回报(因此累积回报除以当天的条目数量。对于2018-12-06 这将除以 5)。


所以我现在拥有的数据是这样的:


            Dates  Last.Price  Daily.Return

19788  2018-11-23       75.18     -0.001199

19789  2018-11-23      129.04     -0.026490

19790  2018-11-26       77.84     -0.035382

19791  2018-11-26      127.98      0.008215

19792  2018-11-27       79.50     -0.021326

19793  2018-11-27      122.68      0.041413

19794  2018-11-28       80.27     -0.009686

19795  2018-11-29       80.00      0.003364

最终的数据框应该是这样的


              Dates  Last.Price  Cum.Return   Average.Return

19788  2018-11-23       75.18     -0.027689    -0.0138445

19790  2018-11-26       77.84     -0.027167    -0.0135835

19792  2018-11-27       79.50      0.020087     0.0100435

19794  2018-11-28       80.27     -0.009686    -0.009686

19795  2018-11-29       80.00      0.003364     0.003364

到目前为止,我有以下代码来总结每日回报。但是它的总和不正确。而且我不知道如何实现平均每日回报。


df = pd.read_csv('/Python Test/SP500Acquirer.csv')


def sum_from_days_prior(row, df):

    '''returns sum of values in row month, 

    from all dates in df prior to row date'''


    day = pd.to_datetime(row).day


    all_dates_prior = df[df.index <= row]

    same_day = all_dates_prior[all_dates_prior.index.day == day]


    return same_day["Daily.Return"].sum()



df.set_index('Dates', inplace = True)

df.index = pd.to_datetime(df.index)

df["Dates"] = df.index

df.sort_index(inplace = True)


df["Day"] = df["Dates"].apply(lambda row: sum_from_days_prior (row, df))

df.drop("Dates", axis = 1, inplace = True)


print(df.tail(20))

如前所述,此代码未正确总结每日收益。而且我不知道如何获得这些天的平均回报。


有只小跳蛙
浏览 159回答 1
1回答

天涯尽头无女友

我认为您需要agg使用函数进行聚合first,sum并且mean:因为 columnDaily.Return是由列表中定义的多个函数聚合的,所以MultiIndex在输出中获取。因此有必要将其展平 - 最简单的是map与join.df = df.groupby('Dates').agg({'Last.Price':'first', 'Daily.Return':['mean','sum']})print (df)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Last.Price Daily.Return&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; first&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;mean&nbsp; &nbsp; &nbsp; &nbsp;sumDates&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;2018-11-23&nbsp; &nbsp; &nbsp; 75.18&nbsp; &nbsp; -0.013844 -0.0276892018-11-26&nbsp; &nbsp; &nbsp; 77.84&nbsp; &nbsp; -0.013583 -0.0271672018-11-27&nbsp; &nbsp; &nbsp; 79.50&nbsp; &nbsp; &nbsp;0.010044&nbsp; 0.0200872018-11-28&nbsp; &nbsp; &nbsp; 80.27&nbsp; &nbsp; -0.009686 -0.0096862018-11-29&nbsp; &nbsp; &nbsp; 80.00&nbsp; &nbsp; &nbsp;0.003364&nbsp; 0.003364print (df.columns)MultiIndex(levels=[['Last.Price', 'Daily.Return'], ['first', 'mean', 'sum']],&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;labels=[[0, 1, 1], [0, 1, 2]])df.columns = df.columns.map('_'.join)print (df)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Last.Price_first&nbsp; Daily.Return_mean&nbsp; Daily.Return_sumDates&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;2018-11-23&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;75.18&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; -0.013844&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;-0.0276892018-11-26&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;77.84&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; -0.013583&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;-0.0271672018-11-27&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;79.50&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;0.010044&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0.0200872018-11-28&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;80.27&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; -0.009686&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;-0.0096862018-11-29&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;80.00&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;0.003364&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0.003364最后一rename栏:d = {'Last.Price_first':'Last.Price',&nbsp; &nbsp; &nbsp;'Daily.Return_sum': 'Cum.Return',&nbsp; &nbsp; &nbsp;'Daily.Return_mean': 'Average.Return'}df = df.rename(columns=d)print (df)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Last.Price&nbsp; Average.Return&nbsp; Cum.ReturnDates&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;2018-11-23&nbsp; &nbsp; &nbsp; &nbsp;75.18&nbsp; &nbsp; &nbsp; &nbsp;-0.013844&nbsp; &nbsp;-0.0276892018-11-26&nbsp; &nbsp; &nbsp; &nbsp;77.84&nbsp; &nbsp; &nbsp; &nbsp;-0.013583&nbsp; &nbsp;-0.0271672018-11-27&nbsp; &nbsp; &nbsp; &nbsp;79.50&nbsp; &nbsp; &nbsp; &nbsp; 0.010044&nbsp; &nbsp; 0.0200872018-11-28&nbsp; &nbsp; &nbsp; &nbsp;80.27&nbsp; &nbsp; &nbsp; &nbsp;-0.009686&nbsp; &nbsp;-0.0096862018-11-29&nbsp; &nbsp; &nbsp; &nbsp;80.00&nbsp; &nbsp; &nbsp; &nbsp; 0.003364&nbsp; &nbsp; 0.003364
随时随地看视频慕课网APP

相关分类

Python
我要回答