如何在没有 for 循环的情况下处理 DataFrame?

我的数据帧是:


            Date        Open        High         Low       Close   Adj Close     Volume

5932  2016-08-18  218.339996  218.899994  218.210007  218.860001  207.483215   52989300

5933  2016-08-19  218.309998  218.750000  217.740005  218.539993  207.179825   75443000

5934  2016-08-22  218.259995  218.800003  217.830002  218.529999  207.170364   61368800

5935  2016-08-23  219.250000  219.600006  218.899994  218.970001  207.587479   53399200

5936  2016-08-24  218.800003  218.910004  217.360001  217.850006  206.525711   71728900

5937  2016-08-25  217.399994  218.190002  217.220001  217.699997  206.383514   69224800

5938  2016-08-26  217.919998  219.119995  216.250000  217.289993  205.994827  122506300

5939  2016-08-29  217.440002  218.669998  217.399994  218.360001  207.009201   68606100

5940  2016-08-30  218.259995  218.589996  217.350006  218.000000  206.667908   58114500

5941  2016-08-31  217.610001  217.750000  216.470001  217.380005  206.080124   85269500

5942  2016-09-01  217.369995  217.729996  216.029999  217.389999  206.089645   97844200

5943  2016-09-02  218.389999  218.869995  217.699997  218.369995  207.018692   79293900

5944  2016-09-06  218.699997  219.119995  217.860001  219.029999  207.644394   56702100


目前,我正在这样做:


        state['open_price'] = lookback.Open.iloc[-1:].get_values()[0]


        for ind, row in lookback.reset_index().iterrows():

            if ind < self.LOOKBACK_DAYS:

                state['close_' + str(self.LOOKBACK_DAYS - ind)] = row.Close

                state['open_' + str(self.LOOKBACK_DAYS - ind)] = row.Open

                state['volume_' + str(self.LOOKBACK_DAYS - ind)] = row.Volume

但这是非常缓慢的。有没有更多矢量化的方法来做到这一点?


弑天下
浏览 132回答 1
1回答

狐的传说

一种方法是欺骗并使用底层数组 .values我还将添加一些我用来创建等效示例的步骤:import pandas as pdfrom itertools import productinitial = ['cash', 'num_shares', 'somethingsomething']initial_series = pd.Series([1, 2, 3], index = initial)print(initial_series)#Output:cash&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1num_shares&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 2somethingsomething&nbsp; &nbsp; 3dtype: int64好的,只是输出系列开始时的一些值,为示例模拟。df = pd.read_clipboard(sep='\s\s+') #pure magicprint(df.head())#Output:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Date&nbsp; &nbsp; &nbsp; &nbsp; Open&nbsp; &nbsp; ...&nbsp; &nbsp; &nbsp; Adj Close&nbsp; &nbsp; Volume5932&nbsp; 2016-08-18&nbsp; 218.339996&nbsp; &nbsp; ...&nbsp; &nbsp; &nbsp;207.483215&nbsp; 529893005933&nbsp; 2016-08-19&nbsp; 218.309998&nbsp; &nbsp; ...&nbsp; &nbsp; &nbsp;207.179825&nbsp; 754430005934&nbsp; 2016-08-22&nbsp; 218.259995&nbsp; &nbsp; ...&nbsp; &nbsp; &nbsp;207.170364&nbsp; 613688005935&nbsp; 2016-08-23&nbsp; 219.250000&nbsp; &nbsp; ...&nbsp; &nbsp; &nbsp;207.587479&nbsp; 533992005936&nbsp; 2016-08-24&nbsp; 218.800003&nbsp; &nbsp; ...&nbsp; &nbsp; &nbsp;206.525711&nbsp; 71728900[5 rows x 7 columns]df 现在本质上是您在示例中提供的数据框。剪贴板技巧来自此处,非常适合 Pandas MCVE。to_select = ['Close', 'Open', 'Volume']SOMELOOKBACK = 6000 #mockedfinal_index = [f"{name}_{index}" for index, name in product((SOMELOOKBACK - df.index), to_select)]这准备了索引,看起来像这样['Close_68',&nbsp;'Open_68',&nbsp;'Volume_68',&nbsp;'Close_67',&nbsp;'Open_67',&nbsp;'Volume_67',...]现在,只需从数据框中选择相关列,用于.values获取二维数组然后展平,以获得最终系列。final_series = pd.Series(df[to_select].values.flatten(), index = final_index)result = initial_series.append(final_series)#Output:cash&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1.000000e+00num_shares&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 2.000000e+00somethingsomething&nbsp; &nbsp; 3.000000e+00Close_68&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 2.188600e+02Open_68&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;2.183400e+02Volume_68&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;5.298930e+07Close_67&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 2.185400e+02Open_67&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;2.183100e+02Volume_67&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;7.544300e+07Close_66&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 2.185300e+02Open_66&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;2.182600e+02Volume_66&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;6.136880e+07...Close_48&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 2.133700e+02Open_48&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;2.134800e+02Volume_48&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;1.552364e+08Length: 66, dtype: float64
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python