我想找到第一个数据帧的最后一个有效索引,并用它来索引第二个数据帧。
所以,假设我有以下数据框(df1):
Site 1 Site 2 Site 3 Site 4 Site 5 Site 6
Date
2000-01-01 13.0 28.0 76.0 45 90.0 58.0
2001-01-01 77.0 75.0 57.0 3 41.0 24.0
2002-01-01 50.0 29.0 2.0 65 48.0 21.0
2003-01-01 7.0 48.0 14.0 63 12.0 66.0
2004-01-01 11.0 90.0 11.0 5 47.0 6.0
2005-01-01 50.0 4.0 31.0 1 40.0 79.0
2006-01-01 30.0 98.0 91.0 96 43.0 39.0
2007-01-01 50.0 20.0 54.0 65 NaN 47.0
2008-01-01 24.0 84.0 52.0 84 NaN 81.0
2009-01-01 56.0 61.0 57.0 25 NaN 36.0
2010-01-01 87.0 45.0 68.0 65 NaN 71.0
2011-01-01 22.0 50.0 92.0 91 NaN 48.0
2012-01-01 12.0 44.0 79.0 77 NaN 25.0
2013-01-01 1.0 22.0 34.0 57 NaN 25.0
2014-01-01 94.0 NaN 86.0 97 NaN 91.0
2015-01-01 2.0 NaN 98.0 44 NaN 79.0
2016-01-01 81.0 NaN 35.0 87 NaN 32.0
2017-01-01 59.0 NaN 95.0 32 NaN 58.0
2018-01-01 NaN NaN 3.0 14 NaN NaN
2019-01-01 NaN NaN 48.0 9 NaN NaN
2020-01-01 NaN NaN NaN 49 NaN NaN
现在我可以使用“first_valid_index()”找到每列的最后一个有效索引:
lvi = df.apply(lambda series: series.last_valid_index())
哪个产量:
Site 1 2017-01-01
Site 2 2013-01-01
Site 3 2019-01-01
Site 4 2020-01-01
Site 5 2006-01-01
Site 6 2017-01-01
我如何将它应用到另一个 Dataframe,我使用这个索引来切片另一个 Dataframe 的时间序列。Dataframe 的另一个示例可以使用以下方法创建:
import pandas as pd
import numpy as np
from numpy import random
random.seed(30)
idx = pd.date_range(start='2000-01-01', end='2020-01-01',freq ='AS')
df2 = df2.set_index(idx)
我如何使用那个“lvi”变量来索引 df2?
要手动执行此操作,我可以使用:
df_s1 = df['Site 1'].loc['2000-01-01':'2017-01-01']
有没有更好的方法来解决这个问题?另外,每一列本质上都必须是自己的数据框才能工作吗?任何帮助是极大的赞赏!
largeQ
温温酱
精慕HU
相关分类