我想根据年份拆分我的数据框

我有一个包含 datetime64 格式的日期值列的数据框。我想根据年份将我的数据帧拆分为单独的数据帧。我写了下面的代码,它有效但非常不切实际。


希望有人有更好的解决方案!


# import libs


import numpy as np

import pandas as pd

from random import sample


# Make some random dataframe with two columns


date = np.arange('2005-02', '2008-03', dtype='datetime64[D]')


status = ["X"]*(int(round(0.9*len(date),0))) +['y']*(int(round(0.05*len(date),0)))+['z']*(int(round(0.05*len(date),0)))

newstatus = sample(status, len(status))


data = {'Data': date, 'Status': newstatus}


df = pd.DataFrame(data)



# Extract year from date and make dummies index for splitting


df['Year'] = pd.DatetimeIndex(df['Data']).year

df = pd.get_dummies(df, columns = ['Year'])


# Split on dummies


df_2007, df_2006, df_2005, df_2008  = df, df, df, df

df_2008= df_2008[df_2008.Year_2008 != 0]

df_2007 = df_2007[df_2007.Year_2007 != 0]

df_2006= df_2006[df_2006.Year_2006 != 0]

df_2005= df_2005[df_2005.Year_2005 != 0]


#Remove Dummies


years = ['Year_2005', 'Year_2006', 'Year_2007', 'Year_2008']

df_2008 = df_2008.drop(years, axis = 1)

df_2007 = df_2007.drop(years, axis = 1)

df_2006 = df_2006.drop(years, axis = 1)

df_2005 = df_2005.drop(years, axis = 1)


斯蒂芬大帝
浏览 145回答 1
1回答

HUWWW

也许这可以帮助你:years = df['Data'].dt.year.unique() # I'm guessing Data should be Date really but I'll go along with it.dfs = {y: df[df['Data'].dt.year == y] for y in years}这将创建一个字典,其中键是年份,值是对应于每一年的数据框。这意味着dfs[2008]为您提供包含 2008 年数据的数据框。
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python