具有 2 列的 Groupby - “pandas.core.groupby.generic”

对于当前的项目,我计划将 Pandas DataFrame 分组为stock_symbol第一标准和quarter第二标准。


从其他线程中,我已经看到类似的结构group_data = df.groupby(['stock_symbol', 'quarter'])可能是这一点的可能解决方案。在给定的情况下,我只收到终端输出<pandas.core.groupby.generic.DataFrameGroupBy object at 0x11fdcbf10>。


有没有人发现我这条线的思维错误?相关代码部分如下所示:


# Datetime conversion

df['date'] = pd.to_datetime(df['date'])

# Adding of 'Quarter' column

df['quarter'] = df['date'].dt.to_period('Q')

# Grouping both the Stock Symbol and the Quarter column

group_data = df.groupby(['stock_symbol', 'quarter'])

print(group_data)

在操作中要调用的函数突出显示如下:


# Word frequency analysis

def get_top_n_bigram(corpus, n=None):

    vec = CountVectorizer(ngram_range=(2, 2), stop_words='english').fit(corpus)

    bag_of_words = vec.transform(corpus)

    sum_words = bag_of_words.sum(axis=0)

    words_freq = [(word, sum_words[0, idx]) for word, idx in vec.vocabulary_.items()]

    words_freq =sorted(words_freq, key = lambda x: x[1], reverse=True)

    return words_freq[:n]




qq_花开花谢_0
浏览 101回答 1
1回答

慕斯王

这是实现您所追求的目标的一种方法:自定义函数:def get_top_n_bigram(row):&nbsp; &nbsp; corpus = row['txt_main'] + row['txt_pro'] + row['txt_con'] + row['txt_adviceMgmt']&nbsp; &nbsp; n = 2 % the top n&nbsp; &nbsp; vec = CountVectorizer(ngram_range=(2, 2), stop_words='english').fit(corpus)&nbsp; &nbsp; bag_of_words = vec.transform(corpus)&nbsp; &nbsp; sum_words = bag_of_words.sum(axis=0)&nbsp; &nbsp; words_freq = [(word, sum_words[0, idx]) for word, idx in vec.vocabulary_.items()]&nbsp; &nbsp; words_freq =sorted(words_freq, key = lambda x: x[1], reverse=True)&nbsp; &nbsp; return words_freq[:n]使用定义的函数调用groupbywith :applydf['date'] = pd.to_datetime(df['date'])df['quarter'] = df['date'].dt.to_period('Q')newdf = df.groupby(['stock_symbol', 'quarter']).apply(get_top_n_bigram).to_frame(name = 'frequencies')print(newdf)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; frequenciesstock_symbol quarter&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;AMG&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 2011Q3&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;[(smart driven, 2), (driven risk, 2)]&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;2013Q1&nbsp; &nbsp;[(asset management, 2), (smart working, 1)]&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;2014Q1&nbsp; &nbsp; &nbsp;[(audit firm, 3), (employment agency, 2)]MMM&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 2017Q2&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;[(working 3m, 1), (3m time, 1)]
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python