pandas - 根据另一列中的每个唯一值计算 DataFrame 中某个值的出现次数

首页课程实战体系课手记专栏慕课教程

pandas - 根据另一列中的每个唯一值计算 DataFrame 中某个值的出现次数

假设我有一个 DataFrame 如下：

term score

0 this 0

1 that 1

2 the other 3

3 something 2

4 anything 1

5 the other 2

6 that 2

7 this 0

8 something 1

我将如何通过score列中的唯一值计算列中的实例term？产生如下结果：

term score 0 score 1 score 2 score 3

0 this 2 0 0 0

1 that 0 1 1 0

2 the other 0 0 1 1

3 something 0 1 1 0

4 anything 0 1 0 0

我在这里读到的相关问题包括Python Pandas 计数和求和特定条件和Pandas python 中的 COUNTIF 在多个具有多个条件的列上，但似乎都不是我想要做的。pivot_table正如在这个问题中提到的，它似乎可能是相关的，但由于缺乏经验和熊猫文档的简洁性，我受到了阻碍。感谢您的任何建议。

MMMHUHU

浏览 156回答 2

2回答

幕布斯7119047

使用groupbywithsize和 reshape by unstack, last add_prefix：df = df.groupby(['term','score']).size().unstack(fill_value=0).add_prefix('score ')或使用crosstab：df = pd.crosstab(df['term'],df['score']).add_prefix('score ')或者pivot_table：df = (df.pivot_table(index='term',columns='score', aggfunc='size', fill_value=0)        .add_prefix('score '))print (df)score      score 0  score 1  score 2  score 3term                                         anything         0        1        0        0something        0        1        1        0that             0        1        1        0the other        0        0        1        1this             2        0        0        0

0 0

翻翻过去那场雪

您还可以使用, get_dummies, set_index, 和sum带level参数：(pd.get_dummies(df.set_index('term'), columns=['score'], prefix_sep=' ')   .sum(level=0)   .reset_index())输出：        term  score 0  score 1  score 2  score 30       this        2        0        0        01       that        0        1        1        02  the other        0        0        1        13  something        0        1        1        04   anything        0        1        0        0

0 0

随时随地看视频慕课网APP