我想知道如何改造表格并获得我想要的结果:
我的示例数据集:
df=pd.DataFrame({
"ID":[111,111,111,111,222,222,222,333,333],
"Section":["CS01","CS01","IT01","IT01","CS02","CS02","CS02","HS01","HS01"],
"Subject":["Hist","Pol","Pol","Arts","Pol","Hist","Arts","Pol","Hist"],
"Activity":["Quiz 1","Quiz 2","Quiz 3","Quiz 1","Quiz 2","Quiz 3","Quiz 1","Quiz 2","Quiz 3"],
"Pass":[1,0,0,1,1,1,0,1,0],
})
它看起来像什么:
ID Section Subject Activity Pass
0 111 CS01 Hist Quiz 1 1
1 111 CS01 Pol Quiz 2 0
2 111 IT01 Pol Quiz 3 0
3 111 IT01 Arts Quiz 1 1
4 222 CS02 Pol Quiz 2 1
5 222 CS02 Hist Quiz 3 1
6 222 CS02 Arts Quiz 1 0
7 333 HS01 Pol Quiz 2 1
8 333 HS01 Hist Quiz 3 0
我正在尝试做的事情:
ID Section Subject Quiz 1 Quiz 2 Quiz 3
0 1 NA 0 1 NA 0 1 NA
111 CS01 Hist 0 1 0 0 0 1 0 0 1
111 CS01 Pol 0 0 1 1 0 0 0 0 1
111 IT01 Arts 0 1 0 0 0 1 0 0 1
111 IT01 Pol 0 0 1 0 0 1 1 0 0
222 CS02 Arts 1 0 0 0 0 0 0 0 0
222 CS02 Hist 0 0 1 0 0 1 0 1 0
222 CS02 Pol 0 0 1 0 1 0 0 0 1
333 HS01 Hist 0 0 1 0 0 1 1 0 0
333 HS01 Pol 0 0 1 0 1 0 0 0 1
我想要的是将“主题”列设置为级别 2,将“通过”列设置为级别 1,并使用“NA”列。
到目前为止我只有这个:
df.groupby(["ID","Section", "Subject","Activity"])["Pass"].value_counts().unstack().fillna(0)
但这没有“NA”列,也没有级别 2 的“活动”
蛊毒传说
相关分类