创建每组百分比的数据框

我有以下数据框,


´data = {'ID':[279, 224, 221, 329, 333],

    'GROUP':['BLACK', 'BLACK', 'BLUE', 'GREEN','BLACK'],

    'ITEM_1':['Delhi', 'Kanpur', 'Delhi', 'Kannauj', 'Delhi'],

    'ITEM_2':['Msc', 'Kanpur', 'Kanpur', 'Phd', 'Kanpur']}´


´df = pd.DataFrame(data)´

´df = df.set_index('ID')´


 ID  Group    Item_1   Item_2

279    A      Delhi    Msc

224    A      Kanpur   Kanpur        

221    B      Delhi    Kanpur    

329    C      Kannauj  Phd

333    A      Delhi    Kanpur

如何创建以下数据框,行等于不同的项目和列与组,即


            Delhi      KANPUR       Kannauj    Msc      Phd                  

   A         2/6%       3/6%         0%       1/6%      0%

   B         1/2%       1/2%         0%        0%       0%

   C           0%         0%       1/2%        0%     1/2%

我的意思是,构建一个相对于每组总数的百分比数据框。任何想法将不胜感激。我虽然使用 groupby(['GROUP']) 和 .apply(lambda r: r/r.sum(), axis=1),但这不是我需要这个数据框的方式


慕工程0101907
浏览 118回答 1
1回答

慕仙森

首先合并这些列单列:all_items = pd.concat([df.ITEM_1, df.ITEM_2])all_itemsOut[8]:&nbsp;ID279&nbsp; &nbsp; &nbsp; Delhi224&nbsp; &nbsp; &nbsp;Kanpur221&nbsp; &nbsp; &nbsp; Delhi329&nbsp; &nbsp; Kannauj333&nbsp; &nbsp; &nbsp; Delhi279&nbsp; &nbsp; &nbsp; &nbsp; Msc224&nbsp; &nbsp; &nbsp;Kanpur221&nbsp; &nbsp; &nbsp;Kanpur329&nbsp; &nbsp; &nbsp; &nbsp; Phd333&nbsp; &nbsp; &nbsp;Kanpurdtype: object然后,将其合并回 df:temp_df = pd.concat([df[["GROUP"]].copy(), df[["GROUP"]].copy()])temp_df["ITEM"] = all_itemstemp_df.reset_index(inplace=True)temp_df["temp_col"] = 1temp_dfOut[15]:&nbsp;&nbsp; &nbsp; ID&nbsp; GROUP&nbsp; &nbsp; &nbsp;ITEM&nbsp; temp_col0&nbsp; 279&nbsp; BLACK&nbsp; &nbsp; Delhi&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;11&nbsp; 224&nbsp; BLACK&nbsp; &nbsp;Kanpur&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;12&nbsp; 221&nbsp; &nbsp;BLUE&nbsp; &nbsp; Delhi&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;13&nbsp; 329&nbsp; GREEN&nbsp; Kannauj&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;14&nbsp; 333&nbsp; BLACK&nbsp; &nbsp; Delhi&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;15&nbsp; 279&nbsp; BLACK&nbsp; &nbsp; &nbsp; Msc&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;16&nbsp; 224&nbsp; BLACK&nbsp; &nbsp;Kanpur&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;17&nbsp; 221&nbsp; &nbsp;BLUE&nbsp; &nbsp;Kanpur&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;18&nbsp; 329&nbsp; GREEN&nbsp; &nbsp; &nbsp; Phd&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;19&nbsp; 333&nbsp; BLACK&nbsp; &nbsp;Kanpur&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;1最后转一下,my_pivot = temp_df.pivot_table(values="temp_col", index="GROUP", columns="ITEM", aggfunc=np.sum, fill_value=0)my_pivot = my_pivot / len(df)# my_pivot / len (df) # <-- changing this toto_div = my_pivot.aggregate(np.sum, axis=1) # <-- this andmy_pivot = my_pivot.div(to_div, axis=0) # <-- thisOut[31]:&nbsp;&nbsp; &nbsp; ITEM&nbsp; &nbsp; &nbsp; Delhi&nbsp; Kannauj&nbsp; Kanpur&nbsp; &nbsp; &nbsp; &nbsp;Msc&nbsp; PhdGROUP&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;BLACK&nbsp; 0.333333&nbsp; &nbsp; &nbsp; 0.0&nbsp; &nbsp; &nbsp;0.5&nbsp; 0.166667&nbsp; 0.0BLUE&nbsp; &nbsp;0.500000&nbsp; &nbsp; &nbsp; 0.0&nbsp; &nbsp; &nbsp;0.5&nbsp; 0.000000&nbsp; 0.0GREEN&nbsp; 0.000000&nbsp; &nbsp; &nbsp; 0.5&nbsp; &nbsp; &nbsp;0.0&nbsp; 0.000000&nbsp; 0.5完毕。
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python