按文件扩展名 pandas 对值进行分组

我有一个包含多种文件类型(.svg、.png、csv 等)的数据框。还有一些文件没有扩展名。

如何在没有扩展名的情况下对这些文件进行分组并制作这样的饼图?

http://img1.mukewang.com/63b4f038000165b306450579.jpg

import pandas as pd


df = pd.DataFrame({'file_name': ['filelist.xml', 'sheet002', 'sheet005.htm', 'image1.jpg', 'image3.jpg',

   'kings.xls', 'Kings.png', 'Kings', 'Riders', 'Royals.pdf', 'Royals.csv', 'Riders.xml'],

   'created_at': ['2020-01-01 23:00:34'] *2 + ['2018-01-01 13:01:34'] *3 + ['2020-01-01 22:00:00'] *4 + ['2018-02-01 23:00:34']*3,

   'size':[8760] * 3 + [789] *4 + [863] *2 + [673] *3})



df_unknown=df[df['file_name'].apply(lambda x: len(x.rsplit('.', 1))) < 2]

编辑我有很多价值观。饼图无法显示全部。

http://img1.mukewang.com/63b4f04e0001214d06480402.jpg

MYYA
浏览 86回答 2
2回答

烙印99

您可以使用where将不包含 a 的那些值设置.为未知,并从中绘制饼图value_counts:(df.file_name.where(df.file_name.str.contains(r'\.'),&nbsp;'unknown') &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.str.split('.').str[-1] &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.value_counts() &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.plot.pie())或者另一种方法是使用str.extractand&nbsp;fillna:(df.file_name.str.extract(r'(\.\w+$)',&nbsp;expand=False) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.fillna('unknown') &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.value_counts() &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.plot.pie())更新要获得sum每组的尺寸图表:(df['size'].groupby(df.file_name.str.extract(r'(\.\w+$)',&nbsp;expand=False) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.fillna('unknown')) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.sum().plot.pie())

翻阅古今

尝试:os.path.splitext_GroupBy.sumimport os(df['size'].groupby(df['file_name'].map(os.path.splitext)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;.str[-1]&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;.replace({'': 'unknown'}))&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;.sum())file_name.csv&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;673.htm&nbsp; &nbsp; &nbsp; &nbsp; 8760.jpg&nbsp; &nbsp; &nbsp; &nbsp; 1578.pdf&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;673.png&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;789.xls&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;789.xml&nbsp; &nbsp; &nbsp; &nbsp; 9433unknown&nbsp; &nbsp; 10486Name: size, dtype: int64从这里开始,绘图应该很简单:_.plot.pie()plt.show()
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python