猿问

计算 Pandas Dataframe 中每种产品的平均价格

我有一个如下所示的数据框:


import pandas as pd


Z = pd.DataFrame({'Product': ['Apple', 'Apple', 'Apple', 'Orange', 'Orange], 'Selling Price': [1.1, 1.2, 1.3, 2.1, 2.2]})

有数千种独特的产品和数亿的售价。我如何有效地报告每种独特产品的平均售价?


Result = pd.DataFrame({'Product': ['Apple', 'Orange'], 'Average Selling Price': [1.2, 2.15]})

挑战在于数据存储在数百个不同的 .csv 文件中(文件名存储在列表中files),我无法同时将其加载到我的环境中。所以我会做类似的事情


for i in files:

     X = pd.read_csv(i)

     # add unique products to the data frame Z

     # add the sum of their selling prices to Z

     # add the number of times the product was sold


# for each unique product, divide the sum of selling prices by the number of times that product was sold

感谢您的任何帮助,您可以提供!


陪伴而非守候
浏览 176回答 1
1回答

当年话下

final_df = pd.DataFrame()for i in files:    X = pd.read_csv(i)    X_agg = X.groupby('Product', as_index=False).agg({'Selling Price':['count', 'sum']})    X_agg.columns = ['Product', 'sale_count', 'selling_sum']    final_df = pd.concat([final_df, X_agg])    final_df = final_df.groupby('Product', as_index=False).agg({'sale_count':'sum', 'selling_sum':'sum'})
随时随地看视频慕课网APP

相关分类

Python
我要回答