根据其他数据框列值过滤 pandas 数据框

首页课程实战体系课手记专栏慕课教程

根据其他数据框列值过滤 pandas 数据框

df1：

Id Country Product

1 india cotton

2 germany shoes

3 algeria bags

df2:

id Country Product Qty Sales

1 India cotton 25 635

2 India cotton 65 335

3 India cotton 96 455

4 India cotton 78 255

5 germany shoes 25 635

6 germany shoes 65 458

7 germany shoes 96 455

8 germany shoes 69 255

9 algeria bags 25 635

10 algeria bags 89 788

11 algeria bags 96 455

12 algeria bags 78 165

我需要根据 df1 中的“国家/地区和产品”列过滤 df2 并创建新的数据框。例如，在 df1 中，有 3 个唯一的国家/地区、类别，因此 df 的数量将为 3。

输出：

df_India_Cotton :

id Country Product Qty Sales

1 India cotton 25 635

2 India cotton 65 335

3 India cotton 96 455

4 India cotton 78 255

df_germany_Product:

id Country Product Qty Sales

1 germany shoes 25 635

2 germany shoes 65 458

3 germany shoes 96 455

4 germany shoes 69 255

df_algeria_Product:

id Country Product Qty Sales

1 algeria bags 25 635

2 algeria bags 89 788

3 algeria bags 96 455

4 algeria bags 78 165

我还可以使用 pandas 中的基本子集过滤掉这些数据框。

df[(df.Country=='India') & (df.Products=='cotton')]

它可以解决这个问题，我的 df1 中可能有很多国家/地区、产品的独特组合。

繁花如伊

浏览 148回答 2

2回答

智慧大石

您可以创建一个字典并在其中保存所有数据帧。检查下面的代码：d={}for i in range(len(df1)):    name=df1.Country.iloc[i]+'_'+df1.Product.iloc[i]    d[name]=df2[(df2.Country==df1.Country.iloc[i]) & (df2.Product==df1.Product.iloc[i])]您可以通过其值来调用每个数据帧，如下所示：d['India_cotton'] 将给出：id   Country  Product  Qty   Sales1    India    cotton   25    6352    India    cotton   65    3353    India    cotton   96    4554    India    cotton   78    255

0 0

DIEA

尝试创建两个 groupby。使用第一个从第二个中选择：import pandas as pdselector_df = pd.DataFrame(data=                           {                               'Country':'india germany algeria'.split(),                               'Product':'cotton shoes bags'.split()                           })details_df = pd.DataFrame(data=                         {                            'Country':'india india india india germany germany germany germany algeria algeria algeria algeria'.split(),                            'Product':'cotton cotton cotton cotton shoes shoes shoes shoes bags bags bags bags'.split(),                            'qty':[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]                         })selectorgroups = selector_df.groupby(by=['Country', 'Product'])datagroups = details_df.groupby(by=['Country', 'Product'])for tag, group in selectorgroups:    print(tag)    try:        print(datagroups.get_group(tag))    except KeyError:        print('tag does not exist in datagroup')

0 0

随时随地看视频慕课网APP

相关分类

Python