如何从数据框中两列的划分中找到最小值

首页课程实战体系课手记专栏慕课教程

如何从数据框中两列的划分中找到最小值

我想找到两列的最小划分，只有列表中第三列中的值。我的数据框是：

ID size price

0 1 5 300

1 2 10 500

2 3 20 600

3 4 35 800

4 5 65 900

5 6 70 1000

我想找到最低价格/尺寸，只能从列表中具有值的 ids 中找到。

ids_wanted = [1,4,6]

我写了这段代码，它可以工作，但我觉得为这个任务制作一个新的数据框既昂贵又没必要。

import numpy as np

import pandas as pd

index = [0,1,2,3,4,5]

i = pd.Series([1,2,3,4,5,6], index=index)

s = pd.Series([5,10,20,35,65,70],index= index)

p = pd.Series([300,500,600,800,900,1000],index= index)

df = pd.DataFrame(np.c_[i,s,p],columns = ["ID","size","price"])

print("original df:\n",df,"\n")

ids_wanted = [1,4,6]

df_with_ids_wanted = df.loc[df['ID'].isin(ids_wanted)]

print("df with ids wanted:\n",df_with_ids_wanted,"\n")

price_per_byte = df_with_ids_wanted['price'] / df_with_ids_wanted['size']

df_with_ids_wanted_ppb = df_with_ids_wanted.assign(pricePerByte=price_per_byte)

print("df with ids wanted and price/size column:\n",df_with_ids_wanted_pps,"\n")

min_ppb = df_with_ids_wanted_pps['pricePerByte'].min()

print("min price per byte:",min_ppb)

输出：

original df:

ID size price

0 1 5 300

1 2 10 500

2 3 20 600

3 4 35 800

4 5 65 900

5 6 70 1000

df with ids wanted:

ID size price

0 1 5 300

3 4 35 800

5 6 70 1000

df with ids wanted and price/size column:

ID size price pricePerByte

0 1 5 300 60.000000

3 4 35 800 22.857143

5 6 70 1000 14.285714

min price per byte: 14.285714285714286

Helenr

浏览 92回答 2

2回答

手掌心

如果你想简洁，你可以试试这个：i = range(1,7)s = [5,10,20,35,65,70]p = [300,500,600,800,900,1000]df = pd.DataFrame({"ID":i,"size":s,"price":p})df输出：    ID  size    price0   1   5   3001   2   10  5002   3   20  6003   4   35  8004   5   65  9005   6   70  1000下一行看起来像这样：id_chosen = [1,4,6](df[df.ID.isin(id_chosen)]["price"]/df[df.ID.isin(id_chosen)]["size"]).min()输出：14.285714285714286要么min_div = (df[df.ID.isin(id_chosen)]["price"]/df[df.ID.isin(id_chosen)]["size"]).min()print("the minimum price/size is {}".format(min_div))输出：the minimum price/size is 14.285714285714286这样，您就不必创建新的数据框。希望这可以帮助。

0 0

慕哥6287543

我会做这样的事情：import numpy as npimport pandas as pddict = {'id': [1, 2, 3, 4, 5, 6],        'size': [5, 10, 20, 35, 65, 70],        'price': [300, 500, 600, 800, 900, 1000]       }df = pd.DataFrame(dict)df['price/byte'] = df['price'] / df['size']ids_wanted = [1, 4, 6]subset = df[df['id'].isin(ids_wanted)]sorted_values = subset.sort_values(by='price/byte', ascending = True)print(sorted_values['price/byte'].iloc[0])

0 0

随时随地看视频慕课网APP