从 Panda 中的列中获取位于前 n% 的值的百分比，例如 25%、50% 等或低于 n%

您正在寻找df.quantile和一些基本数学。在表中显示这些值并没有多大价值——它的 3 列以上乘以len(df)数据都是一样的——所以我将它们作为简单的语句给出：import pandas as pdimport random# some data shuffling to see it works on unsorted datarandom.seed(42)data = [[f"product {i+1:3d}",i*10] for i in range(100)]random.shuffle(data)df = pd.DataFrame(data, columns=['name', 'price']) # calculate the quantile seriesq25 = df.quantile(.25, numeric_only=True)q50 = df.quantile(.5, numeric_only=True)q75 = df.quantile(.75, numeric_only=True)print (q25, q50, q75, sep="\n\n")print( f"Bottom 25% of prices are below/equal to {q25.price} thats", end=" ") print( f"{len(df[df.price <= q25.price]) / (len(df) / 100)}% of all items")print( f"Bottom 50% of prices are below/equal to {q50.price} thats", end=" ")print( f"{len(df[df.price <= q50.price]) / (len(df) / 100)}% of all items")print( f"Bottom 75% of prices are below/equal to {q75.price} thats", end= " ")print( f"{len(df[df.price <= q75.price]) / (len(df)/ 100)}% of all items")（未洗牌）数据框看起来像           name  price0   product   1      01   product   2     102   product   3     20 ..          ...    ...  97  product  98    97098  product  99    98099  product 100    990[100 rows x 2 columns]输出：price    247.5Name: 0.25, dtype: float64price    495.0Name: 0.5, dtype: float64price    742.5Name: 0.75, dtype: float64Bottom 25% of prices are below/equal to 247.5 thats 25.0% of all itemsBottom 50% of prices are below/equal to 495.0 thats 50.0% of all itemsBottom 75% of prices are below/equal to 742.5 thats 75.0% of all items

从 Panda 中的列中获取位于前 n% 的值的百分比，例如 25%、50% 等或低于 n%

1回答