仅返回满足 where 子句的数据框列

首页课程实战体系课手记专栏慕课教程

仅返回满足 where 子句的数据框列

从任意数据框开始，我想返回一个数据框，其中仅包含具有多个不同值的那些列。

我有：

X = df.nunique()

喜欢：

Id 5

MSSubClass 3

MSZoning 1

LotFrontage 5

LotArea 5

Street 1

Alley 0

LotShape 2

然后我将其从系列转换为数据框：

X = X.to_frame(name = 'dcount')

然后我使用 where 子句只返回大于 1 的值：

X.where(X[['dcount']]>1)

看起来像：

dcount

Id 5.0

MSSubClass 3.0

MSZoning NaN

LotFrontage 5.0

LotArea 5.0

Street NaN

Alley NaN

LotShape 2.0

...

但我现在只想要那些没有 dcount = 'NaN' 的 column_names（在 X 的索引中），以便我最终可以返回到我的原始数据帧 df 并将其定义为：

df=df[[list_of_columns]]

这应该怎么做？我尝试了十几种方法，这是一个 PitA。我怀疑有一种方法可以用 1 或 2 行代码来完成。

森林海

浏览 145回答 1

1回答

天涯尽头无女友

您可以使用布尔索引并避免将计数系列转换为数据帧：counts = df.nunique()df = df[counts[counts > 1].index]关键是要注意您系列的索引counts是列标签。因此，您可以过滤系列，然后通过pd.Series.index.这是一个演示：df = pd.DataFrame({'A': [1, 1, 1], 'B': [1, 2, 3],                   'C': [4, 5, 5], 'D': [0, 0, 0]})counts = df.nunique()df = df[counts[counts > 1].index]print(df)   B  C0  1  41  2  52  3  5

0 0

随时随地看视频慕课网APP