猿问

如何从具有特定值的 pandas DataFrame 数组列中选择行

我有带有数组列的熊猫数据框:


id,classes,text

71,`["performer_146", "performer_42"]`,`adipiscing urna. molestie `

72,["performer_42"],`a ligula odio elementum, neque suscipit. egestas Maecenas`

73,["performer_146"],`vestibulum orci nec vestibulum, ligula orci et mauris lobortis, et Aliquam`

74,["performer_0"],tincidunt non interdum nunc ultrices mi accumsan elementum arcu venenatis

75,`["performer_146", "performer_42"]`, orci elementum non finibus dolor. Cras

76,`["performer_42", "performer_146"]`,`mi lectus Maecenas eleifend neque amet, `

77,["performer_146"],` platea placerat. odio Morbi rutrum, eu Cras`

我阅读了这个 CSV 并将“类”列的值转换为数组:


import pandas as pd

import ast


df = pd.read_csv(filename, quotechar='`')

df['classes'] = df['classes'].apply(lambda x: ast.literal_eval(x))

现在我想在“类”值中选择带有“performer_0”的行。像这样:


df['performer_0' in df['classes']]

但是这段代码不起作用:


Traceback(最近一次调用最后一次):文件“d:\pyenv\pandas\lib\site-packages\pandas\core\indexes\base.py”,第 2657 行,在 get_loc return self._engine.get_loc(key) File 中pandas_libs\index.pyx",第 108 行,在 pandas._libs.index.IndexEngine.get_loc 文件 "pandas_libs\index.pyx",第 132 行,在 pandas._libs.index.IndexEngine.get_loc 文件 "pandas_libs\hashtable_class_helper.pxi" ,第 1601 行,在 pandas._libs.hashtable.PyObjectHashTable.get_item 文件“pandas_libs\hashtable_class_helper.pxi”,第 1608 行,在 pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: False


我怎样才能做到这一点?


杨魅力
浏览 106回答 2
2回答

沧海一幻觉

我发现最简单的方法是组合apply和选择:df[df['classes'].apply(lambda x: 'performer_0' in x)]

蓝山帝景

如果你在 pandas 0.25+ 上工作,你可以使用explode:df[df['classes'].explode().eq(performer_0).any(level=0)]
随时随地看视频慕课网APP

相关分类

Python
我要回答