从 pandas 的列中删除字符串列表

我需要删除字符串列表:


list_strings=['describe','include','any']

来自 pandas 的专栏:


My_Column


include details about your goal

describe expected and actual results

show some code anywhere

我试过


df['My_Column']=df['My_Column'].str.replace('|'.join(list_strings), '')

但它删除了部分单词。


例如:


My_Column


details about your goal

expected and actual results

show some code where # here it should be anywhere

我的预期输出:


My_Column


details about your goal

expected and actual results

show some code anywhere 


慕娘9325324
浏览 120回答 3
3回答

慕姐4208626

使用“词边界”\b之类的表达方式。In [46]: df.My_Column.str.replace(r'\b{}\b'.format('|'.join(list_strings)), '')Out[46]: 0         details about your goal1     expected and actual results2         show some code anywhereName: My_Column, dtype: object

慕的地6264312

您的问题是pandas看不到单词,它只看到字符列表。因此,当你要求 pandas 删除“any”时,它并不是从描绘单词开始的。所以一种选择是你自己做,也许是这样的:# Your datadf = pd.DataFrame({'My_Column':['Include details about your goal','Describe expected and actual results','Show some code anywhere']})list_strings=['describe','include','any'] # make sure it's lower casedef remove_words(s):    if s is not None:        return ' '.join(x for x in s.split() if x.lower() not in list_strings)# Apply the function to your columndf.My_Column = df.My_Column.map(remove_words)

慕神8447489

方法的第一个参数.str.replace()必须是字符串或编译后的正则表达式;不是像你这样的列表。你可能想要list_strings=['Describe','Include','any']            # Note capital D and capital Ifor s in [f"\\b{s}\\b" for s in list_strings]:       # surrounded word boundaries (\b)     df['My_Column'] = df['My_Column'].str.replace(s, '')获得                     My_Column0      details about your goal1  expected and actual results2      Show some code anywhere
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python