如何使用 Pandas 数据框字段在 Python 中的另一个字段中使用正则表达式替换文本？

我想在熊猫数据框的另一个字段（“单词”）的基础上找到文本。

#import re

import pandas as pd

df = pd.DataFrame([['I like apple pie','apple'],['Nice banana and lemon','banana|lemon']], columns=['text','words'])

df['text'] = df['text'].str.replace(r''+df['words'].str, '*'+group(0)+'*')

我想用*标记找到的单词。

我怎样才能做到这一点？

所需的输出是：

我喜欢 *apple* pie

Nice *banana* 和 *lemon*

慕桂英546537

浏览 159回答 2

2回答

喵喔喔

您可以从中捕获单词，words并在替换中使用后向引用将其包装在*：import reimport pandas as pddf = pd.DataFrame([['I like apple pie','apple'],['Nice banana and     lemon','banana|lemon']], columns=['text','words'])df['text'] = df['text'].replace(r'('+df['words']+')', r'*\1*', regex=True)print(df)印刷：                            text         words0             I like *apple* pie         apple1  Nice *banana* and     *lemon*  banana|lemon

Cats萌萌

IIUC使用 (?i)与re.Idf.text.replace(regex=r'(?i)'+ df.words,value="*")Out[131]: 0        I like * pie1    Nice * and     *Name: text, dtype: object由于您更新了问题df.words=df.words.str.split('|')s=df.words.apply(pd.Series).stack()df.text.replace(dict(zip(s,'*'+s+'*')),regex=True)Out[139]: 0               I like *apple* pie1    Nice *banana* and     *lemon*Name: text, dtype: object

随时随地看视频慕课网APP