猿问

如何在正则表达式匹配后使用正则表达式删除特定文本部分

假设我有这个列表:


names = ['your name', 'the name', 'his name', 'her name', 'their name', 'employer name', "employer's name", "father's name",

        "mother's name", "maiden name", "son's name", "daughter's name", "brother's name", "sister's name"]

假设我有这段文字:


text = "What is your name?  Well,  uh it's John Smith.  Thanks for asking. Anyway, I'd doing well."

如何使用正则表达式在文本中查找列表名称的每个元素,并立即用“[name]”替换元素之后的文本块(例如,长度为 50)。所以我的输出是:


text = "What is your name [name] Anyway, I'd doing well."

到目前为止,我在下面有这段代码,但它只用“[name]”替换了元素,而不是元素后面的实际文本。


def my_replace3(match):

    match = match.group()

    return " [name] "


def no_name(text):

    names = ['your name', 'the name', 'his name', 'her name', 'their name', 'employer name', "employer's name", "father's name",

        "mother's name", "maiden name", "son's name", "daughter's name", "brother's name", "sister's name"]

    regex = re.compile(r'\b(' + '|'.join(names) + r')\b', re.IGNORECASE)

    text = re.sub(regex, my_replace3, text)

    return text

我不是一个伟大的正则表达式专家,所以你的帮助将不胜感激。


www说
浏览 467回答 1
1回答

三国纷争

如果要在匹配后替换 50 个字符,请添加.{50}到正则表达式。然后在替换字符串中使用反向引用将匹配的字符串复制到替换。def no_name(text):    names = ['your name', 'the name', 'his name', 'her name', 'their name', 'employer name', "employer's name", "father's name",        "mother's name", "maiden name", "son's name", "daughter's name", "brother's name", "sister's name"]    regex = re.compile(r'\b(' + '|'.join(map(re.escape, names)) + r')\b.{50}', re.IGNORECASE)    text = re.sub(regex, r'\1 [name]', text)    return text您还应该re.escape()在将应该完全匹配的字符串插入到正则表达式中时使用,以防它们中的任何一个包含正则表达式运算符。
随时随地看视频慕课网APP

相关分类

Python
我要回答