re.findall 返回正确数量的匹配,但所有空字符串

我正在尝试在一串 PL/FOL 公式中构建文字列表,并且相关的代码正在查找匹配项,但将它们作为空白返回。


我试过re.escape(formula)了,什么也没做。我也尝试过该findall模式的简单变体,但它们随后会生成空列表。


def clean(formula):

    formula = formula.strip()

    formula = re.sub("\( +", "(", formula)

    formula = re.sub(" +\)", ")", formula)

    formula = re.sub("(?P<b_ops>[&v→↔])", " " + "\g<b_ops>" + " ", formula)

    formula = re.sub("[ ]+", " ", formula)

    # Make an inventory of literals for the original formula.

    orig_lit_inv = re.findall("[~]*[A-Z]([a-u]|[w-z]){0,}", formula)

    print(orig_lit_inv)



this_WFF = "(P) & ~(~(Q → (R & ~S)))"

clean(formula=this_WFF)


当我打印结果时,我得到['', '', '', '']. 换句话说,它正在查找匹配项,但返回空白字符串作为匹配项,此时它至少应该返回[A-Z]. 以this_WFF作为参数,clean(formula)应该打印['P', 'Q', 'R', '~S'].


SMILET
浏览 421回答 1
1回答

凤凰求蛊

引用re.findall的文档:如果模式中存在一个或多个捕获组,则返回组列表;如果模式有多个组,这将是一个元组列表。您的正则表达式包含一个捕获组,因此findall永远不会为[A-Z]正则表达式返回任何内容。更改([a-u]|[w-z])为(?:[a-u]|[w-z])查看差异:>>> this_WFF = "(P) & ~(~(Q → (R & ~S)))">>> def clean(formula):...&nbsp; &nbsp; &nbsp;formula = formula.strip()...&nbsp; &nbsp; &nbsp;formula = re.sub("\( +", "(", formula)...&nbsp; &nbsp; &nbsp;formula = re.sub(" +\)", ")", formula)...&nbsp; &nbsp; &nbsp;formula = re.sub("(?P<b_ops>[&v→↔])", " " + "\g<b_ops>" + " ", formula)...&nbsp; &nbsp; &nbsp;formula = re.sub("[ ]+", " ", formula)...&nbsp; &nbsp; &nbsp;# Make an inventory of literals for the original formula....&nbsp; &nbsp; &nbsp;orig_lit_inv = re.findall("[~]*[A-Z]([a-u]|[w-z]){0,}", formula)...&nbsp; &nbsp; &nbsp;print(orig_lit_inv)...&nbsp;>>> clean(this_WFF)['', '', '', '']>>> def clean(formula):...&nbsp; &nbsp; &nbsp;formula = formula.strip()...&nbsp; &nbsp; &nbsp;formula = re.sub("\( +", "(", formula)...&nbsp; &nbsp; &nbsp;formula = re.sub(" +\)", ")", formula)...&nbsp; &nbsp; &nbsp;formula = re.sub("(?P<b_ops>[&v→↔])", " " + "\g<b_ops>" + " ", formula)...&nbsp; &nbsp; &nbsp;formula = re.sub("[ ]+", " ", formula)...&nbsp; &nbsp; &nbsp;# Make an inventory of literals for the original formula...&nbsp; &nbsp; &nbsp;orig_lit_inv = re.findall("[~]*[A-Z](?:[a-u]|[w-z]){0,}", formula)...&nbsp; &nbsp; &nbsp;print(orig_lit_inv)...&nbsp;>>> clean(this_WFF)['P', 'Q', 'R', '~S']由于现在正则表达式不包含捕获组findall,因此只需在结果中返回“组 0”(即整个匹配项)的内容。
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python