Python:如何使用正则表达式获取所有可能的匹配项

我试图在括号之间找到文本,但我想要这样的东西


 s="( abc (def) kkk ( mno) sdd ( xyz ) )"

 p=re.findall(r"\(.*?\)",s)

    for i in p:

        print(i)

输出:


( abc (def) ,

( mno),

( xyz )

预期的:


( abc (def) ,

( abc (def) kkk ( mno) ,

( abc (def) kkk ( mno) sdd ( xyz ) ,

( abc (def) kkk ( mno) sdd ( xyz ) ) ,

(def) ,

(def) kkk ( mno)  ,

(def) kkk ( mno) sdd ( xyz ) ,

(def) kkk ( mno) sdd ( xyz ) ) ,

( mno) ,

( mno) sdd ( xyz ) ,

( mno) sdd ( xyz ) ) ,

( xyz ) ,

( xyz ) ) 


幕布斯7119047
浏览 305回答 1
1回答

动漫人物

python regex 模块不处理重叠匹配。通过查找文本中(和的位置,)为开始/结束值创建合理的元组并对字符串进行切片,更容易获得:使用enumerate(iterable)、collections.defaultdict()和itertools.product():s="( abc (def) kkk ( mno) sdd ( xyz ) )"# get positions of all opening and closing ()from collections import defaultdictd = defaultdict(list)print(d)for idx,c in enumerate(s):&nbsp; &nbsp; if c in "()":&nbsp; &nbsp; &nbsp; &nbsp; d[c].append(idx)# combine all positions&nbsp;from itertools import productpos = list(product (d["("],d[")"]))print(pos)# slice the text if start < stop+1 else skipfor start,stop in pos:&nbsp; &nbsp; if start < stop+1:&nbsp; &nbsp; &nbsp; &nbsp; print(s[start:stop+1])输出:# ddefaultdict(<class 'list'>, {'(': [0, 6, 16, 27], ')': [10, 21, 33, 35]})# pos[(0, 10), (0, 21), (0, 33), (0, 35), (6, 10), (6, 21), (6, 33), (6, 35),&nbsp;&nbsp;(16, 10), (16, 21), (16, 33), (16, 35), (27, 10), (27, 21), (27, 33), (27, 35)]# texts from pos( abc (def)( abc (def) kkk ( mno)( abc (def) kkk ( mno) sdd ( xyz )( abc (def) kkk ( mno) sdd ( xyz ) )(def)(def) kkk ( mno)(def) kkk ( mno) sdd ( xyz )(def) kkk ( mno) sdd ( xyz ) )( mno)( mno) sdd ( xyz )( mno) sdd ( xyz ) )( xyz )( xyz ) )
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python