动漫人物
python regex 模块不处理重叠匹配。通过查找文本中(和的位置,)为开始/结束值创建合理的元组并对字符串进行切片,更容易获得:使用enumerate(iterable)、collections.defaultdict()和itertools.product():s="( abc (def) kkk ( mno) sdd ( xyz ) )"# get positions of all opening and closing ()from collections import defaultdictd = defaultdict(list)print(d)for idx,c in enumerate(s): if c in "()": d[c].append(idx)# combine all positions from itertools import productpos = list(product (d["("],d[")"]))print(pos)# slice the text if start < stop+1 else skipfor start,stop in pos: if start < stop+1: print(s[start:stop+1])输出:# ddefaultdict(<class 'list'>, {'(': [0, 6, 16, 27], ')': [10, 21, 33, 35]})# pos[(0, 10), (0, 21), (0, 33), (0, 35), (6, 10), (6, 21), (6, 33), (6, 35), (16, 10), (16, 21), (16, 33), (16, 35), (27, 10), (27, 21), (27, 33), (27, 35)]# texts from pos( abc (def)( abc (def) kkk ( mno)( abc (def) kkk ( mno) sdd ( xyz )( abc (def) kkk ( mno) sdd ( xyz ) )(def)(def) kkk ( mno)(def) kkk ( mno) sdd ( xyz )(def) kkk ( mno) sdd ( xyz ) )( mno)( mno) sdd ( xyz )( mno) sdd ( xyz ) )( xyz )( xyz ) )