删除明确定义了块的开头和结尾的句子块

正如您所说，该块限制在#和$之间。我们可以将文本视为这些符号之间的数字序列。使用 finditer 指向块限制。    import re    starts =[]    starts = [s.start() for s in re.finditer('###',text)]    # [0, 105, 349]              ends = []              ends   = [e.end() for e in re.finditer(re.escape('$$$'),text)] #special char $    # [104, 348, 558]    blocks = []    blocks = list(starts+ends)    blocks.sort()    #sequence of blocks    nBlocks = [blocks[i:i+2] for i in range(0, len(blocks), 2)]    #[[0, 104], [105, 348], [349, 558]]    #find where the input text belongs           for i in text:               find   = '22 feb 2017 21 april'        where  = text.index(find)    # 10      #removing block elements        for n in range(len(nBlocks)):        if where in range(nBlocks[n][0],nBlocks[n][1]):             for x in range(nBlocks[n][0],nBlocks[n][1]+1):                             #text starts          #text ends                 cleanText = text[0:nBlocks[n][0]]+text[nBlocks[n][1]+1::]    print(cleanText)    ###    risk true stories people never thought they d dare share    risk true stories people never    true stories people never thought    stories people never thought they    people never thought they d    never thought they d dare    thought they d dare share    $$$    ###    everyone hanging out without me mindy kaling non fiction    everyone hanging out without me    hanging out without me mindy    out without me mindy kaling    without me mindy kaling non    me mindy kaling non fiction    $$$

删除明确定义了块的开头和结尾的句子块

1回答