修改代码以处理一长串字符串

我想为一项任务准备一长串数据。我已经能够将在单个实例上完成任务的代码放在一起,但现在我想让它通过一个列表运行。以下是我尝试过的。


用于测试的单个实例......


sentences = ['if the stimulus bill had become hamstrung by a filibuster threat or recalcitrant conservadems']

antecedents = ['bill had become hamstrung by']

实际用例是 pandas 数据框中的两列,我已将其转换为列表


f = tra_df['sentence'].tolist()

b = tra_df['antecedent'].tolist()


单个用例的代码....


results =[]


ous = 1

ayx = ' '.join([str(elem) for elem in antecedents])

ayxx = ayx.split(" ")

antlabels = []    

for i in range(len(ayxx)):


    antlabels.append(ous)

    lab = ' '.join([str(elem) for elem in antlabels])




     # Build the regex string required

rx = '({})'.format('|'.join(re.escape(el) for el in antecedents))

     # Generator to yield replaced sentences

it = (re.sub(rx, lab, sentence) for sentence in sentences)

     # Build list of paired new sentences and old to filter out where not the same

results = ([new_sentence for old_sentence, new_sentence in zip(sentences, it) if old_sentence != new_sentence])


# replace other non 1 values with 0

nw_results = ' '.join([str(elem) for elem in results])

ew_results= nw_results.split(" ")

new_results = ['0' if i is not '1' else i for i in ew_results]

labels =([int(e) for e in new_results]) 


labels


这就是我得到的结果


[0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0]


大列表的稍微修改的代码


for sentences, antecedents in zip(f, b):

    gobels = []

    #def format_labels(antecedents,sentences):

    results =[]

    #lab =[]

    ous = 1

    ayx = ' '.join([str(elem) for elem in antecedents])

    ayxx = ayx.split(" ")

    antlabels = []    

    for i in range(len(ayxx)):

        antlabels.append(ous)

        lab = ' '.join([str(elem) for elem in antlabels])


现在,我得到一个只有 1 的长列表,而不是包含 0 和 1 的字符串列表......


有什么问题?


开满天机
浏览 96回答 1
1回答

牧羊人nacy

像这样的东西可能会更好地扩展。可能还有一种更 Pythonic 的方式来执行此操作。a = '1 2 3 4 5'b = '3 4 6'a = a.split()b = b.split()for idx, val in enumerate(b):    try:        a[a.index(val)] = True    except ValueError:        passfor idx, val in enumerate(a):    if val is not True:        a[idx] = Falseprint([1.0 if i else 0.0 for i in a])# [0.0, 0.0, 1.0, 1.0, 0.0]
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python