我试图返回一个由每个关键字以及每个关键字及其同义词在每个文档中出现的次数组成的元组列表。
当输入只是一个字符串(例如“happy”)时,我没有问题,但是当我尝试更多输入(例如“happy”和“sad”)时,代码只打印最后一个字符串的输出(“sad”) )
这是我的代码:
class Entry :
def __init__(self, input_word, input_synonyms) :
self.word = input_word
self.synonyms = input_synonyms
e1 = Entry("sad", ["unhappy", "upset"])
e2 = Entry("happy", ["cheerful", "joyful"])
Thesaurus = [e1, e2]
doc1 = ["the", "man", "is", "sad", "very", "sad", "and", "unhappy", "and", "upset"]
doc2 = ["the", "boy", "is", "happy", "cheerful", "and", "joyful"]
Corpus = [doc1, doc2]
def search(keyword) :
all_words = [keyword]
for entry in Thesaurus:
if entry.word == keyword:
for word in entry.synonyms:
all_words.append(word)
store = []
for search_word in all_words:
count = 0
for document in Corpus:
for word in document:
if search_word == word:
count = count + 1
store.append([search_word, count])
return store
input_ = "happy" and "sad"
output = search(input_)
print(output)
控制台输出:
[['sad', 2], ['unhappy', 1], ['upset', 1]]
预期输出:
[['happy', 1], ['cheerful', 1], ['joyful', 1], ['sad', 2], ['unhappy', 1], ['upset', 1]]
有什么办法可以解决这个问题吗?
月关宝盒
相关分类