这个程序是为了找到a句子和单词之间的相似之处以及它们在同义词中的相似之处我在第一次编码时下载了nltk,它运行并且没有错误,但是几天后当我运行该程序时
import nltk
nltk.download('stopwords')
nltk.download('wordnet')
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
from nltk.corpus import wordnet as wn
filtered_uploaded_sentences = []
uploaded_sentence_synset = []
database_word_synset = []
uploaded_doc_sentence=" The issue of text semantics, such as word semantics and sentence semantics has received increasing attentions in recent years. However, rare research focuses on the document-level semantic matching due to its complexity. Long documents usually have sophisticated structure and massive information, which causes hardship to measure their semantic similarity. The semantic similarity between words, sentences, texts, and documents is widely studied in various fields, including natural language processing, document semantic comparison, artificial intelligence, semantic web, and semantic search engines. "
database_word=["car","complete",'focus',"semantics"]
stopwords = stopwords.words('english')
uploaded_sentence_words_tokenized = word_tokenize(uploaded_doc_sentence)
#filtering the sentence and synset
for word in uploaded_sentence_words_tokenized:
if word not in stopwords:
filtered_uploaded_sentences.append(word)
print (filtered_uploaded_sentences)
for sentences_are in filtered_uploaded_sentences:
uploaded_sentence_synset.append(wn.synsets(sentences_are))
print(uploaded_sentence_synset)
#for finding similrity in the words
for databasewords in database_word:
database_word_synset.append(wn.synsets(databasewords)[0])
print(database_word_synset)
索引错误:列表索引超出范围
当 uploaded_doc_sentence 很短并且使用长句子时,会出现此错误。
check.append(wn.wup_similarity(数据,sen[0]))
我想比较句子和单词并将结果存储起来。这个类型
#the similarity main function for words
for data in database_word_synset:
for sen in uploaded_sentence_synset :
check.append(wn.wup_similarity(data,sen[0]))
print(check)
12345678_0001
森林海
相关分类