我想计算特定单词的出现次数(连词:“also”、“although”、“and”、“as”、“because”、“before”、“but”、“for”、“if” , “nor”, “of”, “or”, “since”, “that”, “though”, “until”, “when”, “whenever”, “whereas”, “which”, “while”, “然而”)以及来自 txt 文件的标点符号
这就是我所做的:
def count(fname, words_list):
if fname:
try:
file = open(str(fname), 'r')
full_text = file.readlines()
file.close()
count_result = dict()
for word in words_list:
for text in full_text:
if word in count_result:
count_result[word] = count_result[word] + text.count(word)
else:
count_result[word] = text.count(word)
return count_result
except:
print('Something really bad just happened!')
print(count('sample2.txt', ["also", "although", "and", "as", "because", "before", "but", "for", "if", "nor", "of",
"or", "since", "that", "though", "until", "when", "whenever", "whereas",
"which", "while", "yet", ",", ";", "-", "'"]))
但它的作用是将“是”计入“作为”,我该如何解决它或者有没有其他方法来归档它?谢谢
预期输出类似于:
{'also': 0, '虽然': 0, 'and': 27, 'as': 2, 'because': 0, 'before': 2, 'but': 4, 'for': 2, ' if': 2, 'nor': 0, 'of': 13, 'or': 2, 'since': 0, 'that': 10, 'though': 2, 'until': 0, 'when' : 3, 'whenever': 0, 'whereas': 0, 'which': 0, 'while': 0, 'yet': 0, ',': 41, ';': 3, '-': 1 , "'": 17, 'words_per_sentence': 25.4286, 'sentences_per_par': 1.75}
料青山看我应如是
Cats萌萌
相关分类