猿问

从文本文件中计算特定单词的列表-Python

我想计算特定单词的出现次数(连词:“also”、“although”、“and”、“as”、“because”、“before”、“but”、“for”、“if” , “nor”, “of”, “or”, “since”, “that”, “though”, “until”, “when”, “whenever”, “whereas”, “which”, “while”, “然而”)以及来自 txt 文件的标点符号


这就是我所做的:


def count(fname, words_list):

if fname:

    try:

        file = open(str(fname), 'r')

        full_text = file.readlines()

        file.close()

        count_result = dict()

        for word in words_list:

            for text in full_text:

                if word in count_result:

                    count_result[word] = count_result[word] + text.count(word)

                else:

                    count_result[word] = text.count(word)

        return count_result

    except:

        print('Something really bad just happened!')


print(count('sample2.txt', ["also", "although", "and", "as", "because", "before", "but", "for", "if", "nor", "of",

"or", "since", "that", "though", "until", "when", "whenever", "whereas",

"which", "while", "yet", ",", ";", "-", "'"]))

但它的作用是将“是”计入“作为”,我该如何解决它或者有没有其他方法来归档它?谢谢


预期输出类似于:


{'also': 0, '虽然': 0, 'and': 27, 'as': 2, 'because': 0, 'before': 2, 'but': 4, 'for': 2, ' if': 2, 'nor': 0, 'of': 13, 'or': 2, 'since': 0, 'that': 10, 'though': 2, 'until': 0, 'when' : 3, 'whenever': 0, 'whereas': 0, 'which': 0, 'while': 0, 'yet': 0, ',': 41, ';': 3, '-': 1 , "'": 17, 'words_per_sentence': 25.4286, 'sentences_per_par': 1.75}


人到中年有点甜
浏览 94回答 2
2回答

料青山看我应如是

def word_count(fname, word_list):    count_w = dict()    for w in word_list:        count_w[w] = 0    with open(fname) as input_text:        text = input_text.read()        words = text.lower().split()        for word in words:            _word = word.strip('.,:-)()')            if _word in count_w:                count_w[_word] +=1    return count_wdef punctaction_count(fname, punctaction):    count_p = dict()    for p in punctaction:        count_p[p] = 0    with open(fname) as input_text:        for c in input_text.read():            if c in punctaction:                count_p[c] +=1    return count_pprint(word_count('c_prog.txt', ["also", "although", "and", "as", "because", "before", "but", "for", "if", "nor", "of", "or", "since", "that",                                "though", "until", "when", "whenever", "whereas", "which", "while", "yet"]))print(punctaction_count('c_prog.txt', [",", ";", "-", "'"]))如果您想在一个功能中执行此操作:def word_count(fname, word_list, punctaction):    count_w = dict()    for w in word_list:        count_w[w] = 0    count_p = dict()    for p in punctaction:        count_p[p] = 0    with open(fname) as input_text:        text = input_text.read()        words = text.lower().split()        for word in words:            _word = word.strip('.,:-)()')            if _word in count_w:                count_w[_word] +=1        for c in text:            if c in punctaction:                count_p[c] +=1    count_w.update(count_p)    return count_wprint(word_count('c_prog.txt', ["also", "although", "and", "as", "because", "before", "but", "for", "if", "nor", "of", "or", "since", "that",                                "though", "until", "when", "whenever", "whereas", "which", "while", "yet"], [",", ";", "-", "'"]))

Cats萌萌

在 2.7 和 3.1 中,您要实现的目标有特殊的Counter dict由于您尚未发布任何示例输出。我想给你一个你可以使用的方法。维护一个列表。在列表中附加您需要的这些单词。例如,如果您接近单词“also”,请将其附加到列表中。>>> l.append("also")>>> l['also']同样,你遇到“虽然”这个词,列表变成:>>> l.append("although")>>> l['also', 'although']如果您再次遇到“也”,请再次将其附加到上面的列表中。列表变为:['also', 'although', 'also']现在使用 Counter 来计算列表元素的出现次数:>>> l = ['also', 'although', 'also']>>> result = Counter(l)>>> l['also', 'although', 'also']>>> resultCounter({'also': 2, 'although': 1})
随时随地看视频慕课网APP

相关分类

Python
我要回答