如何制作目录中的文件列表并一一处理？- Python

我想列出目录中的所有文本文件。然后我想在每个文件中创建单独的内容列表。例如 document1=[] 然后 document2=[] 等等。然后通过使用文档 1 和文档 2 关键字，我想计算词频和其他过程。代码正在运行，但不能为列表分配不同的名称，如 document1 等等。

import glob

import math

import re

a=0

flist=glob.glob(r'D:/Final Year Project/Development process/Text_data_extraction/MyFolder/*.txt') #get all the files from the d`#open each file >> tokenize the content >> and store it in a set

for fname in flist:

tfile=open(fname,"r")

line=tfile.read()

a+=1

line = line.lower() # lowercase

line = re.sub("</?.*?>"," <> ",line) #remove tags

line = re.sub("(\\d|\\W)+"," ",line) # remove special characters and digits

l_ist = line.split("\n")

print 'document'

print(l_ist)

tfile.close() # close the file

print"Number of documents:"

print(a)

暮色呼如

浏览 188回答 2

2回答

慕尼黑的夜晚无繁华

去这里，我相信不要只给出文本文件名，而是给出目录路径和名称结构，对于“document1, document2...”使用循环，或者如果设置了文档文件的数量，则使用它们。

0 0

随时随地看视频慕课网APP