我有一个包含数千个文件的文件夹。我正在尝试使用 beautifulsoup4 解析其中的 XML 标签。
我可以单独为每个文件执行此操作,但无法使用 for 循环使我的脚本工作。
到目前为止,这是我的代码:
import bs4 as bs
import glob
path = r"~/Desktop/pythontest/*.txt"
files = glob.glob(path)
# ------------------------READ AND PARSE TEXT-----------------------------------------
for f in files:
# open file in read mode
source = open(f, "rt")
# parse xml as soup
soup = bs.BeautifulSoup(source, "lxml")
soupText = soup.get_text()
text = soupText.replace(r"\n", " ")
# close file
source.close()
# --------------------------OVERWRITE FILE---------------------------------------------
for f in files:
# open file in write mode
source = open(f, "wt")
# overwrite the file with the soup
source.write((text))
# # close file
source.close()
print(text)
当我运行它时,控制台给我这个:
Traceback (most recent call last):
File "./camltest.py", line 34, in <module>
print(text)
NameError: name 'text' is not defined
我怀疑这是范围问题,但无法修复。有什么建议么?谢谢
POPMUISE
慕容708150
相关分类