我要获取a标签的文本我这样写:
import time
from urllib.request import urlopen
from bs4 import BeautifulSoup
import re
import chardet
html = urlopen("http://www.gdmzzx.com/html/xiaoyuandongtai/")
bsObj = BeautifulSoup(html)
list_all = bsObj.findAll("a",href = re.compile("/html/xiaoyuandongtai/.+"))
for each in list_all:
print(isinstance(each.get_text(),str)) print(each.get_text()) #print(chardet.detect(each)) #print(each.get_text().encode("utf-8"))
但是得到一团乱码 我应该怎么写?
我知道怎么写了
bsObj = BeautifulSoup(html,fromEncoding = "gbk")
相关分类