猿问

爬虫gb2312编码网站时的问题?

我要获取a标签的文本我这样写:

--coding:utf-8--

import time
from urllib.request import urlopen
from bs4 import BeautifulSoup
import re
import chardet

html = urlopen("http://www.gdmzzx.com/html/xiaoyuandongtai/")
bsObj = BeautifulSoup(html)
list_all = bsObj.findAll("a",href = re.compile("/html/xiaoyuandongtai/.+"))
for each in list_all:

print(isinstance(each.get_text(),str))
print(each.get_text())
#print(chardet.detect(each))
#print(each.get_text().encode("utf-8"))

但是得到一团乱码 我应该怎么写?

我知道怎么写了
bsObj = BeautifulSoup(html,fromEncoding = "gbk")


慕尼黑的夜晚无繁华
浏览 646回答 1
1回答
随时随地看视频慕课网APP

相关分类

JavaScript
我要回答