UnicodeEncodeError：'ascii'编解码器无法编码字符'\ xe9'--

首页课程实战体系课手记专栏慕课教程

UnicodeEncodeError：'ascii'编解码器无法编码字符'\ xe9'--

我正在编写一个脚本，该脚本转到链接列表并解析信息。

它适用于大多数站点，但在某些情况下令人窒息：“ UnicodeEncodeError：'ascii'编解码器无法在位置13编码字符'\ xe9'：序数不在范围内（128）”

它在python3上urlib的client.py上停止

确切的链接是：http : //finance.yahoo.com/news/cafés-growing-faster-than-fast-food-peers-144512056.html

这里有很多类似的帖子，但是似乎没有答案对我有用。

我的代码是：

from urllib import request

def __request(link,debug=0):

try:

html = request.urlopen(link, timeout=35).read() #made this long as I was getting lots of timeouts

unicode_html = html.decode('utf-8','ignore')

# NOTE the except HTTPError must come first, otherwise except URLError will also catch an HTTPError.

except HTTPError as e:

if debug:

print('The server couldn\'t fulfill the request for ' + link)

print('Error code: ', e.code)

return ''

except URLError as e:

if isinstance(e.reason, socket.timeout):

print('timeout')

return ''

else:

return unicode_html

这调用了请求功能

链接=' http: //finance.yahoo.com/news/cafés-growing-faster-than-fast-food-peers-144512056.html'页面= __request（链接）

追溯是：

Traceback (most recent call last):

File "<string>", line 250, in run_nodebug

File "C:\reader\get_news.py", line 276, in <module>

main()

File "C:\reader\get_news.py", line 255, in main

body = get_article_body(item['link'],debug=0)

File "C:\reader\get_news.py", line 155, in get_article_body

page = __request('na',url)

File "C:\reader\get_news.py", line 50, in __request

html = request.urlopen(link, timeout=35).read()

File "C:\Python33\Lib\urllib\request.py", line 156, in urlopen

return opener.open(url, data, timeout)

File "C:\Python33\Lib\urllib\request.py", line 469, in open

response = self._open(req, data)

File "C:\Python33\Lib\urllib\request.py", line 487, in _open

任何帮助表示赞赏它使我发疯，我想我已经尝试过x.decode和类似内容的所有组合

白衣染霜花

浏览 615回答 3

3回答

MM们

我不确定在URL的其他部分是否会出现问题，所以我将其拆分然后重新构建url_tuple = parse.urlsplit（link）parse.quote_plus（url_tuple [2]）+ url_tuple [3] + parse.quote_plus（url_tuple [4]））encode_link =“％s：//％s％s？％s％s”％（url_tuple [0]，url_tuple [1]，parse.quote（url_tuple [2]），url_tuple [3]，parse.quote（url_tuple [4]）） 

0 0

随时随地看视频慕课网APP