我试图不显示/获取 scrapy 中抛出的一些process_response错误RetryMiddleware。超过最大重试限制时脚本遇到的错误。我在中间件中使用了代理。奇怪的是脚本抛出的异常已经在列表中EXCEPTIONS_TO_RETRY。脚本有时可能会超过最大重试次数而没有成功,这是完全可以的。但是,我只是不希望看到该错误,即使它存在,这意味着抑制或绕过它。
错误是这样的:
Traceback (most recent call last):
File "middleware.py", line 43, in process_request
defer.returnValue((yield download_func(request=request,spider=spider)))
twisted.internet.error.TCPTimedOutError: TCP connection timed out: 10060: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond..
这是process_response里面的RetryMiddleware样子:
class RetryMiddleware(object):
cus_retry = 3
EXCEPTIONS_TO_RETRY = (defer.TimeoutError, TimeoutError, DNSLookupError, \
ConnectionRefusedError, ConnectionDone, ConnectError, \
ConnectionLost, TCPTimedOutError, TunnelError, ResponseFailed)
def process_exception(self, request, exception, spider):
if isinstance(exception, self.EXCEPTIONS_TO_RETRY) \
and not request.meta.get('dont_retry', False):
return self._retry(request, exception, spider)
def _retry(self, request, reason, spider):
retries = request.meta.get('cus_retry',0) + 1
if retries<=self.cus_retry:
r = request.copy()
r.meta['cus_retry'] = retries
r.meta['proxy'] = f'https://{ip:port}'
r.dont_filter = True
return r
else:
print("done retrying")
我怎样才能消除 中的错误EXCEPTIONS_TO_RETRY?
PS:无论我选择哪个站点,当达到最大重试限制时脚本都会遇到错误。
缥缈止盈
白衣染霜花
慕运维8079593
相关分类