Lxp1245121592
2019-08-20 21:01
lxp:douban lixiaopeng$ scrapy crawl douban_spider
2019-08-20 20:57:19 [scrapy.utils.log] INFO: Scrapy 1.7.3 started (bot: douban)
2019-08-20 20:57:19 [scrapy.utils.log] INFO: Versions: lxml 4.4.1.0, libxml2 2.9.9, cssselect 1.1.0, parsel 1.5.2, w3lib 1.21.0, Twisted 19.7.0, Python 3.7.0b1 (v3.7.0b1:9561d7f501, Jan 30 2018, 16:11:47) - [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)], pyOpenSSL 19.0.0 (OpenSSL 1.1.1c 28 May 2019), cryptography 2.7, Platform Darwin-18.7.0-x86_64-i386-64bit
2019-08-20 20:57:19 [scrapy.crawler] INFO: Overridden settings: {'BOT_NAME': 'douban', 'DOWNLOAD_DELAY': 0.5, 'NEWSPIDER_MODULE': 'douban.spiders', 'SPIDER_MODULES': ['douban.spiders'], 'USER_AGENT': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36'}
2019-08-20 20:57:20 [scrapy.extensions.telnet] INFO: Telnet Password: 24a1717313d85e58
2019-08-20 20:57:20 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.memusage.MemoryUsage',
'scrapy.extensions.logstats.LogStats']
Unhandled error in Deferred:
2019-08-20 20:57:20 [twisted] CRITICAL: Unhandled error in Deferred:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/scrapy/crawler.py", line 184, in crawl
return self._crawl(crawler, *args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/scrapy/crawler.py", line 188, in _crawl
d = crawler.crawl(*args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/twisted/internet/defer.py", line 1613, in unwindGenerator
return _cancellableInlineCallbacks(gen)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/twisted/internet/defer.py", line 1529, in _cancellableInlineCallbacks
_inlineCallbacks(None, g, status)
--- <exception caught here> ---
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
result = g.send(result)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/scrapy/crawler.py", line 86, in crawl
self.engine = self._create_engine()
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/scrapy/crawler.py", line 111, in _create_engine
return ExecutionEngine(self, lambda _: self.stop())
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/scrapy/core/engine.py", line 69, in __init__
self.downloader = downloader_cls(crawler)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/scrapy/core/downloader/__init__.py", line 86, in __init__
self.middleware = DownloaderMiddlewareManager.from_crawler(crawler)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/scrapy/middleware.py", line 53, in from_crawler
return cls.from_settings(crawler.settings, crawler)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/scrapy/middleware.py", line 34, in from_settings
mwcls = load_object(clspath)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/scrapy/utils/misc.py", line 46, in load_object
mod = import_module(module)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 994, in _gcd_import
File "<frozen importlib._bootstrap>", line 971, in _find_and_load
File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 723, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/scrapy/downloadermiddlewares/httpproxy.py", line 5, in <module>
from urllib2 import _parse_proxy
builtins.SyntaxError: invalid syntax (urllib2.py, line 220)
2019-08-20 20:57:20 [twisted] CRITICAL:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
result = g.send(result)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/scrapy/crawler.py", line 86, in crawl
self.engine = self._create_engine()
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/scrapy/crawler.py", line 111, in _create_engine
return ExecutionEngine(self, lambda _: self.stop())
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/scrapy/core/engine.py", line 69, in __init__
self.downloader = downloader_cls(crawler)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/scrapy/core/downloader/__init__.py", line 86, in __init__
self.middleware = DownloaderMiddlewareManager.from_crawler(crawler)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/scrapy/middleware.py", line 53, in from_crawler
return cls.from_settings(crawler.settings, crawler)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/scrapy/middleware.py", line 34, in from_settings
mwcls = load_object(clspath)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/scrapy/utils/misc.py", line 46, in load_object
mod = import_module(module)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 994, in _gcd_import
File "<frozen importlib._bootstrap>", line 971, in _find_and_load
File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 723, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/scrapy/downloadermiddlewares/httpproxy.py", line 5, in <module>
from urllib2 import _parse_proxy
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/urllib2.py", line 220
raise AttributeError, attr
^
SyntaxError: invalid syntax
python3.3以后不支持urllib2,按住ctrl键点击`File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/scrapy/downloadermiddlewares/httpproxy.py", line 5, in <module>`行进入httpproxy.py,删除第5行和urllib2有关的语句保存就好了
Python最火爬虫框架Scrapy入门与实践
67418 学习 · 223 问题
相似问题