2018-09-25 17:24:37 [scrapy.utils.log] INFO: Scrapy 1.5.1 started (bot: douban)
2018-09-25 17:24:37 [scrapy.utils.log] INFO: Versions: lxml 4.2.5.0, libxml2 2.9.5, cssselect 1.0.3, parsel 1.5.0, w3lib 1.19.0, Twisted 18.7.0, Python 3.6.0 (v3.6.0:41df79263a11, Dec 23 2016, 08:06:12) [MSC v.1900 64 bit (AMD64)], pyOpenSSL 18.0.0 (OpenSSL 1.1.0i 14 Aug 2018), cryptography 2.3.1, Platform Windows-10-10.0.17134-SP0
2018-09-25 17:24:37 [scrapy.crawler] INFO: Overridden settings: {'BOT_NAME': 'douban', 'NEWSPIDER_MODULE': 'douban.spiders', 'ROBOTSTXT_OBEY': True, 'SPIDER_MODULES': ['douban.spiders'], 'USER_AGENT': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36'}
执行后打印出来的user_agent如上,与setting.py中的设置的一致:
# Crawl responsibly by identifying yourself (and your website) on the user-agent
不会随机取user_agent_list中的USER_AGENT吗?
教程里,这里我有个手误,是user-agent,不是下划线