运行/保存我的脚本后,如下所示,我尝试在终端中查看结果但没有成功。
代码非常简单,但我似乎找不到解决方法。
import scrapy
class TickersSpider(scrapy.Spider):
name = 'tickers'
allowed_domains = ['www.seekingalpha.com/']
start_urls = ['https://seekingalpha.com/market-news/on-the-move']
def parse(self, response):
articles_all = response.xpath('//div[@class="title"]/a/text()').getall()
articles_gainers = response.path('//div[@class="title"]/a[contains(text(), "remarket gainers")]/text()').getall()
yield {
'articles': articles_all,
'articles_gainers': articles_gainers
}
我还仔细检查了我是否在正确的目录中运行。scrapy crawl tickers这是我在终端运行时显示的内容:
2020-07-25 16:53:35 [scrapy.utils.log] INFO: Scrapy 2.2.0 started (bot: seekingalpha)
2020-07-25 16:53:35 [scrapy.utils.log] INFO: Versions: lxml 4.5.2.0, libxml2 2.9.10, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 20.3.0, Python 3.7.7 (default, May 6 2020, 11:45:54) [MSC v.1916 64 bit (AMD64)], pyOpenSSL 19.1.0 (OpenSSL 1.1.1g 21 Apr 2020), cryptography 3.0, Platform Windows-10-10.0.18362-SP0
2020-07-25 16:53:35 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.selectreactor.SelectReactor
2020-07-25 16:53:35 [scrapy.crawler] INFO: Overridden settings:
{'BOT_NAME': 'seekingalpha',
'NEWSPIDER_MODULE': 'seekingalpha.spiders',
'ROBOTSTXT_OBEY': True,
'SPIDER_MODULES': ['seekingalpha.spiders']}
2020-07-25 16:53:35 [scrapy.extensions.telnet] INFO: Telnet Password: 2cb47f969c26a413
2020-07-25 16:53:35 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.logstats.LogStats']
慕码人8056858
相关分类