猿问

Scrapy 没有在终端显示产量结果

运行/保存我的脚本后,如下所示,我尝试在终端中查看结果但没有成功。


代码非常简单,但我似乎找不到解决方法。


import scrapy


class TickersSpider(scrapy.Spider):

    name = 'tickers'

    allowed_domains = ['www.seekingalpha.com/']

    start_urls = ['https://seekingalpha.com/market-news/on-the-move']


    def parse(self, response):

        articles_all = response.xpath('//div[@class="title"]/a/text()').getall()

        articles_gainers = response.path('//div[@class="title"]/a[contains(text(), "remarket gainers")]/text()').getall()

    

        yield {

            'articles': articles_all,

            'articles_gainers': articles_gainers

            }

        

我还仔细检查了我是否在正确的目录中运行。scrapy crawl tickers这是我在终端运行时显示的内容:

2020-07-25 16:53:35 [scrapy.utils.log] INFO: Scrapy 2.2.0 started (bot: seekingalpha)

2020-07-25 16:53:35 [scrapy.utils.log] INFO: Versions: lxml 4.5.2.0, libxml2 2.9.10, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 20.3.0, Python 3.7.7 (default, May  6 2020, 11:45:54) [MSC v.1916 64 bit (AMD64)], pyOpenSSL 19.1.0 (OpenSSL 1.1.1g  21 Apr 2020), cryptography 3.0, Platform Windows-10-10.0.18362-SP0

2020-07-25 16:53:35 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.selectreactor.SelectReactor

2020-07-25 16:53:35 [scrapy.crawler] INFO: Overridden settings:

{'BOT_NAME': 'seekingalpha',

 'NEWSPIDER_MODULE': 'seekingalpha.spiders',

 'ROBOTSTXT_OBEY': True,

 'SPIDER_MODULES': ['seekingalpha.spiders']}

2020-07-25 16:53:35 [scrapy.extensions.telnet] INFO: Telnet Password: 2cb47f969c26a413

2020-07-25 16:53:35 [scrapy.middleware] INFO: Enabled extensions:

['scrapy.extensions.corestats.CoreStats',

 'scrapy.extensions.telnet.TelnetConsole',

 'scrapy.extensions.logstats.LogStats']

RISEBY
浏览 118回答 1
1回答

慕码人8056858

问题是您的代码中有错字。    articles_gainers = response.path('//div[@class="title"]/a[contains(text(), "remarket gainers")]/text()').getall()它应该response.xpath()代替response.path(). 这就是异常消息告诉您的内容:AttributeError: 'HtmlResponse' object has no attribute 'path'
随时随地看视频慕课网APP

相关分类

Python
我要回答