需要身份验证的网站提供搜索服务。搜索包括两个步骤。
首先,从产品序列号检索基本信息(库存、尺寸等)的请求。
其次,鉴于之前的搜索和几个附加字段,第二个请求将显示产品价格。问题是必须以严格的顺序调用步骤。例如,给定两个产品A和B,以下序列将产生错误 -> basic_info(A), basic_info(B), get_price(A)=> 显示错误,因为服务器期望get_price(B). 鉴于必须进行身份验证,我不能丢弃 cookie。在下面的场景中,有没有办法保证顺序请求调用顺序?
def after_auth_success(self, response):
for product in prod_list:
yield FormRequest("basic_info_url", ..., calback = self.on_basic_info)
def on_basic_info(self, response):
yield FormRequest("get_price_url", ..., calback = self.on_price_info)
def on_price_info(self, response):
#Scrape result...
#<price would be scraped correctly only if the requests are done in order>
yield result
预期结果:
Only one thread running the sequence
basic_info_url | get_price_url | basic_info_url | get_price_url ...
实际结果:
If CONCURRENT_REQUEST=1 => Invoke all basic_info_url and after invoke all get_price_url
呼啦一阵风
Smart猫小萌
随时随地看视频慕课网APP
相关分类