Scrapy 请求得到一些响应，但不是全部

看来html的一部分是动态加载的，所以scrapy看不到它。数据本身存在于 html 中的 json 结构中。你可以尝试这样获取：import json# get the script with the datajson_data = response.xpath('//script[contains(text(), "__PRELOADED_STATE__")]/text()').extract_first()# load the data in a python dictionarydict_data = json.loads(json_data.split('window.__PRELOADED_STATE__ =')[-1])items = dict_data['itemList']print(len(items))  # prints 36 in my case# go through the dictionary and get the product_urlsfor item in items:  product_url = item['product']['pdURL']  ...

Scrapy 请求得到一些响应，但不是全部

1回答