*大家好,我是 BeautifulSoup 的新手,我不太了解如何提取数据。我想提取亚马逊畅销书排行榜的前十个标题并将其存储到一个数组中。
我的目标是创建亚马逊的前 10 名列表,并针对不同的类别一遍又一遍地复制该过程。我只想提取产品的“标题”。
这是我的代码:*
from bs4 import BeautifulSoup
import requests
headers = {'User-Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/601.3.9 (KHTML, like Gecko) Version/9.0.2 Safari/601.3.9'}
url_amazon = "https://www.amazon.co.uk/Best-Sellers-Electronics/zgbs/electronics"
response = requests.get(url_amazon, headers = headers)
soup = BeautifulSoup(response.content, "lxml")
print(soup.prettify())
title = soup.find("h1", class_ = "a-size-large a-spacing-medium zg-margin-left-15 a-text-bold").text
print(title)
titles = []
for item in soup.findAll("div", attrs = {"class" : "a-fixed-left-grid-col a-col-right"}):
name = item.find("div", attrs = {"class" : "p13n-sc-truncated"})
if name is not None:
titles.append(name.text)
else:
titles.append("unknown title")
print(len(titles))
for i in titles:
print(i)
输出是:“未知标题”
扬帆大鱼
相关分类