我正在尝试做的事情:
我正在尝试编写一个脚本来抓取网站以获取产品信息。
目前,该程序使用 for 循环来获取产品价格和唯一 ID。
for 循环包含两个 if 语句以阻止它抓取 NoneType。
import requests
from bs4 import BeautifulSoup
def average(price_list):
return sum(price_list) / len(price_list)
# Requests search data from Website
page_link = 'URL'
page_response = requests.get(page_link, timeout=5) # gets the webpage (search) from Website
page_content = BeautifulSoup(page_response.content, 'html.parser') # turns the webpage it just retrieved into a BeautifulSoup-object
# Selects the product listings from page content so we can work with these
product_listings = page_content.find_all("div", {"class": "unit flex align-items-stretch result-item"})
prices = [] # Creates a list to add the prices to
uids = [] # Creates a list to store the unique ids
for product in product_listings:
## UIDS
if product.find('a')['id'] is not None:
uid = product.find('a')['id']
uids.append(uid)
# PRICES
if product.find('p', class_ = 'result-price man milk word-break') is not None:# assures that the loop only finds the prices
price = int(product.p.text[:-2].replace(u'\xa0', '')) # makes a temporary variable where the last two chars of the string (,-) and whitespace are removed, turns into int
prices.append(price) # adds the price to the list
问题:
上if product.find('a')['id'] is not None:,我得到一个Exception has occurred: TypeError
'NoneType' object is not subscriptable.
不管是谁,如果我跑了print(product.find('a')['id']),我就会得到我正在寻找的价值,这让我很困惑。这不意味着错误不是 NoneType 吗?
此外,if product.find('p', class_ = 'result-price man milk word-break') is not None:工作完美无缺。
我试过的:
我试过分配if product.find('p', class_ = 'result-price man milk word-break')给一个变量,然后在 for 循环中运行它,但这不起作用。我也做了我应得的谷歌搜索,但没有占上风。问题可能是我对编程比较陌生,不知道要搜索什么,但我仍然找到了很多似乎与相关问题有关的答案,但这在我的工作中不起作用代码。
任何帮助将不胜感激!
米脂
相关分类