使用python进行bs4过滤

我正在尝试编写一个检查 Steam 商店的脚本,但在过滤掉代码中没有折扣的所有列表时遇到问题。我只想保留带有 span 标签的列表以及<span>-percentage</span>其中的列表,而不是没有的列表。这是我的代码:


from urllib.request import urlopen

from datetime import date

import requests as rq


inp = str(input('what would you like to search up?'))

w = ('https://store.steampowered.com/search/?term=' + inp)

page = rq.get(w)

soup = bsoup(page.content, 'html.parser')

soup.prettify()

sales = soup.find_all('div', class_="responsive_search_name_combined")


for sale in sales:

    p = soup.find('div', class_="col search_price responsive_secondrow")

    d = soup.find_all('div', class_="col search_discount responsive_secondrow")

    n = soup.find('span', class_="title")


    if None in (d, n, p):

        continue

    print(d)


和输出(包含我想要过滤掉的东西/我想要保留的东西)


<span>-16%</span>

</div>, <div class="col search_discount responsive_secondrow">

</div>, <div class="col search_discount responsive_secondrow">

<span>-19%</span>

</div>, <div class="col search_discount responsive_secondrow">

</div>, <div class="col search_discount responsive_secondrow">

</div>, <div class="col search_discount responsive_secondrow">

</div>, <div class="col search_discount responsive_secondrow">

</div>, <div class="col search_discount responsive_secondrow">

</div>, <div class="col search_discount responsive_secondrow">

</div>, <div class="col search_discount responsive_secondrow">

</div>, <div class="col search_discount responsive_secondrow">

</div>, <div class="col search_discount responsive_secondrow">

</div>, <div class="col search_discount responsive_secondrow">

</div>, <div class="col search_discount responsive_secondrow">

</div>, <div class="col search_discount responsive_secondrow">

</div>, <div class="col search_discount responsive_secondrow">

</div>, <div class="col search_discount responsive_secondrow">

</div>, <div class="col search_discount responsive_secondrow">

等等,我尝试替换d = soup.find_all('div', class_="col search_discount responsive_secondrow")为,d = soup.find_all('span', string="-16%")看看是否有效,但没有。我想保留 span 标签,但不保留 div 标签,有人可以帮忙吗?


翻过高山走不出你
浏览 125回答 1
1回答

哈士奇WWW

您只需try-except在最后一个for循环中添加一个块即可解决您的问题。这是完整的代码:from urllib.request import urlopenfrom datetime import dateimport requests as rqfrom bs4 import BeautifulSoup as bsoupinp = str(input('what would you like to search up?'))w = ('https://store.steampowered.com/search/?term=' + inp)page = rq.get(w)soup = bsoup(page.content, 'html.parser')soup.prettify()sales = soup.find_all('div', class_="responsive_search_name_combined")final = []for sale in sales:&nbsp; &nbsp; p = soup.find('div', class_="col search_price responsive_secondrow")&nbsp; &nbsp; d = soup.find_all('div', class_="col search_discount responsive_secondrow")&nbsp; &nbsp; n = soup.find('span', class_="title")&nbsp; &nbsp; try:&nbsp; &nbsp; &nbsp; &nbsp; for element in d:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; span = element.span&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if span:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; final.append(span.text)&nbsp; &nbsp; except:&nbsp; &nbsp; &nbsp; &nbsp; passprint(final)输出:what would you like to search up?>? among us['-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%']
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python