猿问

无法获取 HTML 标签内的数据

无法获取 HTML 标签“alt”= 内的数据


from bs4 import BeautifulSoup

import re

soup=BeautifulSoup("""<div class="couponTable">

    <div id="tgCou1" class="tgCoupon couponRow"><span class="spBtnMinus"></span><!-- react-text: 67 -->Wednesday Matches<!-- /react-text --></div>

    <div class="cflag"><img src="/ContentServer/jcbw/images/flag_JLC.gif?CV=L302R1g" alt="Japanese League Cup" title="Japanese League Cup" class="cfJLC"></div>

    <div class="cflag"><img src="/ContentServer/jcbw/images/flag_JLC.gif?CV=L302R1g" alt="Japanese League Cup" title="Japanese League Cup" class="cfJLC"></div>

    </div></div></div>""")


lines=soup.find_all('div')

line in lines:print(re.findall('\w+',line['alt'])[0])


森林海
浏览 137回答 1
1回答

catspeake

如果您只需要该alt值,那么您最好获取img标签而不是div标签。也不需要使用正则表达式来提取alt值from bs4 import BeautifulSoupimport resoup=BeautifulSoup("""<div class="couponTable"><div id="tgCou1" class="tgCoupon couponRow"><span class="spBtnMinus"></span><!-- react-text: 67 -->Wednesday Matches<!-- /react-text --></div><div class="cflag"><img src="/ContentServer/jcbw/images/flag_JLC.gif?CV=L302R1g" alt="Japanese League Cup" title="Japanese League Cup" class="cfJLC"></div><div class="cflag"><img src="/ContentServer/jcbw/images/flag_JLC.gif?CV=L302R1g" alt="Japanese League Cup" title="Japanese League Cup" class="cfJLC"></div></div></div></div>""",'html.parser')lines=soup.find_all('img')for line in lines:&nbsp; &nbsp; print(line['alt'])输出日本联赛杯日本联赛杯
随时随地看视频慕课网APP

相关分类

Python
我要回答