Python Beautifulsoup:在特定字符串后查找元素

我有以下 html 代码:


<div class="xyOfqd">

<div class="aAAD">

   <div class="Bgbcca">Updated</div>

   <span class="hthtb">

      <div>

         <span class="hthtb">September 30, 2018</span>

      </div>

   </span>

</div>

<div class="aAAD">

   <div class="Bgbcca">Text1</div>

   <span class="hthtb">

      <div><span class="hthtb">Text2</span></div>

   </span>

</div>

<div 

   class="aAAD">

   <div class="Bgbcca">MyText</div>

   <span class="hthtb">

      <div> 

         <span class="hthtb">Text3</span>

      </div>

   </span>

</div>

<div class="aAAD">

   <div class="Bgbcca">Text4</div>

   <span class="hthtb">

      <div><span 

         class="hthtb">Text5</span></div>

   </span>

</div>

<div class="aAAD">

   <div 

      class="Bgbcca">Text6</div>

   <span class="hthtb">

      <div><span 

         class="hthtb">Text7</span></div>

   </span>

</div>

<div class="aAAD">

<div 

   class="Bgbcca">

   Text8/div>

   <span class="hthtb">

      <div>

         <span class="hthtb">

            <div>Text9</div>

            <div><a href="https://google.com">Text10</a></div>

         </span>

      </div>

   </span>

</div>

<div class="aAAD">

   <div 

      class="Bgbcca">Text11</div>

   <span class="hthtb">

      <div><span class="hthtb">Text12</span></div>

   </span>

</div>

如何找到Text3紧跟在div带有 的元素之后的元素MyText?


素胚勾勒不出你
浏览 266回答 3
3回答

POPMUISE

您可以使用lxml.html解决方案:from lxml import htmlsource = """<div class="xyOfqd"><div class="aAAD">&nbsp; &nbsp;<div class="Bgbcca">Updated</div>&nbsp; &nbsp;...&nbsp; &nbsp;<span class="hthtb">&nbsp; &nbsp; &nbsp; <div><span class="hthtb">Text12</span></div>&nbsp; &nbsp;</span></div>"""tree = html.fromstring(source)print(tree.xpath('//div[.="MyText"]/following-sibling::span/div/span/text()'))
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python