Jsoup 选择带有许多标签的标签后的文本

我想使用jsoup在每个文本之后提取一个文本。有什么办法可以选择吗?


示例代码如下:


<div class="content">

<div name="panel-summary" id="summary">

    <p>

    <strong>A: </strong>*thank you* **I want to retrieve this text**<br>

    <strong>B: </strong>*Bla..bla* *I don't want this text*<br>

    <strong>C: </strong>*what ever text* *I dont want this*                         

        <strong>D: </strong>*anythinh text* *I want this*<br>

        <strong>E: </strong>*Bla..bla* *I don't want this text*t<br>

        <strong>F: </strong>*anythinh text* *I want this*<br>

    </p>


    <p>I want this</p>

当它完成时,它会创建自动 ID 示例 id=123


牧羊人nacy
浏览 225回答 1
1回答

青春有我

如果我们可以假设<strong>您要查找的所有元素将始终包含A:orD:或F:then with strong:matchesOwn(regex)(其中正则表达式将表示A:|D:|F:),我们可以选择这些元素。处理后,strong我们可以转到第二个<p>并通过text().String html = "<div class=\"content\">\n" +&nbsp; &nbsp; &nbsp; &nbsp; "<div name=\"panel-summary\" id=\"summary\">\n" +&nbsp; &nbsp; &nbsp; &nbsp; "&nbsp; &nbsp; <p>\n" +&nbsp; &nbsp; &nbsp; &nbsp; "&nbsp; &nbsp; <strong>A: </strong>*thank you* **I want to retrieve this text**<br>\n" +&nbsp; &nbsp; &nbsp; &nbsp; "&nbsp; &nbsp; <strong>B: </strong>*Bla..bla* *I don't want this text*<br>\n" +&nbsp; &nbsp; &nbsp; &nbsp; "&nbsp; &nbsp; <strong>C: </strong>*what ever text* *I dont want this*&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;\n" +&nbsp; &nbsp; &nbsp; &nbsp; "&nbsp; &nbsp; &nbsp; &nbsp; <strong>D: </strong>*anythinh text* *I want this*<br>\n" +&nbsp; &nbsp; &nbsp; &nbsp; "&nbsp; &nbsp; &nbsp; &nbsp; <strong>E: </strong>*Bla..bla* *I don't want this text*t<br>\n" +&nbsp; &nbsp; &nbsp; &nbsp; "&nbsp; &nbsp; &nbsp; &nbsp; <strong>F: </strong>*anythinh text* *I want this*<br>\n" +&nbsp; &nbsp; &nbsp; &nbsp; "&nbsp; &nbsp; </p>\n" +&nbsp; &nbsp; &nbsp; &nbsp; "\n" +&nbsp; &nbsp; &nbsp; &nbsp; "&nbsp; &nbsp; <p>I want this</p>";Document doc = Jsoup.parse(html);Elements pElements = doc.select("#summary p");Elements strongElements = pElements.first().select("strong:matchesOwn(A:|D:|F:)");for (Element strong : strongElements) {&nbsp; &nbsp; System.out.println(strong.nextSibling());//get next element, including textual element}System.out.println("---");System.out.println(pElements.get(1).text());//textual content of <p>I want this</p>输出:*thank you* **I want to retrieve this text***anythinh text* *I want this**anythinh text* *I want this*---I want this如果您不想依赖于的内容,<strong>而只想依赖其索引,则选择所有这些,例如Elements allStrElemens = doc.select("#summary p strong");并通过它们的索引简单地选择你需要的那些(记住索引从 0 开始),比如System.out.println(allStrElemens.get(0).nextSibling());System.out.println(allStrElemens.get(3).nextSibling());System.out.println(allStrElemens.get(5).nextSibling());
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Java