如何从包含“<?>”的标签中解析文本

我的目标是获取文本: 27. The method according to claim 23 wherein...

How do I go about retrieving the text inside a tag that contains <?. 我相信他们被谷歌搜索称为 php 短标签。


我正在使用 lxml、xpaths,他们似乎只是没有将其注册为标签或节点。我试过 itertext() 但效果不佳。


 <claim id="CLM-00027" num="00027">

            <claim-text>                <?insert-start id="REI-00005" date="20191203" ?>27. The method according to claim 23 wherein the amorphous metal is selected from the group consisting of Zr based alloys, Ti based alloys, Al based alloys, Fe based alloys, La based alloys, Cu based alloys, Mg based alloys, Pt based alloys, and Pd based alloys.                <?insert-end id="REI-00005" ?></claim-text>

        </claim>


哔哔one
浏览 96回答 2
2回答

UYOU

下面是一段代码,它使用 XPath 到达最深的“有效”标签,然后从那里getchildren一直tail深入到实际文本。import lxmlxml=""" <claim id="CLM-00027" num="00027">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <claim-text>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <?insert-start id="REI-00005" date="20191203" ?>27. The method according to claim 23 wherein the amorphous metal is selected from the group consisting of Zr based alloys, Ti based alloys, Al based alloys, Fe based alloys, La based alloys, Cu based alloys, Mg based alloys, Pt based alloys, and Pd based alloys.&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <?insert-end id="REI-00005" ?></claim-text>&nbsp; &nbsp; &nbsp; &nbsp; </claim>"""root = lxml.etree.fromstring(xml)e = root.xpath("/claim/claim-text")res = e[0].getchildren()[0].tailprint(res)输出:'27。24.根据权利要求23所述的方法,其中所述非晶态金属选自Zr基合金、Ti基合金、Al基合金、Fe基合金、La基合金、Cu基合金、Mg基合金、Pt基合金,和Pd基合金。

守着一只汪

通过索引访问特定的子节点。from xml.etree import ElementTree as ETtree = ET.parse('path_to_your.xml')root = tree.getroot()print(root[0].text)输出:&nbsp; &nbsp; &nbsp; &nbsp; 27. The method according to claim 23 wherein the amorphous metal is selected from the group consisting of Zr based alloys, Ti based alloys, Al based alloys, Fe based alloys, La based alloys, Cu based alloys, Mg based alloys, Pt based alloys, and Pd based alloys.&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python