猿问

xpath 查找包含 1 tr 和 1 th 的行

我需要帮助编写 xpath 来查找仅包含第 1 个和 1 个 td 的表中的行


示例 HTML


<!DOCTYPE html>

<html>

<head>

    <title></title>

</head>

<body>

    <table>

        <tr>

            <th>test</th>

            <td>abc</td>

        </tr>

        <tr>

            <th>test1</th>

            <td>abc</td>

            <td>abc</td>

        </tr>

            <th>test2</th>

            <td>abc</td>

    </table>

</body>

</html>

对于这个 html,我期望只有第一行和最后一行,如果该行包含 th 和 td 以外的任何内容,则应跳过


'//table/tr[th and td]' 我到达了这个 xpath 但这将包括重复如果该行包含其他或任何东西td则不会过滤<a


FFIVE
浏览 105回答 2
2回答

宝慕林4294392

这是可行的——不是很优雅,但是有效。我扩展了您的示例 html,引入了一些更多有问题的节点:test = """<!DOCTYPE html><html><head>&nbsp; &nbsp; <title></title></head><body>&nbsp; &nbsp; <table>&nbsp; &nbsp; &nbsp; &nbsp; <tr>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <th>test</th>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <td>abc</td>&nbsp; &nbsp; &nbsp; &nbsp; </tr>&nbsp; &nbsp; &nbsp; &nbsp; <tr>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <th>test1</th>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <td>abc</td>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <td>abc</td>&nbsp; &nbsp; &nbsp; &nbsp; </tr>&nbsp; &nbsp; &nbsp; &nbsp; <tr>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <th>test2</th>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <td>abc</td>&nbsp; &nbsp; &nbsp; &nbsp; </tr>&nbsp; &nbsp; &nbsp; &nbsp; <tr>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <a>test3</a>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <td>abcd</td>&nbsp; &nbsp; &nbsp; &nbsp; </tr>&nbsp; &nbsp; &nbsp; &nbsp; <tr>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <td>test4</td>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <td>abcd</td>&nbsp; &nbsp; &nbsp; &nbsp; </tr>&nbsp; &nbsp; </table></body>&nbsp; &nbsp; """import lxml.htmldoc = lxml.html.fromstring(test)good_tags = ['th','td']targs = doc.xpath('//tr')for targ in targs:&nbsp; &nbsp; tr = targ.xpath('.//*')&nbsp; &nbsp; if len(tr)==2 and (tr[0].tag != tr[1].tag) and tr[0].tag in good_tags and tr[1].tag in good_tags:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; print(lxml.html.tostring(targ).decode())输出:<tr>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <th>test</th>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <td>abc</td>&nbsp; &nbsp; &nbsp; &nbsp; </tr><tr>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <th>test2</th>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <td>abc</td>&nbsp; &nbsp; &nbsp; &nbsp; </tr>

绝地无双

一班 XPath ://tr[count&nbsp;(./*)=2&nbsp;and&nbsp;count(./th)=1&nbsp;and&nbsp;count(./td)=1]
随时随地看视频慕课网APP

相关分类

Html5
我要回答