剥离lxml中的单个元素

我需要在保留其数据的同时删除 XML 元素。lxml 函数strip_tags确实删除了元素，但它以递归方式工作，我想去除单个元素。

我尝试使用这篇文章的答案，但remove删除了整个元素。

xml="""

One <fruit state="rotten">apple</fruit> a day keeps the doctor away.

This <fruit state="fresh">pear</fruit> is fresh.

</groceries>

"""

tree=ET.fromstring(xml)

for bad in tree.xpath("//fruit[@state='rotten']"):

bad.getparent().remove(bad)

print (ET.tostring(tree, pretty_print=True))

我想得到

One apple a day keeps the doctor away.

This <fruit state="fresh">pear</fruit> is fresh.

</groceries>\n'

我明白了

This <fruit state="fresh">pear</fruit> is fresh.

</groceries>\n'

我尝试使用strip_tags：

for bad in tree.xpath("//fruit[@state='rotten']"):

ET.strip_tags(bad.getparent(), bad.tag)

One apple a day keeps the doctor away.

This pear is fresh.

</groceries>

但这会剥离一切，我只想用state='rotten'.

Cats萌萌

浏览 112回答 1

1回答

ibeautiful

也许其他人有更好的主意，但这是一种可能的解决方法：bad = tree.xpath(".//fruit[@state='rotten']")[0] #for simplicity, I didn't bother with a for loop in this casetxt = bad.text+bad.tail # collect the text content of bad; strangely enough it's not just 'apple'bad.getparent().text += txt # add the collected text to the parent's existing texttree.remove(bad) # this gets rid only of this specific 'bad'print(etree.tostring(tree).decode())输出：<groceries>  One apple a day keeps the doctor away.  This <fruit state="fresh">pear</fruit> is fresh.</groceries>

随时随地看视频慕课网APP