从 XML 中删除所有出现的特定属性

我有一个 XML 文件,内容如下


<document>

  <section>

    <section SectionName="abstract">

     <paragraph>

    <word Endpoint="1" SciomeSRIE_Sentence.ExposureSentence="1">gutkha</word>

    <word ExposureSentence="1">split_identifier ,</word>

    <word ExposureSentence="1">and</word>

    <word ExposureSentence="1">what</word>

    <word ExposureSentence="1">role</word>

    <word ExposureSentence="1">split_identifier ,</word>

    <word ExposureSentence="1">if</word>

    <word ExposureSentence="1">any</word>

    <word ExposureSentence="1">split_identifier ,</word>

    <word ExposureSentence="1">nicotine</word>

    <word ExposureSentence="1">contributes</word>

    <word ExposureSentence="1">to</word>

    <word ExposureSentence="1">the</word>

    <word ExposureSentence="1">effects</word>

    <word ExposureSentence="1">split_identifier .</word>

    <word EB_NLP_Tagger.Participant="3" AnimalGroupSentence="1" DoseGroupSentence="1" ExposureSentence="2">Adult</word>

    <word EB_NLP_Tagger.Participant="3" Sex="1" AnimalGroupSentence="1" DoseGroupSentence="1" ExposureSentence="2">male</word>

    <word EB_NLP_Tagger.Participant="3" Species="1" AnimalGroupSentence="1" DoseGroupSentence="1" ExposureSentence="2">mice</word>

    <word AnimalGroupSentence="1" DoseGroupSentence="1" ExposureSentence="2">were</word>

    <word AnimalGroupSentence="1" DoseGroupSentence="1" ExposureSentence="2">treated</word>

    <word AnimalGroupSentence="1" DoseGroupSentence="1" ExposureSentence="2">daily</word>

    <word AnimalGroupSentence="1" DoseGroupSentence="1" ExposureSentence="2">for</word>


慕田峪7331174
浏览 95回答 2
2回答

慕标5832272

XPath 使这变得简单:public static void main(String... args)&nbsp; &nbsp; &nbsp; &nbsp; throws Exception{&nbsp; &nbsp; DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();&nbsp; &nbsp; DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();&nbsp; &nbsp; Document doc = dBuilder.parse(new ByteArrayInputStream(xml.getBytes()));&nbsp; &nbsp; XPathFactory xPathfactory = XPathFactory.newInstance();&nbsp; &nbsp; XPath xpath = xPathfactory.newXPath();&nbsp; &nbsp; // Find word elements with ExposureSentence attribute&nbsp; &nbsp; XPathExpression query = xpath.compile("//word[@ExposureSentence]");&nbsp; &nbsp; NodeList words = (NodeList) query.evaluate(doc, XPathConstants.NODESET);&nbsp; &nbsp; for (int i = 0; i < words.getLength(); i++) {&nbsp; &nbsp; &nbsp; &nbsp; // Remove the attribute&nbsp; &nbsp; &nbsp; &nbsp; ((Element) words.item(i)).removeAttribute("ExposureSentence");&nbsp; &nbsp; }&nbsp; &nbsp; // Handle ComponentName&nbsp; &nbsp; query = xpath.compile("//ComponentName");&nbsp; &nbsp; NodeList componentNames = (NodeList) query.evaluate(doc, XPathConstants.NODESET);&nbsp; &nbsp; for (int i = 0; i < componentNames.getLength(); i++) {&nbsp; &nbsp; &nbsp; &nbsp; String content = componentNames.item(i).getTextContent();&nbsp; &nbsp; &nbsp; &nbsp; componentNames.item(i).setTextContent(&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Arrays.stream(content.split(","))&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; .map(String::trim)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; .filter(s -> !s.equals("ExposureSentence"))&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; .collect(Collectors.joining(", ")));&nbsp; &nbsp; }&nbsp; &nbsp; // Omitted: Save the XML}

元芳怎么了

我认为最简单的解决方案是ExposureSentence="1"使用简单的正则表达式替换所有出现的情况。将所有 xml 内容读取为 String,并替换所有不需要 XML 解析和替换的特定单词出现位置。在 XML 解析的情况下,您需要解析、操作逻辑,并且必须重建 XML 信息集。
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Java