猿问

正则表达式替换html标签之外的文本

我有这个HTML:


"This is simple html text <span class='simple'>simple simple text text</span> text"

我只需要匹配任何HTML标记之外的单词。我的意思是,如果我想匹配“简单”和“文本”,则只能从“这是简单的html文本”和最后一部分“文本”中得到结果-结果将是“简单” 1匹配,“文本” 2火柴。有人可以帮我吗?我正在使用jQuery。


var pattern = new RegExp("(\\b" + value + "\\b)", 'gi');


if (pattern.test(text)) {

    text = text.replace(pattern, "<span class='notranslate'>$1</span>");

}

value 是我要匹配的单词(在这种情况下为“简单”)

text 是 "This is simple html text <span class='simple'>simple simple text text</span> text"

我需要用来包装所有选定的单词(在此示例中为“简单”)<span>。但是我只想包装任何 HTML标记之外的词。这个例子的结果应该是


This is <span class='notranslate'>simple</span> html <span class='notranslate'>text</span> <span class='simple'>simple simple text text</span> <span class='notranslate'>text</span>

我不想替换里面的任何文本


<span class='simple'>simple simple text text</span>

它应与更换前的相同。


江户川乱折腾
浏览 893回答 2
2回答

白衣染霜花

好的,尝试使用此正则表达式:(text|simple)(?![^<]*>|[^<>]*</)该示例在regex101上工作。分解:(&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;# Open capture group&nbsp; text&nbsp; &nbsp; # Match 'text'|&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;# Or&nbsp; simple&nbsp; # Match 'simple')&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;# End capture group(?!&nbsp; &nbsp; &nbsp; &nbsp;# Negative lookahead start (will cause match to fail if contents match)&nbsp; [^<]*&nbsp; &nbsp;# Any number of non-'<' characters&nbsp; >&nbsp; &nbsp; &nbsp; &nbsp;# A > character|&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;# Or&nbsp; [^<>]*&nbsp; # Any number of non-'<' and non-'>' characters&nbsp; </&nbsp; &nbsp; &nbsp; # The characters < and /)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;# End negative lookahead.否定的超前查询将阻止html标签之间的text或simple。
随时随地看视频慕课网APP
我要回答