猿问

匹配正则表达式中的最长字符串或在常见子字符串的情况下

在正则表达式 OR 中,当有多个具有公共前缀的输入时,正则表达式将匹配第一个输入Regex OR而不是最长匹配。

例如,对于正则表达式regex = (KA|KARNATAKA)input = KARNATAKA输出将是 2 个匹配项match1 =KAmatch2 = KA.

但是我想要的Regex ORmatch1 = KARNATAKA在我给定示例中的给定输入中完成最长可能匹配。

这是正则表达式客户端中的示例

所以我现在正在做的是,我Regex OR按长度按降序对输入进行排序。

我的问题是,我们可以在正则表达式本身中指定匹配最长的字符串吗?还是排序是唯一的方法?

我已经提到了这个问题,除了排序我没有看到其他解决方案


撒科打诨
浏览 181回答 2
2回答

侃侃无极

您可以使用单词边界 ( \b) 来避免匹配前缀对于您提到的情况:以下正则表达式将仅匹配KA或KARNATAKA(\bKA\b|\bKARNATAKA\b)在这里试试

繁星点点滴滴

您可以为此创建一个辅助方法:public final class PatternHelper {&nbsp; &nbsp; public static Pattern compileSortedOr(String regex) {&nbsp; &nbsp; &nbsp; &nbsp; Matcher matcher = Pattern.compile("(.*)\\((.*\\|.*)\\)(.*)").matcher(regex);&nbsp; &nbsp; &nbsp; &nbsp; if (matcher.matches()) {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; List<String> conditions = Arrays.asList(matcher.group(2).split("\\|"));&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; List<String> sortedConditions = conditions.stream()&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; .sorted((c1, c2) -> c2.length() - c1.length())&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; .collect(Collectors.toList());&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return Pattern.compile(matcher.group(1) +&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;"(" +&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;String.join("|", sortedConditions) +&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;")" +&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;matcher.group(3));&nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; &nbsp; &nbsp; return Pattern.compile(regex);&nbsp; &nbsp; }}Matcher matcher = PatternHelper.compileSortedOr("(KA|KARNATAKA)").matcher("KARNATAKA");if (matcher.matches()) {&nbsp; &nbsp; System.out.println(matcher.group(1));}输出:KARNATAKAPS 这仅适用于没有嵌套括号的简单表达式。如果您期望非常复杂的表达式,则需要进行调整。
随时随地看视频慕课网APP

相关分类

Java
我要回答