忽略嵌套在括号内的匹配的JavaScript正则表达式

我将如何使用 JavaScript 创建一个正则表达式来查找逗号分隔符之间的所有文本,但忽略嵌套括号内的逗号?例如,在下面的示例主题中,我希望得到 3 个匹配项:

示例主题

one, two, start (a, b) end

预期匹配:

  1. “一”

  2. “二”

  3. “开始(a,b)结束”

在花了将近一整天的时间尝试(但未能)解决这个问题后,我想起了我的老朋友 Stackoverflow。任何人都可以帮忙吗?也许除了正则表达式之外还有一些更适合这项任务的技术?


LEATH
浏览 270回答 4
4回答

炎炎设计

您可以创建自己的解析器,并跟踪“堆栈”以检测之前是否打开了括号。下面的示例适用于()、[]、{}或您想要的任何内容。它们可以相互嵌套。你可以像这样使用它:const mySplit = customSplitFactory({  delimiter: ',',  escapedPairs: {    '(': ')',    '{': '}',    '[': ']'  }});mySplit('one, two, start (a, b) end'); // ["one"," two"," start (a, b) end"]代码和演示:// Generic factory functionfunction customSplitFactory({ delimiter, escapedPairs }) {  const escapedStartChars = Object.keys(escapedPairs);  return (str) => {    const result = str.split('')      // For each character      .reduce((res, char) => {        // If it's a start escape char `(`, `[`, ...        if (escapedStartChars.includes(char)) {          // Add the corresponding end char to the stack          res.escapeStack.push(escapedPairs[char]);          // Add the char to the current group          res.currentGroup.push(char);        // If it's the end escape char we were waiting for `)`, `]`, ...        } else if (          res.escapeStack.length &&          char === res.escapeStack[res.escapeStack.length - 1]        ) {          // Remove it from the stack          res.escapeStack.pop();          // Add the char to the current group          res.currentGroup.push(char);        // If it's a delimiter and the escape stack is empty        } else if (char === delimiter && !res.escapeStack.length) {          if (res.currentGroup.length) {            // Push the current group into the results            res.groups.push(res.currentGroup.join(''));          }          // Reset it          res.currentGroup = [];        } else {          // Otherwise, just push the char into the current group          res.currentGroup.push(char);        }        return res;      }, {        groups: [],        currentGroup: [],        escapeStack: []      });          // If the current group was not added to the results yet     if (result.currentGroup.length) {       result.groups.push(result.currentGroup.join(''));     }      return result.groups;  };}// Usageconst mySplit = customSplitFactory({  delimiter: ',',  escapedPairs: {    '(': ')',    '{': '}',    '[': ']'  }});function demo(s) { // Just for this demo  const res = mySplit(s);  console.log([s, res].map(JSON.stringify).join(' // '));}demo('one, two, start (a, b) end,');   // ["one"," two"," start (a, b) end"]demo('one, two, start {a, b} end');    // ["one"," two"," start {a, b} end"]demo('one, two, start [{a, b}] end,'); // ["one"," two"," start [{a, b}] end"]demo('one, two, start ((a, b)) end,'); // ["one"," two"," start ((a, b)) end"]

森林海

您需要首先考虑特殊情况,即括号,首先处理它:var str, mtc;str = "one, two, start (a, b) end, hello";mtc =  str.match(/[^,]*\([^\)]+\)[^,]+|[^,]+/g);console.log(mtc);//Expected output: ["one","two", " start (a, b) end", " hello"]首先,处理括号:patt = /[^,]*\([^\)]+\)[^,]+/g//That will match any character after ,//Then match character "(" and then match any charecter with no ")" then ends with )//Now is easy things, we just matches character withno colonpatt = /[^,]+/g

千巷猫影

正如一些评论所建议的,您可以使用 split 功能。例子:let&nbsp;str&nbsp;=&nbsp;"one,&nbsp;two,&nbsp;start&nbsp;(a,&nbsp;b)&nbsp;end,"; let&nbsp;matches&nbsp;=&nbsp;str.split(/(?<!(\"|\{|\()[a-zA-Z0-9]*),(?![a-zA-Z0-9]*\)|\}|\")/);match 将是一个包含 ["one", "two", "start (a, b) end", "" ] 的数组;文档:https ://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/split

摇曳的蔷薇

如果不需要处理不匹配的大括号,则可以将其简化为天真的平衡大括号计数器。目前使用默认正常文本尽力而为:如果检测到右大括号,它将尝试找到起始大括号并将其括起来,将封闭的段视为文本如果没有找到起始大括号,则将其视为普通文本const braces = {'{':'}','[':']','(':')'}// create object map of ending braces to starting bracesconst inv_braces = Object.fromEntries(Object.entries(braces).map(x=>x.reverse()))const red = new RegExp(`(,)|` +&nbsp; `([${Object.keys(braces).join('')}])|` +&nbsp;&nbsp; `([${Object.values(braces).map(x=>`\\${x}`).join('')}])` , 'g')&nbsp; // pre-build break-point scanning regexes&nbsp; // group1 comma detection, group2 start braces, group3 end braceselement_extract= str => {&nbsp; let res = []&nbsp; let stack = [], next, last = -1&nbsp;&nbsp;&nbsp; // search until no more break-points found&nbsp; while(next = red.exec(str)) {&nbsp; &nbsp; const [,comma,begin,end] = next, {index} = next&nbsp; &nbsp;&nbsp;&nbsp; &nbsp; if(begin) stack.push(begin) // beginning brace, push to stack&nbsp; &nbsp; else if(end){ //ending brace, pop off stack to starting brace&nbsp; &nbsp; &nbsp; const start = stack.lastIndexOf(inv_braces[end])&nbsp; &nbsp; &nbsp; if(start!==-1) stack.length = start&nbsp; &nbsp; }&nbsp; &nbsp; else if(!stack.length && comma) res.push(str.slice(last+1,last=index))&nbsp; &nbsp; // empty stack and comma, slice string and push to results&nbsp; }&nbsp; if(last<str.length) res.push(str.slice(last+1)) // final element&nbsp; return res}data = ["one, two, start (a, b) end","one, two, start ((a, (b][,c)]) ((d,e),f)) end, two","one, two ((a, (b,c)) ((d,e),f)) three, start (a, (b,c)) ((d,e),f) end, four","(a, (b,c)) ((d,e)],f))"]for(const x of data)console.log(element_extract(x))笔记:可以通过为 \ 添加另一个匹配组并增加索引以跳过来添加转义可以添加正则表达式字符串清理器以允许匹配特殊字符可以添加第二个正则表达式以跳过逗号进行优化(请参阅编辑历史记录)可以通过替换逗号匹配器并在计算中包括定界符长度来添加对可变长度定界符的支持。大括号也是如此。例如,我可以使用 (\s*,\s*) 而不是 (,) 来去除空格,或者通过调整正则表达式生成器以使用 '|' 来使用 '{{':'}}' 作为大括号&nbsp;而不是字符类为简单起见,我省略了这些
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

JavaScript