使用正则表达式通配符获取没有周围文本的标签

您想要实现的是访问捕获组。我更喜欢命名捕获组，并且有一个非常简单的辅助函数可以处理这个问题：package mainimport (    "fmt"    "regexp")// Our example inputconst input = "X-sync-status: done\r\n"// We anchor the regex to the beginning of a line with "^".// Then we have a fixed string until our capturing group begins.// Within our capturing group, we want to have all consecutive non-whitespace,// non-control characters following.const regexString = `(?i)^X-sync-status: (?P<status>\w*)`// We ensure our regexp is valid and can be used.var syncStatusRegexp *regexp.Regexp = regexp.MustCompile(regexString)// The helper function...func namedResults(re *regexp.Regexp, in string) map[string]string {    // ... does the matching    match := re.FindStringSubmatch(in)    result := make(map[string]string)    // and puts the value for each named capturing group    // into the result map    for i, name := range re.SubexpNames() {        if i != 0 && name != "" {            result[name] = match[i]        }    }    return result}func main() {    fmt.Println(namedResults(syncStatusRegexp, input)["status"])}Run on playground注意您当前的正则表达式有些错误，因为您也会捕获空格。使用当前的正则表达式，结果将是“done”而不是“done”。编辑：当然，如果没有正则表达式，您可以更便宜地做到这一点：fmt.Print(strings.Trim(strings.Split(input, ":")[1], " \r\n"))Run on playgroundEdit2我很好奇 split 方法便宜多少，因此我想出了非常粗略的方法：package mainimport (    "fmt"    "log"    "regexp"    "strings")// Our example inputconst input = "X-sync-status: done\r\n"// We anchor the regex to the beginning of a line with "^".// Then we have a fixed string until our capturing group begins.// Within our capturing group, we want to have all consecutive non-whitespace,// non-control characters following.const regexString = `(?i)^X-sync-status: (?P<status>\w*)`// We ensure our regexp is valid and can be used.var syncStatusRegexp *regexp.Regexp = regexp.MustCompile(regexString)func statusBySplit(in string) string {    return strings.Trim(strings.Split(input, ":")[1], " \r\n")}func statusByRegexp(re *regexp.Regexp, in string) string {    return re.FindStringSubmatch(in)[1]}[...]和一个小基准：package mainimport "testing"func BenchmarkRegexp(b *testing.B) {    for i := 0; i < b.N; i++ {        statusByRegexp(syncStatusRegexp, input)    }}func BenchmarkSplit(b *testing.B) {    for i := 0; i < b.N; i++ {        statusBySplit(input)    }}然后，我让它们分别在 1 个、2 个和 4 个可用的 CPU 上运行 5 次。恕我直言，结果非常有说服力：go test -run=^$ -test.bench=.  -test.benchmem -test.cpu 1,2,4 -test.count=5goos: darwingoarch: amd64pkg: github.com/mwmahlberg/so-regexBenchmarkRegexp          5000000               383 ns/op              32 B/op          1 allocs/opBenchmarkRegexp          5000000               382 ns/op              32 B/op          1 allocs/opBenchmarkRegexp          5000000               382 ns/op              32 B/op          1 allocs/opBenchmarkRegexp          5000000               382 ns/op              32 B/op          1 allocs/opBenchmarkRegexp          5000000               384 ns/op              32 B/op          1 allocs/opBenchmarkRegexp-2        5000000               384 ns/op              32 B/op          1 allocs/opBenchmarkRegexp-2        5000000               382 ns/op              32 B/op          1 allocs/opBenchmarkRegexp-2        5000000               384 ns/op              32 B/op          1 allocs/opBenchmarkRegexp-2        5000000               382 ns/op              32 B/op          1 allocs/opBenchmarkRegexp-2        5000000               382 ns/op              32 B/op          1 allocs/opBenchmarkRegexp-4        5000000               382 ns/op              32 B/op          1 allocs/opBenchmarkRegexp-4        5000000               382 ns/op              32 B/op          1 allocs/opBenchmarkRegexp-4        5000000               380 ns/op              32 B/op          1 allocs/opBenchmarkRegexp-4        5000000               380 ns/op              32 B/op          1 allocs/opBenchmarkRegexp-4        5000000               377 ns/op              32 B/op          1 allocs/opBenchmarkSplit          10000000               161 ns/op              80 B/op          3 allocs/opBenchmarkSplit          10000000               161 ns/op              80 B/op          3 allocs/opBenchmarkSplit          10000000               164 ns/op              80 B/op          3 allocs/opBenchmarkSplit          10000000               165 ns/op              80 B/op          3 allocs/opBenchmarkSplit          10000000               162 ns/op              80 B/op          3 allocs/opBenchmarkSplit-2        10000000               159 ns/op              80 B/op          3 allocs/opBenchmarkSplit-2        10000000               167 ns/op              80 B/op          3 allocs/opBenchmarkSplit-2        10000000               161 ns/op              80 B/op          3 allocs/opBenchmarkSplit-2        10000000               159 ns/op              80 B/op          3 allocs/opBenchmarkSplit-2        10000000               159 ns/op              80 B/op          3 allocs/opBenchmarkSplit-4        10000000               159 ns/op              80 B/op          3 allocs/opBenchmarkSplit-4        10000000               161 ns/op              80 B/op          3 allocs/opBenchmarkSplit-4        10000000               159 ns/op              80 B/op          3 allocs/opBenchmarkSplit-4        10000000               160 ns/op              80 B/op          3 allocs/opBenchmarkSplit-4        10000000               160 ns/op              80 B/op          3 allocs/opPASSok      github.com/mwmahlberg/so-regex  61.340s它清楚地表明，在拆分标签的情况下，实际使用拆分的速度是预编译正则表达式的两倍多。对于您的用例，我显然会选择使用 split。

使用正则表达式通配符获取没有周围文本的标签

1回答