猿问

使用正则表达式通配符获取没有周围文本的标签

我试图在下面获取“完成”值,该值位于分块 http 流末尾返回的字节切片中。

X-sync-status: done\r\n

这是我到目前为止所做的 go 正则表达式

syncStatusRegex = regexp.MustCompile("(?i)X-sync-status:(.*)\r\n")

我只是想让它返回这一点

(.*)

这是获取状态的代码

syncStatus := strings.TrimSpace(string(syncStatusRegex.Find(body)))
fmt.Println(syncStatus)

如何让它只返回“完成”而不返回标题?


慕运维8079593
浏览 77回答 1
1回答

慕少森

您想要实现的是访问捕获组。我更喜欢命名捕获组,并且有一个非常简单的辅助函数可以处理这个问题:package mainimport (&nbsp; &nbsp; "fmt"&nbsp; &nbsp; "regexp")// Our example inputconst input = "X-sync-status: done\r\n"// We anchor the regex to the beginning of a line with "^".// Then we have a fixed string until our capturing group begins.// Within our capturing group, we want to have all consecutive non-whitespace,// non-control characters following.const regexString = `(?i)^X-sync-status: (?P<status>\w*)`// We ensure our regexp is valid and can be used.var syncStatusRegexp *regexp.Regexp = regexp.MustCompile(regexString)// The helper function...func namedResults(re *regexp.Regexp, in string) map[string]string {&nbsp; &nbsp; // ... does the matching&nbsp; &nbsp; match := re.FindStringSubmatch(in)&nbsp; &nbsp; result := make(map[string]string)&nbsp; &nbsp; // and puts the value for each named capturing group&nbsp; &nbsp; // into the result map&nbsp; &nbsp; for i, name := range re.SubexpNames() {&nbsp; &nbsp; &nbsp; &nbsp; if i != 0 && name != "" {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; result[name] = match[i]&nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; }&nbsp; &nbsp; return result}func main() {&nbsp; &nbsp; fmt.Println(namedResults(syncStatusRegexp, input)["status"])}Run on playground注意您当前的正则表达式有些错误,因为您也会捕获空格。使用当前的正则表达式,结果将是“done”而不是“done”。编辑:当然,如果没有正则表达式,您可以更便宜地做到这一点:fmt.Print(strings.Trim(strings.Split(input,&nbsp;":")[1],&nbsp;"&nbsp;\r\n"))Run on playgroundEdit2我很好奇 split 方法便宜多少,因此我想出了非常粗略的方法:package mainimport (&nbsp; &nbsp; "fmt"&nbsp; &nbsp; "log"&nbsp; &nbsp; "regexp"&nbsp; &nbsp; "strings")// Our example inputconst input = "X-sync-status: done\r\n"// We anchor the regex to the beginning of a line with "^".// Then we have a fixed string until our capturing group begins.// Within our capturing group, we want to have all consecutive non-whitespace,// non-control characters following.const regexString = `(?i)^X-sync-status: (?P<status>\w*)`// We ensure our regexp is valid and can be used.var syncStatusRegexp *regexp.Regexp = regexp.MustCompile(regexString)func statusBySplit(in string) string {&nbsp; &nbsp; return strings.Trim(strings.Split(input, ":")[1], " \r\n")}func statusByRegexp(re *regexp.Regexp, in string) string {&nbsp; &nbsp; return re.FindStringSubmatch(in)[1]}[...]和一个小基准:package mainimport "testing"func BenchmarkRegexp(b *testing.B) {&nbsp; &nbsp; for i := 0; i < b.N; i++ {&nbsp; &nbsp; &nbsp; &nbsp; statusByRegexp(syncStatusRegexp, input)&nbsp; &nbsp; }}func BenchmarkSplit(b *testing.B) {&nbsp; &nbsp; for i := 0; i < b.N; i++ {&nbsp; &nbsp; &nbsp; &nbsp; statusBySplit(input)&nbsp; &nbsp; }}然后,我让它们分别在 1 个、2 个和 4 个可用的 CPU 上运行 5 次。恕我直言,结果非常有说服力:go test -run=^$ -test.bench=.&nbsp; -test.benchmem -test.cpu 1,2,4 -test.count=5goos: darwingoarch: amd64pkg: github.com/mwmahlberg/so-regexBenchmarkRegexp&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 5000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;383 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 32 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1 allocs/opBenchmarkRegexp&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 5000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;382 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 32 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1 allocs/opBenchmarkRegexp&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 5000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;382 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 32 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1 allocs/opBenchmarkRegexp&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 5000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;382 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 32 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1 allocs/opBenchmarkRegexp&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 5000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;384 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 32 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1 allocs/opBenchmarkRegexp-2&nbsp; &nbsp; &nbsp; &nbsp; 5000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;384 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 32 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1 allocs/opBenchmarkRegexp-2&nbsp; &nbsp; &nbsp; &nbsp; 5000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;382 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 32 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1 allocs/opBenchmarkRegexp-2&nbsp; &nbsp; &nbsp; &nbsp; 5000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;384 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 32 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1 allocs/opBenchmarkRegexp-2&nbsp; &nbsp; &nbsp; &nbsp; 5000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;382 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 32 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1 allocs/opBenchmarkRegexp-2&nbsp; &nbsp; &nbsp; &nbsp; 5000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;382 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 32 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1 allocs/opBenchmarkRegexp-4&nbsp; &nbsp; &nbsp; &nbsp; 5000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;382 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 32 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1 allocs/opBenchmarkRegexp-4&nbsp; &nbsp; &nbsp; &nbsp; 5000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;382 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 32 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1 allocs/opBenchmarkRegexp-4&nbsp; &nbsp; &nbsp; &nbsp; 5000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;380 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 32 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1 allocs/opBenchmarkRegexp-4&nbsp; &nbsp; &nbsp; &nbsp; 5000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;380 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 32 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1 allocs/opBenchmarkRegexp-4&nbsp; &nbsp; &nbsp; &nbsp; 5000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;377 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 32 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1 allocs/opBenchmarkSplit&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 10000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;161 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 80 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 3 allocs/opBenchmarkSplit&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 10000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;161 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 80 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 3 allocs/opBenchmarkSplit&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 10000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;164 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 80 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 3 allocs/opBenchmarkSplit&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 10000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;165 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 80 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 3 allocs/opBenchmarkSplit&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 10000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;162 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 80 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 3 allocs/opBenchmarkSplit-2&nbsp; &nbsp; &nbsp; &nbsp; 10000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;159 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 80 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 3 allocs/opBenchmarkSplit-2&nbsp; &nbsp; &nbsp; &nbsp; 10000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;167 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 80 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 3 allocs/opBenchmarkSplit-2&nbsp; &nbsp; &nbsp; &nbsp; 10000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;161 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 80 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 3 allocs/opBenchmarkSplit-2&nbsp; &nbsp; &nbsp; &nbsp; 10000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;159 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 80 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 3 allocs/opBenchmarkSplit-2&nbsp; &nbsp; &nbsp; &nbsp; 10000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;159 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 80 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 3 allocs/opBenchmarkSplit-4&nbsp; &nbsp; &nbsp; &nbsp; 10000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;159 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 80 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 3 allocs/opBenchmarkSplit-4&nbsp; &nbsp; &nbsp; &nbsp; 10000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;161 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 80 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 3 allocs/opBenchmarkSplit-4&nbsp; &nbsp; &nbsp; &nbsp; 10000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;159 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 80 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 3 allocs/opBenchmarkSplit-4&nbsp; &nbsp; &nbsp; &nbsp; 10000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;160 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 80 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 3 allocs/opBenchmarkSplit-4&nbsp; &nbsp; &nbsp; &nbsp; 10000000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;160 ns/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 80 B/op&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 3 allocs/opPASSok&nbsp; &nbsp; &nbsp; github.com/mwmahlberg/so-regex&nbsp; 61.340s它清楚地表明,在拆分标签的情况下,实际使用拆分的速度是预编译正则表达式的两倍多。对于您的用例,我显然会选择使用 split。
随时随地看视频慕课网APP

相关分类

Go
我要回答