猿问

如何获取标签的内部 HTML 或只是文本?

我们如何根据下面的示例获取锚文本的值?这是我的代码。href我可以获得和title使用的价值html.ElementNode。我需要仅使用 text 来获取文本的值golang.org/x/net/html,而无需使用其他库。


示例:从<a href="https:xyz.com">Text XYZ</a>,我想获得“文本 XYZ”。


// html.ElementNode works for getting href and title value but no text value with TextNode. 

if n.Type == html.TextNode && n.Data == "a" {

    for _, a := range n.Attr {

        if a.Key == "href" {

            text = a.Val

        }

    }

}


繁花不似锦
浏览 108回答 1
1回答

qq_花开花谢_0

给定 HTML:<a href="http://example.com/1">Go to <b>example</b> 1</a><p>Some para text</p><a href="http://example.com/2">Go to <b>example</b> 2</a>你只期待文字吗?Go to example 1Go to example 2您期望内部 HTML 吗?Go to <b>example</b>example 1Go to <b>example</b>example 2或者,你期待别的吗?以下程序仅提供文本或内部 HTML。每次找到锚节点时,它都会保存该节点,然后继续沿着该节点的树向下移动。当它遇到其他节点时,它会检查保存的节点并附加 TextNodes 的文本或将节点的 HTML 呈现到缓冲区。最后,在遍历所有子节点并重新遇到保存的锚节点后,它打印文本字符串和 HTML 缓冲区,重置两个变量,然后将锚节点置零。我想到了使用缓冲区和 html.Render,并保存特定节点,从Golang 解析 HTML,提取带有标签的所有内容。以下内容也在Playground中:package mainimport (&nbsp; &nbsp; "bytes"&nbsp; &nbsp; "fmt"&nbsp; &nbsp; "io"&nbsp; &nbsp; "strings"&nbsp; &nbsp; "golang.org/x/net/html")func main() {&nbsp; &nbsp; s := `&nbsp; &nbsp; <a href="http://example.com/1">Go to <b>example</b> 1</a>&nbsp; &nbsp; <p>Some para text</p>&nbsp; &nbsp; <a href="http://example.com/2">Go to <b>example</b> 2</a>&nbsp; &nbsp; `&nbsp; &nbsp; doc, _ := html.Parse(strings.NewReader(s))&nbsp; &nbsp; var nAnchor *html.Node&nbsp; &nbsp; var sTxt string&nbsp; &nbsp; var bufInnerHtml bytes.Buffer&nbsp; &nbsp; w := io.Writer(&bufInnerHtml)&nbsp; &nbsp; var f func(*html.Node)&nbsp; &nbsp; f = func(n *html.Node) {&nbsp; &nbsp; &nbsp; &nbsp; if n.Type == html.ElementNode && n.Data == "a" {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; nAnchor = n&nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; &nbsp; &nbsp; if nAnchor != nil {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if n != nAnchor { // don't write the a tag and its attributes&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; html.Render(w, n)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if n.Type == html.TextNode {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sTxt += n.Data&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; &nbsp; &nbsp; for c := n.FirstChild; c != nil; c = c.NextSibling {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; f(c)&nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; &nbsp; &nbsp; if n == nAnchor {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; fmt.Println("Text:", sTxt)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; fmt.Println("InnerHTML:", bufInnerHtml.String())&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sTxt = ""&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; bufInnerHtml.Reset()&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; nAnchor = nil&nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; }&nbsp; &nbsp; f(doc)}Text: Go to example 1InnerHTML: Go to <b>example</b>example 1Text: Go to example 2InnerHTML: Go to <b>example</b>example 2
随时随地看视频慕课网APP

相关分类

Go
我要回答