golang HTML 字符集解码

Golang 官方提供了扩展包：charset和encoding。下面的代码确保 HTML 包可以正确解析文档：func detectContentCharset(body io.Reader) string {    r := bufio.NewReader(body)    if data, err := r.Peek(1024); err == nil {        if _, name, ok := charset.DetermineEncoding(data, ""); ok {            return name        }    }    return "utf-8"}// Decode parses the HTML body on the specified encoding and// returns the HTML Document.func Decode(body io.Reader, charset string) (interface{}, error) {    if charset == "" {        charset = detectContentCharset(body)    }    e, err := htmlindex.Get(charset)    if err != nil {        return nil, err    }    if name, _ := htmlindex.Name(e); name != "utf-8" {        body = e.NewDecoder().Reader(body)    }    node, err := html.Parse(body)    if err != nil {        return nil, err    }    return node, nil}

golang HTML 字符集解码

2回答