在 Golang 中解组一个简单的 xml 时出错

我正在尝试在 Go 中为大型 xml 文件( dblp.xml )编写一个非常简单的解析器,其摘录如下:

<?xml version="1.0" encoding="ISO-8859-1"?>

<!DOCTYPE dblp SYSTEM "dblp.dtd">

<dblp>

    <article key="journals/cacm/Gentry10" mdate="2010-04-26">

        <author>Craig Gentry</author>

        <title>Computing arbitrary functions of encrypted data.</title>

        <pages>97-105</pages>

        <year>2010</year>

        <volume>53</volume>

        <journal>Commun. ACM</journal>

        <number>3</number>

        <ee>http://doi.acm.org/10.1145/1666420.1666444</ee>

        <url>db/journals/cacm/cacm53.html#Gentry10</url>

    </article>


    <article key="journals/cacm/Gentry10" mdate="2010-04-26">

        <author>Craig Gentry Number2</author>

        <title>Computing arbitrary functions of encrypted data.</title>

        <pages>97-105</pages>

        <year>2010</year>

        <volume>53</volume>

        <journal>Commun. ACM</journal>

        <number>3</number>

        <ee>http://doi.acm.org/10.1145/1666420.1666444</ee>

        <url>db/journals/cacm/cacm53.html#Gentry10</url>

    </article>

</dblp>

我的代码如下,看起来好像在 发生了一些事情xml.Unmarshal(byteValue, &articles),因为我无法在输出中获取任何 xml 的值。你能帮我解决我的代码有什么问题吗?


犯罪嫌疑人X
浏览 63回答 1
1回答

青春有我

您的代码中有特定行返回错误xml.Unmarshal(byteValue, &articles)如果你把它改成err = xml.Unmarshal(byteValue, &articles)if err != nil {&nbsp; &nbsp; fmt.Println(err.Error())}您会看到报告的错误:xml: encoding "ISO-8859-1" declared but Decoder.CharsetReader is nil。作为最佳实践,您应该始终检查返回的错误。要解决此问题,您可以从 XML 中删除编码属性 ( encoding="ISO-8859-1") 或稍微更改解组代码:package mainimport (&nbsp; &nbsp; "encoding/xml"&nbsp; &nbsp; "fmt"&nbsp; &nbsp; "io"&nbsp; &nbsp; "os"&nbsp; &nbsp; "golang.org/x/text/encoding/charmap")// Contains the array of articles in the dblp xmltype Dblp struct {&nbsp; &nbsp; XMLName xml.Name&nbsp; `xml:"dblp"`&nbsp; &nbsp; Dblp&nbsp; &nbsp; []Article `xml:"article"`}// Contains the article element tags and attributestype Article struct {&nbsp; &nbsp; XMLName xml.Name `xml:"article"`&nbsp; &nbsp; Key&nbsp; &nbsp; &nbsp;string&nbsp; &nbsp;`xml:"key,attr"`&nbsp; &nbsp; Year&nbsp; &nbsp; string&nbsp; &nbsp;`xml:"year"`}func main() {&nbsp; &nbsp; xmlFile, err := os.Open("dblp.xml")&nbsp; &nbsp; if err != nil {&nbsp; &nbsp; &nbsp; &nbsp; fmt.Println(err)&nbsp; &nbsp; }&nbsp; &nbsp; fmt.Println("Successfully Opened TestDblp.xml")&nbsp; &nbsp; // defer the closing of our xmlFile so that we can parse it later on&nbsp; &nbsp; defer xmlFile.Close()&nbsp; &nbsp; var articles Dblp&nbsp; &nbsp; decoder := xml.NewDecoder(xmlFile)&nbsp; &nbsp; decoder.CharsetReader = makeCharsetReader&nbsp; &nbsp; err = decoder.Decode(&articles)&nbsp; &nbsp; if err != nil {&nbsp; &nbsp; &nbsp; &nbsp; fmt.Println(err)&nbsp; &nbsp; }&nbsp; &nbsp; for i := 0; i < len(articles.Dblp); i++ {&nbsp; &nbsp; &nbsp; &nbsp; fmt.Println("Entered loop")&nbsp; &nbsp; &nbsp; &nbsp; fmt.Println("get title: " + articles.Dblp[i].Key)&nbsp; &nbsp; &nbsp; &nbsp; fmt.Println("get year: " + articles.Dblp[i].Year)&nbsp; &nbsp; }}func makeCharsetReader(charset string, input io.Reader) (io.Reader, error) {&nbsp; &nbsp; if charset == "ISO-8859-1" {&nbsp; &nbsp; &nbsp; &nbsp; // Windows-1252 is a superset of ISO-8859-1, so should do here&nbsp; &nbsp; &nbsp; &nbsp; return charmap.Windows1252.NewDecoder().Reader(input), nil&nbsp; &nbsp; }&nbsp; &nbsp; return nil, fmt.Errorf("Unknown charset: %s", charset)}运行上面的程序会导致:Successfully Opened TestDblp.xmlEntered varEntered loopget title: journals/cacm/Gentry10get year: 2010Entered loopget title: journals/cacm/Gentry10get year: 2010
打开App,查看更多内容
随时随地看视频慕课网APP