去解析无效的 XML

有一个指向 XML 的链接:http : //www.guru.com/rss/jobs/ 当尝试使用 解析 XML 时encoding/xml,得到错误:


第 1 行 XML 语法错误:无效的 XML 名称:t


我知道,此 XML 已损坏,但是我如何忽略它并解析第一项?


XML 的最后一项如下所示:


<item>

    <title>Online Ad Posting Data Entry Jobs</t

    <?xml version="1.0" encoding="utf-8"?>

    <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">

        <channel>

            <title>Guru Jobs</title>

            <link>http://www.guru.com</link>

            <description>Guru Jobs</description>

            <lastBuildDate>Sun, 15 Nov 2015 11:04:51 GMT</lastBuildDate>

            <language>en-us</language>

            <atom:link href='http://www.guru.com/rss/jobs/' rel="self" type="application/rss+xml" />

        </channel>

    </rss>

    itle>

    <link>http://www.guru.com/jobs/online-ad-posting-data-entry-jobs/1189496</link>

    <guid>http://www.guru.com/jobs/online-ad-posting-data-entry-jobs/1189496</guid>

</item> 

代码示例:


type Rss2 struct { 

    ItemList []Item `xml:"channel>item"`

}

type Item struct {

    Title       string      `xml:"title"`

    Link        string      `xml:"link"`

    Description string      `xml:"description"`

    PubDate     string      `xml:"pubDate"`

    GUID        string      `xml:"guid"`    

}


r := Rss2{}

reader := bytes.NewReader(xmlRead)

decoder := xml.NewDecoder(reader)

decoder.CharsetReader = charset.NewReaderLabel

decoder.Strict = false

err = decoder.Decode(&r)

if err != nil { fmt.Printf(err.Error()) }


素胚勾勒不出你
浏览 181回答 2
2回答

慕桂英546537

XML 标签应该正确打开和关闭。从您发布的 XML 来看,似乎 XML 声明并非一开始。<?xml&nbsp;version="1.0"&nbsp;encoding="utf-8"?>这应该是一开始的。希望这可以帮助

翻翻过去那场雪

有问题的 XML 似乎是错误的,这是正确版本的 XML 文件和 Go 代码XML文件:<?xml version="1.0" encoding="utf-8"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel>&nbsp; &nbsp; <title>Guru Jobs</title>&nbsp; &nbsp; <link>http://www.guru.com</link>&nbsp; &nbsp; <description>Guru Jobs</description>&nbsp; &nbsp; <lastBuildDate>Sun, 15 Nov 2015 11:04:51 GMT</lastBuildDate>&nbsp; &nbsp; <language>en-us</language>&nbsp; &nbsp; <atom:link href='http://www.guru.com/rss/jobs/' rel="self" type="application/rss+xml" />&nbsp; &nbsp; <item>&nbsp; &nbsp; &nbsp; &nbsp; <title>Imaging for Bespoke Curtain Website</title>&nbsp; &nbsp; &nbsp; &nbsp; <link>http://www.guru.com/jobs/imaging-for-bespoke-curtain-website/1203083</link>&nbsp; &nbsp; &nbsp; &nbsp; <guid>http://www.guru.com/jobs/imaging-for-bespoke-curtain-website/1203083</guid>&nbsp; &nbsp; &nbsp; &nbsp; <description><![CDATA[<b>Description:</b> Hi,We are currently developing a made to measure curtain website and are looking for help in develo...<br><b>Category:</b> Web, Software & IT<br><b>Required skills:</b> ecommerce, imaging software, opencart, web development<br><b>Fixed Price budget:</b> $500-$1k<br><b>Job type:</b> Public<br><b>Freelancer Location:</b> Worldwide<br>]]>&nbsp; &nbsp; &nbsp; &nbsp; </description>&nbsp; &nbsp; &nbsp; &nbsp; <pubDate>Mon, 04 Jan 2016 12:14:09 GMT</pubDate>&nbsp; &nbsp; </item></channel></rss>示例 Go 代码package mainimport (&nbsp; &nbsp; "io/ioutil"&nbsp; &nbsp; "encoding/xml"&nbsp; &nbsp; "fmt"&nbsp; &nbsp; )type Rss2 struct {&nbsp; &nbsp; ItemList []Item `xml:"channel>item"`}type Item struct {&nbsp; &nbsp; Title&nbsp; &nbsp; &nbsp; &nbsp;string&nbsp; &nbsp; &nbsp; `xml:"title"`&nbsp; &nbsp; Link&nbsp; &nbsp; &nbsp; &nbsp; string&nbsp; &nbsp; &nbsp; `xml:"link"`&nbsp; &nbsp; Description string&nbsp; &nbsp; &nbsp; `xml:"description"`&nbsp; &nbsp; PubDate&nbsp; &nbsp; &nbsp;string&nbsp; &nbsp; &nbsp; `xml:"pubDate"`&nbsp; &nbsp; GUID&nbsp; &nbsp; &nbsp; &nbsp; string&nbsp; &nbsp; &nbsp; `xml:"guid"`}func main() {&nbsp; &nbsp; r := Rss2{}&nbsp; &nbsp; xmlContent, _ := ioutil.ReadFile("example2.xml")&nbsp; &nbsp; if err := xml.Unmarshal(xmlContent, &r); err != nil {&nbsp; &nbsp; &nbsp; &nbsp; panic(err)&nbsp; &nbsp; }&nbsp; &nbsp; fmt.Println("RSS item :", r)}现在,您可以迭代并在 XML 中找到所需的数据。
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Go