猿问

如何解析网页跨度中的值?

我正在尝试从电子商务网站上抓取顶级产品的名称列表。然而结果是空的。想知道缺少什么。输出为: 访问: https ://www.amazon.in/gp/bestsellers/electronics/ref=zg_bs_nav_0/ 抓取结束: https ://www.amazon.in/gp/bestsellers/electronics/ref=zg_bs_nav_0/


代码:


package main


import (

    "encoding/csv"

    "fmt"

    "log"

    "os"


    "github.com/gocolly/colly"

)


func main() {

    fetchURL := "https://www.amazon.in/gp/bestsellers/electronics/ref=zg_bs_nav_0/"

    fileName := "results.csv"

    file, err := os.Create(fileName)

    if err != nil {

        log.Fatal("ERROR: Could not create file %q: %s\n", fileName, err)

        return

    }

    defer file.Close()

    writer := csv.NewWriter(file)

    defer writer.Flush()



    writer.Write([]string{"Sl. No."})



    c := colly.NewCollector()



    c.OnRequest(func(r *colly.Request) {

        fmt.Println("Visiting: ", r.URL)

    })


    c.OnHTML(`.a-section a-spacing-none aok-relative`, func(e *colly.HTMLElement) {

        number := e.ChildText(".zg-badge-text")

        name := e.ChildText(".p13n-sc-truncated")


        writer.Write([]string{

            number,

            name,


    })



    c.Visit(fetchURL)

    fmt.Println("End of scraping: ", fetchURL)

}


qq_花开花谢_0
浏览 107回答 1
1回答

慕哥6287543

您需要添加 User-Agent 标头才能返回数据。它似乎p13n-sc-truncated也是一个生成的类名。您可以使用以下示例:package mainimport (    "log"    "strings"    "github.com/gocolly/colly")type AmazonData struct {    Index int    Link string    Title string}func main() {    c := colly.NewCollector()    var data []AmazonData    count := 1    c.OnHTML(`#zg-ordered-list`, func(e *colly.HTMLElement) {        e.ForEach("li .zg-item", func(_ int, elem *colly.HTMLElement) {            link := elem.DOM.Find("a")            linkHref, _ := link.Attr("href")            data = append(data, AmazonData{                Index: count,                Link: linkHref,                Title: strings.TrimSpace(link.Find("div").Text()),            })            count++        })        log.Println(data)    })    c.OnRequest(func(r *colly.Request) {        r.Headers.Set("User-Agent", "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36")    })    c.Visit("https://www.amazon.in/gp/bestsellers/electronics/ref=zg_bs_nav_0/")}
随时随地看视频慕课网APP

相关分类

Go
我要回答