将 csv 读入 [][]byte

当我将 csv 文件读入二维字节切片时,我遇到了一些奇怪的行为。前 42 行很好,之后似乎将额外的行结尾放入数据中,这会弄乱事情:


前42次第一排:


row 0: 504921600000000000,truck_0,South,Trish,H-2,v2.3,1500,150,12,52.31854,4.72037,124,0,221,0,25

添加 43 行后的第一行:


row 0: 504921600000000000,truck_49,South,Andy,F-150,v2.0,2000,200,15,38.9349,179.94282,289,0,269,0

row 1: 25

重现问题的最少代码:


package main


import (

    "bufio"

    "log"

    "os"

)


type fileDataSource struct {

    scanner *bufio.Scanner

}


type batch struct {

    rows [][]byte

}


func (b *batch) Len() uint {

    return uint(len(b.rows))

}


func (b *batch) Append(row []byte) {

    b.rows = append(b.rows, row)

    for index, row := range b.rows {

        log.Printf("row %d: %s\n", index, string(row))

    }

    if len(b.rows) > 43 {


        log.Fatalf("asdf")

    }

}


type factory struct{}


func (f *factory) New() *batch {

    return &batch{rows: make([][]byte, 0)}

}


func main() {

    file, _ := os.Open("/tmp/data1.csv")

    scanner := bufio.NewScanner(bufio.NewReaderSize(file, 4<<20))

    b := batch{}


    for scanner.Scan() {

        b.Append(scanner.Bytes())

    }

}

我希望行 [][]byte 逐行包含 csv 数据



慕标5832272
浏览 99回答 2
2回答

湖上湖

正如已经建议的那样,您真的应该考虑使用encoding/csv.也就是说,您的问题的原因在函数上方的 godocBytes()中进行了解释:// Bytes returns the most recent token generated by a call to Scan.// The underlying array may point to data that will be overwritten// by a subsequent call to Scan. It does no allocation.func (s *Scanner) Bytes() []byte {&nbsp; &nbsp; return s.token}因此,返回的字节切片可能会被后续调用修改Scan()。为避免这种情况,您需要复制字节切片,例如for scanner.Scan() {&nbsp; &nbsp; row := scanner.Bytes()&nbsp; &nbsp; bs := make([]byte, len(row))&nbsp; &nbsp; copy(bs, row)&nbsp; &nbsp; b.Append(bs)}

哔哔one

您需要创建 返回的数据的副本Bytes。https://pkg.go.dev/bufio@go1.19.3#Scanner.BytesBytes 返回调用 Scan 生成的最新标记。底层数组可能指向将被后续调用 Scan 覆盖的数据。它没有分配。for scanner.Scan() {&nbsp; &nbsp; row := make([]byte, len(scanner.Bytes()))&nbsp; &nbsp; copy(row, scanner.Bytes())&nbsp; &nbsp; b.Append(row)}https://go.dev/play/p/Lqot-wOXiwh
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Go