猿问

如何使用 Go 跟踪分段上传到 s3 的进度?

我正在尝试使用 Mitchell Hashimoto 的 goamz fork 提供的 PutPart 方法。可悲的是,每次我取回一部分并检查大小时,它似乎都认为它是整个文件的大小,而不仅仅是一个块。


例如


上传 15m 文件时,我希望看到


Uploading...

Processing 1 part of 3 and uploaded 5242880.0 bytes.

 Processing 2 part of 3 and uploaded 5242880.0 bytes.

 Processing 3 part of 3 and uploaded 5242880.0 bytes.

相反,我看到:


Uploading...

Processing 1 part of 3 and uploaded 15728640 bytes.

 Processing 2 part of 3 and uploaded 15728640 bytes.

 Processing 3 part of 3 and uploaded 15728640 bytes.

这是由于 file.Read(partBuffer) 的问题吗?任何帮助将非常感激。


我在 Mac 上使用 go 1.5.1。


package main


import (

    "bufio"

    "fmt"

    "math"

    "net/http"

    "os"


    "github.com/mitchellh/goamz/aws"

    "github.com/mitchellh/goamz/s3"

)


func check(err error) {

    if err != nil {

        panic(err)

    }

}


func main() {

    fmt.Println("Test")


    auth, err := aws.GetAuth("XXXXX", "XXXXXXXXXX")

    check(err)


    client := s3.New(auth, aws.USWest2)


    b := s3.Bucket{

        S3:   client,

        Name: "some-bucket",

    }


    fileToBeUploaded := "testfile"

    file, err := os.Open(fileToBeUploaded)

    check(err)

    defer file.Close()


    fileInfo, _ := file.Stat()

    fileSize := fileInfo.Size()

    bytes := make([]byte, fileSize)


    // read into buffer

    buffer := bufio.NewReader(file)

    _, err = buffer.Read(bytes)

    check(err)

    filetype := http.DetectContentType(bytes)


    // set up for multipart upload

    multi, err := b.InitMulti("/"+fileToBeUploaded, filetype, s3.ACL("bucket-owner-read"))

    check(err)


    const fileChunk = 5242880 // 5MB

    totalPartsNum := uint64(math.Ceil(float64(fileSize) / float64(fileChunk)))

    parts := []s3.Part{}


    fmt.Println("Uploading...")

    for i := uint64(1); i < totalPartsNum; i++ {


        partSize := int(math.Min(fileChunk, float64(fileSize-int64(i*fileChunk))))

        partBuffer := make([]byte, partSize)


        _, err := file.Read(partBuffer)

        check(err)


        part, err := multi.PutPart(int(i), file) // write to S3 bucket part by part

        check(err)

沧海一幻觉
浏览 335回答 3
3回答

繁花不似锦

当您将文件部分传递给multi.PutPart方法 (n, strings.NewReader ("")) 时,您的代码必须更改一些点才能使其正常工作,下面的代码将起作用。记住 PutPart 发送分段上传的一部分,从 r 读取所有内容,除了最后一个部分,每个部分的大小必须至少为 5MB。 它在 goamz 文档中有描述。我已更改为正常工作的要点:在这里,我使用文件的所有字节创建我们的 headerPartHeaderPart: = strings.NewReader (string (bytes) )这里io.ReadFull (HeaderPart, partBuffer)我正在读取make ([] byte, partSize)命令的整个缓冲区部分,每次它都位于文件的某个部分。当我们运行multi.PutPart (int (i) +1, strings.NewReader (string (partBuffer))) 时,我们必须+1,因为它不计算部分 0,而不是传递目标文件,我们将传递部分的内容使用strings.NewReader函数为此。在下面查看您的代码,它现在可以正常工作。package mainimport("bufio""fmt""math""net/http""os""launchpad.net/goamz/aws""launchpad.net/goamz/s3")func check(err error) {&nbsp; &nbsp; if err != nil {&nbsp; &nbsp; &nbsp;panic(err)&nbsp; &nbsp; }&nbsp;}func main() {fmt.Println("Test")auth := aws.Auth{&nbsp; &nbsp; AccessKey: "xxxxxxxxxxx", // change this to yours&nbsp; &nbsp; SecretKey: "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",}client := s3.New(auth, aws.USWest2)b := s3.Bucket{&nbsp; &nbsp; S3:&nbsp; &nbsp;client,&nbsp; &nbsp; Name: "some-bucket",}fileToBeUploaded := "testfile"file, err := os.Open(fileToBeUploaded)check(err)defer file.Close()fileInfo, _ := file.Stat()fileSize := fileInfo.Size()bytes := make([]byte, fileSize)// read into bufferbuffer := bufio.NewReader(file)_, err = buffer.Read(bytes)check(err)filetype := http.DetectContentType(bytes)// set up for multipart uploadmulti, err := b.InitMulti("/"+fileToBeUploaded, filetype, s3.ACL("bucket-owner-read"))check(err)const fileChunk = 5242880 // 5MBtotalPartsNum := uint64(math.Ceil(float64(fileSize) / float64(fileChunk)))parts := []s3.Part{}fmt.Println("Uploading...")HeaderPart := strings.NewReader(string(bytes))for i := uint64(0); i < totalPartsNum; i++ {&nbsp; &nbsp; partSize := int(math.Min(fileChunk, float64(fileSize-int64(i*fileChunk))))&nbsp; &nbsp; partBuffer := make([]byte, partSize)&nbsp; &nbsp; n , errx := io.ReadFull(HeaderPart, partBuffer)&nbsp; &nbsp; check(errx)&nbsp; &nbsp; part, err := multi.PutPart(int(i)+1, strings.NewReader(string(partBuffer))) // write to S3 bucket part by part&nbsp; &nbsp; check(err)&nbsp; &nbsp; fmt.Printf("Processing %d part of %d and uploaded %d bytes.\n ", int(i), int(totalPartsNum), int(n))&nbsp; &nbsp; parts = append(parts, part)}err = multi.Complete(parts)check(err)fmt.Println("\n\nPutPart upload completed")}

Qyouu

您读入的数据partBuffer根本没有使用。您传递file到multi.PutPart并读取全部内容的file,求它回到起点为必要吹你做的工作之外的所有。你的代码最小的变化将是通过bytes.NewReader(partBuffer)进入PutPart,而不是file。bytes.Reader实现需要的io.ReadSeeker接口PutPart,并将其大小报告为partBuffer.另一种方法是使用io.SectionReader类型 - 而不是自己将数据读入缓冲区,您只需SectionReader根据file您想要的大小和偏移量创建一系列s并将它们传递给PutPart,它们将传递读取到底层文件阅读器。这应该也能正常工作,并大大减少您必须编写(和错误检查)的代码。它还避免了在 RAM 中不必要地缓冲整个数据块。

噜噜哒

这里的问题可能是由于没有完全读取文件造成的。Read可能有点微妙:Read 将最多 len(p) 个字节读入 p。它返回读取的字节数 (0 <= n <= len(p)) 和遇到的任何错误。即使 Read 返回 n < len(p),它也可能在调用期间使用所有 p 作为暂存空间。如果某些数据可用但 len(p) 字节不可用,则 Read 通常会返回可用的数据,而不是等待更多数据。所以你可能应该使用ioReadFullor (better) io.CopyN。也就是说,我认为您应该尝试切换到官方的 AWS Go 软件包。他们有一个方便的上传器,可以为您处理所有这些:package mainimport (&nbsp; &nbsp; "log"&nbsp; &nbsp; "os"&nbsp; &nbsp; "github.com/aws/aws-sdk-go/aws/session"&nbsp; &nbsp; "github.com/aws/aws-sdk-go/service/s3/s3manager")func main() {&nbsp; &nbsp; bucketName := "test-bucket"&nbsp; &nbsp; keyName := "test-key"&nbsp; &nbsp; file, err := os.Open("example")&nbsp; &nbsp; if err != nil {&nbsp; &nbsp; &nbsp; &nbsp; log.Fatalln(err)&nbsp; &nbsp; }&nbsp; &nbsp; defer file.Close()&nbsp; &nbsp; sess := session.New()&nbsp; &nbsp; uploader := s3manager.NewUploader(sess)&nbsp; &nbsp; // Perform an upload.&nbsp; &nbsp; result, err := uploader.Upload(&s3manager.UploadInput{&nbsp; &nbsp; &nbsp; &nbsp; Bucket: &bucketName,&nbsp; &nbsp; &nbsp; &nbsp; Key:&nbsp; &nbsp; &keyName,&nbsp; &nbsp; &nbsp; &nbsp; Body:&nbsp; &nbsp;file,&nbsp; &nbsp; })&nbsp; &nbsp; if err != nil {&nbsp; &nbsp; &nbsp; &nbsp; log.Fatalln(err)&nbsp; &nbsp; }&nbsp; &nbsp; log.Println(result)}您可以在godoc.org上找到更多文档。
随时随地看视频慕课网APP

相关分类

Go
我要回答