在 GOLANG 中尽可能快地递归遍历所有文件夹中的所有文件

我面临一个问题,即使在论坛上花了一天时间,我仍然无法完全理解和解决。


所以在这里,我做了一个循环所有文件夹及其子文件夹的函数,它有 2 个子函数:- 对于找到的每个文件,列出文件的名称。- 对于找到的每个文件夹,重新启动相同的父函数以再次查找子文件和文件夹。


为简化起见,该宏以递归方式列出了树中的所有文件。但我的目标是尽可能快地完成,所以每次遇到新文件夹时我都会运行一个新的 goroutine。


问题:

我的问题是,当树结构太大(文件夹和子文件夹中的文件夹太多......)时,脚本会生成太多线程,因此会出现错误。所以我增加了这个限制,但突然间它不再需要电脑了:/


所以我的问题是,如何制作适合我的代码的工作系统(带池大小)?不管我怎么看,我都没有看到如何说,例如,生成新的 goroutines 达到一定的限制,清空缓冲区的时间。


源代码:

https ://github.com/LaM0uette/FilesDIR/tree/V0.5


主要的:


package main


import (

    "FilesDIR/globals"

    "FilesDIR/task"

    "fmt"

    "log"

    "runtime/debug"

    "sync"

    "time"

)


func main() {

    timeStart := time.Now()

    debug.SetMaxThreads(5 * 1000)


    var wg sync.WaitGroup


    // task.DrawStart()


    /*

        err := task.LoopDir(globals.SrcPath)

        if err != nil {

            log.Print(err.Error())

        }

    */


    err := task.LoopDirsFiles(globals.SrcPath, &wg) // globals.SrcPath = My path with ~2000000 files ( this is a serveur of my entreprise)

    if err != nil {

        log.Print(err.Error())

    }


    wg.Wait()


    fmt.Println("FINI: Nb Fichiers: ", task.Id)


    timeEnd := time.Since(timeStart)

    fmt.Println(timeEnd)

}



肥皂起泡泡
浏览 288回答 1
1回答

撒科打诨

如果您不想使用任何外部包,您可以创建一个单独的工作程序来处理文件,然后启动任意数量的工作程序。之后,在你的主线程中递归地进入树,并将工作发送给工人。如果任何工人“有时间”,它将从工作频道中挑选以下工作并进行处理。var (&nbsp; &nbsp; wg&nbsp; &nbsp;*sync.WaitGroup&nbsp; &nbsp; jobs chan string = make(chan string))func loopFilesWorker() error {&nbsp; &nbsp; for path := range jobs {&nbsp; &nbsp; &nbsp; &nbsp; files, err := ioutil.ReadDir(path)&nbsp; &nbsp; &nbsp; &nbsp; if err != nil {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; wg.Done()&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return err&nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; &nbsp; &nbsp; for _, file := range files {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if !file.IsDir() {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; fmt.Println(file.Name())&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; &nbsp; &nbsp; wg.Done()&nbsp; &nbsp; }&nbsp; &nbsp; return nil}func LoopDirsFiles(path string) error {&nbsp; &nbsp; files, err := ioutil.ReadDir(path)&nbsp; &nbsp; if err != nil {&nbsp; &nbsp; &nbsp; &nbsp; return err&nbsp; &nbsp; }&nbsp; &nbsp; //Add this path as a job to the workers&nbsp; &nbsp; //You must call it in a go routine, since if every worker is busy, then you have to wait for the channel to be free.&nbsp; &nbsp; go func() {&nbsp; &nbsp; &nbsp; &nbsp; wg.Add(1)&nbsp; &nbsp; &nbsp; &nbsp; jobs <- path&nbsp; &nbsp; }()&nbsp; &nbsp; for _, file := range files {&nbsp; &nbsp; &nbsp; &nbsp; if file.IsDir() {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; //Recursively go further in the tree&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; LoopDirsFiles(filepath.Join(path, file.Name()))&nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; }&nbsp; &nbsp; return nil}func main() {&nbsp; &nbsp; //Start as many workers you want, now 10 workers&nbsp; &nbsp; for w := 1; w <= 10; w++ {&nbsp; &nbsp; &nbsp; &nbsp; go loopFilesWorker()&nbsp; &nbsp; }&nbsp; &nbsp; //Start the recursion&nbsp; &nbsp; LoopDirsFiles(globals.SrcPath)&nbsp; &nbsp; wg.Wait()}
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Go