C# 并行复制 - 小文件问题

我有一个 C# Azure 函数来从 Blob 读取文件内容并将其写入 Azure Data Lake 目标。该代码适用于大文件（~8 MB 及以上），但对于小文件，目标文件用 0 字节写入。我尝试将块大小更改为较小的数字并将并行线程更改为 1，但行为保持不变。我正在模拟 Visual Studio 2017 中的代码。

请找到我正在使用的代码片段。我已经阅读了有关 Parallel.ForEach 限制的文档，但没有遇到任何特定于文件大小问题的内容。（https://docs.microsoft.com/en-us/dotnet/standard/parallel-programming/potential-pitfalls-in-data-and-task-parallelism）

int bufferLength = 1 * 1024 * 1024;//1 MB chunk

long blobRemainingLength = blob.Properties.Length;

var outPutStream = new MemoryStream();

Queue<KeyValuePair<long, long>> queues = new

Queue<KeyValuePair<long, long>>();

long offset = 0;

while (blobRemainingLength > 0)

{

long chunkLength = (long)Math.Min(bufferLength, blobRemainingLength);

queues.Enqueue(new KeyValuePair<long, long>(offset, chunkLength));

offset += chunkLength;

blobRemainingLength -= chunkLength;

}

Console.WriteLine("Number of Queues: " + queues.Count);

Parallel.ForEach(queues,

new ParallelOptions()

{

//Gets or sets the maximum number of concurrent tasks

MaxDegreeOfParallelism = 10

}, (queue) =>

{

using (var ms = new MemoryStream())

{

blob.DownloadRangeToStreamAsync(ms, queue.Key,

queue.Value).GetAwaiter().GetResult();

lock (mystream)

{

var bytes = ms.ToArray();

Console.WriteLine("Processing on thread {0}",

Thread.CurrentThread.ManagedThreadId);

mystream.Write(bytes, 0, bytes.Length);

}

});

小怪兽爱吃肉

浏览 234回答 1

1回答

慕神8447489

我发现我的代码有问题。ADL Stream 编写器未正确刷新和处理。添加必要的代码后，小/大文件的并行化工作正常。

0 0

随时随地看视频慕课网APP