在 PyTorch 中进行数据增强后得到坏图像

您正在将图像转换为 PyTorch 张量，并且在 PyTorch 中图像的大小为[C, H, W]。当您将它们可视化时，您正在将张量转换回 NumPy 数组，其中图像的大小为[H, W, C]。因此，您正在尝试重新排列维度，但您使用的是torch.reshape，它不会交换维度，而只会以不同的方式对数据进行分区。一个例子使这一点更清楚：# Incrementing numbers with size 2 x 3 x 3image = torch.arange(2 * 3 * 3).reshape(2, 3, 3)# => tensor([[[ 0,  1,  2],#             [ 3,  4,  5],#             [ 6,  7,  8]],##            [[ 9, 10, 11],#             [12, 13, 14],#             [15, 16, 17]]])# Reshape keeps the same order of elements but for a different size# The numbers are still incrementing from left to rightimage.reshape(3, 3, 2)# => tensor([[[ 0,  1],#             [ 2,  3],#             [ 4,  5]],##            [[ 6,  7],#             [ 8,  9],#             [10, 11]],##            [[12, 13],#             [14, 15],#             [16, 17]]])要重新排序您可以使用的尺寸permute：# Dimensions are swapped# Now the numbers increment from top to bottomimage.permute(1, 2, 0)# => tensor([[[ 0,  9],#             [ 1, 10],#             [ 2, 11]],##            [[ 3, 12],#             [ 4, 13],#             [ 5, 14]],##            [[ 6, 15],#             [ 7, 16],#             [ 8, 17]]])使用时.astype(np.uint8)，组织图像完全是黑色的。PyTorch 图像表示为值介于 [0, 1] 之间的浮点数，但 NumPy 使用介于 [0, 255] 之间的整数值。将浮点值转换为np.uint8将导致只有 0 和 1，其中不等于 1 的所有内容都将设置为 0，因此整个图像为黑色。您需要将这些值乘以 255 以将它们置于 [0, 255] 范围内。img = img.permute(1, 2, 0) * 255img = img.numpy().astype(np.uint8)当您将张量转换为 PIL 图像时，此转换也会自动完成transforms.ToPILImage（或者TF.to_pil_image如果您更喜欢函数式版本，则使用）并且 PIL 图像可以直接转换为 NumPy 数组。这样您就不必担心尺寸、值范围或类型，上面的代码可以替换为：img = np.array(TF.to_pil_image(img))

在 PyTorch 中进行数据增强后得到坏图像

1回答