如何使自动编码器在小型图像数据集上工作

我有一个包含三个图像的数据集。当我创建一个自动编码器来训练这三个图像时，我得到的输出对于每个图像都是完全相同的，并且看起来像是所有三个图像的混合。

我的结果看起来像这样：

输入图像1：

输出图像1：

输入图片2：

输出图像2：

输入图片3：

输出图像3：

因此，您可以看到输出为每个输入提供了完全相同的东西，并且虽然每个输入都匹配得很好，但这并不完美。

这是一个包含三个图像的数据集-应该是完美的（或每个图像至少是不同的）。

我担心这三个图像数据集，因为当我处理500个图像数据集时，我得到的只是一个白色的空白屏幕，因为这是所有图像中最好的平均值。

我正在使用Keras，并且代码非常简单。

from keras.models import Sequential

from keras.layers import Dense, Flatten, Reshape

import numpy as np

# returns a numpy array with shape (3, 24, 32, 1)

# there are 3 images that are each 24x32 and are black and white (1 color channel)

x_train = get_data()

# this is the size of our encoded representations

# encode down to two numbers (I have tested using 3; I still have the same issue)

encoding_dim = 2

# the shape without the batch amount

input_shape = x_train.shape[1:]

# how many output neurons we need to create an image

input_dim = np.prod(input_shape)

# simple feedforward network

# I've also tried convolutional layers; same issue

autoencoder = Sequential([

Flatten(), # flatten

Dense(encoding_dim), # encode

Dense(input_dim), # decode

Reshape(input_shape) # reshape decoding

])

# adadelta optimizer works better than adam, same issue with both

autoencoder.compile(optimizer='adadelta', loss='mse')

# train it to output the same thing it gets as input

# I've tried epochs up to 30000 with no improvement;

# still predicts the same image for all three inputs

autoencoder.fit(x_train, x_train,

epochs=10,

batch_size=1,

verbose=1)

out = autoencoder.predict(x_train)

然后我去输出（out[0]，out[1]，out[2]），并将其转换回图像。您可以在上面看到输出图像。

我很担心，因为这表明自动编码器没有保留有关输入图像的任何信息，这不是编码器应如何执行的。

如何使编码器根据输入图像显示输出差异？

编辑：

我的一位同事建议不要使用自动编码器，而应该使用1层前馈神经网络。我尝试了一下，然后发生了同样的事情，直到我将批处理大小设置为1并训练了1400个纪元，然后它完美地工作了。这使我认为，更多的时代可以解决这个问题，但是我不确定。

编辑：

训练10,000个历元（批处理大小为3）使第二个图像看起来与编码器上的第一个图像和第三个图像不同，这恰好是在非编码器版本上运行约400个历元（也使用批处理大小为3个）时发生的情况）提供了进一步的证据，那就是培训更多的纪元可能是解决方案。

要使用批处理大小1进行测试，看看是否有更大帮助，然后尝试训练很多纪元，看看是否可以完全解决问题。

不负相思意

浏览 225回答 1

如何使自动编码器在小型图像数据集上工作

1回答