由于 RAM 内存的限制,我按照这些说明构建了一个生成器,该生成器绘制小批量并将它们传递到 Keras 的 fit_generator 中。但是即使我继承了序列,Keras 也无法使用多处理准备队列。
这是我的多处理生成器。
class My_Generator(Sequence):
def __init__(self, image_filenames, labels, batch_size):
self.image_filenames, self.labels = image_filenames, labels
self.batch_size = batch_size
def __len__(self):
return np.ceil(len(self.image_filenames) / float(self.batch_size))
def __getitem__(self, idx):
batch_x = self.image_filenames[idx * self.batch_size:(idx + 1) * self.batch_size]
batch_y = self.labels[idx * self.batch_size:(idx + 1) * self.batch_size]
return np.array([
resize(imread(file_name), (200, 200))
for file_name in batch_x]), np.array(batch_y)
主要功能:
batch_size = 100
num_epochs = 10
train_fnames = []
mask_training = []
val_fnames = []
mask_validation = []
我希望生成器按 ID 在不同线程中分别读取文件夹中的批次(其中 ID 看起来像:原始图像的 {number}.csv 和蒙版图像的 {number}_label.csv)。我最初构建了另一个更优雅的类,将每个数据存储在一个 .h5 文件而不是目录中。但是同样的问题被屏蔽了。因此,如果你有一个代码来做到这一点,我也是接受者。
for dirpath, _, fnames in os.walk('./train/'):
for fname in fnames:
if 'label' not in fname:
training_filenames.append(os.path.abspath(os.path.join(dirpath, fname)))
else:
mask_training.append(os.path.abspath(os.path.join(dirpath, fname)))
for dirpath, _, fnames in os.walk('./validation/'):
for fname in fnames:
if 'label' not in fname:
validation_filenames.append(os.path.abspath(os.path.join(dirpath, fname)))
else:
mask_validation.append(os.path.abspath(os.path.join(dirpath, fname)))
my_training_batch_generator = My_Generator(training_filenames, mask_training, batch_size)
my_validation_batch_generator = My_Generator(validation_filenames, mask_validation, batch_size)
num_training_samples = len(training_filenames)
num_validation_samples = len(validation_filenames)
喵喔喔
慕尼黑8549860
相关分类