猿问

UnidentifiedImageError:无法识别图像文件

您好,我正在使用 TensorFlow 和 Keras 训练模型,数据集是从https://www.microsoft.com/en-us/download/confirmation.aspx?id=54765下载的


这是一个 zip 文件夹,我将其分为以下目录:


.

├── test

│   ├── Cat

│   └── Dog

└── train

    ├── Cat

    └── Dog

test.cat和test.dog各文件夹有1000张jpg照片,train.cat和traing.dog各文件夹有11500张jpg照片。


负载是用这段代码完成的:

batch_size = 16


# Data augmentation and preprocess

train_datagen = ImageDataGenerator(rescale=1./255,

    shear_range=0.2,

    zoom_range=0.2,

    horizontal_flip=True,

    validation_split=0.20) # set validation split


# Train dataset

train_generator = train_datagen.flow_from_directory(

    'PetImages/train',

    target_size=(244, 244),

    batch_size=batch_size,

    class_mode='binary',

    subset='training') # set as training data


# Validation dataset

validation_generator = train_datagen.flow_from_directory(

    'PetImages/train',

    target_size=(244, 244),

    batch_size=batch_size,

    class_mode='binary',

    subset='validation') # set as validation data


test_datagen = ImageDataGenerator(rescale=1./255)

# Test dataset

test_datagen = test_datagen.flow_from_directory(

    'PetImages/test')

该模型正在使用以下代码进行训练:


history = model.fit(train_generator,

                    validation_data=validation_generator,

                    epochs=5)

我得到以下输入:


Epoch 1/5

1150/1150 [==============================] - ETA: 0s - loss: 0.0505 - accuracy: 0.9906

但是当纪元处于这一点时,我收到以下错误:


UnidentifiedImageError:无法识别图像文件 <_io.BytesIO 对象位于 0x7f9e185347d0>


我该如何解决这个问题才能完成培训?


繁花不似锦
浏览 426回答 4
4回答

呼如林

尝试此功能来检查图像的格式是否正确。import osfrom PIL import Imagefolder_path = 'data\img'extensions = []for fldr in os.listdir(folder_path):&nbsp; &nbsp; sub_folder_path = os.path.join(folder_path, fldr)&nbsp; &nbsp; for filee in os.listdir(sub_folder_path):&nbsp; &nbsp; &nbsp; &nbsp; file_path = os.path.join(sub_folder_path, filee)&nbsp; &nbsp; &nbsp; &nbsp; print('** Path: {}&nbsp; **'.format(file_path), end="\r", flush=True)&nbsp; &nbsp; &nbsp; &nbsp; im = Image.open(file_path)&nbsp; &nbsp; &nbsp; &nbsp; rgb_im = im.convert('RGB')&nbsp; &nbsp; &nbsp; &nbsp; if filee.split('.')[1] not in extensions:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; extensions.append(filee.split('.')[1])

智慧大石

我不知道这是否仍然相关,但对于将来遇到同样问题的人来说:在这种特定情况下,dog_cat 数据集中有两个损坏的文件:猫/666.jpg狗/11702.jpg只要删除它们就可以了。

侃侃尔雅

我以前遇到过这个问题。因此,我开发了一个 python 脚本来测试训练和测试目录中是否存在有效的图像文件。文件扩展名必须是 jpg、png、bmp 或 gif 之一,因此它首先检查正确的扩展名。然后它尝试使用 cv2 读取图像。如果未输入有效图像,则会创建异常。在每种情况下都会打印出错误的文件名。最后,名为 bad_list 的列表包含错误文件路径列表。注意目录必须名为“test”和“train”import osimport cv2bad_list=[]dir=r'c:\'PetImages'subdir_list=os.listdir(dir) # create a list of the sub directories in the directory ie train or testfor d in subdir_list:&nbsp; # iterate through the sub directories train and test&nbsp; &nbsp; dpath=os.path.join (dir, d) # create path to sub directory&nbsp; &nbsp; if d in ['test', 'train']:&nbsp; &nbsp; &nbsp; &nbsp; class_list=os.listdir(dpath) # list of classes ie dog or cat&nbsp; &nbsp; &nbsp; &nbsp;# print (class_list)&nbsp; &nbsp; &nbsp; &nbsp; for klass in class_list: # iterate through the two classes&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; class_path=os.path.join(dpath, klass) # path to class directory&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; #print(class_path)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; file_list=os.listdir(class_path) # create list of files in class directory&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; for f in file_list: # iterate through the files&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; fpath=os.path.join (class_path,f)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; index=f.rfind('.') # find index of period infilename&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ext=f[index+1:] # get the files extension&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if ext&nbsp; not in ['jpg', 'png', 'bmp', 'gif']:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; print(f'file {fpath}&nbsp; has an invalid extension {ext}')&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; bad_list.append(fpath)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; else:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; try:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; img=cv2.imread(fpath)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; size=img.shape&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; except:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; print(f'file {fpath} is not a valid image file ')&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; bad_list.append(fpath)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;print (bad_list)

慕尼黑5688855

我们也可以在每个错误实例处删除,而不是附加损坏的列表......import osfrom PIL import Imagefolder_path = r"C:\Users\ImageDatasets"extensions = []corupt_img_paths=[]for fldr in os.listdir(folder_path):&nbsp; &nbsp; sub_folder_path = os.path.join(folder_path, fldr)&nbsp; &nbsp; for filee in os.listdir(sub_folder_path):&nbsp; &nbsp; &nbsp; &nbsp; file_path = os.path.join(sub_folder_path, filee)&nbsp; &nbsp; &nbsp; &nbsp; print('** Path: {}&nbsp; **'.format(file_path), end="\r", flush=True)&nbsp; &nbsp; &nbsp; &nbsp; try:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; im = Image.open(file_path)&nbsp; &nbsp; &nbsp; &nbsp; except:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; print(file_path)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; os.remove(file_path)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; continue&nbsp; &nbsp; &nbsp; &nbsp; else:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; rgb_im = im.convert('RGB')&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if filee.split('.')[1] not in extensions:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; extensions.append(filee.split('.')[1])
随时随地看视频慕课网APP

相关分类

Python
我要回答