猿问

如何将目录中的一组图像输入python以用作训练集?

我已经能够提取URL数据集和链接以用作训练/测试数据集,但是我想将其扩展为图像。基本上,如果我有150张猫的图像,如何输入该图像并进行分类?


使用IRIS数据集从URL提取的当前代码


import pandas

from pandas.plotting import scatter_matrix

import matplotlib.pyplot as plt

from sklearn import model_selection

from sklearn.metrics import classification_report

from sklearn.metrics import confusion_matrix

from sklearn.metrics import accuracy_score

from sklearn.neighbors import KNeighborsClassifier

url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"

names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class']

dataset = pandas.read_csv(url, names=names)

print(dataset.shape)

print(dataset.head(20))

print(dataset.loc[1])

print(dataset.describe())

print(dataset.loc[1][0])

plt.show()

dataset.hist()

plt.show()

scatter_matrix(dataset)

plt.show()


array = dataset.values

X = array[:,0:4]

Y = array[:,4]

validation_size = 0.20

seed = 7


X_train, X_validation, Y_train, Y_validation = model_selection.train_test_split(X, Y, test_size=validation_size, random_state=seed)

seed = 7

scoring = 'accuracy'

models = []

models.append(('KNN', KNeighborsClassifier()))

# evaluate each model in turn

results = []

names = []

for name, model in models:

    kfold = model_selection.KFold(n_splits=10, random_state=seed)

    cv_results = model_selection.cross_val_score(model, X_train, Y_train, cv=kfold, scoring=scoring)

    results.append(cv_results)

    names.append(name)

    msg = "%s: %f (%f)" % (name, cv_results.mean(), cv_results.std())

    print(msg)



fig = plt.figure()

fig.suptitle('Algorithm Comparison')

ax = fig.add_subplot(111)

plt.boxplot(results)

ax.set_xticklabels(names)

plt.show()

knn = KNeighborsClassifier()

knn.fit(X_train, Y_train)

predictions = knn.predict(X_validation)

print(accuracy_score(Y_validation, predictions))

print(confusion_matrix(Y_validation, predictions))

print(classification_report(Y_validation, predictions))


繁星点点滴滴
浏览 226回答 2
2回答

长风秋雁

您可以使用选择的库读取具有顺序文件名的图像import skimage as skifilenames = ['image-%03d.jpg'%n for n in range(150)]images = []for f in filenames:    im = ski.imread(f)    images.append(im)然后images是图像列表。您还可以遍历任何类型的文件名,或使用该os模块仅从具有特定扩展名的目录中提取文件。原理是一样的。只是filenames根据需要构造。但是,我建议使用pims,可能与处理管道一起使用import pimsimport numpy as npimages = pims.ImageSequence('images-*.jpg')@pims.pipelinedef grayarr(im):    return np.array(im)[:,:,0]images = grayarr(images)在这一点上,您可以images使用类似numpy的切片进行索引。pims当您处理太多无法将其保存在RAM中的图像时,此功能特别有用。您可以在pims文档中阅读有关这些内容的信息。

婷婷同学_

您可以使用Glob并从目录中提取数据from PIL import Imageimport globlist_of_images = []for filename in glob.glob('file_directory/.jpg'): #assuming you are dealing with #jpg    training_set = Image.open(filename)    list_of_images.append(training_set)
随时随地看视频慕课网APP

相关分类

Python
我要回答