在python的数据集中应用k-最近邻

首页课程实战体系课手记专栏慕课教程

在python的数据集中应用k-最近邻

我在 CSV 文件中有一个数据集，所有数据都是数字属性，我想在我的数据集中应用 k-最近邻

我的代码中有一些错误，我不知道谁能修复它。

叮当猫咪

浏览 145回答 3

3回答

临摹微笑

import numpy as npclass knn:    def __init__ (self, x, y, k):        self.k = k        self.x_data = x         self.y_data = y    def predict(self, test):        dist = np.sqrt(np.sum((self.x_data-test)**2,axis=1))        closest = np.argpartition(dist, self.k)[0:self.k]        a,b = np.unique(self.y_data[closest],return_counts=True)        return a[np.where(b == b.max())]x：特征，y：标签，k：邻居数量我希望这可以帮助你！

0 0

智慧大石

好像instance1[x]和instance2[x]你申请的减法是一种string类型。你不能减去 2 个字符串，所以你可以修改你的代码，比如，distance += pow( ( int(instance1[x])-int(instance2[x]) ),2)这会将变量读取为int类型，因此您可以减去它。你也可以int用float double等代替

0 0

幕布斯6054654

替换限定变量的行sortedVotes与sortedVotes = sorted ( classVotes.items(), key=operator.itemgetter ( 1 ), reverse=True )

0 0

随时随地看视频慕课网APP