图片识别的难点
从上到下依次为角度不同,光照不同,大小不同,变形,背景的干扰,遮挡,种类不同。
KNN
k近邻算法(KNN)是一种简单的有监督的分类方法。
训练时,只是简单地将所有训练数据集记录下来,时间复杂度为O(n)。
预测时,在与测试样本距离最近的k个样本中投票,多者为预测的结果,一般k为奇数,如下图所示。
k=1时(Nearest Neighbor, NN),将距离最近的训练样本作为分类的结果。
k近邻算法有两个超参数,距离度量方法和k,距离一般使用L1或者L2距离。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
#encoding=utf-8 from cs231n.data_utils import load_CIFAR10 import numpy as np from cs231n.classifiers import KNearestNeighbor def main(): X_train, y_train, X_test, y_test = load_CIFAR10('../cifar-10-batches-py') num_training = 48000 mask = list(range(num_training)) X_train = X_train[mask] y_train = y_train[mask] num_test = 1000 mask = list(range(num_test)) X_test = X_test[mask] y_test = y_test[mask] # Reshape the image data into rows print(X_train.shape) ''' (48000, 32, 32, 3) ''' X_train = np.reshape(X_train, (X_train.shape[0], -1)) print(X_train.shape) ''' (48000, 3072) ''' X_test = np.reshape(X_test, (X_test.shape[0], -1)) print(X_train.shape, X_test.shape) ''' (48000, 3072) (1000, 3072) ''' classifier = KNearestNeighbor() classifier.train(X_train, y_train) y_test_pred = classifier.predict(X_test, k=5) print(y_test_pred) # Compute and display the accuracy num_correct = np.sum(y_test_pred == y_test) accuracy = float(num_correct) / num_test print('Got %d / %d correct => accuracy: %f' % (num_correct, num_test, accuracy)) ''' Got 348 / 1000 correct => accuracy: 0.348000 ''' if __name__ == "__main__": main() |
L2范数计算
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
def compute_distances_two_loops(self, X): num_test = X.shape[0] num_train = self.X_train.shape[0] dists = np.zeros((num_test, num_train)) for i in range(num_test): for j in range(num_train): dists[i, j] = np.sqrt(np.sum((X[i] - self.X_train[j]) ** 2)) return dists def compute_distances_one_loop(self, X): num_test = X.shape[0] num_train = self.X_train.shape[0] dists = np.zeros((num_test, num_train)) for i in range(num_test): dists[i, :] = np.sqrt(np.sum((X[i, :] - self.X_train) ** 2, axis=1)) return dists def compute_distances_no_loops(self, X): num_test = X.shape[0] num_train = self.X_train.shape[0] dists = np.zeros((num_test, num_train)) X_squared = np.sum(X ** 2, axis=1) Y_squared = np.sum(self.X_train ** 2, axis=1) XY = np.dot(X, self.X_train.T) dists = np.sqrt(X_squared[:, np.newaxis] + Y_squared - 2 * XY) return dists |
预测标签
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
def predict_labels(self, dists, k=1): """ Given a matrix of distances between test points and training points, predict a label for each test point. Inputs: - dists: A numpy array of shape (num_test, num_train) where dists[i, j] gives the distance betwen the ith test point and the jth training point. Returns: - y: A numpy array of shape (num_test,) containing predicted labels for the test data, where y[i] is the predicted label for the test point X[i]. """ num_test = dists.shape[0] y_pred = np.zeros(num_test) for i in range(num_test): test_row = dists[i, :] # np.argsort returns indices of sorted input. # 返回输入从小到大的排列的索引号 sorted_row = np.argsort(test_row) # Get the k closest indices. closest_y = self.y_train[sorted_row[0:k]] y_pred[i] = np.argmax(np.bincount(closest_y)) return y_pred |
完整代码
https://github.com/misads/cs231n_assignment/tree/master/knn
参考资料
https://blog.csdn.net/xieyi4650/article/details/53152421
https://blog.csdn.net/normol/article/details/84145071
cs231n-介绍 http://cs231n.github.io/classification/