SGD 实现 Python

首页课程实战体系课手记专栏慕课教程

我知道之前已经在 SO 上询问过 SGD，但我想对我的代码发表意见，如下所示：

import numpy as np

import matplotlib.pyplot as plt

# Generating data

m,n = 10000,4

x = np.random.normal(loc=0,scale=1,size=(m,4))

theta_0 = 2

theta = np.append([],[1,0.5,0.25,0.125]).reshape(n,1)

y = np.matmul(x,theta) + theta_0*np.ones(m).reshape((m,1)) + np.random.normal(loc=0,scale=0.25,size=(m,1))

# input features

x0 = np.ones([m,1])

X = np.append(x0,x,axis=1)

# defining the cost function

def compute_cost(X,y,theta_GD):

return np.sum(np.power(y-np.matmul(np.transpose(theta_GD),X),2))/2

# initializations

theta_GD = np.append([theta_0],[theta]).reshape(n+1,1)

alp = 1e-5

num_iterations = 10000

# Batch Sum

def batch(i,j,theta_GD):

batch_sum = 0

for k in range(i,i+9):

batch_sum += float((y[k]-np.transpose(theta_GD).dot(X[k]))*X[k][j])

return batch_sum

# Gradient Step

def gradient_step(theta_current, X, y, alp,i):

for j in range(0,n):

theta_current[j]-= alp*batch(i,j,theta_current)/10

theta_updated = theta_current

return theta_updated

# gradient descent

cost_vec = []

for i in range(num_iterations):

cost_vec.append(compute_cost(X[i], y[i], theta_GD))

theta_GD = gradient_step(theta_GD, X, y, alp,i)

plt.plot(cost_vec)

plt.xlabel('iterations')

plt.ylabel('cost')

我正在尝试批量大小为 10 的小批量 GD。我得到了 MSE 的极度振荡行为。问题出在哪里？谢谢。

PS 我正在关注 NG 的https://www.coursera.org/learn/machine-learning/lecture/9zJUs/mini-batch-gradient-descent

蓝山帝景

浏览 242回答 1

随时随地看视频慕课网APP