计算梯度下降应用的问题？

iamcodylee · 发表于 2016-1-18 14:16:01

本帖最后由 iamcodylee 于 2016-1-19 09:46 编辑

使用stanford openclassroom（Deep Learning）中的数据作为training set, exercise中使用的是Logistic Regression and Newton's method，我在训练的使用只使用了gradient descent。

def sigmoid(inX):
return 1.0/(1+exp(-inX))
def gradDescent(dataMatIn, classLabels):
x = mat(dataMatIn)
y = mat(classLabels).transpose()
m,n = shape(x)
alpha = 0.001
maxCycles = 100
theta = ones((n,1))
for k in range(maxCycles):
h = sigmoid(x*theta)
error = h - y
theta = theta - alpha * (x.transpose() * error /m)
return theta

复制代码

训练结果为[[ 0.97807688],[ 0.25847312],[-0.15421594]]

问题：
1、很明显这个decision boundary不对，但是我检查了半天代码，也没有发现问题到底出在哪里？
2、我理解的是，LR and Newton's method是对LR的一种改进方法，和LR比，在这里的情况下两者最终计算出来的theta应该都是很接近的吧？

补充一下：
把maxCycle修改到20W次的时候，分类器开始有效果了，但是效果好些不是太好，如图:

morinson · 发表于 2016-1-18 17:12:20

你用的代码是python吗？

斯坦福的好像使用Octave编写的。

morinson · 发表于 2016-1-18 17:18:17

def sigmoid(inX):
return 1.0/(1+exp(-inX))
def gradDescent(dataMatIn, classLabels):
x = mat(dataMatIn)
y = mat(classLabels).transpose()
m,n = shape(x)
alpha = 0.001
maxCycles = 100
theta = ones((n,1))
for k in range(maxCycles):
h = sigmoid(x*theta)
error = h - y
theta = theta - alpha * (x.transpose() * error /m)
return theta

复制代码

我没过高数，程序能看到，但看不出数学那里错了。

坐等其它同学解惑。

小谢 · 发表于 2016-1-18 17:27:54

第14行代码改成 theta = theta - alpha * x.transpose() * error ，试试

morinson · 发表于 2016-1-18 21:19:58

今天看了一篇好文，发现楼主代码里的sigmoid函数叫做“海维塞德阶跃函数”，用来表示跳变的

iamcodylee · 发表于 2016-1-19 09:41:52

小谢发表于 2016-1-18 17:27
第14行代码改成 theta = theta - alpha * x.transpose() * error ，试试

这样不对，这样一改我的cost函数都算不出来了，
然后抛出来的theta画的线也不对

小谢 · 发表于 2016-1-19 10:14:18

iamcodylee 发表于 2016-1-19 09:41
这样不对，这样一改我的cost函数都算不出来了，
然后抛出来的theta画的线也不对

这是基于梯度上升算法的logistic回归地址
http://www.robot-ai.org/forum.ph ... &extra=page%3D1

其中这段代码与gradDescent 这个方法很像，建议你看一下，或许能找到问题所在。
#dataMatIn 2维numpy数组，列代表特征，行代表训练样本 classLabels 类别标签
def gradAscent(dataMatIn,classLabels):
#转换为Numpy矩阵数据类型
dataMatrix = mat(dataMatIn)
labelMat = mat(classLabels).transpose()
m,n = shape(dataMatrix)
alpha = 0.001 #向目标移动的步长
maxCycles= 500 #迭代次数
weights =ones((n,1))
for k in range(maxCycles):
      h=sigmoid(dataMatrix*weights)
      error =(labelMat-h) #h为列向量，列向量的元素个数等于样本个数
      weights = weights + alpha * dataMatrix.transpose()*error
return weights #返回训练好的回归系数

		自动登录	找回密码
密码			立即注册

计算梯度下降应用的问题？

浏览过的版块

站长推荐 /1