May 27, 2009

实用最优化

凸优化标准形式中,有要求等式约束是 Affine 的。

无约束最优化的下降方法:
见绿皮笔记本 p13
最 general 的是 steepest descent
L2 norm -> gradient descent
L1 norm -> coordinate descent
Quadratic norm 
-P^{-1}\bigtriangledown f(x)
Quadratic norm (Hessian) -> Newton


line search 的方法
可以 exact line search 但可能成本很高
其他方法:
  • MoreThuente 
  • Backtracking 
  • Brent 
Backtracking 见 convex optimization  p464

Quasi-Newton method
即 newton 方法中 Hessian 是近似出来的。
\nabla f(x_k+\Delta x)=\nabla f(x_k)+B \Delta x,
which is called the secant equation. B 除了对称正定外,对于不同近似方法可能还有其他约束。
具体近似的方法有
DFP, BFGS (L-BFGS), Broyden, Broyden Family, SR1

貌似  BFGS (L-BFGS) 是比较常用的方法。

Matlab 中无约束最优化函数为
[x,fval,exitflag,output,grad,hessian] = fminunc(fun,x0,options)
函数 fun 除了计算 f(x) 外,还可能计算 gradient and Hessian
options 中有选项表明函数有输出
gradGradient at x
hessianHessian at x

另外其他 options 包含 large scale/ medium scale, tolerance, iteration number 等
关于该函数的算法

Algorithms


Large-Scale Optimization

By default fminunc chooses the large-scale algorithm if you supplies the gradient in fun (and the GradObj option is set to 'on' using optimset). This algorithm is a subspace trust-region method and is based on the interior-reflective Newton method described in [2] and [3]. Each iteration involves the approximate solution of a large linear system using the method of preconditioned conjugate gradients (PCG). See Large Scale fminunc Algorithm, Trust-Region Methods for Nonlinear Minimization and Preconditioned Conjugate Gradient Method.

Medium-Scale Optimization

fminunc, with the LargeScale option set to 'off' with optimset, uses the BFGS Quasi-Newton method with a cubic line search procedure. This quasi-Newton method uses the BFGS ([1],[5],[8], and [9]) formula for updating the approximation of the Hessian matrix. You can select the DFP ([4],[6], and [7]) formula, which approximates the inverse Hessian matrix, by setting the HessUpdate option to 'dfp' (and the LargeScale option to 'off'). You can select a steepest descent method by setting HessUpdate to 'steepdesc' (and LargeScale to 'off'), although this is not recommended.
optimset 函数用于生成 options

注意,如果 func 没有输出 gradient 时,优化算法会去估计 gradient。
注意偏微分的含义,今天我就在这个地方失败了,竟然用
(f(x+delta)-f(x))/delta_i 去计算 gradient.
正确方法是 (f(x+delta_i)-f(x))/delta_i
也就是说如果 x 是高维的话,f 也要被 evaluate 很多次。

Conjugate gradient
这个是用来求解
Ax=b
或者优化
y = 1/2 x^T Ax - b^T x +c

理解: x^TAy 是一个内积,因此在该内积空间中有一组标准正交基。称 x, y conjugate: x^TAy=0。CG 的原则就是按照这组标准正交基的方向进行 descent,采用格兰-施密特方法来求标准正交基。具体循环算法还要考虑最佳步长,即 error 和当前的 direction conjugate.

对于一个椭圆,它的graident 并不指向圆心,但是如果做变换
u= A^{-1/2}x
就指向圆心了

红色为 gradient, 绿色为 transformed gradient

CG 也可以用于最优化非线性函数(无约束)
minimize.m 就实现了这个功能。要求能计算 gradient.

Matlab 约束优化 fmincon
参数和 fminunc 类似,但要规定不等式约束和等式约束。注意非线性不等式约束和等式约束都在参数 nonlcon,

function [c,ceq] = mycon(x)
c = ...     % Compute nonlinear inequalities at x ,向量.
ceq = ...   % Compute nonlinear equalities at x ,向量.

Algorithm


Trust-Region-Reflective Optimization

The trust-region-reflective algorithm is a subspace trust-region method and is based on the interior-reflective Newton method described in [3] and [4]. Each iteration involves the approximate solution of a large linear system using the method of preconditioned conjugate gradients (PCG). See the trust-region and preconditioned conjugate gradient method descriptions in fmincon Trust Region Reflective Algorithm.

Active-Set Optimization

fmincon uses a sequential quadratic programming (SQP) method. In this method, the function solves a quadratic programming (QP) subproblem at each iteration. fmincon updates an estimate of the Hessian of the Lagrangian at each iteration using the BFGS formula (see fminunc and references [7] and [8]).
fmincon performs a line search using a merit function similar to that proposed by [6], [7], and [8]. The QP subproblem is solved using an active set strategy similar to that described in [5]. fmincon Active Set Algorithm describes this algorithm in detail.
See also SQP Implementation for more details on the algorithm used.

Interior-Point Optimization

This algorithm is described in [1], [41], and [9].

0 comments: