Descent direction

In optimization, a descent direction is a vector $\mathbf{p}\in\mathbb R^n$ that, in the sense below, moves us closer towards a local minimum $\mathbf{x}^*$ of our objective function $f:\mathbb R^n\to\mathbb R$ .

Suppose we are computing $\mathbf{x}^*$ by an iterative method, such as line search. We define a descent direction $\mathbf{p}_k\in\mathbb R^n$ at the $k$ th iterate to be any $\mathbf{p}_k$ such that $\langle\mathbf{p}_k,\nabla f(\mathbf{x}_k)\rangle < 0$ , where $\langle , \rangle$ denotes the inner product. The motivation for such an approach is that small steps along $\mathbf{p}_k$ guarantee that $\displaystyle f$ is reduced, by Taylor's theorem.

Using this definition, the negative of a non-zero gradient is always a descent direction, as $\langle -\nabla f(\mathbf{x}_k), \nabla f(\mathbf{x}_k) \rangle = -\langle \nabla f(\mathbf{x}_k), \nabla f(\mathbf{x}_k) \rangle < 0$ .

Numerous methods exist to compute descent directions, all with differing merits. For example, one could use gradient descent or the conjugate gradient method.

More generally, if $P$ is a positive definite matrix, then $d = -P \nabla f(x)$ is a descent direction ^[1] at $x$ . This generality is used in preconditioned gradient descent methods.

↑ J. M. Ortega and W. C. Rheinbold (1970). Iterative Solution of Nonlinear Equations in Several Variables. p. 243. doi:10.1137/1.9780898719468.

This article is issued from Wikipedia - version of the Thursday, October 03, 2013. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.