Karush–Kuhn–Tucker conditions

In mathematical optimization, the Karush–Kuhn–Tucker (KKT) conditions (also known as the Kuhn–Tucker conditions) are first order necessary conditions for a solution in nonlinear programming to be optimal, provided that some regularity conditions are satisfied. Allowing inequality constraints, the KKT approach to nonlinear programming generalizes the method of Lagrange multipliers, which allows only equality constraints. The system of equations corresponding to the KKT conditions is usually not solved directly, except in the few special cases where a closed-form solution can be derived analytically. In general, many optimization algorithms can be interpreted as methods for numerically solving the KKT system of equations.^[1]

The KKT conditions were originally named after Harold W. Kuhn, and Albert W. Tucker, who first published the conditions in 1951.^[2] Later scholars discovered that the necessary conditions for this problem had been stated by William Karush in his master's thesis in 1939.^[3]^[4]

Nonlinear optimization problem

Consider the following nonlinear optimization problem:

Maximize

f(x)

subject to

g_i(x) \leq 0 , h_j(x) = 0

where x is the optimization variable, $f$ is the objective or utility function, $g_i \ (i = 1, \ldots,m)$ are the inequality constraint functions, and $h_j \ (j = 1,\ldots,l)$ are the equality constraint functions. The numbers of inequality and equality constraints are denoted m and l, respectively.

Necessary conditions

Suppose that the objective function $f : \mathbb{R}^n \rightarrow \mathbb{R}$ and the constraint functions $g_i : \,\!\mathbb{R}^n \rightarrow \mathbb{R}$ and $h_j : \,\!\mathbb{R}^n \rightarrow \mathbb{R}$ are continuously differentiable at a point $x^*$ . If $x^*$ is a local optimum that satisfies some regularity conditions (see below), then there exist constants $\mu_i\ (i = 1,\ldots,m)$ and $\lambda_j\ (j = 1,\ldots,l)$ , called KKT multipliers, such that

Inequality constraint diagram for optimization problems

Stationarity: For maximizing f(x): $\nabla f(x^*) = \sum_{i=1}^m \mu_i \nabla g_i(x^*) + \sum_{j=1}^l \lambda_j \nabla h_j(x^*),$; For minimizing f(x): $-\nabla f(x^*) = \sum_{i=1}^m \mu_i \nabla g_i(x^*) + \sum_{j=1}^l \lambda_j \nabla h_j(x^*),$

Primal feasibility: $g_i(x^*) \le 0, \mbox{ for all } i = 1, \ldots, m$; $h_j(x^*) = 0, \mbox{ for all } j = 1, \ldots, l \,\!$

Dual feasibility: $\mu_i \ge 0, \mbox{ for all } i = 1, \ldots, m$

Complementary slackness: $\mu_i g_i (x^*) = 0, \mbox{for all}\; i = 1,\ldots,m.$

In the particular case $m=0$ , i.e., when there are no inequality constraints, the KKT conditions turn into the Lagrange conditions, and the KKT multipliers are called Lagrange multipliers.

If some of the functions are non-differentiable, subdifferential versions of Karush–Kuhn–Tucker (KKT) conditions are available.^[5]

Regularity conditions (or constraint qualifications)

In order for a minimum point $x^*$ to satisfy the above KKT conditions, the problem should satisfy some regularity conditions; the most used ones are listed below:

Linearity constraint qualification: If $g_i$ and $h_j$ are affine functions, then no other condition is needed.
Linear independence constraint qualification (LICQ): the gradients of the active inequality constraints and the gradients of the equality constraints are linearly independent at $x^*$ .
Mangasarian–Fromovitz constraint qualification (MFCQ): the gradients of the active inequality constraints and the gradients of the equality constraints are positive-linearly independent at $x^*$ .
Constant rank constraint qualification (CRCQ): for each subset of the gradients of the active inequality constraints and the gradients of the equality constraints the rank at a vicinity of $x^*$ is constant.
Constant positive linear dependence constraint qualification (CPLD): for each subset of the gradients of the active inequality constraints and the gradients of the equality constraints, if it is positive-linear dependent at $x^*$ then it is positive-linear dependent at a vicinity of $x^*$ .
Quasi-normality constraint qualification (QNCQ): if the gradients of the active inequality constraints and the gradients of the equality constraints are positive-linearly dependent at $x^*$ with associated multipliers $\lambda_i$ for equalities and $\mu_j$ for inequalities, then there is no sequence $x_k\to x^*$ such that $\lambda_i \neq 0 \Rightarrow \lambda_i h_i(x_k)>0$ and $\mu_j \neq 0 \Rightarrow \mu_j g_j(x_k)>0$ .
Slater condition: for a convex problem, there exists a point $x$ such that $h(x)=0$ and $g_i(x) < 0$ .

( $v_1,\ldots,v_n$ ) is positive-linear dependent if there exists $a_1\geq 0,\ldots,a_n\geq 0$ not all zero such that $a_1v_1+\cdots+a_nv_n=0$ .

It can be shown that LICQ⇒MFCQ⇒CPLD⇒QNCQ, LICQ⇒CRCQ⇒CPLD⇒QNCQ (and the converses are not true), although MFCQ is not equivalent to CRCQ.^[6] In practice weaker constraint qualifications are preferred since they provide stronger optimality conditions.

Sufficient conditions

In some cases, the necessary conditions are also sufficient for optimality. In general, the necessary conditions are not sufficient for optimality and additional information is necessary, such as the Second Order Sufficient Conditions (SOSC). For smooth functions, SOSC involve the second derivatives, which explains its name.

The necessary conditions are sufficient for optimality if the objective function $f$ of a maximization problem is a concave function, the inequality constraints $g_j$ are continuously differentiable convex functions and the equality constraints $h_i$ are affine functions.

It was shown by Martin in 1985 that the broader class of functions in which KKT conditions guarantees global optimality are the so-called Type 1 invex functions.^[7]^[8]

Second Order Sufficient Conditions

For smooth, non-linear optimisation problems, a second order sufficient condition is given as follows. Consider $x^*, \lambda^*, \rho^*$ that find a local minimum using the Karush-Kuhn-Tucker conditions above. With $\rho^*$ such that strict complementarity is held at $x^*$ (i.e. all $\mathbb{\rho}_i > 0$ ), then for all $s \ne 0$ such that

\left[ \frac{\partial g(x^*)}{\partial x}, \frac{\partial h(x^*)}{\partial x} \right]^T s = 0

the following equation must hold;

s'\nabla ^2_{xx}L(x^*,\lambda^*,\rho^*)s > 0

If the above condition is strictly met, the function is a strict constrained local minimum.

Economics

Value function

If we reconsider the optimization problem as a maximization problem with constant inequality constraints,v.

\text{Maximize }\; f(x)

\text{subject to }\

g_i(x) \le a_i , h_j(x) = 0.

The value function is defined as

V(a_1, \ldots, a_n) = \sup\limits_x f(x)

\text{subject to }\

g_i(x) \le a_i , h_j(x) = 0

j\in\{1,\ldots, l\}, i\in\{1,\ldots,m\}.

(So the domain of V is $\{a \in \mathbb{R}^m | \text{for some }x\in X, g_i(x) \leq a_i, i \in \{1,\ldots,m\}.$ )

Given this definition, each coefficient, $\mu_i$ , is the rate at which the value function increases as $a_i$ increases. Thus if each $a_i$ is interpreted as a resource constraint, the coefficients tell you how much increasing a resource will increase the optimum value of our function f. This interpretation is especially important in economics and is used, for instance, in utility maximization problems.

Generalizations

With an extra constant multiplier $\mu_0$ , which may be zero, in front of $\nabla f(x^*)$ the KKT stationarity conditions turn into

\mu_0 \nabla f(x^*) + \sum_{i=1}^m \mu_i \nabla g_i(x^*) + \sum_{j=1}^l \lambda_j \nabla h_j(x^*) = 0,

which are called the Fritz John conditions.

The KKT conditions belong to a wider class of the First Order Necessary Conditions (FONC), which allow for non-smooth functions using subderivatives.

References

↑ Boyd, Stephen; Vandenberghe, Lieven (2004). Convex Optimization. Cambridge: Cambridge University Press. p. 244. ISBN 0-521-83378-7. MR 2061575.
↑ Kuhn, H. W.; Tucker, A. W. (1951). "Nonlinear programming". Proceedings of 2nd Berkeley Symposium. Berkeley: University of California Press. pp. 481–492. MR 47303
↑ W. Karush (1939). "Minima of Functions of Several Variables with Inequalities as Side Constraints". M.Sc. Dissertation. Dept. of Mathematics, Univ. of Chicago, Chicago, Illinois.
↑ Kjeldsen, Tinne Hoff (2000). "A contextualized historical analysis of the Kuhn-Tucker theorem in nonlinear programming: the impact of World War II". Historia Math. 27 (4): 331–361. doi:10.1006/hmat.2000.2289. MR 1800317.
↑ Ruszczyński, Andrzej (2006). Nonlinear Optimization. Princeton, NJ: Princeton University Press. ISBN 978-0691119151. MR 2199043.
↑ Rodrigo Eustaquio, Elizabeth Karas, and Ademir Ribeiro. Constraint Qualification for Nonlinear Programming (PDF) (Technical report). Federal University of Parana.
↑ Martin, D. H. (1985). "The Essence of Invexity". J. Optim. Theory Appl. 47 (1): 65–76. doi:10.1007/BF00941316.
↑ Hanson, M. A. (1999). "Invexity and the Kuhn-Tucker Theorem". J. Math. Anal. Appl. 236 (2): 594–604. doi:10.1006/jmaa.1999.6484.

External links

This article is issued from Wikipedia - version of the Sunday, February 21, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.