Namespaces
Variants
Actions

Difference between revisions of "Karush-Kuhn-Tucker conditions"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Added comment.)
(TeX done and links)
Line 1: Line 1:
{{TEX|part}}
+
{{TEX|done}}
  
 
''KKT conditions, Karush–Kuhn–Tucker optimality conditions, KKT optimality conditions, Kuhn–Tucker conditions''
 
''KKT conditions, Karush–Kuhn–Tucker optimality conditions, KKT optimality conditions, Kuhn–Tucker conditions''
Line 11: Line 11:
 
\end{array}
 
\end{array}
 
\end{equation}
 
\end{equation}
with $f$, $g_i$, $h_j$ all continuously differentiable, and form the [[Lagrangian]] function
+
with $f$, $g_i$, $h_j$ all [[Continuous function|continuously]] [[Differentiable function|differentiable]] ($C^1$-functions), and form the [[Lagrangian]] function
 
\begin{equation}
 
\begin{equation}
 
L(x,\lambda,\mu)=f(x)+\sum_{i=1}^{p}\lambda_i g_i(x) + \sum_{j=1}^{q}\mu_j h_j(x).
 
L(x,\lambda,\mu)=f(x)+\sum_{i=1}^{p}\lambda_i g_i(x) + \sum_{j=1}^{q}\mu_j h_j(x).
 
\end{equation}
 
\end{equation}
 
The Karush–Kuhn–Tucker optimality conditions now are:
 
The Karush–Kuhn–Tucker optimality conditions now are:
\begin{equation}
+
\begin{equation} \label{eq:1}
 
\begin{array}{rl}
 
\begin{array}{rl}
 
\dfrac{\partial}{\partial x_k}L(x,\lambda,\mu)=0, & k=1,\ldots,n,\\
 
\dfrac{\partial}{\partial x_k}L(x,\lambda,\mu)=0, & k=1,\ldots,n,\\
Line 31: Line 31:
 
This generalizes the familiar [[Lagrange multipliers]] rule to the case where there are also inequality constraints.
 
This generalizes the familiar [[Lagrange multipliers]] rule to the case where there are also inequality constraints.
  
The result was obtained independently by Karush in 1939, by F. John in 1948, and by H.W. Kuhn and J.W. Tucker in 1951, see [[#References|[a1]]].
+
The result was obtained independently by Karush in 1939, by F. John in 1948, and by H.W. Kuhn and J.W. Tucker in 1951, see [[#References|[1]]], [[#References|[7]]].
  
One suitable constraint qualification (for the case where there are no equality constraints) is $\overline{D}(x^*)=D(x^*)$, where $\overline{D}(x^*)$ is the closure of \begin{equation} D(x^*)=\left\{d\colon\exists\sigma > 0\colon\sigma\geq\tau\geq 0 \Rightarrow x^*+\tau d \quad\text{feasible}\right\} \end{equation} and \begin{equation} D(x^*)=\left\{d\colon\nabla g_i(x^*)^T d\geq 0,\quad i=1,\ldots,p\right\}, \end{equation} where $\nabla g_i(x^*)$ is the [[gradient]] vector of $g_i$ at $x^*$, \begin{equation} \nabla g_i(x^*)^T=\left(\dfrac{\partial g_i}{\partial x_1}(x^*),\ldots,\dfrac{\partial g_i}{\partial x_n}(x^*)\right), \end{equation} see [[#References|[a2]]].
+
One suitable constraint qualification (for the case where there are no equality constraints) is $\overline{D}(x^*)=D(x^*)$, where $\overline{D}(x^*)$ is the closure of \begin{equation} D(x^*)=\left\{d\colon\exists\sigma > 0\colon\sigma\geq\tau\geq 0 \Rightarrow x^*+\tau d \quad\text{feasible}\right\} \end{equation} and \begin{equation} D(x^*)=\left\{d\colon\nabla g_i(x^*)^T d\geq 0,\quad i=1,\ldots,p\right\}, \end{equation} where $\nabla g_i(x^*)$ is the [[gradient]] vector of $g_i$ at $x^*$, \begin{equation} \nabla g_i(x^*)^T=\left(\dfrac{\partial g_i}{\partial x_1}(x^*),\ldots,\dfrac{\partial g_i}{\partial x_n}(x^*)\right), \end{equation} see [[#References|[2]]].
  
 
Given a KKT-triple $(x^*,\lambda^*,\mu^*)$, there are also (as in the case of Lagrange multipliers) second-order sufficient conditions that ensure local optimality of $x^*$.
 
Given a KKT-triple $(x^*,\lambda^*,\mu^*)$, there are also (as in the case of Lagrange multipliers) second-order sufficient conditions that ensure local optimality of $x^*$.
  
====Comment====
 
  
See also: [[Fritz John condition]].
 
  
 +
====Comment====
  
 +
The KKT conditions \eqref{eq:1} provide a criterion for testing whether a solution which has been found by other methods is in fact an optimal solution.
  
====References====
+
The KKT conditions have been generalized in various directions:
 
+
* to necessary or sufficient conditions, or both types, for an extremum of a function subject to equality or inequality constraints [[#References|[6]]]; especially in case of [[Convex function (of a real variable)|convex functions]] $f$ and $g_i$ and affine $h_j$, the KKT conditions \eqref{eq:1} are sufficient, see [[#References|[9]]], pp. 243–246.
<table>
+
* to changing the geometrical structure of the underlying manifold [[#References|[6]]], [[#References|[8]]].
<TR><TD valign="top">[a1]</TD> <TD valign="top">  A.V. Fiacco,  G.P. McCormick,  "Nonlinear programming, sequential unconstrained minimization techniques" , SIAM  (1968)  (Edition: Reprinted)</TD></TR>
+
* [[Fritz John condition]].
<TR><TD valign="top">[a2]</TD> <TD valign="top">  W.I. Zangwill,  "Nonlinear programming" , Prentice-Hall  (1969)</TD></TR>
 
<TR><TD valign="top">[a3]</TD> <TD valign="top">  O.L. Mangasarian,  "Nonlinear programming" , McGraw-Hill  (1969)</TD></TR>
 
<TR><TD valign="top">[a4]</TD> <TD valign="top">  G.P. McCormick,  "Nonlinear programming: theory, algorithms, and applications" , Wiley  (1983)</TD></TR>
 
<TR><TD valign="top">[a5]</TD> <TD valign="top">  Jong-Shi Pang,  "Complementarity problems"  R. Horst (ed.)  P.M. Pardalos (ed.) , ''Handbook of Global Optimization'' , Kluwer Acad. Publ.  (1995)  pp. 271–338</TD></TR>
 
</table>
 
 
 
 
 
 
 
====Kuhn-Tucker conditions====
 
 
 
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k110/k110170/k1101701.png" /> be a complete finite-dimensional [[Riemannian manifold|Riemannian manifold]], let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k110/k110170/k1101702.png" /> be a convex [[Objective function|objective function]] and let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k110/k110170/k1101703.png" /> be a totally convex subset described by a system of inequalities <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k110/k110170/k1101704.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k110/k110170/k1101705.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k110/k110170/k1101706.png" /> are concave functions (cf. [[Concave function|Concave function]]). The solution of the [[Convex programming|convex programming]] problem <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k110/k110170/k1101707.png" /> is completely characterized by a saddle-point theorem initially stated on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k110/k110170/k1101708.png" /> [[#References|[a2]]] by H.W. Kuhn and A.W. Tucker.
 
 
 
Suppose <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k110/k110170/k1101709.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k110/k110170/k11017010.png" /> are of class <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k110/k110170/k11017011.png" />, and let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k110/k110170/k11017012.png" />. Introduce the Lagrange function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k110/k110170/k11017013.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k110/k110170/k11017014.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k110/k110170/k11017015.png" />. A point <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k110/k110170/k11017016.png" /> is the optimal solution of the convex programming problem <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k110/k110170/k11017017.png" /> if and only if there exists a <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k110/k110170/k11017018.png" /> such that the so-called Kuhn–Tucker conditions hold:
 
 
 
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k110/k110170/k11017019.png" /></td> </tr></table>
 
 
 
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k110/k110170/k11017020.png" /></td> </tr></table>
 
 
 
These equations provide a criterion for testing whether a solution which has been found by other methods is in fact an optimal solution.
 
 
 
The Kuhn–Tucker theorem has been generalized in various directions:
 
 
 
to necessary or sufficient conditions, or both types, for an extremum of a function subject to equality or inequality constraints [[#References|[a1]]];
 
 
 
to changing the geometrical structure of the underlying manifold [[#References|[a1]]], [[#References|[a3]]].
 
  
  
Line 78: Line 53:
  
 
<table>
 
<table>
<TR><TD valign="top">[a1]</TD> <TD valign="top">  K.J. Arrow,  L. Hurwicz,  H. Uzawa,  "Studies in linear and nonlinear programming" , Stanford Univ. Press  (1958)</TD></TR>
+
<TR><TD valign="top">[1]</TD> <TD valign="top">  A.V. Fiacco,  G.P. McCormick,  "Nonlinear programming, sequential unconstrained minimization techniques" , SIAM  (1968)  (Edition: Reprinted)</TD></TR>
<TR><TD valign="top">[a2]</TD> <TD valign="top">  H.W. Kuhn,  A.W. Tucker,  "Nonlinear programming" , ''Proc. Second Berkeley Symp. Math. Stat. Probab.'' , California Univ. Press  (1951)  pp. 481–492</TD></TR>
+
<TR><TD valign="top">[2]</TD> <TD valign="top">  W.I. Zangwill,  "Nonlinear programming" , Prentice-Hall  (1969)</TD></TR>
<TR><TD valign="top">[a3]</TD> <TD valign="top">  C. Udrişte,  "Convex functions and optimization methods on Riemannian manifolds" , Kluwer Acad. Publ.  (1994)</TD></TR>
+
<TR><TD valign="top">[3]</TD> <TD valign="top">  O.L. Mangasarian,  "Nonlinear programming" , McGraw-Hill  (1969)</TD></TR>
 +
<TR><TD valign="top">[4]</TD> <TD valign="top">  G.P. McCormick,  "Nonlinear programming: theory, algorithms, and applications" , Wiley  (1983)</TD></TR>
 +
<TR><TD valign="top">[5]</TD> <TD valign="top">  Jong-Shi Pang,  "Complementarity problems"  R. Horst (ed.)  P.M. Pardalos (ed.) , ''Handbook of Global Optimization'' , Kluwer Acad. Publ.  (1995)  pp. 271–338</TD></TR>
 +
<TR><TD valign="top">[6]</TD> <TD valign="top">  K.J. Arrow,  L. Hurwicz,  H. Uzawa,  "Studies in linear and nonlinear programming" , Stanford Univ. Press  (1958)</TD></TR>
 +
<TR><TD valign="top">[7]</TD> <TD valign="top">  H.W. Kuhn,  A.W. Tucker,  "Nonlinear programming" , ''Proc. Second Berkeley Symp. Math. Stat. Probab.'' , California Univ. Press  (1951)  pp. 481–492</TD></TR>
 +
<TR><TD valign="top">[8]</TD> <TD valign="top">  C. Udrişte,  "Convex functions and optimization methods on Riemannian manifolds" , Kluwer Acad. Publ.  (1994)</TD></TR>
 +
<TR><TD valign="top">[9]</TD> <TD valign="top">  S.P. Boyd,  L. Vandenberghe,  "Convex Optimization" , Cambridge University Press  (2004). {{DOI|10.1017/CBO9780511804441}}</TD></TR>
 
</table>
 
</table>

Revision as of 09:42, 3 May 2016


KKT conditions, Karush–Kuhn–Tucker optimality conditions, KKT optimality conditions, Kuhn–Tucker conditions

Consider the general mathematical programming problem: \begin{equation} \begin{array}{lll} \mbox{minimize} & f(x), & x\in\mathbb{R}^n \\ \mbox{subject to} & g_i(x)\leq 0, & i=1,\ldots,p,\\ & h_j(x)=0, & j=1,\ldots,q, \end{array} \end{equation} with $f$, $g_i$, $h_j$ all continuously differentiable ($C^1$-functions), and form the Lagrangian function \begin{equation} L(x,\lambda,\mu)=f(x)+\sum_{i=1}^{p}\lambda_i g_i(x) + \sum_{j=1}^{q}\mu_j h_j(x). \end{equation} The Karush–Kuhn–Tucker optimality conditions now are: \begin{equation} \label{eq:1} \begin{array}{rl} \dfrac{\partial}{\partial x_k}L(x,\lambda,\mu)=0, & k=1,\ldots,n,\\ \lambda_i\geq 0, & i=1,\ldots,p,\\ \lambda_i g_i(x)=0, & i=1,\ldots,p,\\ g_i(x)\leq 0, & i=1,\ldots,p,\\ h_j(x)=0, & j=1,\ldots,q. \end{array} \end{equation} Note that the two last conditions just say that $x$ is feasible and that the third conditions could be replaced by $\sum\lambda_i g_i(x)=0$ (given the second and fourth). Under suitable constraint qualifications, the KKT optimality conditions are necessary conditions for a local solution $x^*$ of the general mathematical programming problem, in the following sense.

If $x^*$ is a local minimum, then there are corresponding $\lambda^*$, $\mu^*$ such that the triple $(x^*,\lambda^*,\mu^*)$ satisfies the KKT optimality conditions. A triple satisfying the KKT optimality conditions is sometimes called a KKT-triple.

This generalizes the familiar Lagrange multipliers rule to the case where there are also inequality constraints.

The result was obtained independently by Karush in 1939, by F. John in 1948, and by H.W. Kuhn and J.W. Tucker in 1951, see [1], [7].

One suitable constraint qualification (for the case where there are no equality constraints) is $\overline{D}(x^*)=D(x^*)$, where $\overline{D}(x^*)$ is the closure of \begin{equation} D(x^*)=\left\{d\colon\exists\sigma > 0\colon\sigma\geq\tau\geq 0 \Rightarrow x^*+\tau d \quad\text{feasible}\right\} \end{equation} and \begin{equation} D(x^*)=\left\{d\colon\nabla g_i(x^*)^T d\geq 0,\quad i=1,\ldots,p\right\}, \end{equation} where $\nabla g_i(x^*)$ is the gradient vector of $g_i$ at $x^*$, \begin{equation} \nabla g_i(x^*)^T=\left(\dfrac{\partial g_i}{\partial x_1}(x^*),\ldots,\dfrac{\partial g_i}{\partial x_n}(x^*)\right), \end{equation} see [2].

Given a KKT-triple $(x^*,\lambda^*,\mu^*)$, there are also (as in the case of Lagrange multipliers) second-order sufficient conditions that ensure local optimality of $x^*$.


Comment

The KKT conditions \eqref{eq:1} provide a criterion for testing whether a solution which has been found by other methods is in fact an optimal solution.

The KKT conditions have been generalized in various directions:

  • to necessary or sufficient conditions, or both types, for an extremum of a function subject to equality or inequality constraints [6]; especially in case of convex functions $f$ and $g_i$ and affine $h_j$, the KKT conditions \eqref{eq:1} are sufficient, see [9], pp. 243–246.
  • to changing the geometrical structure of the underlying manifold [6], [8].
  • Fritz John condition.


References

[1] A.V. Fiacco, G.P. McCormick, "Nonlinear programming, sequential unconstrained minimization techniques" , SIAM (1968) (Edition: Reprinted)
[2] W.I. Zangwill, "Nonlinear programming" , Prentice-Hall (1969)
[3] O.L. Mangasarian, "Nonlinear programming" , McGraw-Hill (1969)
[4] G.P. McCormick, "Nonlinear programming: theory, algorithms, and applications" , Wiley (1983)
[5] Jong-Shi Pang, "Complementarity problems" R. Horst (ed.) P.M. Pardalos (ed.) , Handbook of Global Optimization , Kluwer Acad. Publ. (1995) pp. 271–338
[6] K.J. Arrow, L. Hurwicz, H. Uzawa, "Studies in linear and nonlinear programming" , Stanford Univ. Press (1958)
[7] H.W. Kuhn, A.W. Tucker, "Nonlinear programming" , Proc. Second Berkeley Symp. Math. Stat. Probab. , California Univ. Press (1951) pp. 481–492
[8] C. Udrişte, "Convex functions and optimization methods on Riemannian manifolds" , Kluwer Acad. Publ. (1994)
[9] S.P. Boyd, L. Vandenberghe, "Convex Optimization" , Cambridge University Press (2004). DOI 10.1017/CBO9780511804441
How to Cite This Entry:
Karush-Kuhn-Tucker conditions. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Karush-Kuhn-Tucker_conditions&oldid=38773
This article was adapted from an original article by M. Hazewinkel (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article