Namespaces
Variants
Actions

Difference between revisions of "Lagrange multipliers"

From Encyclopedia of Mathematics
Jump to: navigation, search
m (tex,msc,mr,zbl)
Line 2: Line 2:
 
{{TEX|done}}
 
{{TEX|done}}
  
''Lagrange multipilers'' are variables with the help of which one constructs a
+
The ''Lagrange multipliers'' are variables with the help of which one constructs a
[[Lagrange function|Lagrange function]] for investigating problems on conditional extrema. The use of Lagrange multipliers and a Lagrange function makes it possible to obtain in a uniform way necessary optimality conditions in problems on conditional extrema. The method of obtaining necessary conditions in the problem of determining an extremum of a function
+
[[Lagrange function|Lagrange function]] for investigating problems on ''conditional extrema''.  
  
$$f(x_1,\dots,x_n)\tag{1}$$
+
'''Definition'''
under the constraints
+
If $f, g_1, \ldots, g_m: \mathbb R^n \supset \Omega \to \mathbb R$
 +
are given functions, a conditional extremal point of $f$ under the costraints $g_1, \ldots, g_m$ is an element $x^\star$ with the property that $f (x^\star)$ is the maximum (resp. minimum) value taken by $f$ on the set
 +
\begin{equation}\label{e:constrained_set}
 +
\Sigma :=\{y\in U : g_i (y) = g_i (x^\star) =: b_i\quad \forall i \in \{ 1, \ldots , m\}\}\, .
 +
\end{equation}
 +
If instead $f(x)$ is the maximum (resp. minimum) value taken by $f$ on $\Sigma \cap U$ for some neighborhood $U$ of $x^\star$, then $x^\star$ is called a local conditional extremal point.
  
$$g_i(x_1,\dots,x_n) = b_i,\quad i=1,\dots,m,\quad m<n,\tag{2}$$
+
The method of Lagrange multipliers gives necessary conditions for local conditional extremal points. More precisely we have the following
consisting of the use of Lagrange multipliers $\def\l{\lambda}\l_i$, $i=1,\dots,m$, the construction of the Lagrange function
 
  
$$F(x,\l) = f(x) + \sum_{i=1}^m\l_i ( b_i - g_i(x) )$$
+
'''Theorem 1'''
and equating its partial derivatives with respect to the $x_i$ and $\l_i$ to zero, is called the Lagrange method. In this method the optimal value $x^* = (x_1^*,\dots,x_n^*)$ is found together with the vector of Lagrange multipliers $\l^* = (\l_1^*,\dots,\l_m^*)$ corresponding to it by solving the system of $m  + n$ equations. The Lagrange multipliers $\l^*_i$, $i=1,\dots,m$, have the following interpretation
+
Assume $\Omega$ is an open set and let $f, g_1, \ldots, g_m: \mathbb R^n \supset \Omega \to \mathbb R$ be $C^1$ functions. If $x^\star$ is a local
{{Cite|Ha}}. Suppose that $x^*$ provides a relative extremum of the function (1) under the constraints (2): $z^* = f(x^*)$. The values of $x^*$, $\l^*$ and $z^*$ depend on the values of the $b_i$, the right-hand sides of the constraints (2). One has formulated quite general assumptions under which all the $x_j^*$ and $\l_i^*$ are continuously-differentiable functions of the vector $b=(b_1\dots,b_m)$ in some $\varepsilon$-neighbourhood of its value specified in (2). Under these assumptions the function $z^*$ is also continuously differentiable with respect to the $b_i$. The partial derivatives of $z^*$ with respect to the $b_i$ are equal to the corresponding Lagrange multipliers $\l_i^*$, calculated for the given $b=(b_1\dots,b_m)$:
+
conditional extremal point under the constraints $g_1, \ldots , g_m$, then the gradients $\nabla f (x^\star), \nabla g_1 (x^\star), \ldots , \nabla g_m (x^\star)$ are linearly dependent.
  
$$\frac{\partial z^*}{\partial b_i} = \l_i^*,\quad i=1,\dots,m.\tag{3}$$
+
The conclusion above is in fact usually stated when $m<n$ and $\nabla g_1 (x^\star), \ldots, \nabla g_m (x^\star)$ are linearly dependent, i.e. when
In applied problems $z$ is often interpreted as profit or cost, and the right-hand sides, $b_i$, as losses of certain resources. Then the absolute value of $\l_i^*$ is the ratio of the unit cost to the unit $i$-th resource. The numbers $\l_i^*$ show how the maximum profit (or maximum cost) changes if the amount of the $i$-th resource is increased by one. This interpretation of Lagrange multipliers can be extended to the case of constraints in the form of inequalities and to the case when the variables $x_j$ are subject to the requirement of being non-negative.
+
$b = (b_1, \ldots , b_i)$ is  a [[Regular value|regular value]] of the function $g = (g_1, \ldots, g_m)$ (at least if we restrict $g$ to some neighborhood $V$ of $x^\star$). The necessary condition of Theorem 1 can then be translated into the identity
 +
\begin{equation}\label{e:lagrange_m}
 +
\nabla f (x^\star) = \lambda^\star_1 \nabla g_1 (x^\star) + \ldots + \lambda^\star_m \nabla g_m (x^\star)\, .
 +
\end{equation}
 +
The Lagrange multipliers are then the real numbers $\lambda_1, \ldots , \lambda_m$ appearing in \eqref{e:lagrange_m}.
 +
Observe that, if we define the [[Lagrange function]]
 +
\begin{equation}\label{e:lagrange_f}
 +
F(x,\lambda) = f(x) + \sum_{i=1}^m\lambda_i (b_i - g_i(x))\, ,
 +
\end{equation}
 +
then the conditions $x^\star\in \Sigma = \{g=b\}$ and \eqref{e:lagrange_m} are equivalent to the fact that $(x^\star, \lambda^\star)$ is a [[Critical point|critical point]] of $F$. We can therefore summarize the discussion above in the following statement, which in fact can be easily seen to be equivalent to Theorem 1:
  
In the calculus of variations one conveniently obtains by means of Lagrange multipliers necessary conditions for optimality in the problem on a conditional extremum as necessary conditions for an unconditional extremum of a certain composite functional. Lagrange multipliers in the calculus of variations are not constants, but certain functions. In the theory of optimal control and in the
+
'''Theorem 2'''
[[Pontryagin maximum principle|Pontryagin maximum principle]], Lagrange multipliers have been called conjugate variables.
+
Assume that $m<n$, that $b$ is a regular value for $g$ and that $x^\star$ is a local conditional extremal point for $f$ under the constraint $g$ with $g (x^\star) = b$. Then there is $\lambda^\star\in \mathbb R^i$ such that $(x^\star, \lambda^\star)$ is a critical point of $F$.
  
 +
Observe that under the hypothesis of the latter theorem, the set $\Sigma$ is a $C^1$ submanifold of dimension $i$. Indeed the theorem is usually proved via the [[Implicit function|Implicit function theorem]], reducing it to the usual necessary condition for uncostrained minima of a differentiable function.
 +
Observe also that the coordinates of the point $x^\star = (x_1^\star,\dots,x_n^\star)$ together with the Lagrange multipliers $\lambda^\star = (\lambda_1^\star,\dots,\lambda_m^\star)$ give us $m+n$ real numbers
 +
which satisfy a system of $m+n$ equations: $m$ equations are indeed given by the constraint $g (x^\star) = b$ and $n$ by the identity \eqref{e:lagrange_m}.
  
 +
The Lagrange multipliers $\lambda^\star_i$, $i=1,\dots,m$, have the following interpretation
 +
{{Cite|Ha}}. Suppose that $x^\star$ provides a relative extremum of the function $f$ under the constraints $g$ and set $z^\star = f(x^\star)$. The values of $x^\star$, $\lambda^\star$ and $z^\star$ depend on the values of $b$. Under suitable assumptions such dependence is $C^1$  in some $\varepsilon$-neighbourhood of $g (x^\star)$. Under these assumptions the function $z^\star$ is also continuously differentiable with respect to the $b_i$. The partial derivatives of $z^\star$ with respect to the $b_i$ are equal to the corresponding Lagrange multipliers $\lambda_i^*$, calculated for the given $b=(b_1\dots,b_m)$:
 +
\begin{equation}\label{e:costs}
 +
\frac{\partial z^\star}{\partial b_i} = \lambda_i^\star,\quad i=1,\dots,m\, .
 +
\end{equation}
 +
In applied problems $z$ is often interpreted as profit or cost, and the right-hand sides, $b_i$, as losses of certain resources. Then the absolute value of $\lambda_i^\star$ is the ratio of the unit cost to the unit $i$-th resource. The numbers $\lambda_i^\star$ show how the maximum profit (or maximum cost) changes if the amount of the $i$-th resource is increased by one. This interpretation of the Lagrange multipliers is very useful because it can be extended to the case of constraints in the form of inequalities.
 +
 +
In the calculus of variations suitable versions of the method of Lagrange multipliers have been developed in several infinite-dimensional settings, namely when the sought conditional extremal points are functions and both the cost to be minimized and the constraints are suitable functionals. In this case the vector of Lagrange multipliers might itself be infinite dimensional.
 +
 +
In the theory of optimal control and in the [[Pontryagin maximum principle|Pontryagin maximum principle]], the Lagrange multipliers are usually called conjugate variables.
  
 
====Comments====
 
====Comments====
The same arguments as used above lead to the interpretation of the Lagrange multiplier values $\l_i^*$ as sensitivity coefficients (with respect to changes in the $b_j$).
+
The same arguments as used above lead to the interpretation of the Lagrange multipliers $\lambda_i^*$ as sensitivity coefficients (with respect to changes in the $b_j$).
  
  

Revision as of 16:40, 22 June 2014

2020 Mathematics Subject Classification: Primary: 49-XX [MSN][ZBL]

The Lagrange multipliers are variables with the help of which one constructs a Lagrange function for investigating problems on conditional extrema.

Definition If $f, g_1, \ldots, g_m: \mathbb R^n \supset \Omega \to \mathbb R$ are given functions, a conditional extremal point of $f$ under the costraints $g_1, \ldots, g_m$ is an element $x^\star$ with the property that $f (x^\star)$ is the maximum (resp. minimum) value taken by $f$ on the set \begin{equation}\label{e:constrained_set} \Sigma :=\{y\in U : g_i (y) = g_i (x^\star) =: b_i\quad \forall i \in \{ 1, \ldots , m\}\}\, . \end{equation} If instead $f(x)$ is the maximum (resp. minimum) value taken by $f$ on $\Sigma \cap U$ for some neighborhood $U$ of $x^\star$, then $x^\star$ is called a local conditional extremal point.

The method of Lagrange multipliers gives necessary conditions for local conditional extremal points. More precisely we have the following

Theorem 1 Assume $\Omega$ is an open set and let $f, g_1, \ldots, g_m: \mathbb R^n \supset \Omega \to \mathbb R$ be $C^1$ functions. If $x^\star$ is a local conditional extremal point under the constraints $g_1, \ldots , g_m$, then the gradients $\nabla f (x^\star), \nabla g_1 (x^\star), \ldots , \nabla g_m (x^\star)$ are linearly dependent.

The conclusion above is in fact usually stated when $m<n$ and $\nabla g_1 (x^\star), \ldots, \nabla g_m (x^\star)$ are linearly dependent, i.e. when $b = (b_1, \ldots , b_i)$ is a regular value of the function $g = (g_1, \ldots, g_m)$ (at least if we restrict $g$ to some neighborhood $V$ of $x^\star$). The necessary condition of Theorem 1 can then be translated into the identity \begin{equation}\label{e:lagrange_m} \nabla f (x^\star) = \lambda^\star_1 \nabla g_1 (x^\star) + \ldots + \lambda^\star_m \nabla g_m (x^\star)\, . \end{equation} The Lagrange multipliers are then the real numbers $\lambda_1, \ldots , \lambda_m$ appearing in \eqref{e:lagrange_m}. Observe that, if we define the Lagrange function \begin{equation}\label{e:lagrange_f} F(x,\lambda) = f(x) + \sum_{i=1}^m\lambda_i (b_i - g_i(x))\, , \end{equation} then the conditions $x^\star\in \Sigma = \{g=b\}$ and \eqref{e:lagrange_m} are equivalent to the fact that $(x^\star, \lambda^\star)$ is a critical point of $F$. We can therefore summarize the discussion above in the following statement, which in fact can be easily seen to be equivalent to Theorem 1:

Theorem 2 Assume that $m<n$, that $b$ is a regular value for $g$ and that $x^\star$ is a local conditional extremal point for $f$ under the constraint $g$ with $g (x^\star) = b$. Then there is $\lambda^\star\in \mathbb R^i$ such that $(x^\star, \lambda^\star)$ is a critical point of $F$.

Observe that under the hypothesis of the latter theorem, the set $\Sigma$ is a $C^1$ submanifold of dimension $i$. Indeed the theorem is usually proved via the Implicit function theorem, reducing it to the usual necessary condition for uncostrained minima of a differentiable function. Observe also that the coordinates of the point $x^\star = (x_1^\star,\dots,x_n^\star)$ together with the Lagrange multipliers $\lambda^\star = (\lambda_1^\star,\dots,\lambda_m^\star)$ give us $m+n$ real numbers which satisfy a system of $m+n$ equations: $m$ equations are indeed given by the constraint $g (x^\star) = b$ and $n$ by the identity \eqref{e:lagrange_m}.

The Lagrange multipliers $\lambda^\star_i$, $i=1,\dots,m$, have the following interpretation [Ha]. Suppose that $x^\star$ provides a relative extremum of the function $f$ under the constraints $g$ and set $z^\star = f(x^\star)$. The values of $x^\star$, $\lambda^\star$ and $z^\star$ depend on the values of $b$. Under suitable assumptions such dependence is $C^1$ in some $\varepsilon$-neighbourhood of $g (x^\star)$. Under these assumptions the function $z^\star$ is also continuously differentiable with respect to the $b_i$. The partial derivatives of $z^\star$ with respect to the $b_i$ are equal to the corresponding Lagrange multipliers $\lambda_i^*$, calculated for the given $b=(b_1\dots,b_m)$: \begin{equation}\label{e:costs} \frac{\partial z^\star}{\partial b_i} = \lambda_i^\star,\quad i=1,\dots,m\, . \end{equation} In applied problems $z$ is often interpreted as profit or cost, and the right-hand sides, $b_i$, as losses of certain resources. Then the absolute value of $\lambda_i^\star$ is the ratio of the unit cost to the unit $i$-th resource. The numbers $\lambda_i^\star$ show how the maximum profit (or maximum cost) changes if the amount of the $i$-th resource is increased by one. This interpretation of the Lagrange multipliers is very useful because it can be extended to the case of constraints in the form of inequalities.

In the calculus of variations suitable versions of the method of Lagrange multipliers have been developed in several infinite-dimensional settings, namely when the sought conditional extremal points are functions and both the cost to be minimized and the constraints are suitable functionals. In this case the vector of Lagrange multipliers might itself be infinite dimensional.

In the theory of optimal control and in the Pontryagin maximum principle, the Lagrange multipliers are usually called conjugate variables.

Comments

The same arguments as used above lead to the interpretation of the Lagrange multipliers $\lambda_i^*$ as sensitivity coefficients (with respect to changes in the $b_j$).


References

[Bl] G.A. Bliss, "Lectures on the calculus of variations", Chicago Univ. Press (1947) MR0017881 Zbl 0036.34401
[Br] A.E. Bryson, Y.-C. Ho, "Applied optimal control", Blaisdell (1969) MR0446628
[Ha] G.F. Hadley, "Nonlinear and dynamic programming", Addison-Wesley (1964) MR0173543 Zbl 0179.24601
[Ro] R.T. Rockafellar, "Convex analysis", Princeton Univ. Press (1970) MR0274683 Zbl 0193.18401
How to Cite This Entry:
Lagrange multipliers. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Lagrange_multipliers&oldid=24258
This article was adapted from an original article by I.B. Vapnyarskii (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article