Namespaces
Variants
Actions

Difference between revisions of "Mathematical programming"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Category:Operations research, mathematical programming)
m (removed forgotten {{TEX|part}})
 
(4 intermediate revisions by the same user not shown)
Line 1: Line 1:
The branch of mathematics concerned with the theory and methods for solving problems on finding the extrema of functions on sets defined by linear and non-linear constraints (equalities and inequalities) in a finite-dimensional vector space. Mathematical programming is a branch of [[Operations research|operations research]], which comprises a wide class of control problems the mathematical models of which are finite-dimensional extremum problems. The problems of mathematical programming find applications in various areas of human activity where it is necessary to choose one of the possible ways of action, e.g. in solving numerous problems of control and planning of production processes as well as in problems of design and long-term planning. The term  "mathematical programming"  is connected with the fact that the goal of solving various problems is choosing programs of action.
+
{{TEX|done}}{{MSC|90Cxx}}
  
The mathematical formulation of the problem of mathematical programming is: To minimize a scalar function $\phi (x)$ of a vector argument on the set
+
The branch of mathematics concerned with the theory and methods for solving problems on finding the extrema of functions on sets defined by linear and non-linear constraints (equalities and inequalities) in a finite-dimensional [[vector space]]. Mathematical programming is a branch of [[operations research]], which comprises a wide class of control problems the mathematical models of which are finite-dimensional extremum problems. The problems of mathematical programming find applications in various areas of human activity where it is necessary to choose one of the possible ways of action, e.g. in solving numerous problems of control and planning of production processes as well as in problems of design and long-term planning. The term  "mathematical programming"  is connected with the fact that the goal of solving various problems is choosing programs of action.
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m0627002.png" /></td> </tr></table>
+
The mathematical formulation of the problem of mathematical programming is: To minimize a scalar function $\phi(x)$ of a vector argument on the set
 +
\begin{equation}
 +
X=\left\{x\colon q_i(x)\geq 0,\; i=1,\ldots,k;\quad h_j(x)=0,\; j=1,\ldots,m \right\},
 +
\end{equation}
 +
where $q_i(x)$ and $h_j(x)$ are scalar functions. The function $\phi(x)$ is called the objective function, and also the quality criterion, the set $X$ is called the feasible set, or the set of plans, a solution $x^*$ of the mathematical programming problem is an optimum point (or vector), a point of global minimum and also an optimal plan.
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m0627003.png" /></td> </tr></table>
+
In mathematical programming one customarily distinguishes the following branches.
 
+
* [[Linear programming]]: The objective function $\phi(x)$ and the constraints $q_i(x)$ and $h_j(x)$ are linear.
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m0627004.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m0627005.png" /> are scalar functions. The function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m0627006.png" /> is called the objective function, and also the quality criterion, the set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m0627007.png" /> is called the feasible set, or the set of plans, a solution <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m0627008.png" /> of the mathematical programming problem is an optimum point (or vector), a point of global minimum and also an optimal plan.
+
* [[Quadratic programming]]: The objective function is quadratic and convex, while the feasible set is defined by linear equalities and inequalities.
 
+
* [[Convex programming]]: The objective function and the constraints are convex.
In mathematical programming one customarily distinguishes the following branches. [[Linear programming|Linear programming]]: The objective function $\phi (x)$ and the constraints <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270010.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270011.png" /> are linear. [[Quadratic programming|Quadratic programming]]: The objective function is quadratic and convex, while the feasible set is defined by linear equalities and inequalities. [[Convex programming|Convex programming]]: The objective function and the constraints are convex. [[Discrete programming|Discrete programming]]: The solution is sought only at discrete, say integer, points of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270012.png" />. [[Stochastic programming|Stochastic programming]]: In contrast to deterministic problems, here the data contain an element of indeterminacy. For example, in stochastic problems of minimization of a linear function
+
* [[Discrete programming]]/[[Integer programming]]: The solution is sought only at discrete, say integer, points of $X$.
 
+
* [[Stochastic programming]]: In contrast to deterministic problems, here the data contain an element of indeterminacy. For example, in stochastic problems of minimization of a linear function $\sum_{j=1}^{n}c_j x_j$ under linear constraints $\sum_{j=1}^{n}a_{ij} x_j\geq b_i$, $i=1,2,\ldots$, the parameters $c_j$, $a_{ij}$, $b_i$, or only some of them, are random.
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270013.png" /></td> </tr></table>
+
The problems of linear, quadratic and convex programming have a common property: Local optimality implies global optimality. The so-called [[multi-extremum problem]]s, for which the indicated property does not hold, are both considerably more difficult and less investigated.
 
 
under linear constraints
 
 
 
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270014.png" /></td> </tr></table>
 
 
 
the parameters <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270015.png" />, or only some of them, are random. The problems of linear, quadratic and convex programming have a common property: Local optimality implies global optimality. The so-called multi-extremum problems, for which the indicated property does not hold, are both considerably more difficult and less investigated.
 
 
 
At the basis of the theory of convex, and, in particular, linear and quadratic, programming lies the Kuhn–Tucker theorem, which gives necessary and sufficient conditions for the existence of an optimum point <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270016.png" />: In order that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270017.png" /> be an optimum point, i.e.
 
 
 
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270018.png" /></td> </tr></table>
 
 
 
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270019.png" /></td> </tr></table>
 
 
 
it is necessary and sufficient that there exist a point <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270020.png" /> such that the pair <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270021.png" /> be a [[Saddle point|saddle point]] of the [[Lagrange function|Lagrange function]]
 
 
 
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270022.png" /></td> </tr></table>
 
  
 +
At the basis of the theory of convex, and, in particular, linear and quadratic, programming lie the [[Karush-Kuhn-Tucker conditions]], which gives necessary and sufficient conditions for the existence of an optimum point $x^*$: In order that $x^*$ be an optimum point, i.e.
 +
\begin{equation}
 +
\phi(x^*)=\min_{x\in X}\phi(x),
 +
\end{equation}
 +
\begin{equation}
 +
X=\left\{x\colon f_i(x)\geq 0,\; i=1,\ldots,k\right\},
 +
\end{equation}
 +
it is necessary and sufficient that there exist a point $y^*=(y_1^*,\ldots,y_k^*)$ such that the pair $x^*$, $y^*$ be a [[saddle point]] of the [[Lagrange function]]
 +
\begin{equation}
 +
L(x,y)=\phi(x)+\sum_{i=1}^{k}y_i f_i(x) .
 +
\end{equation}
 
The latter means that
 
The latter means that
 +
\begin{equation}
 +
L(x^*,y)\leq L(x^*,y^*)\leq L(x,y^*),
 +
\end{equation}
 +
for all $x$ and all $y\geq 0$. If the constraints $f_i(x)$ are non-linear, the theorem is valid under some additional assumptions on the feasible set, for example, the existence of a point $x\in X$ such that $f_i(x)>0$, $i=1,\ldots,k$ (Slater's regularity condition).
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270023.png" /></td> </tr></table>
+
If $\phi(x)$ and $f_i(x)$ are [[differentiable function]]s, then the following relations characterize a saddle point:
 
 
for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270024.png" /> and all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270025.png" />. If the constraints <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270026.png" /> are non-linear, the theorem is valid under some additional assumptions on the feasible set, for example, the existence of a point <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270027.png" /> such that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270028.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270029.png" /> (Slater's regularity condition).
 
 
 
If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270030.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270031.png" /> are differentiable functions, then the following relations characterize a saddle point:
 
 
 
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270032.png" /></td> </tr></table>
 
 
 
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270033.png" /></td> </tr></table>
 
 
 
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270034.png" /></td> </tr></table>
 
  
 
Thus, the problem of convex programming reduces to the solution of a system of equations and inequalities.
 
Thus, the problem of convex programming reduces to the solution of a system of equations and inequalities.
 +
\begin{equation}
 +
\begin{aligned}
 +
\sum_{j=1}^{n}\frac{\partial L}{\partial x_j}x_j = 0;&\qquad \frac{\partial L}{\partial x_j} \geq 0,& j=1,\ldots,n;\\
 +
\sum_{i=1}^{k}\frac{\partial L}{\partial y_i}y_i = 0;&\qquad \frac{\partial L}{\partial y_i} \leq 0,\quad y_i\geq 0,& i=1,\ldots,k.
 +
\end{aligned}
 +
\end{equation}
 +
On the basis of the Karush-Kuhn-Tucker conditions various iterative minimization methods were developed, which amount to searching for a saddle point of Lagrange's function.
  
On the basis of the Kuhn–Tucker theorem various iterative minimization methods were developed, which amount to searching for a saddle point of Lagrange's function.
+
In mathematical programming one of the main directions concerns computational methods for solving extremum problems. One of the widest used among these methods is the method of feasible directions. In this method a sequence $\{x_m\}$ of points of the set $X$ is constructed by the formula $x_{p+1}=x_p + \alpha_p s_p$. At each iteration, to calculate the point $x_{p+1}$ one has to choose a direction (a vector) $s_p$ and a step length (a number) $\alpha_p$. For $s_p$ one selects from among the feasible directions (i.e. directions at the point $x_p$, a small displacement along which does not lead out of the set $X$) the one that makes an acute angle with the direction $g(x_p)$ of steepest descent of the objective function (with the vector $g(x_p)=-\left.\frac{d \phi(x)}{dx}\right|_{x=x_p}$ if $\phi(x)$ is differentiable). Therefore, along the direction $s_p$ the function $\psi_p(\alpha)=\phi(x_p + \alpha s_p)$ is decreasing. The number $\alpha_p$ is determined from the conditions that $x_{p+1}\in X$ and  $\phi(x_{p+1})<\phi(x_p)$. To calculate $s_p$ at $x_p$ one defines the cone of feasible directions, given by a system of inequalities, and one formulates the problem (of linear programming) of finding the possible direction along which the objective function decreases most quickly. This problem is readily solved using, say, the standard [[simplex method]]. The step length $\alpha_p$ is determined by solving the minimization problem for the function $\psi_p(\alpha)$. Under rather general assumptions it has been shown that the distance from $x_p$ to the set of stationary points of the original problem (i.e. the set of points at which the conditions necessary for a local minimum of $\phi(x)$ on $X$ are satisfied) tends to zero as $p\rightarrow\infty$. In the case when the original problem is a problem in convex programming, then, as $p\rightarrow\infty$, $x_p$ approaches the set of solutions (optimum points) of the original problem. The method of feasible directions admits an approximate solution of the indicated problems of determination of $s_p$ and $\alpha_p$, and in this sense it is stable with respect to computational errors.
  
In mathematical programming one of the main directions concerns computational methods for solving extremum problems. One of the widest used among these methods is the method of feasible directions. In this method a sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270035.png" /> of points of the set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270036.png" /> is constructed by the formula <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270037.png" />. At each iteration, to calculate the point <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270038.png" /> one has to choose a direction (a vector) <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270039.png" /> and a step length (a number) <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270040.png" />. For <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270041.png" /> one selects from among the feasible directions (i.e. directions at the point <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270042.png" />, a small displacement along which does not lead out of the set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270043.png" />) the one that makes an acute angle with the direction <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270044.png" /> of steepest descent of the objective function (with the vector <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270045.png" /> if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270046.png" /> is differentiable). Therefore, along the direction <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270047.png" /> the function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270048.png" /> is decreasing. The number <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270049.png" /> is determined from the conditions that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270050.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270051.png" />. To calculate <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270052.png" /> at <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270053.png" /> one defines the cone of feasible directions, given by a system of inequalities, and one formulates the problem (of linear programming) of finding the possible direction along which the objective function decreases most quickly. This problem is readily solved using, say, the standard [[Simplex method|simplex method]]. The step length <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270054.png" /> is determined by solving the minimization problem for the function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270055.png" />. Under rather general assumptions it has been shown that the distance from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270056.png" /> to the set of stationary points of the original problem (i.e. the set of points at which the conditions necessary for a local minimum of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270057.png" /> on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270058.png" /> are satisfied) tends to zero as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270059.png" />. In the case when the original problem is a problem in convex programming, then, as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270060.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270061.png" /> approaches the set of solutions (optimum points) of the original problem. The method of feasible directions admits an approximate solution of the indicated problems of determination of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270062.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270063.png" />, and in this sense it is stable with respect to computational errors.
+
For feasible sets of special structure (from the point of view of the simplicity of choosing $s_p$) one can apply minimization methods different from the method of feasible directions. Thus, in a gradient-projection method, as the descent direction at the point $x_p$ one takes $s_p=y_p - x_p$, where $y_p$ is the projection of $v_p=x_p+g(x_p)$, $g(x_p)=-\left.\frac{d \phi(x)}{dx}\right|_{x=x_p}$, on the set $X$.
  
For feasible sets of special structure (from the point of view of the simplicity of choosing <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270064.png" />) one can apply minimization methods different from the method of feasible directions. Thus, in a gradient-projection method, as the descent direction at the point <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270065.png" /> one takes <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270066.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270067.png" /> is the projection of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270068.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270069.png" />, on the set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270070.png" />.
+
If $X$ is given by linear constraints, one has the rather promising method of the conditional gradient: One linearizes the objective function at $x_p$ by constructing the function $l(x)=\phi(x)+\langle g(x_p),x_p - x\rangle$, and then, minimizing $l(x)$ on the set $X$, one finds a minimum point $y_p$ of it and one finally sets $s_p=y_p - x_p$.
  
If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270071.png" /> is given by linear constraints, one has the rather promising method of the conditional gradient: One linearizes the objective function at <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270072.png" /> by constructing the function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270073.png" />, and then, minimizing <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270074.png" /> on the set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270075.png" />, one finds a minimum point <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270076.png" /> of it and one finally sets <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270077.png" />.
+
Minimization methods for non-smooth functions were successfully elaborated. One of the representatives of this class is the method of the generalized gradient. By definition, a generalized gradient at $x_p$ is any vector $\tilde{g}(x_p)$ such that the inequality $\phi(y)-\phi(x_p)\geq\langle\tilde{g}(x_p),y-x_p\rangle$ holds for all $y\in X$. This method differs from the projection-gradient method in that for the point that is projected one takes $v_p=x_p-\tilde{g}(x_p)$.
  
Minimization methods for non-smooth functions were successfully elaborated. One of the representatives of this class is the method of the generalized gradient. By definition, a generalized gradient at <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270078.png" /> is any vector <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270079.png" /> such that the inequality <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270080.png" /> holds for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270081.png" />. This method differs from the projection-gradient method in that for the point that is projected one takes <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270082.png" />.
+
Stochastic methods of minimization are also widely used. In the simplest variant of the method of random descent one takes for $s_p$ a random point on the $n$-dimensional unit sphere with center at the coordinate origin, and if $s_p$ is a feasible direction and if, in addition, $\phi(x)$ decreases along this direction, then a step is effected, i.e. one shifts to the point $x_{p+1}$. In the opposite case, the procedure for selecting $s_p$ is carried out anew.
 
 
Stochastic methods of minimization are also widely used. In the simplest variant of the method of random descent one takes for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270083.png" /> a random point on the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270084.png" />-dimensional unit sphere with centre at the coordinate origin, and if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270085.png" /> is a feasible direction and if, in addition, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270086.png" /> decreases along this direction, then a step is effected, i.e. one shifts to the point <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270087.png" />. In the opposite case, the procedure for selecting <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270088.png" /> is carried out anew.
 
  
 
A characteristic feature of the computational aspect of the methods for solving the problems in mathematical programming is that the application of these methods is invariably connected with the utilization of electronic computers. The main reason for this is that the problems in mathematical programming that formalize situations of control of real systems involve a large amount of work which cannot be performed by manual computation.
 
A characteristic feature of the computational aspect of the methods for solving the problems in mathematical programming is that the application of these methods is invariably connected with the utilization of electronic computers. The main reason for this is that the problems in mathematical programming that formalize situations of control of real systems involve a large amount of work which cannot be performed by manual computation.
  
One of the widespread methods for investigating problems in mathematical programming is the method of penalty functions (cf. [[Penalty functions, method of|Penalty functions, method of]]). This method essentially replaces the given mathematical programming problem by a sequence of parametric problems on unconditional minimization, such that when the parameter tends to infinity (in other cases, to zero) the solutions of these auxiliary problems converge to the solution of the original problem. Note that in an unconditional minimization problem every direction is feasible, and consequently the search for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270089.png" /> requires less effort in comparison with the solution of the same problem by the method of feasible directions. (For example, in the method of steepest descent one takes <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270090.png" />.) The same is true concerning the search for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062700/m06270091.png" />.
+
One of the widespread methods for investigating problems in mathematical programming is the [[Penalty functions, method of|method of penalty functions]]. This method essentially replaces the given mathematical programming problem by a sequence of parametric problems on unconditional minimization, such that when the parameter tends to infinity (in other cases, to zero) the solutions of these auxiliary problems converge to the solution of the original problem. Note that in an unconditional minimization problem every direction is feasible, and consequently the search for $s_p$ requires less effort in comparison with the solution of the same problem by the method of feasible directions. (For example, in the [[Steepest descent, method of|method of steepest descent]] one takes $s_p=-\left.\frac{d \phi(x)}{dx}\right|_{x=x_p}$.) The same is true concerning the search for $\alpha_p$.
  
 
An important direction of investigation in mathematical programming is the problem of stability. Here essential importance is attached to the study of the class of stable problems, that is, problems for which small perturbations (errors) in the data result in perturbations of the solutions that are also small. In the case of unstable problems an important role is reserved for a procedure of approximating the unstable problem by a sequence of stable problems — this is known as the regularization process.
 
An important direction of investigation in mathematical programming is the problem of stability. Here essential importance is attached to the study of the class of stable problems, that is, problems for which small perturbations (errors) in the data result in perturbations of the solutions that are also small. In the case of unstable problems an important role is reserved for a procedure of approximating the unstable problem by a sequence of stable problems — this is known as the regularization process.
  
Along with finite-dimensional problems, problems of mathematical programming in infinite-dimensional spaces are considered as well. Among the latter there are various extremum problems in mathematical economics, technology, problems of optimization of physical characteristics of nuclear reactors, and others. In the terminology of mathematical programming one formulates the problems in the corresponding function space of [[Variational calculus|variational calculus]] and [[Optimal control|optimal control]].
+
Along with finite-dimensional problems, problems of mathematical programming in infinite-dimensional spaces are considered as well. Among the latter there are various extremum problems in mathematical economics, technology, problems of optimization of physical characteristics of nuclear reactors, and others. In the terminology of mathematical programming one formulates the problems in the corresponding function space of [[variational calculus]] and [[optimal control]].
  
 
Mathematical programming has crystallized as a science in the 1950-s until 1970-s. This is first of all connected with the development of electronic computers and, hence, the mathematical processing of large amounts of data. On this basis one solves problems of control and planning, in which the application of mathematical methods is connected mainly with the construction of mathematical models and related extremal problems, among them problems of mathematical programming.
 
Mathematical programming has crystallized as a science in the 1950-s until 1970-s. This is first of all connected with the development of electronic computers and, hence, the mathematical processing of large amounts of data. On this basis one solves problems of control and planning, in which the application of mathematical methods is connected mainly with the construction of mathematical models and related extremal problems, among them problems of mathematical programming.
  
 
====References====
 
====References====
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  Yu.P. Ivanov,  "Methods of optimization" , Moscow  (1978)  (In Russian)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  V.G. Karmanov,  "Mathematical programming" , Moscow  (1975)  (In Russian)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  E. Polak,  "Computational methods in optimization: a unified approach" , Acad. Press  (1971)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  Yu.M. Ermol'ev,  "Methods of stochastic programming" , Moscow  (1976)  (In Russian)</TD></TR></table>
+
<table>
 
+
<TR><TD valign="top">[1]</TD> <TD valign="top">  Yu.P. Ivanov,  "Methods of optimization" , Moscow  (1978)  (In Russian)</TD></TR>
 
+
<TR><TD valign="top">[2]</TD> <TD valign="top">  V.G. Karmanov,  "Mathematical programming" , Moscow  (1975)  (In Russian)</TD></TR>
 
+
<TR><TD valign="top">[3]</TD> <TD valign="top">  E. Polak,  "Computational methods in optimization: a unified approach" , Acad. Press  (1971)</TD></TR>
====Comments====
+
<TR><TD valign="top">[4]</TD> <TD valign="top">  Yu.M. Ermol'ev,  "Methods of stochastic programming" , Moscow  (1976)  (In Russian)</TD></TR>
 
+
<TR><TD valign="top">[5]</TD> <TD valign="top">  M. Minoux,  "Mathematical programming: theory and algorithms" , Wiley  (1986)</TD></TR>
 
+
<TR><TD valign="top">[6]</TD> <TD valign="top">  G. Zoutendijk,  "Mathematical programming methods" , North-Holland  (1976)</TD></TR>
====References====
+
<TR><TD valign="top">[7]</TD> <TD valign="top">  A.R.G. Heesterman,  "Matrices and simplex algorithms" , Reidel  (1983)</TD></TR>
<table><TR><TD valign="top">[a1]</TD> <TD valign="top">  M. Minoux,  "Mathematical programming: theory and algorithms" , Wiley  (1986)</TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top">  G. Zoutendijk,  "Mathematical programming methods" , North-Holland  (1976)</TD></TR><TR><TD valign="top">[a3]</TD> <TD valign="top">  A.R.G. Heesterman,  "Matrices and simplex algorithms" , Reidel  (1983)</TD></TR><TR><TD valign="top">[a4]</TD> <TD valign="top">  G.F. Hadley,  "Nonlinear and dynamic programming" , Addison-Wesley  (1964)</TD></TR><TR><TD valign="top">[a5]</TD> <TD valign="top">  W.I. Zangwill,  "Nonlinear programming: a unified approach" , Prentice-Hall  (1969)</TD></TR><TR><TD valign="top">[a6]</TD> <TD valign="top">  S. Zionts,  "Linear and integer programming" , Prentice-Hall  (1973)</TD></TR></table>
+
<TR><TD valign="top">[8]</TD> <TD valign="top">  G.F. Hadley,  "Nonlinear and dynamic programming" , Addison-Wesley  (1964)</TD></TR>
 +
<TR><TD valign="top">[9]</TD> <TD valign="top">  W.I. Zangwill,  "Nonlinear programming: a unified approach" , Prentice-Hall  (1969)</TD></TR>
 +
<TR><TD valign="top">[10]</TD> <TD valign="top">  S. Zionts,  "Linear and integer programming" , Prentice-Hall  (1973)</TD></TR>
 +
</table>
  
 
[[Category:Operations research, mathematical programming]]
 
[[Category:Operations research, mathematical programming]]

Latest revision as of 09:14, 9 June 2016

2020 Mathematics Subject Classification: Primary: 90Cxx [MSN][ZBL]

The branch of mathematics concerned with the theory and methods for solving problems on finding the extrema of functions on sets defined by linear and non-linear constraints (equalities and inequalities) in a finite-dimensional vector space. Mathematical programming is a branch of operations research, which comprises a wide class of control problems the mathematical models of which are finite-dimensional extremum problems. The problems of mathematical programming find applications in various areas of human activity where it is necessary to choose one of the possible ways of action, e.g. in solving numerous problems of control and planning of production processes as well as in problems of design and long-term planning. The term "mathematical programming" is connected with the fact that the goal of solving various problems is choosing programs of action.

The mathematical formulation of the problem of mathematical programming is: To minimize a scalar function $\phi(x)$ of a vector argument on the set \begin{equation} X=\left\{x\colon q_i(x)\geq 0,\; i=1,\ldots,k;\quad h_j(x)=0,\; j=1,\ldots,m \right\}, \end{equation} where $q_i(x)$ and $h_j(x)$ are scalar functions. The function $\phi(x)$ is called the objective function, and also the quality criterion, the set $X$ is called the feasible set, or the set of plans, a solution $x^*$ of the mathematical programming problem is an optimum point (or vector), a point of global minimum and also an optimal plan.

In mathematical programming one customarily distinguishes the following branches.

  • Linear programming: The objective function $\phi(x)$ and the constraints $q_i(x)$ and $h_j(x)$ are linear.
  • Quadratic programming: The objective function is quadratic and convex, while the feasible set is defined by linear equalities and inequalities.
  • Convex programming: The objective function and the constraints are convex.
  • Discrete programming/Integer programming: The solution is sought only at discrete, say integer, points of $X$.
  • Stochastic programming: In contrast to deterministic problems, here the data contain an element of indeterminacy. For example, in stochastic problems of minimization of a linear function $\sum_{j=1}^{n}c_j x_j$ under linear constraints $\sum_{j=1}^{n}a_{ij} x_j\geq b_i$, $i=1,2,\ldots$, the parameters $c_j$, $a_{ij}$, $b_i$, or only some of them, are random.

The problems of linear, quadratic and convex programming have a common property: Local optimality implies global optimality. The so-called multi-extremum problems, for which the indicated property does not hold, are both considerably more difficult and less investigated.

At the basis of the theory of convex, and, in particular, linear and quadratic, programming lie the Karush-Kuhn-Tucker conditions, which gives necessary and sufficient conditions for the existence of an optimum point $x^*$: In order that $x^*$ be an optimum point, i.e. \begin{equation} \phi(x^*)=\min_{x\in X}\phi(x), \end{equation} \begin{equation} X=\left\{x\colon f_i(x)\geq 0,\; i=1,\ldots,k\right\}, \end{equation} it is necessary and sufficient that there exist a point $y^*=(y_1^*,\ldots,y_k^*)$ such that the pair $x^*$, $y^*$ be a saddle point of the Lagrange function \begin{equation} L(x,y)=\phi(x)+\sum_{i=1}^{k}y_i f_i(x) . \end{equation} The latter means that \begin{equation} L(x^*,y)\leq L(x^*,y^*)\leq L(x,y^*), \end{equation} for all $x$ and all $y\geq 0$. If the constraints $f_i(x)$ are non-linear, the theorem is valid under some additional assumptions on the feasible set, for example, the existence of a point $x\in X$ such that $f_i(x)>0$, $i=1,\ldots,k$ (Slater's regularity condition).

If $\phi(x)$ and $f_i(x)$ are differentiable functions, then the following relations characterize a saddle point:

Thus, the problem of convex programming reduces to the solution of a system of equations and inequalities. \begin{equation} \begin{aligned} \sum_{j=1}^{n}\frac{\partial L}{\partial x_j}x_j = 0;&\qquad \frac{\partial L}{\partial x_j} \geq 0,& j=1,\ldots,n;\\ \sum_{i=1}^{k}\frac{\partial L}{\partial y_i}y_i = 0;&\qquad \frac{\partial L}{\partial y_i} \leq 0,\quad y_i\geq 0,& i=1,\ldots,k. \end{aligned} \end{equation} On the basis of the Karush-Kuhn-Tucker conditions various iterative minimization methods were developed, which amount to searching for a saddle point of Lagrange's function.

In mathematical programming one of the main directions concerns computational methods for solving extremum problems. One of the widest used among these methods is the method of feasible directions. In this method a sequence $\{x_m\}$ of points of the set $X$ is constructed by the formula $x_{p+1}=x_p + \alpha_p s_p$. At each iteration, to calculate the point $x_{p+1}$ one has to choose a direction (a vector) $s_p$ and a step length (a number) $\alpha_p$. For $s_p$ one selects from among the feasible directions (i.e. directions at the point $x_p$, a small displacement along which does not lead out of the set $X$) the one that makes an acute angle with the direction $g(x_p)$ of steepest descent of the objective function (with the vector $g(x_p)=-\left.\frac{d \phi(x)}{dx}\right|_{x=x_p}$ if $\phi(x)$ is differentiable). Therefore, along the direction $s_p$ the function $\psi_p(\alpha)=\phi(x_p + \alpha s_p)$ is decreasing. The number $\alpha_p$ is determined from the conditions that $x_{p+1}\in X$ and $\phi(x_{p+1})<\phi(x_p)$. To calculate $s_p$ at $x_p$ one defines the cone of feasible directions, given by a system of inequalities, and one formulates the problem (of linear programming) of finding the possible direction along which the objective function decreases most quickly. This problem is readily solved using, say, the standard simplex method. The step length $\alpha_p$ is determined by solving the minimization problem for the function $\psi_p(\alpha)$. Under rather general assumptions it has been shown that the distance from $x_p$ to the set of stationary points of the original problem (i.e. the set of points at which the conditions necessary for a local minimum of $\phi(x)$ on $X$ are satisfied) tends to zero as $p\rightarrow\infty$. In the case when the original problem is a problem in convex programming, then, as $p\rightarrow\infty$, $x_p$ approaches the set of solutions (optimum points) of the original problem. The method of feasible directions admits an approximate solution of the indicated problems of determination of $s_p$ and $\alpha_p$, and in this sense it is stable with respect to computational errors.

For feasible sets of special structure (from the point of view of the simplicity of choosing $s_p$) one can apply minimization methods different from the method of feasible directions. Thus, in a gradient-projection method, as the descent direction at the point $x_p$ one takes $s_p=y_p - x_p$, where $y_p$ is the projection of $v_p=x_p+g(x_p)$, $g(x_p)=-\left.\frac{d \phi(x)}{dx}\right|_{x=x_p}$, on the set $X$.

If $X$ is given by linear constraints, one has the rather promising method of the conditional gradient: One linearizes the objective function at $x_p$ by constructing the function $l(x)=\phi(x)+\langle g(x_p),x_p - x\rangle$, and then, minimizing $l(x)$ on the set $X$, one finds a minimum point $y_p$ of it and one finally sets $s_p=y_p - x_p$.

Minimization methods for non-smooth functions were successfully elaborated. One of the representatives of this class is the method of the generalized gradient. By definition, a generalized gradient at $x_p$ is any vector $\tilde{g}(x_p)$ such that the inequality $\phi(y)-\phi(x_p)\geq\langle\tilde{g}(x_p),y-x_p\rangle$ holds for all $y\in X$. This method differs from the projection-gradient method in that for the point that is projected one takes $v_p=x_p-\tilde{g}(x_p)$.

Stochastic methods of minimization are also widely used. In the simplest variant of the method of random descent one takes for $s_p$ a random point on the $n$-dimensional unit sphere with center at the coordinate origin, and if $s_p$ is a feasible direction and if, in addition, $\phi(x)$ decreases along this direction, then a step is effected, i.e. one shifts to the point $x_{p+1}$. In the opposite case, the procedure for selecting $s_p$ is carried out anew.

A characteristic feature of the computational aspect of the methods for solving the problems in mathematical programming is that the application of these methods is invariably connected with the utilization of electronic computers. The main reason for this is that the problems in mathematical programming that formalize situations of control of real systems involve a large amount of work which cannot be performed by manual computation.

One of the widespread methods for investigating problems in mathematical programming is the method of penalty functions. This method essentially replaces the given mathematical programming problem by a sequence of parametric problems on unconditional minimization, such that when the parameter tends to infinity (in other cases, to zero) the solutions of these auxiliary problems converge to the solution of the original problem. Note that in an unconditional minimization problem every direction is feasible, and consequently the search for $s_p$ requires less effort in comparison with the solution of the same problem by the method of feasible directions. (For example, in the method of steepest descent one takes $s_p=-\left.\frac{d \phi(x)}{dx}\right|_{x=x_p}$.) The same is true concerning the search for $\alpha_p$.

An important direction of investigation in mathematical programming is the problem of stability. Here essential importance is attached to the study of the class of stable problems, that is, problems for which small perturbations (errors) in the data result in perturbations of the solutions that are also small. In the case of unstable problems an important role is reserved for a procedure of approximating the unstable problem by a sequence of stable problems — this is known as the regularization process.

Along with finite-dimensional problems, problems of mathematical programming in infinite-dimensional spaces are considered as well. Among the latter there are various extremum problems in mathematical economics, technology, problems of optimization of physical characteristics of nuclear reactors, and others. In the terminology of mathematical programming one formulates the problems in the corresponding function space of variational calculus and optimal control.

Mathematical programming has crystallized as a science in the 1950-s until 1970-s. This is first of all connected with the development of electronic computers and, hence, the mathematical processing of large amounts of data. On this basis one solves problems of control and planning, in which the application of mathematical methods is connected mainly with the construction of mathematical models and related extremal problems, among them problems of mathematical programming.

References

[1] Yu.P. Ivanov, "Methods of optimization" , Moscow (1978) (In Russian)
[2] V.G. Karmanov, "Mathematical programming" , Moscow (1975) (In Russian)
[3] E. Polak, "Computational methods in optimization: a unified approach" , Acad. Press (1971)
[4] Yu.M. Ermol'ev, "Methods of stochastic programming" , Moscow (1976) (In Russian)
[5] M. Minoux, "Mathematical programming: theory and algorithms" , Wiley (1986)
[6] G. Zoutendijk, "Mathematical programming methods" , North-Holland (1976)
[7] A.R.G. Heesterman, "Matrices and simplex algorithms" , Reidel (1983)
[8] G.F. Hadley, "Nonlinear and dynamic programming" , Addison-Wesley (1964)
[9] W.I. Zangwill, "Nonlinear programming: a unified approach" , Prentice-Hall (1969)
[10] S. Zionts, "Linear and integer programming" , Prentice-Hall (1973)
How to Cite This Entry:
Mathematical programming. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Mathematical_programming&oldid=33652
This article was adapted from an original article by V.G. Karmanov (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article