Mathematical programming

From Encyclopedia of Mathematics
Jump to: navigation, search

2020 Mathematics Subject Classification: Primary: 90Cxx [MSN][ZBL]

The branch of mathematics concerned with the theory and methods for solving problems on finding the extrema of functions on sets defined by linear and non-linear constraints (equalities and inequalities) in a finite-dimensional vector space. Mathematical programming is a branch of operations research, which comprises a wide class of control problems the mathematical models of which are finite-dimensional extremum problems. The problems of mathematical programming find applications in various areas of human activity where it is necessary to choose one of the possible ways of action, e.g. in solving numerous problems of control and planning of production processes as well as in problems of design and long-term planning. The term "mathematical programming" is connected with the fact that the goal of solving various problems is choosing programs of action.

The mathematical formulation of the problem of mathematical programming is: To minimize a scalar function $\phi(x)$ of a vector argument on the set \begin{equation} X=\left\{x\colon q_i(x)\geq 0,\; i=1,\ldots,k;\quad h_j(x)=0,\; j=1,\ldots,m \right\}, \end{equation} where $q_i(x)$ and $h_j(x)$ are scalar functions. The function $\phi(x)$ is called the objective function, and also the quality criterion, the set $X$ is called the feasible set, or the set of plans, a solution $x^*$ of the mathematical programming problem is an optimum point (or vector), a point of global minimum and also an optimal plan.

In mathematical programming one customarily distinguishes the following branches.

  • Linear programming: The objective function $\phi(x)$ and the constraints $q_i(x)$ and $h_j(x)$ are linear.
  • Quadratic programming: The objective function is quadratic and convex, while the feasible set is defined by linear equalities and inequalities.
  • Convex programming: The objective function and the constraints are convex.
  • Discrete programming/Integer programming: The solution is sought only at discrete, say integer, points of $X$.
  • Stochastic programming: In contrast to deterministic problems, here the data contain an element of indeterminacy. For example, in stochastic problems of minimization of a linear function $\sum_{j=1}^{n}c_j x_j$ under linear constraints $\sum_{j=1}^{n}a_{ij} x_j\geq b_i$, $i=1,2,\ldots$, the parameters $c_j$, $a_{ij}$, $b_i$, or only some of them, are random.

The problems of linear, quadratic and convex programming have a common property: Local optimality implies global optimality. The so-called multi-extremum problems, for which the indicated property does not hold, are both considerably more difficult and less investigated.

At the basis of the theory of convex, and, in particular, linear and quadratic, programming lie the Karush-Kuhn-Tucker conditions, which gives necessary and sufficient conditions for the existence of an optimum point $x^*$: In order that $x^*$ be an optimum point, i.e. \begin{equation} \phi(x^*)=\min_{x\in X}\phi(x), \end{equation} \begin{equation} X=\left\{x\colon f_i(x)\geq 0,\; i=1,\ldots,k\right\}, \end{equation} it is necessary and sufficient that there exist a point $y^*=(y_1^*,\ldots,y_k^*)$ such that the pair $x^*$, $y^*$ be a saddle point of the Lagrange function \begin{equation} L(x,y)=\phi(x)+\sum_{i=1}^{k}y_i f_i(x) . \end{equation} The latter means that \begin{equation} L(x^*,y)\leq L(x^*,y^*)\leq L(x,y^*), \end{equation} for all $x$ and all $y\geq 0$. If the constraints $f_i(x)$ are non-linear, the theorem is valid under some additional assumptions on the feasible set, for example, the existence of a point $x\in X$ such that $f_i(x)>0$, $i=1,\ldots,k$ (Slater's regularity condition).

If $\phi(x)$ and $f_i(x)$ are differentiable functions, then the following relations characterize a saddle point:

Thus, the problem of convex programming reduces to the solution of a system of equations and inequalities. \begin{equation} \begin{aligned} \sum_{j=1}^{n}\frac{\partial L}{\partial x_j}x_j = 0;&\qquad \frac{\partial L}{\partial x_j} \geq 0,& j=1,\ldots,n;\\ \sum_{i=1}^{k}\frac{\partial L}{\partial y_i}y_i = 0;&\qquad \frac{\partial L}{\partial y_i} \leq 0,\quad y_i\geq 0,& i=1,\ldots,k. \end{aligned} \end{equation} On the basis of the Karush-Kuhn-Tucker conditions various iterative minimization methods were developed, which amount to searching for a saddle point of Lagrange's function.

In mathematical programming one of the main directions concerns computational methods for solving extremum problems. One of the widest used among these methods is the method of feasible directions. In this method a sequence $\{x_m\}$ of points of the set $X$ is constructed by the formula $x_{p+1}=x_p + \alpha_p s_p$. At each iteration, to calculate the point $x_{p+1}$ one has to choose a direction (a vector) $s_p$ and a step length (a number) $\alpha_p$. For $s_p$ one selects from among the feasible directions (i.e. directions at the point $x_p$, a small displacement along which does not lead out of the set $X$) the one that makes an acute angle with the direction $g(x_p)$ of steepest descent of the objective function (with the vector $g(x_p)=-\left.\frac{d \phi(x)}{dx}\right|_{x=x_p}$ if $\phi(x)$ is differentiable). Therefore, along the direction $s_p$ the function $\psi_p(\alpha)=\phi(x_p + \alpha s_p)$ is decreasing. The number $\alpha_p$ is determined from the conditions that $x_{p+1}\in X$ and $\phi(x_{p+1})<\phi(x_p)$. To calculate $s_p$ at $x_p$ one defines the cone of feasible directions, given by a system of inequalities, and one formulates the problem (of linear programming) of finding the possible direction along which the objective function decreases most quickly. This problem is readily solved using, say, the standard simplex method. The step length $\alpha_p$ is determined by solving the minimization problem for the function $\psi_p(\alpha)$. Under rather general assumptions it has been shown that the distance from $x_p$ to the set of stationary points of the original problem (i.e. the set of points at which the conditions necessary for a local minimum of $\phi(x)$ on $X$ are satisfied) tends to zero as $p\rightarrow\infty$. In the case when the original problem is a problem in convex programming, then, as $p\rightarrow\infty$, $x_p$ approaches the set of solutions (optimum points) of the original problem. The method of feasible directions admits an approximate solution of the indicated problems of determination of $s_p$ and $\alpha_p$, and in this sense it is stable with respect to computational errors.

For feasible sets of special structure (from the point of view of the simplicity of choosing $s_p$) one can apply minimization methods different from the method of feasible directions. Thus, in a gradient-projection method, as the descent direction at the point $x_p$ one takes $s_p=y_p - x_p$, where $y_p$ is the projection of $v_p=x_p+g(x_p)$, $g(x_p)=-\left.\frac{d \phi(x)}{dx}\right|_{x=x_p}$, on the set $X$.

If $X$ is given by linear constraints, one has the rather promising method of the conditional gradient: One linearizes the objective function at $x_p$ by constructing the function $l(x)=\phi(x)+\langle g(x_p),x_p - x\rangle$, and then, minimizing $l(x)$ on the set $X$, one finds a minimum point $y_p$ of it and one finally sets $s_p=y_p - x_p$.

Minimization methods for non-smooth functions were successfully elaborated. One of the representatives of this class is the method of the generalized gradient. By definition, a generalized gradient at $x_p$ is any vector $\tilde{g}(x_p)$ such that the inequality $\phi(y)-\phi(x_p)\geq\langle\tilde{g}(x_p),y-x_p\rangle$ holds for all $y\in X$. This method differs from the projection-gradient method in that for the point that is projected one takes $v_p=x_p-\tilde{g}(x_p)$.

Stochastic methods of minimization are also widely used. In the simplest variant of the method of random descent one takes for $s_p$ a random point on the $n$-dimensional unit sphere with center at the coordinate origin, and if $s_p$ is a feasible direction and if, in addition, $\phi(x)$ decreases along this direction, then a step is effected, i.e. one shifts to the point $x_{p+1}$. In the opposite case, the procedure for selecting $s_p$ is carried out anew.

A characteristic feature of the computational aspect of the methods for solving the problems in mathematical programming is that the application of these methods is invariably connected with the utilization of electronic computers. The main reason for this is that the problems in mathematical programming that formalize situations of control of real systems involve a large amount of work which cannot be performed by manual computation.

One of the widespread methods for investigating problems in mathematical programming is the method of penalty functions. This method essentially replaces the given mathematical programming problem by a sequence of parametric problems on unconditional minimization, such that when the parameter tends to infinity (in other cases, to zero) the solutions of these auxiliary problems converge to the solution of the original problem. Note that in an unconditional minimization problem every direction is feasible, and consequently the search for $s_p$ requires less effort in comparison with the solution of the same problem by the method of feasible directions. (For example, in the method of steepest descent one takes $s_p=-\left.\frac{d \phi(x)}{dx}\right|_{x=x_p}$.) The same is true concerning the search for $\alpha_p$.

An important direction of investigation in mathematical programming is the problem of stability. Here essential importance is attached to the study of the class of stable problems, that is, problems for which small perturbations (errors) in the data result in perturbations of the solutions that are also small. In the case of unstable problems an important role is reserved for a procedure of approximating the unstable problem by a sequence of stable problems — this is known as the regularization process.

Along with finite-dimensional problems, problems of mathematical programming in infinite-dimensional spaces are considered as well. Among the latter there are various extremum problems in mathematical economics, technology, problems of optimization of physical characteristics of nuclear reactors, and others. In the terminology of mathematical programming one formulates the problems in the corresponding function space of variational calculus and optimal control.

Mathematical programming has crystallized as a science in the 1950-s until 1970-s. This is first of all connected with the development of electronic computers and, hence, the mathematical processing of large amounts of data. On this basis one solves problems of control and planning, in which the application of mathematical methods is connected mainly with the construction of mathematical models and related extremal problems, among them problems of mathematical programming.


[1] Yu.P. Ivanov, "Methods of optimization" , Moscow (1978) (In Russian)
[2] V.G. Karmanov, "Mathematical programming" , Moscow (1975) (In Russian)
[3] E. Polak, "Computational methods in optimization: a unified approach" , Acad. Press (1971)
[4] Yu.M. Ermol'ev, "Methods of stochastic programming" , Moscow (1976) (In Russian)
[5] M. Minoux, "Mathematical programming: theory and algorithms" , Wiley (1986)
[6] G. Zoutendijk, "Mathematical programming methods" , North-Holland (1976)
[7] A.R.G. Heesterman, "Matrices and simplex algorithms" , Reidel (1983)
[8] G.F. Hadley, "Nonlinear and dynamic programming" , Addison-Wesley (1964)
[9] W.I. Zangwill, "Nonlinear programming: a unified approach" , Prentice-Hall (1969)
[10] S. Zionts, "Linear and integer programming" , Prentice-Hall (1973)
How to Cite This Entry:
Mathematical programming. Encyclopedia of Mathematics. URL:
This article was adapted from an original article by V.G. Karmanov (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article