Namespaces
Variants
Actions

Difference between revisions of "Maximization and minimization of functions"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
m (tex encoded by computer)
 
Line 1: Line 1:
 +
<!--
 +
m0630601.png
 +
$#A+1 = 40 n = 0
 +
$#C+1 = 40 : ~/encyclopedia/old_files/data/M063/M.0603060 Maximization and minimization of functions
 +
Automatically converted into TeX, above some diagnostics.
 +
Please remove this comment and the {{TEX|auto}} line below,
 +
if TeX found to be correct.
 +
-->
 +
 +
{{TEX|auto}}
 +
{{TEX|done}}
 +
 
''of a finite number of variables''
 
''of a finite number of variables''
  
The problem of finding the extrema of a function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m0630601.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m0630602.png" />. This means
+
The problem of finding the extrema of a function $  f ( x) $,  
 +
$  x= ( x  ^ {1} \dots x  ^ {n} ) \in X \subseteq \mathbf R  ^ {n} $.  
 +
This means
  
1) finding <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m0630603.png" /> or <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m0630604.png" />;
+
1) finding $  \overline{f}\; = \sup _ {x \in X }  f ( x) $
 +
or $  \underline{f} = \inf _ {x \in X }  f ( x) $;
  
2) searching maximum or minimum points if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m0630605.png" /> or <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m0630606.png" /> are attained on the admissible set (see [[Maximum and minimum of a function|Maximum and minimum of a function]]);
+
2) searching maximum or minimum points if $  \overline{f}\; $
 +
or $  \underline{f} $
 +
are attained on the admissible set (see [[Maximum and minimum of a function|Maximum and minimum of a function]]);
  
3) the construction of maximizing or minimizing sequences <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m0630607.png" /> such that:
+
3) the construction of maximizing or minimizing sequences $  \{ x _ {i} \} $
 +
such that:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m0630608.png" /></td> </tr></table>
+
$$
 +
\lim\limits _
 +
{i \rightarrow \infty } \
 +
f ( x _ {i} )  = \overline{f}\; ,\ \
 +
\lim\limits _
 +
{j \rightarrow \infty } \
 +
f ( x _ {j} )  = \underline{f} ,
 +
$$
  
if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m0630609.png" /> or <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306010.png" /> is not attained on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306011.png" />.
+
if $  \overline{f}\; $
 +
or $  \underline{f} $
 +
is not attained on $  X $.
  
 
The investigation of the extrema of functions of discrete arguments is referred to as [[Discrete programming|discrete programming]] or [[Integer programming|integer programming]]. Below only maximization and minimization for functions of continuous arguments is explained.
 
The investigation of the extrema of functions of discrete arguments is referred to as [[Discrete programming|discrete programming]] or [[Integer programming|integer programming]]. Below only maximization and minimization for functions of continuous arguments is explained.
  
The classical (indirect) methods of maximization and minimization apply only to smooth functions. They use necessary conditions for an extremum in order to locate stationary points. Zeros of the derivatives <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306012.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306013.png" />, are usually computed in practice by some successive approximation method (see [[#References|[3]]]). On the other hand, each problem of solving a finite set of functional equations of the form
+
The classical (indirect) methods of maximization and minimization apply only to smooth functions. They use necessary conditions for an extremum in order to locate stationary points. Zeros of the derivatives $  \partial  ^  \alpha  f / \partial  x  ^  \alpha  $,  
 +
$  \alpha = 1 \dots n $,  
 +
are usually computed in practice by some successive approximation method (see [[#References|[3]]]). On the other hand, each problem of solving a finite set of functional equations of the form
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306014.png" /></td> </tr></table>
+
$$
 +
\phi _ {m} ( x  ^ {1} \dots x  ^ {n} )  = 0 ,\ \
 +
m \leq  n ,
 +
$$
  
 
can be interpreted as a maximization or minimization problem of some function, for example, of the function
 
can be interpreted as a maximization or minimization problem of some function, for example, of the function
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306015.png" /></td> </tr></table>
+
$$
 +
f ( x)  = \phi _ {1}  ^ {2} ( x) + \dots + \phi _ {m}  ^ {2}
 +
( x)  \rightarrow  \min ,
 +
$$
  
 
and one of the specific methods for maximizing and minimizing can be applied.
 
and one of the specific methods for maximizing and minimizing can be applied.
  
Direct methods for maximizing and minimizing functions are based on direct comparison of the values of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306016.png" /> at two or more points.
+
Direct methods for maximizing and minimizing functions are based on direct comparison of the values of $  f $
 +
at two or more points.
  
 
The practical search of extrema uses iterative algorithms of the form:
 
The practical search of extrema uses iterative algorithms of the form:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306017.png" /></td> </tr></table>
+
$$
 +
x _ {i+} 1  = \widehat{X}  ( i , x _ {i} , x _ {i-} 1 \dots x _ {i-} j ) ,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306018.png" /> is the index of the iteration and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306019.png" /> is some operator. Here it is usually assumed that
+
where $  i $
 +
is the index of the iteration and $  \widehat{X}  ( \cdot ) $
 +
is some operator. Here it is usually assumed that
  
 
a) the algorithm converges in some sense, most often in the sense
 
a) the algorithm converges in some sense, most often in the sense
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306020.png" /></td> </tr></table>
+
$$
 +
x _  \infty  = \overline{x}\; \ \
 +
( x _  \infty  = \underline{x} ) \ \
 +
\textrm{ or } \ \
 +
f ( x _  \infty  )  = \overline{f}\; \
 +
( f ( x _  \infty  )  =  \underline{f}  ) ;
 +
$$
  
b) the iteration procedure is local, that is, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306021.png" /> (<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306022.png" /> as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306023.png" />); the algorithm  "remembers"  the values of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306024.png" /> only for iterations in some neighbourhood of the current position <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306025.png" />. For <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306026.png" /> a memoryless simple Markov computational process is obtained.
+
b) the iteration procedure is local, that is, $  j \ll  i $(
 +
$  j = o ( i) $
 +
as $  i \rightarrow \infty $);  
 +
the algorithm  "remembers"  the values of $  x $
 +
only for iterations in some neighbourhood of the current position $  x _ {i} $.  
 +
For $  j = 0 $
 +
a memoryless simple Markov computational process is obtained.
  
The operator <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306027.png" /> may be deterministic, in deterministic methods, or may contain stochastic parameters. In computational practice stochastic methods are often combined with deterministic methods, for example, in the [[Coordinate-wise descent method|coordinate-wise descent method]] the direction of descent may be determined randomly. The probabilistic characteristics of the stochastic parameters, in turn, may be changed from iteration to iteration (search with adaptation and  "self-learning"  random search).
+
The operator $  \widehat{X}  ( \cdot ) $
 +
may be deterministic, in deterministic methods, or may contain stochastic parameters. In computational practice stochastic methods are often combined with deterministic methods, for example, in the [[Coordinate-wise descent method|coordinate-wise descent method]] the direction of descent may be determined randomly. The probabilistic characteristics of the stochastic parameters, in turn, may be changed from iteration to iteration (search with adaptation and  "self-learning"  random search).
  
Various deterministic methods are widely used in combination, including sequential and parallel computation of an extremum by several methods, composition of algorithms of the form <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306028.png" />, etc. For example, the Levenberg–Marquardt method
+
Various deterministic methods are widely used in combination, including sequential and parallel computation of an extremum by several methods, composition of algorithms of the form $  \widehat{X}  = \widehat{X}  _ {2} ( \widehat{X}  _ {1} ( \cdot ) ) $,  
 +
etc. For example, the Levenberg–Marquardt method
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306029.png" /></td> </tr></table>
+
$$
 +
x _ {i+} 1  = \
 +
x _ {i} -
 +
( \alpha _ {i} \nabla \nabla f( x _ {j} ) + \beta _ {i} I )
 +
^ {-} 1 \nabla f ( x _ {i} ) ,
 +
$$
  
which coincides with the [[Gradient method|gradient method]] for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306030.png" /> and with the [[Newton method|Newton method]] for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306031.png" />.
+
which coincides with the [[Gradient method|gradient method]] for $  \alpha _ {i} = 0 $
 +
and with the [[Newton method|Newton method]] for $  \beta _ {i} = 0 $.
  
One-dimensional optimization, that is, maximizing and minimizing a function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306032.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306033.png" />, in addition to its interest by itself, is a necessary stage in most applicable methods. Specific one-dimensional methods include, for example, the [[Fibonacci method|Fibonacci method]]; the [[Dichotomy method|dichotomy method]] (division by halves) and the [[Parabola method|parabola method]]. Methods for maximizing and minimizing functions in several variables are the [[Gradient method|gradient method]], the method of steepest descent (cf. [[Steepest descent, method of|Steepest descent, method of]]), the [[Coordinate-wise descent method|coordinate-wise descent method]], the [[Simplex method|simplex method]], the [[Scanning method|scanning method]], the method of conjugate gradients, the method of the heavy sphere (cf. [[Conjugate gradients, method of|Conjugate gradients, method of]]; [[Heavy sphere, method of the|Heavy sphere, method of the]]), the [[Adjustment method|adjustment method]], and others.
+
One-dimensional optimization, that is, maximizing and minimizing a function $  f ( x) $,  
 +
$  x \in \mathbf R  ^ {1} $,  
 +
in addition to its interest by itself, is a necessary stage in most applicable methods. Specific one-dimensional methods include, for example, the [[Fibonacci method|Fibonacci method]]; the [[Dichotomy method|dichotomy method]] (division by halves) and the [[Parabola method|parabola method]]. Methods for maximizing and minimizing functions in several variables are the [[Gradient method|gradient method]], the method of steepest descent (cf. [[Steepest descent, method of|Steepest descent, method of]]), the [[Coordinate-wise descent method|coordinate-wise descent method]], the [[Simplex method|simplex method]], the [[Scanning method|scanning method]], the method of conjugate gradients, the method of the heavy sphere (cf. [[Conjugate gradients, method of|Conjugate gradients, method of]]; [[Heavy sphere, method of the|Heavy sphere, method of the]]), the [[Adjustment method|adjustment method]], and others.
  
 
The algorithms of most of the methods listed fall into the scheme of the descent (ascent) method:
 
The algorithms of most of the methods listed fall into the scheme of the descent (ascent) method:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306034.png" /></td> </tr></table>
+
$$
 +
x _ {i+} 1  = x _ {i} \mps \kappa _ {i} y _ {i} ,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306035.png" /> or <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306036.png" /> for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306037.png" /> (the relaxation condition). These are distinguished either by the choice of a descent direction <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306038.png" /> or by the method of moving along the descent vector, defined by the step factor <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306039.png" />.
+
where $  f ( x _ {i+} 1 ) \leq  f ( x _ {i} ) $
 +
or $  f ( x _ {i+} 1 ) \geq  f ( x _ {i} ) $
 +
for all $  i $(
 +
the relaxation condition). These are distinguished either by the choice of a descent direction $  y _ {i} $
 +
or by the method of moving along the descent vector, defined by the step factor $  \kappa $.
  
 
Cut methods have been developed for functions whose contours are  "valleys with steep slopes"  (see [[Minimization methods for functions depending strongly on a few variables|Minimization methods for functions depending strongly on a few variables]]). Ordinary (non-cut) methods when applied here give a zig-zag relaxational path, requiring excessive amounts of machine time to calculate the extremum.
 
Cut methods have been developed for functions whose contours are  "valleys with steep slopes"  (see [[Minimization methods for functions depending strongly on a few variables|Minimization methods for functions depending strongly on a few variables]]). Ordinary (non-cut) methods when applied here give a zig-zag relaxational path, requiring excessive amounts of machine time to calculate the extremum.
Line 69: Line 137:
 
If the admissible set is given by functional conditions
 
If the admissible set is given by functional conditions
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m063/m063060/m06306040.png" /></td> </tr></table>
+
$$
 +
\phi _ {m} ( x)  \leq  0
 +
$$
  
 
(constraints and restrictions, conditional extremum), then methods of [[Mathematical programming|mathematical programming]] are applied in the search of extrema. This problem can be reduced to a sequence of problems on unconstrained extrema by the use of penalty and barrier functions (see [[Penalty functions, method of|Penalty functions, method of]]).
 
(constraints and restrictions, conditional extremum), then methods of [[Mathematical programming|mathematical programming]] are applied in the search of extrema. This problem can be reduced to a sequence of problems on unconstrained extrema by the use of penalty and barrier functions (see [[Penalty functions, method of|Penalty functions, method of]]).
Line 75: Line 145:
 
====References====
 
====References====
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  M. Aoki,  "Introduction to optimization techniques" , Macmillan  (1971)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  N.S. Bakhvalov,  "Numerical methods: analysis, algebra, ordinary differential equations" , MIR  (1977)  (Translated from Russian)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  I.S. Berezin,  N.P. Zhidkov,  "Computing methods" , Pergamon  (1973)  (Translated from Russian)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  F.P. Vasil'ev,  "Lectures on methods for solving extremal problems" , Moscow  (1974)  (In Russian)</TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top">  A.M. Gupal,  "Stochastic methods of solving non-smooth extremal problems" , Kiev  (1979)  (In Russian)</TD></TR><TR><TD valign="top">[6]</TD> <TD valign="top">  V.G. Karmanov,  "Mathematical programming" , Moscow  (1975)  (In Russian)</TD></TR><TR><TD valign="top">[7]</TD> <TD valign="top">  N.N. Moiseev,  Yu.P. Ivanilov,  E.M. Stolyarova,  "Methods of optimization" , Moscow  (1978)  (In Russian)</TD></TR><TR><TD valign="top">[8]</TD> <TD valign="top">  B.N. Pshenichnyi,  Yu.M. Danilin,  "Numerical methods in extremal problems" , MIR  (1978)  (Translated from Russian)</TD></TR><TR><TD valign="top">[9]</TD> <TD valign="top">  B.S. Razymikhin,  "Physical models and methods of the theory of equilibrium in programming and economics" , Reidel  (1984)  (Translated from Russian)</TD></TR><TR><TD valign="top">[10]</TD> <TD valign="top">  L.A. Pastrigin,  "Extremal control systems" , Moscow  (1974)  (In Russian)</TD></TR><TR><TD valign="top">[11]</TD> <TD valign="top">  V.K. Saul'ev,  I.I. Samoilova,  "Approximation methods for the unconstrained optimization of functions of several variables"  ''J. Soviet Math.'' , '''4''' :  6  (1975)  pp. 681–705  ''Itogi Nauk. i Tekhn. Mat. Anal.'' , '''11'''  (1973)  pp. 91–128</TD></TR><TR><TD valign="top">[12]</TD> <TD valign="top">  D.J. Wilde,  "Extremum searching methods" , Prentice-Hall  (1965)</TD></TR><TR><TD valign="top">[13]</TD> <TD valign="top">  V.V. Fedorov,  "Numerical maximin methods" , Moscow  (1979)  (In Russian)</TD></TR><TR><TD valign="top">[14]</TD> <TD valign="top">  J.M. Ortega,  W.C. Rheinboldt,  "Iterative solution of non-linear equations in several variables" , Acad. Press  (1970)</TD></TR><TR><TD valign="top">[15]</TD> <TD valign="top"> , ''The current state of the theory of Operations Research'' , Moscow  (1979)  (In Russian)</TD></TR><TR><TD valign="top">[16]</TD> <TD valign="top">  C. Lemarechal (ed.)  R. Mifflin (ed.) , ''Nonsmooth optimization'' , Pergamon  (1978)</TD></TR><TR><TD valign="top">[17]</TD> <TD valign="top">  L.C.W. Dixon (ed.)  G.P. Szegö (ed.) , ''Towards global optimisation'' , '''1–2''' , North-Holland  (1975–1978)</TD></TR><TR><TD valign="top">[18]</TD> <TD valign="top">  Yu.G. Evtushenko,  "Numerical methods for finding global extrema (case of a non-uniform mesh)"  ''USSR Comp. Math. Math. Phys.'' , '''11''' :  6  (1971)  pp. 38–54  ''Zh. Vychisl. Mat. i Mat. Fiz.'' , '''11''' :  6  (1971)  pp. 1390–1403</TD></TR></table>
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  M. Aoki,  "Introduction to optimization techniques" , Macmillan  (1971)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  N.S. Bakhvalov,  "Numerical methods: analysis, algebra, ordinary differential equations" , MIR  (1977)  (Translated from Russian)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  I.S. Berezin,  N.P. Zhidkov,  "Computing methods" , Pergamon  (1973)  (Translated from Russian)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  F.P. Vasil'ev,  "Lectures on methods for solving extremal problems" , Moscow  (1974)  (In Russian)</TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top">  A.M. Gupal,  "Stochastic methods of solving non-smooth extremal problems" , Kiev  (1979)  (In Russian)</TD></TR><TR><TD valign="top">[6]</TD> <TD valign="top">  V.G. Karmanov,  "Mathematical programming" , Moscow  (1975)  (In Russian)</TD></TR><TR><TD valign="top">[7]</TD> <TD valign="top">  N.N. Moiseev,  Yu.P. Ivanilov,  E.M. Stolyarova,  "Methods of optimization" , Moscow  (1978)  (In Russian)</TD></TR><TR><TD valign="top">[8]</TD> <TD valign="top">  B.N. Pshenichnyi,  Yu.M. Danilin,  "Numerical methods in extremal problems" , MIR  (1978)  (Translated from Russian)</TD></TR><TR><TD valign="top">[9]</TD> <TD valign="top">  B.S. Razymikhin,  "Physical models and methods of the theory of equilibrium in programming and economics" , Reidel  (1984)  (Translated from Russian)</TD></TR><TR><TD valign="top">[10]</TD> <TD valign="top">  L.A. Pastrigin,  "Extremal control systems" , Moscow  (1974)  (In Russian)</TD></TR><TR><TD valign="top">[11]</TD> <TD valign="top">  V.K. Saul'ev,  I.I. Samoilova,  "Approximation methods for the unconstrained optimization of functions of several variables"  ''J. Soviet Math.'' , '''4''' :  6  (1975)  pp. 681–705  ''Itogi Nauk. i Tekhn. Mat. Anal.'' , '''11'''  (1973)  pp. 91–128</TD></TR><TR><TD valign="top">[12]</TD> <TD valign="top">  D.J. Wilde,  "Extremum searching methods" , Prentice-Hall  (1965)</TD></TR><TR><TD valign="top">[13]</TD> <TD valign="top">  V.V. Fedorov,  "Numerical maximin methods" , Moscow  (1979)  (In Russian)</TD></TR><TR><TD valign="top">[14]</TD> <TD valign="top">  J.M. Ortega,  W.C. Rheinboldt,  "Iterative solution of non-linear equations in several variables" , Acad. Press  (1970)</TD></TR><TR><TD valign="top">[15]</TD> <TD valign="top"> , ''The current state of the theory of Operations Research'' , Moscow  (1979)  (In Russian)</TD></TR><TR><TD valign="top">[16]</TD> <TD valign="top">  C. Lemarechal (ed.)  R. Mifflin (ed.) , ''Nonsmooth optimization'' , Pergamon  (1978)</TD></TR><TR><TD valign="top">[17]</TD> <TD valign="top">  L.C.W. Dixon (ed.)  G.P. Szegö (ed.) , ''Towards global optimisation'' , '''1–2''' , North-Holland  (1975–1978)</TD></TR><TR><TD valign="top">[18]</TD> <TD valign="top">  Yu.G. Evtushenko,  "Numerical methods for finding global extrema (case of a non-uniform mesh)"  ''USSR Comp. Math. Math. Phys.'' , '''11''' :  6  (1971)  pp. 38–54  ''Zh. Vychisl. Mat. i Mat. Fiz.'' , '''11''' :  6  (1971)  pp. 1390–1403</TD></TR></table>
 
 
  
 
====Comments====
 
====Comments====
 
  
 
====References====
 
====References====
 
<table><TR><TD valign="top">[a1]</TD> <TD valign="top">  R.T. Rockafellar,  "The theory of subgradients and its applications to problems of optimization. Convex and nonconvex functions" , Heldermann  (1981)</TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top">  P.J.M. van Laarhoven,  "Theoretical and computational aspect of simulated annealing" , ''CWI Tracts'' , '''51''' , CWI , Amsterdam  (1988)</TD></TR><TR><TD valign="top">[a3]</TD> <TD valign="top">  D.G. Luenberger,  "Linear and nonlinear programming" , Addison-Wesley  (1984)</TD></TR><TR><TD valign="top">[a4]</TD> <TD valign="top">  Committee On the Next Decade in Operations Research (Condor),  "Operations Research: the next decade"  ''Oper. Research'' , '''36''' :  4  (1988)  pp. 619–637</TD></TR><TR><TD valign="top">[a5]</TD> <TD valign="top">  J.W. Daniel,  "The approximate minimization of functionals" , Prentice-Hall  (1971)</TD></TR><TR><TD valign="top">[a6]</TD> <TD valign="top">  A.G. Buckley,  J.L. Goffin,  "Algorithms for constrained minimization of smooth nonlinear functions" , North-Holland  (1982)</TD></TR></table>
 
<table><TR><TD valign="top">[a1]</TD> <TD valign="top">  R.T. Rockafellar,  "The theory of subgradients and its applications to problems of optimization. Convex and nonconvex functions" , Heldermann  (1981)</TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top">  P.J.M. van Laarhoven,  "Theoretical and computational aspect of simulated annealing" , ''CWI Tracts'' , '''51''' , CWI , Amsterdam  (1988)</TD></TR><TR><TD valign="top">[a3]</TD> <TD valign="top">  D.G. Luenberger,  "Linear and nonlinear programming" , Addison-Wesley  (1984)</TD></TR><TR><TD valign="top">[a4]</TD> <TD valign="top">  Committee On the Next Decade in Operations Research (Condor),  "Operations Research: the next decade"  ''Oper. Research'' , '''36''' :  4  (1988)  pp. 619–637</TD></TR><TR><TD valign="top">[a5]</TD> <TD valign="top">  J.W. Daniel,  "The approximate minimization of functionals" , Prentice-Hall  (1971)</TD></TR><TR><TD valign="top">[a6]</TD> <TD valign="top">  A.G. Buckley,  J.L. Goffin,  "Algorithms for constrained minimization of smooth nonlinear functions" , North-Holland  (1982)</TD></TR></table>

Latest revision as of 08:00, 6 June 2020


of a finite number of variables

The problem of finding the extrema of a function $ f ( x) $, $ x= ( x ^ {1} \dots x ^ {n} ) \in X \subseteq \mathbf R ^ {n} $. This means

1) finding $ \overline{f}\; = \sup _ {x \in X } f ( x) $ or $ \underline{f} = \inf _ {x \in X } f ( x) $;

2) searching maximum or minimum points if $ \overline{f}\; $ or $ \underline{f} $ are attained on the admissible set (see Maximum and minimum of a function);

3) the construction of maximizing or minimizing sequences $ \{ x _ {i} \} $ such that:

$$ \lim\limits _ {i \rightarrow \infty } \ f ( x _ {i} ) = \overline{f}\; ,\ \ \lim\limits _ {j \rightarrow \infty } \ f ( x _ {j} ) = \underline{f} , $$

if $ \overline{f}\; $ or $ \underline{f} $ is not attained on $ X $.

The investigation of the extrema of functions of discrete arguments is referred to as discrete programming or integer programming. Below only maximization and minimization for functions of continuous arguments is explained.

The classical (indirect) methods of maximization and minimization apply only to smooth functions. They use necessary conditions for an extremum in order to locate stationary points. Zeros of the derivatives $ \partial ^ \alpha f / \partial x ^ \alpha $, $ \alpha = 1 \dots n $, are usually computed in practice by some successive approximation method (see [3]). On the other hand, each problem of solving a finite set of functional equations of the form

$$ \phi _ {m} ( x ^ {1} \dots x ^ {n} ) = 0 ,\ \ m \leq n , $$

can be interpreted as a maximization or minimization problem of some function, for example, of the function

$$ f ( x) = \phi _ {1} ^ {2} ( x) + \dots + \phi _ {m} ^ {2} ( x) \rightarrow \min , $$

and one of the specific methods for maximizing and minimizing can be applied.

Direct methods for maximizing and minimizing functions are based on direct comparison of the values of $ f $ at two or more points.

The practical search of extrema uses iterative algorithms of the form:

$$ x _ {i+} 1 = \widehat{X} ( i , x _ {i} , x _ {i-} 1 \dots x _ {i-} j ) , $$

where $ i $ is the index of the iteration and $ \widehat{X} ( \cdot ) $ is some operator. Here it is usually assumed that

a) the algorithm converges in some sense, most often in the sense

$$ x _ \infty = \overline{x}\; \ \ ( x _ \infty = \underline{x} ) \ \ \textrm{ or } \ \ f ( x _ \infty ) = \overline{f}\; \ ( f ( x _ \infty ) = \underline{f} ) ; $$

b) the iteration procedure is local, that is, $ j \ll i $( $ j = o ( i) $ as $ i \rightarrow \infty $); the algorithm "remembers" the values of $ x $ only for iterations in some neighbourhood of the current position $ x _ {i} $. For $ j = 0 $ a memoryless simple Markov computational process is obtained.

The operator $ \widehat{X} ( \cdot ) $ may be deterministic, in deterministic methods, or may contain stochastic parameters. In computational practice stochastic methods are often combined with deterministic methods, for example, in the coordinate-wise descent method the direction of descent may be determined randomly. The probabilistic characteristics of the stochastic parameters, in turn, may be changed from iteration to iteration (search with adaptation and "self-learning" random search).

Various deterministic methods are widely used in combination, including sequential and parallel computation of an extremum by several methods, composition of algorithms of the form $ \widehat{X} = \widehat{X} _ {2} ( \widehat{X} _ {1} ( \cdot ) ) $, etc. For example, the Levenberg–Marquardt method

$$ x _ {i+} 1 = \ x _ {i} - ( \alpha _ {i} \nabla \nabla f( x _ {j} ) + \beta _ {i} I ) ^ {-} 1 \nabla f ( x _ {i} ) , $$

which coincides with the gradient method for $ \alpha _ {i} = 0 $ and with the Newton method for $ \beta _ {i} = 0 $.

One-dimensional optimization, that is, maximizing and minimizing a function $ f ( x) $, $ x \in \mathbf R ^ {1} $, in addition to its interest by itself, is a necessary stage in most applicable methods. Specific one-dimensional methods include, for example, the Fibonacci method; the dichotomy method (division by halves) and the parabola method. Methods for maximizing and minimizing functions in several variables are the gradient method, the method of steepest descent (cf. Steepest descent, method of), the coordinate-wise descent method, the simplex method, the scanning method, the method of conjugate gradients, the method of the heavy sphere (cf. Conjugate gradients, method of; Heavy sphere, method of the), the adjustment method, and others.

The algorithms of most of the methods listed fall into the scheme of the descent (ascent) method:

$$ x _ {i+} 1 = x _ {i} \mps \kappa _ {i} y _ {i} , $$

where $ f ( x _ {i+} 1 ) \leq f ( x _ {i} ) $ or $ f ( x _ {i+} 1 ) \geq f ( x _ {i} ) $ for all $ i $( the relaxation condition). These are distinguished either by the choice of a descent direction $ y _ {i} $ or by the method of moving along the descent vector, defined by the step factor $ \kappa $.

Cut methods have been developed for functions whose contours are "valleys with steep slopes" (see Minimization methods for functions depending strongly on a few variables). Ordinary (non-cut) methods when applied here give a zig-zag relaxational path, requiring excessive amounts of machine time to calculate the extremum.

The comparative effectiveness of the methods is estimated by many, even contradictory, criteria. E.g., accuracy of the solution, speed of approach to the solution, reliability of the method, preparation time from the problem to the calculation, convergence of the algorithm, etc. The domain of applicability of each of the approved methods is very limited.

For testing the methods a set of standard test functions, characteristic of different function classes, has been collected (see [1]). Convergence of methods for maximizing and minimizing functions has been extensively studied (see [6], [8]). However, convergence is a property which is neither necessary nor sufficient for the effective termination of a calculation.

All methods above lead to a local extremum if the first approximation belongs to the domain of attraction of this extremum. The detection of a global extremum is guaranteed only for convex and related unimodal functions. The theory of finding a global extremum is still (1989) in the initial phase of development (see Multi-extremum problem). Another area of maximizing and minimizing of functions which is developing is optimization of non-smooth functions (see [4], [13], [16]). In particular, the problem of minimizing the maxima of a family of functions generally leads to non-smooth functions (see Maximin, numerical methods). Apparently, all of the commonly-used methods for optimization have an interesting physical, economic or biological meaning. The corresponding research is only just developing (see [9]) and leads to the creation of new methods (see also Continuous analogues of iteration methods). If the values of the functions being considered are defined statistically with random noise, then one of the methods of stochastic approximation is applied to find extrema. Here one is bordering on the design of experiments.

Experimental methods of maximizing and minimizing functions use the reproduction of various physical processes in the search for extrema. A related area is simulation on analogue computers (see [17]). In spite of the convenience and cheapness of using the simplest automatic optimizers, they do not guarantee high accuracy of calculation.

Graphical methods are suitable only for obtaining rough estimates and for the construction of a first approximation for iterative methods.

If the admissible set is given by functional conditions

$$ \phi _ {m} ( x) \leq 0 $$

(constraints and restrictions, conditional extremum), then methods of mathematical programming are applied in the search of extrema. This problem can be reduced to a sequence of problems on unconstrained extrema by the use of penalty and barrier functions (see Penalty functions, method of).

References

[1] M. Aoki, "Introduction to optimization techniques" , Macmillan (1971)
[2] N.S. Bakhvalov, "Numerical methods: analysis, algebra, ordinary differential equations" , MIR (1977) (Translated from Russian)
[3] I.S. Berezin, N.P. Zhidkov, "Computing methods" , Pergamon (1973) (Translated from Russian)
[4] F.P. Vasil'ev, "Lectures on methods for solving extremal problems" , Moscow (1974) (In Russian)
[5] A.M. Gupal, "Stochastic methods of solving non-smooth extremal problems" , Kiev (1979) (In Russian)
[6] V.G. Karmanov, "Mathematical programming" , Moscow (1975) (In Russian)
[7] N.N. Moiseev, Yu.P. Ivanilov, E.M. Stolyarova, "Methods of optimization" , Moscow (1978) (In Russian)
[8] B.N. Pshenichnyi, Yu.M. Danilin, "Numerical methods in extremal problems" , MIR (1978) (Translated from Russian)
[9] B.S. Razymikhin, "Physical models and methods of the theory of equilibrium in programming and economics" , Reidel (1984) (Translated from Russian)
[10] L.A. Pastrigin, "Extremal control systems" , Moscow (1974) (In Russian)
[11] V.K. Saul'ev, I.I. Samoilova, "Approximation methods for the unconstrained optimization of functions of several variables" J. Soviet Math. , 4 : 6 (1975) pp. 681–705 Itogi Nauk. i Tekhn. Mat. Anal. , 11 (1973) pp. 91–128
[12] D.J. Wilde, "Extremum searching methods" , Prentice-Hall (1965)
[13] V.V. Fedorov, "Numerical maximin methods" , Moscow (1979) (In Russian)
[14] J.M. Ortega, W.C. Rheinboldt, "Iterative solution of non-linear equations in several variables" , Acad. Press (1970)
[15] , The current state of the theory of Operations Research , Moscow (1979) (In Russian)
[16] C. Lemarechal (ed.) R. Mifflin (ed.) , Nonsmooth optimization , Pergamon (1978)
[17] L.C.W. Dixon (ed.) G.P. Szegö (ed.) , Towards global optimisation , 1–2 , North-Holland (1975–1978)
[18] Yu.G. Evtushenko, "Numerical methods for finding global extrema (case of a non-uniform mesh)" USSR Comp. Math. Math. Phys. , 11 : 6 (1971) pp. 38–54 Zh. Vychisl. Mat. i Mat. Fiz. , 11 : 6 (1971) pp. 1390–1403

Comments

References

[a1] R.T. Rockafellar, "The theory of subgradients and its applications to problems of optimization. Convex and nonconvex functions" , Heldermann (1981)
[a2] P.J.M. van Laarhoven, "Theoretical and computational aspect of simulated annealing" , CWI Tracts , 51 , CWI , Amsterdam (1988)
[a3] D.G. Luenberger, "Linear and nonlinear programming" , Addison-Wesley (1984)
[a4] Committee On the Next Decade in Operations Research (Condor), "Operations Research: the next decade" Oper. Research , 36 : 4 (1988) pp. 619–637
[a5] J.W. Daniel, "The approximate minimization of functionals" , Prentice-Hall (1971)
[a6] A.G. Buckley, J.L. Goffin, "Algorithms for constrained minimization of smooth nonlinear functions" , North-Holland (1982)
How to Cite This Entry:
Maximization and minimization of functions. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Maximization_and_minimization_of_functions&oldid=47803
This article was adapted from an original article by Yu.P. IvanilovV.V. Okhrimenko (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article