# Extremal problems, numerical methods

Methods of computational mathematics that are applied to find the extrema (maxima or minima) of functions and functionals.

The numerical solution of extremal problems considered in infinite-dimensional function spaces (for example, problems of optimal control by means of processes described by ordinary or partial differential equations) can be obtained by using appropriate generalizations of many methods of mathematical programming developed for problems of minimization or maximization of functions of finitely many variables. In this context it is very important to make the correct choice of a suitable function space in which a concrete problem should be considered. Such a space is usually chosen by taking into account physical considerations, properties of admissible controls, properties of the solutions of the corresponding initial-boundary value problems for a fixed control, etc.

For example, the problem of optimal control consisting of the minimization of the functional

$$\tag{1 } J ( u) = \int\limits _ {t _ {0} } ^ { T } f ^ { 0 } ( x ( t) , u ( t) , t) d t + F ( x ( T) )$$

under the conditions

$$\tag{2 } \dot{x} = f ( x , u ( t) , t ),\ \ t _ {0} \leq t \leq T ; \ \ x ( t _ {0} ) = x _ {0} ,$$

$$\tag{3 } u = u ( t) \in V ( t) ,\ t _ {0} \leq t \leq T ,$$

is usually considered in the function space $L _ {2} ^ {r} [ t _ {0} , T ]$. Here $x = ( x ^ {1} \dots x ^ {n} )$, $u = ( u ^ {1} \dots u ^ {r} )$, $f = ( f ^ {1} \dots f ^ { n } )$, $f ^ { i } ( x , u , t )$, $i = 1 \dots n$, and $F ( x)$ are given functions, $t _ {0}$ and $T$ known moments in time, $t _ {0} < T$, $x _ {0}$ is a given initial point, $V ( t)$ is a given set in the Euclidean space $E ^ {r}$ for every $t \in [ t _ {0} , T ]$, and $L _ {2} ^ {r} [ t _ {0} , T ]$ is the Hilbert space of $r$- dimensional vector functions $u = u ( t) = ( u ^ {1} ( t) \dots u ^ {r} ( t) )$, $t _ {0} \leq t \leq T$, where each function $u ^ {i} ( t)$ is, together with its square, Lebesgue integrable over $[ t _ {0} , t]$ $( L _ {2} ^ {1} [ t _ {0} , T ] = L _ {2} [ t _ {0} , T ] )$; the inner product of two functions $u$ and $v$ in this space is defined by

$$\langle u , v \rangle _ {L _ {2} ^ {r} } = \int\limits _ {t _ {0} } ^ { T } \sum _ { i= } 1 ^ { r } u ^ {i} ( t) v ^ {i} ( t) d t ,$$

and the norm by

$$\| u \| _ {L _ {2} ^ {r} } = \left ( \int\limits _ {t _ {0} } ^ { T } \sum _ { i= } 1 ^ { r } | u ^ {i} ( t) | ^ {2} d t \right ) ^ {1/2} .$$

For a specific degree of smoothness of the functions $f ^ { i } ( x , u , t )$ and $F ( x)$ the increment of the functional (1) can be represented in the form

$$\tag{4 } J ( u + h ) - J ( u) = - \int\limits _ {t _ {0} } ^ { T } \sum _ { i= } 1 ^ { r } \frac{\partial H ( u) }{\partial u ^ {i} } h ^ {i} ( t) d t + R =$$

$$= \ \langle - \frac{\partial H ( u) }{\partial u } , h \rangle _ {L _ {2} ^ {r} } + R ,\ R = o ( \| h \| _ {L _ {2} ^ {r} } ) ,$$

where

$$\tag{5 } H ( x , \psi , u , t ) = \sum _ { i= } 1 ^ { n } \psi _ {i} f ^ { i } ( x , u , t ) - f ^ { 0 } ( x , u , t ) ,$$

$$\frac{\partial H }{\partial u } = \left ( \frac{\partial H }{\partial u ^ {1} } \dots \frac{\partial H }{\partial u ^ {r} } \right ) ,\ \left . \frac{\partial H ( u) }{\partial u } = \frac{\partial H }{\partial u } \right | _ {\begin{array} {c} u = u ( t) \\ x = x ( t ; u ) \\ \psi = \psi ( t ; u ) \end{array} } ,$$

$x = x ( t ; u )$ is the solution of the problem (2) for $u = u ( t)$, and $\psi = \psi ( t ; u )$ is the solution of the adjoint problem

$$\tag{6 } \left . \dot \psi _ {i} = - \frac{\partial H }{\partial x ^ {i} } \right | _ {\begin{array} {c} u = u ( y) \\ x = x ( t ; u ) \end{array} } ,\ \ t _ {0} \leq t \leq T ,\ \ i = 1 \dots n ,$$

$$\tag{7 } \left . \psi _ {i} ( T) = \frac{\partial F }{\partial x ^ {i} } \right | _ {x = x ( T ; u ) } .$$

Formula (4) implies that the functional (1) is differentiable in $L _ {2} ^ {r} [ t _ {0} , T ]$ and that its gradient is the vector function

$$\tag{8 } J ^ \prime ( u) = - \frac{\partial H ( u) }{\partial u } \in L _ {2} ^ {r} [ t _ {0} , T ] .$$

Hence, (1)–(3) can be solved by applying various methods that use the gradient of the functional. For $V ( t) = E ^ {r}$ one may apply the gradient method

$$u _ {k+} 1 ( t) = u _ {k} ( t) + \alpha _ {k} \frac{\partial H ( u _ {k} ) }{\partial u } ,\ \ k = 0 , 1 , . . . .$$

If $V ( t) = \{ {u = ( u ^ {1} \dots u ^ {r} ) } : {\alpha _ {i} ( t) \leq u ^ {i} \leq \beta _ {i} ( t), i = 1 \dots r } \}$, where $\alpha _ {i} ( t)$ and $\beta _ {i} ( t)$ are given functions in $L _ {2} [ t _ {0} , T ]$, then it is possible to apply the method of gradient projection:

$$u _ {k+} 1 ( t) = P \left ( u _ {k} ( t) + \alpha _ {k} \frac{\partial H ( u _ {k} ) }{\partial u } \right ) ,\ \ k = 0 , 1 \dots$$

where

$$P ( w) = ( w ^ {1} ( t) \dots w ^ {r} ( t) ) ,\ \ t _ {0} \leq t \leq T ,$$

$$w ^ {i} ( t) = \left \{ \begin{array}{ll} \alpha _ {i} ( t) &\textrm{ for } v ^ {i} ( t) \leq \alpha _ {i} ( t) , \\ v ^ {i} ( t) &\textrm{ for } \alpha _ {i} ( t) \leq v ^ {i} ( t) \leq \beta _ {i} ( t) , \\ \beta _ {i} ( t) &\textrm{ for } v ^ {i} ( t) > \beta _ {i} ( t) ,\ i = 1 \dots r . \\ \end{array} \right .$$

The parameter $\alpha _ {k} > 0$ can be chosen from the condition $J ( u _ {k+} 1 ) < J ( u _ {k} )$. The methods of the conditional gradient, dual gradients, etc. (see  and ) can be adapted for the problem (1)–(3) analogously. If this problem is considered under the additional restrictions

$$\tag{9 } x ( t ; u ) \in G ( t) ,\ \ t _ {0} \leq t \leq t ,$$

where $G ( t)$ is a given set in $E ^ {n}$, then these restrictions can be taken into account by using the method of penalty functions (cf. Penalty functions, method of). For example, if

$$G ( t) =$$

$$= \ \{ {x \in E ^ {n} } : {g _ {i} ( x , t ) \leq 0 ,\ i = 1 \dots m ; g _ {i} ( x , t ) = 0 , i = m + 1 \dots s } \} ,$$

then as a penalty function one may take

$$P ( u) = \int\limits _ {t _ {0} } ^ { T } \left ( \sum _ { i= } 1 ^ { m } ( \max \{ g _ {i} ( x ( t ; u ) , t ) ; 0 \} ) ^ {p\right} . +$$

$$+ \left . \sum _ { i= } m+ 1 ^ { s } | g _ {i} ( x ( t ; u ) , t ) | ^ {p} \right ) d t,\ p \geq 1 ,$$

and (1)–(3), (9) is replaced by the problem of minimizing the functional $\Phi _ {k} ( u) = J ( u) + A _ {k} P ( u)$ under the conditions (2) and (3), where $A _ {k}$ is the penalty coefficient and $\lim\limits _ {k \rightarrow \infty } A _ {k} = + \infty$.

Other methods for solving (1)–(3), (9) are based on Pontryagin's maximum principle and on dynamic programming (see Pontryagin maximum principle; Dynamic programming and Variational calculus, numerical methods of).

The problem of minimizing a quadratic functional subject to a system of linear ordinary differential equations or linear partial differential equations can be solved by applying the method of moments (see  and ). This method is described below when applied to the problem of minimizing the functional

$$\tag{10 } J ( u) = | x ( T ; u ) - y | _ {E ^ {n} } ^ {2} ,$$

where $x = x ( t ; u )$ is the solution of the problem

$$\tag{11 } \dot{x} = A ( t) x + B ( t) + f ( t) ,\ \ t _ {0} \leq t \leq T ,\ \ x ( t _ {0} ) = x _ {0} ,$$

and the controls $u = u ( t) \in L _ {2} ^ {r} [ t _ {0} , T ]$ are such that

$$\tag{12 } \| u \| _ {L _ {2} ^ {r} } ^ {2} = \ \int\limits _ {t _ {0} } ^ { T } | u ( t) | _ {E ^ {r} } ^ {2} d t \leq R ^ {2} ;$$

here $A ( t)$, $B ( t)$ and $f ( t)$ are given matrices of order $n \times n$, $n \times r$ and $n \times 1$, respectively, with piecewise-continuous entries on the interval $t _ {0} \leq t \leq T$, $0 < R \leq \infty$, $x , y \in E ^ {n}$ are given points, $| x | _ {E ^ {m} } ^ {2} = \langle x , x \rangle _ {E ^ {m} }$, and $\langle x , y \rangle _ {E ^ {m} } = \sum _ {i=} 1 ^ {m} x ^ {i} y ^ {i}$ is the inner product in $E ^ {m}$. From the rule of Lagrange multipliers it follows that the control $u = u ( t)$ is optimal in the problem (10)–(12) if and only if there is a number $\gamma \geq 0$( the Lagrange multiplier for the constraint (12)) such that

$$\tag{13 } J ^ \prime ( u) + \gamma u = 0 ,$$

$$\tag{14 } \gamma ( \| u \| _ {L _ {2} ^ {r} } - R ) = \ 0 ,\ \| u \| _ {L _ {2} ^ {r} } \leq R ;$$

here $\gamma = 0$ for $R = \infty$. Formulas (5)–(8) imply that the gradient $J ^ \prime ( u)$ of the functional (10) in $L _ {2} ^ {r} [ t _ {0} , T ]$ has the form

$$J ^ \prime ( u) = B ^ {T} ( t) \psi ( t ; u ) ,\ \ t _ {0} \leq t \leq T ,$$

where $\psi = \psi ( t ; u )$ is the solution of the problem

$$\tag{15 } \dot \psi = - A ^ {T} ( t) \psi ,\ \ t _ {0} \leq t \leq T ,$$

$$\tag{16 } \psi ( T) = 2 ( x ( T ; u ) - y ) ;$$

here $A ^ {T}$ and $B ^ {T}$ are the transposed matrices of $A$ and $B$, respectively. Then condition (13) becomes

$$\tag{17 } B ^ {T} ( t) \psi ( t ; u ) + \gamma u ( t) = 0 ,\ \ t _ {0} \leq t \leq T .$$

Condition (16) is equivalent to

$$\tag{18 } \int\limits _ {t _ {0} } ^ { T } \langle u ( t), B ^ {T} ( t) p _ {k} ( t) \rangle _ {E ^ {r} } d t - \frac{1}{2} \langle \psi ( T) , p _ {k} ( T) \rangle _ {E ^ {n} } = a _ {k} ,$$

for $k = 1 \dots n$, where

$$a _ {k} = \langle y , p _ {k} ( T) \rangle _ {E ^ {n} } - < x _ {0} ,\ p _ {k} ( t _ {0} ) > _ {E ^ {n} } - \int\limits _ {t _ {0} } ^ { T } \langle f ( t) , p _ {k} ( t) \rangle _ {E ^ {r} } d t ,$$

and $p _ {k} ( t)$ is the solution of the system (15) satisfying $p _ {k} ( T) = e _ {k} = ( 0 \dots 0 , 1 , 0 \dots 0)$( a unit vector). Hence, the optimal control $u = u ( t)$ in the problem (10)–(12) is determined by solving the system (14), (15), (17), (18) relative to the functions $u ( t)$ and $\psi ( t)$ and the number $\gamma \geq 0$. If $R = \infty$, then $\gamma = 0$, $\psi ( t) = 0$ and (18) reduces to the problem of moments (see Moment problem): To find a function $u = u( t)$, knowing

$$\int\limits _ {t _ {0} } ^ { T } \langle u ( t) , \phi _ {k} ( t) \rangle d t = a _ {k}$$

with respect to the system $\phi _ {k} ( t) = B ^ {T} ( t) p _ {k} ( t)$, $k = 1 \dots n$.

System (14), (15), (17), (18) is a generalized moment problem for (10)–(12) with $0 < R < \infty$( see  and ).

Any solution $\psi ( t)$ of (15) is uniquely representable in the form

$$\tag{19 } \psi ( t) = \sum _ { k= } 1 ^ { n } \psi _ {k} p _ {k} ( t) ,\ \ t _ {0} \leq t \leq T .$$

System (15), (17), (18) has a solution $\psi ( t ; \gamma )$, $u ( t ; \gamma )$ for any fixed $\gamma \geq 0$, and among all its solutions there is a unique one such that $u ( t ; \gamma )$ has the form

$$\tag{20 } u ( t ; \gamma ) = \sum _ { k= } 1 ^ { n } u _ {k} B ^ {T} ( t) p _ {k} ( t) ,\ \ t _ {0} \leq t \leq T .$$

To find $\psi ( t ; \gamma )$ and $u ( t ; \gamma )$ one substitutes (19) and (20) in (17) and (18) and obtains a system of linear algebraic equations with respect to $\psi _ {1} \dots \psi _ {n} , u _ {1} \dots u _ {n}$, from which $\psi _ {1} \dots \psi _ {n}$ are determined uniquely and $u _ {1} \dots u _ {n}$ non-uniquely if the system $\{ {B ^ {T} ( t) p _ {k} ( t) } : {k = 1 \dots n } \}$ is linearly dependent. In the practical solution of the problem (10)–(12) it is advisable to set $\gamma = 0$ and $\psi ( t) = 0$ from the beginning and to use (18) to determine $u ( t ; 0 )$ of the form (20). Next, it must be checked that $\| u ( t ; 0 ) \| _ {L _ {2} ^ {r} } \leq R$. If this inequality holds, then $u ( t ; 0 )$ is the optimal control of (10)–(12) with minimal norm among all optimal controls; the set of all optimal controls in this case consists of controls of the form

$$u ( t) = u ( t ; 0 ) + v ( t) ,\ \ t _ {0} \leq t \leq T ,$$

where $v ( t)$ belongs to the orthogonal complement in $L _ {2} ^ {r} [ t _ {0} ; T ]$ of the linear span of the system of functions

$$\{ {B ^ {T} ( t) p _ {k} ( t) } : {k = 1 \dots n } \} ,$$

$$\| v ( t) \| _ {L _ {2} ^ {r} } \leq R ^ {2} - \| u ( t ; 0 ) \| _ {L _ {2} ^ {r} } ^ {2} .$$

If $\| u ( t ; 0 ) \| _ {L _ {2} ^ {r} } > R$, then for $\gamma > 0$ the solutions $\psi ( t ; \gamma )$ and $u ( t ; \gamma )$ of the form (19) and (20), respectively, are found from (17) and (18), and $\gamma$ is found from the equation

$$\tag{21 } \| u ( t ; \gamma ) \| _ {L _ {2} ^ {r} } = R ,\ \gamma > 0 .$$

The function $\| u ( t ; \gamma ) \| _ {L _ {2} ^ {r} }$ of the variable $\gamma$ is continuous and strictly decreasing for $\gamma > 0$, and $\lim\limits _ {\gamma \rightarrow + \infty } \| u ( t ; 0 ) \| _ {L _ {2} ^ {r} } = 0$; therefore, the desired $\gamma = \gamma _ {*}$ is determined uniquely from (21). The control $u ( t ; \gamma _ {*} )$ is optimal for the problem (10)–(12); for $\| u ( t ; 0 ) \| _ {L _ {2} ^ {r} } > R$ this problem has no other optimal controls.

The method of moments is also applicable for the solution of the high-speed problem for the system (11) and other linear systems (see  and ).

The above methods are also widely used for the numerical solution of problems of optimal control by means of processes described by partial differential equations.

The numerical realizations of many methods for solving problems of optimal control are based on the use of some technique for the approximate solution of the intervening initial-boundary value problems (see Boundary value problem, numerical methods for partial differential equations) and of approximate computations of integrals (see Integration, numerical). As a result, the original problem of optimal control is replaced by a family of approximate problems depending on some parameters (for example, on the steps of the difference grid). About questions regarding the construction of approximate problems and the investigation of their convergence, see .

Large classes of extremal problems are ill posed (see Ill-posed problems) and their solution requires the use of a regularization method (see  and ).

How to Cite This Entry:
Extremal problems, numerical methods. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Extremal_problems,_numerical_methods&oldid=46893
This article was adapted from an original article by F.P. Vasil'ev (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article