# Variational calculus

calculus of variations

The branch of mathematics in which one studies methods for obtaining extrema of functionals which depend on the choice of one or several functions subject to constraints of various kinds (phase, differential, integral, etc.) imposed on these functions. This is the framework of the problems which are still known as problems of classical variational calculus. The term "variational calculus" has a broader sense also, viz., a branch of the theory of extremal problems in which the extrema are studied by the "method of variations" (cf. Variation), i.e. by the method of small perturbations of the arguments and functionals; such problems, in the wider sense, are opposite to discrete optimization problems.

The following scheme describes a rather wide range of problems of classical variational calculus. It is required to minimize the functional

$$\tag{1 } J ( x) = \int\limits _ { T } f ( t, x ( t), \dot{x} ( t)) dt,$$

where $T \subset \mathbf R ^ {m}$, $t = ( t _ {1}, \dots, t _ {m} )$, $x = ( x ^ {1}, \dots, x ^ {n} )$,

$$\dot{x} = \left ( \frac{\partial x ^ {i} }{\partial t _ {0} } \right ) ,\ \ f: \mathbf R ^ {m} \times \mathbf R ^ {n} \times \mathbf R ^ {mn} \rightarrow \mathbf R ,$$

subject to the constraints described by equations of the type

$$\tag{2 } \phi(t, x(t), \dot x(t)) = 0 , \\ \phi:\mathbf R^m \times \mathbf R^n \times \mathbf R^{mn} \to \mathbf R^s ,$$

and by certain boundary conditions $x\mid _ {\partial T } \in \Gamma$. Problems of this type are known as Lagrange problems (cf. Lagrange problem). Other types of problems considered are the Mayer problem, the Bolza problem, etc.

The most elementary question in classical variational calculus is the simplest problem in variational calculus, in which $t$ and $x$ in (1) are one-dimensional, the constraints (2) are absent and the boundary conditions are fixed:

$$\tag{3 } J ( x) = \ \int\limits _ { t _ {0} } ^ { {t _ 1 } } L ( t, x, \dot{x} ) dt \rightarrow \inf ; \ \ x ( t _ {0} ) = x _ {0} ,\ \ x ( t _ {1} ) = x _ {1} .$$

This type includes the brachistochrone problem, or the problem of curves of minimum time of descent. This problem is usually considered to be the starting point in the history of the calculus of variations.

The theoretical foundations of classical variational calculus were laid in the 18th century by L. Euler and J.L. Lagrange. They also discovered the important connections of this discipline with mechanics and physics. Many specific problems (on geodesics, surfaces of revolution, isoperimetric problems, etc.) were solved during the first stage of development of this theory — mainly owing to the work of G. Leibniz, Jacob, and Johann Bernoulli, Euler and Lagrange.

Variational calculus deals with algorithmic methods for finding extrema, methods of arriving at necessary and sufficient conditions, conditions which ensure the existence of an extremum, qualitative problems, etc. Direct methods occupy an important place among the algorithmic methods for finding extrema.

## Direct methods.

Euler (1768) proposed a method for the approximate (numerical) solution of problems in variational calculus, which received the name of Euler's method of polygonal lines. This marked the beginning of the study of numerically solving extremum problems. Euler's method was the first representative of a large class of methods known as direct methods of variational calculus. These methods are based on reducing the problem of finding the extremum of a functional to that of finding the extremum of a function of several variables.

Problem (3) may be solved by Euler's method of polygonal lines as follows. The interval $[ t _ {0} , t _ {1} ]$ is subdivided into $N$ equal parts by means of points $t _ {0} = t _ {0} , \tau _ {1} = t _ {0} + \tau, \dots, \tau _ {N} = t _ {0} + N \tau = t _ {1}$. Let the values of the function at these points be $x _ {0} , x _ {1}, \dots, x _ {N}$, respectively. Each set of points $( \tau _ {0} , x _ {0} ), \dots, ( \tau _ {N} , x _ {N} )$ defines some polygonal line. The problem may now be formulated as follows: Out of all possible polygonal lines connecting the points $( \tau _ {0} , x _ {0} )$ and $( \tau _ {N} , x _ {N} )$, to find the line for which the functional (1) assumes an extremal value. The value of the derivative $\dot{x}$ on the interval $[ \tau _ {i} , \tau _ {i+ 1 } ]$ will be ${\dot{x} } _ {i} = ( x _ {i+ 1 } - x _ {i} )/ \tau$. The functional $J( x)$ becomes a function of a finite number of variables $x _ {i}$:

$$J ( x) \sim J ( x _ {0}, \dots, x _ {n} ),$$

and problem (3) is reduced to the problem of finding the extremum of the function $J( x _ {0}, \dots, x _ {n} )$. In order that Euler's line realizing the extremum of this function approximate the solution of problem (3) with a high accuracy, the number $N$ should, as a rule, be sufficiently large. The labour involved in the computations which must be performed to find the extremum of the function (3) is so large that "manual" computations are very difficult. For this reason, direct methods were ruled out in basic studies of variational calculus for a long time.

Direct methods began to be much more extensively studied in the 20th century. At first, new methods were proposed to reduce the problem to finding the extremum of a function in a finite number of variables. These ideas may be clarified by taking, as an example, the minimization of the functional (3) subject to the condition

$$x ( t _ {0} ) = x ( t _ {1} ) = 0.$$

Consider the solution of this problem in the form

$$x ( t) = \sum _ {n = 1 } ^ { N } a _ {n} \phi _ {n} ( t),$$

where $\{ \phi _ {n} ( t) \}$ is some system of functions satisfying the conditions $\phi _ {i} ( t _ {0} ) = \phi _ {i} ( t _ {1} ) = 0$, $i = 1, \dots, N$. The functional $J( x)$ becomes a function of the coefficients, $J( x) \sim J( a _ {1}, \dots, a _ {N} )$, and the problem is reduced to finding the extremum of this function of $N$ variables. Under certain conditions imposed on the system of functions $\{ \phi _ {n} \}$, the solution of the problem tends to that of problem (3) as $N \rightarrow \infty$ (cf. Galerkin method).

## The method of variations.

A second direction of study is the study of necessary and sufficient conditions to be satisfied by the function $x( t)$ realizing the extremum of the functional $J( x)$. The principal method for finding necessary conditions is the method of variations. Construct some function $x( t)$ in one way or another. How to test whether or not this function is a solution of the variational problem (3)? An answer to this question was first given by Euler in 1744. The answer as formulated below involves the concept, introduced by Lagrange in 1762, of the variation $\delta J$ of the functional $J$ (hence the name "variational calculus" ; cf. Variation; Variation of a functional).

For the simplest problem of variational calculus this variation is defined as:

$$\delta J ( x, h) = \ \int\limits _ { t _ {0} } ^ { {t _ 1 } } \left . \left ( \frac{\partial L }{\partial x } - { \frac{d}{dt} } \frac{\partial L }{\partial \dot{x} } \right ) \right | _ {x ( t) } h ( t) dt,$$

where $h( t)$ is an arbitrary smooth function satisfying the conditions $h( t _ {0} ) = h( t _ {1} ) = 0$. The condition $\delta J = 0$ is necessary for the function $x( t)$ to realize an extremum of the functional (3). Hence — and also from the expression for the variation $\delta J$— one may conclude that, for the function $x( t)$ to constitute an extremum of (3), it must satisfy the following second-order differential equation:

$$\tag{4 } \frac{\partial L }{\partial x } - { \frac{d}{dt} } \frac{\partial L }{\partial \dot{x} } = 0.$$

The above equation is known as the Euler equation, while the integral curves of this family are said to be the extremals of the variational problem under consideration. A function $x( t)$ for which $J( x)$ attains an extremum necessarily represents a solution of the boundary value problem $x( t _ {0} ) = x _ {0}$, $x( t _ {1} ) = x _ {1}$ for equation (4). One has thus obtained a second method for solving the extremal problem. The boundary value problem for the Euler equation is solved (in regular cases the number of such solutions will be finite), after which for each one of the solutions which have been obtained supplementary restrictions are tested in order to find out if there are curves which are a solution of the initial problem. However, a significant drawback of this method is that it does not furnish universal methods for solving boundary value problems for ordinary (non-linear) differential equations.

Variational problems with mobile ends are very often encountered. For instance, in the simplest problems, the points $x( t _ {0} )$ and $x( t _ {1} )$ may move along given curves. In problems with mobile ends the condition $\delta J = 0$ implies supplementary conditions to be satisfied by the mobile ends — the so-called transversality condition which, in conjunction with the boundary conditions, yields a closed system of conditions for the boundary value problem.

The principal results concerning the simplest problem of variational calculus are applied to the general case of functionals of the type

$$J ( x) = \ \int\limits _ { t _ {0} } ^ { {t _ 1 } } F \left ( x ( t), { \frac{dx}{dt} }, \dots, \frac{d ^ {s} x }{dt ^ {s} } \right ) dt,$$

where $x( t)$ is a vector function of arbitrary dimension [3].

## Lagrange's problem.

Euler and Lagrange also studied problems on a conditional extremum. The simplest class of problems of this type is the class of so-called isoperimetric problems (cf. Isoperimetric problem). For the case of one-dimensional $t$, Lagrange stated the class of problems (1) and (2), and obtained an analogue of Euler's equations, involving the so-called Lagrange multipliers for these problems. Such an analogue may also be obtained for the most general case of problems (1) and (2). The Lagrange problem assumed special importance in the mid-20th century in connection with the creation of the mathematical theory of optimal control (cf. Optimal control, mathematical theory of). Below the main results concerning the Lagrange problem are given, in terms of this theory. These were obtained by L.S. Pontryagin and his school.

Consider the following case: In problems (1) and (2), $t$ is one-dimensional and the system $\phi ( t, x, \dot{x} )$ may be solved for some components of the last-named variables. The resulting problem is that of minimizing the functional

$$\tag{5 } J ( x, u) = \int\limits _ { t _ {0} } ^ { {t _ 1 } } F ( t, x, u) dt$$

under the differential constraint

$$\tag{6 } \dot{x} = f ( t, x, u)$$

and the boundary conditions

$$\tag{7 } ( x ( t _ {0} ), x ( t _ {1} )) \in E.$$

In equations (5)–(7), $x = ( x ^ {1}, \dots, x ^ {n} )$ is a vector function known as the phase vector, $u = ( u ^ {1}, \dots, u ^ {m} )$ is a vector function known as the control, $F: \mathbf R \times {\mathbf R ^ {n} } \times {\mathbf R ^ {m} } \rightarrow \mathbf R$, $f: \mathbf R \times {\mathbf R ^ {n} } \times {\mathbf R ^ {m} } \rightarrow {\mathbf R ^ {n} }$, $E \subset {\mathbf R ^ {2n} }$.

The fixed conditions of problem (3) may serve as an example of boundary conditions of the type (7). In optimal control problems certain "non-classical" conditions such as

$$\tag{8 } u ( t) \in U \subset \mathbf R ^ {m}$$

are imposed in addition to conditions (6) and (7).

## Weak and strong extrema.

Two topologies are usually distinguished in variational calculus — a strong and a weak topology and, correspondingly, one defines strong and weak extrema. For instance, as applied to problem (3) one says that the curve ${x _ {0} } ( t)$ realizes a weak minimum if it is possible to find an $\epsilon > 0$ such that $J( x) \geq J ( {x _ {0} } )$ for all continuously-differentiable functions $x( t)$ satisfying the conditions $x( {t _ {0} } ) = {x _ {0} } ( {t _ {0} } )$, $x( {t _ {1} } ) = {x _ {0} } ( {t _ {1} } )$ and

$$\max _ {t \in [ t _ {0} , t _ {1} ] } | x ( t) - x _ {0} ( t) | + \max _ {t \in [ t _ {0} , t _ {1} ] } \ | \dot{x} ( t) - \dot{x} _ {0} ( t) | < \epsilon .$$

In other words, this fixes the proximity not only of the phase variables, but also of the speeds (controls). One says that a function gives a strong extremum if it is possible to find an $\epsilon > 0$ such that $J( x) \geq J( x _ {0} )$ for all permissible absolutely-continuous functions $x( t)$ (for which $J( x)$ exists) satisfying the conditions $x( t _ {0} ) = {x _ {0} } ( t _ {0} )$, $x( t _ {1} ) = {x _ {0} } ( {t _ {1} } )$ and

$$\max _ {t \in [ t _ {0} , t _ {1} ] } \ | x ( t) - x _ {0} ( t) | \leq \epsilon .$$

This equation merely represents proximity of the phase variables.

If $x _ {0} ( t)$ realizes a strong extremum, it realizes a fortiori a weak extremum as well; accordingly, conditions sufficient for a strong extremum are also sufficient for a weak one. Conversely, if a weak extremum is absent, so is a strong one, i.e. necessary conditions for a weak extremum are also necessary for a strong extremum.

## Necessary and sufficient conditions for an extremum.

Euler's equation, which was discussed above, is a necessary condition for a weak extremum. In the late 1950s, Pontryagin postulated a maximum principle for the problem (5)–(8) which is a necessary condition for a strong extremum (cf. Pontryagin maximum principle). This maximum principle states that if a pair $( x, u)$ supplies a strong extremum in the problem (5)–(8), there exist a vector function $\psi$ and a number $\lambda _ {0}$ such that the relations

$$\tag{9 } \left . \begin{array}{c} \dot x = \frac{\partial H}{\partial \psi}, \qquad -\dot\psi = \frac{\partial H}{\partial x} , \\ \max_{u\in U} H(t, x(t), \psi(t), u, \lambda_0) = H(t, x(t), \psi(t), u(t), \lambda_0) \end{array} \right\}$$

are satisfied for the Hamilton function $H( t, x, \psi , u , {\lambda _ {0} } ) = ( \psi , f ) - {\lambda _ {0} } F$.

If Pontryagin's maximum principle is applied to the problem (3), it follows that a necessary condition for a curve $x( t)$ to yield a strong minimum in the problem (3) is that it be an extremal (i.e. satisfy Euler's equation (4)) and satisfy the necessary Weierstrass condition (cf. Weierstrass conditions (for a variational extremum))

$$\tag{10 } {\mathcal E} ( t, x ( t), \dot{x} ( t), \xi ) \geq 0 \ \ \textrm{ for } \textrm{ all } t \in [ t _ {0} , t _ {1} ],\ \xi \in \mathbf R ,$$

where

$${\mathcal E} ( t, x, \dot{x} , \xi ) = \ L ( t, x, \xi ) - L ( t, x, \dot{x} ) - ( ( \xi - \dot{x} ) L _ {x} dot ( t, x, \dot{x} ) )$$

is the so-called Weierstrass function.

In addition to conditions of the type (4) and (10), which are local (i.e. which can be verified at each point of the extremal), there is also a global necessary condition, related to the behaviour of the set of extremals in a neighbourhood of the given extremal (cf. Jacobi condition). For problem (3) Jacobi's condition may be formulated as follows. For the extremal $x( t)$ to supply a minimum in the problem (3), it is necessary that the solution of the Jacobi equation

$$\tag{11 } - { \frac{d}{dt} } \left ( \left . \frac{\partial ^ {2} L }{\partial {\dot{x} } ^ {2} } \right | _ {x ( t) } { \frac{d}{dt} } h ( t) \right ) +$$

$$+ \left . \left ( \frac{\partial ^ {2} L }{\partial x ^ {2} } - { \frac{d}{dt} } \frac{\partial ^ {2} L }{\partial x \partial \dot{x} } \right ) \right | _ {x ( t) } h ( t) = 0$$

with the boundary conditions $h( t _ {0} ) = 0$, ${\dot{h} } ( t _ {0} ) \neq 0$, does not have zeros in the interval $( t _ {0} , t _ {1} )$. The zeros of the solution $h( t)$ of equation (11) are said to be points conjugate with the point $t _ {0}$. Thus, Jacobi's condition means that the interval $( t _ {0} , t _ {1} )$ does not contain points which are conjugate with $t _ {0}$.

Necessary conditions for a weak minimum, $\delta J = 0$, $\delta ^ {2} J \geq 0$, are strict analogues of the minimum conditions $f ^ { \prime } ( x) = 0$, $f ^ { \prime\prime } ( x) \geq 0$ for functions of one variable. If the (strong) Legendre condition is met, the Jacobi condition is a necessary condition for the second variation to be non-negative. This leads to the following result: For a function $x( t)$ to realize a weak minimum of the functional (3) it is necessary: a) that $x( t)$ satisfies Euler's equation; b) that the Legendre condition $( \partial ^ {2} L / \partial \dot{x} ^ {2} ) \mid _ {x(} t) > 0$ is satisfied; and c) that the interval $( t _ {0} , t _ {1} )$ does not contain points conjugate with $t _ {0}$ (if the strong Legendre condition is satisfied).

Sufficient conditions for a weak minimum are as follows: The function $x( t)$ must be an extremal on which the strong Legendre condition is met, and the semi-interval $( t _ {0} , t _ {1} ]$ must not contain points conjugate with $t _ {0}$. For a curve $x( t)$ to yield a strong maximum it is sufficient that the sufficient Weierstrass condition, as well as the sufficient conditions for a weak minimum formulated above, be satisfied.

## Problems in optimal control.

One of the principal directions in the development of the calculus of variations is that of non-classical problems much like the problem (5)–(8) formulated above. Problems of this kind have a major practical significance. For instance, let (6) describe the motion of some dynamic object, say a space ship. The control — the vector $u$— is the thrust of its motor. The initial location of the space ship is some orbit, while its final position is an orbit of different radius. The functional $J$ describes the fuel consumption involved in the performance of such a maneuver. The problem (5)–(7) may then be applied to this situation as follows: Determine the law governing the variation of the thrust exerted by the motor of the space ship required to perform the transition from one orbit to the other within a given period of time so as to minimize the fuel consumption. This must be done subject to the control constraints: the thrust of the motor must not exceed a certain given value; the turning angle is also bounded. Thus, the components of the thrust, $u ^ {i}$, $i = 1, 2, 3$, are in this case subject to the constraints

$$a _ {i} ^ {-} \leq u ^ {i} \leq a _ {i} ^ {+} ,$$

where $a _ {i} ^ {-}$ and $a _ {i} ^ {+}$ are given numbers.

A large number of problems can be reduced to the Lagrange problem, subject to a supplementary restriction of the type (8). Such problems are known as problems of optimal control. It would be desirable to develop a special apparatus for the theory of optimal control. Pontryagin's maximum principle may be said to be such an apparatus.

Another approach to these problems in optimal control theory is also possible. Let $S( t, x)$ be the value of the functional (5) along an optimal solution from a point $( t _ {0} , x _ {0} )$ to a point $( t, x)$. For the function $u( t)$ to be an optimal control in such a case it is necessary (and also sufficient in certain cases) that the partial differential equation

$$\frac{\partial S }{\partial t } + \min _ {u \in U } \left ( \left ( \frac{\partial S }{\partial x } \right ) ^ \prime ,\ ( f ( t, x ( t), u( t)) - F ( t, x ( t), u ( t) )) \right ) = 0 ,$$

known as Bellman's equation (cf. Dynamic programming), holds. In problems in classical variational calculus the function $S( t, x)$ (the action integral) must satisfy the Hamilton–Jacobi equation

$$\frac{\partial S }{\partial t } + H \left ( t, x, \frac{\partial S }{\partial x } \right ) = 0,$$

where $H$ is the Hamilton function. In problem (3) the function is the Legendre transform with respect to $\dot{x}$ of the integrand $L( t, x, {\dot{x} } )$. The Hamilton–Jacobi theory is a powerful tool in the study of numerous variational problems connected with classical mechanics.

The connection between variational calculus and the theory of partial differential equations was discovered as early as the 19th century. It was shown by P.G.L. Dirichlet that solving boundary value problems for the Laplace equation is equivalent to solving some variational problem. Consider, for example, a given linear operator equation

$$\tag{12 } A x = f,$$

where $x( \xi , \eta )$ is some function of two independent variables which vanishes on a closed curve $\Gamma$. Subject to assumptions which are natural in a certain class of physical problems, the problem of finding the solution of equation (12) is equivalent to finding the minimum of the functional

$$\tag{13 } J ( x) = {\int\limits \int\limits } _ \Omega A _ {xx} d \xi d \eta - 2 {\int\limits \int\limits } _ \Omega fx d \xi d \eta ,$$

where $\Omega$ is the domain bounded by the curve $\Gamma$. Equation (12) is in this case the Euler equation for the functional (13).

The reduction of problem (12) to (13) is possible if, for example, $A$ is a positive-definite self-adjoint operator. The connection between problems involving partial differential equations and variational problems makes it possible, in particular, to establish the truth of various existence and uniqueness theorems; it played an important part in the crystallization of the concept of a generalized solution. Such a reduction is very important in numerical mathematics as well, since direct methods of variational calculus can be employed to solve boundary value problems in the theory of partial differential equations.

## Qualitative methods.

These methods make it possible to solve problems on the existence and uniqueness of solutions, as well as on the qualitative features of (families of) extremals. It was established in the 20th century that the number of solutions of a variational problem depends on the properties of the space on which the functional has been defined. For instance, if the functional $J$ is defined on all possible smooth curves on a torus which connect two given points, or on all possible closed curves in a surface which is topologically equivalent to a torus, the number of critical elements — curves on which the variation $\delta J = 0$ — is infinite in both cases. L.A. Lyusternik and L.G. Shnirel'man [7] showed that on every surface which is topologically equivalent to a sphere there exist at least three closed self-intersecting geodesics of different lengths; if the lengths of only two of these geodesics are equal, there exists an infinite number of closed geodesics of equal length. Such problems indicate a close connection between variational calculus and the qualitative theory of differential equations and topology. The development of functional analysis made a substantial contribution to the study of qualitative methods. See also Variational calculus in the large.

## Connection between variational calculus and the theory of cones.

The scope of problems studied in variational calculus keeps increasing. In particular, there is much interest in functionals $J( x)$ of a very general type defined on sets $G _ {k}$ of elements of normed spaces. The concept of variation is difficult to introduce into problems of this kind, and another kind of apparatus has to be utilized. This proved to be the theory of cones in Banach spaces. Consider, for example, the problem of minimizing $f( x)$, where $x$ is an element of a closed set $G$. The cone ${\Gamma _ {G} } ( x _ {0} )$ is the set of non-zero vectors $e$ that can be put into correspondence with a positive number $\lambda _ {e} ^ {*}$ so that the vector $x = {x _ {0} } + {\lambda e } \in G$ for all $\lambda \in ( 0, {\lambda _ {e} ^ {*} } )$. The cone ${\Gamma _ {f} } ( x _ {0} )$ is the set of non-zero vectors $e$ that can be put into correspondence with a positive $\lambda _ {e} ^ {*}$ so that

$$f ( x _ {0} + \lambda e) \geq f ( x _ {0} )$$

for all $\lambda \in [ 0, {\lambda _ {e} ^ {*} } ]$. For $x _ {0}$ to realize the minimum of $f( x)$, the intersection of the cones ${\Gamma _ {G} } ( x _ {0} )$ and ${\Gamma _ {f} } ( x _ {0} )$ must be empty. This condition is just as elementary as that of vanishing of the variation, but not all the results which follow from it can be obtained by classical methods of variational calculus. It makes it possible to tackle much more complicated problems, such as in studies on extremal values of non-differentiable functionals [6].

#### References

 [1] V.I. Smirnov, "A course of higher mathematics" , 4 , Addison-Wesley (1964) (Translated from Russian) MR0182690 MR0182688 MR0182687 MR0177069 MR0168707 Zbl 0122.29703 Zbl 0121.25904 Zbl 0118.28402 Zbl 0117.03404 [2] M.A. Lavrent'ev, L.A. Lyusternik, "A course in variational calculus" , Moscow-Leningrad (1950) (In Russian) [3] G.A. Bliss, "Lectures on the calculus of variations" , Chicago Univ. Press (1947) MR0017881 Zbl 0036.34401 [4] S.G. [S.G. Mikhlin] Michlin, "Variationsmethoden der mathematischen Physik" , Akademie Verlag (1962) (Translated from Russian) MR0141248 Zbl 0098.36909 [5] L.S. Pontryagin, V.G. Boltayanskii, R.V. Gamkrelidze, E.F. Mishchenko, "The mathematical theory of optimal processes" , Wiley (1962) (Translated from Russian) MR0166036 MR0166037 MR0166038 Zbl 0102.32001 [6] B.N. Pshenichnyi, "Necessary conditions of an extremum" , Interscience (1962) (Translated from Russian) [7] L.A. Lyusternik, L.G. [L.G. Shnirel'man] Schnirelmann, "Méthode topologiques dans les problèmes variationelles" , Hermann (1934) (Translated from Russian)