# Optimal synthesis control

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

A solution of a problem in the mathematical theory of optimal control (cf. Optimal control, mathematical theory of), consisting of a synthesis of an optimal control (a feedback synthesis) in the form of a control strategy (a feedback principle), as a function of the current state (position) of a process (see ). The value of the control is defined not only by the current time, but also by the admissible values of the current parameters. In this way the introduction of a positional strategy makes possible an a posteriori realization of a control $u$, corrected on the basis of supplementary information obtained during the process.

The simplest synthesis problem, for example, for a system

$$\tag{1 } \dot{x} = f( t, x, u),\ \ t _ {0} \leq t \leq t _ {1} ,\ \ x \in \mathbf R ^ {n} ,\ u \in \mathbf R ^ {p} ,$$

with constraints

$$\tag{2 } u \in U \subseteq \mathbf R ^ {p} \ \textrm{ or } \ \ \psi ( u) \leq 0,\ \psi : \mathbf R ^ {p} \rightarrow \mathbf R ^ {k} ,$$

and a given "terminal" criterion

$$I( x( \cdot ), u( \cdot )) = \phi ( t _ {1} , x( t _ {1} )),\ \ \phi : \mathbf R ^ {n+} 1 \rightarrow \mathbf R ^ {1} ,$$

assumes that a solution $u ^ {0}$ is being sought to minimize the functional $I( x( \cdot ), u( \cdot ))$ among the functions of the form $u( t, x)$ for an arbitrary initial position $\{ \tau , x \}$. The natural course is to find for every pair $\{ \tau , x \}$ a solution of the corresponding problem of constructing an optimal programming control

$$u ^ {0} [ t \mid \tau , x],\ \ x = x( \tau ),\ \ \tau \leq t \leq t _ {1} ,$$

as a minimum of that same functional $I( x( \cdot ), u( \cdot ))$ and with those same constraints. It is further supposed that

$$u ^ {0} ( t, x) = u ^ {0} [ t \mid t, x] ;$$

if the function $u ^ {0} ( t, x)$ is correctly defined, while the equation

$$\tag{3 } \dot{x} = f( t, x, u ^ {0} ( t, x)),\ \ x( \tau ) = x,\ \ \tau \leq t \leq t _ {1} ,$$

has a unique solution, then the synthesis problem can be solved; moreover, the optimal values of $I$ found in the classes of programming and synthesis controls coincide (in general, conditions prevail which ensure the existence in a specific sense of the solutions of equation (3), and conditions also prevail which guarantee the optimality of all the trajectories of this equation).

The synthesized function $u ^ {0} ( t, x)$, being an optimal synthesis control, leads to an optimal solution for the minimum of the functional $I$ in the problem of optimal control for any initial position $\{ \tau , x \}$. This is in contrast to an optimal programming control, which in general depends on the fixed starting point $\{ t _ {0} , x ^ {0} \}$ of the process. The solution of an optimal control problem in the form of an optimal synthesis control has many applications, especially in practical procedures for implementing the optimal control in the presence of limited information or perturbations in the dynamics. In these situations a synthesis control is preferable to a programming control.

The search for $u ^ {0} ( t, x)$ in the form of a function of the current state is immediately linked to dynamic programming (see ). The return function (Bellman function, value function) $V( \tau , x)$, being introduced as a minimum (maximum) of a quantity to be optimized (for example, the functional

$$\tag{4 } J( x( \cdot ), u( \cdot )) = \ \int\limits _ \tau ^ { {t _ 1} } f ^ { 0 } ( t, x, u) dt + \phi ( t _ {1} , x( t _ {1} ))$$

for the system (1) if $x( \tau ) = x, t \in [ \tau , t _ {1} ]$), must satisfy the Bellman equation with boundary conditions depending on the aim of the control and $J$. For the system (1), (2) and (4), this equation takes the form

$$\tag{5 } \frac{\partial V }{\partial t } + H \left ( t, x, \frac{\partial V }{\partial x } \right ) = 0,\ \ V( t _ {1} , x) = \phi ( t _ {1} , x),$$

where

$$\tag{6 } H \left ( t, x, \frac{\partial V }{\partial x } \right ) =$$

$$= \ \min \left \{ \left ( \frac{\partial V }{\partial x } , f( t, x, u) \right ) + f ^ { 0 } ( t, x, u) : u \in U \right \}$$

is the Hamilton function. This equation is connected with the equations figuring in the conditions of the Pontryagin maximum principle, in the same way as the Hamilton–Jacobi equation for a return function is linked in analytical mechanics to the ordinary Hamiltonian differential equations (see Variational principles of classical mechanics).

The derivation of equations (5) for the synthesis problem relies on the optimality principle asserting that a section of an optimal trajectory is also an optimal trajectory (see ). The viability of this approach depends on the correct definition of the informational properties of the process, particularly on the concept of position (current state, see ).

In the time-optimal control problem — regarding the minimal time $T( x)$ for a trajectory of an autonomous system (1) to hit a set $M$, starting from a position $x$— the function $V( \tau , x) = V( x)$ can be considered as a particular kind of potential $V( x) = T( x)$ with respect to $M$. The choice of an optimal control $u ^ {0} ( t, x)$ from conditions (5), (6) now has the form

$$\min \left \{ {\left ( \frac{\partial V }{\partial x } , f( x, u) \right ) } : {u \in U } \right \} = - 1,$$

$$V( x) = 0 \ \textrm{ when } x \in M,$$

which means that $u ^ {0} ( t, x) = u ^ {0} ( x)$ realizes the descent of the optimal trajectory $x ^ {0} ( t)$ relative to the level surfaces of the function $V( x)$ by the fastest method permitted by the condition $u \in U$.

The use of the method of dynamic programming (as a sufficient condition of optimality) will be rigorous if the function $V( \tau , x)$ satisfies certain smoothness conditions everywhere (for example, in problems (3)–(6) the function $V( \tau , x)$ must be continuously differentiable) or if smoothness conditions are satisfied everywhere with the exception of a "special" set $N$. When certain special "conditions of regular synthesis" are fulfilled, the method of dynamic programming is equivalent to Pontryagin's principle, which is then seen to be a necessary and sufficient condition for optimality (see ). Difficulties connected with the a priori verification of the applicability of the method of dynamic programming and with the need to solve the Bellman equation complicate the use of this method. The method of dynamic programming has been extended to problems of optimal synthesis control for discrete (multi-stage) systems, where the corresponding Bellman equation is a finite-difference equation (see , ).

In problems of optimal control with differential constraints, the method of dynamic programming gives an effective solution of the synthesis problem in closed form for a class of problems which embraces linear systems with quadratic performance criterion (4) (the functions $f ^ { 0 }$, $\phi$ are positive-definite quadratic forms in $x, u$ and in $x$, respectively). This problem, related to the analytic construction of an optimal regulator if $\phi ( t, x) \equiv 0$, $t _ {1} = \infty$, becomes a problem of optimal stabilization of the system (the property of asymptotic stability of the equilibrium position of a synthesized system follows directly from the existence of an admissible control) (see , ). The existence of a solution in the given instance is ensured by the property of stabilizability of the system (see ). For linear stationary and periodic systems it is equivalent to the property of controllability of the unstable models of the system (see Optimal programming control).

The solution of the problem of optimal stabilization has shown that the corresponding Bellman function is at the same time the "optimal" Lyapunov function for the initial system with an obtained optimal control. Under these circumstances effective conditions of controllability have been obtained and a complete analogue of Lyapunov's theory of stability (in a first approximation and in critical cases) for problems of stabilization has been created which embraces ordinary quasi-linear and periodic systems and also delay-systems. In the latter case, the role of the Bellman function is played by "optimal" Lyapunov–Krasovskii functionals, given on the sections of the trajectory that correspond to the value of the delay in the system (see , ). The theory of linear, quadratic problems of optimal control is also well-developed for partial differential equations (see ).

In applied problems of optimal synthesis control it is not always possible to measure all phase coordinates of the system. The following problem of observation therefore arises, permitting numerous generalizations: Knowing the realization $y[ t] \in \mathbf R ^ {m}$ for an interval $\sigma \leq t \leq \theta$ of an accessible measurement of the function $y = g( t, x)$ in the coordinates of the system (1) (if $u( t)$ is known, for example, if $u( t) \equiv 0$ and $m \leq n$), find the vector $x( \theta ) \in \mathbf R ^ {n}$ at the given moment $\theta$. Systems which, through a unique realization $y[ t]$, allow one to establish $x( \theta )$, whatever its value, are called completely observable.

The property of complete observability, as well as the construction of the corresponding analytic operations that distinguish $x( \theta )$ and the optimization of these operations, have been well studied for linear systems. Here, a duality principle is known: For every problem of observation, a corresponding equivalent two-point boundary value control problem for the dual system can be established. A consequence of this is that the property of complete observability of a linear system coincides with the property of complete controllability of the dual system with control. Moreover, it turns out that the corresponding dual boundary value problems of optimal observation and optimal control can also be composed in such a way that their solutions coincide (see ). The properties of controllability and observability of linear systems have many generalizations to linear infinite-dimensional systems (equations in a Banach space, systems with deviating argument, partial differential equations). There is also a number of results characterizing the corresponding properties. For non-linear systems, only some local theorems on observability are known. Solutions of the problem of observation have found numerous applications in synthesis problems with incomplete information on coordinates, among them problems of optimal stabilization (see , , ).

The problem of synthesis control becomes especially interesting when information on the controls of the controllable process, the initial conditions and the current parameters is subject to uncertainty (perturbations). If the description of this uncertainty has a statistical character, the problems of optimal control are examined within the framework of the theory of stochastic optimal control. This theory, arising from the solution of stochastic problems , has been, to a very large degree, developed for systems of the form

$$\tag{7 } \dot{x} = f( t, x, u) + g( t, x, u) \eta ,\ \ x( t _ {0} ) = x ^ {0} ,$$

with random perturbations $\eta ( t)$ described by Gaussian diffusion processes or by more general classes of Markov processes (the initial vector is also usually taken random). In these circumstances, as a rule, it is assumed that certain probability characteristics of the variable $\eta$ are given (for example, information on the moments of the corresponding distributions or on the parameters of the stochastic equations describing the evolution of the process $\eta ( t)$).

Generally, the use of programming and synthesis controls gives essentially different values of the optimal performance indices $J$( the roles of these indices can be played, for instance, by average estimates of non-negative functionals defined on trajectories of the process). The problem of synthesis of a stochastic optimal synthesis control now has clear advantages, since the continuous measurement of coordinates of the system enables one to correct the movement with regard to the real course of the random process, not predicted earlier. The method of dynamic programming combined with the theory of generating operators for the Markov semi-groups associated with stochastic processes, has led to sufficient conditions for optimality. This has led to the solution of a number of problems of stochastic optimal control on finite or infinite time intervals, including those with complete and incomplete information on current coordinates, of stochastic problems of pursuit, etc. It is essential that, for the principle of optimality to apply, the control $u$ exists at every moment of time $t$ as a function of "sufficient coordinates" $z$ of the process which are known to have the Markov property (see , , , ).

It is in this way, in particular, that the theory of optimal stochastic stabilization has been developed, in conjunction with the corresponding Lyapunov theory of stability, for stochastic systems .

For the formulation of an optimal synthesis controller as well as for other aims of control, it is usual to evaluate the state of a stochastic system by means of measurements. The theory of stochastic filtering is about the solution of this question, given the condition that the measurement process is disturbed by probabilistic "noises" . The most complete solutions known here are for linear systems with quadratic optimality criteria (the so-called Kalman–Bucy filter, see ). In applying this theory to the problem of stochastic optimal synthesis control, conditions have been developed which ensure the validity of the separation principle, allowing the problem of control to be solved independently of the problem of evaluating current positions on the basis of sufficient coordinates of the process (see ;  and  are dedicated to more general procedures of stochastic filtering, as well as to problems of a stochastic optimal control when the control itself is selected from the class of Markov diffusion processes).

A strictly formalized solution of the problem of stochastic optimal control is invariably coupled with the problem of a correct foundation for the existence questions of solutions for the corresponding stochastic differential equations. The latter circumstance generated specific difficulties in the solution of problems of stochastic optimal control when non-classical constraints are applied.

An interesting process of dynamic optimization arises in problems of optimal synthesis control under conditions of uncertainty (see Optimal programming control). Synthesis solutions generally permit improvement of the quality of the criteria of the process, as compared to programming solutions, which are none the less the result of statistical optimization (carried out, admittedly, in a space of dynamical systems and control functions). The concepts and methods of game theory are now used to obtain a solution of these problems.

Let there be given a system

$$\tag{8 } \dot{x} = f( t , x , u , w),\ \ t _ {0} \leq t \leq t _ {1} ,$$

with constraints

$$x ^ {0} = x( t _ {0} ) \in X ^ {0} \subseteq \mathbf R ^ {n} ,\ \ u \in U \subseteq \mathbf R ^ {p} ,\ \ w \in W \subseteq \mathbf R ^ {q} ,$$

on the initial vector $x ^ {0}$, the control $u$ and the disturbances $w$. Unlike the case of a coalition of players, represented by the initial control $u$ subject to definition, the disturbances $w$ are in this case treated as controls of an opponent player, and one is allowed to examine any strategies $w$ formed from any admissible information. Moreover, the aims of control can be formulated from the point of view of each one of the players separately. If the stated aims are contradictory, then the problem of conflicting control arises. Research into problems of synthesis control under conditions of conflict or uncertainty is the subject of the theory of differential games.

The process of forming an optimal synthesis control under conditions of uncertainty can also be complicated by incomplete information on the current state. So, in the system (8), only the results of indirect measurements of the phase vector $x$ and the realization $y[ t]$ of the function

$$\tag{9 } y( t) = g( t, x, \xi )$$

can be accessible; here the indefinite parameters $\xi$ are restricted by an a priori known constraint, $\xi \in {\mathsf E}$. The values $y[ t]$, $t _ {0} \leq t \leq \theta$( for a given $u( t)$), make it possible to construct a region of information $X( \theta , y( \cdot ))$ of states of the system (8) in the phase space, along with the realization $y[ t]$, equation (9) and the restriction on $w, \xi$. Among the elements of $X( \theta , y( \cdot )) = X( \theta , \cdot )$ there will be also an unknown true state of the system (8), which can be estimated by choosing a point $x ^ \star ( \theta , \cdot )$ from $X( \theta , \cdot )$( for example, the "centre of gravity" or the "Chebyshev centre" of $X( \theta , \cdot )$). The study of the evolution of the regions $X( \theta , \cdot )$ and the dynamics of the vectors $x ^ \star ( \theta , \cdot )$ is the purpose of the theory of minimax filtering. Most complete solutions are known for linear systems and convex constraints (see ).

In general, the choice of a synthesis strategy of optimal control under conditions of uncertainty (for example, in the form of a functional $u = u( t, X( t, \cdot ))$) must aim at control of the evolution of the domains $X( \theta , \cdot )$( i.e. the alternation of their configuration and their displacement in space), in accordance with prescribed criteria. For the problem shown, a number of general qualitative results is known, as well as constructive solutions in the class of special linear, convex problems (see , ). Furthermore, information containing measurements (for example, of the functions $y[ t]$ in the systems (8) and (9)), permits an a posteriori re-evaluation during the process of the domain of admissible values of the indefinite parameters in the direction of their constraint. In this way, the problem of identification of a mathematical model of a process (for example, the parameters $w$ in equation (8)), is solved at the same time. All that has been said enables one to treat the solutions of the problem of optimal synthesis control under conditions of uncertainty as a procedure of adaptive optimal control, in which a more precise definition of the properties of the model of the process gets mixed up with the choice of the controls. The questions of identification of models of dynamical processes and of the problem of adaptive optimal control are studied in detail under the assumption of existence of a probabilistic description of the indefinite parameters (see , ).

If in problems of optimal synthesis control under conditions of uncertainty the parameters $w$, $\xi$ are treated as "controls" of a fictitious opponent player, then the aims of the controls $u$ and $\{ w, \xi \}$ can be different. The latter instance leads to a non-scalar quality criterion of the process. A consequence of this is that the corresponding problems can be considered within the framework of the concepts of equilibrium situations peculiar to the multi-criterion problems of the theory of cooperative games and their generalizations.

How to Cite This Entry:
Optimal synthesis control. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Optimal_synthesis_control&oldid=48057
This article was adapted from an original article by A.B. Kurzhanskii (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article