Namespaces
Variants
Actions

Difference between revisions of "System identification"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
m (→‎References: zbl link)
 
(One intermediate revision by one other user not shown)
Line 1: Line 1:
 +
<!--This article has been texified automatically. Since there was no Nroff source code for this article,
 +
the semi-automatic procedure described at https://encyclopediaofmath.org/wiki/User:Maximilian_Janisch/latexlist
 +
was used.
 +
If the TeX and formula formatting is correct, please remove this message and the {{TEX|semi-auto}} category.
 +
 +
Out of 39 formulas, 39 were replaced by TEX code.-->
 +
 +
{{TEX|semi-auto}}{{TEX|done}}
 
A branch of science concerned with the construction of mathematical models of dynamical systems from measured input/output data. The constructed models are mostly of finite-dimensional difference or differential equation form. The area has close connections with statistics and time-series analysis, and also offers a very wide spectrum of applications.
 
A branch of science concerned with the construction of mathematical models of dynamical systems from measured input/output data. The constructed models are mostly of finite-dimensional difference or differential equation form. The area has close connections with statistics and time-series analysis, and also offers a very wide spectrum of applications.
  
 
From a formal point of view, a system identification method is a mapping from sets of data to sets of models. An example of a simple model is the discrete-time ARX-model
 
From a formal point of view, a system identification method is a mapping from sets of data to sets of models. An example of a simple model is the discrete-time ARX-model
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s1203501.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a1)</td></tr></table>
+
\begin{equation} \tag{a1} y ( t ) + a _ { 1 } y ( t - 1 ) + \ldots + a _ { n } y ( t - n ) = \end{equation}
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s1203502.png" /></td> </tr></table>
+
\begin{equation*} = b _ { 1 } u ( t - 1 ) + \ldots + b _ { m } u ( t - m ) + e ( t ), \end{equation*}
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s1203503.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s1203504.png" /> are the outputs and inputs, respectively, of the system and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s1203505.png" /> is a realization of a [[Stochastic process|stochastic process]] (often assumed to be a sequence of independent random variables, cf. also [[Random variable|Random variable]]). Another example is the continuous-time state-space model, described by the linear [[Stochastic differential equation|stochastic differential equation]]
+
where $y$ and $u$ are the outputs and inputs, respectively, of the system and $e$ is a realization of a [[Stochastic process|stochastic process]] (often assumed to be a sequence of independent random variables, cf. also [[Random variable|Random variable]]). Another example is the continuous-time state-space model, described by the linear [[Stochastic differential equation|stochastic differential equation]]
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s1203506.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a2)</td></tr></table>
+
\begin{equation} \tag{a2} \left\{ \begin{array} { l } { d x ( t ) = A x ( t ) d t + B u ( t ) d t + d w ( t ), } \\ { d y ( t ) = C x ( t ) d t + D u ( t ) d t + d v ( t ), } \end{array} \right. \end{equation}
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s1203507.png" /> is the vector of (internal) state variables and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s1203508.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s1203509.png" /> are Wiener processes (cf. also [[Wiener process|Wiener process]]). Artificial neural networks form an example of common non-linear black-box models for dynamical systems.
+
where $x$ is the vector of (internal) state variables and $w$ and $v$ are Wiener processes (cf. also [[Wiener process|Wiener process]]). Artificial neural networks form an example of common non-linear black-box models for dynamical systems.
  
In any case, the model can be associated with a predictor function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035010.png" /> that predicts <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035011.png" /> from past (discrete-time) observations
+
In any case, the model can be associated with a predictor function $f$ that predicts $y ( t )$ from past (discrete-time) observations
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035012.png" /></td> </tr></table>
+
\begin{equation*} Z ^ { t - 1 } = \{ y ( t - 1 ) , u ( t - 1 ) , \dots , y ( 0 ) , u ( 0 ) \}: \end{equation*}
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035013.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a3)</td></tr></table>
+
\begin{equation} \tag{a3} \hat{y} ( t | t - 1 ) = f ( Z ^ { t - 1 } , t ). \end{equation}
  
A set of smoothly parametrized such predictor functions, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035014.png" />, forms a model structure <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035015.png" /> as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035016.png" /> ranges over a subset <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035017.png" /> of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035018.png" />. The mapping (estimator or identification method) from observed data <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035019.png" /> to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035020.png" />, yielding the estimate <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035021.png" />, can be chosen based on a least-squares fit or as a maximum-likelihood estimator (cf. also [[Least squares, method of|Least squares, method of]]; [[Maximum-likelihood method|Maximum-likelihood method]]). This leads to a mapping of the kind
+
A set of smoothly parametrized such predictor functions, $f ( Z ^ { t - 1 } , t , \theta )$, forms a model structure $\mathcal{M}$ as $\theta$ ranges over a subset $D _ {\cal{ M} }$ of $\mathbf{R} ^ { d }$. The mapping (estimator or identification method) from observed data $Z ^ { N }$ to $D _ {\cal{ M} }$, yielding the estimate $\hat { \theta } _ { N }$, can be chosen based on a least-squares fit or as a maximum-likelihood estimator (cf. also [[Least squares, method of|Least squares, method of]]; [[Maximum-likelihood method|Maximum-likelihood method]]). This leads to a mapping of the kind
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035022.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a4)</td></tr></table>
+
\begin{equation} \tag{a4} \hat { \theta } _ { N } = \operatorname { arg } \operatorname { min } _ { \theta \in D _ { \mathcal{M} } } \sum _ { \mathcal{M} } ^ { N _ { t } = 1 } \text{l} \left( y ( t ) - f ( Z ^ { t - 1 } , t , \theta ) \right), \end{equation}
  
with a positive scalar-valued function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035023.png" />.
+
with a positive scalar-valued function $\operatorname{l}$.
  
When the data <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035024.png" /> are described as random variables, the [[Law of large numbers|law of large numbers]] and the [[Central limit theorem|central limit theorem]] can be applied under weak assumptions to infer the asymptotic (as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035025.png" />) properties of the random variable <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035026.png" />. The covariance matrix of the asymptotic (normal) distribution of the estimate takes the typical form
+
When the data $Z ^ { N }$ are described as random variables, the [[Law of large numbers|law of large numbers]] and the [[Central limit theorem|central limit theorem]] can be applied under weak assumptions to infer the asymptotic (as $N \rightarrow \infty$) properties of the random variable $\hat { \theta } _ { N }$. The covariance matrix of the asymptotic (normal) distribution of the estimate takes the typical form
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035027.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a5)</td></tr></table>
+
\begin{equation} \tag{a5} P = \operatorname { lim } _ { N \rightarrow \infty } N . \operatorname{Cov} ( \hat{\theta}_ N ) = \end{equation}
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035028.png" /></td> </tr></table>
+
\begin{equation*} = \lambda \operatorname { lim } _ { N \rightarrow \infty } \sum _ { t = 1 } ^ { N } \mathsf{E} \frac { \partial } { \partial \theta } f ( Z ^ { t - 1 } , t , \theta ) \left( \frac { \partial } { \partial \theta } f ( Z ^ { t - 1 } , t , \theta ) \right) ^ { T }, \end{equation*}
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035029.png" /> is the variance of the resulting model's prediction errors, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035030.png" /> denotes [[Mathematical expectation|mathematical expectation]]. Explicit expressions for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035031.png" /> form the basis for experiment design and other user-oriented issues. For general treatments of system identification, see, e.g., [[#References|[a5]]], [[#References|[a7]]], and [[#References|[a3]]].
+
where $\lambda$ is the variance of the resulting model's prediction errors, and $\mathsf{E}$ denotes [[Mathematical expectation|mathematical expectation]]. Explicit expressions for $P$ form the basis for experiment design and other user-oriented issues. For general treatments of system identification, see, e.g., [[#References|[a5]]], [[#References|[a7]]], and [[#References|[a3]]].
  
By adaptive system identification (also called recursive identification or sequential identification) one means that the mapping from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035032.png" /> to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035033.png" /> is constrained to be of the form
+
By adaptive system identification (also called recursive identification or sequential identification) one means that the mapping from $Z ^ { N }$ to $\hat { \theta } _ { N }$ is constrained to be of the form
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035034.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a6)</td></tr></table>
+
\begin{equation} \tag{a6} \left\{ \begin{array} { r l r l } { X _ { N } = H ( N , X _ { N - 1 } , y ( N ) , u ( N ) ), }  \\ { \hat{\theta}_{N} = h ( X _ { N } ), } \end{array} \right. \end{equation}
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035035.png" /> is a vector of fixed dimensions. This structure allows the computation of the estimate at step (time) <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035036.png" /> with a fixed amount of calculations. This is instrumental in an application where the model is required  "on-line"  as the data is measured. Such applications include adaptive control, adaptive filtering, supervision, etc. The structure (a6) often takes the more specific form
+
where $X ( t )$ is a vector of fixed dimensions. This structure allows the computation of the estimate at step (time) $N$ with a fixed amount of calculations. This is instrumental in an application where the model is required  "on-line"  as the data is measured. Such applications include adaptive control, adaptive filtering, supervision, etc. The structure (a6) often takes the more specific form
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035037.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a7)</td></tr></table>
+
\begin{equation} \tag{a7} \left\{ \begin{array} { c c c c }{  \hat{ \theta }_{N} =\hat{\theta }_{N-1}+ \gamma (N) Q_1(X(N),y(N),u(N)),    }\\{X_{N}= X _ { N - 1 } + \mu _ { N } Q _ { 2 } ( X _ { N-1} ,y(N), u(N)), }\end{array} \right.  \end{equation}
  
to reflect that the estimate is adjusted from the previous one, usually by a small amount. The convergence analysis of algorithms like (a7) is treated in e.g. [[#References|[a6]]], [[#References|[a1]]], [[#References|[a4]]], and [[#References|[a8]]]. The underlying theory is typically based on averaging, relating (a7) to an associated differential equation, and the subsequent stability analysis of this equation, or on stochastic Lyapunov functions (cf. also [[Lyapunov stochastic function|Lyapunov stochastic function]]). It is also of interest to determine the asymptotic distribution of the estimate as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035038.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s120/s120350/s12035039.png" /> become small, see, e.g., [[#References|[a2]]] and [[#References|[a8]]].
+
to reflect that the estimate is adjusted from the previous one, usually by a small amount. The convergence analysis of algorithms like (a7) is treated in e.g. [[#References|[a6]]], [[#References|[a1]]], [[#References|[a4]]], and [[#References|[a8]]]. The underlying theory is typically based on averaging, relating (a7) to an associated differential equation, and the subsequent stability analysis of this equation, or on stochastic Lyapunov functions (cf. also [[Lyapunov stochastic function|Lyapunov stochastic function]]). It is also of interest to determine the asymptotic distribution of the estimate as $\gamma$ and $\mu$ become small, see, e.g., [[#References|[a2]]] and [[#References|[a8]]].
  
 
====References====
 
====References====
<table><TR><TD valign="top">[a1]</TD> <TD valign="top">  A. Benveniste,  M. Métivier,  P. Priouret,  "Adaptive algorithms and stochastic approximations" , Springer  (1990)</TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top">  L. Guo,  L. Ljung,  "Performance analysis of general tracking algorithms."  ''IEEE Trans. Automat. Control'' , '''40'''  (1995)  pp. 1388–1402</TD></TR><TR><TD valign="top">[a3]</TD> <TD valign="top">  E.J. Hannan,  M. Deistler,  "The statistical theory of linear systems" , Wiley  (1988)</TD></TR><TR><TD valign="top">[a4]</TD> <TD valign="top">  H.J. Kushner,  D.S. Clark,  "Stochastic approximation methods for constrained and unconstrained systems" , Springer  (1978)</TD></TR><TR><TD valign="top">[a5]</TD> <TD valign="top">  L. Ljung,  "System identification: Theory for the user" , Prentice-Hall  (1999)  (Edition: Second)</TD></TR><TR><TD valign="top">[a6]</TD> <TD valign="top">  L. Ljung,  T. Söderström,  "Theory and practice of recursive identification" , MIT  (1983)</TD></TR><TR><TD valign="top">[a7]</TD> <TD valign="top">  T. Söderström,  P. Stoica,  "System identification" , Prentice-Hall  (1989)</TD></TR><TR><TD valign="top">[a8]</TD> <TD valign="top">  V. Solo,  X. Kong,  "Adaptive signal processing algorithms" , Prentice-Hall  (1995)</TD></TR></table>
+
<table><tr><td valign="top">[a1]</td> <td valign="top">  A. Benveniste,  M. Métivier,  P. Priouret,  "Adaptive algorithms and stochastic approximations" , Springer  (1990) {{ZBL|0752.93073}}</td></tr><tr><td valign="top">[a2]</td> <td valign="top">  L. Guo,  L. Ljung,  "Performance analysis of general tracking algorithms."  ''IEEE Trans. Automat. Control'' , '''40'''  (1995)  pp. 1388–1402</td></tr><tr><td valign="top">[a3]</td> <td valign="top">  E.J. Hannan,  M. Deistler,  "The statistical theory of linear systems" , Wiley  (1988)</td></tr><tr><td valign="top">[a4]</td> <td valign="top">  H.J. Kushner,  D.S. Clark,  "Stochastic approximation methods for constrained and unconstrained systems" , Springer  (1978)</td></tr><tr><td valign="top">[a5]</td> <td valign="top">  L. Ljung,  "System identification: Theory for the user" , Prentice-Hall  (1999)  (Edition: Second)</td></tr><tr><td valign="top">[a6]</td> <td valign="top">  L. Ljung,  T. Söderström,  "Theory and practice of recursive identification" , MIT  (1983)</td></tr><tr><td valign="top">[a7]</td> <td valign="top">  T. Söderström,  P. Stoica,  "System identification" , Prentice-Hall  (1989)</td></tr><tr><td valign="top">[a8]</td> <td valign="top">  V. Solo,  X. Kong,  "Adaptive signal processing algorithms" , Prentice-Hall  (1995)</td></tr></table>

Latest revision as of 14:45, 16 December 2023

A branch of science concerned with the construction of mathematical models of dynamical systems from measured input/output data. The constructed models are mostly of finite-dimensional difference or differential equation form. The area has close connections with statistics and time-series analysis, and also offers a very wide spectrum of applications.

From a formal point of view, a system identification method is a mapping from sets of data to sets of models. An example of a simple model is the discrete-time ARX-model

\begin{equation} \tag{a1} y ( t ) + a _ { 1 } y ( t - 1 ) + \ldots + a _ { n } y ( t - n ) = \end{equation}

\begin{equation*} = b _ { 1 } u ( t - 1 ) + \ldots + b _ { m } u ( t - m ) + e ( t ), \end{equation*}

where $y$ and $u$ are the outputs and inputs, respectively, of the system and $e$ is a realization of a stochastic process (often assumed to be a sequence of independent random variables, cf. also Random variable). Another example is the continuous-time state-space model, described by the linear stochastic differential equation

\begin{equation} \tag{a2} \left\{ \begin{array} { l } { d x ( t ) = A x ( t ) d t + B u ( t ) d t + d w ( t ), } \\ { d y ( t ) = C x ( t ) d t + D u ( t ) d t + d v ( t ), } \end{array} \right. \end{equation}

where $x$ is the vector of (internal) state variables and $w$ and $v$ are Wiener processes (cf. also Wiener process). Artificial neural networks form an example of common non-linear black-box models for dynamical systems.

In any case, the model can be associated with a predictor function $f$ that predicts $y ( t )$ from past (discrete-time) observations

\begin{equation*} Z ^ { t - 1 } = \{ y ( t - 1 ) , u ( t - 1 ) , \dots , y ( 0 ) , u ( 0 ) \}: \end{equation*}

\begin{equation} \tag{a3} \hat{y} ( t | t - 1 ) = f ( Z ^ { t - 1 } , t ). \end{equation}

A set of smoothly parametrized such predictor functions, $f ( Z ^ { t - 1 } , t , \theta )$, forms a model structure $\mathcal{M}$ as $\theta$ ranges over a subset $D _ {\cal{ M} }$ of $\mathbf{R} ^ { d }$. The mapping (estimator or identification method) from observed data $Z ^ { N }$ to $D _ {\cal{ M} }$, yielding the estimate $\hat { \theta } _ { N }$, can be chosen based on a least-squares fit or as a maximum-likelihood estimator (cf. also Least squares, method of; Maximum-likelihood method). This leads to a mapping of the kind

\begin{equation} \tag{a4} \hat { \theta } _ { N } = \operatorname { arg } \operatorname { min } _ { \theta \in D _ { \mathcal{M} } } \sum _ { \mathcal{M} } ^ { N _ { t } = 1 } \text{l} \left( y ( t ) - f ( Z ^ { t - 1 } , t , \theta ) \right), \end{equation}

with a positive scalar-valued function $\operatorname{l}$.

When the data $Z ^ { N }$ are described as random variables, the law of large numbers and the central limit theorem can be applied under weak assumptions to infer the asymptotic (as $N \rightarrow \infty$) properties of the random variable $\hat { \theta } _ { N }$. The covariance matrix of the asymptotic (normal) distribution of the estimate takes the typical form

\begin{equation} \tag{a5} P = \operatorname { lim } _ { N \rightarrow \infty } N . \operatorname{Cov} ( \hat{\theta}_ N ) = \end{equation}

\begin{equation*} = \lambda \operatorname { lim } _ { N \rightarrow \infty } \sum _ { t = 1 } ^ { N } \mathsf{E} \frac { \partial } { \partial \theta } f ( Z ^ { t - 1 } , t , \theta ) \left( \frac { \partial } { \partial \theta } f ( Z ^ { t - 1 } , t , \theta ) \right) ^ { T }, \end{equation*}

where $\lambda$ is the variance of the resulting model's prediction errors, and $\mathsf{E}$ denotes mathematical expectation. Explicit expressions for $P$ form the basis for experiment design and other user-oriented issues. For general treatments of system identification, see, e.g., [a5], [a7], and [a3].

By adaptive system identification (also called recursive identification or sequential identification) one means that the mapping from $Z ^ { N }$ to $\hat { \theta } _ { N }$ is constrained to be of the form

\begin{equation} \tag{a6} \left\{ \begin{array} { r l r l } { X _ { N } = H ( N , X _ { N - 1 } , y ( N ) , u ( N ) ), } \\ { \hat{\theta}_{N} = h ( X _ { N } ), } \end{array} \right. \end{equation}

where $X ( t )$ is a vector of fixed dimensions. This structure allows the computation of the estimate at step (time) $N$ with a fixed amount of calculations. This is instrumental in an application where the model is required "on-line" as the data is measured. Such applications include adaptive control, adaptive filtering, supervision, etc. The structure (a6) often takes the more specific form

\begin{equation} \tag{a7} \left\{ \begin{array} { c c c c }{ \hat{ \theta }_{N} =\hat{\theta }_{N-1}+ \gamma (N) Q_1(X(N),y(N),u(N)), }\\{X_{N}= X _ { N - 1 } + \mu _ { N } Q _ { 2 } ( X _ { N-1} ,y(N), u(N)), }\end{array} \right. \end{equation}

to reflect that the estimate is adjusted from the previous one, usually by a small amount. The convergence analysis of algorithms like (a7) is treated in e.g. [a6], [a1], [a4], and [a8]. The underlying theory is typically based on averaging, relating (a7) to an associated differential equation, and the subsequent stability analysis of this equation, or on stochastic Lyapunov functions (cf. also Lyapunov stochastic function). It is also of interest to determine the asymptotic distribution of the estimate as $\gamma$ and $\mu$ become small, see, e.g., [a2] and [a8].

References

[a1] A. Benveniste, M. Métivier, P. Priouret, "Adaptive algorithms and stochastic approximations" , Springer (1990) Zbl 0752.93073
[a2] L. Guo, L. Ljung, "Performance analysis of general tracking algorithms." IEEE Trans. Automat. Control , 40 (1995) pp. 1388–1402
[a3] E.J. Hannan, M. Deistler, "The statistical theory of linear systems" , Wiley (1988)
[a4] H.J. Kushner, D.S. Clark, "Stochastic approximation methods for constrained and unconstrained systems" , Springer (1978)
[a5] L. Ljung, "System identification: Theory for the user" , Prentice-Hall (1999) (Edition: Second)
[a6] L. Ljung, T. Söderström, "Theory and practice of recursive identification" , MIT (1983)
[a7] T. Söderström, P. Stoica, "System identification" , Prentice-Hall (1989)
[a8] V. Solo, X. Kong, "Adaptive signal processing algorithms" , Prentice-Hall (1995)
How to Cite This Entry:
System identification. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=System_identification&oldid=19200
This article was adapted from an original article by L. Ljung (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article