Namespaces
Variants
Actions

Difference between revisions of "Stochastic processes, filtering of"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
m (tex encoded by computer)
Line 1: Line 1:
 +
<!--
 +
s0902601.png
 +
$#A+1 = 79 n = 0
 +
$#C+1 = 79 : ~/encyclopedia/old_files/data/S090/S.0900260 Stochastic processes, filtering of,
 +
Automatically converted into TeX, above some diagnostics.
 +
Please remove this comment and the {{TEX|auto}} line below,
 +
if TeX found to be correct.
 +
-->
 +
 +
{{TEX|auto}}
 +
{{TEX|done}}
 +
 
''filtration of stochastic processes''
 
''filtration of stochastic processes''
  
The problem of estimating the value of a [[Stochastic process|stochastic process]] <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s0902601.png" /> at the current moment <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s0902602.png" /> given the past of another stochastic process related to it. For example, estimate a stationary process <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s0902603.png" /> given the values <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s0902604.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s0902605.png" />, of a stationary process stationarily related to it (see [[#References|[1]]], for example). Usually one considers the estimator <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s0902606.png" /> which minimizes the mean-square error, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s0902607.png" />. The use of the term  "filter"  goes back to the problem of isolating a signal from a  "mixture"  of a signal and a random noise. An important case of this is the problem of optimal filtering, when the connection between <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s0902608.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s0902609.png" /> is described by a [[Stochastic differential equation|stochastic differential equation]]
+
The problem of estimating the value of a [[Stochastic process|stochastic process]] $  Z ( t) $
 +
at the current moment $  t $
 +
given the past of another stochastic process related to it. For example, estimate a stationary process $  Z ( t) $
 +
given the values $  X ( s) $,  
 +
s \leq  t $,  
 +
of a stationary process stationarily related to it (see [[#References|[1]]], for example). Usually one considers the estimator $  \widehat{Z}  ( t) $
 +
which minimizes the mean-square error, $  {\mathsf E} | \widehat{Z}  ( t) - Z ( t) |  ^ {2} $.  
 +
The use of the term  "filter"  goes back to the problem of isolating a signal from a  "mixture"  of a signal and a random noise. An important case of this is the problem of optimal filtering, when the connection between $  Z ( t) $
 +
and $  X ( t) $
 +
is described by a [[Stochastic differential equation|stochastic differential equation]]
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026010.png" /></td> </tr></table>
+
$$
 +
d X ( t)  = Z ( t)  d t + d Y ( t) ,\ \
 +
t > t _ {0} ,
 +
$$
  
where the noise is assumed to be independent of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026011.png" /> and is given by a standard [[Wiener process|Wiener process]] <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026012.png" />.
+
where the noise is assumed to be independent of $  Z ( t) $
 +
and is given by a standard [[Wiener process|Wiener process]] $  Y ( t) $.
  
A widely used filtering method is the Kalman–Bucy method, which applies to processes <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026013.png" /> that are described by linear stochastic differential equations. For example, if, in the above scheme,
+
A widely used filtering method is the Kalman–Bucy method, which applies to processes $  Z ( t) $
 +
that are described by linear stochastic differential equations. For example, if, in the above scheme,
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026014.png" /></td> </tr></table>
+
$$
 +
d X ( t)  = a ( t) Z ( t)  d t + d Y ( t)
 +
$$
  
 
with zero initial conditions, then
 
with zero initial conditions, then
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026015.png" /></td> </tr></table>
+
$$
 +
\widehat{Z}  ( t)  = \int\limits _ {t _ {0} } ^ { t }  c ( t , s )  d X ( t) ,
 +
$$
 +
 
 +
where the weight function  $  c ( t , s ) $
 +
is obtained from the equations:
 +
 
 +
$$
  
where the weight function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026016.png" /> is obtained from the equations:
+
\frac{d}{dt}
 +
c ( t , s )  = \
 +
[ a ( t) - b ( t) ] c ( t , s ) ,\ \
 +
t > s ,
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026017.png" /></td> </tr></table>
+
$$
 +
c( s, s)  = b( s),
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026018.png" /></td> </tr></table>
+
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026019.png" /></td> </tr></table>
+
\frac{d}{dt}
 +
b ( t)  = 2 a ( t) b ( t) - [ b ( t) ]  ^ {2}
 +
+ 1 ,\  t > t _ {0} ,\  b ( t _ {0} )  = 0 .
 +
$$
  
 
The generalization of this method to non-linear equations is called the general stochastic filtering problem or the non-linear filtering problem (see [[#References|[2]]]).
 
The generalization of this method to non-linear equations is called the general stochastic filtering problem or the non-linear filtering problem (see [[#References|[2]]]).
Line 27: Line 71:
 
In the case when
 
In the case when
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026020.png" /></td> </tr></table>
+
$$
 +
Z ( t)  = \sum _ { k= } 1 ^ { n }  c _ {k} Z _ {k} ( t)
 +
$$
  
depends on the unknown parameters <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026021.png" />, one can obtain the interpolation estimator <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026022.png" /> by estimating these parameters given <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026023.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026024.png" />; the method of least squares applies here, along with its generalizations (see [[#References|[3]]], for example).
+
depends on the unknown parameters $  c _ {1} \dots c _ {n} $,  
 +
one can obtain the interpolation estimator $  \widehat{Z}  ( t) $
 +
by estimating these parameters given $  X ( s) $,  
 +
s \leq  t $;  
 +
the method of least squares applies here, along with its generalizations (see [[#References|[3]]], for example).
  
 
====References====
 
====References====
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  Yu.A. Rozanov,  "Stationary random processes" , Holden-Day  (1967)  (Translated from Russian)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  R.S. Liptser,  A.N. Shiryaev,  "Statistics of stochastic processes" , '''1–2''' , Springer  (1977–1978)  (Translated from Russian)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  I.A. Ibragimov,  Yu.A. Rozanov,  "Gaussian random processes" , Springer  (1978)  (Translated from Russian)</TD></TR></table>
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  Yu.A. Rozanov,  "Stationary random processes" , Holden-Day  (1967)  (Translated from Russian)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  R.S. Liptser,  A.N. Shiryaev,  "Statistics of stochastic processes" , '''1–2''' , Springer  (1977–1978)  (Translated from Russian)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  I.A. Ibragimov,  Yu.A. Rozanov,  "Gaussian random processes" , Springer  (1978)  (Translated from Russian)</TD></TR></table>
 
 
  
 
====Comments====
 
====Comments====
Line 41: Line 89:
 
The linear filtering problem has first been formulated and solved by N. Wiener [[#References|[a18]]] and A.N. Kolmogorov [[#References|[a20]]]. R.E. Kalman has reformulated the linear filtering problem for a stochastic system in state space form. The solution to that problem is known as the Kalman filter for discrete-time processes [[#References|[a7]]] and as the Kalman–Bucy filter for continuous-time processes [[#References|[a8]]]. The new elements in the problem formulation are the emphasis on recursive filters and on the finite-dimensionality of the state space.
 
The linear filtering problem has first been formulated and solved by N. Wiener [[#References|[a18]]] and A.N. Kolmogorov [[#References|[a20]]]. R.E. Kalman has reformulated the linear filtering problem for a stochastic system in state space form. The solution to that problem is known as the Kalman filter for discrete-time processes [[#References|[a7]]] and as the Kalman–Bucy filter for continuous-time processes [[#References|[a8]]]. The new elements in the problem formulation are the emphasis on recursive filters and on the finite-dimensionality of the state space.
  
In Wiener–Kolmogorov filtering (cf. [[#References|[a12]]], [[#References|[a20]]]), one is given a pair of jointly stationary zero-mean normally-distributed stochastic processes <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026025.png" /> and one would like to obtain the optimal least-square estimator of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026026.png" /> from the observed past of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026027.png" />: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026028.png" />. The optimal estimator, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026029.png" />, will be given by the convolution
+
In Wiener–Kolmogorov filtering (cf. [[#References|[a12]]], [[#References|[a20]]]), one is given a pair of jointly stationary zero-mean normally-distributed stochastic processes $  \{ {y ( t) , z ( t) } : {t \in \mathbf R } \} $
 +
and one would like to obtain the optimal least-square estimator of $  z ( t) $
 +
from the observed past of $  y $:  
 +
$  \{ y ( t  ^  \prime  )  \textrm{ for }  t  ^  \prime  < t \} $.  
 +
The optimal estimator, $  \widehat{z}  ( t) $,  
 +
will be given by the convolution
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026030.png" /></td> </tr></table>
+
$$
 +
\widehat{z}  ( t)  = \int\limits _ {- \infty } ^ { t }
 +
G ( t - t  ^  \prime  ) y ( t  ^  \prime  )  d t  ^  \prime  .
 +
$$
  
 
The convolution kernel is determined by the integral equation
 
The convolution kernel is determined by the integral equation
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026031.png" /></td> </tr></table>
+
$$
 +
\int\limits _ { 0 } ^  \infty 
 +
G ( t  ^  \prime  ) R _ {yy} ( t - t  ^  \prime  )  d t  ^  \prime  = \
 +
R _ {zy} ( t) ,\  t > 0,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026032.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026033.png" />.
+
where $  R _ {yy} ( t) = {\mathsf E} \{ y ( t) y  ^ {T} ( 0) \} $
 +
and $  R _ {zy} ( t) = {\mathsf E} \{ z ( t) y  ^ {T} ( 0) \} $.
  
This integral equation is a so-called [[Wiener–Hopf equation|Wiener–Hopf equation]], and determines <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026034.png" /> as a function of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026035.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026036.png" />. The most effective way of solving it is by means of the method of [[Spectral decomposition of a random function|spectral decomposition of a random function]].
+
This integral equation is a so-called [[Wiener–Hopf equation|Wiener–Hopf equation]], and determines $  G $
 +
as a function of $  R _ {yy} $
 +
and $  R _ {zy} $.  
 +
The most effective way of solving it is by means of the method of [[Spectral decomposition of a random function|spectral decomposition of a random function]].
  
 
In Kalman–Bucy filtering (cf. [[#References|[a7]]], [[#References|[a8]]]), the model is given by the linear stochastic differential equation
 
In Kalman–Bucy filtering (cf. [[#References|[a7]]], [[#References|[a8]]]), the model is given by the linear stochastic differential equation
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026037.png" /></td> </tr></table>
+
$$
 +
d x  = A ( t) x  d t + B ( t)  d w ( t) ,
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026038.png" /></td> </tr></table>
+
$$
 +
d y  = C ( t) x  d t + D ( t)  d w ( t) ,
 +
$$
  
with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026039.png" /> a [[Wiener process|Wiener process]] which generates the observed vector process <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026040.png" /> through the state vector process <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026041.png" />. The matrices <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026042.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026043.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026044.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026045.png" /> are assumed to be known and of suitable dimension, with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026046.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026047.png" /> strictly positive-definite. The initial state <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026048.png" /> is normally distributed with known mean <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026049.png" /> and covariance <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026050.png" />, and is assumed to be independent of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026051.png" />. Further, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026052.png" /> is taken to be zero.
+
with $  w $
 +
a [[Wiener process|Wiener process]] which generates the observed vector process $  y $
 +
through the state vector process $  x $.  
 +
The matrices $  A ( t) $,  
 +
$  B ( t) $,  
 +
$  C ( t) $,  
 +
$  D ( t) $
 +
are assumed to be known and of suitable dimension, with $  D( t) $
 +
and $  D  ^ {T} ( t) $
 +
strictly positive-definite. The initial state $  x ( t _ {0} ) $
 +
is normally distributed with known mean $  m _ {0} $
 +
and covariance $  \Pi _ {0} $,  
 +
and is assumed to be independent of $  w $.  
 +
Further, $  y ( t _ {0} ) $
 +
is taken to be zero.
  
The problem is to estimate <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026053.png" /> from the observations <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026054.png" /> for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026055.png" />. The Kalman filter, which generates this estimator, is given by
+
The problem is to estimate $  x ( t) $
 +
from the observations $  y ( t  ^  \prime  ) $
 +
for  $  t _ {0} \leq  t  ^  \prime  < t $.  
 +
The Kalman filter, which generates this estimator, is given by
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026056.png" /></td> </tr></table>
+
$$
 +
d \widehat{x}  = A ( t) \widehat{x}  d t + L ( t) ( d y - C ( t) \widehat{x}  d t )
 +
$$
  
on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026057.png" /> with initial condition <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026058.png" /> and the Kalman gain <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026059.png" /> is defined by
+
on $  t \geq  t _ {0} $
 +
with initial condition $  \widehat{x}  ( t _ {0} ) = m _ {0} $
 +
and the Kalman gain $  L $
 +
is defined by
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026060.png" /></td> </tr></table>
+
$$
 +
L ( t)  = ( \Sigma ( t) C  ^ {T} ( t) +
 +
B ( t) D  ^ {T} ( t) ) ( D ( t) D  ^ {T} ( t ) )  ^ {-} 1 ,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026061.png" /> is the solution of the [[Riccati equation|Riccati equation]]
+
where $  \Sigma $
 +
is the solution of the [[Riccati equation|Riccati equation]]
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026062.png" /></td> </tr></table>
+
$$
 +
\dot \Sigma  = A ( t) \Sigma + \Sigma
 +
A  ^ {T} ( t) - ( \Sigma C  ^ {T} ( t) + B ( t) D  ^ {T} ( t) ) ( D
 +
( t) D  ^ {T} ( t) )  ^ {-} 1 \times
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026063.png" /></td> </tr></table>
+
$$
 +
\times
 +
( C ( t) \Sigma + D ( t) B  ^ {T} ( t)) + B ( t) B  ^ {T} ( t) ,
 +
$$
  
with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026064.png" />. This differential equation in the (<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026065.png" />)-symmetric matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026066.png" /> can be shown to have a unique symmetric non-negative definite solution. Its solution is in fact equal to the estimation error:
+
with $  \Sigma ( t _ {0} ) = \Pi _ {0} $.  
 +
This differential equation in the ( $  n \times n $)-
 +
symmetric matrix $  \Sigma $
 +
can be shown to have a unique symmetric non-negative definite solution. Its solution is in fact equal to the estimation error:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026067.png" /></td> </tr></table>
+
$$
 +
\Sigma ( t)  = {\mathsf E} \{ ( x ( t) - \widehat{x}  ( t) )
 +
( x ( t) - \widehat{x}  ( t) )  ^ {T} \} .
 +
$$
  
In the time-invariant Kalman filter one assumes that the initial time since observations were taken goes to minus infinity: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026068.png" />. If one assumes that the matrices <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026069.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026070.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026071.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026072.png" /> are independent of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026073.png" /> and satisfy certain observability and controllability conditions, then the infinite Kalman filter becomes
+
In the time-invariant Kalman filter one assumes that the initial time since observations were taken goes to minus infinity: $  t _ {0} \rightarrow - \infty $.  
 +
If one assumes that the matrices $  A $,  
 +
$  B $,  
 +
$  C $,  
 +
$  D $
 +
are independent of $  t $
 +
and satisfy certain observability and controllability conditions, then the infinite Kalman filter becomes
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026074.png" /></td> </tr></table>
+
$$
 +
d \widehat{x}  = A \widehat{x}  d t + L ( d y - \widehat{x}  d t ) ,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026075.png" /> is defined by
+
where $  L $
 +
is defined by
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026076.png" /></td> </tr></table>
+
$$
 +
= ( \Sigma  ^ {+} C  ^ {T} + B D  ^ {T} ) ( D D  ^ {T} )  ^ {-} 1
 +
$$
  
when <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026077.png" /> is the (unique) symmetric non-negative definite solution of the algebraic Riccati equation
+
when $  \Sigma  ^ {+} $
 +
is the (unique) symmetric non-negative definite solution of the algebraic Riccati equation
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026078.png" /></td> </tr></table>
+
$$
 +
= A \Sigma + \Sigma A  ^ {T} - ( \Sigma C  ^ {T} + B D  ^ {T} )
 +
( D D  ^ {T} )  ^ {-} 1 \times
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090260/s09026079.png" /></td> </tr></table>
+
$$
 +
\times
 +
( C \Sigma + D B  ^ {T} ) + B B  ^ {T} .
 +
$$
  
 
The Kalman filter, in particular its time-invariant version, is one of the most basic results in control theory and signal processing and has found wide application in process control, aerospace engineering, econometrics, etc. Many of these applications involve non-linear systems, and the Kalman filter is applied in a non-rigorous way by a procedure of successive linearization. Such algorithms are known as extended Kalman filters and have proved remarkably effective in practice [[#References|[a11]]]. See [[#References|[a21]]], [[#References|[a22]]] for general surveys of linear filtering theory.
 
The Kalman filter, in particular its time-invariant version, is one of the most basic results in control theory and signal processing and has found wide application in process control, aerospace engineering, econometrics, etc. Many of these applications involve non-linear systems, and the Kalman filter is applied in a non-rigorous way by a procedure of successive linearization. Such algorithms are known as extended Kalman filters and have proved remarkably effective in practice [[#References|[a11]]]. See [[#References|[a21]]], [[#References|[a22]]] for general surveys of linear filtering theory.

Revision as of 08:23, 6 June 2020


filtration of stochastic processes

The problem of estimating the value of a stochastic process $ Z ( t) $ at the current moment $ t $ given the past of another stochastic process related to it. For example, estimate a stationary process $ Z ( t) $ given the values $ X ( s) $, $ s \leq t $, of a stationary process stationarily related to it (see [1], for example). Usually one considers the estimator $ \widehat{Z} ( t) $ which minimizes the mean-square error, $ {\mathsf E} | \widehat{Z} ( t) - Z ( t) | ^ {2} $. The use of the term "filter" goes back to the problem of isolating a signal from a "mixture" of a signal and a random noise. An important case of this is the problem of optimal filtering, when the connection between $ Z ( t) $ and $ X ( t) $ is described by a stochastic differential equation

$$ d X ( t) = Z ( t) d t + d Y ( t) ,\ \ t > t _ {0} , $$

where the noise is assumed to be independent of $ Z ( t) $ and is given by a standard Wiener process $ Y ( t) $.

A widely used filtering method is the Kalman–Bucy method, which applies to processes $ Z ( t) $ that are described by linear stochastic differential equations. For example, if, in the above scheme,

$$ d X ( t) = a ( t) Z ( t) d t + d Y ( t) $$

with zero initial conditions, then

$$ \widehat{Z} ( t) = \int\limits _ {t _ {0} } ^ { t } c ( t , s ) d X ( t) , $$

where the weight function $ c ( t , s ) $ is obtained from the equations:

$$ \frac{d}{dt} c ( t , s ) = \ [ a ( t) - b ( t) ] c ( t , s ) ,\ \ t > s , $$

$$ c( s, s) = b( s), $$

$$ \frac{d}{dt} b ( t) = 2 a ( t) b ( t) - [ b ( t) ] ^ {2} + 1 ,\ t > t _ {0} ,\ b ( t _ {0} ) = 0 . $$

The generalization of this method to non-linear equations is called the general stochastic filtering problem or the non-linear filtering problem (see [2]).

In the case when

$$ Z ( t) = \sum _ { k= } 1 ^ { n } c _ {k} Z _ {k} ( t) $$

depends on the unknown parameters $ c _ {1} \dots c _ {n} $, one can obtain the interpolation estimator $ \widehat{Z} ( t) $ by estimating these parameters given $ X ( s) $, $ s \leq t $; the method of least squares applies here, along with its generalizations (see [3], for example).

References

[1] Yu.A. Rozanov, "Stationary random processes" , Holden-Day (1967) (Translated from Russian)
[2] R.S. Liptser, A.N. Shiryaev, "Statistics of stochastic processes" , 1–2 , Springer (1977–1978) (Translated from Russian)
[3] I.A. Ibragimov, Yu.A. Rozanov, "Gaussian random processes" , Springer (1978) (Translated from Russian)

Comments

In the filtering of stochastic processes one distinguishes two problems. The linear filtering problem is to estimate a stationary stochastic process given a linear function of the past of a real stationary process such that a least-squares criterion is minimized. The stochastic filtering problem or non-linear filtering problem is to determine the conditional probability distribution of a process given the past of a related process.

The linear filtering problem has first been formulated and solved by N. Wiener [a18] and A.N. Kolmogorov [a20]. R.E. Kalman has reformulated the linear filtering problem for a stochastic system in state space form. The solution to that problem is known as the Kalman filter for discrete-time processes [a7] and as the Kalman–Bucy filter for continuous-time processes [a8]. The new elements in the problem formulation are the emphasis on recursive filters and on the finite-dimensionality of the state space.

In Wiener–Kolmogorov filtering (cf. [a12], [a20]), one is given a pair of jointly stationary zero-mean normally-distributed stochastic processes $ \{ {y ( t) , z ( t) } : {t \in \mathbf R } \} $ and one would like to obtain the optimal least-square estimator of $ z ( t) $ from the observed past of $ y $: $ \{ y ( t ^ \prime ) \textrm{ for } t ^ \prime < t \} $. The optimal estimator, $ \widehat{z} ( t) $, will be given by the convolution

$$ \widehat{z} ( t) = \int\limits _ {- \infty } ^ { t } G ( t - t ^ \prime ) y ( t ^ \prime ) d t ^ \prime . $$

The convolution kernel is determined by the integral equation

$$ \int\limits _ { 0 } ^ \infty G ( t ^ \prime ) R _ {yy} ( t - t ^ \prime ) d t ^ \prime = \ R _ {zy} ( t) ,\ t > 0, $$

where $ R _ {yy} ( t) = {\mathsf E} \{ y ( t) y ^ {T} ( 0) \} $ and $ R _ {zy} ( t) = {\mathsf E} \{ z ( t) y ^ {T} ( 0) \} $.

This integral equation is a so-called Wiener–Hopf equation, and determines $ G $ as a function of $ R _ {yy} $ and $ R _ {zy} $. The most effective way of solving it is by means of the method of spectral decomposition of a random function.

In Kalman–Bucy filtering (cf. [a7], [a8]), the model is given by the linear stochastic differential equation

$$ d x = A ( t) x d t + B ( t) d w ( t) , $$

$$ d y = C ( t) x d t + D ( t) d w ( t) , $$

with $ w $ a Wiener process which generates the observed vector process $ y $ through the state vector process $ x $. The matrices $ A ( t) $, $ B ( t) $, $ C ( t) $, $ D ( t) $ are assumed to be known and of suitable dimension, with $ D( t) $ and $ D ^ {T} ( t) $ strictly positive-definite. The initial state $ x ( t _ {0} ) $ is normally distributed with known mean $ m _ {0} $ and covariance $ \Pi _ {0} $, and is assumed to be independent of $ w $. Further, $ y ( t _ {0} ) $ is taken to be zero.

The problem is to estimate $ x ( t) $ from the observations $ y ( t ^ \prime ) $ for $ t _ {0} \leq t ^ \prime < t $. The Kalman filter, which generates this estimator, is given by

$$ d \widehat{x} = A ( t) \widehat{x} d t + L ( t) ( d y - C ( t) \widehat{x} d t ) $$

on $ t \geq t _ {0} $ with initial condition $ \widehat{x} ( t _ {0} ) = m _ {0} $ and the Kalman gain $ L $ is defined by

$$ L ( t) = ( \Sigma ( t) C ^ {T} ( t) + B ( t) D ^ {T} ( t) ) ( D ( t) D ^ {T} ( t ) ) ^ {-} 1 , $$

where $ \Sigma $ is the solution of the Riccati equation

$$ \dot \Sigma = A ( t) \Sigma + \Sigma A ^ {T} ( t) - ( \Sigma C ^ {T} ( t) + B ( t) D ^ {T} ( t) ) ( D ( t) D ^ {T} ( t) ) ^ {-} 1 \times $$

$$ \times ( C ( t) \Sigma + D ( t) B ^ {T} ( t)) + B ( t) B ^ {T} ( t) , $$

with $ \Sigma ( t _ {0} ) = \Pi _ {0} $. This differential equation in the ( $ n \times n $)- symmetric matrix $ \Sigma $ can be shown to have a unique symmetric non-negative definite solution. Its solution is in fact equal to the estimation error:

$$ \Sigma ( t) = {\mathsf E} \{ ( x ( t) - \widehat{x} ( t) ) ( x ( t) - \widehat{x} ( t) ) ^ {T} \} . $$

In the time-invariant Kalman filter one assumes that the initial time since observations were taken goes to minus infinity: $ t _ {0} \rightarrow - \infty $. If one assumes that the matrices $ A $, $ B $, $ C $, $ D $ are independent of $ t $ and satisfy certain observability and controllability conditions, then the infinite Kalman filter becomes

$$ d \widehat{x} = A \widehat{x} d t + L ( d y - \widehat{x} d t ) , $$

where $ L $ is defined by

$$ L = ( \Sigma ^ {+} C ^ {T} + B D ^ {T} ) ( D D ^ {T} ) ^ {-} 1 $$

when $ \Sigma ^ {+} $ is the (unique) symmetric non-negative definite solution of the algebraic Riccati equation

$$ 0 = A \Sigma + \Sigma A ^ {T} - ( \Sigma C ^ {T} + B D ^ {T} ) ( D D ^ {T} ) ^ {-} 1 \times $$

$$ \times ( C \Sigma + D B ^ {T} ) + B B ^ {T} . $$

The Kalman filter, in particular its time-invariant version, is one of the most basic results in control theory and signal processing and has found wide application in process control, aerospace engineering, econometrics, etc. Many of these applications involve non-linear systems, and the Kalman filter is applied in a non-rigorous way by a procedure of successive linearization. Such algorithms are known as extended Kalman filters and have proved remarkably effective in practice [a11]. See [a21], [a22] for general surveys of linear filtering theory.

The study of the stochastic filtering problem, or non-linear filtering, has been initiated by R.L. Stratonovich [a16] and H.J. Kushner [a9]. A generalization and a proof using martingale theory is due to M. Fujisaki, G. Kallianpur and H. Kunita [a4]. See also [a26] and [2]. An approach leading to dynamical equations for a non-normalized conditional density has been developed by Kallianpur, C. Striebel [a6], R.E. Mortensen [a12], M. Zakai [a19], and E. Pardoux [a13]. See also [a25]. None of these filtering formulas is directly implementable, since all are "infinite-dimensional" , i.e. describe the time evolution of conditional distribution or density functions in the form of measure-valued or stochastic partial differential equations. In 1980, V.E. Benes [a1] discovered a class of non-linear systems for which the conditional density admits a finite-dimensional parametrization, and this has led to extensive research on characterizing such systems and exploring the connection, uncovered by R.W. Brockett and J.M.C. Clark [a3], between non-linear filtering and certain Lie algebras of differential operators; see [a25], [a18]. Further work, e.g. [a23], [a24], has been concerned with establishing the existence of smooth conditional density functions, using methods based on the Malliavin calculus.

Stochastic filtering problems for counting process observations have first been considered by D.L. Snyder, see [a15]. Generalizations may be found in [a2], [a14], [a17].

References

[a1] V.E. Beneš, "Exact finite-dimensional filters for certain diffusion with nonlinear drift" Stochastics , 5 (1981) pp. 65–92
[a2] P. Brémaud, "Point processes and queues - Martingale dynamics" , Springer (1981)
[a3] R.W. Brockett, J.M.C. Clark, "The geometry of the conditional density equation" O.L.R. Jacobs (ed.) M.H.A. Davis (ed.) M.A.H. Dempster (ed.) C.J. Harris (ed.) P.C. Parks (ed.) , Analysis and optimization of stochastic systems , Acad. Press (1980) pp. 299–309
[a4] M. Fujisaki, G. Kallianpur, H. Kunita, "Stochastic differential equations for the nonlinear filtering problem" Osaka J. Math. , 9 (1972) pp. 19–40
[a5] A.H. Jazwinski, "Stochastic processes and filtering theory" , Acad. Press (1970)
[a6] G. Kallianpur, C. Striebel, "Estimation of stochastic systems: Arbitrary system processes with additive white noise observation errors" Ann. Math. Statist. , 39 (1968) pp. 785–801
[a7] R.E. Kalman, "A new approach to linear filtering and prediction problems" J. Basic Eng., Trans. ASME, Series D. , 82 : 1 (March 1960) pp. 35–45
[a8] R.E. Kalman, R.S. Bucy, "New results in linear filtering and prediction theory" J. Basic Eng., Trans. ASME, Series D , 83 (1961) pp. 95–108
[a9] H.J. Kushner, "Dynamical equations for optimal nonlinear filtering" J. Diff. Equations , 3 (1967) pp. 179–190
[a10] S.I. Marcus, "Algebraic and geometric methods in nonlinear filtering" SIAM J. Control Optim. , 22 (1984) pp. 814–844
[a11] P.S. Maybeck, "Stochastic models, estimation and control" , 1–3 , Acad. Press (1979–1982)
[a12] R.E. Mortensen, "Optimal control of continuous time stochastic systems" , Doctoral Diss. Dept. Elect. Engin. Univ. California (1966)
[a13] E. Pardoux, "Stochastic partial differential equations and filtering of diffusion processes" Stochastics , 3 (1979) pp. 127–167
[a14] A. Segall, M.H.A. Davis, T. Kailath, "Nonlinear filtering with counting observations" IEEE Trans. Inform. Theory , 21 (1975) pp. 143–149
[a15] D.L. Snyder, "Random point processes" , Wiley (1975)
[a16] R.L. Stratonovitch, "Conditional Markov processes" Theor. Probab. Appl. , 5 (1960) pp. 156–178
[a17] J.H. van Schuppen, "Filtering prediction and smoothing for counting process observations, a martingale approach" SIAM J. Appl. Math. , 32 (1977) pp. 552–570
[a18] N. Wiener, "Extrapolation, interpolation, and smoothing of stationary time series: with engineering applications" , M.I.T. (1949)
[a19] M. Zakai, "On the optimal filtering of diffusion processes" Z. Wahrscheinlichkeitstheorie verw. Gebiete , 11 (1969) pp. 230–243
[a20] A.N. Kolmogorov, "Interpolation and extrapolation of stationary random sequences" Byull. Akad. Nauk. SSSR Ser. Mat. , 5 (1941) pp. 3–14 (In Russian)
[a21] T. Kailath, "Lectures on Wiener and Kalman filtering" , Springer (1981)
[a22] J.C. Willems, "Recursive filtering" Statistica Neerlandica , 32 : 1 (1978) pp. 1–39
[a23] J.M. Bismut, D. Michel, "Diffusions conditionelles I" Funct. Anal. , 44 (1981) pp. 174–211
[a24] J.M. Bismut, D. Michel, "Diffusions conditionelles II" Funct. Anal. , 45 (1982) pp. 274–292
[a25] M. Hazewinkel (ed.) J.C. Willems (ed.) , Stochastic systems: the mathematical theory of filtering and identification and applications , Reidel (1981)
[a26] G. Kallianpur, "Stochastic filtering theory" , Springer (1978)
How to Cite This Entry:
Stochastic processes, filtering of. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Stochastic_processes,_filtering_of&oldid=15457
This article was adapted from an original article by Yu.A. Rozanov (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article