Namespaces
Variants
Actions

Difference between revisions of "Statistical problems in the theory of stochastic processes"

From Encyclopedia of Mathematics
Jump to: navigation, search
m (Undo revision 48818 by Ulf Rehmann (talk))
Tag: Undo
m (tex encoded by computer)
 
Line 1: Line 1:
A branch of mathematical statistics devoted to statistical inferences on the basis of observations represented as a random process. In the most common formulation, the values of a random function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s0874501.png" /> for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s0874502.png" /> are observed, and on the basis of these observations statistical inferences must be made regarding certain characteristics of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s0874503.png" />. Such a broad definition also formally includes all the classical statistics of independent observations. In fact, statistics of random processes is often taken to mean only the statistics of dependent observations, excluding the statistical analysis of a large number of independent realizations of a random process. Here the foundations of the statistical theory, the basic formulations of problems (cf. [[Statistical estimation|Statistical estimation]]; [[Statistical hypotheses, verification of|Statistical hypotheses, verification of]]), the basic concepts (sufficiency, unbiasedness, consistency, etc.) are the same as in the classical theory. However, when solving concrete problems, significant difficulties and phenomena of a new type arise from time to time. These difficulties are partially caused by the fact of dependence and the more complex structure of the process under consideration, and partially, in the case of observations in continuous time, by the need to examine distributions in infinite-dimensional spaces.
+
<!--
 +
s0874501.png
 +
$#A+1 = 302 n = 0
 +
$#C+1 = 302 : ~/encyclopedia/old_files/data/S087/S.0807450 Statistical problems in the theory of stochastic processes
 +
Automatically converted into TeX, above some diagnostics.
 +
Please remove this comment and the {{TEX|auto}} line below,
 +
if TeX found to be correct.
 +
-->
 +
 
 +
{{TEX|auto}}
 +
{{TEX|done}}
 +
 
 +
A branch of mathematical statistics devoted to statistical inferences on the basis of observations represented as a random process. In the most common formulation, the values of a random function $  x( t) $
 +
for $  t \in T $
 +
are observed, and on the basis of these observations statistical inferences must be made regarding certain characteristics of $  x( t) $.  
 +
Such a broad definition also formally includes all the classical statistics of independent observations. In fact, statistics of random processes is often taken to mean only the statistics of dependent observations, excluding the statistical analysis of a large number of independent realizations of a random process. Here the foundations of the statistical theory, the basic formulations of problems (cf. [[Statistical estimation|Statistical estimation]]; [[Statistical hypotheses, verification of|Statistical hypotheses, verification of]]), the basic concepts (sufficiency, unbiasedness, consistency, etc.) are the same as in the classical theory. However, when solving concrete problems, significant difficulties and phenomena of a new type arise from time to time. These difficulties are partially caused by the fact of dependence and the more complex structure of the process under consideration, and partially, in the case of observations in continuous time, by the need to examine distributions in infinite-dimensional spaces.
  
 
In fact, when solving statistical problems in the theory of random processes, the structure of the process under consideration is crucial, while when classifying random processes, statistical problems of Gaussian, Markov, stationary, branching, diffusion, and other processes are studied. Of these, the most far-reaching is the statistical theory of stationary processes (time-series analysis).
 
In fact, when solving statistical problems in the theory of random processes, the structure of the process under consideration is crucial, while when classifying random processes, statistical problems of Gaussian, Markov, stationary, branching, diffusion, and other processes are studied. Of these, the most far-reaching is the statistical theory of stationary processes (time-series analysis).
Line 5: Line 20:
 
The need for a statistical analysis of random processes arose in the 19th century in the form of analysis of meteorological and economic series and studies on cyclic processes (price fluctuation, sun spots). In modern times, the number of problems covered by the statistical analysis of random processes has become extraordinarily large. To cite but a few examples, statistical analysis of random noise, vibrations, turbulence, wave motion in a sea, cardiograms and encephalograms, etc. The theoretical aspects of extracting a signal from a background of noise can be seen to a significant degree as a statistical problem in the theory of random processes.
 
The need for a statistical analysis of random processes arose in the 19th century in the form of analysis of meteorological and economic series and studies on cyclic processes (price fluctuation, sun spots). In modern times, the number of problems covered by the statistical analysis of random processes has become extraordinarily large. To cite but a few examples, statistical analysis of random noise, vibrations, turbulence, wave motion in a sea, cardiograms and encephalograms, etc. The theoretical aspects of extracting a signal from a background of noise can be seen to a significant degree as a statistical problem in the theory of random processes.
  
Below it is proposed that a segment <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s0874504.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s0874505.png" />, of the random process <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s0874506.png" /> be observed, whereby the parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s0874507.png" /> passes either through the whole interval <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s0874508.png" />, or through the integers in this interval. In statistical problems, the distribution <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s0874509.png" /> of the process <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745010.png" /> is usually known only to belong to some family of distributions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745011.png" />. This family can always be written in parametric form.
+
Below it is proposed that a segment $  x( t) $,  
 +
0 \leq  t \leq  T $,  
 +
of the random process $  x( t) $
 +
be observed, whereby the parameter $  t $
 +
passes either through the whole interval $  [ 0, T] $,
 +
or through the integers in this interval. In statistical problems, the distribution $  P  ^ {T} $
 +
of the process $  \{ {x( t) } : {0 \leq  t \leq  T } \} $
 +
is usually known only to belong to some family of distributions $  \{ P  ^ {T} \} $.  
 +
This family can always be written in parametric form.
  
 
===Example 1.===
 
===Example 1.===
The process <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745012.png" /> is either the sum of a non-random function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745013.png" /> (a "signal" ) and a random function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745014.png" /> (the "noise" ), or is a single random function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745015.png" />. The hypothesis <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745016.png" />: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745017.png" /> must be tested against the alternative <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745018.png" />: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745019.png" /> (the problem of locating a signal in noise). This is an example of testing a statistical hypothesis.
+
The process $  x( t) $
 +
is either the sum of a non-random function s( t) $(
 +
a "signal" ) and a random function $  \xi ( t) $(
 +
the "noise" ), or is a single random function $  \xi ( t) $.  
 +
The hypothesis $  H _ {0} $:  
 +
$  x( t) = s( t) + \xi ( t) $
 +
must be tested against the alternative $  H _ {1} $:  
 +
$  x( t) = \xi ( t) $(
 +
the problem of locating a signal in noise). This is an example of testing a statistical hypothesis.
  
 
===Example 2.===
 
===Example 2.===
The process <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745020.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745021.png" /> is an unknown non-random function (the signal), while <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745022.png" /> is a random process (the noise). The function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745023.png" />, or its value <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745024.png" /> at a given point <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745025.png" />, has to be estimated. Similarly, it can be proposed that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745026.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745027.png" /> is a known function, depending on an unknown parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745028.png" />, which must also be estimated through the observation of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745029.png" /> (problems of extracting a signal from a background of noise). These are examples of estimation problems.
+
The process $  x( t) = s( t) + \xi ( t) $,  
 +
where s( t) $
 +
is an unknown non-random function (the signal), while $  \xi ( t) $
 +
is a random process (the noise). The function s $,  
 +
or its value $  s( t _ {0} ) $
 +
at a given point $  t _ {0} $,  
 +
has to be estimated. Similarly, it can be proposed that $  x( t) = s( t;  \theta ) + \xi ( t) $,  
 +
where s $
 +
is a known function, depending on an unknown parameter $  \theta $,  
 +
which must also be estimated through the observation of $  x( t) $(
 +
problems of extracting a signal from a background of noise). These are examples of estimation problems.
  
 
==The likelihood ratio for random processes.==
 
==The likelihood ratio for random processes.==
In statistical problems, likelihood ratios and likelihood functions play an important role (see [[Neyman–Pearson lemma|Neyman–Pearson lemma]]; [[Statistical hypotheses, verification of|Statistical hypotheses, verification of]]; [[Statistical estimation|Statistical estimation]]). The likelihood ratio of two distributions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745030.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745031.png" /> is the density
+
In statistical problems, likelihood ratios and likelihood functions play an important role (see [[Neyman–Pearson lemma|Neyman–Pearson lemma]]; [[Statistical hypotheses, verification of|Statistical hypotheses, verification of]]; [[Statistical estimation|Statistical estimation]]). The likelihood ratio of two distributions $  P _ {u}  ^ {T} $
 +
and $  P _ {v}  ^ {T} $
 +
is the density
 +
 
 +
$$
 +
p( x( \cdot );  u , v)  =  p( x( \cdot ))  = \
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745032.png" /></td> </tr></table>
+
\frac{dP _ {u}  ^ {T} }{dP _ {v}  ^ {T} }
 +
( x( \cdot )) .
 +
$$
  
 
The likelihood function is the function
 
The likelihood function is the function
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745033.png" /></td> </tr></table>
+
$$
 +
L( \theta )  =
 +
\frac{dP _  \theta  ^ {T} }{d \mu }
 +
( x( \cdot )),
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745034.png" /> is a <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745035.png" />-finite measure relative to which all measures <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745036.png" /> are absolutely continuous. In the discrete case, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745037.png" /> runs through the integers of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745038.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745039.png" />, the likelihood ratio always exists if the distributions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745040.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745041.png" /> have positive densities, and it coincides with the ratio of these two densities.
+
where $  \mu $
 +
is a $  \sigma $-
 +
finite measure relative to which all measures $  P _  \theta  ^ {T} $
 +
are absolutely continuous. In the discrete case, where $  t $
 +
runs through the integers of $  [ 0, T] $
 +
and $  T < \infty $,  
 +
the likelihood ratio always exists if the distributions $  P _ {u} $
 +
and $  P _ {v} $
 +
have positive densities, and it coincides with the ratio of these two densities.
  
If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745042.png" /> runs through the entire interval <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745043.png" />, then cases may arise in which the measures <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745044.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745045.png" /> are not absolutely continuous with respect to each other; moreover, situations can arise in which <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745046.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745047.png" /> are mutually singular, i.e. where for a set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745048.png" /> in the space of realizations of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745049.png" />,
+
If $  t $
 +
runs through the entire interval $  [ 0, T] $,
 +
then cases may arise in which the measures $  P _ {u}  ^ {T} $
 +
and $  P _ {v}  ^ {T} $
 +
are not absolutely continuous with respect to each other; moreover, situations can arise in which $  P _ {u}  ^ {T} $
 +
and $  P _ {v}  ^ {T} $
 +
are mutually singular, i.e. where for a set $  A $
 +
in the space of realizations of $  x( t) $,
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745050.png" /></td> </tr></table>
+
$$
 +
P _ {u}  ^ {T} \{ x \in A \}  = 0,\ \
 +
P _ {v}  ^ {T} \{ x \in A \}  = 1.
 +
$$
  
In this case <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745051.png" /> does not exist. The singularity of the measures <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745052.png" /> leads to important and somewhat paradoxical statistical results, allowing for error-free inferences concerning the parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745053.png" />. For example, let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745054.png" />; the singularity of the measures <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745055.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745056.png" /> means that, using the test "accept H0 if x A, reject H0 if x A" , the hypotheses <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745057.png" />: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745058.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745059.png" />: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745060.png" /> are distinguished error-free. The presence of such perfect tests often demonstrates that the statistical problem is not posed entirely satisfactorily and that certain essential random disturbances are excluded from it.
+
In this case $  p( x;  u , v) $
 +
does not exist. The singularity of the measures $  P _  \theta  ^ {T} $
 +
leads to important and somewhat paradoxical statistical results, allowing for error-free inferences concerning the parameter $  \theta $.  
 +
For example, let $  \theta = \{ 0, 1 \} $;  
 +
the singularity of the measures $  P _ {0}  ^ {T} $
 +
and $  P _ {1}  ^ {T} $
 +
means that, using the test "accept H0 if x A, reject H0 if x A" , the hypotheses $  H _ {0} $:  
 +
$  \theta = 0 $
 +
and $  H _ {1} $:  
 +
$  \theta = 1 $
 +
are distinguished error-free. The presence of such perfect tests often demonstrates that the statistical problem is not posed entirely satisfactorily and that certain essential random disturbances are excluded from it.
  
 
===Example 3.===
 
===Example 3.===
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745061.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745062.png" /> is a stationary ergodic process with zero average and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745063.png" /> is a real parameter. Let the realizations of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745064.png" /> with probability 1 be analytic in a strip containing the real axis. According to the ergodic theorem,
+
Let $  x( t) = \theta + \xi ( t) $,  
 +
where $  \xi ( t) $
 +
is a stationary ergodic process with zero average and $  \theta $
 +
is a real parameter. Let the realizations of $  \xi ( t) $
 +
with probability 1 be analytic in a strip containing the real axis. According to the ergodic theorem,
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745065.png" /></td> </tr></table>
+
$$
 +
\lim\limits _ {T \rightarrow \infty } 
 +
\frac{1}{T}
 +
\int\limits _ { 0 } ^ { T }  x( t)  dt  = \theta ,
 +
$$
  
and all measures <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745066.png" /> are also mutually singular. Since an analytic function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745067.png" /> is completely defined by its values in a neighbourhood of zero, the parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745068.png" /> is error-free when estimated through the observations <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745069.png" /> for any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745070.png" />.
+
and all measures $  P _  \theta  ^  \infty  $
 +
are also mutually singular. Since an analytic function $  x( t) $
 +
is completely defined by its values in a neighbourhood of zero, the parameter $  \theta $
 +
is error-free when estimated through the observations $  \{ {x( t) } : {0 \leq  t \leq  T } \} $
 +
for any $  T > 0 $.
  
 
The calculation of the likelihood ratio in those cases where it exists is a difficult problem. Calculations are often based on the limit relation
 
The calculation of the likelihood ratio in those cases where it exists is a difficult problem. Calculations are often based on the limit relation
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745071.png" /></td> </tr></table>
+
$$
 +
p( x( \cdot ); u , v)  = \
 +
\lim\limits _ {n \rightarrow \infty } \
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745072.png" /> are the densities of the vector <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745073.png" />, while <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745074.png" /> is a dense set in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745075.png" />. Study of the right-hand side of the above equality also is useful in investigating the possible singularity of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745076.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745077.png" />.
+
\frac{p _ {u} ( x( t _ {1} ) \dots x( t _ {n} )) }{p _ {v} ( x( t _ {1} ) \dots x( t _ {n} )) }
 +
,
 +
$$
 +
 
 +
where $  p _ {u} , p _ {v} $
 +
are the densities of the vector $  ( x( t _ {1} ) \dots x( t _ {n} )) $,  
 +
while $  \{ t _ {1} , t _ {2} , . . . \} $
 +
is a dense set in $  [ 0, T] $.  
 +
Study of the right-hand side of the above equality also is useful in investigating the possible singularity of $  P _ {u} $
 +
and $  P _ {v} $.
  
 
===Example 4.===
 
===Example 4.===
Suppose one has either the observation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745078.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745079.png" /> is a [[Wiener process|Wiener process]] (hypothesis <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745080.png" />), or <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745081.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745082.png" /> is a non-random function (hypothesis <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745083.png" />). The measures <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745084.png" /> are mutually absolutely continuous if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745085.png" />, and mutually singular if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745086.png" />. The likelihood ratio equals
+
Suppose one has either the observation $  x( t) = w( t) $,  
 +
where $  w( t) $
 +
is a [[Wiener process|Wiener process]] (hypothesis $  H _ {0} $),  
 +
or $  x( t) = m( t) + w( t) $,  
 +
where $  m $
 +
is a non-random function (hypothesis $  H _ {1} $).  
 +
The measures $  P _ {0} , P _ {1} $
 +
are mutually absolutely continuous if $  m  ^  \prime  \in L _ {2} ( 0, T) $,
 +
and mutually singular if $  m  ^  \prime  \notin L _ {2} ( 0, T) $.  
 +
The likelihood ratio equals
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745087.png" /></td> </tr></table>
+
$$
 +
 
 +
\frac{dP _ {1}  ^ {T} }{dP _ {0}  ^ {T} }
 +
( x)  = \
 +
\mathop{\rm exp} \left \{ -  
 +
\frac{1}{2}
 +
\int\limits _ { 0 } ^ { T }  [ m  ^  \prime  ( t)]  ^ {2}  dt + \int\limits _ { 0 } ^ { T }
 +
m  ^  \prime  ( t)  dx( t) \right \} .
 +
$$
  
 
===Example 5.===
 
===Example 5.===
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745088.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745089.png" /> is a real parameter and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745090.png" /> is a stationary Gaussian [[Markov process|Markov process]] with mean zero and known correlation function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745091.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745092.png" />. The measures <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745093.png" /> are mutually absolutely continuous with likelihood function
+
Let $  x( t) = \theta + \xi ( t) $,  
 +
where $  \theta $
 +
is a real parameter and $  \xi ( t) $
 +
is a stationary Gaussian [[Markov process|Markov process]] with mean zero and known correlation function $  r( t) = e ^ {- \alpha | t | } $,
 +
$  \alpha > 0 $.  
 +
The measures $  P _  \theta  ^ {T} $
 +
are mutually absolutely continuous with likelihood function
 +
 
 +
$$
 +
 
 +
\frac{dP _  \theta  ^ {T} }{dP _  \theta  ^ {0} }
 +
( x)  = \
 +
\mathop{\rm exp} \left \{
 +
\frac{1}{2}
 +
\theta x( 0) +
 +
\frac{1}{2}
 +
\theta x( T) \right . +
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745094.png" /></td> </tr></table>
+
$$
 +
+ \left .
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745095.png" /></td> </tr></table>
+
\frac{1}{2}
 +
\theta \alpha \int\limits _ { 0 } ^ { T }  x( t)  dt -  
 +
\frac{1}{2}
 +
\theta  ^ {2} -
 +
\frac{1}{4}
 +
\theta  ^ {2} \alpha T  \right \} .
 +
$$
  
In particular, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745096.png" /> is a [[Sufficient statistic|sufficient statistic]] for the family <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745097.png" />.
+
In particular, $  x( 0) + x( T) + \alpha \int _ {0}  ^ {T} x( t)  dt $
 +
is a [[Sufficient statistic|sufficient statistic]] for the family $  P _  \theta  ^ {T} $.
  
 
==Linear problems in the statistics of random processes.==
 
==Linear problems in the statistics of random processes.==
 
Let the function
 
Let the function
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745098.png" /></td> <td valign="top" style="width:5%;text-align:right;">(*)</td></tr></table>
+
$$ \tag{* }
 +
x( t)  = \sum _ { 1 } ^ { k }  \theta _ {j} \phi _ {j} ( t) + \xi ( t)
 +
$$
  
be observed, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745099.png" /> is a random process with mean zero and known correlation function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450100.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450101.png" /> are known non-random functions, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450102.png" /> is an unknown parameter (<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450103.png" /> are the regression coefficients), and the parameter set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450104.png" /> is a subset of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450105.png" />. Linear estimators for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450106.png" /> are estimators of the form <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450107.png" />, or their limits in the mean square. The problem of finding optimal unbiased linear estimators in the mean square reduces to the solution of linear algebraic or linear integral equations in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450108.png" />. Indeed, an optimal estimator <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450109.png" /> is defined by the equations <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450110.png" /> for any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450111.png" /> of the form <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450112.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450113.png" />. In a number of cases, estimators of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450114.png" />, obtained asymptotically by the method of least squares, when <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450115.png" />, are not worse than the optimal linear estimators. Estimators obtained by the method of least squares are calculated more simply and do not depend on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450116.png" />.
+
be observed, where $  \xi ( t) $
 +
is a random process with mean zero and known correlation function $  r( t, s) $,  
 +
$  \phi _ {j} $
 +
are known non-random functions, $  \theta = ( \theta _ {1} \dots \theta _ {k} ) $
 +
is an unknown parameter ( $  \theta _ {j} $
 +
are the regression coefficients), and the parameter set $  \Theta $
 +
is a subset of $  \mathbf R  ^ {k} $.  
 +
Linear estimators for $  \theta _ {j} $
 +
are estimators of the form $  \sum c _ {j} x( t _ {j} ) $,  
 +
or their limits in the mean square. The problem of finding optimal unbiased linear estimators in the mean square reduces to the solution of linear algebraic or linear integral equations in $  r $.  
 +
Indeed, an optimal estimator $  \widehat \theta  $
 +
is defined by the equations $  {\mathsf E} _  \theta  ( \widehat \theta  _ {j} \xi ) = 0 $
 +
for any $  \xi $
 +
of the form $  \xi = \sum b _ {j} x( t _ {j} ) $,  
 +
$  \sum b _ {j} \phi _ {l} ( t _ {j} ) = 0 $.  
 +
In a number of cases, estimators of $  \theta $,  
 +
obtained asymptotically by the method of least squares, when $  T \rightarrow \infty $,  
 +
are not worse than the optimal linear estimators. Estimators obtained by the method of least squares are calculated more simply and do not depend on $  r $.
  
 
===Example 6.===
 
===Example 6.===
Under the conditions of example 5, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450117.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450118.png" />. The optimal unbiased [[Linear estimator|linear estimator]] takes the form
+
Under the conditions of example 5, $  k= 1 $,  
 +
$  \phi _ {1} ( t) \equiv 1 $.  
 +
The optimal unbiased [[Linear estimator|linear estimator]] takes the form
 +
 
 +
$$
 +
\widehat \theta    = 
 +
\frac{1}{2 + \alpha T }
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450119.png" /></td> </tr></table>
+
\left ( x( 0) + x( T) + \alpha \int\limits _ { 0 } ^ { T }
 +
x( t)  dt \right ) .
 +
$$
  
 
The estimator
 
The estimator
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450120.png" /></td> </tr></table>
+
$$
 +
\theta  ^  \star  =
 +
\frac{1}{T}
 +
\int\limits _ { 0 } ^ { T }  x( t)  dt
 +
$$
  
 
has asymptotically the same variance.
 
has asymptotically the same variance.
  
 
==Statistical problems of Gaussian processes.==
 
==Statistical problems of Gaussian processes.==
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450121.png" /> be a [[Gaussian process|Gaussian process]] for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450122.png" />. For Gaussian processes one has the alternatives: Any two measures <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450123.png" /> are either mutually absolutely continuous or are singular. Since the Gaussian distribution <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450124.png" /> is completely defined by the mean value <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450125.png" /> and the correlation function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450126.png" />, the likelihood ratio <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450127.png" /> is expressed in terms of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450128.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450129.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450130.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450131.png" /> in a complex way. The case where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450132.png" />, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450133.png" /> a continuous function, is relatively simple. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450134.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450135.png" />; let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450136.png" />, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450137.png" />, be the eigenvalues, and the corresponding normalized eigenfunctions in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450138.png" />, of the integral equation
+
Let $  \{ {x( t) } : {0 \leq  t \leq  T,  P _  \theta  ^ {T} } \} $
 +
be a [[Gaussian process|Gaussian process]] for all $  \theta \in \Theta $.  
 +
For Gaussian processes one has the alternatives: Any two measures $  P _ {u}  ^ {T} , P _ {v}  ^ {T} $
 +
are either mutually absolutely continuous or are singular. Since the Gaussian distribution $  P _  \theta  ^ {T} $
 +
is completely defined by the mean value $  m _  \theta  ( t) = {\mathsf E} _  \theta  x( t) $
 +
and the correlation function $  r _  \theta  ( s, t) = {\mathsf E} _  \theta  x( s) x( t) $,  
 +
the likelihood ratio $  dP _ {u}  ^ {T} /dP _ {v}  ^ {T} $
 +
is expressed in terms of $  m _ {u} $,  
 +
$  m _ {v} $,  
 +
$  r _ {u} $,  
 +
$  r _ {v} $
 +
in a complex way. The case where $  r _ {u} = r _ {v} = r $,  
 +
and $  r $
 +
a continuous function, is relatively simple. Let $  \Theta = \{ 0, 1 \} $,
 +
$  r _ {0} = r _ {1} = r $;  
 +
let $  \lambda _ {i} $,  
 +
and $  \phi _ {i} ( t) $,  
 +
be the eigenvalues, and the corresponding normalized eigenfunctions in $  L _ {2} ( 0, T) $,
 +
of the integral equation
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450139.png" /></td> </tr></table>
+
$$
 +
\lambda \phi ( s)  = \int\limits _ { 0 } ^ { T }  r ( t) \phi ( t)  dt;
 +
$$
  
let the means <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450140.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450141.png" /> be continuous functions; and let
+
let the means $  m _ {0} ( t) $
 +
and $  m _ {1} ( t) $
 +
be continuous functions; and let
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450142.png" /></td> </tr></table>
+
$$
 +
m _ {ij}  = \int\limits _ { 0 } ^ { T }  m _ {i} ( t) \phi _ {j} ( t)  dt.
 +
$$
  
The measures <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450143.png" /> are absolutely continuous if and only if
+
The measures $  P _ {0} , P _ {1} $
 +
are absolutely continuous if and only if
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450144.png" /></td> </tr></table>
+
$$
 +
\sum _ { j= } 1 ^  \infty  ( m _ {0j} - m _ {1j} )  ^ {2} \lambda _ {j}  ^ {-} 1  < \infty .
 +
$$
  
 
Here,
 
Here,
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450145.png" /></td> </tr></table>
+
$$
 +
 
 +
\frac{dP _ {1}  ^ {T} }{dP _ {0}  ^ {T} }
 +
( x)  = \
 +
\mathop{\rm exp} \left \{ \sum _ { j= } 1 ^  \infty 
 +
\frac{m _ {1j} - m _ {0j} }{\lambda _ {j}  }
 +
\right . \times
 +
$$
 +
 
 +
$$
 +
\times \left .
 +
\left ( \int\limits _ { 0 } ^ { T }  x( t) \phi _ {j} ( t)  dt -
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450146.png" /></td> </tr></table>
+
\frac{m _ {1j} - m _ {0j} }{2}
 +
\right ) \right \} .
 +
$$
  
This equality can be used to devise a test for the hypothesis <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450147.png" />: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450148.png" /> against the alternative <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450149.png" />: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450150.png" /> under the assumption that the function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450151.png" /> is known to the observer.
+
This equality can be used to devise a test for the hypothesis $  H _ {0} $:  
 +
$  m = m _ {0} $
 +
against the alternative $  H _ {1} $:  
 +
$  m = m _ {1} $
 +
under the assumption that the function $  r $
 +
is known to the observer.
  
 
==Statistical problems of stationary processes.==
 
==Statistical problems of stationary processes.==
Let the observation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450152.png" /> be a stationary process with mean <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450153.png" /> and correlation function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450154.png" />; let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450155.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450156.png" /> be its spectral density and spectral function, respectively. The basic problems of the statistics of stationary processes relate to hypotheses testing and to estimating the characteristics <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450157.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450158.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450159.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450160.png" />. In the case of an ergodic process <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450161.png" />, consistent estimators (when <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450162.png" />) for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450163.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450164.png" />, respectively, are provided by
+
Let the observation $  x( t) $
 +
be a stationary process with mean $  m $
 +
and correlation function $  r( t) $;  
 +
let $  f( \lambda ) $
 +
and $  F( \lambda ) $
 +
be its spectral density and spectral function, respectively. The basic problems of the statistics of stationary processes relate to hypotheses testing and to estimating the characteristics $  m $,  
 +
$  r $,  
 +
$  f $,  
 +
$  F $.  
 +
In the case of an ergodic process $  x( t) $,  
 +
consistent estimators (when $  T \rightarrow \infty $)  
 +
for $  m $
 +
and $  r( t) $,  
 +
respectively, are provided by
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450165.png" /></td> </tr></table>
+
$$
 +
m  ^  \star  =
 +
\frac{1}{T}
 +
\int\limits _ { 0 } ^ { T }  x( t)  dt,
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450166.png" /></td> </tr></table>
+
$$
 +
r  ^  \star  ( t)  =
 +
\frac{1}{T}
 +
\int\limits _ { 0 } ^ { T- }  t x( t+ s) x( s)  ds.
 +
$$
  
The problem of estimating <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450167.png" /> when <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450168.png" /> is known is often treated as a linear problem. This group of problems also includes the more general problems of estimating regression coefficients through observations of the form (*) with stationary <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450169.png" />.
+
The problem of estimating $  m $
 +
when $  r $
 +
is known is often treated as a linear problem. This group of problems also includes the more general problems of estimating regression coefficients through observations of the form (*) with stationary $  \xi ( t) $.
  
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450170.png" /> have zero mean and spectral density <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450171.png" /> depending on a finite-dimensional parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450172.png" />. If the process <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450173.png" /> is Gaussian, formulas can be derived for the likelihood ratio <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450174.png" /> (if the ratio exists), which in a number of cases make it possible to find maximum-likelihood estimators or "good" approximations of them (for large <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450175.png" />). Under sufficiently broad assumptions these estimators are asymptotically normal <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450176.png" /> and asymptotically efficient.
+
Let $  x( t) $
 +
have zero mean and spectral density $  f( \lambda ;  \theta ) $
 +
depending on a finite-dimensional parameter $  \theta \in \Theta $.  
 +
If the process $  x( t) $
 +
is Gaussian, formulas can be derived for the likelihood ratio $  dP _  \theta  /dP _ {\theta  ^ {0}  } $(
 +
if the ratio exists), which in a number of cases make it possible to find maximum-likelihood estimators or "good" approximations of them (for large $  T $).  
 +
Under sufficiently broad assumptions these estimators are asymptotically normal $  ( \theta , c( \theta )/ \sqrt T ) $
 +
and asymptotically efficient.
  
 
===Example 7.===
 
===Example 7.===
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450177.png" /> be a stationary Gaussian process in continuous time with rational spectral density <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450178.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450179.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450180.png" /> are polynomials. The measures <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450181.png" /> corresponding to the rational spectral densities <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450182.png" /> are absolutely continuous if and only if
+
Let $  x( t) $
 +
be a stationary Gaussian process in continuous time with rational spectral density $  f( \lambda ) = | Q( \lambda )/P( \lambda ) |  ^ {2} $,  
 +
where $  P $
 +
and $  Q $
 +
are polynomials. The measures $  P _ {0}  ^ {T} , P _ {1}  ^ {T} $
 +
corresponding to the rational spectral densities $  f _ {0} , f _ {1} $
 +
are absolutely continuous if and only if
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450183.png" /></td> </tr></table>
+
$$
 +
\lim\limits _ {\lambda \rightarrow \infty } \
  
Here the parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450184.png" /> is the set of all coefficients of the polynomials <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450185.png" />.
+
\frac{f _ {0} ( \lambda ) }{f _ {1} ( \lambda ) }
 +
  =  1.
 +
$$
 +
 
 +
Here the parameter $  \theta $
 +
is the set of all coefficients of the polynomials $  P, Q $.
  
 
===Example 8.===
 
===Example 8.===
An important class of stationary Gaussian processes consists of the auto-regressive processes (cf. [[Auto-regressive process|Auto-regressive process]]) <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450186.png" />:
+
An important class of stationary Gaussian processes consists of the auto-regressive processes (cf. [[Auto-regressive process|Auto-regressive process]]) $  x( t) $:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450187.png" /></td> </tr></table>
+
$$
 +
x  ^ {(} n) ( t) + \theta _ {n} x  ^ {(} n- 1) ( t) + \dots + \theta _ {1} x( t)  = \xi ( t),
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450188.png" /> is a Gaussian white noise of unit intensity and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450189.png" /> is an unknown parameter. In this case the spectral density is
+
where $  \xi ( t) $
 +
is a Gaussian white noise of unit intensity and $  \theta = ( \theta _ {1} \dots \theta _ {n} ) $
 +
is an unknown parameter. In this case the spectral density is
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450190.png" /></td> </tr></table>
+
$$
 +
f( \lambda ; \theta )  = ( 2 \pi )  ^ {-} 1 | P( i \lambda ) |  ^ {-} 2 ,
 +
$$
  
 
where
 
where
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450191.png" /></td> </tr></table>
+
$$
 +
P( z)  = \theta _ {1} + \theta _ {2} z + \dots + \theta _ {n} z  ^ {n-} 1 + z  ^ {n} .
 +
$$
  
 
The likelihood function is
 
The likelihood function is
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450192.png" /></td> </tr></table>
+
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450193.png" /></td> </tr></table>
+
\frac{dP _  \theta  ^ {T} }{dP _ {\theta  ^ {0}  }  ^ {T} }
 +
  = \
 +
\sqrt {
 +
\frac{K( \theta ) }{K( \theta  ^ {0} ) }
 +
}  \mathop{\rm exp} \left \{
 +
\frac{( \theta _ {n} - \theta _ {n}  ^ {0} ) }{T\right}
 +
{} -
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450194.png" /></td> </tr></table>
+
$$
 +
-  
  
Here, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450195.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450196.png" /> are quadratic forms in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450197.png" />, depending on the values <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450198.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450199.png" />, at the points <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450200.png" />, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450201.png" /> is the determinant of the correlation matrix of the vector <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450202.png" />.
+
\frac{1}{2}
 +
\sum _ { j= } 0 ^ { n- }  1 [ \lambda _ {j} ( \theta ) - \lambda _ {j} ( \theta  ^ {0} )] \int\limits _ { 0 } ^ { T }  [ x  ^ {(} j) ( t)]  ^ {2}  dt -
 +
$$
  
Maximum-likelihood estimators for the auto-regression parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450203.png" /> are asymptotically normal and asymptotically efficient. These properties are shared by the solution <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450204.png" /> of the approximate likelihood equation
+
$$
 +
- \left .
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450205.png" /></td> </tr></table>
+
\frac{1}{2}
 +
( \lambda ( \theta ) - \lambda ( \theta  ^ {0} )) \right \} .
 +
$$
  
An important role in statistical studies on the spectrum of a stationary process is played by the [[Periodogram|periodogram]] <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450206.png" />. This statistic is defined as
+
Here,  $  \lambda _ {j} ( \theta ) $
 +
and  $  \lambda ( \theta ) $
 +
are quadratic forms in $  \theta $,
 +
depending on the values  $  x  ^ {(} j) ( t) $,
 +
$  j = 1 \dots n- 1 $,
 +
at the points  $  t = 0, T $,
 +
and  $  K( \theta ) $
 +
is the determinant of the correlation matrix of the vector  $  ( x( 0) \dots x  ^ {(} n- 1) ( 0)) $.
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450207.png" /></td> </tr></table>
+
Maximum-likelihood estimators for the auto-regression parameter  $  \theta $
 +
are asymptotically normal and asymptotically efficient. These properties are shared by the solution  $  \theta _ {T}  ^  \star  $
 +
of the approximate likelihood equation
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450208.png" /></td> </tr></table>
+
$$
  
The periodogram is widely used in constructing different kinds of estimators for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450209.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450210.png" /> and criteria for testing hypotheses on these characteristics. Under broad assumptions, the statistics <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450211.png" /> are consistent estimators for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450212.png" />. In particular, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450213.png" /> may serve as an estimator for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450214.png" />. If the sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450215.png" /> converges in an appropriate way to the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450216.png" />-function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450217.png" />, then the integrals <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450218.png" /> will be consistent estimators for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450219.png" />. Functions of the form <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450220.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450221.png" />, are often used in the capacity of the functions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450222.png" />. If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450223.png" /> is a process in discrete time, these estimators can be written in the form
+
\frac{1}{2T}
 +
\sum _ { j= } 0 ^ { n- }  1
 +
\frac{d \lambda _ {j} ( \theta ) }{d \theta _ {i} }
 +
\int\limits _ { 0 } ^ { T }  [ x  ^ {(} j) ( t)]  ^ {2}  dt  = \
 +
\left \{
 +
\begin{array}{ll}
 +
0, & 1 \leq  i \leq  n, \\
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450224.png" /></td> </tr></table>
+
\frac{1}{2}
 +
,  & i= n.  \\
 +
\end{array}
 +
 
 +
\right .$$
 +
 
 +
An important role in statistical studies on the spectrum of a stationary process is played by the [[Periodogram|periodogram]]  $  I _ {T} ( \lambda ) $.
 +
This statistic is defined as
 +
 
 +
$$
 +
I _ {T} ( \lambda )  = \
 +
 
 +
\frac{1}{2 \pi T }
 +
\left | \sum _ { t= } 0 ^ { T }  e ^ {- it \lambda } x( t) \right |
 +
\  \textrm{ (discrete  time)  } ,
 +
$$
 +
 
 +
$$
 +
I _ {T} ( \lambda )  =
 +
\frac{1}{2 \pi T }
 +
\left |
 +
\int\limits _ { 0 } ^ { T }  e ^ {- it \lambda } x( t)  dt
 +
\right |  ^ {2} \  \textrm{ (continuous  time)  } .
 +
$$
 +
 
 +
The periodogram is widely used in constructing different kinds of estimators for  $  f( \lambda ) $,
 +
$  F( \lambda ) $
 +
and criteria for testing hypotheses on these characteristics. Under broad assumptions, the statistics  $  \int I _ {T} ( \lambda ) \phi ( \lambda )  d \lambda $
 +
are consistent estimators for  $  \int f( \lambda ) \phi ( \lambda )  d \lambda $.
 +
In particular,  $  \int _  \alpha  ^  \beta  I _ {T} ( \lambda )  d \lambda $
 +
may serve as an estimator for  $  F( \beta ) - F( \alpha ) $.
 +
If the sequence  $  \phi _ {T} ( \lambda ; \lambda _ {0} ) $
 +
converges in an appropriate way to the  $  \delta $-
 +
function  $  \delta ( \lambda - \lambda _ {0} ) $,
 +
then the integrals  $  \int \phi _ {T} ( \lambda ; \lambda _ {0} ) I _ {T} ( \lambda )  d \lambda $
 +
will be consistent estimators for  $  f( \lambda _ {0} ) $.  
 +
Functions of the form  $  a _ {T} \psi ( a _ {T} ( \lambda - \lambda _ {0} )) $,
 +
$  a _ {T} \rightarrow \infty $,
 +
are often used in the capacity of the functions  $  \phi _ {T} ( \lambda ;  \lambda _ {0} ) $.  
 +
If  $  x( t) $
 +
is a process in discrete time, these estimators can be written in the form
 +
 
 +
$$
 +
 
 +
\frac{1}{2 \pi }
 +
\sum _ { t=- } T+ 1 ^ { T- }  1 e ^ {- it \lambda } r  ^  \star  ( t) c _ {T} ( t),
 +
$$
  
 
where the empirical correlation function is
 
where the empirical correlation function is
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450225.png" /></td> </tr></table>
+
$$
 +
r  ^  \star  ( t)  =
 +
\frac{1}{T}
 +
\sum _ { u= } 0 ^ { T- }  t x( u+ t) x( u),
 +
$$
  
while the non-random coefficients <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450226.png" /> are defined by the choice of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450227.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450228.png" />. This choice, in turn, depends on a priori information on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450229.png" />. A similar representation also holds for processes in continuous time.
+
while the non-random coefficients $  c _ {T} ( t) $
 +
are defined by the choice of $  \psi $
 +
and $  a _ {T} $.  
 +
This choice, in turn, depends on a priori information on $  f( \lambda ) $.  
 +
A similar representation also holds for processes in continuous time.
  
 
Problems in the statistical analysis of stationary processes sometimes also include problems of extrapolation, interpolation and filtration of stationary processes.
 
Problems in the statistical analysis of stationary processes sometimes also include problems of extrapolation, interpolation and filtration of stationary processes.
  
 
==Statistical problems of Markov processes.==
 
==Statistical problems of Markov processes.==
Let the observations <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450230.png" /> belong to a homogeneous [[Markov chain|Markov chain]]. Under sufficiently broad assumptions the likelihood function is
+
Let the observations $  X _ {0} \dots X _ {T} $
 +
belong to a homogeneous [[Markov chain|Markov chain]]. Under sufficiently broad assumptions the likelihood function is
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450231.png" /></td> </tr></table>
+
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450232.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450233.png" /> are the initial and transition densities of the distribution. This expression is similar to the likelihood function for a sequence of independent observations, and when the regularity conditions are observed (smoothness in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450234.png" />), a theory can be constructed for hypotheses testing and estimation which is analogous to the corresponding theory for independent observations.
+
\frac{dP _  \theta  ^ {T} }{d \mu  ^ {T} }
 +
  = \
 +
p _ {0} ( X _ {0} ;  \theta ) p( X _ {1}  | X _ {0} ;  \theta ) \dots p( X _ {T} |  X _ {T-} 1 ;  \theta ),
 +
$$
  
A more complex situation arises if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450235.png" /> is a [[Markov process|Markov process]] in continuous time. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450236.png" /> be a homogeneous Markov process with a finite number of states <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450237.png" /> and differentiable transition probabilities <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450238.png" />. The transition probability matrix is defined by the matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450239.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450240.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450241.png" />. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450242.png" /> be independent of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450243.png" /> at the initial time. By choosing any matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450244.png" />, one finds
+
where  $  p _ {0} $,
 +
$  p $
 +
are the initial and transition densities of the distribution. This expression is similar to the likelihood function for a sequence of independent observations, and when the regularity conditions are observed (smoothness in  $  \theta \in \Theta \subset  \mathbf R  ^ {k} $),
 +
a theory can be constructed for hypotheses testing and estimation which is analogous to the corresponding theory for independent observations.
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450245.png" /></td> </tr></table>
+
A more complex situation arises if  $  x( t) $
 +
is a [[Markov process|Markov process]] in continuous time. Let  $  x( t) $
 +
be a homogeneous Markov process with a finite number of states  $  N $
 +
and differentiable transition probabilities  $  P _ {ij} ( t) $.
 +
The transition probability matrix is defined by the matrix  $  Q = \| q _ {ij} \| $,
 +
$  q _ {ij} = P _ {ij} ^ { \prime } ( 0) $,
 +
$  q _ {i} = - q _ {ii} $.
 +
Let  $  x( 0) = i _ {0} $
 +
be independent of  $  Q $
 +
at the initial time. By choosing any matrix  $  Q _ {0} = \| q _ {ij}  ^ {0} \| $,
 +
one finds
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450246.png" /></td> </tr></table>
+
$$
  
Here the statistics <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450247.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450248.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450249.png" /> are defined in the following way: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450250.png" /> is the number of jumps of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450251.png" /> on the interval <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450252.png" />; <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450253.png" /> is the moment of the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450254.png" />-th jump, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450255.png" />, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450256.png" />. It follows that the maximum-likelihood estimators for the parameters <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450257.png" /> are: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450258.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450259.png" /> is the number of transitions from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450260.png" /> to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450261.png" /> on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450262.png" />, while <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450263.png" /> is the time spent by the process <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450264.png" /> in the state <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450265.png" />.
+
\frac{dP _ {Q}  ^ {T} }{dP _ {Q _ {0}  }  ^ {T} }
 +
( x)  = \
 +
\mathop{\rm exp} \{ ( q _ {i _ {n}  }  ^ {0} - q _ {i _ {n}  } ) T \}
 +
\sum _ {\nu = 0 } ^ { n- }  1
 +
\frac{q _ {j _  \nu  j _ {\nu + 1 }  } }{q _ {j _  \nu  j _ {\nu + 1 }  }  ^ {0} }
 +
\times
 +
$$
 +
 
 +
$$
 +
\times
 +
\mathop{\rm exp} \{ t _  \nu  ( q _ {i _ {n}  } - q _ {i _  \nu  } - q _ {i _  \nu  }  ^ {0} + q _ {i _ {n}  }  ^ {0} ) \} .
 +
$$
 +
 
 +
Here the statistics  $  n( x) $,  
 +
$  t _  \nu  ( x) $,  
 +
$  j _  \nu  ( x) $
 +
are defined in the following way: $  n $
 +
is the number of jumps of $  x( t) $
 +
on the interval $  [ 0, T) $;  
 +
$  \tau _  \nu  $
 +
is the moment of the $  \nu $-
 +
th jump, $  t _  \nu  = \tau _ {\nu + 1 }  - \tau _  \nu  $,  
 +
and $  j _  \nu  = x( \tau _  \nu  ) $.  
 +
It follows that the maximum-likelihood estimators for the parameters $  q _ {ij} $
 +
are: $  q _ {ij}  ^  \star  = m _ {ij} / \mu _ {i} $,  
 +
where $  m _ {ij} $
 +
is the number of transitions from $  i $
 +
to $  j $
 +
on $  [ 0, T) $,
 +
while $  \mu _ {i} $
 +
is the time spent by the process $  x( t) $
 +
in the state $  i $.
  
 
===Example 9.===
 
===Example 9.===
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450266.png" /> be a [[Birth-and-death process|birth-and-death process]] with constant intensities of birth <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450267.png" /> and death <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450268.png" />. This means that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450269.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450270.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450271.png" />, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450272.png" /> if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450273.png" />. In this example the number of states is infinite. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450274.png" />. The likelihood ratio is
+
Let $  x( t) $
 +
be a [[Birth-and-death process|birth-and-death process]] with constant intensities of birth $  \lambda $
 +
and death $  \mu $.  
 +
This means that $  q _ {i,i+} 1 = i \lambda $,
 +
$  q _ {i,i-} 1 = i \mu $,  
 +
$  q _ {ii} = 1- i( \lambda + \mu ) $,  
 +
and $  q _ {ij} = 0 $
 +
if $  | i- j | > 1 $.  
 +
In this example the number of states is infinite. Let $  x( 0) \equiv 1 $.  
 +
The likelihood ratio is
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450275.png" /></td> </tr></table>
+
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450276.png" /></td> </tr></table>
+
\frac{dP _ {\lambda \mu }  ^ {T} }{dP _ {\lambda _ {0}  , \mu _ {0} }  ^ {T}
 +
}
 +
( x) =
 +
$$
  
Here <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450277.png" /> is the total number of births (jumps of measure <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450278.png" />) and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450279.png" /> is the number of deaths (jumps of measure <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450280.png" />). Maximum-likelihood estimators for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450281.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450282.png" /> are
+
$$
 +
= \
 +
\left (
 +
\frac \lambda {\lambda _ {0} }
 +
\right )  ^ {B}
 +
\left (  
 +
\frac \mu {\mu _ {0} }
 +
\right ) ^ {D}  \mathop{\rm exp} \left \{ -( \lambda +
 +
\mu - \lambda _ {0} - \mu _ {0} ) \int\limits _ { 0 } ^ { T }  x( s)  ds \right \} .
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450283.png" /></td> </tr></table>
+
Here  $  B $
 +
is the total number of births (jumps of measure  $  + 1 $)
 +
and  $  D $
 +
is the number of deaths (jumps of measure  $  - 1 $).  
 +
Maximum-likelihood estimators for  $  \lambda $
 +
and  $  \mu $
 +
are
  
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450284.png" /> be a diffusion process with drift coefficient <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450285.png" /> and diffusion coefficient <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450286.png" />, such that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450287.png" /> satisfies the [[Stochastic differential equation|stochastic differential equation]]
+
$$
 +
\lambda _ {T}  ^  \star  =
 +
\frac{1}{B}
 +
\int\limits _ { 0 } ^ { T }  x( s)  ds,\ \
 +
\mu _ {T}  ^  \star  =
 +
\frac{1}{D}
 +
\int\limits _ { 0 } ^ { T }  x( s)  ds.
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450288.png" /></td> </tr></table>
+
Let  $  x( t) $
 +
be a diffusion process with drift coefficient  $  a $
 +
and diffusion coefficient  $  b $,
 +
such that  $  x( t) $
 +
satisfies the [[Stochastic differential equation|stochastic differential equation]]
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450289.png" /> is a Wiener process. Then, under specific restrictions,
+
$$
 +
dx( t)  = a( t, x( t))  dt + b( t, x( t))  dw( t),\ \
 +
x( 0)  = x _ {0} ,
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450290.png" /></td> </tr></table>
+
where  $  w $
 +
is a Wiener process. Then, under specific restrictions,
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450291.png" /></td> </tr></table>
+
$$
  
(here <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450292.png" /> is a fixed coefficient).
+
\frac{dP _ {a,b}  ^ {T} }{dP _ {a _ {0}  ,b }  ^ {T} }
 +
( x)  = \
 +
\mathop{\rm exp} \left \{ - \int\limits _ { 0 } ^ { T } 
 +
\frac{a( t, x( t)) - a _ {0} ( t, x( t)) }{b( t, x( t)) }
 +
 
 +
dx( t) \right . +
 +
$$
 +
 
 +
$$
 +
+ \left .
 +
 
 +
\frac{1}{2}
 +
\int\limits _ { 0 } ^ { T } 
 +
\frac{a( t, x( t)) -
 +
a _ {0} ( t, x( t))  ^ {2} }{b( t, x( t)) }
 +
  dt \right \}
 +
$$
 +
 
 +
(here  $  a _ {0} $
 +
is a fixed coefficient).
  
 
===Example 10.===
 
===Example 10.===
 
Let
 
Let
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450293.png" /></td> </tr></table>
+
$$
 +
dx( t)  = a( t, x( t); \theta )  dt + dw,
 +
$$
 +
 
 +
where  $  a $
 +
is a known function and  $  \theta $
 +
is an unknown real parameter. If Wiener measure is denoted by  $  \mu $,
 +
then the likelihood function is
 +
 
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450294.png" /> is a known function and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450295.png" /> is an unknown real parameter. If Wiener measure is denoted by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450296.png" />, then the likelihood function is
+
\frac{dP _  \theta  ^ {T} }{d \mu }
 +
  = \
 +
\mathop{\rm exp} \left \{ \int\limits _ { 0 } ^ { T }  a( t, x( t);  \theta )  dx -
 +
\frac{1}{2}
 +
\int\limits _ { 0 } ^ { T }  a
 +
^ {2} ( t, x( t);  \theta )  dt \right \} ,
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450297.png" /></td> </tr></table>
+
and, under regularity conditions, the Cramér–Rao inequality is satisfied: For an estimator  $  \tau $
 +
with bias  $  \Delta ( \theta ) = {\mathsf E} _  \theta  \tau - \theta $,
  
and, under regularity conditions, the Cramér–Rao inequality is satisfied: For an estimator <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450298.png" /> with bias <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450299.png" />,
+
$$
 +
{\mathsf E} _  \theta  | \tau - \theta |  ^ {2}  \geq 
 +
\frac{( 1 + {d \Delta } / {d \theta }
 +
)  ^ {2} }{ {\mathsf E} _  \theta  \int\limits _ { 0 } ^ { T }  [ ( \partial  / {\partial  \theta } )
 +
a( t, x( t);  \theta )]  ^ {2}  dt }
 +
+ \Delta  ^ {2} ( \theta ).
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450300.png" /></td> </tr></table>
+
If the dependence on  $  \theta $
 +
is linear, the maximum-likelihood estimator is
  
If the dependence on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450301.png" /> is linear, the maximum-likelihood estimator is
+
$$
 +
\theta _ {T}  ^  \star  = \
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450302.png" /></td> </tr></table>
+
\frac{\int\limits _ { 0 } ^ { T }  a( t, x( t))  dt }{\int\limits _ { 0 } ^ { T }  a  ^ {2} ( t, x( t))  dt }
 +
.
 +
$$
  
 
====References====
 
====References====
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top"> U. Grenander, "Stochastic processes and statistical inference" ''Ark. Mat.'' , '''1''' (1950) pp. 195–277 {{MR|0039202}} {{ZBL|0058.35501}} {{ZBL|0041.45807}} </TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top"> E.J. Hannan, "Time series analysis" , Methuen , London (1960) {{MR|0114281}} {{ZBL|0095.13204}} </TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top"> U. Grenander, M. Rosenblatt, "Statistical analysis of stationary time series" , Wiley (1957) {{MR|0084975}} {{ZBL|0080.12904}} </TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top"> U. Grenander, "Abstract inference" , Wiley (1981) {{MR|0599175}} {{ZBL|0505.62069}} </TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top"> Yu.A. Rozanov, "Infinite-dimensional Gaussian distributions" ''Proc. Steklov Inst. Math.'' , '''108''' (1971) ''Trudy Mat. Inst. Steklov.'' , '''108''' (1968) {{MR|0436304}} {{ZBL|}} </TD></TR><TR><TD valign="top">[6]</TD> <TD valign="top"> I.A. Ibragimov, Yu.A. Rozanov, "Gaussian random processes" , Springer (1978) (Translated from Russian) {{MR|0543837}} {{ZBL|0392.60037}} </TD></TR><TR><TD valign="top">[7]</TD> <TD valign="top"> D.R. Brillinger, "Time series. Data analysis and theory" , Holt, Rinehart &amp; Winston (1975) {{MR|0443257}} {{ZBL|0321.62004}} </TD></TR><TR><TD valign="top">[8]</TD> <TD valign="top"> P. Billingsley, "Statistical inference for Markov processes" , Univ. Chicago Press (1961) {{MR|1531450}} {{MR|0123419}} {{ZBL|0106.34201}} </TD></TR><TR><TD valign="top">[9]</TD> <TD valign="top"> R.S. Liptser, A.N. Shiryaev, "Statistics of random processes" , '''1–2''' , Springer (1977–1978) (Translated from Russian) {{MR|1800858}} {{MR|1800857}} {{MR|0608221}} {{MR|0488267}} {{MR|0474486}} {{ZBL|1008.62073}} {{ZBL|1008.62072}} {{ZBL|0556.60003}} {{ZBL|0369.60001}} {{ZBL|0364.60004}} </TD></TR><TR><TD valign="top">[10]</TD> <TD valign="top"> A.M. Yaglom, "Correlation theory of stationary and related random functions" , '''1–2''' , Springer (1987) (Translated from Russian) {{MR|0915557}} {{MR|0893393}} {{ZBL|0685.62078}} {{ZBL|0685.62077}} </TD></TR><TR><TD valign="top">[11]</TD> <TD valign="top"> T.M. Anderson, "The statistical analysis of time series" , Wiley (1971) {{MR|0283939}} {{ZBL|0225.62108}} </TD></TR></table>
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top"> U. Grenander, "Stochastic processes and statistical inference" ''Ark. Mat.'' , '''1''' (1950) pp. 195–277 {{MR|0039202}} {{ZBL|0058.35501}} {{ZBL|0041.45807}} </TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top"> E.J. Hannan, "Time series analysis" , Methuen , London (1960) {{MR|0114281}} {{ZBL|0095.13204}} </TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top"> U. Grenander, M. Rosenblatt, "Statistical analysis of stationary time series" , Wiley (1957) {{MR|0084975}} {{ZBL|0080.12904}} </TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top"> U. Grenander, "Abstract inference" , Wiley (1981) {{MR|0599175}} {{ZBL|0505.62069}} </TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top"> Yu.A. Rozanov, "Infinite-dimensional Gaussian distributions" ''Proc. Steklov Inst. Math.'' , '''108''' (1971) ''Trudy Mat. Inst. Steklov.'' , '''108''' (1968) {{MR|0436304}} {{ZBL|}} </TD></TR><TR><TD valign="top">[6]</TD> <TD valign="top"> I.A. Ibragimov, Yu.A. Rozanov, "Gaussian random processes" , Springer (1978) (Translated from Russian) {{MR|0543837}} {{ZBL|0392.60037}} </TD></TR><TR><TD valign="top">[7]</TD> <TD valign="top"> D.R. Brillinger, "Time series. Data analysis and theory" , Holt, Rinehart &amp; Winston (1975) {{MR|0443257}} {{ZBL|0321.62004}} </TD></TR><TR><TD valign="top">[8]</TD> <TD valign="top"> P. Billingsley, "Statistical inference for Markov processes" , Univ. Chicago Press (1961) {{MR|1531450}} {{MR|0123419}} {{ZBL|0106.34201}} </TD></TR><TR><TD valign="top">[9]</TD> <TD valign="top"> R.S. Liptser, A.N. Shiryaev, "Statistics of random processes" , '''1–2''' , Springer (1977–1978) (Translated from Russian) {{MR|1800858}} {{MR|1800857}} {{MR|0608221}} {{MR|0488267}} {{MR|0474486}} {{ZBL|1008.62073}} {{ZBL|1008.62072}} {{ZBL|0556.60003}} {{ZBL|0369.60001}} {{ZBL|0364.60004}} </TD></TR><TR><TD valign="top">[10]</TD> <TD valign="top"> A.M. Yaglom, "Correlation theory of stationary and related random functions" , '''1–2''' , Springer (1987) (Translated from Russian) {{MR|0915557}} {{MR|0893393}} {{ZBL|0685.62078}} {{ZBL|0685.62077}} </TD></TR><TR><TD valign="top">[11]</TD> <TD valign="top"> T.M. Anderson, "The statistical analysis of time series" , Wiley (1971) {{MR|0283939}} {{ZBL|0225.62108}} </TD></TR></table>

Latest revision as of 14:55, 7 June 2020


A branch of mathematical statistics devoted to statistical inferences on the basis of observations represented as a random process. In the most common formulation, the values of a random function $ x( t) $ for $ t \in T $ are observed, and on the basis of these observations statistical inferences must be made regarding certain characteristics of $ x( t) $. Such a broad definition also formally includes all the classical statistics of independent observations. In fact, statistics of random processes is often taken to mean only the statistics of dependent observations, excluding the statistical analysis of a large number of independent realizations of a random process. Here the foundations of the statistical theory, the basic formulations of problems (cf. Statistical estimation; Statistical hypotheses, verification of), the basic concepts (sufficiency, unbiasedness, consistency, etc.) are the same as in the classical theory. However, when solving concrete problems, significant difficulties and phenomena of a new type arise from time to time. These difficulties are partially caused by the fact of dependence and the more complex structure of the process under consideration, and partially, in the case of observations in continuous time, by the need to examine distributions in infinite-dimensional spaces.

In fact, when solving statistical problems in the theory of random processes, the structure of the process under consideration is crucial, while when classifying random processes, statistical problems of Gaussian, Markov, stationary, branching, diffusion, and other processes are studied. Of these, the most far-reaching is the statistical theory of stationary processes (time-series analysis).

The need for a statistical analysis of random processes arose in the 19th century in the form of analysis of meteorological and economic series and studies on cyclic processes (price fluctuation, sun spots). In modern times, the number of problems covered by the statistical analysis of random processes has become extraordinarily large. To cite but a few examples, statistical analysis of random noise, vibrations, turbulence, wave motion in a sea, cardiograms and encephalograms, etc. The theoretical aspects of extracting a signal from a background of noise can be seen to a significant degree as a statistical problem in the theory of random processes.

Below it is proposed that a segment $ x( t) $, $ 0 \leq t \leq T $, of the random process $ x( t) $ be observed, whereby the parameter $ t $ passes either through the whole interval $ [ 0, T] $, or through the integers in this interval. In statistical problems, the distribution $ P ^ {T} $ of the process $ \{ {x( t) } : {0 \leq t \leq T } \} $ is usually known only to belong to some family of distributions $ \{ P ^ {T} \} $. This family can always be written in parametric form.

Example 1.

The process $ x( t) $ is either the sum of a non-random function $ s( t) $( a "signal" ) and a random function $ \xi ( t) $( the "noise" ), or is a single random function $ \xi ( t) $. The hypothesis $ H _ {0} $: $ x( t) = s( t) + \xi ( t) $ must be tested against the alternative $ H _ {1} $: $ x( t) = \xi ( t) $( the problem of locating a signal in noise). This is an example of testing a statistical hypothesis.

Example 2.

The process $ x( t) = s( t) + \xi ( t) $, where $ s( t) $ is an unknown non-random function (the signal), while $ \xi ( t) $ is a random process (the noise). The function $ s $, or its value $ s( t _ {0} ) $ at a given point $ t _ {0} $, has to be estimated. Similarly, it can be proposed that $ x( t) = s( t; \theta ) + \xi ( t) $, where $ s $ is a known function, depending on an unknown parameter $ \theta $, which must also be estimated through the observation of $ x( t) $( problems of extracting a signal from a background of noise). These are examples of estimation problems.

The likelihood ratio for random processes.

In statistical problems, likelihood ratios and likelihood functions play an important role (see Neyman–Pearson lemma; Statistical hypotheses, verification of; Statistical estimation). The likelihood ratio of two distributions $ P _ {u} ^ {T} $ and $ P _ {v} ^ {T} $ is the density

$$ p( x( \cdot ); u , v) = p( x( \cdot )) = \ \frac{dP _ {u} ^ {T} }{dP _ {v} ^ {T} } ( x( \cdot )) . $$

The likelihood function is the function

$$ L( \theta ) = \frac{dP _ \theta ^ {T} }{d \mu } ( x( \cdot )), $$

where $ \mu $ is a $ \sigma $- finite measure relative to which all measures $ P _ \theta ^ {T} $ are absolutely continuous. In the discrete case, where $ t $ runs through the integers of $ [ 0, T] $ and $ T < \infty $, the likelihood ratio always exists if the distributions $ P _ {u} $ and $ P _ {v} $ have positive densities, and it coincides with the ratio of these two densities.

If $ t $ runs through the entire interval $ [ 0, T] $, then cases may arise in which the measures $ P _ {u} ^ {T} $ and $ P _ {v} ^ {T} $ are not absolutely continuous with respect to each other; moreover, situations can arise in which $ P _ {u} ^ {T} $ and $ P _ {v} ^ {T} $ are mutually singular, i.e. where for a set $ A $ in the space of realizations of $ x( t) $,

$$ P _ {u} ^ {T} \{ x \in A \} = 0,\ \ P _ {v} ^ {T} \{ x \in A \} = 1. $$

In this case $ p( x; u , v) $ does not exist. The singularity of the measures $ P _ \theta ^ {T} $ leads to important and somewhat paradoxical statistical results, allowing for error-free inferences concerning the parameter $ \theta $. For example, let $ \theta = \{ 0, 1 \} $; the singularity of the measures $ P _ {0} ^ {T} $ and $ P _ {1} ^ {T} $ means that, using the test "accept H0 if x A, reject H0 if x A" , the hypotheses $ H _ {0} $: $ \theta = 0 $ and $ H _ {1} $: $ \theta = 1 $ are distinguished error-free. The presence of such perfect tests often demonstrates that the statistical problem is not posed entirely satisfactorily and that certain essential random disturbances are excluded from it.

Example 3.

Let $ x( t) = \theta + \xi ( t) $, where $ \xi ( t) $ is a stationary ergodic process with zero average and $ \theta $ is a real parameter. Let the realizations of $ \xi ( t) $ with probability 1 be analytic in a strip containing the real axis. According to the ergodic theorem,

$$ \lim\limits _ {T \rightarrow \infty } \frac{1}{T} \int\limits _ { 0 } ^ { T } x( t) dt = \theta , $$

and all measures $ P _ \theta ^ \infty $ are also mutually singular. Since an analytic function $ x( t) $ is completely defined by its values in a neighbourhood of zero, the parameter $ \theta $ is error-free when estimated through the observations $ \{ {x( t) } : {0 \leq t \leq T } \} $ for any $ T > 0 $.

The calculation of the likelihood ratio in those cases where it exists is a difficult problem. Calculations are often based on the limit relation

$$ p( x( \cdot ); u , v) = \ \lim\limits _ {n \rightarrow \infty } \ \frac{p _ {u} ( x( t _ {1} ) \dots x( t _ {n} )) }{p _ {v} ( x( t _ {1} ) \dots x( t _ {n} )) } , $$

where $ p _ {u} , p _ {v} $ are the densities of the vector $ ( x( t _ {1} ) \dots x( t _ {n} )) $, while $ \{ t _ {1} , t _ {2} , . . . \} $ is a dense set in $ [ 0, T] $. Study of the right-hand side of the above equality also is useful in investigating the possible singularity of $ P _ {u} $ and $ P _ {v} $.

Example 4.

Suppose one has either the observation $ x( t) = w( t) $, where $ w( t) $ is a Wiener process (hypothesis $ H _ {0} $), or $ x( t) = m( t) + w( t) $, where $ m $ is a non-random function (hypothesis $ H _ {1} $). The measures $ P _ {0} , P _ {1} $ are mutually absolutely continuous if $ m ^ \prime \in L _ {2} ( 0, T) $, and mutually singular if $ m ^ \prime \notin L _ {2} ( 0, T) $. The likelihood ratio equals

$$ \frac{dP _ {1} ^ {T} }{dP _ {0} ^ {T} } ( x) = \ \mathop{\rm exp} \left \{ - \frac{1}{2} \int\limits _ { 0 } ^ { T } [ m ^ \prime ( t)] ^ {2} dt + \int\limits _ { 0 } ^ { T } m ^ \prime ( t) dx( t) \right \} . $$

Example 5.

Let $ x( t) = \theta + \xi ( t) $, where $ \theta $ is a real parameter and $ \xi ( t) $ is a stationary Gaussian Markov process with mean zero and known correlation function $ r( t) = e ^ {- \alpha | t | } $, $ \alpha > 0 $. The measures $ P _ \theta ^ {T} $ are mutually absolutely continuous with likelihood function

$$ \frac{dP _ \theta ^ {T} }{dP _ \theta ^ {0} } ( x) = \ \mathop{\rm exp} \left \{ \frac{1}{2} \theta x( 0) + \frac{1}{2} \theta x( T) \right . + $$

$$ + \left . \frac{1}{2} \theta \alpha \int\limits _ { 0 } ^ { T } x( t) dt - \frac{1}{2} \theta ^ {2} - \frac{1}{4} \theta ^ {2} \alpha T \right \} . $$

In particular, $ x( 0) + x( T) + \alpha \int _ {0} ^ {T} x( t) dt $ is a sufficient statistic for the family $ P _ \theta ^ {T} $.

Linear problems in the statistics of random processes.

Let the function

$$ \tag{* } x( t) = \sum _ { 1 } ^ { k } \theta _ {j} \phi _ {j} ( t) + \xi ( t) $$

be observed, where $ \xi ( t) $ is a random process with mean zero and known correlation function $ r( t, s) $, $ \phi _ {j} $ are known non-random functions, $ \theta = ( \theta _ {1} \dots \theta _ {k} ) $ is an unknown parameter ( $ \theta _ {j} $ are the regression coefficients), and the parameter set $ \Theta $ is a subset of $ \mathbf R ^ {k} $. Linear estimators for $ \theta _ {j} $ are estimators of the form $ \sum c _ {j} x( t _ {j} ) $, or their limits in the mean square. The problem of finding optimal unbiased linear estimators in the mean square reduces to the solution of linear algebraic or linear integral equations in $ r $. Indeed, an optimal estimator $ \widehat \theta $ is defined by the equations $ {\mathsf E} _ \theta ( \widehat \theta _ {j} \xi ) = 0 $ for any $ \xi $ of the form $ \xi = \sum b _ {j} x( t _ {j} ) $, $ \sum b _ {j} \phi _ {l} ( t _ {j} ) = 0 $. In a number of cases, estimators of $ \theta $, obtained asymptotically by the method of least squares, when $ T \rightarrow \infty $, are not worse than the optimal linear estimators. Estimators obtained by the method of least squares are calculated more simply and do not depend on $ r $.

Example 6.

Under the conditions of example 5, $ k= 1 $, $ \phi _ {1} ( t) \equiv 1 $. The optimal unbiased linear estimator takes the form

$$ \widehat \theta = \frac{1}{2 + \alpha T } \left ( x( 0) + x( T) + \alpha \int\limits _ { 0 } ^ { T } x( t) dt \right ) . $$

The estimator

$$ \theta ^ \star = \frac{1}{T} \int\limits _ { 0 } ^ { T } x( t) dt $$

has asymptotically the same variance.

Statistical problems of Gaussian processes.

Let $ \{ {x( t) } : {0 \leq t \leq T, P _ \theta ^ {T} } \} $ be a Gaussian process for all $ \theta \in \Theta $. For Gaussian processes one has the alternatives: Any two measures $ P _ {u} ^ {T} , P _ {v} ^ {T} $ are either mutually absolutely continuous or are singular. Since the Gaussian distribution $ P _ \theta ^ {T} $ is completely defined by the mean value $ m _ \theta ( t) = {\mathsf E} _ \theta x( t) $ and the correlation function $ r _ \theta ( s, t) = {\mathsf E} _ \theta x( s) x( t) $, the likelihood ratio $ dP _ {u} ^ {T} /dP _ {v} ^ {T} $ is expressed in terms of $ m _ {u} $, $ m _ {v} $, $ r _ {u} $, $ r _ {v} $ in a complex way. The case where $ r _ {u} = r _ {v} = r $, and $ r $ a continuous function, is relatively simple. Let $ \Theta = \{ 0, 1 \} $, $ r _ {0} = r _ {1} = r $; let $ \lambda _ {i} $, and $ \phi _ {i} ( t) $, be the eigenvalues, and the corresponding normalized eigenfunctions in $ L _ {2} ( 0, T) $, of the integral equation

$$ \lambda \phi ( s) = \int\limits _ { 0 } ^ { T } r ( t) \phi ( t) dt; $$

let the means $ m _ {0} ( t) $ and $ m _ {1} ( t) $ be continuous functions; and let

$$ m _ {ij} = \int\limits _ { 0 } ^ { T } m _ {i} ( t) \phi _ {j} ( t) dt. $$

The measures $ P _ {0} , P _ {1} $ are absolutely continuous if and only if

$$ \sum _ { j= } 1 ^ \infty ( m _ {0j} - m _ {1j} ) ^ {2} \lambda _ {j} ^ {-} 1 < \infty . $$

Here,

$$ \frac{dP _ {1} ^ {T} }{dP _ {0} ^ {T} } ( x) = \ \mathop{\rm exp} \left \{ \sum _ { j= } 1 ^ \infty \frac{m _ {1j} - m _ {0j} }{\lambda _ {j} } \right . \times $$

$$ \times \left . \left ( \int\limits _ { 0 } ^ { T } x( t) \phi _ {j} ( t) dt - \frac{m _ {1j} - m _ {0j} }{2} \right ) \right \} . $$

This equality can be used to devise a test for the hypothesis $ H _ {0} $: $ m = m _ {0} $ against the alternative $ H _ {1} $: $ m = m _ {1} $ under the assumption that the function $ r $ is known to the observer.

Statistical problems of stationary processes.

Let the observation $ x( t) $ be a stationary process with mean $ m $ and correlation function $ r( t) $; let $ f( \lambda ) $ and $ F( \lambda ) $ be its spectral density and spectral function, respectively. The basic problems of the statistics of stationary processes relate to hypotheses testing and to estimating the characteristics $ m $, $ r $, $ f $, $ F $. In the case of an ergodic process $ x( t) $, consistent estimators (when $ T \rightarrow \infty $) for $ m $ and $ r( t) $, respectively, are provided by

$$ m ^ \star = \frac{1}{T} \int\limits _ { 0 } ^ { T } x( t) dt, $$

$$ r ^ \star ( t) = \frac{1}{T} \int\limits _ { 0 } ^ { T- } t x( t+ s) x( s) ds. $$

The problem of estimating $ m $ when $ r $ is known is often treated as a linear problem. This group of problems also includes the more general problems of estimating regression coefficients through observations of the form (*) with stationary $ \xi ( t) $.

Let $ x( t) $ have zero mean and spectral density $ f( \lambda ; \theta ) $ depending on a finite-dimensional parameter $ \theta \in \Theta $. If the process $ x( t) $ is Gaussian, formulas can be derived for the likelihood ratio $ dP _ \theta /dP _ {\theta ^ {0} } $( if the ratio exists), which in a number of cases make it possible to find maximum-likelihood estimators or "good" approximations of them (for large $ T $). Under sufficiently broad assumptions these estimators are asymptotically normal $ ( \theta , c( \theta )/ \sqrt T ) $ and asymptotically efficient.

Example 7.

Let $ x( t) $ be a stationary Gaussian process in continuous time with rational spectral density $ f( \lambda ) = | Q( \lambda )/P( \lambda ) | ^ {2} $, where $ P $ and $ Q $ are polynomials. The measures $ P _ {0} ^ {T} , P _ {1} ^ {T} $ corresponding to the rational spectral densities $ f _ {0} , f _ {1} $ are absolutely continuous if and only if

$$ \lim\limits _ {\lambda \rightarrow \infty } \ \frac{f _ {0} ( \lambda ) }{f _ {1} ( \lambda ) } = 1. $$

Here the parameter $ \theta $ is the set of all coefficients of the polynomials $ P, Q $.

Example 8.

An important class of stationary Gaussian processes consists of the auto-regressive processes (cf. Auto-regressive process) $ x( t) $:

$$ x ^ {(} n) ( t) + \theta _ {n} x ^ {(} n- 1) ( t) + \dots + \theta _ {1} x( t) = \xi ( t), $$

where $ \xi ( t) $ is a Gaussian white noise of unit intensity and $ \theta = ( \theta _ {1} \dots \theta _ {n} ) $ is an unknown parameter. In this case the spectral density is

$$ f( \lambda ; \theta ) = ( 2 \pi ) ^ {-} 1 | P( i \lambda ) | ^ {-} 2 , $$

where

$$ P( z) = \theta _ {1} + \theta _ {2} z + \dots + \theta _ {n} z ^ {n-} 1 + z ^ {n} . $$

The likelihood function is

$$ \frac{dP _ \theta ^ {T} }{dP _ {\theta ^ {0} } ^ {T} } = \ \sqrt { \frac{K( \theta ) }{K( \theta ^ {0} ) } } \mathop{\rm exp} \left \{ \frac{( \theta _ {n} - \theta _ {n} ^ {0} ) }{T\right} {} - $$

$$ - \frac{1}{2} \sum _ { j= } 0 ^ { n- } 1 [ \lambda _ {j} ( \theta ) - \lambda _ {j} ( \theta ^ {0} )] \int\limits _ { 0 } ^ { T } [ x ^ {(} j) ( t)] ^ {2} dt - $$

$$ - \left . \frac{1}{2} ( \lambda ( \theta ) - \lambda ( \theta ^ {0} )) \right \} . $$

Here, $ \lambda _ {j} ( \theta ) $ and $ \lambda ( \theta ) $ are quadratic forms in $ \theta $, depending on the values $ x ^ {(} j) ( t) $, $ j = 1 \dots n- 1 $, at the points $ t = 0, T $, and $ K( \theta ) $ is the determinant of the correlation matrix of the vector $ ( x( 0) \dots x ^ {(} n- 1) ( 0)) $.

Maximum-likelihood estimators for the auto-regression parameter $ \theta $ are asymptotically normal and asymptotically efficient. These properties are shared by the solution $ \theta _ {T} ^ \star $ of the approximate likelihood equation

$$ \frac{1}{2T} \sum _ { j= } 0 ^ { n- } 1 \frac{d \lambda _ {j} ( \theta ) }{d \theta _ {i} } \int\limits _ { 0 } ^ { T } [ x ^ {(} j) ( t)] ^ {2} dt = \ \left \{ \begin{array}{ll} 0, & 1 \leq i \leq n, \\ \frac{1}{2} , & i= n. \\ \end{array} \right .$$

An important role in statistical studies on the spectrum of a stationary process is played by the periodogram $ I _ {T} ( \lambda ) $. This statistic is defined as

$$ I _ {T} ( \lambda ) = \ \frac{1}{2 \pi T } \left | \sum _ { t= } 0 ^ { T } e ^ {- it \lambda } x( t) \right | \ \textrm{ (discrete time) } , $$

$$ I _ {T} ( \lambda ) = \frac{1}{2 \pi T } \left | \int\limits _ { 0 } ^ { T } e ^ {- it \lambda } x( t) dt \right | ^ {2} \ \textrm{ (continuous time) } . $$

The periodogram is widely used in constructing different kinds of estimators for $ f( \lambda ) $, $ F( \lambda ) $ and criteria for testing hypotheses on these characteristics. Under broad assumptions, the statistics $ \int I _ {T} ( \lambda ) \phi ( \lambda ) d \lambda $ are consistent estimators for $ \int f( \lambda ) \phi ( \lambda ) d \lambda $. In particular, $ \int _ \alpha ^ \beta I _ {T} ( \lambda ) d \lambda $ may serve as an estimator for $ F( \beta ) - F( \alpha ) $. If the sequence $ \phi _ {T} ( \lambda ; \lambda _ {0} ) $ converges in an appropriate way to the $ \delta $- function $ \delta ( \lambda - \lambda _ {0} ) $, then the integrals $ \int \phi _ {T} ( \lambda ; \lambda _ {0} ) I _ {T} ( \lambda ) d \lambda $ will be consistent estimators for $ f( \lambda _ {0} ) $. Functions of the form $ a _ {T} \psi ( a _ {T} ( \lambda - \lambda _ {0} )) $, $ a _ {T} \rightarrow \infty $, are often used in the capacity of the functions $ \phi _ {T} ( \lambda ; \lambda _ {0} ) $. If $ x( t) $ is a process in discrete time, these estimators can be written in the form

$$ \frac{1}{2 \pi } \sum _ { t=- } T+ 1 ^ { T- } 1 e ^ {- it \lambda } r ^ \star ( t) c _ {T} ( t), $$

where the empirical correlation function is

$$ r ^ \star ( t) = \frac{1}{T} \sum _ { u= } 0 ^ { T- } t x( u+ t) x( u), $$

while the non-random coefficients $ c _ {T} ( t) $ are defined by the choice of $ \psi $ and $ a _ {T} $. This choice, in turn, depends on a priori information on $ f( \lambda ) $. A similar representation also holds for processes in continuous time.

Problems in the statistical analysis of stationary processes sometimes also include problems of extrapolation, interpolation and filtration of stationary processes.

Statistical problems of Markov processes.

Let the observations $ X _ {0} \dots X _ {T} $ belong to a homogeneous Markov chain. Under sufficiently broad assumptions the likelihood function is

$$ \frac{dP _ \theta ^ {T} }{d \mu ^ {T} } = \ p _ {0} ( X _ {0} ; \theta ) p( X _ {1} | X _ {0} ; \theta ) \dots p( X _ {T} | X _ {T-} 1 ; \theta ), $$

where $ p _ {0} $, $ p $ are the initial and transition densities of the distribution. This expression is similar to the likelihood function for a sequence of independent observations, and when the regularity conditions are observed (smoothness in $ \theta \in \Theta \subset \mathbf R ^ {k} $), a theory can be constructed for hypotheses testing and estimation which is analogous to the corresponding theory for independent observations.

A more complex situation arises if $ x( t) $ is a Markov process in continuous time. Let $ x( t) $ be a homogeneous Markov process with a finite number of states $ N $ and differentiable transition probabilities $ P _ {ij} ( t) $. The transition probability matrix is defined by the matrix $ Q = \| q _ {ij} \| $, $ q _ {ij} = P _ {ij} ^ { \prime } ( 0) $, $ q _ {i} = - q _ {ii} $. Let $ x( 0) = i _ {0} $ be independent of $ Q $ at the initial time. By choosing any matrix $ Q _ {0} = \| q _ {ij} ^ {0} \| $, one finds

$$ \frac{dP _ {Q} ^ {T} }{dP _ {Q _ {0} } ^ {T} } ( x) = \ \mathop{\rm exp} \{ ( q _ {i _ {n} } ^ {0} - q _ {i _ {n} } ) T \} \sum _ {\nu = 0 } ^ { n- } 1 \frac{q _ {j _ \nu j _ {\nu + 1 } } }{q _ {j _ \nu j _ {\nu + 1 } } ^ {0} } \times $$

$$ \times \mathop{\rm exp} \{ t _ \nu ( q _ {i _ {n} } - q _ {i _ \nu } - q _ {i _ \nu } ^ {0} + q _ {i _ {n} } ^ {0} ) \} . $$

Here the statistics $ n( x) $, $ t _ \nu ( x) $, $ j _ \nu ( x) $ are defined in the following way: $ n $ is the number of jumps of $ x( t) $ on the interval $ [ 0, T) $; $ \tau _ \nu $ is the moment of the $ \nu $- th jump, $ t _ \nu = \tau _ {\nu + 1 } - \tau _ \nu $, and $ j _ \nu = x( \tau _ \nu ) $. It follows that the maximum-likelihood estimators for the parameters $ q _ {ij} $ are: $ q _ {ij} ^ \star = m _ {ij} / \mu _ {i} $, where $ m _ {ij} $ is the number of transitions from $ i $ to $ j $ on $ [ 0, T) $, while $ \mu _ {i} $ is the time spent by the process $ x( t) $ in the state $ i $.

Example 9.

Let $ x( t) $ be a birth-and-death process with constant intensities of birth $ \lambda $ and death $ \mu $. This means that $ q _ {i,i+} 1 = i \lambda $, $ q _ {i,i-} 1 = i \mu $, $ q _ {ii} = 1- i( \lambda + \mu ) $, and $ q _ {ij} = 0 $ if $ | i- j | > 1 $. In this example the number of states is infinite. Let $ x( 0) \equiv 1 $. The likelihood ratio is

$$ \frac{dP _ {\lambda \mu } ^ {T} }{dP _ {\lambda _ {0} , \mu _ {0} } ^ {T} } ( x) = $$

$$ = \ \left ( \frac \lambda {\lambda _ {0} } \right ) ^ {B} \left ( \frac \mu {\mu _ {0} } \right ) ^ {D} \mathop{\rm exp} \left \{ -( \lambda + \mu - \lambda _ {0} - \mu _ {0} ) \int\limits _ { 0 } ^ { T } x( s) ds \right \} . $$

Here $ B $ is the total number of births (jumps of measure $ + 1 $) and $ D $ is the number of deaths (jumps of measure $ - 1 $). Maximum-likelihood estimators for $ \lambda $ and $ \mu $ are

$$ \lambda _ {T} ^ \star = \frac{1}{B} \int\limits _ { 0 } ^ { T } x( s) ds,\ \ \mu _ {T} ^ \star = \frac{1}{D} \int\limits _ { 0 } ^ { T } x( s) ds. $$

Let $ x( t) $ be a diffusion process with drift coefficient $ a $ and diffusion coefficient $ b $, such that $ x( t) $ satisfies the stochastic differential equation

$$ dx( t) = a( t, x( t)) dt + b( t, x( t)) dw( t),\ \ x( 0) = x _ {0} , $$

where $ w $ is a Wiener process. Then, under specific restrictions,

$$ \frac{dP _ {a,b} ^ {T} }{dP _ {a _ {0} ,b } ^ {T} } ( x) = \ \mathop{\rm exp} \left \{ - \int\limits _ { 0 } ^ { T } \frac{a( t, x( t)) - a _ {0} ( t, x( t)) }{b( t, x( t)) } dx( t) \right . + $$

$$ + \left . \frac{1}{2} \int\limits _ { 0 } ^ { T } \frac{a( t, x( t)) - a _ {0} ( t, x( t)) ^ {2} }{b( t, x( t)) } dt \right \} $$

(here $ a _ {0} $ is a fixed coefficient).

Example 10.

Let

$$ dx( t) = a( t, x( t); \theta ) dt + dw, $$

where $ a $ is a known function and $ \theta $ is an unknown real parameter. If Wiener measure is denoted by $ \mu $, then the likelihood function is

$$ \frac{dP _ \theta ^ {T} }{d \mu } = \ \mathop{\rm exp} \left \{ \int\limits _ { 0 } ^ { T } a( t, x( t); \theta ) dx - \frac{1}{2} \int\limits _ { 0 } ^ { T } a ^ {2} ( t, x( t); \theta ) dt \right \} , $$

and, under regularity conditions, the Cramér–Rao inequality is satisfied: For an estimator $ \tau $ with bias $ \Delta ( \theta ) = {\mathsf E} _ \theta \tau - \theta $,

$$ {\mathsf E} _ \theta | \tau - \theta | ^ {2} \geq \frac{( 1 + {d \Delta } / {d \theta } ) ^ {2} }{ {\mathsf E} _ \theta \int\limits _ { 0 } ^ { T } [ ( \partial / {\partial \theta } ) a( t, x( t); \theta )] ^ {2} dt } + \Delta ^ {2} ( \theta ). $$

If the dependence on $ \theta $ is linear, the maximum-likelihood estimator is

$$ \theta _ {T} ^ \star = \ \frac{\int\limits _ { 0 } ^ { T } a( t, x( t)) dt }{\int\limits _ { 0 } ^ { T } a ^ {2} ( t, x( t)) dt } . $$

References

[1] U. Grenander, "Stochastic processes and statistical inference" Ark. Mat. , 1 (1950) pp. 195–277 MR0039202 Zbl 0058.35501 Zbl 0041.45807
[2] E.J. Hannan, "Time series analysis" , Methuen , London (1960) MR0114281 Zbl 0095.13204
[3] U. Grenander, M. Rosenblatt, "Statistical analysis of stationary time series" , Wiley (1957) MR0084975 Zbl 0080.12904
[4] U. Grenander, "Abstract inference" , Wiley (1981) MR0599175 Zbl 0505.62069
[5] Yu.A. Rozanov, "Infinite-dimensional Gaussian distributions" Proc. Steklov Inst. Math. , 108 (1971) Trudy Mat. Inst. Steklov. , 108 (1968) MR0436304
[6] I.A. Ibragimov, Yu.A. Rozanov, "Gaussian random processes" , Springer (1978) (Translated from Russian) MR0543837 Zbl 0392.60037
[7] D.R. Brillinger, "Time series. Data analysis and theory" , Holt, Rinehart & Winston (1975) MR0443257 Zbl 0321.62004
[8] P. Billingsley, "Statistical inference for Markov processes" , Univ. Chicago Press (1961) MR1531450 MR0123419 Zbl 0106.34201
[9] R.S. Liptser, A.N. Shiryaev, "Statistics of random processes" , 1–2 , Springer (1977–1978) (Translated from Russian) MR1800858 MR1800857 MR0608221 MR0488267 MR0474486 Zbl 1008.62073 Zbl 1008.62072 Zbl 0556.60003 Zbl 0369.60001 Zbl 0364.60004
[10] A.M. Yaglom, "Correlation theory of stationary and related random functions" , 1–2 , Springer (1987) (Translated from Russian) MR0915557 MR0893393 Zbl 0685.62078 Zbl 0685.62077
[11] T.M. Anderson, "The statistical analysis of time series" , Wiley (1971) MR0283939 Zbl 0225.62108
How to Cite This Entry:
Statistical problems in the theory of stochastic processes. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Statistical_problems_in_the_theory_of_stochastic_processes&oldid=49444
This article was adapted from an original article by I.A. Ibragimov (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article