Namespaces
Variants
Actions

Difference between revisions of "Statistical problems in the theory of stochastic processes"

From Encyclopedia of Mathematics
Jump to: navigation, search
m (tex encoded by computer)
m (Undo revision 48818 by Ulf Rehmann (talk))
Tag: Undo
Line 1: Line 1:
<!--
+
A branch of mathematical statistics devoted to statistical inferences on the basis of observations represented as a random process. In the most common formulation, the values of a random function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s0874501.png" /> for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s0874502.png" /> are observed, and on the basis of these observations statistical inferences must be made regarding certain characteristics of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s0874503.png" />. Such a broad definition also formally includes all the classical statistics of independent observations. In fact, statistics of random processes is often taken to mean only the statistics of dependent observations, excluding the statistical analysis of a large number of independent realizations of a random process. Here the foundations of the statistical theory, the basic formulations of problems (cf. [[Statistical estimation|Statistical estimation]]; [[Statistical hypotheses, verification of|Statistical hypotheses, verification of]]), the basic concepts (sufficiency, unbiasedness, consistency, etc.) are the same as in the classical theory. However, when solving concrete problems, significant difficulties and phenomena of a new type arise from time to time. These difficulties are partially caused by the fact of dependence and the more complex structure of the process under consideration, and partially, in the case of observations in continuous time, by the need to examine distributions in infinite-dimensional spaces.
s0874501.png
 
$#A+1 = 302 n = 0
 
$#C+1 = 302 : ~/encyclopedia/old_files/data/S087/S.0807450 Statistical problems in the theory of stochastic processes
 
Automatically converted into TeX, above some diagnostics.
 
Please remove this comment and the {{TEX|auto}} line below,
 
if TeX found to be correct.
 
-->
 
 
 
{{TEX|auto}}
 
{{TEX|done}}
 
 
 
A branch of mathematical statistics devoted to statistical inferences on the basis of observations represented as a random process. In the most common formulation, the values of a random function $  x( t) $
 
for $  t \in T $
 
are observed, and on the basis of these observations statistical inferences must be made regarding certain characteristics of $  x( t) $.  
 
Such a broad definition also formally includes all the classical statistics of independent observations. In fact, statistics of random processes is often taken to mean only the statistics of dependent observations, excluding the statistical analysis of a large number of independent realizations of a random process. Here the foundations of the statistical theory, the basic formulations of problems (cf. [[Statistical estimation|Statistical estimation]]; [[Statistical hypotheses, verification of|Statistical hypotheses, verification of]]), the basic concepts (sufficiency, unbiasedness, consistency, etc.) are the same as in the classical theory. However, when solving concrete problems, significant difficulties and phenomena of a new type arise from time to time. These difficulties are partially caused by the fact of dependence and the more complex structure of the process under consideration, and partially, in the case of observations in continuous time, by the need to examine distributions in infinite-dimensional spaces.
 
  
 
In fact, when solving statistical problems in the theory of random processes, the structure of the process under consideration is crucial, while when classifying random processes, statistical problems of Gaussian, Markov, stationary, branching, diffusion, and other processes are studied. Of these, the most far-reaching is the statistical theory of stationary processes (time-series analysis).
 
In fact, when solving statistical problems in the theory of random processes, the structure of the process under consideration is crucial, while when classifying random processes, statistical problems of Gaussian, Markov, stationary, branching, diffusion, and other processes are studied. Of these, the most far-reaching is the statistical theory of stationary processes (time-series analysis).
Line 20: Line 5:
 
The need for a statistical analysis of random processes arose in the 19th century in the form of analysis of meteorological and economic series and studies on cyclic processes (price fluctuation, sun spots). In modern times, the number of problems covered by the statistical analysis of random processes has become extraordinarily large. To cite but a few examples, statistical analysis of random noise, vibrations, turbulence, wave motion in a sea, cardiograms and encephalograms, etc. The theoretical aspects of extracting a signal from a background of noise can be seen to a significant degree as a statistical problem in the theory of random processes.
 
The need for a statistical analysis of random processes arose in the 19th century in the form of analysis of meteorological and economic series and studies on cyclic processes (price fluctuation, sun spots). In modern times, the number of problems covered by the statistical analysis of random processes has become extraordinarily large. To cite but a few examples, statistical analysis of random noise, vibrations, turbulence, wave motion in a sea, cardiograms and encephalograms, etc. The theoretical aspects of extracting a signal from a background of noise can be seen to a significant degree as a statistical problem in the theory of random processes.
  
Below it is proposed that a segment $  x( t) $,  
+
Below it is proposed that a segment <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s0874504.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s0874505.png" />, of the random process <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s0874506.png" /> be observed, whereby the parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s0874507.png" /> passes either through the whole interval <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s0874508.png" />, or through the integers in this interval. In statistical problems, the distribution <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s0874509.png" /> of the process <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745010.png" /> is usually known only to belong to some family of distributions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745011.png" />. This family can always be written in parametric form.
0 \leq  t \leq  T $,  
 
of the random process $  x( t) $
 
be observed, whereby the parameter $  t $
 
passes either through the whole interval $  [ 0, T] $,
 
or through the integers in this interval. In statistical problems, the distribution $  P  ^ {T} $
 
of the process $  \{ {x( t) } : {0 \leq  t \leq  T } \} $
 
is usually known only to belong to some family of distributions $  \{ P  ^ {T} \} $.  
 
This family can always be written in parametric form.
 
  
 
===Example 1.===
 
===Example 1.===
The process $  x( t) $
+
The process <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745012.png" /> is either the sum of a non-random function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745013.png" /> (a "signal" ) and a random function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745014.png" /> (the "noise" ), or is a single random function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745015.png" />. The hypothesis <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745016.png" />: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745017.png" /> must be tested against the alternative <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745018.png" />: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745019.png" /> (the problem of locating a signal in noise). This is an example of testing a statistical hypothesis.
is either the sum of a non-random function s( t) $(
 
a "signal" ) and a random function $  \xi ( t) $(
 
the "noise" ), or is a single random function $  \xi ( t) $.  
 
The hypothesis $  H _ {0} $:  
 
$  x( t) = s( t) + \xi ( t) $
 
must be tested against the alternative $  H _ {1} $:  
 
$  x( t) = \xi ( t) $(
 
the problem of locating a signal in noise). This is an example of testing a statistical hypothesis.
 
  
 
===Example 2.===
 
===Example 2.===
The process $  x( t) = s( t) + \xi ( t) $,  
+
The process <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745020.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745021.png" /> is an unknown non-random function (the signal), while <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745022.png" /> is a random process (the noise). The function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745023.png" />, or its value <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745024.png" /> at a given point <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745025.png" />, has to be estimated. Similarly, it can be proposed that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745026.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745027.png" /> is a known function, depending on an unknown parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745028.png" />, which must also be estimated through the observation of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745029.png" /> (problems of extracting a signal from a background of noise). These are examples of estimation problems.
where s( t) $
 
is an unknown non-random function (the signal), while $  \xi ( t) $
 
is a random process (the noise). The function s $,  
 
or its value s( t _ {0} ) $
 
at a given point $  t _ {0} $,  
 
has to be estimated. Similarly, it can be proposed that $  x( t) = s( t;  \theta ) + \xi ( t) $,  
 
where s $
 
is a known function, depending on an unknown parameter $  \theta $,  
 
which must also be estimated through the observation of $  x( t) $(
 
problems of extracting a signal from a background of noise). These are examples of estimation problems.
 
  
 
==The likelihood ratio for random processes.==
 
==The likelihood ratio for random processes.==
In statistical problems, likelihood ratios and likelihood functions play an important role (see [[Neyman–Pearson lemma|Neyman–Pearson lemma]]; [[Statistical hypotheses, verification of|Statistical hypotheses, verification of]]; [[Statistical estimation|Statistical estimation]]). The likelihood ratio of two distributions $  P _ {u}  ^ {T} $
+
In statistical problems, likelihood ratios and likelihood functions play an important role (see [[Neyman–Pearson lemma|Neyman–Pearson lemma]]; [[Statistical hypotheses, verification of|Statistical hypotheses, verification of]]; [[Statistical estimation|Statistical estimation]]). The likelihood ratio of two distributions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745030.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745031.png" /> is the density
and $  P _ {v}  ^ {T} $
 
is the density
 
 
 
$$
 
p( x( \cdot );  u , v)  =  p( x( \cdot ))  = \
 
  
\frac{dP _ {u}  ^ {T} }{dP _ {v}  ^ {T} }
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745032.png" /></td> </tr></table>
( x( \cdot )) .
 
$$
 
  
 
The likelihood function is the function
 
The likelihood function is the function
  
$$
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745033.png" /></td> </tr></table>
L( \theta )  =
 
\frac{dP _  \theta  ^ {T} }{d \mu }
 
( x( \cdot )),
 
$$
 
  
where $  \mu $
+
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745034.png" /> is a <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745035.png" />-finite measure relative to which all measures <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745036.png" /> are absolutely continuous. In the discrete case, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745037.png" /> runs through the integers of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745038.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745039.png" />, the likelihood ratio always exists if the distributions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745040.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745041.png" /> have positive densities, and it coincides with the ratio of these two densities.
is a $  \sigma $-
 
finite measure relative to which all measures $  P _  \theta  ^ {T} $
 
are absolutely continuous. In the discrete case, where $  t $
 
runs through the integers of $  [ 0, T] $
 
and $  T < \infty $,  
 
the likelihood ratio always exists if the distributions $  P _ {u} $
 
and $  P _ {v} $
 
have positive densities, and it coincides with the ratio of these two densities.
 
  
If $  t $
+
If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745042.png" /> runs through the entire interval <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745043.png" />, then cases may arise in which the measures <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745044.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745045.png" /> are not absolutely continuous with respect to each other; moreover, situations can arise in which <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745046.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745047.png" /> are mutually singular, i.e. where for a set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745048.png" /> in the space of realizations of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745049.png" />,
runs through the entire interval $  [ 0, T] $,
 
then cases may arise in which the measures $  P _ {u}  ^ {T} $
 
and $  P _ {v}  ^ {T} $
 
are not absolutely continuous with respect to each other; moreover, situations can arise in which $  P _ {u}  ^ {T} $
 
and $  P _ {v}  ^ {T} $
 
are mutually singular, i.e. where for a set $  A $
 
in the space of realizations of $  x( t) $,
 
  
$$
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745050.png" /></td> </tr></table>
P _ {u}  ^ {T} \{ x \in A \}  = 0,\ \
 
P _ {v}  ^ {T} \{ x \in A \}  = 1.
 
$$
 
  
In this case $  p( x;  u , v) $
+
In this case <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745051.png" /> does not exist. The singularity of the measures <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745052.png" /> leads to important and somewhat paradoxical statistical results, allowing for error-free inferences concerning the parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745053.png" />. For example, let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745054.png" />; the singularity of the measures <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745055.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745056.png" /> means that, using the test "accept H0 if x A, reject H0 if x A" , the hypotheses <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745057.png" />: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745058.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745059.png" />: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745060.png" /> are distinguished error-free. The presence of such perfect tests often demonstrates that the statistical problem is not posed entirely satisfactorily and that certain essential random disturbances are excluded from it.
does not exist. The singularity of the measures $  P _  \theta  ^ {T} $
 
leads to important and somewhat paradoxical statistical results, allowing for error-free inferences concerning the parameter $  \theta $.  
 
For example, let $  \theta = \{ 0, 1 \} $;  
 
the singularity of the measures $  P _ {0}  ^ {T} $
 
and $  P _ {1}  ^ {T} $
 
means that, using the test "accept H0 if x A, reject H0 if x A" , the hypotheses $  H _ {0} $:  
 
$  \theta = 0 $
 
and $  H _ {1} $:  
 
$  \theta = 1 $
 
are distinguished error-free. The presence of such perfect tests often demonstrates that the statistical problem is not posed entirely satisfactorily and that certain essential random disturbances are excluded from it.
 
  
 
===Example 3.===
 
===Example 3.===
Let $  x( t) = \theta + \xi ( t) $,  
+
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745061.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745062.png" /> is a stationary ergodic process with zero average and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745063.png" /> is a real parameter. Let the realizations of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745064.png" /> with probability 1 be analytic in a strip containing the real axis. According to the ergodic theorem,
where $  \xi ( t) $
 
is a stationary ergodic process with zero average and $  \theta $
 
is a real parameter. Let the realizations of $  \xi ( t) $
 
with probability 1 be analytic in a strip containing the real axis. According to the ergodic theorem,
 
  
$$
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745065.png" /></td> </tr></table>
\lim\limits _ {T \rightarrow \infty } 
 
\frac{1}{T}
 
\int\limits _ { 0 } ^ { T }  x( t)  dt  = \theta ,
 
$$
 
  
and all measures $  P _  \theta  ^  \infty  $
+
and all measures <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745066.png" /> are also mutually singular. Since an analytic function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745067.png" /> is completely defined by its values in a neighbourhood of zero, the parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745068.png" /> is error-free when estimated through the observations <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745069.png" /> for any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745070.png" />.
are also mutually singular. Since an analytic function $  x( t) $
 
is completely defined by its values in a neighbourhood of zero, the parameter $  \theta $
 
is error-free when estimated through the observations $  \{ {x( t) } : {0 \leq  t \leq  T } \} $
 
for any $  T > 0 $.
 
  
 
The calculation of the likelihood ratio in those cases where it exists is a difficult problem. Calculations are often based on the limit relation
 
The calculation of the likelihood ratio in those cases where it exists is a difficult problem. Calculations are often based on the limit relation
  
$$
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745071.png" /></td> </tr></table>
p( x( \cdot ); u , v)  = \
 
\lim\limits _ {n \rightarrow \infty } \
 
  
\frac{p _ {u} ( x( t _ {1} ) \dots x( t _ {n} )) }{p _ {v} ( x( t _ {1} ) \dots x( t _ {n} )) }
+
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745072.png" /> are the densities of the vector <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745073.png" />, while <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745074.png" /> is a dense set in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745075.png" />. Study of the right-hand side of the above equality also is useful in investigating the possible singularity of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745076.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745077.png" />.
,
 
$$
 
 
 
where $  p _ {u} , p _ {v} $
 
are the densities of the vector $  ( x( t _ {1} ) \dots x( t _ {n} )) $,  
 
while $  \{ t _ {1} , t _ {2} , . . . \} $
 
is a dense set in $  [ 0, T] $.  
 
Study of the right-hand side of the above equality also is useful in investigating the possible singularity of $  P _ {u} $
 
and $  P _ {v} $.
 
  
 
===Example 4.===
 
===Example 4.===
Suppose one has either the observation $  x( t) = w( t) $,  
+
Suppose one has either the observation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745078.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745079.png" /> is a [[Wiener process|Wiener process]] (hypothesis <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745080.png" />), or <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745081.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745082.png" /> is a non-random function (hypothesis <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745083.png" />). The measures <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745084.png" /> are mutually absolutely continuous if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745085.png" />, and mutually singular if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745086.png" />. The likelihood ratio equals
where $  w( t) $
 
is a [[Wiener process|Wiener process]] (hypothesis $  H _ {0} $),  
 
or $  x( t) = m( t) + w( t) $,  
 
where $  m $
 
is a non-random function (hypothesis $  H _ {1} $).  
 
The measures $  P _ {0} , P _ {1} $
 
are mutually absolutely continuous if $  m  ^  \prime  \in L _ {2} ( 0, T) $,
 
and mutually singular if $  m  ^  \prime  \notin L _ {2} ( 0, T) $.  
 
The likelihood ratio equals
 
  
$$
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745087.png" /></td> </tr></table>
 
 
\frac{dP _ {1}  ^ {T} }{dP _ {0}  ^ {T} }
 
( x)  = \
 
\mathop{\rm exp} \left \{ -  
 
\frac{1}{2}
 
\int\limits _ { 0 } ^ { T }  [ m  ^  \prime  ( t)]  ^ {2}  dt + \int\limits _ { 0 } ^ { T }
 
m  ^  \prime  ( t)  dx( t) \right \} .
 
$$
 
  
 
===Example 5.===
 
===Example 5.===
Let $  x( t) = \theta + \xi ( t) $,  
+
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745088.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745089.png" /> is a real parameter and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745090.png" /> is a stationary Gaussian [[Markov process|Markov process]] with mean zero and known correlation function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745091.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745092.png" />. The measures <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745093.png" /> are mutually absolutely continuous with likelihood function
where $  \theta $
 
is a real parameter and $  \xi ( t) $
 
is a stationary Gaussian [[Markov process|Markov process]] with mean zero and known correlation function $  r( t) = e ^ {- \alpha | t | } $,  
 
$  \alpha > 0 $.  
 
The measures $  P _  \theta  ^ {T} $
 
are mutually absolutely continuous with likelihood function
 
  
$$
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745094.png" /></td> </tr></table>
  
\frac{dP _  \theta  ^ {T} }{dP _  \theta  ^ {0} }
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745095.png" /></td> </tr></table>
( x)  = \
 
\mathop{\rm exp} \left \{
 
\frac{1}{2}
 
\theta x( 0) +
 
\frac{1}{2}
 
\theta x( T) \right . +
 
$$
 
  
$$
+
In particular, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745096.png" /> is a [[Sufficient statistic|sufficient statistic]] for the family <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745097.png" />.
+ \left .
 
 
 
\frac{1}{2}
 
\theta \alpha \int\limits _ { 0 } ^ { T }  x( t)  dt -
 
\frac{1}{2}
 
\theta  ^ {2} -
 
\frac{1}{4}
 
\theta  ^ {2} \alpha T  \right \} .
 
$$
 
 
 
In particular, $  x( 0) + x( T) + \alpha \int _ {0}  ^ {T} x( t)  dt $
 
is a [[Sufficient statistic|sufficient statistic]] for the family $  P _  \theta  ^ {T} $.
 
  
 
==Linear problems in the statistics of random processes.==
 
==Linear problems in the statistics of random processes.==
 
Let the function
 
Let the function
  
$$ \tag{* }
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745098.png" /></td> <td valign="top" style="width:5%;text-align:right;">(*)</td></tr></table>
x( t)  = \sum _ { 1 } ^ { k }  \theta _ {j} \phi _ {j} ( t) + \xi ( t)
 
$$
 
  
be observed, where $  \xi ( t) $
+
be observed, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s08745099.png" /> is a random process with mean zero and known correlation function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450100.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450101.png" /> are known non-random functions, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450102.png" /> is an unknown parameter (<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450103.png" /> are the regression coefficients), and the parameter set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450104.png" /> is a subset of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450105.png" />. Linear estimators for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450106.png" /> are estimators of the form <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450107.png" />, or their limits in the mean square. The problem of finding optimal unbiased linear estimators in the mean square reduces to the solution of linear algebraic or linear integral equations in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450108.png" />. Indeed, an optimal estimator <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450109.png" /> is defined by the equations <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450110.png" /> for any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450111.png" /> of the form <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450112.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450113.png" />. In a number of cases, estimators of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450114.png" />, obtained asymptotically by the method of least squares, when <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450115.png" />, are not worse than the optimal linear estimators. Estimators obtained by the method of least squares are calculated more simply and do not depend on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450116.png" />.
is a random process with mean zero and known correlation function $  r( t, s) $,
 
$  \phi _ {j} $
 
are known non-random functions, $  \theta = ( \theta _ {1} \dots \theta _ {k} ) $
 
is an unknown parameter ( $  \theta _ {j} $
 
are the regression coefficients), and the parameter set $  \Theta $
 
is a subset of $  \mathbf R  ^ {k} $.  
 
Linear estimators for $  \theta _ {j} $
 
are estimators of the form $  \sum c _ {j} x( t _ {j} ) $,  
 
or their limits in the mean square. The problem of finding optimal unbiased linear estimators in the mean square reduces to the solution of linear algebraic or linear integral equations in $  r $.  
 
Indeed, an optimal estimator $  \widehat \theta  $
 
is defined by the equations $  {\mathsf E} _  \theta  ( \widehat \theta  _ {j} \xi ) = 0 $
 
for any $  \xi $
 
of the form $  \xi = \sum b _ {j} x( t _ {j} ) $,  
 
$  \sum b _ {j} \phi _ {l} ( t _ {j} ) = 0 $.  
 
In a number of cases, estimators of $  \theta $,  
 
obtained asymptotically by the method of least squares, when $  T \rightarrow \infty $,  
 
are not worse than the optimal linear estimators. Estimators obtained by the method of least squares are calculated more simply and do not depend on $  r $.
 
  
 
===Example 6.===
 
===Example 6.===
Under the conditions of example 5, $  k= 1 $,  
+
Under the conditions of example 5, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450117.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450118.png" />. The optimal unbiased [[Linear estimator|linear estimator]] takes the form
$  \phi _ {1} ( t) \equiv 1 $.  
 
The optimal unbiased [[Linear estimator|linear estimator]] takes the form
 
  
$$
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450119.png" /></td> </tr></table>
\widehat \theta    =
 
\frac{1}{2 + \alpha T }
 
 
 
\left ( x( 0) + x( T) + \alpha \int\limits _ { 0 } ^ { T }
 
x( t)  dt \right ) .
 
$$
 
  
 
The estimator
 
The estimator
  
$$
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450120.png" /></td> </tr></table>
\theta  ^  \star  =
 
\frac{1}{T}
 
\int\limits _ { 0 } ^ { T }  x( t)  dt
 
$$
 
  
 
has asymptotically the same variance.
 
has asymptotically the same variance.
  
 
==Statistical problems of Gaussian processes.==
 
==Statistical problems of Gaussian processes.==
Let $  \{ {x( t) } : {0 \leq  t \leq  T,  P _  \theta  ^ {T} } \} $
+
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450121.png" /> be a [[Gaussian process|Gaussian process]] for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450122.png" />. For Gaussian processes one has the alternatives: Any two measures <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450123.png" /> are either mutually absolutely continuous or are singular. Since the Gaussian distribution <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450124.png" /> is completely defined by the mean value <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450125.png" /> and the correlation function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450126.png" />, the likelihood ratio <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450127.png" /> is expressed in terms of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450128.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450129.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450130.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450131.png" /> in a complex way. The case where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450132.png" />, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450133.png" /> a continuous function, is relatively simple. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450134.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450135.png" />; let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450136.png" />, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450137.png" />, be the eigenvalues, and the corresponding normalized eigenfunctions in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450138.png" />, of the integral equation
be a [[Gaussian process|Gaussian process]] for all $  \theta \in \Theta $.  
 
For Gaussian processes one has the alternatives: Any two measures $  P _ {u}  ^ {T} , P _ {v}  ^ {T} $
 
are either mutually absolutely continuous or are singular. Since the Gaussian distribution $  P _  \theta  ^ {T} $
 
is completely defined by the mean value $  m _  \theta  ( t) = {\mathsf E} _  \theta  x( t) $
 
and the correlation function $  r _  \theta  ( s, t) = {\mathsf E} _  \theta  x( s) x( t) $,  
 
the likelihood ratio $  dP _ {u}  ^ {T} /dP _ {v}  ^ {T} $
 
is expressed in terms of $  m _ {u} $,  
 
$  m _ {v} $,  
 
$  r _ {u} $,  
 
$  r _ {v} $
 
in a complex way. The case where $  r _ {u} = r _ {v} = r $,  
 
and $  r $
 
a continuous function, is relatively simple. Let $  \Theta = \{ 0, 1 \} $,
 
$  r _ {0} = r _ {1} = r $;  
 
let $  \lambda _ {i} $,  
 
and $  \phi _ {i} ( t) $,  
 
be the eigenvalues, and the corresponding normalized eigenfunctions in $  L _ {2} ( 0, T) $,
 
of the integral equation
 
  
$$
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450139.png" /></td> </tr></table>
\lambda \phi ( s)  = \int\limits _ { 0 } ^ { T }  r ( t) \phi ( t)  dt;
 
$$
 
  
let the means $  m _ {0} ( t) $
+
let the means <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450140.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450141.png" /> be continuous functions; and let
and $  m _ {1} ( t) $
 
be continuous functions; and let
 
  
$$
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450142.png" /></td> </tr></table>
m _ {ij}  = \int\limits _ { 0 } ^ { T }  m _ {i} ( t) \phi _ {j} ( t)  dt.
 
$$
 
  
The measures $  P _ {0} , P _ {1} $
+
The measures <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450143.png" /> are absolutely continuous if and only if
are absolutely continuous if and only if
 
  
$$
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450144.png" /></td> </tr></table>
\sum _ { j= } 1 ^  \infty  ( m _ {0j} - m _ {1j} )  ^ {2} \lambda _ {j}  ^ {-} 1  < \infty .
 
$$
 
  
 
Here,
 
Here,
  
$$
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450145.png" /></td> </tr></table>
 
 
\frac{dP _ {1}  ^ {T} }{dP _ {0}  ^ {T} }
 
( x)  = \
 
\mathop{\rm exp} \left \{ \sum _ { j= } 1 ^  \infty 
 
\frac{m _ {1j} - m _ {0j} }{\lambda _ {j}  }
 
\right . \times
 
$$
 
 
 
$$
 
\times \left .
 
\left ( \int\limits _ { 0 } ^ { T }  x( t) \phi _ {j} ( t)  dt -
 
  
\frac{m _ {1j} - m _ {0j} }{2}
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450146.png" /></td> </tr></table>
\right ) \right \} .
 
$$
 
  
This equality can be used to devise a test for the hypothesis $  H _ {0} $:  
+
This equality can be used to devise a test for the hypothesis <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450147.png" />: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450148.png" /> against the alternative <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450149.png" />: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450150.png" /> under the assumption that the function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450151.png" /> is known to the observer.
$  m = m _ {0} $
 
against the alternative $  H _ {1} $:  
 
$  m = m _ {1} $
 
under the assumption that the function $  r $
 
is known to the observer.
 
  
 
==Statistical problems of stationary processes.==
 
==Statistical problems of stationary processes.==
Let the observation $  x( t) $
+
Let the observation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450152.png" /> be a stationary process with mean <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450153.png" /> and correlation function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450154.png" />; let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450155.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450156.png" /> be its spectral density and spectral function, respectively. The basic problems of the statistics of stationary processes relate to hypotheses testing and to estimating the characteristics <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450157.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450158.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450159.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450160.png" />. In the case of an ergodic process <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450161.png" />, consistent estimators (when <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450162.png" />) for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450163.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450164.png" />, respectively, are provided by
be a stationary process with mean $  m $
 
and correlation function $  r( t) $;  
 
let $  f( \lambda ) $
 
and $  F( \lambda ) $
 
be its spectral density and spectral function, respectively. The basic problems of the statistics of stationary processes relate to hypotheses testing and to estimating the characteristics $  m $,  
 
$  r $,  
 
$  f $,  
 
$  F $.  
 
In the case of an ergodic process $  x( t) $,  
 
consistent estimators (when $  T \rightarrow \infty $)  
 
for $  m $
 
and $  r( t) $,  
 
respectively, are provided by
 
  
$$
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450165.png" /></td> </tr></table>
m  ^  \star  =
 
\frac{1}{T}
 
\int\limits _ { 0 } ^ { T }  x( t)  dt,
 
$$
 
  
$$
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450166.png" /></td> </tr></table>
r  ^  \star  ( t)  =
 
\frac{1}{T}
 
\int\limits _ { 0 } ^ { T- }  t x( t+ s) x( s)  ds.
 
$$
 
  
The problem of estimating $  m $
+
The problem of estimating <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450167.png" /> when <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450168.png" /> is known is often treated as a linear problem. This group of problems also includes the more general problems of estimating regression coefficients through observations of the form (*) with stationary <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450169.png" />.
when $  r $
 
is known is often treated as a linear problem. This group of problems also includes the more general problems of estimating regression coefficients through observations of the form (*) with stationary $  \xi ( t) $.
 
  
Let $  x( t) $
+
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450170.png" /> have zero mean and spectral density <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450171.png" /> depending on a finite-dimensional parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450172.png" />. If the process <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450173.png" /> is Gaussian, formulas can be derived for the likelihood ratio <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450174.png" /> (if the ratio exists), which in a number of cases make it possible to find maximum-likelihood estimators or "good" approximations of them (for large <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450175.png" />). Under sufficiently broad assumptions these estimators are asymptotically normal <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450176.png" /> and asymptotically efficient.
have zero mean and spectral density $  f( \lambda ;  \theta ) $
 
depending on a finite-dimensional parameter $  \theta \in \Theta $.  
 
If the process $  x( t) $
 
is Gaussian, formulas can be derived for the likelihood ratio $  dP _  \theta  /dP _ {\theta  ^ {0}  } $(
 
if the ratio exists), which in a number of cases make it possible to find maximum-likelihood estimators or "good" approximations of them (for large $  T $).  
 
Under sufficiently broad assumptions these estimators are asymptotically normal $  ( \theta , c( \theta )/ \sqrt T ) $
 
and asymptotically efficient.
 
  
 
===Example 7.===
 
===Example 7.===
Let $  x( t) $
+
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450177.png" /> be a stationary Gaussian process in continuous time with rational spectral density <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450178.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450179.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450180.png" /> are polynomials. The measures <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450181.png" /> corresponding to the rational spectral densities <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450182.png" /> are absolutely continuous if and only if
be a stationary Gaussian process in continuous time with rational spectral density $  f( \lambda ) = | Q( \lambda )/P( \lambda ) |  ^ {2} $,  
 
where $  P $
 
and $  Q $
 
are polynomials. The measures $  P _ {0}  ^ {T} , P _ {1}  ^ {T} $
 
corresponding to the rational spectral densities $  f _ {0} , f _ {1} $
 
are absolutely continuous if and only if
 
 
 
$$
 
\lim\limits _ {\lambda \rightarrow \infty } \
 
  
\frac{f _ {0} ( \lambda ) }{f _ {1} ( \lambda ) }
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450183.png" /></td> </tr></table>
  = 1.
 
$$
 
  
Here the parameter $  \theta $
+
Here the parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450184.png" /> is the set of all coefficients of the polynomials <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450185.png" />.
is the set of all coefficients of the polynomials $  P, Q $.
 
  
 
===Example 8.===
 
===Example 8.===
An important class of stationary Gaussian processes consists of the auto-regressive processes (cf. [[Auto-regressive process|Auto-regressive process]]) $  x( t) $:
+
An important class of stationary Gaussian processes consists of the auto-regressive processes (cf. [[Auto-regressive process|Auto-regressive process]]) <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450186.png" />:
  
$$
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450187.png" /></td> </tr></table>
x  ^ {(} n) ( t) + \theta _ {n} x  ^ {(} n- 1) ( t) + \dots + \theta _ {1} x( t)  = \xi ( t),
 
$$
 
  
where $  \xi ( t) $
+
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450188.png" /> is a Gaussian white noise of unit intensity and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450189.png" /> is an unknown parameter. In this case the spectral density is
is a Gaussian white noise of unit intensity and $  \theta = ( \theta _ {1} \dots \theta _ {n} ) $
 
is an unknown parameter. In this case the spectral density is
 
  
$$
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450190.png" /></td> </tr></table>
f( \lambda ; \theta )  = ( 2 \pi )  ^ {-} 1 | P( i \lambda ) |  ^ {-} 2 ,
 
$$
 
  
 
where
 
where
  
$$
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450191.png" /></td> </tr></table>
P( z)  = \theta _ {1} + \theta _ {2} z + \dots + \theta _ {n} z  ^ {n-} 1 + z  ^ {n} .
 
$$
 
  
 
The likelihood function is
 
The likelihood function is
  
$$
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450192.png" /></td> </tr></table>
  
\frac{dP _  \theta  ^ {T} }{dP _ {\theta  ^ {0}  }  ^ {T} }
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450193.png" /></td> </tr></table>
  = \
 
\sqrt {
 
\frac{K( \theta ) }{K( \theta  ^ {0} ) }
 
}  \mathop{\rm exp} \left \{
 
\frac{( \theta _ {n} - \theta _ {n}  ^ {0} ) }{T\right}
 
{} -
 
$$
 
  
$$
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450194.png" /></td> </tr></table>
-  
 
  
\frac{1}{2}
+
Here, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450195.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450196.png" /> are quadratic forms in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450197.png" />, depending on the values <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450198.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450199.png" />, at the points <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450200.png" />, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450201.png" /> is the determinant of the correlation matrix of the vector <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450202.png" />.
\sum _ { j= } 0 ^ { n- }  1 [ \lambda _ {j} ( \theta ) - \lambda _ {j} ( \theta  ^ {0} )] \int\limits _ { 0 } ^ { T }  [ x  ^ {(} j) ( t)]  ^ {2}  dt -
 
$$
 
  
$$
+
Maximum-likelihood estimators for the auto-regression parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450203.png" /> are asymptotically normal and asymptotically efficient. These properties are shared by the solution <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450204.png" /> of the approximate likelihood equation
- \left .
 
  
\frac{1}{2}
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450205.png" /></td> </tr></table>
( \lambda ( \theta ) - \lambda ( \theta  ^ {0} )) \right \} .
 
$$
 
  
Here,  $  \lambda _ {j} ( \theta ) $
+
An important role in statistical studies on the spectrum of a stationary process is played by the [[Periodogram|periodogram]] <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450206.png" />. This statistic is defined as
and  $  \lambda ( \theta ) $
 
are quadratic forms in $  \theta $,
 
depending on the values  $  x  ^ {(} j) ( t) $,
 
$  j = 1 \dots n- 1 $,
 
at the points  $  t = 0, T $,
 
and  $  K( \theta ) $
 
is the determinant of the correlation matrix of the vector  $  ( x( 0) \dots x  ^ {(} n- 1) ( 0)) $.
 
  
Maximum-likelihood estimators for the auto-regression parameter  $  \theta $
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450207.png" /></td> </tr></table>
are asymptotically normal and asymptotically efficient. These properties are shared by the solution  $  \theta _ {T}  ^  \star  $
 
of the approximate likelihood equation
 
  
$$
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450208.png" /></td> </tr></table>
  
\frac{1}{2T}
+
The periodogram is widely used in constructing different kinds of estimators for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450209.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450210.png" /> and criteria for testing hypotheses on these characteristics. Under broad assumptions, the statistics <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450211.png" /> are consistent estimators for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450212.png" />. In particular, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450213.png" /> may serve as an estimator for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450214.png" />. If the sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450215.png" /> converges in an appropriate way to the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450216.png" />-function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450217.png" />, then the integrals <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450218.png" /> will be consistent estimators for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450219.png" />. Functions of the form <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450220.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450221.png" />, are often used in the capacity of the functions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450222.png" />. If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450223.png" /> is a process in discrete time, these estimators can be written in the form
\sum _ { j= } 0 ^ { n- }  1
 
\frac{d \lambda _ {j} ( \theta ) }{d \theta _ {i} }
 
\int\limits _ { 0 } ^ { T }  [ x  ^ {(} j) ( t)]  ^ {2}  dt  = \
 
\left \{
 
  
An important role in statistical studies on the spectrum of a stationary process is played by the [[Periodogram|periodogram]]  $  I _ {T} ( \lambda ) $.
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450224.png" /></td> </tr></table>
This statistic is defined as
 
 
 
$$
 
I _ {T} ( \lambda )  = \
 
 
 
\frac{1}{2 \pi T }
 
\left | \sum _ { t= } 0 ^ { T }  e ^ {- it \lambda } x( t) \right |
 
\  \textrm{ (discrete  time)  } ,
 
$$
 
 
 
$$
 
I _ {T} ( \lambda )  =
 
\frac{1}{2 \pi T }
 
\left |
 
\int\limits _ { 0 } ^ { T }  e ^ {- it \lambda } x( t)  dt
 
\right |  ^ {2} \  \textrm{ (continuous  time)  } .
 
$$
 
 
 
The periodogram is widely used in constructing different kinds of estimators for  $  f( \lambda ) $,
 
$  F( \lambda ) $
 
and criteria for testing hypotheses on these characteristics. Under broad assumptions, the statistics  $  \int I _ {T} ( \lambda ) \phi ( \lambda )  d \lambda $
 
are consistent estimators for  $  \int f( \lambda ) \phi ( \lambda )  d \lambda $.
 
In particular,  $  \int _  \alpha  ^  \beta  I _ {T} ( \lambda )  d \lambda $
 
may serve as an estimator for  $  F( \beta ) - F( \alpha ) $.
 
If the sequence  $  \phi _ {T} ( \lambda ; \lambda _ {0} ) $
 
converges in an appropriate way to the  $  \delta $-
 
function  $  \delta ( \lambda - \lambda _ {0} ) $,
 
then the integrals  $  \int \phi _ {T} ( \lambda ; \lambda _ {0} ) I _ {T} ( \lambda )  d \lambda $
 
will be consistent estimators for  $  f( \lambda _ {0} ) $.  
 
Functions of the form  $  a _ {T} \psi ( a _ {T} ( \lambda - \lambda _ {0} )) $,
 
$  a _ {T} \rightarrow \infty $,
 
are often used in the capacity of the functions  $  \phi _ {T} ( \lambda ;  \lambda _ {0} ) $.  
 
If  $  x( t) $
 
is a process in discrete time, these estimators can be written in the form
 
 
 
$$
 
 
 
\frac{1}{2 \pi }
 
\sum _ { t=- } T+ 1 ^ { T- }  1 e ^ {- it \lambda } r  ^  \star  ( t) c _ {T} ( t),
 
$$
 
  
 
where the empirical correlation function is
 
where the empirical correlation function is
  
$$
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450225.png" /></td> </tr></table>
r  ^  \star  ( t)  =
 
\frac{1}{T}
 
\sum _ { u= } 0 ^ { T- }  t x( u+ t) x( u),
 
$$
 
  
while the non-random coefficients $  c _ {T} ( t) $
+
while the non-random coefficients <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450226.png" /> are defined by the choice of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450227.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450228.png" />. This choice, in turn, depends on a priori information on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450229.png" />. A similar representation also holds for processes in continuous time.
are defined by the choice of $  \psi $
 
and $  a _ {T} $.  
 
This choice, in turn, depends on a priori information on $  f( \lambda ) $.  
 
A similar representation also holds for processes in continuous time.
 
  
 
Problems in the statistical analysis of stationary processes sometimes also include problems of extrapolation, interpolation and filtration of stationary processes.
 
Problems in the statistical analysis of stationary processes sometimes also include problems of extrapolation, interpolation and filtration of stationary processes.
  
 
==Statistical problems of Markov processes.==
 
==Statistical problems of Markov processes.==
Let the observations $  X _ {0} \dots X _ {T} $
+
Let the observations <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450230.png" /> belong to a homogeneous [[Markov chain|Markov chain]]. Under sufficiently broad assumptions the likelihood function is
belong to a homogeneous [[Markov chain|Markov chain]]. Under sufficiently broad assumptions the likelihood function is
 
 
 
$$
 
 
 
\frac{dP _  \theta  ^ {T} }{d \mu  ^ {T} }
 
  = \
 
p _ {0} ( X _ {0} ;  \theta ) p( X _ {1}  | X _ {0} ;  \theta ) \dots p( X _ {T} |  X _ {T-} 1 ;  \theta ),
 
$$
 
  
where  $  p _ {0} $,
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450231.png" /></td> </tr></table>
$  p $
 
are the initial and transition densities of the distribution. This expression is similar to the likelihood function for a sequence of independent observations, and when the regularity conditions are observed (smoothness in  $  \theta \in \Theta \subset  \mathbf R  ^ {k} $),
 
a theory can be constructed for hypotheses testing and estimation which is analogous to the corresponding theory for independent observations.
 
  
A more complex situation arises if  $  x( t) $
+
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450232.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450233.png" /> are the initial and transition densities of the distribution. This expression is similar to the likelihood function for a sequence of independent observations, and when the regularity conditions are observed (smoothness in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450234.png" />), a theory can be constructed for hypotheses testing and estimation which is analogous to the corresponding theory for independent observations.
is a [[Markov process|Markov process]] in continuous time. Let  $  x( t) $
 
be a homogeneous Markov process with a finite number of states  $  N $
 
and differentiable transition probabilities  $  P _ {ij} ( t) $.  
 
The transition probability matrix is defined by the matrix  $  Q = \| q _ {ij} \| $,
 
$  q _ {ij} = P _ {ij} ^ { \prime } ( 0) $,
 
$  q _ {i} = - q _ {ii} $.  
 
Let  $  x( 0) = i _ {0} $
 
be independent of  $  Q $
 
at the initial time. By choosing any matrix  $  Q _ {0} = \| q _ {ij}  ^ {0} \| $,
 
one finds
 
  
$$
+
A more complex situation arises if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450235.png" /> is a [[Markov process|Markov process]] in continuous time. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450236.png" /> be a homogeneous Markov process with a finite number of states <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450237.png" /> and differentiable transition probabilities <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450238.png" />. The transition probability matrix is defined by the matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450239.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450240.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450241.png" />. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450242.png" /> be independent of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450243.png" /> at the initial time. By choosing any matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450244.png" />, one finds
  
\frac{dP _ {Q}  ^ {T} }{dP _ {Q _ {0}  }  ^ {T} }
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450245.png" /></td> </tr></table>
( x)  = \
 
\mathop{\rm exp} \{ ( q _ {i _ {n}  }  ^ {0} - q _ {i _ {n}  } ) T \}
 
\sum _ {\nu = 0 } ^ { n- }  1
 
\frac{q _ {j _  \nu  j _ {\nu + 1 }  } }{q _ {j _  \nu  j _ {\nu + 1 }  }  ^ {0} }
 
\times
 
$$
 
  
$$
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450246.png" /></td> </tr></table>
\times
 
\mathop{\rm exp} \{ t _  \nu  ( q _ {i _ {n}  } - q _ {i _  \nu  } - q _ {i _  \nu  }  ^ {0} + q _ {i _ {n}  }  ^ {0} ) \} .
 
$$
 
  
Here the statistics $  n( x) $,  
+
Here the statistics <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450247.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450248.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450249.png" /> are defined in the following way: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450250.png" /> is the number of jumps of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450251.png" /> on the interval <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450252.png" />; <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450253.png" /> is the moment of the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450254.png" />-th jump, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450255.png" />, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450256.png" />. It follows that the maximum-likelihood estimators for the parameters <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450257.png" /> are: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450258.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450259.png" /> is the number of transitions from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450260.png" /> to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450261.png" /> on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450262.png" />, while <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450263.png" /> is the time spent by the process <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450264.png" /> in the state <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450265.png" />.
$  t _  \nu  ( x) $,  
 
$  j _  \nu  ( x) $
 
are defined in the following way: $  n $
 
is the number of jumps of $  x( t) $
 
on the interval $  [ 0, T) $;  
 
$  \tau _  \nu  $
 
is the moment of the $  \nu $-
 
th jump, $  t _  \nu  = \tau _ {\nu + 1 }  - \tau _  \nu  $,  
 
and $  j _  \nu  = x( \tau _  \nu  ) $.  
 
It follows that the maximum-likelihood estimators for the parameters $  q _ {ij} $
 
are: $  q _ {ij}  ^  \star  = m _ {ij} / \mu _ {i} $,  
 
where $  m _ {ij} $
 
is the number of transitions from $  i $
 
to $  j $
 
on $  [ 0, T) $,
 
while $  \mu _ {i} $
 
is the time spent by the process $  x( t) $
 
in the state $  i $.
 
  
 
===Example 9.===
 
===Example 9.===
Let $  x( t) $
+
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450266.png" /> be a [[Birth-and-death process|birth-and-death process]] with constant intensities of birth <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450267.png" /> and death <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450268.png" />. This means that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450269.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450270.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450271.png" />, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450272.png" /> if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450273.png" />. In this example the number of states is infinite. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450274.png" />. The likelihood ratio is
be a [[Birth-and-death process|birth-and-death process]] with constant intensities of birth $  \lambda $
 
and death $  \mu $.  
 
This means that $  q _ {i,i+} 1 = i \lambda $,  
 
$  q _ {i,i-} 1 = i \mu $,
 
$  q _ {ii} = 1- i( \lambda + \mu ) $,  
 
and $  q _ {ij} = 0 $
 
if $  | i- j | > 1 $.  
 
In this example the number of states is infinite. Let $  x( 0) \equiv 1 $.
 
The likelihood ratio is
 
 
 
$$
 
 
 
\frac{dP _ {\lambda \mu }  ^ {T} }{dP _ {\lambda _ {0}  , \mu _ {0} }  ^ {T}
 
}
 
( x) =
 
$$
 
 
 
$$
 
= \
 
\left (
 
\frac \lambda {\lambda _ {0} }
 
\right )  ^ {B}
 
\left (
 
\frac \mu {\mu _ {0} }
 
\right )  ^ {D}  \mathop{\rm exp} \left \{ -( \lambda +
 
\mu - \lambda _ {0} - \mu _ {0} ) \int\limits _ { 0 } ^ { T }  x( s)  ds \right \} .
 
$$
 
 
 
Here  $  B $
 
is the total number of births (jumps of measure  $  + 1 $)
 
and  $  D $
 
is the number of deaths (jumps of measure  $  - 1 $).  
 
Maximum-likelihood estimators for  $  \lambda $
 
and  $  \mu $
 
are
 
  
$$
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450275.png" /></td> </tr></table>
\lambda _ {T}  ^  \star  =
 
\frac{1}{B}
 
\int\limits _ { 0 } ^ { T }  x( s)  ds,\ \
 
\mu _ {T}  ^  \star  =
 
\frac{1}{D}
 
\int\limits _ { 0 } ^ { T }  x( s)  ds.
 
$$
 
  
Let  $  x( t) $
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450276.png" /></td> </tr></table>
be a diffusion process with drift coefficient  $  a $
 
and diffusion coefficient  $  b $,
 
such that  $  x( t) $
 
satisfies the [[Stochastic differential equation|stochastic differential equation]]
 
  
$$
+
Here <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450277.png" /> is the total number of births (jumps of measure <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450278.png" />) and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450279.png" /> is the number of deaths (jumps of measure <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450280.png" />). Maximum-likelihood estimators for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450281.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450282.png" /> are
dx( t) = a( t, x( t))  dt + b( t, x( t))  dw( t),\ \
 
x( 0= x _ {0} ,
 
$$
 
  
where  $  w $
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450283.png" /></td> </tr></table>
is a Wiener process. Then, under specific restrictions,
 
  
$$
+
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450284.png" /> be a diffusion process with drift coefficient <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450285.png" /> and diffusion coefficient <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450286.png" />, such that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450287.png" /> satisfies the [[Stochastic differential equation|stochastic differential equation]]
  
\frac{dP _ {a,b}  ^ {T} }{dP _ {a _ {0}  ,b }  ^ {T} }
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450288.png" /></td> </tr></table>
( x)  = \
 
\mathop{\rm exp} \left \{ - \int\limits _ { 0 } ^ { T } 
 
\frac{a( t, x( t)) - a _ {0} ( t, x( t)) }{b( t, x( t)) }
 
  
dx( t) \right . +
+
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450289.png" /> is a Wiener process. Then, under specific restrictions,
$$
 
  
$$
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450290.png" /></td> </tr></table>
+ \left .
 
  
\frac{1}{2}
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450291.png" /></td> </tr></table>
\int\limits _ { 0 } ^ { T } 
 
\frac{a( t, x( t)) -
 
a _ {0} ( t, x( t))  ^ {2} }{b( t, x( t)) }
 
  dt \right \}
 
$$
 
  
(here $  a _ {0} $
+
(here <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450292.png" /> is a fixed coefficient).
is a fixed coefficient).
 
  
 
===Example 10.===
 
===Example 10.===
 
Let
 
Let
  
$$
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450293.png" /></td> </tr></table>
dx( t)  = a( t, x( t); \theta )  dt + dw,
 
$$
 
 
 
where  $  a $
 
is a known function and  $  \theta $
 
is an unknown real parameter. If Wiener measure is denoted by  $  \mu $,
 
then the likelihood function is
 
 
 
$$
 
  
\frac{dP _  \theta  ^ {T} }{d \mu }
+
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450294.png" /> is a known function and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450295.png" /> is an unknown real parameter. If Wiener measure is denoted by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450296.png" />, then the likelihood function is
  = \
 
\mathop{\rm exp} \left \{ \int\limits _ { 0 } ^ { T }  a( t, x( t);  \theta )  dx -
 
\frac{1}{2}
 
\int\limits _ { 0 } ^ { T }  a
 
^ {2} ( t, x( t);  \theta )  dt \right \} ,
 
$$
 
  
and, under regularity conditions, the Cramér–Rao inequality is satisfied: For an estimator  $  \tau $
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450297.png" /></td> </tr></table>
with bias  $  \Delta ( \theta ) = {\mathsf E} _  \theta  \tau - \theta $,
 
  
$$
+
and, under regularity conditions, the Cramér–Rao inequality is satisfied: For an estimator <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450298.png" /> with bias <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450299.png" />,
{\mathsf E} _  \theta  | \tau - \theta |  ^ {2}  \geq 
 
\frac{( 1 + {d \Delta } / {d \theta }
 
)  ^ {2} }{ {\mathsf E} _  \theta  \int\limits _ { 0 } ^ { T }  [ ( \partial  / {\partial  \theta } )
 
a( t, x( t);  \theta )]  ^ {2}  dt }
 
+ \Delta  ^ {2} ( \theta ).
 
$$
 
  
If the dependence on  $  \theta $
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450300.png" /></td> </tr></table>
is linear, the maximum-likelihood estimator is
 
  
$$
+
If the dependence on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450301.png" /> is linear, the maximum-likelihood estimator is
\theta _ {T}  ^  \star  = \
 
  
\frac{\int\limits _ { 0 } ^ { T }  a( t, x( t))  dt }{\int\limits _ { 0 } ^ { T }  a  ^ {2} ( t, x( t))  dt }
+
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087450/s087450302.png" /></td> </tr></table>
.
 
$$
 
  
 
====References====
 
====References====
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top"> U. Grenander, "Stochastic processes and statistical inference" ''Ark. Mat.'' , '''1''' (1950) pp. 195–277 {{MR|0039202}} {{ZBL|0058.35501}} {{ZBL|0041.45807}} </TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top"> E.J. Hannan, "Time series analysis" , Methuen , London (1960) {{MR|0114281}} {{ZBL|0095.13204}} </TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top"> U. Grenander, M. Rosenblatt, "Statistical analysis of stationary time series" , Wiley (1957) {{MR|0084975}} {{ZBL|0080.12904}} </TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top"> U. Grenander, "Abstract inference" , Wiley (1981) {{MR|0599175}} {{ZBL|0505.62069}} </TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top"> Yu.A. Rozanov, "Infinite-dimensional Gaussian distributions" ''Proc. Steklov Inst. Math.'' , '''108''' (1971) ''Trudy Mat. Inst. Steklov.'' , '''108''' (1968) {{MR|0436304}} {{ZBL|}} </TD></TR><TR><TD valign="top">[6]</TD> <TD valign="top"> I.A. Ibragimov, Yu.A. Rozanov, "Gaussian random processes" , Springer (1978) (Translated from Russian) {{MR|0543837}} {{ZBL|0392.60037}} </TD></TR><TR><TD valign="top">[7]</TD> <TD valign="top"> D.R. Brillinger, "Time series. Data analysis and theory" , Holt, Rinehart &amp; Winston (1975) {{MR|0443257}} {{ZBL|0321.62004}} </TD></TR><TR><TD valign="top">[8]</TD> <TD valign="top"> P. Billingsley, "Statistical inference for Markov processes" , Univ. Chicago Press (1961) {{MR|1531450}} {{MR|0123419}} {{ZBL|0106.34201}} </TD></TR><TR><TD valign="top">[9]</TD> <TD valign="top"> R.S. Liptser, A.N. Shiryaev, "Statistics of random processes" , '''1–2''' , Springer (1977–1978) (Translated from Russian) {{MR|1800858}} {{MR|1800857}} {{MR|0608221}} {{MR|0488267}} {{MR|0474486}} {{ZBL|1008.62073}} {{ZBL|1008.62072}} {{ZBL|0556.60003}} {{ZBL|0369.60001}} {{ZBL|0364.60004}} </TD></TR><TR><TD valign="top">[10]</TD> <TD valign="top"> A.M. Yaglom, "Correlation theory of stationary and related random functions" , '''1–2''' , Springer (1987) (Translated from Russian) {{MR|0915557}} {{MR|0893393}} {{ZBL|0685.62078}} {{ZBL|0685.62077}} </TD></TR><TR><TD valign="top">[11]</TD> <TD valign="top"> T.M. Anderson, "The statistical analysis of time series" , Wiley (1971) {{MR|0283939}} {{ZBL|0225.62108}} </TD></TR></table>
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top"> U. Grenander, "Stochastic processes and statistical inference" ''Ark. Mat.'' , '''1''' (1950) pp. 195–277 {{MR|0039202}} {{ZBL|0058.35501}} {{ZBL|0041.45807}} </TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top"> E.J. Hannan, "Time series analysis" , Methuen , London (1960) {{MR|0114281}} {{ZBL|0095.13204}} </TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top"> U. Grenander, M. Rosenblatt, "Statistical analysis of stationary time series" , Wiley (1957) {{MR|0084975}} {{ZBL|0080.12904}} </TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top"> U. Grenander, "Abstract inference" , Wiley (1981) {{MR|0599175}} {{ZBL|0505.62069}} </TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top"> Yu.A. Rozanov, "Infinite-dimensional Gaussian distributions" ''Proc. Steklov Inst. Math.'' , '''108''' (1971) ''Trudy Mat. Inst. Steklov.'' , '''108''' (1968) {{MR|0436304}} {{ZBL|}} </TD></TR><TR><TD valign="top">[6]</TD> <TD valign="top"> I.A. Ibragimov, Yu.A. Rozanov, "Gaussian random processes" , Springer (1978) (Translated from Russian) {{MR|0543837}} {{ZBL|0392.60037}} </TD></TR><TR><TD valign="top">[7]</TD> <TD valign="top"> D.R. Brillinger, "Time series. Data analysis and theory" , Holt, Rinehart &amp; Winston (1975) {{MR|0443257}} {{ZBL|0321.62004}} </TD></TR><TR><TD valign="top">[8]</TD> <TD valign="top"> P. Billingsley, "Statistical inference for Markov processes" , Univ. Chicago Press (1961) {{MR|1531450}} {{MR|0123419}} {{ZBL|0106.34201}} </TD></TR><TR><TD valign="top">[9]</TD> <TD valign="top"> R.S. Liptser, A.N. Shiryaev, "Statistics of random processes" , '''1–2''' , Springer (1977–1978) (Translated from Russian) {{MR|1800858}} {{MR|1800857}} {{MR|0608221}} {{MR|0488267}} {{MR|0474486}} {{ZBL|1008.62073}} {{ZBL|1008.62072}} {{ZBL|0556.60003}} {{ZBL|0369.60001}} {{ZBL|0364.60004}} </TD></TR><TR><TD valign="top">[10]</TD> <TD valign="top"> A.M. Yaglom, "Correlation theory of stationary and related random functions" , '''1–2''' , Springer (1987) (Translated from Russian) {{MR|0915557}} {{MR|0893393}} {{ZBL|0685.62078}} {{ZBL|0685.62077}} </TD></TR><TR><TD valign="top">[11]</TD> <TD valign="top"> T.M. Anderson, "The statistical analysis of time series" , Wiley (1971) {{MR|0283939}} {{ZBL|0225.62108}} </TD></TR></table>

Revision as of 14:53, 7 June 2020

A branch of mathematical statistics devoted to statistical inferences on the basis of observations represented as a random process. In the most common formulation, the values of a random function for are observed, and on the basis of these observations statistical inferences must be made regarding certain characteristics of . Such a broad definition also formally includes all the classical statistics of independent observations. In fact, statistics of random processes is often taken to mean only the statistics of dependent observations, excluding the statistical analysis of a large number of independent realizations of a random process. Here the foundations of the statistical theory, the basic formulations of problems (cf. Statistical estimation; Statistical hypotheses, verification of), the basic concepts (sufficiency, unbiasedness, consistency, etc.) are the same as in the classical theory. However, when solving concrete problems, significant difficulties and phenomena of a new type arise from time to time. These difficulties are partially caused by the fact of dependence and the more complex structure of the process under consideration, and partially, in the case of observations in continuous time, by the need to examine distributions in infinite-dimensional spaces.

In fact, when solving statistical problems in the theory of random processes, the structure of the process under consideration is crucial, while when classifying random processes, statistical problems of Gaussian, Markov, stationary, branching, diffusion, and other processes are studied. Of these, the most far-reaching is the statistical theory of stationary processes (time-series analysis).

The need for a statistical analysis of random processes arose in the 19th century in the form of analysis of meteorological and economic series and studies on cyclic processes (price fluctuation, sun spots). In modern times, the number of problems covered by the statistical analysis of random processes has become extraordinarily large. To cite but a few examples, statistical analysis of random noise, vibrations, turbulence, wave motion in a sea, cardiograms and encephalograms, etc. The theoretical aspects of extracting a signal from a background of noise can be seen to a significant degree as a statistical problem in the theory of random processes.

Below it is proposed that a segment , , of the random process be observed, whereby the parameter passes either through the whole interval , or through the integers in this interval. In statistical problems, the distribution of the process is usually known only to belong to some family of distributions . This family can always be written in parametric form.

Example 1.

The process is either the sum of a non-random function (a "signal" ) and a random function (the "noise" ), or is a single random function . The hypothesis : must be tested against the alternative : (the problem of locating a signal in noise). This is an example of testing a statistical hypothesis.

Example 2.

The process , where is an unknown non-random function (the signal), while is a random process (the noise). The function , or its value at a given point , has to be estimated. Similarly, it can be proposed that , where is a known function, depending on an unknown parameter , which must also be estimated through the observation of (problems of extracting a signal from a background of noise). These are examples of estimation problems.

The likelihood ratio for random processes.

In statistical problems, likelihood ratios and likelihood functions play an important role (see Neyman–Pearson lemma; Statistical hypotheses, verification of; Statistical estimation). The likelihood ratio of two distributions and is the density

The likelihood function is the function

where is a -finite measure relative to which all measures are absolutely continuous. In the discrete case, where runs through the integers of and , the likelihood ratio always exists if the distributions and have positive densities, and it coincides with the ratio of these two densities.

If runs through the entire interval , then cases may arise in which the measures and are not absolutely continuous with respect to each other; moreover, situations can arise in which and are mutually singular, i.e. where for a set in the space of realizations of ,

In this case does not exist. The singularity of the measures leads to important and somewhat paradoxical statistical results, allowing for error-free inferences concerning the parameter . For example, let ; the singularity of the measures and means that, using the test "accept H0 if x A, reject H0 if x A" , the hypotheses : and : are distinguished error-free. The presence of such perfect tests often demonstrates that the statistical problem is not posed entirely satisfactorily and that certain essential random disturbances are excluded from it.

Example 3.

Let , where is a stationary ergodic process with zero average and is a real parameter. Let the realizations of with probability 1 be analytic in a strip containing the real axis. According to the ergodic theorem,

and all measures are also mutually singular. Since an analytic function is completely defined by its values in a neighbourhood of zero, the parameter is error-free when estimated through the observations for any .

The calculation of the likelihood ratio in those cases where it exists is a difficult problem. Calculations are often based on the limit relation

where are the densities of the vector , while is a dense set in . Study of the right-hand side of the above equality also is useful in investigating the possible singularity of and .

Example 4.

Suppose one has either the observation , where is a Wiener process (hypothesis ), or , where is a non-random function (hypothesis ). The measures are mutually absolutely continuous if , and mutually singular if . The likelihood ratio equals

Example 5.

Let , where is a real parameter and is a stationary Gaussian Markov process with mean zero and known correlation function , . The measures are mutually absolutely continuous with likelihood function

In particular, is a sufficient statistic for the family .

Linear problems in the statistics of random processes.

Let the function

(*)

be observed, where is a random process with mean zero and known correlation function , are known non-random functions, is an unknown parameter ( are the regression coefficients), and the parameter set is a subset of . Linear estimators for are estimators of the form , or their limits in the mean square. The problem of finding optimal unbiased linear estimators in the mean square reduces to the solution of linear algebraic or linear integral equations in . Indeed, an optimal estimator is defined by the equations for any of the form , . In a number of cases, estimators of , obtained asymptotically by the method of least squares, when , are not worse than the optimal linear estimators. Estimators obtained by the method of least squares are calculated more simply and do not depend on .

Example 6.

Under the conditions of example 5, , . The optimal unbiased linear estimator takes the form

The estimator

has asymptotically the same variance.

Statistical problems of Gaussian processes.

Let be a Gaussian process for all . For Gaussian processes one has the alternatives: Any two measures are either mutually absolutely continuous or are singular. Since the Gaussian distribution is completely defined by the mean value and the correlation function , the likelihood ratio is expressed in terms of , , , in a complex way. The case where , and a continuous function, is relatively simple. Let , ; let , and , be the eigenvalues, and the corresponding normalized eigenfunctions in , of the integral equation

let the means and be continuous functions; and let

The measures are absolutely continuous if and only if

Here,

This equality can be used to devise a test for the hypothesis : against the alternative : under the assumption that the function is known to the observer.

Statistical problems of stationary processes.

Let the observation be a stationary process with mean and correlation function ; let and be its spectral density and spectral function, respectively. The basic problems of the statistics of stationary processes relate to hypotheses testing and to estimating the characteristics , , , . In the case of an ergodic process , consistent estimators (when ) for and , respectively, are provided by

The problem of estimating when is known is often treated as a linear problem. This group of problems also includes the more general problems of estimating regression coefficients through observations of the form (*) with stationary .

Let have zero mean and spectral density depending on a finite-dimensional parameter . If the process is Gaussian, formulas can be derived for the likelihood ratio (if the ratio exists), which in a number of cases make it possible to find maximum-likelihood estimators or "good" approximations of them (for large ). Under sufficiently broad assumptions these estimators are asymptotically normal and asymptotically efficient.

Example 7.

Let be a stationary Gaussian process in continuous time with rational spectral density , where and are polynomials. The measures corresponding to the rational spectral densities are absolutely continuous if and only if

Here the parameter is the set of all coefficients of the polynomials .

Example 8.

An important class of stationary Gaussian processes consists of the auto-regressive processes (cf. Auto-regressive process) :

where is a Gaussian white noise of unit intensity and is an unknown parameter. In this case the spectral density is

where

The likelihood function is

Here, and are quadratic forms in , depending on the values , , at the points , and is the determinant of the correlation matrix of the vector .

Maximum-likelihood estimators for the auto-regression parameter are asymptotically normal and asymptotically efficient. These properties are shared by the solution of the approximate likelihood equation

An important role in statistical studies on the spectrum of a stationary process is played by the periodogram . This statistic is defined as

The periodogram is widely used in constructing different kinds of estimators for , and criteria for testing hypotheses on these characteristics. Under broad assumptions, the statistics are consistent estimators for . In particular, may serve as an estimator for . If the sequence converges in an appropriate way to the -function , then the integrals will be consistent estimators for . Functions of the form , , are often used in the capacity of the functions . If is a process in discrete time, these estimators can be written in the form

where the empirical correlation function is

while the non-random coefficients are defined by the choice of and . This choice, in turn, depends on a priori information on . A similar representation also holds for processes in continuous time.

Problems in the statistical analysis of stationary processes sometimes also include problems of extrapolation, interpolation and filtration of stationary processes.

Statistical problems of Markov processes.

Let the observations belong to a homogeneous Markov chain. Under sufficiently broad assumptions the likelihood function is

where , are the initial and transition densities of the distribution. This expression is similar to the likelihood function for a sequence of independent observations, and when the regularity conditions are observed (smoothness in ), a theory can be constructed for hypotheses testing and estimation which is analogous to the corresponding theory for independent observations.

A more complex situation arises if is a Markov process in continuous time. Let be a homogeneous Markov process with a finite number of states and differentiable transition probabilities . The transition probability matrix is defined by the matrix , , . Let be independent of at the initial time. By choosing any matrix , one finds

Here the statistics , , are defined in the following way: is the number of jumps of on the interval ; is the moment of the -th jump, , and . It follows that the maximum-likelihood estimators for the parameters are: , where is the number of transitions from to on , while is the time spent by the process in the state .

Example 9.

Let be a birth-and-death process with constant intensities of birth and death . This means that , , , and if . In this example the number of states is infinite. Let . The likelihood ratio is

Here is the total number of births (jumps of measure ) and is the number of deaths (jumps of measure ). Maximum-likelihood estimators for and are

Let be a diffusion process with drift coefficient and diffusion coefficient , such that satisfies the stochastic differential equation

where is a Wiener process. Then, under specific restrictions,

(here is a fixed coefficient).

Example 10.

Let

where is a known function and is an unknown real parameter. If Wiener measure is denoted by , then the likelihood function is

and, under regularity conditions, the Cramér–Rao inequality is satisfied: For an estimator with bias ,

If the dependence on is linear, the maximum-likelihood estimator is

References

[1] U. Grenander, "Stochastic processes and statistical inference" Ark. Mat. , 1 (1950) pp. 195–277 MR0039202 Zbl 0058.35501 Zbl 0041.45807
[2] E.J. Hannan, "Time series analysis" , Methuen , London (1960) MR0114281 Zbl 0095.13204
[3] U. Grenander, M. Rosenblatt, "Statistical analysis of stationary time series" , Wiley (1957) MR0084975 Zbl 0080.12904
[4] U. Grenander, "Abstract inference" , Wiley (1981) MR0599175 Zbl 0505.62069
[5] Yu.A. Rozanov, "Infinite-dimensional Gaussian distributions" Proc. Steklov Inst. Math. , 108 (1971) Trudy Mat. Inst. Steklov. , 108 (1968) MR0436304
[6] I.A. Ibragimov, Yu.A. Rozanov, "Gaussian random processes" , Springer (1978) (Translated from Russian) MR0543837 Zbl 0392.60037
[7] D.R. Brillinger, "Time series. Data analysis and theory" , Holt, Rinehart & Winston (1975) MR0443257 Zbl 0321.62004
[8] P. Billingsley, "Statistical inference for Markov processes" , Univ. Chicago Press (1961) MR1531450 MR0123419 Zbl 0106.34201
[9] R.S. Liptser, A.N. Shiryaev, "Statistics of random processes" , 1–2 , Springer (1977–1978) (Translated from Russian) MR1800858 MR1800857 MR0608221 MR0488267 MR0474486 Zbl 1008.62073 Zbl 1008.62072 Zbl 0556.60003 Zbl 0369.60001 Zbl 0364.60004
[10] A.M. Yaglom, "Correlation theory of stationary and related random functions" , 1–2 , Springer (1987) (Translated from Russian) MR0915557 MR0893393 Zbl 0685.62078 Zbl 0685.62077
[11] T.M. Anderson, "The statistical analysis of time series" , Wiley (1971) MR0283939 Zbl 0225.62108
How to Cite This Entry:
Statistical problems in the theory of stochastic processes. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Statistical_problems_in_the_theory_of_stochastic_processes&oldid=48818
This article was adapted from an original article by I.A. Ibragimov (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article