Statistical problems in the theory of stochastic processes
A branch of mathematical statistics devoted to statistical inferences on the basis of observations represented as a random process. In the most common formulation, the values of a random function for are observed, and on the basis of these observations statistical inferences must be made regarding certain characteristics of . Such a broad definition also formally includes all the classical statistics of independent observations. In fact, statistics of random processes is often taken to mean only the statistics of dependent observations, excluding the statistical analysis of a large number of independent realizations of a random process. Here the foundations of the statistical theory, the basic formulations of problems (cf. Statistical estimation; Statistical hypotheses, verification of), the basic concepts (sufficiency, unbiasedness, consistency, etc.) are the same as in the classical theory. However, when solving concrete problems, significant difficulties and phenomena of a new type arise from time to time. These difficulties are partially caused by the fact of dependence and the more complex structure of the process under consideration, and partially, in the case of observations in continuous time, by the need to examine distributions in infinite-dimensional spaces.
In fact, when solving statistical problems in the theory of random processes, the structure of the process under consideration is crucial, while when classifying random processes, statistical problems of Gaussian, Markov, stationary, branching, diffusion, and other processes are studied. Of these, the most far-reaching is the statistical theory of stationary processes (time-series analysis).
The need for a statistical analysis of random processes arose in the 19th century in the form of analysis of meteorological and economic series and studies on cyclic processes (price fluctuation, sun spots). In modern times, the number of problems covered by the statistical analysis of random processes has become extraordinarily large. To cite but a few examples, statistical analysis of random noise, vibrations, turbulence, wave motion in a sea, cardiograms and encephalograms, etc. The theoretical aspects of extracting a signal from a background of noise can be seen to a significant degree as a statistical problem in the theory of random processes.
Below it is proposed that a segment , , of the random process be observed, whereby the parameter passes either through the whole interval , or through the integers in this interval. In statistical problems, the distribution of the process is usually known only to belong to some family of distributions . This family can always be written in parametric form.
- 1 Example 1.
- 2 Example 2.
- 3 The likelihood ratio for random processes.
- 4 Linear problems in the statistics of random processes.
- 5 Statistical problems of Gaussian processes.
- 6 Statistical problems of stationary processes.
- 7 Statistical problems of Markov processes.
The process is either the sum of a non-random function (a "signal" ) and a random function (the "noise" ), or is a single random function . The hypothesis : must be tested against the alternative : (the problem of locating a signal in noise). This is an example of testing a statistical hypothesis.
The process , where is an unknown non-random function (the signal), while is a random process (the noise). The function , or its value at a given point , has to be estimated. Similarly, it can be proposed that , where is a known function, depending on an unknown parameter , which must also be estimated through the observation of (problems of extracting a signal from a background of noise). These are examples of estimation problems.
The likelihood ratio for random processes.
In statistical problems, likelihood ratios and likelihood functions play an important role (see Neyman–Pearson lemma; Statistical hypotheses, verification of; Statistical estimation). The likelihood ratio of two distributions and is the density
The likelihood function is the function
where is a -finite measure relative to which all measures are absolutely continuous. In the discrete case, where runs through the integers of and , the likelihood ratio always exists if the distributions and have positive densities, and it coincides with the ratio of these two densities.
If runs through the entire interval , then cases may arise in which the measures and are not absolutely continuous with respect to each other; moreover, situations can arise in which and are mutually singular, i.e. where for a set in the space of realizations of ,
In this case does not exist. The singularity of the measures leads to important and somewhat paradoxical statistical results, allowing for error-free inferences concerning the parameter . For example, let ; the singularity of the measures and means that, using the test "accept H0 if x A, reject H0 if x A" , the hypotheses : and : are distinguished error-free. The presence of such perfect tests often demonstrates that the statistical problem is not posed entirely satisfactorily and that certain essential random disturbances are excluded from it.
Let , where is a stationary ergodic process with zero average and is a real parameter. Let the realizations of with probability 1 be analytic in a strip containing the real axis. According to the ergodic theorem,
and all measures are also mutually singular. Since an analytic function is completely defined by its values in a neighbourhood of zero, the parameter is error-free when estimated through the observations for any .
The calculation of the likelihood ratio in those cases where it exists is a difficult problem. Calculations are often based on the limit relation
where are the densities of the vector , while is a dense set in . Study of the right-hand side of the above equality also is useful in investigating the possible singularity of and .
Suppose one has either the observation , where is a Wiener process (hypothesis ), or , where is a non-random function (hypothesis ). The measures are mutually absolutely continuous if , and mutually singular if . The likelihood ratio equals
Let , where is a real parameter and is a stationary Gaussian Markov process with mean zero and known correlation function , . The measures are mutually absolutely continuous with likelihood function
In particular, is a sufficient statistic for the family .
Linear problems in the statistics of random processes.
Let the function
be observed, where is a random process with mean zero and known correlation function , are known non-random functions, is an unknown parameter ( are the regression coefficients), and the parameter set is a subset of . Linear estimators for are estimators of the form , or their limits in the mean square. The problem of finding optimal unbiased linear estimators in the mean square reduces to the solution of linear algebraic or linear integral equations in . Indeed, an optimal estimator is defined by the equations for any of the form , . In a number of cases, estimators of , obtained asymptotically by the method of least squares, when , are not worse than the optimal linear estimators. Estimators obtained by the method of least squares are calculated more simply and do not depend on .
Under the conditions of example 5, , . The optimal unbiased linear estimator takes the form
has asymptotically the same variance.
Statistical problems of Gaussian processes.
Let be a Gaussian process for all . For Gaussian processes one has the alternatives: Any two measures are either mutually absolutely continuous or are singular. Since the Gaussian distribution is completely defined by the mean value and the correlation function , the likelihood ratio is expressed in terms of , , , in a complex way. The case where , and a continuous function, is relatively simple. Let , ; let , and , be the eigenvalues, and the corresponding normalized eigenfunctions in , of the integral equation
let the means and be continuous functions; and let
The measures are absolutely continuous if and only if
This equality can be used to devise a test for the hypothesis : against the alternative : under the assumption that the function is known to the observer.
Statistical problems of stationary processes.
Let the observation be a stationary process with mean and correlation function ; let and be its spectral density and spectral function, respectively. The basic problems of the statistics of stationary processes relate to hypotheses testing and to estimating the characteristics , , , . In the case of an ergodic process , consistent estimators (when ) for and , respectively, are provided by
The problem of estimating when is known is often treated as a linear problem. This group of problems also includes the more general problems of estimating regression coefficients through observations of the form (*) with stationary .
Let have zero mean and spectral density depending on a finite-dimensional parameter . If the process is Gaussian, formulas can be derived for the likelihood ratio (if the ratio exists), which in a number of cases make it possible to find maximum-likelihood estimators or "good" approximations of them (for large ). Under sufficiently broad assumptions these estimators are asymptotically normal and asymptotically efficient.
Let be a stationary Gaussian process in continuous time with rational spectral density , where and are polynomials. The measures corresponding to the rational spectral densities are absolutely continuous if and only if
Here the parameter is the set of all coefficients of the polynomials .
An important class of stationary Gaussian processes consists of the auto-regressive processes (cf. Auto-regressive process) :
where is a Gaussian white noise of unit intensity and is an unknown parameter. In this case the spectral density is
The likelihood function is
Here, and are quadratic forms in , depending on the values , , at the points , and is the determinant of the correlation matrix of the vector .
Maximum-likelihood estimators for the auto-regression parameter are asymptotically normal and asymptotically efficient. These properties are shared by the solution of the approximate likelihood equation
An important role in statistical studies on the spectrum of a stationary process is played by the periodogram . This statistic is defined as
The periodogram is widely used in constructing different kinds of estimators for , and criteria for testing hypotheses on these characteristics. Under broad assumptions, the statistics are consistent estimators for . In particular, may serve as an estimator for . If the sequence converges in an appropriate way to the -function , then the integrals will be consistent estimators for . Functions of the form , , are often used in the capacity of the functions . If is a process in discrete time, these estimators can be written in the form
where the empirical correlation function is
while the non-random coefficients are defined by the choice of and . This choice, in turn, depends on a priori information on . A similar representation also holds for processes in continuous time.
Problems in the statistical analysis of stationary processes sometimes also include problems of extrapolation, interpolation and filtration of stationary processes.
Statistical problems of Markov processes.
Let the observations belong to a homogeneous Markov chain. Under sufficiently broad assumptions the likelihood function is
where , are the initial and transition densities of the distribution. This expression is similar to the likelihood function for a sequence of independent observations, and when the regularity conditions are observed (smoothness in ), a theory can be constructed for hypotheses testing and estimation which is analogous to the corresponding theory for independent observations.
A more complex situation arises if is a Markov process in continuous time. Let be a homogeneous Markov process with a finite number of states and differentiable transition probabilities . The transition probability matrix is defined by the matrix , , . Let be independent of at the initial time. By choosing any matrix , one finds
Here the statistics , , are defined in the following way: is the number of jumps of on the interval ; is the moment of the -th jump, , and . It follows that the maximum-likelihood estimators for the parameters are: , where is the number of transitions from to on , while is the time spent by the process in the state .
Let be a birth-and-death process with constant intensities of birth and death . This means that , , , and if . In this example the number of states is infinite. Let . The likelihood ratio is
Here is the total number of births (jumps of measure ) and is the number of deaths (jumps of measure ). Maximum-likelihood estimators for and are
Let be a diffusion process with drift coefficient and diffusion coefficient , such that satisfies the stochastic differential equation
where is a Wiener process. Then, under specific restrictions,
(here is a fixed coefficient).
where is a known function and is an unknown real parameter. If Wiener measure is denoted by , then the likelihood function is
and, under regularity conditions, the Cramér–Rao inequality is satisfied: For an estimator with bias ,
If the dependence on is linear, the maximum-likelihood estimator is
|||U. Grenander, "Stochastic processes and statistical inference" Ark. Mat. , 1 (1950) pp. 195–277|
|||E.J. Hannan, "Time series analysis" , Methuen , London (1960)|
|||U. Grenander, M. Rosenblatt, "Statistical analysis of stationary time series" , Wiley (1957)|
|||U. Grenander, "Abstract inference" , Wiley (1981)|
|||Yu.A. Rozanov, "Infinite-dimensional Gaussian distributions" Proc. Steklov Inst. Math. , 108 (1971) Trudy Mat. Inst. Steklov. , 108 (1968)|
|||I.A. Ibragimov, Yu.A. Rozanov, "Gaussian random processes" , Springer (1978) (Translated from Russian)|
|||D.R. Brillinger, "Time series. Data analysis and theory" , Holt, Rinehart & Winston (1975)|
|||P. Billingsley, "Statistical inference for Markov processes" , Univ. Chicago Press (1961)|
|||R.S. Liptser, A.N. Shiryaev, "Statistics of random processes" , 1–2 , Springer (1977–1978) (Translated from Russian)|
|||A.M. Yaglom, "Correlation theory of stationary and related random functions" , 1–2 , Springer (1987) (Translated from Russian)|
|||T.M. Anderson, "The statistical analysis of time series" , Wiley (1971)|
Statistical problems in the theory of stochastic processes. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Statistical_problems_in_the_theory_of_stochastic_processes&oldid=13453