# Stochastic process

random process, probability process, random function of time

2010 Mathematics Subject Classification: Primary: 60Gxx [MSN][ZBL]

A process (that is, a variation with time of the state of a certain system) whose course depends on chance and for which probabilities for some courses are given. A typical example of this is Brownian motion. Other examples of practical importance are: the fluctuation of current in an electrical circuit in the presence of so-called thermal noise, the random changes in the level of received radio-signals in the presence of random weakening of radio-signals (fading) created by meteorological or other disturbances, and the turbulent flow of a liquid or gas. To these can be added many industrial processes accompanied by random fluctuations, and also certain processes encountered in geophysics (e.g., variations of the Earth's magnetic field, unordered sea-waves and microseisms, that is, high-frequency irregular oscillations of the level of the surface of the Earth), biophysics (for example, variations of the bio-electric potential of the brain registered on an electro-encephalograph), and economics.

The mathematical theory of stochastic processes regards the instantaneous state of the system in question as a point of a certain phase space $R$( the space of states), so that the stochastic process is a function $X ( t)$ of the time $t$ with values in $R$. It is usually assumed that $R$ is a vector space, the most studied case (and the most important one for applications) being the narrower one where the points of $R$ are given by one or more numerical parameters (a generalized coordinate system). In the narrow case a stochastic process can be regarded either simply as a numerical function $X ( t)$ of time taking various values depending on chance (i.e. admitting various realizations $x ( t)$, a one-dimensional stochastic process), or similarly as a vector function $\mathbf X ( t) = \{ X _ {1} ( t) \dots X _ {k} ( t) \}$( a multi-dimensional or vector stochastic process). The study of multi-dimensional stochastic processes can be reduced to that of one-dimensional stochastic processes by passing from $\mathbf X ( t)$ to an auxiliary process

$$X _ {\mathbf a} ( t) = ( \mathbf X ( t) , \mathbf a ) = \sum _ { j= } 1 ^ { k } a _ {j} X _ {j} ( t) ,$$

where $\mathbf a = ( a _ {1} \dots a _ {k} )$ is an arbitrary $k$- dimensional vector. Therefore the study of one-dimensional processes occupies a central place in the theory of stochastic processes. The parameter $t$ usually takes arbitrary real values or values in an interval on the real axis $\mathbf R ^ {1}$( when one wishes to stress this, one speaks of a stochastic process in continuous time), but it may take only integral values, in which case $X ( t)$ is called a stochastic process in discrete time (or a random sequence or a time series).

The representation of a probability distribution in the infinite-dimensional space of all variants of the course of $X ( t)$( that is, in the space of realizations $x ( t)$) does not fall within the scope of the classical methods of probability theory and requires the construction of a special mathematical apparatus. The only exceptions are special classes of stochastic processes whose probabilistic nature is completely determined by the dependence of $X ( t) = X ( t ; \mathbf Y )$ on a certain finite-dimensional random vector $\mathbf Y = ( Y _ {1} \dots Y _ {k} )$, since in this case the probability of the course followed by $X ( t)$ depends only on the finite-dimensional probability distribution of $\mathbf Y$. An example of a stochastic process of this type which is of practical importance is a random harmonic oscillation of the form

$$X ( t) = A \cos ( \omega t + \Phi ) ,$$

where $\omega$ is a fixed number and $A$ and $\Phi$ are independent random variables. This process is often used in the investigation of amplitude-phase modulation in radio-technology.

A wide class of probability distributions for stochastic processes is characterized by an infinite family of compatible finite-dimensional probability distributions of the random vectors $\{ X ( t _ {1} ) \dots X ( t _ {n} ) \}$ corresponding to all finite subsets $( t _ {1} \dots t _ {n} )$ of values of $t$( see Random function). However, knowledge of all these distributions is not sufficient to determine the probabilities of events depending on the values of $X ( t)$ for an uncountable set of values of $t$, that is, it does not determine the stochastic process $X ( t)$ uniquely.

Example. Let $X ( t) = \cos ( \omega t + \Phi )$, $0 \leq t \leq 1$, be a harmonic oscillation with random phase $\Phi$. Let a random variable $Z$ be uniformly distributed on the interval $[ 0 , 1 ]$, and let $X _ {1} ( t)$, $0 \leq t \leq 1$, be the stochastic process given by the equations $X _ {1} ( t) = X ( t)$ when $t \neq Z$, $X _ {1} ( t) = X ( t) + 3$ when $t = Z$. Since ${\mathsf P} \{ Z = t _ {1} \textrm{ or } \dots \textrm{ or } Z = t _ {n} \} = 0$ for any fixed finite set of points $( t _ {1} \dots t _ {n} )$, it follows that all the finite-dimensional distributions of $X ( t)$ and $X _ {1} ( t)$ are identical. At the same time, $X ( t)$ and $X _ {1} ( t)$ are different: in particular, all realizations of $X ( t)$ are continuous (having sinusoidal form), while all realizations of $X _ {1} ( t)$ have a point of discontinuity, and all realizations of $X ( t)$ do not exceed 1, but no realization of $X _ {1} ( t)$ has this property. Hence it follows that a given system of finite-dimensional probability distributions can correspond to distinct modifications of a stochastic process, and one cannot compute, purely from knowledge of this system, either the probability that a realization of the stochastic process will be continuous, or the probability that it will be bounded by some fixed constant.

However, from knowledge of all finite-dimensional probability distributions one can often clarify whether or not there exists a stochastic process $X ( t)$ that has these finite-dimensional distributions, and is such that its realizations are continuous (or differentiable or nowhere exceed a given constant $B$) with probability 1. A typical example of a general condition guaranteeing the existence of a stochastic process $X ( t)$ with continuous realizations with probability 1 and given finite-dimensional distributions is Kolmogorov's condition: If the finite-dimensional probability distributions of a stochastic process $X ( t)$, defined on the interval $[ a , b ]$, are such that for some $\alpha > 0$, $\delta > 0$, $C < \infty$, and all sufficiently small $h$, the following inequality holds:

$$\tag{1 } {\mathsf E} | X ( t + h ) - X ( t) | ^ \alpha < C | h | ^ {1 + \delta }$$

(which evidently imposes restrictions only on the two-dimensional distributions of $X ( t)$), then $X ( t)$ has a modification with continuous realizations with probability 1 (see [Sl][We], for example). In the special case of a Gaussian process $X ( t)$, condition (1) can be replaced by the weaker condition

$$\tag{2 } {\mathsf E} | X ( t + h ) - X ( t ) | ^ {\alpha _ {1} } < C _ {1} | h | ^ {\delta _ {1} }$$

for some $\alpha _ {1} > 0$, $\delta _ {1} > 0$, $C _ {1} > 0$. This holds with $\alpha _ {1} = 2$ and $\delta _ {1} = 1$ for the Wiener process and the Ornstein–Uhlenbeck process, for example. In cases where, for given finite-dimensional probability distributions, there is a modification of $X ( t)$ whose realizations are continuous (or differentiable or bounded by a constant $B$) with probability 1, all other modifications of the same process can usually be excluded from consideration by requiring that $X ( t)$ satisfies a certain very general regularity condition, which holds in almost-all applications (see Separable process).

Instead of specifying the infinite system of finite-dimensional probability distributions of a stochastic process $X ( t)$, this can be defined using the values of the corresponding characteristic functional

$$\tag{3 } \psi [ l ] = {\mathsf E} \mathop{\rm exp} \{ i l [ X ] \} ,$$

where $l$ ranges over a sufficiently wide class of linear functionals depending on $X$. If $X$ is continuous in probability for $a \leq t \leq b$( that is, ${\mathsf P} \{ | X ( t + h ) - X ( t) | > \epsilon \} \rightarrow 0$ as $h \rightarrow 0$ for any $\epsilon > 0$) and $g$ is a function of bounded variation on $[ a , b ]$, then

$$\int\limits _ { a } ^ { b } X ( t) d g ( t) = l ^ {(} g) [ X ]$$

is a random variable. One may take $l [ X] = l ^ {(} g) [ X]$ in (3), where $\psi [ l ^ {(} g) ]$ is denoted by the symbol $\psi [ g]$ for convenience. In many cases it is sufficient to consider only linear functionals $l [ X]$ of the form

$$\int\limits _ { a } ^ { b } X ( t) \phi ( t) d t = l _ \phi [ X] ,$$

where $\phi$ is an infinitely-differentiable function of compact support in $t$( and the interval $[ a , b ]$ may be taken finite). Under fairly general regularity conditions, the values $\psi [ l _ \phi ] = \psi [ \phi ]$ uniquely determine all finite-dimensional probability distributions of $X ( t)$, since

$$\psi [ \phi ] \rightarrow \psi _ {t _ {1} \dots t _ {n} } ( \theta _ {1} \dots \theta _ {n} ) ,$$

where $\psi _ {t _ {1} \dots t _ {n} } ( \theta _ {1} \dots \theta _ {n} )$ is the characteristic function of the random vector $\{ X ( t _ {1} ) \dots X ( t _ {n} ) \}$, as

$$\phi ( t) \rightarrow \theta _ {1} \delta ( t - t _ {1} ) + \dots + \theta _ {n} \delta ( t - t _ {n} )$$

(here $\delta ( t)$ is the Dirac $\delta$- function, and convergence is understood in the sense of convergence of generalized functions). If $\psi [ \phi ]$ does not tend to a finite limit, then $X$ has no finite values at any fixed point and only smoothed values $l _ \phi [ X]$ have a meaning, that is, the characteristic functional $\psi [ \phi ]$ does not give an ordinary ( "classical" ) stochastic process $X ( t)$, but a generalized stochastic process (cf. Stochastic process, generalized) $X = X ( \phi )$.

The problem of describing all finite-dimensional probability distributions of $X ( t)$ is simplified in those cases when they are all uniquely determined by the distributions of only a few lower orders. The most important class of stochastic processes for which all multi-dimensional distributions are determined by the values of the one-dimensional distributions of $X ( t)$ are sequences of independent random variables (which are special stochastic processes in discrete time). Such processes can be studied within the framework of classical probability theory, and it is important that some important classes of stochastic processes can be effectively specified as functions of a sequence $Y ( t)$, $t = 0 , \pm 1 , \pm 2 \dots$ of independent random variables. For example, the following stochastic processes are of significant interest:

$$X ( t) = \sum _ { j= } 0 ^ \infty b _ {j} Y ( t - j )$$

or

$$X ( t) = \sum _ {j = - \infty } ^ \infty b _ {j} Y ( t - j ) ,\ \ t = 0 , \pm 1 ,\dots$$

(see Moving-average process), and

$$X ( t) = \sum _ { j= } 1 ^ \infty Y ( j) h _ {j} ( t) ,\ \ a \leq t \leq b ,$$

where $h _ {j}$, $j = 1 , 2 \dots$ is a prescribed system of functions on the interval $[ a , b ]$( see Spectral decomposition of a random function).

Three important classes of stochastic processes are described below, for which all finite-dimensional distributions are determined by the one-dimensional distributions of $X ( t)$ and the two-dimensional distributions of $\{ X ( t _ {1} ) , X ( t _ {2} ) \}$.

1) The class of stochastic processes with independent increments (cf. Stochastic process with independent increments) $X ( t)$, for which $X ( t _ {2} ) - X ( t _ {1} )$ and $X ( t _ {4} ) - X ( t _ {3} )$ are independent variables ( $t _ {1} < t _ {2} \leq t _ {3} < t _ {4}$). To represent $X ( t)$ on the interval $[ a, b]$ it is convenient to use the distribution functions $F _ {a} ( x)$ and $\Phi _ {t _ {1} , t _ {2} } ( z)$, where $a \leq t _ {1} \leq t _ {2} \leq b$, of the random variables $X ( a)$ and $X ( t _ {2} ) - X ( t _ {1} )$, in which case $\Phi _ {t _ {1} , t _ {2} } ( z)$ must evidently satisfy the functional equation

$$\tag{4 } \int\limits _ {- \infty } ^ \infty \Phi _ {t _ {1} , t _ {2} } ( z - u ) d \Phi _ {t _ {1} , t _ {3} } ( u) = \Phi _ {t _ {1} , t _ {3} } ( z ) ,$$

$$a \leq t _ {1} < t _ {2} < t _ {3} \leq b .$$

Using (4) it is possible to show that if $X ( t)$ is continuous in probability, then its characteristic functional $\psi [ g ]$ can be written in the form

$$\psi [ g ] = \mathop{\rm exp} \left \{ i \int\limits _ { a } ^ { b } \gamma ( t) d g ( t) - \frac{1}{2} \int\limits _ { a } ^ { b } \beta ( t) [ g ( b) - g ( t) ] d g ( t) \right . +$$

$$+ \int\limits _ {- \infty } ^ \infty \int\limits _ { a } ^ { b } \left [ e ^ {i y [ g ( b) - g ( t) ] } - 1 - \frac{i y [ g ( b) - g ( t) ] }{1 + y ^ {2} } \right ] \times$$

$$\times \left . \frac{1 + y ^ {2} }{y} ^ {2} d _ {t} \Pi _ {t} ( d y ) \right \} ,$$

where $\gamma ( t)$ is a continuous function, $\beta ( t)$ is a non-decreasing continuous function such that $\beta ( a)= 0$ and $\Pi _ {t} ( d y )$ is an increasing continuous measure on $\mathbf R$ in $t$.

2) The class of Markov processes $X ( t)$ for which, when $t _ {1} < t _ {2}$, the conditional probability distribution of $X ( t _ {2} )$ given all values of $X ( t)$ for $t \leq t _ {1}$ depends only on $X ( t _ {1} )$. To represent a Markov process $X ( t)$, $a \leq t \leq b$, it is convenient to use the distribution function $F _ {a} ( x)$ of the value $X ( a)$ and the transition function $\Phi _ {t _ {1} , t _ {2} } ( x , z )$, which is defined for $t _ {1} < t _ {2}$ as the conditional probability that $X ( t _ {2} ) < z$ given that $X ( t _ {1} ) = x$. The function $\Phi _ {t _ {1} , t _ {2} } ( x , z )$ must satisfy the Kolmogorov–Chapman equation, similar to (4), and this enables one, under certain conditions, to obtain the simpler forward and backward Kolmogorov equation (e.g. the Fokker–Planck equation) for this function.

3) The class of Gaussian processes $X ( t)$ for which all multi-dimensional probability distributions of the vectors $\{ X ( t _ {1} ) \dots X ( t _ {n} ) \}$ are Gaussian (normal) distributions. Since a normal distribution is uniquely determined by its first and second moments, a Gaussian process $X ( t)$ is determined by the values of the functions

$${\mathsf E} X ( t) = m ( t)$$

and

$${\mathsf E} X ( t) X ( s) = B ( t , s ) ,$$

where $B ( t , s )$ must be a non-negative definite kernel such that

$$b ( t , s ) = B ( t , s ) - m ( t ) m ( s )$$

is a non-negative definite kernel. The characteristic functional $\psi [ g ]$ of a Gaussian process $X ( t)$, where $a \leq t \leq b$, is

$$\psi [ g ] = \mathop{\rm exp} \left \{ t \int\limits _ { a } ^ { b } m ( t) d g ( t ) - \frac{1}{2} \int\limits _ { a } ^ { b } \int\limits _ { a } ^ { b } b ( t , s ) \ d g ( t ) d g ( s) \right \} .$$

4) Another important class of stochastic processes is that of stationary stochastic processes $X ( t)$, where the statistical characteristics do not change in the course of time, that is, they are invariant under the transformation $X ( t) \mapsto X ( t + a )$, for any fixed number $a$. The multi-dimensional probability distributions of a general stationary stochastic process $X ( t)$ cannot be described in a simple manner, but for many problems concerning such processes it is sufficient to know only the values of the first two moments, ${\mathsf E} X ( t) = m$ and ${\mathsf E} X ( t) X ( t + s ) = B ( s)$( so that here the only necessary assumption is of stationarity in the wide sense, i.e. the moments ${\mathsf E} X ( t)$ and ${\mathsf E} X ( t) X ( t + s)$ are independent of $t$). It is essential that any stationary stochastic process (at least in the wide sense) admits a spectral decomposition of the form

$$\tag{5 } X ( t) = \int\limits _ {- \infty } ^ \infty e ^ {i t \lambda } d Z ( \lambda ) ,$$

where $Z ( \lambda )$ is a stochastic process with non-correlated increments. In particular, it follows that

$$\tag{6 } B ( s) = \int\limits _ {- \infty } ^ \infty e ^ {i t \lambda } d F ( \lambda ) ,$$

where $F ( \lambda )$ is the monotone non-decreasing spectral function of $X ( t)$( cf. Spectral function of a stationary stochastic process). The spectral decompositions (5) and (6) lie at the heart of the solution of problems of best (in the sense of minimal mean-square error) linear extrapolation, interpolation and filtering of stationary stochastic processes.

The mathematical theory of stochastic processes also includes a large number of results related to a series of subclasses or, conversely, of extensions, of the above classes of stochastic processes (see Markov chain; Diffusion process; Branching process; Martingale; Stochastic process with stationary increments; etc.).

#### References

 [Sl] E.E. Slutskii, Selected works , Moscow (1980) pp. 269–280 (In Russian) [Do] J.L. Doob, "Stochastic processes" , Wiley (1953) MR1570654 MR0058896 Zbl 0053.26802 [GS] I.I. Gihman, A.V. Skorohod, "Introduction to the theory of stochastic processes" , Saunders (1967) (Translated from Russian) [GS2] I.I. Gihman, A.V. Skorohod, "Theory of stochastic processes" , 1–3 , Springer (1974–1979) (Translated from Russian) MR0636254 MR0651015 MR0375463 MR0350794 MR0346882 Zbl 0531.60002 Zbl 0531.60001 Zbl 0404.60061 Zbl 0305.60027 Zbl 0291.60019 [CL] H. Cramér, M.R. Leadbetter, "Stationary and related stochastic processes" , Wiley (1967) MR0217860 Zbl 0162.21102 [We] A.D. Wentzell, "A course in the theory of stochastic processes" , McGraw-Hill (1981) (Translated from Russian) MR0781738 MR0614594 Zbl 0502.60001 [Rz] Yu.A. Rozanov, "Stochastic processes" , 1–2 , Moscow (1960–1963) (In Russian) [Sk] A.V. Skorohod, "Random processes with independent increments" , Kluwer (1991) (Translated from Russian) MR1155400 [Dy] E.B. Dynkin, "Markov processes" , 1–2 , Springer (1965) (Translated from Russian) MR0193671 Zbl 0132.37901 [IR] I.A. Ibragimov, Yu.A. Rozanov, "Gaussian stochastic processes" , Springer (1978) (Translated from Russian) MR0272040 [Rz2] Yu.A. Rozanov, "Stationary stochastic processes" , Holden-Day (1967) (Translated from Russian) MR0159363 MR0114252 Zbl 0721.60040

The state space $E$ of a stochastic process $X$ may be a (good) topological space without algebraic structure as in Markov process theory; in this case real processes of the form $f \circ X$, where $f$ is a real function on $E$, are considered; it can be also a differentiable manifold, as in modern diffusion process theory, etc. Concerning the regularity properties of the paths, often it is not possible to prove that the considered set of regular paths has probability 1 because this set is not measurable, but it is often possible to circumvent this difficulty by proving that the outer probability is $1$.