Difference between revisions of "Asymptotic optimality"
(Importing text file) |
Ulf Rehmann (talk | contribs) m (tex encoded by computer) |
||
Line 1: | Line 1: | ||
+ | <!-- | ||
+ | a1107901.png | ||
+ | $#A+1 = 42 n = 0 | ||
+ | $#C+1 = 42 : ~/encyclopedia/old_files/data/A110/A.1100790 Asymptotic optimality | ||
+ | Automatically converted into TeX, above some diagnostics. | ||
+ | Please remove this comment and the {{TEX|auto}} line below, | ||
+ | if TeX found to be correct. | ||
+ | --> | ||
+ | |||
+ | {{TEX|auto}} | ||
+ | {{TEX|done}} | ||
+ | |||
''of estimating functions'' | ''of estimating functions'' | ||
Efficient estimation (cf. [[Efficient estimator|Efficient estimator]]) of parameters in stochastic models is most conveniently approached via properties of estimating functions, namely functions of the data and the parameter of interest, rather than estimators derived therefrom. For a detailed explanation see [[#References|[a1]]], Chapt. 1. | Efficient estimation (cf. [[Efficient estimator|Efficient estimator]]) of parameters in stochastic models is most conveniently approached via properties of estimating functions, namely functions of the data and the parameter of interest, rather than estimators derived therefrom. For a detailed explanation see [[#References|[a1]]], Chapt. 1. | ||
− | Let | + | Let $ \{ {X _ {t} } : {0 \leq t \leq T } \} $ |
+ | be a [[Sample|sample]] in discrete or continuous time from a stochastic system taking values in an $ r $- | ||
+ | dimensional Euclidean space. The distribution of $ X _ {t} $ | ||
+ | depends on a parameter of interest $ \theta $ | ||
+ | taking values in an open subset of a $ p $- | ||
+ | dimensional Euclidean space. The possible probability measures (cf. [[Probability measure|Probability measure]]) for $ X _ {t} $ | ||
+ | are $ \{ {\mathsf P} _ \theta \} $, | ||
+ | a union of families of models. | ||
− | Consider the class | + | Consider the class $ {\mathcal G} $ |
+ | of zero-mean square-integrable estimating functions $ G _ {T} = G _ {T} ( \{ {X _ {t} } : {0 \leq t \leq T } \} , \theta ) $, | ||
+ | which are vectors of dimension $ p $ | ||
+ | and for which the matrices used below are non-singular. | ||
− | Optimality in both the fixed sample and the asymptotic sense is considered. The former involves choice of an estimating function | + | Optimality in both the fixed sample and the asymptotic sense is considered. The former involves choice of an estimating function $ G _ {T} $ |
+ | to maximize, in the partial order of non-negative definite matrices, the information criterion | ||
− | + | $$ | |
+ | {\mathcal E} {( G _ {T} ) } = ( {\mathsf E} {\nabla G } _ {T} ) ^ \prime ( {\mathsf E} {G _ {T} } G _ {T} ^ \prime ) ^ {-1 } ( {\mathsf E} {\nabla G } _ {T} ) , | ||
+ | $$ | ||
− | which is a natural generalization of the [[Fisher amount of information|Fisher amount of information]]. Here | + | which is a natural generalization of the [[Fisher amount of information|Fisher amount of information]]. Here $ \nabla G $ |
+ | is the $ ( p \times p ) $- | ||
+ | matrix of derivatives of the elements of $ G $ | ||
+ | with respect to those of $ \theta $ | ||
+ | and prime denotes transposition. If $ {\mathcal H} \subset {\mathcal G} $ | ||
+ | is a prespecified family of estimating functions, it is said that $ G _ {T} ^ {*} \in {\mathcal H} $ | ||
+ | is fixed sample optimal in $ {\mathcal H} $ | ||
+ | if $ {\mathcal E} ( G _ {T} ^ {*} ) - {\mathcal E} ( G _ {T} ) $ | ||
+ | is non-negative definite for all $ G _ {T} \in {\mathcal H} $, | ||
+ | $ \theta $ | ||
+ | and $ {\mathsf P} _ \theta $. | ||
+ | Then, $ G _ {T} ^ {*} $ | ||
+ | is the element of $ {\mathcal H} $ | ||
+ | whose dispersion distance from the maximum information estimating function in $ {\mathcal G} $( | ||
+ | often the likelihood score) is least. | ||
− | A focus on asymptotic properties can be made by confining attention to the subset | + | A focus on asymptotic properties can be made by confining attention to the subset $ {\mathcal M} \subset {\mathcal G} $ |
+ | of estimating functions which are martingales (cf. [[Martingale|Martingale]]). Here one considers $ T $ | ||
+ | ranging over the positive real numbers and for $ \{ G _ {T} \} \in {\mathcal M} $ | ||
+ | one writes $ \{ \langle G \rangle _ {T} \} $ | ||
+ | for the quadratic characteristic, the predictable increasing process for which $ \{ G _ {T} G _ {T} ^ \prime - \langle G \rangle _ {T} \} $ | ||
+ | is a martingale. Also, write $ \{ {\overline{G}\; } _ {T} \} $ | ||
+ | for the predictable process for which $ \{ {\nabla G } _ {T} - {\overline{G}\; } _ {T} \} $ | ||
+ | is a martingale. Then, $ G _ {T} ^ {*} \in {\mathcal M} _ {1} \subset {\mathcal M} $ | ||
+ | is asymptotically optimal in $ {\mathcal M} _ {1} $ | ||
+ | if $ {\overline {\mathcal E} \; } ( G _ {T} ^ {*} ) - {\overline {\mathcal E} \; } ( G _ {T} ) $ | ||
+ | is almost surely non-negative definite for all $ {G _ {T} } \in {\mathcal M} _ {1} $, | ||
+ | $ \theta $, | ||
+ | $ {\mathsf P} _ \theta $, | ||
+ | and $ T > 0 $, | ||
+ | where | ||
− | + | $$ | |
+ | {\overline {\mathcal E} \; } ( G _ {T} ) = {\overline{G}\; } _ {T} ^ \prime \left \langle G \right \rangle _ {T} ^ {-1 } {\overline{G}\; } _ {T} . | ||
+ | $$ | ||
− | Under suitable regularity conditions, asymptotically optimal estimating functions produce estimators for | + | Under suitable regularity conditions, asymptotically optimal estimating functions produce estimators for $ \theta $ |
+ | which are consistent (cf. [[Consistent estimator|Consistent estimator]]), asymptotically unbiased (cf. [[Unbiased estimator|Unbiased estimator]]) and asymptotically normally distributed (cf. [[Normal distribution|Normal distribution]]) with minimum size asymptotic confidence zones (cf. [[Confidence estimation|Confidence estimation]]). For further details see [[#References|[a2]]], [[#References|[a3]]]. | ||
====References==== | ====References==== | ||
<table><TR><TD valign="top">[a1]</TD> <TD valign="top"> D.L. McLeish, C.G. Small, "The theory and applications of statistical inference functions" , ''Lecture Notes in Statistics'' , Springer (1988)</TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top"> V.P. Godambe, C.C. Heyde, "Quasi-likelihood and optimal estimation" ''Internat. Statist. Rev.'' , '''55''' (1987) pp. 231–244.</TD></TR><TR><TD valign="top">[a3]</TD> <TD valign="top"> C.C. Heyde, "Quasi-likelihood and its application. A general approach to optimal parameter estimation" , Springer (1997)</TD></TR></table> | <table><TR><TD valign="top">[a1]</TD> <TD valign="top"> D.L. McLeish, C.G. Small, "The theory and applications of statistical inference functions" , ''Lecture Notes in Statistics'' , Springer (1988)</TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top"> V.P. Godambe, C.C. Heyde, "Quasi-likelihood and optimal estimation" ''Internat. Statist. Rev.'' , '''55''' (1987) pp. 231–244.</TD></TR><TR><TD valign="top">[a3]</TD> <TD valign="top"> C.C. Heyde, "Quasi-likelihood and its application. A general approach to optimal parameter estimation" , Springer (1997)</TD></TR></table> |
Latest revision as of 18:48, 5 April 2020
of estimating functions
Efficient estimation (cf. Efficient estimator) of parameters in stochastic models is most conveniently approached via properties of estimating functions, namely functions of the data and the parameter of interest, rather than estimators derived therefrom. For a detailed explanation see [a1], Chapt. 1.
Let $ \{ {X _ {t} } : {0 \leq t \leq T } \} $ be a sample in discrete or continuous time from a stochastic system taking values in an $ r $- dimensional Euclidean space. The distribution of $ X _ {t} $ depends on a parameter of interest $ \theta $ taking values in an open subset of a $ p $- dimensional Euclidean space. The possible probability measures (cf. Probability measure) for $ X _ {t} $ are $ \{ {\mathsf P} _ \theta \} $, a union of families of models.
Consider the class $ {\mathcal G} $ of zero-mean square-integrable estimating functions $ G _ {T} = G _ {T} ( \{ {X _ {t} } : {0 \leq t \leq T } \} , \theta ) $, which are vectors of dimension $ p $ and for which the matrices used below are non-singular.
Optimality in both the fixed sample and the asymptotic sense is considered. The former involves choice of an estimating function $ G _ {T} $ to maximize, in the partial order of non-negative definite matrices, the information criterion
$$ {\mathcal E} {( G _ {T} ) } = ( {\mathsf E} {\nabla G } _ {T} ) ^ \prime ( {\mathsf E} {G _ {T} } G _ {T} ^ \prime ) ^ {-1 } ( {\mathsf E} {\nabla G } _ {T} ) , $$
which is a natural generalization of the Fisher amount of information. Here $ \nabla G $ is the $ ( p \times p ) $- matrix of derivatives of the elements of $ G $ with respect to those of $ \theta $ and prime denotes transposition. If $ {\mathcal H} \subset {\mathcal G} $ is a prespecified family of estimating functions, it is said that $ G _ {T} ^ {*} \in {\mathcal H} $ is fixed sample optimal in $ {\mathcal H} $ if $ {\mathcal E} ( G _ {T} ^ {*} ) - {\mathcal E} ( G _ {T} ) $ is non-negative definite for all $ G _ {T} \in {\mathcal H} $, $ \theta $ and $ {\mathsf P} _ \theta $. Then, $ G _ {T} ^ {*} $ is the element of $ {\mathcal H} $ whose dispersion distance from the maximum information estimating function in $ {\mathcal G} $( often the likelihood score) is least.
A focus on asymptotic properties can be made by confining attention to the subset $ {\mathcal M} \subset {\mathcal G} $ of estimating functions which are martingales (cf. Martingale). Here one considers $ T $ ranging over the positive real numbers and for $ \{ G _ {T} \} \in {\mathcal M} $ one writes $ \{ \langle G \rangle _ {T} \} $ for the quadratic characteristic, the predictable increasing process for which $ \{ G _ {T} G _ {T} ^ \prime - \langle G \rangle _ {T} \} $ is a martingale. Also, write $ \{ {\overline{G}\; } _ {T} \} $ for the predictable process for which $ \{ {\nabla G } _ {T} - {\overline{G}\; } _ {T} \} $ is a martingale. Then, $ G _ {T} ^ {*} \in {\mathcal M} _ {1} \subset {\mathcal M} $ is asymptotically optimal in $ {\mathcal M} _ {1} $ if $ {\overline {\mathcal E} \; } ( G _ {T} ^ {*} ) - {\overline {\mathcal E} \; } ( G _ {T} ) $ is almost surely non-negative definite for all $ {G _ {T} } \in {\mathcal M} _ {1} $, $ \theta $, $ {\mathsf P} _ \theta $, and $ T > 0 $, where
$$ {\overline {\mathcal E} \; } ( G _ {T} ) = {\overline{G}\; } _ {T} ^ \prime \left \langle G \right \rangle _ {T} ^ {-1 } {\overline{G}\; } _ {T} . $$
Under suitable regularity conditions, asymptotically optimal estimating functions produce estimators for $ \theta $ which are consistent (cf. Consistent estimator), asymptotically unbiased (cf. Unbiased estimator) and asymptotically normally distributed (cf. Normal distribution) with minimum size asymptotic confidence zones (cf. Confidence estimation). For further details see [a2], [a3].
References
[a1] | D.L. McLeish, C.G. Small, "The theory and applications of statistical inference functions" , Lecture Notes in Statistics , Springer (1988) |
[a2] | V.P. Godambe, C.C. Heyde, "Quasi-likelihood and optimal estimation" Internat. Statist. Rev. , 55 (1987) pp. 231–244. |
[a3] | C.C. Heyde, "Quasi-likelihood and its application. A general approach to optimal parameter estimation" , Springer (1997) |
Asymptotic optimality. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Asymptotic_optimality&oldid=45243