Difference between revisions of "Bayesian numerical analysis"
m (AUTOMATIC EDIT (latexlist): Replaced 23 formulas out of 26 by TEX code with an average confidence of 2.0 and a minimal confidence of 2.0.) |
m (Automatically changed introduction) |
||
Line 2: | Line 2: | ||
the semi-automatic procedure described at https://encyclopediaofmath.org/wiki/User:Maximilian_Janisch/latexlist | the semi-automatic procedure described at https://encyclopediaofmath.org/wiki/User:Maximilian_Janisch/latexlist | ||
was used. | was used. | ||
− | If the TeX and formula formatting is correct, please remove this message and the {{TEX|semi-auto}} category. | + | If the TeX and formula formatting is correct and if all png images have been replaced by TeX code, please remove this message and the {{TEX|semi-auto}} category. |
Out of 26 formulas, 23 were replaced by TEX code.--> | Out of 26 formulas, 23 were replaced by TEX code.--> | ||
− | {{TEX|semi-auto}}{{TEX| | + | {{TEX|semi-auto}}{{TEX|part}} |
The comparison of algorithms and the analysis of numerical problems in a Bayesian setting, cf. also [[Bayesian approach|Bayesian approach]]. | The comparison of algorithms and the analysis of numerical problems in a Bayesian setting, cf. also [[Bayesian approach|Bayesian approach]]. | ||
Revision as of 17:43, 1 July 2020
The comparison of algorithms and the analysis of numerical problems in a Bayesian setting, cf. also Bayesian approach.
A numerical algorithm is usually developed and applied for inputs sharing some common properties. In a theoretical comparison of algorithms one either selects a class $P$ of inputs and defines the cost and the error of an algorithm in a worst-case sense over the class $P$. Alternatively, in Bayesian numerical analysis, one puts an a priori distribution $\mu$ on the inputs and defines the cost and the error of an algorithm in an average-case sense, i.e., via expectations with respect to $\mu$. For instance, if $\epsilon ( p , m )$ denotes the error of a method $m$, applied to an input $p$, then
\begin{equation*} \mathcal{E} _ { \operatorname{wor} } ( P , m ) = \operatorname { sup } _ { p \in P } | \epsilon ( p , m ) | \end{equation*}
is called the worst-case (maximal) error of $m$ on $P$ while
\begin{equation*} \mathcal{E} _ { \text{avg} } ( \mu , m ) = \int | \epsilon ( p , m ) | d \mu ( p ) \end{equation*}
is called the average error of $m$ with respect to $\mu$. The notions of optimality of algorithms and complexity of numerical problems are defined analogously in the worst-case setting and the Bayesian setting, see Optimization of computational algorithms and Information-based complexity.
Properties of the inputs, e.g., smoothness of integrands for numerical integration, are expressed by being a member of $P$ or being distributed according to $\mu$. For problems with inputs from finite-dimensional spaces, the Lebesgue measure is often used in the definition of $\mu$, see [a1] for linear programming and [a7] for solving systems of polynomial equations. For problems with inputs from infinite-dimensional spaces, often a Gaussian measure $\mu$ is used. If the inputs are real-valued functions on some domain, imposing an a priori distribution is essentially equivalent to treating inputs as random functions (cf. also Random function).
The Bayesian approach leads to new algorithms and to new insight into numerical problems. A few examples of this are as follows.
Multivariate integration.
Under very mild assumptions on the measure $\mu$, the cost of optimal methods does not depend exponentially on the number of variables, see [a8]. On the other hand, often a curse of dimension occurs in the worst-case setting.
Linear programming.
Under natural symmetry properties of $\mu$, the average cost of the simplex method is polynomial, while the worst-case (maximal) cost is exponential, see [a1].
Global optimization.
Suitable adaptive (active) methods are superior to all non-adaptive (passive) ones in a Bayesian setting, see [a2]. In contrast, there is no superiority of adaptive methods in the worst-case setting for any convex and symmetric class $P$. See also Optimization of computational algorithms; Adaptive quadrature; see [a6] for similar results.
Zero finding.
Bisection is worst-case optimal for many classes $P$ of (smooth) functions. Thus, the worst-case -complexity is of order $\operatorname { log } ( 1 / \epsilon )$. For measures that are derived from the Wiener measure and concentrated on smooth functions, the average-case -complexity is of order $\operatorname { log } \operatorname { log } ( 1 / \epsilon )$. The upper bound is achieved by a hybrid secant-bisection method, see [a5].
For references, see also Information-based complexity.
References
[a1] | K.H. Borgwardt, "The simplex method: A probabilistic approach" , Springer (1987) |
[a2] | J.M. Calvin, "Average performance of a class of adaptive algorithms for global optimization" Ann. Appl. Probab. , 7 (1997) pp. 711–730 |
[a3] | P. Diaconis, "Bayesian numerical analysis" S.S. Gupta (ed.) J.O. Berger (ed.) , Statistical Decision Theory and Related Topics IV , 1 , Springer (1988) pp. 163–175 |
[a4] | J.B. Kadane, G.W. Wasilkowski, "Average case -complexity in computer science: a Bayesian view" J.M. Bernardo (ed.) , Bayesian Statistics , North-Holland (1988) pp. 361–374 |
[a5] | E. Novak, K. Ritter, H. Woźniakowski, "Average case optimality of a hybrid secant-bisection method" Math. Comp. , 64 (1995) pp. 1517–1539 |
[a6] | K. Ritter, "Average case analysis of numerical problems" , Lecture Notes Math. , Springer (2000) |
[a7] | M. Shub, S. Smale, "Complexity of Bezout's theorem V: polynomial time" Theoret. Comput. Sci. , 133 (1994) pp. 141–164 |
[a8] | G.W. Wasilkowski, "Average case complexity of multivariate integration and function approximation: an overview" J. Complexity , 12 (1996) pp. 257–272 |
Bayesian numerical analysis. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Bayesian_numerical_analysis&oldid=49958