# Bayesian numerical analysis

The comparison of algorithms and the analysis of numerical problems in a Bayesian setting, cf. also Bayesian approach.

A numerical algorithm is usually developed and applied for inputs sharing some common properties. In a theoretical comparison of algorithms one either selects a class $P$ of inputs and defines the cost and the error of an algorithm in a worst-case sense over the class $P$. Alternatively, in Bayesian numerical analysis, one puts an a priori distribution $\mu$ on the inputs and defines the cost and the error of an algorithm in an average-case sense, i.e., via expectations with respect to $\mu$. For instance, if $\epsilon ( p , m )$ denotes the error of a method $m$, applied to an input $p$, then

\begin{equation*} \mathcal{E} _ { \operatorname{wor} } ( P , m ) = \operatorname { sup } _ { p \in P } | \epsilon ( p , m ) | \end{equation*}

is called the worst-case (maximal) error of $m$ on $P$ while

\begin{equation*} \mathcal{E} _ { \text{avg} } ( \mu , m ) = \int | \epsilon ( p , m ) | d \mu ( p ) \end{equation*}

is called the average error of $m$ with respect to $\mu$. The notions of optimality of algorithms and complexity of numerical problems are defined analogously in the worst-case setting and the Bayesian setting, see Optimization of computational algorithms and Information-based complexity.

Properties of the inputs, e.g., smoothness of integrands for numerical integration, are expressed by being a member of $P$ or being distributed according to $\mu$. For problems with inputs from finite-dimensional spaces, the Lebesgue measure is often used in the definition of $\mu$, see [a1] for linear programming and [a7] for solving systems of polynomial equations. For problems with inputs from infinite-dimensional spaces, often a Gaussian measure $\mu$ is used. If the inputs are real-valued functions on some domain, imposing an a priori distribution is essentially equivalent to treating inputs as random functions (cf. also Random function).

The Bayesian approach leads to new algorithms and to new insight into numerical problems. A few examples of this are as follows.

## Contents

### Multivariate integration.

Under very mild assumptions on the measure $\mu$, the cost of optimal methods does not depend exponentially on the number of variables, see [a8]. On the other hand, often a curse of dimension occurs in the worst-case setting.

### Linear programming.

Under natural symmetry properties of $\mu$, the average cost of the simplex method is polynomial, while the worst-case (maximal) cost is exponential, see [a1].

### Global optimization.

Suitable adaptive (active) methods are superior to all non-adaptive (passive) ones in a Bayesian setting, see [a2]. In contrast, there is no superiority of adaptive methods in the worst-case setting for any convex and symmetric class $P$. See also Optimization of computational algorithms; Adaptive quadrature; see [a6] for similar results.

### Zero finding.

Bisection is worst-case optimal for many classes $P$ of (smooth) functions. Thus, the worst-case -complexity is of order $\operatorname { log } ( 1 / \epsilon )$. For measures that are derived from the Wiener measure and concentrated on smooth functions, the average-case -complexity is of order $\operatorname { log } \operatorname { log } ( 1 / \epsilon )$. The upper bound is achieved by a hybrid secant-bisection method, see [a5].