Bregman function

Let $ C $ be a closed convex subset of $ \mathbf R ^ {n} $ and $ C ^ {o} $ its interior. Consider a real-valued convex function (cf. also convex function (of a real variable)) $ f $ whose effective domain contains $ C $ and let $ {D _ {f} } : {C \times C ^ {o} } \rightarrow \mathbf R $ be defined as

$$ D _ {f} ( x,y ) = f ( x ) - f ( y ) - \left \langle {\nabla f ( y ) ,x - y } \right \rangle . $$

$ f $ is said to be a Bregman function with zone $ C $( and $ D _ {f} $ the Bregman distance induced by $ f $) if the following conditions hold:

B1) $ f $ is continuously differentiable on $ C ^ {o} $;

B2) $ f $ is strictly convex and continuous on $ C $;

B3) for all $ \delta \in \mathbf R $ the partial level sets $ \Gamma ( x, \delta ) = \{ {y \in C ^ {o} } : {D _ {f} ( x,y ) \leq \delta } \} $ are bounded for all $ x \in C $;

B4) if $ \{ y ^ {k} \} \subset C ^ {o} $ converges to $ y ^ {*} $, then $ D _ {f} ( y ^ {*} ,y ^ {k} ) $ converges to $ 0 $;

B5) if $ \{ x ^ {k} \} \subset C $ and $ \{ y ^ {k} \} \subset C ^ {o} $ are sequences such that $ \{ x ^ {k} \} $ is bounded, $ {\lim\limits } _ {k \rightarrow \infty } y ^ {k} = y ^ {*} $ and $ {\lim\limits } _ {k \rightarrow \infty } D _ {f} ( x ^ {k} ,y ^ {k} ) = 0 $, then $ {\lim\limits } _ {k \rightarrow \infty } x ^ {k} = y ^ {*} $.

Bregman functions were introduced in [a1]. B4) and B5) hold automatically when $ x ^ {k} ,y ^ {*} $ are in $ C ^ {o} $, as a consequence of B1), B2) and B3), and so they need to be checked only at points on the boundary $ \partial C $ of $ C $. When $ C = \mathbf R ^ {n} $, a sufficient condition for a strictly convex differentiable function $ f $ to be a Bregman function is

$$ {\lim\limits } _ {\left \| x \right \| \rightarrow \infty } { \frac{f ( x ) }{\left \| x \right \| } } = \infty $$

(see [a2]).

A Bregman function $ f $ is said to be boundary coercive if for all $ \{ y ^ {k} \} \subset C ^ {o} $ such that $ {\lim\limits } _ {k \rightarrow \infty } y ^ {k} = y \in \partial C $ one has $ {\lim\limits } _ {k \rightarrow \infty } D _ {g} ( x,y ^ {k} ) = \infty $ for all $ x \in C ^ {o} $, and zone coercive if the image of $ C ^ {o} $ under $ \nabla f $ is equal to $ \mathbf R ^ {n} $. Zone coerciveness implies boundary coerciveness (see [a3]). These notions are closely related to essential smoothness, as defined in [a5]. For a boundary-coercive Bregman function $ f $ the zone $ C $ is uniquely determined from $ f $, i.e. $ f $ cannot be finitely extended outside $ C $. This property is essential in most applications of Bregman functions.

Examples.

$ \mathbf R ^ {n} _ {+} $ denotes the non-negative orthant of $ \mathbf R ^ {n} $.

i) $ C = \mathbf R ^ {n} $, $ f ( x ) = \| x \| ^ {2} $. In this case $ D _ {f} ( x,y ) = \| {x - y } \| ^ {2} $. More generally, $ f ( x ) = x ^ {t} Mx $, with $ M \in \mathbf R ^ {n \times n } $ symmetric and positive definite, in which case $ D _ {f} ( x,y ) = ( x - y ) ^ {t} M ( x - y ) $.

ii) $ C = \mathbf R _ {+} ^ {n} $, $ f ( x ) = \sum _ {j = 1 } ^ {n} x _ {j} { \mathop{\rm log} } x _ {j} $, extended by continuity to $ \partial \mathbf R _ {+} ^ {n} $ with the convention that $ 0 { \mathop{\rm log} } 0 = 0 $. In this case

$$ D _ {f} ( x,y ) = \sum _ {j = 1 } ^ { n } \left ( x _ {j} { \mathop{\rm log} } { \frac{x _ {j} }{y _ {j} } } + y _ {j} - x _ {j} \right ) , $$

which is the Kullback–Leibler information divergence, widely used in statistics (see Kullback–Leibler-type distance measures; [a4]).

iii) $ C = \mathbf R _ {+} ^ {n} $, $ f ( x ) = \sum _ {j = 1 } ^ {n} ( x _ {j} ^ \alpha - x _ {j} ^ \beta ) $ with $ \alpha \geq 1 $, $ 0 < \beta < 1 $. For $ \alpha = 2 $, $ \beta = {1 / 2 } $ one has

$$ D _ {f} ( x,y ) = \left \| {x - y } \right \| ^ {2} + \sum _ {j = 1 } ^ { n } \left [ { \frac{( \sqrt {x _ {j} } - \sqrt {y _ {j} } ) ^ {2} }{2 \sqrt {y _ {j} } } } \right ] , $$

and for $ \alpha = 1 $, $ \beta = 1/2 $ one has

$$ D _ {f} ( x,y ) = \sum _ {j = 1 } ^ { n } \left [ { \frac{( \sqrt {x _ {j} } - \sqrt {y _ {j} } ) ^ {2} }{2 \sqrt {y _ {j} } } } \right ] . $$

iv) $ C $ is a box (i.e., $ C = [ a _ {1} ,b _ {1} ] \times \dots \times [ a _ {n} ,b _ {n} ] $ with $ a _ {j} < b _ {j} $, $ 1 \leq j \leq n $),

$$ f ( x ) = $$

$$ = \sum _ {j = 1 } ^ { n } [ ( x _ {j} - a _ {j} ) { \mathop{\rm log} } ( x _ {j} - a _ {j} ) + ( b _ {j} - x _ {j} ) { \mathop{\rm log} } ( b _ {j} - x _ {j} ) ] . $$

In this case

$$ D _ {f} ( x,y ) = $$

$$ = \sum _ {j = 1 } ^ { n } ( x _ {j} - a _ {j} ) { \mathop{\rm log} } \left ( { \frac{x _ {j} - a _ {j} }{y _ {j} - a _ {j} } } \right ) + $$

$$ + \sum _ {j = 1 } ^ { n } ( b _ {j} - x _ {j} ) { \mathop{\rm log} } \left ( { \frac{b _ {j} - x _ {j} }{b _ {j} - y _ {j} } } \right ) . $$

v) $ C $ is a polyhedron with non-empty interior (i.e., $ C = \{ {x \in \mathbf R ^ {n} } : {Ax \leq b } \} $ with $ A \in \mathbf R ^ {m \times n } $, $ b \in \mathbf R ^ {n} $ and $ { \mathop{\rm rank} } ( A ) = n $( so that $ m \geq n $)), $ f ( x ) = \sum _ {i = 1 } ^ {m} ( b _ {i} - \langle {a ^ {i} ,x } \rangle ) { \mathop{\rm log} } ( b _ {i} - \langle {a ^ {i} ,x } \rangle ) $, where $ a ^ {i} $( $ 1 \leq i \leq m $) are the rows of $ A $. In this case

$$ D _ {f} ( x,y ) = $$

$$ = \sum _ {i = 1 } ^ { m } \left [ ( b _ {i} - \left \langle {a ^ {i} ,x } \right \rangle ) { \mathop{\rm log} } \left ( { \frac{b _ {i} - \left \langle {a ^ {i} ,x } \right \rangle }{b _ {i} - \left \langle {a ^ {i} ,y } \right \rangle } } \right ) + \left \langle {a ^ {i} , x - y } \right \rangle \right ] . $$

All the Bregman functions in the above examples are zone coercive, except for iii) with $ \alpha = 1 $, which is only boundary coercive.

Bregman functions are used in algorithms for convex feasibility problems and linearly constrained convex optimization (cf. Bregman distance), as well as for generalizations of the proximal point method for convex optimization (cf. Proximal point methods in mathematical programming).

References

[a1]	L.M. Bregman, "The relaxation method of finding the common points of convex sets and its application to the solution of problems in convex programming" USSR Comput. Math. Math. Phys. , 7 : 3 (1967) pp. 200–217 (In Russian)
[a2]	A.R. de Pierro, A.N. Iusem, "A relaxed version of Bregman's method for convex programming" J. Optimization Th. Appl. , 51 (1986) pp. 421–440
[a3]	A.N. Iusem, "On some properties of generalized proximal point methods for quadratic and linear programming" J. Optimization Th. Appl. , 85 (1995) pp. 593–612
[a4]	F. Liese, I. Vajda, "Convex statistical distances" , Teubner (1987)
[a5]	R.T. Rockafellar, "Convex analysis" , Princeton Univ. Press (1970)

How to Cite This Entry:
Bregman function. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Bregman_function&oldid=46162

This article was adapted from an original article by A.N. Iusem (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article

Navigation

Tools

Namespaces

Variants

Views

Actions

Bregman function

Examples.

References