Namespaces
Variants
Actions

Difference between revisions of "Bregman distance"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
m (tex encoded by computer)
 
Line 1: Line 1:
Given a convex closed set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b1108801.png" /> with non-empty interior <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b1108802.png" /> and a [[Bregman function|Bregman function]] <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b1108803.png" /> with zone <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b1108804.png" />, the Bregman distance <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b1108805.png" /> is defined as:
+
<!--
 +
b1108801.png
 +
$#A+1 = 64 n = 0
 +
$#C+1 = 64 : ~/encyclopedia/old_files/data/B110/B.1100880 Bregman distance
 +
Automatically converted into TeX, above some diagnostics.
 +
Please remove this comment and the {{TEX|auto}} line below,
 +
if TeX found to be correct.
 +
-->
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b1108806.png" /></td> </tr></table>
+
{{TEX|auto}}
 +
{{TEX|done}}
  
Bregman distances were introduced in [[#References|[a1]]]. For several examples of Bregman distances for relevant sets <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b1108807.png" />, see [[Bregman function|Bregman function]]. It follows easily from the properties of Bregman functions that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b1108808.png" /> for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b1108809.png" /> and all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088010.png" />, that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088011.png" /> if and only if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088012.png" /> and that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088013.png" /> is a convex function (cf. also [[Convex function (of a real variable)|convex function (of a real variable)]]) for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088014.png" />. In general, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088015.png" /> does not satisfy the triangle inequality, it is not symmetric (i.e. it is not true that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088016.png" /> for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088017.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088018.png" />) and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088019.png" /> is not convex. If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088020.png" /> and either <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088021.png" /> is symmetric or <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088022.png" /> is convex for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088023.png" />, then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088024.png" /> is a quadratic function and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088025.png" /> is the square of an elliptic norm. A basic property of Bregman distances, which follows easily from the definition, is the following:
+
Given a convex closed set  $  C \subset  \mathbf R  ^ {n} $
 +
with non-empty interior  $  C  ^ {o} $
 +
and a [[Bregman function|Bregman function]] $  f $
 +
with zone  $  C $,
 +
the Bregman distance  $  {D _ {f} } : {C \times C  ^ {o} } \rightarrow \mathbf R $
 +
is defined as:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088026.png" /></td> </tr></table>
+
$$
 +
D _ {f} ( x,y ) = f ( x ) - f ( y ) - \left \langle  {\nabla f ( y ) ,x - y } \right \rangle .
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088027.png" /></td> </tr></table>
+
Bregman distances were introduced in [[#References|[a1]]]. For several examples of Bregman distances for relevant sets  $  C $,
 +
see [[Bregman function|Bregman function]]. It follows easily from the properties of Bregman functions that  $  D _ {f} ( x,y ) \geq 0 $
 +
for all  $  x \in C $
 +
and all  $  y \in C  ^ {o} $,
 +
that  $  D _ {f} ( x,y ) = 0 $
 +
if and only if  $  x = y $
 +
and that  $  D _ {f} ( \cdot,y ) $
 +
is a convex function (cf. also [[Convex function (of a real variable)|convex function (of a real variable)]]) for all  $  y \in C  ^ {o} $.
 +
In general,  $  D _ {f} $
 +
does not satisfy the triangle inequality, it is not symmetric (i.e. it is not true that  $  D _ {f} ( x,y ) = D _ {f} ( y,x ) $
 +
for all  $  x $,
 +
$  y $)
 +
and  $  D _ {f} ( x, \cdot ) $
 +
is not convex. If  $  C = \mathbf R  ^ {n} $
 +
and either  $  D _ {f} $
 +
is symmetric or  $  D _ {f} ( x, \cdot ) $
 +
is convex for all  $  x \in C $,
 +
then  $  f $
 +
is a quadratic function and  $  D _ {f} $
 +
is the square of an elliptic norm. A basic property of Bregman distances, which follows easily from the definition, is the following:
  
for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088028.png" />, and all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088029.png" />. Given a closed convex set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088030.png" /> such that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088031.png" />, the Bregman projection onto <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088032.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088033.png" />, is defined as
+
$$
 +
D _ {f} ( x,y ) + D _ {f} ( y,z ) - D _ {f} ( x,z ) =
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088034.png" /></td> </tr></table>
+
$$
 +
=  
 +
\left \langle  {\nabla f ( z ) - \nabla f ( y ) ,x - y } \right \rangle
 +
$$
  
The properties of Bregman distances ensure existence and uniqueness of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088035.png" /> for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088036.png" />. Given closed convex sets <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088037.png" /> such that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088038.png" /> for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088039.png" /> and all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088040.png" /> (such sets are said to be zones consistent with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088041.png" />), it is interesting to consider a sequence of successive Bregman projections onto the convex sets <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088042.png" />, i.e. the sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088043.png" /> with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088044.png" /> and iterative formula given by
+
for all  $  x \in C $,
 +
and all $  y,z \in C  ^ {o} $.  
 +
Given a closed convex set  $  L \subset  \mathbf R  ^ {n} $
 +
such that $  L \cap C \neq \emptyset $,  
 +
the Bregman projection onto $  L $,
 +
$  {P _ {L}  ^ {f} } : {C  ^ {o} } \rightarrow L $,  
 +
is defined as
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088045.png" /></td> </tr></table>
+
$$
 +
P _ {L}  ^ {f} ( z ) = { \mathop{\rm argmin} } \left \{ {D _ {f} ( x,z ) } : {x \in L \cap C } \right \} .
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088046.png" /> is the index of the convex set used in the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088047.png" />th iteration (for instance cyclically, i.e. <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088048.png" />). This algorithm, called Bregman's method, converges to a point in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088049.png" /> if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088050.png" /> is non-empty (see [[#References|[a1]]]). It has been proved in [[#References|[a1]]] that if all the sets <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088051.png" /> are hyperplanes, then the limit of the sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088052.png" /> is also the unique solution of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088053.png" />, subject to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088054.png" />. This property also holds for an underrelaxed version of the method, of the type
+
The properties of Bregman distances ensure existence and uniqueness of $  P _ {L}  ^ {f} ( z ) $
 +
for all  $  z \in C  ^ {o} $.  
 +
Given closed convex sets  $  L _ {1} \dots L _ {m} $
 +
such that  $  P _ {L _ {i}  }  ^ {f} ( z ) \in C  ^ {o} $
 +
for all  $  z \in C  ^ {o} $
 +
and all $  i $(
 +
such sets are said to be zones consistent with  $  f $),  
 +
it is interesting to consider a sequence of successive Bregman projections onto the convex sets  $  L _ {i} $,
 +
i.e. the sequence  $  \{ x  ^ {k} \} $
 +
with  $  x  ^ {0} \in C  ^ {o} $
 +
and iterative formula given by
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088055.png" /></td> </tr></table>
+
$$
 +
x ^ {k + 1 } = P _ {L _ {i ( k ) }  }  ^ {f} ( x  ^ {k} ) ,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088056.png" /> is a hyperplane parallel to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088057.png" /> and lying between <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088058.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088059.png" /> (see [[#References|[a3]]]). Under suitable modifications in the definition of the hyperplane <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088060.png" />, the method has been extended to the case of minimization of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088061.png" /> subject to linear inequalities and linear interval constraints (see [[#References|[a2]]], [[#References|[a5]]]). The entropy maximization method known as MART (multiplicative algebraic reconstruction technique, see [[#References|[a4]]]) is a particular case of Bregman's method with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088062.png" /> (the non-negative orthant of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088063.png" />) and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110880/b11088064.png" />, under a specific underrelaxation strategy.
+
where $  i ( k ) $
 +
is the index of the convex set used in the  $  k $
 +
th iteration (for instance cyclically, i.e.  $  i ( k ) = k { \mathop{\rm mod} } m $).
 +
This algorithm, called Bregman's method, converges to a point in  $  L = \cap _ {i = 1 }  ^ {m} L _ {i} $
 +
if  $  L $
 +
is non-empty (see [[#References|[a1]]]). It has been proved in [[#References|[a1]]] that if all the sets  $  L _ {i} $
 +
are hyperplanes, then the limit of the sequence  $  \{ x  ^ {k} \} $
 +
is also the unique solution of  $  { \mathop{\rm min} } f $,
 +
subject to  $  x \in \cap _ {i = 1 }  ^ {m} L _ {i} $.  
 +
This property also holds for an underrelaxed version of the method, of the type
 +
 
 +
$$
 +
x ^ {k + 1 } = P _ {H _ {k}  }  ^ {f} ( k ) ,
 +
$$
 +
 
 +
where  $  H _ {k} $
 +
is a hyperplane parallel to $  L _ {i ( k ) }  $
 +
and lying between $  x  ^ {k} $
 +
and $  L _ {i ( k ) }  $(
 +
see [[#References|[a3]]]). Under suitable modifications in the definition of the hyperplane $  H _ {k} $,  
 +
the method has been extended to the case of minimization of $  f $
 +
subject to linear inequalities and linear interval constraints (see [[#References|[a2]]], [[#References|[a5]]]). The entropy maximization method known as MART (multiplicative algebraic reconstruction technique, see [[#References|[a4]]]) is a particular case of Bregman's method with $  C = \mathbf R _ {+}  ^ {n} $(
 +
the non-negative orthant of $  \mathbf R  ^ {n} $)  
 +
and $  f ( x ) = \sum _ {j = 1 }  ^ {n} x _ {j} { \mathop{\rm log} } x _ {j} $,  
 +
under a specific underrelaxation strategy.
  
 
Bregman distances have also been used to generate generalized proximal point methods for convex optimization and variational inequalities (cf. [[Proximal point methods in mathematical programming|Proximal point methods in mathematical programming]]).
 
Bregman distances have also been used to generate generalized proximal point methods for convex optimization and variational inequalities (cf. [[Proximal point methods in mathematical programming|Proximal point methods in mathematical programming]]).

Latest revision as of 06:29, 30 May 2020


Given a convex closed set $ C \subset \mathbf R ^ {n} $ with non-empty interior $ C ^ {o} $ and a Bregman function $ f $ with zone $ C $, the Bregman distance $ {D _ {f} } : {C \times C ^ {o} } \rightarrow \mathbf R $ is defined as:

$$ D _ {f} ( x,y ) = f ( x ) - f ( y ) - \left \langle {\nabla f ( y ) ,x - y } \right \rangle . $$

Bregman distances were introduced in [a1]. For several examples of Bregman distances for relevant sets $ C $, see Bregman function. It follows easily from the properties of Bregman functions that $ D _ {f} ( x,y ) \geq 0 $ for all $ x \in C $ and all $ y \in C ^ {o} $, that $ D _ {f} ( x,y ) = 0 $ if and only if $ x = y $ and that $ D _ {f} ( \cdot,y ) $ is a convex function (cf. also convex function (of a real variable)) for all $ y \in C ^ {o} $. In general, $ D _ {f} $ does not satisfy the triangle inequality, it is not symmetric (i.e. it is not true that $ D _ {f} ( x,y ) = D _ {f} ( y,x ) $ for all $ x $, $ y $) and $ D _ {f} ( x, \cdot ) $ is not convex. If $ C = \mathbf R ^ {n} $ and either $ D _ {f} $ is symmetric or $ D _ {f} ( x, \cdot ) $ is convex for all $ x \in C $, then $ f $ is a quadratic function and $ D _ {f} $ is the square of an elliptic norm. A basic property of Bregman distances, which follows easily from the definition, is the following:

$$ D _ {f} ( x,y ) + D _ {f} ( y,z ) - D _ {f} ( x,z ) = $$

$$ = \left \langle {\nabla f ( z ) - \nabla f ( y ) ,x - y } \right \rangle $$

for all $ x \in C $, and all $ y,z \in C ^ {o} $. Given a closed convex set $ L \subset \mathbf R ^ {n} $ such that $ L \cap C \neq \emptyset $, the Bregman projection onto $ L $, $ {P _ {L} ^ {f} } : {C ^ {o} } \rightarrow L $, is defined as

$$ P _ {L} ^ {f} ( z ) = { \mathop{\rm argmin} } \left \{ {D _ {f} ( x,z ) } : {x \in L \cap C } \right \} . $$

The properties of Bregman distances ensure existence and uniqueness of $ P _ {L} ^ {f} ( z ) $ for all $ z \in C ^ {o} $. Given closed convex sets $ L _ {1} \dots L _ {m} $ such that $ P _ {L _ {i} } ^ {f} ( z ) \in C ^ {o} $ for all $ z \in C ^ {o} $ and all $ i $( such sets are said to be zones consistent with $ f $), it is interesting to consider a sequence of successive Bregman projections onto the convex sets $ L _ {i} $, i.e. the sequence $ \{ x ^ {k} \} $ with $ x ^ {0} \in C ^ {o} $ and iterative formula given by

$$ x ^ {k + 1 } = P _ {L _ {i ( k ) } } ^ {f} ( x ^ {k} ) , $$

where $ i ( k ) $ is the index of the convex set used in the $ k $ th iteration (for instance cyclically, i.e. $ i ( k ) = k { \mathop{\rm mod} } m $). This algorithm, called Bregman's method, converges to a point in $ L = \cap _ {i = 1 } ^ {m} L _ {i} $ if $ L $ is non-empty (see [a1]). It has been proved in [a1] that if all the sets $ L _ {i} $ are hyperplanes, then the limit of the sequence $ \{ x ^ {k} \} $ is also the unique solution of $ { \mathop{\rm min} } f $, subject to $ x \in \cap _ {i = 1 } ^ {m} L _ {i} $. This property also holds for an underrelaxed version of the method, of the type

$$ x ^ {k + 1 } = P _ {H _ {k} } ^ {f} ( k ) , $$

where $ H _ {k} $ is a hyperplane parallel to $ L _ {i ( k ) } $ and lying between $ x ^ {k} $ and $ L _ {i ( k ) } $( see [a3]). Under suitable modifications in the definition of the hyperplane $ H _ {k} $, the method has been extended to the case of minimization of $ f $ subject to linear inequalities and linear interval constraints (see [a2], [a5]). The entropy maximization method known as MART (multiplicative algebraic reconstruction technique, see [a4]) is a particular case of Bregman's method with $ C = \mathbf R _ {+} ^ {n} $( the non-negative orthant of $ \mathbf R ^ {n} $) and $ f ( x ) = \sum _ {j = 1 } ^ {n} x _ {j} { \mathop{\rm log} } x _ {j} $, under a specific underrelaxation strategy.

Bregman distances have also been used to generate generalized proximal point methods for convex optimization and variational inequalities (cf. Proximal point methods in mathematical programming).

References

[a1] L.M. Bregman, "The relaxation method of finding the common points of convex sets and its application to the solution of problems in convex programming" USSR Comput. Math. Math. Phys. , 7 : 3 (1967) pp. 200–217 (In Russian)
[a2] Y. Censor, A. Lent, "An iterative row-action method for interval convex programming" J. Optimization Th. Appl. , 34 (1981) pp. 321–353
[a3] A.R. de Pierro, A.N. Iusem, "A relaxed version of Bregman's method for convex programming" J. Optimization Th. Appl. , 51 (1986) pp. 421–440
[a4] R. Gordon, R. Bender, G.T. Herman, "Algebraic reconstruction techniques (art) for three dimensional electron microscopy and x-ray photography" J. Theor. Biology , 29 (1970) pp. 471–481
[a5] A.N. Iusem, S.A. Zenios, "Interval underrelaxed Bregman method with an application" Optimization , 35 (1995) pp. 227–250
How to Cite This Entry:
Bregman distance. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Bregman_distance&oldid=15904
This article was adapted from an original article by A.N. Iusem (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article