Difference between revisions of "Dual functions"
(Importing text file) |
Ulf Rehmann (talk | contribs) m (tex encoded by computer) |
||
Line 1: | Line 1: | ||
+ | <!-- | ||
+ | d0341001.png | ||
+ | $#A+1 = 62 n = 0 | ||
+ | $#C+1 = 62 : ~/encyclopedia/old_files/data/D034/D.0304100 Dual functions | ||
+ | Automatically converted into TeX, above some diagnostics. | ||
+ | Please remove this comment and the {{TEX|auto}} line below, | ||
+ | if TeX found to be correct. | ||
+ | --> | ||
+ | |||
+ | {{TEX|auto}} | ||
+ | {{TEX|done}} | ||
+ | |||
Functions complementary in the sense of Young, i.e. strictly convex functions (cf. [[Convex function (of a real variable)|Convex function (of a real variable)]]) connected by the [[Legendre transform|Legendre transform]]. | Functions complementary in the sense of Young, i.e. strictly convex functions (cf. [[Convex function (of a real variable)|Convex function (of a real variable)]]) connected by the [[Legendre transform|Legendre transform]]. | ||
====Comments==== | ====Comments==== | ||
− | For certain real-valued non-decreasing functions defined on the positive half-line (including zero) there is a natural notion of an inverse. If | + | For certain real-valued non-decreasing functions defined on the positive half-line (including zero) there is a natural notion of an inverse. If $ \phi $ |
+ | and $ \psi $ | ||
+ | are such inverses to each other, the functions $ \Phi $ | ||
+ | and $ \Psi $ | ||
+ | defined (on the positive half-line) by | ||
− | + | $$ | |
+ | \Phi ( u) = \ | ||
+ | \int\limits _ { 0 } ^ { u } | ||
+ | \phi ( t) dt \ \ | ||
+ | \textrm{ and } \ \ | ||
+ | \Psi ( v) = \ | ||
+ | \int\limits _ { 0 } ^ { v } | ||
+ | \psi ( s) ds | ||
+ | $$ | ||
are said to be complementary in the sense of Young or Young-conjugate. For them Young's inequality holds: | are said to be complementary in the sense of Young or Young-conjugate. For them Young's inequality holds: | ||
− | + | $$ | |
+ | uv \leq \ | ||
+ | \Phi ( u) + \Psi ( v),\ \ | ||
+ | u , v \geq 0. | ||
+ | $$ | ||
− | Associated with a pair | + | Associated with a pair $ \Phi , \Psi $ |
+ | of non-vanishing functions complementary in the sense of Young and a $ \sigma $- | ||
+ | finite measure, there is a pair $ L _ \Phi $, | ||
+ | $ L _ \Psi $ | ||
+ | of complete normed spaces. These spaces, consisting of (equivalence classes of) $ \mu $- | ||
+ | measurable functions, are called Orlicz spaces (cf. [[Orlicz space|Orlicz space]]). The Lebesgue spaces $ L _ {p} $( | ||
+ | cf. [[Lebesgue space|Lebesgue space]]) are particular cases of Orlicz spaces, cf. [[#References|[a4]]]. | ||
In a more abstract setting, the name dual functions is reminiscent of [[Dual pair|dual pair]] in [[Duality|duality]] theory and of dual problems in [[Convex programming|convex programming]] and optimal control (cf. [[Optimal control, mathematical theory of|Optimal control, mathematical theory of]]), but this name is rarely used in English: the most common name is (convex) conjugate functions (cf. [[Conjugate function|Conjugate function]]). | In a more abstract setting, the name dual functions is reminiscent of [[Dual pair|dual pair]] in [[Duality|duality]] theory and of dual problems in [[Convex programming|convex programming]] and optimal control (cf. [[Optimal control, mathematical theory of|Optimal control, mathematical theory of]]), but this name is rarely used in English: the most common name is (convex) conjugate functions (cf. [[Conjugate function|Conjugate function]]). | ||
− | Let | + | Let $ X $ |
+ | and $ Y $ | ||
+ | be two real vector spaces in separate duality with respect to a bilinear form $ \langle \cdot , \cdot \rangle $( | ||
+ | the usual one if $ X = Y = \mathbf R ^ {n} $), | ||
+ | and let $ f $ | ||
+ | be a mapping from $ X $ | ||
+ | into $ \mathbf R \cup \{ + \infty \} $( | ||
+ | if $ f $ | ||
+ | is only defined on a subset $ D $ | ||
+ | of $ X $, | ||
+ | set $ f = + \infty $ | ||
+ | on $ CD $, | ||
+ | the complement of $ D $). | ||
+ | If $ \{ f < + \infty \} $ | ||
+ | is non-empty, the dual, or polar, or adjoint, or better conjugate function of $ f $ | ||
+ | is the convex function $ f ^ {*} $ | ||
+ | defined on $ Y $ | ||
+ | by | ||
− | + | $$ | |
+ | f ^ {*} ( y) = \ | ||
+ | \sup _ {x \in X } \ | ||
+ | \{ \langle x, y \rangle - f ( x) \} . | ||
+ | $$ | ||
− | The following result is a generalization of the geometric [[Hahn–Banach theorem|Hahn–Banach theorem]] on the bipolar of a set: the biconjugate function | + | The following result is a generalization of the geometric [[Hahn–Banach theorem|Hahn–Banach theorem]] on the bipolar of a set: the biconjugate function $ f ^ {**} $ |
+ | of $ f $ | ||
+ | is the greatest lower semi-continuous convex function bounded above by $ f $, | ||
+ | and so is equal to $ f $ | ||
+ | if and only if $ f $ | ||
+ | is a lower semi-continuous convex function (in which case $ \{ f, f ^ {*} \} $ | ||
+ | is called a pair of conjugate functions). The notion of conjugate function, which was introduced by W. Young in the case $ X = \mathbf R $ | ||
+ | and by W. Fenchel in the case $ X = \mathbf R ^ {n} $, | ||
+ | is very important in convex analysis; it is closely related to the notion of [[Subdifferential|subdifferential]]: if $ f $ | ||
+ | is convex and $ \partial f $ | ||
+ | is its subdifferential, then for $ y \in Y $ | ||
+ | and $ x \in X $ | ||
+ | one has | ||
− | + | $$ | |
+ | y \in \partial f ( x) \iff \ | ||
+ | \langle x, y \rangle = f ( x) + f ^ {*} ( y). | ||
+ | $$ | ||
− | This can be written, if | + | This can be written, if $ f $ |
+ | is lower semi-continuous, as | ||
− | + | $$ | |
+ | y \in \partial f ( x) \iff \ | ||
+ | x \in \partial f ^ {*} ( y). | ||
+ | $$ | ||
− | The mapping | + | The mapping $ f \mapsto f ^ {*} $ |
+ | is often called the Fenchel transform, sometimes with the name of Young or Legendre, or both, added. When $ X = Y = \mathbf R ^ {n} $ | ||
+ | and $ f $ | ||
+ | is sufficiently smooth as a convex function, it is a special case of the [[Legendre transform|Legendre transform]]; on the other hand, it is also a special case of a [[Galois correspondence|Galois correspondence]]; these facts are of secondary importance in convex analysis. The notion of conjugate function plays a fundamental role in convex optimization. It is used to define the Lagrangian of some problem and the associated dual problem. | ||
− | When | + | When $ X = Y = \mathbf R $, |
+ | a function $ f: \mathbf R _ {+} \mapsto \mathbf R _ {+} $ | ||
+ | is called a Young function if it is a non-decreasing convex function such that $ f ( 0) = 0 $ | ||
+ | and $ \lim\limits _ {t \rightarrow + \infty } {f ( t) } /t = + \infty $. | ||
+ | The conjugate function $ f ^ {*} $ | ||
+ | of a Young function $ f $ | ||
+ | is still a Young function on $ \mathbf R _ {+} $; | ||
+ | for example, when $ f ( x) = {x ^ {p} } /p $ | ||
+ | with $ 1 < p < + \infty $, | ||
+ | then $ f ^ {*} ( y) = {y ^ {q} } /q $ | ||
+ | where $ q $ | ||
+ | is the conjugate exponent of $ p $, | ||
+ | i.e. $ ( 1/p) + ( 1/q) = 1 $. | ||
+ | Young functions are used to define Orlicz spaces (cf. [[Orlicz space|Orlicz space]]), and pairs of conjugate Young functions are used to study the duality between them; more generally they help to establish various inequalities in measure theory (Burkholder inequalities in martingale theory, Chernov's inequality in classical probability theory, Kullback's inequality in statistics, etc.), via the easy to prove but fundamental Young inequality | ||
− | + | $$ | |
+ | \langle x, y \rangle \geq f ( x) + f ^ {*} ( y), | ||
+ | $$ | ||
which enabled Young to solve a problem about Fourier transformation. | which enabled Young to solve a problem about Fourier transformation. |
Latest revision as of 19:36, 5 June 2020
Functions complementary in the sense of Young, i.e. strictly convex functions (cf. Convex function (of a real variable)) connected by the Legendre transform.
Comments
For certain real-valued non-decreasing functions defined on the positive half-line (including zero) there is a natural notion of an inverse. If $ \phi $ and $ \psi $ are such inverses to each other, the functions $ \Phi $ and $ \Psi $ defined (on the positive half-line) by
$$ \Phi ( u) = \ \int\limits _ { 0 } ^ { u } \phi ( t) dt \ \ \textrm{ and } \ \ \Psi ( v) = \ \int\limits _ { 0 } ^ { v } \psi ( s) ds $$
are said to be complementary in the sense of Young or Young-conjugate. For them Young's inequality holds:
$$ uv \leq \ \Phi ( u) + \Psi ( v),\ \ u , v \geq 0. $$
Associated with a pair $ \Phi , \Psi $ of non-vanishing functions complementary in the sense of Young and a $ \sigma $- finite measure, there is a pair $ L _ \Phi $, $ L _ \Psi $ of complete normed spaces. These spaces, consisting of (equivalence classes of) $ \mu $- measurable functions, are called Orlicz spaces (cf. Orlicz space). The Lebesgue spaces $ L _ {p} $( cf. Lebesgue space) are particular cases of Orlicz spaces, cf. [a4].
In a more abstract setting, the name dual functions is reminiscent of dual pair in duality theory and of dual problems in convex programming and optimal control (cf. Optimal control, mathematical theory of), but this name is rarely used in English: the most common name is (convex) conjugate functions (cf. Conjugate function).
Let $ X $ and $ Y $ be two real vector spaces in separate duality with respect to a bilinear form $ \langle \cdot , \cdot \rangle $( the usual one if $ X = Y = \mathbf R ^ {n} $), and let $ f $ be a mapping from $ X $ into $ \mathbf R \cup \{ + \infty \} $( if $ f $ is only defined on a subset $ D $ of $ X $, set $ f = + \infty $ on $ CD $, the complement of $ D $). If $ \{ f < + \infty \} $ is non-empty, the dual, or polar, or adjoint, or better conjugate function of $ f $ is the convex function $ f ^ {*} $ defined on $ Y $ by
$$ f ^ {*} ( y) = \ \sup _ {x \in X } \ \{ \langle x, y \rangle - f ( x) \} . $$
The following result is a generalization of the geometric Hahn–Banach theorem on the bipolar of a set: the biconjugate function $ f ^ {**} $ of $ f $ is the greatest lower semi-continuous convex function bounded above by $ f $, and so is equal to $ f $ if and only if $ f $ is a lower semi-continuous convex function (in which case $ \{ f, f ^ {*} \} $ is called a pair of conjugate functions). The notion of conjugate function, which was introduced by W. Young in the case $ X = \mathbf R $ and by W. Fenchel in the case $ X = \mathbf R ^ {n} $, is very important in convex analysis; it is closely related to the notion of subdifferential: if $ f $ is convex and $ \partial f $ is its subdifferential, then for $ y \in Y $ and $ x \in X $ one has
$$ y \in \partial f ( x) \iff \ \langle x, y \rangle = f ( x) + f ^ {*} ( y). $$
This can be written, if $ f $ is lower semi-continuous, as
$$ y \in \partial f ( x) \iff \ x \in \partial f ^ {*} ( y). $$
The mapping $ f \mapsto f ^ {*} $ is often called the Fenchel transform, sometimes with the name of Young or Legendre, or both, added. When $ X = Y = \mathbf R ^ {n} $ and $ f $ is sufficiently smooth as a convex function, it is a special case of the Legendre transform; on the other hand, it is also a special case of a Galois correspondence; these facts are of secondary importance in convex analysis. The notion of conjugate function plays a fundamental role in convex optimization. It is used to define the Lagrangian of some problem and the associated dual problem.
When $ X = Y = \mathbf R $, a function $ f: \mathbf R _ {+} \mapsto \mathbf R _ {+} $ is called a Young function if it is a non-decreasing convex function such that $ f ( 0) = 0 $ and $ \lim\limits _ {t \rightarrow + \infty } {f ( t) } /t = + \infty $. The conjugate function $ f ^ {*} $ of a Young function $ f $ is still a Young function on $ \mathbf R _ {+} $; for example, when $ f ( x) = {x ^ {p} } /p $ with $ 1 < p < + \infty $, then $ f ^ {*} ( y) = {y ^ {q} } /q $ where $ q $ is the conjugate exponent of $ p $, i.e. $ ( 1/p) + ( 1/q) = 1 $. Young functions are used to define Orlicz spaces (cf. Orlicz space), and pairs of conjugate Young functions are used to study the duality between them; more generally they help to establish various inequalities in measure theory (Burkholder inequalities in martingale theory, Chernov's inequality in classical probability theory, Kullback's inequality in statistics, etc.), via the easy to prove but fundamental Young inequality
$$ \langle x, y \rangle \geq f ( x) + f ^ {*} ( y), $$
which enabled Young to solve a problem about Fourier transformation.
References
[a1] | R.T. Rockafellar, "Conjugate duality and optimization" , Reg. Conf. Ser. Appl. Math. , SIAM (1974) |
[a2] | J. Neveu, "Martingales à temps discret" , Masson (1972) |
[a3] | C. Dellacherie, P.A. Meyer, "Probabilities and potential" , 2. Theory of martingales , North-Holland (1978–1988) (Translated from French) |
[a4] | A.C. Zaanen, "Linear analysis" , North-Holland (1956) |
Dual functions. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Dual_functions&oldid=12311