Convex function (of a real variable)
A function $f$, defined on some interval, satisfying the condition
$$ f \left( \frac{x_1 + x_2}{2} \right) \le \frac{f(x_1) + f(x_2)}{2} \tag{1} $$
for every two points $x_1$ and $x_2$ from this interval. The geometrical meaning of this condition is that the midpoint of any chord of the graph of the function $f$ is located either above the graph or on it. If the inequality (1) is strict for all $x_1$ and $x_2$, then $f$ is called strictly convex. Examples of convex functions include $x^p, p \ge 1$, $x \ln x$ for $x > 0$, and $\left| x \right|$ for all $x$. If the sign of inequality (1) is reversed, the function is called concave. All measurable convex functions on open intervals are continuous. There exist convex functions which are not continuous, but they are very irregular: If a function $f$ is convex on the interval $(a, b)$ and is bounded from above on some interval lying inside $(a, b)$, it is continuous on $(a, b)$. Thus, a discontinuous convex function is unbounded on any interior interval and is not measurable.
If a function $f$ is continuous on an interval, and if each chord of its graph contains at least one point other than the end points of the chord and lying above the graph or on it, $f$ is convex. It follows from condition (1) that for a continuous function the centre of gravity of any finite number of material points lying on the graph of the function lies either above the graph or on it: For any numbers $p_k > 0, k = 1, \ldots, n$ (where $n$ is arbitrary), the Jensen inequality
$$ f \left( \frac{\sum_{k = 1}^{n} p_k x_k}{\sum_{k = 1}^{n} p_k} \right) \le \frac{\sum_{k = 1}^{n} p_k f(x_k)}{\sum_{k = 1}^{n} p_k} \tag{2} $$
is valid.
If, for some function $f$, inequality (2) is true for any two points $x_1$ and $x_2$ in some interval and any $p_1 > 0$ and $p_2 > 0$, the function $f$ is continuous and, of course, convex on this interval. Any chord of the graph of a continuous convex function coincides with the corresponding part of the graph or lies entirely above the graph except for its end points. This means that if a continuous convex function is not linear on any interval, strict inequality is realized in (1) and (2) for any pairwise different values of the argument, i.e. $f$ is a strictly convex function.
A continuous function is convex if and only if the set of points of the plane located above its graph, i.e. its supergraph, is a convex set. For a continuous function $f$, defined on an interval $(a, b)$, to be convex, it is necessary and sufficient that for each point on the graph there be at least one straight line (known as a supporting line) situated under the graph (on the interval $(a, b)$) or partly on the graph that passes through the point, i.e. for any point $x_0 \in (a, b)$ there exists a $k = k(x_0)$ such that
$$ f(x_0) + k(x - x_0) \le f(x) \tag{3} $$
for all $x \in (a, b)$.
A continuous function which is convex on an open interval has no strict local maximum. If a function $f$ is continuous and convex on an interval $(a, b)$, it has, at each one of its points $x_0$, a finite left, $D_{-} f(x_0)$, and right, $D_{+} f(x_0)$, derivative; moreover, $D_{-} f(x_0) \le D_{+} f(x_0)$ and, in addition, if the number $k = k(x_0)$ satisfies condition (3), the inequalities $D_{-} f(x_0) \le k(x_0) \le D_{+} f(x_0)$ hold. The functions $D_{-} f(x)$ and $D_{+} f(x)$ do not decrease, and at all points except, possibly, a countable number of them, $D_{-} f(x) = D_{+} f(x) = f'(x)$, so that $f$ is differentiable at these points. On each closed interval located inside $(a, b)$ the function $f$ satisfies a Lipschitz condition and is thus absolutely continuous. This makes it possible to establish the following convexity criterion: A continuous function is convex if and only if it is the indefinite integral of a non-decreasing function.
If a function $f$ is differentiable on an interval, it is (strictly) convex on this interval if and only if its derivative does not decrease (is increasing). At a point of the graph of a continuous convex function at which the function is differentiable there exists a unique supporting line — the tangent at this point. On the other hand, if, at any point of the graph of a function which is differentiable on an interval, the tangent to the graph at that point lies under the graph in some neighbourhood of that point (except at the tangency point itself), the function is strictly convex; if it lies under the graph or partly on it, it is just a convex function.
If the function $f$ is twice-differentiable on the interval, it is convex on this interval if and only if its second derivative is non-negative on this interval (this theorem is valid for the second symmetric derivative, as well as for the ordinary second derivative). If the function has a positive second derivative at each point of some interval, it is strictly convex on that interval.
If the functions $f_i$ are convex on an interval $(a, b)$ and $p_i > 0, i = 1, \ldots, n$, then the function
$$ f = \sum\limits_{i = 1}^{n} p_i f_i $$
is also convex on this interval; also, if even one of the functions $f_i$ is strictly convex, $f$ is strictly convex as well.
There exist various generalizations of the concept of convexity to functions of several variables. For instance, let a function $y = f(x^1, \ldots, x^n)$ be defined on a convex set $M$ of the $n$-dimensional affine space $E^n$. The function $f$ is called convex if inequality (1) is valid for all points $x_1 \in M$ and $x_2 \in M$, where $x_1 + x_2$ denotes the sum of the $n$-dimensional vectors $x_1$ and $x_2$. The properties of a convex function of one variable are correspondingly generalized to functions of several variables; for example, inequality (2) is satisfied only for continuous convex functions. A continuous function is convex if and only if the set of points $(x^1, \ldots, x^n, y)$ of the space $E^{n + 1}$ lying above its graph is convex.
A continuous function $f$ defined on a convex domain $G$ is convex if and only if for each point $x \in G$ there exists a linear function
$$ l(y) = a_1 y^1 + \ldots + a_n y^n + b, $$
such that
$$ f(x) = l(x),\qquad f(y) \ge l(y),\qquad y \in G. \tag{4} $$
The hyperplane defined by the equation $l(y) = 0$ is called a supporting hyperplane.
If a function $f$ is continuously differentiable in $G$, condition (4) is equivalent to the condition
$$ f(y) - f(x) - \sum\limits_{i = 1}^{n} \frac{\partial f(x)}{\partial x^i} (y^i - x^i) \ge 0,\qquad x, y \in G. $$
If $f$ is twice-differentiable, condition (4) is equivalent to the condition that the second differential of the function, i.e. the quadratic form
$$ \sum\limits_{i = 1}^{n} \sum\limits_{j = 1}^{n} \frac{\partial f(x)}{\partial x^i \partial x^j} \xi^i \xi^j $$
is non-negative for all $x \in G$.
Another important generalization of the concept of a convex function for functions of several variables is the concept of a subharmonic function. The concept of a convex function can be extended in a natural manner to include functions defined on corresponding subsets of infinite-dimensional linear spaces; cf. Convex functional.
References
[1] | N. Bourbaki, "Elements of mathematics. Functions of a real variable" , Addison-Wesley (1976) pp. Chapt. 1 Sect. 4 (Translated from French) |
[2] | A. Zygmund, "Trigonometric series" , 1–2 , Cambridge Univ. Press (1988) pp. Chapt. 1 |
[3] | L.D. Kudryavtsev, "Mathematical analysis" , 1 , Moscow (1973) pp. Chapt. 1 (In Russian) |
[4] | I.P. Natanson, "Theorie der Funktionen einer reellen Veränderlichen" , Deutsch. Verlag Wissenschaft. , Frankfurt a.M. (1961) pp. Chapt. 10 (Translated from Russian) |
[5] | S.M. Nikol'skii, "A course of mathematical analysis" , 1–2 , MIR (1977) pp. Chapt. 5 (Translated from Russian) |
[6] | G.H. Hardy, J.E. Littlewood, G. Pólya, "Inequalities" , Cambridge Univ. Press (1934) pp. Chapt. 3 |
Comments
Convexity of a real-valued function $f$ on an interval $I$ is often defined by the condition that
$$ f((1 - \alpha) x + \alpha y) \le (1 - \alpha) f(x) + \alpha f(y) \tag{c1} $$
whenever $x, y \in I$ and $0 \le \alpha \le 1$. This implies that $f$ is continuous on the interior of $I$. For measurable $f$, (1) and (c1) are equivalent. A function $f$ satisfying (1) is also called midpoint convex.
References
[a1] | K.R. Stromberg, "Introduction to classical real analysis" , Wadsworth (1981) pp. 199–206; 334 |
[a2] | V. Barbu, Th. Precupanu, "Convexity and optimization in Banach spaces" , Reidel (1986) pp. Chapt. 2 |
Convex function (of a real variable). Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Convex_function_(of_a_real_variable)&oldid=29996