Vapnik-Chervonenkis class
Vapnik–Červonenkis class
Let be a set,
a collection of subsets of
and
a finite subset of
. Then
is said to shatter
if for every subset
of
there is a set
in
with
. If there is a largest, finite
such that
shatters at least one set of cardinality
, then
is called a Vapnik–Chervonenkis class, or VC class, of sets and
its Vapnik–Chervonenkis index.
Let be the number of different sets
for
. Let
be the maximum of
over all sets
of cardinality
. Thus,
is a Vapnik–Chervonenkis class if and only if
for some finite
, and then for all
. Sauer's lemma says that
![]() |
Thus, is either always
or, for a Vapnik–Chervonenkis class
, it is bounded above by a polynomial in
of degree
. (This is the so-called Vapnik–Chervonenkis property: if
for large
, then
is bounded by a polynomial.)
Vapnik–Chervonenkis classes have turned out to be useful in computer science (learning theory [a1]), probability theory and mathematical statistics [a6], because certain probability limit theorems hold uniformly over them under suitable measurability conditions. One such sufficient measurability condition is that there exist a -algebra
of subsets of
, including
, and a mapping
from a complete separable metric space
onto
such that the set of pairs
with
is product-measurable in
. A VC class
satisfying this last condition is called a VCM class. While VC, but not VCM, classes can be shown to exist using the axiom of choice, the VC classes usually encountered in applications are VCM.
Let be a probability measure on
and let
be independent coordinates with distribution
, specifically, on a countable Cartesian product of copies of
. Let
be the sum of the point masses
at
for
; it is called an empirical measure for
(cf. also Empirical process). Then the law of large numbers for empirical measures holds uniformly over any VCM class
, meaning that the supremum for
of
approaches zero almost surely as
becomes large [a7]. This can be improved to a uniform law of the iterated logarithm, meaning that for any VCM class
, with probability
,
![]() |
![]() |
Moreover, a central limit theorem holds uniformly: if is any VCM class, and
assigns to sets in
jointly normal (Gaussian) random variables with mean zero and covariances
, then for any
there is a sufficiently large
such that for every
, there exists a
with
![]() |
on an event with probability at least . For the uniform central limit theorem to hold for each probability measure
on
, the VC property is also necessary.
VC classes can be generated as follows. Let be a
-dimensional vector space of real-valued functions on
. For each
, let
be the set where
. Then the class
of all sets
for
is a VC class with
. For example, the set of all ellipsoids in a Euclidean space
is a VCM class for each
. Also, let
be a VC class and
a finite integer. Let
be the union of all Boolean algebras of sets (cf. Boolean algebra), each generated by at most
sets in
. Then
is a VC class. For example, the set of all convex polytopes with at most
faces in
is a VC class for each
and
. Classes of projections of positivity sets of polynomials of bounded degree, and some other related classes, are also VC [a4].
The class of all finite sets in and the class of all closed convex sets are not VC classes.
The notion of VC class extends in different ways to a class of real functions on
. The subgraph of a function
is the set
![]() |
in . Then
is called a VC subgraph class if the collection of all subgraphs of functions in
is a VC class in
; it is called a VC major class if the class of all sets
for
and real
is a VC class in
.
The above probability limit theorems extend to these and larger classes of functions, with suitable measurability and boundedness. Neither the VC subgraph nor VC major property implies the other. For a uniformly bounded, suitably measurable family of functions, the uniform central limit property for all appears not to be equivalent to any VC-type combinatorial property.
For a probability measure and two events
, let
. For a totally bounded metric space
and
, let
be the maximum number of points of
all at distance at least
from each other. For any
there is a
such that for every VCM class
with
and any
,
![]() |
[a3]. There is a universal constant such that for every VCM class
and any
,
![]() |
![]() |
[a5].
Every VC class is included in a maximal class with the same VC index. If is a maximal VC class of index
, then for any
the set of symmetric differences
for
has a tree-like partial ordering by inclusion, and conversely, such an ordering implies
[a2]. For index greater than
no such structure is known (1996).
A general reference on VC classes of sets and functions, also from the viewpoint of probability and statistics, is [a6], Sect. 2.6.
References
[a1] | A. Blumer, A. Ehrenfeucht, D. Haussler, M.K. Warmuth, "Learnability and the Vapnik–Chervonenkis dimension" JACM , 6 (1989) pp. 929–965 |
[a2] | R.M. Dudley, "The structure of some Vapnik–Červonenkis classes" , Proc. Berkeley Conf.in honor of J. Neyman and J. Kiefer , 2 , Wadsworth (1985) pp. 495–508 |
[a3] | D. Haussler, "Sphere packing numbers for subsets of the Boolean ![]() |
[a4] | G. Stengle, J. Yukich, "Some new Vapnik–Chervonenkis classes" Ann. Statist. , 17 (1989) pp. 1441–1446 |
[a5] | M. Talagrand, "Sharper bounds for Gaussian and empirical processes" Ann. Probab. , 22 (1994) pp. 28–76 |
[a6] | A. van der Vaart, J. Wellner, "Weak convergence and empirical processes" , Springer (1996) |
[a7] | V.N. Vapnik, A.Ya. Červonenkis, "On the uniform convergence of frequencies of occurrence of events to their probabilities" Th. Probab. Appl. , 16 (1971) pp. 264–280 |
[a8] | R.M. Dudley, "Central limit theorems for empirical measures" Ann. of Probab. , 6 (1978) pp. 899–929 |
[a9] | R.M. Dudley, "Universal Donsker classes and metric entropy" Ann. of Probab. , 15 (1987) pp. 1306–1326 |
[a10] | D. Pollard, "Convergence of stochastic processes" , Springer (1984) |
Vapnik-Chervonenkis class. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Vapnik-Chervonenkis_class&oldid=49110