Namespaces
Variants
Actions

Difference between revisions of "User:Richard Pinch/sandbox-13"

From Encyclopedia of Mathematics
Jump to: navigation, search
(→‎Monad: move)
 
(17 intermediate revisions by the same user not shown)
Line 1: Line 1:
=Hoeffding inequality=
+
=Mann–Whitney test=
  
A simple inequality giving a bound on the probability of deviation of a sum of independent random variables from a given value.
+
A statistical test for testing the hypothesis  $  H _ {0} $
Let $X_1,\ldots,X_n$ be independent [[Subgaussian distribution|sub-Gaussian random variable]]s with means $\mu_i$ and sub-Gaussian parameters $\sigma_i$. Then
+
of homogeneity of two samples  $  X _ {1} \dots X _ {n} $
 +
and  $ Y _ {1} \dots Y _ {m} $,  
 +
all  $  m + n $
 +
elements of which are mutually independent and have continuous distributions. This test, suggested by H.B. Mann and D.R. Whitney [[#References|[1]]], is based on the statistic
 +
 
 +
$$  
 +
U  =  W -  
 +
\frac{1}{2}
 +
m ( m + 1 )  = \
 +
\sum _ { i=1 } ^ { n } \
 +
\sum _ { j=1 } ^ { m }
 +
\delta _ {ij} ,
 
$$
 
$$
\mathsf{P}\left\{{\sum_{i=1}^n (X_i-\mu_i) > t }\right\} < \exp\left({-\frac{t^2}{2\sum_{i=1}^n \sigma_i^2} }\right) \ .
+
 
 +
where  $  W $
 +
is the statistic of the [[Wilcoxon test|Wilcoxon test]] intended for testing the same hypothesis, equal to the sum of the ranks of the elements of the second sample among the pooled order statistics (cf. [[Order statistic|Order statistic]]), and
 +
 
 +
$$
 +
\delta _ {ij} = \
 +
\left \{  
 +
\begin{array}{ll}
 +
1  & \textrm{ if }  X _ {i} < Y _ {j} ,  \\
 +
0  & \textrm{ otherwise } .  \\
 +
\end{array}
 +
 
 +
\right .$$
 +
 
 +
Thus,  $  U $
 +
counts the number of cases when the elements of the second sample exceed elements of the first sample. It follows from the definition of  $  U $
 +
that if  $  H _ {0} $
 +
is true, then
 +
 
 +
$$ \tag{* }
 +
{\mathsf E} U  = 
 +
\frac{nm}{2}
 +
,\ \  
 +
{\mathsf D} U  =
 +
\frac{n m ( n + m + 1 ) }{12}
 +
,
 
$$
 
$$
In particular, if the $X_i$ are bounded, $|X_i|<b$ then they are sub-Gaussian with parameter $b$ and the
+
 
inequality takes the form
+
and, in addition, this statistic has all the properties of the Wilcoxon statistic  $ W $,  
 +
including asymptotic normality with parameters (*).
 +
 
 +
====References====
 +
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  H.B. Mann,  D.R. Whitney,  "On a test whether one of two random variables is statistically larger than the other"  ''Ann. Math. Stat.'' , '''18'''  (1947)  pp. 50–60</TD></TR></table>
 +
 
 +
====Comments====
 +
Instead of Mann–Whitney test, the phrase  $U$-test is also used.
 +
 
 +
==Wilcoxon test==
 +
 
 +
A [[Non-parametric test|non-parametric test]] of the homogeneity of two samples  $  X _ {1} \dots X _ {n} $
 +
and  $  Y _ {1} \dots Y _ {m} $.
 +
The elements of the samples are assumed to be mutually independent, with continuous distribution functions  $ F( x) $
 +
and $  G( x) $,
 +
respectively. The hypothesis to be tested is  $  F( x)= G( x) $.
 +
Wilcoxon's test is based on the [[Rank statistic|rank statistic]]
 +
 
 +
$$ \tag{* }
 +
W  =  s ( r _ {1} ) + \dots + s ( r _ {m} ),
 
$$
 
$$
\mathsf{P}\left\{{\sum_{i=1}^n (X_i-\mu_i) > t }\right\} < \exp\left({-\frac{t^2}{2 n b^2} }\right) \ .
+
 
 +
where  $  r _ {j} $
 +
are the ranks of the random variables  $  Y _ {j} $
 +
in the common series of order statistics of  $  X _ {i} $
 +
and  $  Y _ {j} $,
 +
while the function  $  s( r) $,
 +
$  r = 1 \dots n + m $,
 +
is defined by a given permutation
 +
 
 +
$$
 +
\left(
 +
\begin{array}{cccc}
 +
1 & 2 & \cdots & m+n \\
 +
s(1) & s(2) & \cdots & s(m+n)
 +
\end{array}
 +
\right)\ ,
 
$$
 
$$
 +
where  $  s( 1) \dots s( n+ m) $
 +
is one of the possible rearrangements of the numbers  $  1 \dots n + m $.
 +
The permutation is chosen so that the power of Wilcoxon's test for the given alternative is highest. The statistical distribution of  $  W $
 +
depends only on the size of the samples and not on the chosen permutation (if the homogeneity hypothesis is true). If  $  n \rightarrow \infty $
 +
and  $  m \rightarrow \infty $,
 +
the random variable  $  W $
 +
has an asymptotically-normal distribution. This variant of the test was first proposed by F. Wilcoxon in 1945 for samples of equal sizes and was based on the special case  $  s( r) \equiv r $(
 +
cf. [[Rank sum test|Rank sum test]]; [[Mann–Whitney test|Mann–Whitney test]]). See also [[Van der Waerden test|van der Waerden test]]; [[Rank test|Rank test]].
 +
 +
====References====
 +
<table>
 +
<TR><TD valign="top">[1]</TD> <TD valign="top">  F. Wilcoxon,  "Individual comparison by ranking methods"  ''Biometrics'' , '''1''' :  6  (1945)  pp. 80–83</TD></TR>
 +
<TR><TD valign="top">[2]</TD> <TD valign="top">  L.N. Bol'shev,  N.V. Smirnov,  "Tables of mathematical statistics" , ''Libr. math. tables'' , '''46''' , Nauka  (1983)  (In Russian)  (Processed by L.S. Bark and E.S. Kedrova)</TD></TR>
 +
<TR><TD valign="top">[3]</TD> <TD valign="top">  B.L. van der Waerden,  "Mathematische Statistik" , Springer  (1957)</TD></TR>
 +
</table>
 +
 +
====Comments====
 +
 +
====References====
 +
<table>
 +
<TR><TD valign="top">[a1]</TD> <TD valign="top">  E.L. Lehmann,  "Testing statistical hypotheses" , Wiley  (1986)</TD></TR>
 +
</table>
 +
 +
 +
 +
=Vapnik-Chervonenkis class=
 +
''Vapnik–Červonenkis class''
 +
 +
Let  $  S $
 +
be a set,  $  {\mathcal C} $
 +
a collection of subsets of  $  S $
 +
and  $  F $
 +
a finite subset of  $  S $.
 +
Then  $  {\mathcal C} $
 +
is said to shatter  $  F $
 +
if for every subset  $  A $
 +
of  $  F $
 +
there is a set  $  C $
 +
in  $  {\mathcal C} $
 +
with  $  A = C \cap F $.
 +
If there is a largest, finite  $  k $
 +
such that  $  {\mathcal C} $
 +
shatters at least one set of cardinality  $  k $,
 +
then  $  {\mathcal C} $
 +
is called a Vapnik–Chervonenkis class, or VC class, of sets and  $  S ( {\mathcal C} ) = k $
 +
its Vapnik–Chervonenkis index.
  
==References==
+
Let  $  \Delta  ^  {\mathcal C}  ( F ) $
* Martin J. Wainwright, "High-dimensional Statistics", Cambridge (2019) ISBN 978-1-108-49802-9
+
be the number of different sets  $  C \cap F $
 +
for  $  C \in {\mathcal C} $.
 +
Let  $  m  ^  {\mathcal C}  ( n ) $
 +
be the maximum of  $  \Delta  ^  {\mathcal C}  ( F ) $
 +
over all sets  $  F $
 +
of cardinality  $  n $.  
 +
Thus, $  {\mathcal C} $
 +
is a Vapnik–Chervonenkis class if and only if  $  m  ^  {\mathcal C}  ( n ) < 2  ^ {n} $
 +
for some finite  $  n $,  
 +
and then for all  $  n > S ( {\mathcal C} ) $.
 +
Sauer's lemma says that
  
=Subgaussian distribution=
+
$$
A probability distribution with tails comparable to those of the Gaussian distribution. A random variable $X$ with mean $\mu$ is
+
m  ^  {\mathcal C}  ( n ) \leq  \sum _ {j = 0 } ^ { {S }  ( {\mathcal C} ) } \left ( \begin{array}{c}
sub-Gaussian if there is a positive constant $\sigma$ such that
+
n \\
 +
  j
 +
\end{array}
 +
\right ) .
 
$$
 
$$
\mathsf{E}[\exp(\lambda(X-\mu))] \le \exp(\lambda^2\sigma^2/2)
 
$$
 
for all real $\lambda$.
 
  
Clearly any normal random variable $N(\mu,\sigma^2)$ is sub-Gaussian with parameter $\sigma$.
+
Thus,  $ m  ^  {\mathcal C}  ( n ) $
 +
is either always  $  2  ^ {n} $
 +
or, for a Vapnik–Chervonenkis class  $  {\mathcal C} $,  
 +
it is bounded above by a polynomial in  $  n $
 +
of degree  $  S ( {\mathcal C} ) $.
 +
(This is the so-called Vapnik–Chervonenkis property: if  $  m  ^ {\mathcal C}  ( n ) < 2 ^ {n} $
 +
for large  $  n $,
 +
then  $  m  ^  {\mathcal C}  ( n ) $
 +
is bounded by a polynomial.)
 +
 
 +
Vapnik–Chervonenkis classes have turned out to be useful in computer science (learning theory [[#References|[a1]]]), [[Probability theory|probability theory]] and [[Mathematical statistics|mathematical statistics]] [[#References|[a6]]], because certain probability limit theorems hold uniformly over them under suitable measurability conditions. One such sufficient measurability condition is that there exist a  $  \sigma $-
 +
algebra  $  {\mathcal S} $
 +
of subsets of  $  S $,
 +
including  $  {\mathcal C} $,
 +
and a mapping  $  Y $
 +
from a complete separable [[Metric space|metric space]]  $  U $
 +
onto  $  {\mathcal C} $
 +
such that the set of pairs  $  ( x,u ) $
 +
with $ x \in Y ( u ) $
 +
is product-measurable in  $  S \times U $.
 +
A VC class  $  {\mathcal C} $
 +
satisfying this last condition is called a VCM class. While VC, but not VCM, classes can be shown to exist using the [[Axiom of choice|axiom of choice]], the VC classes usually encountered in applications are VCM.
 +
 
 +
Let  $  {\mathsf P} $
 +
be a probability measure on  $  ( S, {\mathcal S} ) $
 +
and let  $  X _ {1} ,X _ {2} , \dots $
 +
be independent coordinates with distribution  $  {\mathsf P} $,
 +
specifically, on a countable Cartesian product of copies of  $  ( S, {\mathcal S}, {\mathsf P} ) $.
 +
Let  $  {\mathsf P} _ {n} $
 +
be the sum of the point masses  $  {1 / n } $
 +
at  $  X _ {i} $
 +
for  $  i = 1 \dots n $;
 +
it is called an empirical measure for  $  {\mathsf P} $(
 +
cf. also [[Empirical process|Empirical process]]). Then the [[Law of large numbers|law of large numbers]] for empirical measures holds uniformly over any VCM class  $  {\mathcal C} $,
 +
meaning that the supremum for  $  C \in {\mathcal C} $
 +
of  $  | {( {\mathsf P} _ {n} - {\mathsf P} ) ( C ) } | $
 +
approaches zero almost surely as  $  n $
 +
becomes large [[#References|[a7]]]. This can be improved to a uniform [[Law of the iterated logarithm|law of the iterated logarithm]], meaning that for any VCM class  $  {\mathcal C} $,
 +
with probability  $  1 $,
  
A random variable is sub-Gaussian if and only if it has a moment generating function satisfying
+
$$
 +
{\lim\limits }  \sup  _ {n \rightarrow \infty } n ^ {1/2 }  \sup  _ {C \in {\mathcal C} } {
 +
\frac{\left | {( {\mathsf P} _ {n} - {\mathsf P} ) ( C ) } \right | }{( 2 { \mathop{\rm log} } { \mathop{\rm log} } n ) ^ {1/2 } }
 +
} =
 
$$
 
$$
\mathsf{E}[\exp(\lambda X)] \le \exp(\lambda^2\sigma^2/2)
+
 
 +
$$
 +
=
 +
\sup  _ {A \in {\mathcal C} } ( {\mathsf P} ( A ) ( 1 - {\mathsf P} ( A ) ) ) ^ {1/2 } .
 
$$
 
$$
for all real $\lambda$.
 
  
==References==
+
Moreover, a [[Central limit theorem|central limit theorem]] holds uniformly: if  $  {\mathcal C} $
* Martin J. Wainwright, "High-dimensional Statistics", Cambridge (2019) ISBN 978-1-108-49802-9
+
is any VCM class, and  $  G _  {\mathsf P}  $
 +
assigns to sets in  $  {\mathcal C} $
 +
jointly normal (Gaussian) random variables with mean zero and covariances  $  {\mathsf E} G _  {\mathsf P}  ( A ) G _  {\mathsf P}  ( B ) = {\mathsf P} ( A \cap B ) - {\mathsf P} ( A ) P ( B ) $,
 +
then for any  $  \epsilon > 0 $
 +
there is a sufficiently large  $  m $
 +
such that for every  $  n \geq  m $,
 +
there exists a  $  G _  {\mathsf P}  $
 +
with
  
=Markov inequality in probability theory=
+
$$
A simple inequality giving a bound on the probability of deviation of a random variable with finite expectation from a given value.
+
\sup  _ {A \in {\mathcal C} } \left | {n ^ {1/2 } ( {\mathsf P} _ {n} - {\mathsf P} ) ( A ) - G _  {\mathsf P} ( A ) } \right | < \epsilon
Let $X(\omega)$ be a random variable. Then
 
 
$$
 
$$
\mathsf{P}\{|X| > \epsilon\} \le \frac{\mathsf{E}X}{\epsilon} \ .
+
 
 +
on an event with probability at least  $  1 - \epsilon $.
 +
For the uniform central limit theorem to hold for each probability measure  $  {\mathsf P} $
 +
on  $  ( S, {\mathcal S} ) $,
 +
the VC property is also necessary.
 +
 
 +
VC classes can be generated as follows. Let  $  V $
 +
be a  $  k $-
 +
dimensional [[Vector space|vector space]] of real-valued functions on  $  S $.
 +
For each  $  f \in V $,
 +
let  $  { \mathop{\rm pos} } ( f ) $
 +
be the set where  $  f > 0 $.
 +
Then the class  $  {\mathcal C} $
 +
of all sets  $  { \mathop{\rm pos} } ( f ) $
 +
for  $  f \in V $
 +
is a VC class with  $  S ( {\mathcal C} ) = k $.
 +
For example, the set of all ellipsoids in a Euclidean space  $  \mathbf R  ^ {d} $
 +
is a VCM class for each  $  d $.
 +
Also, let  $  {\mathcal C} $
 +
be a VC class and  $  m $
 +
a finite integer. Let  $  {\mathcal D} $
 +
be the union of all Boolean algebras of sets (cf. [[Boolean algebra|Boolean algebra]]), each generated by at most  $  m $
 +
sets in  $  {\mathcal C} $.
 +
Then  $  {\mathcal D} $
 +
is a VC class. For example, the set of all convex polytopes with at most  $  m $
 +
faces in  $  \mathbf R  ^ {d} $
 +
is a VC class for each  $  m $
 +
and  $  d $.
 +
Classes of projections of positivity sets of polynomials of bounded degree, and some other related classes, are also VC [[#References|[a4]]].
 +
 
 +
The class of all finite sets in  $  \mathbf R  ^ {d} $
 +
and the class of all closed convex sets are not VC classes.
 +
 
 +
The notion of VC class extends in different ways to a class  $  {\mathcal F} $
 +
of real functions on  $  S $.  
 +
The subgraph of a function  $  f $
 +
is the set
 +
 
 +
$$
 +
\left \{ {( s,x ) } : {0 \leq  x \leq  f ( s )  \textrm{ or  }  f ( s ) \leq  x \leq  0 } \right \}
 
$$
 
$$
  
The inequality follows from comparing the indicator function of the set $\{\omega : |X(\omega)| > \epsilon\}$ with the random variable $X/\epsilon$ and then taking expectations.
+
in  $  S \times \mathbf R $.
 +
Then  $  {\mathcal F} $
 +
is called a VC subgraph class if the collection of all subgraphs of functions in  $  {\mathcal F} $
 +
is a VC class in  $  S \times \mathbf R $;
 +
it is called a VC major class if the class of all sets  $ \{ {s \in S } : {f ( s ) > x } \} $
 +
for  $  f \in {\mathcal F} $
 +
and real  $  x $
 +
is a VC class in  $  S $.
 +
 
 +
The above probability limit theorems extend to these and larger classes of functions, with suitable measurability and boundedness. Neither the VC subgraph nor VC major property implies the other. For a uniformly bounded, suitably measurable family of functions, the uniform central limit property for all  $  {\mathsf P} $
 +
appears not to be equivalent to any VC-type combinatorial property.
 +
 
 +
For a probability measure  $  {\mathsf P} $
 +
and two events  $  A,B $,
 +
let  $  d _ {1, {\mathsf P} }  ( A,B ) = {\mathsf E} | {1 _ {A} - 1 _ {B} } | $.
 +
For a totally bounded metric space  $  ( T,d ) $
 +
and  $  \epsilon > 0 $,
 +
let  $  D ( \epsilon,T,d ) $
 +
be the maximum number of points of  $  T $
 +
all at distance at least  $  \epsilon $
 +
from each other. For any  $  m $
 +
there is a  $  K _ {m} < \infty $
 +
such that for every VCM class  $  {\mathcal C} $
 +
with $ S ( {\mathcal C} ) = m $
 +
and any  $  {\mathsf P} $,
  
Application of Markov's inequality to the random variable $Y = (X-\mu)^2$ yields the [[Chebyshev inequality in probability theory]]. More generally, if $X$ has a central moment of order $k$ about $\mu$, then
+
$$
 +
D ( \epsilon, {\mathcal C},d _ {1, {\mathsf P} }  ) \leq K _ {m} \epsilon ^ {- m } ,
 
$$
 
$$
\mathsf{P}\{|X-\mu| > \epsilon\} \le \frac{\mathsf{E}[|X-\mu|^k]}{\epsilon^k} \ .
+
 
 +
[[#References|[a3]]]. There is a universal constant  $  K $
 +
such that for every VCM class  $  {\mathcal C} $
 +
and any  $  M < \infty $,
 +
 
 +
$$
 +
{ \mathop{\rm Pr} } \left \{ \sup  _ {A \in {\mathcal C} } \left | {( {\mathsf P} _ {n} - {\mathsf P} ) ( A ) } \right | > M \right \} \leq
 
$$
 
$$
  
If $X$ has a [[moment generating function]] in the neighbourhood $[-b,b]$ of zero, then
+
$$  
$$
+
\leq 
\mathsf{P}\{|X-\mu| > \epsilon\} = \mathsf{P}\{\exp(\lambda|X-\mu|) > \exp(\lambda\epsilon)\} \le \frac{\mathsf{E}[\exp(\lambda|X-\mu|)]}{\exp(\lambda\epsilon)}
+
KM ^ {2S ( {\mathcal C} ) - 1 } { \mathop{\rm exp} } ( - 2M  ^ {2} ) ,
$$
 
so that
 
$$
 
\log \mathsf{P}\{|X-\mu| > \epsilon\} \le \inf_{|\lambda|\le } \log \mathsf{E}[\exp(\lambda|X-\mu|)] -  \lambda\epsilon \ .
 
 
$$
 
$$
  
 +
[[#References|[a5]]].
 +
 +
Every VC class is included in a maximal class with the same VC index. If  $  {\mathcal C} $
 +
is a maximal VC class of index  $  1 $,
 +
then for any  $  A \in {\mathcal C} $
 +
the set of symmetric differences  $  ( B \setminus  A ) \cup ( A \setminus  B ) $
 +
for  $  B \in {\mathcal C} $
 +
has a tree-like partial ordering by inclusion, and conversely, such an ordering implies  $  S ( {\mathcal C} ) = 1 $[[#References|[a2]]]. For index greater than  $  1 $
 +
no such structure is known (1996).
 +
 +
A general reference on VC classes of sets and functions, also from the viewpoint of probability and statistics, is [[#References|[a6]]], Sect. 2.6.
 +
 +
==Vapnik-Chervonenkis dimension==
 +
 +
''Vapnik–Červonenkis dimension, VC-dimension''
 +
 +
Let $H = (V_H,E_H)$ be a [[hypergraph]]. The Vapnik–Chervonenkis dimension of $H$ is the largest cardinality of a subset $F$ of $V_H$ that is scattered by $E_H$, i.e. such that for all $A \subseteq F$ there is an $E \in E_H$ with $A = F \cap E$. Thus, it is the same as the index of a [[Vapnik–Chervonenkis class]]. It is usually denoted by $\mathrm{VC}(H)$.
 +
 +
Computing the Vapnik–Chervonenkis dimension is $\mathcal{NP}$-hard (cf. also [[NP|$\mathcal{NP}$]]) for many classes of hypergraphs, [[#References|[b1]]], [[#References|[b2]]].
  
==References==
+
The Vapnik–Chervonenkis dimension plays an important role in learning theory, especially in probably approximately correct ([[PAC]]) learning. Thus, learnability of classes of $\{0,1\}$-valued functions is equivalent to finiteness of the Vapnik–Chervonenkis dimension, [[#References|[b3]]].
* Geoffrey Grimmett, David Stirzaker, "Probability and Random Processes" (4th ed), Oxford (2020) ISBN 0-19-884759-9
 
* Martin J. Wainwright, "High-dimensional Statistics", Cambridge (2019) ISBN 978-1-1-8-49802-9
 
  
 +
For the role of the Vapnik–Chervonenkis dimension in [[neural network]]s, see, e.g., [[#References|[b4]]], [[#References|[b5]]].
  
 +
The independence number of a hypergraph $H$ is the maximal cardinality of a subset $A$ of $V_H$ that does not contain any $E \in E_H$ (see also [[Graph, numerical characteristics of a]]). This notion is closely related with $\mathrm{VC}(H)$, [[#References|[b6]]], [[#References|[b7]]].
  
=Steinitz exchange lemma=
 
A lemma on [[Linear independence|linearly independent]] sets and [[spanning set]]s in a vector space from which it is possible to deduce that dimension is well-defined.
 
  
Let $X = \{x_1,\ldots,x_m\}$ be a linearly independent set and $Y = \{y_1,\ldots,y_n\}$ a spanning set in a vector space $V$.  Then $m \le n$ and $Y$ can be re-ordered so that $\{x_1,\ldots,x_m,y_{m+1},\ldots,y_n$ is also a spanning set.
 
  
As a corollary, if $X$ and $Y$ are both bases (linearly independent and spanning) for $V$, then $n = m$.  Hence all bases for $V$ have the same size, and this shows that the dimension of $V$ is well-defined.
 
  
 
==References==
 
==References==
* P. M. Cohn, "Classic Algebra" Wiley (2000) ISBN 047187731X
+
<table>
 +
<TR><TD valign="top">[a1]</TD> <TD valign="top">  A. Blumer,  A. Ehrenfeucht,  D. Haussler,  M.K. Warmuth,  "Learnability and the Vapnik–Chervonenkis dimension"  ''JACM'' , '''6'''  (1989)  pp. 929–965</TD></TR>
 +
<TR><TD valign="top">[a2]</TD> <TD valign="top">  R.M. Dudley,  "The structure of some Vapnik–Červonenkis classes" , ''Proc. Berkeley Conf.in honor of J. Neyman and J. Kiefer'' , '''2''' , Wadsworth  (1985)  pp. 495–508</TD></TR>
 +
<TR><TD valign="top">[a3]</TD> <TD valign="top">  D. Haussler,  "Sphere packing numbers for subsets of the Boolean $n$-cube with bounded Vapnik–Chervonenkis dimension"  ''J. Combin. Th. A'' , '''69'''  (1995)  pp. 217–232</TD></TR>
 +
<TR><TD valign="top">[a4]</TD> <TD valign="top">  G. Stengle,  J. Yukich,  "Some new Vapnik–Chervonenkis classes"  ''Ann. Statist.'' , '''17'''  (1989)  pp. 1441–1446</TD></TR><TR><TD valign="top">[a5]</TD> <TD valign="top">  M. Talagrand,  "Sharper bounds for Gaussian and empirical processes"  ''Ann. Probab.'' , '''22'''  (1994)  pp. 28–76</TD></TR><TR><TD valign="top">[a6]</TD> <TD valign="top">  A. van der Vaart,  J. Wellner,  "Weak convergence and empirical processes" , Springer  (1996)</TD></TR>
 +
<TR><TD valign="top">[a7]</TD> <TD valign="top">  V.N. Vapnik,  A.Ya. Červonenkis,  "On the uniform convergence of frequencies of occurrence of events to their probabilities"  ''Th. Probab. Appl.'' , '''16'''  (1971)  pp. 264–280</TD></TR>
 +
<TR><TD valign="top">[a8]</TD> <TD valign="top">  R.M. Dudley,  "Central limit theorems for empirical measures"  ''Ann. of Probab.'' , '''6'''  (1978)  pp. 899–929</TD></TR><TR><TD valign="top">[a9]</TD> <TD valign="top">  R.M. Dudley,  "Universal Donsker classes and metric entropy"  ''Ann. of Probab.'' , '''15'''  (1987)  pp. 1306–1326</TD></TR>
 +
<TR><TD valign="top">[a10]</TD> <TD valign="top">  D. Pollard,  "Convergence of stochastic processes" , Springer  (1984)</TD></TR>
 +
<TR><TD valign="top">[b1]</TD> <TD valign="top">  E. Kranakis,  D. Krizanc,  B. Ruf,  J. Urrutia,  G. Wöginger,  "The VC-dimension of set systems defined by graphs"  ''Discr. Appl. Math.'' , '''77''' :  3  (1997)  pp. 237–257</TD></TR>
 +
<TR><TD valign="top">[b2]</TD> <TD valign="top">  C.H. Papadimitriou,  M. Yannakakis,  "On limited nondeterminism and the complexity of VC-dimension"  ''J. Comput. Syst. Sci.'' , '''53''' :  2  (1996)  pp. 161–170</TD></TR>
 +
<TR><TD valign="top">[b3]</TD> <TD valign="top">  S. Ben-David,  N. Cesa-Bianchi,  D. Haussler,  P.M. Long,  "Characterizations of learnability of $\{0,\ldots,n\}$-valued functions"  ''J. Comput. Syst. Sci.'' , '''50''' :  1  (1995)  pp. 74–86  {{DOI|10.1006/jcss.1995.1008}} {{ZBL|0827.68095}}</TD></TR>
 +
<TR><TD valign="top">[b4]</TD> <TD valign="top">  S.B. Holden,  "Neural networks and the VC-dimension"  J.G. McWhirter (ed.) , ''Mathematics in Signal Processing'' , '''III''' , Oxford Univ. Press  (1994)  pp. 73–84</TD></TR>
 +
<TR><TD valign="top">[b5]</TD> <TD valign="top">  W. Maass,  "Perspectives of current research about the complexity of learning on neural nets"  V. Roychowdhury (ed.)  et al. (ed.) , ''Theoretical Advances in Neural Computation and Learning'' , Kluwer Acad. Publ.  (1994)  pp. 295–336</TD></TR>
 +
<TR><TD valign="top">[b6]</TD> <TD valign="top">  D.Q. Naiman,  H.P. Wynn,  "Independence number and the complexity of families of sets"  ''Discr. Math.'' , '''154'''  (1996)  pp. 203–216</TD></TR>
 +
<TR><TD valign="top">[b7]</TD> <TD valign="top">  J. Pach,  P.K. Agarwal,  "Combinatorial geometry" , Wiley/Interscience  (1995) pp. 247–254</TD></TR>
 +
</table>
 +
 
  
 
=Spanning set=
 
=Spanning set=
 
''generating set'', ''for a module $M$ over a ring $R$''
 
''generating set'', ''for a module $M$ over a ring $R$''
  
A subset $S$ of $M$ such that every element of $M$ can be written as a finite linear combination $\sum_{i=1}^k r_i s_i$ with $r_i \in R$ and $s_i \in S$.
+
A subset $S$ of $M$ such that every element of $M$ can be written as a finite linear combination $\sum_{i=1}^k r_i s_i$ with $r_i \in R$ and $s_i \in S$: a set $S$ such that $M$ is the [[linear span]] of $S$.
  
 
==References==
 
==References==
Line 85: Line 357:
 
A measure of dissimilarity between [[word]]s over some alphabet in terms of the number of elementary "edit" operations required to. turn one word into another.
 
A measure of dissimilarity between [[word]]s over some alphabet in terms of the number of elementary "edit" operations required to. turn one word into another.
  
Example include
+
Examples include
  
 
* ''[[Hamming distance]]'' between words of the same length.  An edit operation consists of ''substitution'': replacing one letter in a given position by another letter in the same position.
 
* ''[[Hamming distance]]'' between words of the same length.  An edit operation consists of ''substitution'': replacing one letter in a given position by another letter in the same position.
Line 309: Line 581:
 
<table>
 
<table>
 
<TR><TD valign="top">[1]</TD> <TD valign="top">  S. MacLane,  "Categories for the working mathematician" , Springer  (1971). ISBN 0-387-98403-8</TD></TR>
 
<TR><TD valign="top">[1]</TD> <TD valign="top">  S. MacLane,  "Categories for the working mathematician" , Springer  (1971). ISBN 0-387-98403-8</TD></TR>
</table>
 
 
=Triple=
 
==Triple==
 
''monad, on a category  $  \mathfrak R $''
 
 
A [[Monoid|monoid]] in the [[Category|category]] of all endomorphism functors on  $  \mathfrak R $.
 
In other words, a triple on a category  $  \mathfrak R $
 
is a covariant functor  $  T:  \mathfrak R \mathop \rightarrow \limits \mathfrak R $
 
endowed with natural transformations  $  \eta :  {\mathop{\rm Id}\nolimits} _ {\mathfrak R} \mathop \rightarrow \limits T $
 
and  $  \mu :  T ^ {2} \mathop \rightarrow \limits T $
 
(here  $  {\mathop{\rm Id}\nolimits} _ {\mathfrak R} $
 
denotes the identity functor on  $  \mathfrak R $)
 
such that the following diagrams are commutative:
 
 
$$  \begin{array}{crclc} T (X)  & \mathop \rightarrow \limits ^ {T ( \eta _ {X} )}  &T ^ {2} (X)  & \mathop \leftarrow \limits ^ {\eta _ {T (X)}}  &T (X)  \\ {}  &{} _ {1 _ {T (X)}} \searrow  &\scriptsize {\mu _ {X}} \downarrow  &\swarrow _ {1 _ {T (X)}}  &{}  \\ {}  &{}  &T (X)  &{}  &{}  \\ \end{array}  $$
 
 
$$  \begin{array}{rcl} T ^ {3} (X)  & \mathop \rightarrow \limits ^ {T ( \mu _ {X} )}  &T ^ {2} (X)  \\ \scriptsize {\mu _ {T (X)}}  \downarrow  &{}  &\downarrow  \scriptsize {\mu _ {X}}  \\ T ^ {2} (X)  & \mathop \rightarrow \limits _ {\mu _ {X}}  &T (X)  \\ \end{array}  $$
 
 
A triple is sometimes called a standard construction, cf. [[#References|[2]]].
 
 
For any pair of adjoint functors  $  F :  \mathfrak R \mathop \rightarrow \limits \mathfrak L $
 
and  $  G:  \mathfrak L \mathop \rightarrow \limits \mathfrak R $
 
(see [[Adjoint functor|Adjoint functor]]) with unit and co-unit of adjunction  $  \eta :  {\mathop{\rm Id}\nolimits} _ {\mathfrak R} \mathop \rightarrow \limits GF $
 
and  $  \epsilon :  FG \mathop \rightarrow \limits {\mathop{\rm Id}\nolimits} _ {\mathfrak R} $,
 
respectively, the functor  $  T = GF:  \mathfrak R \mathop \rightarrow \limits \mathfrak R $
 
endowed with  $  \eta :  {\mathop{\rm Id}\nolimits} _ {\mathfrak R} \mathop \rightarrow \limits T $
 
and  $  \mu = G ( \epsilon _ {F} ):  T ^ {2} \mathop \rightarrow \limits T $
 
is a triple on  $  \mathfrak R $.
 
Conversely, for any triple  $  (T, \eta , \mu ) $
 
there exist pairs of adjoint functors  $  F $
 
and  $  G $
 
such that  $  T = GF $,
 
and the transformations  $  \eta $
 
and  $  \mu $
 
are obtained from the unit and co-unit of the adjunction in the manner described above. The different such decompositions of a triple may form a proper class. In this class there is a smallest element (the Kleisli construction) and a largest element (the Eilenberg–Moore construction).
 
 
===Examples.===
 
 
1) In the category of sets, the functor which sends an arbitrary set to the set of all its subsets has the structure of a triple. Each set  $  X $
 
is naturally imbedded in the set of its subsets via singleton sets, and to each set of subsets of  $  X $
 
one associates the union of these subsets.
 
 
2) In the category of sets, every representable functor  $  H _ {A} (X) = H (A, X) $
 
carries a triple: The mapping  $  \eta _ {X} :  X \mathop \rightarrow \limits H (A, X) $
 
associates to each  $  x \in X $
 
the constant function  $  f _ {x} :  A \mathop \rightarrow \limits X $
 
with value  $  x $;
 
the mapping  $  \mu _ {X} :  H (A, H (A, X)) \simeq H (A \times A, X) \mathop \rightarrow \limits H (A, X) $
 
associates to each function of two variables its restriction to the diagonal.
 
 
3) In the category of topological spaces, each topological group  $  G $,
 
with unit  $  e $,
 
enables one to define a functor  $  T _ {G} (X) = X \times G $
 
that carries a triple: Each element  $  x \in X $
 
is taken to the element  $  (x, e) $
 
and the mapping  $  \mu :  X \times G \times G \mathop \rightarrow \limits X \times G $
 
is defined by  $  \mu _ {X} (x, g, g ^  \prime  ) = (x, gg ^  \prime  ) $.
 
 
4) In the category of modules over a commutative ring  $  R $,
 
each (associative, unital)  $  R $-
 
algebra  $  A $
 
gives rise to a triple structure on the functor  $  T _ {A} (X) = X \otimes _ {R} A $,
 
in a manner similar to Example 3).
 
 
====References====
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  J.F. Adams,  "Infinite loop spaces" , Princeton Univ. Press  (1978)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  R. Godement,  "Topologie algébrique et théorie des faisceaux" , Hermann  (1958)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  M.Sh. Tsalenko,  E.G. Shul'geifer,  "Categories"  ''J. Soviet Math.'' , '''7''' :  4  (1977)  pp. 532–586  ''Itogi Nauk. i Tekhn. Algebra Topol. Geom.'' , '''13'''  (1975)  pp. 51–148</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  S. MacLane,  "Categories for the working mathematician" , Springer  (1971)</TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top">  E.G. Manes,  "Algebraic theories" , Springer  (1976)</TD></TR></table>
 
 
====Comments====
 
The non-descriptive name  "triple"  for this concept has now largely been superseded by  "monad" , although there is an obstinate minority of category-theorists who continue to use it. A comonad (or cotriple) on a category  $  \mathfrak R $
 
is a monad on  $  \mathfrak R ^ {op} $;
 
in other words, it is a functor  $  T:  \mathfrak R \mathop \rightarrow \limits \mathfrak R $
 
equipped with natural transformations  $  \epsilon :  T \mathop \rightarrow \limits {\mathop{\rm Id}\nolimits} _ {\mathfrak R} $
 
and  $  \delta :  T \mathop \rightarrow \limits T ^ {2} $
 
satisfying the duals of the commutative diagrams above. Every adjoint pair of functors ( $  F \dashv G $)
 
gives rise to a comonad structure on the composite  $  FG $,
 
as well as a monad structure on  $  GF $.
 
 
An important example of a functor which carries a comonad structure is  $  \Lambda :  {\mathop{\rm Ring}\nolimits} \mathop \rightarrow \limits {\mathop{\rm Ring}\nolimits} $,
 
$  \Lambda (A)=1+tA[[t]] $,
 
or, equivalently, the functor of big Witt vectors, cf. [[Lambda-ring| $  \lambda $-
 
ring]]; [[Witt vector|Witt vector]]. A special case of the natural transformation  $  W(A) \mathop \rightarrow \limits \Lambda (W(A)) $
 
occurs in algebraic number theory as the [[Artin–Hasse exponential]], [[#References|[a5]]].
 
 
Monads in the category of sets can be equivalently described by sets  $  T(n) $
 
of  $  n $-
 
ary operations for each cardinal number (or set)  $  n $;
 
$  \eta _ {n} :  n \mathop \rightarrow \limits T(n) $
 
gives the projection operations  $  (x _ {1} , x _ {2} ,\dots) \mapsto x _ {i} $,
 
and  $  \mu $
 
gives the rules for composing operations. See [[#References|[5]]] or [[#References|[a1]]]. This approach extends to monads in arbitrary categories, but it has not proved useful in general, as it has in or near sets.
 
 
Of the two canonical ways of constructing an adjunction from a given monad, mentioned in the main article above, the Eilenberg–Moore construction (or category of  $  T $-
 
algebras) is by far the more important. Given a monad  $  (T, \eta , \mu ) $
 
on a category  $  \mathfrak R $,
 
a  $  T $-
 
algebra in  $  \mathfrak R $
 
is a pair  $  (A, \alpha ) $
 
where  $  \alpha :  TA \mathop \rightarrow \limits A $
 
is a morphism such that
 
 
$$  \begin{array}{lcr} A  \mathop \rightarrow \limits ^ {\eta _ {A}}  &TA  & \mathop \leftarrow \limits ^ {\mu _ {A}}  T ^ {2} A  \\ {} _ {1 _ {A}} \nwarrow  &\scriptsize \alpha  \downarrow  &\downarrow  \scriptsize {T _ {A}}  \\ {}  & A  & \mathop \leftarrow \limits _  \alpha  TA  \\ \end{array}  $$
 
 
commutes. A homomorphism of  $  T $-
 
algebras  $  (A, \alpha ) \mathop \rightarrow \limits (B, \beta ) $
 
is a morphism  $  f:  A \mathop \rightarrow \limits B $
 
in  $  \mathfrak R $
 
such that
 
 
$$  \begin{array}{rcl} TA  & \mathop \rightarrow \limits ^ {Tf}  &TB  \\ \scriptsize \alpha  \downarrow  &{}  &\downarrow  \scriptsize \beta  \\  A  &\mathop \rightarrow \limits _ {f}  & B  \\ \end{array}  $$
 
 
commutes; thus, one has a category  $  \mathfrak R ^ {T} $
 
of  $  T $-
 
algebras, with an evident forgetful functor  $  G ^ {T} :  \mathfrak R ^ {T} \mathop \rightarrow \limits \mathfrak R $.
 
The functor  $  G ^ {T} $
 
has a left adjoint  $  F ^ { T} $,
 
which sends an object  $  A $
 
of  $  \mathfrak R $
 
to the  $  T $-
 
algebra  $  (TA, \mu _ {A} ) $,
 
and the monad induced by the adjunction ( $  F ^ { T} \dashv G ^ {T} $)
 
is the one originally given.
 
 
Now the Kleisli category of  $  (T, \eta , \mu ) $
 
is just the full subcategory of  $  \mathfrak R ^ {T} $
 
on the objects  $  F ^ { T} (A) $:
 
the category of free algebras (cf. also [[Category|Category]]).
 
 
For a monad  $  (T, \eta , \mu ) $
 
on  $  \mathfrak R $,
 
in the Kleisli construction the category  $  \mathfrak L $
 
has as objects the objects of  $  \mathfrak R $,
 
and as hom-sets the sets
 
 
$$  \mathfrak L (A, B)  =  \mathfrak R (A, TB).  $$
 
 
The composition rule for  $  \mathfrak L $
 
assigns to  $  f \in \mathfrak L (A, B) $
 
and  $  g \in \mathfrak L (B, C) $
 
the  $  \mathfrak R $-
 
composite:
 
 
$$  [A  \mathop \rightarrow \limits ^ {T}  TB  \mathop \rightarrow \limits ^ {T(g)}  TTC  \mathop \rightarrow \limits ^ {\mu _ {C}}  TC ]  \in  \mathfrak L (A, C);  $$
 
 
as identity mapping  $  1 _ {A} \in \mathfrak L (A, A) = \mathfrak R (T, TA) $
 
one uses the  $  \mathfrak R $-
 
morphism  $  \eta _ {A} :  A \mathop \rightarrow \limits TA $.
 
 
An adjoint pair  $  F:  \mathfrak R \mathop \rightarrow \limits \mathfrak L $,
 
$  U:  \mathfrak L \mathop \rightarrow \limits \mathfrak R $
 
is obtained by setting  $  F(A)=A $
 
for  $  A \in \mathfrak R $,
 
 
$$  F(f)  =  \eta _ {B} \circ f :  A  \mathop \rightarrow \limits  B  \mathop \rightarrow \limits  TB  \in  \mathfrak R (A, TB)  =  \mathfrak L (A, B)  $$
 
 
for  $  f \in \mathfrak R (A, B) $,
 
$  U(B)=TB $
 
for  $  B \in \mathfrak L $,
 
and  $  U(g ) = \mu _ {G} \circ T(g) $
 
for  $  g \in \mathfrak L (B, C)= \mathfrak R (B, TC) $.
 
 
Then  $  \eta $
 
will serve as unit for the adjunction, while the co-unit  $  \epsilon :  FU \mathop \rightarrow \limits {\mathop{\rm Id}\nolimits} _ {\mathfrak L} $
 
is given by
 
 
$$  \epsilon _ {B}  =  \mathop{\rm Id} _ {T(B)}  \in  \mathfrak R (TB, TB)  =  \mathfrak L (FUB, B).  $$
 
 
Co-algebras are defined in the same manner. In practice, co-algebras very often occur superposed on algebras; a comonad  $  G $
 
will be constructed on a category of algebras of some sort,  $  \mathfrak R $,
 
leading to the category  $  {} ^ {G} \mathfrak R $
 
of bi-algebras. An important class of cases involves a monad  $  T $
 
and a cotriple  $  G $
 
on the same category  $  \mathfrak R $.
 
There is a standard lifting of  $  G $
 
to a cotriple  $  G ^ {*} $
 
on  $  \mathfrak R ^ {T} $.
 
A  "TG-bi-algebraTG-bi-algebra"  means an object of  $  {} ^ {G ^ {*}} ( \mathfrak R ^ {T} ) $;
 
the reverse order is also possible, but rarely occurs, and the objects would not be called bi-algebras.
 
 
For the role of comonads in (algebraic) cohomology theories see [[Cohomology of algebras|Cohomology of algebras]] and [[#References|[a2]]], [[#References|[a3]]]; particularly [[#References|[a2]]] for explicit interpretation.
 
 
An adjunction is said to be monadic (or monadable) if the Eilenberg–Moore construction applied to the monad it induces yields an adjunction equivalent to the original one. Many important examples of adjunctions are monadic; for example, for any [[Variety of universal algebras|variety of universal algebras]], the forgetful functor from the variety to the category of sets and its left adjoint (the free algebra functor) form a monadic adjunction.
 
 
A monad  $  (T, \eta , \mu ) $
 
is said to be idempotent if  $  \mu $
 
is an isomorphism. In this case it can be shown that any  $  T $-
 
algebra structure  $  \alpha $
 
on an object  $  A $
 
is necessarily a two-sided inverse for  $  \eta _ {A} $,
 
and hence that  $  \mathfrak R ^ {T} $
 
is isomorphic to the full subcategory  $  {\mathop{\rm Fix}\nolimits} (T) \subset  \mathfrak R $
 
consisting of all objects  $  A $
 
such that  $  \eta _ {A} $
 
is an isomorphism.  $  {\mathop{\rm Fix}\nolimits} (T) $
 
is a [[Reflective subcategory|reflective subcategory]] of  $  \mathfrak R $,
 
the left adjoint to the inclusion being given by  $  T $
 
itself. Conversely, for any reflective subcategory of  $  \mathfrak R $,
 
the monad on  $  \mathfrak R $
 
induced by the inclusion and its left adjoint is idempotent; thus, the adjunctions corresponding to reflective subcategories are always monadic.
 
 
====References====
 
<table>
 
<TR><TD valign="top">[a1]</TD> <TD valign="top">  M. Barr,  C. Wells,  "Toposes, monads, and theories" , Springer  (1985)</TD></TR>
 
<TR><TD valign="top">[a2]</TD> <TD valign="top">  J.W. Duskin,  "$K(\pi,n)$-torsors and the interpretation of  "monad"  cohomology"  ''Proc. Nat. Acad. Sci. USA'' , '''71'''  (1974)  pp. 2554–2557</TD></TR>
 
<TR><TD valign="top">[a3]</TD> <TD valign="top">  J.W. Duskin,  "Simplicial methods and the interpretation of  "monad"  cohomology"  ''Mem. Amer. Math. Soc.'' , '''3'''  (1975)</TD></TR>
 
<TR><TD valign="top">[a4]</TD> <TD valign="top">  J. Adamek,  H. Herrlich,  G.E. Strecker,  "Abstract and concrete categories" , Wiley (Interscience)  (1990)</TD></TR>
 
<TR><TD valign="top">[a5]</TD> <TD valign="top">  M. Hazewinkel,  "Formal groups" , Acad. Press  (1978)  pp. Sects. 14.5; 14.6, E2</TD></TR>
 
<TR><TD valign="top">[a6]</TD> <TD valign="top">  H. Appelgate (ed.)  et al. (ed.) , ''Seminar on monads and categorical homology theory ETH 1966/7'' , ''Lect. notes in math.'' , '''80''' , Springer  (1969)</TD></TR>
 
<TR><TD valign="top">[a7]</TD> <TD valign="top">  S. Eilenberg,  J.C. Moore,  "Adjoint functors and monads"  ''Ill. J. Math.'' , '''9'''  (1965)  pp. 381–398</TD></TR>
 
<TR><TD valign="top">[a8]</TD> <TD valign="top">  S. Eilenberg (ed.)  et al. (ed.) , ''Proc. conf. categorical algebra (La Jolla, 1965)'' , Springer  (1966)</TD></TR>
 
</table>
 
 
==Standard construction==
 
'''Standard construction''' is a concept in [[category theory]]. Other names are [[triple]], monad and functor-algebra.
 
 
Let $\mathfrak{S}$ be a [[category]]. A standard construction is a [[functor]] $T:\mathfrak{S} \rightarrow \mathfrak{S}$ equipped with natural transformations $\eta:1\rightarrow T$ and $\mu:T^2\rightarrow T$ such that the following diagrams commute:
 
$$
 
\begin{array}{ccc}
 
T^3 Y & \stackrel{T\mu_Y}{\rightarrow} & T^2 Y \\
 
\mu_{TY}\downarrow& & \downarrow_Y \\
 
T^2 & \stackrel{T_y}{\rightarrow} & Y
 
\end{array}
 
$$
 
$$
 
\begin{array}{ccccc}
 
TY & \stackrel{TY}{\rightarrow} & T^2Y & \stackrel{T_{\eta Y}}{\leftarrow} & TY \\
 
& 1\searrow & \downarrow\mu Y & \swarrow1 & \\
 
& & Y & & \\
 
\end{array}
 
$$
 
 
The basic use of standard constructions in topology is in the construction of various classifying spaces and their algebraic analogues, the so-called bar-constructions.
 
 
====References====
 
<table>
 
<TR><TD valign="top">[b1]</TD> <TD valign="top">  J.M. Boardman,  R.M. Vogt,  "Homotopy invariant algebraic structures on topological spaces" , Springer  (1973)</TD></TR>
 
<TR><TD valign="top">[b2]</TD> <TD valign="top">  J.F. Adams,  "Infinite loop spaces" , Princeton Univ. Press  (1978)</TD></TR>
 
<TR><TD valign="top">[b3]</TD> <TD valign="top">  J.P. May,  "The geometry of iterated loop spaces" , ''Lect. notes in math.'' , '''271''' , Springer  (1972)</TD></TR>
 
<TR><TD valign="top">[b4]</TD> <TD valign="top">  S. MacLane,  "Categories for the working mathematician" , Springer  (1971)</TD></TR>
 
</table>
 
 
 
 
====Comments====
 
The term  "standard construction"  was introduced by R. Godement [[#References|[a1]]] for want of a better name for this concept. It is now entirely obsolete, having been generally superseded by  "monad"  (although a minority of authors still use the term  "triple" ). Monads have many other uses besides the one mentioned above, for example in the categorical approach to [[universal algebra]] (see [[#References|[a2]]], [[#References|[a3]]]).
 
 
====References====
 
<table>
 
<TR><TD valign="top">[c1]</TD> <TD valign="top">  R. Godement,  "Théorie des faisceaux" , Hermann  (1958)</TD></TR>
 
<TR><TD valign="top">[c2]</TD> <TD valign="top">  E.G. Manes,  "Algebraic theories" , Springer  (1976)</TD></TR>
 
<TR><TD valign="top">[c3]</TD> <TD valign="top">  M. Barr,  C. Wells,  "Toposes, triples and theories" , Springer  (1985)</TD></TR>
 
 
</table>
 
</table>
  

Latest revision as of 18:44, 3 July 2021

Mann–Whitney test

A statistical test for testing the hypothesis $ H _ {0} $ of homogeneity of two samples $ X _ {1} \dots X _ {n} $ and $ Y _ {1} \dots Y _ {m} $, all $ m + n $ elements of which are mutually independent and have continuous distributions. This test, suggested by H.B. Mann and D.R. Whitney [1], is based on the statistic

$$ U = W - \frac{1}{2} m ( m + 1 ) = \ \sum _ { i=1 } ^ { n } \ \sum _ { j=1 } ^ { m } \delta _ {ij} , $$

where $ W $ is the statistic of the Wilcoxon test intended for testing the same hypothesis, equal to the sum of the ranks of the elements of the second sample among the pooled order statistics (cf. Order statistic), and

$$ \delta _ {ij} = \ \left \{ \begin{array}{ll} 1 & \textrm{ if } X _ {i} < Y _ {j} , \\ 0 & \textrm{ otherwise } . \\ \end{array} \right .$$

Thus, $ U $ counts the number of cases when the elements of the second sample exceed elements of the first sample. It follows from the definition of $ U $ that if $ H _ {0} $ is true, then

$$ \tag{* } {\mathsf E} U = \frac{nm}{2} ,\ \ {\mathsf D} U = \frac{n m ( n + m + 1 ) }{12} , $$

and, in addition, this statistic has all the properties of the Wilcoxon statistic $ W $, including asymptotic normality with parameters (*).

References

[1] H.B. Mann, D.R. Whitney, "On a test whether one of two random variables is statistically larger than the other" Ann. Math. Stat. , 18 (1947) pp. 50–60

Comments

Instead of Mann–Whitney test, the phrase $U$-test is also used.

Wilcoxon test

A non-parametric test of the homogeneity of two samples $ X _ {1} \dots X _ {n} $ and $ Y _ {1} \dots Y _ {m} $. The elements of the samples are assumed to be mutually independent, with continuous distribution functions $ F( x) $ and $ G( x) $, respectively. The hypothesis to be tested is $ F( x)= G( x) $. Wilcoxon's test is based on the rank statistic

$$ \tag{* } W = s ( r _ {1} ) + \dots + s ( r _ {m} ), $$

where $ r _ {j} $ are the ranks of the random variables $ Y _ {j} $ in the common series of order statistics of $ X _ {i} $ and $ Y _ {j} $, while the function $ s( r) $, $ r = 1 \dots n + m $, is defined by a given permutation

$$ \left( \begin{array}{cccc} 1 & 2 & \cdots & m+n \\ s(1) & s(2) & \cdots & s(m+n) \end{array} \right)\ , $$ where $ s( 1) \dots s( n+ m) $ is one of the possible rearrangements of the numbers $ 1 \dots n + m $. The permutation is chosen so that the power of Wilcoxon's test for the given alternative is highest. The statistical distribution of $ W $ depends only on the size of the samples and not on the chosen permutation (if the homogeneity hypothesis is true). If $ n \rightarrow \infty $ and $ m \rightarrow \infty $, the random variable $ W $ has an asymptotically-normal distribution. This variant of the test was first proposed by F. Wilcoxon in 1945 for samples of equal sizes and was based on the special case $ s( r) \equiv r $( cf. Rank sum test; Mann–Whitney test). See also van der Waerden test; Rank test.

References

[1] F. Wilcoxon, "Individual comparison by ranking methods" Biometrics , 1 : 6 (1945) pp. 80–83
[2] L.N. Bol'shev, N.V. Smirnov, "Tables of mathematical statistics" , Libr. math. tables , 46 , Nauka (1983) (In Russian) (Processed by L.S. Bark and E.S. Kedrova)
[3] B.L. van der Waerden, "Mathematische Statistik" , Springer (1957)

Comments

References

[a1] E.L. Lehmann, "Testing statistical hypotheses" , Wiley (1986)


Vapnik-Chervonenkis class

Vapnik–Červonenkis class

Let $ S $ be a set, $ {\mathcal C} $ a collection of subsets of $ S $ and $ F $ a finite subset of $ S $. Then $ {\mathcal C} $ is said to shatter $ F $ if for every subset $ A $ of $ F $ there is a set $ C $ in $ {\mathcal C} $ with $ A = C \cap F $. If there is a largest, finite $ k $ such that $ {\mathcal C} $ shatters at least one set of cardinality $ k $, then $ {\mathcal C} $ is called a Vapnik–Chervonenkis class, or VC class, of sets and $ S ( {\mathcal C} ) = k $ its Vapnik–Chervonenkis index.

Let $ \Delta ^ {\mathcal C} ( F ) $ be the number of different sets $ C \cap F $ for $ C \in {\mathcal C} $. Let $ m ^ {\mathcal C} ( n ) $ be the maximum of $ \Delta ^ {\mathcal C} ( F ) $ over all sets $ F $ of cardinality $ n $. Thus, $ {\mathcal C} $ is a Vapnik–Chervonenkis class if and only if $ m ^ {\mathcal C} ( n ) < 2 ^ {n} $ for some finite $ n $, and then for all $ n > S ( {\mathcal C} ) $. Sauer's lemma says that

$$ m ^ {\mathcal C} ( n ) \leq \sum _ {j = 0 } ^ { {S } ( {\mathcal C} ) } \left ( \begin{array}{c} n \\ j \end{array} \right ) . $$

Thus, $ m ^ {\mathcal C} ( n ) $ is either always $ 2 ^ {n} $ or, for a Vapnik–Chervonenkis class $ {\mathcal C} $, it is bounded above by a polynomial in $ n $ of degree $ S ( {\mathcal C} ) $. (This is the so-called Vapnik–Chervonenkis property: if $ m ^ {\mathcal C} ( n ) < 2 ^ {n} $ for large $ n $, then $ m ^ {\mathcal C} ( n ) $ is bounded by a polynomial.)

Vapnik–Chervonenkis classes have turned out to be useful in computer science (learning theory [a1]), probability theory and mathematical statistics [a6], because certain probability limit theorems hold uniformly over them under suitable measurability conditions. One such sufficient measurability condition is that there exist a $ \sigma $- algebra $ {\mathcal S} $ of subsets of $ S $, including $ {\mathcal C} $, and a mapping $ Y $ from a complete separable metric space $ U $ onto $ {\mathcal C} $ such that the set of pairs $ ( x,u ) $ with $ x \in Y ( u ) $ is product-measurable in $ S \times U $. A VC class $ {\mathcal C} $ satisfying this last condition is called a VCM class. While VC, but not VCM, classes can be shown to exist using the axiom of choice, the VC classes usually encountered in applications are VCM.

Let $ {\mathsf P} $ be a probability measure on $ ( S, {\mathcal S} ) $ and let $ X _ {1} ,X _ {2} , \dots $ be independent coordinates with distribution $ {\mathsf P} $, specifically, on a countable Cartesian product of copies of $ ( S, {\mathcal S}, {\mathsf P} ) $. Let $ {\mathsf P} _ {n} $ be the sum of the point masses $ {1 / n } $ at $ X _ {i} $ for $ i = 1 \dots n $; it is called an empirical measure for $ {\mathsf P} $( cf. also Empirical process). Then the law of large numbers for empirical measures holds uniformly over any VCM class $ {\mathcal C} $, meaning that the supremum for $ C \in {\mathcal C} $ of $ | {( {\mathsf P} _ {n} - {\mathsf P} ) ( C ) } | $ approaches zero almost surely as $ n $ becomes large [a7]. This can be improved to a uniform law of the iterated logarithm, meaning that for any VCM class $ {\mathcal C} $, with probability $ 1 $,

$$ {\lim\limits } \sup _ {n \rightarrow \infty } n ^ {1/2 } \sup _ {C \in {\mathcal C} } { \frac{\left | {( {\mathsf P} _ {n} - {\mathsf P} ) ( C ) } \right | }{( 2 { \mathop{\rm log} } { \mathop{\rm log} } n ) ^ {1/2 } } } = $$

$$ = \sup _ {A \in {\mathcal C} } ( {\mathsf P} ( A ) ( 1 - {\mathsf P} ( A ) ) ) ^ {1/2 } . $$

Moreover, a central limit theorem holds uniformly: if $ {\mathcal C} $ is any VCM class, and $ G _ {\mathsf P} $ assigns to sets in $ {\mathcal C} $ jointly normal (Gaussian) random variables with mean zero and covariances $ {\mathsf E} G _ {\mathsf P} ( A ) G _ {\mathsf P} ( B ) = {\mathsf P} ( A \cap B ) - {\mathsf P} ( A ) P ( B ) $, then for any $ \epsilon > 0 $ there is a sufficiently large $ m $ such that for every $ n \geq m $, there exists a $ G _ {\mathsf P} $ with

$$ \sup _ {A \in {\mathcal C} } \left | {n ^ {1/2 } ( {\mathsf P} _ {n} - {\mathsf P} ) ( A ) - G _ {\mathsf P} ( A ) } \right | < \epsilon $$

on an event with probability at least $ 1 - \epsilon $. For the uniform central limit theorem to hold for each probability measure $ {\mathsf P} $ on $ ( S, {\mathcal S} ) $, the VC property is also necessary.

VC classes can be generated as follows. Let $ V $ be a $ k $- dimensional vector space of real-valued functions on $ S $. For each $ f \in V $, let $ { \mathop{\rm pos} } ( f ) $ be the set where $ f > 0 $. Then the class $ {\mathcal C} $ of all sets $ { \mathop{\rm pos} } ( f ) $ for $ f \in V $ is a VC class with $ S ( {\mathcal C} ) = k $. For example, the set of all ellipsoids in a Euclidean space $ \mathbf R ^ {d} $ is a VCM class for each $ d $. Also, let $ {\mathcal C} $ be a VC class and $ m $ a finite integer. Let $ {\mathcal D} $ be the union of all Boolean algebras of sets (cf. Boolean algebra), each generated by at most $ m $ sets in $ {\mathcal C} $. Then $ {\mathcal D} $ is a VC class. For example, the set of all convex polytopes with at most $ m $ faces in $ \mathbf R ^ {d} $ is a VC class for each $ m $ and $ d $. Classes of projections of positivity sets of polynomials of bounded degree, and some other related classes, are also VC [a4].

The class of all finite sets in $ \mathbf R ^ {d} $ and the class of all closed convex sets are not VC classes.

The notion of VC class extends in different ways to a class $ {\mathcal F} $ of real functions on $ S $. The subgraph of a function $ f $ is the set

$$ \left \{ {( s,x ) } : {0 \leq x \leq f ( s ) \textrm{ or } f ( s ) \leq x \leq 0 } \right \} $$

in $ S \times \mathbf R $. Then $ {\mathcal F} $ is called a VC subgraph class if the collection of all subgraphs of functions in $ {\mathcal F} $ is a VC class in $ S \times \mathbf R $; it is called a VC major class if the class of all sets $ \{ {s \in S } : {f ( s ) > x } \} $ for $ f \in {\mathcal F} $ and real $ x $ is a VC class in $ S $.

The above probability limit theorems extend to these and larger classes of functions, with suitable measurability and boundedness. Neither the VC subgraph nor VC major property implies the other. For a uniformly bounded, suitably measurable family of functions, the uniform central limit property for all $ {\mathsf P} $ appears not to be equivalent to any VC-type combinatorial property.

For a probability measure $ {\mathsf P} $ and two events $ A,B $, let $ d _ {1, {\mathsf P} } ( A,B ) = {\mathsf E} | {1 _ {A} - 1 _ {B} } | $. For a totally bounded metric space $ ( T,d ) $ and $ \epsilon > 0 $, let $ D ( \epsilon,T,d ) $ be the maximum number of points of $ T $ all at distance at least $ \epsilon $ from each other. For any $ m $ there is a $ K _ {m} < \infty $ such that for every VCM class $ {\mathcal C} $ with $ S ( {\mathcal C} ) = m $ and any $ {\mathsf P} $,

$$ D ( \epsilon, {\mathcal C},d _ {1, {\mathsf P} } ) \leq K _ {m} \epsilon ^ {- m } , $$

[a3]. There is a universal constant $ K $ such that for every VCM class $ {\mathcal C} $ and any $ M < \infty $,

$$ { \mathop{\rm Pr} } \left \{ \sup _ {A \in {\mathcal C} } \left | {( {\mathsf P} _ {n} - {\mathsf P} ) ( A ) } \right | > M \right \} \leq $$

$$ \leq KM ^ {2S ( {\mathcal C} ) - 1 } { \mathop{\rm exp} } ( - 2M ^ {2} ) , $$

[a5].

Every VC class is included in a maximal class with the same VC index. If $ {\mathcal C} $ is a maximal VC class of index $ 1 $, then for any $ A \in {\mathcal C} $ the set of symmetric differences $ ( B \setminus A ) \cup ( A \setminus B ) $ for $ B \in {\mathcal C} $ has a tree-like partial ordering by inclusion, and conversely, such an ordering implies $ S ( {\mathcal C} ) = 1 $[a2]. For index greater than $ 1 $ no such structure is known (1996).

A general reference on VC classes of sets and functions, also from the viewpoint of probability and statistics, is [a6], Sect. 2.6.

Vapnik-Chervonenkis dimension

Vapnik–Červonenkis dimension, VC-dimension

Let $H = (V_H,E_H)$ be a hypergraph. The Vapnik–Chervonenkis dimension of $H$ is the largest cardinality of a subset $F$ of $V_H$ that is scattered by $E_H$, i.e. such that for all $A \subseteq F$ there is an $E \in E_H$ with $A = F \cap E$. Thus, it is the same as the index of a Vapnik–Chervonenkis class. It is usually denoted by $\mathrm{VC}(H)$.

Computing the Vapnik–Chervonenkis dimension is $\mathcal{NP}$-hard (cf. also $\mathcal{NP}$) for many classes of hypergraphs, [b1], [b2].

The Vapnik–Chervonenkis dimension plays an important role in learning theory, especially in probably approximately correct (PAC) learning. Thus, learnability of classes of $\{0,1\}$-valued functions is equivalent to finiteness of the Vapnik–Chervonenkis dimension, [b3].

For the role of the Vapnik–Chervonenkis dimension in neural networks, see, e.g., [b4], [b5].

The independence number of a hypergraph $H$ is the maximal cardinality of a subset $A$ of $V_H$ that does not contain any $E \in E_H$ (see also Graph, numerical characteristics of a). This notion is closely related with $\mathrm{VC}(H)$, [b6], [b7].



References

[a1] A. Blumer, A. Ehrenfeucht, D. Haussler, M.K. Warmuth, "Learnability and the Vapnik–Chervonenkis dimension" JACM , 6 (1989) pp. 929–965
[a2] R.M. Dudley, "The structure of some Vapnik–Červonenkis classes" , Proc. Berkeley Conf.in honor of J. Neyman and J. Kiefer , 2 , Wadsworth (1985) pp. 495–508
[a3] D. Haussler, "Sphere packing numbers for subsets of the Boolean $n$-cube with bounded Vapnik–Chervonenkis dimension" J. Combin. Th. A , 69 (1995) pp. 217–232
[a4] G. Stengle, J. Yukich, "Some new Vapnik–Chervonenkis classes" Ann. Statist. , 17 (1989) pp. 1441–1446
[a5] M. Talagrand, "Sharper bounds for Gaussian and empirical processes" Ann. Probab. , 22 (1994) pp. 28–76
[a6] A. van der Vaart, J. Wellner, "Weak convergence and empirical processes" , Springer (1996)
[a7] V.N. Vapnik, A.Ya. Červonenkis, "On the uniform convergence of frequencies of occurrence of events to their probabilities" Th. Probab. Appl. , 16 (1971) pp. 264–280
[a8] R.M. Dudley, "Central limit theorems for empirical measures" Ann. of Probab. , 6 (1978) pp. 899–929
[a9] R.M. Dudley, "Universal Donsker classes and metric entropy" Ann. of Probab. , 15 (1987) pp. 1306–1326
[a10] D. Pollard, "Convergence of stochastic processes" , Springer (1984)
[b1] E. Kranakis, D. Krizanc, B. Ruf, J. Urrutia, G. Wöginger, "The VC-dimension of set systems defined by graphs" Discr. Appl. Math. , 77 : 3 (1997) pp. 237–257
[b2] C.H. Papadimitriou, M. Yannakakis, "On limited nondeterminism and the complexity of VC-dimension" J. Comput. Syst. Sci. , 53 : 2 (1996) pp. 161–170
[b3] S. Ben-David, N. Cesa-Bianchi, D. Haussler, P.M. Long, "Characterizations of learnability of $\{0,\ldots,n\}$-valued functions" J. Comput. Syst. Sci. , 50 : 1 (1995) pp. 74–86 DOI 10.1006/jcss.1995.1008 Zbl 0827.68095
[b4] S.B. Holden, "Neural networks and the VC-dimension" J.G. McWhirter (ed.) , Mathematics in Signal Processing , III , Oxford Univ. Press (1994) pp. 73–84
[b5] W. Maass, "Perspectives of current research about the complexity of learning on neural nets" V. Roychowdhury (ed.) et al. (ed.) , Theoretical Advances in Neural Computation and Learning , Kluwer Acad. Publ. (1994) pp. 295–336
[b6] D.Q. Naiman, H.P. Wynn, "Independence number and the complexity of families of sets" Discr. Math. , 154 (1996) pp. 203–216
[b7] J. Pach, P.K. Agarwal, "Combinatorial geometry" , Wiley/Interscience (1995) pp. 247–254


Spanning set

generating set, for a module $M$ over a ring $R$

A subset $S$ of $M$ such that every element of $M$ can be written as a finite linear combination $\sum_{i=1}^k r_i s_i$ with $r_i \in R$ and $s_i \in S$: a set $S$ such that $M$ is the linear span of $S$.

References

  • P. M. Cohn, "Classic Algebra" Wiley (2000) ISBN 047187731X

Edit distance

A measure of dissimilarity between words over some alphabet in terms of the number of elementary "edit" operations required to. turn one word into another.

Examples include

  • Hamming distance between words of the same length. An edit operation consists of substitution: replacing one letter in a given position by another letter in the same position.
  • Lee distance between words of the same length over the alphabet $\mathbf{Z}/m$. An edit operation consists of replacing one letter $i \pmod m$ by $i\pm 1 \pmod m$.
  • Levenshtein distance between strings. An edit operation consists of deleting one character or inserting one character. A substitution can be obtained as a deletion following by an insertion, but may be considered as another elementary operation.


References

Star-semiring

$*$-semiring

A semiring with a unary operator $*$.

A Conway semiring satisfies the properties $$ (x+y)^* = (x^*y)^*x^* $$ and $$ (xy)^* = 1 + x(yx)^*y \ . $$

In a Conway semiring $C$ the $*$ operator extends to the matrix semiring over $C$.

References

Jean Berstel, Christophe Reutenauer, "Noncommutative rational series with applications", Encyclopedia of Mathematics and its Applications 137, Cambridge (2011) ISBN 978-0-521-19022-0 Zbl 1250.68007

Club

Let $\gamma$ be a limit ordinal. A subset $a \subset \gamma$ is unbounded in $\gamma$ if $\sup a = \gamma$; it is closed in $\gamma$ if it is closed in the order topology on $\gamma$: that is, for any limit $\mu < \gamma$ with $\sup(a\cap\mu) = \mu$ then $\mu \in a$. A club (closed unbounded) subset $a$ is one which is both closed and unbounded in $\gamma$.

If the cofinality $\kappa$ of $\gamma$ is greater than $\omega$, then the intersection of any family of fewer than $\kappa$ clubs is again a club.

References

  • Kenneth Kunen, "Set Theory", Studies in Logic 34, College Publications (2013) ISBN 978-1-84890-050-9. Zbl 1262.03001

Weakly compact cardinal

A cardinal $\mathfrak{k}$ is weakly compact if for any 2-colouring of the edges of a complete graph on a vertex set of cardinality $\mathfrak{k}$ there is a homogeneous subgraph of cardinality $\mathfrak{k}$.

Weakly compact cardinals are inaccessible: that is, their existence is independent of ZFC.

Every weakly compact cardinal is a Mahlo cardinal.

References

  • Erdős, Paul; Tarski, Alfred "On some problems involving inaccessible cardinals", Essays on the foundations of mathematics, Magnes Press (1961) pp. 50–82, MR0167422. Zbl 0212.32502

Weakly compact operator

An operator $T : X \rightarrow Y$ on Banach spaces is weakly compact if $T$ sends bounded subsets of $X$ into weakly compact subsets of $Y$.

See: Grothendieck space, Dunford–Pettis property, Dunford–Pettis operator.

Grothendieck universe

A type of set in which all of mathematics can be performed. Formally, a set $U$ with the properties:

  • $x \in U$ and $y \in x$ implies $y \in U$: that is, $U$ is a transitive set;
  • If $x, y \in U$ then the doubleton $\{x,y\} \in U$;
  • If $x \in U$ then the power set $\mathcal{P}(x) \in U$;
  • If $\{x_\alpha\}_{\alpha \in y}$ is a family of elements of $U$ indexed by $y$, and $y \in U$, then the union $\cup_{\alpha \in y} x_\alpha \in U$.

The empty set is a Grothendieck universe, as is the set $V_\omega$ of hereditarily finite sets.

The existence of a non-trivial Grothendieck universe is equivalent to a large cardinal axiom: the existence of strongly inaccessible cardinals.

References

  • Thomas Streicher, "Universes in Toposes" in Crosilla, Laura and Schuster, Peter (edd.) From sets and types to topology and analysis. Towards practicable foundations for constructive mathematics. Clarendon Press (2005). pp. 78–90. ISBN 978-0-19-856651-9 Zbl 1092.03038

Hereditarily finite set

A finite set all of whose elements are finite sets.

Large cardinal axiom

Strongly inaccessible cardinal

Inaccessible cardinal

Finite type

For an algebraic structure, an alternative term for finitely generated.

A sheaf of modules $\def\cF{ {\mathcal F}}\cF$ over a sheaf of rings $\cO$ is a sheaf of finite type if it is locally generated over $\cO$ by a finite number of sections.

A Riemann surface $M$ is of finite type if it can be imbedded in a compact Riemann surface $\tilde M$ such that $\tilde M \setminus M$ consists of finitely many points. Cf. also Riemann surfaces, classification of and (the references to) Double of a Riemann surface.


Local properties of a group

In group theory, if $\mathcal{P}$ is a property of groups, then a group $G$ is said to be locally $\mathcal{P}$ if every finitely-generated subgroup of $G$ has the property $\mathcal{P}$. The term was introduced by A.G. Kurosh.

See

Locally cyclic group
Locally finite group
Locally free group
Locally nilpotent group
Locally solvable group

The term "Locally normal group" does not fit this paradigm.

References

  • B. Chandler, W. Magnus, "The History of Combinatorial Group Theory: A Case Study in the History of Ideas", Springer (2012) ISBN 1-4613-9487-2
  • A.G. Kurosh, Group theory. (Теория групп) (Russian) OGIZ, Moskva-Leningrad (1944) Zbl 0061.02101


Separable partial order

A partially ordered set $(X,{<})$ is Cantor separable if no strictly increasing linearly ordered subset has a least upper bound: $$ a_1 < a_2 < \cdots < b $$ implies there exists $c$ with $$ a_1 < a_2 < \cdots < c < b \ . $$ It is duBois–Reymond separable if a strictly increasing sequence can be separated from a decreasing sequence of upper bounds: $$ a_1 < a_2 < \cdots < \cdots < b_2 < b_1 $$ implies there exists $c$ with $$ a_1 < a_2 < \cdots < c < \cdots < b_2 < b_1 \ . $$

References

  • R.C. Walker, "The Stone–Čech compactification", Springer (1974)

Heron triangle

A triangle for which the lengths of the sides and the area are expressible by integers. Named after Heron (1st century A.D.), who studied triangles with side lengths $13,14,15$ and $5,12,13$, the areas of which are 84 and 30, respectively.

The Pythagorean triangles are special cases (cf. Pythagorean numbers). In this case the area is a congruent number.

The Heron formula for the area $S$ of a triangle in terms of its sides $a$, $b$ and $c$ and semi-perimeter $p=(a+b+c)/2$ is $$S=\sqrt{p(p-a)(p-b)(p-c)},$$ so Heron triangles correspond to integer solutions to $$ s^2 = p(p-a)(p-b)(p-c) \ . $$

References

Trace monoid

Let $A$ be an alphabet with an irreflexive symmetric relation $I$ called independence. The complementary relation $I = A \times A \setminus I$ is the "dependence" relation. Such an alphabet is a concurrence or dependency alphabet. The free monoid on $A$ modulo the relations $ab=ba$ when $a,b \in I$ is the trace monoid on $(A,D)$. The elements of a trace monoid are "traces" and the subsets are the "trace languages".

Trace monoids are used to model concurrency in computer languages.

References

  • Diekert, Volker; Rozenberg, Grzegorz (edd) "The Book Of Traces" (World Scientific, 1995) ISBN 981-02-2058-8

Trace-class operator

An operator $T$ on a Hilbert space $H$ with complete orthonormal set $(e_n)$ for which the sum $\sum_n \langle Tx_n , x_n \rangle$ is finite. For such operators, the trace is defined to be the value of this sum. The set of trace-class operators on $H$ coincides with the set of squares of Hilbert-Schmidt operators. The trace-class operators are precisely the Schatten class for $p=1$.

References

  • Retherford, J. R. "Hilbert space: Compact operators and the trace theorem" London Mathematical Society Student Texts 27. (Cambridge University Press, 1993) ISBN 0-521-42933-1. Zbl 0783.47031

Schatten class

Schatten ideal

A class of operators on a Hilbert space. Let $T$ be an operator with singular values $\sigma_n$. For $1 \le p < \infty$ we say that $T$ is in the Schatten $p$-class if the sequence $(\sigma_n)$ is in $\ell_p$: that is, if $\sum_n |\sigma_n|^p$ converges, and then the $p$-root of the value is the Schatten $p$-norm of $T$. The Schatten classes form ideals of the operator algebra.

The Schatten $2$-class is precisely the Hilbert–Schmidt operators. The Schatten $1$-class is the trace-class operators.

References

  • Retherford, J. R. "Hilbert space: Compact operators and the trace theorem" London Mathematical Society Student Texts 27. (Cambridge University Press, 1993) ISBN 0-521-42933-1. Zbl 0783.47031
  • Schatten, Robert. "Norm ideals of completely continuous operators" Ergebnisse der Mathematik und ihrer Grenzgebiete. Neue Folge, 27 . (Springer-Verlag, 1960) ISBN 0090.09402

Singular value

The singular values of a complex matrix $A$ are the eigenvalues of $A^*A$, or equivalently of $AA^*$. The singular value decomposition of $A$ is the expression $A=U\Sigma V$, with $U$ a unitary $(m\times n)$-matrix, $V$ a unitary $(n\times n)$-matrix and $\Sigma$ of the form $$ \Sigma = \begin{pmatrix} {\mathcal D} & 0\\ 0 & 0\end{pmatrix}, $$ where ${\mathcal D}$ is diagonal with entries the singular values $s_1,\dots,s_k$ of $A$ and $k$ the rank of $A$.

In the case of a closed operator $A$ on a Hilbert space, then $A^*$ is a positive operator and the singular values of $A$ are the spectrum of $A^*A$.

Span

Span may refer to

Span (category theory)

A diagram in a category of the form $$ \begin{array}{ccccc} & & C & & \\ & f \swarrow & & \searrow g & \\ A & & & & B \end{array} $$

Two spans with arrows $(f,g)$ and $(f',g')$ are equivalent if for all $D,p,q$ the diagrams $$ \begin{array}{ccccc} & & C & & \\ & f \swarrow & & \searrow g & \\ A & & & & B \\ & p \searrow & & \swarrow q \\ & & D & & \\ \end{array} \ \ \text{and}\ \ \begin{array}{ccccc} & & C & & \\ & f' \swarrow & & \searrow g' & \\ A & & & & B \\ & p \searrow & & \swarrow q \\ & & D & & \\ \end{array} $$ either both commute or both do not commute.

A pushout is the colimit of a span.

References

[1] S. MacLane, "Categories for the working mathematician" , Springer (1971). ISBN 0-387-98403-8

Ostrowski representation

MSC 11A67

Let $[a_1,a_2,\ldots]$ be the partial quotients of an infinite continued fraction and $(c_n)$ the corresponding continuants, $c_0 = 1$, $c_1 = a_1$ and $c_{n+1} = a_{n+1} c_n + c_{n-1}$. An Ostrowski representation of $N$ is $$ N = \sum_{k=0}^n x_{k+1} c_k $$ where $0 \le x_k \le a_k$ and if $x_k = a_k$ then $x_{k-1} = 0$. Every positive integer $M$ has a unique Ostrowski representation.

When the $a_n$ are all equal to $1$, the $c_n$ are the Fibonacci numbers, and the Ostrowski representation is just the Zeckendorf representation. The addition of two $n$-digit numbers in Ostrowski representation based on a given continued fraction can be computed by three linear passes over the input.

References

  • Philipp Hieronymi; Alonza Terry jun. "Ostrowski numeration systems, addition, and finite automata" Notre Dame J. Formal Logic 59 (2018) 215-232 Zbl 06870290

Dirac comb

A sum of Dirac delta-functions supported on a locally finite point set.

References

  • Michael Baake, Uwe Gromm; "Aperiodic order", vol.1, Encyclopedia of Mathematics and its Applications 149 (Cambridge, 2013) ISBN 978-0-521-86991-1 Zbl 1295.37001
  • Marjorie Senechal; "Quasicrystals and geometry" (Cambridge, 1995) ISBN 0-521-57541-9 Zbl 0828.52007

Delone set

Delaunay set, $(r,R)$-set

A subset $D$ of $\mathbf{R}^n$ which is both discrete: there exists $r>0$ such that the balls of radius $r$ centred on points of $D$ are disjoint; and relatively dense: there exists $R$ such that the balls of radius $R$ centred on points of $D$ cover $\mathbf{R}^n$.

See also Covering and packing.

The spectrum of a Delone set $D$ is defined as the Fourier transform $$ \hat\gamma(s) = \lim_{T\to\infty} \frac{1}{(2T)^n} \sum_{d \in D_T} \exp(-2\pi i\, d\cdot s) $$ where $D_T = D \cap [-T,T]^n$. The spectrum $\hat\gamma$ is a positive measure and has a Lebesgue decomposition into a sum of discrete and continuous measures. If the discrete measure is supported on a countably infinite set $S$, then $D$ is said to satisfy the diffraction condition.

References

  • Marjorie Senechal; "Quasicrystals and geometry" (Cambridge, 1995) ISBN 0-521-57541-9 Zbl 0828.52007

SIS model

A simple model in mathematical epidemiology which reduces to the logistic equation. Assume that the population falls into two subgroups, "susceptible" ($S$) and "infected" ($I$), with susceptible members being infected at a rate proportional to the number of infected, and infected members recovering and returning to the susceptible subgroup at a constant rate. We therefore have $$ S' = - \beta S \cdot I + \alpha I $$ and $$ I' = \beta S \cdot I - \alpha I $$ Since $S+I = N$ is constant, we have $I' = r I (1- k^{-1} I)$ where $r = \beta N - \alpha$ is the growth rate and $k = r / \beta$. The basic reproduction number $R_0 = \beta N / \alpha$. If $R_0 < 1$, so that $r<0$, then $I$ decreases to zero. Otherwise we have the explicit solution $$ I(t) = \frac{ k B e^{rt} }{ 1 + B e^{rt} } $$ where $B= I(0) / (k - I(0))$ and $I(t)$ tends to $k$ as $t \to \infty$

Reference

  • Maia Martcheva, "An Introduction to Mathematical Epdidemiology" Texts in Applied Mathematics 41 (Springer, 2015) ISBN 978-1-4899-7611-6 Zbl 1333.92006

Pregroup

A pregroup generalises the notion of a free group with amalgamation. A pregroup is a partially ordered monoid $(M,{\cdot},1,{\ge})$ with left and right adjoint maps $L$ and $R$ satisfying $$ x^L \cdot x \ge 1 \ge x \cdot x^L $$ $$ x \cdot x^R \ge 1 \ge x^R \cdot x $$ It follows that $x^{LR} = x = x^{RL}$

A partially ordered group is a pregroup, with both adjoint maps being group inversion.

References

  • J. Stallings, "Group theory and three-dimensional manifolds" Yale Univ. Monogr. 4 (1971) Zbl 0241.57001

Freiman homomorphism

A map $\phi$ defined on a subset $A$ of an additive group $G$ to a group $H$ such that for $a_1,a_2,a_3,a-4 \in A$ $$ a_1 + a_2 = a_3 + a_4 \ \Rightarrow\ \phi(a_1) + \phi(a_2) = \phi(a_3) + \phi(a_4) $$

Clearly an affine map $x \mapsto \psi(x) + b$ with $\psi$ a group homomorphism and $b \in H$ is a Frieman homomorphism. A notable problem in additive combinatorics is to find conditions on $A$ that require every Freiman homomorphism to be affine.

More generally, a Freiman homomorphism of order $k$ satisfies the corresponding property for $k$-tuples with equal sums.

References

  • Melvyn B. Nathanson, "Additive Number Theory: Inverse Problems and the Geometry of Sumsets", Graduate Texts in Mathematics 165 (Springer, 1996) ISBN 0-387-94655-1
  • Terence Tao, Van H. Vu; "Additive Combinatorics", Cambridge Studies in Advanced Mathematics 105 (Cambridge University Press, 2006)ISBN 1-1394-5834-5
  • David J. Grynkiewicz, "Structural Additive Theory", Developments in Mathematics 30 (Springer, 2013) ISBN 3-319-00416-6
How to Cite This Entry:
Richard Pinch/sandbox-13. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Richard_Pinch/sandbox-13&oldid=51573