Namespaces
Variants
Actions

Difference between revisions of "Inclusion-exclusion formula"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
m (texed)
 
(3 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 +
 +
<!--This article has been texified automatically. Since there was no Nroff source code for this article,
 +
the semi-automatic procedure described at https://encyclopediaofmath.org/wiki/User:Maximilian_Janisch/latexlist
 +
was used.
 +
If the TeX and formula formatting is correct and if all png images have been replaced by TeX code, please remove this message and the {{TEX|semi-auto}} category.
 +
 +
Out of 78 formulas, 69 were replaced by TEX code.-->
 +
 +
{{TEX|semi-auto}}{{TEX|done}}
 
''inclusion-exclusion principle, inclusion-exclusion method''
 
''inclusion-exclusion principle, inclusion-exclusion method''
  
The inclusion-exclusion principle is used in many branches of pure and applied mathematics. In [[Probability theory|probability theory]] it means the following theorem: Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i1300201.png" /> be events in a [[Probability space|probability space]] and
+
The inclusion-exclusion principle is used in many branches of pure and applied mathematics. In [[Probability theory|probability theory]] it means the following theorem: Let $A _ { 1 } , \ldots , A _ { n }$ be events in a [[Probability space|probability space]] and
 
 
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i1300202.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a1)</td></tr></table>
 
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i1300203.png" /></td> </tr></table>
+
\begin{equation}
 +
\tag{a1}
 +
S_k = \sum_{1 \le i_1 \le \dots \le i_k \le n} \mathsf{P}(A_{i_1} \cap \dots \cap A_{i_k})
 +
\end{equation}
 +
\begin{equation*} k = 1 , \dots , n. \end{equation*}
  
 
Then one has the relation
 
Then one has the relation
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i1300204.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a2)</td></tr></table>
+
\begin{equation}
 +
\tag{a2}
 +
\mathsf{P} ( A _ { 1 } \bigcup \ldots \bigcup A _ { n } ) = S _ { 1 } - S _ { 2 } + \ldots + ( - 1 ) ^ { n - 1 } S _ { n }.
 +
\end{equation}
  
This theorem can easily be proved by induction on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i1300205.png" />. Another proof can be given by the use of indicator variables, as follows. In connection with an event <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i1300206.png" /> one defines <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i1300207.png" /> if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i1300208.png" /> occurs, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i1300209.png" /> otherwise. Then the equation
+
This theorem can easily be proved by induction on $n$. Another proof can be given by the use of indicator variables, as follows. In connection with an event $A$ one defines $I _ { A } = 1$ if $A$ occurs, and $I _ { A } = 0$ otherwise. Then the equation
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002010.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a3)</td></tr></table>
+
\begin{equation}
 +
\tag{a3}
 +
I_{A_1 \cup \dots \cup A_n}
 +
= \sum_{k=1}^n \sum_{1 \le i_1 \le \dots \le i_k \le n} I_{A_{i_1}} \dots I_{A_{i_k}}
 +
\end{equation}
  
 
holds. Taking expectations on both sides, the theorem follows.
 
holds. Taking expectations on both sides, the theorem follows.
  
If one wants to find the probability of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002011.png" />, then first one observes that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002012.png" /> and then uses (a2) for the events <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002013.png" />.
+
If one wants to find the probability of $A _ { 1 } \cap \ldots \cap A _ { n }$, then first one observes that $\mathsf{P} ( A _ { 1 } \cap \ldots \cap A _ { n } ) = 1 - \mathsf{P} ( \overline { A } _ { 1 } \cup \ldots \cup \overline { A } _ { n } )$ and then uses (a2) for the events $\overline { A } _ { 1 } , \dots , \overline { A } _ { n }$.
  
The theorem is frequently attributed to H. Poincaré [[#References|[a8]]]. However, it was already known to A. De Moivre [[#References|[a2]]] and the even more general formula for the probability that at least <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002014.png" /> events occur has been found by r equation','../p/p072950.htm','Positive variation of a function','../p/p073970.htm','Rectifiable curve','../r/r080130.htm','Variation of a function','../v/v096110.htm')" style="background-color:yellow;">C. Jordan [[#References|[a4]]]. See also [[#References|[a5]]] for further references. All these formulas concern Boolean functions of random events.
+
The theorem is frequently attributed to H. Poincaré [[#References|[a8]]]. However, it was already known to A. De Moivre [[#References|[a2]]] and the even more general formula for the probability that at least $r$ events occur has been found by r equation','../p/p072950.htm','Positive variation of a function','../p/p073970.htm','Rectifiable curve','../r/r080130.htm','Variation of a function','../v/v096110.htm')" style="background-color:yellow;"&gt;C. Jordan [[#References|[a4]]]. See also [[#References|[a5]]] for further references. All these formulas concern Boolean functions of random events.
  
An example for the application of formula (a2) is the solution of the classical problem of coincidences proposed by P.R. Montmort [[#References|[a7]]]: A box contains <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002015.png" /> cards marked <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002016.png" /> and all cards are drawn, one by one, without replacement, where each sequence has the same probability; one says that a coincidence occurs at the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002017.png" />th drawing, or event <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002018.png" /> occurs, if the card drawn is marked <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002019.png" />; the problem is to find the probability that at least one coincidence occurs (cf. also [[Montmort matching problem|Montmort matching problem]]). One easily sees that for any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002020.png" /> one has
+
An example for the application of formula (a2) is the solution of the classical problem of coincidences proposed by P.R. Montmort [[#References|[a7]]]: A box contains $n$ cards marked $1 , \dots , n$ and all cards are drawn, one by one, without replacement, where each sequence has the same probability; one says that a coincidence occurs at the $i$th drawing, or event $A_i$ occurs, if the card drawn is marked $i$; the problem is to find the probability that at least one coincidence occurs (cf. also [[Montmort matching problem|Montmort matching problem]]). One easily sees that for any $1 \leq i_{1} < \ldots < i_{k} \leq n$ one has
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002021.png" /></td> </tr></table>
+
\begin{equation*} \mathsf{P} ( A _ { i_{1} } \bigcap \ldots \bigcap A _ { i_{k} } ) = \frac { ( n - k ) ! } { n ! }, \end{equation*}
  
 
hence
 
hence
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002022.png" /></td> </tr></table>
+
\begin{equation*} S _ { k } = \left( \begin{array} { c } { n } \\ { k } \end{array} \right) \frac { ( n - k ) ! } { n ! } \end{equation*}
  
 
and
 
and
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002023.png" /></td> </tr></table>
+
\begin{equation*} \mathsf{P} ( A _ { 1 } \bigcap \ldots \bigcap A _ { n } ) = \sum _ { k = 1 } ^ { n } ( - 1 ) ^ { k - 1 } \frac { 1 } { k ! }. \end{equation*}
  
The numbers <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002024.png" /> have another probabilistic interpretation: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002025.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002026.png" />, i.e., <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002027.png" /> is the number of events which occur. One says that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002028.png" /> is the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002029.png" />th binomial moment of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002030.png" />. For a proof of the above equation, see, e.g., [[#References|[a11]]].
+
The numbers $S _ { k }$ have another probabilistic interpretation: $S _ { k } = \mathsf{E} \left[ \left( \begin{array} { l } { X } \\ { k } \end{array} \right) \right]$, where $X = I _ { A _ { 1 } } + \ldots + I _ { A _ { n } }$, i.e., $X$ is the number of events which occur. One says that $S _ { k }$ is the $k$th binomial moment of $X$. For a proof of the above equation, see, e.g., [[#References|[a11]]].
  
 
There are many practical applications where one needs to compute the probability of a union, or other Boolean function of events. Prominent are those in reliability theory. For example, in a communication network, where the links randomly work or fail, one may want to find a two-terminal reliability or the all-terminal reliability. In the former case one has to find the probability that all links in at least one path connecting the two terminal nodes work, and in the latter case the probability that all links in at least one spanning tree work. In both cases the number of random events is too large to apply inclusion-exclusion. In fact, one may be able to compute only a few of the binomial moments involved. In such cases one looks for bounds. The simplest ones, for the probability of the union, are the Bonferroni bounds:
 
There are many practical applications where one needs to compute the probability of a union, or other Boolean function of events. Prominent are those in reliability theory. For example, in a communication network, where the links randomly work or fail, one may want to find a two-terminal reliability or the all-terminal reliability. In the former case one has to find the probability that all links in at least one path connecting the two terminal nodes work, and in the latter case the probability that all links in at least one spanning tree work. In both cases the number of random events is too large to apply inclusion-exclusion. In fact, one may be able to compute only a few of the binomial moments involved. In such cases one looks for bounds. The simplest ones, for the probability of the union, are the Bonferroni bounds:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002031.png" /></td> </tr></table>
+
\begin{equation*} \mathsf{P} ( A _ { 1 } \bigcup \ldots \bigcup A _ { n } ) \geq S _ { 1 } - S _ { 2 } + \ldots + S _ { m - 1 } - S _ { m } \end{equation*}
  
if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002032.png" /> is even, and
+
if $m$ is even, and
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002033.png" /></td> </tr></table>
+
$$
 +
\mathsf{P} ( A_1 \cup \dots \cup A_n) \le S_1 - S_2 + \dots + S_{m-2} - S_{m-1} + S_m
 +
$$
  
if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002034.png" /> is odd, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002035.png" /> (cf. also [[Bonferroni inequalities|Bonferroni inequalities]]). However, in many cases these bounds are not sufficiently tight.
+
if $m$ is odd, where $m < n$ (cf. also [[Bonferroni inequalities|Bonferroni inequalities]]). However, in many cases these bounds are not sufficiently tight.
  
A general, efficient method for bounding <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002036.png" /> and other Boolean functions of events has been worked out by A. Prékopa (see [[#References|[a10]]] and the references therein). In the case of bounding the probability of the union one observes that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002037.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002038.png" /> is the probability that exactly <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002039.png" /> events occur. Then, if one knows <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002040.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002041.png" />, but the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002042.png" /> are unknown, one solves the linear programs:
+
A general, efficient method for bounding $\mathsf{P} ( A _ { 1 } \cup \ldots \cup A _ { n } )$ and other Boolean functions of events has been worked out by A. Prékopa (see [[#References|[a10]]] and the references therein). In the case of bounding the probability of the union one observes that $S _ { k } = \mathsf{E} \left[ \left( \begin{array} { l } { X } \\ { k } \end{array} \right) \right] = \sum _ { i = 1 } ^ { n } \left( \begin{array} { l } { i } \\ { k } \end{array} \right) p_{i}$, where $p _ { i }$ is the probability that exactly $i$ events occur. Then, if one knows $S _ { 1 } , \ldots , S _ { m }$, where $m < n$, but the $p_{i}$ are unknown, one solves the linear programs:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002043.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a4)</td></tr></table>
+
\begin{equation}
 +
\tag{a4}
 +
\left\{
 +
\begin{array}{l}
 +
\text{min(max)} \qquad \displaystyle \sum_{i=1}^n p_i \\
 +
\text{subject to} \qquad \displaystyle \sum_{i=1}^n \binom ik p_i = S_k, \quad k=1,\ldots m, \\
 +
p_i \ge 0,\ i=1,\ldots, n.
 +
\end{array}
 +
\right.
 +
\end{equation}
  
If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002044.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002045.png" /> are the optimum values corresponding to the min and max problems, respectively, then these are sharp bounds for the probability of the union, i.e.
+
If $P_{\text{min}}$ and $P _ { \text { max } }$ are the optimum values corresponding to the min and max problems, respectively, then these are sharp bounds for the probability of the union, i.e.
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002046.png" /></td> </tr></table>
+
\begin{equation*} P _ { \operatorname { min } } \leq \mathsf{P} ( A _ { 1 } \bigcup \dots  \bigcup A _ { n } ) \leq P _ { \text{max} } \end{equation*}
  
and no better bounds can be given based on the knowledge of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002047.png" />. There are simple and efficient algorithms to solve problems (a4) which are variants of the dual algorithm of linear programming. The sharp bounding formulas can also be written as
+
and no better bounds can be given based on the knowledge of $S _ { 1 } , \ldots , S _ { m }$. There are simple and efficient algorithms to solve problems (a4) which are variants of the dual algorithm of linear programming. The sharp bounding formulas can also be written as
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002048.png" /></td> </tr></table>
+
\begin{equation*} \sum _ { k = 1 } ^ { m } x _ { k } S _ { k } \leq \mathsf{P} ( A _ { 1 } \bigcup \ldots \bigcup A _ { n } ) \leq \sum _ { k = 1 } ^ { m } y _ { k } S _ { k }, \end{equation*}
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002049.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002050.png" /> are optimal solutions of the duals of the min and max problems, respectively. It has been shown that the components of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002051.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002052.png" /> have alternating signs, starting with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002053.png" />, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002054.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002055.png" />. The optimum values can be expressed by formulas if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002056.png" /> (see the references in [[#References|[a10]]]).
+
where $\mathbf x = ( x _ { 1 } , \dots , x _ { m } ) ^ { T }$ and $\mathbf{y} = ( y _ { 1 } , \dots , y _ { m } ) ^ { T }$ are optimal solutions of the duals of the min and max problems, respectively. It has been shown that the components of $\mathbf{x}$ and $\mathbf{y}$ have alternating signs, starting with $+$, and $| x _ { 1 } | \geq \ldots \geq | x _ { m } |$, $| y _ { 1 } | \geq \ldots \geq | y _ { m } |$. The optimum values can be expressed by formulas if $m \leq 4$ (see the references in [[#References|[a10]]]).
  
Formula (a2) holds for any finitely additive set function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002057.png" /> defined on an algebra <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002058.png" /> of some subsets of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002059.png" /> (if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002060.png" />, then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002061.png" />, and if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002062.png" />, then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002063.png" />). In particular, one may take <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002064.png" /> an arbitrary finite set, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002065.png" /> all subsets of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002066.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002067.png" />, the cardinality of the set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002068.png" />. The latter case has many applications in combinatorics, especially in enumeration problems. A good sample of combinatorial problems, where inclusion-exclusion is used, is presented in [[#References|[a6]]].
+
Formula (a2) holds for any finitely additive set function $\mu$ defined on an algebra $\mathcal{S}$ of some subsets of $\Omega$ (if $A \in \mathcal S$, then $\overline{A} \in \mathcal{S}$, and if $A , B \in \mathcal S$, then $A \cup B \in \mathcal{S}$). In particular, one may take $\Omega$ an arbitrary finite set, $\mathcal{S}$ all subsets of $\Omega$ and $\mu ( A ) = | A |$, the cardinality of the set $A$. The latter case has many applications in combinatorics, especially in enumeration problems. A good sample of combinatorial problems, where inclusion-exclusion is used, is presented in [[#References|[a6]]].
  
Inclusion-exclusion plays also an important role in number theory. Here one calls it the sieve formula or sieve method. In this respect, V. Brun did pioneering work [[#References|[a1]]] (cf. also [[Sieve method|Sieve method]]; [[Brun sieve|Brun sieve]]). For a summary of the sieve methods, see [[#References|[a9]]] and [[#References|[a3]]]. A simple but revealing example is the formula for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002069.png" />, the number of those positive integers that are relative prime to, and smaller than, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002070.png" />. If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002071.png" /> has prime divisors <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002072.png" />, then the inclusion-exclusion principle gives:
+
Inclusion-exclusion plays also an important role in number theory. Here one calls it the sieve formula or sieve method. In this respect, V. Brun did pioneering work [[#References|[a1]]] (cf. also [[Sieve method|Sieve method]]; [[Brun sieve|Brun sieve]]). For a summary of the sieve methods, see [[#References|[a9]]] and [[#References|[a3]]]. A simple but revealing example is the formula for $\varphi ( n )$, the number of those positive integers that are relative prime to, and smaller than, $n$. If $n$ has prime divisors $p _ { 1 } , \dots , p _ { k }$, then the inclusion-exclusion principle gives:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002073.png" /></td> </tr></table>
+
\begin{equation*} \varphi ( n ) = n - \frac { n } { p _ { 1 } } - \ldots - \frac { n } { p _ { k } } + \end{equation*}
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i130/i130020/i13002074.png" /></td> </tr></table>
+
\begin{equation*} + \frac { n } { p _ { 1 } p _ { 2 } } + \ldots + \frac { n } { p _ { k - 1 } p _ { k } } + - \frac { n } { p _ { 1 } p _ { 2 } p _ { 3 } } - \ldots + ( - 1 ) ^ { k } \frac { n } { p _ { 1 } \ldots p _ { k } }. \end{equation*}
  
 
====References====
 
====References====
<table><TR><TD valign="top">[a1]</TD> <TD valign="top">  V. Brun,  "Le crible d'Eratosthène et le théorème de Goldbach"  ''Skrifter Norske Vid.-Akad. Kristiana I'' , '''3'''  (1920)</TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top">  A. De Moivre,  "Doctrine of chances" , London  (1718)</TD></TR><TR><TD valign="top">[a3]</TD> <TD valign="top">  H. Halberstam,  H.E. Richert,  "Sieve methods" , Acad. Press  (1974)</TD></TR><TR><TD valign="top">[a4]</TD> <TD valign="top">  C. Jordan,  "De quelques formules de probabilité"  ''C.R. Acad. Sci. Paris'' , '''65'''  (1867)  pp. 993–994</TD></TR><TR><TD valign="top">[a5]</TD> <TD valign="top">  K. Jordan,  "Chapters on the classical calculus of probability" , Akad. Kiadó  (1972)</TD></TR><TR><TD valign="top">[a6]</TD> <TD valign="top">  L. Lovász,  "Combinatorial problems and exercises" , Akad. Kiadó  (1993)</TD></TR><TR><TD valign="top">[a7]</TD> <TD valign="top">  P.R. Montmort,  "Essai d'analyse sur les jeux de hazard" , Paris  (1708)</TD></TR><TR><TD valign="top">[a8]</TD> <TD valign="top">  H. Poincaré,  "Calcul des probabilités" , Gauthier-Villars  (1896)</TD></TR><TR><TD valign="top">[a9]</TD> <TD valign="top">  K. Prachar,  "Primzahlverteilung" , Springer  (1957)</TD></TR><TR><TD valign="top">[a10]</TD> <TD valign="top">  A. Prékopa,  "Stochastic programming" , Kluwer Acad. Publ.  (1995)</TD></TR><TR><TD valign="top">[a11]</TD> <TD valign="top">  L. Takács,  "On the method of inclusion and exclusion"  ''J. Amer. Statist. Assoc.'' , '''62'''  (1967)  pp. 102–113</TD></TR></table>
+
<table
 +
><tr><td valign="top">[a1]</td> <td valign="top">  V. Brun,  "Le crible d'Eratosthène et le théorème de Goldbach"  ''Skrifter Norske Vid.-Akad. Kristiana I'' , '''3'''  (1920)</td></tr><tr><td valign="top">[a2]</td> <td valign="top">  A. De Moivre,  "Doctrine of chances" , London  (1718)</td></tr><tr><td valign="top">[a3]</td> <td valign="top">  H. Halberstam,  H.E. Richert,  "Sieve methods" , Acad. Press  (1974)</td></tr><tr><td valign="top">[a4]</td> <td valign="top">  C. Jordan,  "De quelques formules de probabilité"  ''C.R. Acad. Sci. Paris'' , '''65'''  (1867)  pp. 993–994</td></tr><tr><td valign="top">[a5]</td> <td valign="top">  K. Jordan,  "Chapters on the classical calculus of probability" , Akad. Kiadó  (1972)</td></tr><tr><td valign="top">[a6]</td> <td valign="top">  L. Lovász,  "Combinatorial problems and exercises" , Akad. Kiadó  (1993)</td></tr><tr><td valign="top">[a7]</td> <td valign="top">  P.R. Montmort,  "Essai d'analyse sur les jeux de hazard" , Paris  (1708)</td></tr><tr><td valign="top">[a8]</td> <td valign="top">  H. Poincaré,  "Calcul des probabilités" , Gauthier-Villars  (1896)</td></tr><tr><td valign="top">[a9]</td> <td valign="top">  K. Prachar,  "Primzahlverteilung" , Springer  (1957)</td></tr><tr><td valign="top">[a10]</td> <td valign="top">  A. Prékopa,  "Stochastic programming" , Kluwer Acad. Publ.  (1995)</td></tr><tr><td valign="top">[a11]</td> <td valign="top">  L. Takács,  "On the method of inclusion and exclusion"  ''J. Amer. Statist. Assoc.'' , '''62'''  (1967)  pp. 102–113</td></tr>
 +
</table>

Latest revision as of 06:06, 15 February 2024

inclusion-exclusion principle, inclusion-exclusion method

The inclusion-exclusion principle is used in many branches of pure and applied mathematics. In probability theory it means the following theorem: Let $A _ { 1 } , \ldots , A _ { n }$ be events in a probability space and

\begin{equation} \tag{a1} S_k = \sum_{1 \le i_1 \le \dots \le i_k \le n} \mathsf{P}(A_{i_1} \cap \dots \cap A_{i_k}) \end{equation} \begin{equation*} k = 1 , \dots , n. \end{equation*}

Then one has the relation

\begin{equation} \tag{a2} \mathsf{P} ( A _ { 1 } \bigcup \ldots \bigcup A _ { n } ) = S _ { 1 } - S _ { 2 } + \ldots + ( - 1 ) ^ { n - 1 } S _ { n }. \end{equation}

This theorem can easily be proved by induction on $n$. Another proof can be given by the use of indicator variables, as follows. In connection with an event $A$ one defines $I _ { A } = 1$ if $A$ occurs, and $I _ { A } = 0$ otherwise. Then the equation

\begin{equation} \tag{a3} I_{A_1 \cup \dots \cup A_n} = \sum_{k=1}^n \sum_{1 \le i_1 \le \dots \le i_k \le n} I_{A_{i_1}} \dots I_{A_{i_k}} \end{equation}

holds. Taking expectations on both sides, the theorem follows.

If one wants to find the probability of $A _ { 1 } \cap \ldots \cap A _ { n }$, then first one observes that $\mathsf{P} ( A _ { 1 } \cap \ldots \cap A _ { n } ) = 1 - \mathsf{P} ( \overline { A } _ { 1 } \cup \ldots \cup \overline { A } _ { n } )$ and then uses (a2) for the events $\overline { A } _ { 1 } , \dots , \overline { A } _ { n }$.

The theorem is frequently attributed to H. Poincaré [a8]. However, it was already known to A. De Moivre [a2] and the even more general formula for the probability that at least $r$ events occur has been found by r equation','../p/p072950.htm','Positive variation of a function','../p/p073970.htm','Rectifiable curve','../r/r080130.htm','Variation of a function','../v/v096110.htm')" style="background-color:yellow;">C. Jordan [a4]. See also [a5] for further references. All these formulas concern Boolean functions of random events.

An example for the application of formula (a2) is the solution of the classical problem of coincidences proposed by P.R. Montmort [a7]: A box contains $n$ cards marked $1 , \dots , n$ and all cards are drawn, one by one, without replacement, where each sequence has the same probability; one says that a coincidence occurs at the $i$th drawing, or event $A_i$ occurs, if the card drawn is marked $i$; the problem is to find the probability that at least one coincidence occurs (cf. also Montmort matching problem). One easily sees that for any $1 \leq i_{1} < \ldots < i_{k} \leq n$ one has

\begin{equation*} \mathsf{P} ( A _ { i_{1} } \bigcap \ldots \bigcap A _ { i_{k} } ) = \frac { ( n - k ) ! } { n ! }, \end{equation*}

hence

\begin{equation*} S _ { k } = \left( \begin{array} { c } { n } \\ { k } \end{array} \right) \frac { ( n - k ) ! } { n ! } \end{equation*}

and

\begin{equation*} \mathsf{P} ( A _ { 1 } \bigcap \ldots \bigcap A _ { n } ) = \sum _ { k = 1 } ^ { n } ( - 1 ) ^ { k - 1 } \frac { 1 } { k ! }. \end{equation*}

The numbers $S _ { k }$ have another probabilistic interpretation: $S _ { k } = \mathsf{E} \left[ \left( \begin{array} { l } { X } \\ { k } \end{array} \right) \right]$, where $X = I _ { A _ { 1 } } + \ldots + I _ { A _ { n } }$, i.e., $X$ is the number of events which occur. One says that $S _ { k }$ is the $k$th binomial moment of $X$. For a proof of the above equation, see, e.g., [a11].

There are many practical applications where one needs to compute the probability of a union, or other Boolean function of events. Prominent are those in reliability theory. For example, in a communication network, where the links randomly work or fail, one may want to find a two-terminal reliability or the all-terminal reliability. In the former case one has to find the probability that all links in at least one path connecting the two terminal nodes work, and in the latter case the probability that all links in at least one spanning tree work. In both cases the number of random events is too large to apply inclusion-exclusion. In fact, one may be able to compute only a few of the binomial moments involved. In such cases one looks for bounds. The simplest ones, for the probability of the union, are the Bonferroni bounds:

\begin{equation*} \mathsf{P} ( A _ { 1 } \bigcup \ldots \bigcup A _ { n } ) \geq S _ { 1 } - S _ { 2 } + \ldots + S _ { m - 1 } - S _ { m } \end{equation*}

if $m$ is even, and

$$ \mathsf{P} ( A_1 \cup \dots \cup A_n) \le S_1 - S_2 + \dots + S_{m-2} - S_{m-1} + S_m $$

if $m$ is odd, where $m < n$ (cf. also Bonferroni inequalities). However, in many cases these bounds are not sufficiently tight.

A general, efficient method for bounding $\mathsf{P} ( A _ { 1 } \cup \ldots \cup A _ { n } )$ and other Boolean functions of events has been worked out by A. Prékopa (see [a10] and the references therein). In the case of bounding the probability of the union one observes that $S _ { k } = \mathsf{E} \left[ \left( \begin{array} { l } { X } \\ { k } \end{array} \right) \right] = \sum _ { i = 1 } ^ { n } \left( \begin{array} { l } { i } \\ { k } \end{array} \right) p_{i}$, where $p _ { i }$ is the probability that exactly $i$ events occur. Then, if one knows $S _ { 1 } , \ldots , S _ { m }$, where $m < n$, but the $p_{i}$ are unknown, one solves the linear programs:

\begin{equation} \tag{a4} \left\{ \begin{array}{l} \text{min(max)} \qquad \displaystyle \sum_{i=1}^n p_i \\ \text{subject to} \qquad \displaystyle \sum_{i=1}^n \binom ik p_i = S_k, \quad k=1,\ldots m, \\ p_i \ge 0,\ i=1,\ldots, n. \end{array} \right. \end{equation}

If $P_{\text{min}}$ and $P _ { \text { max } }$ are the optimum values corresponding to the min and max problems, respectively, then these are sharp bounds for the probability of the union, i.e.

\begin{equation*} P _ { \operatorname { min } } \leq \mathsf{P} ( A _ { 1 } \bigcup \dots \bigcup A _ { n } ) \leq P _ { \text{max} } \end{equation*}

and no better bounds can be given based on the knowledge of $S _ { 1 } , \ldots , S _ { m }$. There are simple and efficient algorithms to solve problems (a4) which are variants of the dual algorithm of linear programming. The sharp bounding formulas can also be written as

\begin{equation*} \sum _ { k = 1 } ^ { m } x _ { k } S _ { k } \leq \mathsf{P} ( A _ { 1 } \bigcup \ldots \bigcup A _ { n } ) \leq \sum _ { k = 1 } ^ { m } y _ { k } S _ { k }, \end{equation*}

where $\mathbf x = ( x _ { 1 } , \dots , x _ { m } ) ^ { T }$ and $\mathbf{y} = ( y _ { 1 } , \dots , y _ { m } ) ^ { T }$ are optimal solutions of the duals of the min and max problems, respectively. It has been shown that the components of $\mathbf{x}$ and $\mathbf{y}$ have alternating signs, starting with $+$, and $| x _ { 1 } | \geq \ldots \geq | x _ { m } |$, $| y _ { 1 } | \geq \ldots \geq | y _ { m } |$. The optimum values can be expressed by formulas if $m \leq 4$ (see the references in [a10]).

Formula (a2) holds for any finitely additive set function $\mu$ defined on an algebra $\mathcal{S}$ of some subsets of $\Omega$ (if $A \in \mathcal S$, then $\overline{A} \in \mathcal{S}$, and if $A , B \in \mathcal S$, then $A \cup B \in \mathcal{S}$). In particular, one may take $\Omega$ an arbitrary finite set, $\mathcal{S}$ all subsets of $\Omega$ and $\mu ( A ) = | A |$, the cardinality of the set $A$. The latter case has many applications in combinatorics, especially in enumeration problems. A good sample of combinatorial problems, where inclusion-exclusion is used, is presented in [a6].

Inclusion-exclusion plays also an important role in number theory. Here one calls it the sieve formula or sieve method. In this respect, V. Brun did pioneering work [a1] (cf. also Sieve method; Brun sieve). For a summary of the sieve methods, see [a9] and [a3]. A simple but revealing example is the formula for $\varphi ( n )$, the number of those positive integers that are relative prime to, and smaller than, $n$. If $n$ has prime divisors $p _ { 1 } , \dots , p _ { k }$, then the inclusion-exclusion principle gives:

\begin{equation*} \varphi ( n ) = n - \frac { n } { p _ { 1 } } - \ldots - \frac { n } { p _ { k } } + \end{equation*}

\begin{equation*} + \frac { n } { p _ { 1 } p _ { 2 } } + \ldots + \frac { n } { p _ { k - 1 } p _ { k } } + - \frac { n } { p _ { 1 } p _ { 2 } p _ { 3 } } - \ldots + ( - 1 ) ^ { k } \frac { n } { p _ { 1 } \ldots p _ { k } }. \end{equation*}

References

[a1] V. Brun, "Le crible d'Eratosthène et le théorème de Goldbach" Skrifter Norske Vid.-Akad. Kristiana I , 3 (1920)
[a2] A. De Moivre, "Doctrine of chances" , London (1718)
[a3] H. Halberstam, H.E. Richert, "Sieve methods" , Acad. Press (1974)
[a4] C. Jordan, "De quelques formules de probabilité" C.R. Acad. Sci. Paris , 65 (1867) pp. 993–994
[a5] K. Jordan, "Chapters on the classical calculus of probability" , Akad. Kiadó (1972)
[a6] L. Lovász, "Combinatorial problems and exercises" , Akad. Kiadó (1993)
[a7] P.R. Montmort, "Essai d'analyse sur les jeux de hazard" , Paris (1708)
[a8] H. Poincaré, "Calcul des probabilités" , Gauthier-Villars (1896)
[a9] K. Prachar, "Primzahlverteilung" , Springer (1957)
[a10] A. Prékopa, "Stochastic programming" , Kluwer Acad. Publ. (1995)
[a11] L. Takács, "On the method of inclusion and exclusion" J. Amer. Statist. Assoc. , 62 (1967) pp. 102–113
How to Cite This Entry:
Inclusion-exclusion formula. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Inclusion-exclusion_formula&oldid=12560
This article was adapted from an original article by András Prékopa (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article