# Benford law

significant-digit law, first-digit law

A probability distribution on the significant digits of real numbers named after one of the early researchers, [a1]. Letting $\{ D _ {n} \} _ {n = 1 } ^ \infty$ denote the (base- $10$) significant digit functions (on $\mathbf R \backslash \{ 0 \}$), i.e.,

$$D _ {n} ( x ) = n \textrm{ th significant digit of } x$$

(so, e.g., $D _ {1} ( 0.0304 ) = D _ {1} ( 304 ) = 3$, $D _ {2} ( 0.0304 ) = 0$, etc.), Benford's law is the logarithmic probability distribution ${\mathsf P}$ given by

1) (first digit law)

$${\mathsf P} ( D _ {1} = d ) = { \mathop{\rm log} } _ {10 } ( 1 + d ^ {- 1 } ) , d = 1 \dots 9;$$

2) (second digit law)

$${\mathsf P} ( D _ {2} = d ) = \sum _ {k = 1 } ^ { 9 } { \mathop{\rm log} } _ {10 } \left ( 1 + ( 10k + d ) ^ {- 1 } \right ) ,$$

$$d =0 \dots 9 \$$

3) (general digit law)

$${\mathsf P} ( D _ {1} = d _ {1} \dots D _ {k} = d _ {k} ) =$$

$$= { \mathop{\rm log} } _ {10 } \left [ 1 + \left ( \sum _ {i = 1 } ^ { k } d _ {i} \cdot 10 ^ {k - i } \right ) ^ {- 1 } \right ]$$

for all $k \in \mathbf N$, $d _ {1} \in \{ 1 \dots 9 \}$ and $d _ {j} \in \{ 0 \dots 9 \}$, $j = 2 \dots k$.

An alternate form of the general law 3) is

4) ${\mathsf P} ( { \mathop{\rm mantissa} } \leq {t / {10 } } ) = { \mathop{\rm log} } _ {10 } t$ for all $t \in [ 1,10 )$. Here, the mantissa (base $10$) of a positive real number $x$ is the real number $r \in [ {1 / {10 } } ,1 )$ with $x = r \cdot 10 ^ {n}$ for some $n \in \mathbf Z$; e.g., the mantissas of both $304$ and $0.0304$ are $0.304$.

More formally, the logarithmic probability measure ${\mathsf P}$ in 1)–4) is defined on the measurable space $( \mathbf R ^ {+} , {\mathcal M} )$, where $\mathbf R ^ {+}$ is the set of positive real numbers and ${\mathcal M}$ is the (base- $10$) mantissa sigma algebra, i.e., the sub-sigma-algebra of the Borel $\sigma$- algebra generated by the significant digit functions $\{ D _ {n} \} _ {n =1 } ^ \infty$( or, equivalently, generated by the single function $x \mapsto { \mathop{\rm mantissa} } ( x )$). In some combinatorial and number-theoretic treatises of Benford's law, $\mathbf R ^ {+}$ is replaced by $\mathbf N$, and ${\mathsf P}$ by a finitely-additive probability measure defined on all subsets of $\mathbf N$.

Empirical evidence of Benford's law in numerical data has appeared in a wide variety of contexts, including tables of physical constants, newspaper articles and almanacs, scientific computations, and many areas of accounting and demographic data (see [a1], [a5], [a6], [a7]), and these observations have led to many mathematical derivations based on combinatorial (e.g., [a2]), analytic ([a3], [a8]), and various urn-scheme arguments, among others (see [a7] for a review of these ideas).

Benford's law ${\mathsf P}$ can also be characterized by several invariance properties, such as the following two. Say that a probability measure ${\widehat {\mathsf P} }$ on the mantissa space $( \mathbf R ^ {+} , {\mathcal M} )$ is scale-invariant if ${\widehat {\mathsf P} } ( sS ) = {\widehat {\mathsf P} } ( S )$ for every $S \in {\mathcal M}$ and $s > 0$, and is base-invariant if ${\widehat {\mathsf P} } ( S ^ { {1 / n } } ) = {\widehat {\mathsf P} } ( S )$ for every $S \in {\mathcal M}$ and $n \in \mathbf N$. Letting ${\mathsf P}$ denote the logarithmic probability distribution given in 1)–4), then (see [a4])

${\mathsf P}$ is the unique probability on $( \mathbf R ^ {+} , {\mathcal M} )$ which is scale-invariant;

${\mathsf P}$ is the unique atomless probability on $( \mathbf R ^ {+} , {\mathcal M} )$ which is base-invariant.

A statistical derivation of Benford's law in the form of a central limit-like theorem (cf., e.g., Central limit theorem) characterizes ${\mathsf P}$ as the unique limit of the significant-digit frequencies of a sequence of random variables generated as follows. First, pick probability distributions at random, and then take random samples (independent, identically distributed random variables) from each of these distributions. If the overall process is scale- or base-neutral (see [a5]), the frequencies of occurrence of the significant digits approach the Benford frequencies 1)–4) in the limit almost surely (i.e., with probability one; cf. also Convergence, almost-certain).

There is nothing special about the decimal base in 1)–4), and the analogue of Benford's law 4) for general bases $b > 1$ is simply

$${ \mathop{\rm Prob} } \left ( { \mathop{\rm mantissa} } ( { \mathop{\rm base} } b ) \leq { \frac{t}{b} } \right ) = { \mathop{\rm log} } _ {b} t$$

for all $t \in [ 1,b )$.

Applications of Benford's law have been given to design of computers, mathematical modelling, and detection of fraud in accounting data (see [a5], [a7]).

How to Cite This Entry:
Benford law. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Benford_law&oldid=46011
This article was adapted from an original article by T. Hill (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article