Benford law
significant-digit law, first-digit law
A probability distribution on the significant digits of real numbers named after one of the early researchers, [a1]. Letting denote the (base-) significant digit functions (on ), i.e.,
(so, e.g., , , etc.), Benford's law is the logarithmic probability distribution given by
1) (first digit law)
2) (second digit law)
3) (general digit law)
for all , and , .
An alternate form of the general law 3) is
4) for all . Here, the mantissa (base ) of a positive real number is the real number with for some ; e.g., the mantissas of both and are .
More formally, the logarithmic probability measure in 1)–4) is defined on the measurable space , where is the set of positive real numbers and is the (base-) mantissa sigma algebra, i.e., the sub-sigma-algebra of the Borel -algebra generated by the significant digit functions (or, equivalently, generated by the single function ). In some combinatorial and number-theoretic treatises of Benford's law, is replaced by , and by a finitely-additive probability measure defined on all subsets of .
Empirical evidence of Benford's law in numerical data has appeared in a wide variety of contexts, including tables of physical constants, newspaper articles and almanacs, scientific computations, and many areas of accounting and demographic data (see [a1], [a5], [a6], [a7]), and these observations have led to many mathematical derivations based on combinatorial (e.g., [a2]), analytic ([a3], [a8]), and various urn-scheme arguments, among others (see [a7] for a review of these ideas).
Benford's law can also be characterized by several invariance properties, such as the following two. Say that a probability measure on the mantissa space is scale-invariant if for every and , and is base-invariant if for every and . Letting denote the logarithmic probability distribution given in 1)–4), then (see [a4])
is the unique probability on which is scale-invariant;
is the unique atomless probability on which is base-invariant.
A statistical derivation of Benford's law in the form of a central limit-like theorem (cf., e.g., Central limit theorem) characterizes as the unique limit of the significant-digit frequencies of a sequence of random variables generated as follows. First, pick probability distributions at random, and then take random samples (independent, identically distributed random variables) from each of these distributions. If the overall process is scale- or base-neutral (see [a5]), the frequencies of occurrence of the significant digits approach the Benford frequencies 1)–4) in the limit almost surely (i.e., with probability one; cf. also Convergence, almost-certain).
There is nothing special about the decimal base in 1)–4), and the analogue of Benford's law 4) for general bases is simply
for all .
Applications of Benford's law have been given to design of computers, mathematical modelling, and detection of fraud in accounting data (see [a5], [a7]).
References
[a1] | F. Benford, "The law of anomalous numbers" Proc. Amer. Philos. Soc. , 78 (1938) pp. 551–572 |
[a2] | D. Cohen, "An explanation of the first digit phenomenon" J. Combinatorial Th. A , 20 (1976) pp. 367–370 |
[a3] | P. Diaconis, "The distribution of leading digits and uniform distribution mod " Ann. of Probab. , 5 (1977) pp. 72–81 |
[a4] | T. Hill, "Base-invariance implies Benford's law" Proc. Amer. Math. Soc. , 123 (1995) pp. 887–895 |
[a5] | T. Hill, "A statistical derivation of the significant-digit law" Statistical Sci. , 10 (1996) pp. 354–363 |
[a6] | S. Newcomb, "Note on the frequency of use of different digits in natural numbers" Amer. J. Math. , 4 (1881) pp. 39–40 |
[a7] | R. Raimi, "The first digit problem" Amer. Math. Monthly , 102 (1976) pp. 322–327 |
[a8] | P. Schatte, "On mantissa distributions in computing and Benford's law" J. Inform. Process. Cybern. , 24 (1988) pp. 443–445 |
Benford law. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Benford_law&oldid=46011