Huffman code
An algorithm for the construction of optimal (i.e., minimum redundancy) variable-length -ary codes (
) for finite memoryless information sources with known statistics was proposed by D. Huffman in 1952 (see [a1], [a2]). (The algorithm refines the Morse alphabet, as it associates the shortest codewords to the most probable source letters.)
Let be such a source with
letters,
, and let
be the probability that the source emits letter
(
). Let
be a uniquely decipherable
-ary code for
(i.e., a set of
sequences
from an alphabet of
symbols) and let
be the length (or number of
-ary symbols) of codeword
(
) associated to the letter
. The difference
![]() |
between the average length of
and the source entropy (cf. also Entropy)
is called the redundancy of
. A Huffman code
for
minimizes
(hence
) over the set of all uniquely decipherable codes for
.
In the binary case (), the algorithm consists in the construction of a sequence of "reduced" sources
, where
is obtained from
by merging the two least probable letters of
into one letter of
(
) and leaving all other letters unchanged (ties are broken arbitrarily). The probability of the new letter is the sum of the probabilities of the merged letters. The
mergings can be represented by a binary code tree in which each letter of
corresponds to a leaf and each codeword is the sequence of binary labels of the path from root to leaf. (If the code alphabet has
letters, each reduced source is obtained by merging the
least probable letters of the previous source, except possibly in the first step, where
letters have to be grouped together,
being the remainder of the division of
by
. The resulting
-ary code tree is labeled and again the codewords are the sequences of labels from root to leaves.)
Owing to the arbitrary resolution of possible ties in grouping the least probable sequences at each step, for a given source there may be several distinct Huffman codes, but two such codes are considered (essentially) different only if the corresponding sets
of codeword lengths do not coincide. However, the average length of all Huffman codes for
is the same.
More formally, consider the set of all
-order probability distributions (cf. Probability distribution)
,
,
(
). The informational divergence (also called the Kullback–Leibler information) between two probability distributions
,
in
is the non-negative quantity
.
vanishes if and only if
, hence it can be considered as an oriented pseudo-distance (cf. also Pseudo-metric) in
. Now, if
is the set of codeword lengths of a Huffman code
for
, the redundancy
is equal to the informational divergence
between the source probability distribution
and the "dyadic" probability distribution
.
The optimality of Huffman codes can now be restated by saying that to any (i.e., to any source
), the algorithm associates (one of) the dyadic probability distribution(s) "closest" to
in the sense of the pseudo-metric
.
In several practical situations the statistics of the source is not perfectly known or varies over time, and the necessity arises of updating the Huffman code when the new(ly known) becomes "closer" to a dyadic probability distribution different from the previous one (see [a3], [a4]).
See also Coding and decoding; Information theory.
References
[a1] | D.A. Huffman, "A method for the construction of minimum redundancy codes" Proc. I.R.E. , 40 (1952) pp. 1098–1101 |
[a2] | R. Gallager, "Variations on a theme by Huffman" IEEE Trans. Inform. Theory , IT–24 (1978) pp. 668–674 |
[a3] | G. Longo, G. Galasso, "An application of informational divergence to Huffman codes" IEEE Trans. Inform. Theory , IT–28 (1982) pp. 36–43 |
[a4] | D.A. Lelewer, D.S. Hirschberg, "Data compression" ACM Comput. Surv. , 19 (1987) pp. 261–296 |
Huffman code. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Huffman_code&oldid=18372