# Histogram

A method for representing experimental data. A histogram is constructed as follows. The entire range of the observed values $X_1, \dots, X_n$ of some random variable $X$ is subdivided into $k$ grouping intervals (which are usually all of equal length) by points $x_1, \dots, x_{k+1}$; the number of observations $m_i$ per interval $[x_i, x_{i+1}]$ and the frequency $h_i=m_i/n$ are computed. The points $x_1, \dots, x_{k+1}$ are marked on the abscissa, and the segments $x_ix_{i+1} \quad (i = 1,\dots, k)$ are taken as the bases of rectangles with heights $h_i/(x_{i+1}-x_i)$. If the intervals $[x_i, x_{i+1})$ have equal lengths, the altitudes of the rectangles are taken as $h_i$ or as $m_i$. Thus, let the measurements of trunks of 1000 firs give the following results:

 diameter in cm. 22–27 27–32 32–37 37–42 42–52 number of trunks 100 130 500 170 100

The histogram for this example is shown in the figure. diameter in cm. number of trunks Figure: h047450a

The histogram can be considered as a technique of density estimation (cf. also Density of a probability distribution), and there is much literature on its properties as a statistical estimator of an unknown probability density as $n\to\infty$ and the grouping intervals are made smaller (grouping intervals of lengths $\approx n^{-1/3}$ seem optimal).