# Pattern recognition

A branch of mathematical cybernetics devising principles and methods for the classification and identification of objects, phenomena, processes, signals, and situations, i.e. of all those objects that can be described by a finite set of features or properties characterizing the object.

A description of an object (a sample) is an \$n\$-dimensional vector, where \$n\$ is the number of features used to characterize the object and the \$i\$-th coordinate of the vector is equal to the value of the \$i\$-th feature, \$i=1,\ldots,n\$. It is permissible for a description to contain no information about the values of some of the features. If it is necessary to classify given objects into several classes (patterns) solely on the basis of their descriptions, where the number of classes need not be specified, then the problem of recognition is called a taxonomy problem (cluster analysis, learning without a teacher, self-education). For the proper problems of pattern recognition (learning with a teacher), apart from the descriptions of the objects additional information is needed about the inclusion of some of these objects into one class (pattern) or another. The number of classes is finite and specified. They may overlap.

A set of descriptions of objects for which the patterns to which they belong are known constitutes a so-called training sequence (design set). The fundamental problem of pattern recognition is to determine, on the basis of a training sequence, the class which the sample to be classified or identified belongs to. Any problem of decision making can be reduced to such a scheme as long as the process of decision making is based mainly on the analysis of previously-gained experience.

The applied problems that can be solved by methods of pattern recognition arise in the identification of typed- and handwritten texts, the identification of photographic images, in automatic speech recognition, in medical diagnostics, in geological prediction, in the prediction of properties of chemical compounds, in the evaluation of situations in economics, politics and commercial production, in the classification of sociological data, etc. For solving such problems a great number of so-called heuristic recognition algorithms have been accumulated, each oriented to specific properties of a problem. Moreover, models of recognition algorithms have been constructed, based on certain intuitive principles, i.e. families of algorithms for solving classification problems. The most widely used are: models based on the potential principle; models based on the partition principle, specifying a class of surfaces discriminating the patterns; models for calculating estimates (voting models); as well as structural and statistical models.

At the model level the problem consists of finding a recognition algorithm (a model for calculating estimates) of extremal quality. The quality of a recognition algorithm is generally judged by the results it produces on a certain test set of objects (a test sequence) for which the true classification is a priori known. In developing the general theory of recognition algorithms the most complete results have been obtained within an algebraic framework. A recognition algorithm is represented by a composition of a recognition operator and a decision rule. Introducing operations of addition, multiplication and multiplication by a scalar for recognition operators allows one to prove that a recognition algorithm of extremal quality on any test sequence exists within some algebraic extension of the initial set of recognition operators.

The problems of pattern recognition also include problems of data reduction in the descriptions of the initial objects and the selection of relevant informative features.

How to Cite This Entry:
Pattern recognition. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Pattern_recognition&oldid=31812
This article was adapted from an original article by P.P. Kol'tsov (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article