A network of many simple processors, each possibly having a small amount of local memory. The units are connected by communication channels which usually carry numeric (as opposed to symbolic) data, encoded by any of various means (cf. also Communication channel; Network). The units operate only on their local data and on the inputs they receive via the connections.
Some neural networks are models of biological neural networks and some are not, but historically, much of the inspiration for the field of neural networks came from the desire to produce artificial systems capable of sophisticated, perhaps intelligent, computations similar to those that the human brain routinely performs, and thereby possibly to enhance our understanding of the human brain.
Most neural networks have some sort of training rule whereby the weights of connections are adjusted on the basis of data. In other words, neural networks learn from examples and exhibit some capability for generalization beyond the training data. The restriction to local operations is often relaxed during training.
There are two main ways of training a neural network. A network can be trained in a supervised or in an unsupervised way. With supervision, the learning is done on the basis of direct comparison of the output of the network with known correct answers. In unsupervised learning the only available information is in the correlations of the input data.
Two of the most popular neural networks are the feedforward network and the self-organizing feature map (see below). For good general introductions to neural networks, see [a1], [a3].
The simple processors in a feedforward network apply a transfer function to a weighted sum. If one views the incoming signals and the weights on the incoming connections as vectors, then the weighted sum is just the inner product of these two vectors. The transfer function is often a sigmoidal function like $f(x)=\tanh(x)$ or a hard-limiting transfer function like the sign-function. Transfer functions are needed to introduce non-linearity into the network. The resulting networks are universal function approximators. The parameters of the feedforward network are the weights on the connections. As error criterion one can take the squared sum of errors over all the examples and all the outputs. An error is defined as the difference between the output of the network and the correct output as specified by the input/output pairs. The so-called backpropagation learning rule is obtained by using the gradient-descent method on this error criterion.
Self-organizing feature map.
The self-organizing feature map [a2] takes as inputs elements from a high-dimensional vector space. The learning algorithm of the self-organizing feature map finds a balance between two goals:
1) for each processing element it finds a prototype vector in the input space such that these vectors together model the input space;
2) there is a distance function on the processing elements, and if there is a small distance between two processing elements, then there should also be a small distance between their prototype vectors.
One of the main applications of the self-organizing feature map is to visualize high-dimensional data. The self-organizing feature map projects the high-dimensional data by mapping it to the processing element with the prototype vector which is closest to the input.
See also Hebb rule.
|[a1]||J. Hertz, A. Krogh, R. Palmer, "Introduction to the theory of neural computation" , Addison-Wesley (1991)|
|[a2]||T. Kohonen, "Self-organization and associative memory" , Springer (1989)|
|[a3]||S. Haykin, "Neural networks, a comprehensive foundation" , Macmillan (1994)|
Neural network. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Neural_network&oldid=31774