# Hebb rule

Hebbian learning

A learning rule dating back to D.O. Hebb's classic [a1], which appeared in 1949. The idea behind it is simple. Neurons of vertebrates consist of three parts: a dendritic tree, which collects the input, a soma, which can be considered as a central processing unit, and an axon, which transmits the output. Neurons communicate via action potentials or spikes, pulses of a duration of about one millisecond. If neuron $j$ emits a spike, it travels along the axon to a so-called synapse on the dendritic tree of neuron $i$, say. This takes $\tau _ {ij }$ milliseconds. The synapse has a synaptic strength, to be denoted by $J _ {ij }$. Its value, which encodes the information to be stored, is to be governed by the Hebb rule.

In [a1], p. 62, one can find the "neurophysiological postulate" that is the Hebb rule in its original form: When an axon of cell $A$ is near enough to excite a cell $B$ and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that the efficiency of $A$, as one of the cells firing $B$, is increased.

Hebb's postulate has been formulated in plain English (but not more than that) and the main question is how to implement it mathematically. The key ideas are that:

i) only the pre- and post-synaptic neuron determine the change of a synapse;

ii) learning means evaluating correlations. If both $A$ and $B$ are active, then the synaptic efficacy should be strengthened. Efficient learning also requires, however, that the synaptic strength be decreased every now and then [a2].

In the present context, one usually wants to store a number of activity patterns in a network with a fairly high connectivity ( $10 ^ {4}$ in biological nets). Most of the information presented to a network varies in space and time. So what is needed is a common representation of both the spatial and the temporal aspects. As a pattern changes, the system should be able to measure and store this change. How can it do that?

For unbiased random patterns in a network with synchronous updating this can be done as follows. The neuronal dynamics in its simplest form is supposed to be given by $S _ {i} ( t + \Delta t ) = { \mathop{\rm sign} } ( h _ {i} ( t ) )$, where $h _ {i} ( t ) = \sum _ {j} J _ {ij } S _ {j} ( t )$. Let $J _ {ij }$ be the synaptic strength before the learning session, whose duration is denoted by $T$. After the learning session, $J _ {ij }$ is to be changed into $J _ {ij } + \Delta J _ {ij }$ with

$$\Delta J _ {ij } = \epsilon _ {ij } { \frac{1}{T} } \sum _ { 0 } ^ { T } S _ {i} ( t + \Delta t ) S _ {j} ( t - \tau _ {ij } )$$

(cf. [a3], [a4]). The above equation provides a local encoding of the data at the synapse $j \rightarrow i$. The $\epsilon _ {ij }$ is a constant known factor. The learning session having a duration $T$, the multiplier $T ^ {- 1 }$ in front of the sum takes saturation into account. The neuronal activity $S _ {i} ( t )$ equals $1$ if neuron $i$ is active at time $t$ and $- 1$ if it is not. At time $t + \Delta t$ it is combined with the signal that arrives at $i$ at time $t$, i.e., $S _ {j} ( t - \tau _ {ij } )$, where $\tau _ {ij }$ is the axonal delay. Here, $\{ {S _ {i} ( t ) } : {1 \leq i \leq N } \}$, denotes the pattern as it is taught to the network of size $N$ during the learning session of duration $0 \leq t \leq T$. The time unit is $\Delta t = 1$ milliseconds. In the case of asynchronous dynamics, where each time a single neuron is updated randomly, one has to rescale $\Delta t \pto {1 / N }$ and the above sum is reduced to an integral as $N \rightarrow \infty$. In passing one notes that for constant, spatial, patterns one recovers the Hopfield model [a5].

Suppose now that the activity $a$ in the network is low, as is usually the case in biological nets, i.e., $a \approx - 1$. Then the appropriate modification of the above learning rule reads

$$\Delta J _ {ij } = \epsilon _ {ij } { \frac{1}{T} } \sum _ { 0 } ^ { T } S _ {i} ( t + \Delta t ) [ S _ {j} ( t - \tau _ {ij } ) - \mathbf a ]$$

(cf. [a4]). Since $S _ {j} - a \approx 0$ when the presynaptic neuron is not active, one sees that the pre-synaptic neuron is gating. One gets a depression (LTD) if the post-synaptic neuron is inactive and a potentiation (LTP) if it is active. So it is advantageous to have a time window [a6]: The pre-synaptic neuron should fire slightly before the post-synaptic one. The above Hebbian learning rule can also be adapted so as to be fully integrated in biological contexts [a6]. The biology of Hebbian learning has meanwhile been confirmed. See the review [a7].

G. Palm [a8] has advocated an extremely low activity for efficient storage of stationary data. Out of $N$ neurons, only ${ \mathop{\rm ln} } N$ should be active. This seems to be advantageous for hardware realizations.

In summary, Hebbian learning is efficient since it is local, and it is a powerful algorithm to store spatial or spatio-temporal patterns. If so, why is it that good? As to the why, the succinct answer [a3] is that synaptic representations are selected according to their resonance with the input data; the stronger the resonance, the larger $\Delta J _ {ij }$. In other words, the algorithm "picks" and strengthens only those synapses that match the input pattern.

#### References

 [a1] D.O. Hebb, "The organization of behavior--A neurophysiological theory" , Wiley (1949) [a2] T.J. Sejnowski, "Statistical constraints on synaptic plasticity" J. Theor. Biol , 69 (1977) pp. 385–389 [a3] A.V.M. Herz, B. Sulzer, R. Kühn, J.L. van Hemmen, "The Hebb rule: Storing static and dynamic objects in an associative neural network" Europhys. Lett. , 7 (1988) pp. 663–669 (Hebbian learning reconsidered: Representation of static and dynamic objects in associative neural nets, Biol. Cybern. 60 (1989), 457–467) [a4] J.L. van Hemmen, W. Gerstner, A.V.M. Herz, R. Kühn, M. Vaas, "Encoding and decoding of patterns which are correlated in space and time" G. Dorffner (ed.) , Konnektionismus in artificial Intelligence und Kognitionsforschung , Springer (1990) pp. 153–162 [a5] J.J. Hopfield, "Neural networks and physical systems with emergent collective computational abilities" Proc. Nat. Acad. Sci. USA , 79 (1982) pp. 2554–2558 [a6] W. Gerstner, R. Ritz, J.L. van Hemmen, "Why spikes? Hebbian learning and retrieval of time-resolved excitation patterns" Biol. Cybern. , 69 (1993) pp. 503–515 (See also: W. Gerstner and R. Kempter and J.L. van Hemmen and H. Wagner: A neuronal learning rule for sub-millisecond temporal coding, Nature 383 (1996), 76–78) [a7] T.H. Brown, S. Chattarji, "Hebbian synaptic plasticity: Evolution of the contemporary concept" E. Domany (ed.) J.L. van Hemmen (ed.) K. Schulten (ed.) , Models of neural networks , II , Springer (1994) pp. 287–314 [a8] G. Palm, "Neural assemblies: An alternative approach to artificial intelligence" , Springer (1982)
How to Cite This Entry:
Hebb rule. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Hebb_rule&oldid=47201
This article was adapted from an original article by J.L. van Hemmen (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article