Namespaces
Variants
Actions

Difference between revisions of "Hebb rule"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
m (tex encoded by computer)
 
Line 1: Line 1:
 +
<!--
 +
h1101001.png
 +
$#A+1 = 45 n = 0
 +
$#C+1 = 45 : ~/encyclopedia/old_files/data/H110/H.1100100 Hebb rule,
 +
Automatically converted into TeX, above some diagnostics.
 +
Please remove this comment and the {{TEX|auto}} line below,
 +
if TeX found to be correct.
 +
-->
 +
 +
{{TEX|auto}}
 +
{{TEX|done}}
 +
 
''Hebbian learning''
 
''Hebbian learning''
  
A learning rule dating back to D.O. Hebb's classic [[#References|[a1]]], which appeared in 1949. The idea behind it is simple. Neurons of vertebrates consist of three parts: a dendritic tree, which collects the input, a soma, which can be considered as a central processing unit, and an axon, which transmits the output. Neurons communicate via action potentials or spikes, pulses of a duration of about one millisecond. If neuron <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h1101001.png" /> emits a spike, it travels along the axon to a so-called synapse on the dendritic tree of neuron <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h1101002.png" />, say. This takes <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h1101003.png" /> milliseconds. The synapse has a synaptic strength, to be denoted by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h1101004.png" />. Its value, which encodes the information to be stored, is to be governed by the Hebb rule.
+
A learning rule dating back to D.O. Hebb's classic [[#References|[a1]]], which appeared in 1949. The idea behind it is simple. Neurons of vertebrates consist of three parts: a dendritic tree, which collects the input, a soma, which can be considered as a central processing unit, and an axon, which transmits the output. Neurons communicate via action potentials or spikes, pulses of a duration of about one millisecond. If neuron $  j $
 +
emits a spike, it travels along the axon to a so-called synapse on the dendritic tree of neuron $  i $,  
 +
say. This takes $  \tau _ {ij }  $
 +
milliseconds. The synapse has a synaptic strength, to be denoted by $  J _ {ij }  $.  
 +
Its value, which encodes the information to be stored, is to be governed by the Hebb rule.
  
In [[#References|[a1]]], p. 62, one can find the  "neurophysiological postulate"  that is the Hebb rule in its original form: When an axon of cell <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h1101005.png" /> is near enough to excite a cell <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h1101006.png" /> and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that the efficiency of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h1101007.png" />, as one of the cells firing <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h1101008.png" />, is increased.
+
In [[#References|[a1]]], p. 62, one can find the  "neurophysiological postulate"  that is the Hebb rule in its original form: When an axon of cell $  A $
 +
is near enough to excite a cell $  B $
 +
and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that the efficiency of $  A $,  
 +
as one of the cells firing $  B $,  
 +
is increased.
  
 
Hebb's postulate has been formulated in plain English (but not more than that) and the main question is how to implement it mathematically. The key ideas are that:
 
Hebb's postulate has been formulated in plain English (but not more than that) and the main question is how to implement it mathematically. The key ideas are that:
Line 9: Line 29:
 
i) only the pre- and post-synaptic neuron determine the change of a synapse;
 
i) only the pre- and post-synaptic neuron determine the change of a synapse;
  
ii) learning means evaluating correlations. If both <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h1101009.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010010.png" /> are active, then the synaptic efficacy should be strengthened. Efficient learning also requires, however, that the synaptic strength be decreased every now and then [[#References|[a2]]].
+
ii) learning means evaluating correlations. If both $  A $
 +
and $  B $
 +
are active, then the synaptic efficacy should be strengthened. Efficient learning also requires, however, that the synaptic strength be decreased every now and then [[#References|[a2]]].
  
In the present context, one usually wants to store a number of activity patterns in a network with a fairly high connectivity (<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010011.png" /> in biological nets). Most of the information presented to a network varies in space and time. So what is needed is a common representation of both the spatial and the temporal aspects. As a pattern changes, the system should be able to measure and store this change. How can it do that?
+
In the present context, one usually wants to store a number of activity patterns in a network with a fairly high connectivity ( $  10  ^ {4} $
 +
in biological nets). Most of the information presented to a network varies in space and time. So what is needed is a common representation of both the spatial and the temporal aspects. As a pattern changes, the system should be able to measure and store this change. How can it do that?
  
For unbiased random patterns in a network with synchronous updating this can be done as follows. The neuronal dynamics in its simplest form is supposed to be given by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010012.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010013.png" />. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010014.png" /> be the synaptic strength before the learning session, whose duration is denoted by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010015.png" />. After the learning session, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010016.png" /> is to be changed into <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010017.png" /> with
+
For unbiased random patterns in a network with synchronous updating this can be done as follows. The neuronal dynamics in its simplest form is supposed to be given by $  S _ {i} ( t + \Delta t ) = { \mathop{\rm sign} } ( h _ {i} ( t ) ) $,  
 +
where $  h _ {i} ( t ) = \sum _ {j} J _ {ij }  S _ {j} ( t ) $.  
 +
Let $  J _ {ij }  $
 +
be the synaptic strength before the learning session, whose duration is denoted by $  T $.  
 +
After the learning session, $  J _ {ij }  $
 +
is to be changed into $  J _ {ij }  + \Delta J _ {ij }  $
 +
with
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010018.png" /></td> </tr></table>
+
$$
 +
\Delta J _ {ij }  = \epsilon _ {ij }  {
 +
\frac{1}{T}
 +
} \sum _ { 0 } ^ { T }  S _ {i} ( t + \Delta t ) S _ {j} ( t - \tau _ {ij }  )
 +
$$
  
(cf. [[#References|[a3]]], [[#References|[a4]]]). The above equation provides a local encoding of the data at the synapse <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010019.png" />. The <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010020.png" /> is a constant known factor. The learning session having a duration <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010021.png" />, the multiplier <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010022.png" /> in front of the sum takes saturation into account. The neuronal activity <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010023.png" /> equals <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010024.png" /> if neuron <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010025.png" /> is active at time <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010026.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010027.png" /> if it is not. At time <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010028.png" /> it is combined with the signal that arrives at <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010029.png" /> at time <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010030.png" />, i.e., <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010031.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010032.png" /> is the axonal delay. Here, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010033.png" />, denotes the pattern as it is taught to the network of size <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010034.png" /> during the learning session of duration <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010035.png" />. The time unit is <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010036.png" /> milliseconds. In the case of asynchronous dynamics, where each time a single neuron is updated randomly, one has to rescale <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010037.png" /> and the above sum is reduced to an integral as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010038.png" />. In passing one notes that for constant, spatial, patterns one recovers the Hopfield model [[#References|[a5]]].
+
(cf. [[#References|[a3]]], [[#References|[a4]]]). The above equation provides a local encoding of the data at the synapse $  j \rightarrow i $.  
 +
The $  \epsilon _ {ij }  $
 +
is a constant known factor. The learning session having a duration $  T $,  
 +
the multiplier $  T ^ {- 1 } $
 +
in front of the sum takes saturation into account. The neuronal activity $  S _ {i} ( t ) $
 +
equals $  1 $
 +
if neuron $  i $
 +
is active at time $  t $
 +
and $  - 1 $
 +
if it is not. At time $  t + \Delta t $
 +
it is combined with the signal that arrives at $  i $
 +
at time $  t $,  
 +
i.e., $  S _ {j} ( t - \tau _ {ij }  ) $,  
 +
where $  \tau _ {ij }  $
 +
is the axonal delay. Here, $  \{ {S _ {i} ( t ) } : {1 \leq  i \leq  N } \} $,  
 +
denotes the pattern as it is taught to the network of size $  N $
 +
during the learning session of duration 0 \leq  t \leq  T $.  
 +
The time unit is $  \Delta t = 1 $
 +
milliseconds. In the case of asynchronous dynamics, where each time a single neuron is updated randomly, one has to rescale $  \Delta t \pto {1 / N } $
 +
and the above sum is reduced to an integral as $  N \rightarrow \infty $.  
 +
In passing one notes that for constant, spatial, patterns one recovers the Hopfield model [[#References|[a5]]].
  
Suppose now that the activity <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010039.png" /> in the network is low, as is usually the case in biological nets, i.e., <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010040.png" />. Then the appropriate modification of the above learning rule reads
+
Suppose now that the activity $  a $
 +
in the network is low, as is usually the case in biological nets, i.e., $  a \approx - 1 $.  
 +
Then the appropriate modification of the above learning rule reads
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010041.png" /></td> </tr></table>
+
$$
 +
\Delta J _ {ij }  = \epsilon _ {ij }  {
 +
\frac{1}{T}
 +
} \sum _ { 0 } ^ { T }  S _ {i} ( t + \Delta t ) [ S _ {j} ( t - \tau _ {ij }  ) - \mathbf a ]
 +
$$
  
(cf. [[#References|[a4]]]). Since <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010042.png" /> when the presynaptic neuron is not active, one sees that the pre-synaptic neuron is gating. One gets a depression (LTD) if the post-synaptic neuron is inactive and a potentiation (LTP) if it is active. So it is advantageous to have a time window [[#References|[a6]]]: The pre-synaptic neuron should fire slightly before the post-synaptic one. The above Hebbian learning rule can also be adapted so as to be fully integrated in biological contexts [[#References|[a6]]]. The biology of Hebbian learning has meanwhile been confirmed. See the review [[#References|[a7]]].
+
(cf. [[#References|[a4]]]). Since $  S _ {j} - a \approx 0 $
 +
when the presynaptic neuron is not active, one sees that the pre-synaptic neuron is gating. One gets a depression (LTD) if the post-synaptic neuron is inactive and a potentiation (LTP) if it is active. So it is advantageous to have a time window [[#References|[a6]]]: The pre-synaptic neuron should fire slightly before the post-synaptic one. The above Hebbian learning rule can also be adapted so as to be fully integrated in biological contexts [[#References|[a6]]]. The biology of Hebbian learning has meanwhile been confirmed. See the review [[#References|[a7]]].
  
G. Palm [[#References|[a8]]] has advocated an extremely low activity for efficient storage of stationary data. Out of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010043.png" /> neurons, only <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010044.png" /> should be active. This seems to be advantageous for hardware realizations.
+
G. Palm [[#References|[a8]]] has advocated an extremely low activity for efficient storage of stationary data. Out of $  N $
 +
neurons, only $  { \mathop{\rm ln} } N $
 +
should be active. This seems to be advantageous for hardware realizations.
  
In summary, Hebbian learning is efficient since it is local, and it is a powerful algorithm to store spatial or spatio-temporal patterns. If so, why is it that good? As to the why, the succinct answer [[#References|[a3]]] is that synaptic representations are selected according to their resonance with the input data; the stronger the resonance, the larger <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110100/h11010045.png" />. In other words, the algorithm  "picks"  and strengthens only those synapses that match the input pattern.
+
In summary, Hebbian learning is efficient since it is local, and it is a powerful algorithm to store spatial or spatio-temporal patterns. If so, why is it that good? As to the why, the succinct answer [[#References|[a3]]] is that synaptic representations are selected according to their resonance with the input data; the stronger the resonance, the larger $  \Delta J _ {ij }  $.  
 +
In other words, the algorithm  "picks"  and strengthens only those synapses that match the input pattern.
  
 
====References====
 
====References====
 
<table><TR><TD valign="top">[a1]</TD> <TD valign="top">  D.O. Hebb,  "The organization of behavior--A neurophysiological theory" , Wiley  (1949)</TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top">  T.J. Sejnowski,  "Statistical constraints on synaptic plasticity"  ''J. Theor. Biol'' , '''69'''  (1977)  pp. 385–389</TD></TR><TR><TD valign="top">[a3]</TD> <TD valign="top">  A.V.M. Herz,  B. Sulzer,  R. Kühn,  J.L. van Hemmen,  "The Hebb rule: Storing static and dynamic objects in an associative neural network"  ''Europhys. Lett.'' , '''7'''  (1988)  pp. 663–669  (Hebbian learning reconsidered: Representation of static and dynamic objects in associative neural nets, Biol. Cybern. 60 (1989), 457–467)</TD></TR><TR><TD valign="top">[a4]</TD> <TD valign="top">  J.L. van Hemmen,  W. Gerstner,  A.V.M. Herz,  R. Kühn,  M. Vaas,  "Encoding and decoding of patterns which are correlated in space and time"  G. Dorffner (ed.) , ''Konnektionismus in artificial Intelligence und Kognitionsforschung'' , Springer  (1990)  pp. 153–162</TD></TR><TR><TD valign="top">[a5]</TD> <TD valign="top">  J.J. Hopfield,  "Neural networks and physical systems with emergent collective computational abilities"  ''Proc. Nat. Acad. Sci. USA'' , '''79'''  (1982)  pp. 2554–2558</TD></TR><TR><TD valign="top">[a6]</TD> <TD valign="top">  W. Gerstner,  R. Ritz,  J.L. van Hemmen,  "Why spikes? Hebbian learning and retrieval of time-resolved excitation patterns"  ''Biol. Cybern.'' , '''69'''  (1993)  pp. 503–515  (See also: W. Gerstner and R. Kempter and J.L. van Hemmen and H. Wagner: A neuronal learning rule for sub-millisecond temporal coding, Nature 383 (1996), 76–78)</TD></TR><TR><TD valign="top">[a7]</TD> <TD valign="top">  T.H. Brown,  S. Chattarji,  "Hebbian synaptic plasticity: Evolution of the contemporary concept"  E. Domany (ed.)  J.L. van Hemmen (ed.)  K. Schulten (ed.) , ''Models of neural networks'' , '''II''' , Springer  (1994)  pp. 287–314</TD></TR><TR><TD valign="top">[a8]</TD> <TD valign="top">  G. Palm,  "Neural assemblies: An alternative approach to artificial intelligence" , Springer  (1982)</TD></TR></table>
 
<table><TR><TD valign="top">[a1]</TD> <TD valign="top">  D.O. Hebb,  "The organization of behavior--A neurophysiological theory" , Wiley  (1949)</TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top">  T.J. Sejnowski,  "Statistical constraints on synaptic plasticity"  ''J. Theor. Biol'' , '''69'''  (1977)  pp. 385–389</TD></TR><TR><TD valign="top">[a3]</TD> <TD valign="top">  A.V.M. Herz,  B. Sulzer,  R. Kühn,  J.L. van Hemmen,  "The Hebb rule: Storing static and dynamic objects in an associative neural network"  ''Europhys. Lett.'' , '''7'''  (1988)  pp. 663–669  (Hebbian learning reconsidered: Representation of static and dynamic objects in associative neural nets, Biol. Cybern. 60 (1989), 457–467)</TD></TR><TR><TD valign="top">[a4]</TD> <TD valign="top">  J.L. van Hemmen,  W. Gerstner,  A.V.M. Herz,  R. Kühn,  M. Vaas,  "Encoding and decoding of patterns which are correlated in space and time"  G. Dorffner (ed.) , ''Konnektionismus in artificial Intelligence und Kognitionsforschung'' , Springer  (1990)  pp. 153–162</TD></TR><TR><TD valign="top">[a5]</TD> <TD valign="top">  J.J. Hopfield,  "Neural networks and physical systems with emergent collective computational abilities"  ''Proc. Nat. Acad. Sci. USA'' , '''79'''  (1982)  pp. 2554–2558</TD></TR><TR><TD valign="top">[a6]</TD> <TD valign="top">  W. Gerstner,  R. Ritz,  J.L. van Hemmen,  "Why spikes? Hebbian learning and retrieval of time-resolved excitation patterns"  ''Biol. Cybern.'' , '''69'''  (1993)  pp. 503–515  (See also: W. Gerstner and R. Kempter and J.L. van Hemmen and H. Wagner: A neuronal learning rule for sub-millisecond temporal coding, Nature 383 (1996), 76–78)</TD></TR><TR><TD valign="top">[a7]</TD> <TD valign="top">  T.H. Brown,  S. Chattarji,  "Hebbian synaptic plasticity: Evolution of the contemporary concept"  E. Domany (ed.)  J.L. van Hemmen (ed.)  K. Schulten (ed.) , ''Models of neural networks'' , '''II''' , Springer  (1994)  pp. 287–314</TD></TR><TR><TD valign="top">[a8]</TD> <TD valign="top">  G. Palm,  "Neural assemblies: An alternative approach to artificial intelligence" , Springer  (1982)</TD></TR></table>

Latest revision as of 22:10, 5 June 2020


Hebbian learning

A learning rule dating back to D.O. Hebb's classic [a1], which appeared in 1949. The idea behind it is simple. Neurons of vertebrates consist of three parts: a dendritic tree, which collects the input, a soma, which can be considered as a central processing unit, and an axon, which transmits the output. Neurons communicate via action potentials or spikes, pulses of a duration of about one millisecond. If neuron $ j $ emits a spike, it travels along the axon to a so-called synapse on the dendritic tree of neuron $ i $, say. This takes $ \tau _ {ij } $ milliseconds. The synapse has a synaptic strength, to be denoted by $ J _ {ij } $. Its value, which encodes the information to be stored, is to be governed by the Hebb rule.

In [a1], p. 62, one can find the "neurophysiological postulate" that is the Hebb rule in its original form: When an axon of cell $ A $ is near enough to excite a cell $ B $ and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that the efficiency of $ A $, as one of the cells firing $ B $, is increased.

Hebb's postulate has been formulated in plain English (but not more than that) and the main question is how to implement it mathematically. The key ideas are that:

i) only the pre- and post-synaptic neuron determine the change of a synapse;

ii) learning means evaluating correlations. If both $ A $ and $ B $ are active, then the synaptic efficacy should be strengthened. Efficient learning also requires, however, that the synaptic strength be decreased every now and then [a2].

In the present context, one usually wants to store a number of activity patterns in a network with a fairly high connectivity ( $ 10 ^ {4} $ in biological nets). Most of the information presented to a network varies in space and time. So what is needed is a common representation of both the spatial and the temporal aspects. As a pattern changes, the system should be able to measure and store this change. How can it do that?

For unbiased random patterns in a network with synchronous updating this can be done as follows. The neuronal dynamics in its simplest form is supposed to be given by $ S _ {i} ( t + \Delta t ) = { \mathop{\rm sign} } ( h _ {i} ( t ) ) $, where $ h _ {i} ( t ) = \sum _ {j} J _ {ij } S _ {j} ( t ) $. Let $ J _ {ij } $ be the synaptic strength before the learning session, whose duration is denoted by $ T $. After the learning session, $ J _ {ij } $ is to be changed into $ J _ {ij } + \Delta J _ {ij } $ with

$$ \Delta J _ {ij } = \epsilon _ {ij } { \frac{1}{T} } \sum _ { 0 } ^ { T } S _ {i} ( t + \Delta t ) S _ {j} ( t - \tau _ {ij } ) $$

(cf. [a3], [a4]). The above equation provides a local encoding of the data at the synapse $ j \rightarrow i $. The $ \epsilon _ {ij } $ is a constant known factor. The learning session having a duration $ T $, the multiplier $ T ^ {- 1 } $ in front of the sum takes saturation into account. The neuronal activity $ S _ {i} ( t ) $ equals $ 1 $ if neuron $ i $ is active at time $ t $ and $ - 1 $ if it is not. At time $ t + \Delta t $ it is combined with the signal that arrives at $ i $ at time $ t $, i.e., $ S _ {j} ( t - \tau _ {ij } ) $, where $ \tau _ {ij } $ is the axonal delay. Here, $ \{ {S _ {i} ( t ) } : {1 \leq i \leq N } \} $, denotes the pattern as it is taught to the network of size $ N $ during the learning session of duration $ 0 \leq t \leq T $. The time unit is $ \Delta t = 1 $ milliseconds. In the case of asynchronous dynamics, where each time a single neuron is updated randomly, one has to rescale $ \Delta t \pto {1 / N } $ and the above sum is reduced to an integral as $ N \rightarrow \infty $. In passing one notes that for constant, spatial, patterns one recovers the Hopfield model [a5].

Suppose now that the activity $ a $ in the network is low, as is usually the case in biological nets, i.e., $ a \approx - 1 $. Then the appropriate modification of the above learning rule reads

$$ \Delta J _ {ij } = \epsilon _ {ij } { \frac{1}{T} } \sum _ { 0 } ^ { T } S _ {i} ( t + \Delta t ) [ S _ {j} ( t - \tau _ {ij } ) - \mathbf a ] $$

(cf. [a4]). Since $ S _ {j} - a \approx 0 $ when the presynaptic neuron is not active, one sees that the pre-synaptic neuron is gating. One gets a depression (LTD) if the post-synaptic neuron is inactive and a potentiation (LTP) if it is active. So it is advantageous to have a time window [a6]: The pre-synaptic neuron should fire slightly before the post-synaptic one. The above Hebbian learning rule can also be adapted so as to be fully integrated in biological contexts [a6]. The biology of Hebbian learning has meanwhile been confirmed. See the review [a7].

G. Palm [a8] has advocated an extremely low activity for efficient storage of stationary data. Out of $ N $ neurons, only $ { \mathop{\rm ln} } N $ should be active. This seems to be advantageous for hardware realizations.

In summary, Hebbian learning is efficient since it is local, and it is a powerful algorithm to store spatial or spatio-temporal patterns. If so, why is it that good? As to the why, the succinct answer [a3] is that synaptic representations are selected according to their resonance with the input data; the stronger the resonance, the larger $ \Delta J _ {ij } $. In other words, the algorithm "picks" and strengthens only those synapses that match the input pattern.

References

[a1] D.O. Hebb, "The organization of behavior--A neurophysiological theory" , Wiley (1949)
[a2] T.J. Sejnowski, "Statistical constraints on synaptic plasticity" J. Theor. Biol , 69 (1977) pp. 385–389
[a3] A.V.M. Herz, B. Sulzer, R. Kühn, J.L. van Hemmen, "The Hebb rule: Storing static and dynamic objects in an associative neural network" Europhys. Lett. , 7 (1988) pp. 663–669 (Hebbian learning reconsidered: Representation of static and dynamic objects in associative neural nets, Biol. Cybern. 60 (1989), 457–467)
[a4] J.L. van Hemmen, W. Gerstner, A.V.M. Herz, R. Kühn, M. Vaas, "Encoding and decoding of patterns which are correlated in space and time" G. Dorffner (ed.) , Konnektionismus in artificial Intelligence und Kognitionsforschung , Springer (1990) pp. 153–162
[a5] J.J. Hopfield, "Neural networks and physical systems with emergent collective computational abilities" Proc. Nat. Acad. Sci. USA , 79 (1982) pp. 2554–2558
[a6] W. Gerstner, R. Ritz, J.L. van Hemmen, "Why spikes? Hebbian learning and retrieval of time-resolved excitation patterns" Biol. Cybern. , 69 (1993) pp. 503–515 (See also: W. Gerstner and R. Kempter and J.L. van Hemmen and H. Wagner: A neuronal learning rule for sub-millisecond temporal coding, Nature 383 (1996), 76–78)
[a7] T.H. Brown, S. Chattarji, "Hebbian synaptic plasticity: Evolution of the contemporary concept" E. Domany (ed.) J.L. van Hemmen (ed.) K. Schulten (ed.) , Models of neural networks , II , Springer (1994) pp. 287–314
[a8] G. Palm, "Neural assemblies: An alternative approach to artificial intelligence" , Springer (1982)
How to Cite This Entry:
Hebb rule. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Hebb_rule&oldid=16900
This article was adapted from an original article by J.L. van Hemmen (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article