Introduction to Neural Dynamics and Signal Transmission Delay Jianhong Wu
Introduction to Neural Dynamics and Signal Transmission Delay Jianhong Wu
Introduction to Neural Dynamics and Signal Transmission Delay Jianhong Wu
Introduction to Neural Dynamics and Signal Transmission Delay Jianhong Wu
1. Introduction to Neural Dynamics and Signal
Transmission Delay Jianhong Wu pdf download
https://ebookgate.com/product/introduction-to-neural-dynamics-
and-signal-transmission-delay-jianhong-wu/
Get Instant Ebook Downloads – Browse at https://ebookgate.com
2. Get Your Digital Files Instantly: PDF, ePub, MOBI and More
Quick Digital Downloads: PDF, ePub, MOBI and Other Formats
Introduction to Mixed Signal IC Test and Measurement
2nd Edition Edition Roberts
https://ebookgate.com/product/introduction-to-mixed-signal-ic-
test-and-measurement-2nd-edition-edition-roberts/
Neural Based Orthogonal Data Fitting The EXIN Neural
Networks Adaptive and Learning Systems for Signal
Processing Communications and Control Series 1st
Edition Giansalvo Cirrincione
https://ebookgate.com/product/neural-based-orthogonal-data-
fitting-the-exin-neural-networks-adaptive-and-learning-systems-
for-signal-processing-communications-and-control-series-1st-
edition-giansalvo-cirrincione/
Introduction to Digital Signal Processing and Filter
Design 1st Edition B. A. Shenoi
https://ebookgate.com/product/introduction-to-digital-signal-
processing-and-filter-design-1st-edition-b-a-shenoi/
An Introduction to Mixed Signal IC Test and Measurement
2nd Edition Gordon Roberts
https://ebookgate.com/product/an-introduction-to-mixed-signal-ic-
test-and-measurement-2nd-edition-gordon-roberts/
3. Neural Networks in Chemical Reaction Dynamics 1st
Edition Lionel M. Raff
https://ebookgate.com/product/neural-networks-in-chemical-
reaction-dynamics-1st-edition-lionel-m-raff/
An Introduction to Planar Dynamics 4th Edition Chen
Guang
https://ebookgate.com/product/an-introduction-to-planar-
dynamics-4th-edition-chen-guang/
Receptive Spirit German Idealism and the Dynamics of
Cultural Transmission 1st Edition Márton Dornbach
https://ebookgate.com/product/receptive-spirit-german-idealism-
and-the-dynamics-of-cultural-transmission-1st-edition-marton-
dornbach/
Introduction to Practice of Molecular Simulation
Molecular Dynamics Monte Carlo Brownian Dynamics
Lattice Boltzmann and Dissipative Particle Dynamics 1st
Edition Akira Satoh
https://ebookgate.com/product/introduction-to-practice-of-
molecular-simulation-molecular-dynamics-monte-carlo-brownian-
dynamics-lattice-boltzmann-and-dissipative-particle-dynamics-1st-
edition-akira-satoh/
Introduction to Neural Networks with Java 1 st Edition
Second printing Heaton Jeff.
https://ebookgate.com/product/introduction-to-neural-networks-
with-java-1-st-edition-second-printing-heaton-jeff/
4. de Gruyter Series in Nonlinear Analysis and Applications 6
Editors
A. Bensoussan (Paris)
R. Conti (Florence)
A. Friedman (Minneapolis)
K.-H. Hoffmann (Munich)
L. Nirenberg (New York)
A. Vignoli (Rome)
Managing Editors
J. Appell (Würzburg)
V. Lakshmikantham (Melbourne, USA)
6. Jianhong Wu
Introduction to Neural Dynamics
and
Signal Transmission Delay
W
DE
G_
Walter de Gruyter · Berlin · New York 2001
10. Preface
In the design of a neural network, either for biological modeling, cognitive simulation,
numerical computation or engineering applications, it is important to describe the
dynamics (also known as evolution) of the network. The success in this area in the
early 1980's was one of the main sources for the resurgence of interest in neural
networks, and the current progress towards understanding neural dynamics has been
part of exhaustive efforts to lay down a solid theoretical foundation forthis fast growing
theory and for the applications of neural networks.
Unfortunately, the highly interdisciplinary nature of the research in neural net-
works makes it very difficult for a newcomer to enter this important and fascinating
area of modern science. The purpose of this book is to give an introduction to the
mathematical modeling and analysis of networks of neurons from the viewpoint of
dynamical systems. It is hoped that this book will give an introduction to the basic
knowledge in neurobiology and physiology which is necessary in order to understand
several popular mathematical models of neural networks, as well as to some well-
known results and recent developments in the theoretical study of the dynamics of the
mathematical models.
This book is written as a textbook for senior undergraduate and graduate students
in applied mathematics. The level of exposition assumes a basic knowledge of matrix
algebra and ordinary differential equations, and an effort has been made to make the
book as self-contained as possible.
Many neural network models were originally proposed to explain observations in
neurobiology or physiology. Also, understanding human behavior and brain function
is still one of the main motivations for neural network modeling and analysis. It is
therefore of prime importance to understand the basic structure of a single neuron as
well as a network of neurons and to understand the main mechanisms of the neural
signal transmission. This necessary basic knowledge in neuroscience will be collected
in Chapter 1.
Chapter 2 will start with the derivation of general models of biological neural net-
works. In particular, the additive and shunting equations will be presented and a brief
description of the popular signal functions will be given. An abstract formulation,
which is suitable for both biological and artificial networks and treats a network as a
labeled graph, will be provided as well. Two types of network architectures: feedfor-
ward and feedback networks will be identified and various connection topologies will
be described.
In Chapter 3, several simple networks will be presented that perform some elemen-
tary functions such as storing, recalling and recognizing neuron activation patterns.
The important on-center off-surround connection topology and its connection to the
11. viii
solution of the noise-saturation dilemma will be discussed. Two important issues
related to applications: the choice of signal functions and the determination of the
synaptic coupling coefficients will also be addressed.
A central subject of this book is the long-term behaviors of the network. In Chap-
ter 4, the connection between the convergence and the global attractor and the im-
portant property of content-addressable memory of many networks will be discussed.
The convergence theorem due to Cohen, Grossberg and Hopfield based on LaSalle's
Invariance Principle and its various extensions and modifications will be presented.
Other techniques for establishing the convergence of almost every trajectory of a given
network will also be provided, including the theory of monotone dynamical systems
and the combinatorial matrix theory.
A special feature of this book is its emphasis on the effect of signal delays on the
long-term performance of the networks under consideration. Such time lags exist due
to the finite propagation speeds of neural signals along axons and the finite speeds
of the neurotransmitters across synaptic gaps in a biological neural network and due
to the finite switching speeds of amplifiers (neurons) in artificial neural networks.
In the final chapter, various phenomena associated with the signal delays: delay-
induced instability, nonlinear periodic solutions, transient oscillations, phase-locked
oscillations and changes of the basins of attraction will be demonstrated.
This book grows from the lecture notes given by the author during a summer
graduate course in Neural Dynamics atYork University in 1999. I am deeply indebted
to all the students form this class. I especially want to thank Yuming Chen who
typed the whole text, struggled with evolving versions of it and offered many critical
comments and suggestions.
Professor Hugh R. Wilson read several chapters of the manuscript and offered
many helpful comments which are greatly appreciated. I thank my editor, Manfred
Karbe, for his patience, understanding, encouragement and assistance.
It is also a pleasure to acknowledge the financial support from Natural Sciences
and Engineering Research Council of Canada, and from the Network of Centers of
Excellence: Mathematics for Information Technology and Complex Systems.
Toronto, February 2001 Jianhong Wu
12. Contents
Preface vii
1 The structure of neural networks 1
1.1 The structure of a single neuron 1
1.2 Transmission of neural signals 2
1.3 Neural circuits, CNS and ANN 7
2 Dynamic models of networks 9
2.1 Biological models 9
2.2 Signal functions 14
2.3 General models and network architectures 16
3 Simple networks 24
3.1 Outstars: pattern learning 24
3.2 Instars: pattern recognition 30
3.3 Lateral inhibition: noise-saturation dilemma 34
3.4 Recurrent ON-CTR OFF-SUR networks: signal enhancement
and noise suppression 36
3.5 Determining synaptic weights 50
Appendix. Grossberg's Learning Theorem 57
4 Content-addressable memory storage 61
4.1 Parallel memory storage by competitive networks 62
4.2 Convergence in networks with a nonsymmetric
interconnection matrix 70
4.3 Implementation of CAM: Hopfield networks 75
4.4 Generic convergence in monotone networks 79
5 Signal transmission delays 88
5.1 Neural networks with delay and basic theory 90
5.2 Global stability analysis 95
5.3 Delay-induced instability 99
5.4 Hopf bifurcation of periodic solutions 105
5.5 A network of two neurons: McCulloch-Pitts nonlinearity 115
5.6 Delay-induced transient oscillations 138
5.7 Effect of delay on the basin of attraction 146
14. Chapter 1
The structure of neural networks
The human central nervous system is composed of a vast number of interconnected
cellular units called nerve cells or neurons. While neurons have a wide variety of
shapes, sizes and location, most neurons are of rather uniform structures and neural
signals are transmitted on the same basic electrical and chemical principles.
The purpose of this chapter is to give a brief description about the structure of a
single neuron and a neural circuit, and about the electrical and chemical mechanism
of neural signal transmission. Most materials in this chapter are taken from Harvey
[1994] and Purves, Augustine, Fitzpatrick, Katz, LaMantia and McNamara [1997],
related references are Grossberg [1982,1988], Haykin [1994], Levine [1991], Müller
and Reinhardt [1991].
Figure 1.1.1 shows a prototypical neuron. The central part is called the cell body or
soma which contains nucleus and other organelles that are essential to the function of
all cells. From the cell body project many root-like extensions (the dendrites) as well
as a single tubular fiber (the axon) which ramifies at its end into a number of small
branches.
1.1 The structure of a single neuron
Other neurons
6
Synaptic gap
Other neurons
Figure 1.1.1. Schematic structure of a typical neuron.
15. 2 1 The structure of neural networks
Dendrites are branch-like protrusions from the neuron cell body. A typical cell has
many dendrites that are highly branched. The receiving zones of signals (or impulses or
information), called synapses, are on the cell body and dendrites. Some neurons have
spines on the dendrites, thus creating specialized receiving sites. Since a fundamental
purpose of neurons is to integrate information from other neurons, the number of
inputs received by each neuron is an important determinant of neural function.
The axon is a long fiber-like extension of the cell body. The axon's purpose is
signal conduction, i.e., transmission of the impulses to other neurons or muscle fibers.
The axonal mechanism that carries signals over the axon is called the action potential,
a self-regenerating electrical wave that propagates from its point of initiation at the cell
body (called the axon hillock) to the terminals of the axon. Axon terminals are highly
specialized to convey the signal to target neurons. These terminal specializations
are called synaptic endings, and the contacts they make with the target neurons are
called chemical synapses. Each synaptic ending contains secretory organelles called
synaptic vesicles.
The axon fans out to other neurons. The fan-out is typically 1 : 10,000 and
more. The same signal encoded by action potentials propagates along each branch
with varying time delays. As mentioned above, each branch end has a synapse which
is typically of a bulb-like structure. The synapses are on the cell bodies, dendrites
and spines of target neurons. Between the synapses and the target neurons is a narrow
gap (called synaptic gap), typically 20 nanometers wide. Structures are spoken of
in relation to the synapses as presynaptic and postsynaptic. Special molecules called
neurotransmitters are released from synaptic vesicles and cross the synaptic gap to
receptor sites on the target neuron. The transmission of signals between synapses and
target neurons is the flow of neurotransmitter molecules.
1.2 Transmission of neural signals
Neuron signals are transmitted either electrically or chemically. Electrical transmis-
sion prevails in the interior of a neuron, whereas chemical mechanisms operate at
the synapses. These two types of signaling mechanisms are the basis for all the
information-processing capabilities of the brain.
An electrochemical mechanism produces and propagates signals along the axon.
In the state of inactivity (equilibrium) the interior (cytoplasm) of the neuron and the
axon is negatively charged relative to the surrounding extra-cellular fluid due to the
differences in the concentrations of specific ions across the neuron membrane. For
example, in an exceptionally large neuron found in the nervous system of the squid,
ion concentrations of potassium (K+
), sodium (Na+
), chloride (Cl~) and calcium
(Ca2+
) are 400, 50, 40-150 and 0.0001 mM inside the neuron, and 20, 440, 560
and 10 mM outside the neuron. Such measurements are the basis for the claim that
there is much more K+
inside the neuron than out, and much more Na+
outside than
in. Similar concentration gradients occur in the neurons of most animals, including
16. 1.2 Transmission of neural signals 3
humans. The equilibrium ion concentration gradients, called the resting potential, of
about —70mV (typically —50 to —80 mV, depending on the type of neurons being
examined) is caused by a biological ion pump powered by mitochondria (structures
in the neuron which use and transfer the energy produced by burning fuel (sugar) and
molecular oxygen).
The excess of positive and negative charges generates an electric field across the
neuron surface and the plasma membrane of the neuron holds the charges apart. See
Figure 1.2.1.
Ion pump causing
charge seperation
+ +f + + + + + + + +
Membrane
+ + + + + + + + +
Surrouniding neuron liquid
J>
Interior ' ( J > maintaining
charge seperation
Figure 1.2.1. Biological ion pump powered by mitochondria.
The membrane is selective to diffusion of particular molecules, and its diffusion
selectivity varies with time and length along the axon. This selective permeability
of membranes is due largely to ion channels which allow only certain kinds of ions
to cross the membrane in the direction of their concentration and electro-chemical
gradients. Thus, channels and pumps basically work against each other.
The axon signal pulses are the fundamental electrical signals of neurons. They
are called action potentials (or spikes or impulses) and are described electrically by
current-voltage characteristics. Injecting a current pulse into the axon causes the po-
tential across the membrane to vary. As mentioned above, equilibrium potential of the
axon is about —70 mV. When injected an inhibitory (hyperpolarization) or excitatory
(idepolarization) signal pulse, the response is a RC-type (exponential) followed by
relaxation. When the injected current causes the voltage to exceed the threshold (typ-
ically —55 mV), a special electrochemical process described below generates a rapid
increase in the potential and this injected current then produces a single pulse with a
peak-to-peak amplitude of about 100-120 mV and a duration of 0.5-1 ms. This gen-
erated pulse propagates without weakening along all branches of the axon. A complex
sequence involving the ions maintains the pulse shape and strength. A brief descrip-
tion of the production and propagation of a pulse by a step-by-step electrochemical
process is as follows (see Figure 1.2.2):
1. Chemical signals at excitatory synapses inject current in the axon;
2. Rising positive charges in the cell leads to the sudden loss of impermeability
17. 4 1 The structure of neural networks
+40 Overshoot
+20
- 6 0
- 2 0
- 8 0
- 4 0
K+
Undershoot
Figure 1.2.2. Production and propagation of a pulse.
against Na+
ions of the membrane and trigger Na+
gate along the membrane to open
(threshold requires a depolarization of about 10 mV from rest);
3. Na+
ions enter the interior until the Na+
equilibrium potential is almost reached.
As the Na+
equilibrium potential is typically near +50 mV, the interior of the neuron
even acquires a positive potential against its surroundings (overshooting);
4. The positive interior causes adjacent Na+
gates to open so that an impulse
propagates down the axon;
5. The Na+
gates inactivate1
and the outflow of K+
from the interior causes the
axon to hyperpolarize.
As a result of Na+
channel inactivation and K+
outflow, the frequency of impulse
generation is limited. For one or a few milliseconds after an impulse, no additional
impulses can be generated. This is called the absolute refractory period. For several
more milliseconds afterwards, there is a relative refractory period in which an action
potential can be initiated, but the strength of current required to initiate an impulse is
larger than normal (Levine [1991]).
The pulse normally propagates from the cell body to the terminals of the neuron.
The speed of propagation of the pulse signal along the axon varies greatly. In the cells
of the human brain the signal travels with a velocity of about 0.2-2 m/s. For other
'in fact, there are two factors to cause the membrane potential to return to the resting potential after
an action potential. The most important is that after a Na+
channel opens, it closes spontaneously, even
though the membrane potential is still depolarized. This process, called Na+
inactivation, is about ten
times slower than Na+
channel opening. In addition to Na+
inactivation, depolarization opens voltage-
gated K +
channels, increasing the K +
permeability above that in the resting cell. Opening voltage-gated
K +
channels favors more rapid repolarization. Voltage-gated K +
channels activate at about the same rate
that Na+
channels inactivate. See Raven and Johnson [1995].
18. 1.2 Transmission of neural signals 5
cells, the transmission velocity can be up to 100 m/s due to the so-called saltatory
conduction between Ranvier nodes. This is because many axons are covered by an
electrically insulating layer known as the myelin sheath which is interrupted from time
to time at the so-called Ranvier nodes, and the signal jumps along the axon from one
Ranvier node to the next.
The above discussion describes the production and propagation of a single pulse.
It should be emphasized that the amplitude of the single pulse is independent of
the magnitude of the current injected into the axon which generates the pulse. In
other words, larger injected currents do not elicit larger action potentials. The action
potentials of a given neuron are therefore said to be all-or-none. On the other hand, if
the amplitude or duration of the injected current is increasing significantly, multiple
action potentials occur. Therefore, one pulse does not carry information, and pulses are
produced randomly due to the massive receiving sites of a single neuron. The intensity
of a pulse train is not coded in the amplitude, rather it is coded in the frequency of
succession of the invariant pulses, which can range from about 1 to 1000 per second.
The interval between two spikes can take any value (larger than the absolute refractory
period), and the combination of analog and digital signal processing is utilized to obtain
security, quality and simplicity of information processing.
The pulse signal traveling along the axon comes to a halt at the synapse due to the
synaptic gap. The signal is transferred to the target neuron across the synaptic gap
mainly via a special chemical mechanism, called synaptic transmission. In synaptic
transmission, when a pulse train signal arrives at the presynaptic site, special sub-
stances called neurotransmitters are released. The neurotransmitter molecules diffuse
across the synaptic gap, as shown in Figure 1.2.3, reaching the postsynaptic neuron (or
Figure 1.2.3. Signal transmission at the synaptic gap.
muscle fiber) within about 0.5 ms. Upon their arrival at special receptors, these sub-
stances modify the permeability of the postsynaptic membrane for certain ions. These
ions then flow in or out of the neurons, causing a hyperpolarization or depolarization
of the local postsynaptic potential. If the induced polarization potential is positive
19. 6 1 The structure of neural networks
(resp. negative), then the synapse is termed excitatory (resp. inhibitory), because the
influence of the synapse tends to activate (resp. to inhibit) the postsynaptic neuron.
The differences in excitation and inhibition are primarily membrane permeability dif-
ferences. Table 1.2.4 gives a comparison of the sequences for synaptic excitation and
inhibition.
Excitation Inhibition
1. Presynaptic sites release excita-
tory neurotransmitter (such asAcetyl-
choline (ACh+
) or glutamate).
2. The neurotransmitter diffuses
across the synaptic gap to the post-
synaptic membrane.
3. The postsynaptic membrane per-
meability to Na+
and K+
greatly in-
creases.
4. Na+
and K+
ion currents across
the membrane drive the potential of
the target neuron to the —20 mV-to-
0 mV range (usually near 0 mV).
1. Presynaptic sites release in-
hibitory neurotransmitter (such as y-
aminobutyric acid (GABA)).
2. The neurotransmitter diffuses
across the synaptic gap to the post-
synaptic membrane.
3. The postsynaptic membrane per-
meability to CI-
and K+
greatly in-
creases.
4. CI-
and K+
ion currents across
the membrane drive the potential of
the target neuron below —70 mV.
Table 1.2.4. Sequences for synaptic excitation and inhibition.
Note that inhibitory synapses sometimes terminate at the presynaptic sites of other
axons, inhibiting their ability to send neurotransmitters across the synaptic gap. This
presynaptic inhibition is helpful when many pathways converge because the system
can selectively suppress inputs.
The polarization potential caused by a single synapse might or might not be large
enough to depolarize the postsynaptic neuron to its firing threshold. In fact, the post-
synaptic neuron typically has many dendrites receiving synapses from thousands of
different presynaptic cells (neurons). Hence its firing depends on the sum of depolar-
izing effects from these different dendrites. These effects decay with a characteristic
time of 5-10 ms, but if signals arrive at the same synapse over such a period, then
excitatory effects accumulate. A high rate of repetition of firing of a neuron therefore
expresses a large intensity of the signal. When the total magnitude of the depolariza-
tion potential in the cell body exceeds the critical threshold (about 10 mV), the neuron
fires.
20. 1.3 Neural circuits, CNS and ANN 7
The neurotransmitter is randomly emitted at every synaptic ending in quanta of
a few 1000 molecules at a low rate. The rate of release is increasing enormously
upon arrival of an impulse, a single action potential carrying the emission of 100-300
quanta within a very short time.
1.3 Neural circuits, CNS and ANN
The complexity of the human nervous system and its subsystems rests on not only the
complicated structures of single nerve cells and the complicated mechanisms of nerve
signal transmission, but also the vast number of neurons and their mutual connections.
Neurons do not function in isolation. They are organized into ensembles called
circuits that process specific kind of information. Although the arrangement of neural
circuitry is extremely varied, the information of neural circuits is typically carried out
in a dense matrix of axons, dendrites and their connections. Processing circuits are
combined in turn into systems that serve broad functions such as the visual system and
the auditory system.
The nervous system is structurally divided into central and peripheral components.
The central nervous system (CNS) comprises the brain (cerebrum, cerebellum and
the brainstem) and the spinal cord. The peripheral nervous system includes sensory
neurons, which connect the brain and the spinal cord to sensory receptors, as well as
motor neurons, which connect brain and the spinal cord to muscles and glands.
Connectivity of neurons is essential for the brain to perform complex tasks. In the
human cortex, every neuron is estimated to receive converging input on the average
from about 104
synapses. On the other hand, each neuron feeds its output into many
hundreds of other neurons, often through a large number of synapses touching a single
nerve cell. It is estimated that there must be in the order of 1011
neurons in the human
cortex, and 1015
synapses.
Plasticity is another essential feature of neural networks. There is a great deal of
evidence that the strength of a synaptic coupling between two given neurons is not fixed
once and for all. As originally postulated by Hebb [1949], the strength of a synaptic
connection can be adjusted if its level of activity changes. An active synapse, which
repeatedly triggers the activation of its postsynaptic neuron, will grow in strength,
while others will gradually weaken. This plasticity permits the modification of synap-
tic coupling and connection topology, which is important in the network's ability of
learning from, and adapting to its environments.
The brain's capability of organizing neurons to perform certain complex tasks
comes from its massively parallel distributed structure and its plasticity. This complex
and plastic parallel computer has the capability of organizing neurons so as to perform
certain computations such as pattern recognition and associative memory many times
faster than the fastest digital computer. It is therefore of great importance and interest
to understand how the biological neural networks perform these computations and
how to build a biologically motivated machine to perform these computations.
21. 8 1 The structure of neural networks
An artificial neural network (ANN) is a machine that is designed to model the way
in which the brain performs a particular task or function of interest. Such a network is
usually implemented using electronic components or simulated in software on a digital
computer. In the last two decades, neurobiology and experimental psychology have
been rapidly developed and great progress has been made in the expansion of cognitive
or adaptive capabilities in industrial applications of computers. Many designers of
intelligence machines have borrowed ideas from experimental results in the brain's
analog response, and machines have been built like simulated brain regions/functions,
with nodes corresponding to neurons or populations of neurons, and connections
between the nodes corresponding to the synapses of neurons. As these fields develop
further and interact, it is natural to look for common organizing principles, quantitative
foundations and mathematical modeling.
We conclude this chapter with the following definition, from the DARPA study
[1988], which seems to be suitable for both biological and artificial neural networks:
A neural network is a system composed of many simple processing el-
ements operating in parallel whose function is determined by network
structure, connection strengths, and the processing performed at com-
puting elements, .... Neural network architectures are inspired by the
architecture of biological nervous systems operating in parallel to obtain
high computation rates.
Note that, in the above definition and throughout the remaining part of this book, we
will alternatively call the functional units in neural networks "nodes", "units", "cells",
"neurons" (and even "populations" if a group of neurons is represented by a node, see
Section 2.1).
22. Chapter 2
Dynamic models of networks
In this chapter, we start with a general model to describe the dynamics of a biological
network of neurons. As often in the development of science, the derivation of the
model equations based on well-accepted laws and principles is the starting point to
understand human brain-mind functioning and to design a machine (artificial neural
network) to mimic the way in which the brain performs a particular task of interest.
We will also give an abstract formulation of a network as a labeled graph and discuss
two important network architectures.
2.1 Biological models
Although it is well known that a neuron transmits voltage spikes along its axon, many
studies show that the effects on the receiving neurons can be usefully summarized by
voltage potentials in the neuron interior that vary slowly and continuously during the
time scale of a single spike. That is, the time scale of the interior neuron potential is long
compared to the spike duration. The biological model of neural networks introduced
below, following the presentation of Harvey [1994], reflects this viewpoint.
Assume that the network, shown in Figure 2.1.1, consists of η neurons, denoted
by vi,..., vn. We introduce a variable λ, to describe the neuron's state and a variable
Zij to describe the coupling between two neurons υ; and vj. More precisely, let
xj(t) = deviation of the ith neuron's potential from its equilibrium.
This variable describes the activation level of the ith neuron. It is called the action
potential, or the short-term memory (STM) trace.
The variable Z,7 associated with v, 's interaction with another neuron vj is defined
as
Z(j = neurotransmitter average release rate per unit axon signal frequency.
This is called the synaptic coupling coefficient (or weight) or the long-term memory
(LTM) trace.
Let us first derive the so-called additive STM equation. Assume a change in neu-
ron potential from equilibrium (—70 mV) occurs. In general, the change is caused by
23. 10 2 Dynamic models of networks
Figure 2.1.1. Schematic diagram of a neuron and a network: Neuron v,· with
potential jt(· relative to equilibrium sends a signal S,j along the axon to a target
neuron vj. The signal affects the target neuron with a coupling strength . The
signal may be excitatory (Z,7 > 0) or inhibitory < 0). Other inputs /,· to
the neuron model external stimuli.
internal and external processes. Thus, we have the following form for the STM trace:
d * / d * A + / d * A ( 2 l l )
Cit dt /internal di /external
Assume inputs from other neurons and stimuli are additive. Then
= + ( * * } + . (2.1.2)
di di / intema] di / βχςί^ο^ dt / inhibitory V & / stimuli
Assuming furtherthe internal neuron processes are stable (that is, the neuron's potential
decays exponentially to its equilibrium without external processes), we have
= -Mxi)xi, Mxi)> 0. (2.1.3)
internal
In most models discussed in this book and in much of the literature, A,· is assumed
to be a constant. But, in general, A,· can vary as a function of x,. See the example
provided at the end of this section.
Assume additive synaptic excitation is proportional to the pulse train frequency.
Then
( i f ) = Y ^ S k i Z k i , (2.1.4)
V / excitatory
k^i
where S^ is the average frequency of signal evaluated at i>,· in the axon from to
Vi. This is called the signal function. In general, Ski depends on the propagation time
delay Tkj from vk to v,· and the threshold Γ* for firing of vk in the following manner
Ski(t) = fk(xk(t - rki) - rk) (2.1.5)
24. 2.1 Biological models 11
for a given nonnegative function fk : Μ ->· [0, oo). Commonly used forms of the
signal function will be described in next section. Here and in what follows, the
propagation time delay is the time needed for the signal sent by vk to reach the
receiving site of neuron v,·, and the threshold Γ* is the depolarization at which the
neuron vk fires (i.e., transmits an impulse).
Assume hardwiring of the inhibitory inputs from other neurons, that is, their cou-
pling strength is constant. Then
f i r ) = ί > (2·1 ·6 )
a l
/ inhibitory
k£i
with
Ckiif) = ckifk(xh(t - Tki) - Γ*), (2.1.7)
where cki > 0 are constants.
We now turn to the LTM trace. Assume the excitatory coupling strength varies
with time. A common model based on Hebb's law is
d Z'
= —Bij(Zij)Zij + Pij[xj]+
, Bijizij) > 0, (2.1.8)
where
Pij(t) = dijfiixiit - τφ - Γ,·), (2.1.9)
here dij > 0 are constants and [z]+
= ζ if ζ > 0, and [ z ] +
= 0 if ζ < 0 . The term
Pij[xj]+
shows that to increase Ζι;·, v, must send a signal Pjj to vj and, at the same
time, vj must be activated (xj > 0).
Putting the above formulations and the external stimuli together, we then get the
following additive STM and passive decay LTM system, for 1 < i < η, 1 < j < n,
Xi(t) = -Ai(Xi(t))Xi(t) + ΣΙ=1 fk(x
k(t - Tki) - rk)Zki(t)
~ ELi Ckifk(xk(t - Tki) - rk) + Ii(t), (2.1.10)
k^i
Zij(t) = -Bij(Zij(t))Zij(t) + dijfi(xi(t - Tij) - ri)[xj(t)]+.
Note that we use i, (t) for ^Xi(t) and Ζ,·7· (/) for ^Zij(t).
We next derive the STM shunting equations. To motivate the shunting equation,
we start with the so-called membrane equation on which cellular neurophysiology is
based. The membrane equation, appeared in the Hodgkin-Huxley model (see, for
example, Hodgkin and Huxley [1952]), describes the voltage V(t) of a neuron by
dV
C — = (V+
- V)g+ + (V- - V)g- + (VP - V)gP (2.1.11)
di
shown in Figure 2.1.2, where C is the capacitance, V+
, V~ and Vp
are constant
25. 12 2 Dynamic models of networks
V(t)
Figure 2.1.2. Circuit analogy for a neuron membrane. V+, V -
, Vp
are the
excitatory, inhibitory and passive saturation points, respectively, inside a neuron.
The voltages act like batteries in an electrical circuit. They produce a fluctuating
output voltage V(t) representing the action potential inside the neuron.
excitatory, inhibitory and passive saturation points, respectively, and g+
, g~ and gp
are excitatory, inhibitory and passive conductance, respectively. Often V+
represents
the saturation point of a Na+
channel and V~ represents the saturation point of a K+
channel.
At each neuron v,·, we let
V+ = Bi,
V~ = - D i ,
VP = 0,
o+ — I -u V <?(+)
7 ( + )
g - = Ji + YsitiSV Zu
8P
— A,
C = 1 (rescale time).
(2.1.12)
Then we obtain the following shunting STM equation
i,· = - A i X i + (Bi - Χ ί ) [ Σ ^ Z i f + It]
kfr
- (Xi + Α · ) [ Σ S ' Z ^ + y(], 1 < i < n,
(2.1.13)
Ιφί
where
= fi (Xi(t - Tij) - Γ;),
s!;t) = fi(Xi(t - Π}) - Γ ; ) , 1 < j Φ i < n,
;s of the respecth
inhibitory synaptic coupling. In the above shunting equations, Bi and —£>,· are the
and Z ^ and zj,· ^ are the corresponding LTM traces of the respective excitatory and
26. 2.1 Biological models 13
maximal and minimal action potentials of neuron v,. As we will see in subsequent
chapters, in networks modeled by shunting equations, there exists some global control
on the dynamics of the network which prevents too high or too low activity. Another
derivation of shunting equations based on the mass action law will be given later.
Finally, we point out that it is often convenient to use a single node to represent a
pool of interacting neurons of similar structures. Figure 2.1.3 shows a typical situation
Interacting neurons
Figure 2.1.3. If the time scales of the input and neuron dynamics are properly
chosen, a single neuron v>o may represent a group of neurons. In this interpretation,
the activation level of vo is proportional to the fraction of excited neurons in the
group.
consisting of a group of neurons and one output representing node. For each neuron
in the group, we have
η η
Xi = -AiXi + Σ s
ki Zki ~ Σ + 7
<' 1
- ' - (2
·1
·14
>
k^i Ιφϊ
and
Ζη =-Βί}Ζη + Pij[xj]+
, <i J<n, (2.1.15)
where the summation indices reflect the interconnections. For the output node, we
have
η
xo = -Aoxo + Σ S
iOZ
iO (2.1.16)
i=l
and
Zio = - Bi0Zi0 + Pi0[*o]+
, 1 < i < η. (2.1.17)
Assume the following holds:
(HI) Ζ,ό ~ constant Zq (slowly varying) for 1 < i < n.
(H2) 5,o ~ Soo (*i) (short delays, low thresholds and step signal function) for 1 <
i < n, where Soq (xj) = 1 if χι > 0 and Sqq (χι) = 0 if jc,· < 0.
27. 14 2 Dynamic models of networks
Then (2.1.16) becomes
η
i ( ) = -A0*0 + T.Soo(Xi)ZQ.
i=l
Consequently, if Ao is large and the time scales for integrating inputs in (2.1.14) are
short, then xq is proportional to the number of excited neurons in the pool.
This discussion leads to the following formulation for networks of groups of neu-
rons: Consider a network consisting of v , . . . , v„ groups of interacting neurons. Let
Xi be defined as follows:
Xi = action potentials of neurons in group v,.
Assume v,· has ß, excitable sites. In a binary model (that is, neurons are on or off),
we can then regard x, as the number of sites excited and 5, — x,· the number of sites
unexcited. However, due to the average random effects over short time intervals in the
large number of neurons in group v,·, we may regard x,· as the (continuous) potentials
of neurons in v, . Assuming also Xi decays exponentially to an equilibrium point
(arbitrarily set equal to zero), then the mass action leads to the shunting equation
Xi = —AiXj+(5, -*,')[ Σ S+ Ζ{
+]
+/,] - (χ,· + A ) [ Σ + · (2-1-18)
k^i ΙφΙ
Thus, the state equations for a single neuron have the same form as that for a group
of interacting neurons, where JC,· is the excited states in the group and thus A, actually
depends on Xi.
Throughout the remaining part of this book, we shall use Xi to denote the acti-
vation level of a neuron or the excited states in a group, depending on the context
and applications. Consequently, it is more appropriate to regard a neuron as a func-
tional unit/node. In general, we will regard the functional units/nodes as collections
of neurons sharing some common response properties. This idea has a particular
physiology basis in the organization of the visual cortex (Hubel and Wiesel [1962,
1965]) and somatosensory cortex (Mountcastle [1957]) into columns of cells with the
same preferred stimuli. Moreover, columns that are close together also tend to have
preferred stimuli that are close together. Existence of such columnar organization has
also appeared in multimodality association areas of the cortex (Rosenkilde, Bauer and
Fuster [1981], Fuster, Bauer and Jerrey [1982], Goldman-Rakic [1984]).
2.2 Signal functions
Recall that
Ski = fk(xk(t - m ) ~ Γ*)
is the average frequency of signal evaluated at υ,· in the axon from v^ to v,. Typical
functions of include
28. 2.2 Signal functions 15
• step function,
• piecewise linear function,
• sigmoid function.
The step function is defined as
f ( v ) =
1
0
if ν > 0,
if ν < 0,
shown in Figure 2.2.1 (left). Models involving such a signal function are referred to
as the McCulloch-Pitts models, in recognition of the pioneering work of McCulloch
and Pitts [1943] (the function describes the all-or-none property of a neuron in the
McCulloch-Pitts model).
f(V) f(V)
1
1
ι/β
Figure 2.2.1. A step signal function (left) and a piecewise linear signal function
(right).
A piecewise linear function is given by
f ( v ) =
0
βυ
1
if ν < 0,
if 0 < ν <
if υ >
as shown in Figure 2.2.1 (right). This describes the nonlinear off-on characteristic of
neurons, β is called the neural gain. Note that a piecewise linear function reduces
to the step function if β is infinity. Such a function has been widely used in cellular
neural network models. See, for example, Chua and Yang [1988a, 1988b] and Roska
and Vandewalle [1993].
The sigmoid function is by far the most common form of a signal function. It is
defined as a strictly increasing smooth bounded function satisfying certain concavity
and asymptotic properties. An example of a sigmoid function is the logistic function
given by
1
f ( v ) =
1 + e-W*'
V €
29. 16 2 Dynamic models of networks
shown in Figure 2.2.2, where β = /'(Ο) > 0 is the neuron gain. Other examples
include inverse tangent function and hyperbolic tangent function.
fiV)
Figure 2.2.2. A sigmoid function f(v) — l+e-4$v ·
As β oo, the sigmoid function becomes the step function. Whereas a step
function assumes the values of 0 or 1, a sigmoid function assumes a continuous
range of values in (0,1). The smoothness of the sigmoid function is important: it
allows analog signal processing and it makes many mathematical theories applicable.
Another important feature of the bounded sigmoid function is to limit the magnitude
of a nerval signal's impact to its receiving neurons. It was noted, see Levine [1991],
that if the firing threshold of an all-or-none neuron is described by a random variable
with a Gaussian (normal) distribution, then the expected value of its output signal
is a sigmoid function of activity. For this and other reasons, sigmoids have become
increasingly popular in recent neural network models. Also, there has been some
physiological verification of sigmoid signal functions at the neuron level, see Rail
[1955] and Kernell [1965].
Other signal functions are also possible. As will be seen in subsequent chapters,
many global neural network properties are not sensitive to the choice of particular
signal functions, but some are. Choice of a signal function also depends on the
applications considered.
Signal functions are also referred to as activationfunctions, amplification functions,
input-outputfunctions etc in the literature. Such a signal function gives the important
nonlinearity of a neural network.
2.3 General models and network architectures
With the neurobiological analog as the source of inspiration, the theory and applica-
tions of ANNs have been advanced rapidly. Recall that by an artificial neural network
we mean a machine that is designed to model the way in which a biological neural
network performs a certain task.
30. 2.3 General models and network architectures 17
In biological and artificial neural networks, we can regard a neuron as an informa-
tion-processing unit. Based on our discussions, for a network of η neurons t>j,..., vn
shown in Figure 2.3.1, we may identify five basic elements associated with neuron υ,·
(Haykin [1994] and Müller and Reinhardt [1991]):
External input
Figure 2.3.1. A general model of five basic elements: activity variable, signal
function, threshold/bias, synaptic weight and propagation delay, and external
inputs.
(a) a variable χ,· describing the activity of u,·;
(b) a signal fiinction f transforming χ,· into an output variable yi;
(c) an externally applied threshold Γ, that has the effect of lowering the input of
the signal function, or an externally applied bias that has the opposite effect of
a threshold;
(d) a set of synapses connected to other neurons Vj (I < j < n), each of which is
characterized by a weight or strength Wji and a propagation delay τ,,· (the prop-
agation delay is always positive, the weight is positive/negative if the associated
synapse is excitatory/inhibitory);
(e) an externally applied input /,.
Viewing the above figure dynamically, we have
yi(t) = fiiXiit) - Γ;),
Ujiit) = Wjiyiit - Tji)
and thus the input arising from the signal *,· through the synapse (wji, τ7,) to the
neuron Vj is given by
Uji(t) = Wji fi (X{ (t - Tji) - Γ,·).
We then can define a neural network as a labeled graph (see Figure 2.3.2) which
has the following properties:
31. 18 2 Dynamic models of networks
h
fl (ωπ,τι,·)
IV
f j (wji, τμ)
Xi
j ·
x
n ^
In
fn τ
ηί)
Τη
Figure 2.3.2. A network regarded as a labeled graph.
(1) an activity variable χι is associated with each node i;
(2) a real-valued weight Wjj and a nonnegative delay r,·_/ are associated with link
(ji) (called synapse) from node j to node i;
(3) a real-valued threshold/bias Γ,· is associated with node i
(4) a real-valued function input /,· of time t is associated with node i;
(5) a signal function /, is defined for node i which determines the output of the
node as a function of its threshold/bias and its activity ;
(6) the net input to node i is given by the weighted sum 1 Wij f j (xj (t — r[y·) — Γ;·)
of the outputs of all other nodes connected to i, with corresponding delay.1
Recall that nodes are called neurons in the standard terminology. Nodes without links
towards them are called input neurons and nodes with no link leading away from them
are called output neurons.
In general, we may identify two types of network architectures: feedforward net-
works and feedback networks. Figure 2.3.3 shows a feedforward network which
consists of one layer of 3 neurons (called input layer) that projects onto another layer
of 2 neurons (called output layer), but not vice versa.
Such a network is called a single-layer network (the layer of input neurons is not
counted since no information processing is performed here). A more complicated
feedforward network is given in Figure 2.3.4, which consists of a layer of 4 input
'There is another method to calculate the netinput to node i, which is given by /,· w
ijx
j(t —
Tjj) — Tj). Models based on such a calculation are usually refereed as Wilson-Cowan equations. See
Wilson and Cowan [1972], Grossberg [1972] and Pineda [1987],
32. 2.3 General models and network architectures 19
Figure 2.3.3. A feedforward network consisting of 3 input neurons and 2 output
neurons.
Figure 2.3.4. A network consisting of 3 layers: the input layer, the hidden layer
and the output layer.
neurons, a layer of so-called hidden neurons and a layer of 2 output neurons. The
function of the hidden neurons is to intervene between the external input and the
network output. By adding one or more hidden layers, the network is able to extract
higher-order statistics, for the network acquires a global perspective despite its local
connectivity by virtue of the extra set of synaptic connections and the extra dimension
of neural interactions (Churchland and Sejnowski [1992]). A feedforward network is
a special example of the so-called cascades of neurons.
In a general multilayer feedforward network, the neurons in the input layer of
the network supply respective elements of the activation pattern (input vector), which
33. 20 2 Dynamic models of networks
constitute the input signals applied to the neurons in the second layer (i.e., the first
hidden layer). The output signals of the second layer are used as inputs to the third
layer, and so on for the rest of the network.
I^pically, the neurons in each layer of the network have as their inputs the output
signals of the preceding layer only. The set of output signals of the neurons in the
output (final) layer of the network constitutes the overall response of the network to the
activation pattern supplied by the neurons in the input (first) layer. The architectural
graph of Figure 2.3.5 illustrates the layout of a multilayer feedforward neural network
for the case of a single hidden layer. For brevity, the network of Figure 2.3.5 is referred
to as a 4-3-2 network since it has 4 input neurons, 3 hidden neurons, and 2 output
neurons. As another example, a feedforward network with ρ neurons in the input
layer (source nodes), h neurons in the first layer, hi neurons in the second layer, and
q neurons in the output layer, is referred to as a ρ — h — /12 — q network.
The neural network of Figure 2.3.5 is said to be fully connected in the sense that
. hidden neurons
Layer of
Input neurons
Figure 2.3.5. A fully connected feedforward network with one hidden layer.
every neuron in each layer of the network is connected to every other neuron in the
adjacent forward layer. If, however, some of the synaptic connections are missing
from the network, we say that the network is partially connected. A form of partially
connected multilayer feedforward network of particular interest is a locally connected
network. An example of such a network with a single hidden layer is presented in
Figure 2.3.6, where each neuron in the hidden layer is connected to a local (partial) set
of input neurons that lies in its immediate neighborhood (such a set of localized neurons
feeding a neuron is said to constitute the receptive field of the neuron). Likewise, each
neuron in the output layer is connected to a local set of hidden neurons.
34. 2.3 General models and network architectures 21
input neurons
Figure 2.3.6. A partially connected feedforward network.
Another type of network contains feedback, and thus is called afeedback network
or a recurrent network. For example, a recurrent network may consist of a single layer
of neurons with each neuron feeding its output signal back to the inputs of all the other
neurons, as illustrated in Figure 2.3.7.
Figure 2.3.7. A recurrent network with no self-feedback loops and no hidden
neurons.
In the structure depicted in the figure there is no self-feedback in the network
(self-feedback refers to a situation where the output of a neuron is fed back to its own
input). The recurrent network illustrated in Figure 2.3.7 also has no hidden neurons.
In Figure 2.3.8 we illustrate another class of recurrent networks with hidden neurons.
35. 22 2 Dynamic models of networks
The feedback connections shown in Figure 2.3.8 originate from the hidden neurons as
well as the output neurons. The presence of feedback loops, be it as in the recurrent
structure of Figure 2.3.7 or that of Figure 2.3.8, has a profound impact on the learning
capability of the network, and on its performance.
To conclude this chapter, we note that the performance of a network is determined
by many factors including
(1) the internal decay rates;
(2) the external inputs;
(3) the synaptic weights;
(4) the propagation delays;
(5) the signal functions and the thresholds/biases;
(6) the connection topology.
In the deterministic models we discussed, a network with a given connection topology
is a dynamical system in the sense that given initial activation levels and initial synaptic
coupling coefficients and when all other factors are considered as parameters, the future
activation levels and synaptic coupling coefficients can be calculated. This is called
thejoint activation-weight dynamics. In applications, however, one often separates the
activation dynamics from the weight dynamics. There are many schemes for adaptively
determining the synaptic weights of a network in order to achieve some particular kind
of tasks such as pattern classification or to obtain some desired network outputs from
a give set of inputs. Such a scheme usually determines a discrete dynamical system
Output
Output
Input
Input
Figure 2.3.8. A recurrent network with hidden neurons.
36. 2.3 General models and network architectures 23
(a system of difference equations) or a continuous dynamical system (a system of
differential equations) in the space of matrices of synaptic coupling coefficients. This
is the weight dynamics. Once the connection topology and the synaptic weights are
determined and when other factors are regarded as parameters, the future activation
levels can be predicted by specifying the initial activation levels. This is the activation
dynamics. How various factors affect this activation dynamics will be the central
subject of the remaining part of this book.
37. Chapter 3
Simple networks
In this chapter, we present several simple networks that perform some elementary
functions such as storing, recalling and recognizing neuron activation patterns. We will
discuss a special connection topology which seems to be quite effective in solving the
noise-saturation dilemma. Assuming this connection topology, we will also indicate
how the choice of signal functions affects the limiting patterns of a network and how
to determine the synaptic weights.
3.1 Outstars: pattern learning
An outstar is a neural network which is designed for learning and recalling external
inputs impressed on an array of neurons. The network consists of (n +1) neurons: the
command neuron uo and the input neurons (L»i , ..., vn). The command neuron axon
connects with input neurons. Axons of the input neurons are not considered because
they do not affect outstar's functioning.
The name "outstar" comes from the geometry when arranging the input neurons
in a circle with the command neuron in the center. Figure 3.1.1 gives a schematic
Figure 3.1.1. An outstar neural network.
38. 3.1 Outstars: pattern learning 25
picture of an outstar. The inputs ( / j , . . . , /„) activate the input neurons, and the input
/o turns on the command neuron, producing axon signals to the input neurons. The
external inputs and axon signals modify the LTM traces, storing the input spatial
pattern (defined later) in the LTM trace. After removing the inputs, the spatial pattern
can be restored and recalled across the input neurons by turning on ΙΌ.
Following Harvey [1994], we can formulate a simple mathematical model of the
outstar, by using additive STM and passive decay LTM, as follows:
i0(0 = -axo(t) + / 0 ( f ) ,
Xi (t) = -axi (t) + Si (t)Zi (t) + Ii (ί), (3.1.1)
Zi{t) = -ßZiit) + Ui(t)[xi(t)]+y 1 < i < n,
where a and β are positive constants,
Sdt) = f(xo(t — τ) — Γ),
Uiit) = g(x0(t - τ) - Γ),
1 < i < η (3.1.2)
and f,g:R—>R are signal functions with certain positive constant coefficients.
Here, for the sake of simplicity, we assumed that this is a "regular" outstar and so
the synaptic coupling from the command neuron to the input neurons is uniform in
strength, resulting the same signal functions, same transmission delay and the same
threshold for each input neuron. For the sake of simplicity, we will consider the case
where f and g are smooth functions, and (/o, I,...,/«) are nonnegative constants.
More general situations are discussed in the appendix to this chapter.
The first equation can be easily solved by the variation-of-constants formula
x0(t) = e~at
x0(0) + - ( 1 - e~at
), t > 0 (3.1.3)
a
from which it follows that
*o(0 — as t oo. (3.1.4)
a
Assume ^ > Γ. Substituting this into equations for x, and Zj, we then obtain the
following linear nonhomogeneous system
ii(t) = -a*,· (f) + /*Z/(/) + /,-(f),
Zi{t) = -ßZi(t) + g*Xi(t),
* ^ 1 <i <n (3.1.5)
as the system to describe the limiting behaviors of nonnegative solutions (jti (t),...,
jcn(/)) and (Zj(t),..., Z„(t)) as t —
> oo, where
« · = « ( f - Γ). ( 3 1
' 6 )
39. 26 3 Simple networks
It is interesting to note that (3.1.5) is a decoupled η systems of planar equations, and
it is easy to show that if
aß > f*8* (3-1.7)
then
Xi(t) -* xf - - /,-, Ziit) -> Z* = Ii as t oo.
' aß-f*g*" 1
aß — f*g*
(3.1.8)
We also note that forthe total activity, total LTM trace and total input given respectively
by η η η
X(0 = J > ( 0 , z ( 0 = £ z f ( 0 , /(ί) = £ / . · ( Ο, (3.1.9)
ί=1 ι=1 1=1
we have
ί X(t) = -aX(t) + /*Ζ(t) + /(f), π , i m
I Z(f) = - £ Z ( 0 + g * X ( 0 , 1
' ;
from which it follows that
x(t)-• ^ /, Z(0 -» g
/ a s r ^ o o . (3.1.11)
- / ^ - /*#*
For further discussions, it is convenient to introduce the so-called reflectance co-
efficients (θι,..., 6>n). These are nonnegative constants such that
η
Ii(t) = GiI(t) = ei J ^ I j . (3.1.12)
7=1
We say that these coefficients define a spatial pattern. Such a spatial pattern is impor-
tant in describing neural activities. For example, in vision the identification of a picture
is invariant under fluctuations in total intensity I (t) over a broad physiological range,
and the relative intensity 0,· at each spatial point characterizes the picture. Convergence
(3.1.8) shows that the steady-state STM trace and LTM trace are both proportional to
the input spatial pattern. Equation (3.1.10) and convergence (3.1.11), however, show
that the total STM trace X(t) and the total LTM trace Z(t) are completely determined
by the total intensity of the external inputs to input neurons.
In summary, in outstar learning, one applies /o and (/i , . . . , / „ ) to the network: 7o
excites vo causing a (sampling) signal to the input neurons. The impact of this sample
signal and the applied spatial pattern cause the synaptic coupling relax towards its
steady state, storing the pattern. When the sampling signal and external inputs are
removed, the LTM trace retains the pattern.
To recall the stored pattern, one activates the command neuron v
>
o with a new input
IQ. Without external inputs to input neurons, the STM for v, becomes
Xi(t) = -axi(t) + f(x0(t-T)-r)Z*
40. 3.1 Outstars: pattern learning 27
and thus v,· relaxes to
jCf(/) -+-f(!?--r)z*= 8
*f
(«~r
) fl./ as ί ^ oo.
or α / α(οφ — f*g*)
In particular, lim^oo jjpj = ^ for a l l 7 = 1,..., n. That is, the readout signal
recalls the spatial pattern on the input neurons, regardless of the activating new input
to the command neuron and regardless of the initial state of the input neurons.
Zcs
Unconditional
response (Salivation)
Conditional stimulus
(Bell)
Unconditional stimulus
(Food)
Figure 3.1.2. A simple outstar simulating Pavlovian conditioning.
The above discussion gives insights into the learning mechanism of Pavlovian con-
ditioning. A famous example of Pavlovian conditioning is as follows: A hungry dog,
presented with food (the unconditional stimulus, or UCS) will salivate (the uncondi-
tional response, or UCR). A bell (the conditional stimulus, or CS) does not initially
elicit salivation, but will do after pairing CS and UCS several times. To apply the
above discussion, assume a neuron uo is activated by CS and a neuron is activated
by UCS. Then the synaptic coupling Zcs between vq and i>i increases and eventually
a CS alone produces the UCR.
The remaining part of this section provides a rigorous proof, taken from Huang
and Wu [1999a], of the convergence result (3.1.8) for system (3.1.1). For more general
results, called Grossberg's Learning Theorem, we refer to the appendix.
We first recall the Gronwall's inequality stated as follows.
Lemma 3.1.1. Suppose that to e Rand that φ, ψ, w, ξ : [ίο, oo) —»· Rare continuous
functions with wit), ξ it) > 0for all t > to. If
φ(0 <ψ(ί)+ξ(ί) [ w(s)<p(s)ds fort > t0, (3.1.13)
Jt0
then
rt t
(pit) < f i t ) + Ht) / fis)wis)efs mwWdufc fort > tQ (3.1.14)
Jt0
41. 28 3 Simple networks
To prove Lemma 3.1.1, we multiply (3.1.13) by w(t) and obtain for ί > to,
diL
[ (p(s)w(s)ds] < l/(t)w(t)e ^
Jt0
J
-fi0 Hs)w(s)ds
Thus integration yields
-ft'0Hs)w(s)äs f f(s)w(s)e-f
'omwiu)du
ds,
Jto Jto
or equivalently
f <p(s)w(s)ds< f ir(s)w(s)e^mwiu)du
ds.
Jt0 JtQ
Substituting this into (3.1.13), we obtain (3.1.14).
be continuous with I being
Lemma 3.1.2. Let to e R, P, Q, F, G, I : [ίο, oo)
bounded and
P(t), Q(t), F(t), G(t) 0 as t ^ oo.
Assume that α, β, δ and γ are nonnegative constants such that the real parts of all
eigenvalues of A = ( a
^„ J are negative. Then every solution of
χ = -ax + 8z + P(t)x + F(t)z + I(t),
ζ = γχ - βζ + G{t)x + Q(t)z
(3.1.15)
is bounded on [ίο, oo).
Proof Let^(i) = ^ ^ be a given solution of (3.1.15) and let X(0 =
for t > to. Denote by || · || the Euclidean norm in R2
, and for a 2 χ 2 real matrix Β let
(ζ) I I - 1 1 5 1 1 I I (ζ) I I f o r a 1 1 (z)6 r 2 · app1
^
I ß II be the matrix norm so that
the variation-of-constants formula to (3.1.15), we get
y(t) < lkA ( i
-, 0 )
|
/*(i0 )
z{to))
+
ty-iv)
ds
+ f eA(t
~s)
X(s)y(s)ds
Jto
t > ίο.
We can find constants Κ, ε > 0 so that
lkA
'|| < Ke
-et
for ί > 0.
42. 3.1 Outstars: pattern learning 29
Therefore,
+
Let
||y(f)|| < ^ ^ - ^ ( ( ^ l + A r j f ' e - ^ l / W l d i
Κ ί e-^-s)
X(s)y(s)ds, t > t0.
J t0
c = s u p (;<;»>) I I + I J r /<,„ 4
Clearly, C < oo and Lemma 3.1.1 implies
||;y(f)|| < C + KC f e-£(t-s)X(s)eK fs x^duds
Jto
= C + KC f efsl-e+KX(»mdullxmds,
J to
Since ||Χ(ί)ΙΙ 0 as t ->· oo, we can find Τ > to so that
||X(m)|| < 1 and - e + ff||X(iO|| < for« > T.
Therefore, for t > Τ, we have
Τ
||y(i)|| < c + * c [ j f ^'[
-e+i:||X(
")ll]dM
||X(5)||d5
J'efA-'+xmum^x^ d5]
Τ
< c + KCeti[
-£+K
^umdu
f ^r[
-£+/f||X(M)l|]d
"||X(i)||di
Λη
+
JT
'T
'to
+ KC j ' e - ^ d s
< C + KC Γ ef'l
-£+KWumdu
X(s)ds +
Jto
2 KC
This proves the boundedness of y on [ίο, oo).
Lemma 3.1.3. Assume that all the all conditions of Lemma 3.1.2 are satisfied. Let
oo) —
> R2
be the solution of
(ζ) : [
'0
'
IX = -aX + SZ + Ht),
Ζ = γΧ — βΖ (3116)
43. 30 3 Simple networks
with X(to) = x(to) and Z(to) = z(fo)· Then
lim [Χ(ί) - *(i)] = lim [Ζ(ί) - z(t)] = 0.
f->O
O t—^00
Proof. Using the variation-of-constants formula to (3.1.15) and (3.1.16) and the fact
that X(to) - x(to) = Ζ (to) - ζ(to) = 0, we obtain
f X ( t ) - x ( t ) _ [' ,A(ts) P(S) F(S)'
G(s) Q(s)
By Lemma 3.1.2, we can find a constant Λ
Γ
χ > 0 such that
(x(s)
Z(S)J
ds.
(X(t)-x(t)
Z(t)-z(t)J
< KKi f e-£{t
-s)
Jto 11
L
-e(t-s) II [P(s) F(s)'
G(s) Q(s)
ds
since
(P(s) F(s)
G(s) Q(s)j
-> 0 as t -> oo
0 as s oo. This completes the proof.
To apply the above results to (3.1.1), we note that Jto(0 —•
hence
Si(t) —•*• f*, Ui(t) —.* g* a s f ^ - o o .
We can then write (3.1.1) as (3.1.15) with
h as t —> oo, and
X(t) = Xi(t), z(t) = Zi(t), S = f
F(t) = S i ( t ) - r , P(t) = 0,
Υ =
8 y
G(t) = Ui(t)-g*, 0 ( 0 = 0.
(3.1.5) is exactly the associated "limiting" equation (3.1.16) of (3.1.15), and condition
(3.1.7) and Lemma 3.1.3 then ensure that
lim Xi(t) = x f , lim z,(t) = Zf for 1 < i < n.
t-f-OO f->oo
3.2 Instars: pattern recognition
An instar is a neural network for recognizing spatial patterns. As with outstars, the
name comes from the geometry. The network consists of η input neurons (vi,..., vn)
and one output neuron v>o. When inputs (7i,..., In) are imposed, input neurons are
activated and send signals to the output neuron. The model for the network shown in
Figure 3.2.1 during the learning process takes the following form
Xi(t) = -axi(t) + U(t), 1 <i<n,
i0(t) = ~otxo(t) + £"=i f(xi(t - τi) - Ti)Zi(t) + Io(t), (3.2.1)
Zi(t) = -ßZi(t) + g(Xi(t - τi) - r,)[xo]+
,
44. 3.2 Instars: pattern recognition 31
where /, g : R —
> [0, oo) are given signal functions.
As foroutstars, we consider the simplest case where (/i , . . . , / „ ) and IQ are constant
functions and define
Θϊ = j , 1 <i<n (3.2.2)
as the spatial pattern (reflectance coefficients), where I = /,·. The system
which determines the limiting behaviors of (;co, Z,..., Z„) as t oo is, assuming
nonnegative jcq and large inputs such that
— > Γ( , 1 < i < η
a
of the form
io = -a*0 + Σ"=1 /(£ - Γi)Zi + Io,
Zi = -ßZi+g(^-ri)x0.
In particular, for the weighted sum of LTM trace
we have
io = -ax ο + Z(t) + Io,
Ζ = -βζ + [ΣΪ=ι / ( £ - - Γ,·)]χο,
from which it follows that, if
« > t / e - r , ) f g - r . )
(3.2.3)
(3.2.4)
(3.2.5)
(3.2.6)
45. 32 3 Simple networks
then
*o(0 s —
(=1
/ ο έ / α - Γ , Μ ΐ - η ) ( 3
·2
·7 )
Z(f) ——n as r —• oo.
<*β~Σ f ( * - r
i ) g ( a - r
> )
i = 1
Substituting this into (3.2.4), we then obtain
Ζ,·(ί) -> ^ 10 = Z* as t -* oo. (3.2.8)
" ß - T . f ( i ~ Γ,·)ί(J - Π )
i=l
In particular,
Zi(t) « ( έ - r , · )
as t oo. (3.2.9)
If we define the modified reflectance coefficients as
0* = t 1 < / < n , (3.2.10)
Σ 8 ( ί ~ Γ;)
j=i
then the steady-state LTM vector is proportional to the modified inputpattern expressed
as the modified reflectance coefficient vector (Θ*,... ,9*). That is, the instar learns
the pattern and stores it in the LTM trace, as in outstars. Note that Θ* coincides with
the reflectance coefficient vector θ = ψ,..., θη) if Γ,· = Ο for 1 < i < η and if g is
a linear function in the neighborhood of ^ for 1 < i <n.
An important function of an instar is pattern recognition. Instar recognition is by
comparing an input pattern to the stored LTM vector. To be specific, we assume a new
input pattern
P = ( p u . . . , p n ) (3.2.11)
is imposed on the input neuron vector. Then the limiting behavior of the STM of the
output neuron with the aforementioned stored pattern (Zj,..., Z*) is governed by
η
xo = -ax0 + r « ) z ; (3.2.12)
i'=l
and thus the output is given by
= . (3.2.13,
1=1 α β - Σ / ( ψ - - Γ,·)
i=1
46. 3.2 Instars: pattern recognition 33
Assume this output is large enough to excess the threshold, namely,
> Γο, (3.2.14)
then the neuron fires, thus recognizing the pattern. We thus say that Ρ belongs to the
pattern class represented by ( Ζ * , . . . , Z*) if (3.2.14) is satisfied. Note that if all Γ,
are zero and if both / and g are linear, this is equivalent to requiring the inner product
of the input pattern Ρ and the stored pattern θ
η
θ · Ρ = Σ θ ί Ρ ί (3.2.15)
i=1
to be sufficiently large.
Figure 3.2.2 gives a schematic description of instar pattern recognition.
Figure 3.2.2. Recognition of a pattern by an instar with stored LTM trace
( Z * , . . . , Z*): the inputs cause input neurons to send signals along the axons
to the output neuron. The signals are multiplied by the LTM trace, summed
and compared with a threshold. If the output neuron fires, the input pattern is
recognized as belonging to the pattern class represented by the LTM trace.
In summary, instars and outstars are dual to one another. An outstar can recall but
cannot recognize a pattern, while an instar can recognize but cannot recall. In other
words, the outstar is blind and the instar is dumb.
Admittedly, the models for outstars and instars we discussed in the last two sections
represent oversimplification of the reality for even idealized networks. Interested
readers are referred to the appendix where general results of Grossberg for unbiased
pattern learning are described. See also Carpenter [1994] for more complicated spatial
pattern learning by a distributed outstar where the source node (command neuron) can
be a source field of arbitrarily many nodes whose activity pattern may be arbitrarily
distributed or compressed.
47. 34 3 Simple networks
3.3 Lateral inhibition: noise-saturation dilemma
One of the main motivations behind the development of shunting networks is to prevent
too high or too low activity of either a single neuron or the whole network. This
is related to the so-called noise-saturation dilemma: Suppose that STM traces at a
network level fluctuate within fixed finite limits at their respective network nodes. If
a large number of intermitted input sources converge to the nodes through time, then
a serious design problem arises due to the fact that the total input converging to each
node can vary widely through time. If the STM traces are sensitive to large inputs,
then why do not small inputs get lost in internal system noise? If the STM traces are
sensitive to small inputs, then why do they not all saturate at their maximum values in
response to large inputs?
Shunting networks possess automatic gain control properties which are capable of
generating an infinite dynamic range within which input patterns can be effectively
processed, thereby solving the noise-saturation dilemma. In this and next sections,
we are going to follow Grossberg [1988, 1982] to describe the simplest feedforward
and feedback networks to convey some of the main ideas behind shunting networks.
The main focus of this section is on the simplest feedforward network in order
to illustrate how it offers a solution to the sensitivity problem raised by the noise-
saturation dilemma. Let a spatial pattern /, = 0,·/ of inputs be processed by the
neurons υ,, i = 1,...,«. Each Ö,· is the (constant) reflectance of its input /, and I is
the (variable) total input size. How can each neuron v,· maintain its sensitivity to 0,
when I is increased? How is saturation avoided?
Note that to compute Θ, = ^J' , each neuron v, must have information about
2^k=1 'k
all the inputs Ik, k = 1,...,«. Rewriting
0
J
>
we observe that increasing /, increases whereas increasing any 4 with k Φ i
decreases 0,·. Translating this into the connection topology for delivering feedforward
inputs to the neuron u,· suggests that /,· excites v,· and that all Ik with k φ i inhibit u,.
This rule represents the simplest feedforward on-center off-surround network shown
in Figure 3.3.1.
Experimental studies of the hippocampus (Andersen, Gross, Lomo and Sveen
[1969]) have suggested that its cells are arranged in a recurrent on-center off-surround
anatomy. The main cell type, the pyramidal cell, emits axon collaterals to interneurons.
Some of these interneurons feed back excitatory signals to nearby pyramidal cells.
Other interneurons scatter inhibitory feedback signals over a broad area.
How does the on-center off-surround network activate and inhibit the neuron V;
via mass action? Assume for each υ,· the STM trace or activation at time t is x,· (t)
whose maximum is B. As discussed in Chapter 2, one may think of Β as the maximal
excitable site among which (t) are excited and Β — JC,· (0 are unexcited at the time t.
Then at v,·, Ii excites Β — X( unexcited sites by mass action, and the total inhibitory
48. 3.3 Lateral inhibition: noise-saturation dilemma 35
Vi
Figure 3.3.1. An on-center off-surround network.
input Σΐιφΐ ^ inhibits JC, excited sites by mass action. Assume also the excitation
decays at a fixed rate A. Then one gets
d ^
—Xi = -Axi + (B - Xi)U - χι Ik, 1 < i < n. (3.3.1)
If a fixed spatial pattern /, = 0 , / , 1 < i < n, is presented and the background input
is held constant for a while, from the equivalent form
d
—Xi = -(A + I)xi + ΒIi, 1 < ι < η (3.3.2)
dt
of (3.3.1), we conclude that each x, approaches an equilibrium given by
ΒI
x* = 9i——, 1 <i<n. (3.3.3)
A + I
Consequently, the relative activity X* = χ*/ ΣΊ^χ equals the reflectance 0,· no
matter how large I is chosen and the total activity
n
BT
x ' = Σ > * = j ^ J <3·3·4)
is bounded by B, so there is no saturation. This regulation of the network's total
activation, called total activity normalization, is due to the automatic control by the
inhibitory inputs (—χ; Σ ^ ϊ
The steady state in (3.3.3) combines two types of information: information about
spatial pattern 0, and information about background activity or luminance I. Note
50. small donations ($1 to $5,000) are particularly important to
maintaining tax exempt status with the IRS.
The Foundation is committed to complying with the laws regulating
charities and charitable donations in all 50 states of the United
States. Compliance requirements are not uniform and it takes a
considerable effort, much paperwork and many fees to meet and
keep up with these requirements. We do not solicit donations in
locations where we have not received written confirmation of
compliance. To SEND DONATIONS or determine the status of
compliance for any particular state visit www.gutenberg.org/donate.
While we cannot and do not solicit contributions from states where
we have not met the solicitation requirements, we know of no
prohibition against accepting unsolicited donations from donors in
such states who approach us with offers to donate.
International donations are gratefully accepted, but we cannot make
any statements concerning tax treatment of donations received from
outside the United States. U.S. laws alone swamp our small staff.
Please check the Project Gutenberg web pages for current donation
methods and addresses. Donations are accepted in a number of
other ways including checks, online payments and credit card
donations. To donate, please visit: www.gutenberg.org/donate.
Section 5. General Information About
Project Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could be
freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose network of
volunteer support.
51. Project Gutenberg™ eBooks are often created from several printed
editions, all of which are confirmed as not protected by copyright in
the U.S. unless a copyright notice is included. Thus, we do not
necessarily keep eBooks in compliance with any particular paper
edition.
Most people start at our website which has the main PG search
facility: www.gutenberg.org.
This website includes information about Project Gutenberg™,
including how to make donations to the Project Gutenberg Literary
Archive Foundation, how to help produce our new eBooks, and how
to subscribe to our email newsletter to hear about new eBooks.
52. Welcome to Our Bookstore - The Ultimate Destination for Book Lovers
Are you passionate about books and eager to explore new worlds of
knowledge? At our website, we offer a vast collection of books that
cater to every interest and age group. From classic literature to
specialized publications, self-help books, and children’s stories, we
have it all! Each book is a gateway to new adventures, helping you
expand your knowledge and nourish your soul
Experience Convenient and Enjoyable Book Shopping Our website is more
than just an online bookstore—it’s a bridge connecting readers to the
timeless values of culture and wisdom. With a sleek and user-friendly
interface and a smart search system, you can find your favorite books
quickly and easily. Enjoy special promotions, fast home delivery, and
a seamless shopping experience that saves you time and enhances your
love for reading.
Let us accompany you on the journey of exploring knowledge and
personal growth!
ebookgate.com