SlideShare a Scribd company logo
An Introduction to
Artificial Neural Network
Dr Iman Ardekani
Biological Neurons
Modeling Neurons
McCulloch-Pitts Neuron
Network Architecture
Learning Process
Perceptron
Linear Neuron
Multilayer Perceptron
Content
Ramón-y-Cajal (Spanish Scientist, 1852~1934):
1. Brain is composed of individual cells called
neurons.
2. Neurons are connected to each others by
synopses.
Biological Neurons
Neurons Structure (Biology)
Biological Neurons
Dendrites Cell body
Nucleus
Axon
Axon
Terminals
Synaptic Junction (Biology)
Biological Neurons
Synapse
Neurons Function (Biology)
1. Dendrites receive signal from other neurons.
2. Neurons can process (or transfer) the received signals.
3. Axon terminals deliverer the processed signal to other
tissues.
Biological Neurons
What kind of signals? Electrical Impulse Signals
Modeling Neurons (Computer Science)
Biological Neurons
Input Outputs
Process
Modeling Neurons
Net input signal is a linear combination of input signals xi.
Each Output is a function of the net input signal.
Biological Neurons
x1x2
x3
x4
x5
x6
x7
x8
x9
x10
y
- McCulloch and Pitts (1943) for introducing the
idea of neural networks as computing machines
- Hebb (1949) for inventing the first rule for self-
organized learning
- Rosenblass (1958) for proposing the perceptron
as the first model for learning with a teacher
Modelling Neurons
Net input signal received through synaptic junctions is
net = b + Σwixi = b + WT
X
Weight vector: W =[w1 w2 … wm]T
Input vector: X = [x1 x2 … xm]T
Each output is a function of the net stimulus signal (f is called the
activation function)
y = f (net) = f(b + WTX)
Modelling Neurons
General Model for Neurons
Modelling Neurons
y= f(b+ Σwnxn)
x1
x2
xm
.
.
. y
w1
w2
wm
b
 f
Activation functions
Modelling Neurons
1 1 1
1 1 1
-1 -1 -1
Good for classification Simple computation Continuous & Differentiable
Threshold Function/ Hard Limiter Linear Function sigmoid Function
Sigmoid Function
Modelling Neurons
1
sigmoid Function
f(x) =
1
1 + e-ax
= threshold function
when a goes to infinity
McCulloch-Pitts Neuron
Modelling Neurons
y {0,1}
y
x1
x2
xm
.
.
. y
w1
w2
wm
b

Modelling Neurons
y [0,1]

x1
x2
xm
.
.
. y
w1
w2
wm
b
Single-input McCulloch-Pitts neurode with b=0, w1=-1 for binary
inputs:
Conclusion?
McCulloch-Pitts Neuron
x1 net y
0 0 1
1 -1 0 x1 y
w1
b

Two-input McCulloch-Pitts neurode with b=-1,
w1=w2=1 for binary inputs:
McCulloch-Pitts Neuron
x1 x2 net y
0 0 ? ?
0 1 ? ?
1 0 ? ?
1 1 ? ?
x1
y
w1
b

x2
w2
Two-input McCulloch-Pitts neurode with b=-1,
w1=w2=1 for binary inputs:
McCulloch-Pitts Neuron
x1 x2 net y
0 0 -1 0
0 1 0 1
1 0 0 1
1 1 1 1
x1
y
w1
b

x2
w2
Two-input McCulloch-Pitts neurode with b=-2,
w1=w2=1 for binary inputs :
McCulloch-Pitts Neuron
x1 x2 net y
0 0 ? ?
0 1 ? ?
1 0 ? ?
1 1 ? ?
x1
y
w1
b

x2
w2
Two-input McCulloch-Pitts neurode with b=-2,
w1=w2=1 for binary inputs :
McCulloch-Pitts Neuron
x1 x2 net y
0 0 -2 0
0 1 -1 0
1 0 -1 0
1 1 0 1
x1
y
w1
b

x2
w2
Every basic Boolean function can be
implemented using combinations of McCulloch-
Pitts Neurons.
McCulloch-Pitts Neuron
the McCulloch-Pitts neuron can be used as a
classifier that separate the input signals into two
classes (perceptron):
McCulloch-Pitts Neuron
X1
X2
Class A (y=1)
Class B (y=0)
Class A  y=?
y = 1  net = ?
net  0  ?
b+w1x1+w2x2  0
Class B  y=?
y = 0  net = ?
net < 0  ?
b+w1x1+w2x2 < 0
Class A  b+w1x1+w2x2  0
Class B  b+w1x1+w2x2 < 0
Therefore, the decision boundary is a hyperline given by
b+w1x1+w2x2 = 0
Where w1 and w2 come from?
McCulloch-Pitts Neuron
Solution: More Neurons Required
McCulloch-Pitts Neuron
X2
X1
X1 and x2
X2
X1
X1 or x2
X2
X1
X1 xor x2
?
Nonlinear Classification
McCulloch-Pitts Neuron
X1
X2
Class A (y=1)
Class B (y=0)
Single Layer Feed-forward Network
Network Architecture
Single Layer:
There is only one
computational
layer.
Feed-forward:
Input layer projects
to the output layer
not vice versa.
 f
 f
 f
Input layer
(sources)
Output layer
(Neurons)
Multi Layer Feed-forward Network
Network Architecture
 f
 f
 f
Input layer
(sources)
Hidden layer
(Neurons)
 f
 f
Output layer
(Neurons)
Single Layer Recurrent Network
Network Architecture
 f
 f
 f
Multi Layer Recurrent Network
Network Architecture
 f
 f
 f
 f
 f
The mechanism based on which a neural network
can adjust its weights (synaptic junctions weights):
- Supervised learning: having a teacher
- Unsupervised learning: without teacher
Learning Processes
Supervised Learning
Learning Processes
Teacher
Learner
Environment 
Input
data Desired
response
Actual
response
Error Signal
+
-
Unsupervised Learning
Learning Processes
LearnerEnvironment
Input
data
Neurons learn based on a competitive task.
A competition rule is required (competitive-learning rule).
- The goal of the perceptron to classify input data into two classes A and B
- Only when the two classes can be separated by a linear boundary
- The perceptron is built around the McCulloch-Pitts Neuron model
- A linear combiner followed by a hard limiter
- Accordingly the neuron can produce +1 and 0
Perceptron
x1
x2
xm
.
.
. y
wk
wk
wk
b

Equivalent Presentation
Perceptron
x1
x2
xm
.
.
. y
w1
w2
wm
b

w0
1
net = WTX
Weight vector: W =[w0 w1 … wm]T
Input vector: X = [1 x1 x2 … xm]T
There exist a weight vector w such that we may state
WTx > 0 for every input vector x belonging to A
WTx ≤ 0 for every input vector x belonging to B
Perceptron
Elementary perceptron Learning Algorithm
1) Initiation: w(0) = a random weight vector
2) At time index n, form the input vector x(n)
3) IF (wTx > 0 and x belongs to A) or (wTx ≤ 0 and x
belongs to B) THEN w(n)=w(n-1) Otherwise
w(n)=w(n-1)-ηx(n)
4) Repeat 2 until w(n) converges
1)
Perceptron
- when the activation function simply is f(x)=x the
neuron acts similar to an adaptive filter.
- In this case: y=net=wTx
- w=[w1 w2 … wm]T
- x=[x1 x2 … xm]T
Linear Neuron
x1
x2
xm
.
.
. y
w1
w2
wm
b

Linear Neuron Learning (Adaptive Filtering)
Linear Neuron
Teacher
Learner (Linear
Neuron)
Environment 
Input
data Desired
Response
d(n)
Actual
Response
y(n)
Error Signal
+
-
LMS Algorithm
To minimize the value of the cost function defined as
E(w) = 0.5 e2(n)
where e(n) is the error signal
e(n)=d(n)-y(n)=d(n)-wT(n)x(n)
In this case, the weight vector can be updated as follows
wi(n+1)=wi(n-1) - μ( )
Linear Neuron
dE
dwi
LMS Algorithm (continued)
= e(n) = e(n) {d(n)-wT(n)x(n)}
= -e(n) xi(n)
wi(n+1)=wi(n)+μe(n)xi(n)
w(n+1)=w(n)+μe(n)x(n)
Linear Neuron
dE
dwi
de(n)
dwi
d
dwi
Summary of LMS Algorithm
1) Initiation: w(0) = a random weight vector
2) At time index n, form the input vector x(n)
3) y(n)=wT(n)x(n)
4) e(n)=d(n)-y(n)
5) w(n+1)=w(n)+μe(n)x(n)
6) Repeat 2 until w(n) converges
Linear Neuron
- To solve the xor problem
- To solve the nonlinear classification problem
- To deal with more complex problems
Multilayer Perceptron
- The activation function in multilayer perceptron is
usually a Sigmoid function.
- Because Sigmoid is differentiable function, unlike
the hard limiter function used in the elementary
perceptron.
Multilayer Perceptron
Architecture
Multilayer Perceptron
 f
 f
 f
Input layer
(sources)
Hidden layer
(Neurons)
 f
 f
Output layer
(Neurons)
Architecture
Multilayer Perceptron
 f
 f
 f
Inputs (from layer k-1) layer k Outputs (to layer k+1)
Consider a single neuron in a multilayer perceptron (neuron k)
Multilayer Perceptron
 f
yk
y0=1
y1
ym
Wk,0
Wk,1
Wk,m
.
.
.
netk = Σwk,iyi
yk= f (netk)
Multilayer perceptron Learning (Back Propagation Algorithm)
Multilayer Perceptron
Teacher
Neuron j
of layer k
layer k-1 
Input
data Desired
Response
dk(n)
Actual
Response
yk(n)
Error Signal
ek(n)
+
-
Back Propagation Algorithm:
Cost function of neuron j:
Ek = 0.5 e2
j
Cost function of network:
E = ΣEj = 0.5Σe2
j
ek = dk – yk
ek = dk – f(netk)
ek = dk – f(Σwk,iyi)
Multilayer Perceptron
Back Propagation Algorithm:
Cost function of neuron k:
Ek = 0.5 e2
k
ek = dk – yk
ek = dk – f(netk)
ek = dk – f(Σwk,iyi)
Multilayer Perceptron
Cost function of network:
E = ΣEj = 0.5Σe2
j
Multilayer Perceptron
Back Propagation Algorithm:
To minimize the value of the cost function E(wk,i), the
weight vector can be updated using a gradient based
algorithm as follows
wk,i(n+1)=wk,i (n) - μ( )
=?
Multilayer Perceptron
dE
dwk,i
dE
dwk,j
Back Propagation Algorithm:
= =
= δk yk
δk = = -ekf’(netk)
Multilayer Perceptron
dE
dwk,i
dE
dek
dek
dyk
dyk
dnetk
dnetk
dwk,i
dE
dnetk
dnetk
dwk,i
dE
dek
dek
dyk
dyk
dnetk
dE
dek
= ek
dek
dnetk
= f’(netk)
dek
dyk
= -1
dnetk
dwk,i
= yi
Local Gradient
Back Propagation Algorithm:
Substituting into the gradient-based algorithm:
wk,i(n+1)=wk,i (n) – μ δk yk
δk = -ekf’(netk) = - {dk- yk} f’(netk) = ?
If k is an output neuron we have all the terms of δk
When k is a hidden neuron?
Multilayer Perceptron
dE
dwk,i
Back Propagation Algorithm:
When k is hidden
δk = =
= f’(netk)
=?
Multilayer Perceptron
dE
dek
dek
dyk
dyk
dnetk
dE
dyk
dyk
dnetk
dek
dnetk
= f’(netk)
dE
dyk
dE
dyk
= ?
E = 0.5Σe2
j
Multilayer Perceptron
dE
dyk
dE
dyk
dej
dyk
dej
dnetj
dnetj
dyk
wjk
-f’(netj)
= -Σ ej f’(netj) wjk = -Σ δj wjk
= Σ ej
= Σ ej
We had δk =-
Substituting = Σ δj wjk into δk results in
δk = - f’(netk) Σ δj wjk
which gives the local gradient for the hidden neuron k
Multilayer Perceptron
dE
dyk
f’(netk)
dE
dyk
Summary of Back Propagation Algorithm
1) Initiation: w(0) = a random weight vector
At time index n, get the input data and
2) Calculate netk and yk for all the neurons
3) For output neurons, calculate ek=dk-yk
4) For output neurons, δk = -ekf’(netk)
5) For hidden neurons, δk = - f’(netk) Σ δj wjk
6) Update every neuron weights by
wk,i(NEW)=wk,i (OLD) – μ δk yk
7. Repeat steps 2~6 for the next data set (next time index)
Multilayer Perceptron
THE END

More Related Content

PPTX
Perceptron & Neural Networks
PPTX
Neural Networks
PPTX
Introduction Of Artificial neural network
ODP
Artificial Neural Network
PPTX
Support Vector Machines- SVM
PDF
Artificial Neural Network
PPT
backpropagation in neural networks
PPTX
Feed forward ,back propagation,gradient descent
Perceptron & Neural Networks
Neural Networks
Introduction Of Artificial neural network
Artificial Neural Network
Support Vector Machines- SVM
Artificial Neural Network
backpropagation in neural networks
Feed forward ,back propagation,gradient descent

What's hot (20)

PPSX
ADABoost classifier
PPTX
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
PPTX
adaboost
PPTX
Artificial neural networks
PPTX
Neural networks...
PDF
Naive Bayes
PPT
Artificial neural network
PPSX
Perceptron (neural network)
PPTX
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
PDF
Artificial Neural Network Lecture 6- Associative Memories & Discrete Hopfield...
PPT
Artificial Neural Network seminar presentation using ppt.
PDF
Intro to Neural Networks
PPT
Adaptive Resonance Theory
PPTX
HOPFIELD NETWORK
PDF
Artificial neural networks
PPTX
Artifical Neural Network and its applications
PDF
Introduction to Neural Networks
PPT
Artificial Neural Networks
PPT
Perceptron
PPTX
Neural network
ADABoost classifier
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
adaboost
Artificial neural networks
Neural networks...
Naive Bayes
Artificial neural network
Perceptron (neural network)
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
Artificial Neural Network Lecture 6- Associative Memories & Discrete Hopfield...
Artificial Neural Network seminar presentation using ppt.
Intro to Neural Networks
Adaptive Resonance Theory
HOPFIELD NETWORK
Artificial neural networks
Artifical Neural Network and its applications
Introduction to Neural Networks
Artificial Neural Networks
Perceptron
Neural network
Ad

Viewers also liked (20)

PPTX
Artificial neural network
PPTX
Neural network & its applications
PPT
artificial neural network
PPT
Artificial Intelligence: Artificial Neural Networks
PDF
Artificial Neural Networks Lect3: Neural Network Learning rules
PPT
Neural Networks
PPTX
Artificial intelligence NEURAL NETWORKS
PPTX
neural network
PPT
SOFT COMPUTERING TECHNICS -Unit 1
PPTX
14. mohsin dalvi artificial neural networks presentation
PDF
Machine Learning: Introduction to Neural Networks
PPT
Back propagation
PPTX
Neural network
PPTX
Neural networks
PPTX
Keynote Session 3: Learning to See
DOCX
mohsin dalvi artificial neural networks questions
PPT
PPTX
Neural Networks - Types of Neurons
PPTX
Two-step Classification method for Spatial Decision Tree
PPT
Du binary signalling
Artificial neural network
Neural network & its applications
artificial neural network
Artificial Intelligence: Artificial Neural Networks
Artificial Neural Networks Lect3: Neural Network Learning rules
Neural Networks
Artificial intelligence NEURAL NETWORKS
neural network
SOFT COMPUTERING TECHNICS -Unit 1
14. mohsin dalvi artificial neural networks presentation
Machine Learning: Introduction to Neural Networks
Back propagation
Neural network
Neural networks
Keynote Session 3: Learning to See
mohsin dalvi artificial neural networks questions
Neural Networks - Types of Neurons
Two-step Classification method for Spatial Decision Tree
Du binary signalling
Ad

Similar to Artificial Neural Network (20)

PPTX
Neural network
PDF
10-Perceptron.pdf
PPT
Artificial neural networks and deep learning.ppt
PPT
Data mining techniques power point presentation
PPT
neural networking and factor analysis.ppt
PPT
neural1Advanced Features of Neural Network.ppt
PDF
20200428135045cfbc718e2c.pdf
PPT
19_Learning.ppt
PPTX
Multilayer Perceptron Neural Network MLP
PPT
neural.ppt
PPT
neural.ppt
PPT
introduction to feed neural networks.ppt
PPT
neural (1).ppt
PPT
neural.ppt
PPT
neural.ppt
PPT
neural.ppt
PPT
lecture07.ppt
PPT
Artificial neural networks
PPTX
PDF
MLIP - Chapter 2 - Preliminaries to deep learning
Neural network
10-Perceptron.pdf
Artificial neural networks and deep learning.ppt
Data mining techniques power point presentation
neural networking and factor analysis.ppt
neural1Advanced Features of Neural Network.ppt
20200428135045cfbc718e2c.pdf
19_Learning.ppt
Multilayer Perceptron Neural Network MLP
neural.ppt
neural.ppt
introduction to feed neural networks.ppt
neural (1).ppt
neural.ppt
neural.ppt
neural.ppt
lecture07.ppt
Artificial neural networks
MLIP - Chapter 2 - Preliminaries to deep learning

More from Iman Ardekani (8)

PPTX
Introduction to Quantitative Research Methods
PPTX
Introduction to Research Methods
PPTX
Artificial Intelligence
PPTX
Expert Systems
PPTX
Genetic Agorithm
PPTX
ANC Tutorial (2013)
PPTX
Remote Active Noise Control
PPTX
Adaptive Active Control of Sound in Smart Rooms (2014)
Introduction to Quantitative Research Methods
Introduction to Research Methods
Artificial Intelligence
Expert Systems
Genetic Agorithm
ANC Tutorial (2013)
Remote Active Noise Control
Adaptive Active Control of Sound in Smart Rooms (2014)

Recently uploaded (20)

PPT
6.1 High Risk New Born. Padetric health ppt
PPTX
C1 cut-Methane and it's Derivatives.pptx
PPTX
Seminar Hypertension and Kidney diseases.pptx
DOCX
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PDF
Warm, water-depleted rocky exoplanets with surfaceionic liquids: A proposed c...
PPTX
Hypertension_Training_materials_English_2024[1] (1).pptx
PDF
. Radiology Case Scenariosssssssssssssss
PPTX
Fluid dynamics vivavoce presentation of prakash
PPTX
perinatal infections 2-171220190027.pptx
PDF
BET Eukaryotic signal Transduction BET Eukaryotic signal Transduction.pdf
PPTX
CORDINATION COMPOUND AND ITS APPLICATIONS
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PPTX
Introduction to Cardiovascular system_structure and functions-1
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PPT
veterinary parasitology ````````````.ppt
PDF
Lymphatic System MCQs & Practice Quiz – Functions, Organs, Nodes, Ducts
PDF
The scientific heritage No 166 (166) (2025)
PDF
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
6.1 High Risk New Born. Padetric health ppt
C1 cut-Methane and it's Derivatives.pptx
Seminar Hypertension and Kidney diseases.pptx
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
Warm, water-depleted rocky exoplanets with surfaceionic liquids: A proposed c...
Hypertension_Training_materials_English_2024[1] (1).pptx
. Radiology Case Scenariosssssssssssssss
Fluid dynamics vivavoce presentation of prakash
perinatal infections 2-171220190027.pptx
BET Eukaryotic signal Transduction BET Eukaryotic signal Transduction.pdf
CORDINATION COMPOUND AND ITS APPLICATIONS
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
Introduction to Cardiovascular system_structure and functions-1
Phytochemical Investigation of Miliusa longipes.pdf
veterinary parasitology ````````````.ppt
Lymphatic System MCQs & Practice Quiz – Functions, Organs, Nodes, Ducts
The scientific heritage No 166 (166) (2025)
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...

Artificial Neural Network

  • 1. An Introduction to Artificial Neural Network Dr Iman Ardekani
  • 2. Biological Neurons Modeling Neurons McCulloch-Pitts Neuron Network Architecture Learning Process Perceptron Linear Neuron Multilayer Perceptron Content
  • 3. Ramón-y-Cajal (Spanish Scientist, 1852~1934): 1. Brain is composed of individual cells called neurons. 2. Neurons are connected to each others by synopses. Biological Neurons
  • 4. Neurons Structure (Biology) Biological Neurons Dendrites Cell body Nucleus Axon Axon Terminals
  • 6. Neurons Function (Biology) 1. Dendrites receive signal from other neurons. 2. Neurons can process (or transfer) the received signals. 3. Axon terminals deliverer the processed signal to other tissues. Biological Neurons What kind of signals? Electrical Impulse Signals
  • 7. Modeling Neurons (Computer Science) Biological Neurons Input Outputs Process
  • 8. Modeling Neurons Net input signal is a linear combination of input signals xi. Each Output is a function of the net input signal. Biological Neurons x1x2 x3 x4 x5 x6 x7 x8 x9 x10 y
  • 9. - McCulloch and Pitts (1943) for introducing the idea of neural networks as computing machines - Hebb (1949) for inventing the first rule for self- organized learning - Rosenblass (1958) for proposing the perceptron as the first model for learning with a teacher Modelling Neurons
  • 10. Net input signal received through synaptic junctions is net = b + Σwixi = b + WT X Weight vector: W =[w1 w2 … wm]T Input vector: X = [x1 x2 … xm]T Each output is a function of the net stimulus signal (f is called the activation function) y = f (net) = f(b + WTX) Modelling Neurons
  • 11. General Model for Neurons Modelling Neurons y= f(b+ Σwnxn) x1 x2 xm . . . y w1 w2 wm b  f
  • 12. Activation functions Modelling Neurons 1 1 1 1 1 1 -1 -1 -1 Good for classification Simple computation Continuous & Differentiable Threshold Function/ Hard Limiter Linear Function sigmoid Function
  • 13. Sigmoid Function Modelling Neurons 1 sigmoid Function f(x) = 1 1 + e-ax = threshold function when a goes to infinity
  • 14. McCulloch-Pitts Neuron Modelling Neurons y {0,1} y x1 x2 xm . . . y w1 w2 wm b 
  • 16. Single-input McCulloch-Pitts neurode with b=0, w1=-1 for binary inputs: Conclusion? McCulloch-Pitts Neuron x1 net y 0 0 1 1 -1 0 x1 y w1 b 
  • 17. Two-input McCulloch-Pitts neurode with b=-1, w1=w2=1 for binary inputs: McCulloch-Pitts Neuron x1 x2 net y 0 0 ? ? 0 1 ? ? 1 0 ? ? 1 1 ? ? x1 y w1 b  x2 w2
  • 18. Two-input McCulloch-Pitts neurode with b=-1, w1=w2=1 for binary inputs: McCulloch-Pitts Neuron x1 x2 net y 0 0 -1 0 0 1 0 1 1 0 0 1 1 1 1 1 x1 y w1 b  x2 w2
  • 19. Two-input McCulloch-Pitts neurode with b=-2, w1=w2=1 for binary inputs : McCulloch-Pitts Neuron x1 x2 net y 0 0 ? ? 0 1 ? ? 1 0 ? ? 1 1 ? ? x1 y w1 b  x2 w2
  • 20. Two-input McCulloch-Pitts neurode with b=-2, w1=w2=1 for binary inputs : McCulloch-Pitts Neuron x1 x2 net y 0 0 -2 0 0 1 -1 0 1 0 -1 0 1 1 0 1 x1 y w1 b  x2 w2
  • 21. Every basic Boolean function can be implemented using combinations of McCulloch- Pitts Neurons. McCulloch-Pitts Neuron
  • 22. the McCulloch-Pitts neuron can be used as a classifier that separate the input signals into two classes (perceptron): McCulloch-Pitts Neuron X1 X2 Class A (y=1) Class B (y=0) Class A  y=? y = 1  net = ? net  0  ? b+w1x1+w2x2  0 Class B  y=? y = 0  net = ? net < 0  ? b+w1x1+w2x2 < 0
  • 23. Class A  b+w1x1+w2x2  0 Class B  b+w1x1+w2x2 < 0 Therefore, the decision boundary is a hyperline given by b+w1x1+w2x2 = 0 Where w1 and w2 come from? McCulloch-Pitts Neuron
  • 24. Solution: More Neurons Required McCulloch-Pitts Neuron X2 X1 X1 and x2 X2 X1 X1 or x2 X2 X1 X1 xor x2 ?
  • 26. Single Layer Feed-forward Network Network Architecture Single Layer: There is only one computational layer. Feed-forward: Input layer projects to the output layer not vice versa.  f  f  f Input layer (sources) Output layer (Neurons)
  • 27. Multi Layer Feed-forward Network Network Architecture  f  f  f Input layer (sources) Hidden layer (Neurons)  f  f Output layer (Neurons)
  • 28. Single Layer Recurrent Network Network Architecture  f  f  f
  • 29. Multi Layer Recurrent Network Network Architecture  f  f  f  f  f
  • 30. The mechanism based on which a neural network can adjust its weights (synaptic junctions weights): - Supervised learning: having a teacher - Unsupervised learning: without teacher Learning Processes
  • 31. Supervised Learning Learning Processes Teacher Learner Environment  Input data Desired response Actual response Error Signal + -
  • 32. Unsupervised Learning Learning Processes LearnerEnvironment Input data Neurons learn based on a competitive task. A competition rule is required (competitive-learning rule).
  • 33. - The goal of the perceptron to classify input data into two classes A and B - Only when the two classes can be separated by a linear boundary - The perceptron is built around the McCulloch-Pitts Neuron model - A linear combiner followed by a hard limiter - Accordingly the neuron can produce +1 and 0 Perceptron x1 x2 xm . . . y wk wk wk b 
  • 34. Equivalent Presentation Perceptron x1 x2 xm . . . y w1 w2 wm b  w0 1 net = WTX Weight vector: W =[w0 w1 … wm]T Input vector: X = [1 x1 x2 … xm]T
  • 35. There exist a weight vector w such that we may state WTx > 0 for every input vector x belonging to A WTx ≤ 0 for every input vector x belonging to B Perceptron
  • 36. Elementary perceptron Learning Algorithm 1) Initiation: w(0) = a random weight vector 2) At time index n, form the input vector x(n) 3) IF (wTx > 0 and x belongs to A) or (wTx ≤ 0 and x belongs to B) THEN w(n)=w(n-1) Otherwise w(n)=w(n-1)-ηx(n) 4) Repeat 2 until w(n) converges 1) Perceptron
  • 37. - when the activation function simply is f(x)=x the neuron acts similar to an adaptive filter. - In this case: y=net=wTx - w=[w1 w2 … wm]T - x=[x1 x2 … xm]T Linear Neuron x1 x2 xm . . . y w1 w2 wm b 
  • 38. Linear Neuron Learning (Adaptive Filtering) Linear Neuron Teacher Learner (Linear Neuron) Environment  Input data Desired Response d(n) Actual Response y(n) Error Signal + -
  • 39. LMS Algorithm To minimize the value of the cost function defined as E(w) = 0.5 e2(n) where e(n) is the error signal e(n)=d(n)-y(n)=d(n)-wT(n)x(n) In this case, the weight vector can be updated as follows wi(n+1)=wi(n-1) - μ( ) Linear Neuron dE dwi
  • 40. LMS Algorithm (continued) = e(n) = e(n) {d(n)-wT(n)x(n)} = -e(n) xi(n) wi(n+1)=wi(n)+μe(n)xi(n) w(n+1)=w(n)+μe(n)x(n) Linear Neuron dE dwi de(n) dwi d dwi
  • 41. Summary of LMS Algorithm 1) Initiation: w(0) = a random weight vector 2) At time index n, form the input vector x(n) 3) y(n)=wT(n)x(n) 4) e(n)=d(n)-y(n) 5) w(n+1)=w(n)+μe(n)x(n) 6) Repeat 2 until w(n) converges Linear Neuron
  • 42. - To solve the xor problem - To solve the nonlinear classification problem - To deal with more complex problems Multilayer Perceptron
  • 43. - The activation function in multilayer perceptron is usually a Sigmoid function. - Because Sigmoid is differentiable function, unlike the hard limiter function used in the elementary perceptron. Multilayer Perceptron
  • 44. Architecture Multilayer Perceptron  f  f  f Input layer (sources) Hidden layer (Neurons)  f  f Output layer (Neurons)
  • 45. Architecture Multilayer Perceptron  f  f  f Inputs (from layer k-1) layer k Outputs (to layer k+1)
  • 46. Consider a single neuron in a multilayer perceptron (neuron k) Multilayer Perceptron  f yk y0=1 y1 ym Wk,0 Wk,1 Wk,m . . . netk = Σwk,iyi yk= f (netk)
  • 47. Multilayer perceptron Learning (Back Propagation Algorithm) Multilayer Perceptron Teacher Neuron j of layer k layer k-1  Input data Desired Response dk(n) Actual Response yk(n) Error Signal ek(n) + -
  • 48. Back Propagation Algorithm: Cost function of neuron j: Ek = 0.5 e2 j Cost function of network: E = ΣEj = 0.5Σe2 j ek = dk – yk ek = dk – f(netk) ek = dk – f(Σwk,iyi) Multilayer Perceptron
  • 49. Back Propagation Algorithm: Cost function of neuron k: Ek = 0.5 e2 k ek = dk – yk ek = dk – f(netk) ek = dk – f(Σwk,iyi) Multilayer Perceptron
  • 50. Cost function of network: E = ΣEj = 0.5Σe2 j Multilayer Perceptron
  • 51. Back Propagation Algorithm: To minimize the value of the cost function E(wk,i), the weight vector can be updated using a gradient based algorithm as follows wk,i(n+1)=wk,i (n) - μ( ) =? Multilayer Perceptron dE dwk,i dE dwk,j
  • 52. Back Propagation Algorithm: = = = δk yk δk = = -ekf’(netk) Multilayer Perceptron dE dwk,i dE dek dek dyk dyk dnetk dnetk dwk,i dE dnetk dnetk dwk,i dE dek dek dyk dyk dnetk dE dek = ek dek dnetk = f’(netk) dek dyk = -1 dnetk dwk,i = yi Local Gradient
  • 53. Back Propagation Algorithm: Substituting into the gradient-based algorithm: wk,i(n+1)=wk,i (n) – μ δk yk δk = -ekf’(netk) = - {dk- yk} f’(netk) = ? If k is an output neuron we have all the terms of δk When k is a hidden neuron? Multilayer Perceptron dE dwk,i
  • 54. Back Propagation Algorithm: When k is hidden δk = = = f’(netk) =? Multilayer Perceptron dE dek dek dyk dyk dnetk dE dyk dyk dnetk dek dnetk = f’(netk) dE dyk dE dyk
  • 55. = ? E = 0.5Σe2 j Multilayer Perceptron dE dyk dE dyk dej dyk dej dnetj dnetj dyk wjk -f’(netj) = -Σ ej f’(netj) wjk = -Σ δj wjk = Σ ej = Σ ej
  • 56. We had δk =- Substituting = Σ δj wjk into δk results in δk = - f’(netk) Σ δj wjk which gives the local gradient for the hidden neuron k Multilayer Perceptron dE dyk f’(netk) dE dyk
  • 57. Summary of Back Propagation Algorithm 1) Initiation: w(0) = a random weight vector At time index n, get the input data and 2) Calculate netk and yk for all the neurons 3) For output neurons, calculate ek=dk-yk 4) For output neurons, δk = -ekf’(netk) 5) For hidden neurons, δk = - f’(netk) Σ δj wjk 6) Update every neuron weights by wk,i(NEW)=wk,i (OLD) – μ δk yk 7. Repeat steps 2~6 for the next data set (next time index) Multilayer Perceptron

Editor's Notes

  • #34: - Historically important