SlideShare a Scribd company logo
Neural Network - Forward and Back
Propagation, Gradient Descent
Visualize Forward and Back Propagation in a
simple Neural Network using IBM Rational
Rhapsody MDA tool to design the Neural
Network and execute the model. Minimize
the Cost (Loss) using Gradient Descent
The Problem: wX + b = Y
•Question: Given a set of X inputs, a set of
Y targets (expected, desired), and initial
values for the Weights, can the Neural
Network model compute the outputs,
Yhat, (actual, computed, predicted), to
match the Y targets within a predefined
error(i.e. Y – Yhat <= 10E-6)?
The Problem: wX + b = Y (cont.)
•Answer: Yes! Use gradient descent
algorithm to find the minimum of a
function.
•Use the Forward(FP) and Back
Propagation(BP) capability of the NN
model to minimize the Cost or Loss of the
NN and thus find the right weights.
The Problem: wX + b = Y (cont.)
•In FP compute the Outputs for each
neuron in each Hidden Layer
•the output of the Input Layer is the
transferred input, X, and does not
need to be computed
•compute the Cost at the output Layer
The Problem: wX + b = Y (cont.)
•In BP compute the Gradients and update the
Weights(coefficients) to minimize the cost
•Iterate again (FPBP) until the Cost
approaches the predefined error.
• Xs and Ys are randomly generated within (-1, 1)
• Weights are randomly initialized between (-0.1, 0.1)
An Artificial Neural Network is suitable to an
Object Oriented design:
•ANNModel (“layer manager”) has Layers
• Maintains a list of Layers
• Delegates operations to the Layers
•Layer (“neuron manager”) has Neurons
• Maintains a list of Neurons
• Delegates operations to the Neurons
• ANNModel does not have DIRECT access to the
Neurons but through its Layers
Artificial Neural Network is suitable to an
Object Oriented design (cont):
•Neuron maintains a list of Weights that is
indexed by the Ids of the Previous Layer
Neurons
• This indexing constitutes the connection between
a current layer and the previous layer in a NN
• A Neuron encapsulates data and operations to:
• Activate a neuron (i.e. compute its output)
• Compute its Gradient
• Update its Weights (compute momentum, and
delta weight)
Artificial Neural Network is suitable to an
Object Oriented design (cont):
•Activator (“singleton”)
•Encapsulates the activation functions and
their corresponding derivatives
•The Neuron interacts with the Activator
when it needs to use the activation function
and the corresponding derivatives
•It is Instantiated by the ANNModel at
startup
USE Rational Rhapsody MDA tool to design a NN
and visualize the two major states of execution in
a NN:
•Forward Propagation state
• Compute the Outputs
• Compute Error(Cost, Loss)
•Back Propagation state
• Minimize Error(Cost) by using gradient descent
• Compute gradients - For Output layer, For Hidden layer
• Update weights - For Output layer, For Hidden layer
The NN Model has two major states of execution:
1. Forward_Propagation state:
•Activation sub state - compute the Outputs
•Compute_Cost sub state - compute Error
2. Back_Propagation state:
•Compute_Gradients sub state
•Update_Weights sub state
x. Configuration state - the inputs, target outputs, and the
weights are generated before training.
Here is the Logical View – Class diagram
Dynamic View – IDLE sate
Dynamic View – CONFIGURE state
Dynamic View – FP->ACTIVATE state
Dynamic View – FP->COMPUTE_COST state
Dynamic View – BP --> COMPUTE_GRADIENTS state
Dynamic View – BP -->Update_Weights state
Dynamic View – SUCCESS sate
Panel GUI interface
Panel GUI interface - Success
Output
Input
Hidden1
h2
Ah2
Gh2
h3
Ah3
Gh3
1.0
X3
X1
O1
X2
O2
O3
h0
Ah0 =1WB_h1
WX1_h1
WX2_h1
WX3_h1
Wb_h2
WX1_h2
WX2_h2
WX3_h2
Wb_h3
WX1_h3
WX2_h3
WX3_h3
WB_o2
Wh1_o2
Wh2_o2
Wh3_o2
WB_o3
Wh1_o3
Wh2_o3
Wh3_o3
Y1hat = tanh(Zo1)
Err1 = Y1 – Y1hat
Go1 = Err1*tanhD(ZO1)
Y2hat = tanh(Zo2)
Err2 = Y2 – Y2hat
Y3hat = tanh(Zo3)
Err3 = Y3 – Y3hat
Zh1
Go2 = Err2*tanhD(ZO2)
Go3 = Err3*tanhD(ZO3)
h1_Wlist
O1_Wlist
Zi = Sum of (w * previous neuron output)
Ah1 = tanh (Zh1); Y1hat = tanh (ZO1)
Zh1 = (h1_Wlist[0]*1.0 + h1_Wlist[1]*X1 + h1_Wlist[2]*X2 + h1_Wlist[3]*X3) =
(WB_h1*1.0 + WX1_h1*X1 + WX2_h1*X2 + WX3_h1*X3)
ZO1 = (O1_Wlist[0]*1.0 + O1_Wlist[1]*Ah1 + O1_Wlist[2]*Ah2 + O1_Wlist[3]*Ah3) =
(WB_O1*1.0 + Wh1_O1*Ah1 + Wh2_O1*Ah2 + Wh3_O1*Ah3)
Zh2
Zh3
WB_o1
Wh1_o1
Wh2_o1
Wh3_o1
O2_Wlist
O3_Wlist
h2_Wlist
h3_Wlist
h1
Ah1
Gh1
ZO1
ZO2
ZO3
Cost = (Err1^2+Err2^2+Err3^2) / (2*3)
FORWARD PROPAGATION
𝐶𝑜𝑠𝑡(𝑤,𝑏) =
1
2𝑁 𝑖=1
𝑁
(𝑦𝑖 − 𝑦𝑖ℎ𝑎𝑡)2
where: yi – target, expected, desired, ideal
yihat – predicted, computed, actual
N – number of outputs in Output layer
FORWARD PROPAGATION
Output
O1
O2
O3
Y1hat = AF(Zo1)
Err1 = Y1 – Y1hat
Go1 = Err1 * AF’(ZO1)
Y2hat = AF(Zo2)
Err2 = Y2 – Y2hat
Y3hat = AF(Zo3)
Err3 = Y3 – Y3hat
Go2 = Err2 * AF’(ZO2)
Go3 = Err3 * AF’(ZO3)
Output Layer Gradient – Go(i)
Zo1
ZO2
ZO3
COMPUTE GRADIENTS
Output
Input
Hidden1
h1
Ah1
Gh1
h2
Ah2
Gh2
h3
Ah3
Gh3
1.0
X3
X1
O1
X2
O2
O3
h0
Ah0
1
WB_h1
WX1_h1
WX2_h1
WX3_h1
WB_o2
Wh1_o2
Wh2_o2
Wh3_o2
WB_o3
Wh1_o3
Wh2_o3
Wh3_o3
Y1hat = tanh(Zo1)
Err1 = Y1 – Y1hat
Go1 = Err1*tanhD(ZO1)
Y2hat = tanh(Zo2)
Err2 = Y2 – Y2hat
Y3hat = tanh(Zo3)
Err3 = Y3 – Y3hat
Zh1
Go2 = Err2*tanhD(ZO2)
Go3 = Err3*tanhD(ZO3)
h1_Wlist
O1_Wlist
Gh1= SumG * tanhD(Zh1)
Zh2
Zh3
WB_o1
Wh1_o1
Wh2_o1
Wh3_o1
O2_Wlist
O3_Wlist
Wb_h2
WX1_h2
WX2_h2
WX3_h2
Wb_h3
WX1_h3
WX2_h3
WX3_h3
h2_Wlist
h3_Wlist
ZO1
ZO2
ZO3
SumG = (O1_Wlist[1]*Go1 + O2_Wlist[1]*Go2 + O3_Wlist[1]*Go3 ) =
Wh1_o1*Go1 + Wh1_o2*Go2 + Wh1_o3*Go3
COMPUTE GRADIENTS
Hidden Layer Gradient – Gh(i)
Output
Input
Hidden1
h1
h2
h3
I0
A=1
I3
A=X3
I1
A=X1
O1
I2
A=X2
O2
O3
h0
Ah0
=1
WB_h1
WX1_h1
WX2_h1
WX3_h1
WB_o2
Wh1_o2
Wh2_o2
Wh3_o2
WB_o3
Wh1_o3
Wh2_o3
Wh3_o3
Y1hat = tanh(Zo1)
Err1 = Y1 – Y1hat
Go1 = Err1*tanhD(ZO1)
Y2hat = tanh(Zo2)
Err2 = Y2 – Y2hat
Y3hat = tanh(Zo3)
Err3 = Y3 – Y3hat
Zh1
Go2 = Err2*tanhD(ZO2)
Go3 = Err3*tanhD(ZO3)
Zo1
Zo2
Zo3
H1_Wlist
O1_Wlist
Ah1
Gh1
Zh2
Zh3
WB_h1 += (0.5*1.0*Gh1) ;
WX1_h1 +=(0.5*X1* Gh1);
WX2_h1 +=(0.5*X2* Gh1);
WX3_h1 +=(0.5*X3* Gh1);
WB_o1
Wh1_o1
Wh2_o1
Wh3_o1
WB_o1 += (0.5*1.0* Go1); WB_o2 += (0.5*1.0* Go2); WB_o3 += (0.5*1.0* Go3)
Wh1_o1 += (0.5*Ah1* Go1); Wh1_o2 += (0.5*Ah1 * Go2); Wh1_o3 += (0.5* Ah1* Go3)
O2_Wlist
O3_Wlist
Ah2
Ah3
Gh2
Gh3
Cost = (Err1^2+Err2^2+Err3^2) / (2*3)
W += (LearnRate * PrevOut * CurrentG)
LearnRate = 0.5;
UPDATE WEIGHTS
OutputHidden1
h1
Ah1
Gh1
h2
Ah2
Gh2
h3
Ah3
Gh3
O1
O2
O3
h0
Ah0
1.0
WB_o2
Wh1_o2
Wh2_o2
Wh3_o2
WB_o3
Wh1_o3
Wh2_o3
Wh3_o3
Y1hat = tanh(Zo1)
Err_o1 = Y1 – Y1hat
Go1 = Err_o1*tanhD(Zo1)
Y2hat = tanh(Zo2)
Err_o2 = Y2 – Y2hat
Y3hat = tanh(Zo3)
Err_o3 = Y3 – Y3hat
Go2 = Err_o2*tanhD(Zo2)
Go3 = Err_o3*tanhD(Zo3)
O1_Wlist
WB_o1
Wh1_o1
Wh2_o1
Wh3_o1
O2_Wlist
O3_Wlist
ZO1
ZO2
ZO3
Wh1_o1 += ΔW(t) + momt; momt = α * ΔW(t-1) ; ΔW = LR * Ah1 * Go1
Wh1_o2 += ΔW(t) + momt; momt = α * ΔW(t-1) ; ΔW = LR * Ah2 * Go2
Wh1_o3 += ΔW(t) + momt; momt = α * ΔW(t-1) ; ΔW = LR * Ah3 * Go3
UPDATE WEIGHTS
Zh1
Zh2
Zh3
WX1_h1 += ΔW(t) + momt; momt = α * ΔW(t-1) ;
ΔW = LR * X1 * Gh1
WX1_h2 += ΔW(t) + momt; momt = α * ΔW(t-1) ;
ΔW = LR * X2 * Gh2
WX1_h3 += ΔW(t) + momt; momt = α * ΔW(t-1) ;
ΔW = LR * X3 * Gh3
WB_h1 += ΔW(t) + momt; momt = α * ΔW(t-1) ;
ΔW = (LR * 1.0 *Gh1);
WX1_h1 += ΔW(t) + momt; momt = α * ΔW(t-1) ;
ΔW = (LR * X1 * Gh1);
WX2_h1 += ΔW(t) + momt; momt = α * ΔW(t-1) ;
ΔW = (LR * X2 * Gh1);
WX3_h1 += ΔW(t) + momt; momt = α * ΔW(t-1) ;
ΔW = (LR * X3 * Gh1);
Zh0
float Neuron:: Activate(Layer* prevLayer,const Activation_Type& activation) {
assert(prevLayer != NULL);
//Iterate through the previous layer neurons to compute the product sum (Zi)
//This is Fan-In: from many(previous) to one(current)
itsInputSum = ZERO_FLOAT;
OMIterator<Neuron*> iPrevNeuron = prevLayer->getItsNeuronList();
for (iPrevNeuron.reset(); *iPrevNeuron != NULL; ++iPrevNeuron) {
float w = itsWeightList[(*iPrevNeuron)->itsId]->W;
float output = (*iPrevNeuron)->itsOutput;
itsInputSum += w * output;
}
itsOutput = Activator::S_GetInstance()->Run(itsInputSum, activation);
return itsOutput;
}
//Output layer
void Neuron::ComputeOutputGradient( float expectedOutput,
const Activation_Type& activation, bool useOutput) {
itsError = expectedOutput - itsOutput;
float x = itsInputSum; //default
if (useOutput == true) { x = itsOutput; //experiment }
float deriv = Activator::S_GetInstance()->RunDeriv(x, activation);
itsGradient = itsError * deriv;
}
void Neuron::ComputeHiddenGradient(Layer* nextLayer,
const Activation_Type& activation, bool useOutput) {
//Compute the contribution of each current neuron to the network Error
itsGradientSum = ZERO_FLOAT;
OMIterator<Neuron*> iNextNeuron = nextLayer->getItsNeuronList();
for (iNextNeuron.reset(); *iNextNeuron != NULL; ++iNextNeuron) {
if ((*iNextNeuron)->itsBiasFlag == false) { //skip the bias neuron
float w = (*iNextNeuron)->itsWeightList[itsId]->W;
float gradient = (*iNextNeuron)->itsGradient;
itsGradientSum += w * gradient;
}
}//for()
float x = itsInputSum; //default
if (useOutput == true) { x = itsOutput; //experiment }
float deriv = Activator::S_GetInstance()->RunDeriv(x, activation);
itsGradient = itsGradientSum * deriv;
}
void Neuron::UpdateWeights(
Layer* prevLayer, float learningRate, float alpha) {
OMIterator<Neuron*> iPrevNeuron = prevLayer->getItsNeuronList();
for (iPrevNeuron.reset(); *iPrevNeuron != NULL; ++iPrevNeuron) {
int id = (*iPrevNeuron)->itsId;
float output = (*iPrevNeuron)->itsOutput ;
float momentum = alpha * itsWeightList[id]->DeltaW;
float deltaW = learningRate * output * itsGradient + momentum;
itsWeightList[id]->DeltaW = deltaW;
itsWeightList[id]->W += deltaW;
}
}
void Activator::Run(float x, const Activation_Type& activation) {
float output = ZERO_FLOAT;
switch(activation){
case eSigmoid:
output = Sigmoid(x); break;
case eReLu:
output = ReLu(x); break;
case eLeakyReLu:
output = LeakyReLu(x); break;
case eTanh:
default:
output = Tanh(x);break;
}
return output;
}
Neural Network - Feed Forward - Back Propagation Visualization
Neural Network - Feed Forward - Back Propagation Visualization
Neural Network - Feed Forward - Back Propagation Visualization
Neural Network - Feed Forward - Back Propagation Visualization
-4
-2
0
2
4
6
8
10
12
14
16
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
Fully Connected Network: Input - 11 Neurons, 2 Hidden - 3 Neurons, Output – 10 Outputs
LId NId PrevNId Weight
-4
-2
0
2
4
6
8
10
12
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
Fully Connected Network: Input - 11 Neurons, 2 Hidden - 3 Neurons, Output – 10
LId NId PrevNId Weight
-4
-2
0
2
4
6
8
10
12
14
16
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
LId NId PrevNId Weight
0
0.5
1
1.5
2
2.5
3
3.5
1
10
19
28
37
46
55
64
73
82
91
100
109
118
127
136
145
154
163
172
181
190
199
208
217
226
235
244
253
262
271
280
289
298
307
316
325
334
343
352
361
370
379
388
397
406
415
424
433
442
451
460
469
478
487
496
505
514
523
532
541
550
559
568
577
586
595
604
613
622
631
640
649
658
667
676
685
694
703
712
721
Chart Title
LRate Alpha
0
0.05
0.1
0.15
0.2
0.25
1
10
19
28
37
46
55
64
73
82
91
100
109
118
127
136
145
154
163
172
181
190
199
208
217
226
235
244
253
262
271
280
289
298
307
316
325
334
343
352
361
370
379
388
397
406
415
424
433
442
451
460
469
478
487
496
505
514
523
532
541
550
559
568
577
586
595
604
613
622
631
640
649
658
667
676
685
694
703
712
721
Cost
0
0.5
1
1.5
2
2.5
3
3.5
1
10
19
28
37
46
55
64
73
82
91
100
109
118
127
136
145
154
163
172
181
190
199
208
217
226
235
244
253
262
271
280
289
298
307
316
325
334
343
352
361
370
379
388
397
406
415
424
433
442
451
460
469
478
487
496
505
514
523
532
541
550
559
568
577
586
595
604
613
622
631
640
649
658
667
676
685
694
703
712
721
Chart Title
LRate Alpha
0
0.01
0.02
0.03
0.04
0.05
1
10
19
28
37
46
55
64
73
82
91
100
109
118
127
136
145
154
163
172
181
190
199
208
217
226
235
244
253
262
271
280
289
298
307
316
325
334
343
352
361
370
379
388
397
406
415
424
433
442
451
460
469
478
487
496
505
514
523
532
541
550
559
568
577
586
595
604
613
622
631
640
649
658
667
676
685
694
703
712
721
Cost
LR
OK

More Related Content

PPSX
Modules and packages in python
PPTX
Method Overloading in Java
PPTX
PHP OOP Lecture - 01.pptx
PPT
Working with frames
PDF
Strings in Python
PDF
Python libraries
PPTX
Array of objects.pptx
PPTX
Classes and objects in c++
Modules and packages in python
Method Overloading in Java
PHP OOP Lecture - 01.pptx
Working with frames
Strings in Python
Python libraries
Array of objects.pptx
Classes and objects in c++

What's hot (20)

PPTX
Machine learning with neural networks
PDF
Python's magic methods
PDF
Feathertheme
PPTX
NumPy.pptx
PPTX
Machine learning with scikitlearn
PPTX
Method overloading
PPTX
Polymorphism in Python
PPTX
Dynamic memory allocation in c++
PPTX
Oop c++class(final).ppt
PDF
Python - object oriented
PPTX
Friend functions
PDF
Top 100 Python Interview Questions And Answers
PDF
Python : Regular expressions
PPTX
Python-Polymorphism.pptx
PDF
Introduction to Python
PPT
C++ classes
PPTX
Map, Filter and Reduce In Python
PPTX
ALGORITHMIQUE fonction et procedure.pptx
PPTX
jQuery
PPTX
Object Oriented Programming Using C++
Machine learning with neural networks
Python's magic methods
Feathertheme
NumPy.pptx
Machine learning with scikitlearn
Method overloading
Polymorphism in Python
Dynamic memory allocation in c++
Oop c++class(final).ppt
Python - object oriented
Friend functions
Top 100 Python Interview Questions And Answers
Python : Regular expressions
Python-Polymorphism.pptx
Introduction to Python
C++ classes
Map, Filter and Reduce In Python
ALGORITHMIQUE fonction et procedure.pptx
jQuery
Object Oriented Programming Using C++
Ad

Similar to Neural Network - Feed Forward - Back Propagation Visualization (20)

PPT
Neural networks,Single Layer Feed Forward
PPT
neural networking and factor analysis.ppt
PPT
neural1Advanced Features of Neural Network.ppt
PPT
Artificial neural networks and deep learning.ppt
PPT
Data mining techniques power point presentation
PPTX
Neural network basic and introduction of Deep learning
PDF
Capstone paper
PPT
neural.ppt
PPT
neural.ppt
PPT
neural.ppt
PPT
introduction to feed neural networks.ppt
PPT
neural (1).ppt
PPT
neural.ppt
PPT
neural.ppt
PDF
Artificial Neural Network for machine learning
PDF
Gradient Descent, Back Propagation, and Auto Differentiation - Advanced Spark...
PPTX
Neural Network Fundamentals
PDF
Artificial Neural Networks
PDF
Part 1.1.Neural network and training algorithm.pdf
PPTX
Artificial intelligence learning presentations
Neural networks,Single Layer Feed Forward
neural networking and factor analysis.ppt
neural1Advanced Features of Neural Network.ppt
Artificial neural networks and deep learning.ppt
Data mining techniques power point presentation
Neural network basic and introduction of Deep learning
Capstone paper
neural.ppt
neural.ppt
neural.ppt
introduction to feed neural networks.ppt
neural (1).ppt
neural.ppt
neural.ppt
Artificial Neural Network for machine learning
Gradient Descent, Back Propagation, and Auto Differentiation - Advanced Spark...
Neural Network Fundamentals
Artificial Neural Networks
Part 1.1.Neural network and training algorithm.pdf
Artificial intelligence learning presentations
Ad

Recently uploaded (20)

PDF
Business Analytics and business intelligence.pdf
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPT
Reliability_Chapter_ presentation 1221.5784
PDF
Foundation of Data Science unit number two notes
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
Lecture1 pattern recognition............
PDF
annual-report-2024-2025 original latest.
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
Database Infoormation System (DBIS).pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
1_Introduction to advance data techniques.pptx
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
Business Analytics and business intelligence.pdf
Introduction to Knowledge Engineering Part 1
climate analysis of Dhaka ,Banglades.pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
Reliability_Chapter_ presentation 1221.5784
Foundation of Data Science unit number two notes
oil_refinery_comprehensive_20250804084928 (1).pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Business Acumen Training GuidePresentation.pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Lecture1 pattern recognition............
annual-report-2024-2025 original latest.
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Database Infoormation System (DBIS).pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
1_Introduction to advance data techniques.pptx
Miokarditis (Inflamasi pada Otot Jantung)
Introduction-to-Cloud-ComputingFinal.pptx

Neural Network - Feed Forward - Back Propagation Visualization