SlideShare a Scribd company logo
Advanced Artificial Neural
Networks
Perceptron & Beyond
Lecture 2 & 3
Dr. Tehseen Zia
Biological Neuron
Perceptron
• Rosenblatt proposed a binary classification method
• Key Idea
• One weight per input
• Multiple weights with respective inputs and add bias
• If result larger than threshold, return 1, otherwise 0
Example Problem
• Will I pass this course?
Let’s start with a simple two feature model
Example Problem
𝑥1
𝑥2
Σ
𝑤1
𝑤2
• Will I pass this course?
Let’s start with a simple two feature model
Example Problem
𝑥1
𝑥2
Σ
𝑤1=1
𝑤2=1
• Will I pass this course?
Let’s start with a simple two feature model
𝜏=10
𝑦
Example Problem
𝑥1
𝑥2
Σ
𝑤1=1
𝑤2=1
• Will I pass this course?
Let’s start with a simple two feature model
𝜏=10
𝑦
What variables of the model are known and not
known ?
Example Problem
𝑥1
𝑥2
Σ
𝑤1=1
𝑤2=1
• Will I pass this course?
Let’s start with a simple two feature model
𝜏=10
𝑦
What variables of the model are known and not
known ?
X
X X
Example Problem
𝑥1
𝑥2
Σ
𝑤1=1
𝑤2=1
• Will I pass this course?
Let’s start with a simple two feature model
𝜏=10
𝑦
What variables of the model are known and not
known ?
X
X X
Learning objective: To find values of the variables ()
Example Problem
𝑥1
𝑥2
Σ
𝑤1=1
𝑤2=1
• Will I pass this course?
Let’s start with a simple two feature model
𝜏=0
𝑦
𝑥0=1
𝑤0=−10
How to fix threshold?
Example Problem
𝑥1 =4
𝑥2=5
Σ
𝑤1=1
𝑤2=1
• Will I pass this course?
Let’s start with a simple two feature model
𝜏=10
=0
Example Problem
𝑥1 =4
=5
Σ
𝑤1=1
𝑤2=1
• Will I pass this course?
Let’s start with a simple two feature model
𝜏=0
=0
𝑥0=1
𝑏¿−10
How to fix threshold?
Example Problem
• How to implement the model
Training a Perceptron
• Learning algorithm:
• Initialize weights randomly
• Take one sample and predict
• For wrong predictions, update weights
• If the output was = 1, increase weights
• If the output was = 0, decrease weights
• Repeat until no errors are made
Training a Perceptron
• Learning algorithm:
• Initialize weights randomly
• Take one sample and predict
• For wrong predictions, update weights
• If the output was = 1, increase weights
• If the output was = 0, decrease weights
• Repeat until no errors are made
But, how do the algorithm knew that the
predictions are wrong?
Training a Perceptron
• Learning algorithm:
• Initialize weights randomly
• Take one sample and predict
• For wrong predictions, update weights
• If the output was = 1, increase weights
• If the output was = 0, decrease weights
• Repeat until no errors are made
But, how do the algorithm knew that the
predictions are wrong?
Cost function/ loss function tell it
Training a Perceptron
• Learning algorithm:
• Initialize weights randomly
• Take one sample and predict
• For wrong predictions, update weights
• If the output was = 1, increase weights
• If the output was = 0, decrease weights
• Repeat until no errors are made
What is Cost function? 𝐿(𝑊 ,𝑏)=∑
𝑖=1
𝑚
(𝑦− ^
𝑦)
2
Training a Perceptron
• Learning algorithm:
• Initialize weights randomly
• Take one sample and predict
• For wrong predictions, update weights
• If the output was = 1, increase weights
• If the output was = 0, decrease weights
• Repeat until no errors are made
What is Cost function? 𝐿(𝑊 ,𝑏)=∑
𝑖=1
𝑚
(𝑦− ^
𝑦)
2
Actual
class
Predicted
class
Learning Problem
• Problem:
• Loss function:
• Objective
• Minimize with respect to
• Algorithm:
• Start with some
• Keep changing to reduce
• Until we (hopefully) end-up with minimum
Training a Perceptron
• Learning algorithm:
• Initialize weights randomly
• Take one sample and predict
• For wrong predictions, update weights
• If the output was = 1, increase weights
• If the output was = 0, decrease weights
• Repeat until no errors are made
How much weights to
increase or decrease ?
Training a Perceptron
• Learning algorithm:
• Initialize weights randomly
• Take one sample and predict
• For wrong predictions, update weights
• If the output was = 1, increase weights
• If the output was = 0, decrease weights
• Repeat until no errors are made
Gradient Decent
Algorithm
Hill-Descent
𝐿
(
𝑊
)
𝑊
“Climbing down from Everest in thick fog
with amnesia"
𝑥
Σ
𝑤
𝜏
𝑦
𝐿(𝑊 )
Hill-Descent
𝐿
(
𝑊
)
𝑊
Question 1: What we need to take a step
downwards ?
Hill-Descent
𝐿
(
𝑊
)
𝑊
Question 1: What we need to take a step
downwards ?: Direction and step size
Hill-Descent
𝐿
(
𝑊
)
𝑊
Question 1: What we need to take a step
downwards ?: Direction and step size
Hill-Descent
𝐿
(
𝑊
)
𝑊
Question 2: How do we find direction of
decreasing ?
From Hill-Descent to Gradient-Descent
𝐿
(
𝑊
)
𝑊
Question 2: How do we find direction of
decreasing ?:
From Hill-Descent to Gradient-Descent
𝐿
(
𝑊
)
𝑊
Knowing direction is not enough, we take
a step towards the direction:
𝜕(𝐿 (𝑊 ))
𝜕𝑊
α
From Hill-Descent to Gradient-Descent
𝐿
(
𝑊
)
𝑊
Knowing direction is not enough, we take
a step towards the direction:
-
α
Gradient-Descent Algorithm
-
}
Gradient-Descent Algorithm: Why It Works ?
𝐿(𝑤¿¿
𝑗)¿
𝑤 𝑗
𝐿(𝑤
𝑗
)
𝑤 𝑗
-
-
Gradient-Descent Algorithm: Why It Works ?
𝐿(𝑤¿¿
𝑗)¿
𝑤 𝑗
𝐿(𝑤
𝑗
)
𝑤 𝑗
-
-
-
-
𝑤 𝑗
𝑛𝑒𝑤
<𝑤 𝑗
𝑜𝑙𝑑
𝑤 𝑗
𝑛𝑒𝑤
>𝑤 𝑗
𝑜𝑙𝑑
Gradient-Descent Algorithm: Why It Works ?
𝐿(𝑤¿¿
𝑗)¿
𝑤 𝑗
-
𝑤 𝑗
𝑛𝑒𝑤
=𝑤 𝑗
𝑜𝑙𝑑
Why ?
Gradient Descent Algorithm: Semantics of
Learning Rate
𝐿(𝑤¿¿
𝑗)¿
𝑤 𝑗
𝐿(𝑤¿¿
𝑗)¿
𝑤 𝑗
-
-
Gradient Descent Algorithm: Semantics of
Learning Rate
𝐿(𝑤¿¿
𝑗)¿
𝑤 𝑗
𝐿(𝑤¿¿
𝑗)¿
𝑤 𝑗
-
-
If is small gradient descent
convergence is slow
If is large gradient descent
convergence fast but it can
overshoot the minimum or
diverge
Convergence
𝐿(𝑤¿¿
𝑗)¿
𝑤 𝑗
𝜕 𝐿(𝑊 )
𝜕𝑤
𝜕 𝐿(𝑊 )
𝜕𝑤
𝜕 𝐿(𝑊 )
𝜕𝑤
𝜕 𝐿(𝑊 )
𝜕𝑤
𝜕 𝐿(𝑊 )
𝜕𝑤
𝜕 𝐿(𝑊 )
𝜕𝑤
Convergence
𝐿(𝑤¿¿
𝑗)¿
𝑤 𝑗
𝜕 𝐿(𝑊 )
𝜕𝑤
𝜕 𝐿(𝑊 )
𝜕𝑤
𝜕 𝐿(𝑊 )
𝜕𝑤
𝜕 𝐿(𝑊 )
𝜕𝑤
𝜕 𝐿(𝑊 )
𝜕𝑤
𝜕 𝐿(𝑊 )
𝜕𝑤
As it approaches to local
minimum, gradient descent
automatically takes smaller
steps
Derivation of
• Loss function:
• = = …
Gradient-Descent Algorithm
- .
}
Perceptron Decision Boundary: Linear
∑
𝑥1
𝑤1
𝑥1
^
𝑦
𝑤1 =0.5
^
𝑦
Single input neuron
Linearly and Nonlinearly Separable Problem
Implementing AND with Perceptron
∑=𝑤 1𝑥1+𝑤 2𝑥2
𝑤 1=1
1
𝑥 1
𝑥 2
^
𝑦
¿ 2
= 1
= 0
Implementing OR with Perceptron
∑=𝑤 1𝑥1+𝑤 2𝑥2
𝑤 1=1
1
𝑥 1
𝑥 2
^
𝑦
¿ 1
= 1
= 0
Implementing XOR with Perceptron
Implementing XOR with Perceptron
Implementing XOR with Perceptron
We cannot because
XOR is nonlinearly
separable
Implementing XOR with Multi-Perceptron
Perceptron # 2
Perceptron # 1
Nonlinearly Separable Problems and Multi-
perceptron
Perceptron # 2
Perceptron # 1
Decision rule:
if ∑ of P1 < 0 -> black
elseif ∑ of P2 > 0 -> black
else white
Multi-perceptron Architecture
∑
𝑥1
𝑤11
^
𝑦
∑
𝑤22
𝑥2
∑
𝑤12
𝑤21
𝑤11
𝑤21
Perceptron # 1
Perceptron # 2 Perceptron # 3
Multi-perceptron Architecture
∑
𝑥1
𝑤11
^
𝑦
∑
𝑤22
𝑥2
∑
𝑤12
𝑤21
𝑤11
𝑤21
Perceptron # 1
Perceptron # 2 Perceptron # 3
Nonlinearity is Important
• Perceptron 1:
• Perceptron 2:
• Perceptron 3:
Nonlinearity is Important
• Perceptron 1:
• Perceptron 2:
• Perceptron 3:
+ + +
Nonlinearity is Important
• Perceptron 1:
• Perceptron 2:
• Perceptron 3:
+ + +
+
Nonlinearity is Important
• Perceptron 1:
• Perceptron 2:
• Perceptron 3:
+ + +
+
Linear function
Computational Neuron
𝑧=𝑤 1𝑥 1+𝑤 2𝑥 2
𝑤 1
𝑤 2
𝑥 1
𝑥 2
𝑎=𝜎(𝑧)
𝑎= ^
𝑦
=
Computational Neuron
=
Interpretation
𝑎=𝜎(𝑧)
𝑎=𝑝(𝑦=1|𝑥,𝑤¿
^
𝑦=1,𝑖𝑓𝑎≥0.5
Decision rule:
^
𝑦=0 ,𝑖𝑓𝑎<0.5
Decision Boundary
𝑧=𝑤 0+𝑤 1 𝑥 1+𝑤 2 𝑥 2
= ()
= 0
= -3 𝑤 1=1𝑤 2=1
Suppose
Decision Boundary
𝑧=𝑤 0+𝑤 1 𝑥 1+𝑤 2 𝑥 2
= ()
= 0
= -3 𝑤 1=1𝑤 2=1
Suppose
Learning Task
• Dataset :
• Input:
• Classifier
• Where
• How to choose parameters ?
=
𝑧=𝑤 1 𝑥1
+𝑤 2 𝑥2
+…+𝑤 𝑛 𝑥𝑛
Cost Function
Cost Function Mean square loss
function
=
Cost Function
)
Mean square loss
function
=
Cost Function Mean square loss
function
Cost Function Mean square loss
function
Cost Function Mean square loss
function
Cost Function
Cost = 0 if y = 1, a = 1
But as -> 0
Cost -> ∞
Mean square loss
function
Cost Function
Cost = 0 if y = 0, a = 0
But as -> 1
Cost -> ∞
Mean square loss
function
Cost Function Mean square loss
function
𝐽(𝑊 )=−
1
𝑚
∑
𝑖=1
𝑚
ylog(^
𝑦)+(1− y)log(1− ^
𝑦)
Cost Function Mean square loss
function
Cross-entropy loss function
𝐽(𝑊 )=−
1
𝑚
∑
𝑖=1
𝑚
ylog(^
𝑦)+(1− y)log(1− ^
𝑦)
Learning Parameters: Gradient Descent
Algorithm

More Related Content

PDF
11_Học máy cơ bản_Hồi quy tuyến tính.pdf
PPTX
jake.pptx
PPTX
Elements of Statistical Learning 読み会 第2章
PDF
Backpropagation: Understanding How to Update ANNs Weights Step-by-Step
PPTX
13Kernel_Machines.pptx
PPTX
ERF Training Workshop Panel Data 5
PPTX
Lec05.pptx
11_Học máy cơ bản_Hồi quy tuyến tính.pdf
jake.pptx
Elements of Statistical Learning 読み会 第2章
Backpropagation: Understanding How to Update ANNs Weights Step-by-Step
13Kernel_Machines.pptx
ERF Training Workshop Panel Data 5
Lec05.pptx

Similar to Understanding of neural network architecture (20)

PPTX
Coursera 2week
PPTX
ngboost.pptx
PPTX
Machine learning with neural networks
PDF
Optimum engineering design - Day 6. Classical optimization methods
PDF
Calculus Review Session Brian Prest Duke University Nicholas School of the En...
PPTX
CS532L4_Backpropagation.pptx
PDF
Optimum engineering design - Day 5. Clasical optimization methods
PPTX
04 Multi-layer Feedforward Networks
PPTX
Logistic-regression-Supervised-MachineLearning.pptx
PPTX
Page rank - from theory to application
PPTX
Orbital_Simulation (2).pptx
PPTX
Bayesian Neural Networks
PDF
Optimum Engineering Design - Day 2b. Classical Optimization methods
PDF
DL_lecture3_regularization_I.pdf
PPTX
EMOD_Optimization_Presentation.pptx
PPTX
2Multi_armed_bandits.pptx
PPTX
Supervised learning for IOT IN Vellore Institute of Technology
PDF
DMTM Lecture 03 Regression
PDF
Logistic regression
PPTX
Secant method
Coursera 2week
ngboost.pptx
Machine learning with neural networks
Optimum engineering design - Day 6. Classical optimization methods
Calculus Review Session Brian Prest Duke University Nicholas School of the En...
CS532L4_Backpropagation.pptx
Optimum engineering design - Day 5. Clasical optimization methods
04 Multi-layer Feedforward Networks
Logistic-regression-Supervised-MachineLearning.pptx
Page rank - from theory to application
Orbital_Simulation (2).pptx
Bayesian Neural Networks
Optimum Engineering Design - Day 2b. Classical Optimization methods
DL_lecture3_regularization_I.pdf
EMOD_Optimization_Presentation.pptx
2Multi_armed_bandits.pptx
Supervised learning for IOT IN Vellore Institute of Technology
DMTM Lecture 03 Regression
Logistic regression
Secant method
Ad

Recently uploaded (20)

PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
Pharma ospi slides which help in ospi learning
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PDF
TR - Agricultural Crops Production NC III.pdf
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
Pre independence Education in Inndia.pdf
PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
Cell Types and Its function , kingdom of life
PDF
Insiders guide to clinical Medicine.pdf
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
Basic Mud Logging Guide for educational purpose
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
master seminar digital applications in india
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Pharma ospi slides which help in ospi learning
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
TR - Agricultural Crops Production NC III.pdf
2.FourierTransform-ShortQuestionswithAnswers.pdf
Pre independence Education in Inndia.pdf
O7-L3 Supply Chain Operations - ICLT Program
Cell Types and Its function , kingdom of life
Insiders guide to clinical Medicine.pdf
Module 4: Burden of Disease Tutorial Slides S2 2025
Basic Mud Logging Guide for educational purpose
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
O5-L3 Freight Transport Ops (International) V1.pdf
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Final Presentation General Medicine 03-08-2024.pptx
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Abdominal Access Techniques with Prof. Dr. R K Mishra
master seminar digital applications in india
Supply Chain Operations Speaking Notes -ICLT Program
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Ad

Understanding of neural network architecture