SlideShare a Scribd company logo
Welcome to my presentation on
Regression and Classification: An Artificial
Neural Network Approach
Presented by
Md. Menhazul Abedin
Research student
Dept. of Statistics
University of Rajshahi
Rajshahi-6205
Dedication
• This presentation is dedicated to my
honorable supervisor
12/12/2016 2
Three pioneer of ANN
Warren McCulloch Walter Pitts
Frank Rosenblatt
12/12/2016 3
Outlines
Motivation/Why this study?
Objectives
Methodology
Findings
Conclusion
Limitation
Area of further research
12/12/2016 4
Motivation/Why this study?
• Vector, matrix, sound, image, wave, string, text etc.
• How to analyze them? Pitfall of human civilization from several decades.
12/12/2016 5
Objectives?
• To study neural network as a technique for
regression and classification.
• To compare neural network with classical
regression and classification techniques.
• To study the limitations of neural network.
12/12/2016 6
• Structure of neuron
12/12/2016 7
What is ANN?
Biological neural network
Artificial neural network
12/12/2016 8
• How many hidden layers considered?
 More hidden layer more approximate nonlinearity
• More hidden layer  need much time to converge.
• Weight adjusted by iterative method (backpropagation)
• Analogy between biological and artificial
neural networks
12/12/2016 9
Historical Background of Artificial
Neural Network
• In 1943, neurophysiologist Warren McCulloch and mathematician Walter
Pitts wrote a paper on how neurons might work.
• In 1949, Donald Hebb wrote The Organization of Behavior (the ways in
which humans learn)
• M. Minsky (1951) built a reinforcement-based network learning system.
• F. Rosenblatt (1958) the first practical Artificial Neural Network (ANN) - the
perceptron,
• B. Widrow & M.E. Hoff (1960) introduced adaptive percepton-like network
using Least Mean Square (LMS) error algorithm.
• 1969 – Marvin Minsky and Seymour showed that perceptron model is not
capable of representing many important problems
• 1973 – Christoph Von Der Malsburg used a neuron model that was
nonlinear and biologically more motivated
• 1974 – Paul Werbos Developed a learning precedure called
backpropagation of error.
12/12/2016 10
Historical Background of Artificial
Neural Network
• 1986, The application area of the MLP networks
remained rather limited until the breakthrough when a
general back propagation algorithm for a multi-layered
perceptron was introduced by Rummelhart and
Mclelland.
• 1988, Radial Basis Function (RBF) networks were first
introduced by Broomhead & Lowe. Although the basic
idea of RBF was developed 30 years ago under the
name method of potential function, the work by
Broomhead & Lowe opened a new frontier in the
neural network community.
12/12/2016 11
ANN regression
• Linear activation function  Gives continuous
values.
12/12/2016 12
ANN classification
• For two class  Sigmoid function
( threshold > 0.5 one class & threshold < 0.5 another class)
• More class  Softmax function
(Gives probability for each class)
• tanh function may used as activation function
12/12/2016 13
Activation functions
• Linear function , 𝜑 η = η
• Sigmoid function , 𝜑 η =
1
1+ 𝑒−η
Where η=xθ.
• Softmax function,
𝜑 η = (
exp η1
𝑖=1
𝑘
exp η 𝑖
, … ,
exp η 𝑘
𝑖=1
𝑘
exp η 𝑖
)
12/12/2016 14
Perceptron learning model specifies the probability of a binary
output yi ε {0,1} given the input xi as follows:
( | , ) ( | ( , ))i i i ip y x w Ber y sigm x w
1
( | , ) ( | ( , ))
n
i i
i
p y X w Ber y sigm x w

 
1
1
1 1
( | , ) 1
1 1
i i
i i
y yn
x w x w
i
p y X w
e e

 

   
         

1
; ( 1| , )
1 i
i i i x w
p y x w
e
   

Cost function:
 
1
( ) log ( | , )
= log (1 )log(1 )
n
i i i i
i
c w p y X w
y y

 
    
Cross entropy
Construction of cost function: sigmoid formulation
sigm(xi,w)=
1
1 ix w
e

Xiw=0
12/12/2016 15
Softmax formulation
sigm(xi,w)=
1
1 ix w
e

+1
xi1
xi2
+1
b1=w10
w11
w21
w12
w22
b2=w20
Ʃ
Ʃ
u11
u12
Softmaxlayer
1
1 2
1
i
i i
x w
ix w x w
e
e e
 

2
1 2
2
i
i i
x w
ix w x w
e
e e
 

1 2 1i i   
12/12/2016 16
Indicator: 1 if
( )
0 otherwise
i
c i
y c
I y

 

0 1( ) ( )
1 2( | , ) i iI y I y
i i i ip y x w   
0 1( ) ( )
1 2
1
( | , ) i i
n
I y I y
i i
i
p y X w

  
1
1 2
2
1 2
1
2
y 0
( | , )
y 1
i
i i
i
i i
x w
i ix w x w
i i x w
i ix w x w
e
if
e e
p y x w
e
if
e e

   
 
   
 
0 1 1 2
1
( ) log ( | , ) ( ( )log ( )log )
n
i i i i
i
c w p y X w I y I y

      
Construction of cost function: Softmax formulation
X
Linear
Layer
Log
softmax
layer
NLL C(w)
12/12/2016 17
Weight update (Backpropagation)
• Derivative cost w.r.t inputs (layer wise).
• Information go from 𝑧1
(𝑥) to 𝑧4
(𝑥) = c forward
message.
• Error propagate backward message & update its
weights.
12/12/2016 18
Optimization
Our goal is to optimize the cost function.
Different optimization techniques
Gradient descent algorithm
Newton's algorithm
Stochastic gradient descent(SGD)
Online learning, batch & mini batch
optimization
12/12/2016 19
Regression (Findings)
• Used data set = 7
• (Regression = 4, classification = 3)
• Pharmaceuticals data:
Size 26
No. of variables 4 (one dependent and three independent)
Outlier Present (6th , 10th ,and 26th )
Autocorrelation Absence
Multicollinearity Absence
Normality Present
Data type Real
Cross validation LOOCV
Applied methods Linear model, Polynomial & ANN
12/12/2016 20
Regression (cont…)
ANN is the best regression model
12/12/2016 21
Regression(cont..)
• Yacht Hydrodynamics Data:
Size 308
No. of variables 7 (one dependent and six independent)
Outlier Absence
Autocorrelation Absence
Multicollinearity Absence
Normality Absence (Clustered)
Data type Real
Cross validation Training set and test set
Applied methods Linear model, Polynomial & ANN
12/12/2016 22
• Results of Yacht hydrodynamics..
12/12/2016 23
• 100 times repeat for different training and test set
• Box plot of test error  grow sense about error variation
• ANN is the best regression model
12/12/2016 24
Regression(cont..)
• Simulated data-1
Size 1000
No. of variables 10 (one dependent and nine independent)
Outlier Absence
Autocorrelation Absence
Multicollinearity Absence
Normality present
Data type Real
Cross validation Training set and test set
Applied methods Linear model & ANN
12/12/2016 25
• Results of Simulated data-1
12/12/2016 26
• 100 times repeat for different training and test set
• Box plot of test error  grow sense about error variation
• ANN is the best regression model
12/12/2016 27
Regression (cont…)
• Simulated data-2
Size 20000
No. of variables 20 (one dependent and nine independent)
Outlier Absence
Autocorrelation Absence
Multicollinearity Strong Multicollinearity
Normality present
Data type Real
Cross validation Training set and test set
Applied methods Linear model & ANN
12/12/2016 28
• Results of Simulated data-2
12/12/2016 29
• 100 times repeat for different training and test set
• Box plot of test error  grow sense about error variation
• ANN is the best regression model
12/12/2016 30
Classification
• IRIS data
Size 150
No. of variables 5 (one dependent and four independent)
No. of class Three (Setosa, Versicolor, Virginica
Type Balanced
Data type Real
Cross validation LOOCV
Applied methods Logistic, LDA, QDA, KNN, NB & ANN
12/12/2016 31
Classification (cont…)
• Results
• ANN is the best classifier
Methods Classification rate Misclassification rate
Logistic 0.98 0.02
LDA 0.98 0.02
QDA 0.98 0.02
KNN 0.95 0.05
NB 0.95 0.05
ANN 0.99 0.01
12/12/2016 32
Classification (cont…)
• Fertility data
Size 100
No. of variables 5 (one dependent and four independent)
No. of class Two (Normal & Altered)
Type Imbalanced
Data type Real
Cross validation LOOCV
Applied methods Logistic, LDA, KNN, NB & ANN
12/12/2016 33
Classification (cont…)
• Results
• ANN is the best classifier
Methods Accuracy Sensitivity Specificity PPV NPV
Logistic 0.84 0.87 0.00 0.96 0.00
LDA 0.83 0.95 0.00 0.87 0.00
KNN 0.81 0.90 0.16 0.88 0.20
NB 0.82 0.94 0.00 0.87 0.00
ANN 0.88 0.95 0.34 0.91 0.50
12/12/2016 34
Classification (cont…)
• Leukemia data
Size 72
No. of variables 7130 (one dependent and 7129 independent)
No. of class Two (ALL & AML)
Type Balanced
Data type Real
Cross validation LOOCV
Applied methods Logistic, LDA, QDA, KNN, NB & ANN
12/12/2016 35
Classification (cont…)
• Results
• ANN is the best classifier
Methods Accuracy Sensitivity Specificity
Logistic 0.47 0.62 0.31
LDA 0.62 0.68 0.52
QDA 0.65 1.00 0.00
KNN 0.54 0.65 0.32
NB 0.65 1.00 0.00
ANN 0.64 0.68 0.56
12/12/2016 36
Conclusion
• In all cases ANN is the best .
Data Problems ANN Status
Pharmaceuticals Outlier Best regression model
Yacht hydro: Clustered Best regression model
Simulated data-1 Fresh Best regression model
simulated data-2 Strong multicollinearity Best regression model
IRIS Balanced Best classifier
Fertility Imbalanced Best classifier
Leukemia Large (7129 varisbles) Best classifier
12/12/2016 37
Limitations
• Backpropagation no guarantee of absolute
minimum
• VC dimension  unclear
• Weights initialization random  result is not unique.
• Some weights are zero  network doesn’t converge.
• Computation of confidence interval is so hard.
• Doesn’t perform t-test, F-test.
12/12/2016 38
Areas of further research
• Robust, generalized ridge, principle component, latent
root, lasso and step wise regression.
• Multivariate regression, time series analysis
• Application of artificial neural network on
unsupervised learning
• Study of semi supervised learning
• Comparative study with others machine learning
techniques and data mining techniques
• Improvement of backpropagation algorithm
12/12/2016 39
THANK YOU
ALL
12/12/2016 40

More Related Content

PPTX
Random forest algorithm
PPTX
Support vector machines (svm)
PPTX
Random forest
PPTX
PDF
Support Vector Machines ( SVM )
PPT
Unit I & II in Principles of Soft computing
PPTX
Next word Prediction
PPTX
Random forest
Random forest algorithm
Support vector machines (svm)
Random forest
Support Vector Machines ( SVM )
Unit I & II in Principles of Soft computing
Next word Prediction
Random forest

What's hot (20)

PPTX
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
PPTX
Lecture 18: Gaussian Mixture Models and Expectation Maximization
PDF
Recurrent Neural Networks
PDF
Machine Learning and its Applications
PPTX
PPTX
Artificial Intelligence, Machine Learning and Deep Learning
PDF
Principal Component Analysis
PPTX
Multilayer perceptron
PPTX
PPTX
Fraud and Risk in Big Data
PPTX
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
ODP
Machine Learning With Logistic Regression
PPTX
Presentation on K-Means Clustering
PPTX
Naive bayes
PDF
Understanding random forests
PPTX
A brief introduction to mutual information and its application
PDF
Machine Learning: Introduction to Neural Networks
PDF
Neural Networks: Multilayer Perceptron
PPT
Multi-Layer Perceptrons
PPTX
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
Lecture 18: Gaussian Mixture Models and Expectation Maximization
Recurrent Neural Networks
Machine Learning and its Applications
Artificial Intelligence, Machine Learning and Deep Learning
Principal Component Analysis
Multilayer perceptron
Fraud and Risk in Big Data
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
Machine Learning With Logistic Regression
Presentation on K-Means Clustering
Naive bayes
Understanding random forests
A brief introduction to mutual information and its application
Machine Learning: Introduction to Neural Networks
Neural Networks: Multilayer Perceptron
Multi-Layer Perceptrons
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Ad

Viewers also liked (20)

PPTX
Artificial intelligence NEURAL NETWORKS
PPTX
Neural network & its applications
PPTX
Artificial neural network
PPTX
Classification ANN
PPT
2008 3 11 Meeting
PPTX
Genetic Algorithm
PPT
Ch5 alternative classification
PPTX
PPTX
Lecture 25 hill climbing
PPT
Neural network
PPT
Artificial Intelligence AI Topics History and Overview
PPTX
Introduction to Neural networks (under graduate course) Lecture 3 of 9
PPTX
Lecture 14 Heuristic Search-A star algorithm
PPT
Artificial intelligence
PPTX
Hill-climbing #2
PPTX
hopfield neural network
ODP
Hillclimbing search algorthim #introduction
PPTX
HOPFIELD NETWORK
PPTX
Genetic Algorithm by Example
PPT
Hill climbing
Artificial intelligence NEURAL NETWORKS
Neural network & its applications
Artificial neural network
Classification ANN
2008 3 11 Meeting
Genetic Algorithm
Ch5 alternative classification
Lecture 25 hill climbing
Neural network
Artificial Intelligence AI Topics History and Overview
Introduction to Neural networks (under graduate course) Lecture 3 of 9
Lecture 14 Heuristic Search-A star algorithm
Artificial intelligence
Hill-climbing #2
hopfield neural network
Hillclimbing search algorthim #introduction
HOPFIELD NETWORK
Genetic Algorithm by Example
Hill climbing
Ad

Similar to Regression and Classification: An Artificial Neural Network Approach (20)

PDF
SVD and the Netflix Dataset
PDF
Getting started with chemometric classification
PPTX
Regression vs Deep Neural net vs SVM
PPTX
ch12-ml1gnsnnr5ưt5trhtgfnszfbaSDhbgdfb.pptx
PPTX
Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...
PDF
Machine learning for_finance
PDF
23AFMC_Beamer.pdf
PDF
1803-DataScienceOverview.pdf presentatino slides
PDF
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
PDF
Low rank models for recommender systems with limited preference information
PDF
Machine Learning Notes for beginners ,Step by step
PDF
Graph Analysis Beyond Linear Algebra
PDF
Declarative data analysis
PPTX
CMU Trecvid sed11
PDF
Improving Hardware Efficiency for DNN Applications
PDF
Sparsenet
PPTX
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
PDF
LHCb Computing Workshop 2018: PV finding with CNNs
PPTX
Engineering Data Analysis OEL Presentation.pptx
PDF
Application of OpenSees in Reliability-based Design Optimization of Structures
SVD and the Netflix Dataset
Getting started with chemometric classification
Regression vs Deep Neural net vs SVM
ch12-ml1gnsnnr5ưt5trhtgfnszfbaSDhbgdfb.pptx
Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...
Machine learning for_finance
23AFMC_Beamer.pdf
1803-DataScienceOverview.pdf presentatino slides
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Low rank models for recommender systems with limited preference information
Machine Learning Notes for beginners ,Step by step
Graph Analysis Beyond Linear Algebra
Declarative data analysis
CMU Trecvid sed11
Improving Hardware Efficiency for DNN Applications
Sparsenet
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
LHCb Computing Workshop 2018: PV finding with CNNs
Engineering Data Analysis OEL Presentation.pptx
Application of OpenSees in Reliability-based Design Optimization of Structures

More from Khulna University (11)

PPTX
Stat 2153 Introduction to Queiueng Theory
PPTX
Stat 2153 Stochastic Process and Markov chain
PPTX
Stat 3203 -sampling errors and non-sampling errors
PPTX
Stat 3203 -cluster and multi-stage sampling
PPTX
Stat 3203 -multphase sampling
PPTX
Stat 3203 -pps sampling
PPTX
Ds 2251 -_hypothesis test
PPTX
Stat 1163 -statistics in environmental science
PPTX
Stat 1163 -correlation and regression
PPTX
Introduction to matlab
PPTX
Different kind of distance and Statistical Distance
Stat 2153 Introduction to Queiueng Theory
Stat 2153 Stochastic Process and Markov chain
Stat 3203 -sampling errors and non-sampling errors
Stat 3203 -cluster and multi-stage sampling
Stat 3203 -multphase sampling
Stat 3203 -pps sampling
Ds 2251 -_hypothesis test
Stat 1163 -statistics in environmental science
Stat 1163 -correlation and regression
Introduction to matlab
Different kind of distance and Statistical Distance

Recently uploaded (20)

PDF
Phytochemical Investigation of Miliusa longipes.pdf
PPTX
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PPTX
Comparative Structure of Integument in Vertebrates.pptx
PDF
MIRIDeepImagingSurvey(MIDIS)oftheHubbleUltraDeepField
PPT
Chemical bonding and molecular structure
PPTX
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
PPTX
Vitamins & Minerals: Complete Guide to Functions, Food Sources, Deficiency Si...
PDF
An interstellar mission to test astrophysical black holes
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PPTX
SCIENCE10 Q1 5 WK8 Evidence Supporting Plate Movement.pptx
PPTX
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
PPTX
famous lake in india and its disturibution and importance
PPTX
2. Earth - The Living Planet Module 2ELS
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PDF
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
PDF
Sciences of Europe No 170 (2025)
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
Phytochemical Investigation of Miliusa longipes.pdf
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
Comparative Structure of Integument in Vertebrates.pptx
MIRIDeepImagingSurvey(MIDIS)oftheHubbleUltraDeepField
Chemical bonding and molecular structure
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
Vitamins & Minerals: Complete Guide to Functions, Food Sources, Deficiency Si...
An interstellar mission to test astrophysical black holes
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
SCIENCE10 Q1 5 WK8 Evidence Supporting Plate Movement.pptx
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
famous lake in india and its disturibution and importance
2. Earth - The Living Planet Module 2ELS
TOTAL hIP ARTHROPLASTY Presentation.pptx
The KM-GBF monitoring framework – status & key messages.pptx
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
Sciences of Europe No 170 (2025)
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg

Regression and Classification: An Artificial Neural Network Approach

  • 1. Welcome to my presentation on Regression and Classification: An Artificial Neural Network Approach Presented by Md. Menhazul Abedin Research student Dept. of Statistics University of Rajshahi Rajshahi-6205
  • 2. Dedication • This presentation is dedicated to my honorable supervisor 12/12/2016 2
  • 3. Three pioneer of ANN Warren McCulloch Walter Pitts Frank Rosenblatt 12/12/2016 3
  • 5. Motivation/Why this study? • Vector, matrix, sound, image, wave, string, text etc. • How to analyze them? Pitfall of human civilization from several decades. 12/12/2016 5
  • 6. Objectives? • To study neural network as a technique for regression and classification. • To compare neural network with classical regression and classification techniques. • To study the limitations of neural network. 12/12/2016 6
  • 7. • Structure of neuron 12/12/2016 7
  • 8. What is ANN? Biological neural network Artificial neural network 12/12/2016 8
  • 9. • How many hidden layers considered?  More hidden layer more approximate nonlinearity • More hidden layer  need much time to converge. • Weight adjusted by iterative method (backpropagation) • Analogy between biological and artificial neural networks 12/12/2016 9
  • 10. Historical Background of Artificial Neural Network • In 1943, neurophysiologist Warren McCulloch and mathematician Walter Pitts wrote a paper on how neurons might work. • In 1949, Donald Hebb wrote The Organization of Behavior (the ways in which humans learn) • M. Minsky (1951) built a reinforcement-based network learning system. • F. Rosenblatt (1958) the first practical Artificial Neural Network (ANN) - the perceptron, • B. Widrow & M.E. Hoff (1960) introduced adaptive percepton-like network using Least Mean Square (LMS) error algorithm. • 1969 – Marvin Minsky and Seymour showed that perceptron model is not capable of representing many important problems • 1973 – Christoph Von Der Malsburg used a neuron model that was nonlinear and biologically more motivated • 1974 – Paul Werbos Developed a learning precedure called backpropagation of error. 12/12/2016 10
  • 11. Historical Background of Artificial Neural Network • 1986, The application area of the MLP networks remained rather limited until the breakthrough when a general back propagation algorithm for a multi-layered perceptron was introduced by Rummelhart and Mclelland. • 1988, Radial Basis Function (RBF) networks were first introduced by Broomhead & Lowe. Although the basic idea of RBF was developed 30 years ago under the name method of potential function, the work by Broomhead & Lowe opened a new frontier in the neural network community. 12/12/2016 11
  • 12. ANN regression • Linear activation function  Gives continuous values. 12/12/2016 12
  • 13. ANN classification • For two class  Sigmoid function ( threshold > 0.5 one class & threshold < 0.5 another class) • More class  Softmax function (Gives probability for each class) • tanh function may used as activation function 12/12/2016 13
  • 14. Activation functions • Linear function , 𝜑 η = η • Sigmoid function , 𝜑 η = 1 1+ 𝑒−η Where η=xθ. • Softmax function, 𝜑 η = ( exp η1 𝑖=1 𝑘 exp η 𝑖 , … , exp η 𝑘 𝑖=1 𝑘 exp η 𝑖 ) 12/12/2016 14
  • 15. Perceptron learning model specifies the probability of a binary output yi ε {0,1} given the input xi as follows: ( | , ) ( | ( , ))i i i ip y x w Ber y sigm x w 1 ( | , ) ( | ( , )) n i i i p y X w Ber y sigm x w    1 1 1 1 ( | , ) 1 1 1 i i i i y yn x w x w i p y X w e e                    1 ; ( 1| , ) 1 i i i i x w p y x w e      Cost function:   1 ( ) log ( | , ) = log (1 )log(1 ) n i i i i i c w p y X w y y         Cross entropy Construction of cost function: sigmoid formulation sigm(xi,w)= 1 1 ix w e  Xiw=0 12/12/2016 15
  • 16. Softmax formulation sigm(xi,w)= 1 1 ix w e  +1 xi1 xi2 +1 b1=w10 w11 w21 w12 w22 b2=w20 Ʃ Ʃ u11 u12 Softmaxlayer 1 1 2 1 i i i x w ix w x w e e e    2 1 2 2 i i i x w ix w x w e e e    1 2 1i i    12/12/2016 16
  • 17. Indicator: 1 if ( ) 0 otherwise i c i y c I y     0 1( ) ( ) 1 2( | , ) i iI y I y i i i ip y x w    0 1( ) ( ) 1 2 1 ( | , ) i i n I y I y i i i p y X w     1 1 2 2 1 2 1 2 y 0 ( | , ) y 1 i i i i i i x w i ix w x w i i x w i ix w x w e if e e p y x w e if e e              0 1 1 2 1 ( ) log ( | , ) ( ( )log ( )log ) n i i i i i c w p y X w I y I y         Construction of cost function: Softmax formulation X Linear Layer Log softmax layer NLL C(w) 12/12/2016 17
  • 18. Weight update (Backpropagation) • Derivative cost w.r.t inputs (layer wise). • Information go from 𝑧1 (𝑥) to 𝑧4 (𝑥) = c forward message. • Error propagate backward message & update its weights. 12/12/2016 18
  • 19. Optimization Our goal is to optimize the cost function. Different optimization techniques Gradient descent algorithm Newton's algorithm Stochastic gradient descent(SGD) Online learning, batch & mini batch optimization 12/12/2016 19
  • 20. Regression (Findings) • Used data set = 7 • (Regression = 4, classification = 3) • Pharmaceuticals data: Size 26 No. of variables 4 (one dependent and three independent) Outlier Present (6th , 10th ,and 26th ) Autocorrelation Absence Multicollinearity Absence Normality Present Data type Real Cross validation LOOCV Applied methods Linear model, Polynomial & ANN 12/12/2016 20
  • 21. Regression (cont…) ANN is the best regression model 12/12/2016 21
  • 22. Regression(cont..) • Yacht Hydrodynamics Data: Size 308 No. of variables 7 (one dependent and six independent) Outlier Absence Autocorrelation Absence Multicollinearity Absence Normality Absence (Clustered) Data type Real Cross validation Training set and test set Applied methods Linear model, Polynomial & ANN 12/12/2016 22
  • 23. • Results of Yacht hydrodynamics.. 12/12/2016 23
  • 24. • 100 times repeat for different training and test set • Box plot of test error  grow sense about error variation • ANN is the best regression model 12/12/2016 24
  • 25. Regression(cont..) • Simulated data-1 Size 1000 No. of variables 10 (one dependent and nine independent) Outlier Absence Autocorrelation Absence Multicollinearity Absence Normality present Data type Real Cross validation Training set and test set Applied methods Linear model & ANN 12/12/2016 25
  • 26. • Results of Simulated data-1 12/12/2016 26
  • 27. • 100 times repeat for different training and test set • Box plot of test error  grow sense about error variation • ANN is the best regression model 12/12/2016 27
  • 28. Regression (cont…) • Simulated data-2 Size 20000 No. of variables 20 (one dependent and nine independent) Outlier Absence Autocorrelation Absence Multicollinearity Strong Multicollinearity Normality present Data type Real Cross validation Training set and test set Applied methods Linear model & ANN 12/12/2016 28
  • 29. • Results of Simulated data-2 12/12/2016 29
  • 30. • 100 times repeat for different training and test set • Box plot of test error  grow sense about error variation • ANN is the best regression model 12/12/2016 30
  • 31. Classification • IRIS data Size 150 No. of variables 5 (one dependent and four independent) No. of class Three (Setosa, Versicolor, Virginica Type Balanced Data type Real Cross validation LOOCV Applied methods Logistic, LDA, QDA, KNN, NB & ANN 12/12/2016 31
  • 32. Classification (cont…) • Results • ANN is the best classifier Methods Classification rate Misclassification rate Logistic 0.98 0.02 LDA 0.98 0.02 QDA 0.98 0.02 KNN 0.95 0.05 NB 0.95 0.05 ANN 0.99 0.01 12/12/2016 32
  • 33. Classification (cont…) • Fertility data Size 100 No. of variables 5 (one dependent and four independent) No. of class Two (Normal & Altered) Type Imbalanced Data type Real Cross validation LOOCV Applied methods Logistic, LDA, KNN, NB & ANN 12/12/2016 33
  • 34. Classification (cont…) • Results • ANN is the best classifier Methods Accuracy Sensitivity Specificity PPV NPV Logistic 0.84 0.87 0.00 0.96 0.00 LDA 0.83 0.95 0.00 0.87 0.00 KNN 0.81 0.90 0.16 0.88 0.20 NB 0.82 0.94 0.00 0.87 0.00 ANN 0.88 0.95 0.34 0.91 0.50 12/12/2016 34
  • 35. Classification (cont…) • Leukemia data Size 72 No. of variables 7130 (one dependent and 7129 independent) No. of class Two (ALL & AML) Type Balanced Data type Real Cross validation LOOCV Applied methods Logistic, LDA, QDA, KNN, NB & ANN 12/12/2016 35
  • 36. Classification (cont…) • Results • ANN is the best classifier Methods Accuracy Sensitivity Specificity Logistic 0.47 0.62 0.31 LDA 0.62 0.68 0.52 QDA 0.65 1.00 0.00 KNN 0.54 0.65 0.32 NB 0.65 1.00 0.00 ANN 0.64 0.68 0.56 12/12/2016 36
  • 37. Conclusion • In all cases ANN is the best . Data Problems ANN Status Pharmaceuticals Outlier Best regression model Yacht hydro: Clustered Best regression model Simulated data-1 Fresh Best regression model simulated data-2 Strong multicollinearity Best regression model IRIS Balanced Best classifier Fertility Imbalanced Best classifier Leukemia Large (7129 varisbles) Best classifier 12/12/2016 37
  • 38. Limitations • Backpropagation no guarantee of absolute minimum • VC dimension  unclear • Weights initialization random  result is not unique. • Some weights are zero  network doesn’t converge. • Computation of confidence interval is so hard. • Doesn’t perform t-test, F-test. 12/12/2016 38
  • 39. Areas of further research • Robust, generalized ridge, principle component, latent root, lasso and step wise regression. • Multivariate regression, time series analysis • Application of artificial neural network on unsupervised learning • Study of semi supervised learning • Comparative study with others machine learning techniques and data mining techniques • Improvement of backpropagation algorithm 12/12/2016 39