SlideShare a Scribd company logo
Scientific Review
ISSN: 2412-2599
Vol. 1, No. 4, pp: 74-78, 2015
URL: http://arpgweb.com/?ic=journal&journal=10&info=aims
*Corresponding Author
74
Academic Research Publishing Group
Classification of Iris Data using Kernel Radial Basis Probabilistic
Neural Network
Lim Eng Aik* Institute of Engineering Mathematic Universiti Malaysia Perlis, 02600 Ulu Pauh, Perlis
Mohd. Syafarudy Abu Institute of Engineering Mathematic Universiti Malaysia Perlis, 02600 Ulu Pauh, Perlis
1. Introduction
In recent years, data classification and assessment model is a very popular and important research topic. The so-
called classification and assessment means to use the related class information to evaluate the level of accuracy for
classification problem. However, the data classification accuracy is usually the most important key for predicting the
right subject [1]. Therefore, if the data classification accuracy can be improve, then public can use it for
classification in many fields with its error greatly reduced.
In the past literature, some researchers use conventional statistical model for model construction, for example,
the Discriminant Analysis of [2], the Probit of Ohlson [3]. However, in recent years, some researchers found that the
model constructed by data mining technique has better prediction accuracy than that of the conventional statistical
model; some of them are, for example, the Back-Propagation Networks model of Odom and Sharda [4] and the
decision tree classification model of [5]. Another recent approach done by Zarita, et al. [6] using backpropagation
method in classification of river water quality. The approach provided a good result in classification, but, it takes a
long times for training the network. In this paper, Radial Basis Probabilistic Neural Networks (RBPNN) which is
known to have good generalization properties and is trained faster than the backpropagation network Ganchev, et al.
[7] is adopted for the construction of classification prediction model.
Our proposed method, used kernel-function to increase the separability of data by working in a high dimensional
space. Thus, the proposed method is characterized by higher classification accuracy than the original RBPNN. The
kernel-based classification in the feature space not only preserves the inherent structure of groups in the input space,
but also simplifies the associated structure of the data [8]. Since Girolami first developed the kernel k-means
clustering algorithm for unsupervised classification [9], several studies have demonstrated the superiority of kernel
classification algorithms over other approaches to classification [10-14].
In this paper, we evaluate the performance of our proposed method, kernel RBPNN, with a comparison to three
well-known classification methods: the Back-Propagation (BP), Radial Basis Function Network (RBFN), and
RBPNN; meanwhile, comparison and analysis on the classification power is done to these three methods. These
methods performance are compared using Iris data sets.
2. Radial Basis Probabilistic Neural Networks
Radial Basis Probabilistic Neural Network (RBPNN) architecture is based on based on Bayesian decision theory
and nonparametric technique to estimate Probability Density Function (PDF). The form of PDF is a Gaussian
distribution. Specht [15] had proposed this function [16]:
Abstract: Radial Basis Probabilistic Neural Network (RBPNN) has a broader generalized capability that been
successfully applied to multiple fields. In this paper, the Euclidean distance of each data point in RBPNN is
extended by calculating its kernel-induced distance instead of the conventional sum-of squares distance. The
kernel function is a generalization of the distance metric that measures the distance between two data points as the
data points are mapped into a high dimensional space. During the comparing of the four constructed classification
models with Kernel RBPNN, Radial Basis Function networks, RBPNN and Back-Propagation networks as
proposed, results showed that, model classification on Iris Data with Kernel RBPNN display an outstanding
performance in this regard.
Keywords: Kernel function; Radial Basis Probabilistic Neural Network; Iris Data; Classification.
Scientific Review, 2015, 1(4): 74-78
75
  2 2
1
2
1 1 1
( ) exp
22
k
N
k j
k m m
jk
X X
f X
N   

 
     
           
 (1)
Since RBPNN is applicable to general classification problem, and assume that the eigenvector to be classified
must belong to one of the known classifications, then the absolute probabilistic value of each classification is not
important and only relative value needs to be considered, hence, in equation (1),
  2
1 1
2
m m

  
  
  
Can be neglected and equation (1) can be simplified as
2
1
2
1
( ) exp
2
k
N
k j
k
jk
X X
f X
N 

 
  
       
 (2)
In equation (2),  is the smoothing parameter of RBPNN. After network training is completed, the prediction
accuracy can be enhanced through the adjustment of the smoothing parameter  , that is, the larger the value, the
smoother the approaching function. If the smoothing parameter  is inappropriately selected, it will lead to an
excessive or insufficient neural units in the network design, and over fitting or inappropriate fitting will be the result
in the function approaching attempt; finally, the prediction power will be reduced.
Let
2
kj k j
d X X 
Be the square of the Euclidean distance of two points Xk and Xj in the sample space, and equation (2) can be re-
written as
2
1
1
( ) exp
2
k
N
k
jk
kjd
f X
N 
 
   
   
  
 (3)
In equation (3), when smoothing parameter  approaches zero,
1
( )k
k
f X
N

If Xk = Xj, then
( ) 0k
f X 
At this moment, RBPNN will depend fully on the non-classified sample which is closest to the classified sample
to decide its classification. When smoothing parameter  approaches infinity,
( ) 1k
f X 
At this moment, RBPNN is close to blind classification. Usually, the researchers need to try different  in
certain range to obtain one that can reach the optimum accuracy. Specht [15] had proposed a method that can be
used to adjust smoothing parameter  ,that is, assign each input neural unit a single ; during the test stage,  that
have the optimum classification result are taken through the fine adjustment of each .
RBPNN is a three-layer feed-forward neural network (as in Figure 1). The first layer is the input layer and the
number of neural unit is the number of independent variable and receives the input data; the second hidden layer in
the middle is Pattern Layer, which stores each training data; the data sent out by Pattern Layer will pass through the
neural unit of the third layer Summation Layer to correspond to each possible category, in this layer, the calculation
of equation (3) will be performed. The fourth layer is Competitive Layer; the competitive transfer function of this
layer will pick up from the output of the last layer the maximum value from these probabilities and generate the
output value. If the output value is 1, it means it is the category you want; but if the output value is 0, it means it is
other unwanted category.
Figure-1. RBPNN architecture
Scientific Review, 2015, 1(4): 74-78
76
3. Kernelized Radial Basis Probabilistic Neural Networks
3.1. Kernel-Based Approach
Given an unlabeled data set  1
, , n
X x x in the d-dimensional space Rd
, let :
d
R H  be a non-linear
mapping function from this input space to a high dimensional feature space H. By applying the non-linear mapping
function  , the dot product xi•xj in the input space. The key notion in kernel-based learning is that the mapping
function  need not be explicitly specified. The dot product ( () )i j
x x  in the high dimensional feature space can
be calculated through the kernel function K(xi, xj) in the input space Rd
[17].
( , ) ( ) ( )i j i j
K x x x x   (4)
Three commonly used kernel functions (Scholkopf & Smola, 2002) are the polynomial kernel function,
( , ) ( )
d
i j i j
K x x x x c   (5)
where 0, ;c d N  the Gaussian kernel function,
2
2
( , ) exp
2
i j
i j
x x
K x x


 
 
  
 
(6)
where 0;  and the sigmoidal kernel function,
( , ) tanh( ( , ) )i j i j
K x x x x   (7)
where 0 and 0.  
3.2. Formulation
Given a data point (1 )
d
i
x R i n   and a non-linear mapping : ,
d
R H  the Probability Density Function
at data point xi is defined as
2
1
2
( (1
( ) exp
2
) )i
N
i j
i
ji
x x
f x
N 
  
 
  
       
 (8)
where
2
( ) ( )i j
x x   is the square of distance between ( )i
x and ( ).j
x Thus a higher value of ( )i
f x
indicates that xi has more data points xj near to it in the feature space. The distance in feature space is calculated
through the kernel in the input space as follows:
   
2
( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) 2 ( ) ( ) ( ) ( )
( , ) 2 ( , ) ( , ) (9)
i j i j i j
i i i j j j
i i i j j j
x x x x x x
x x x x x x
K x x K x x K x x
          
           
  
Therefore, Equation (8) can be rewritten as
2
1
( , ) 2 ( , ) ( , )1
( ) exp
2
i
N
i i i j j j
i
ji
K x x K x x K x x
f x
N 
 
 
   
  
  
 (10)
The rest of the computation procedure is similar to that of the RBPNN method.
4. Construction of Classification Prediction Model and Roc Curve Analysis
A sequence of 150 rows of data consists of sepal length (in cm), sepal width (in cm), petal length (in cm) and
petal width (in cm) that constitutes to the dataset. The data to include in the database is taken from Machine
Learning Repository [18]. All of data sets of the four parameters have been Z-transformed for classification purposes
and the Gaussian kernel was used in training of Kernelized RBPNN. The Z-transform is performed as a
preprocessing step to minimize the gap of minimum and maximum of data values. Thus, we divided 150 rows of
data into two separate parts, 100 rows for training and 50 rows remaining for testing. So, the ratio of train to test is
approximately 2:1. All training is done in a desktop PC of Intel duo core 2GHz with 1Gb RAM memory.
In the Kernel RBPNN, MATLAB is used to self-write the program for the model construction and we fixed
value to the Smoothing Parameter σ to 0.1 according to Sansone and Vento [19]. In the test phase, each classifier has
been tested with data sets mentioned above. Performance of different classifiers in the test phase can be observed in
Table 1. The proposed classifier, which referred as Kernel RBPNN, achieves the best result amongst all other
Scientific Review, 2015, 1(4): 74-78
77
methods. It can be concluded from Table 1 that the performance of Kernel RBPNN is better than RBPNN, RBFN
and BP. In term of time, Kernel RBPNN is about 82 times faster than RBFN and 1931 times faster than GD. The
classification accuracy obtained using Kernel RBPNN is 6% higher than RBPNN. It is also obvious that Kernel
RBPNN consumed less time in comparison to RBPNN, RBFN and BP. RBPNN outperform RBFN in classification
accuracy due to its network architecture that implemented Bayesian decision and nonparametric technique which
make it more applicable to general classification problem, this bring RBFN has the worst performance among all
models. Although the classification accuracy of the well known classification method, BP, is high, but as we know,
high processing time of BP makes it undesirable for many on-line recognition applications.
The true positive rate in Figure 2 means the percentage of the number of class with prediction result of 0 to the
number of class with real value of 0; false positive rate means the percentage of the number of class with prediction
result of 1 to the number of class with real value of 1. In this paper, the mutual verification of the data among the
models is performed. The Kernel RBPNN is verified as the best performed model and there are 50 rows of data in
the test results. The result is drawn as Receiver Operator Characteristic (ROC) curve as in Figure 2. In the figure, the
farther the ROC curve above the reference line, the larger the area under the ROC curve (AUC), which also means
the higher the classification prediction power of this model [20]. Table 2 shows the mutual verification results of the
Iris dataset. Figure 2 shows that the area under the ROC curve of the Kernel RBPNN model of the Iris dataset are
larger than the other three models; from the observation of Table 2, it seems that the specificity of Kernel RBPNN
model is less than other three models, but the AUC value are higher than those of BP, RBPNN and RBFN models;
therefore, it can be concluded that Kernel RBPNN model has very good classification prediction capability.
Table-1. Classification Prediction accuracy of Iris dataset
Method Classification Accuracy (%) CPU Time (s) Epoch
Kernel RBPNN 89.12 0.057 9
RBPNN 82.44 0.068 11
RBFN 45.15 4.688 200
GD 85.66 110.082 30000
Table-2. Mutual Verification with AUC.
Method AUC
Kernelized RBPNN 0.7296
RBPNN 0.6554
RBFN 0.4827
GD 0.6349
Figure-2. The Mutual verified ROC curve of test data of Iris dataset
5. Conclusion
In this paper, a new RBPNN which consists of kernel-based approach has been proposed. Not like the
conventional RBPNN that only based on Euclidean distance computation, the Kernel RBPNN applied the kernel-
induced distance computation which can implicitly map the input data to a high dimensional space in which data
classification is easier. The network is applied to river water classification problem and showed an increased in
classification accuracy in comparison with other well-known classifiers. The metrics for considering performance of
ROC Curve
Kernel RBPNN
RBPNN
RBFN
GD
Reference Line
Scientific Review, 2015, 1(4): 74-78
78
a classifier in this work were the classification accuracy and processing time of the classifier. In future work, we plan
to improve the classification rate by employing different kernel function such as wavelet function.
References
[1] Yamazaki, K., 2015. "Accuracy analysis of semi-supervised classification when the class balance changes."
Neurocomputing, vol. 160, pp. 132-140.
[2] Altman, E. I., 1968. "Financial ratios, discriminate analysis and the prediction of corporate bankruptcy."
Journal of Finance, vol. 23, pp. 589-609.
[3] Ohlson, J. A., 1980. "Financial ratios and the probabilistic prediction of bankruptcy." Journal of Accounting
Research, vol. 18, pp. 109-131.
[4] Odom, M. D. and Sharda, R., 1990. "A neural network model for bankruptcy prediction." IEEE INNS
IJCNN, vol. 2, pp. 163-168.
[5] Breiman, L., 1996. "Bagging predictors." Machine Learning, vol. 24, pp. 123-140.
[6] Zarita, Z., Sharifuddin, M. Z., Lim, E. A., and Hafizan, J., 2004. "Application of neural network in river
water quality classification." Ecological and Environmental Modeling, vol. 2, pp. 24-31.
[7] Ganchev, T., Tasoulis, D. K., and Vrahatis, M. N., 2003. "Locally Recurrent Probabilistic Neural Network
for Text-Independent Speaker Verification." Proc. of the Euro Speech, vol. 3, pp. 762-766.
[8] Muller, K. R., 2001. "An introduction to kernel-based learning algorithms." IEEE Trans. Neural Networks,
vol. 12, pp. 181-202.
[9] Girolami, M., 2002. "Mercer kernel-based clustering in feature space." IEEE Trans. Neural Networks, vol.
13, pp. 780-784.
[10] Zhang, R. and Rudnicky, A. I., 2002. "A large scale clustering scheme for kernel k-means," In The 16th
International Conference on Pattern Recognition. pp. 289-292.
[11] Wu, Z. D. and Xie, W. X., 2003. "Fuzzy c-means clustering algorithm based on kernel method," In The
Fifth International Conference on Computational Intelligent and Multimedia Applications. pp. 1-6.
[12] Kosic, D., 2015. "Fast clustered radial basis function network as an adaptive predictive controller." Neural
Networks, vol. 63, pp. 79-86.
[13] Ha, Q., Wahid, H., Duc, H., and Azzi, M., 2015. "Enhanced radial basis function neural networks for ozone
level estimation." Neurocomputing, vol. 155, pp. 62-70.
[14] Lee, Y. J., Micchelli, C. A., and Yoon, J., 2015. "A study on multivariate interpolation by increasingly flat
kernel functions." Journal of Mathematical Analysis and Applications, vol. 427, pp. 74-87.
[15] Specht, D. F., 1990. "Probabilistic neural network and the polynomial adaline as complementary techniques
for classification." IEEE Trans. Neural Networks, vol. 1, pp. 111-121.
[16] Yeh, Y. C., 1998. Application of neural network. Taiwan: Scholars Books
[17] Scholkopf, B. and Smola, A. J., 2002. Learning with Kernels. Cambridge: MIT Press
[18] Machine Learning Repository, 2015. https://archive.ics.uci.edu/ml/datasets/Iris
[19] Sansone, C. and Vento, M., 2001. "A classification reliability drive reject rule for multi-expert system."
International Journal of Pattern Recognition and Artificial Intelligent, vol. 15, pp. 1-19.
[20] Bradley, A. P., 1977. "The use of the area under ROC curve in the evaluation of machine learning
algorithms." Pattern Recognition, vol. 30, pp. 1145-1159.

More Related Content

PDF
PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)
PPTX
Clustering on database systems rkm
PDF
Training and Inference for Deep Gaussian Processes
PDF
The Gaussian Process Latent Variable Model (GPLVM)
PDF
Premeditated Initial Points for K-Means Clustering
PDF
50120140505013
PDF
Optics ordering points to identify the clustering structure
PDF
CLUSTERING HYPERSPECTRAL DATA
PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)
Clustering on database systems rkm
Training and Inference for Deep Gaussian Processes
The Gaussian Process Latent Variable Model (GPLVM)
Premeditated Initial Points for K-Means Clustering
50120140505013
Optics ordering points to identify the clustering structure
CLUSTERING HYPERSPECTRAL DATA

What's hot (20)

DOCX
K means report
PDF
New Approach for K-mean and K-medoids Algorithm
PDF
A PSO-Based Subtractive Data Clustering Algorithm
PDF
Experimental study of Data clustering using k- Means and modified algorithms
PDF
Big data Clustering Algorithms And Strategies
PDF
The International Journal of Engineering and Science (The IJES)
PDF
Birch
PDF
Clustering Using Shared Reference Points Algorithm Based On a Sound Data Model
PDF
A HYBRID CLUSTERING ALGORITHM FOR DATA MINING
PDF
Incorporating Kalman Filter in the Optimization of Quantum Neural Network Par...
PDF
Clustering Algorithms for Data Stream
PDF
50120140501016
PPTX
Grid based method & model based clustering method
PPTX
Dimension Reduction And Visualization Of Large High Dimensional Data Via Inte...
PPT
3.4 density and grid methods
PDF
Fuzzy c-means
PDF
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
PDF
Parallel KNN for Big Data using Adaptive Indexing
PPT
PPT
3.2 partitioning methods
K means report
New Approach for K-mean and K-medoids Algorithm
A PSO-Based Subtractive Data Clustering Algorithm
Experimental study of Data clustering using k- Means and modified algorithms
Big data Clustering Algorithms And Strategies
The International Journal of Engineering and Science (The IJES)
Birch
Clustering Using Shared Reference Points Algorithm Based On a Sound Data Model
A HYBRID CLUSTERING ALGORITHM FOR DATA MINING
Incorporating Kalman Filter in the Optimization of Quantum Neural Network Par...
Clustering Algorithms for Data Stream
50120140501016
Grid based method & model based clustering method
Dimension Reduction And Visualization Of Large High Dimensional Data Via Inte...
3.4 density and grid methods
Fuzzy c-means
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
Parallel KNN for Big Data using Adaptive Indexing
3.2 partitioning methods
Ad

Viewers also liked (10)

PDF
An Application of Genetic Programming for Power System Planning and Operation
PDF
Mixed Language Based Offline Handwritten Character Recognition Using First St...
PDF
Current clustering techniques
PPT
Cluster spss week7
PPTX
artiicial intelligence in power system
PPT
K mean-clustering algorithm
DOC
List Of Post Graduate Thesis in engineering project management
PPTX
Artificial intelligence in power plants
PPT
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
PPT
Electrical Engineering Presentation
An Application of Genetic Programming for Power System Planning and Operation
Mixed Language Based Offline Handwritten Character Recognition Using First St...
Current clustering techniques
Cluster spss week7
artiicial intelligence in power system
K mean-clustering algorithm
List Of Post Graduate Thesis in engineering project management
Artificial intelligence in power plants
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Electrical Engineering Presentation
Ad

Similar to Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Network (20)

PDF
Path loss prediction
PDF
F017533540
PDF
DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...
PDF
Clustering using kernel entropy principal component analysis and variable ker...
PDF
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
PDF
Adaptive Training of Radial Basis Function Networks Based on Cooperative
PDF
Large Scale Kernel Learning using Block Coordinate Descent
PDF
Expert system design for elastic scattering neutrons optical model using bpnn
PDF
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
PDF
Predicting rainfall using ensemble of ensembles
PPTX
Machine Learning Algorithms (Part 1)
DOCX
Neural nw k means
PDF
Black-box modeling of nonlinear system using evolutionary neural NARX model
PPT
Zoooooohaib
PPTX
A general multiobjective clustering approach based on multiple distance measures
PDF
Recognition of handwritten digits using rbf neural network
PDF
Recognition of handwritten digits using rbf neural network
PDF
ANN Based POS Tagging For Nepali Text
PDF
Hidden Layer Leraning Vector Quantizatio
PPTX
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
Path loss prediction
F017533540
DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...
Clustering using kernel entropy principal component analysis and variable ker...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Adaptive Training of Radial Basis Function Networks Based on Cooperative
Large Scale Kernel Learning using Block Coordinate Descent
Expert system design for elastic scattering neutrons optical model using bpnn
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
Predicting rainfall using ensemble of ensembles
Machine Learning Algorithms (Part 1)
Neural nw k means
Black-box modeling of nonlinear system using evolutionary neural NARX model
Zoooooohaib
A general multiobjective clustering approach based on multiple distance measures
Recognition of handwritten digits using rbf neural network
Recognition of handwritten digits using rbf neural network
ANN Based POS Tagging For Nepali Text
Hidden Layer Leraning Vector Quantizatio
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...

More from Scientific Review (7)

PDF
Computation of Dielectric Constant and Loss Factor of Water and Dimethylsulph...
PDF
Assessment of Maternal Health Seeking Behavior and Service Utilization among ...
PDF
A Dynamic Cellular Automaton Model for Large-Scale Pedestrian Evacuation
PDF
Extending TCP the Major Protocol of Transport Layer
PDF
Multidrug Resistance Pattern of Staphylococcus Aureus Isolates in Maiduguri M...
PDF
Mechanical Behaviour of Agricultural Residue Reinforced Composites
PDF
Performance Evaluation of a Three Phase Nine Level Inverter with Reduced Swit...
Computation of Dielectric Constant and Loss Factor of Water and Dimethylsulph...
Assessment of Maternal Health Seeking Behavior and Service Utilization among ...
A Dynamic Cellular Automaton Model for Large-Scale Pedestrian Evacuation
Extending TCP the Major Protocol of Transport Layer
Multidrug Resistance Pattern of Staphylococcus Aureus Isolates in Maiduguri M...
Mechanical Behaviour of Agricultural Residue Reinforced Composites
Performance Evaluation of a Three Phase Nine Level Inverter with Reduced Swit...

Recently uploaded (20)

PPT
Project quality management in manufacturing
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
DOCX
573137875-Attendance-Management-System-original
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PDF
Well-logging-methods_new................
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPT
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
Sustainable Sites - Green Building Construction
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
Construction Project Organization Group 2.pptx
PDF
Digital Logic Computer Design lecture notes
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Project quality management in manufacturing
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
573137875-Attendance-Management-System-original
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
Well-logging-methods_new................
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Sustainable Sites - Green Building Construction
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Construction Project Organization Group 2.pptx
Digital Logic Computer Design lecture notes
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...

Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Network

  • 1. Scientific Review ISSN: 2412-2599 Vol. 1, No. 4, pp: 74-78, 2015 URL: http://arpgweb.com/?ic=journal&journal=10&info=aims *Corresponding Author 74 Academic Research Publishing Group Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Network Lim Eng Aik* Institute of Engineering Mathematic Universiti Malaysia Perlis, 02600 Ulu Pauh, Perlis Mohd. Syafarudy Abu Institute of Engineering Mathematic Universiti Malaysia Perlis, 02600 Ulu Pauh, Perlis 1. Introduction In recent years, data classification and assessment model is a very popular and important research topic. The so- called classification and assessment means to use the related class information to evaluate the level of accuracy for classification problem. However, the data classification accuracy is usually the most important key for predicting the right subject [1]. Therefore, if the data classification accuracy can be improve, then public can use it for classification in many fields with its error greatly reduced. In the past literature, some researchers use conventional statistical model for model construction, for example, the Discriminant Analysis of [2], the Probit of Ohlson [3]. However, in recent years, some researchers found that the model constructed by data mining technique has better prediction accuracy than that of the conventional statistical model; some of them are, for example, the Back-Propagation Networks model of Odom and Sharda [4] and the decision tree classification model of [5]. Another recent approach done by Zarita, et al. [6] using backpropagation method in classification of river water quality. The approach provided a good result in classification, but, it takes a long times for training the network. In this paper, Radial Basis Probabilistic Neural Networks (RBPNN) which is known to have good generalization properties and is trained faster than the backpropagation network Ganchev, et al. [7] is adopted for the construction of classification prediction model. Our proposed method, used kernel-function to increase the separability of data by working in a high dimensional space. Thus, the proposed method is characterized by higher classification accuracy than the original RBPNN. The kernel-based classification in the feature space not only preserves the inherent structure of groups in the input space, but also simplifies the associated structure of the data [8]. Since Girolami first developed the kernel k-means clustering algorithm for unsupervised classification [9], several studies have demonstrated the superiority of kernel classification algorithms over other approaches to classification [10-14]. In this paper, we evaluate the performance of our proposed method, kernel RBPNN, with a comparison to three well-known classification methods: the Back-Propagation (BP), Radial Basis Function Network (RBFN), and RBPNN; meanwhile, comparison and analysis on the classification power is done to these three methods. These methods performance are compared using Iris data sets. 2. Radial Basis Probabilistic Neural Networks Radial Basis Probabilistic Neural Network (RBPNN) architecture is based on based on Bayesian decision theory and nonparametric technique to estimate Probability Density Function (PDF). The form of PDF is a Gaussian distribution. Specht [15] had proposed this function [16]: Abstract: Radial Basis Probabilistic Neural Network (RBPNN) has a broader generalized capability that been successfully applied to multiple fields. In this paper, the Euclidean distance of each data point in RBPNN is extended by calculating its kernel-induced distance instead of the conventional sum-of squares distance. The kernel function is a generalization of the distance metric that measures the distance between two data points as the data points are mapped into a high dimensional space. During the comparing of the four constructed classification models with Kernel RBPNN, Radial Basis Function networks, RBPNN and Back-Propagation networks as proposed, results showed that, model classification on Iris Data with Kernel RBPNN display an outstanding performance in this regard. Keywords: Kernel function; Radial Basis Probabilistic Neural Network; Iris Data; Classification.
  • 2. Scientific Review, 2015, 1(4): 74-78 75   2 2 1 2 1 1 1 ( ) exp 22 k N k j k m m jk X X f X N                          (1) Since RBPNN is applicable to general classification problem, and assume that the eigenvector to be classified must belong to one of the known classifications, then the absolute probabilistic value of each classification is not important and only relative value needs to be considered, hence, in equation (1),   2 1 1 2 m m           Can be neglected and equation (1) can be simplified as 2 1 2 1 ( ) exp 2 k N k j k jk X X f X N                 (2) In equation (2),  is the smoothing parameter of RBPNN. After network training is completed, the prediction accuracy can be enhanced through the adjustment of the smoothing parameter  , that is, the larger the value, the smoother the approaching function. If the smoothing parameter  is inappropriately selected, it will lead to an excessive or insufficient neural units in the network design, and over fitting or inappropriate fitting will be the result in the function approaching attempt; finally, the prediction power will be reduced. Let 2 kj k j d X X  Be the square of the Euclidean distance of two points Xk and Xj in the sample space, and equation (2) can be re- written as 2 1 1 ( ) exp 2 k N k jk kjd f X N                (3) In equation (3), when smoothing parameter  approaches zero, 1 ( )k k f X N  If Xk = Xj, then ( ) 0k f X  At this moment, RBPNN will depend fully on the non-classified sample which is closest to the classified sample to decide its classification. When smoothing parameter  approaches infinity, ( ) 1k f X  At this moment, RBPNN is close to blind classification. Usually, the researchers need to try different  in certain range to obtain one that can reach the optimum accuracy. Specht [15] had proposed a method that can be used to adjust smoothing parameter  ,that is, assign each input neural unit a single ; during the test stage,  that have the optimum classification result are taken through the fine adjustment of each . RBPNN is a three-layer feed-forward neural network (as in Figure 1). The first layer is the input layer and the number of neural unit is the number of independent variable and receives the input data; the second hidden layer in the middle is Pattern Layer, which stores each training data; the data sent out by Pattern Layer will pass through the neural unit of the third layer Summation Layer to correspond to each possible category, in this layer, the calculation of equation (3) will be performed. The fourth layer is Competitive Layer; the competitive transfer function of this layer will pick up from the output of the last layer the maximum value from these probabilities and generate the output value. If the output value is 1, it means it is the category you want; but if the output value is 0, it means it is other unwanted category. Figure-1. RBPNN architecture
  • 3. Scientific Review, 2015, 1(4): 74-78 76 3. Kernelized Radial Basis Probabilistic Neural Networks 3.1. Kernel-Based Approach Given an unlabeled data set  1 , , n X x x in the d-dimensional space Rd , let : d R H  be a non-linear mapping function from this input space to a high dimensional feature space H. By applying the non-linear mapping function  , the dot product xi•xj in the input space. The key notion in kernel-based learning is that the mapping function  need not be explicitly specified. The dot product ( () )i j x x  in the high dimensional feature space can be calculated through the kernel function K(xi, xj) in the input space Rd [17]. ( , ) ( ) ( )i j i j K x x x x   (4) Three commonly used kernel functions (Scholkopf & Smola, 2002) are the polynomial kernel function, ( , ) ( ) d i j i j K x x x x c   (5) where 0, ;c d N  the Gaussian kernel function, 2 2 ( , ) exp 2 i j i j x x K x x            (6) where 0;  and the sigmoidal kernel function, ( , ) tanh( ( , ) )i j i j K x x x x   (7) where 0 and 0.   3.2. Formulation Given a data point (1 ) d i x R i n   and a non-linear mapping : , d R H  the Probability Density Function at data point xi is defined as 2 1 2 ( (1 ( ) exp 2 ) )i N i j i ji x x f x N                   (8) where 2 ( ) ( )i j x x   is the square of distance between ( )i x and ( ).j x Thus a higher value of ( )i f x indicates that xi has more data points xj near to it in the feature space. The distance in feature space is calculated through the kernel in the input space as follows:     2 ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 2 ( ) ( ) ( ) ( ) ( , ) 2 ( , ) ( , ) (9) i j i j i j i i i j j j i i i j j j x x x x x x x x x x x x K x x K x x K x x                           Therefore, Equation (8) can be rewritten as 2 1 ( , ) 2 ( , ) ( , )1 ( ) exp 2 i N i i i j j j i ji K x x K x x K x x f x N                 (10) The rest of the computation procedure is similar to that of the RBPNN method. 4. Construction of Classification Prediction Model and Roc Curve Analysis A sequence of 150 rows of data consists of sepal length (in cm), sepal width (in cm), petal length (in cm) and petal width (in cm) that constitutes to the dataset. The data to include in the database is taken from Machine Learning Repository [18]. All of data sets of the four parameters have been Z-transformed for classification purposes and the Gaussian kernel was used in training of Kernelized RBPNN. The Z-transform is performed as a preprocessing step to minimize the gap of minimum and maximum of data values. Thus, we divided 150 rows of data into two separate parts, 100 rows for training and 50 rows remaining for testing. So, the ratio of train to test is approximately 2:1. All training is done in a desktop PC of Intel duo core 2GHz with 1Gb RAM memory. In the Kernel RBPNN, MATLAB is used to self-write the program for the model construction and we fixed value to the Smoothing Parameter σ to 0.1 according to Sansone and Vento [19]. In the test phase, each classifier has been tested with data sets mentioned above. Performance of different classifiers in the test phase can be observed in Table 1. The proposed classifier, which referred as Kernel RBPNN, achieves the best result amongst all other
  • 4. Scientific Review, 2015, 1(4): 74-78 77 methods. It can be concluded from Table 1 that the performance of Kernel RBPNN is better than RBPNN, RBFN and BP. In term of time, Kernel RBPNN is about 82 times faster than RBFN and 1931 times faster than GD. The classification accuracy obtained using Kernel RBPNN is 6% higher than RBPNN. It is also obvious that Kernel RBPNN consumed less time in comparison to RBPNN, RBFN and BP. RBPNN outperform RBFN in classification accuracy due to its network architecture that implemented Bayesian decision and nonparametric technique which make it more applicable to general classification problem, this bring RBFN has the worst performance among all models. Although the classification accuracy of the well known classification method, BP, is high, but as we know, high processing time of BP makes it undesirable for many on-line recognition applications. The true positive rate in Figure 2 means the percentage of the number of class with prediction result of 0 to the number of class with real value of 0; false positive rate means the percentage of the number of class with prediction result of 1 to the number of class with real value of 1. In this paper, the mutual verification of the data among the models is performed. The Kernel RBPNN is verified as the best performed model and there are 50 rows of data in the test results. The result is drawn as Receiver Operator Characteristic (ROC) curve as in Figure 2. In the figure, the farther the ROC curve above the reference line, the larger the area under the ROC curve (AUC), which also means the higher the classification prediction power of this model [20]. Table 2 shows the mutual verification results of the Iris dataset. Figure 2 shows that the area under the ROC curve of the Kernel RBPNN model of the Iris dataset are larger than the other three models; from the observation of Table 2, it seems that the specificity of Kernel RBPNN model is less than other three models, but the AUC value are higher than those of BP, RBPNN and RBFN models; therefore, it can be concluded that Kernel RBPNN model has very good classification prediction capability. Table-1. Classification Prediction accuracy of Iris dataset Method Classification Accuracy (%) CPU Time (s) Epoch Kernel RBPNN 89.12 0.057 9 RBPNN 82.44 0.068 11 RBFN 45.15 4.688 200 GD 85.66 110.082 30000 Table-2. Mutual Verification with AUC. Method AUC Kernelized RBPNN 0.7296 RBPNN 0.6554 RBFN 0.4827 GD 0.6349 Figure-2. The Mutual verified ROC curve of test data of Iris dataset 5. Conclusion In this paper, a new RBPNN which consists of kernel-based approach has been proposed. Not like the conventional RBPNN that only based on Euclidean distance computation, the Kernel RBPNN applied the kernel- induced distance computation which can implicitly map the input data to a high dimensional space in which data classification is easier. The network is applied to river water classification problem and showed an increased in classification accuracy in comparison with other well-known classifiers. The metrics for considering performance of ROC Curve Kernel RBPNN RBPNN RBFN GD Reference Line
  • 5. Scientific Review, 2015, 1(4): 74-78 78 a classifier in this work were the classification accuracy and processing time of the classifier. In future work, we plan to improve the classification rate by employing different kernel function such as wavelet function. References [1] Yamazaki, K., 2015. "Accuracy analysis of semi-supervised classification when the class balance changes." Neurocomputing, vol. 160, pp. 132-140. [2] Altman, E. I., 1968. "Financial ratios, discriminate analysis and the prediction of corporate bankruptcy." Journal of Finance, vol. 23, pp. 589-609. [3] Ohlson, J. A., 1980. "Financial ratios and the probabilistic prediction of bankruptcy." Journal of Accounting Research, vol. 18, pp. 109-131. [4] Odom, M. D. and Sharda, R., 1990. "A neural network model for bankruptcy prediction." IEEE INNS IJCNN, vol. 2, pp. 163-168. [5] Breiman, L., 1996. "Bagging predictors." Machine Learning, vol. 24, pp. 123-140. [6] Zarita, Z., Sharifuddin, M. Z., Lim, E. A., and Hafizan, J., 2004. "Application of neural network in river water quality classification." Ecological and Environmental Modeling, vol. 2, pp. 24-31. [7] Ganchev, T., Tasoulis, D. K., and Vrahatis, M. N., 2003. "Locally Recurrent Probabilistic Neural Network for Text-Independent Speaker Verification." Proc. of the Euro Speech, vol. 3, pp. 762-766. [8] Muller, K. R., 2001. "An introduction to kernel-based learning algorithms." IEEE Trans. Neural Networks, vol. 12, pp. 181-202. [9] Girolami, M., 2002. "Mercer kernel-based clustering in feature space." IEEE Trans. Neural Networks, vol. 13, pp. 780-784. [10] Zhang, R. and Rudnicky, A. I., 2002. "A large scale clustering scheme for kernel k-means," In The 16th International Conference on Pattern Recognition. pp. 289-292. [11] Wu, Z. D. and Xie, W. X., 2003. "Fuzzy c-means clustering algorithm based on kernel method," In The Fifth International Conference on Computational Intelligent and Multimedia Applications. pp. 1-6. [12] Kosic, D., 2015. "Fast clustered radial basis function network as an adaptive predictive controller." Neural Networks, vol. 63, pp. 79-86. [13] Ha, Q., Wahid, H., Duc, H., and Azzi, M., 2015. "Enhanced radial basis function neural networks for ozone level estimation." Neurocomputing, vol. 155, pp. 62-70. [14] Lee, Y. J., Micchelli, C. A., and Yoon, J., 2015. "A study on multivariate interpolation by increasingly flat kernel functions." Journal of Mathematical Analysis and Applications, vol. 427, pp. 74-87. [15] Specht, D. F., 1990. "Probabilistic neural network and the polynomial adaline as complementary techniques for classification." IEEE Trans. Neural Networks, vol. 1, pp. 111-121. [16] Yeh, Y. C., 1998. Application of neural network. Taiwan: Scholars Books [17] Scholkopf, B. and Smola, A. J., 2002. Learning with Kernels. Cambridge: MIT Press [18] Machine Learning Repository, 2015. https://archive.ics.uci.edu/ml/datasets/Iris [19] Sansone, C. and Vento, M., 2001. "A classification reliability drive reject rule for multi-expert system." International Journal of Pattern Recognition and Artificial Intelligent, vol. 15, pp. 1-19. [20] Bradley, A. P., 1977. "The use of the area under ROC curve in the evaluation of machine learning algorithms." Pattern Recognition, vol. 30, pp. 1145-1159.