ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHM

International Journal of Fuzzy Logic Systems (IJFLS) Vol.5, No.4, October 2015
DOI : 10.5121/ijfls.2015.5405 51
ADAPTIVE FUZZY KERNEL CLUSTERING
ALGORITHM
Weijun Xu1
1
The Department of Electrical and Information Engineering, Northeast Petroleum
University at Qinhuangdao, Qinhuangdao, P.R. China
ABSTRACT
Fuzzy clustering algorithm can not obtain good clustering effect when the sample characteristic is not
obvious and need to determine the number of clusters firstly. For thi0s reason, this paper proposes an
adaptive fuzzy kernel clustering algorithm. The algorithm firstly use the adaptive function of clustering
number to calculate the optimal clustering number, then the samples of input space is mapped to high-
dimensional feature space using gaussian kernel and clustering in the feature space. The Matlab simulation
results confirmed that the algorithm's performance has greatly improvement than classical clustering
algorithm and has faster convergence speed and more accurate clustering results.
KEYWORDS
Fuzzy clustering; Gaussian kernel; Adaptive clustering number; fuzzy kernel clustering
1. INTRODUCTION
Clustering is an unsupervised learning process. C-means method (HCM) and fuzzy C-means
method (FCM) [1, 2]
directly cluster on the sample's characteristics and the effect of clustering
largely depends on the distribution of the sample. These methods are not suitable for found the
nonconvex shape of clusters or cluster size difference is very big, and the noise data points would
greatly affect the end result of clustering. This paper introduces gaussian kernel[3]
into FCM, for
this the samples of input space is mapped to high-dimensional feature space and clustering in the
feature space in order to get better clustering results. In addition, the HCM and the FCM need to
make sure of the number of clustering. This paper puts forward an adaptive fuzzy kernel
clustering algorithm. The algorithm firstly use an adaptive function of clustering number to
calculate the optimal clustering number, then the samples of input space are mapped to high-
dimensional feature space and clustering in the feature space. The Matlab simulation results
confirmed the algorithm has faster convergence speed and more accurate clustering results than
classical clustering algorithm.
2. FUZZY C-MEANS ALGORITHM
Based on the distance measure is the basis of many clustering algorithms, like HCM and FCM.
The FCM is generalized from HCM by J. C. Bezdek [1, 2]
. It has become one of the most
commonly used and more discussed clustering algorithm. Its principle is described as below: X
{x,i =1,2,...,n} is a sample set,
p
X R⊆ , c is expected clustering number, ( )1,2,...,iv i c=
is the ith

52
cluster center, ( )1,2,..., ; 1,2,...,iju i c j n= =
is the membership functions of the jth sample to the
ith class,and 1
0 1;0
n
ij ij
j
u u n
=
≤ ≤ < <∑
. The objective function of FCM is:
2
1 1
( , )
c n
m
m ij j i
i j
J U V u x v
= =
= −∑∑
（1）
Where
{ }ijU u=
， 1 2( , ,..., )cV v v v= ， •
is the Euclidean distance generally,m (m>1) is the
fuzzy weighting exponent. The constraint condition is:
1
1, 1,2,...,
c
ij
i
u j n
=
= ∀ =∑
（2）
J. C. Bezdek gave an iterative algorithm to get the optimal solution of the above mathematical
programming problem.The iterative formulaes in the algorithm are:
2 1/( 1)
2 1/( 1)
1
(1/ )
, 1,2,..., ; 1,2,...,
(1/ )
m
j i
ij c
m
j k
k
x v
u i c j n
x v
−
−
=
−
= ∀ = =
−∑
（3）
1
1
, 1,2,...,
n
m
ij j
j
i n
m
ij
j
u x
v i c
u
=
=
= ∀ =
∑
∑
（4）
FCM is suitable for the ball or elliptic globular cluster and is very sensitive to noise and outliers.
By introducing the kernel method to fuzzy clustering can effectively solve those problems[3,4]
. In
HCM and FCM clustering algorithm the number of clustering must be given in advance, in order
to solve this problem, this paper proposes an adaptive function of clustering number to calculate
the optimal clustering number, so as to realize the adaptive of clustering parameters.
3. THE ADAPTIVE FUNCTION OF THE OPTIMAL CLUSTERING NUMBER
In this section, we present a new adaptive function of the optimal clustering number and the
corresponding algorithm. At last, we use four groups of synthetic data to test the algorithm.
3.1. The Adaptive Function
Evaluation of good clustering effect under the geometric meaning should satisfy two
requirements:
(a)divergence: the distance between the classes should be as large as possible;
(b)compactness: the distance between data points in the same class should be as small as possible.
The proportion of divergence and compactness can be a clustering validity function, so we
constructed the following adaptive function of the optimal clustering number
( )optC c
:

53
2
1 1
2
1 1
/ ( 1)
( )
/ ( )
c n
m
ij i
i j
opt
c n
m
ij j i
i j
u v x c
C c
u x v n c
−
= =
= =
− −
=
− −
∑∑
∑∑
（5）
Where the definition of
, , ,iju x v •
is as same as formula (1), x is the central vector of the
overall data:
1 1
c n
m
ij j
i j
u x
x
n
−
= =
=
∑∑
（6）
The optimal clustering number c is the one which
( )optC c
reaches its maximum value.
3.2. Algorithm
Now, we outline the
( )optC c
algorithm process. For simplicity, we assume m = 2 in the study
below:
Step 1. Initialization: the termination conditionε> 0，cluster number c = 2, L(1) = 0, (0)
V , k =
0；(It can also use the partition matrix (0)
U as the initial condition.)
Step 2. Calculate the partition matrix:
( ) 2
( ) 1
( )
1
1/ ( )
kc
ijk m
ij k
r rj
d
u
d
−
=
= ∑
if there exist j and r, so that
( )k
ijd
= 0, then
( )k
iju
=1 and fori r≠ ,
( )k
iju
= 0
Step3. Calculate the prototypes:
( 1) ( ) ( )
1 1
( ) / ( )
n n
k k m k m
i ij j ij
j j
v u x u+
= =
= ∑ ∑
Step4. Calculate the variation of partition matrix:
( 1) ( )
|| ||k k
V v+
−
where
•
is some kind of matrix normif
( 1) ( )
|| ||k k
V v ε+
− < then goto step 4, else let k = k + 1
and go to Step2.
Step5. calculate
( )optC c
under c>2 and c<n:
If ( 1) ( 2)opt optC c C c− > − and ( 1) ( )opt optC c C c− > , then stop the iteration, else go to Step 2
with c = c + 1.

54
3.3. Examples
In this example, we use four groups of synthetic data to test the algorithm presented above. The
four datasets are all consist of 2-dimension points, see Fig. 1. The optimal clustering
number(OCN) of each data group evaluated by ( )optC c
are also shown there.
Figure 1. Four groups of synthetic data
Use the algorithm given in Section 3.2, we get the following OCNs corresponding to the four
datasets(the bolded values indicate the OCNs ):
a:BCN=2,(L(2),L(3))=(116.1173,98.0299)
b:BCN=3,(L(2),L(3), L(4))=(52.1941,119.8551,112.3083)
c:BCN=4,(L(2),L(3), L(4), L(5))=(74.0647, 114.8352, 155.9011,145.8195)
d:BCN=5,(L(2),L(3),L(4),L(5),L(6))=(74.0877,81.1528,129.8848,165.9403,161.8719)
The four figures show that the adaptive function can figure out the optimal clustering number.
4. FUZZY KERNEL CLUSTERING ALGORITHM
Kernel clustering method can increase the optimization of sample characteristics and effectively
improve the classification results of the algorithm. Based on the statistical learning theory, as long
as the function meet the Mercer[5]
conditions, it can be regarded as a kind of kernel function.
Through the using of Mercer kernel, the samples of input space are mapped to high-dimensional
feature space and clustering in the feature space. Through the mapping of kernel function, make
originally unconspicuous features stand out, so as to better clustering[6]
.
We replace j ix v−
in the formula (1) with
( ) ( )j ix vφ φ−
, where ( )φ • is the nonlinear transformation
function. The formula (1) can be changed as below:

55
2
1 1
( , ) ( ) ( )
c n
m
m ij j i
i j
J U V u x vφ φ
= =
= −∑∑
（7）
Where
( )jxφ
and
( )ivφ respectively represent the image of the sample and the clustering center
in H feature space. The
2
( ) ( )j ix vφ φ−
calculates as following:
2
( ) ( ) ( , ) ( , ) 2 ( , )j i j j i i j ix v K x x K v v K x vφ φ− = + −
（8）
In this paper, the kernel selects gaussian kernel:
2
2
( , ) exp( )
2
x y
K x y
σ
− −
=
（9）
Where σ is the width of gaussian kernel. For the sake of convenience, we write the FCM
algorithm which meeting the formula (7) as KFCM. In the constraint condition of formulae (2),
the iterative formulae in KFCM are:
1/( 1)
1/( 1)
1
(1/ ( ( , ) ( , ) 2 ( , )))
(1/ ( ( , ) ( , ) 2 ( , )))
m
j j i i j i
ij c
m
j j k k j k
k
K x x K v v K x v
u
K x x K v v K x v
−
−
=
+ −
=
+ −∑
（10）
1
1
( , )
, 1,2,...,
( , )
n
m
ij j i j
j
i n
m
ij j i
j
u K x v x
v i c
u K x v
=
=
= ∀ =
∑
∑
（11）
The formula (11) shows that iv is still belongs to the input space, but because of the addition of
the weighted coefficient of ( , )j iK x v
( especially the gaussian kernel) , making it to noise and
outliers with different weights, this greatly reduces the influence of noise and outliers on the
clustering result.
5. KFCMA ALGORITHM PROCESS
In order to describe convenient，the proposed adaptive fuzzy kernel clustering algorithm is
abbreviated as KFCMA. Now, we outline the KFCMA algorithm process.
STEP 1. Use adaptive function of clustering number to calculate the optimal clustering number of
the sample space data.
STEP 2. Initialization: clustering number c is the result of STEP 1, the termination
condition
(0)
0, 2,m Vε > = ; the iteration counter 0l = and max 50l = ; 150σ = .
STEP 3. According to the formula (10) used the current clustering center to update the
membership matrix.
STEP 4. According to the formula (11) used the current clustering center and the membership
matrix get from step 3 to update the clustering center.
STEP 5. Judgement and termination: If
( ) ( 1)l l
V V ε−
− <
or maxl L>
, then stop the algorithm
and output the membership matrix and clustering center matrix, else go to step 3 with 1l l= + .

56
6. COMPARISON OF SIMULATION RESULTS
In order to verify the effective and feasible of KFCMA, we carried out experiments on artificial
data sets and real data sets in Matlab respectively.
6.1. Artificial data
The first set of sample data contains 200 samples in two dimensional space, which belongs to 2
different categories (Each categories has 100 samples).The first category is the point on the circle,
the radius of the circle is 10 and the center is (10, 10). The second category is the point inside the
square, the center of the square is (10,10) and it’s length is 6. There is no public crossing point in
the two kind of data set.
The second set of sample data also contains 200 samples in two dimensional space, which also
belongs to 2 different categories (Each categories has 100 samples).The first category is the point
inside the circle, the radius of the circle is 8 and the center is (8, 8). The second category is the
point inside the circle, the center of the circle is (16,16) and it’s radius is 4. The two categories of
points has 12 crossing points.
We cluster the two data sets respectively using FCM and KFCMA algorithm on the same
computer. Comparison results are shown in table 2. From the table 2, we can seen that KFCMA
has faster convergence speed, less number of iterations, and more accurate clustering results than
FCM in the same termination conditions ε .
Table 1. Experimental results of the artifitial data(100 randomized trial)
Data set Algorithm Average Percentage
of Misclass(%)
Average
Iterations
Average
Time(S)
First set
FCM 0.7 8 0.06
KFCMA 0.0 2 0.02
Sencond
set
FCM 10.5 10 0.08
KFCMA 0.0 4 0.03
6.2. Real data
The real data contains three data sets from the UCI machine learning database. The three sets of
data is: Iris、Wine、Wisc(Wisconsin breast cancer data).
Iris is the best known database to be found in the pattern recognition literature. The data set
contains 3 classes of 50 instances each, where each class refers to a type of iris plant. One class is
linearly separable from the other 2; the latter are NOT linearly separable from each other.
The Wine data sets are the results of a chemical analysis of wines grown in the same region in
Italy but derived from three different cultivars. The analysis determined the quantities of 13
constituents found in each of the three types of wines.
This breast cancer databases (Wisc) was obtained from the University of Wisconsin Hospitals,
Madison from Dr. William H. Wolberg.

57
Table 2 is the optimal clustering number of the three data sets which is calculated by ( )optC c
. The
results in the table show that using this algorithm can correctly determine the real data sets '
cluster number.
We cluster the three data sets respectively using FCM and KFCMA algorithm on the same
computer. Comparison results are shown in table 3. From the table 3, we can seen that KFCMA
has faster convergence speed, less number of iterations, and more accurate clustering results than
FCM in the same termination conditions ε .
Table 2. The optimal clustering number of data sets
Data
sets
( )optC c optimal clustering
number
actual clustering
number
Iris (1) 0, (2) 10.63
(3) 17.31
opt opt
opt
C C
C
= =
=
3 3
Wine (1) 0, (2) 11.81
(3) 10.93, (4 ~ 12) 0
opt opt
opt opt
C C
C C
= =
= =
2 2
Wisc (1) 0, (2) 7.91
(3) 6.02, (4 ~ 8) 0
opt opt
opt opt
C C
C C
= =
= =
2 2
Table 3. The results of KFCMA and FCM algorithm on Real data sets(100 randomized trial)
Algorithm Data
set
Average Percentage
of misclass(%)
Average
Iterations
Average
Time(S)
FCM
Iris 10.667 17 0.1683
Wine 5.056 18 0.2114
Wisc 3.367 13 0.5058
KFCM
Iris 10.0 2 0.0877
Wine 3.371 2 0.1087
Wisc 2.343 11 0.4520
7 .CONCLUSION
Fuzzy clustering algorithm cannot obtain good clustering effect when the sample characteristic is
not obvious, very sensitive to noise and need to determine the number of clusters firstly. For those
reason, this paper proposes an adaptive fuzzy kernel clustering algorithm. The algorithm firstly
use the adaptive function of clustering number to calculate the optimal clustering number, then
the samples of input space is mapped to high-dimensional feature space and clustering in the
feature space.The Matlab simulation results confirmed that the KFCMA having faster
convergence speed, less number of iterations and more accurate clustering results than classical
clustering algorithm.

58
REFERENCES
[1] MacQueen J. Some methods for classification and analysis of multivariate observations[A]. Proc5th
Berkeley Symposium in Mathematics, Statistics, Probbability[C]. California,1967. 281-297.
[2] Bezdek JC.Pattern Recognition with Fuzzy Objective Function Algorithms[M]. New York: Plenum
Press, 1981.
[3] Zhang, D.Q., Chen, S.C. Clustering incomplete data using kernelbased fuzzy C-means algorithm[J].
Neural Process. Lett. 18(3), 155–162 (2003)
[4] Scholkopf B, Mika S, Burges C. Input space versus feature space in kernelbased methods[J]. IEEE
Trans on Neural Networks, 1999, 10(5): 1000-1017.
[5] Liguozheng, Wangmeng, Zenghuajun. An introduction to support vector machine[M].Beijing: China
Machine PRESS, 2004:1-123
[6] Kamel S M ohamed. New algorithms for solving the fuzzy c-means clustering problem [J].Pattern
Recognition, 1994, 27(3): 421-428.
[7] Blake C, Merz C J. UCI repository of machine learning databases, University of California Irvine.
http://www.ics.uci.edu/~mlearn
[8] S.R. Kannan, S. Ramathilagam and P.C. Chung, Effective fuzzy c-means clustering algorithms for
data clustering problems, Expert Systems with Applications (2012), 6292–6300.
[9] P.Y. Mok, H.Q. Huang, Y.L. Kwok, et al., A robust adaptive clustering analysis method for automatic
identification of clusters, Pattern Recognition (2012), 3017–3033.
[10] S. Ramathilagam, R. Devi and S.R. Kannan, Extended fuzzy c-means: an analyzing data clustering
problems, Cluster Computing (2013), 389–406.
[11] S. Ghosha, S. Mitraa and R. Dattagupta, Fuzzy clustering with biological knowledge for gene
selection, Applied Soft Computing (2014), 102–111.
[12] Bijalwan, Vishwanath, et al. "Machine learning approach for text and document mining." arXiv
preprint arXiv:1406.1580 (2014).
Author
Weijun Xu(1981 －) ，male, native place:Fuping- Shaanxi Province. Lecturer. E-
mail:xwjsm@163.com.I obtained my B.S. Degree at Yanshan University in China in
2004, and completed my M.S.at Beijing Information and Technology University in China
in 2011. Now I'm a teacher in Northeast Petroleum University at Qinhuangdao,China. My
research area specializes in fuzzy control and pattern recognition.

ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHM

More Related Content

What's hot (19)

Similar to ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHM (20)

Recently uploaded (20)

ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHM