Improved k-means

Improving the accuracy
of
K-means clustering
algorithm
Kasun Ranga Wijeweera
(krw19870829@gmail.com)

This presentation is based on the
following research paper

K. A. Abdul Nazeer, M. P. Sebastian, Improving
the Accuracy and Efficiency of the k-means
Clustering Algorithm, Proceedings of the World
Congress on Engineering 2009 Vol I, WCE
2009, July 1 – 3, 2009, London, U. K.

Consider a Set of Data Points,

And a Set of Clusters,

Algorithm k-means
1.Randomly choose K data items from X as initial
centroids.
2.Repeat
 Assign each data point to the cluster which has
the closest centroid.
 Calculate new cluster centroids.
Until the convergence criteria is met.

K-means gets stuck in a local
optima

Algorithm selection of initial centroids
1. Set m = 1;
2. Compute the distance between each data point and all
other data points in the set;
3. Find the closest pair of data points from the set X and
form a data point set A[m] (1 <= m <= K) which
contains these two data points. Delete these two data
points from the set;
4. Find the data point in X that is closest to the data
points set. Add it to A[m] and delete it from X;
5. Repeat step 4 until the number of data points in A[m]
reaches 0.75*(n/k);

Algorithm selection of initial centroids
continued…
6. If m < k then m = m + 1, find another pair of data
points from X between which the distance is the
shortest, form another data point set A[m] and delete
them from X. Go to step 4;
7. For each data point set A[m] (1 <= m <= K) find the
arithmetic mean of the vectors of data points in A[m].
These means will be the initial centroids.

Improved k-means

More Related Content

What's hot (20)

Similar to Improved k-means (20)

More from Kasun Ranga Wijeweera (20)

Improved k-means