2. WHY TO EVALUATE THE “GOODNESS” OF
THE RESULTING CLUSTERS?
To avoid finding patterns in noise
To compare clustering algorithms
To compare two sets of clusters
To compare two clusters
3. DIFFERENT ASPECTS OF CLUSTER
VALIDATION
1. Determining the clustering tendency of a set of data, i.e., distinguishing whether
non-random structure actually exists in the data.
2. Comparing the results of a cluster analysis to externally known results, e.g., to
externally given class labels.
3. Evaluating how well the results of a cluster analysis fit the data without reference
to external information.
- Use only the data
4. Comparing the results of two different sets of cluster analyses to determine
which is better.
5. Determining the ‘correct’ number of clusters.
For 2, 3, and 4, we can further distinguish whether we want to evaluate the entire
clustering or just individual clusters.
4. FRAMEWORK FOR CLUSTER VALIDITY
Need a framework to interpret any measure.
For example, if our measure of evaluation has the value, 10, is that good,
fair, or poor?
Statistics provide a framework for cluster validity
The more “atypical” a clustering result is, the more likely it represents valid
structure in the data
Can compare the values of an index that result from random data or
clusterings to those of a clustering result.
If the value of the index is unlikely, then the cluster results are valid
These approaches are more complicated and harder to understand.
For comparing the results of two different sets of cluster analyses, a
framework is less necessary.
However, there is the question of whether the difference between two index
values is significant
5. MEASURES OF CLUSTER VALIDITY
Numerical measures that are applied to judge various aspects of
cluster validity, are classified into the following three types.
External Index: Used to measure the extent to which cluster labels match
externally supplied class labels.
Entropy
Internal Index: Used to measure the goodness of a clustering structure
without respect to external information.
Sum of Squared Error (SSE)
Relative Index: Used to compare two different clusterings or clusters.
Often an external or internal index is used for this function, e.g., SSE or entropy
Sometimes these are referred to as criteria instead of indices
However, sometimes criterion is the general strategy and index is the numerical
measure that implements the criterion.
6. EXTERNAL MEASURES
The correct or ground truth clustering is known priori.
Given a clustering partition C and ground truth partitioning T, we
redefine TP, TN, FP, FN in the context of clustering.
Given the number of pairs N
N=TP+FP+FN+TN
7. EXTERNAL MEASURES …
True Positives (TP): Xi and Xj are a true positive pair if they belong to the
same partition in T, and they are also in the same cluster in C. TP is
defined as the number of true positive pairs.
False Negatives (FN): Xi and Xj are a false negative pair if they belong to
the same partition in T, but they do not belong to the same cluster in C.
FN is defined as the number of false negative pairs.
• False Positives (FP): Xi and Xj are a false positive pair if the do not
belong to the same partition in T, but belong to the same cluster in C.
FP is the number of false positive pairs.
True Negatives (TN): Xi and Xj are a false negative pair if they do not
belong to the same partition in T, nor to the same cluster in C. TN is the
number of true negative pairs.
8. JACCARD COEFFICIENT
Measures the fraction of true positive point pairs, but after ignoring the
true negatives as,
Jaccard = TP/ (TP+FP+FN)
For a perfect clustering C, the coefficient is one, that is, there are no
false positives nor false negatives.
Note that the Jaccard coefficient is asymmetric in that it ignores the
true negatives
9. RAND STATISTIC
Measures the fraction of true positives and true negatives over all pairs
as
Rand = (TP + TN)/ N
The Rand statistic measures the fraction of point pairs where both the
clustering C and the ground truth T agree.
A perfect clustering has a value of 1 for the statistic.
The adjusted rand index is the extension of the rand statistic corrected
for chance.
10. FOWLKES-MALLOWS MEASURE
Define precision and recall analogously to what done for classification,
Prec = TP/ (TP+FP) and Recall = TP / (TP+FN)
• The Fowlkes–Mallows (FM) measure is defined as the geometric mean
of the pairwise precision and recall
FM = (precision recall)
√ ∙
FM is also asymmetric in terms of the true positives and negatives
because it ignores the true negatives. Its highest value is also 1,
achieved when there are no false positives or negatives.
11. INTERNAL MEASURES: COHESION
AND SEPARATION
Cluster Cohesion (Compactness): Measures how closely related
are objects in a cluster.
Cluster Separation (Separation): Measure how distinct or well-
separated a cluster is from other clusters.
Example: Squared Error
Cohesion is measured by the within cluster sum of squares (SSE)
Separation is measured by the between cluster sum of squares
Where |Ci| is the size of cluster i
i
i
i m
m
C
BSS 2
)
(
18. XIE-BENI INDEX:
In the definition of XB-index, the numerator indicates the compactness
of the obtained cluster while the denominator indicates the strength of
the separation between clusters.
The objective is to minimize the XB-index for achieving proper
clustering.