SlideShare a Scribd company logo
Vishal Krishna, Ayush Kumar & Kishore Bhamidipati
International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 320
Textural Feature Extraction of Natural Objects for Image
Classification
Vishal Krishna vkrishna7@gatech.edu
Computer Science
Georgia Institute of Technology
Atlanta – 30332, US
Ayush Kumar f2010029@goa.bits-pilani.ac.in
Computer science
BITS Pilani, Goa Campus
Goa – 403726, India
Kishore Bhamidipadi Kishore.b@manipal.edu
Computer Science Engineering
Manipal Institute of Technology
Manipal – 576104, India
Abstract
The field of digital image processing has been growing in scope in the recent years. A digital
image is represented as a two-dimensional array of pixels, where each pixel has the intensity and
location information. Analysis of digital images involves extraction of meaningful information from
them, based on certain requirements. Digital Image Analysis requires the extraction of features,
transforms the data in the high-dimensional space to a space of fewer dimensions. Feature
vectors are n-dimensional vectors of numerical features used to represent an object. We have
used Haralick features to classify various images using different classification algorithms like
Support Vector Machines (SVM), Logistic Classifier, Random Forests Multi Layer Perception and
Naïve Bayes Classifier. Then we used cross validation to assess how well a classifier works for a
generalized data set, as compared to the classifications obtained during training.
.Keywords: Feature Extraction, Haralick, Classifiers, Cross-Validation.
1. INTRODUCTION
Texture is an important feature for many types of analysis of images and identification of regions
of interest. Texture analysis has a wide array of applications, including industrial and biomedical
monitoring, classification and segmentation of satellite or aerial photos, identification of ground
relief, and many others. [1] Various methods have been proposed via research over the years for
identifying and discriminating the textures. Measures like angular second moment, contrast,
mean, correlation, entropy, inverse difference moment, etc. have been typically used by
researchers for obtaining feature vectors, which are then manipulated to obtain textural features.
One of the most popular approaches to texture analysis is based on the co-occurrence matrix
obtained from images, proposed by Robert M. Haralick in 1973, which forms the basis of this
paper.
Image classification is one of the most important part of digital image analysis. Classification is a
computational procedure that sorts images into subsets according to their similarities. [4]
Contextual image classification, as the name suggests, is a method of classification based on the
contextual information in images, i.e. the relationship amongst neighbouring pixels. [2].
For classification, we used the WEKA (“Waikato Environment for Knowledge Analysis”) tool,
which is an open source machine-learning software suite developed using Java, by the University
Vishal Krishna, Ayush Kumar & Kishore Bhamidipati
International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 321
of Waikato, New Zealand.[6] It contains set of tools for different data analysis and modelling
techniques such as: pre-processing, classification, clustering, segmentation, association rules
and visualization. It implements many artificial intelligence algorithms like decision trees, neural
networks, Particle Swarm Optimization etc.).[5]
2. LITERATURE SURVEY
The classification of images can be done either on the basis of a single resolution cell or on a
collection of resolution cells. When a block of cells are used, the challenge is to define a set of
features to represent the information given by those cells, which can be used for classification of
the images.
Human perception of images is based on three major classes of features: spectral, textural and
contextual. Spectral features are obtained as the average variation of tone across various bands
of the electromagnetic spectrum. Textural features, on the other hand, provide information about
the variation of tone within a single band. Information from portions of image surrounding the part
under analysis constitute the contextual features. In gray-scale photographs, tone represents the
varying gray levels in resolution cells, while the statistical distribution of the gray levels is
interpreted as texture. Tone and texture form an intrinsic part of any image, though one can get
precedence over the other according to the nature of the image. Simply stated, the relation
between the two is: tone is dominant when the sample under consideration shows only small
range of variation of gray levels, while gray levels spread over a wide range in a similar sample
indicate the dominance of texture.
Haralick’s work is based on the assumption that information regarding the texture of any image
can be obtained from calculating the average spatial relation of the gray tones of the image with
each other. The procedure for calculating the Haralick textural features is based on a set of gray-
tone spatial-dependence probability distribution matrices (also termed as Gray-Level Co-
occurrence Matrices or GLCM, or gray-level spatial dependence matrix), computed for various
angles at fixed distances. From each such matrix, fourteen features can be calculated, which
provide information in terms of homogeneity, contrast, linear variation of gray tone, nature and
number of boundaries etc.
Co-occurrence Matrix: A co-occurrence matrix, P, is used to describe the relationships between
neighbouring (at a distance, d) pixels in an image. 4 co-occurrence matrices, each calculated for
a different angle, can be defined. A co-occurrence matrix, termed as P
0
, describes pixels that are
adjacent to one another horizontally (at angle 0
o
). Similarly, co-occurrence matrices are defined
for the vertical direction (90
o
) and both diagonals (45
o
and 135
o
). These matrices are called P
90
,
p
45
and P
135
respectively. [3]
‫ݔ‬ = ൮
0 0 1 1
0 0 1 1
0 2 2 2
2 2 3 3
൲ ‫݌‬଴ = ൮
4 2 1 0
2 4 0 0
1 0 6 1
0 0 1 2
൲
‫݌‬ସହ = ൮
4 1 0 0
1 2 2 0
0 2 4 1
0 0 1 0
൲ ‫݌‬ଽ଴ = ൮
6 0 2 0
0 4 2 0
2 2 2 2
0 0 2 0
൲
‫݌‬ଵଷହ = ൮
2 1 3 0
1 2 1 0
3 1 0 2
0 0 2 0
൲
Vishal Krishna, Ayush Kumar & Kishore Bhamidipati
International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 322
There are 4 pairs of (0,0) in angular 0, thus P
0
(0,0)=4 , there are 2 pairs of (0,1), thus P0(0,1)=2.
Similarly all the four matrices are computed.
Based on the co-occurrence matrices calculated as above, the thirteen texture features as
proposed by Haralick are defined below:
Notation:
Ng : Number of distinct gray levels in quantized image
a) Angular Second Moment
b) Contrast
c) Correlation
Where µx, µy, σx, σy are mean of x, y and standard deviation of x, y respectively.
d) Sum of Squares: Variance
e) Inverse Difference Moment
f) Sum Average
g) Sum Variance
Vishal Krishna, Ayush Kumar & Kishore Bhamidipati
International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 323
h) Sum Entropy
i) Entropy
j) Difference Variance
k) Difference Entropy
l) Information measures of correlation
where HX and HY are entropies of px and py.
3. METHODOLOGY
For any value of d, as mentioned before, 4 matrices are calculated for each of the thirteen
features detailed above. The mean and range of each set of four values give a set 28 values
which are then passed to the classifier. Out of the input features, some share a strong correlation,
so a feature-selection procedure can identify a subset of features in order to give good results in
classification.
The test data has a total of 25 classes, which are known Apriori. We use this knowledge to
calculate the effectiveness of various classification algorithms available, on the Haralick features.
The classification algorithms used are:
1. Naïve Bayes Classifier (NB) - A Bayes classifier is a simple probabilistic classifier based
on applying Bayes' theorem (from Bayesian statistics) with strong (naive) independence
assumptions. [7]
2. Logistic Classifier (Log) - Logistic regression is a probabilistic statistical classification
model. It measures the relationship between the categorical dependent variable and one
or more independent variables, which are usually (but not necessarily) continuous, by
using probability scores as the predicted values of the dependent variable.[8]
3. Multilayer Perception Classifier (MP) – In conventional MLP, components of feature
vectors are made to take crisp binary values, and the pattern is classified according to
highest activation reached. [9]
Vishal Krishna, Ayush Kumar & Kishore Bhamidipati
International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 324
4. Random Forest Classifier (RF) - Random forests operate by constructing a number of
decision trees training data and classifying data according to the mode of the
obtained.[10]
5. Sequential Minimal Optimization – The algorithm is used to train support vector machines
for classification. [11]
The parameters on which the effectiveness of each of the above algorithms are:
1. True Positive Rate (TP) – it is the number of items correctly labelled as belonging to the
particular class divided by the total number of elements labelled as belonging to that
class
2. False Positive Rate (FP) – it is the number of items incorrectly labelled as belonging to
the particular class divided by the total number of elements labelled as belonging to that
class
3. Precision - it is the fraction of retrieved instances that are relevant
4. Recall - it is the fraction of relevant instances that are retrieved
5. F-Measure – it is a measure that combines precision and recall, calculated as the
harmonic mean of precision and recall
6. ROC Area - receiver operating characteristic (ROC) is a plot of the performance of a
binary classifier system. The area under the curve is treated as a measure of accuracy of
the classifier.
A second set of experiments are carried out, using the same test data, algorithms and
parameters, but with the added constraint of using cross validation factor of 10.
4. RESULTS AND ANALYSIS
Each algorithm is first run on the data set and all six parameters are measured and compared.
The results obtained are given below.
Class TP Rate
NB Log MP SMO RF
1 0.525 1.000 0.950 0.675 1
2 0.750 1.000 0.975 0.675 1
3 0.850 1.000 1.000 0.875 0.975
4 0.775 0.975 0.975 0.800 1
5 0.900 1.000 1.000 0.975 1
6 0.825 1.000 0.975 0.800 1
7 0.900 1.000 1.000 0.950 1
8 0.775 1.000 0.925 0.700 1
9 0.850 0.975 0.825 0.675 1
10 0.800 1.000 1.000 0.925 1
11 0.725 1.000 0.975 0.775 1
12 0.750 1.000 0.950 0.825 1
13 0.800 1.000 1.000 0.900 1
14 0.650 0.975 0.975 0.850 1
15 0.725 0.975 1.000 0.850 0.975
16 0.850 1.000 0.975 0.800 0.975
17 0.975 0.975 1.000 0.800 1
18 0.900 0.975 1.000 0.975 1
19 0.600 0.975 0.975 0.800 1
20 1.000 1.000 1.000 1.000 1
21 0.250 1.000 0.925 0.725 0.975
22 0.750 1.000 0.975 0.900 1
23 0.525 1.000 0.950 0.775 0.9
Vishal Krishna, Ayush Kumar & Kishore Bhamidipati
International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015
24
25
FIGURE 1.1: Values of TP Rate of each class for
FIGURE 1.2
Class
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
FIGURE 2.1: Values of FP Rate of each class for different classification methods
Kishore Bhamidipati
International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015
1.000 1.000 1.000 0.975 1
0.850 1.000 1.000 0.975 1
Values of TP Rate of each class for different classification methods
1.2: Graphical representation of TP Rate values.
Class NB Log MP SMO RF
0.013 0.000 0.000 0.01 0.001
0.019 0.000 0.002 0.01 0
0.000 0.000 0.001 0 0.001
0.013 0.001 0.003 0.01 0
0.010 0.000 0.000 0 0.002
0.006 0.000 0.003 0 0
0.002 0.000 0.000 0 0
0.004 0.000 0.001 0.01 0
0.054 0.001 0.001 0.02 0
0.006 0.000 0.002 0.01 0.002
0.007 0.000 0.004 0.01 0.001
0.013 0.000 0.001 0.01 0
0.003 0.000 0.001 0 0
0.015 0.000 0.000 0.02 0
0.011 0.000 0.001 0 0.001
0.018 0.000 0.000 0 0
0.002 0.001 0.001 0.01 0
0.000 0.001 0.000 0 0
0.004 0.000 0.002 0.02 0
0.000 0.000 0.000 0 0
0.017 0.000 0.003 0.02 0
0.013 0.000 0.000 0.01 0
0.008 0.000 0.001 0.01 0
0.000 0.000 0.000 0 0
0.000 0.000 0.000 0 0
Values of FP Rate of each class for different classification methods
325
different classification methods.
Values of FP Rate of each class for different classification methods.
Vishal Krishna, Ayush Kumar & Kishore Bhamidipati
International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015
FIGURE 2.2
Class
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Kishore Bhamidipati
International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015
2.2: Graphical representation of FP Rate values.
NB Log MP SMO RF
0.636 1.000 1.000 0.82 0.976
0.625 1.000 0.951 0.73 1
1.000 1.000 0.976 1 0.975
0.721 0.974 0.929 0.82 1
0.783 1.000 1.000 0.91 0.952
0.846 1.000 0.929 0.97 1
0.947 1.000 1.000 0.97 1
0.886 1.000 0.974 0.85 1
0.395 0,983 0.971 0.54 1
0.842 1.000 0.952 0.76 0.952
0.806 1.000 0.907 0.76 0.976
0.714 1.000 0.974 0.83 1
0.914 1.000 0.976 0.95 1
0.650 0.994 1.000 0.68 1
0.725 0.992 0.976 0.97 0.975
0.667 1.000 1.000 0.91 1
0.951 0.978 0.976 0.82 1
1.000 0.984 1.000 0.95 1
0.857 0.993 0.951 0.7 1
1.000 1.000 1.000 1 1
0.385 1.000 0.925 0.66 1
0.714 1.000 1.000 0.86 1
0.724 1.000 0.974 0.82 1
1.000 1.000 1.000 0.98 1
326
Vishal Krishna, Ayush Kumar & Kishore Bhamidipati
International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015
25
FIGURE 3.1: Values of Precision of each class for different classification
+
FIGURE 3.2
Class
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Kishore Bhamidipati
International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015
1.000 1.000 1.000 0.98 1
Values of Precision of each class for different classification methods
3.2: Graphical representation of Precision values.
Class NB Log MP SMO RF
0.525 1.000 0.950 0.68 1
0.750 1.000 0.975 0.68 1
0.850 1.000 1.000 0.88 0.975
0.775 0.975 0.975 0.8 1
0.900 1.000 1.000 0.98 1
0.825 1.000 0.975 0.8 1
0.900 1.000 1.000 0.95 1
0.775 1.000 0.925 0.7 1
0.850 0.975 0.825 0.68 1
0.800 1.000 1.000 0.93 1
0.725 1.000 0.975 0.78 1
0.750 1.000 0.950 0.83 1
0.800 1.000 1.000 0.9 1
0.650 0.975 0.975 0.85 1
0.725 0.975 1.000 0.85 0.975
0.850 1.000 0.975 0.8 0.975
0.975 0.975 1.000 0.8 1
0.900 0.975 1.000 0.98 1
0.600 0.975 0.975 0.8 1
1.000 1.000 1.000 1 1
0.250 1.000 0.925 0.73 0.975
327
methods.
Vishal Krishna, Ayush Kumar & Kishore Bhamidipati
International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015
22
23
24
25
FIGURE 4.1: Values of Recall of each class for different classification methods
FIGURE 4.2
Class
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Kishore Bhamidipati
International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015
0.750 1.000 0.975 0.9 1
0.525 1.000 0.950 0.78 0.9
1.000 1.000 1.000 0.98 1
0.850 1.000 1.000 0.98 1
Values of Recall of each class for different classification methods
4.2: Graphical Representation of values of Recall.
NB Log MP SMO RF
0.575 1.000 0.974 0.74 0.988
0.682 1.000 0.963 0.7 1
0.919 1.000 0.988 0.93 0.975
0.747 0.976 0.951 0.81 1
0.837 1.000 1.000 0.94 0.976
0.835 1.000 0.951 0.88 1
0.923 1.000 1.000 0.96 1
0.827 1.000 0.949 0.77 1
0.540 0.979 0.892 0.6 1
0.821 1.000 0.976 0.83 0.976
0.763 1.000 0.940 0.77 0.988
0.732 1.000 0.962 0.83 1
0.853 1.000 0.988 0.92 1
0.650 0.982 0.987 0.76 1
0.725 0.986 0.988 0.91 0.975
0.747 1.000 0.987 0.85 0.987
0.963 0.976 0.988 0.81 1
0.947 0.993 1.000 0.96 1
328
Values of Recall of each class for different classification methods.
Vishal Krishna, Ayush Kumar & Kishore Bhamidipati
International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015
19
20
21
22
23
24
25
FIGURE 5.1: Values of F
FIGURE 5.2
Class
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Kishore Bhamidipati
International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015
0.706 0.981 0.963 0.74 1
1.000 1.000 1.000 1 1
0.303 1.000 0.925 0.69 0.987
0.732 1.000 0.987 0.88 1
0.609 1.000 0.962 0.8 0.947
1.000 1.000 1.000 0.98 1
0.919 1.000 1.000 0.98 1
Values of F-measure of each class for different classification methods
5.2: Graphical representation of F-measure values.
Class NB Log MP SMO RF
0.970 1.000 0.974 0.97 1
0.979 1.000 0.996 0.98 1
0.997 1.000 1.000 1 1
0.984 1.000 0.999 0.99 1
0.995 1.000 1.000 1 1
0.982 1.000 0.996 0.98 1
0.999 1.000 1.000 1 1
0.991 1.000 0.985 0.98 1
0.977 1.000 0.964 0.97 1
0.995 1.000 1.000 0.99 1
0.983 1.000 0.999 0.98 1
0.986 1.000 0.995 0.99 1
0.996 1.000 1.000 1 1
0.985 1.000 0.999 0.98 1
0.984 1.000 1.000 0.99 1
329
classification methods.
Vishal Krishna, Ayush Kumar & Kishore Bhamidipati
International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015
16
17
18
19
20
21
22
23
24
25
FIGURE 6.1: Values of ROC Area of each class for different classification methods
FIGURE 6.2
The following tables and diagrams pertain to the second set of experiments, i.e.
validation factor of 10 in each case.
Class MP CV10
1 0.725
2 0.825
3 0.925
4 0.825
5 0.900
6 0.750
7 0.950
8 0.825
9 0.625
Kishore Bhamidipati
International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015
0.990 1.000 0.997 0.98 1
0.998 1.000 1.000 0.99 1
1.000 1.000 1.000 1 1
0.978 1.000 0.998 0.98 1
1.000 1.000 1.000 1 1
0.930 1.000 0.993 0.98 1
0.974 1.000 0.997 0.99 1
0.964 1.000 0.978 0.98 1
1.000 1.000 1.000 1 1
1.000 1.000 1.000 1 1
Values of ROC Area of each class for different classification methods
6.2: Graphical representation of ROC Area Values.
The following tables and diagrams pertain to the second set of experiments, i.e.
validation factor of 10 in each case.
MP CV10 NB CV10 Log CV 10 RF CV10 SMO CV10
0.5 0.725 0.675 0.525
0.675 0.775 0.675 0.625
0.775 0.975 0.875 0.825
0.775 0.875 0.7 0.725
0.9 0.925 0.8 0.9
0.825 0.8 0.725 0.775
0.85 0.975 0.925 0.925
0.725 0.9 0.75 0.575
0.85 0.725 0.65 0.675
330
Values of ROC Area of each class for different classification methods.
The following tables and diagrams pertain to the second set of experiments, i.e. with a cross
Vishal Krishna, Ayush Kumar & Kishore Bhamidipati
International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 331
10 0.925 0.775 0.925 0.9 0.825
11 0.875 0.7 0.9 0.85 0.75
12 0.825 0.7 0.825 0.75 0.775
13 0.975 0.775 0.975 0.825 0.85
14 0.800 0.575 0.85 0.7 0.8
15 0.925 0.675 0.9 0.825 0.85
16 0.825 0.825 0.85 0.7 0.75
17 0.850 0.95 0.975 0.925 0.7
18 0.975 0.9 0.975 0.975 0.95
19 0.875 0.575 0.9 0.675 0.725
20 1.000 1 1 1 1
21 0.675 0.225 0.75 0.45 0.575
22 0.850 0.75 0.9 0.775 0.825
23 0.775 0.425 0.8 0.625 0.7
24 0.975 1 1 0.95 0.975
25 0.925 0.825 1 0.925 0.975
FIGURE 7.1: Values of TP Rate of each class for different classification methods with cross validation 10.
FIGURE 7.2: Graphical representation of TP Rate values with Cross Validation.
0.2000.2500.3000.3500.4000.4500.5000.5500.6000.6500.7000.7500.8000.8500.9000.9501.0001.050
1 3 5 7 9 11 13 15 17 19 21 23 25
TP RATE
MP CV10 NB CV10 Log CV 10
RF CV10 SMO CV10
Vishal Krishna, Ayush Kumar & Kishore Bhamidipati
International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 332
FIGURE 8.2: Graphical Representation of FP Rate values with cross validation 10.
FIGURE 9.2: Graphical Representation of Precision values with cross validation 10.
FIGURE 10.2: Graphical Representation of Recall values with cross validation 10.
0.000
0.005
0.010
0.015
0.020
0.025
0.030
0.035
0.040
0.045
0.050
0.055
1 3 5 7 9 11 13 15 17 19 21 23 25
FP RATE
MP CV10 NB CV10 Log CV 10
RF CV10 SMO CV10
0.250
0.750
1.250
1 3 5 7 9 11 13 15 17 19 21 23 25
Precision
MP CV10 NB CV10 Log CV 10
RF CV10 SMO CV10
0.150
0.350
0.550
0.750
0.950
1 3 5 7 9 11 13 15 17 19 21 23 25
Recall
MP CV10 NB CV10 Log CV 10
RF CV10 SMO CV10
Vishal Krishna, Ayush Kumar & Kishore Bhamidipati
International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 333
FIGURE 11.2: Graphical Representation of F-measure values with cross validation 10.
FIGURE 12.2: Graphical Representation of ROC Area values with cross validation 10.
The overall accuracy of each algorithm, considering all classes is depicted below.
Log MP NB RF SMO
99.7 97.3 77.2 99.2 83.9
FIGURE 13.1: Overall accuracy values of all classes.
5. CONCLUSION
Our comparative study provides a comprehensive analysis to Haralick features and its use in the
well-known classification models. From the analysis, we can see Logistic classifier performs
extremely well under all parameters, which is reflected in the combined accuracy values. It has a
99.7 percent accuracy for the trained parameters across all the classes. Random Forest
Classifier performs second with respect to the rest of the classifiers. It successfully predicted all
the values for most of the classes. Native Bayes performs the worst, especially with certain
classes, which brings down the total accuracy achieved.
On applying cross validation with a factor of 10, we see that the accuracy decreases across all
the classifiers. The different classifiers perform similarly with respect to each other as they did
without cross validation. However, it can be seen that MultiLayer Perception Classifier performs
slightly better than Random Forest Classifier in this case.
0.250
0.750
1 3 5 7 9 11 13 15 17 19 21 23 25
F-Measure
MP CV10 NB CV10 Log CV 10
RF CV10 SMO CV10
0.900
0.910
0.920
0.930
0.940
0.950
0.960
0.970
0.980
0.990
1.000
1.010
1 3 5 7 9 11 13 15 17 19 21 23 25
ROC Area
MP CV10 NB CV10 Log CV 10
RF CV10 SMO CV10
Vishal Krishna, Ayush Kumar & Kishore Bhamidipati
International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 334
Apart from Native Bayes, all other methods had an accuracy of over 80 percent. Logistical and
Random Forest scored above 99 percent in its accuracy. This demonstrates the power of
Haralick features and its efficiency in image classification using standard classification models.
6. REFERENCES
[1] Timo Ojala, Matti Pietikainen and David Harwood, A comparative study of texture measures
with classification based on feature distributions.
[2] M. Pietikainen, T. Ojala, Z. Xu; Rotation-invariant texture classification using feature
distributions
[3] Eizan Miyamotol and Thomas Merryman Jr; FAST CALCULATION OF HARALICK TEXTURE
FEATURES.
[4] Frank, J. (1990) Quart. Rev. Biophys. 23, 281-329.
[5] Baharak Goli and Geetha Govindan ; WEKA – A powerful free software for implementing Bio-
inspired Algorithms;State Inter University Centre of Excellence in Bioinformatics, University of
Kerala).
[6] Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, Ian H.
Witten (2009); The WEKA Data Mining Software: An Update; SIGKDD Explorations, Volume 11,
Issue 1.
[7] Tom M. Mitchell; Machine Learning; McGraw Hill,2010.
[8] David A. Freedsma; Statistical Models: Theory and Practice; Cambridge University Press,
2009, p. 128.
[9] Shankar Pal, Shushmita Mitra; Multilayer Perceptron, Fuzzy Sets and Classification; IEEE
Transactions on Neural Networks, Vol 3, September 1992.
[10] Leo Breiman ; "Random Forests". Machine Learning 45 (1); 2001.
[11] John Platt; Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector
Machines; 1998.

More Related Content

PDF
An Automatic Medical Image Segmentation using Teaching Learning Based Optimiz...
PDF
AUTOMATIC THRESHOLDING TECHNIQUES FOR OPTICAL IMAGES
PDF
Object-Oriented Approach of Information Extraction from High Resolution Satel...
PDF
4 image segmentation through clustering
PPT
Evaluation of Texture in CBIR
PDF
A010210106
PDF
I1803026164
PDF
An Experiment with Sparse Field and Localized Region Based Active Contour Int...
An Automatic Medical Image Segmentation using Teaching Learning Based Optimiz...
AUTOMATIC THRESHOLDING TECHNIQUES FOR OPTICAL IMAGES
Object-Oriented Approach of Information Extraction from High Resolution Satel...
4 image segmentation through clustering
Evaluation of Texture in CBIR
A010210106
I1803026164
An Experiment with Sparse Field and Localized Region Based Active Contour Int...

What's hot (16)

PDF
Content-based Image Retrieval Using The knowledge of Color, Texture in Binary...
PDF
DOMAIN SPECIFIC CBIR FOR HIGHLY TEXTURED IMAGES
PDF
OTSU Thresholding Method for Flower Image Segmentation
PDF
Web image annotation by diffusion maps manifold learning algorithm
PDF
Image similarity using fourier transform
PDF
FULL PAPER.PDF
PDF
Combining Generative And Discriminative Classifiers For Semantic Automatic Im...
PDF
Content Based Image Retrieval Using Gray Level Co-Occurance Matrix with SVD a...
PDF
Comparison on PCA ICA and LDA in Face Recognition
PDF
SEGMENTATION USING ‘NEW’ TEXTURE FEATURE
PDF
D010332630
PDF
A comparative analysis of retrieval techniques in content based image retrieval
PDF
PDE BASED FEATURES FOR TEXTURE ANALYSIS USING WAVELET TRANSFORM
PDF
Segmentation of medical images using metric topology – a region growing approach
PDF
A Novel Algorithm for Design Tree Classification with PCA
PDF
Performance Evaluation of Basic Segmented Algorithms for Brain Tumor Detection
Content-based Image Retrieval Using The knowledge of Color, Texture in Binary...
DOMAIN SPECIFIC CBIR FOR HIGHLY TEXTURED IMAGES
OTSU Thresholding Method for Flower Image Segmentation
Web image annotation by diffusion maps manifold learning algorithm
Image similarity using fourier transform
FULL PAPER.PDF
Combining Generative And Discriminative Classifiers For Semantic Automatic Im...
Content Based Image Retrieval Using Gray Level Co-Occurance Matrix with SVD a...
Comparison on PCA ICA and LDA in Face Recognition
SEGMENTATION USING ‘NEW’ TEXTURE FEATURE
D010332630
A comparative analysis of retrieval techniques in content based image retrieval
PDE BASED FEATURES FOR TEXTURE ANALYSIS USING WAVELET TRANSFORM
Segmentation of medical images using metric topology – a region growing approach
A Novel Algorithm for Design Tree Classification with PCA
Performance Evaluation of Basic Segmented Algorithms for Brain Tumor Detection
Ad

Similar to Textural Feature Extraction of Natural Objects for Image Classification (20)

PDF
A comparative study on content based image retrieval methods
PDF
Content Based Image Retrieval : Classification Using Neural Networks
PDF
Query Image Searching With Integrated Textual and Visual Relevance Feedback f...
PDF
Color vs texture feature extraction and matching in visual content retrieval ...
PDF
Q UANTUM C LUSTERING -B ASED F EATURE SUBSET S ELECTION FOR MAMMOGRAPHIC I...
PDF
C1803011419
PDF
IMAGE RETRIEVAL USING QUADRATIC DISTANCE BASED ON COLOR FEATURE AND PYRAMID S...
PDF
B0343011014
PDF
International Journal of Engineering Research and Development
PDF
Medical Image segmentation using Image Mining concepts
PDF
A Novel Feature Extraction Scheme for Medical X-Ray Images
PDF
Feature selection approach in animal classification
PDF
Component-Based Ethnicity Identification from Facial Images.pdf
PDF
A COMPARATIVE ANALYSIS OF RETRIEVAL TECHNIQUES IN CONTENT BASED IMAGE RETRIEVAL
PDF
COLOUR IMAGE REPRESENTION OF MULTISPECTRAL IMAGE FUSION
PDF
COLOUR IMAGE REPRESENTION OF MULTISPECTRAL IMAGE FUSION
PDF
C OMPARATIVE S TUDY OF D IMENSIONALITY R EDUCTION T ECHNIQUES U SING PCA AND ...
PDF
International Journal of Engineering and Science Invention (IJESI)
PDF
Computer Assisted and Contour Detectioin in Medical Imaging Using Fuzzy Logic
PDF
H017344752
A comparative study on content based image retrieval methods
Content Based Image Retrieval : Classification Using Neural Networks
Query Image Searching With Integrated Textual and Visual Relevance Feedback f...
Color vs texture feature extraction and matching in visual content retrieval ...
Q UANTUM C LUSTERING -B ASED F EATURE SUBSET S ELECTION FOR MAMMOGRAPHIC I...
C1803011419
IMAGE RETRIEVAL USING QUADRATIC DISTANCE BASED ON COLOR FEATURE AND PYRAMID S...
B0343011014
International Journal of Engineering Research and Development
Medical Image segmentation using Image Mining concepts
A Novel Feature Extraction Scheme for Medical X-Ray Images
Feature selection approach in animal classification
Component-Based Ethnicity Identification from Facial Images.pdf
A COMPARATIVE ANALYSIS OF RETRIEVAL TECHNIQUES IN CONTENT BASED IMAGE RETRIEVAL
COLOUR IMAGE REPRESENTION OF MULTISPECTRAL IMAGE FUSION
COLOUR IMAGE REPRESENTION OF MULTISPECTRAL IMAGE FUSION
C OMPARATIVE S TUDY OF D IMENSIONALITY R EDUCTION T ECHNIQUES U SING PCA AND ...
International Journal of Engineering and Science Invention (IJESI)
Computer Assisted and Contour Detectioin in Medical Imaging Using Fuzzy Logic
H017344752
Ad

Recently uploaded (20)

PDF
Pre independence Education in Inndia.pdf
PPTX
PPH.pptx obstetrics and gynecology in nursing
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
01-Introduction-to-Information-Management.pdf
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Basic Mud Logging Guide for educational purpose
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
Classroom Observation Tools for Teachers
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
Computing-Curriculum for Schools in Ghana
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
TR - Agricultural Crops Production NC III.pdf
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Pre independence Education in Inndia.pdf
PPH.pptx obstetrics and gynecology in nursing
Pharmacology of Heart Failure /Pharmacotherapy of CHF
01-Introduction-to-Information-Management.pdf
Module 4: Burden of Disease Tutorial Slides S2 2025
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
human mycosis Human fungal infections are called human mycosis..pptx
Basic Mud Logging Guide for educational purpose
2.FourierTransform-ShortQuestionswithAnswers.pdf
Classroom Observation Tools for Teachers
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Computing-Curriculum for Schools in Ghana
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
STATICS OF THE RIGID BODIES Hibbelers.pdf
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
TR - Agricultural Crops Production NC III.pdf
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Renaissance Architecture: A Journey from Faith to Humanism
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx

Textural Feature Extraction of Natural Objects for Image Classification

  • 1. Vishal Krishna, Ayush Kumar & Kishore Bhamidipati International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 320 Textural Feature Extraction of Natural Objects for Image Classification Vishal Krishna vkrishna7@gatech.edu Computer Science Georgia Institute of Technology Atlanta – 30332, US Ayush Kumar f2010029@goa.bits-pilani.ac.in Computer science BITS Pilani, Goa Campus Goa – 403726, India Kishore Bhamidipadi Kishore.b@manipal.edu Computer Science Engineering Manipal Institute of Technology Manipal – 576104, India Abstract The field of digital image processing has been growing in scope in the recent years. A digital image is represented as a two-dimensional array of pixels, where each pixel has the intensity and location information. Analysis of digital images involves extraction of meaningful information from them, based on certain requirements. Digital Image Analysis requires the extraction of features, transforms the data in the high-dimensional space to a space of fewer dimensions. Feature vectors are n-dimensional vectors of numerical features used to represent an object. We have used Haralick features to classify various images using different classification algorithms like Support Vector Machines (SVM), Logistic Classifier, Random Forests Multi Layer Perception and Naïve Bayes Classifier. Then we used cross validation to assess how well a classifier works for a generalized data set, as compared to the classifications obtained during training. .Keywords: Feature Extraction, Haralick, Classifiers, Cross-Validation. 1. INTRODUCTION Texture is an important feature for many types of analysis of images and identification of regions of interest. Texture analysis has a wide array of applications, including industrial and biomedical monitoring, classification and segmentation of satellite or aerial photos, identification of ground relief, and many others. [1] Various methods have been proposed via research over the years for identifying and discriminating the textures. Measures like angular second moment, contrast, mean, correlation, entropy, inverse difference moment, etc. have been typically used by researchers for obtaining feature vectors, which are then manipulated to obtain textural features. One of the most popular approaches to texture analysis is based on the co-occurrence matrix obtained from images, proposed by Robert M. Haralick in 1973, which forms the basis of this paper. Image classification is one of the most important part of digital image analysis. Classification is a computational procedure that sorts images into subsets according to their similarities. [4] Contextual image classification, as the name suggests, is a method of classification based on the contextual information in images, i.e. the relationship amongst neighbouring pixels. [2]. For classification, we used the WEKA (“Waikato Environment for Knowledge Analysis”) tool, which is an open source machine-learning software suite developed using Java, by the University
  • 2. Vishal Krishna, Ayush Kumar & Kishore Bhamidipati International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 321 of Waikato, New Zealand.[6] It contains set of tools for different data analysis and modelling techniques such as: pre-processing, classification, clustering, segmentation, association rules and visualization. It implements many artificial intelligence algorithms like decision trees, neural networks, Particle Swarm Optimization etc.).[5] 2. LITERATURE SURVEY The classification of images can be done either on the basis of a single resolution cell or on a collection of resolution cells. When a block of cells are used, the challenge is to define a set of features to represent the information given by those cells, which can be used for classification of the images. Human perception of images is based on three major classes of features: spectral, textural and contextual. Spectral features are obtained as the average variation of tone across various bands of the electromagnetic spectrum. Textural features, on the other hand, provide information about the variation of tone within a single band. Information from portions of image surrounding the part under analysis constitute the contextual features. In gray-scale photographs, tone represents the varying gray levels in resolution cells, while the statistical distribution of the gray levels is interpreted as texture. Tone and texture form an intrinsic part of any image, though one can get precedence over the other according to the nature of the image. Simply stated, the relation between the two is: tone is dominant when the sample under consideration shows only small range of variation of gray levels, while gray levels spread over a wide range in a similar sample indicate the dominance of texture. Haralick’s work is based on the assumption that information regarding the texture of any image can be obtained from calculating the average spatial relation of the gray tones of the image with each other. The procedure for calculating the Haralick textural features is based on a set of gray- tone spatial-dependence probability distribution matrices (also termed as Gray-Level Co- occurrence Matrices or GLCM, or gray-level spatial dependence matrix), computed for various angles at fixed distances. From each such matrix, fourteen features can be calculated, which provide information in terms of homogeneity, contrast, linear variation of gray tone, nature and number of boundaries etc. Co-occurrence Matrix: A co-occurrence matrix, P, is used to describe the relationships between neighbouring (at a distance, d) pixels in an image. 4 co-occurrence matrices, each calculated for a different angle, can be defined. A co-occurrence matrix, termed as P 0 , describes pixels that are adjacent to one another horizontally (at angle 0 o ). Similarly, co-occurrence matrices are defined for the vertical direction (90 o ) and both diagonals (45 o and 135 o ). These matrices are called P 90 , p 45 and P 135 respectively. [3] ‫ݔ‬ = ൮ 0 0 1 1 0 0 1 1 0 2 2 2 2 2 3 3 ൲ ‫݌‬଴ = ൮ 4 2 1 0 2 4 0 0 1 0 6 1 0 0 1 2 ൲ ‫݌‬ସହ = ൮ 4 1 0 0 1 2 2 0 0 2 4 1 0 0 1 0 ൲ ‫݌‬ଽ଴ = ൮ 6 0 2 0 0 4 2 0 2 2 2 2 0 0 2 0 ൲ ‫݌‬ଵଷହ = ൮ 2 1 3 0 1 2 1 0 3 1 0 2 0 0 2 0 ൲
  • 3. Vishal Krishna, Ayush Kumar & Kishore Bhamidipati International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 322 There are 4 pairs of (0,0) in angular 0, thus P 0 (0,0)=4 , there are 2 pairs of (0,1), thus P0(0,1)=2. Similarly all the four matrices are computed. Based on the co-occurrence matrices calculated as above, the thirteen texture features as proposed by Haralick are defined below: Notation: Ng : Number of distinct gray levels in quantized image a) Angular Second Moment b) Contrast c) Correlation Where µx, µy, σx, σy are mean of x, y and standard deviation of x, y respectively. d) Sum of Squares: Variance e) Inverse Difference Moment f) Sum Average g) Sum Variance
  • 4. Vishal Krishna, Ayush Kumar & Kishore Bhamidipati International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 323 h) Sum Entropy i) Entropy j) Difference Variance k) Difference Entropy l) Information measures of correlation where HX and HY are entropies of px and py. 3. METHODOLOGY For any value of d, as mentioned before, 4 matrices are calculated for each of the thirteen features detailed above. The mean and range of each set of four values give a set 28 values which are then passed to the classifier. Out of the input features, some share a strong correlation, so a feature-selection procedure can identify a subset of features in order to give good results in classification. The test data has a total of 25 classes, which are known Apriori. We use this knowledge to calculate the effectiveness of various classification algorithms available, on the Haralick features. The classification algorithms used are: 1. Naïve Bayes Classifier (NB) - A Bayes classifier is a simple probabilistic classifier based on applying Bayes' theorem (from Bayesian statistics) with strong (naive) independence assumptions. [7] 2. Logistic Classifier (Log) - Logistic regression is a probabilistic statistical classification model. It measures the relationship between the categorical dependent variable and one or more independent variables, which are usually (but not necessarily) continuous, by using probability scores as the predicted values of the dependent variable.[8] 3. Multilayer Perception Classifier (MP) – In conventional MLP, components of feature vectors are made to take crisp binary values, and the pattern is classified according to highest activation reached. [9]
  • 5. Vishal Krishna, Ayush Kumar & Kishore Bhamidipati International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 324 4. Random Forest Classifier (RF) - Random forests operate by constructing a number of decision trees training data and classifying data according to the mode of the obtained.[10] 5. Sequential Minimal Optimization – The algorithm is used to train support vector machines for classification. [11] The parameters on which the effectiveness of each of the above algorithms are: 1. True Positive Rate (TP) – it is the number of items correctly labelled as belonging to the particular class divided by the total number of elements labelled as belonging to that class 2. False Positive Rate (FP) – it is the number of items incorrectly labelled as belonging to the particular class divided by the total number of elements labelled as belonging to that class 3. Precision - it is the fraction of retrieved instances that are relevant 4. Recall - it is the fraction of relevant instances that are retrieved 5. F-Measure – it is a measure that combines precision and recall, calculated as the harmonic mean of precision and recall 6. ROC Area - receiver operating characteristic (ROC) is a plot of the performance of a binary classifier system. The area under the curve is treated as a measure of accuracy of the classifier. A second set of experiments are carried out, using the same test data, algorithms and parameters, but with the added constraint of using cross validation factor of 10. 4. RESULTS AND ANALYSIS Each algorithm is first run on the data set and all six parameters are measured and compared. The results obtained are given below. Class TP Rate NB Log MP SMO RF 1 0.525 1.000 0.950 0.675 1 2 0.750 1.000 0.975 0.675 1 3 0.850 1.000 1.000 0.875 0.975 4 0.775 0.975 0.975 0.800 1 5 0.900 1.000 1.000 0.975 1 6 0.825 1.000 0.975 0.800 1 7 0.900 1.000 1.000 0.950 1 8 0.775 1.000 0.925 0.700 1 9 0.850 0.975 0.825 0.675 1 10 0.800 1.000 1.000 0.925 1 11 0.725 1.000 0.975 0.775 1 12 0.750 1.000 0.950 0.825 1 13 0.800 1.000 1.000 0.900 1 14 0.650 0.975 0.975 0.850 1 15 0.725 0.975 1.000 0.850 0.975 16 0.850 1.000 0.975 0.800 0.975 17 0.975 0.975 1.000 0.800 1 18 0.900 0.975 1.000 0.975 1 19 0.600 0.975 0.975 0.800 1 20 1.000 1.000 1.000 1.000 1 21 0.250 1.000 0.925 0.725 0.975 22 0.750 1.000 0.975 0.900 1 23 0.525 1.000 0.950 0.775 0.9
  • 6. Vishal Krishna, Ayush Kumar & Kishore Bhamidipati International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 24 25 FIGURE 1.1: Values of TP Rate of each class for FIGURE 1.2 Class 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 FIGURE 2.1: Values of FP Rate of each class for different classification methods Kishore Bhamidipati International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 1.000 1.000 1.000 0.975 1 0.850 1.000 1.000 0.975 1 Values of TP Rate of each class for different classification methods 1.2: Graphical representation of TP Rate values. Class NB Log MP SMO RF 0.013 0.000 0.000 0.01 0.001 0.019 0.000 0.002 0.01 0 0.000 0.000 0.001 0 0.001 0.013 0.001 0.003 0.01 0 0.010 0.000 0.000 0 0.002 0.006 0.000 0.003 0 0 0.002 0.000 0.000 0 0 0.004 0.000 0.001 0.01 0 0.054 0.001 0.001 0.02 0 0.006 0.000 0.002 0.01 0.002 0.007 0.000 0.004 0.01 0.001 0.013 0.000 0.001 0.01 0 0.003 0.000 0.001 0 0 0.015 0.000 0.000 0.02 0 0.011 0.000 0.001 0 0.001 0.018 0.000 0.000 0 0 0.002 0.001 0.001 0.01 0 0.000 0.001 0.000 0 0 0.004 0.000 0.002 0.02 0 0.000 0.000 0.000 0 0 0.017 0.000 0.003 0.02 0 0.013 0.000 0.000 0.01 0 0.008 0.000 0.001 0.01 0 0.000 0.000 0.000 0 0 0.000 0.000 0.000 0 0 Values of FP Rate of each class for different classification methods 325 different classification methods. Values of FP Rate of each class for different classification methods.
  • 7. Vishal Krishna, Ayush Kumar & Kishore Bhamidipati International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 FIGURE 2.2 Class 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Kishore Bhamidipati International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 2.2: Graphical representation of FP Rate values. NB Log MP SMO RF 0.636 1.000 1.000 0.82 0.976 0.625 1.000 0.951 0.73 1 1.000 1.000 0.976 1 0.975 0.721 0.974 0.929 0.82 1 0.783 1.000 1.000 0.91 0.952 0.846 1.000 0.929 0.97 1 0.947 1.000 1.000 0.97 1 0.886 1.000 0.974 0.85 1 0.395 0,983 0.971 0.54 1 0.842 1.000 0.952 0.76 0.952 0.806 1.000 0.907 0.76 0.976 0.714 1.000 0.974 0.83 1 0.914 1.000 0.976 0.95 1 0.650 0.994 1.000 0.68 1 0.725 0.992 0.976 0.97 0.975 0.667 1.000 1.000 0.91 1 0.951 0.978 0.976 0.82 1 1.000 0.984 1.000 0.95 1 0.857 0.993 0.951 0.7 1 1.000 1.000 1.000 1 1 0.385 1.000 0.925 0.66 1 0.714 1.000 1.000 0.86 1 0.724 1.000 0.974 0.82 1 1.000 1.000 1.000 0.98 1 326
  • 8. Vishal Krishna, Ayush Kumar & Kishore Bhamidipati International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 25 FIGURE 3.1: Values of Precision of each class for different classification + FIGURE 3.2 Class 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 Kishore Bhamidipati International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 1.000 1.000 1.000 0.98 1 Values of Precision of each class for different classification methods 3.2: Graphical representation of Precision values. Class NB Log MP SMO RF 0.525 1.000 0.950 0.68 1 0.750 1.000 0.975 0.68 1 0.850 1.000 1.000 0.88 0.975 0.775 0.975 0.975 0.8 1 0.900 1.000 1.000 0.98 1 0.825 1.000 0.975 0.8 1 0.900 1.000 1.000 0.95 1 0.775 1.000 0.925 0.7 1 0.850 0.975 0.825 0.68 1 0.800 1.000 1.000 0.93 1 0.725 1.000 0.975 0.78 1 0.750 1.000 0.950 0.83 1 0.800 1.000 1.000 0.9 1 0.650 0.975 0.975 0.85 1 0.725 0.975 1.000 0.85 0.975 0.850 1.000 0.975 0.8 0.975 0.975 0.975 1.000 0.8 1 0.900 0.975 1.000 0.98 1 0.600 0.975 0.975 0.8 1 1.000 1.000 1.000 1 1 0.250 1.000 0.925 0.73 0.975 327 methods.
  • 9. Vishal Krishna, Ayush Kumar & Kishore Bhamidipati International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 22 23 24 25 FIGURE 4.1: Values of Recall of each class for different classification methods FIGURE 4.2 Class 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Kishore Bhamidipati International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 0.750 1.000 0.975 0.9 1 0.525 1.000 0.950 0.78 0.9 1.000 1.000 1.000 0.98 1 0.850 1.000 1.000 0.98 1 Values of Recall of each class for different classification methods 4.2: Graphical Representation of values of Recall. NB Log MP SMO RF 0.575 1.000 0.974 0.74 0.988 0.682 1.000 0.963 0.7 1 0.919 1.000 0.988 0.93 0.975 0.747 0.976 0.951 0.81 1 0.837 1.000 1.000 0.94 0.976 0.835 1.000 0.951 0.88 1 0.923 1.000 1.000 0.96 1 0.827 1.000 0.949 0.77 1 0.540 0.979 0.892 0.6 1 0.821 1.000 0.976 0.83 0.976 0.763 1.000 0.940 0.77 0.988 0.732 1.000 0.962 0.83 1 0.853 1.000 0.988 0.92 1 0.650 0.982 0.987 0.76 1 0.725 0.986 0.988 0.91 0.975 0.747 1.000 0.987 0.85 0.987 0.963 0.976 0.988 0.81 1 0.947 0.993 1.000 0.96 1 328 Values of Recall of each class for different classification methods.
  • 10. Vishal Krishna, Ayush Kumar & Kishore Bhamidipati International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 19 20 21 22 23 24 25 FIGURE 5.1: Values of F FIGURE 5.2 Class 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Kishore Bhamidipati International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 0.706 0.981 0.963 0.74 1 1.000 1.000 1.000 1 1 0.303 1.000 0.925 0.69 0.987 0.732 1.000 0.987 0.88 1 0.609 1.000 0.962 0.8 0.947 1.000 1.000 1.000 0.98 1 0.919 1.000 1.000 0.98 1 Values of F-measure of each class for different classification methods 5.2: Graphical representation of F-measure values. Class NB Log MP SMO RF 0.970 1.000 0.974 0.97 1 0.979 1.000 0.996 0.98 1 0.997 1.000 1.000 1 1 0.984 1.000 0.999 0.99 1 0.995 1.000 1.000 1 1 0.982 1.000 0.996 0.98 1 0.999 1.000 1.000 1 1 0.991 1.000 0.985 0.98 1 0.977 1.000 0.964 0.97 1 0.995 1.000 1.000 0.99 1 0.983 1.000 0.999 0.98 1 0.986 1.000 0.995 0.99 1 0.996 1.000 1.000 1 1 0.985 1.000 0.999 0.98 1 0.984 1.000 1.000 0.99 1 329 classification methods.
  • 11. Vishal Krishna, Ayush Kumar & Kishore Bhamidipati International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 16 17 18 19 20 21 22 23 24 25 FIGURE 6.1: Values of ROC Area of each class for different classification methods FIGURE 6.2 The following tables and diagrams pertain to the second set of experiments, i.e. validation factor of 10 in each case. Class MP CV10 1 0.725 2 0.825 3 0.925 4 0.825 5 0.900 6 0.750 7 0.950 8 0.825 9 0.625 Kishore Bhamidipati International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 0.990 1.000 0.997 0.98 1 0.998 1.000 1.000 0.99 1 1.000 1.000 1.000 1 1 0.978 1.000 0.998 0.98 1 1.000 1.000 1.000 1 1 0.930 1.000 0.993 0.98 1 0.974 1.000 0.997 0.99 1 0.964 1.000 0.978 0.98 1 1.000 1.000 1.000 1 1 1.000 1.000 1.000 1 1 Values of ROC Area of each class for different classification methods 6.2: Graphical representation of ROC Area Values. The following tables and diagrams pertain to the second set of experiments, i.e. validation factor of 10 in each case. MP CV10 NB CV10 Log CV 10 RF CV10 SMO CV10 0.5 0.725 0.675 0.525 0.675 0.775 0.675 0.625 0.775 0.975 0.875 0.825 0.775 0.875 0.7 0.725 0.9 0.925 0.8 0.9 0.825 0.8 0.725 0.775 0.85 0.975 0.925 0.925 0.725 0.9 0.75 0.575 0.85 0.725 0.65 0.675 330 Values of ROC Area of each class for different classification methods. The following tables and diagrams pertain to the second set of experiments, i.e. with a cross
  • 12. Vishal Krishna, Ayush Kumar & Kishore Bhamidipati International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 331 10 0.925 0.775 0.925 0.9 0.825 11 0.875 0.7 0.9 0.85 0.75 12 0.825 0.7 0.825 0.75 0.775 13 0.975 0.775 0.975 0.825 0.85 14 0.800 0.575 0.85 0.7 0.8 15 0.925 0.675 0.9 0.825 0.85 16 0.825 0.825 0.85 0.7 0.75 17 0.850 0.95 0.975 0.925 0.7 18 0.975 0.9 0.975 0.975 0.95 19 0.875 0.575 0.9 0.675 0.725 20 1.000 1 1 1 1 21 0.675 0.225 0.75 0.45 0.575 22 0.850 0.75 0.9 0.775 0.825 23 0.775 0.425 0.8 0.625 0.7 24 0.975 1 1 0.95 0.975 25 0.925 0.825 1 0.925 0.975 FIGURE 7.1: Values of TP Rate of each class for different classification methods with cross validation 10. FIGURE 7.2: Graphical representation of TP Rate values with Cross Validation. 0.2000.2500.3000.3500.4000.4500.5000.5500.6000.6500.7000.7500.8000.8500.9000.9501.0001.050 1 3 5 7 9 11 13 15 17 19 21 23 25 TP RATE MP CV10 NB CV10 Log CV 10 RF CV10 SMO CV10
  • 13. Vishal Krishna, Ayush Kumar & Kishore Bhamidipati International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 332 FIGURE 8.2: Graphical Representation of FP Rate values with cross validation 10. FIGURE 9.2: Graphical Representation of Precision values with cross validation 10. FIGURE 10.2: Graphical Representation of Recall values with cross validation 10. 0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035 0.040 0.045 0.050 0.055 1 3 5 7 9 11 13 15 17 19 21 23 25 FP RATE MP CV10 NB CV10 Log CV 10 RF CV10 SMO CV10 0.250 0.750 1.250 1 3 5 7 9 11 13 15 17 19 21 23 25 Precision MP CV10 NB CV10 Log CV 10 RF CV10 SMO CV10 0.150 0.350 0.550 0.750 0.950 1 3 5 7 9 11 13 15 17 19 21 23 25 Recall MP CV10 NB CV10 Log CV 10 RF CV10 SMO CV10
  • 14. Vishal Krishna, Ayush Kumar & Kishore Bhamidipati International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 333 FIGURE 11.2: Graphical Representation of F-measure values with cross validation 10. FIGURE 12.2: Graphical Representation of ROC Area values with cross validation 10. The overall accuracy of each algorithm, considering all classes is depicted below. Log MP NB RF SMO 99.7 97.3 77.2 99.2 83.9 FIGURE 13.1: Overall accuracy values of all classes. 5. CONCLUSION Our comparative study provides a comprehensive analysis to Haralick features and its use in the well-known classification models. From the analysis, we can see Logistic classifier performs extremely well under all parameters, which is reflected in the combined accuracy values. It has a 99.7 percent accuracy for the trained parameters across all the classes. Random Forest Classifier performs second with respect to the rest of the classifiers. It successfully predicted all the values for most of the classes. Native Bayes performs the worst, especially with certain classes, which brings down the total accuracy achieved. On applying cross validation with a factor of 10, we see that the accuracy decreases across all the classifiers. The different classifiers perform similarly with respect to each other as they did without cross validation. However, it can be seen that MultiLayer Perception Classifier performs slightly better than Random Forest Classifier in this case. 0.250 0.750 1 3 5 7 9 11 13 15 17 19 21 23 25 F-Measure MP CV10 NB CV10 Log CV 10 RF CV10 SMO CV10 0.900 0.910 0.920 0.930 0.940 0.950 0.960 0.970 0.980 0.990 1.000 1.010 1 3 5 7 9 11 13 15 17 19 21 23 25 ROC Area MP CV10 NB CV10 Log CV 10 RF CV10 SMO CV10
  • 15. Vishal Krishna, Ayush Kumar & Kishore Bhamidipati International Journal of Image Processing (IJIP), Volume (9) : Issue (6) : 2015 334 Apart from Native Bayes, all other methods had an accuracy of over 80 percent. Logistical and Random Forest scored above 99 percent in its accuracy. This demonstrates the power of Haralick features and its efficiency in image classification using standard classification models. 6. REFERENCES [1] Timo Ojala, Matti Pietikainen and David Harwood, A comparative study of texture measures with classification based on feature distributions. [2] M. Pietikainen, T. Ojala, Z. Xu; Rotation-invariant texture classification using feature distributions [3] Eizan Miyamotol and Thomas Merryman Jr; FAST CALCULATION OF HARALICK TEXTURE FEATURES. [4] Frank, J. (1990) Quart. Rev. Biophys. 23, 281-329. [5] Baharak Goli and Geetha Govindan ; WEKA – A powerful free software for implementing Bio- inspired Algorithms;State Inter University Centre of Excellence in Bioinformatics, University of Kerala). [6] Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, Ian H. Witten (2009); The WEKA Data Mining Software: An Update; SIGKDD Explorations, Volume 11, Issue 1. [7] Tom M. Mitchell; Machine Learning; McGraw Hill,2010. [8] David A. Freedsma; Statistical Models: Theory and Practice; Cambridge University Press, 2009, p. 128. [9] Shankar Pal, Shushmita Mitra; Multilayer Perceptron, Fuzzy Sets and Classification; IEEE Transactions on Neural Networks, Vol 3, September 1992. [10] Leo Breiman ; "Random Forests". Machine Learning 45 (1); 2001. [11] John Platt; Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines; 1998.