SlideShare a Scribd company logo
IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 75
IMPLEMENTATION OF CONTENT-BASED IMAGE RETRIEVAL
USING THE CFSD ALGORITHM
Nagaraja. G. S1
, Samir Sheriff2
, Raunaq Kumar3
1
Associate Professor, 2,3
VII Semester B.E., Dept of Computer Sci. Eng., R.V. College of Engineering, Bangalore, India
nagarajags@rvce.edu.in, samiriff@gmail.com, devilronny@gmail.com
Abstract
Content-based image retrieval (CBIR) is the application of multimedia -processing techniques to the image retrieval problem, that is,
the problem of searching for digital images in large databases. There is a growing interest in CBIR because of the limitations inherent
in metadata-based systems. The most common method for comparing two images in CBIR is using an image distance measure based
on color, texture, shape and others. In this paper, we describe an object-oriented graphical implementation of a system, backed by a
My SQL database, that computes distance measures using the CFSD algorithm, based on color similarity
--------------------------------------------------------------------*****-----------------------------------------------------------------------
1. INTRODUCTION
”Content-based” means that the search will analyze the actual
contents of the image rather than the metadata such as
keywords, tags, and/or descriptions associated with the image.
The term ’content’ in this context might refer to colors,
shapes, textures, or any other information that can be derived
from the image itself.
The system we have implemented uses color-based feature
extraction. Computing distance measures based on color
similarity is achieved by computing a color histogram for each
image that identifies the proportion of pixels within an image
holding specific values. Examining images based on the colors
they contain is one of the most widely used techniques
because it does not depend on image size or orientation.
Our program allows the user to choose between two
algorithms
_ Color Histogram Method: In image processing
andphotography, a color histogram is a representation of
thedistribution of colors in an image. For digital images,
acolor histogram represents the number of pixels that
havecolors in each of a fixed list of color ranges, that spanthe
image’s color space, the set of all possible colors.
_ Color Frequency Sequence Difference Method: Thehigh
dimensionality of feature vectors of the Color
Histogrammethod results in high computation cost and
spacecost. CFSD expresses the color image in terms of
numericalvalues for each color channel, greatly
improvingcomparison time and computation costs.
Fig1. Content-Based Image Retrieval System
2. THE COLOR HISTOGRAM METHOD
1) Color Quantization: For every image in the database,colors
in the RGB model are quantized, to make latercomputations
easier. Color quantization reduces thenumber of distinct colors
used in an image. In ourapplication, the user can select the
quantization bucketsize for red, green and blue separately.
2) Compute Histogram: For every quantized image: Calculatea
color histogram, which is a frequency distributionof quantized
RGB values of each pixel of animage. In our application, we
have used hashmaps tostore histogram values as <string,
double >pairs.
IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 76
3) Query Image Feature Extraction: Compute the
quantizedcolor-values and the resulting histogram of thequery
image.
4) Compute Distance Measure: Compare the histogramof the
query image with the histograms of every imageof the
retrieval system. Let h and g represent two colorhistograms. If
n = 2B �1, where B is the maximumnumber of bits used for
representing quantized colorcomponents a, b and c, then the
euclidean distancebetween the color histograms h and g can be
computed
as:
Xn
a=0
Xn
b=0
Xn
c=0
j(h(a; b; c) �g(a; b; c))j (1)
In this distance formula, there is only comparison betweenthe
identical bins in the respective histograms.Two different bins
may represent perceptually similarcolors but are not compared
cross-wise. All bins contributeequally to the distance.
5) Find Similar Images: The similarity between twoimages is
inversely proportional to the (color histogram)distance
between them.
3. THE COLOR FREQUENCY SEQUENCE
DIFFERENCE
METHOD
1) Color Space Conversion: Translate the representationof all
colors in each image from the RGB space to theHSV space.
HSV color space has two distinct characteristics:one is that
lightness component is independentof color information of
images; the other is that hueand saturation component is
correlative with manner ofhuman visual perception.
2) Color Quantization: For every image in the database,colors
in the HSV model are quantized, to make latercomputations
easier. Color quantization reduces thenumber of distinct colors
used in an image. In ourapplication, the user can select the
quantization bucketsize for each color component separately.
3) Compute Histogram (Sequence): For every quantizedimage,
a color histogram is calculated, which is afrequency
distribution of quantized HSV values of eachpixel of an
image. In our application, we have usedtreemaps to store
histogram values as <string, double>pairs. The values are
ordered alphabetically by theirHSV strings. For instance, (H,
S, V) string (0, 0, 0) isfollowed by (0, 0, 1).
4) Compute Sorted Histogram (Frequency): Take thecolor
histogram of each image and sort the keys indescending order
of frequency values. Each image nowhas 2 histograms:
_ One sorted by color sequence Call this P
_ One sorted by color frequency Call this Q
However, in our application, we maintain only thesorted
histogram as the other histogram can be derivedimplicitly.
5) Query Image Feature Extraction: Convert and quantizethe
color space of the query image. Then, computethe sequence
histogram as well as the frequency histogramof the query
image.
6) Compute SFD: In this algorithm, SFD is an alias
fordistance measure, which is calculated as follows, foreach
component of the HSV model:
For every color combination x in Histogram P,
sfd =
XN
i=1
w(i)h(i) (2)
where
_ i is the index of color combination x in histogram
Q and i is the index of x in histogram P
_ w(i) = 1
ji�i0j+1.w is a function of the difference
between the indices i and i.
_ h is a function of the frequency of color combination
x in the image
7) Compute Distance with SFD (Difference): Let h andg
represent two images which have SFD values foreach
component of the HSV model. The SFD distancebetween h
and g can be computed as:
Dsfd(a; b) = (sfdH(a; b)+sfdS(a; b)+sfdV (a; b))=3
(3)
where
_ sfdT (a; b) = asfdT � bsfdT which is the distancebetween
the corresponding SFD values of the componentT of the HSV
model of the two images aand b.
_ T_fH; S; V g
_ asfdT is the corresponding SFD value of componentT of the
HSV model representing image a
IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 77
Fig2. SFD Calculation for one component of a given image
8) Find Similar Images: Compare the SFDs of each
HSVcomponent of the query image with the
correspondingSFDs of every image of the retrieval system.
The similaritybetween two images is inversely proportional
tothe SFD distance between them.
4. OBJECT-ORIENTED DESIGN
The Java Programming Language has been used to createour
Content-Based Image Retrieval System, with MySQL forthe
back-end. For displaying graphical objects, we have usedthe
ACM package (created by the Java Task Force (JTF)),which
provides a set of classes that support the creation ofsimple,
object-oriented graphical displays. The object-orienteddesign
of our program below:
Fig3. Finding Similar Images based on SFD Distance. The
two images on
top are similar since their sfd values are quite close. This
similarity cannotbe detected by the normal color histogram
method.
A. Class Design
We have segregated the code into two packages for
easymanagement:
_ retriever - This package contains classes that
providemethods to read a database of images, store status
information.The implementation of the algorithms
mentionedin the previous sections can be found in this
package.
_ gui - As the name suggests, this package contains allthe code
necessary for the graphical user interface of ourretrieval
system. It makes use of methods provided by theACM library
as well as the Swing library.
Now that you know the organization of code, let us divedeeper
into the actual object-oriented design. The retrieverpackage
consists of the following classes:
1) ColorFrequency - This class is used while storing
colorhistograms in hashmaps/treemaps, to store< value,
frequency > tuples for different color combinations.It
implements the Comparable Interface so thatthe
Collections.sort() method provided by the java.utilpackage to
sort histograms in decreasing order of frequency,as required
by the CFSD algorithm.
2) ImageData - This class keeps track of the status of asingle
image of the database. It stores information andprovides
methods to:
_ store the two-dimensional representation of animage using
the GImage class of the ACM package
_ store the path of this image
_ quantize every color model component of the givenpixel of
this image
_ create color histograms with ColorFrequency mappingsfor
individual components as well as a combinationof all
components of the color space of thisimage_ calculate SFD
values for every component of thecolor space of this image_
establish or disable a connection with a databaseand retrieve
status information if feature values havealready been
calculated for this image during aprevious run of the program.
This is done to preventunnecessary computation of SFD
values wheneverthe program is run repeatedly
3) ImageCollection - This class keeps track of the Image-Data
of a number of images in the database. In essence,an
ImageCollection Object is a collection of ImageDataobjects.It
provides methods to:
_ initialize an ImageData collection with images froma
specified directory, which is traversed to a folderdepth of 1
_ find images that are most similar to a query image,computed
based on the Color Histogram method andEuclidean distance
IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 78
formula. A list of images, sortedbased on decreasing order of
similarity is returned
_ find images that are most similar to a query image,computed
based on the CFSD method and SFDdistance formula. A list
of images, sorted based ondecreasing order of similarity is
returned.
4) UserVariables - This class only maintains the values ofall
the variables that the user can set in the GUI window.The
ImageData and ImageCollection objects read thesevalues
during computation. It provides methods to getand set:
_ the number of red, green and blue bins for colorquantization
_ the number of similar images to be returned
byImageCollection
_ aboolean variable denoting whether the Color Histogram
Algorithm or CFSD algorithm should be
Used
The gui package consists of only one class, namely:
1) MainGraphics - This class uses the power of the
widgetsprovided by the ACM graphics package and the
Javaswing library to create a user-friendly interface for
ourImage Retrieval System. It creates a window consistingof a
white area where the query image selected by theuser, and the
three most similar images are displayed,in their normal forms
as well as their quantized forms.
The northern portion of the window consists of:
_ Two Buttons - One to open a query image and anotherto
start the execution of the selected algorithm,which is the Color
Histogram Algorithm, by default.
_ Three sliders - each of which can be varied from 1to 60. The
sliders are used to choose the bin sizesof each component of
the RGB(HSV) model for theColor Histogram (CFSD)
algorithm
_ One checkbox - If checked, then the CFSDalgorithm is run
to find similar images. Otherwise,the Color Histogram method
is used
For each of these widgets, the corresponding variablesof the
UserVariable class are set.
B. Schema Design
Since comparisons in the CFSD algorithm are done using afew
SFD values instead of comparing entire histograms as inthe
normal color histogram method, the CFSD algorithm canbe
made to operate more efficiently by calculating the SFDvalues
of an image only once (when it is encountered for thevery first
time) and storing them in a database which can beaccessed for
all future references. Please note that wheneverwe say red,
green or blue in the CFSD algorithm, we actuallyrefer to the
H, S or V values of the image respectively. Thiswas done for
compatibility.
The database that we constructed for our system consists
oftwo tables:
1) imagedata - It consists of the following attributes:_ Image
ID - which is the primary key of this table.
It is an auto-incremented variable
_ Image Path - A string with a variable number ofcharacters
that stores the path of the given image,relative to the directory
of the application
_ redSFD - The SFD calculated using the red (H)histogram of
the image
_ greenSFD - The SFD calculated using the green
(S)histogram of the image
_ blueSFD - The SFD calculated using the blue (V)histogram
of the image
2) quantizedvalues- It consists of the following attributes:
_ Image ID - which is the foreign key of this tablethat
references the primary key of the imagedatatable.
_ qRed - The value of the bin used to quantize red(H)
_ qGreen - The value of the bin used to quantize green(S)
_ qBlue - The value of the bin used to quantize blue(V)
C. Summary and Results
In order to test out our image retrieval system, we have
provideda sample set of images, organized in seven folders
withfive images each. Feel free to add more folders if
necessary,but ensure that each folder contains exactly five
images at anygiven instant of time. Using our sample
database, it was foundthat this application could retrieve at
least three similar imagesto a supplied query image, chosen
from the database itself. Thespeed of retrieval varied
depending on the algorithm selected,with the CFSD algorithm
running faster than the naive ColorHistogram algorithm.
5. FUNCTIONALITY
First ensure that all the above requirements are met. Then,to
start the application, go to the installation folder and clickon
file named CBIR.jar. In the new window that pops up,select
the appropriate query image using the open button.
Afterselecting the appropriate number of bins for quantization
ofeach component of the color space, and
checking/uncheckingthe CFSD box, click on the Retrive
Similar Images button toretrieve the three most similar images
that will be displayedin the lower portion of the window.
Figures 4, 5 and 6 showthe state of the window at different
instances of time.
Fig4. Application after the query image has been selected by
the use
IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 79
Fig5 Application after images similar to the query image have
been retrieved using the CFSD algorithm
Fig6 Application after images similar to the query image have
been retrieved using the Color Histogram algorithm
CONCLUSIONS
This paper discussed briefly the software implementation of a
Content-Based Image Retrieval System using the CFSD
algorithm as well as a naive Color Histogram method. This
software tool was implemented in Java, relying heavily on
Object-Oriented concepts without which code management
wouldn’t have been easy at all. We also had the opportunity to
find out that the naive Color Histogram algorithm is not as fast
as the CFSD algorithm. In fact, the CFSD algorithm, coupled
with a database, forms a highly efficient system which
retrieves at most three similar images quite accurately.
However, it was found that the accuracy diminished when
more than three similar images were sought. This leads us to
conclude that we should also consider other features such as
shape, etc., while comparing images in a content-based
retrieval system.
ACKNOWLEDGMENTS
The authors wish to express their gratitude to Dr.
NagarajaG.S., who offered invaluable assistance, support and
guidance throughout the development of this tool.
REFERENCES
[1] RishavChakravarti, XiannongMeng, ”A Study of Color
HistogramBased Image Retrieval”, Sixth International
Conference on InformationTechnology: New Generations,
2009. ITNG ’09., pp. 1323�1328, 2009
[2] Zhenhua ZHANG, Yina LU, Wenhui LI and Wei LIU,
”Novel ColorFeature Representation and Matching Technique
for Content-based ImageRetrieval”, International conference
on Multimedia Computing and Systems
(ICMCS 09), pp. 118�122, 2009
[3] M. J. Swain and D. H. Ballard, ”Color indexing”,
International Journalof Computer Vision, vol. 7, no. 1, pp.
11�32, 1991.
[4] H. Lewkovitz and G. H. Herman, ”A generalized lightness
hue andsaturation color model”, Graphical Models and Image
Processing, pp.271�278, 1993
[5] Smeulders A.W.M., Worring, M., Santini, S., Gupta, A.,
and Jain, R.,”Content-based image retrieval at the end of the
early years”, IEEE transactionson Pattern Analysis and
Machine Intelligence, pp. 1349�1380,2000
[6] M. Stricker and M. Orengo, Similarity of color images,
Proceeding ofSPIE Storage and Retrieval for Image and Video
Databases III, vol. 2420,pp. 381�392, 1995.
[7] JaumeAmores, NicuSebe, and PetiaRadeva, ”Context-
Based Object-Class Recognition and Retrieval by Generalized
Correlograms”, Transactionson Pattern Analysis and Machine
Intelligence, vol. 29, no. 10, pp.1818�1833, 2008

More Related Content

PDF
Ba34321326
PDF
Performance Anaysis for Imaging System
PDF
International Journal of Engineering Research and Development
PDF
Comparison of thresholding methods
PDF
Ax31139148
PDF
Ijetcas14 466
PDF
COMPARISON OF SECURE AND HIGH CAPACITY COLOR IMAGE STEGANOGRAPHY TECHNIQUES I...
PDF
Image Steganography Techniques
Ba34321326
Performance Anaysis for Imaging System
International Journal of Engineering Research and Development
Comparison of thresholding methods
Ax31139148
Ijetcas14 466
COMPARISON OF SECURE AND HIGH CAPACITY COLOR IMAGE STEGANOGRAPHY TECHNIQUES I...
Image Steganography Techniques

What's hot (15)

PDF
Different Steganography Methods and Performance Analysis
PDF
H3602056060
PDF
Iaetsd degraded document image enhancing in
PDF
International Journal of Engineering Research and Development (IJERD)
PDF
CONTRAST ENHANCEMENT TECHNIQUES USING HISTOGRAM EQUALIZATION METHODS ON COLOR...
PDF
Basics of image processing using MATLAB
PDF
Ac03401600163.
PDF
A Review of Feature Extraction Techniques for CBIR based on SVM
PPTX
Chapter 06 eng
PDF
Data Hiding and Retrieval using Visual Cryptography
PDF
An unsupervised method for real time video shot segmentation
PDF
Color image steganography in YCbCr space
PDF
Comparative between global threshold and adaptative threshold concepts in ima...
PDF
RECOGNITION OF CHEISING IYEK/EEYEK-MANIPURI DIGITS USING SUPPORT VECTOR MACHINES
PDF
Gesture Recognition Based Mouse Events
Different Steganography Methods and Performance Analysis
H3602056060
Iaetsd degraded document image enhancing in
International Journal of Engineering Research and Development (IJERD)
CONTRAST ENHANCEMENT TECHNIQUES USING HISTOGRAM EQUALIZATION METHODS ON COLOR...
Basics of image processing using MATLAB
Ac03401600163.
A Review of Feature Extraction Techniques for CBIR based on SVM
Chapter 06 eng
Data Hiding and Retrieval using Visual Cryptography
An unsupervised method for real time video shot segmentation
Color image steganography in YCbCr space
Comparative between global threshold and adaptative threshold concepts in ima...
RECOGNITION OF CHEISING IYEK/EEYEK-MANIPURI DIGITS USING SUPPORT VECTOR MACHINES
Gesture Recognition Based Mouse Events
Ad

Viewers also liked (18)

PDF
Emotional telugu speech signals classification based on k nn classifier
PDF
Effect of process factors on surface roughness in dip cryogenic machining of ...
PDF
Visible light solar photocatalytic degradation of pulp and paper wastewater u...
PDF
Survey on cloud backup services of personal storage
PDF
Effect of fiber distance on various sac ocdma detection techniques
PDF
Beam steering in smart antennas by using low complex adaptive algorithms
PDF
Numerical analysis of geothermal tunnels
PDF
A new approach for generalised unsharp masking alogorithm
PDF
National culture impact on lean leadership and lean manufacturing maturity ...
PDF
Unravelling the molecular linkage of co morbid diseases
PDF
Tools description for product development process management in food industries
PDF
Fatigue analysis of rail joint using finite element method
PDF
Android mobile platform security and malware survey
PDF
Utilization of stonedust with plastic waste for improving the subgrade in hig...
PDF
Effect of packing material and its geometry on the performance of packed bed ...
PDF
Effect of free surface wave on free vibration of a floating platform
PDF
Optical power debugging in dwdm system having fixed gain amplifiers
PDF
Optimization of friction stir welding process parameter using taguchi method ...
Emotional telugu speech signals classification based on k nn classifier
Effect of process factors on surface roughness in dip cryogenic machining of ...
Visible light solar photocatalytic degradation of pulp and paper wastewater u...
Survey on cloud backup services of personal storage
Effect of fiber distance on various sac ocdma detection techniques
Beam steering in smart antennas by using low complex adaptive algorithms
Numerical analysis of geothermal tunnels
A new approach for generalised unsharp masking alogorithm
National culture impact on lean leadership and lean manufacturing maturity ...
Unravelling the molecular linkage of co morbid diseases
Tools description for product development process management in food industries
Fatigue analysis of rail joint using finite element method
Android mobile platform security and malware survey
Utilization of stonedust with plastic waste for improving the subgrade in hig...
Effect of packing material and its geometry on the performance of packed bed ...
Effect of free surface wave on free vibration of a floating platform
Optical power debugging in dwdm system having fixed gain amplifiers
Optimization of friction stir welding process parameter using taguchi method ...
Ad

Similar to Implementation of content based image retrieval using the cfsd algorithm (20)

PDF
IMAGE RETRIEVAL USING QUADRATIC DISTANCE BASED ON COLOR FEATURE AND PYRAMID S...
PDF
PDF
IRJET-Feature based Image Retrieval based on Color
PDF
Color and texture based image retrieval a proposed
PPT
CBIR_white.ppt
PDF
Color and texture based image retrieval
PPTX
Multimedia content based retrieval in digital libraries
PDF
Week06 bme429-cbir
PDF
IRJET- Content Based Image Retrieval (CBIR)
PDF
Content Based Image Retrieval
PDF
Content-Based Image Retrieval Using Modified Human Colour Perception Histogram
PDF
Color vs texture feature extraction and matching in visual content retrieval ...
PDF
Ijaems apr-2016-16 Active Learning Method for Interactive Image Retrieval
PDF
Performance analysis is basis on color based image retrieval technique
PDF
Performance analysis is basis on color based image retrieval technique
PDF
B0310408
PDF
Et35839844
PDF
Dynamic hand gesture recognition using cbir
PDF
WEB IMAGE RETRIEVAL USING CLUSTERING APPROACHES
PDF
Av4301248253
IMAGE RETRIEVAL USING QUADRATIC DISTANCE BASED ON COLOR FEATURE AND PYRAMID S...
IRJET-Feature based Image Retrieval based on Color
Color and texture based image retrieval a proposed
CBIR_white.ppt
Color and texture based image retrieval
Multimedia content based retrieval in digital libraries
Week06 bme429-cbir
IRJET- Content Based Image Retrieval (CBIR)
Content Based Image Retrieval
Content-Based Image Retrieval Using Modified Human Colour Perception Histogram
Color vs texture feature extraction and matching in visual content retrieval ...
Ijaems apr-2016-16 Active Learning Method for Interactive Image Retrieval
Performance analysis is basis on color based image retrieval technique
Performance analysis is basis on color based image retrieval technique
B0310408
Et35839844
Dynamic hand gesture recognition using cbir
WEB IMAGE RETRIEVAL USING CLUSTERING APPROACHES
Av4301248253

More from eSAT Journals (20)

PDF
Mechanical properties of hybrid fiber reinforced concrete for pavements
PDF
Material management in construction – a case study
PDF
Managing drought short term strategies in semi arid regions a case study
PDF
Life cycle cost analysis of overlay for an urban road in bangalore
PDF
Laboratory studies of dense bituminous mixes ii with reclaimed asphalt materials
PDF
Laboratory investigation of expansive soil stabilized with natural inorganic ...
PDF
Influence of reinforcement on the behavior of hollow concrete block masonry p...
PDF
Influence of compaction energy on soil stabilized with chemical stabilizer
PDF
Geographical information system (gis) for water resources management
PDF
Forest type mapping of bidar forest division, karnataka using geoinformatics ...
PDF
Factors influencing compressive strength of geopolymer concrete
PDF
Experimental investigation on circular hollow steel columns in filled with li...
PDF
Experimental behavior of circular hsscfrc filled steel tubular columns under ...
PDF
Evaluation of punching shear in flat slabs
PDF
Evaluation of performance of intake tower dam for recent earthquake in india
PDF
Evaluation of operational efficiency of urban road network using travel time ...
PDF
Estimation of surface runoff in nallur amanikere watershed using scs cn method
PDF
Estimation of morphometric parameters and runoff using rs &amp; gis techniques
PDF
Effect of variation of plastic hinge length on the results of non linear anal...
PDF
Effect of use of recycled materials on indirect tensile strength of asphalt c...
Mechanical properties of hybrid fiber reinforced concrete for pavements
Material management in construction – a case study
Managing drought short term strategies in semi arid regions a case study
Life cycle cost analysis of overlay for an urban road in bangalore
Laboratory studies of dense bituminous mixes ii with reclaimed asphalt materials
Laboratory investigation of expansive soil stabilized with natural inorganic ...
Influence of reinforcement on the behavior of hollow concrete block masonry p...
Influence of compaction energy on soil stabilized with chemical stabilizer
Geographical information system (gis) for water resources management
Forest type mapping of bidar forest division, karnataka using geoinformatics ...
Factors influencing compressive strength of geopolymer concrete
Experimental investigation on circular hollow steel columns in filled with li...
Experimental behavior of circular hsscfrc filled steel tubular columns under ...
Evaluation of punching shear in flat slabs
Evaluation of performance of intake tower dam for recent earthquake in india
Evaluation of operational efficiency of urban road network using travel time ...
Estimation of surface runoff in nallur amanikere watershed using scs cn method
Estimation of morphometric parameters and runoff using rs &amp; gis techniques
Effect of variation of plastic hinge length on the results of non linear anal...
Effect of use of recycled materials on indirect tensile strength of asphalt c...

Recently uploaded (20)

PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
Sustainable Sites - Green Building Construction
PPT
Mechanical Engineering MATERIALS Selection
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPT
Project quality management in manufacturing
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
Lecture Notes Electrical Wiring System Components
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
Welding lecture in detail for understanding
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
additive manufacturing of ss316l using mig welding
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Sustainable Sites - Green Building Construction
Mechanical Engineering MATERIALS Selection
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Project quality management in manufacturing
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Lecture Notes Electrical Wiring System Components
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
Embodied AI: Ushering in the Next Era of Intelligent Systems
Welding lecture in detail for understanding
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
additive manufacturing of ss316l using mig welding
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Automation-in-Manufacturing-Chapter-Introduction.pdf
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx

Implementation of content based image retrieval using the cfsd algorithm

  • 1. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163 __________________________________________________________________________________________ Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 75 IMPLEMENTATION OF CONTENT-BASED IMAGE RETRIEVAL USING THE CFSD ALGORITHM Nagaraja. G. S1 , Samir Sheriff2 , Raunaq Kumar3 1 Associate Professor, 2,3 VII Semester B.E., Dept of Computer Sci. Eng., R.V. College of Engineering, Bangalore, India nagarajags@rvce.edu.in, samiriff@gmail.com, devilronny@gmail.com Abstract Content-based image retrieval (CBIR) is the application of multimedia -processing techniques to the image retrieval problem, that is, the problem of searching for digital images in large databases. There is a growing interest in CBIR because of the limitations inherent in metadata-based systems. The most common method for comparing two images in CBIR is using an image distance measure based on color, texture, shape and others. In this paper, we describe an object-oriented graphical implementation of a system, backed by a My SQL database, that computes distance measures using the CFSD algorithm, based on color similarity --------------------------------------------------------------------*****----------------------------------------------------------------------- 1. INTRODUCTION ”Content-based” means that the search will analyze the actual contents of the image rather than the metadata such as keywords, tags, and/or descriptions associated with the image. The term ’content’ in this context might refer to colors, shapes, textures, or any other information that can be derived from the image itself. The system we have implemented uses color-based feature extraction. Computing distance measures based on color similarity is achieved by computing a color histogram for each image that identifies the proportion of pixels within an image holding specific values. Examining images based on the colors they contain is one of the most widely used techniques because it does not depend on image size or orientation. Our program allows the user to choose between two algorithms _ Color Histogram Method: In image processing andphotography, a color histogram is a representation of thedistribution of colors in an image. For digital images, acolor histogram represents the number of pixels that havecolors in each of a fixed list of color ranges, that spanthe image’s color space, the set of all possible colors. _ Color Frequency Sequence Difference Method: Thehigh dimensionality of feature vectors of the Color Histogrammethod results in high computation cost and spacecost. CFSD expresses the color image in terms of numericalvalues for each color channel, greatly improvingcomparison time and computation costs. Fig1. Content-Based Image Retrieval System 2. THE COLOR HISTOGRAM METHOD 1) Color Quantization: For every image in the database,colors in the RGB model are quantized, to make latercomputations easier. Color quantization reduces thenumber of distinct colors used in an image. In ourapplication, the user can select the quantization bucketsize for red, green and blue separately. 2) Compute Histogram: For every quantized image: Calculatea color histogram, which is a frequency distributionof quantized RGB values of each pixel of animage. In our application, we have used hashmaps tostore histogram values as <string, double >pairs.
  • 2. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163 __________________________________________________________________________________________ Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 76 3) Query Image Feature Extraction: Compute the quantizedcolor-values and the resulting histogram of thequery image. 4) Compute Distance Measure: Compare the histogramof the query image with the histograms of every imageof the retrieval system. Let h and g represent two colorhistograms. If n = 2B �1, where B is the maximumnumber of bits used for representing quantized colorcomponents a, b and c, then the euclidean distancebetween the color histograms h and g can be computed as: Xn a=0 Xn b=0 Xn c=0 j(h(a; b; c) �g(a; b; c))j (1) In this distance formula, there is only comparison betweenthe identical bins in the respective histograms.Two different bins may represent perceptually similarcolors but are not compared cross-wise. All bins contributeequally to the distance. 5) Find Similar Images: The similarity between twoimages is inversely proportional to the (color histogram)distance between them. 3. THE COLOR FREQUENCY SEQUENCE DIFFERENCE METHOD 1) Color Space Conversion: Translate the representationof all colors in each image from the RGB space to theHSV space. HSV color space has two distinct characteristics:one is that lightness component is independentof color information of images; the other is that hueand saturation component is correlative with manner ofhuman visual perception. 2) Color Quantization: For every image in the database,colors in the HSV model are quantized, to make latercomputations easier. Color quantization reduces thenumber of distinct colors used in an image. In ourapplication, the user can select the quantization bucketsize for each color component separately. 3) Compute Histogram (Sequence): For every quantizedimage, a color histogram is calculated, which is afrequency distribution of quantized HSV values of eachpixel of an image. In our application, we have usedtreemaps to store histogram values as <string, double>pairs. The values are ordered alphabetically by theirHSV strings. For instance, (H, S, V) string (0, 0, 0) isfollowed by (0, 0, 1). 4) Compute Sorted Histogram (Frequency): Take thecolor histogram of each image and sort the keys indescending order of frequency values. Each image nowhas 2 histograms: _ One sorted by color sequence Call this P _ One sorted by color frequency Call this Q However, in our application, we maintain only thesorted histogram as the other histogram can be derivedimplicitly. 5) Query Image Feature Extraction: Convert and quantizethe color space of the query image. Then, computethe sequence histogram as well as the frequency histogramof the query image. 6) Compute SFD: In this algorithm, SFD is an alias fordistance measure, which is calculated as follows, foreach component of the HSV model: For every color combination x in Histogram P, sfd = XN i=1 w(i)h(i) (2) where _ i is the index of color combination x in histogram Q and i is the index of x in histogram P _ w(i) = 1 ji�i0j+1.w is a function of the difference between the indices i and i. _ h is a function of the frequency of color combination x in the image 7) Compute Distance with SFD (Difference): Let h andg represent two images which have SFD values foreach component of the HSV model. The SFD distancebetween h and g can be computed as: Dsfd(a; b) = (sfdH(a; b)+sfdS(a; b)+sfdV (a; b))=3 (3) where _ sfdT (a; b) = asfdT � bsfdT which is the distancebetween the corresponding SFD values of the componentT of the HSV model of the two images aand b. _ T_fH; S; V g _ asfdT is the corresponding SFD value of componentT of the HSV model representing image a
  • 3. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163 __________________________________________________________________________________________ Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 77 Fig2. SFD Calculation for one component of a given image 8) Find Similar Images: Compare the SFDs of each HSVcomponent of the query image with the correspondingSFDs of every image of the retrieval system. The similaritybetween two images is inversely proportional tothe SFD distance between them. 4. OBJECT-ORIENTED DESIGN The Java Programming Language has been used to createour Content-Based Image Retrieval System, with MySQL forthe back-end. For displaying graphical objects, we have usedthe ACM package (created by the Java Task Force (JTF)),which provides a set of classes that support the creation ofsimple, object-oriented graphical displays. The object-orienteddesign of our program below: Fig3. Finding Similar Images based on SFD Distance. The two images on top are similar since their sfd values are quite close. This similarity cannotbe detected by the normal color histogram method. A. Class Design We have segregated the code into two packages for easymanagement: _ retriever - This package contains classes that providemethods to read a database of images, store status information.The implementation of the algorithms mentionedin the previous sections can be found in this package. _ gui - As the name suggests, this package contains allthe code necessary for the graphical user interface of ourretrieval system. It makes use of methods provided by theACM library as well as the Swing library. Now that you know the organization of code, let us divedeeper into the actual object-oriented design. The retrieverpackage consists of the following classes: 1) ColorFrequency - This class is used while storing colorhistograms in hashmaps/treemaps, to store< value, frequency > tuples for different color combinations.It implements the Comparable Interface so thatthe Collections.sort() method provided by the java.utilpackage to sort histograms in decreasing order of frequency,as required by the CFSD algorithm. 2) ImageData - This class keeps track of the status of asingle image of the database. It stores information andprovides methods to: _ store the two-dimensional representation of animage using the GImage class of the ACM package _ store the path of this image _ quantize every color model component of the givenpixel of this image _ create color histograms with ColorFrequency mappingsfor individual components as well as a combinationof all components of the color space of thisimage_ calculate SFD values for every component of thecolor space of this image_ establish or disable a connection with a databaseand retrieve status information if feature values havealready been calculated for this image during aprevious run of the program. This is done to preventunnecessary computation of SFD values wheneverthe program is run repeatedly 3) ImageCollection - This class keeps track of the Image-Data of a number of images in the database. In essence,an ImageCollection Object is a collection of ImageDataobjects.It provides methods to: _ initialize an ImageData collection with images froma specified directory, which is traversed to a folderdepth of 1 _ find images that are most similar to a query image,computed based on the Color Histogram method andEuclidean distance
  • 4. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163 __________________________________________________________________________________________ Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 78 formula. A list of images, sortedbased on decreasing order of similarity is returned _ find images that are most similar to a query image,computed based on the CFSD method and SFDdistance formula. A list of images, sorted based ondecreasing order of similarity is returned. 4) UserVariables - This class only maintains the values ofall the variables that the user can set in the GUI window.The ImageData and ImageCollection objects read thesevalues during computation. It provides methods to getand set: _ the number of red, green and blue bins for colorquantization _ the number of similar images to be returned byImageCollection _ aboolean variable denoting whether the Color Histogram Algorithm or CFSD algorithm should be Used The gui package consists of only one class, namely: 1) MainGraphics - This class uses the power of the widgetsprovided by the ACM graphics package and the Javaswing library to create a user-friendly interface for ourImage Retrieval System. It creates a window consistingof a white area where the query image selected by theuser, and the three most similar images are displayed,in their normal forms as well as their quantized forms. The northern portion of the window consists of: _ Two Buttons - One to open a query image and anotherto start the execution of the selected algorithm,which is the Color Histogram Algorithm, by default. _ Three sliders - each of which can be varied from 1to 60. The sliders are used to choose the bin sizesof each component of the RGB(HSV) model for theColor Histogram (CFSD) algorithm _ One checkbox - If checked, then the CFSDalgorithm is run to find similar images. Otherwise,the Color Histogram method is used For each of these widgets, the corresponding variablesof the UserVariable class are set. B. Schema Design Since comparisons in the CFSD algorithm are done using afew SFD values instead of comparing entire histograms as inthe normal color histogram method, the CFSD algorithm canbe made to operate more efficiently by calculating the SFDvalues of an image only once (when it is encountered for thevery first time) and storing them in a database which can beaccessed for all future references. Please note that wheneverwe say red, green or blue in the CFSD algorithm, we actuallyrefer to the H, S or V values of the image respectively. Thiswas done for compatibility. The database that we constructed for our system consists oftwo tables: 1) imagedata - It consists of the following attributes:_ Image ID - which is the primary key of this table. It is an auto-incremented variable _ Image Path - A string with a variable number ofcharacters that stores the path of the given image,relative to the directory of the application _ redSFD - The SFD calculated using the red (H)histogram of the image _ greenSFD - The SFD calculated using the green (S)histogram of the image _ blueSFD - The SFD calculated using the blue (V)histogram of the image 2) quantizedvalues- It consists of the following attributes: _ Image ID - which is the foreign key of this tablethat references the primary key of the imagedatatable. _ qRed - The value of the bin used to quantize red(H) _ qGreen - The value of the bin used to quantize green(S) _ qBlue - The value of the bin used to quantize blue(V) C. Summary and Results In order to test out our image retrieval system, we have provideda sample set of images, organized in seven folders withfive images each. Feel free to add more folders if necessary,but ensure that each folder contains exactly five images at anygiven instant of time. Using our sample database, it was foundthat this application could retrieve at least three similar imagesto a supplied query image, chosen from the database itself. Thespeed of retrieval varied depending on the algorithm selected,with the CFSD algorithm running faster than the naive ColorHistogram algorithm. 5. FUNCTIONALITY First ensure that all the above requirements are met. Then,to start the application, go to the installation folder and clickon file named CBIR.jar. In the new window that pops up,select the appropriate query image using the open button. Afterselecting the appropriate number of bins for quantization ofeach component of the color space, and checking/uncheckingthe CFSD box, click on the Retrive Similar Images button toretrieve the three most similar images that will be displayedin the lower portion of the window. Figures 4, 5 and 6 showthe state of the window at different instances of time. Fig4. Application after the query image has been selected by the use
  • 5. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163 __________________________________________________________________________________________ Volume: 02 Issue: 01 | Jan-2013, Available @ http://www.ijret.org 79 Fig5 Application after images similar to the query image have been retrieved using the CFSD algorithm Fig6 Application after images similar to the query image have been retrieved using the Color Histogram algorithm CONCLUSIONS This paper discussed briefly the software implementation of a Content-Based Image Retrieval System using the CFSD algorithm as well as a naive Color Histogram method. This software tool was implemented in Java, relying heavily on Object-Oriented concepts without which code management wouldn’t have been easy at all. We also had the opportunity to find out that the naive Color Histogram algorithm is not as fast as the CFSD algorithm. In fact, the CFSD algorithm, coupled with a database, forms a highly efficient system which retrieves at most three similar images quite accurately. However, it was found that the accuracy diminished when more than three similar images were sought. This leads us to conclude that we should also consider other features such as shape, etc., while comparing images in a content-based retrieval system. ACKNOWLEDGMENTS The authors wish to express their gratitude to Dr. NagarajaG.S., who offered invaluable assistance, support and guidance throughout the development of this tool. REFERENCES [1] RishavChakravarti, XiannongMeng, ”A Study of Color HistogramBased Image Retrieval”, Sixth International Conference on InformationTechnology: New Generations, 2009. ITNG ’09., pp. 1323�1328, 2009 [2] Zhenhua ZHANG, Yina LU, Wenhui LI and Wei LIU, ”Novel ColorFeature Representation and Matching Technique for Content-based ImageRetrieval”, International conference on Multimedia Computing and Systems (ICMCS 09), pp. 118�122, 2009 [3] M. J. Swain and D. H. Ballard, ”Color indexing”, International Journalof Computer Vision, vol. 7, no. 1, pp. 11�32, 1991. [4] H. Lewkovitz and G. H. Herman, ”A generalized lightness hue andsaturation color model”, Graphical Models and Image Processing, pp.271�278, 1993 [5] Smeulders A.W.M., Worring, M., Santini, S., Gupta, A., and Jain, R.,”Content-based image retrieval at the end of the early years”, IEEE transactionson Pattern Analysis and Machine Intelligence, pp. 1349�1380,2000 [6] M. Stricker and M. Orengo, Similarity of color images, Proceeding ofSPIE Storage and Retrieval for Image and Video Databases III, vol. 2420,pp. 381�392, 1995. [7] JaumeAmores, NicuSebe, and PetiaRadeva, ”Context- Based Object-Class Recognition and Retrieval by Generalized Correlograms”, Transactionson Pattern Analysis and Machine Intelligence, vol. 29, no. 10, pp.1818�1833, 2008