SlideShare a Scribd company logo
CONTACT: PRAVEEN KUMAR. L (, +91 – 9791938249)
MAIL ID: sunsid1989@gmail.com, praveen@nexgenproject.com
Web: www.nexgenproject.com, www.finalyear-ieeeprojects.com
K NEAREST NEIGHBOUR JOINS FOR BIG DATA ON MAPREDUCE: A
THEORETICAL AND EXPERIMENTAL ANALYSIS
ABSTRACT:
Given a point p and a set of points S, the kNN operation finds the k closest
points to p in S. It is a computational intensive task with a large range of
applications such as knowledge discovery or data mining. However, as the
volume and the dimension of data increase, only distributed approaches can
perform such costly operation in a reasonable time. Recent works have
focused on implementing efficient solutions using the MapReduce
programming model because it is suitable for distributed large scale data
processing. Although these works provide different solutions to the same
problem, each one has particular constraints and properties. In this paper, we
compare the different existing approaches for computing kNN on MapReduce,
first theoretically, and then by performing an extensive experimental
evaluation. To be able to compare solutions, we identify three generic steps
for kNN computation on MapReduce: data pre-processing, data partitioning
and computation. We then analyze each step from load balancing, accuracy
and complexity aspects. Experiments in this paper use a variety of datasets,
and analyze the impact of data volume, data dimension and the value of k
from many perspectives like time and space complexity, and accuracy. The
experimental part brings new advantages and shortcomings that are discussed
for each algorithm. To the best of our knowledge, this is the first paper that
CONTACT: PRAVEEN KUMAR. L (, +91 – 9791938249)
MAIL ID: sunsid1989@gmail.com, praveen@nexgenproject.com
Web: www.nexgenproject.com, www.finalyear-ieeeprojects.com
compares kNN computing methods on MapReduce both theoretically and
experimentally with the same setting. Overall, this paper can be used as a
guide to tackle kNN-based practical problems in the context of big data.
CONCLUSION
In this paper, we have studied existing solutions to perform the kNN operation
in the context of MapReduce. We have first approached this problem from a
workflow point of view. We have pointed out that all solutions follow three
main steps to compute kNN over MapReduce, namely preprocessing of data,
partitioning and actual computation. We have listed and explained the
different algorithms which could be chosen for each step, and developed their
pros and cons, in terms of load balancing, accuracy of results, and overall
complexity. In a second part, we have performed extensive experiments to
compare the performance, disk usage and accuracy of all these algorithms in
the same environment. We have mainly used two real datasets, a geographic
coordinates one (2 dimensions) and an image based one (SURF descriptors,
128 dimensions). For all algorithms, it was the first published experiment on
such high dimensions. Moreover, we have performed a fine analysis, outlining,
for each algorithm, the importance and difficulty of fine tuning some
parameters to obtain the best performance.
REFERENCES
[1] D. Li, Q. Chen, and C.-K. Tang, “Motion-aware knn laplacian for video
matting,” in ICCV’13, 2013.
CONTACT: PRAVEEN KUMAR. L (, +91 – 9791938249)
MAIL ID: sunsid1989@gmail.com, praveen@nexgenproject.com
Web: www.nexgenproject.com, www.finalyear-ieeeprojects.com
[2] H.-P. Kriegel and T. Seidl, “Approximation-based similarity search for 3-D
surface segments,” Geoinformatica, 1998.
[3] X. Bai, R. Guerraoui, A.-M. Kermarrec, and V. Leroy, “Collaborative
personalized top-k processing,” ACM Trans. Database Syst., 2011.
[4] D. Rafiei and A. Mendelzon, “Similarity-based queries for time series data,”
SIGMOD Rec., 1997.
[5] R. Agrawal, C. Faloutsos, and A. N. Swami, “Efficient similarity search in
sequence databases,” in Foundations of Data Organization and Algorithms,
1993.
[6] K. Inthajak, C. Duanggate, B. Uyyanonvara, S. Makhanov, and S. Barman,
“Medical image blob detection with feature stability and knn classification,” in
Computer Science and Software Engineering, 2011.
[7] F. Korn, N. Sidiropoulos, C. Faloutsos, E. Siegel, and Z. Protopapas, “Fast
nearest neighbor search in medical image databases,” in VLDB’96, 1996.
[8] H. V. Jagadish, B. C. Ooi, K.-L. Tan, C. Yu, and R. Zhang, “idistance: An
adaptive b+-tree based indexing method for nearest neighbor search,” ACM
Trans. Database Syst., vol. 30, no. 2, pp. 364–397, 2005.
[9] C. B¨ohm and F. Krebs, “The k-nearest neighbour join: Turbo charging the
kdd process,” Knowl. Inf. Syst., vol. 6, no. 6, pp. 728– 749, Nov. 2004.
CONTACT: PRAVEEN KUMAR. L (, +91 – 9791938249)
MAIL ID: sunsid1989@gmail.com, praveen@nexgenproject.com
Web: www.nexgenproject.com, www.finalyear-ieeeprojects.com
[10] P. Ciaccia, M. Patella, and P. Zezula, “M-tree: An efficient access method
for similarity search in metric spaces,” in VLDB’97, 1997.

More Related Content

PDF
SCALABLE SEMI-SUPERVISED LEARNING BY EFFICIENT ANCHOR GRAPH REGULARIZATION
PDF
K-SUBSPACES QUANTIZATION FOR APPROXIMATE NEAREST NEIGHBOR SEARCH
PDF
ONLINE SUBGRAPH SKYLINE ANALYSIS OVER KNOWLEDGE GRAPHS
DOCX
Clustering big spatiotemporal interval data
PDF
STATISTICAL PERFORMANCE ANALYSIS OF A FAST SUPER-RESOLUTION TECHNIQUE USING ...
PDF
Towards Glyph-based Visualizations for Big Data Clustering
PPT
Stochastic kronecker graphs
PPTX
Building Electricity Demand Forecasting
SCALABLE SEMI-SUPERVISED LEARNING BY EFFICIENT ANCHOR GRAPH REGULARIZATION
K-SUBSPACES QUANTIZATION FOR APPROXIMATE NEAREST NEIGHBOR SEARCH
ONLINE SUBGRAPH SKYLINE ANALYSIS OVER KNOWLEDGE GRAPHS
Clustering big spatiotemporal interval data
STATISTICAL PERFORMANCE ANALYSIS OF A FAST SUPER-RESOLUTION TECHNIQUE USING ...
Towards Glyph-based Visualizations for Big Data Clustering
Stochastic kronecker graphs
Building Electricity Demand Forecasting

What's hot (20)

PDF
Exploring Big Data Landscapes with Elastic Displays
PDF
PROXIES FOR SHORTEST PATH AND DISTANCE QUERIES
PDF
Graph based Clustering
PDF
An Enhanced Support Vector Regression Model for Weather Forecasting
PDF
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...
PDF
オープンハウスにおける 機械学習・データサイエンスの 取り組みについて
DOCX
Density maximization for improving graph matching with its applications
PDF
Drsp dimension reduction for similarity matching and pruning of time series ...
PPTX
EfficientNet
PDF
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
DOCX
survey paper 2
PDF
Development Infographic
PPSX
Meta-MapReduce- A Technique for Reducing Communication in MapReduce Computations
PPT
Correlations, Trends, and Outliers in ggplot2
PDF
Amin tayyebi: Big Data and Land Use Change Science
PDF
Fast and scalable range query processing with strong privacy protection for c...
PDF
How Deep Learning Could Predict Weather Events
PDF
Wind meteodyn WT cfd micro scale modeling combined statistical learning for s...
PDF
Slides
PDF
Rosaic: A Round-wise Fair Scheduling Approach for Mobile Clouds Based on Task...
Exploring Big Data Landscapes with Elastic Displays
PROXIES FOR SHORTEST PATH AND DISTANCE QUERIES
Graph based Clustering
An Enhanced Support Vector Regression Model for Weather Forecasting
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...
オープンハウスにおける 機械学習・データサイエンスの 取り組みについて
Density maximization for improving graph matching with its applications
Drsp dimension reduction for similarity matching and pruning of time series ...
EfficientNet
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
survey paper 2
Development Infographic
Meta-MapReduce- A Technique for Reducing Communication in MapReduce Computations
Correlations, Trends, and Outliers in ggplot2
Amin tayyebi: Big Data and Land Use Change Science
Fast and scalable range query processing with strong privacy protection for c...
How Deep Learning Could Predict Weather Events
Wind meteodyn WT cfd micro scale modeling combined statistical learning for s...
Slides
Rosaic: A Round-wise Fair Scheduling Approach for Mobile Clouds Based on Task...
Ad

Viewers also liked (16)

PDF
PMBOK Exam Prep Cert 2013
PPTX
Gestor com grade visão geral
PDF
Alex_Shadrin
PDF
Army e- Learning Final Exam Bugeting Essentials
PDF
CSI courses passed
PDF
Medaht Cv
DOCX
Elementos de inicio
PDF
Plataformas e learning
PPT
Diapositiva proyecto 1
PPTX
La familia en la eaffb
PPS
Causality Triangle Presentation
PPTX
Primera Guerra Mundial
PDF
What's Hot in SEO Ranking Factors By Eric Enge
PPTX
Image net classification with deep convolutional neural network
PMBOK Exam Prep Cert 2013
Gestor com grade visão geral
Alex_Shadrin
Army e- Learning Final Exam Bugeting Essentials
CSI courses passed
Medaht Cv
Elementos de inicio
Plataformas e learning
Diapositiva proyecto 1
La familia en la eaffb
Causality Triangle Presentation
Primera Guerra Mundial
What's Hot in SEO Ranking Factors By Eric Enge
Image net classification with deep convolutional neural network
Ad

Similar to K NEAREST NEIGHBOUR JOINS FOR BIG DATA ON MAPREDUCE: A THEORETICAL AND EXPERIMENTAL ANALYSIS (20)

DOCX
A hybrid approach to clustering in big data
PDF
EPAS: A SAMPLING BASED SIMILARITY IDENTIFICATION ALGORITHM FOR THE CLOUD
PDF
RESOLVING MULTI-PARTY PRIVACY CONFLICTS IN SOCIAL MEDIA
PDF
Big Data Clustering Model based on Fuzzy Gaussian
PDF
Big data classification based on improved parallel k-nearest neighbor
PDF
IEEE Big data 2016 Title and Abstract
PDF
Survey on classification algorithms for data mining (comparison and evaluation)
PDF
A LOCALITY SENSITIVE LOW-RANK MODEL FOR IMAGE TAG COMPLETION
DOCX
Hashedcubes simple, low memory, real time visual
DOCX
Hashedcubes simple, low memory, real time visual
PDF
4 tracking objects of deformable shapes
PDF
ANALYTIC QUERIES OVER GEOSPATIAL TIME-SERIES DATA USING DISTRIBUTED HASH TABLES
DOCX
Collaboration and fairness-aware big data management in distributed clouds
DOCX
Improving viability of electric taxis by taxi service strategy optimization a...
PDF
IEEE Datamining 2016 Title and Abstract
PDF
RSDC (Reliable Scheduling Distributed in Cloud Computing)
PDF
ME Synopsis
PDF
Node classification with graph neural network based centrality measures and f...
PDF
Web image annotation by diffusion maps manifold learning algorithm
PDF
A bi objective workflow application
A hybrid approach to clustering in big data
EPAS: A SAMPLING BASED SIMILARITY IDENTIFICATION ALGORITHM FOR THE CLOUD
RESOLVING MULTI-PARTY PRIVACY CONFLICTS IN SOCIAL MEDIA
Big Data Clustering Model based on Fuzzy Gaussian
Big data classification based on improved parallel k-nearest neighbor
IEEE Big data 2016 Title and Abstract
Survey on classification algorithms for data mining (comparison and evaluation)
A LOCALITY SENSITIVE LOW-RANK MODEL FOR IMAGE TAG COMPLETION
Hashedcubes simple, low memory, real time visual
Hashedcubes simple, low memory, real time visual
4 tracking objects of deformable shapes
ANALYTIC QUERIES OVER GEOSPATIAL TIME-SERIES DATA USING DISTRIBUTED HASH TABLES
Collaboration and fairness-aware big data management in distributed clouds
Improving viability of electric taxis by taxi service strategy optimization a...
IEEE Datamining 2016 Title and Abstract
RSDC (Reliable Scheduling Distributed in Cloud Computing)
ME Synopsis
Node classification with graph neural network based centrality measures and f...
Web image annotation by diffusion maps manifold learning algorithm
A bi objective workflow application

More from Nexgen Technology (20)

DOCX
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
DOCX
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
DOCX
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
DOCX
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
DOCX
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
DOCX
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
DOCX
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CH...
DOCX
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHENN...
DOCX
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
DOCX
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
DOCX
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHENNA...
DOCX
Ieee 2020 21 vlsi projects in pondicherry,ieee vlsi projects in chennai
DOCX
Ieee 2020 21 power electronics in pondicherry,Ieee 2020 21 power electronics
DOCX
Ieee 2020 -21 ns2 in pondicherry, Ieee 2020 -21 ns2 projects,best project cen...
DOCX
Ieee 2020 21 ns2 in pondicherry,best project center in pondicherry,final year...
DOCX
Ieee 2020 21 java dotnet in pondicherry,final year projects in pondicherry,pr...
DOCX
Ieee 2020 21 iot in pondicherry,final year projects in pondicherry,project ce...
DOCX
Ieee 2020 21 blockchain in pondicherry,final year projects in pondicherry,bes...
DOCX
Ieee 2020 -21 bigdata in pondicherry,project center in pondicherry,best proje...
DOCX
Ieee 2020 21 embedded in pondicherry,final year projects in pondicherry,best...
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CH...
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHENN...
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHENNA...
Ieee 2020 21 vlsi projects in pondicherry,ieee vlsi projects in chennai
Ieee 2020 21 power electronics in pondicherry,Ieee 2020 21 power electronics
Ieee 2020 -21 ns2 in pondicherry, Ieee 2020 -21 ns2 projects,best project cen...
Ieee 2020 21 ns2 in pondicherry,best project center in pondicherry,final year...
Ieee 2020 21 java dotnet in pondicherry,final year projects in pondicherry,pr...
Ieee 2020 21 iot in pondicherry,final year projects in pondicherry,project ce...
Ieee 2020 21 blockchain in pondicherry,final year projects in pondicherry,bes...
Ieee 2020 -21 bigdata in pondicherry,project center in pondicherry,best proje...
Ieee 2020 21 embedded in pondicherry,final year projects in pondicherry,best...

Recently uploaded (20)

PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PPTX
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
PDF
RMMM.pdf make it easy to upload and study
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PDF
Complications of Minimal Access Surgery at WLH
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
A systematic review of self-coping strategies used by university students to ...
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
O5-L3 Freight Transport Ops (International) V1.pdf
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
RMMM.pdf make it easy to upload and study
Abdominal Access Techniques with Prof. Dr. R K Mishra
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
Complications of Minimal Access Surgery at WLH
Chinmaya Tiranga quiz Grand Finale.pdf
human mycosis Human fungal infections are called human mycosis..pptx
202450812 BayCHI UCSC-SV 20250812 v17.pptx
Final Presentation General Medicine 03-08-2024.pptx
102 student loan defaulters named and shamed – Is someone you know on the list?
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
STATICS OF THE RIGID BODIES Hibbelers.pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
Pharmacology of Heart Failure /Pharmacotherapy of CHF
A systematic review of self-coping strategies used by university students to ...
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE

K NEAREST NEIGHBOUR JOINS FOR BIG DATA ON MAPREDUCE: A THEORETICAL AND EXPERIMENTAL ANALYSIS

  • 1. CONTACT: PRAVEEN KUMAR. L (, +91 – 9791938249) MAIL ID: sunsid1989@gmail.com, praveen@nexgenproject.com Web: www.nexgenproject.com, www.finalyear-ieeeprojects.com K NEAREST NEIGHBOUR JOINS FOR BIG DATA ON MAPREDUCE: A THEORETICAL AND EXPERIMENTAL ANALYSIS ABSTRACT: Given a point p and a set of points S, the kNN operation finds the k closest points to p in S. It is a computational intensive task with a large range of applications such as knowledge discovery or data mining. However, as the volume and the dimension of data increase, only distributed approaches can perform such costly operation in a reasonable time. Recent works have focused on implementing efficient solutions using the MapReduce programming model because it is suitable for distributed large scale data processing. Although these works provide different solutions to the same problem, each one has particular constraints and properties. In this paper, we compare the different existing approaches for computing kNN on MapReduce, first theoretically, and then by performing an extensive experimental evaluation. To be able to compare solutions, we identify three generic steps for kNN computation on MapReduce: data pre-processing, data partitioning and computation. We then analyze each step from load balancing, accuracy and complexity aspects. Experiments in this paper use a variety of datasets, and analyze the impact of data volume, data dimension and the value of k from many perspectives like time and space complexity, and accuracy. The experimental part brings new advantages and shortcomings that are discussed for each algorithm. To the best of our knowledge, this is the first paper that
  • 2. CONTACT: PRAVEEN KUMAR. L (, +91 – 9791938249) MAIL ID: sunsid1989@gmail.com, praveen@nexgenproject.com Web: www.nexgenproject.com, www.finalyear-ieeeprojects.com compares kNN computing methods on MapReduce both theoretically and experimentally with the same setting. Overall, this paper can be used as a guide to tackle kNN-based practical problems in the context of big data. CONCLUSION In this paper, we have studied existing solutions to perform the kNN operation in the context of MapReduce. We have first approached this problem from a workflow point of view. We have pointed out that all solutions follow three main steps to compute kNN over MapReduce, namely preprocessing of data, partitioning and actual computation. We have listed and explained the different algorithms which could be chosen for each step, and developed their pros and cons, in terms of load balancing, accuracy of results, and overall complexity. In a second part, we have performed extensive experiments to compare the performance, disk usage and accuracy of all these algorithms in the same environment. We have mainly used two real datasets, a geographic coordinates one (2 dimensions) and an image based one (SURF descriptors, 128 dimensions). For all algorithms, it was the first published experiment on such high dimensions. Moreover, we have performed a fine analysis, outlining, for each algorithm, the importance and difficulty of fine tuning some parameters to obtain the best performance. REFERENCES [1] D. Li, Q. Chen, and C.-K. Tang, “Motion-aware knn laplacian for video matting,” in ICCV’13, 2013.
  • 3. CONTACT: PRAVEEN KUMAR. L (, +91 – 9791938249) MAIL ID: sunsid1989@gmail.com, praveen@nexgenproject.com Web: www.nexgenproject.com, www.finalyear-ieeeprojects.com [2] H.-P. Kriegel and T. Seidl, “Approximation-based similarity search for 3-D surface segments,” Geoinformatica, 1998. [3] X. Bai, R. Guerraoui, A.-M. Kermarrec, and V. Leroy, “Collaborative personalized top-k processing,” ACM Trans. Database Syst., 2011. [4] D. Rafiei and A. Mendelzon, “Similarity-based queries for time series data,” SIGMOD Rec., 1997. [5] R. Agrawal, C. Faloutsos, and A. N. Swami, “Efficient similarity search in sequence databases,” in Foundations of Data Organization and Algorithms, 1993. [6] K. Inthajak, C. Duanggate, B. Uyyanonvara, S. Makhanov, and S. Barman, “Medical image blob detection with feature stability and knn classification,” in Computer Science and Software Engineering, 2011. [7] F. Korn, N. Sidiropoulos, C. Faloutsos, E. Siegel, and Z. Protopapas, “Fast nearest neighbor search in medical image databases,” in VLDB’96, 1996. [8] H. V. Jagadish, B. C. Ooi, K.-L. Tan, C. Yu, and R. Zhang, “idistance: An adaptive b+-tree based indexing method for nearest neighbor search,” ACM Trans. Database Syst., vol. 30, no. 2, pp. 364–397, 2005. [9] C. B¨ohm and F. Krebs, “The k-nearest neighbour join: Turbo charging the kdd process,” Knowl. Inf. Syst., vol. 6, no. 6, pp. 728– 749, Nov. 2004.
  • 4. CONTACT: PRAVEEN KUMAR. L (, +91 – 9791938249) MAIL ID: sunsid1989@gmail.com, praveen@nexgenproject.com Web: www.nexgenproject.com, www.finalyear-ieeeprojects.com [10] P. Ciaccia, M. Patella, and P. Zezula, “M-tree: An efficient access method for similarity search in metric spaces,” in VLDB’97, 1997.