SlideShare a Scribd company logo
LEC-03
INTRODUCTION TO CLASSIFICATION
K - NEAREST NEIGHBOUR
Nearest Neighbour
• Mainly used when all attribute values are continuous
• It can be modified to deal with categorical attributes
• The idea is to estimate the classification of an unseen instance using the
classification of the instance or instances that are closest to it, in some
sense that we need to define (classifies new cases based on a similarity
measure)
Nearest Neighbour
• What should its classification be?
• Even without knowing what the six attributes represent, it seems
intuitively obvious that the unseen instance is nearer to the first instance
than to the second.
K - Nearest Neighbour (KNN)
• In practice there are likely to be many more instances in the training set
but the same principle applies.
• It is usual to base the classification on those of the k nearest neighbours,
not just the nearest one.
• The method is then known as k-Nearest Neighbour or just k-NN
classification
KNN
• We can illustrate k-NN classification diagrammatically when the
dimension (i.e. the number of attributes) is small.
• Next we will see an example which illustrates the case where the
dimension is just 2.
• In real-world data mining applications it can of course be considerably
larger.
KNN
• A training set with 20 instances, each giving the
values of two attributes and an associated
classification
• How can we estimate the classification for an
‘unseen’ instance where the first and second
attributes are 9.1 and 11.0, respectively?
Introduction to classification K-Nearest Neighbour
Introduction to classification K-Nearest Neighbour
Introduction to classification K-Nearest Neighbour
KNN
• For this small number of attributes we can represent
the training set as 20 points on a two-dimensional
graph with values of the first and second attributes
measured along the horizontal and vertical axes,
respectively.
• Each point is labelled with a + or − symbol to
indicate that the classification is positive or
negative, respectively.
KNN
• A circle has been added to enclose the five nearest
neighbours of the unseen instance, which is shown
as a small circle close to the centre of the larger
one.
KNN
• The five nearest neighbours are labelled with three
+ signs and two − signs
• So a basic 5-NN classifier would classify the unseen
instance as ‘positive’ by a form of majority voting.
KNN
• We can represent two points in two dimensions (‘in two-dimensional
space’ is the usual term) as (a1,a2) and (b1,b2)
• When there are three attributes we can represent the points by (a1,a2,a3)
and (b1,b2,b3)
• When there are n attributes, we can represent the instances by the points
(a1,a2,.. .,an) and (b1,b2,...,bn) in ‘n-dimensional space’
Distance Measures: Euclidean Distance
• If we denote an instance in the training set by (a1, a2) and the unseen
instance by (b1,b2) the length of the straight line joining the points is
• If there are two points (a1, a2, a3) and (b1, b2, b3) in a three-dimensional
space the corresponding formula is
• The formula for Euclidean distance between points (a1, a2, . . . , an) and
(b1, b2, . . . , bn) in n-dimensional space is a generalisation of these two
results.The Euclidean distance is given by the formula
Distance Measures: Manhattan Distance
• The City Block distance between the points (4, 2) and (12, 9) is (12 − 4) +
(9 − 2) = 8 + 7 = 15

More Related Content

PPTX
W5_CLASSIFICATION.pptxW5_CLASSIFICATION.pptx
PPTX
K nearest neighbor: classify by closest training points.
PPTX
KNN.pptx
PPTX
Knn 160904075605-converted
PPTX
Lecture 11.pptxVYFYFYF UYF6 F7T7T7ITY8Y8YUO
PDF
Lecture 6 - Classification Classification
PPTX
KNN Classificationwithexplanation and examples.pptx
W5_CLASSIFICATION.pptxW5_CLASSIFICATION.pptx
K nearest neighbor: classify by closest training points.
KNN.pptx
Knn 160904075605-converted
Lecture 11.pptxVYFYFYF UYF6 F7T7T7ITY8Y8YUO
Lecture 6 - Classification Classification
KNN Classificationwithexplanation and examples.pptx

Similar to Introduction to classification K-Nearest Neighbour (20)

PDF
Unit 1 KNN.pptx one of the important chapters
PPT
unit 4 nearest neighbor.ppt
PPTX
K- Nearest Neighbor Approach
PPTX
K- Nearest Neighbour Algorithm.pptx
PPTX
SAMPATH-SEMINAR.pptx ..............................
PPTX
DataAnalysis in machine learning using different techniques
PPTX
Data Mining Lecture_10(b).pptx
PPTX
KNN CLASSIFIER, INTRODUCTION TO K-NEAREST NEIGHBOR ALGORITHM.pptx
PPTX
k-Nearest Neighbors with brief explanation.pptx
PPTX
Statistical Machine Learning unit3 lecture notes
PPT
instance bases k nearest neighbor algorithm.ppt
PPTX
K-Nearest Neighbor(KNN)
PPTX
KNN Algorithm Machine_Learning_KNN_Presentation.pptx
PDF
Classification Based Machine Learning Algorithms
PDF
Machine Learning-Lec7 Bayesian calssification.pdf
PPTX
Cluster Analysis
PDF
k-nearest neighbour Machine Learning.pdf
PPTX
Dbscan algorithom
PPTX
Ensemble_instance_unsupersied_learning 01_02_2024.pptx
Unit 1 KNN.pptx one of the important chapters
unit 4 nearest neighbor.ppt
K- Nearest Neighbor Approach
K- Nearest Neighbour Algorithm.pptx
SAMPATH-SEMINAR.pptx ..............................
DataAnalysis in machine learning using different techniques
Data Mining Lecture_10(b).pptx
KNN CLASSIFIER, INTRODUCTION TO K-NEAREST NEIGHBOR ALGORITHM.pptx
k-Nearest Neighbors with brief explanation.pptx
Statistical Machine Learning unit3 lecture notes
instance bases k nearest neighbor algorithm.ppt
K-Nearest Neighbor(KNN)
KNN Algorithm Machine_Learning_KNN_Presentation.pptx
Classification Based Machine Learning Algorithms
Machine Learning-Lec7 Bayesian calssification.pdf
Cluster Analysis
k-nearest neighbour Machine Learning.pdf
Dbscan algorithom
Ensemble_instance_unsupersied_learning 01_02_2024.pptx
Ad

Recently uploaded (20)

PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPT
Project quality management in manufacturing
PPTX
OOP with Java - Java Introduction (Basics)
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
web development for engineering and engineering
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPTX
Safety Seminar civil to be ensured for safe working.
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PDF
Digital Logic Computer Design lecture notes
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
Sustainable Sites - Green Building Construction
PPT
Mechanical Engineering MATERIALS Selection
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
UNIT 4 Total Quality Management .pptx
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
Project quality management in manufacturing
OOP with Java - Java Introduction (Basics)
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
web development for engineering and engineering
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Safety Seminar civil to be ensured for safe working.
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
Digital Logic Computer Design lecture notes
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Automation-in-Manufacturing-Chapter-Introduction.pdf
Model Code of Practice - Construction Work - 21102022 .pdf
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
CYBER-CRIMES AND SECURITY A guide to understanding
Sustainable Sites - Green Building Construction
Mechanical Engineering MATERIALS Selection
Ad

Introduction to classification K-Nearest Neighbour

  • 2. Nearest Neighbour • Mainly used when all attribute values are continuous • It can be modified to deal with categorical attributes • The idea is to estimate the classification of an unseen instance using the classification of the instance or instances that are closest to it, in some sense that we need to define (classifies new cases based on a similarity measure)
  • 3. Nearest Neighbour • What should its classification be? • Even without knowing what the six attributes represent, it seems intuitively obvious that the unseen instance is nearer to the first instance than to the second.
  • 4. K - Nearest Neighbour (KNN) • In practice there are likely to be many more instances in the training set but the same principle applies. • It is usual to base the classification on those of the k nearest neighbours, not just the nearest one. • The method is then known as k-Nearest Neighbour or just k-NN classification
  • 5. KNN • We can illustrate k-NN classification diagrammatically when the dimension (i.e. the number of attributes) is small. • Next we will see an example which illustrates the case where the dimension is just 2. • In real-world data mining applications it can of course be considerably larger.
  • 6. KNN • A training set with 20 instances, each giving the values of two attributes and an associated classification • How can we estimate the classification for an ‘unseen’ instance where the first and second attributes are 9.1 and 11.0, respectively?
  • 10. KNN • For this small number of attributes we can represent the training set as 20 points on a two-dimensional graph with values of the first and second attributes measured along the horizontal and vertical axes, respectively. • Each point is labelled with a + or − symbol to indicate that the classification is positive or negative, respectively.
  • 11. KNN • A circle has been added to enclose the five nearest neighbours of the unseen instance, which is shown as a small circle close to the centre of the larger one.
  • 12. KNN • The five nearest neighbours are labelled with three + signs and two − signs • So a basic 5-NN classifier would classify the unseen instance as ‘positive’ by a form of majority voting.
  • 13. KNN • We can represent two points in two dimensions (‘in two-dimensional space’ is the usual term) as (a1,a2) and (b1,b2) • When there are three attributes we can represent the points by (a1,a2,a3) and (b1,b2,b3) • When there are n attributes, we can represent the instances by the points (a1,a2,.. .,an) and (b1,b2,...,bn) in ‘n-dimensional space’
  • 14. Distance Measures: Euclidean Distance • If we denote an instance in the training set by (a1, a2) and the unseen instance by (b1,b2) the length of the straight line joining the points is • If there are two points (a1, a2, a3) and (b1, b2, b3) in a three-dimensional space the corresponding formula is • The formula for Euclidean distance between points (a1, a2, . . . , an) and (b1, b2, . . . , bn) in n-dimensional space is a generalisation of these two results.The Euclidean distance is given by the formula
  • 15. Distance Measures: Manhattan Distance • The City Block distance between the points (4, 2) and (12, 9) is (12 − 4) + (9 − 2) = 8 + 7 = 15