SlideShare a Scribd company logo
Data mining
Assignment week 6




BARRY KOLLEE

10349863
Assignment 6


Exercise 1: Lazy Learning
How is lazy learning different from the other machine learning approaches we have covered
so far?

In the previous lectures and exercises we’ve been working on how to predict various types of data. The
approach we've used so far is to split data into a test set and training set. With these sets we were able
to determine several solutions for handling the data which we eventually process.

The distinction between the Lazy learning system and the previous systems is that a Lazy system stores
it's training data and uses it directly on the data which we process further on. In the previous lectures the
training data was necessary in the first place. Thereafter we’ve determined several conditions which
gave us the opportunity to predict new values/instances. In conclusion we can state that we’ve
determined a formula to handle new data and after words the training data were we base our formula’s
on is not necessary anymore. With a lazy learning system we always refresh our training data to be able
to predict the best. A disadvantage of this ‘lazy learning system’ is that it takes a lot of storage to handle
these systems.




2
Assignment 6



Exercise 2: k Nearest Neighbor Classification
2.1 How does a form of overfitting affect kNN classification, and what can be done
to overcome this?

The kNN classifier is used to classify new instances for an existing dataset. To classify new instances
we search for the nearest point/neighbor of this instance.

For using this kNN algorithm a certain variable is taken into account. This variable represents the
number of nearest points which we need to take into account for classifying new instances. In case
we’re only looking for one (or a view) nearest point(s) it’s easily possible to create overfitting.

To describe the overfitting of the kNN algorithm I’ve used these three examples which are listed below.



    Example 1:
    In this example we try to look for a plus or a minus which is closes to the red dot.
    We use a k (number of points which we take into account) of 1. Because the red
    dot is closest to the plus we ‘generalize’ the red dot to a plus value.




    Example 2:
    In this example we try to look for a plus or a minus which is closes to the red dot.
    We use a k (number of points which we take into account) of 1. Because the red
    dot is closest to the min we ‘generalize’ the red dot to a minus value.




    Example 3:
    In this example we try to look for a plus or a minus which is closes to the red
    dot. We use a k (number of points which we take into account) of 3. We now
    check for the three nearest neighbors so now it will be harder to predict the
    value of the red dot. If we draw a circular line as it is listed in the screenshot you
    see that the same dot from example 1 has now a plus value in stead of a minus
    because there are two plusses and one minus in the closest region of the red
    dot.



What you can see within example 2 and 3 is that the red dot was classified as a minus. But when we
increased the ‘k’ value it became clear that this red dot it’s value has been swapped now. In conclusion
we can state that we needed to increase the k value to increase the sensitivity and correctness of our
classifier. If we take a lower ‘k’ value then the sensitivity of our classifier goes upwards and
generalizing/overfitting is more likely to occur.




3
Assignment 6




2.2 Given the following data:

How does the kNN algorithm classify instances 7 and 8, with k
= 1 and k = 3? You can use simple majority voting and the
Manhattan distance.

I’ve chosen to use the manhattan distance measure method to
classify instance 7 and 8.

First we compare instance 7 with all other attributes. We note all
distances.

Instance 1

    0.25 – 0.25 = 0.00
    0.55 – 0.25 = 0.30
    Distance = 0.30


Instance 2

    0.25 – 0.25 = 0.00
    0.55 – 0.75 = -0.20
    Distance = |-0.20| = 0.20


Instance 3

    0.25 – 0.50 = -0.25
    0.55 – 0.25 = 0.30
    Distance = |-0.25| + 0.30 = 0.55


Instance 4

    0.25 – 0.50 = -0.25
    0.55 – 0.75 = -0.20
    Distance = |-0.25| + |-0.20| = 0.45


Instance 5

    0.25 – 0.75 = -0.50
    0.55 – 0.50 = 0.05
    Distance = |-0.50| + 0.05 = 0.55


Instance 6

    0.25 – 0.75 = -0.50
    0.55 – 1.00 = -0.45
    Distance = |-0.50| + |-0.45| = 0.95



     1.   If we use a k-value of 1 we check for just one instance which is closest to instance 7. This is
          done by checking the instance which has the shortest distance. This instance is instance 2. So
          we classify instance 7 with a plus(+).
     2.   Now we check for a k value of 3. So for the three nearest points (with the shortest distance to
          instance 7). These instances are 1,2 and 4. We classify instance 7 with a plus(+) because 2/3
          of the nearest neighbors are classified as a plus.




4
Assignment 6


Now we compare instance 8 with all other attributes. We note all distances.

Instance 1

    0.75 – 0.25 = 0.50
    0.80 – 0.25 = 0.55
    Distance = 1.05


Instance 2

    0.75 – 0.25 = 0.50
    0.80 – 0.75 = 0.05
    Distance = 0.55


Instance 3

    0.75 – 0.50 = 0.25
    0.80 – 0.25 = 0.55
    Distance = 0.80


Instance 4

    0.75 – 0.50 = 0.25
    0.80 – 0.75 = 0.05
    Distance = 0.30


Instance 5

    0.75 – 0.75 = 0.00
    0.80 – 0.50 = 0.30
    Distance = 0.30


Instance 6

    0.75 – 0.75 = 0.00
    0.80 – 1.00 = -0.2
    Distance = |-0.20| = 0.20



     1.   If we use a k-value of 1 we check for just one instance which is closest to instance 8. This is
          done by checking the instance which has the shortest distance. This instance is instance 6. So
          we classify instance 8 with a plus(+).
     2.   Now we check for a k-value of 3. So the three points with the shortest distance to instance 8.
          These instances are 4,5 and 6. We classify instance 8 with a minus(-) because 2/3 of the
          nearest neighbors are classified as a minus.




5
Assignment 6


2.3 Given the dataset from question 2.3, how are instances 7
and 8 classified when using the prototype classifier, and what
are the coordinates of the prototypes (i.e., x1 and x2 values?)

 For classifying both instance 7 and 8 with the prototype classifier we
need to define a ‘super’-plus and a ‘super’-minus. These two values
will represent the average value for all plus and minus classified
instances. These are represented as a new ‘average’-instance.

First we calculate a ‘super’-plus:


    Super-plus x1 value = ((0.25 + 0.25 + 0.75) / 3)
    Super-plus x1 value = 0.46

    Super-plus x2 value = ((0.25 + 0.75 + 1.00) / 3)
    Super-plus x2 value = 0.67


Now we calculate a ‘super’-minus:


    Super-minus x1 value = ((0.50 + 0.50 + 0.75) / 3)
    Super-minus x1 value = 0.58

    Super-minus x2 value = ((0.25 + 0.75 + 0.50) / 3)
    Super-minus x2 value = 0.50


Classifying instance 7

Now we use the manhattan distance to compute the total difference of the super-plus compared to
instance 7:


    Distance x1 = Super-plus x1 value – instance_7 x1 value =
    Distance x1 = 0.46 – 0.25 = 0.21

    Distance x2 = Super-plus x2 value – instance_7 x2 value
    Distance x2 = 0.67 – 0.55 = 0.17

    Distance = 0.21 + 0.17 = 0.38



And finally we compute the total difference between the super minus and instance 7:


    Distance x1 = Super-minus x1 value – instance_7 x1 value =
    Distance x1 = 0.58 – 0.25 = 0.33

    Distance x2 = Super- minus x2 value – instance_7 x2 value
    Distance x2 = 0.50 – 0.55 = 0.05

    Distance = 0.3 + 0.55 = 0.85




We see that the smallest distance is between the super plus and instance 7. So we classify instance 7
as a plus(+).




6
Assignment 6


Classifying instance 8

Now we use the manhattan distance to compute the total difference of the super-plus compared to
instance 8:


    Distance x1 = Super-plus x1 value – instance_8 x1 value =
    Distance x1 = 0.46 – 0.75 = -0.29

    Distance x2 = Super-plus x2 value – instance_8 x2 value
    Distance x2 = 0.67 – 0.80 = -0.13

    Distance = |-0.29| + |-0.13| = 0.42



And finally we compute the total difference between the super minus and instance 8:


    Distance x1 = Super-minus x1 value – instance_8 x1 value =
    Distance x1 = 0.58 – 0.75 = -0.17

    Distance x2 = Super- minus x2 value – instance_8 x2 value
    Distance x2 = 0.50 – 0.80 = -0.30

    Distance = |-0.17| + |-0.30| = 0.47




We see that the smallest distance is between the super plus and instance 8. So we classify instance 8
as a plus(+).




7

More Related Content

PDF
Group 6 NDE project
PDF
Naive Bayes Classifier
PPTX
Laws of exponents and Power
PDF
K Means Clustering Algorithm | K Means Example in Python | Machine Learning A...
PDF
Limit & Continuity of Functions - Differential Calculus by Arun Umrao
PDF
Unit 1 BP801T t h multiple correlation examples
PDF
Principle of Integration - Basic Introduction - by Arun Umrao
PPT
Solving systems of equations in 3 variables
Group 6 NDE project
Naive Bayes Classifier
Laws of exponents and Power
K Means Clustering Algorithm | K Means Example in Python | Machine Learning A...
Limit & Continuity of Functions - Differential Calculus by Arun Umrao
Unit 1 BP801T t h multiple correlation examples
Principle of Integration - Basic Introduction - by Arun Umrao
Solving systems of equations in 3 variables

What's hot (11)

PDF
Scott Cunningham STAT512 Final Project
PPT
Lecture11
DOCX
83662164 case-study-1
PDF
The doppler effect lo
PDF
Energy and power source
PPT
Extrapolation
PPT
Extrapolation
PPT
DIAPOSITIVA
PPT
Extrapolation
PPT
Extrapolation
ZIP
Lecture 3.1- Measurements (P)
Scott Cunningham STAT512 Final Project
Lecture11
83662164 case-study-1
The doppler effect lo
Energy and power source
Extrapolation
Extrapolation
DIAPOSITIVA
Extrapolation
Extrapolation
Lecture 3.1- Measurements (P)
Ad

Viewers also liked (12)

DOCX
Assignment
DOCX
Assignment (062)
PDF
Acct 311 homework assignment #7
DOCX
Machine Learning 1
DOC
Assignment #3 pattern thinking
PDF
M.Tech: AI and Neural Networks Assignment II
PDF
Practical Machine Learning
PDF
MTech - AI_NeuralNetworks_Assignment
PDF
Computer security using machine learning
DOCX
Online Assignment
PDF
4 ma0 4hr_que_20140520
Assignment
Assignment (062)
Acct 311 homework assignment #7
Machine Learning 1
Assignment #3 pattern thinking
M.Tech: AI and Neural Networks Assignment II
Practical Machine Learning
MTech - AI_NeuralNetworks_Assignment
Computer security using machine learning
Online Assignment
4 ma0 4hr_que_20140520
Ad

Similar to Data mining assignment 6 (20)

PDF
Lecture 6 - Classification Classification
PPTX
KNN Classificationwithexplanation and examples.pptx
PDF
Natural Language Processing of applications.pdf
PPTX
k-Nearest Neighbors with brief explanation.pptx
PDF
Chapter2 NEAREST NEIGHBOURHOOD ALGORITHMS.pdf
PPTX
KNN.pptx
PPTX
KNN.pptx
PPTX
KNN CLASSIFIER, INTRODUCTION TO K-NEAREST NEIGHBOR ALGORITHM.pptx
PDF
Machine Learning and Data Mining: 13 Nearest Neighbor and Bayesian Classifiers
PDF
Machine Learning Algorithm - KNN
PDF
Di35605610
PPTX
K- Nearest Neighbor Approach
PDF
KNN presentation.pdf
PDF
K - Nearest neighbor ( KNN )
PPT
Poggi analytics - distance - 1a
PDF
k-nearest neighbour Machine Learning.pdf
PPTX
W5_CLASSIFICATION.pptxW5_CLASSIFICATION.pptx
PPTX
KNN Algorithm Machine_Learning_KNN_Presentation.pptx
PPTX
KNN.pptx
Lecture 6 - Classification Classification
KNN Classificationwithexplanation and examples.pptx
Natural Language Processing of applications.pdf
k-Nearest Neighbors with brief explanation.pptx
Chapter2 NEAREST NEIGHBOURHOOD ALGORITHMS.pdf
KNN.pptx
KNN.pptx
KNN CLASSIFIER, INTRODUCTION TO K-NEAREST NEIGHBOR ALGORITHM.pptx
Machine Learning and Data Mining: 13 Nearest Neighbor and Bayesian Classifiers
Machine Learning Algorithm - KNN
Di35605610
K- Nearest Neighbor Approach
KNN presentation.pdf
K - Nearest neighbor ( KNN )
Poggi analytics - distance - 1a
k-nearest neighbour Machine Learning.pdf
W5_CLASSIFICATION.pptxW5_CLASSIFICATION.pptx
KNN Algorithm Machine_Learning_KNN_Presentation.pptx
KNN.pptx

More from BarryK88 (14)

PDF
Data mining test notes (back)
PDF
Data mining test notes (front)
PDF
Data mining Computerassignment 3
PDF
Data mining assignment 2
PDF
Data mining assignment 4
PDF
Data mining assignment 3
PDF
Data mining assignment 5
PDF
Data mining assignment 1
PDF
Data mining Computerassignment 2
PDF
Data mining Computerassignment 1
PDF
Semantic web final assignment
PDF
Semantic web assignment 3
PDF
Semantic web assignment 2
PDF
Semantic web assignment1
Data mining test notes (back)
Data mining test notes (front)
Data mining Computerassignment 3
Data mining assignment 2
Data mining assignment 4
Data mining assignment 3
Data mining assignment 5
Data mining assignment 1
Data mining Computerassignment 2
Data mining Computerassignment 1
Semantic web final assignment
Semantic web assignment 3
Semantic web assignment 2
Semantic web assignment1

Data mining assignment 6

  • 1. Data mining Assignment week 6 BARRY KOLLEE 10349863
  • 2. Assignment 6 Exercise 1: Lazy Learning How is lazy learning different from the other machine learning approaches we have covered so far? In the previous lectures and exercises we’ve been working on how to predict various types of data. The approach we've used so far is to split data into a test set and training set. With these sets we were able to determine several solutions for handling the data which we eventually process. The distinction between the Lazy learning system and the previous systems is that a Lazy system stores it's training data and uses it directly on the data which we process further on. In the previous lectures the training data was necessary in the first place. Thereafter we’ve determined several conditions which gave us the opportunity to predict new values/instances. In conclusion we can state that we’ve determined a formula to handle new data and after words the training data were we base our formula’s on is not necessary anymore. With a lazy learning system we always refresh our training data to be able to predict the best. A disadvantage of this ‘lazy learning system’ is that it takes a lot of storage to handle these systems. 2
  • 3. Assignment 6 Exercise 2: k Nearest Neighbor Classification 2.1 How does a form of overfitting affect kNN classification, and what can be done to overcome this? The kNN classifier is used to classify new instances for an existing dataset. To classify new instances we search for the nearest point/neighbor of this instance. For using this kNN algorithm a certain variable is taken into account. This variable represents the number of nearest points which we need to take into account for classifying new instances. In case we’re only looking for one (or a view) nearest point(s) it’s easily possible to create overfitting. To describe the overfitting of the kNN algorithm I’ve used these three examples which are listed below. Example 1: In this example we try to look for a plus or a minus which is closes to the red dot. We use a k (number of points which we take into account) of 1. Because the red dot is closest to the plus we ‘generalize’ the red dot to a plus value. Example 2: In this example we try to look for a plus or a minus which is closes to the red dot. We use a k (number of points which we take into account) of 1. Because the red dot is closest to the min we ‘generalize’ the red dot to a minus value. Example 3: In this example we try to look for a plus or a minus which is closes to the red dot. We use a k (number of points which we take into account) of 3. We now check for the three nearest neighbors so now it will be harder to predict the value of the red dot. If we draw a circular line as it is listed in the screenshot you see that the same dot from example 1 has now a plus value in stead of a minus because there are two plusses and one minus in the closest region of the red dot. What you can see within example 2 and 3 is that the red dot was classified as a minus. But when we increased the ‘k’ value it became clear that this red dot it’s value has been swapped now. In conclusion we can state that we needed to increase the k value to increase the sensitivity and correctness of our classifier. If we take a lower ‘k’ value then the sensitivity of our classifier goes upwards and generalizing/overfitting is more likely to occur. 3
  • 4. Assignment 6 2.2 Given the following data: How does the kNN algorithm classify instances 7 and 8, with k = 1 and k = 3? You can use simple majority voting and the Manhattan distance. I’ve chosen to use the manhattan distance measure method to classify instance 7 and 8. First we compare instance 7 with all other attributes. We note all distances. Instance 1 0.25 – 0.25 = 0.00 0.55 – 0.25 = 0.30 Distance = 0.30 Instance 2 0.25 – 0.25 = 0.00 0.55 – 0.75 = -0.20 Distance = |-0.20| = 0.20 Instance 3 0.25 – 0.50 = -0.25 0.55 – 0.25 = 0.30 Distance = |-0.25| + 0.30 = 0.55 Instance 4 0.25 – 0.50 = -0.25 0.55 – 0.75 = -0.20 Distance = |-0.25| + |-0.20| = 0.45 Instance 5 0.25 – 0.75 = -0.50 0.55 – 0.50 = 0.05 Distance = |-0.50| + 0.05 = 0.55 Instance 6 0.25 – 0.75 = -0.50 0.55 – 1.00 = -0.45 Distance = |-0.50| + |-0.45| = 0.95 1. If we use a k-value of 1 we check for just one instance which is closest to instance 7. This is done by checking the instance which has the shortest distance. This instance is instance 2. So we classify instance 7 with a plus(+). 2. Now we check for a k value of 3. So for the three nearest points (with the shortest distance to instance 7). These instances are 1,2 and 4. We classify instance 7 with a plus(+) because 2/3 of the nearest neighbors are classified as a plus. 4
  • 5. Assignment 6 Now we compare instance 8 with all other attributes. We note all distances. Instance 1 0.75 – 0.25 = 0.50 0.80 – 0.25 = 0.55 Distance = 1.05 Instance 2 0.75 – 0.25 = 0.50 0.80 – 0.75 = 0.05 Distance = 0.55 Instance 3 0.75 – 0.50 = 0.25 0.80 – 0.25 = 0.55 Distance = 0.80 Instance 4 0.75 – 0.50 = 0.25 0.80 – 0.75 = 0.05 Distance = 0.30 Instance 5 0.75 – 0.75 = 0.00 0.80 – 0.50 = 0.30 Distance = 0.30 Instance 6 0.75 – 0.75 = 0.00 0.80 – 1.00 = -0.2 Distance = |-0.20| = 0.20 1. If we use a k-value of 1 we check for just one instance which is closest to instance 8. This is done by checking the instance which has the shortest distance. This instance is instance 6. So we classify instance 8 with a plus(+). 2. Now we check for a k-value of 3. So the three points with the shortest distance to instance 8. These instances are 4,5 and 6. We classify instance 8 with a minus(-) because 2/3 of the nearest neighbors are classified as a minus. 5
  • 6. Assignment 6 2.3 Given the dataset from question 2.3, how are instances 7 and 8 classified when using the prototype classifier, and what are the coordinates of the prototypes (i.e., x1 and x2 values?) For classifying both instance 7 and 8 with the prototype classifier we need to define a ‘super’-plus and a ‘super’-minus. These two values will represent the average value for all plus and minus classified instances. These are represented as a new ‘average’-instance. First we calculate a ‘super’-plus: Super-plus x1 value = ((0.25 + 0.25 + 0.75) / 3) Super-plus x1 value = 0.46 Super-plus x2 value = ((0.25 + 0.75 + 1.00) / 3) Super-plus x2 value = 0.67 Now we calculate a ‘super’-minus: Super-minus x1 value = ((0.50 + 0.50 + 0.75) / 3) Super-minus x1 value = 0.58 Super-minus x2 value = ((0.25 + 0.75 + 0.50) / 3) Super-minus x2 value = 0.50 Classifying instance 7 Now we use the manhattan distance to compute the total difference of the super-plus compared to instance 7: Distance x1 = Super-plus x1 value – instance_7 x1 value = Distance x1 = 0.46 – 0.25 = 0.21 Distance x2 = Super-plus x2 value – instance_7 x2 value Distance x2 = 0.67 – 0.55 = 0.17 Distance = 0.21 + 0.17 = 0.38 And finally we compute the total difference between the super minus and instance 7: Distance x1 = Super-minus x1 value – instance_7 x1 value = Distance x1 = 0.58 – 0.25 = 0.33 Distance x2 = Super- minus x2 value – instance_7 x2 value Distance x2 = 0.50 – 0.55 = 0.05 Distance = 0.3 + 0.55 = 0.85 We see that the smallest distance is between the super plus and instance 7. So we classify instance 7 as a plus(+). 6
  • 7. Assignment 6 Classifying instance 8 Now we use the manhattan distance to compute the total difference of the super-plus compared to instance 8: Distance x1 = Super-plus x1 value – instance_8 x1 value = Distance x1 = 0.46 – 0.75 = -0.29 Distance x2 = Super-plus x2 value – instance_8 x2 value Distance x2 = 0.67 – 0.80 = -0.13 Distance = |-0.29| + |-0.13| = 0.42 And finally we compute the total difference between the super minus and instance 8: Distance x1 = Super-minus x1 value – instance_8 x1 value = Distance x1 = 0.58 – 0.75 = -0.17 Distance x2 = Super- minus x2 value – instance_8 x2 value Distance x2 = 0.50 – 0.80 = -0.30 Distance = |-0.17| + |-0.30| = 0.47 We see that the smallest distance is between the super plus and instance 8. So we classify instance 8 as a plus(+). 7