SlideShare a Scribd company logo
Classification and prediction




          Prepared By - Mr. Nilesh Magar
•What Is Classification?
•Example
•Two Step Process:
       Learning Step: Training set made up of DB tuples & their
                       associated class labels- Classification Rule or
Decision Tree or mathematical Formulae
       Classification Step:

•Supervised Learning:
•Accuracy of the classifier:




                               Prepared By - Mr. Nilesh Magar
Classifier




Class label
 Attribute




              Prepared By - Mr. Nilesh Magar
Decision Tree

•Between 1970-1980 J. Rose Quinlan, a researcher in Machine
Learning developed a decision tree algorithm known as ID3
(Iterative dichotomiser), C4.5 is the succesor of ID3.
•CART(Classification & Regression tree is also developed during the
same period which describe the generation of binary tree.
• Flowchart like tree structure- root, Node, Branch, leaf node.




•How are decision trees used for classification?
                          Prepared By - Mr. Nilesh Magar
3 Attribute
                                                  Selection
                                                  methods



                                 3 Termination
                                   Condition


                                                 3 Splitting
                                                 scenarios




Prepared By - Mr. Nilesh Magar
Splitting Scenarios
1) A is Discrete value                     2) A is continuous Valued




  3) Discrete Value but Binary tree must be produced




                           Prepared By - Mr. Nilesh Magar
Termination Condition : Recursive



1. All of the tuples in partition D (represented at node N) belong to the same class

(steps 2 and 3), or

2. There are no remaining attributes on which the tuples may be further partitioned

(step 4). In this case, majority voting is employed (step 5). This involves converting

node N into a leaf and labeling it with the most common class in D. Alternatively,

the class distribution of the node tuples may be stored.

3. There are no tuples for a given branch, that is, a partition Dj is empty (step 12).

In this case, a leaf is created with the majority class in D (step 13).


                                  Prepared By - Mr. Nilesh Magar
Attribute Selection Measures:

 1. Information Gain:
 2. Gain Ratio:
 3. Gini Index




                        Prepared By - Mr. Nilesh Magar
Performance:
 •Quite simple, suitable for relatively small data sets
 •Large real-world databases?
 •Training tuples should reside in main memory
 Issues:

 •Over fitting

   Tree pruning

1. Pre-pruning
2. Post-pruning




                            Prepared By - Mr. Nilesh Magar
Bayes Classification Method

•Statistical Classifier
•They use to predict class membership probability
•Based on Bayes’ Theorm
•Naïve
•It assumes “effects of an attribute value on a given class is independent
of the value of the other attributes” – class condition independence
•The name bayes is taken from the name thomas Bayes who did early
work in probability and decision theoryduring 18th century.



                           Prepared By - Mr. Nilesh Magar
Bayesian Theorem


•Let X is data tuple “evidence” & H is hypothesis that X belongs to specific
class.
•Determine P(H|X):
•Posterior probability: P(H|X), tuple X contains customers attribute age=35
& salary=40,000 , H customer will buy a computer.
•Prior Probability: P(H)
•P(X|H) :
•P(X)
•Bayesian Theorem:
                P(H|X) = P(X|H) P(H) / P(X)
                              Prepared By - Mr. Nilesh Magar
Naïve Bayesian classifier:


Suppose there are m classes, C1, C2, …..,Cm. Given a tuple, X, the classifier will predict that X
belongs to the class having the highest posterior probability, conditioned on X. X belong to Ci If
& only if
          P(Ci|X)>P(Cj|X)               for 1<= j <= m, j!=I

So Bayes theorem is

           P(Ci|X) = P(X|Ci) P(Ci) / P(X)
As P(X) is constant for all classes so only P(X|Ci) P(Ci) need to be maximize, If class prior
probability is not known then P(C1) = P(C2) = …… = P(Cm) so only P(X|Ci) need to maximize.
But maximization of P(X|Ci) is computationally expensive so we will apply Class conditional
independence,




                                        Prepared By - Mr. Nilesh Magar
Example:




           Prepared By - Mr. Nilesh Magar
Prepared By - Mr. Nilesh Magar
Prepared By - Mr. Nilesh Magar
Prediction
Regression Analysis Can be used to model the relationship between 2 variables.
         Predictor Variable: The values of the predictor variables are known.
         Response variable: The response variable is what we want to predict.
Linear regression:
       y = b+wx;

         y = w0+w1x




                               Prepared By - Mr. Nilesh Magar
Example
Animal        height (feet)                             weight (lbs)

Animal1       9                                         300

Animal2       8.78                                      295

Animal3       9.6                                       312

Animal4       8.09                                      280

Animal5       5                                         200

Animal6       5.5                                       250

Animal7       5.42                                      230

Animal8       5.75                                      250

                              Prepared By - Mr. Nilesh Magar
Given the above data, we compute

                                                             = 7.15 and    = 264.7




      (9-7.15)(300–264.7)+(8.78–7.15)(295–264.7)+(9.6–7.15)(312–264.7)+………+(5.75-7.15)(250–264.7)
W1=
                      (9 – 7.15)2 + (8.78 – 7.15 ) 2 +……… (5.75-7.15) 2
      = 19.35337
      Let w 0 = 264.7 – (19.35337)(7.15)
                = 126.3234
      y = 126.3234 + 19.35337x. Using this equation, we can predict that the Animal with 8
      feet height can have 281.1504 lbs weight.( 126.3234 + 19.35337(8))

                                          Prepared By - Mr. Nilesh Magar
Subjects
1)   U.M.L.
2)   P.P.L.
3)   D.M.D.W.
4)   O.S.
5)   Programming Languages
6)   RDBMS
                                    Mr. Nilesh Magar
                                    Lecturer at MIT, Kothrud, Pune.
                                    9975155310.
                Prepared By - Mr. Nilesh Magar
Thank You




 Prepared By - Mr. Nilesh Magar

More Related Content

PDF
Day 2 metric
PPTX
GA Presentation_ver2.pptx
PDF
What's cooking
PPSX
Frequent itemset mining methods
PPTX
Feasibility study
PPTX
Decision tree induction
PPTX
Building and deploying analytics
PDF
Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...
Day 2 metric
GA Presentation_ver2.pptx
What's cooking
Frequent itemset mining methods
Feasibility study
Decision tree induction
Building and deploying analytics
Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...

Similar to Classification & preduction (20)

PPTX
Data Mining Lecture_10(b).pptx
PDF
Gradient Boosted Regression Trees in scikit-learn
PPTX
Deep learning from mashine learning AI..
PDF
09_dm1_knn_2022_23.pdf
PDF
Machine Learning Algorithms Introduction.pdf
PPTX
Machine Learning with Python unit-2.pptx
PPTX
BAS 250 Lecture 8
PDF
Lecture 8: Decision Trees & k-Nearest Neighbors
PDF
PPT s10-machine vision-s2
PDF
PDF
Machine learning for_finance
PDF
CSA 3702 machine learning module 2
PDF
Computer Vision Computer Vision: Algorithms and Applications Richard Szeliski
PPT
unit 4 nearest neighbor.ppt
PPTX
kmean_naivebayes.pptx
PPTX
Ensemble_instance_unsupersied_learning 01_02_2024.pptx
PPTX
Decision Trees
PDF
machine_learning.pptx
PPTX
NN Classififcation Neural Network NN.pptx
PPTX
Lecture4.pptx
Data Mining Lecture_10(b).pptx
Gradient Boosted Regression Trees in scikit-learn
Deep learning from mashine learning AI..
09_dm1_knn_2022_23.pdf
Machine Learning Algorithms Introduction.pdf
Machine Learning with Python unit-2.pptx
BAS 250 Lecture 8
Lecture 8: Decision Trees & k-Nearest Neighbors
PPT s10-machine vision-s2
Machine learning for_finance
CSA 3702 machine learning module 2
Computer Vision Computer Vision: Algorithms and Applications Richard Szeliski
unit 4 nearest neighbor.ppt
kmean_naivebayes.pptx
Ensemble_instance_unsupersied_learning 01_02_2024.pptx
Decision Trees
machine_learning.pptx
NN Classififcation Neural Network NN.pptx
Lecture4.pptx
Ad

More from Prof.Nilesh Magar (7)

PPTX
Decision tree- System analysis and design
PPTX
System concepts- System Analysis and design
PPTX
Trigger in mysql
PPTX
Stored procedures
PPTX
Mysql creating stored function
PPTX
Crash recovery in database
PPT
Data-ware Housing
Decision tree- System analysis and design
System concepts- System Analysis and design
Trigger in mysql
Stored procedures
Mysql creating stored function
Crash recovery in database
Data-ware Housing
Ad

Classification & preduction

  • 1. Classification and prediction Prepared By - Mr. Nilesh Magar
  • 2. •What Is Classification? •Example •Two Step Process: Learning Step: Training set made up of DB tuples & their associated class labels- Classification Rule or Decision Tree or mathematical Formulae Classification Step: •Supervised Learning: •Accuracy of the classifier: Prepared By - Mr. Nilesh Magar
  • 3. Classifier Class label Attribute Prepared By - Mr. Nilesh Magar
  • 4. Decision Tree •Between 1970-1980 J. Rose Quinlan, a researcher in Machine Learning developed a decision tree algorithm known as ID3 (Iterative dichotomiser), C4.5 is the succesor of ID3. •CART(Classification & Regression tree is also developed during the same period which describe the generation of binary tree. • Flowchart like tree structure- root, Node, Branch, leaf node. •How are decision trees used for classification? Prepared By - Mr. Nilesh Magar
  • 5. 3 Attribute Selection methods 3 Termination Condition 3 Splitting scenarios Prepared By - Mr. Nilesh Magar
  • 6. Splitting Scenarios 1) A is Discrete value 2) A is continuous Valued 3) Discrete Value but Binary tree must be produced Prepared By - Mr. Nilesh Magar
  • 7. Termination Condition : Recursive 1. All of the tuples in partition D (represented at node N) belong to the same class (steps 2 and 3), or 2. There are no remaining attributes on which the tuples may be further partitioned (step 4). In this case, majority voting is employed (step 5). This involves converting node N into a leaf and labeling it with the most common class in D. Alternatively, the class distribution of the node tuples may be stored. 3. There are no tuples for a given branch, that is, a partition Dj is empty (step 12). In this case, a leaf is created with the majority class in D (step 13). Prepared By - Mr. Nilesh Magar
  • 8. Attribute Selection Measures: 1. Information Gain: 2. Gain Ratio: 3. Gini Index Prepared By - Mr. Nilesh Magar
  • 9. Performance: •Quite simple, suitable for relatively small data sets •Large real-world databases? •Training tuples should reside in main memory Issues: •Over fitting Tree pruning 1. Pre-pruning 2. Post-pruning Prepared By - Mr. Nilesh Magar
  • 10. Bayes Classification Method •Statistical Classifier •They use to predict class membership probability •Based on Bayes’ Theorm •Naïve •It assumes “effects of an attribute value on a given class is independent of the value of the other attributes” – class condition independence •The name bayes is taken from the name thomas Bayes who did early work in probability and decision theoryduring 18th century. Prepared By - Mr. Nilesh Magar
  • 11. Bayesian Theorem •Let X is data tuple “evidence” & H is hypothesis that X belongs to specific class. •Determine P(H|X): •Posterior probability: P(H|X), tuple X contains customers attribute age=35 & salary=40,000 , H customer will buy a computer. •Prior Probability: P(H) •P(X|H) : •P(X) •Bayesian Theorem: P(H|X) = P(X|H) P(H) / P(X) Prepared By - Mr. Nilesh Magar
  • 12. Naïve Bayesian classifier: Suppose there are m classes, C1, C2, …..,Cm. Given a tuple, X, the classifier will predict that X belongs to the class having the highest posterior probability, conditioned on X. X belong to Ci If & only if P(Ci|X)>P(Cj|X) for 1<= j <= m, j!=I So Bayes theorem is P(Ci|X) = P(X|Ci) P(Ci) / P(X) As P(X) is constant for all classes so only P(X|Ci) P(Ci) need to be maximize, If class prior probability is not known then P(C1) = P(C2) = …… = P(Cm) so only P(X|Ci) need to maximize. But maximization of P(X|Ci) is computationally expensive so we will apply Class conditional independence, Prepared By - Mr. Nilesh Magar
  • 13. Example: Prepared By - Mr. Nilesh Magar
  • 14. Prepared By - Mr. Nilesh Magar
  • 15. Prepared By - Mr. Nilesh Magar
  • 16. Prediction Regression Analysis Can be used to model the relationship between 2 variables. Predictor Variable: The values of the predictor variables are known. Response variable: The response variable is what we want to predict. Linear regression: y = b+wx; y = w0+w1x Prepared By - Mr. Nilesh Magar
  • 17. Example Animal height (feet) weight (lbs) Animal1 9 300 Animal2 8.78 295 Animal3 9.6 312 Animal4 8.09 280 Animal5 5 200 Animal6 5.5 250 Animal7 5.42 230 Animal8 5.75 250 Prepared By - Mr. Nilesh Magar
  • 18. Given the above data, we compute = 7.15 and = 264.7 (9-7.15)(300–264.7)+(8.78–7.15)(295–264.7)+(9.6–7.15)(312–264.7)+………+(5.75-7.15)(250–264.7) W1= (9 – 7.15)2 + (8.78 – 7.15 ) 2 +……… (5.75-7.15) 2 = 19.35337 Let w 0 = 264.7 – (19.35337)(7.15) = 126.3234 y = 126.3234 + 19.35337x. Using this equation, we can predict that the Animal with 8 feet height can have 281.1504 lbs weight.( 126.3234 + 19.35337(8)) Prepared By - Mr. Nilesh Magar
  • 19. Subjects 1) U.M.L. 2) P.P.L. 3) D.M.D.W. 4) O.S. 5) Programming Languages 6) RDBMS Mr. Nilesh Magar Lecturer at MIT, Kothrud, Pune. 9975155310. Prepared By - Mr. Nilesh Magar
  • 20. Thank You Prepared By - Mr. Nilesh Magar