Classification & preduction

Classification and prediction

Prepared By - Mr. Nilesh Magar

•What Is Classification?
•Example
•Two Step Process:
Learning Step: Training set made up of DB tuples & their
associated class labels- Classification Rule or
Decision Tree or mathematical Formulae
Classification Step:

•Supervised Learning:
•Accuracy of the classifier:


Classifier

Class label
Attribute


Decision Tree

•Between 1970-1980 J. Rose Quinlan, a researcher in Machine
Learning developed a decision tree algorithm known as ID3
(Iterative dichotomiser), C4.5 is the succesor of ID3.
•CART(Classification & Regression tree is also developed during the
same period which describe the generation of binary tree.
• Flowchart like tree structure- root, Node, Branch, leaf node.

•How are decision trees used for classification?

3 Attribute
Selection
methods

3 Termination
Condition

3 Splitting
scenarios


Splitting Scenarios
1) A is Discrete value 2) A is continuous Valued

3) Discrete Value but Binary tree must be produced


Termination Condition : Recursive

1. All of the tuples in partition D (represented at node N) belong to the same class

(steps 2 and 3), or

2. There are no remaining attributes on which the tuples may be further partitioned

(step 4). In this case, majority voting is employed (step 5). This involves converting

node N into a leaf and labeling it with the most common class in D. Alternatively,

the class distribution of the node tuples may be stored.

3. There are no tuples for a given branch, that is, a partition Dj is empty (step 12).

In this case, a leaf is created with the majority class in D (step 13).


Attribute Selection Measures:

1. Information Gain:
2. Gain Ratio:
3. Gini Index


Performance:
•Quite simple, suitable for relatively small data sets
•Large real-world databases?
•Training tuples should reside in main memory
Issues:

•Over fitting

Tree pruning

1. Pre-pruning
2. Post-pruning


Bayes Classification Method

•Statistical Classifier
•They use to predict class membership probability
•Based on Bayes’ Theorm
•Naïve
•It assumes “effects of an attribute value on a given class is independent
of the value of the other attributes” – class condition independence
•The name bayes is taken from the name thomas Bayes who did early
work in probability and decision theoryduring 18th century.


Bayesian Theorem

•Let X is data tuple “evidence” & H is hypothesis that X belongs to specific
class.
•Determine P(H|X):
•Posterior probability: P(H|X), tuple X contains customers attribute age=35
& salary=40,000 , H customer will buy a computer.
•Prior Probability: P(H)
•P(X|H) :
•P(X)
•Bayesian Theorem:
P(H|X) = P(X|H) P(H) / P(X)

Naïve Bayesian classifier:

Suppose there are m classes, C1, C2, …..,Cm. Given a tuple, X, the classifier will predict that X
belongs to the class having the highest posterior probability, conditioned on X. X belong to Ci If
& only if
P(Ci|X)>P(Cj|X) for 1<= j <= m, j!=I

So Bayes theorem is

P(Ci|X) = P(X|Ci) P(Ci) / P(X)
As P(X) is constant for all classes so only P(X|Ci) P(Ci) need to be maximize, If class prior
probability is not known then P(C1) = P(C2) = …… = P(Cm) so only P(X|Ci) need to maximize.
But maximization of P(X|Ci) is computationally expensive so we will apply Class conditional
independence,


Example:


Prediction
Regression Analysis Can be used to model the relationship between 2 variables.
Predictor Variable: The values of the predictor variables are known.
Response variable: The response variable is what we want to predict.
Linear regression:
y = b+wx;

y = w0+w1x


Example
Animal height (feet) weight (lbs)

Animal1 9 300

Animal2 8.78 295

Animal3 9.6 312

Animal4 8.09 280

Animal5 5 200

Animal6 5.5 250

Animal7 5.42 230

Animal8 5.75 250


Given the above data, we compute

= 7.15 and = 264.7

(9-7.15)(300–264.7)+(8.78–7.15)(295–264.7)+(9.6–7.15)(312–264.7)+………+(5.75-7.15)(250–264.7)
W1=
(9 – 7.15)2 + (8.78 – 7.15 ) 2 +……… (5.75-7.15) 2
= 19.35337
Let w 0 = 264.7 – (19.35337)(7.15)
= 126.3234
y = 126.3234 + 19.35337x. Using this equation, we can predict that the Animal with 8
feet height can have 281.1504 lbs weight.( 126.3234 + 19.35337(8))


Subjects
1) U.M.L.
2) P.P.L.
3) D.M.D.W.
4) O.S.
5) Programming Languages
6) RDBMS
Mr. Nilesh Magar
Lecturer at MIT, Kothrud, Pune.
9975155310.

Thank You


Classification & preduction

More Related Content

Similar to Classification & preduction (20)

More from Prof.Nilesh Magar (7)

Classification & preduction