SlideShare a Scribd company logo
Classification Techniques
and Regression
Lazy vs. Eager Learning
 Lazy vs. Eager learning
 Lazy learning (e.g., instance-based learning): Simply stores training data (or
only minor processing) and waits until it is given a test tuple
 Eager learning (the above discussed methods): Given a set of training set,
constructs a classification model before receiving new (e.g., test) data to
classify
 Lazy: less time in training but more time in predicting
 Accuracy
 Lazy method effectively uses a richer hypothesis space
 Eager: must commit to a single hypothesis
2
Lazy Learner: Instance-Based Methods
 Instance-based learning:
 Store training examples and delay the processing (“lazy
evaluation”) until a new instance must be classified
 Typical approaches
 k-nearest neighbor approach
 Instances represented as points in a Euclidean space.
 Case-based reasoning
 Uses symbolic representations and knowledge-based
inference
3
The k-Nearest Neighbor Algorithm
 Labor Intensive – currently used for pattern recognition
 Learning by Analogy
 All instances correspond to points in the n-D space
 The nearest neighbor are defined in terms of Euclidean distance,
dist(X1, X2)
 For discrete-valued, k-NN returns the most common value among
the k training examples nearest to xq
 Assigns equal weight to all attributes
 Attributes have to be normalized
 Can also be used for prediction
4
k-NN Algorithm
 k-NN for real-valued prediction for a given unknown tuple
 Returns the mean values of the k nearest neighbors
 Distance-weighted nearest neighbor algorithm
 Weight the contribution of each of the k neighbors according to their
distance to the query xq
 Give greater weight to closer neighbors
 Robust to noisy data by averaging k-nearest neighbors
 Curse of dimensionality: distance between neighbors could be
dominated by irrelevant attributes
 To overcome it, axes stretch or elimination of the least relevant attributes
5
k-NN Algorithm
 Categorical attributes
 If two values are identical – difference =0
 Missing values
 For categorical values difference is 1 if either one or both values are
missing
 For numeric values
 Both are missing – difference is 1
 One value is missing other is v’ – difference is 1-v’ or v’
 Complexity
 Classification O(|D|) comparisons are needed
Case-Based Reasoning (CBR)
 CBR: Uses a database of problem solutions to solve new problems
 Store symbolic description (tuples or cases)—not points in a Euclidean space
 Applications: Customer-service (product-related diagnosis), legal ruling
 Methodology
 Instances represented by rich symbolic descriptions (e.g., function graphs)
 Search for similar cases, multiple retrieved cases may be combined
 Tight coupling between case retrieval, knowledge-based reasoning, and problem
solving
 Challenges
 Find a good similarity metric
 Indexing based on syntactic similarity measure, and when failure, backtracking,
and adapting to additional cases
7
Genetic Algorithms (GA)
 Genetic Algorithm: based on an analogy to biological evolution
 An initial population is created consisting of randomly generated rules
 Each rule is represented by a string of bits
 E.g., if A1 and ¬A2 then C2 can be encoded as 100
 If an attribute has k > 2 values, k bits can be used
 Based on the notion of survival of the fittest, a new population is formed to consist of the
fittest rules and their offsprings
 The fitness of a rule is represented by its classification accuracy on a set of training
examples
 Offsprings are generated by crossover and mutation
 The process continues until a population P evolves when each rule in P satisfies a
prespecified threshold
 Slow but easily parallelizable
8
Rough Set Approach
 Rough sets are used to approximately or “roughly” define equivalent
classes
 A rough set for a given class C is approximated by two sets: a lower
approximation (certain to be in C) and an upper approximation (cannot be
described as not belonging to C)
 Rough sets can be used for Feature Reduction and Relevance analysis
9
Fuzzy Set Approaches
 Fuzzy logic uses truth values between 0.0 and 1.0 to represent the
degree of membership (such as using fuzzy membership graph)
 Attribute values are converted to fuzzy values
 e.g., income is mapped into the discrete categories {low, medium, high}
with fuzzy values calculated
 For a given new sample, more than one fuzzy value may apply
 Each applicable rule contributes a vote for membership in the
categories
 Typically, the truth values for each predicted category are summed,
and these sums are combined
10
Fuzzy Set Approaches
11
Fuzzy-Set Approaches
 Fuzzy measures
 AND operation
 m(high_income AND senior_employee) (x) = min(mhigh_income(x), msenior_employee(x))
 OR operation
 m(high_income OR senior_employee) (x) = max(mhigh_income(x), msenior_employee(x))
Prediction
 (Numerical) prediction is similar to classification
 construct a model
 use model to predict continuous or ordered value for a given input
 Prediction is different from classification
 Classification refers to predict categorical class label
 Prediction models continuous-valued functions
 Major method for prediction: regression
 model the relationship between one or more independent or predictor
variables and a dependent or response variable
 Regression analysis
 Linear and multiple regression
 Non-linear regression
 Other regression methods: generalized linear model, Poisson regression,
log-linear models, regression trees
13
Linear Regression
 Linear regression: involves a response variable y and a single
predictor variable x
y = w0 + w1 x
where w0 (y-intercept) and w1 (slope) are regression coefficients
 Method of least squares: estimates the best-fitting straight line
 Multiple linear regression: involves more than one predictor variable
 Ex. For 2-D data, we may have: y = w0 + w1 x1+ w2 x2
 Solvable by extension of least square method
∑
∑
=
=
−
−−
= ||
1
2
||
1
)(
))((
1 D
i
i
D
i
ii
xx
yyxx
w xwyw
10
−=
14
Nonlinear Regression
 Some nonlinear models can be modeled by a polynomial function
 A polynomial regression model can be transformed into linear
regression model. For example,
y = w0 + w1 x + w2 x2
+ w3 x3
convertible to linear
 Other functions, such as power function, can also be transformed to
linear model
 Some models are intractable nonlinear (e.g., sum of exponential
terms)
 possible to obtain least square estimates through extensive calculation on
more complex formulae
15
Other Regression-Based Models
 Generalized linear model:
 Foundation on which linear regression can be applied to modeling
categorical response variables
 Variance of y is a function of the mean value of y, not a constant
 Logistic regression: models the prob. of some event occurring as a linear
function of a set of predictor variables
 Poisson regression: models the data that exhibit a Poisson distribution
 Log-linear models: (for categorical data)
 Data Cubes
 Also useful for data compression and smoothing
 Decision trees
 Regression trees
 Model trees
16

More Related Content

PPTX
CLR AND LALR PARSER
PPTX
Bfs and Dfs
PDF
NFA to DFA
PPT
Finite automata
PPTX
Pumping lemma
PPTX
LR(1) and SLR(1) parsing
PPT
Context free grammar
CLR AND LALR PARSER
Bfs and Dfs
NFA to DFA
Finite automata
Pumping lemma
LR(1) and SLR(1) parsing
Context free grammar

What's hot (20)

PDF
Overriding
PPTX
Stack & Queue using Linked List in Data Structure
PPTX
Presentation on Breadth First Search (BFS)
PPTX
Asymptotic Notation
PPTX
Stacks in Data Structure
PDF
Course recommender system
PPTX
C# Private assembly
PDF
Applications of stack
PPT
Red black tree
PDF
PPTX
Minimization of DFA.pptx
PPTX
Infix to postfix conversion
PPTX
Unit 4 queue
PPTX
NFA & DFA
PPTX
Context free grammar
PPTX
Python Functions
PPTX
Depth Buffer Method
PPT
Z buffer
PDF
14-Intermediate code generation - Variants of Syntax trees - Three Address Co...
PPTX
Alphabets , strings, languages and grammars
Overriding
Stack & Queue using Linked List in Data Structure
Presentation on Breadth First Search (BFS)
Asymptotic Notation
Stacks in Data Structure
Course recommender system
C# Private assembly
Applications of stack
Red black tree
Minimization of DFA.pptx
Infix to postfix conversion
Unit 4 queue
NFA & DFA
Context free grammar
Python Functions
Depth Buffer Method
Z buffer
14-Intermediate code generation - Variants of Syntax trees - Three Address Co...
Alphabets , strings, languages and grammars
Ad

Viewers also liked (11)

PPT
Chapter 2
PPT
T16 multiple regression
PDF
DOSUG Intro to google prediction api
PPTX
Data mining
PPT
Day 10 prediction and regression
PPTX
Chapter 4 Classification
PDF
Regression Analysis
PPTX
Data mining: Classification and prediction
PDF
B0930610
ODP
Multiple linear regression
PPS
Correlation and regression
Chapter 2
T16 multiple regression
DOSUG Intro to google prediction api
Data mining
Day 10 prediction and regression
Chapter 4 Classification
Regression Analysis
Data mining: Classification and prediction
B0930610
Multiple linear regression
Correlation and regression
Ad

Similar to 2.7 other classifiers (20)

PPTX
DataAnalyticsIntroduction and its ci.pptx
PPTX
lazy learners and other classication methods
PPTX
03 Data Mining Techniques
PPTX
Informs presentation new ppt
PPTX
Machine learning presentation (razi)
PPTX
Machine learning
PPTX
Ensemble_instance_unsupersied_learning 01_02_2024.pptx
PPT
Machine Learning: Foundations Course Number 0368403401
PPTX
Reuqired ppt for machine learning algirthms and part
PPTX
Nss power point_machine_learning
PPT
Chapter 6- Classification and Prediction Methods
PPTX
ML unit-1.pptx
PPTX
Data mining approaches and methods
PDF
IRJET- Performance Evaluation of Various Classification Algorithms
PDF
IRJET- Performance Evaluation of Various Classification Algorithms
PPT
Chghjgkgyhbygukbhyvuhbbubnubuyuvyyvivlh06.ppt
PDF
Machine learning cheat sheet
PPT
3 DM Classification HFCS kilometres .ppt
PPTX
PREDICT 422 - Module 1.pptx
PPTX
machine _learning_introductionand python.pptx
DataAnalyticsIntroduction and its ci.pptx
lazy learners and other classication methods
03 Data Mining Techniques
Informs presentation new ppt
Machine learning presentation (razi)
Machine learning
Ensemble_instance_unsupersied_learning 01_02_2024.pptx
Machine Learning: Foundations Course Number 0368403401
Reuqired ppt for machine learning algirthms and part
Nss power point_machine_learning
Chapter 6- Classification and Prediction Methods
ML unit-1.pptx
Data mining approaches and methods
IRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification Algorithms
Chghjgkgyhbygukbhyvuhbbubnubuyuvyyvivlh06.ppt
Machine learning cheat sheet
3 DM Classification HFCS kilometres .ppt
PREDICT 422 - Module 1.pptx
machine _learning_introductionand python.pptx

More from Krish_ver2 (20)

PPT
5.5 back tracking
PPT
5.5 back track
PPT
5.5 back tracking 02
PPT
5.4 randomized datastructures
PPT
5.4 randomized datastructures
PPT
5.4 randamized algorithm
PPT
5.3 dynamic programming 03
PPT
5.3 dynamic programming
PPT
5.3 dyn algo-i
PPT
5.2 divede and conquer 03
PPT
5.2 divide and conquer
PPT
5.2 divede and conquer 03
PPT
5.1 greedyyy 02
PPT
5.1 greedy
PPT
5.1 greedy 03
PPT
4.4 hashing02
PPT
4.4 hashing
PPT
4.4 hashing ext
PPT
4.4 external hashing
PPT
4.2 bst
5.5 back tracking
5.5 back track
5.5 back tracking 02
5.4 randomized datastructures
5.4 randomized datastructures
5.4 randamized algorithm
5.3 dynamic programming 03
5.3 dynamic programming
5.3 dyn algo-i
5.2 divede and conquer 03
5.2 divide and conquer
5.2 divede and conquer 03
5.1 greedyyy 02
5.1 greedy
5.1 greedy 03
4.4 hashing02
4.4 hashing
4.4 hashing ext
4.4 external hashing
4.2 bst

Recently uploaded (20)

PDF
TR - Agricultural Crops Production NC III.pdf
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
Business Ethics Teaching Materials for college
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
Cell Types and Its function , kingdom of life
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
Classroom Observation Tools for Teachers
PPTX
Week 4 Term 3 Study Techniques revisited.pptx
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PPTX
Institutional Correction lecture only . . .
PDF
Complications of Minimal Access Surgery at WLH
PDF
Insiders guide to clinical Medicine.pdf
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
TR - Agricultural Crops Production NC III.pdf
VCE English Exam - Section C Student Revision Booklet
Business Ethics Teaching Materials for college
PPH.pptx obstetrics and gynecology in nursing
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
Abdominal Access Techniques with Prof. Dr. R K Mishra
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Cell Types and Its function , kingdom of life
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Classroom Observation Tools for Teachers
Week 4 Term 3 Study Techniques revisited.pptx
human mycosis Human fungal infections are called human mycosis..pptx
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Microbial diseases, their pathogenesis and prophylaxis
Institutional Correction lecture only . . .
Complications of Minimal Access Surgery at WLH
Insiders guide to clinical Medicine.pdf
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Final Presentation General Medicine 03-08-2024.pptx
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...

2.7 other classifiers

  • 2. Lazy vs. Eager Learning  Lazy vs. Eager learning  Lazy learning (e.g., instance-based learning): Simply stores training data (or only minor processing) and waits until it is given a test tuple  Eager learning (the above discussed methods): Given a set of training set, constructs a classification model before receiving new (e.g., test) data to classify  Lazy: less time in training but more time in predicting  Accuracy  Lazy method effectively uses a richer hypothesis space  Eager: must commit to a single hypothesis 2
  • 3. Lazy Learner: Instance-Based Methods  Instance-based learning:  Store training examples and delay the processing (“lazy evaluation”) until a new instance must be classified  Typical approaches  k-nearest neighbor approach  Instances represented as points in a Euclidean space.  Case-based reasoning  Uses symbolic representations and knowledge-based inference 3
  • 4. The k-Nearest Neighbor Algorithm  Labor Intensive – currently used for pattern recognition  Learning by Analogy  All instances correspond to points in the n-D space  The nearest neighbor are defined in terms of Euclidean distance, dist(X1, X2)  For discrete-valued, k-NN returns the most common value among the k training examples nearest to xq  Assigns equal weight to all attributes  Attributes have to be normalized  Can also be used for prediction 4
  • 5. k-NN Algorithm  k-NN for real-valued prediction for a given unknown tuple  Returns the mean values of the k nearest neighbors  Distance-weighted nearest neighbor algorithm  Weight the contribution of each of the k neighbors according to their distance to the query xq  Give greater weight to closer neighbors  Robust to noisy data by averaging k-nearest neighbors  Curse of dimensionality: distance between neighbors could be dominated by irrelevant attributes  To overcome it, axes stretch or elimination of the least relevant attributes 5
  • 6. k-NN Algorithm  Categorical attributes  If two values are identical – difference =0  Missing values  For categorical values difference is 1 if either one or both values are missing  For numeric values  Both are missing – difference is 1  One value is missing other is v’ – difference is 1-v’ or v’  Complexity  Classification O(|D|) comparisons are needed
  • 7. Case-Based Reasoning (CBR)  CBR: Uses a database of problem solutions to solve new problems  Store symbolic description (tuples or cases)—not points in a Euclidean space  Applications: Customer-service (product-related diagnosis), legal ruling  Methodology  Instances represented by rich symbolic descriptions (e.g., function graphs)  Search for similar cases, multiple retrieved cases may be combined  Tight coupling between case retrieval, knowledge-based reasoning, and problem solving  Challenges  Find a good similarity metric  Indexing based on syntactic similarity measure, and when failure, backtracking, and adapting to additional cases 7
  • 8. Genetic Algorithms (GA)  Genetic Algorithm: based on an analogy to biological evolution  An initial population is created consisting of randomly generated rules  Each rule is represented by a string of bits  E.g., if A1 and ¬A2 then C2 can be encoded as 100  If an attribute has k > 2 values, k bits can be used  Based on the notion of survival of the fittest, a new population is formed to consist of the fittest rules and their offsprings  The fitness of a rule is represented by its classification accuracy on a set of training examples  Offsprings are generated by crossover and mutation  The process continues until a population P evolves when each rule in P satisfies a prespecified threshold  Slow but easily parallelizable 8
  • 9. Rough Set Approach  Rough sets are used to approximately or “roughly” define equivalent classes  A rough set for a given class C is approximated by two sets: a lower approximation (certain to be in C) and an upper approximation (cannot be described as not belonging to C)  Rough sets can be used for Feature Reduction and Relevance analysis 9
  • 10. Fuzzy Set Approaches  Fuzzy logic uses truth values between 0.0 and 1.0 to represent the degree of membership (such as using fuzzy membership graph)  Attribute values are converted to fuzzy values  e.g., income is mapped into the discrete categories {low, medium, high} with fuzzy values calculated  For a given new sample, more than one fuzzy value may apply  Each applicable rule contributes a vote for membership in the categories  Typically, the truth values for each predicted category are summed, and these sums are combined 10
  • 12. Fuzzy-Set Approaches  Fuzzy measures  AND operation  m(high_income AND senior_employee) (x) = min(mhigh_income(x), msenior_employee(x))  OR operation  m(high_income OR senior_employee) (x) = max(mhigh_income(x), msenior_employee(x))
  • 13. Prediction  (Numerical) prediction is similar to classification  construct a model  use model to predict continuous or ordered value for a given input  Prediction is different from classification  Classification refers to predict categorical class label  Prediction models continuous-valued functions  Major method for prediction: regression  model the relationship between one or more independent or predictor variables and a dependent or response variable  Regression analysis  Linear and multiple regression  Non-linear regression  Other regression methods: generalized linear model, Poisson regression, log-linear models, regression trees 13
  • 14. Linear Regression  Linear regression: involves a response variable y and a single predictor variable x y = w0 + w1 x where w0 (y-intercept) and w1 (slope) are regression coefficients  Method of least squares: estimates the best-fitting straight line  Multiple linear regression: involves more than one predictor variable  Ex. For 2-D data, we may have: y = w0 + w1 x1+ w2 x2  Solvable by extension of least square method ∑ ∑ = = − −− = || 1 2 || 1 )( ))(( 1 D i i D i ii xx yyxx w xwyw 10 −= 14
  • 15. Nonlinear Regression  Some nonlinear models can be modeled by a polynomial function  A polynomial regression model can be transformed into linear regression model. For example, y = w0 + w1 x + w2 x2 + w3 x3 convertible to linear  Other functions, such as power function, can also be transformed to linear model  Some models are intractable nonlinear (e.g., sum of exponential terms)  possible to obtain least square estimates through extensive calculation on more complex formulae 15
  • 16. Other Regression-Based Models  Generalized linear model:  Foundation on which linear regression can be applied to modeling categorical response variables  Variance of y is a function of the mean value of y, not a constant  Logistic regression: models the prob. of some event occurring as a linear function of a set of predictor variables  Poisson regression: models the data that exhibit a Poisson distribution  Log-linear models: (for categorical data)  Data Cubes  Also useful for data compression and smoothing  Decision trees  Regression trees  Model trees 16