SlideShare a Scribd company logo
Prof. Pier Luca Lanzi
Classification: Introduction
Data Mining andText Mining (UIC 583 @ Politecnico di Milano)
Prof. Pier Luca Lanzi
What is An Apple? 2
Prof. Pier Luca Lanzi
Are These Apples?
Prof. Pier Luca Lanzi
Prof. Pier Luca Lanzi
Contact Lenses Data 5
NoneReducedYesHypermetropePre-presbyopic
NoneNormalYesHypermetropePre-presbyopic
NoneReducedNoMyopePresbyopic
NoneNormalNoMyopePresbyopic
NoneReducedYesMyopePresbyopic
HardNormalYesMyopePresbyopic
NoneReducedNoHypermetropePresbyopic
SoftNormalNoHypermetropePresbyopic
NoneReducedYesHypermetropePresbyopic
NoneNormalYesHypermetropePresbyopic
SoftNormalNoHypermetropePre-presbyopic
NoneReducedNoHypermetropePre-presbyopic
HardNormalYesMyopePre-presbyopic
NoneReducedYesMyopePre-presbyopic
SoftNormalNoMyopePre-presbyopic
NoneReducedNoMyopePre-presbyopic
hardNormalYesHypermetropeYoung
NoneReducedYesHypermetropeYoung
SoftNormalNoHypermetropeYoung
NoneReducedNoHypermetropeYoung
HardNormalYesMyopeYoung
NoneReducedYesMyopeYoung
SoftNormalNoMyopeYoung
NoneReducedNoMyopeYoung
Recommended lensesTear production rateAstigmatismSpectacle prescriptionAge
Prof. Pier Luca Lanzi
A Model for the Contact Lenses Data 6
If tear production rate = reduced then recommendation = none
If age = young and astigmatic = no
and tear production rate = normal then recommendation = soft
If age = pre-presbyopic and astigmatic = no
and tear production rate = normal then recommendation = soft
If age = presbyopic and spectacle prescription = myope
and astigmatic = no then recommendation = none
If spectacle prescription = hypermetrope and astigmatic = no
and tear production rate = normal then recommendation = soft
If spectacle prescription = myope and astigmatic = yes
and tear production rate = normal then recommendation = hard
If age young and astigmatic = yes
and tear production rate = normal then recommendation = hard
If age = pre-presbyopic
and spectacle prescription = hypermetrope
and astigmatic = yes then recommendation = none
If age = presbyopic and spectacle prescription = hypermetrope
and astigmatic = yes then recommendation = none
Prof. Pier Luca Lanzi
CPU Performance Data 7
0
0
32
128
CHMAX
0
0
8
16
CHMIN
Channels PerformanceCache
(Kb)
Main memory
(Kb)
Cycle time
(ns)
45040001000480209
67328000512480208
…
26932320008000292
19825660002561251
PRPCACHMMAXMMINMYCT
PRP = -55.9 + 0.0489 MYCT + 0.0153 MMIN + 0.0056 MMAX
+ 0.6410 CACH - 0.2700 CHMIN + 1.480 CHMAX
Prof. Pier Luca Lanzi
Classification vs. Prediction
•  Classification
§ Predicts categorical class labels (discrete or nominal)
§ Classifies data (constructs a model) based on the training set
and the values (class labels) in a classifying attribute and uses it
in classifying new data
•  Prediction
§ Models continuous-valued functions, i.e., predicts unknown or
missing values
•  Applications
§ Credit approval
§ Target marketing
§ Medical diagnosis
§ Fraud detection
8
Prof. Pier Luca Lanzi
classification = model building + model usage
Prof. Pier Luca Lanzi
What is classification?
•  Classification is a two-step Process
•  Model construction
§ Given a set of data representing examples of 
a target concept, build a model to “explain” the concept
•  Model usage
§ The classification model is used for classifying 
future or unknown cases
§ Estimate accuracy of the model
10
Prof. Pier Luca Lanzi
Classification: Model Construction 11
Classification
Algorithm
IF rank = ‘professor’
OR years  6
THEN tenured = ‘yes’
name rank years tenured
Mike Assistant Prof 3 no
Mary Assistant Prof 7 yes
Bill Professor 2 yes
Jim Associate Prof 7 yes
Dave Assistant Prof 6 no
Anne Associate Prof 3 no
Training
Data
Classifier
(Model)
Prof. Pier Luca Lanzi
Classification: Model Usage 12
tenured = yes
name rank years tenured
Tom Assistant Prof 2 no
Merlisa Associate Prof 7 no
George Professor 5 yes
Joseph Assistant Prof 7 yes
Test
Data
Classifier
(Model)
Unseen Data
Jeff, Professor, 4
Prof. Pier Luca Lanzi
Evaluating Classification Methods
•  Accuracy
§ classifier accuracy: predicting class label
§ predictor accuracy: guessing value of predicted attributes
•  Speed
§ time to construct the model (training time)
§ time to use the model (classification/prediction time)
•  Other Criteria
§ Robustness: handling noise and missing values
§ Scalability: efficiency in disk-resident databases
§ Interpretability: understanding and insight provided
§ Other measures, e.g., goodness of rules, such as decision tree size
or compactness of classification rules
13
Prof. Pier Luca Lanzi
Example
Prof. Pier Luca Lanzi
The Weather Dataset:
Building the Model
Outlook Temp Humidity Windy Play
Sunny Hot High True No
Overcast Hot High False Yes
Rainy Cool Normal False Yes
Rainy Cool Normal True No
Overcast Cool Normal True Yes
Sunny Cool Normal False Yes
Sunny Mild Normal True Yes
Overcast Mild High True Yes
Overcast Hot Normal False Yes
Rainy Mild High True No
15
•  Write one rule like “if A=v1 then X, else if A=v2 thenY, …” to
predict whether the player is going to play or not
•  A is an attribute; vi are attribute values; X,Y are class labels
Prof. Pier Luca Lanzi
The Weather Dataset: 
Testing the Model
Outlook Temp Humidity Windy Play
Sunny Hot High False No
Rainy Mild High False Yes
Sunny Mild High False No
Rainy Mild Normal False Yes
16
Prof. Pier Luca Lanzi
Examples of Models
•  if outlook = sunny then no (3 / 2) 
if outlook = overcast then yes (0 / 4) 
if outlook = rainy then yes (2 / 3) 

correct: 10 out of 14 training examples
•  if outlook = sunny then yes (1 / 2)
if outlook = overcast then yes (0 / 4) 
if outlook = rainy then no (2 / 1) 

correct: 8 out of 10 training examples
17
Prof. Pier Luca Lanzi
The Machine Learning Perspective
Prof. Pier Luca Lanzi
The Machine Learning Perspective 
•  Classification algorithms are methods of supervised Learning
•  The experience E consists of a set of examples of a target
concept that have been prepared by a supervisor
•  The task T consists of finding an hypothesis that accurately
explains the target concept
•  The performance P depends on how accurately the hypothesis h
explains the examples in E
19
Prof. Pier Luca Lanzi
The Machine Learning Perspective
•  Let us define the problem domain as the set of instance X 
(for instance, X contains different different fruits)
•  We define a concept over X as a function c which maps
elements of X into a range D or c:X→ D
•  The range D represents the type of concept analyzed
•  For instance, c: X → {isApple, notAnApple}
20
Prof. Pier Luca Lanzi
The Machine Learning Perspective
•  Experience E is a set of x,d pairs, with x∈X and d∈D.
•  The task T consists of finding an hypothesis h to explain E:
•  ∀x∈X h(x)=c(x)
•  The set H of all the possible hypotheses h that can be used to
explain c it is called the hypothesis space
•  The goodness of an hypothesis h can be evaluated as the
percentage of examples that are correctly explained by h

P(h) = | {x| x∈X e h(x)=c(x)}| / |X|
21
Prof. Pier Luca Lanzi
Examples
•  Concept Learning
when D={0,1}
•  Supervised classification 
when D consists of a finite number of labels
•  Prediction
when D is a subset of Rn
22
Prof. Pier Luca Lanzi
The Machine Learning Perspective 
on Classification
•  Supervised learning algorithms, given the examples in E, search
the hypotheses space H for the hypothesis h that best explains
the examples in E
•  Learning is viewed as a search in the hypotheses space
23
Prof. Pier Luca Lanzi
Searching for Hypotheses
•  The type of hypothesis required influences the search algorithm
•  The more complex the representation 
the more complex the search algorithm
•  Many algorithms assume that it is possible to define a partial
ordering over the hypothesis space
•  The hypothesis space can be searched using either a general to
specific or a specific-to-general strategy
24
Prof. Pier Luca Lanzi
Exploring the Hypothesis Space
•  General to Specific
§ Start with the most general hypothesis and then go on
through specialization steps
•  Specific to General
§ Start with the set of the most specific hypothesis and
then go on through generalization steps
25
Prof. Pier Luca Lanzi
Inductive Bias
•  Set of assumptions that together with the training data deductively justify the
classification assigned by the learner to future instances
•  There can be a number of hypotheses consistent with training data
•  Each learning algorithm has an inductive bias that imposes a preference on the
space of all possible hypotheses
26
Prof. Pier Luca Lanzi
Types of Inductive Bias
•  Syntactic Bias
§ Depends on the language used to represent hypotheses
•  Semantic Bias
§ Depends on the heuristics used to filter hypotheses
•  Preference Bias
§ Depends on the ability to rank and compare hypotheses
•  Restriction Bias
§ Depends on the ability to restrict the search space
27
Prof. Pier Luca Lanzi
Why Are We Looking for h?
Prof. Pier Luca Lanzi
Inductive Learning Hypothesis
•  Any hypothesis (h) found to approximate the target function (c) over a
sufficiently large set of training examples will also approximate the
target function (c) well over other unobserved examples.
•  Training
§ The hypothesis h is developed to explain the examples in ETrain
•  Testing
§ The hypothesis h is evaluated (verified) with respect to the
previously unseen examples in ETest
•  The underlying hypothesis
§ If h explains ETrain then it can also be used to explain other unseen
examples in ETest (not previously used to develop h)
29
Prof. Pier Luca Lanzi
Generalization and Overfitting
•  Generalization
§ When h explains “well” both ETrain and ETest we say that h is
general and that the method used to develop h has
adequately generalized
•  Overfitting
§ When h explains ETrain but not ETest we say that the method
used to develop h has overfitted
§ We have overfitting when the hypothesis h explains ETrain too
accurately so that h is not general enough to be applied
outside ETrain
30
Prof. Pier Luca Lanzi
What are the general issues
for classification in Machine Learning?
•  Type of training experience
§ Direct or indirect?
§ Supervised or not?
•  Type of target function and performance
•  Type of search algorithm
•  Type of representation of the solution
•  Type of inductive bias
31

More Related Content

PDF
DMTM 2015 - 04 Data Exploration
PDF
DMTM 2015 - 17 Text Mining Part 1
PDF
DMTM 2015 - 03 Data Representation
PDF
DMTM 2015 - 06 Introduction to Clustering
PDF
DMTM 2015 - 16 Data Preparation
PDF
DMTM Lecture 17 Text mining
PDF
DMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other Methods
PDF
DMTM Lecture 20 Data preparation
DMTM 2015 - 04 Data Exploration
DMTM 2015 - 17 Text Mining Part 1
DMTM 2015 - 03 Data Representation
DMTM 2015 - 06 Introduction to Clustering
DMTM 2015 - 16 Data Preparation
DMTM Lecture 17 Text mining
DMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other Methods
DMTM Lecture 20 Data preparation

What's hot (20)

PDF
DMTM 2015 - 09 Density Based Clustering
PDF
DMTM Lecture 15 Clustering evaluation
PDF
DMTM 2015 - 15 Classification Ensembles
PDF
DMTM Lecture 04 Classification
PDF
DMTM Lecture 11 Clustering
PDF
DMTM Lecture 18 Graph mining
PDF
DMTM Lecture 13 Representative based clustering
PDF
DMTM Lecture 03 Regression
PDF
DMTM Lecture 09 Other classificationmethods
PDF
DMTM 2015 - 14 Evaluation of Classification Models
PDF
DMTM Lecture 10 Classification ensembles
PDF
DMTM 2015 - 07 Hierarchical Clustering
PDF
DMTM Lecture 05 Data representation
PDF
DMTM 2015 - 08 Representative-Based Clustering
PDF
DMTM Lecture 06 Classification evaluation
PDF
DMTM Lecture 16 Association rules
PDF
DMTM Lecture 07 Decision trees
PDF
DMTM Lecture 19 Data exploration
PDF
DMTM Lecture 12 Hierarchical clustering
PDF
DMTM Lecture 08 Classification rules
DMTM 2015 - 09 Density Based Clustering
DMTM Lecture 15 Clustering evaluation
DMTM 2015 - 15 Classification Ensembles
DMTM Lecture 04 Classification
DMTM Lecture 11 Clustering
DMTM Lecture 18 Graph mining
DMTM Lecture 13 Representative based clustering
DMTM Lecture 03 Regression
DMTM Lecture 09 Other classificationmethods
DMTM 2015 - 14 Evaluation of Classification Models
DMTM Lecture 10 Classification ensembles
DMTM 2015 - 07 Hierarchical Clustering
DMTM Lecture 05 Data representation
DMTM 2015 - 08 Representative-Based Clustering
DMTM Lecture 06 Classification evaluation
DMTM Lecture 16 Association rules
DMTM Lecture 07 Decision trees
DMTM Lecture 19 Data exploration
DMTM Lecture 12 Hierarchical clustering
DMTM Lecture 08 Classification rules
Ad

Viewers also liked (19)

PDF
DMTM 2015 - 11 Decision Trees
PDF
DMTM 2015 - 12 Classification Rules
PDF
Course Introduction
PDF
DMTM 2015 - 19 Graph Mining
PDF
Focus Junior - 14 Maggio 2016
PDF
DMTM 2015 - 01 Course Introduction
PDF
DMTM 2015 - 02 Data Mining
PDF
Course Organization
PDF
DMTM 2015 - 18 Text Mining Part 2
PDF
DMTM 2015 - 05 Association Rules
PDF
Machine Learning and Data Mining: 12 Classification Rules
PDF
Idea Generation and Conceptualization
PPTX
Introduction to Procedural Content Generation - Codemotion 29 Novembre 2014
PDF
Working with Formal Elements
PDF
The Structure of Games
PDF
Elements for the Theory of Fun
PDF
The Design Document
PDF
Lecture 9: Machine Learning in Practice (2)
PPT
Fitness Inheritance in Evolutionary and
DMTM 2015 - 11 Decision Trees
DMTM 2015 - 12 Classification Rules
Course Introduction
DMTM 2015 - 19 Graph Mining
Focus Junior - 14 Maggio 2016
DMTM 2015 - 01 Course Introduction
DMTM 2015 - 02 Data Mining
Course Organization
DMTM 2015 - 18 Text Mining Part 2
DMTM 2015 - 05 Association Rules
Machine Learning and Data Mining: 12 Classification Rules
Idea Generation and Conceptualization
Introduction to Procedural Content Generation - Codemotion 29 Novembre 2014
Working with Formal Elements
The Structure of Games
Elements for the Theory of Fun
The Design Document
Lecture 9: Machine Learning in Practice (2)
Fitness Inheritance in Evolutionary and
Ad

Similar to DMTM 2015 - 10 Introduction to Classification (20)

PDF
Machine learning with in the python lecture for computer science
PDF
Machine Learning and Data Mining: 16 Classifiers Ensembles
PDF
Concept Learning in hypothesis in machine Learning by tom m mitchel
PPTX
Machine Learning
PPTX
AI -learning and machine learning.pptx
PPTX
machine leraning : main principles and techniques
PPTX
Knowledge & logic in Artificial Intelligence.pptx
PDF
The T.A.P.E. system for effective corporate training
PPT
Language Modeling Putting a curve to the bag of words
PDF
Lecture-2.pdf
PPT
S10
PPT
S10
PPTX
Teaching Kids Programming using the Intentional Method
PPT
fovkfgfdfsssssffffffffffssssccocmall.ppt
PPT
Improving Communications With Soft Skill And Dialogue Simulations
PDF
Human-Centric Machine Learning
PPTX
AI material for you computer science.pptx
PPTX
Learn from Example and Learn Probabilistic Model
PPTX
Action research workshop
PDF
Teacher toolkit Pycon UK Sept 2018
Machine learning with in the python lecture for computer science
Machine Learning and Data Mining: 16 Classifiers Ensembles
Concept Learning in hypothesis in machine Learning by tom m mitchel
Machine Learning
AI -learning and machine learning.pptx
machine leraning : main principles and techniques
Knowledge & logic in Artificial Intelligence.pptx
The T.A.P.E. system for effective corporate training
Language Modeling Putting a curve to the bag of words
Lecture-2.pdf
S10
S10
Teaching Kids Programming using the Intentional Method
fovkfgfdfsssssffffffffffssssccocmall.ppt
Improving Communications With Soft Skill And Dialogue Simulations
Human-Centric Machine Learning
AI material for you computer science.pptx
Learn from Example and Learn Probabilistic Model
Action research workshop
Teacher toolkit Pycon UK Sept 2018

More from Pier Luca Lanzi (13)

PDF
11 Settembre 2021 - Giocare con i Videogiochi
PDF
Breve Viaggio al Centro dei Videogiochi
PDF
Global Game Jam 19 @ POLIMI - Morning Welcome
PPTX
Data Driven Game Design @ Campus Party 2018
PDF
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
PDF
GGJ18 al Politecnico di Milano - Presentazione di apertura
PDF
Presentation for UNITECH event - January 8, 2018
PDF
DMTM Lecture 14 Density based clustering
PDF
DMTM Lecture 01 Introduction
PDF
DMTM Lecture 02 Data mining
PDF
VDP2016 - Lecture 16 Rendering pipeline
PDF
VDP2016 - Lecture 15 PCG with Unity
PDF
VDP2016 - Lecture 14 Procedural content generation
11 Settembre 2021 - Giocare con i Videogiochi
Breve Viaggio al Centro dei Videogiochi
Global Game Jam 19 @ POLIMI - Morning Welcome
Data Driven Game Design @ Campus Party 2018
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
GGJ18 al Politecnico di Milano - Presentazione di apertura
Presentation for UNITECH event - January 8, 2018
DMTM Lecture 14 Density based clustering
DMTM Lecture 01 Introduction
DMTM Lecture 02 Data mining
VDP2016 - Lecture 16 Rendering pipeline
VDP2016 - Lecture 15 PCG with Unity
VDP2016 - Lecture 14 Procedural content generation

Recently uploaded (20)

PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PPTX
Cell Types and Its function , kingdom of life
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
master seminar digital applications in india
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
TR - Agricultural Crops Production NC III.pdf
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
GDM (1) (1).pptx small presentation for students
PPTX
Institutional Correction lecture only . . .
PPTX
Pharma ospi slides which help in ospi learning
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
Classroom Observation Tools for Teachers
Pharmacology of Heart Failure /Pharmacotherapy of CHF
102 student loan defaulters named and shamed – Is someone you know on the list?
Cell Types and Its function , kingdom of life
O7-L3 Supply Chain Operations - ICLT Program
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
2.FourierTransform-ShortQuestionswithAnswers.pdf
master seminar digital applications in india
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
TR - Agricultural Crops Production NC III.pdf
human mycosis Human fungal infections are called human mycosis..pptx
Microbial diseases, their pathogenesis and prophylaxis
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Final Presentation General Medicine 03-08-2024.pptx
GDM (1) (1).pptx small presentation for students
Institutional Correction lecture only . . .
Pharma ospi slides which help in ospi learning
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Classroom Observation Tools for Teachers

DMTM 2015 - 10 Introduction to Classification

  • 1. Prof. Pier Luca Lanzi Classification: Introduction Data Mining andText Mining (UIC 583 @ Politecnico di Milano)
  • 2. Prof. Pier Luca Lanzi What is An Apple? 2
  • 3. Prof. Pier Luca Lanzi Are These Apples?
  • 5. Prof. Pier Luca Lanzi Contact Lenses Data 5 NoneReducedYesHypermetropePre-presbyopic NoneNormalYesHypermetropePre-presbyopic NoneReducedNoMyopePresbyopic NoneNormalNoMyopePresbyopic NoneReducedYesMyopePresbyopic HardNormalYesMyopePresbyopic NoneReducedNoHypermetropePresbyopic SoftNormalNoHypermetropePresbyopic NoneReducedYesHypermetropePresbyopic NoneNormalYesHypermetropePresbyopic SoftNormalNoHypermetropePre-presbyopic NoneReducedNoHypermetropePre-presbyopic HardNormalYesMyopePre-presbyopic NoneReducedYesMyopePre-presbyopic SoftNormalNoMyopePre-presbyopic NoneReducedNoMyopePre-presbyopic hardNormalYesHypermetropeYoung NoneReducedYesHypermetropeYoung SoftNormalNoHypermetropeYoung NoneReducedNoHypermetropeYoung HardNormalYesMyopeYoung NoneReducedYesMyopeYoung SoftNormalNoMyopeYoung NoneReducedNoMyopeYoung Recommended lensesTear production rateAstigmatismSpectacle prescriptionAge
  • 6. Prof. Pier Luca Lanzi A Model for the Contact Lenses Data 6 If tear production rate = reduced then recommendation = none If age = young and astigmatic = no and tear production rate = normal then recommendation = soft If age = pre-presbyopic and astigmatic = no and tear production rate = normal then recommendation = soft If age = presbyopic and spectacle prescription = myope and astigmatic = no then recommendation = none If spectacle prescription = hypermetrope and astigmatic = no and tear production rate = normal then recommendation = soft If spectacle prescription = myope and astigmatic = yes and tear production rate = normal then recommendation = hard If age young and astigmatic = yes and tear production rate = normal then recommendation = hard If age = pre-presbyopic and spectacle prescription = hypermetrope and astigmatic = yes then recommendation = none If age = presbyopic and spectacle prescription = hypermetrope and astigmatic = yes then recommendation = none
  • 7. Prof. Pier Luca Lanzi CPU Performance Data 7 0 0 32 128 CHMAX 0 0 8 16 CHMIN Channels PerformanceCache (Kb) Main memory (Kb) Cycle time (ns) 45040001000480209 67328000512480208 … 26932320008000292 19825660002561251 PRPCACHMMAXMMINMYCT PRP = -55.9 + 0.0489 MYCT + 0.0153 MMIN + 0.0056 MMAX + 0.6410 CACH - 0.2700 CHMIN + 1.480 CHMAX
  • 8. Prof. Pier Luca Lanzi Classification vs. Prediction •  Classification § Predicts categorical class labels (discrete or nominal) § Classifies data (constructs a model) based on the training set and the values (class labels) in a classifying attribute and uses it in classifying new data •  Prediction § Models continuous-valued functions, i.e., predicts unknown or missing values •  Applications § Credit approval § Target marketing § Medical diagnosis § Fraud detection 8
  • 9. Prof. Pier Luca Lanzi classification = model building + model usage
  • 10. Prof. Pier Luca Lanzi What is classification? •  Classification is a two-step Process •  Model construction § Given a set of data representing examples of a target concept, build a model to “explain” the concept •  Model usage § The classification model is used for classifying future or unknown cases § Estimate accuracy of the model 10
  • 11. Prof. Pier Luca Lanzi Classification: Model Construction 11 Classification Algorithm IF rank = ‘professor’ OR years 6 THEN tenured = ‘yes’ name rank years tenured Mike Assistant Prof 3 no Mary Assistant Prof 7 yes Bill Professor 2 yes Jim Associate Prof 7 yes Dave Assistant Prof 6 no Anne Associate Prof 3 no Training Data Classifier (Model)
  • 12. Prof. Pier Luca Lanzi Classification: Model Usage 12 tenured = yes name rank years tenured Tom Assistant Prof 2 no Merlisa Associate Prof 7 no George Professor 5 yes Joseph Assistant Prof 7 yes Test Data Classifier (Model) Unseen Data Jeff, Professor, 4
  • 13. Prof. Pier Luca Lanzi Evaluating Classification Methods •  Accuracy § classifier accuracy: predicting class label § predictor accuracy: guessing value of predicted attributes •  Speed § time to construct the model (training time) § time to use the model (classification/prediction time) •  Other Criteria § Robustness: handling noise and missing values § Scalability: efficiency in disk-resident databases § Interpretability: understanding and insight provided § Other measures, e.g., goodness of rules, such as decision tree size or compactness of classification rules 13
  • 14. Prof. Pier Luca Lanzi Example
  • 15. Prof. Pier Luca Lanzi The Weather Dataset: Building the Model Outlook Temp Humidity Windy Play Sunny Hot High True No Overcast Hot High False Yes Rainy Cool Normal False Yes Rainy Cool Normal True No Overcast Cool Normal True Yes Sunny Cool Normal False Yes Sunny Mild Normal True Yes Overcast Mild High True Yes Overcast Hot Normal False Yes Rainy Mild High True No 15 •  Write one rule like “if A=v1 then X, else if A=v2 thenY, …” to predict whether the player is going to play or not •  A is an attribute; vi are attribute values; X,Y are class labels
  • 16. Prof. Pier Luca Lanzi The Weather Dataset: Testing the Model Outlook Temp Humidity Windy Play Sunny Hot High False No Rainy Mild High False Yes Sunny Mild High False No Rainy Mild Normal False Yes 16
  • 17. Prof. Pier Luca Lanzi Examples of Models •  if outlook = sunny then no (3 / 2) if outlook = overcast then yes (0 / 4) if outlook = rainy then yes (2 / 3) correct: 10 out of 14 training examples •  if outlook = sunny then yes (1 / 2) if outlook = overcast then yes (0 / 4) if outlook = rainy then no (2 / 1) correct: 8 out of 10 training examples 17
  • 18. Prof. Pier Luca Lanzi The Machine Learning Perspective
  • 19. Prof. Pier Luca Lanzi The Machine Learning Perspective •  Classification algorithms are methods of supervised Learning •  The experience E consists of a set of examples of a target concept that have been prepared by a supervisor •  The task T consists of finding an hypothesis that accurately explains the target concept •  The performance P depends on how accurately the hypothesis h explains the examples in E 19
  • 20. Prof. Pier Luca Lanzi The Machine Learning Perspective •  Let us define the problem domain as the set of instance X (for instance, X contains different different fruits) •  We define a concept over X as a function c which maps elements of X into a range D or c:X→ D •  The range D represents the type of concept analyzed •  For instance, c: X → {isApple, notAnApple} 20
  • 21. Prof. Pier Luca Lanzi The Machine Learning Perspective •  Experience E is a set of x,d pairs, with x∈X and d∈D. •  The task T consists of finding an hypothesis h to explain E: •  ∀x∈X h(x)=c(x) •  The set H of all the possible hypotheses h that can be used to explain c it is called the hypothesis space •  The goodness of an hypothesis h can be evaluated as the percentage of examples that are correctly explained by h P(h) = | {x| x∈X e h(x)=c(x)}| / |X| 21
  • 22. Prof. Pier Luca Lanzi Examples •  Concept Learning when D={0,1} •  Supervised classification when D consists of a finite number of labels •  Prediction when D is a subset of Rn 22
  • 23. Prof. Pier Luca Lanzi The Machine Learning Perspective on Classification •  Supervised learning algorithms, given the examples in E, search the hypotheses space H for the hypothesis h that best explains the examples in E •  Learning is viewed as a search in the hypotheses space 23
  • 24. Prof. Pier Luca Lanzi Searching for Hypotheses •  The type of hypothesis required influences the search algorithm •  The more complex the representation the more complex the search algorithm •  Many algorithms assume that it is possible to define a partial ordering over the hypothesis space •  The hypothesis space can be searched using either a general to specific or a specific-to-general strategy 24
  • 25. Prof. Pier Luca Lanzi Exploring the Hypothesis Space •  General to Specific § Start with the most general hypothesis and then go on through specialization steps •  Specific to General § Start with the set of the most specific hypothesis and then go on through generalization steps 25
  • 26. Prof. Pier Luca Lanzi Inductive Bias •  Set of assumptions that together with the training data deductively justify the classification assigned by the learner to future instances •  There can be a number of hypotheses consistent with training data •  Each learning algorithm has an inductive bias that imposes a preference on the space of all possible hypotheses 26
  • 27. Prof. Pier Luca Lanzi Types of Inductive Bias •  Syntactic Bias § Depends on the language used to represent hypotheses •  Semantic Bias § Depends on the heuristics used to filter hypotheses •  Preference Bias § Depends on the ability to rank and compare hypotheses •  Restriction Bias § Depends on the ability to restrict the search space 27
  • 28. Prof. Pier Luca Lanzi Why Are We Looking for h?
  • 29. Prof. Pier Luca Lanzi Inductive Learning Hypothesis •  Any hypothesis (h) found to approximate the target function (c) over a sufficiently large set of training examples will also approximate the target function (c) well over other unobserved examples. •  Training § The hypothesis h is developed to explain the examples in ETrain •  Testing § The hypothesis h is evaluated (verified) with respect to the previously unseen examples in ETest •  The underlying hypothesis § If h explains ETrain then it can also be used to explain other unseen examples in ETest (not previously used to develop h) 29
  • 30. Prof. Pier Luca Lanzi Generalization and Overfitting •  Generalization § When h explains “well” both ETrain and ETest we say that h is general and that the method used to develop h has adequately generalized •  Overfitting § When h explains ETrain but not ETest we say that the method used to develop h has overfitted § We have overfitting when the hypothesis h explains ETrain too accurately so that h is not general enough to be applied outside ETrain 30
  • 31. Prof. Pier Luca Lanzi What are the general issues for classification in Machine Learning? •  Type of training experience § Direct or indirect? § Supervised or not? •  Type of target function and performance •  Type of search algorithm •  Type of representation of the solution •  Type of inductive bias 31