SlideShare a Scribd company logo
ETHEM ALPAYDIN
© The MIT Press, 2010
alpaydin@boun.edu.tr
http://www.cmpe.boun.edu.tr/~ethem/i2ml2e
Lecture Slides for
machine learning introductory concepts .
Learning a Class from Examples
 Class C of a “family car”
 Prediction: Is car x a family car?
 Knowledge extraction: What do people expect from a
family car?
 Output:
Positive (+) and negative (–) examples
 Input representation:
x1: price, x2 : engine power
3
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Training set X
N
t
t
t
,r 1
}
{ 
 x
X




negative
is
if
positive
is
if
x
x
0
1
r
4







2
1
x
x
x
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Class C
   
2
1
2
1 power
engine
AND
price e
e
p
p 



5
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Hypothesis class H




negative
is
says
if
positive
is
says
if
)
(
x
x
x
h
h
h
0
1
 
 




N
t
t
t
r
h
h
E
1
1 x
)
|
( X
6
Error of h on H
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
S, G, and the Version Space
7
most specific hypothesis, S
most general hypothesis, G
h H, between S and G is
consistent
and make up the
version space
(Mitchell, 1997)
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Margin
 Choose h with largest margin
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 8
VC Dimension
 N points can be labeled in 2N ways as +/–
 H shatters N if there
exists h  H consistent
for any of these:
VC(H ) = N
9
An axis-aligned rectangle shatters 4 points only !
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Probably Approximately Correct
(PAC) Learning
 How many training examples N should we have, such that with probability
at least 1 ‒ δ, h has error at most ε ?
(Blumer et al., 1989)
 Each strip is at most ε/4
 Pr that we miss a strip 1‒ ε/4
 Pr that N instances miss a strip (1 ‒ ε/4)N
 Pr that N instances miss 4 strips 4(1 ‒ ε/4)N
 4(1 ‒ ε/4)N ≤ δ and (1 ‒ x)≤exp( ‒ x)
 4exp(‒ εN/4) ≤ δ and N ≥ (4/ε)log(4/δ)
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 10
Noise and Model Complexity
Use the simpler one because
 Simpler to use
(lower computational
complexity)
 Easier to train (lower
space complexity)
 Easier to explain
(more interpretable)
 Generalizes better (lower
variance - Occam’s razor)
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 11
Multiple Classes, Ci i=1,...,K
N
t
t
t
,r 1
}
{ 
 x
X







,
if
if
i
j
r
j
t
i
t
t
i
C
C
x
x
0
1
 







,
if
if
i
j
h
j
t
i
t
t
i
C
C
x
x
x
0
1
12
Train hypotheses
hi(x), i =1,...,K:
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Regression
  0
1 w
x
w
x
g 

  0
1
2
2 w
x
w
x
w
x
g 


   
 




N
t
t
t
x
g
r
N
g
E
1
2
1
X
|
13
   
 





N
t
t
t
w
x
w
r
N
w
w
E
1
2
0
1
0
1
1
X
|
,
 
  




 
t
t
t
N
t
t
t
x
f
r
r
r
x 1
,
X
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Model Selection & Generalization
 Learning is an ill-posed problem; data is not sufficient to
find a unique solution
 The need for inductive bias, assumptions about H
 Generalization: How well a model performs on new data
 Overfitting: H more complex than C or f
 Underfitting: H less complex than C or f
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 14
Triple Trade-Off
 There is a trade-off between three factors (Dietterich,
2003):
1. Complexity of H, c (H),
2. Training set size, N,
3. Generalization error, E, on new data
 As NE
 As c (H)first Eand then E
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 15
Cross-Validation
 To estimate generalization error, we need data unseen
during training. We split the data as
 Training set (50%)
 Validation set (25%)
 Test (publication) set (25%)
 Resampling when there is few data
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 16
Dimensions of a Supervised
Learner
1. Model:
2. Loss function:
3. Optimization procedure:
 

|
x
g
   
 


t
t
t
g
r
L
E 
 |
,
| x
X
17
 
X
|
min
arg
* 


E

Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

More Related Content

PPT
Alpaydin - Chapter 2
PPT
mlcccccccccccccccccccccccccccccccccccccccccccccccccccccccc.ppt
PPT
Alpaydin - Chapter 2
PPTX
I2ml3e chap2
PPTX
i2ml3e-chap2.pptx
PPT
Supervised_Learning.ppt
PDF
supervised learning and unsupervised learning
PPTX
machinelearning123333333secondedition.pptx
Alpaydin - Chapter 2
mlcccccccccccccccccccccccccccccccccccccccccccccccccccccccc.ppt
Alpaydin - Chapter 2
I2ml3e chap2
i2ml3e-chap2.pptx
Supervised_Learning.ppt
supervised learning and unsupervised learning
machinelearning123333333secondedition.pptx

Similar to machine learning introductory concepts . (20)

PPTX
محازةرةي يةكةم
PPTX
Machine Learning basics
PDF
Bayesian_Decision_Theory-3.pdf
PPTX
Introduction to Machine Learning
PPT
lec1.ppt
PPT
Lecture 1
PPTX
Topic13b- Kernel machines and support vector machine.pptx
PPT
AML_030607.ppt
PDF
machinelearning_slide note this is repdf
PPT
notes as .ppt
PDF
Machine learning and its parameter is discussed here
PPT
Decision Theory pattern recognition theory
PDF
Lecture1 introduction to machine learning
PPT
Introduction to MACHINE LEARNING for beginners.ppt
PPTX
Practical ML
PDF
Introduction AI ML& Mathematicals of ML.pdf
PPTX
Learning – Types of Machine Learning – Supervised Learning – Unsupervised UNI...
PPT
CC282 Decision trees Lecture 2 slides for CC282 Machine ...
PPTX
Machine Learning Presentation - Vilnius Tech
PDF
Lecture 3 (Supervised learning)
محازةرةي يةكةم
Machine Learning basics
Bayesian_Decision_Theory-3.pdf
Introduction to Machine Learning
lec1.ppt
Lecture 1
Topic13b- Kernel machines and support vector machine.pptx
AML_030607.ppt
machinelearning_slide note this is repdf
notes as .ppt
Machine learning and its parameter is discussed here
Decision Theory pattern recognition theory
Lecture1 introduction to machine learning
Introduction to MACHINE LEARNING for beginners.ppt
Practical ML
Introduction AI ML& Mathematicals of ML.pdf
Learning – Types of Machine Learning – Supervised Learning – Unsupervised UNI...
CC282 Decision trees Lecture 2 slides for CC282 Machine ...
Machine Learning Presentation - Vilnius Tech
Lecture 3 (Supervised learning)
Ad

Recently uploaded (20)

PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
Safety Seminar civil to be ensured for safe working.
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPT
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPT
Mechanical Engineering MATERIALS Selection
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PPTX
Lecture Notes Electrical Wiring System Components
DOCX
573137875-Attendance-Management-System-original
PPTX
additive manufacturing of ss316l using mig welding
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
Internet of Things (IOT) - A guide to understanding
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
bas. eng. economics group 4 presentation 1.pptx
Safety Seminar civil to be ensured for safe working.
CYBER-CRIMES AND SECURITY A guide to understanding
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Mechanical Engineering MATERIALS Selection
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Lecture Notes Electrical Wiring System Components
573137875-Attendance-Management-System-original
additive manufacturing of ss316l using mig welding
Embodied AI: Ushering in the Next Era of Intelligent Systems
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
Ad

machine learning introductory concepts .

  • 1. ETHEM ALPAYDIN © The MIT Press, 2010 alpaydin@boun.edu.tr http://www.cmpe.boun.edu.tr/~ethem/i2ml2e Lecture Slides for
  • 3. Learning a Class from Examples  Class C of a “family car”  Prediction: Is car x a family car?  Knowledge extraction: What do people expect from a family car?  Output: Positive (+) and negative (–) examples  Input representation: x1: price, x2 : engine power 3 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
  • 4. Training set X N t t t ,r 1 } {   x X     negative is if positive is if x x 0 1 r 4        2 1 x x x Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
  • 5. Class C     2 1 2 1 power engine AND price e e p p     5 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
  • 6. Hypothesis class H     negative is says if positive is says if ) ( x x x h h h 0 1         N t t t r h h E 1 1 x ) | ( X 6 Error of h on H Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
  • 7. S, G, and the Version Space 7 most specific hypothesis, S most general hypothesis, G h H, between S and G is consistent and make up the version space (Mitchell, 1997) Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
  • 8. Margin  Choose h with largest margin Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 8
  • 9. VC Dimension  N points can be labeled in 2N ways as +/–  H shatters N if there exists h  H consistent for any of these: VC(H ) = N 9 An axis-aligned rectangle shatters 4 points only ! Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
  • 10. Probably Approximately Correct (PAC) Learning  How many training examples N should we have, such that with probability at least 1 ‒ δ, h has error at most ε ? (Blumer et al., 1989)  Each strip is at most ε/4  Pr that we miss a strip 1‒ ε/4  Pr that N instances miss a strip (1 ‒ ε/4)N  Pr that N instances miss 4 strips 4(1 ‒ ε/4)N  4(1 ‒ ε/4)N ≤ δ and (1 ‒ x)≤exp( ‒ x)  4exp(‒ εN/4) ≤ δ and N ≥ (4/ε)log(4/δ) Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 10
  • 11. Noise and Model Complexity Use the simpler one because  Simpler to use (lower computational complexity)  Easier to train (lower space complexity)  Easier to explain (more interpretable)  Generalizes better (lower variance - Occam’s razor) Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 11
  • 12. Multiple Classes, Ci i=1,...,K N t t t ,r 1 } {   x X        , if if i j r j t i t t i C C x x 0 1          , if if i j h j t i t t i C C x x x 0 1 12 Train hypotheses hi(x), i =1,...,K: Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
  • 13. Regression   0 1 w x w x g     0 1 2 2 w x w x w x g              N t t t x g r N g E 1 2 1 X | 13            N t t t w x w r N w w E 1 2 0 1 0 1 1 X | ,            t t t N t t t x f r r r x 1 , X Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
  • 14. Model Selection & Generalization  Learning is an ill-posed problem; data is not sufficient to find a unique solution  The need for inductive bias, assumptions about H  Generalization: How well a model performs on new data  Overfitting: H more complex than C or f  Underfitting: H less complex than C or f Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 14
  • 15. Triple Trade-Off  There is a trade-off between three factors (Dietterich, 2003): 1. Complexity of H, c (H), 2. Training set size, N, 3. Generalization error, E, on new data  As NE  As c (H)first Eand then E Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 15
  • 16. Cross-Validation  To estimate generalization error, we need data unseen during training. We split the data as  Training set (50%)  Validation set (25%)  Test (publication) set (25%)  Resampling when there is few data Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 16
  • 17. Dimensions of a Supervised Learner 1. Model: 2. Loss function: 3. Optimization procedure:    | x g         t t t g r L E   | , | x X 17   X | min arg *    E  Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)