• PS: This file is for reference only. Do not
depend solely on it for the content. It is to
supplement your Text book content. It is
recommended to go through suggested
readings/Text book to have detailed
knowledge of the content.
1
1. Introduction
2
Definition
• In 1959, Arthur Samuel, a pioneer in the field
of machine learning (ML) defined it as the
“field of study that gives computers the
ability to learn without being explicitly
programmed”
3
Definition
“A computer program is said to learn from experience
with respect to some class of tasks and performance
measure, if the performance at the tasks, as measured by
the performance measure, improves with experience”
Features of a well-defined learning problem:
• The learning task
• The measure of performance
• The task experience
• Types of learning tasks
5
What is the Learning Problem?
• Learning = Improving with experience at some
task
• Improve over task T ,
• with respect to performance measure P ,
• based on experience E.
6
What is the Learning Problem?
• E.g., Learn to play checkers
T : Play checkers
P : % of games won in world tournament
E: opportunity to play against self
•
7
Learning to Play Checkers
• E.g., Learn to play checkers
T : Play checkers
P : % of games won in world tournament
• What Experience
• What exactly should be learned?
• How shall it be represented?
• What specific algorithm to learn it?
8
Designing a Learning System
• Consider designing a program to learn to play
checkers, with the goal of entering it in the world
checkers tournament
9
Designing a Learning System
• Consider designing a program to learn to play
checkers, with the goal of entering it in the world
checkers tournament
• Performance measure: the percentage of games it
wins in this tournament.
• Requires the following sets
– Choosing Training Experience
– Choosing the Target Function
– Choosing the Representation of the Target Function
– Choosing the Function Approximation Algorithm
10
Choosing the Training Experience
1. What training experience should the system have?
– A design choice with great impact on the outcome.
2. What amount of interaction should there be
between the system and the supervisor?
3. Which training examples?
11
Choosing the Training Experience
1. What training experience should the system have?
– A design choice with great impact on the outcome.
• Will the training experience provide direct or indirect
feedback?
– Direct Feedback: system learns from examples of individual checkers
board states and the correct move for each
Just a bunch of board states together with a correct move.
12
Choosing the Training Experience
• Direct
13
Choosing the Training Experience
1. What training experience should the system have?
– A design choice with great impact on the outcome.
• Will the training experience provide direct or indirect
feedback?
– Direct Feedback: system learns from examples of individual checkers
board states and the correct move for each
Just a bunch of board states together with a correct move.
– Indirect Feedback: A bunch of recorded games, where the correctness
of the moves is inferred by the result of the game.
• Credit assignment problem: Value of early states must be inferred from
the outcome
14
Direct feedback easier to learn from
Choosing the Training Experience
2. What amount of interaction should there be between the
system and the supervisor?
– Choice #1: No freedom. Supervisor provides all training
examples.
– Choice #2: Semi-free. Supervisor provides training
examples, system constructs its own examples too, and
asks questions to the supervisor in cases of doubt.
– Choice #3: Total-freedom. System learns to play
completely unsupervised
• How “daring” the system should be in exploring new boards?
15
Choosing the Training Experience
3. Which training examples?
– There is an huge huge number of possible games.
– No time to try all possible games.
– System should learn with examples that it will
encounter in the future.
– For example, if the goal is to beat humans, it
should be able to do well in situations that
humans encounter when they play (this is hard to
achieve in practice).
16
Choosing the Training Experience
– If training the checkers program consists only of
experiences played against itself, it may never encounter
crucial board states that are likely to be played by the
human checkers champion
– Most theory of machine learning rests on the assumption
that the distribution of training examples is identical to the
distribution of test examples
17
Partial Design of Checkers Learning
Program
• A checkers learning problem:
– Task T: playing checkers
– Performance measure P: percent of games won in the
world tournament
– Training experience E: games played against itself
• Remaining choices
– The exact type of knowledge to be learned
– A representation for this target knowledge
– A learning mechanism
18
Choosing the Target Function
What should be learned exactly?
• The computer program knows the legal moves.
Should learn how to choose the best move. Program
needs to learn the best move from among legal moves
• The computer should learn a ‘hidden’ function.
– target function: ChooseMove : B → M
– B legal Board state, M – legal Move
• ChooseMove is difficult to learn given indirect training
19
Choosing the Target Function
• What should be learned exactly?
20
Choosing the Target Function
• So, our Alternative target function
– An evaluation function that assigns a numerical score to any given
board state
– V : B → ( where is the set of real numbers)
• V(b) for an arbitrary board state b in B
– if b is a final board state that is won, then V(b) = 100
– if b is a final board state that is lost, then V(b) = -100
– if b is a final board state that is drawn, then V(b) = 0
– if b is not a final state, then V(b) = V(b '), where b' is the
best final board state that can be achieved starting from b
and playing optimally until the end of the game
21
 
Choosing the Target Function
• V(b) gives a recursive definition for board state b
– Not usable because not efficient to compute except is first
three trivial cases
– nonoperational definition
• Goal of learning is to discover an operational
description of V
• Learning the target function is often called function
approximation
– Referred to as
22
V̂
Choosing a Representation for the Target
Function
• Choice of representations involve trade offs
– Pick a very expressive representation to allow close approximation to
the ideal target function V
– More expressive, more training data required to choose among
alternative hypotheses
• Use linear combination of the following board features:
– x1: the number of black pieces on the board
– x2: the number of red pieces on the board
– x3: the number of black kings on the board
– x4: the number of red kings on the board
– x5: the number of black pieces threatened by red (i.e. which can be
captured on red's next turn)
– x6: the number of red pieces threatened by black
23
6
6
5
5
4
4
3
3
2
2
1
1
0
)
(
ˆ x
w
x
w
x
w
x
w
x
w
x
w
w
b
V 






24
Partial Design of Checkers Learning
Program
• A checkers learning problem:
– Task T: playing checkers
– Performance measure P: percent of games won in the
world tournament
– Training experience E: games played against itself
– Target Function: V: Board →
– Target function representation
25
6
6
5
5
4
4
3
3
2
2
1
1
0
)
(
ˆ x
w
x
w
x
w
x
w
x
w
x
w
w
b
V 







Choosing a Function Approximation
Algorithm
• To learn we require a set of training
examples describing the board b and the
training value Vtrain(b)
– Ordered pair
26
V̂
 
b
V
b train
,
100
,
0
,
0
,
0
,
1
,
0
,
3 6
5
4
3
2
1 





 x
x
x
x
x
x
x1: the number of black pieces on the board
x2: the number of red pieces on the board
x3: the number of black kings on the board
x4: the number of red kings on the board
x5: the number of black pieces threatened by red (i.e. which can be
captured on red's next turn)
x6: the number of red pieces threatened by black
Choosing a Function Approximation
Algorithm
• Need a procedure that first derives such training
examples from the indirect training experience, then
adjust the weights Wi to best fits these training
examples.
27
Estimating Training Values
• Need to assign specific scores to intermediate
board states
• Approximate intermediate board state b using
the learner's current approximation of the
next board state following b
– Simple and successful approach
– More accurate for states closer to end states
28
))
(
(
ˆ
)
( b
Successor
V
b
Vtrain 
Adjusting the Weights
• Choose the weights wi to best fit the set of training examples
• Minimize the squared error E between the train values and
the values predicted by the hypothesis
• Require an algorithm that
– will incrementally refine weights as new training examples become
available
– will be robust to errors in these estimated training values
• Least Mean Squares (LMS) is one such algorithm
29
   
 
 




examples
training
b
V
b
train
train
b
V
b
V
E
,
2
ˆ
LMS Weight Update Rule
• For each train example
– Use the current weights to calculate
– For each weight wi, update it as
– where
• is a small constant (e.g. 0.1)
30
 
b
V
b train
,
 
b
V
ˆ

   
  i
train
i
i x
b
V
b
V
w
w ˆ


 
Summary of Design Choices
Suggested Readings
• “Machine Learning” by Tom Mitchell, McGraw
Hill Publisher, Chapter 1
32

More Related Content

PDF
employed to cover the tampering traces of a tampered image.
PPTX
UNIT 1 Machine Learning [KCS-055] (1).pptx
PPTX
Introdution and designing a learning system
PPTX
Machine Learning- Introduction.pptx
PPTX
Unit 2 TOMMichlwjwjwjwjwwjejejejejejejej
PPTX
Machine learning module_1_ppt vtu...pptx
PPT
Machine Learning Fall, 2007 Course Information
PPTX
Machine Learning Introduction by Dr.C.R.Dhivyaa Kongu Engineering College
employed to cover the tampering traces of a tampered image.
UNIT 1 Machine Learning [KCS-055] (1).pptx
Introdution and designing a learning system
Machine Learning- Introduction.pptx
Unit 2 TOMMichlwjwjwjwjwwjejejejejejejej
Machine learning module_1_ppt vtu...pptx
Machine Learning Fall, 2007 Course Information
Machine Learning Introduction by Dr.C.R.Dhivyaa Kongu Engineering College

Similar to Module 1.pdf (20)

PPTX
UNIT I (6).pptx
PDF
Machine learning Lecture 1
PPTX
Machine Learning.pptx
PPT
introducción a Machine Learning
PPT
introducción a Machine Learning
PPT
ML_Lecture_1.ppt
PPT
ML Unit 1 CS.ppt
PDF
Ensemble Methods and Recommender Systems
PPT
vorl1.ppt
PPTX
ML_ Unit_1_PART_A
PPTX
AIML UNIT-4 FROM JNTUK SYLLABUS PRESENTATION.pptx
PPTX
Machine learning for beginners students.
PDF
ML PPT print.pdf
PPTX
Presentation1
PPT
课堂讲义(最后更新:2009-9-25)
PPT
Machine learning introduction to unit 1.ppt
PPTX
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
PPT
ai4.ppt
PPT
Lecture: introduction to Machine Learning.ppt
UNIT I (6).pptx
Machine learning Lecture 1
Machine Learning.pptx
introducción a Machine Learning
introducción a Machine Learning
ML_Lecture_1.ppt
ML Unit 1 CS.ppt
Ensemble Methods and Recommender Systems
vorl1.ppt
ML_ Unit_1_PART_A
AIML UNIT-4 FROM JNTUK SYLLABUS PRESENTATION.pptx
Machine learning for beginners students.
ML PPT print.pdf
Presentation1
课堂讲义(最后更新:2009-9-25)
Machine learning introduction to unit 1.ppt
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
ai4.ppt
Lecture: introduction to Machine Learning.ppt
Ad

Recently uploaded (20)

PDF
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
PDF
Transcultural that can help you someday.
PPTX
CYBER SECURITY the Next Warefare Tactics
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PDF
Optimise Shopper Experiences with a Strong Data Estate.pdf
PDF
Microsoft 365 products and services descrption
PPT
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PDF
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
PDF
Introduction to Data Science and Data Analysis
PPTX
Topic 5 Presentation 5 Lesson 5 Corporate Fin
DOCX
Factor Analysis Word Document Presentation
PDF
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
PPTX
SET 1 Compulsory MNH machine learning intro
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
PPTX
Business_Capability_Map_Collection__pptx
PPTX
Pilar Kemerdekaan dan Identi Bangsa.pptx
PPTX
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
PPTX
SAP 2 completion done . PRESENTATION.pptx
PPTX
IMPACT OF LANDSLIDE.....................
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
Transcultural that can help you someday.
CYBER SECURITY the Next Warefare Tactics
STERILIZATION AND DISINFECTION-1.ppthhhbx
Optimise Shopper Experiences with a Strong Data Estate.pdf
Microsoft 365 products and services descrption
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
Introduction to Data Science and Data Analysis
Topic 5 Presentation 5 Lesson 5 Corporate Fin
Factor Analysis Word Document Presentation
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
SET 1 Compulsory MNH machine learning intro
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
Business_Capability_Map_Collection__pptx
Pilar Kemerdekaan dan Identi Bangsa.pptx
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
SAP 2 completion done . PRESENTATION.pptx
IMPACT OF LANDSLIDE.....................
Ad

Module 1.pdf

  • 1. • PS: This file is for reference only. Do not depend solely on it for the content. It is to supplement your Text book content. It is recommended to go through suggested readings/Text book to have detailed knowledge of the content. 1
  • 3. Definition • In 1959, Arthur Samuel, a pioneer in the field of machine learning (ML) defined it as the “field of study that gives computers the ability to learn without being explicitly programmed” 3
  • 4. Definition “A computer program is said to learn from experience with respect to some class of tasks and performance measure, if the performance at the tasks, as measured by the performance measure, improves with experience” Features of a well-defined learning problem: • The learning task • The measure of performance • The task experience • Types of learning tasks
  • 5. 5
  • 6. What is the Learning Problem? • Learning = Improving with experience at some task • Improve over task T , • with respect to performance measure P , • based on experience E. 6
  • 7. What is the Learning Problem? • E.g., Learn to play checkers T : Play checkers P : % of games won in world tournament E: opportunity to play against self • 7
  • 8. Learning to Play Checkers • E.g., Learn to play checkers T : Play checkers P : % of games won in world tournament • What Experience • What exactly should be learned? • How shall it be represented? • What specific algorithm to learn it? 8
  • 9. Designing a Learning System • Consider designing a program to learn to play checkers, with the goal of entering it in the world checkers tournament 9
  • 10. Designing a Learning System • Consider designing a program to learn to play checkers, with the goal of entering it in the world checkers tournament • Performance measure: the percentage of games it wins in this tournament. • Requires the following sets – Choosing Training Experience – Choosing the Target Function – Choosing the Representation of the Target Function – Choosing the Function Approximation Algorithm 10
  • 11. Choosing the Training Experience 1. What training experience should the system have? – A design choice with great impact on the outcome. 2. What amount of interaction should there be between the system and the supervisor? 3. Which training examples? 11
  • 12. Choosing the Training Experience 1. What training experience should the system have? – A design choice with great impact on the outcome. • Will the training experience provide direct or indirect feedback? – Direct Feedback: system learns from examples of individual checkers board states and the correct move for each Just a bunch of board states together with a correct move. 12
  • 13. Choosing the Training Experience • Direct 13
  • 14. Choosing the Training Experience 1. What training experience should the system have? – A design choice with great impact on the outcome. • Will the training experience provide direct or indirect feedback? – Direct Feedback: system learns from examples of individual checkers board states and the correct move for each Just a bunch of board states together with a correct move. – Indirect Feedback: A bunch of recorded games, where the correctness of the moves is inferred by the result of the game. • Credit assignment problem: Value of early states must be inferred from the outcome 14 Direct feedback easier to learn from
  • 15. Choosing the Training Experience 2. What amount of interaction should there be between the system and the supervisor? – Choice #1: No freedom. Supervisor provides all training examples. – Choice #2: Semi-free. Supervisor provides training examples, system constructs its own examples too, and asks questions to the supervisor in cases of doubt. – Choice #3: Total-freedom. System learns to play completely unsupervised • How “daring” the system should be in exploring new boards? 15
  • 16. Choosing the Training Experience 3. Which training examples? – There is an huge huge number of possible games. – No time to try all possible games. – System should learn with examples that it will encounter in the future. – For example, if the goal is to beat humans, it should be able to do well in situations that humans encounter when they play (this is hard to achieve in practice). 16
  • 17. Choosing the Training Experience – If training the checkers program consists only of experiences played against itself, it may never encounter crucial board states that are likely to be played by the human checkers champion – Most theory of machine learning rests on the assumption that the distribution of training examples is identical to the distribution of test examples 17
  • 18. Partial Design of Checkers Learning Program • A checkers learning problem: – Task T: playing checkers – Performance measure P: percent of games won in the world tournament – Training experience E: games played against itself • Remaining choices – The exact type of knowledge to be learned – A representation for this target knowledge – A learning mechanism 18
  • 19. Choosing the Target Function What should be learned exactly? • The computer program knows the legal moves. Should learn how to choose the best move. Program needs to learn the best move from among legal moves • The computer should learn a ‘hidden’ function. – target function: ChooseMove : B → M – B legal Board state, M – legal Move • ChooseMove is difficult to learn given indirect training 19
  • 20. Choosing the Target Function • What should be learned exactly? 20
  • 21. Choosing the Target Function • So, our Alternative target function – An evaluation function that assigns a numerical score to any given board state – V : B → ( where is the set of real numbers) • V(b) for an arbitrary board state b in B – if b is a final board state that is won, then V(b) = 100 – if b is a final board state that is lost, then V(b) = -100 – if b is a final board state that is drawn, then V(b) = 0 – if b is not a final state, then V(b) = V(b '), where b' is the best final board state that can be achieved starting from b and playing optimally until the end of the game 21  
  • 22. Choosing the Target Function • V(b) gives a recursive definition for board state b – Not usable because not efficient to compute except is first three trivial cases – nonoperational definition • Goal of learning is to discover an operational description of V • Learning the target function is often called function approximation – Referred to as 22 V̂
  • 23. Choosing a Representation for the Target Function • Choice of representations involve trade offs – Pick a very expressive representation to allow close approximation to the ideal target function V – More expressive, more training data required to choose among alternative hypotheses • Use linear combination of the following board features: – x1: the number of black pieces on the board – x2: the number of red pieces on the board – x3: the number of black kings on the board – x4: the number of red kings on the board – x5: the number of black pieces threatened by red (i.e. which can be captured on red's next turn) – x6: the number of red pieces threatened by black 23 6 6 5 5 4 4 3 3 2 2 1 1 0 ) ( ˆ x w x w x w x w x w x w w b V       
  • 24. 24
  • 25. Partial Design of Checkers Learning Program • A checkers learning problem: – Task T: playing checkers – Performance measure P: percent of games won in the world tournament – Training experience E: games played against itself – Target Function: V: Board → – Target function representation 25 6 6 5 5 4 4 3 3 2 2 1 1 0 ) ( ˆ x w x w x w x w x w x w w b V        
  • 26. Choosing a Function Approximation Algorithm • To learn we require a set of training examples describing the board b and the training value Vtrain(b) – Ordered pair 26 V̂   b V b train , 100 , 0 , 0 , 0 , 1 , 0 , 3 6 5 4 3 2 1        x x x x x x x1: the number of black pieces on the board x2: the number of red pieces on the board x3: the number of black kings on the board x4: the number of red kings on the board x5: the number of black pieces threatened by red (i.e. which can be captured on red's next turn) x6: the number of red pieces threatened by black
  • 27. Choosing a Function Approximation Algorithm • Need a procedure that first derives such training examples from the indirect training experience, then adjust the weights Wi to best fits these training examples. 27
  • 28. Estimating Training Values • Need to assign specific scores to intermediate board states • Approximate intermediate board state b using the learner's current approximation of the next board state following b – Simple and successful approach – More accurate for states closer to end states 28 )) ( ( ˆ ) ( b Successor V b Vtrain 
  • 29. Adjusting the Weights • Choose the weights wi to best fit the set of training examples • Minimize the squared error E between the train values and the values predicted by the hypothesis • Require an algorithm that – will incrementally refine weights as new training examples become available – will be robust to errors in these estimated training values • Least Mean Squares (LMS) is one such algorithm 29             examples training b V b train train b V b V E , 2 ˆ
  • 30. LMS Weight Update Rule • For each train example – Use the current weights to calculate – For each weight wi, update it as – where • is a small constant (e.g. 0.1) 30   b V b train ,   b V ˆ        i train i i x b V b V w w ˆ    
  • 31. Summary of Design Choices
  • 32. Suggested Readings • “Machine Learning” by Tom Mitchell, McGraw Hill Publisher, Chapter 1 32