SlideShare a Scribd company logo
Dr.M.Bindhu
Associate Professor/ECE
Saveetha Engineering
College
OUTLINE
Machine Learning (AI) –Learning Rules
Sequential Covering Algorithm
Learn One Rule
Why Learn First Order Rules?
 First Order Logic: Terminology
 The FOIL Algorithm
 Why Combine Inductive And Analytical Learning?
 Kbann: Prior Knowledge To Initialize The Hypothesis
 Tangentprop, EBNN: Prior Knowledge Alters Search
Objective
 FOCL Algorithm : Prior Knowledge Alters Search
Operators
Machine Learning (AI)
A key aspect of intelligence is the ability to
learn knowledge over time.
Vast majority of machine learning algorithm
have been developed to learn knowledge
that is inherently propositional, where the
domain of interest is encoded in terms of a
fixed set of variables.
 Subfield of symbolic artificial
intelligence which uses logic programming as
a uniform representation for examples,
background knowledge and hypotheses
MACHINE LEARNING-LEARNING RULE
 https://youtu.be/ad79nYk2keg
We can learn sets of rules and
then converting the tree to rules.
We can also use a genetic algorithm
that encodes the rules as bit strings.
But, these only work with predicate
rules (no variables).
They also consider the set of rules as
a whole, not one rule at a time.
LEARNING RULES
SEQUENTIAL COVERING ALGORITHM
1..Learn one rule with high accuracy, any coverage
2. Remove positive examples covered by this rule
3. Repeat
Example:
Wind(d1)= Strong, Humdity(d1)=Low, Outlook(d1)=Sunny, PlayTennis(d1)=No
Wind(d2)= Weak, Humdity(d2)=Med, Outlook(d2)=Sunny, PlayTennis(d2)=Yes
Wind(d3)=Med, Humdity(d3)=Med, Outlook(d3)=Rain, PlayTennis(d3)=No
Target_attribute is the one wish to learn. e.g. PlayTennis(x)
Attributes is the set of all possible attributes.
e.g. Wind(x), Humidity(x), Outlook(x) threshold is the desired
SEQUENTIAL COVERING ALGORITHM
Sequential_covering (Target_attribute, Attributes, Examples,
Threshold) :
Learned_rules = {}
Rule = Learn-One-Rule(Target_attribute, Attributes,
Examples)
while Performance(Rule, Examples) > Threshold :
Learned_rules = Learned_rules + Rule
Examples = Examples - {examples correctly classified by
Rule}
Rule = Learn-One-Rule(Target_attribute, Attributes,
Examples)
Learned_rules = sort Learned_rules according to
performance over Examples
return Learned_rules
Drawback Of Sequential Covering
Algorithm
We require Learn-One-Rule to have
high (perfect?) accuracy but not
necessarily high coverage (i.e., when it
makes a prediction it should be true).
Since it performs a greedy search it is
not guaranteed to find the best or
smallest set of rules that cover the
training examples.
Why Learn First Order Rules?
Propositional logic allows the expression of individual
propositions and their truth-functional combination.
1.Propositions like Tom is a man or All men are mortal may be
represented by single proposition letters such as P or Q
2. Truth functional combinations are built up using connectives,
such as ∧, ∨, ¬, →
Example:
P∧Q
 Inference rules defined over propositional forms,P → Q
P is Tom is a man and Q is All men are mortal,
Inference that Tom is mortal does not follow in propositional
logic
Why Learn First Order Rules?..Contd
First order logic allows the expression of propositions and
their truth functional combination, but it also allows us
to represent propositions as assertions of predicates
about individuals or sets of individuals.
Example:
Propositions like Tom is a man or All men are mortal
may be represented by predicate-argument
representations such as man(tom) or ∀x(man(x) →
mortal(x)) (so, variables range over individuals) .
Inference rules permit conclusions to be drawn about
sets/individuals – e.g. mortal(tom)
Why Learn First Order Rules?..Contd
Day Outlook Temp Humid Wind PlayTennis
D1 Sunny Hot High Weak No
D2 Sunny Hot High Strong No
D3 Overcast Hot High Weak Yes
D4 Rain Mild High Weak Yes
D5 Rain Cool Low Weak Yes
D6 Rain Cool Low Strong No
D7 Overcast Cool Low Strong Yes
D8 Sunny Mild High Weak No
D9 Sunny Cool Low Weak Yes
D10 Rain Mild Low Weak Yes
D11 Sunny Mild Low Strong Yes
D12 Overcast Mild High Strong Yes
D13 Overcast Hot Low Weak Yes
D14 Rain Mild High Strong No
Learn-One-Rule
P os positive Examples
N eg negative Examples
while P os,
Learn a N ewRule
{ N ewRule most general rule possible
N ewRule N eg
N eg { while N ewRuleN eg, do
Add a new literal to specialize N ewRule
1. Candidate literals generate candidates
2.Best literal argmaxL2Candidate literals P erf
ormance(SpecializeRule(N ewRule; L))
3. add Best literal to N ewRule preconditions
4. N ewRule N eg subset of N ewRuleN eg
that satises N ewRule preconditions
{ Learned rules Learned rules + N ewRule
{ P os P os fmembers of P os covered by N ewRuleg
Return Learned rule
Autonomous Car Technology
Laser Terrain
Mapping
Stanley
Learning from Human
Drivers
Sebastian
Adaptive
Vision
Pat
h
Plannin
g
Images and movies taken from Sebastian Thrun’smultimedia
w1e4bsite.
Idea: organize the hypothesis space search
in general to specific fashion.
Start with most general rule
precondition, then greedily add the
attribute that most improves performance
measured over the training examples.
Can be generalized to multi-valued target
functions.
There are other ways to define Performance(),
besides using entropy.
Learn One Summary
MACHINE LEARNING-LEARNING RULE
First-Order Logic Definitions
 Every expression is composed
of constants, variables, predicates, and functions.
 A term is any constant, or variable, or any function
applied to any term.
 A literal is any predicate (or its negation) applied to
any set of terms.Female(Mary), ¬Female(x)
 A ground literal is a literal that does not contain any
variables.
 A clause is any disjunction of literals whose variables
are universally quantified.∀ x : Female(x) ∨ Male(x)
 A Horn clause is an expression of the form H ← L1 ∧
L2 ∧ ... ∧ Ln
First Order Logic: Terminology
 – constants – e.g. bob, 23, a
 – variables – e.g. X,Y,Z
– predicate symbols – e.g. female, father
– predicates take on the values True or False
only
 – function symbols – e.g. Age – functions
can take on any constant as a value
 – connectives – e.g. ∧, ∨, ¬, → (or ←)
– quantifiers – e.g. ∀, ∃
 A term is
 – any constant – e.g. bob
 – any variable – e.g X
 – any function applied to any term – e.g. age(bob)
 A literal is any predicate or negated predicate applied
to any terms – e.g. f emale(sue), ¬f ather(X,Y)
 – A ground literal is a literal that contains no variables
– e.g. f emale(sue)
 – A positive literal is a literal that does not contain a
negated predicate – e.g. f emale(sue)
 – A negative literal is a literal that contains a negated
predicate – e.g ¬f ather(X,Y)
Learning First-Order Horn Clauses
 Say we are trying to learn the concept Daughter(x,y) from
examples.
 Each person is described by the attributes: Name, Mother,
Father, Male, Female.
 Each example is a pair of instances,
 say a and b:
 Name(a) = Sharon, Mother(a) = Louise, Father(a) = Bob,
Male(a) = False, Female(a) = True
Name(b) = Bob, Mother(b) = Nora, Father(b) = Victor,
Male(b) = True, Female(b) = False, Daughter(a,b) = True
 If we give a bunch of these examples to CN2 or C4.5 they
will output a set of rules like:IF Father(a) = Bob ∧ Name(b)
= Bob ∧ Female(a) THEN Daughter(a,b)
 A first-order learner would output more general rules
likeIF Father(x) = y ∧ Female(x) THEN Daughter(x,y)
FOIL ALGORITHM
FOIL extends the SEQUENTIAL-COVERING and
LEARN-ONE-RULE algorithms for propositional
rule learning to first order rule learning
 • FOIL learns in two phases:
an outer loop which acquires a disjunction of
Horn clause-like rules which together cover the
positive examples
 an inner loop which constructs individual rules
by progressive specialisation of a rule through
adding new literals selected until no negative
examples are covered.
FOIL(T arget predicate; P redicates; Examples)
P os positive Examples
N eg negative Examples
while P os, do
Learn a N ewRule
N ewRule most general rule possible
N ewRuleN eg N eg
while N ewRuleN eg, do
Add a new literal to specialize N ewRule
1. Candidate literals generate candidates
2. Best literal argmaxL2Candidate literals F oil Gain(L; N ewRule)
3. add Best literal to N ewRule preconditions
4. N ewRuleN eg subset of N ewRuleN eg that satisfiesN ewRule
preconditions
Learned rules Learned rules + N ewRule
P os P os -members of P os covered by N ewRuleg
Return Learned rule
Inductive and Analytical Learning
INDUCTIVE LEARNING ANALYTICAL LEARNING

Inductive logic programming is particularly useful in bioinformatics and natural language
processing.
 Hypothesis fits data Hypothesis fits domain theory
Statistical inference Deductive inference
Requires little prior Learns from scarce data
Syntactic inductive bias Bias is domain theory
Plentiful data
No prior knowledge
Scarce data
Perfect prior knowledge
25
Domain Theory
Cup  Stable, Liftable, OpenVessel
Stable  BottomIsFlat
Liftable  Graspable, Light
Graspable  HasHandle
OpenVessel  HasConcavity, ConcavityPointsUp
Cup
Stable Liftable OpenVessel
Graspable
BottomIsFlatLight HasConcavityConcavityPointsUpHasHandle
26
KBANN
Knowledge Based Artificial Neural Networks
KBANN (data D, domain theory B)
1. Create a feed forward network h equivalent to B
2. Use BACKPROP to tune h to fit D
CS 5751 Machine Learning Chapter 12 Comb. Inductive/Analytical 27
Neural Net Equivalent to Domain Theory
Expensive
BottomIsFlat
MadeOfCeramic
MadeOfStyrofoam
MadeOfPaper
HasHandle
HandleOnTop
HandleOnSide
Light
HasConcavity
ConcavityPointsUp
Fragile
Stable
Liftable
OpenVessel
CupGraspable
large positive weight
large negative weight
negligible weight
28
Creating Network Equivalent to Domain
Theory
Create one unit per horn clause rule (an AND unit)
Connect unit inputs to corresponding clause
antecedents
For each non-negated antecedent, corresponding input
weight w  W, where W is some constant
For each negated antecedent, weight w  -W
Threshold weight w0  -(n - .5) W, where n is number
of non-negated antecedents
Finally, add additional connections with near-zero
weights
Liftable  Graspable, ¬Heavy
29
Result of Refining the Network
Expensive
BottomIsFlat
MadeOfCeramic
MadeOfStyrofoam
MadeOfPaper
HasHandle
HandleOnTop
HandleOnSide
Light
HasConcavity
ConcavityPointsUp
Fragile
Stable
Liftable
OpenVessel
CupGraspable
large positive weight
large negative weight
negligible weight
30
Hypothesis Space Search in KBANN
Hypotheses that
fit training data
equally well
Initial hypothesis
for KBANN
Initial hypothesis
for Backpropagation
Hypothesis Space
31
EBNN
Explanation Based Neural Network
Key idea:
Previously learned approximate domain
theory
Domain theory represented by collection
of neural networks
Learn target function as another neural
network
32
Expensive
BottomIsFlat
MadeOfCeramic
MadeOfStyrofoam
MadeOfPaper
HasHandle
HandleOnTop
HandleOnSide
Light
HasConcavity
ConcavityPointsUp
Fragile
Stable
Explanation in Terms of Domain Theory
Prior learned networks for useful concepts
combined into a single target network
Graspable
Stable
OpenVessel
Liftable Cup
33
Hypothesis Space Search in TangentProp
Hypotheses that
maximize fit to data
TangetProp
Search
Backpropagation
Search
Hypothesis Space
Hypotheses that
maximize fit to
data and prior
knowledge
34
FOCL algorithm (First Order Combined
Learner)
Adaptation of FOIL that uses domain theory
When adding a literal to a rule, not only consider
adding single terms, but also think about adding terms
from domain theory
Most importantly a potentially incorrect hypothesis is
allowed as an initial approximation to the predicate
to be learned. The main goal of FOCL is to incorporate
the methods of explanation-based learning (EBL)
35
Search in FOCL
Cup  HasHandle
[2+,3-]
Cup 
Cup  ¬HasHandle
[2+,3-]
Cup  Fragile
[2+,4-]
Cup  BottomIsFlat,
Light,
HasConcavity,
ConcavityPointsUp
[4+,2-]
...
Cup  BottomIsFlat,
Light,
HasConcavity,
ConcavityPointsUp,
HandleOnTop
[0+,2-]
Cup  BottomIsFlat,
Light,
HasConcavity,
ConcavityPointsUp,
¬ HandleOnTop
[4+,0-]
Cup  BottomIsFlat,
Light,
HasConcavity,
ConcavityPointsUp,
HandleOnSide
[2+,0-]
...
36
FOCL Results
Recognizing legal chess endgame positions:
30 positive, 30 negative examples
FOIL: 86%
FOCL: 94% (using domain theory with 76%
accuracy)
NYNEX telephone network diagnosis
500 training examples
FOIL: 90%
FOCL: 98% (using domain theory with 95%
accuracy)
VIDEO LINK
 https://www.youtube.com/watch?v=rVlN0ZCYmtI
 https://www.youtube.com/watch?v=SSrD02pdE78
RECAP
Sequential Covering Algorithm
Learn One Rule
The FOIL Algorithm
 FOCL Algorithm
http://jmvidal.cse.sc.edu/talks/learningrules/allslides.xml
http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-
20/www/mlbook/ch10.pdf
https://bcssp10.files.wordpress.com/2013/02/lecture151.pdf
REFERENCES
Machine learnning ---Tom Mitchell (1998):
MACHINE LEARNING-LEARNING RULE
MACHINE LEARNING-LEARNING RULE
MACHINE LEARNING-LEARNING RULE
HEARTFUL
THANK YOU

More Related Content

PPTX
Learning set of rules
PPTX
Advanced topics in artificial neural networks
PPT
Knowledge Representation in Artificial intelligence
PPTX
Generative Adversarial Network (GAN)
PPTX
Learning rule of first order rules
ODP
Machine Learning with Decision trees
PPTX
Concept learning
Learning set of rules
Advanced topics in artificial neural networks
Knowledge Representation in Artificial intelligence
Generative Adversarial Network (GAN)
Learning rule of first order rules
Machine Learning with Decision trees
Concept learning

What's hot (20)

PDF
Bayesian Networks - A Brief Introduction
PDF
Understanding Bagging and Boosting
PPTX
ML_Unit_1_Part_C
PPTX
Multilayer & Back propagation algorithm
ODP
NAIVE BAYES CLASSIFIER
PPTX
Simulated Annealing
PDF
Artificial Neural Network Lect4 : Single Layer Perceptron Classifiers
PPTX
Artificial Neural Networks 1
PPTX
Self-organizing map
PPTX
Artificial neural networks and its applications
PDF
Linear regression
PPTX
Semantic Networks
PPTX
Naïve Bayes Classifier Algorithm.pptx
PPTX
Inductive analytical approaches to learning
PPTX
Inductive bias
PPTX
Genetic programming
PPTX
Knowledge representation In Artificial Intelligence
PPTX
Feedforward neural network
PDF
Nature-Inspired Optimization Algorithms
PPT
Artificial neural network
Bayesian Networks - A Brief Introduction
Understanding Bagging and Boosting
ML_Unit_1_Part_C
Multilayer & Back propagation algorithm
NAIVE BAYES CLASSIFIER
Simulated Annealing
Artificial Neural Network Lect4 : Single Layer Perceptron Classifiers
Artificial Neural Networks 1
Self-organizing map
Artificial neural networks and its applications
Linear regression
Semantic Networks
Naïve Bayes Classifier Algorithm.pptx
Inductive analytical approaches to learning
Inductive bias
Genetic programming
Knowledge representation In Artificial Intelligence
Feedforward neural network
Nature-Inspired Optimization Algorithms
Artificial neural network
Ad

Similar to MACHINE LEARNING-LEARNING RULE (20)

PPT
Learning sets of rules, Sequential Learning Algorithm,FOIL
DOCX
Propositional logic is a good vehicle to introduce basic properties of logic
PPT
Predicate logic_2(Artificial Intelligence)
PDF
Ai lecture 10(unit03)
PPTX
Prolog Programming : Basics
PPTX
Poggi analytics - ebl - 1
PDF
Prolog,Prolog Programming IN AI.pdf
PDF
GDSC SSN - solution Challenge : Fundamentals of Decision Making
PDF
AI NOTES ppt 4.pdf
PPTX
Knowledge & logic in Artificial Intelligence.pptx
PDF
AI Lesson 17
PPT
Basics of Machine Learning
PDF
Unit-4-Knowledge-representation.pdf
PDF
AI Lesson 11
PPTX
artificial intelligence.pptx
PPTX
General-to specific ordering of hypotheses Learning algorithms which use this...
PDF
10 logic+programming+with+prolog
PPT
Logical Programming Paradigm for programming Languages.
PPTX
Module4_AI 4th semester engineering.pptx
PPT
S10
Learning sets of rules, Sequential Learning Algorithm,FOIL
Propositional logic is a good vehicle to introduce basic properties of logic
Predicate logic_2(Artificial Intelligence)
Ai lecture 10(unit03)
Prolog Programming : Basics
Poggi analytics - ebl - 1
Prolog,Prolog Programming IN AI.pdf
GDSC SSN - solution Challenge : Fundamentals of Decision Making
AI NOTES ppt 4.pdf
Knowledge & logic in Artificial Intelligence.pptx
AI Lesson 17
Basics of Machine Learning
Unit-4-Knowledge-representation.pdf
AI Lesson 11
artificial intelligence.pptx
General-to specific ordering of hypotheses Learning algorithms which use this...
10 logic+programming+with+prolog
Logical Programming Paradigm for programming Languages.
Module4_AI 4th semester engineering.pptx
S10
Ad

Recently uploaded (20)

PDF
Structs to JSON How Go Powers REST APIs.pdf
PDF
composite construction of structures.pdf
PDF
PPT on Performance Review to get promotions
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
Geodesy 1.pptx...............................................
PPTX
UNIT 4 Total Quality Management .pptx
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
Lesson 3_Tessellation.pptx finite Mathematics
PPTX
Strings in CPP - Strings in C++ are sequences of characters used to store and...
PPTX
OOP with Java - Java Introduction (Basics)
DOCX
573137875-Attendance-Management-System-original
PPTX
Sustainable Sites - Green Building Construction
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
Structs to JSON How Go Powers REST APIs.pdf
composite construction of structures.pdf
PPT on Performance Review to get promotions
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
CYBER-CRIMES AND SECURITY A guide to understanding
Geodesy 1.pptx...............................................
UNIT 4 Total Quality Management .pptx
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
UNIT-1 - COAL BASED THERMAL POWER PLANTS
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Internet of Things (IOT) - A guide to understanding
bas. eng. economics group 4 presentation 1.pptx
Lesson 3_Tessellation.pptx finite Mathematics
Strings in CPP - Strings in C++ are sequences of characters used to store and...
OOP with Java - Java Introduction (Basics)
573137875-Attendance-Management-System-original
Sustainable Sites - Green Building Construction
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd

MACHINE LEARNING-LEARNING RULE

  • 2. OUTLINE Machine Learning (AI) –Learning Rules Sequential Covering Algorithm Learn One Rule Why Learn First Order Rules?  First Order Logic: Terminology  The FOIL Algorithm  Why Combine Inductive And Analytical Learning?  Kbann: Prior Knowledge To Initialize The Hypothesis  Tangentprop, EBNN: Prior Knowledge Alters Search Objective  FOCL Algorithm : Prior Knowledge Alters Search Operators
  • 3. Machine Learning (AI) A key aspect of intelligence is the ability to learn knowledge over time. Vast majority of machine learning algorithm have been developed to learn knowledge that is inherently propositional, where the domain of interest is encoded in terms of a fixed set of variables.  Subfield of symbolic artificial intelligence which uses logic programming as a uniform representation for examples, background knowledge and hypotheses
  • 6. We can learn sets of rules and then converting the tree to rules. We can also use a genetic algorithm that encodes the rules as bit strings. But, these only work with predicate rules (no variables). They also consider the set of rules as a whole, not one rule at a time. LEARNING RULES
  • 7. SEQUENTIAL COVERING ALGORITHM 1..Learn one rule with high accuracy, any coverage 2. Remove positive examples covered by this rule 3. Repeat Example: Wind(d1)= Strong, Humdity(d1)=Low, Outlook(d1)=Sunny, PlayTennis(d1)=No Wind(d2)= Weak, Humdity(d2)=Med, Outlook(d2)=Sunny, PlayTennis(d2)=Yes Wind(d3)=Med, Humdity(d3)=Med, Outlook(d3)=Rain, PlayTennis(d3)=No Target_attribute is the one wish to learn. e.g. PlayTennis(x) Attributes is the set of all possible attributes. e.g. Wind(x), Humidity(x), Outlook(x) threshold is the desired
  • 8. SEQUENTIAL COVERING ALGORITHM Sequential_covering (Target_attribute, Attributes, Examples, Threshold) : Learned_rules = {} Rule = Learn-One-Rule(Target_attribute, Attributes, Examples) while Performance(Rule, Examples) > Threshold : Learned_rules = Learned_rules + Rule Examples = Examples - {examples correctly classified by Rule} Rule = Learn-One-Rule(Target_attribute, Attributes, Examples) Learned_rules = sort Learned_rules according to performance over Examples return Learned_rules
  • 9. Drawback Of Sequential Covering Algorithm We require Learn-One-Rule to have high (perfect?) accuracy but not necessarily high coverage (i.e., when it makes a prediction it should be true). Since it performs a greedy search it is not guaranteed to find the best or smallest set of rules that cover the training examples.
  • 10. Why Learn First Order Rules? Propositional logic allows the expression of individual propositions and their truth-functional combination. 1.Propositions like Tom is a man or All men are mortal may be represented by single proposition letters such as P or Q 2. Truth functional combinations are built up using connectives, such as ∧, ∨, ¬, → Example: P∧Q  Inference rules defined over propositional forms,P → Q P is Tom is a man and Q is All men are mortal, Inference that Tom is mortal does not follow in propositional logic
  • 11. Why Learn First Order Rules?..Contd First order logic allows the expression of propositions and their truth functional combination, but it also allows us to represent propositions as assertions of predicates about individuals or sets of individuals. Example: Propositions like Tom is a man or All men are mortal may be represented by predicate-argument representations such as man(tom) or ∀x(man(x) → mortal(x)) (so, variables range over individuals) . Inference rules permit conclusions to be drawn about sets/individuals – e.g. mortal(tom)
  • 12. Why Learn First Order Rules?..Contd
  • 13. Day Outlook Temp Humid Wind PlayTennis D1 Sunny Hot High Weak No D2 Sunny Hot High Strong No D3 Overcast Hot High Weak Yes D4 Rain Mild High Weak Yes D5 Rain Cool Low Weak Yes D6 Rain Cool Low Strong No D7 Overcast Cool Low Strong Yes D8 Sunny Mild High Weak No D9 Sunny Cool Low Weak Yes D10 Rain Mild Low Weak Yes D11 Sunny Mild Low Strong Yes D12 Overcast Mild High Strong Yes D13 Overcast Hot Low Weak Yes D14 Rain Mild High Strong No
  • 14. Learn-One-Rule P os positive Examples N eg negative Examples while P os, Learn a N ewRule { N ewRule most general rule possible N ewRule N eg N eg { while N ewRuleN eg, do Add a new literal to specialize N ewRule 1. Candidate literals generate candidates 2.Best literal argmaxL2Candidate literals P erf ormance(SpecializeRule(N ewRule; L)) 3. add Best literal to N ewRule preconditions 4. N ewRule N eg subset of N ewRuleN eg that satises N ewRule preconditions { Learned rules Learned rules + N ewRule { P os P os fmembers of P os covered by N ewRuleg Return Learned rule
  • 15. Autonomous Car Technology Laser Terrain Mapping Stanley Learning from Human Drivers Sebastian Adaptive Vision Pat h Plannin g Images and movies taken from Sebastian Thrun’smultimedia w1e4bsite.
  • 16. Idea: organize the hypothesis space search in general to specific fashion. Start with most general rule precondition, then greedily add the attribute that most improves performance measured over the training examples. Can be generalized to multi-valued target functions. There are other ways to define Performance(), besides using entropy. Learn One Summary
  • 18. First-Order Logic Definitions  Every expression is composed of constants, variables, predicates, and functions.  A term is any constant, or variable, or any function applied to any term.  A literal is any predicate (or its negation) applied to any set of terms.Female(Mary), ¬Female(x)  A ground literal is a literal that does not contain any variables.  A clause is any disjunction of literals whose variables are universally quantified.∀ x : Female(x) ∨ Male(x)  A Horn clause is an expression of the form H ← L1 ∧ L2 ∧ ... ∧ Ln
  • 19. First Order Logic: Terminology  – constants – e.g. bob, 23, a  – variables – e.g. X,Y,Z – predicate symbols – e.g. female, father – predicates take on the values True or False only  – function symbols – e.g. Age – functions can take on any constant as a value  – connectives – e.g. ∧, ∨, ¬, → (or ←) – quantifiers – e.g. ∀, ∃
  • 20.  A term is  – any constant – e.g. bob  – any variable – e.g X  – any function applied to any term – e.g. age(bob)  A literal is any predicate or negated predicate applied to any terms – e.g. f emale(sue), ¬f ather(X,Y)  – A ground literal is a literal that contains no variables – e.g. f emale(sue)  – A positive literal is a literal that does not contain a negated predicate – e.g. f emale(sue)  – A negative literal is a literal that contains a negated predicate – e.g ¬f ather(X,Y)
  • 21. Learning First-Order Horn Clauses  Say we are trying to learn the concept Daughter(x,y) from examples.  Each person is described by the attributes: Name, Mother, Father, Male, Female.  Each example is a pair of instances,  say a and b:  Name(a) = Sharon, Mother(a) = Louise, Father(a) = Bob, Male(a) = False, Female(a) = True Name(b) = Bob, Mother(b) = Nora, Father(b) = Victor, Male(b) = True, Female(b) = False, Daughter(a,b) = True  If we give a bunch of these examples to CN2 or C4.5 they will output a set of rules like:IF Father(a) = Bob ∧ Name(b) = Bob ∧ Female(a) THEN Daughter(a,b)  A first-order learner would output more general rules likeIF Father(x) = y ∧ Female(x) THEN Daughter(x,y)
  • 22. FOIL ALGORITHM FOIL extends the SEQUENTIAL-COVERING and LEARN-ONE-RULE algorithms for propositional rule learning to first order rule learning  • FOIL learns in two phases: an outer loop which acquires a disjunction of Horn clause-like rules which together cover the positive examples  an inner loop which constructs individual rules by progressive specialisation of a rule through adding new literals selected until no negative examples are covered.
  • 23. FOIL(T arget predicate; P redicates; Examples) P os positive Examples N eg negative Examples while P os, do Learn a N ewRule N ewRule most general rule possible N ewRuleN eg N eg while N ewRuleN eg, do Add a new literal to specialize N ewRule 1. Candidate literals generate candidates 2. Best literal argmaxL2Candidate literals F oil Gain(L; N ewRule) 3. add Best literal to N ewRule preconditions 4. N ewRuleN eg subset of N ewRuleN eg that satisfiesN ewRule preconditions Learned rules Learned rules + N ewRule P os P os -members of P os covered by N ewRuleg Return Learned rule
  • 24. Inductive and Analytical Learning INDUCTIVE LEARNING ANALYTICAL LEARNING  Inductive logic programming is particularly useful in bioinformatics and natural language processing.  Hypothesis fits data Hypothesis fits domain theory Statistical inference Deductive inference Requires little prior Learns from scarce data Syntactic inductive bias Bias is domain theory Plentiful data No prior knowledge Scarce data Perfect prior knowledge
  • 25. 25 Domain Theory Cup  Stable, Liftable, OpenVessel Stable  BottomIsFlat Liftable  Graspable, Light Graspable  HasHandle OpenVessel  HasConcavity, ConcavityPointsUp Cup Stable Liftable OpenVessel Graspable BottomIsFlatLight HasConcavityConcavityPointsUpHasHandle
  • 26. 26 KBANN Knowledge Based Artificial Neural Networks KBANN (data D, domain theory B) 1. Create a feed forward network h equivalent to B 2. Use BACKPROP to tune h to fit D
  • 27. CS 5751 Machine Learning Chapter 12 Comb. Inductive/Analytical 27 Neural Net Equivalent to Domain Theory Expensive BottomIsFlat MadeOfCeramic MadeOfStyrofoam MadeOfPaper HasHandle HandleOnTop HandleOnSide Light HasConcavity ConcavityPointsUp Fragile Stable Liftable OpenVessel CupGraspable large positive weight large negative weight negligible weight
  • 28. 28 Creating Network Equivalent to Domain Theory Create one unit per horn clause rule (an AND unit) Connect unit inputs to corresponding clause antecedents For each non-negated antecedent, corresponding input weight w  W, where W is some constant For each negated antecedent, weight w  -W Threshold weight w0  -(n - .5) W, where n is number of non-negated antecedents Finally, add additional connections with near-zero weights Liftable  Graspable, ¬Heavy
  • 29. 29 Result of Refining the Network Expensive BottomIsFlat MadeOfCeramic MadeOfStyrofoam MadeOfPaper HasHandle HandleOnTop HandleOnSide Light HasConcavity ConcavityPointsUp Fragile Stable Liftable OpenVessel CupGraspable large positive weight large negative weight negligible weight
  • 30. 30 Hypothesis Space Search in KBANN Hypotheses that fit training data equally well Initial hypothesis for KBANN Initial hypothesis for Backpropagation Hypothesis Space
  • 31. 31 EBNN Explanation Based Neural Network Key idea: Previously learned approximate domain theory Domain theory represented by collection of neural networks Learn target function as another neural network
  • 32. 32 Expensive BottomIsFlat MadeOfCeramic MadeOfStyrofoam MadeOfPaper HasHandle HandleOnTop HandleOnSide Light HasConcavity ConcavityPointsUp Fragile Stable Explanation in Terms of Domain Theory Prior learned networks for useful concepts combined into a single target network Graspable Stable OpenVessel Liftable Cup
  • 33. 33 Hypothesis Space Search in TangentProp Hypotheses that maximize fit to data TangetProp Search Backpropagation Search Hypothesis Space Hypotheses that maximize fit to data and prior knowledge
  • 34. 34 FOCL algorithm (First Order Combined Learner) Adaptation of FOIL that uses domain theory When adding a literal to a rule, not only consider adding single terms, but also think about adding terms from domain theory Most importantly a potentially incorrect hypothesis is allowed as an initial approximation to the predicate to be learned. The main goal of FOCL is to incorporate the methods of explanation-based learning (EBL)
  • 35. 35 Search in FOCL Cup  HasHandle [2+,3-] Cup  Cup  ¬HasHandle [2+,3-] Cup  Fragile [2+,4-] Cup  BottomIsFlat, Light, HasConcavity, ConcavityPointsUp [4+,2-] ... Cup  BottomIsFlat, Light, HasConcavity, ConcavityPointsUp, HandleOnTop [0+,2-] Cup  BottomIsFlat, Light, HasConcavity, ConcavityPointsUp, ¬ HandleOnTop [4+,0-] Cup  BottomIsFlat, Light, HasConcavity, ConcavityPointsUp, HandleOnSide [2+,0-] ...
  • 36. 36 FOCL Results Recognizing legal chess endgame positions: 30 positive, 30 negative examples FOIL: 86% FOCL: 94% (using domain theory with 76% accuracy) NYNEX telephone network diagnosis 500 training examples FOIL: 90% FOCL: 98% (using domain theory with 95% accuracy)
  • 37. VIDEO LINK  https://www.youtube.com/watch?v=rVlN0ZCYmtI  https://www.youtube.com/watch?v=SSrD02pdE78
  • 38. RECAP Sequential Covering Algorithm Learn One Rule The FOIL Algorithm  FOCL Algorithm