SlideShare a Scribd company logo
Introduction to Pattern
Recognition
Vojtěch Franc
xfrancv@cmp.felk.cvut.cz
Center for Machine Perception
Czech Technical University in Prague
What is pattern recognition?
A pattern is an object, process or event that can be
given a name.
A pattern class (or category) is a set of patterns
sharing common attributes and usually originating from
the same source.
During recognition (or classification) given objects
are assigned to prescribed classes.
A classifier is a machine which performs classification.
“The assignment of a physical object or event to one of
several prespecified categeries” -- Duda & Hart
Examples of applications
• Optical Character
Recognition (OCR)
• Biometrics
• Diagnostic systems
• Military applications
• Handwritten: sorting letters by postal code,
input device for PDA‘s.
• Printed texts: reading machines for blind
people, digitalization of text documents.
• Face recognition, verification, retrieval.
• Finger prints recognition.
• Speech recognition.
• Medical diagnosis: X-Ray, EKG analysis.
• Machine diagnostics, waster detection.
• Automated Target Recognition (ATR).
• Image segmentation and analysis (recognition
from aerial or satelite photographs).
Approaches
Statistical PR: based on underlying statistical model
of patterns and pattern classes.
Structural (or syntactic) PR: pattern classes
represented by means of formal structures as
grammars, automata, strings, etc.
Neural networks: classifier is represented as a
network of cells modeling neurons of the human brain
(connectionist approach).
Basic concepts
y x=












nx
x
x

2
1 Feature vector
- A vector of observations (measurements).
- is a point in feature space .
Hidden state
- Cannot be directly measured.
- Patterns with equal hidden state belong to the same class.
X∈x
x X
Y∈y
Task
- To design a classifer (decision rule)
which decides about a hidden state based on an onbservation.
YX →:q
Pattern
Example
x=





2
1
x
x
height
weight
Task: jockey-hoopster recognition.
The set of hidden state is
The feature space is
},{ JH=Y
2
ℜ=X
Training examples )},(,),,{( 11 ll yy xx 
1x
2x
Jy =
Hy =Linear classifier:



<+⋅
≥+⋅
=
0)(
0)(
)q(
bifJ
bifH
xw
xw
x
0)( =+⋅ bxw
Components of PR system
Sensors and
preprocessing
Feature
extraction
Classifier
Class
assignment
• Sensors and preprocessing.
• A feature extraction aims to create discriminative features good for classification.
• A classifier.
• A teacher provides information about hidden state -- supervised learning.
• A learning algorithm sets PR from training examples.
Learning algorithmTeacher
Pattern
Feature extraction
Task: to extract features which are good for classification.
Good features: • Objects from the same class have similar feature values.
• Objects from different classes have different values.
“Good” features “Bad” features
Feature extraction methods












km
m
m

2
1












nx
x
x

2
11φ
2φ
nφ
















km
m
m
m

3
2
1












nx
x
x

2
1
Feature extraction Feature selection
Problem can be expressed as optimization of parameters of featrure extractor .
Supervised methods: objective function is a criterion of separability
(discriminability) of labeled examples, e.g., linear discriminat analysis (LDA).
Unsupervised methods: lower dimesional representation which preserves important
characteristics of input data is sought for, e.g., principal component analysis (PCA).
φ(θ)
Classifier
A classifier partitions feature space X into class-labeled regions such that
||21 YXXXX ∪∪∪=  }0{||21 =∩∩∩ YXXX and
1X 3X
2X
1X
1X
2X
3X
The classification consists of determining to which region a feature vector x belongs to.
Borders between decision boundaries are called decision regions.
Representation of classifier
A classifier is typically represented as a set of discriminant functions
||,,1,:)(f YX =ℜ→ ii x
The classifier assigns a feature vector x to the i-the class if )(f)(f xx ji > ij ≠∀
)(f1 x
)(f2 x
)(f || xY
maxx y
Feature vector
Discriminant function
Class identifier
Bayesian decision making
• The Bayesian decision making is a fundamental statistical approach which
allows to design the optimal classifier if complete statistical model is known.
Definition
:
Obsevations
Hidden states
Decisions
A loss function
A decision rule
A joint probabilityD
DX →:q
)p( y,x
X
Y
RDYW →×:
Task: to design decision rule q which minimizes Bayesian risk
∑∑∈ ∈
=
Yy Xx
yy )),W(q(),p(R(q) xx
Example of Bayesian task
Task: minimization of classification error.
A set of decisions D is the same as set of hidden states Y.
0/1 - loss function used



≠
=
=
yif
yif
y
)q(1
)q(0
)),W(q(
x
x
x
The Bayesian risk R(q) corresponds to probability of
misclassification.
The solution of Bayesian task is
)p(
)p()|p(
maxarg)|(maxargR(q)minargq *
q
*
x
x
x
yy
ypy
yy
==⇒=
Limitations of Bayesian
approach
• The statistical model p(x,y) is mostly not known therefore
learning must be employed to estimate p(x,y) from training
examples {(x1,y1),…,(x,y)} -- plug-in Bayes.
• Non-Bayesian methods offers further task formulations:
• A partial statistical model is avaliable only:
• p(y) is not known or does not exist.
• p(x|y,θ) is influenced by a non-random intervetion θ.
• The loss function is not defined.
• Examples: Neyman-Pearson‘s task, Minimax task, etc.
Discriminative approaches
Given a class of classification rules q(x;θ) parametrized by θ∈Ξ
the task is to find the “best” parameter θ*
based on a set of
training examples {(x1,y1),…,(x,y)} -- supervised learning.
The task of learning: recognition which classification rule is
to be used.
The way how to perform the learning is determined by a
selected inductive principle.
Empirical risk minimization
principle
The true expected risk R(q) is approximated by empirical risk
∑=
=

 1
emp )),;W(q(
1
));(q(R
i
ii yx θxθ
with respect to a given labeled training set {(x1,y1),…,(x,y)}.
The learning based on the empirical minimization principle is
defined as
));(q(Rminarg emp
*
θxθ
θ
=
Examples of algorithms: Perceptron, Back-propagation, etc.
Overfitting and underfitting
Problem: how rich class of classifications q(x;θ) to use.
underfitting overfittinggood fit
Problem of generalization: a small emprical risk Remp does not imply
small true expected risk R.
Structural risk
minimization principle
An upper bound on the expected risk of a classification rule q∈Q
)
1
log,,
1
(R(q)RR(q)
σ
hstremp

+≤
where  is number of training examples, h is VC-dimension of class
of functions Q and 1-σ is confidence of the upper bound.
SRM principle: from a given nested function classes Q1,Q2,…,Qm,
such that
mhhh ≤≤≤ 21
select a rule q*
which minimizes the upper bound on the expected risk.
Statistical learning theory -- Vapnik & Chervonenkis.
Unsupervised learning
Input: training examples {x1,…,x} without information about the
hidden state.
Clustering: goal is to find clusters of data sharing similar properties.
Classifier
Learning
algorithm
θ
},,{ 1  xx },,{ 1  yy
Classifier
ΘY)(X: →× 
L
YΘX →×:q
Learning algorithm
(supervised)
A broad class of unsupervised learning algorithms:
Example of unsupervised
learning algorithm
k-Means clustering:
Classifier
||||minarg)q(
,,1
i
ki
y mxx −==
= 
Goal is to minimize
∑=
−

1
2
)q( ||||
i
i ixmx
∑∈
=
ij
j
i
i
II
,
||
1
xm })q(:{ ij ji == xI
Learning algorithm
1m
2m
3m
},,{ 1  xx
},,{ 1 kmmθ =
},,{ 1  yy
References
Books
Duda, Heart: Pattern Classification and Scene Analysis. J. Wiley & Sons, New
York, 1982. (2nd edition 2000).
Fukunaga: Introduction to Statistical Pattern Recognition. Academic Press, 1990.
Bishop: Neural Networks for Pattern Recognition. Claredon Press, Oxford, 1997.
Schlesinger, Hlaváč: Ten lectures on statistical and structural pattern recognition.
Kluwer Academic Publisher, 2002.
Journals
Journal of Pattern Recognition Society.
IEEE transactions on Neural Networks.
Pattern Recognition and Machine Learning.

More Related Content

PPTX
Machine learning
PPT
isabelle_webinar_jan..
PDF
Scalable sentiment classification for big data analysis using naive bayes cla...
PPT
2.7 other classifiers
PDF
Basic deep learning & Deep learning application to medicine
PPT
Machine Learning: Foundations Course Number 0368403401
PDF
Introduction to Supervised ML Concepts and Algorithms
PDF
Data-Driven Recommender Systems
Machine learning
isabelle_webinar_jan..
Scalable sentiment classification for big data analysis using naive bayes cla...
2.7 other classifiers
Basic deep learning & Deep learning application to medicine
Machine Learning: Foundations Course Number 0368403401
Introduction to Supervised ML Concepts and Algorithms
Data-Driven Recommender Systems

What's hot (17)

PDF
Slides used during my thesis defense "Du typage vectoriel"
PDF
Mining Uncertain Data (Sebastiaan van Schaaik)
PDF
Machine Learning : why we should know and how it works
PPTX
"Naive Bayes Classifier" @ Papers We Love Bucharest
PPTX
Module 4 part_1
PPT
Naive bayes
PPT
Introduction
PDF
Bayesian statistics intro using r
PPT
Machine learning and Neural Networks
PDF
Naive Bayes Classifier
PPT
Chapter 09 class advanced
PPT
Bayesian statistics using r intro
PDF
WSC 2011, advanced tutorial on simulation in Statistics
PDF
Domain adaptation: A Theoretical View
PPT
Text classification
PPTX
Machine learning and linear regression programming
PPT
20070702 Text Categorization
Slides used during my thesis defense "Du typage vectoriel"
Mining Uncertain Data (Sebastiaan van Schaaik)
Machine Learning : why we should know and how it works
"Naive Bayes Classifier" @ Papers We Love Bucharest
Module 4 part_1
Naive bayes
Introduction
Bayesian statistics intro using r
Machine learning and Neural Networks
Naive Bayes Classifier
Chapter 09 class advanced
Bayesian statistics using r intro
WSC 2011, advanced tutorial on simulation in Statistics
Domain adaptation: A Theoretical View
Text classification
Machine learning and linear regression programming
20070702 Text Categorization
Ad

Similar to Free Ebooks Download ! Edhole.com (20)

PDF
MLHEP 2015: Introductory Lecture #1
PDF
Introduction to conventional machine learning techniques
PPTX
Deep learning from mashine learning AI..
PPT
Bayes ML.ppt
PDF
Machine Learning Algorithms Introduction.pdf
PPT
Machine Learning and Statistical Analysis
PPT
Machine Learning and Statistical Analysis
PPT
Machine Learning and Statistical Analysis
PPT
Machine Learning and Statistical Analysis
PPT
Machine Learning and Statistical Analysis
PPT
Machine Learning and Statistical Analysis
PPT
Machine Learning and Statistical Analysis
PDF
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
PPT
nnml.ppt
PDF
Introduction to Big Data Science
PPT
Pattern Recognition and understanding patterns
PPT
Pattern Recognition- Basic Lecture Notes
PPT
PatternRecognition_fundamental_engineering.ppt
PPTX
lecture_3_2.pptx, Classification and prediction
PPTX
Image Recognition of recognition pattern.pptx
MLHEP 2015: Introductory Lecture #1
Introduction to conventional machine learning techniques
Deep learning from mashine learning AI..
Bayes ML.ppt
Machine Learning Algorithms Introduction.pdf
Machine Learning and Statistical Analysis
Machine Learning and Statistical Analysis
Machine Learning and Statistical Analysis
Machine Learning and Statistical Analysis
Machine Learning and Statistical Analysis
Machine Learning and Statistical Analysis
Machine Learning and Statistical Analysis
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
nnml.ppt
Introduction to Big Data Science
Pattern Recognition and understanding patterns
Pattern Recognition- Basic Lecture Notes
PatternRecognition_fundamental_engineering.ppt
lecture_3_2.pptx, Classification and prediction
Image Recognition of recognition pattern.pptx
Ad

More from Edhole.com (20)

PPT
Ca in patna
PPT
Chartered accountant in dwarka
PPT
Ca in dwarka
PPT
Ca firm in dwarka
PPT
Website development company surat
PPTX
Website designing company in surat
PPTX
Website dsigning company in india
PPT
Website designing company in delhi
PPT
Ca in patna
PPT
Chartered accountant in dwarka
PPT
Ca firm in dwarka
PPTX
Ca in dwarka
PPTX
Website development company surat
PPT
Website designing company in surat
PPT
Website designing company in india
PPT
Website designing company in delhi
PPT
Website designing company in mumbai
PPT
Website development company surat
PPT
Website desinging company in surat
PPT
Website designing company in india
Ca in patna
Chartered accountant in dwarka
Ca in dwarka
Ca firm in dwarka
Website development company surat
Website designing company in surat
Website dsigning company in india
Website designing company in delhi
Ca in patna
Chartered accountant in dwarka
Ca firm in dwarka
Ca in dwarka
Website development company surat
Website designing company in surat
Website designing company in india
Website designing company in delhi
Website designing company in mumbai
Website development company surat
Website desinging company in surat
Website designing company in india

Recently uploaded (20)

PPTX
master seminar digital applications in india
PPTX
PPH.pptx obstetrics and gynecology in nursing
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PPTX
Institutional Correction lecture only . . .
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Pre independence Education in Inndia.pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
RMMM.pdf make it easy to upload and study
PDF
Classroom Observation Tools for Teachers
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
Basic Mud Logging Guide for educational purpose
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
master seminar digital applications in india
PPH.pptx obstetrics and gynecology in nursing
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Institutional Correction lecture only . . .
human mycosis Human fungal infections are called human mycosis..pptx
Pre independence Education in Inndia.pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
O7-L3 Supply Chain Operations - ICLT Program
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
102 student loan defaulters named and shamed – Is someone you know on the list?
Final Presentation General Medicine 03-08-2024.pptx
RMMM.pdf make it easy to upload and study
Classroom Observation Tools for Teachers
Pharmacology of Heart Failure /Pharmacotherapy of CHF
Basic Mud Logging Guide for educational purpose
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Supply Chain Operations Speaking Notes -ICLT Program
Renaissance Architecture: A Journey from Faith to Humanism
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf

Free Ebooks Download ! Edhole.com

  • 1. Introduction to Pattern Recognition Vojtěch Franc xfrancv@cmp.felk.cvut.cz Center for Machine Perception Czech Technical University in Prague
  • 2. What is pattern recognition? A pattern is an object, process or event that can be given a name. A pattern class (or category) is a set of patterns sharing common attributes and usually originating from the same source. During recognition (or classification) given objects are assigned to prescribed classes. A classifier is a machine which performs classification. “The assignment of a physical object or event to one of several prespecified categeries” -- Duda & Hart
  • 3. Examples of applications • Optical Character Recognition (OCR) • Biometrics • Diagnostic systems • Military applications • Handwritten: sorting letters by postal code, input device for PDA‘s. • Printed texts: reading machines for blind people, digitalization of text documents. • Face recognition, verification, retrieval. • Finger prints recognition. • Speech recognition. • Medical diagnosis: X-Ray, EKG analysis. • Machine diagnostics, waster detection. • Automated Target Recognition (ATR). • Image segmentation and analysis (recognition from aerial or satelite photographs).
  • 4. Approaches Statistical PR: based on underlying statistical model of patterns and pattern classes. Structural (or syntactic) PR: pattern classes represented by means of formal structures as grammars, automata, strings, etc. Neural networks: classifier is represented as a network of cells modeling neurons of the human brain (connectionist approach).
  • 5. Basic concepts y x=             nx x x  2 1 Feature vector - A vector of observations (measurements). - is a point in feature space . Hidden state - Cannot be directly measured. - Patterns with equal hidden state belong to the same class. X∈x x X Y∈y Task - To design a classifer (decision rule) which decides about a hidden state based on an onbservation. YX →:q Pattern
  • 6. Example x=      2 1 x x height weight Task: jockey-hoopster recognition. The set of hidden state is The feature space is },{ JH=Y 2 ℜ=X Training examples )},(,),,{( 11 ll yy xx  1x 2x Jy = Hy =Linear classifier:    <+⋅ ≥+⋅ = 0)( 0)( )q( bifJ bifH xw xw x 0)( =+⋅ bxw
  • 7. Components of PR system Sensors and preprocessing Feature extraction Classifier Class assignment • Sensors and preprocessing. • A feature extraction aims to create discriminative features good for classification. • A classifier. • A teacher provides information about hidden state -- supervised learning. • A learning algorithm sets PR from training examples. Learning algorithmTeacher Pattern
  • 8. Feature extraction Task: to extract features which are good for classification. Good features: • Objects from the same class have similar feature values. • Objects from different classes have different values. “Good” features “Bad” features
  • 9. Feature extraction methods             km m m  2 1             nx x x  2 11φ 2φ nφ                 km m m m  3 2 1             nx x x  2 1 Feature extraction Feature selection Problem can be expressed as optimization of parameters of featrure extractor . Supervised methods: objective function is a criterion of separability (discriminability) of labeled examples, e.g., linear discriminat analysis (LDA). Unsupervised methods: lower dimesional representation which preserves important characteristics of input data is sought for, e.g., principal component analysis (PCA). φ(θ)
  • 10. Classifier A classifier partitions feature space X into class-labeled regions such that ||21 YXXXX ∪∪∪=  }0{||21 =∩∩∩ YXXX and 1X 3X 2X 1X 1X 2X 3X The classification consists of determining to which region a feature vector x belongs to. Borders between decision boundaries are called decision regions.
  • 11. Representation of classifier A classifier is typically represented as a set of discriminant functions ||,,1,:)(f YX =ℜ→ ii x The classifier assigns a feature vector x to the i-the class if )(f)(f xx ji > ij ≠∀ )(f1 x )(f2 x )(f || xY maxx y Feature vector Discriminant function Class identifier
  • 12. Bayesian decision making • The Bayesian decision making is a fundamental statistical approach which allows to design the optimal classifier if complete statistical model is known. Definition : Obsevations Hidden states Decisions A loss function A decision rule A joint probabilityD DX →:q )p( y,x X Y RDYW →×: Task: to design decision rule q which minimizes Bayesian risk ∑∑∈ ∈ = Yy Xx yy )),W(q(),p(R(q) xx
  • 13. Example of Bayesian task Task: minimization of classification error. A set of decisions D is the same as set of hidden states Y. 0/1 - loss function used    ≠ = = yif yif y )q(1 )q(0 )),W(q( x x x The Bayesian risk R(q) corresponds to probability of misclassification. The solution of Bayesian task is )p( )p()|p( maxarg)|(maxargR(q)minargq * q * x x x yy ypy yy ==⇒=
  • 14. Limitations of Bayesian approach • The statistical model p(x,y) is mostly not known therefore learning must be employed to estimate p(x,y) from training examples {(x1,y1),…,(x,y)} -- plug-in Bayes. • Non-Bayesian methods offers further task formulations: • A partial statistical model is avaliable only: • p(y) is not known or does not exist. • p(x|y,θ) is influenced by a non-random intervetion θ. • The loss function is not defined. • Examples: Neyman-Pearson‘s task, Minimax task, etc.
  • 15. Discriminative approaches Given a class of classification rules q(x;θ) parametrized by θ∈Ξ the task is to find the “best” parameter θ* based on a set of training examples {(x1,y1),…,(x,y)} -- supervised learning. The task of learning: recognition which classification rule is to be used. The way how to perform the learning is determined by a selected inductive principle.
  • 16. Empirical risk minimization principle The true expected risk R(q) is approximated by empirical risk ∑= =   1 emp )),;W(q( 1 ));(q(R i ii yx θxθ with respect to a given labeled training set {(x1,y1),…,(x,y)}. The learning based on the empirical minimization principle is defined as ));(q(Rminarg emp * θxθ θ = Examples of algorithms: Perceptron, Back-propagation, etc.
  • 17. Overfitting and underfitting Problem: how rich class of classifications q(x;θ) to use. underfitting overfittinggood fit Problem of generalization: a small emprical risk Remp does not imply small true expected risk R.
  • 18. Structural risk minimization principle An upper bound on the expected risk of a classification rule q∈Q ) 1 log,, 1 (R(q)RR(q) σ hstremp  +≤ where  is number of training examples, h is VC-dimension of class of functions Q and 1-σ is confidence of the upper bound. SRM principle: from a given nested function classes Q1,Q2,…,Qm, such that mhhh ≤≤≤ 21 select a rule q* which minimizes the upper bound on the expected risk. Statistical learning theory -- Vapnik & Chervonenkis.
  • 19. Unsupervised learning Input: training examples {x1,…,x} without information about the hidden state. Clustering: goal is to find clusters of data sharing similar properties. Classifier Learning algorithm θ },,{ 1  xx },,{ 1  yy Classifier ΘY)(X: →×  L YΘX →×:q Learning algorithm (supervised) A broad class of unsupervised learning algorithms:
  • 20. Example of unsupervised learning algorithm k-Means clustering: Classifier ||||minarg)q( ,,1 i ki y mxx −== =  Goal is to minimize ∑= −  1 2 )q( |||| i i ixmx ∑∈ = ij j i i II , || 1 xm })q(:{ ij ji == xI Learning algorithm 1m 2m 3m },,{ 1  xx },,{ 1 kmmθ = },,{ 1  yy
  • 21. References Books Duda, Heart: Pattern Classification and Scene Analysis. J. Wiley & Sons, New York, 1982. (2nd edition 2000). Fukunaga: Introduction to Statistical Pattern Recognition. Academic Press, 1990. Bishop: Neural Networks for Pattern Recognition. Claredon Press, Oxford, 1997. Schlesinger, Hlaváč: Ten lectures on statistical and structural pattern recognition. Kluwer Academic Publisher, 2002. Journals Journal of Pattern Recognition Society. IEEE transactions on Neural Networks. Pattern Recognition and Machine Learning.