SlideShare a Scribd company logo
Part 1: Graphical Models Machine Learning Techniques  for Computer Vision Microsoft Research Cambridge ECCV 2004, Prague Christopher M. Bishop
About this Tutorial Learning is the new frontier in computer vision  Focus on concepts  not lists of algorithms not technical details Graduate level Please ask questions!
Overview Part 1:  Graphical models directed and undirected graphs inference and learning   Part 2:  Unsupervised learning mixture models, EM variational inference, model complexity continuous latent variables Part 3:  Supervised learning decision theory linear models, neural networks,  boosting, sparse kernel machines
Probability Theory Sum rule Product rule From these we have Bayes’ theorem with normalization
Role of the Graphs New insights into existing models Motivation for new models Graph based algorithms for calculation and computation c.f. Feynman diagrams in physics
Decomposition Consider an arbitrary joint distribution By successive application of the product rule
Directed Acyclic Graphs Joint distribution where  denotes the parents of  i No directed cycles
Undirected Graphs Provided  then joint distribution is product of non-negative functions over the  cliques  of the graph where  are the  clique potentials,  and  Z  is a normalization constant
Conditioning on Evidence Variables may be hidden (latent) or visible (observed) Latent variables may have a specific interpretation, or may be introduced to permit a richer class of distribution
Conditional Independences x  independent of  y  given  z  if, for all values of  z , For undirected graphs this is given by graph separation!
“Explaining Away” C.I. for directed graphs similar, but with one subtlety Illustration: pixel colour in an image image colour surface colour lighting colour
Directed versus Undirected
Example: State Space Models Hidden Markov model Kalman filter
Example: Bayesian SSM
Example: Factorial SSM Multiple hidden sequences Avoid exponentially large hidden space
Example: Markov Random Field Typical application: image region labelling
Example: Conditional Random Field
Inference Simple example: Bayes’ theorem
Message Passing Example Find marginal for a particular node for  M -state nodes, cost is  exponential in length of chain but, we can exploit the graphical structure (conditional independences)
Message Passing Joint distribution Exchange sums and products
Message Passing Express as product of messages Recursive evaluation of messages Find  Z  by normalizing
Belief Propagation Extension to general tree-structured graphs At each node: form product of  incoming  messages and local evidence marginalize to give  outgoing  message one message in each direction across every link Fails if there are loops
Junction Tree Algorithm An efficient exact algorithm for a general graph applies to both directed and undirected graphs compile original graph into a tree of cliques then perform message passing on this tree Problem:  cost is exponential in size of largest clique many vision models have intractably large cliques
Loopy Belief Propagation Apply belief propagation directly to general graph need to keep iterating might not converge State-of-the-art performance in error-correcting codes
Max-product Algorithm Goal: find define then Message passing algorithm with “sum” replaced by “max” Example:  Viterbi algorithm for HMMs
Inference and Learning Data set Likelihood function (independent observations) Maximize (log) likelihood Predictive distribution
Regularized Maximum Likelihood Prior  , posterior MAP (maximum posterior) Predictive distribution Not really Bayesian
Bayesian Learning Key idea is to  marginalize  over unknown parameters, rather than make point estimates avoids severe over-fitting of ML and MAP allows direct model comparison Parameters are now latent variables Bayesian learning is an inference problem!
Bayesian Learning
Bayesian Learning
And Finally … the Exponential Family Many distributions can be written in the form Includes:  Gaussian Dirichlet Gamma Multi-nomial Wishart Bernoulli … Building blocks in graphs to give rich probabilistic models
Illustration: the Gaussian Use precision (inverse variance) In standard form
Maximum Likelihood Likelihood function (independent observations) Depends on data via  sufficient statistics  of fixed dimension
Conjugate Priors Prior has same functional form as likelihood Hence posterior is of the form Can interpret prior as  effective observations of value Examples:  Gaussian for the mean of a Gaussian Gaussian-Wishart for mean and precision of Gaussian Dirichlet for the parameters of a discrete distribution
Summary of Part 1 Directed graphs Undirected graphs Inference by message passing: belief propagation

More Related Content

PPT
ProbabilisticModeling20080411
PDF
AI Use Cases: Special Attention on Semantic Segmentation
PPTX
Morse-Smale Regression for Risk Modeling
PDF
Distributed vertex cover
PPTX
Introduction to Machine learning & Neural Networks
PPT
Part 2: Unsupervised Learning Machine Learning Techniques
PDF
Introduction to Model-Based Machine Learning for Transportation
PPTX
Introduction to Convolutional Neural Networks
ProbabilisticModeling20080411
AI Use Cases: Special Attention on Semantic Segmentation
Morse-Smale Regression for Risk Modeling
Distributed vertex cover
Introduction to Machine learning & Neural Networks
Part 2: Unsupervised Learning Machine Learning Techniques
Introduction to Model-Based Machine Learning for Transportation
Introduction to Convolutional Neural Networks

What's hot (17)

DOCX
Decentralized eigenvalue algorithms for distributed signal detection in wirel...
PDF
Deep Multi-Task Learning with Shared Memory
PPT
Multimedia And Contiguity Principles Casey Susan
DOC
Steganography using reversible texture synthesis
PDF
IEEE PROJECT TOPICS &ABSTRACTS on image processing
PDF
Show observe and tell giang nguyen
PPTX
Survey on contrastive self supervised l earning
DOCX
Steganography using reversible texture synthesis
PPTX
Spectral Clustering
PPTX
PPT - Deep and Confident Prediction For Time Series at Uber
PPTX
Em Algorithm | Statistics
PDF
Comparison Between Clustering Algorithms for Microarray Data Analysis
PDF
Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...
PDF
CORRELATION OF EIGENVECTOR CENTRALITY TO OTHER CENTRALITY MEASURES: RANDOM, S...
PPTX
Age estimation based on extended non negative factorization
PPTX
Video Stegnography
PPTX
support vector machine and associative classification
Decentralized eigenvalue algorithms for distributed signal detection in wirel...
Deep Multi-Task Learning with Shared Memory
Multimedia And Contiguity Principles Casey Susan
Steganography using reversible texture synthesis
IEEE PROJECT TOPICS &ABSTRACTS on image processing
Show observe and tell giang nguyen
Survey on contrastive self supervised l earning
Steganography using reversible texture synthesis
Spectral Clustering
PPT - Deep and Confident Prediction For Time Series at Uber
Em Algorithm | Statistics
Comparison Between Clustering Algorithms for Microarray Data Analysis
Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...
CORRELATION OF EIGENVECTOR CENTRALITY TO OTHER CENTRALITY MEASURES: RANDOM, S...
Age estimation based on extended non negative factorization
Video Stegnography
support vector machine and associative classification
Ad

Viewers also liked (20)

PPTX
Thesis Presentation
DOC
DOC
PDF
Deep Belief Networks
PPT
NIPS2007: deep belief nets
PPTX
Deep Belief Networks for Spam Filtering
PDF
Deep learning presentation
PDF
Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hier...
PDF
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
PPTX
Deep Belief nets
PDF
Brief Introduction to Boltzmann Machine
PDF
H2O Deep Learning at Next.ML
PPTX
DNN and RBM
PPTX
Deep Learning - A Literature survey
PPTX
Stock Market Prediction using Hidden Markov Models and Investor sentiment
PDF
Deeplearning in finance
PDF
Prediction of Exchange Rate Using Deep Neural Network
PPTX
Deep Learning for Fraud Detection
PPTX
Fraud Detection Architecture
PDF
PayPal's Fraud Detection with Deep Learning in H2O World 2014
PDF
Deep Learning through Examples
Thesis Presentation
DOC
Deep Belief Networks
NIPS2007: deep belief nets
Deep Belief Networks for Spam Filtering
Deep learning presentation
Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hier...
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
Deep Belief nets
Brief Introduction to Boltzmann Machine
H2O Deep Learning at Next.ML
DNN and RBM
Deep Learning - A Literature survey
Stock Market Prediction using Hidden Markov Models and Investor sentiment
Deeplearning in finance
Prediction of Exchange Rate Using Deep Neural Network
Deep Learning for Fraud Detection
Fraud Detection Architecture
PayPal's Fraud Detection with Deep Learning in H2O World 2014
Deep Learning through Examples
Ad

Similar to Cristopher M. Bishop's tutorial on graphical models (20)

PPTX
Pattern Recognition and Machine Learning : Graphical Models
PPT
. An introduction to machine learning and probabilistic ...
PDF
Graphical Models 4dummies
PDF
Exercises_in_Machine_Learning_1657514028.pdf
PPTX
Computational Giants_nhom.pptx
PDF
Probabilistic Models with Hidden variables3.pdf
PPTX
Graph Models for Deep Learning
PPTX
Machine Learning Algorithms Review(Part 2)
PPTX
PRML Chapter 8
PDF
Statistical inference of generative network models - Tiago P. Peixoto
PDF
M4D-v0.4.pdf
PDF
Machine learning cheat sheet
PDF
Energy-Based Models with Applications to Speech and Language Processing
PPTX
Unit V -Graphical Models.pptx for artificial intelligence
PPTX
Unit V -Graphical Models in artificial intelligence and machine learning
PDF
Cheatsheet supervised-learning
PPTX
planning and decision making
PPTX
Inference in HMM and Bayesian Models
PPT
4646150.ppt
PDF
Graphical Models In Python | Edureka
Pattern Recognition and Machine Learning : Graphical Models
. An introduction to machine learning and probabilistic ...
Graphical Models 4dummies
Exercises_in_Machine_Learning_1657514028.pdf
Computational Giants_nhom.pptx
Probabilistic Models with Hidden variables3.pdf
Graph Models for Deep Learning
Machine Learning Algorithms Review(Part 2)
PRML Chapter 8
Statistical inference of generative network models - Tiago P. Peixoto
M4D-v0.4.pdf
Machine learning cheat sheet
Energy-Based Models with Applications to Speech and Language Processing
Unit V -Graphical Models.pptx for artificial intelligence
Unit V -Graphical Models in artificial intelligence and machine learning
Cheatsheet supervised-learning
planning and decision making
Inference in HMM and Bayesian Models
4646150.ppt
Graphical Models In Python | Edureka

More from butest (20)

PDF
EL MODELO DE NEGOCIO DE YOUTUBE
DOC
1. MPEG I.B.P frame之不同
PDF
LESSONS FROM THE MICHAEL JACKSON TRIAL
PPT
Timeline: The Life of Michael Jackson
DOCX
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
PDF
LESSONS FROM THE MICHAEL JACKSON TRIAL
PPTX
Com 380, Summer II
PPT
PPT
DOCX
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
DOC
MICHAEL JACKSON.doc
PPTX
Social Networks: Twitter Facebook SL - Slide 1
PPT
Facebook
DOCX
Executive Summary Hare Chevrolet is a General Motors dealership ...
DOC
Welcome to the Dougherty County Public Library's Facebook and ...
DOC
NEWS ANNOUNCEMENT
DOC
C-2100 Ultra Zoom.doc
DOC
MAC Printing on ITS Printers.doc.doc
DOC
Mac OS X Guide.doc
DOC
hier
DOC
WEB DESIGN!
EL MODELO DE NEGOCIO DE YOUTUBE
1. MPEG I.B.P frame之不同
LESSONS FROM THE MICHAEL JACKSON TRIAL
Timeline: The Life of Michael Jackson
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
LESSONS FROM THE MICHAEL JACKSON TRIAL
Com 380, Summer II
PPT
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
MICHAEL JACKSON.doc
Social Networks: Twitter Facebook SL - Slide 1
Facebook
Executive Summary Hare Chevrolet is a General Motors dealership ...
Welcome to the Dougherty County Public Library's Facebook and ...
NEWS ANNOUNCEMENT
C-2100 Ultra Zoom.doc
MAC Printing on ITS Printers.doc.doc
Mac OS X Guide.doc
hier
WEB DESIGN!

Cristopher M. Bishop's tutorial on graphical models

  • 1. Part 1: Graphical Models Machine Learning Techniques for Computer Vision Microsoft Research Cambridge ECCV 2004, Prague Christopher M. Bishop
  • 2. About this Tutorial Learning is the new frontier in computer vision Focus on concepts not lists of algorithms not technical details Graduate level Please ask questions!
  • 3. Overview Part 1: Graphical models directed and undirected graphs inference and learning Part 2: Unsupervised learning mixture models, EM variational inference, model complexity continuous latent variables Part 3: Supervised learning decision theory linear models, neural networks, boosting, sparse kernel machines
  • 4. Probability Theory Sum rule Product rule From these we have Bayes’ theorem with normalization
  • 5. Role of the Graphs New insights into existing models Motivation for new models Graph based algorithms for calculation and computation c.f. Feynman diagrams in physics
  • 6. Decomposition Consider an arbitrary joint distribution By successive application of the product rule
  • 7. Directed Acyclic Graphs Joint distribution where denotes the parents of i No directed cycles
  • 8. Undirected Graphs Provided then joint distribution is product of non-negative functions over the cliques of the graph where are the clique potentials, and Z is a normalization constant
  • 9. Conditioning on Evidence Variables may be hidden (latent) or visible (observed) Latent variables may have a specific interpretation, or may be introduced to permit a richer class of distribution
  • 10. Conditional Independences x independent of y given z if, for all values of z , For undirected graphs this is given by graph separation!
  • 11. “Explaining Away” C.I. for directed graphs similar, but with one subtlety Illustration: pixel colour in an image image colour surface colour lighting colour
  • 13. Example: State Space Models Hidden Markov model Kalman filter
  • 15. Example: Factorial SSM Multiple hidden sequences Avoid exponentially large hidden space
  • 16. Example: Markov Random Field Typical application: image region labelling
  • 18. Inference Simple example: Bayes’ theorem
  • 19. Message Passing Example Find marginal for a particular node for M -state nodes, cost is exponential in length of chain but, we can exploit the graphical structure (conditional independences)
  • 20. Message Passing Joint distribution Exchange sums and products
  • 21. Message Passing Express as product of messages Recursive evaluation of messages Find Z by normalizing
  • 22. Belief Propagation Extension to general tree-structured graphs At each node: form product of incoming messages and local evidence marginalize to give outgoing message one message in each direction across every link Fails if there are loops
  • 23. Junction Tree Algorithm An efficient exact algorithm for a general graph applies to both directed and undirected graphs compile original graph into a tree of cliques then perform message passing on this tree Problem: cost is exponential in size of largest clique many vision models have intractably large cliques
  • 24. Loopy Belief Propagation Apply belief propagation directly to general graph need to keep iterating might not converge State-of-the-art performance in error-correcting codes
  • 25. Max-product Algorithm Goal: find define then Message passing algorithm with “sum” replaced by “max” Example: Viterbi algorithm for HMMs
  • 26. Inference and Learning Data set Likelihood function (independent observations) Maximize (log) likelihood Predictive distribution
  • 27. Regularized Maximum Likelihood Prior , posterior MAP (maximum posterior) Predictive distribution Not really Bayesian
  • 28. Bayesian Learning Key idea is to marginalize over unknown parameters, rather than make point estimates avoids severe over-fitting of ML and MAP allows direct model comparison Parameters are now latent variables Bayesian learning is an inference problem!
  • 31. And Finally … the Exponential Family Many distributions can be written in the form Includes: Gaussian Dirichlet Gamma Multi-nomial Wishart Bernoulli … Building blocks in graphs to give rich probabilistic models
  • 32. Illustration: the Gaussian Use precision (inverse variance) In standard form
  • 33. Maximum Likelihood Likelihood function (independent observations) Depends on data via sufficient statistics of fixed dimension
  • 34. Conjugate Priors Prior has same functional form as likelihood Hence posterior is of the form Can interpret prior as effective observations of value Examples: Gaussian for the mean of a Gaussian Gaussian-Wishart for mean and precision of Gaussian Dirichlet for the parameters of a discrete distribution
  • 35. Summary of Part 1 Directed graphs Undirected graphs Inference by message passing: belief propagation