Unit V -Graphical Models in artificial intelligence and machine learning

Graphical Models
Prepared By: Nivetha
Department of Computer Science and
Engineering

Graphical Models
• A graphical model is a probabilistic model for which a graph expresses
the conditional dependence structure between random variables.
• It provides a language to facilitate communication between a domain
expert and a statistician, provide flexible and modular definitions of
families of probability distributions, and are amenable to scalable
computational techniques
• Graphical models in machine learning are a powerful framework used to
represent and reason about the dependencies between variables.
• These models provide a structured way to visualize and compute joint
probabilities for a set of variables in complex systems, which is useful for
tasks like prediction, decision making, and inference.

Graphical Models
• The Graphical model (GM) is a branch of ML which uses a graph to
represent a domain problem
• Probabilistic graphical modeling combines both probability and graph
theory
• Also called as Bayesian networks, belief networks or probabilistic networks
• Consists of graph structure-Nodes and arcs
• Two categories — Bayesian networks and Markov networks

Graphical Models
• Each node corresponds to a random variable, X, and has a value
corresponding to the probability of the random variable, P(X).
• If there is a directed arc from node X to node Y, this indicates that X has a
direct influence on Y.
• This influence is specified by the conditional probability directed acyclic
P(Y|X).
• Bayesian -The network is a directed acyclic graph (DAG); namely, these are
graph no cycles.
• The nodes and the arcs between the nodes define the structure of the
network, and the conditional probabilities are the parameters given the
structure.

Example
• This example models that rain causes the grass to get wet
• It rains on 40 percent of the days and when it rains, there is a 90 percent
chance that the grass gets wet; maybe 10 percent of the time it does not
rain long enough for us to really consider the grass wet enough.
• The random variables in this example are binary; they are either true or
false.
• There is a 20 percent probability that the grass gets wet without its actually
raining, for example, when a sprinkler is used

Ex: Bayesian network modeling that rain is the cause of wet

Conditional Independence
• In a graphical model, not all nodes are connected; actually, in general, a node is
connected to only a small number of other nodes.
• Certain subgraphs imply conditional independence statements, and these allow
us to break down a complex graph into smaller subsets in which inferences can
be done locally and whose results are later propagated over the graph

Canonical Cases for Conditional
Independence
• Head-to-tail Connection
• Tail-to-Tail Connection
• Head-to-Head Connection

Canonical Cases for Conditional
Independence
Case 1: Head-to-tail Connection
•Three events may be connected serially, as seen in ﬁgure . We see
here that X and Z are independent given Y: Knowing Y tells Z
everything; knowing the state of X does not add any extra knowledge
for Z; we write P(Z|Y,X)= P(Z|Y). We say that Y blocks the path from X
to Z, or in other words, it separates them in the sense that if Y is
removed, there is no path between X to Z. In this case, the joint is
written as

Case 1: Head to Tail Connection

Case 2: Tail-to-tail
X may be the parent of two nodes Y and Z. The joint density is
written as
Normally Y and Z are dependent through X; given X, they
become independent:

Case 3 Head-to-head there are two parents X and Y to a single node Z,

Advantages
• No of variables stored is less
• we do not need to designate explicitly certain variables as input and certain
others as output.

Example Graphical Model
• Naïve Bayes Classifier
• Hidden Markov model

Example Graphical Model
Naive Bayes’ Classiﬁer
Hidden Markov Model

BAYESIAN NETWORKS
 Directed graphs not contain cycles, that is, there cannot be any loops in the
graphs(DAGs: directed, acyclic graphs) , when they are paired with the
conditional probability tables, they are called Bayesian networks
 Bayesian Networks help us to effectively visualize the probabilistic model for
each domain and to study the relationship between random variables in the
form of a user-friendly graph.

Why Bayes Network?
 Bayes optimal classifier is too costly to apply
 Naïve Bayes makes overly restrictive assumptions.
 But all variables are rarely completely independent.
 Bayes network represents conditional independence relations among the
features.
 Representation of causal relations makes the representation and
inference efficient.

Bayes Network
 Two different ways to calculate the conditional probability.
 Given A and B are dependent events, the conditional probability is
calculated as P (A| B) = P (A and B) / P (B)
 If A and B are independent events, then the expression for conditional
probability is given by, P(A| B) = P (A)

Bayesian Network – example 1
o The probability of a random variable depends on his parents.
oBayesian network models capture both conditionally dependent and conditionally
independent relationships between random variables.
 Create a Bayesian Network that will model the marks of a student in his
examination

Bayesian Network- example
The marks will depend on
 Exam Level (e) :(difficult, easy)
 IQ of the students(I): (high,low)
 Marks -> admitted to a university
 The IQ -> aptitude score(s) of the
student
 Each node has a probability table

 Exam level and IQ level are parent nodes – represented the probability
 Marks depends on Exam level and IQ level – represented by conditional
probability .
 Conditional probability table for Marks contains entry for Exam level and IQ
level
 Conditional probability table for Admission contains entry for Marks
 Conditional probability table for Apti score contains entry for IQ level

 Calculate Joint probability
p(a,m,i,e,s)=p(a|m) p(m|i,e) p(e) p(i) p(s|i)
 p(a|m) : CP of student admit-> marks
 p(m|i,d):cp of the student’s marks ->(IQ &
Exam level)
 p(i): probability -> IQ level
 p(e): probability -> exam level
 p(a): probability ->aptitude level
 p(s|i) CP of aptitude scores ->IQ level

Calculate the probability that in spite of the exam level being
difficult, the student having a low IQ level and a low Aptitude Score,
manages to pass the exam and secure admission to the university.
Joint Probability Distribution can be written as
P[a=1, m=1, i=0, e=1, s=0]
From the above Conditional Probability tables, the values for the
given conditions are fed to the formula and is calculated as below.
P[a=1, m=1, i=0, e=0, s=0] = P(a=1 | m=1) . P(m=1 | i=0, e=1) . P(i=0) .
P(e=1) . P(s=0 | i=0)
= 0.1 * 0.1 * 0.8 * 0.3 * 0.75
= 0.0018

Bayesian Networks – Example 2
 You have a new burglar alarm installed at home
 It is reliable at detecting burglary ,but also sometimes responds to minor
earthquakes.
 You have two neighbors, John and Mary ,who promised to call you at work
when they hear the alarm
 John always calls when he hears the alarm, but sometimes confuses
telephone ringing with the alarm and calls too
 Merry likes loud music and sometimes misses the alarm
 Given the evidence of who has or has not called, we would like to estimate the
probability of a burglary

Probability for no
burglary =1-0.01
=0.99
Probability of for
earthquake =1-0.02
=0.98
Probability for no
alarm given burglary
and earthquake =1-
0.95
Probability for Mary
will not call and no
=1-0.01=0.99

1.What is the probability that the alarm has sounded but neither a burglary nor
an earthquake has occurred, and both John and Merry call?

2. What is the probability that John call?

Naive Bayes’ Classifier
If the inputs are independent, we have the graph
which is called the naive Bayes’ classifier, because
it ignores possible dependencies, namely,
correlations, among the inputs and reduces a
multivariate problem to a group of univariate
problems

Unit V -Graphical Models in artificial intelligence and machine learning

The Hidden Markov
model (HMM)
• The Hidden Markov model (HMM) is a statistical model and uses a
Markov process that contains hidden and unknown parameters.
• In this model, the observed parameters are used to identify the
hidden parameters. These parameters are then used for further
analysis
• It is a probabilistic graphical model that is commonly used in
statistical pattern recognition and classification.

Hidden Markov Model as a Graphical
Model

Unit V -Graphical Models in artificial intelligence and machine learning

More Related Content

Similar to Unit V -Graphical Models in artificial intelligence and machine learning (20)

Recently uploaded (20)

Unit V -Graphical Models in artificial intelligence and machine learning