SlideShare a Scribd company logo
Naive Bayes Classifier
Tushar B. Kute,
http://tusharkute.com
Joint Probability
• The joint probability is the probability of two (or
more) simultaneous events, often described in
terms of events A and B from two dependent
random variables, e.g. X and Y.
• The joint probability is often summarized as just
the outcomes, e.g. A and B.
– Joint Probability: Probability of two (or more)
simultaneous events, e.g. P(A and B) or P(A,
B).
Conditional Probability
• The conditional probability is the probability of
one event given the occurrence of another
event, often described in terms of events A and
B from two dependent random variables e.g. X
and Y.
– Conditional Probability: Probability of one (or
more) event given the occurrence of another
event, e.g. P(A given B) or P(A | B).
Summary
• The joint probability can be calculated using the
conditional probability; for example:
– P(A, B) = P(A | B) * P(B)
• This is called the product rule. Importantly, the joint
probability is symmetrical, meaning that:
– P(A, B) = P(B, A)
• The conditional probability can be calculated using the
joint probability; for example:
– P(A | B) = P(A, B) / P(B)
• The conditional probability is not symmetrical; for example:
– P(A | B) != P(B | A)
Alternate way for conditional prob
• Specifically, one conditional probability can be
calculated using the other conditional probability; for
example:
– P(A|B) = P(B|A) * P(A) / P(B)
• The reverse is also true; for example:
– P(B|A) = P(A|B) * P(B) / P(A)
• This alternate approach of calculating the conditional
probability is useful either when the joint probability is
challenging to calculate (which is most of the time), or
when the reverse conditional probability is available or
easy to calculate.
Bayes Theorem
• Bayes Theorem: Principled way of calculating a
conditional probability without the joint probability. It
is often the case that we do not have access to the
denominator directly, e.g. P(B).
• We can calculate it an alternative way; for example:
– P(B) = P(B|A) * P(A) + P(B|not A) * P(not A)
• This gives a formulation of Bayes Theorem that we can
use that uses the alternate calculation of P(B),
described below:
– P(A|B) = P(B|A) * P(A) / P(B|A) * P(A) + P(B|not A) *
P(not A)
Bayes Theorem
• Firstly, in general, the result P(A|B) is referred to as the
posterior probability and P(A) is referred to as the prior
probability.
– P(A|B): Posterior probability.
– P(A): Prior probability.
• Sometimes P(B|A) is referred to as the likelihood and P(B)
is referred to as the evidence.
– P(B|A): Likelihood.
– P(B): Evidence.
• This allows Bayes Theorem to be restated as:
– Posterior = Likelihood * Prior / Evidence
Naive Bayes Classifier
• Naive Bayes classifiers are a collection of
classification algorithms based on Bayes’
Theorem.
• It is not a single algorithm but a family of
algorithms where all of them share a common
principle, i.e. every pair of features being
classified is independent of each other.
Bayes Theorem
Example Reference: Super Data Science
Bayes Theorem
Defective Spanners
Bayes Theorem
Bayes Theorem
Bayes Theorem
Bayes Theorem
That’s intuitive
Exercise
Example:
Step-1
Step-1
Step-1
Step-2
Step-3
Naive Bayes – Step-1
Naive Bayes – Step-2
Naive Bayes – Step-3
Combining altogether
Naive Bayes – Step-4
Naive Bayes – Step-5
Types of model
Final Classification
Probability Distribution
Types of Naive Bayes Classifier
• Multinomial Naive Bayes:
– This is mostly used for document classification
problem, i.e whether a document belongs to the
category of sports, politics, technology etc.
– The features/predictors used by the classifier are
the frequency of the words present in the
document.
Types of Naive Bayes Classifier
• Bernoulli Naive Bayes:
– This is similar to the multinomial naive bayes but
the predictors are boolean variables.
– The parameters that we use to predict the class
variable take up only values yes or no, for example
if a word occurs in the text or not.
Types of Naive Bayes Classifier
• Gaussian Naive Bayes:
– When the predictors take up a continuous value
and are not discrete, we assume that these values
are sampled from a gaussian distribution.
Advantages
• When assumption of independent predictors
holds true, a Naive Bayes classifier performs
better as compared to other models.
• Naive Bayes requires a small amount of
training data to estimate the test data. So, the
training period is less.
• Naive Bayes is also easy to implement.
Disadvantages
• Main imitation of Naive Bayes is the assumption of
independent predictors. Naive Bayes implicitly assumes
that all the attributes are mutually independent. In real life,
it is almost impossible that we get a set of predictors which
are completely independent.
• If categorical variable has a category in test data set, which
was not observed in training data set, then model will
assign a 0 (zero) probability and will be unable to make a
prediction. This is often known as Zero Frequency. To solve
this, we can use the smoothing technique. One of the
simplest smoothing techniques is called Laplace estimation.
tushar@tusharkute.com
Thank you
This presentation is created using LibreOffice Impress 5.1.6.2, can be used freely as per GNU General Public License
Web Resources
https://mitu.co.in
http://tusharkute.com
/mITuSkillologies @mitu_group
contact@mitu.co.in
/company/mitu-
skillologies
MITUSkillologies

More Related Content

PPTX
Navies bayes
PPTX
Introduction to Naive bayes and baysian belief network
PPTX
Ml4 naive bayes
PPTX
Naive Bayes.pptx
PDF
Bayes 6
PPTX
Bayer's Theorem Naive Bayer's classifier
PDF
Naive Bayes for the Superbowl
PDF
AI 10 | Naive Bayes Classifier
Navies bayes
Introduction to Naive bayes and baysian belief network
Ml4 naive bayes
Naive Bayes.pptx
Bayes 6
Bayer's Theorem Naive Bayer's classifier
Naive Bayes for the Superbowl
AI 10 | Naive Bayes Classifier

Similar to navi bays algorithm in data mining ppt.pdf (20)

PPTX
Naïve Bayes Classifier Algorithm.pptx
PDF
Naïve Bayes Machine Learning Classification with R Programming: A case study ...
PDF
Naive Bayes Classifier
PPT
UNIT2_NaiveBayes algorithms used in machine learning
PDF
Bayesian classification
PDF
Bayesian Learning - Naive Bayes Algorithm
PPTX
Naive Bayesian classifier Naive Bayesian classifier Naive Bayesian classifier
PDF
Machine learning naive bayes and svm.pdf
PDF
Bayesian data analysis1
PDF
lecture 5 about lecture 5 about lecture lecture
PPTX
baysian in machine learning in Supervised Learning .pptx
PPTX
naive bayes classification for machine learning..pptx
PPT
bayes answer jejisiowwoowwksknejejrjejej
PPT
bayesNaive.ppt
PPT
bayesNaive.ppt
PPT
bayesNaive algorithm in machine learning
PPTX
Naive Bayes
PPTX
Naive Bayes Presentation
PPTX
Supervised models
PPTX
Belief Networks & Bayesian Classification
Naïve Bayes Classifier Algorithm.pptx
Naïve Bayes Machine Learning Classification with R Programming: A case study ...
Naive Bayes Classifier
UNIT2_NaiveBayes algorithms used in machine learning
Bayesian classification
Bayesian Learning - Naive Bayes Algorithm
Naive Bayesian classifier Naive Bayesian classifier Naive Bayesian classifier
Machine learning naive bayes and svm.pdf
Bayesian data analysis1
lecture 5 about lecture 5 about lecture lecture
baysian in machine learning in Supervised Learning .pptx
naive bayes classification for machine learning..pptx
bayes answer jejisiowwoowwksknejejrjejej
bayesNaive.ppt
bayesNaive.ppt
bayesNaive algorithm in machine learning
Naive Bayes
Naive Bayes Presentation
Supervised models
Belief Networks & Bayesian Classification
Ad

Recently uploaded (20)

PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
UNIT 4 Total Quality Management .pptx
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PDF
composite construction of structures.pdf
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
additive manufacturing of ss316l using mig welding
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPT
Project quality management in manufacturing
PPTX
OOP with Java - Java Introduction (Basics)
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
Lecture Notes Electrical Wiring System Components
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PDF
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
PPT
Mechanical Engineering MATERIALS Selection
DOCX
573137875-Attendance-Management-System-original
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
UNIT-1 - COAL BASED THERMAL POWER PLANTS
UNIT 4 Total Quality Management .pptx
Embodied AI: Ushering in the Next Era of Intelligent Systems
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
composite construction of structures.pdf
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
CH1 Production IntroductoryConcepts.pptx
Internet of Things (IOT) - A guide to understanding
additive manufacturing of ss316l using mig welding
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Project quality management in manufacturing
OOP with Java - Java Introduction (Basics)
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Lecture Notes Electrical Wiring System Components
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
Mechanical Engineering MATERIALS Selection
573137875-Attendance-Management-System-original
Ad

navi bays algorithm in data mining ppt.pdf

  • 1. Naive Bayes Classifier Tushar B. Kute, http://tusharkute.com
  • 2. Joint Probability • The joint probability is the probability of two (or more) simultaneous events, often described in terms of events A and B from two dependent random variables, e.g. X and Y. • The joint probability is often summarized as just the outcomes, e.g. A and B. – Joint Probability: Probability of two (or more) simultaneous events, e.g. P(A and B) or P(A, B).
  • 3. Conditional Probability • The conditional probability is the probability of one event given the occurrence of another event, often described in terms of events A and B from two dependent random variables e.g. X and Y. – Conditional Probability: Probability of one (or more) event given the occurrence of another event, e.g. P(A given B) or P(A | B).
  • 4. Summary • The joint probability can be calculated using the conditional probability; for example: – P(A, B) = P(A | B) * P(B) • This is called the product rule. Importantly, the joint probability is symmetrical, meaning that: – P(A, B) = P(B, A) • The conditional probability can be calculated using the joint probability; for example: – P(A | B) = P(A, B) / P(B) • The conditional probability is not symmetrical; for example: – P(A | B) != P(B | A)
  • 5. Alternate way for conditional prob • Specifically, one conditional probability can be calculated using the other conditional probability; for example: – P(A|B) = P(B|A) * P(A) / P(B) • The reverse is also true; for example: – P(B|A) = P(A|B) * P(B) / P(A) • This alternate approach of calculating the conditional probability is useful either when the joint probability is challenging to calculate (which is most of the time), or when the reverse conditional probability is available or easy to calculate.
  • 6. Bayes Theorem • Bayes Theorem: Principled way of calculating a conditional probability without the joint probability. It is often the case that we do not have access to the denominator directly, e.g. P(B). • We can calculate it an alternative way; for example: – P(B) = P(B|A) * P(A) + P(B|not A) * P(not A) • This gives a formulation of Bayes Theorem that we can use that uses the alternate calculation of P(B), described below: – P(A|B) = P(B|A) * P(A) / P(B|A) * P(A) + P(B|not A) * P(not A)
  • 7. Bayes Theorem • Firstly, in general, the result P(A|B) is referred to as the posterior probability and P(A) is referred to as the prior probability. – P(A|B): Posterior probability. – P(A): Prior probability. • Sometimes P(B|A) is referred to as the likelihood and P(B) is referred to as the evidence. – P(B|A): Likelihood. – P(B): Evidence. • This allows Bayes Theorem to be restated as: – Posterior = Likelihood * Prior / Evidence
  • 8. Naive Bayes Classifier • Naive Bayes classifiers are a collection of classification algorithms based on Bayes’ Theorem. • It is not a single algorithm but a family of algorithms where all of them share a common principle, i.e. every pair of features being classified is independent of each other.
  • 9. Bayes Theorem Example Reference: Super Data Science
  • 23. Naive Bayes – Step-1
  • 24. Naive Bayes – Step-2
  • 25. Naive Bayes – Step-3
  • 27. Naive Bayes – Step-4
  • 28. Naive Bayes – Step-5
  • 32. Types of Naive Bayes Classifier • Multinomial Naive Bayes: – This is mostly used for document classification problem, i.e whether a document belongs to the category of sports, politics, technology etc. – The features/predictors used by the classifier are the frequency of the words present in the document.
  • 33. Types of Naive Bayes Classifier • Bernoulli Naive Bayes: – This is similar to the multinomial naive bayes but the predictors are boolean variables. – The parameters that we use to predict the class variable take up only values yes or no, for example if a word occurs in the text or not.
  • 34. Types of Naive Bayes Classifier • Gaussian Naive Bayes: – When the predictors take up a continuous value and are not discrete, we assume that these values are sampled from a gaussian distribution.
  • 35. Advantages • When assumption of independent predictors holds true, a Naive Bayes classifier performs better as compared to other models. • Naive Bayes requires a small amount of training data to estimate the test data. So, the training period is less. • Naive Bayes is also easy to implement.
  • 36. Disadvantages • Main imitation of Naive Bayes is the assumption of independent predictors. Naive Bayes implicitly assumes that all the attributes are mutually independent. In real life, it is almost impossible that we get a set of predictors which are completely independent. • If categorical variable has a category in test data set, which was not observed in training data set, then model will assign a 0 (zero) probability and will be unable to make a prediction. This is often known as Zero Frequency. To solve this, we can use the smoothing technique. One of the simplest smoothing techniques is called Laplace estimation.
  • 37. tushar@tusharkute.com Thank you This presentation is created using LibreOffice Impress 5.1.6.2, can be used freely as per GNU General Public License Web Resources https://mitu.co.in http://tusharkute.com /mITuSkillologies @mitu_group contact@mitu.co.in /company/mitu- skillologies MITUSkillologies