A Semi-naive Bayes Classifier with Grouping of Cases

A Semi-naive Bayes
Classifier with Grouping
of Cases
J. Abellán, A. Cano, A. R. Masegosa, S. Moral
Department of Computer Science and A.I.
University of Granada
Spain

2
Outline
1. Introduction.
2. Semi-Naive Bayes Classifier with
Grouping of Cases.
 General Description
 The Joining Criterions
 The Grouping Criterions
3. Experimental Evaluation.
4. Conclusions and Future Work.

3
Introduction
Information from a data base
Attribute variables Class variable
Data Base
Calcium Tumor Coma Migraine Cancer
normal a1 absent absent absent
high a1 present absent present
high ao present present absent
...... ...... ...... ...... ......

4
Introduction
Naive Bayes (Duda & Hart, 1973)
 Attribute variables {Xi | i=1,..,r}
 Class variable C={c1,..,ck}.
 New observation z=(z1,..,zr) 
(X1=z1,..,Xr=zr).
 Select state of C:
arg maxci
(P(ci|Z)).
 Supposition of independecy
known the class variable:
arg maxci
(P(ci) ∏r
j=1
P(zj|ci))
…
C
X1 X2 Xr
Graphical Structure

5
Introduction
Naive Bayes Classifiers
 Naive Bayesian Classifiers:
NB’s performance is comparable with some
state-of-the-art classifiers even when its
independency assumption does not hold in
normal cases.
 Question:
“Can the performance be better when the
conditional independency assumption of NB is
relaxed?”

6
 Semi-Naive Bayesian Classifiers(SNB)
 A looser assumption than NB.
 Independency occurs among the joined
variables given the class variable C.
Introduction
Semi-Naive Bayes Classifiers

7
Introduction
Semi-Naive Bayes Classifiers
 Main problems of Semi-NB approach:
 When to join two variables? Joining Criterion
 Kononenko’s criterion is entropy based.
 Pazzani’s criterion is accuracy based.
 Wrapper estimation.
 Very high complexity with high number of variables.
Class entropy reduction

8
A SNB with Grouping of Cases
Joining Method
 Three new proposals for Joining Criterions.
 BDe: Bayesian Dirichlet Equivalent.
 L10: The Expected Log-likelihood under
leaving-one-out.
 LRT: Log-likelihood Ratio Test.

9
Grouping Method
 Increment in Parameter Estimations
 Solution: “Grouping cases of the new variable”.
Independent
P (Xi | C)P(Xj | C)
Nº Parameters:
#(C) (#(Xi) + #(Xj))
Dependent
P (Xi, Xj | C)
Nº Parameters:
#(C) #(Xi) #(Xj)
Similar Information

10
Example
…
C
X1 X2 Xr
Joining Phase
…
C
X5 x X9 X1 Xr
Each pair of Variables
is evaluated using a JC
Grouping Phase
Similar Information
Each pair of Cases
is evaluated using a GC
…
C
X5 x X9 X1 Xr

11
Joining Criterions
BDe criterion
 Bayesian Dirichlet equivalent Metric (BDe)
“Bayesian scores measure the quality of a
model, M, as the posterior probability of
the model given the learning data D”
JC(BDe) = Score (M1:D) – Score(M2:D)
C
X Y
C
X x Y
M1 M2

12
Joining Criterions
L1O criterion
 Expected Log-Likelihood Under Leave-
One-Out (L1O).
Leave-one-out EstimationLaplace Estimation
“The estimation of the log-likelihood of the class
is carried out with a leave-one-out scheme
computed with a closed equation”

13
Joining Criterions
LRT criterion
 Log-likelihood Ratio Test (LRT):
Corrector Factor:
“Comparison of two nested models: M1 with
merged variables and M2 variables are independent”
Number of total
comparisons over
n active variables

14
Grouping Method
Hypotheses
 Hypotheses: Model Selection Problem
 Sample data D is restricted to X=xi or X=xj.
 Consider xi and xj the only possible cases of X.
 Grouping xi and xj implies X has only one case.
Similar Information

15
Grouping Method
Criterions
 BDe score:
 L10 score:
 LRT score:

16
Experimental Evaluation
Details
 SNG was implemented in Elvira.
 Integrated in Weka for evaluation.
 Tested in 13 data bases without missing
values from UCI repository.
 10 fold-cross validation repeated 10 times.
 Comparison with a corrected paired t-test
to 5%.

17
 The trade-off between Accuracy and log-
likelihood is better for LRT.
 L10 works badly as joining criterion.
Evaluating Joining Criterions
Naive Bayes Comparison

18
Evaluating Joining Criterions
Pazzani’s semi-NB comparison
LRT works slightly better than BDe.
Similar performance with a lower time
complexity.
LRT is the best joining criterion

19
Evaluating Grouping Criterions
Naive Bayes Comparison
 LRT Joining + Grouping Method
Not strong differences among criterions.
L10 slightly better.
L1O is the best grouping criterion

20
Pazzani’s Semi-NB Comparison
SNB-G = LRT Joining + L10 Grouping
 Similar performance:
 Dramatic building time reduction:

21
State-of-the-art Classifiers
AODE, TAN and LBR comparison
 Three wins against
NB.
 1 W vs 1 D against
AODE.
 None difference
against TAN and
LBR.
 One Win against
Pazzani’s Semi-NB.

22
Conclusions and Future Work
 A preprocessing step for Naive Bayes:
 Method for joining variables.
 Combined method for grouping cases.
 Very efficient with similar performance
respect to Pazzani’s Semi-NB classifier.
 Application to high-dimensionality data sets.
 Generalization of the methodology to
another models: decision trees and TAN
model.

A Semi-naive Bayes Classifier with Grouping of Cases

More Related Content

Viewers also liked (11)

Similar to A Semi-naive Bayes Classifier with Grouping of Cases (20)

More from NTNU (18)

Recently uploaded (20)

A Semi-naive Bayes Classifier with Grouping of Cases