SlideShare a Scribd company logo
HYBRID TECHNIQUE FOR ASSOCIATIVE
CLASSIFICATION: A NOVAL APPROACH

Jagdeep Singh
Table of Contents
Introduction

Ø 

Ø  Data

Ø 

Ø 

Mining Process
Ø  Classification
Ø  Association
Ø 
Ø 
Ø 
Ø 

Motivation
Literature Survey
Problem Formulation
Objectives

Ø 

Methodology
Facilities Required
References
Data Mining
Data mining computational process of finding patterns
in large data sets including methods at the intersection
of machine learning, artificial intelligence, statistics
and database systems. The main focus of data mining
process is to obtain information from the data and
converted it into an knowledgeable and reasonable
structure for further use.
Data Mining Process

Figure 1 : The Data Mining Process [10]
Classification
Classification is the problem of identifying to which of
a set of categories a new observation belongs, on the
basis of a training set of data containing observations
(or instances) whose category membership is known.
Association
Association learning method for discovering interesting
relations between variables in large databases. It is
intended to identify strong rules discovered in
databases using different measures of interestingness.

For example, the rule :
{onions, potatoes} => {burger}.
Example : The Weather Problem
ID

outlook

temperature

humidity

windy

play

1

sunny

hot

high

false

no

2

sunny

hot

high

true

no

3

overcast

hot

high

false

yes

4

rainy

mild

high

false

yes

5

rainy

cool

normal

false

yes

6

rainy

cool

normal

true

no

7

overcast

cool

normal

true

yes

8

sunny

mild

high

false

no

9

sunny

cool

normal

false

yes

10

rainy

mild

normal

false

yes

11

sunny

mild

normal

true

yes

12

overcast

mild

high

true

yes

13

overcast

hot

normal

false

yes

14

rainy

mild

high

true

no
Association rules for: Weather Problem
1. humidity=normal windy=FALSE (4) ==> play=yes (4) 
 2. temperature=cool (4)== humidity=normal (4)  	

3. outlook=overcast (4) == play=yes (4)    
 4. temperature=cool play=yes (3) == humidity=normal (3)     
 5. outlook=rainy windy=FALSE (3) == play=yes (3)     
 6. outlook=rainy play=yes (3) == windy=FALSE (3)    
 7. outlook=sunny humidity=high(3) == play=no (3)     
 8. outlook=sunny play=no (3) == humidity=high (3)     
 9. temperature=cool windy=FALSE (2) == humidity=normal play=yes (2)    
10. temperature=cool humidity=normal windy=FALSE (2) == play=yes (2)   
Result new prediction ?

Outlook

Temp.

Humidity

Wind

Sunny

Cool

High

True

Play
Literature Survey
Ø 

Liao et al. [8] author report about data mining techniques and application,
development through a survey of literature, form 2000 to 2011. Paper surveys
three areas of data mining research: knowledge types, analysis types, and
architecture types. A discussion deals with future progress in social science and
Engineering methodologies implement data mining techniques and the development
of applications in problem- oriented

Ø 

The first association rule mining algorithm was the Apriori algorithm [3] developed
by Agrawal, and swami. The Apriori algorithm generates the candidate item sets in
one pass through only the item sets with large support in the previous pass, without
considering the transactions in the database.
Continue…
Ø 

Kwon et al.[9] evaluated the data set features are most affective on
classification algorithms performance. It is a complex problem to find out
which algorithm is highly effective in relation to which data set. Author’s
research experimentally examines how data set characteristics affect
algorithm performance, in terms of elapsed time and accuracy.

Ø 

B. Liu et al. [2] presented an associative classification, to integrate
classification rules and association rule mining. The integration is done by
focusing on mining a special subset of association rules whose consequent
parts are restricted to the classification class labels, called Class Association
Rules (CARs).
Problem Formulation
Ø 

Associative and classification suffers from inefficiency due to the fact that it
often generates a very large number of rules in association rule mining.
Often this leads to generation of a large number of insignificant rules and
at the same time good rules with relatively low support are not produced. It
takes efforts to select high quality rules from among them.

Ø 

Most of the associative classification algorithms adopt the exhaustive search
method presented in the famous Apriori algorithm to discover the rules and
require multiple passes over the database. Furthermore, they find frequent
items in one phase and generate the rules in a separate phase consuming
more resources such as storage and processing time.
Objectives
Ø 

Ø 
Ø 

Purpose a framework that can generate
Classification Association Rules (CARs) efficiently.
Perform evaluation of proposed approach.
Comparative analysis of proposed Algorithm with
other state-of-the-art techniques.
Methodology
Ø 

Ø 

Ø 

Ø 

Review of the classification and association rule
generation methods.
Understanding the existing model associative
classification.
Implement a classification system based on
association rules and compare the performance of
several model construction methods or algorithms in
Weka environment.
Comparison of proposed approach with exiting
methods.
Facilities Required
Ø 

Data mining tools is used for the
implementation of the proposed project
work like Weka.
References
Ø 
Ø 

Ø 

Ø 

Ø 

Ø 

Tom M. Mitchell, “Machine Learning”, 1st ed.U.K.: McGraw-Hill, 1997.
Bing Liu, Wynne Hsu, and Yiming Ma, “Integrating classification and association rule
mining”. In Knowledge Discovery and Data Mining, New York, vol. 2, pp 80–86,
1998.
R. Agrawal and R. Srikant, “Fast algorithms for mining association rules”, In VLDB,
pp. 487-499, Santiago, Chile, September 12-15, 1994.
Wenmin Li, Jiawei Han, and Jian Pei, “CMAR: Accurate and efficient classifi- cation
based on multiple class-association rules”. In ICDM'01 Proc. of the 2001 IEEE
International Conference on Data Mining, pp 369–376, IEEE Computer Society
Washington, DC, USA , 2001.
X. Yin and J. Han, “CPAR: Classification based on Predictive Association Rules,” Proc.
SIAM Int. Conf. on Data Mining, pp. 331-335, San Francisco, CA, May 2003.
Thabtah, Fadi Abdeljaber, “A review of associative classification mining”. Knowledge
Engineering Review, vol. 1, pp. 37-65, 2007.
Continue …
Ø 

Ø 

Ø 

Ø 

T.V.Mahendra, N.Deepika and N.Keasava Rao, “Data Mining for High Performance
Data Cloud using Association Rule Mining”, International Journal of Advanced
Research in Computer Science and Software Engineering, vol. 2, Issue 1, 2012.
S. H. Liao, P. H. Chu, and P. Y. Hsiao, “Data mining techniques and applications – A
decade review from 2000 to 2011”, Elsevier Expert Systems with Applications, vol.
39, pp. 11303–11311, 2012.
Ohbyung Kwon and Jae Mun Sim, “Effects of data set features on the performances
of classification algorithms”, Expert Systems with Applications, vol. 40, pp. 1847–
1857, 2013.
http://www.infovis-wiki.net/index.php?title=File:Fayyad96kdd-process.png
Jagdeep Singh

More Related Content

PPTX
Edge and Fog computing, a use-case prespective
PDF
Data preprocessing using Machine Learning
PDF
Machine Learning and Internet of Things
PPTX
Tree pruning
PPTX
Concept learning
PDF
Machine learning Lecture 2
PPTX
Intro/Overview on Machine Learning Presentation
PDF
Data visualization in Python
Edge and Fog computing, a use-case prespective
Data preprocessing using Machine Learning
Machine Learning and Internet of Things
Tree pruning
Concept learning
Machine learning Lecture 2
Intro/Overview on Machine Learning Presentation
Data visualization in Python

What's hot (20)

PPTX
Types of machine learning
PPTX
Machine Learning With Python | Machine Learning Algorithms | Machine Learning...
PDF
Decision tree
PPTX
Data reduction
PPTX
Machine Can Think
PPTX
Data Mining: clustering and analysis
PPTX
Introduction to Machine Learning
PPTX
Database security
PDF
Prototyping Online Components(Part 01)_Internet of Things
PPT
Data preprocessing
PPTX
Business intelligence- Components, Tools, Need and Applications
PPTX
PPT on BRAIN TUMOR detection in MRI images based on IMAGE SEGMENTATION
PPTX
Machine learning seminar ppt
PPTX
Types of Machine Learning
PPTX
Introduction to Deep Learning
PPT
Cloud computing ppt
PPTX
Radial basis function network ppt bySheetal,Samreen and Dhanashri
PPTX
Machine Learning Algorithms
PDF
1 seaborn introduction
PPTX
Introduction to machine learning
Types of machine learning
Machine Learning With Python | Machine Learning Algorithms | Machine Learning...
Decision tree
Data reduction
Machine Can Think
Data Mining: clustering and analysis
Introduction to Machine Learning
Database security
Prototyping Online Components(Part 01)_Internet of Things
Data preprocessing
Business intelligence- Components, Tools, Need and Applications
PPT on BRAIN TUMOR detection in MRI images based on IMAGE SEGMENTATION
Machine learning seminar ppt
Types of Machine Learning
Introduction to Deep Learning
Cloud computing ppt
Radial basis function network ppt bySheetal,Samreen and Dhanashri
Machine Learning Algorithms
1 seaborn introduction
Introduction to machine learning
Ad

Similar to Associative Classification: Synopsis (20)

PDF
Classification on multi label dataset using rule mining technique
PDF
Classification based on Positive and Negative Association Rules
PDF
prediction using data mining.pdf
PDF
Configuring Associations to Increase Trust in Product Purchase
PDF
CONFIGURING ASSOCIATIONS TO INCREASE TRUST IN PRODUCT PURCHASE
PDF
Ca25458463
PDF
H044063843
PDF
A literature review of modern association rule mining techniques
PDF
Data Mining based on Hashing Technique
PDF
IRJET- Improving the Performance of Smart Heterogeneous Big Data
PDF
Data Mining Concepts - A survey paper
PDF
Hu3414421448
PPTX
Introduction to Data Mining
PDF
An Efficient Approach for Asymmetric Data Classification
PDF
A Survey on Frequent Patterns To Optimize Association Rules
PDF
Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...
PDF
Implementation of Improved Apriori Algorithm on Large Dataset using Hadoop
PDF
Ghhh
PPT
Data Mining Concepts 15061
PPT
Data Mining Concepts
Classification on multi label dataset using rule mining technique
Classification based on Positive and Negative Association Rules
prediction using data mining.pdf
Configuring Associations to Increase Trust in Product Purchase
CONFIGURING ASSOCIATIONS TO INCREASE TRUST IN PRODUCT PURCHASE
Ca25458463
H044063843
A literature review of modern association rule mining techniques
Data Mining based on Hashing Technique
IRJET- Improving the Performance of Smart Heterogeneous Big Data
Data Mining Concepts - A survey paper
Hu3414421448
Introduction to Data Mining
An Efficient Approach for Asymmetric Data Classification
A Survey on Frequent Patterns To Optimize Association Rules
Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...
Implementation of Improved Apriori Algorithm on Large Dataset using Hadoop
Ghhh
Data Mining Concepts 15061
Data Mining Concepts
Ad

More from Jagdeep Singh Malhi (7)

PDF
Hybrid Technique for Associative Classification of Heart Diseases
PDF
Automation
PDF
PDF
Introduction to Django
PPT
FILE SERVER
PPT
ODP
Believe IN GOD
Hybrid Technique for Associative Classification of Heart Diseases
Automation
Introduction to Django
FILE SERVER
Believe IN GOD

Recently uploaded (20)

PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
Classroom Observation Tools for Teachers
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
Institutional Correction lecture only . . .
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
Lesson notes of climatology university.
PPTX
Cell Structure & Organelles in detailed.
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
Insiders guide to clinical Medicine.pdf
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Computing-Curriculum for Schools in Ghana
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PPTX
Cell Types and Its function , kingdom of life
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
Module 4: Burden of Disease Tutorial Slides S2 2025
Classroom Observation Tools for Teachers
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
STATICS OF THE RIGID BODIES Hibbelers.pdf
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Institutional Correction lecture only . . .
Renaissance Architecture: A Journey from Faith to Humanism
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Supply Chain Operations Speaking Notes -ICLT Program
Lesson notes of climatology university.
Cell Structure & Organelles in detailed.
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Insiders guide to clinical Medicine.pdf
Abdominal Access Techniques with Prof. Dr. R K Mishra
Microbial diseases, their pathogenesis and prophylaxis
Computing-Curriculum for Schools in Ghana
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Cell Types and Its function , kingdom of life
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape

Associative Classification: Synopsis

  • 1. HYBRID TECHNIQUE FOR ASSOCIATIVE CLASSIFICATION: A NOVAL APPROACH Jagdeep Singh
  • 2. Table of Contents Introduction Ø  Ø  Data Ø  Ø  Mining Process Ø  Classification Ø  Association Ø  Ø  Ø  Ø  Motivation Literature Survey Problem Formulation Objectives Ø  Methodology Facilities Required References
  • 3. Data Mining Data mining computational process of finding patterns in large data sets including methods at the intersection of machine learning, artificial intelligence, statistics and database systems. The main focus of data mining process is to obtain information from the data and converted it into an knowledgeable and reasonable structure for further use.
  • 4. Data Mining Process Figure 1 : The Data Mining Process [10]
  • 5. Classification Classification is the problem of identifying to which of a set of categories a new observation belongs, on the basis of a training set of data containing observations (or instances) whose category membership is known.
  • 6. Association Association learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using different measures of interestingness. For example, the rule : {onions, potatoes} => {burger}.
  • 7. Example : The Weather Problem ID outlook temperature humidity windy play 1 sunny hot high false no 2 sunny hot high true no 3 overcast hot high false yes 4 rainy mild high false yes 5 rainy cool normal false yes 6 rainy cool normal true no 7 overcast cool normal true yes 8 sunny mild high false no 9 sunny cool normal false yes 10 rainy mild normal false yes 11 sunny mild normal true yes 12 overcast mild high true yes 13 overcast hot normal false yes 14 rainy mild high true no
  • 8. Association rules for: Weather Problem 1. humidity=normal windy=FALSE (4) ==> play=yes (4)   2. temperature=cool (4)== humidity=normal (4)   3. outlook=overcast (4) == play=yes (4)     4. temperature=cool play=yes (3) == humidity=normal (3)      5. outlook=rainy windy=FALSE (3) == play=yes (3)      6. outlook=rainy play=yes (3) == windy=FALSE (3)     7. outlook=sunny humidity=high(3) == play=no (3)      8. outlook=sunny play=no (3) == humidity=high (3)      9. temperature=cool windy=FALSE (2) == humidity=normal play=yes (2)    10. temperature=cool humidity=normal windy=FALSE (2) == play=yes (2)   
  • 9. Result new prediction ? Outlook Temp. Humidity Wind Sunny Cool High True Play
  • 10. Literature Survey Ø  Liao et al. [8] author report about data mining techniques and application, development through a survey of literature, form 2000 to 2011. Paper surveys three areas of data mining research: knowledge types, analysis types, and architecture types. A discussion deals with future progress in social science and Engineering methodologies implement data mining techniques and the development of applications in problem- oriented Ø  The first association rule mining algorithm was the Apriori algorithm [3] developed by Agrawal, and swami. The Apriori algorithm generates the candidate item sets in one pass through only the item sets with large support in the previous pass, without considering the transactions in the database.
  • 11. Continue… Ø  Kwon et al.[9] evaluated the data set features are most affective on classification algorithms performance. It is a complex problem to find out which algorithm is highly effective in relation to which data set. Author’s research experimentally examines how data set characteristics affect algorithm performance, in terms of elapsed time and accuracy. Ø  B. Liu et al. [2] presented an associative classification, to integrate classification rules and association rule mining. The integration is done by focusing on mining a special subset of association rules whose consequent parts are restricted to the classification class labels, called Class Association Rules (CARs).
  • 12. Problem Formulation Ø  Associative and classification suffers from inefficiency due to the fact that it often generates a very large number of rules in association rule mining. Often this leads to generation of a large number of insignificant rules and at the same time good rules with relatively low support are not produced. It takes efforts to select high quality rules from among them. Ø  Most of the associative classification algorithms adopt the exhaustive search method presented in the famous Apriori algorithm to discover the rules and require multiple passes over the database. Furthermore, they find frequent items in one phase and generate the rules in a separate phase consuming more resources such as storage and processing time.
  • 13. Objectives Ø  Ø  Ø  Purpose a framework that can generate Classification Association Rules (CARs) efficiently. Perform evaluation of proposed approach. Comparative analysis of proposed Algorithm with other state-of-the-art techniques.
  • 14. Methodology Ø  Ø  Ø  Ø  Review of the classification and association rule generation methods. Understanding the existing model associative classification. Implement a classification system based on association rules and compare the performance of several model construction methods or algorithms in Weka environment. Comparison of proposed approach with exiting methods.
  • 15. Facilities Required Ø  Data mining tools is used for the implementation of the proposed project work like Weka.
  • 16. References Ø  Ø  Ø  Ø  Ø  Ø  Tom M. Mitchell, “Machine Learning”, 1st ed.U.K.: McGraw-Hill, 1997. Bing Liu, Wynne Hsu, and Yiming Ma, “Integrating classification and association rule mining”. In Knowledge Discovery and Data Mining, New York, vol. 2, pp 80–86, 1998. R. Agrawal and R. Srikant, “Fast algorithms for mining association rules”, In VLDB, pp. 487-499, Santiago, Chile, September 12-15, 1994. Wenmin Li, Jiawei Han, and Jian Pei, “CMAR: Accurate and efficient classifi- cation based on multiple class-association rules”. In ICDM'01 Proc. of the 2001 IEEE International Conference on Data Mining, pp 369–376, IEEE Computer Society Washington, DC, USA , 2001. X. Yin and J. Han, “CPAR: Classification based on Predictive Association Rules,” Proc. SIAM Int. Conf. on Data Mining, pp. 331-335, San Francisco, CA, May 2003. Thabtah, Fadi Abdeljaber, “A review of associative classification mining”. Knowledge Engineering Review, vol. 1, pp. 37-65, 2007.
  • 17. Continue … Ø  Ø  Ø  Ø  T.V.Mahendra, N.Deepika and N.Keasava Rao, “Data Mining for High Performance Data Cloud using Association Rule Mining”, International Journal of Advanced Research in Computer Science and Software Engineering, vol. 2, Issue 1, 2012. S. H. Liao, P. H. Chu, and P. Y. Hsiao, “Data mining techniques and applications – A decade review from 2000 to 2011”, Elsevier Expert Systems with Applications, vol. 39, pp. 11303–11311, 2012. Ohbyung Kwon and Jae Mun Sim, “Effects of data set features on the performances of classification algorithms”, Expert Systems with Applications, vol. 40, pp. 1847– 1857, 2013. http://www.infovis-wiki.net/index.php?title=File:Fayyad96kdd-process.png