SlideShare a Scribd company logo
Association Rule Mining
Association Rule Mining (ARM)
• The classical problem of association pattern mining is defined in the context of
supermarket data containing sets of items bought by customers, which are
referred to as transactions.
• The goal is to determine associations between groups of items bought by
customers.
• The most popular model for association pattern mining uses the frequencies of
sets.
• The discovered sets of items are referred to as large itemsets, frequent itemsets, or
frequent patterns.
• Application of ARM - Supermarket data, Text mining, Generalization to
dependency-oriented data types, Market Basket Analysis.
Association Rule Mining (ARM)
• Frequent itemsets can be used to generate association rules of the form X ⇒ Y ,
where X and Y are sets of items.
• This rule suggests that buying of X makes it more likely that Y will also be
bought.
• Given a set of transactions, find rules that will predict the occurrence of an item
based on the occurrences of other items in the transaction.
• Example of Association Rules
{Diaper} → {Beer},
{Milk, Bread} → {Eggs, Coke},
{Beer, Bread} → {Milk},
Association Rule Mining (ARM)
• Itemset
• A collection of one or more items, eg; {Milk, Bread, Diaper}
• k-itemset - An itemset that contains k items.
• Support count (σ)
• Frequency of occurrence of an itemset
• E.g. σ({Milk, Bread,Diaper}) = 2
• Support
• Fraction of transactions that contain an itemset
• E.g. s({Milk, Bread, Diaper}) = 2/5
• Frequent Itemset
• An itemset whose support is greater than or equal to a minsup threshold
Association Rule Mining (ARM)
• Rule Evaluation Metrics
• Support (s)
• Fraction of transactions that contain both X and Y
• The support of an itemset I is defined as the fraction of the
transactions in the database T = {T1 . . . Tn} that contain I as a subset.
• Confidence (c)
• Measures how often items in Y appear in transactions that contain X.
• The confidence conf(X ∪ Y ) of the rule X ∪ Y is the conditional
probability of X ∪ Y occurring in a transaction, given that the
transaction contains X. Therefore, the confidence conf(X ⇒ Y ) is
defined as follows –
• Association Rule Mining Task
• Given a set of transactions T, the goal of association rule
mining is to find all rules having
• support ≥ minsup threshold
• confidence ≥ minconf threshold
Association Rule Mining Task
• Brute-force approach:
• List all possible association rules
• Compute the support and confidence for each rule
• Prune rules that fail the minsup and minconf thresholds.
• Computationally Expensive
• Strategies to reduce Computational Complexity:
• Reduce the number of candidates (M)
• Reduce the number of transactions (N)
• Reduce the number of comparisons (NM)
Association Rule Mining Task
• Apriori Principle: If an itemset is frequent, then all of its subsets must also be
frequent.
• Apriori principle holds due to the following property of the support measure:
• Support of an itemset never exceeds the support of its subsets.
• This is known as the anti-monotone property of support.
• Apriori also uses the downward closure property.
• Apriori algorithm generates candidates with smaller length k first and counts their
supports before generating candidates of length (k+1).
• The resulting frequent k-itemsets are used to restrict the number of (k + 1)-
candidates with the downward closure property.
Association Rule Mining Task

More Related Content

PPTX
Lect6 Association rule & Apriori algorithm
PPT
1.9.association mining 1
PPTX
Association 04.03.14
PPTX
Rules of data mining
PPTX
Data mining presentation.ppt
PPT
Lec6_Association.ppt
PPTX
MODULE 5 _ Mining frequent patterns and associations.pptx
PPTX
Unit-II-1-1@dm.pptx .
Lect6 Association rule & Apriori algorithm
1.9.association mining 1
Association 04.03.14
Rules of data mining
Data mining presentation.ppt
Lec6_Association.ppt
MODULE 5 _ Mining frequent patterns and associations.pptx
Unit-II-1-1@dm.pptx .

Similar to Association Rule Mining(ARM) notes for class (20)

PDF
AssociationRule.pdf
PPT
Data Mining: Association-Rules Techniques.ppt
PPTX
Association Analysis in Data Mining
PPT
dm14-association-rules (BahanAR-3).ppt
PPT
Pert 06 association rules
PPT
associations and Data Mining in Machine learning.ppt
PPTX
3 association rule mining
PDF
MCA-IV_DataMining16_DataMining_AssociationRules_APriori_Keerti_Dixit.pdf
PDF
Intake 37 DM
PDF
Data Mining and Warehousing presentation
PPTX
BAS 250 Lecture 4
PPTX
Association in Frequent Pattern Mining
PDF
PROJECT-109,93.pdf data miiining project
PPT
Dwh lecture slides-week15
PPTX
Unit 4_ML.pptx
PPTX
Data mining techniques unit III
PPTX
MIning association rules and frequent patterns.pptx
PPTX
Datamining.pptx
PPT
20IT501_DWDM_PPT_Unit_III.ppt
PPT
20IT501_DWDM_U3.ppt
AssociationRule.pdf
Data Mining: Association-Rules Techniques.ppt
Association Analysis in Data Mining
dm14-association-rules (BahanAR-3).ppt
Pert 06 association rules
associations and Data Mining in Machine learning.ppt
3 association rule mining
MCA-IV_DataMining16_DataMining_AssociationRules_APriori_Keerti_Dixit.pdf
Intake 37 DM
Data Mining and Warehousing presentation
BAS 250 Lecture 4
Association in Frequent Pattern Mining
PROJECT-109,93.pdf data miiining project
Dwh lecture slides-week15
Unit 4_ML.pptx
Data mining techniques unit III
MIning association rules and frequent patterns.pptx
Datamining.pptx
20IT501_DWDM_PPT_Unit_III.ppt
20IT501_DWDM_U3.ppt
Ad

Recently uploaded (20)

PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
1_Introduction to advance data techniques.pptx
PPTX
Computer network topology notes for revision
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPT
Quality review (1)_presentation of this 21
PDF
Lecture1 pattern recognition............
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
Data_Analytics_and_PowerBI_Presentation.pptx
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
1_Introduction to advance data techniques.pptx
Computer network topology notes for revision
Acceptance and paychological effects of mandatory extra coach I classes.pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Business Acumen Training GuidePresentation.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
Introduction to Knowledge Engineering Part 1
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
Quality review (1)_presentation of this 21
Lecture1 pattern recognition............
STUDY DESIGN details- Lt Col Maksud (21).pptx
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
climate analysis of Dhaka ,Banglades.pptx
Reliability_Chapter_ presentation 1221.5784
Business Ppt On Nestle.pptx huunnnhhgfvu
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Ad

Association Rule Mining(ARM) notes for class

  • 2. Association Rule Mining (ARM) • The classical problem of association pattern mining is defined in the context of supermarket data containing sets of items bought by customers, which are referred to as transactions. • The goal is to determine associations between groups of items bought by customers. • The most popular model for association pattern mining uses the frequencies of sets. • The discovered sets of items are referred to as large itemsets, frequent itemsets, or frequent patterns. • Application of ARM - Supermarket data, Text mining, Generalization to dependency-oriented data types, Market Basket Analysis.
  • 3. Association Rule Mining (ARM) • Frequent itemsets can be used to generate association rules of the form X ⇒ Y , where X and Y are sets of items. • This rule suggests that buying of X makes it more likely that Y will also be bought. • Given a set of transactions, find rules that will predict the occurrence of an item based on the occurrences of other items in the transaction. • Example of Association Rules {Diaper} → {Beer}, {Milk, Bread} → {Eggs, Coke}, {Beer, Bread} → {Milk},
  • 4. Association Rule Mining (ARM) • Itemset • A collection of one or more items, eg; {Milk, Bread, Diaper} • k-itemset - An itemset that contains k items. • Support count (σ) • Frequency of occurrence of an itemset • E.g. σ({Milk, Bread,Diaper}) = 2 • Support • Fraction of transactions that contain an itemset • E.g. s({Milk, Bread, Diaper}) = 2/5 • Frequent Itemset • An itemset whose support is greater than or equal to a minsup threshold
  • 5. Association Rule Mining (ARM) • Rule Evaluation Metrics • Support (s) • Fraction of transactions that contain both X and Y • The support of an itemset I is defined as the fraction of the transactions in the database T = {T1 . . . Tn} that contain I as a subset. • Confidence (c) • Measures how often items in Y appear in transactions that contain X. • The confidence conf(X ∪ Y ) of the rule X ∪ Y is the conditional probability of X ∪ Y occurring in a transaction, given that the transaction contains X. Therefore, the confidence conf(X ⇒ Y ) is defined as follows – • Association Rule Mining Task • Given a set of transactions T, the goal of association rule mining is to find all rules having • support ≥ minsup threshold • confidence ≥ minconf threshold
  • 6. Association Rule Mining Task • Brute-force approach: • List all possible association rules • Compute the support and confidence for each rule • Prune rules that fail the minsup and minconf thresholds. • Computationally Expensive • Strategies to reduce Computational Complexity: • Reduce the number of candidates (M) • Reduce the number of transactions (N) • Reduce the number of comparisons (NM)
  • 7. Association Rule Mining Task • Apriori Principle: If an itemset is frequent, then all of its subsets must also be frequent. • Apriori principle holds due to the following property of the support measure: • Support of an itemset never exceeds the support of its subsets. • This is known as the anti-monotone property of support. • Apriori also uses the downward closure property. • Apriori algorithm generates candidates with smaller length k first and counts their supports before generating candidates of length (k+1). • The resulting frequent k-itemsets are used to restrict the number of (k + 1)- candidates with the downward closure property.