SlideShare a Scribd company logo
Master the Art of Analytics
A Simplistic Explainer Series For Citizen Data Scientists
J o u r n e y To w a r d s A u g m e n t e d A n a l y t i c s
Frequent Pattern Mining
Introduction
Basic terminologies
with example
Standard input/tuning parameters & Sample UI
Sample output UI
Interpretation of Output
Limitations
Business use cases
What Are
All Covered
INTRODUCTION
 Association rule mining is a procedure which
finds frequent patterns, associations, or causal
structures from data sets found in various kinds
of databases such as relational databases,
transactional databases, and other forms of
data repositories
 Given a set of transactions, association rule
mining aims to find the rules which enable us
to predict the occurrence of a specific item
based on the occurrences of the other items in
the transaction
BASIC
TERMINOLOGIES
• ANTECEDENT:
• Left hand side of the rule is called Antecedent
• For instance, for the rule milk->bread, milk is antecedent
• CONSEQUENT:
• Right hand side of the rule is called Consequent
• For instance, for the rule milk->bread, bread is consequent
• SUPPORT :
• The support of a rule x -> y (where x and y are each items/events
etc.) is defined as the proportion of transactions in the data set
which contain the item set x as well as y
• Thus, Support (x -> y)= no. of transactions which contain the item
set x & y / total no. of transactions
•
BASIC
TERMINOLOGIES
• CONFIDENCE :
• The confidence of a rule x -> y is defined as:
• Support ( x -> y ) / support (x)
• Thus it is the ratio of the number of
transactions that include all items in the
consequent (y in this case), as well as the
antecedent( x in this case) to the number of
transactions that include all items in the
antecedent ( x in this case )
• LIFT :
• The lift of a rule x -> y is defined as:
• Support ( x -> y ) / support (x) * support (y)
Here , support (Milk -> Bread):
= Number of transactions containing milk &
bread / total transactions
= 2/5 = 0.4
TID Milk Bread Butter Beer
1 1 0 1 1
2 1 1 1 0
3 0 1 1 0
4 1 0 0 1
5 1 1 1 1
Confidence (Milk -> Bread):
= support (milk-> bread)/ support(milk)
= 0.4/ [4/5]
=0.4/ 0.8
=0.5
Lift (Milk -> Bread):
= support (milk-> bread)/ support(milk) *
support(bread)
= 0.4/ [(4/5) * (3/5)]
=0.4/ [0.8*0.6] = 0.4/0.48
=0.83
 Support (milk->bread) = 0.4
means milk & bread together
occur in 40% of all transactions
 Confidence (milk->bread) = 0.5
means, if there are 100
transactions containing milk then
there are 50 of them containing
bread also
Example
Example
Similarly , support, confidence and lift values
of all item combinations are found and the
rules matching user defined threshold of
support and confidence will be displayed in
final output as shown below :
For instance, for minimum support = 0.3 and
minimum confidence =0.3, sample rules
generated would be as shown below ,
depicting frequent item sets or best
performing combination of item sets
Rule Support Confidence Lift
Bread->Butter 0.5 0.6 0.86
Milk -> Bread 0.4 0.5 0.83
Milk -> Butter 0.3 0.5 0.78
Bread-> Beer 0.3 0.4 0.68
INTERPRETATION :
Example
Rule Support Confidence Lift
Bread->Butter 0.5 0.6 0.86
Milk -> Bread 0.4 0.5 0.83
Milk -> Butter 0.3 0.5 0.78
Bread-> Beer 0.3 0.4 0.68
• In this case, Bread -> Butter rule has highest
propensity to be bought together as it has
highest support as well as confidence,
followed by Milk -> Bread, Milk -> Butter
and Bread -> Beer
• As support (Bread-> Butter ) = 0.5, there are
50% transactions containing bread and
butter
• As confidence (Bread-> Butter ) = 0.6 , if
there are 100 transactions containing bread
then there are 60 of them containing butter
also
• A lift larger than 1.0 implies that the
relationship between the antecedent and
the consequent is more significant than
would be expected if the two were
independent. The larger the lift, the more
significant the association
Standard Input Parameters & Sample UI
Standard Output 1 :
Model Summary
Rules that have both high confidence and
support are called strong rules
Even if confidence reaches high values, the
rule is not useful unless the support value is
high as well
In this case, Bread -> Butter rule has highest
propensity to be bought together as it has
highest support as well as confidence,
followed by Milk -> Bread, Milk -> Butter and
Bread -> Beer
There are 50% transactions containing break
and butter, and if there are 100 transactions
containing bread, 60 of them also has butter
INTERPRETATION :
Rule Support Confidence Lift
Shampoo -> Soap 0.5 0.5 0.86
Cold drink -> Snacks 0.4 0.4 0.83
Fruit -> Vegetables 0.3 0.35 0.78
Milk > Egg 0.3 0.30 0.68
INTERPRETATION :
Sample Output 2 :
Plot : Confidence By
Rules
The confidence value of each rule is
shown in the plot above
As confidence value takes into
account the sequence of items in the
association rule, this plot is built
based on confidence values instead of
support or lift
The product combinations shown in
dark green color in plot above have
the highest likelihood to be bought
together and in sequence
Darker the color, more the likelihood
of products being bought together
and sequentially
Sample Output 2:
Plot : Support &
confidence by rule
• Ideally both support and
confidence should be taken into
account to come up with best rules
because support only indicates the
frequency of both items being sold
together where as confidence takes
care of sequence of purchase also
• Hence, alternatively , a bubble
scatter plot using support and
confidence measures can be used to
focus on rules with high support as
well as confidence
LIMITATIONS :
 Processing time for running this algorithm is
relatively high when compared to other
algorithms due to millions of transaction level
data in input
 The user must possess a certain amount of
expertise in order to find the right settings for
support and confidence to obtain the best
association rules
GENERAL
APPLICATIONS
• BASKET DATA ANALYSIS
• To analyze the association of purchased items in a single basket or
single purchase
• CROSS MARKETING/SELLING
• To work with other businesses that complement your own, not
competitors.
• For example, vehicle dealerships and manufacturers have cross
marketing campaigns with oil and gas companies for obvious reasons
• CATALOG DESIGN
• The selection of items in a business’ catalog are often designed to
complement each other so that buying one item will lead to buying of
another. So these items are often complements or very related
• MEDICAL TREATMENTS
• Each patient is represented as a transaction containing the ordered set
of diseases and which diseases are likely to occur simultaneously /
sequentially can be predicted
USE CASE 1
Business benefit:
• Based on the association rules
generated, the store manager will
be able to strategically place the
products together or in sequence
leading to growth in sales and in
turn revenue of the store
• Offers such as “Buy this and get this
free” or “Buy this and get %off on
this” can be designed based on the
association rules generated
Business problem :
• A retail store manager wants to
conduct Market Basket analysis to
come up with better strategy of
products placement and product
bundling
Use case 1 : Sample Input Dataset
Transaction ID Product 1 Product 2 Product 3
1039153 Milk Bread Jam
1069697 Shampoo Soap Tooth paste
1068120 Cold drink Snacks Ear ring
563175 Hand wash Antiseptic liquid Hand sanitizer
562842 Fruit Vegetables Ketchup
562681 Cold drink Snacks Ear ring
562404 Shampoo Soap -
700159 Bread Jam -
696484 Milk Fruit Vegetables
Use Case 1 : Sample Output 1 : Association
Rules
Rule Support Confidence Lift
Shampoo -> Soap 0.5 0.6 0.86
Cold drink -> Snacks 0.4 0.5 0.83
Fruits -> Vegetables 0.3 0.5 0.78
Bread -> Jam 0.3 0.3 0.67
Output : Based on the threshold set by user or automatically selected by algorithm, the best product combinations
will show up in the form of association rules, along with their support, confidence and lift values as shown below :
Use Case 1 : Sample Output 2: Plot :
Confidence By Rule
• Based on the association rules
generated, the heat map plot can
be shown as above , indicating
rules having high confidence or
support with darker shade of a
particular color and those having
lower support or confidence
values with lighter shade of a
same color
• End user can simply focus on the
rules shown in dark color to come
up with better product bundling or
placement in order to increase the
cross sell
Use case 2
Business benefit:
• Based on the rules generated,
banking products can be cross sold
to each existing or prospective
customer to drive sales and bank
revenue
• For instance, if saving ,personal
loan and credit card are frequently
sequentially bought then a new
saving account customer can be
cross sold with personal loan and
credit card
Business problem :
• A bank marketing manager may
want to analyze which products are
frequently and sequentially bought
together
• Each customer is represented as a
transaction containing the ordered
set of products and which products
are likely to be purchased
simultaneously / sequentially can
be predicted
Use case 3
Business problem :
•A telecom marketing manager may want
to analyze which value added services are
frequently and sequentially bought
together
•Each customer is represented as a
transaction containing the ordered set of
value added services and which services
are likely to be purchased simultaneously
/ sequentially can be predicted
Business benefit:
•Based on the rules generated, value
added services can be cross sold to each
existing or prospective customer to drive
sales and revenue of a telecom service
provider
•For instance, if a group calling sim, 1 GB
internet plan and 500 minutes plan is
generally opted out plan by most of the
customers than whenever a new
prospective customer comes in for group
calling sim card, then he or she can be
targeted with 1 GB internet plan and 500
minutes plan
Want to Learn
More?
Get in touch with us @
support@Smarten.com
And Do Checkout the Learning section
on
Smarten.com
June 2018

More Related Content

PPTX
What is the Paired Sample T Test and How is it Beneficial to Business Analysis?
PPTX
What is the Independent Samples T Test Method of Analysis and How Can it Bene...
PPTX
What is the Chi Square Test of Association and How Can it be Used for Analysis?
PPTX
Random Forest Regression Analysis Reveals Impact of Variables on Target Values
PPTX
What is Binary Logistic Regression Classification and How is it Used in Analy...
PPTX
Gradient Boosting Regression Analysis Reveals Dependent Variables and Interre...
PPTX
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
PPTX
What is the Multinomial-Logistic Regression Classification Algorithm and How ...
What is the Paired Sample T Test and How is it Beneficial to Business Analysis?
What is the Independent Samples T Test Method of Analysis and How Can it Bene...
What is the Chi Square Test of Association and How Can it be Used for Analysis?
Random Forest Regression Analysis Reveals Impact of Variables on Target Values
What is Binary Logistic Regression Classification and How is it Used in Analy...
Gradient Boosting Regression Analysis Reveals Dependent Variables and Interre...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is the Multinomial-Logistic Regression Classification Algorithm and How ...

What's hot (20)

PPTX
What Is Generalized Linear Regression with Gaussian Distribution And How Can ...
PPTX
What is Descriptive Statistics and How Do You Choose the Right One for Enterp...
PPTX
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
PPTX
Mba2216 week 11 data analysis part 02
PPTX
Exploratory Data Analysis for Biotechnology and Pharmaceutical Sciences
PPTX
Marketing Optimization Augmented Analytics Use Cases - Smarten
PPT
Chap019
PPT
Spss software
PPT
Chapter 15 Marketing Research Malhotra
PDF
Data Science - Part VI - Market Basket and Product Recommendation Engines
PPT
Data analysis test for association BY Prof Sachin Udepurkar
PPTX
Multivariate
PPTX
Statistical analysis in SPSS_
PPT
SPSS statistics - get help using SPSS
PPTX
Using SPSS: A Tutorial
PPT
2 presentations malhotra-mr05_ppt_16
PPTX
T test, independant sample, paired sample and anova
PDF
Workshop on SPSS: Basic to Intermediate Level
PPTX
Chapter 8 by Malhotra
What Is Generalized Linear Regression with Gaussian Distribution And How Can ...
What is Descriptive Statistics and How Do You Choose the Right One for Enterp...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
Mba2216 week 11 data analysis part 02
Exploratory Data Analysis for Biotechnology and Pharmaceutical Sciences
Marketing Optimization Augmented Analytics Use Cases - Smarten
Chap019
Spss software
Chapter 15 Marketing Research Malhotra
Data Science - Part VI - Market Basket and Product Recommendation Engines
Data analysis test for association BY Prof Sachin Udepurkar
Multivariate
Statistical analysis in SPSS_
SPSS statistics - get help using SPSS
Using SPSS: A Tutorial
2 presentations malhotra-mr05_ppt_16
T test, independant sample, paired sample and anova
Workshop on SPSS: Basic to Intermediate Level
Chapter 8 by Malhotra
Ad

Similar to What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining to Analyze Data? (20)

PPTX
Unit 4_ML.pptx
PPTX
Apriori_Algorithm_Presentation_for_btech_bca_students.pptx
PPTX
Market basketanalysis using r
PPTX
Instacart Market Basket Analysis
PPTX
Market Basket Analysis
PPTX
Market Basket Analysis.pptx
PPTX
big data seminar.pptx
PPTX
MODULE 5 _ Mining frequent patterns and associations.pptx
PPTX
WEEK 11 - Association Mining_020520.pptx
PDF
6. Association Rule.pdf
PDF
2023 Supervised_Learning_Association_Rules
 
PPTX
BAS 250 Lecture 4
PDF
Market Basket Analysis of bakery Shop
PPTX
Unit 4.1 Associativity apriori algorithm.pptx
PPTX
Association rule introduction, Market basket Analysis
PPTX
Association rule mining and Apriori algorithm
PPTX
Apriori Algorithm.pptx
PPTX
Marketing & Retail Analytics_PartB_Grocery_Store.pptx
PPTX
Unit-II-1-1@dm.pptx .
PPTX
Products Frequently Bought Together in Stores Using classificat...
Unit 4_ML.pptx
Apriori_Algorithm_Presentation_for_btech_bca_students.pptx
Market basketanalysis using r
Instacart Market Basket Analysis
Market Basket Analysis
Market Basket Analysis.pptx
big data seminar.pptx
MODULE 5 _ Mining frequent patterns and associations.pptx
WEEK 11 - Association Mining_020520.pptx
6. Association Rule.pdf
2023 Supervised_Learning_Association_Rules
 
BAS 250 Lecture 4
Market Basket Analysis of bakery Shop
Unit 4.1 Associativity apriori algorithm.pptx
Association rule introduction, Market basket Analysis
Association rule mining and Apriori algorithm
Apriori Algorithm.pptx
Marketing & Retail Analytics_PartB_Grocery_Store.pptx
Unit-II-1-1@dm.pptx .
Products Frequently Bought Together in Stores Using classificat...
Ad

More from Smarten Augmented Analytics (20)

PPTX
Hot Lead Prediction Analytics Use Case - Smarten
PPTX
Crop Yield Predictive Analytics Use Case – Smarten
PPTX
Crime Type Prediction - Augmented Analytics Use Case – Smarten
PPTX
What Is Multilayer Perceptron Classifier And How Is It Used For Enterprise An...
PPTX
What Is Random Forest Classification And How Can It Help Your Business?
PPTX
What is Isotonic Regression and How Can a Business Utilize it to Analyze Data?
PPTX
Students' Academic Performance Predictive Analytics Use Case – Smarten
PPTX
What is Simple Linear Regression and How Can an Enterprise Use this Technique...
PPTX
Fraud Mitigation Predictive Analytics Use Case – Smarten
PPTX
Quality Control Predictive Analytics Use Case - Smarten
PPTX
Machine Maintenance Management Predictive Analytics Use Case - Smarten
PPTX
Predictive Analytics Using External Data Augmented Analytics Use Case - Smarten
PPTX
Human Resource Attrition Augmented Analytics Use Case - Smarten
PPTX
Customer Targeting Augmented Analytics Use Case - Smarten
PPTX
What is Naïve Bayes Classification and How is it Used for Enterprise Analysis?
PPTX
What is KNN Classification and How Can This Analysis Help an Enterprise?
PPTX
What Are Simple Random Sampling and Stratified Random Sampling Analytical Tec...
PPTX
What is Simple Linear Regression and How Can an Enterprise Use this Technique...
PPTX
What is ARIMAX Forecasting and How is it Used for Enterprise Analysis?
PPTX
What is Karl Pearson Correlation Analysis and How Can it be Used for Enterpri...
Hot Lead Prediction Analytics Use Case - Smarten
Crop Yield Predictive Analytics Use Case – Smarten
Crime Type Prediction - Augmented Analytics Use Case – Smarten
What Is Multilayer Perceptron Classifier And How Is It Used For Enterprise An...
What Is Random Forest Classification And How Can It Help Your Business?
What is Isotonic Regression and How Can a Business Utilize it to Analyze Data?
Students' Academic Performance Predictive Analytics Use Case – Smarten
What is Simple Linear Regression and How Can an Enterprise Use this Technique...
Fraud Mitigation Predictive Analytics Use Case – Smarten
Quality Control Predictive Analytics Use Case - Smarten
Machine Maintenance Management Predictive Analytics Use Case - Smarten
Predictive Analytics Using External Data Augmented Analytics Use Case - Smarten
Human Resource Attrition Augmented Analytics Use Case - Smarten
Customer Targeting Augmented Analytics Use Case - Smarten
What is Naïve Bayes Classification and How is it Used for Enterprise Analysis?
What is KNN Classification and How Can This Analysis Help an Enterprise?
What Are Simple Random Sampling and Stratified Random Sampling Analytical Tec...
What is Simple Linear Regression and How Can an Enterprise Use this Technique...
What is ARIMAX Forecasting and How is it Used for Enterprise Analysis?
What is Karl Pearson Correlation Analysis and How Can it be Used for Enterpri...

Recently uploaded (20)

PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PPTX
Online Work Permit System for Fast Permit Processing
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
Nekopoi APK 2025 free lastest update
PDF
medical staffing services at VALiNTRY
PDF
System and Network Administraation Chapter 3
PPTX
ISO 45001 Occupational Health and Safety Management System
PPTX
Transform Your Business with a Software ERP System
PDF
AI in Product Development-omnex systems
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PPTX
history of c programming in notes for students .pptx
PPTX
ai tools demonstartion for schools and inter college
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PPTX
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
How Creative Agencies Leverage Project Management Software.pdf
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
PTS Company Brochure 2025 (1).pdf.......
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Online Work Permit System for Fast Permit Processing
How to Migrate SBCGlobal Email to Yahoo Easily
Nekopoi APK 2025 free lastest update
medical staffing services at VALiNTRY
System and Network Administraation Chapter 3
ISO 45001 Occupational Health and Safety Management System
Transform Your Business with a Software ERP System
AI in Product Development-omnex systems
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
history of c programming in notes for students .pptx
ai tools demonstartion for schools and inter college
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
How Creative Agencies Leverage Project Management Software.pdf
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Navsoft: AI-Powered Business Solutions & Custom Software Development
PTS Company Brochure 2025 (1).pdf.......

What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining to Analyze Data?

  • 1. Master the Art of Analytics A Simplistic Explainer Series For Citizen Data Scientists J o u r n e y To w a r d s A u g m e n t e d A n a l y t i c s
  • 3. Introduction Basic terminologies with example Standard input/tuning parameters & Sample UI Sample output UI Interpretation of Output Limitations Business use cases What Are All Covered
  • 4. INTRODUCTION  Association rule mining is a procedure which finds frequent patterns, associations, or causal structures from data sets found in various kinds of databases such as relational databases, transactional databases, and other forms of data repositories  Given a set of transactions, association rule mining aims to find the rules which enable us to predict the occurrence of a specific item based on the occurrences of the other items in the transaction
  • 5. BASIC TERMINOLOGIES • ANTECEDENT: • Left hand side of the rule is called Antecedent • For instance, for the rule milk->bread, milk is antecedent • CONSEQUENT: • Right hand side of the rule is called Consequent • For instance, for the rule milk->bread, bread is consequent • SUPPORT : • The support of a rule x -> y (where x and y are each items/events etc.) is defined as the proportion of transactions in the data set which contain the item set x as well as y • Thus, Support (x -> y)= no. of transactions which contain the item set x & y / total no. of transactions •
  • 6. BASIC TERMINOLOGIES • CONFIDENCE : • The confidence of a rule x -> y is defined as: • Support ( x -> y ) / support (x) • Thus it is the ratio of the number of transactions that include all items in the consequent (y in this case), as well as the antecedent( x in this case) to the number of transactions that include all items in the antecedent ( x in this case ) • LIFT : • The lift of a rule x -> y is defined as: • Support ( x -> y ) / support (x) * support (y)
  • 7. Here , support (Milk -> Bread): = Number of transactions containing milk & bread / total transactions = 2/5 = 0.4 TID Milk Bread Butter Beer 1 1 0 1 1 2 1 1 1 0 3 0 1 1 0 4 1 0 0 1 5 1 1 1 1 Confidence (Milk -> Bread): = support (milk-> bread)/ support(milk) = 0.4/ [4/5] =0.4/ 0.8 =0.5 Lift (Milk -> Bread): = support (milk-> bread)/ support(milk) * support(bread) = 0.4/ [(4/5) * (3/5)] =0.4/ [0.8*0.6] = 0.4/0.48 =0.83  Support (milk->bread) = 0.4 means milk & bread together occur in 40% of all transactions  Confidence (milk->bread) = 0.5 means, if there are 100 transactions containing milk then there are 50 of them containing bread also Example
  • 8. Example Similarly , support, confidence and lift values of all item combinations are found and the rules matching user defined threshold of support and confidence will be displayed in final output as shown below : For instance, for minimum support = 0.3 and minimum confidence =0.3, sample rules generated would be as shown below , depicting frequent item sets or best performing combination of item sets Rule Support Confidence Lift Bread->Butter 0.5 0.6 0.86 Milk -> Bread 0.4 0.5 0.83 Milk -> Butter 0.3 0.5 0.78 Bread-> Beer 0.3 0.4 0.68
  • 9. INTERPRETATION : Example Rule Support Confidence Lift Bread->Butter 0.5 0.6 0.86 Milk -> Bread 0.4 0.5 0.83 Milk -> Butter 0.3 0.5 0.78 Bread-> Beer 0.3 0.4 0.68 • In this case, Bread -> Butter rule has highest propensity to be bought together as it has highest support as well as confidence, followed by Milk -> Bread, Milk -> Butter and Bread -> Beer • As support (Bread-> Butter ) = 0.5, there are 50% transactions containing bread and butter • As confidence (Bread-> Butter ) = 0.6 , if there are 100 transactions containing bread then there are 60 of them containing butter also • A lift larger than 1.0 implies that the relationship between the antecedent and the consequent is more significant than would be expected if the two were independent. The larger the lift, the more significant the association
  • 11. Standard Output 1 : Model Summary Rules that have both high confidence and support are called strong rules Even if confidence reaches high values, the rule is not useful unless the support value is high as well In this case, Bread -> Butter rule has highest propensity to be bought together as it has highest support as well as confidence, followed by Milk -> Bread, Milk -> Butter and Bread -> Beer There are 50% transactions containing break and butter, and if there are 100 transactions containing bread, 60 of them also has butter INTERPRETATION : Rule Support Confidence Lift Shampoo -> Soap 0.5 0.5 0.86 Cold drink -> Snacks 0.4 0.4 0.83 Fruit -> Vegetables 0.3 0.35 0.78 Milk > Egg 0.3 0.30 0.68
  • 12. INTERPRETATION : Sample Output 2 : Plot : Confidence By Rules The confidence value of each rule is shown in the plot above As confidence value takes into account the sequence of items in the association rule, this plot is built based on confidence values instead of support or lift The product combinations shown in dark green color in plot above have the highest likelihood to be bought together and in sequence Darker the color, more the likelihood of products being bought together and sequentially
  • 13. Sample Output 2: Plot : Support & confidence by rule • Ideally both support and confidence should be taken into account to come up with best rules because support only indicates the frequency of both items being sold together where as confidence takes care of sequence of purchase also • Hence, alternatively , a bubble scatter plot using support and confidence measures can be used to focus on rules with high support as well as confidence
  • 14. LIMITATIONS :  Processing time for running this algorithm is relatively high when compared to other algorithms due to millions of transaction level data in input  The user must possess a certain amount of expertise in order to find the right settings for support and confidence to obtain the best association rules
  • 15. GENERAL APPLICATIONS • BASKET DATA ANALYSIS • To analyze the association of purchased items in a single basket or single purchase • CROSS MARKETING/SELLING • To work with other businesses that complement your own, not competitors. • For example, vehicle dealerships and manufacturers have cross marketing campaigns with oil and gas companies for obvious reasons • CATALOG DESIGN • The selection of items in a business’ catalog are often designed to complement each other so that buying one item will lead to buying of another. So these items are often complements or very related • MEDICAL TREATMENTS • Each patient is represented as a transaction containing the ordered set of diseases and which diseases are likely to occur simultaneously / sequentially can be predicted
  • 16. USE CASE 1 Business benefit: • Based on the association rules generated, the store manager will be able to strategically place the products together or in sequence leading to growth in sales and in turn revenue of the store • Offers such as “Buy this and get this free” or “Buy this and get %off on this” can be designed based on the association rules generated Business problem : • A retail store manager wants to conduct Market Basket analysis to come up with better strategy of products placement and product bundling
  • 17. Use case 1 : Sample Input Dataset Transaction ID Product 1 Product 2 Product 3 1039153 Milk Bread Jam 1069697 Shampoo Soap Tooth paste 1068120 Cold drink Snacks Ear ring 563175 Hand wash Antiseptic liquid Hand sanitizer 562842 Fruit Vegetables Ketchup 562681 Cold drink Snacks Ear ring 562404 Shampoo Soap - 700159 Bread Jam - 696484 Milk Fruit Vegetables
  • 18. Use Case 1 : Sample Output 1 : Association Rules Rule Support Confidence Lift Shampoo -> Soap 0.5 0.6 0.86 Cold drink -> Snacks 0.4 0.5 0.83 Fruits -> Vegetables 0.3 0.5 0.78 Bread -> Jam 0.3 0.3 0.67 Output : Based on the threshold set by user or automatically selected by algorithm, the best product combinations will show up in the form of association rules, along with their support, confidence and lift values as shown below :
  • 19. Use Case 1 : Sample Output 2: Plot : Confidence By Rule • Based on the association rules generated, the heat map plot can be shown as above , indicating rules having high confidence or support with darker shade of a particular color and those having lower support or confidence values with lighter shade of a same color • End user can simply focus on the rules shown in dark color to come up with better product bundling or placement in order to increase the cross sell
  • 20. Use case 2 Business benefit: • Based on the rules generated, banking products can be cross sold to each existing or prospective customer to drive sales and bank revenue • For instance, if saving ,personal loan and credit card are frequently sequentially bought then a new saving account customer can be cross sold with personal loan and credit card Business problem : • A bank marketing manager may want to analyze which products are frequently and sequentially bought together • Each customer is represented as a transaction containing the ordered set of products and which products are likely to be purchased simultaneously / sequentially can be predicted
  • 21. Use case 3 Business problem : •A telecom marketing manager may want to analyze which value added services are frequently and sequentially bought together •Each customer is represented as a transaction containing the ordered set of value added services and which services are likely to be purchased simultaneously / sequentially can be predicted Business benefit: •Based on the rules generated, value added services can be cross sold to each existing or prospective customer to drive sales and revenue of a telecom service provider •For instance, if a group calling sim, 1 GB internet plan and 500 minutes plan is generally opted out plan by most of the customers than whenever a new prospective customer comes in for group calling sim card, then he or she can be targeted with 1 GB internet plan and 500 minutes plan
  • 22. Want to Learn More? Get in touch with us @ support@Smarten.com And Do Checkout the Learning section on Smarten.com June 2018