SlideShare a Scribd company logo
CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this
material is prohibited and subject to legal action under breach of IP and confidentiality clauses.
E-commerce Customer Segmentation and Prediction
CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this
material is prohibited and subject to legal action under breach of IP and confidentiality clauses.
Agenda
• Understanding the problem Statement and the data
• Data Cleaning and Preprocessing
• Exploratory Data Analysis (EDA)
• Feature Engineering
• Customer Segmentation
• Predicting Model and Training
• Results and Performance Evaluation
• Conclusion
CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this
material is prohibited and subject to legal action under breach of IP and confidentiality clauses.
Understanding the problem statement
• E-commerce Customer Segmentation and Prediction is a data-driven
approach to understanding and predicting customer behavior in online
retail. It involves dividing customers into distinct groups based on shared
characteristics, such as demographics, purchase history, and browsing
behavior. This segmentation allows businesses to tailor marketing
campaigns, personalize product recommendations, and optimize customer
experiences.
• By leveraging predictive modeling techniques, businesses can forecast
future customer behavior, such as purchase likelihood, churn probability,
or product preferences. This enables them to proactively identify at-risk
customers, implement targeted retention strategies, and drive sales growth.
CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this
material is prohibited and subject to legal action under breach of IP and confidentiality clauses.
•Customer Segmentation: Dividing customers into distinct groups based on shared
characteristics.
•Predictive Modeling: Using data to forecast future customer behavior.
•Benefits:
•Personalized Marketing
•Improved Customer Retention
•Enhanced Customer Experience
•Optimized Inventory Management
•Increased Sales and Revenue
CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this
material is prohibited and subject to legal action under breach of IP and confidentiality clauses.
Data Cleaning and Preprocessing
• Handle missing values.
• Outlier detection and treatment.
For removing Outliers Z-Scores is used.
CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this
material is prohibited and subject to legal action under breach of IP and confidentiality clauses.
• Data normalization and standardization.
Standard scalar is used to normalize the RFM values
• Feature engineering (creating new features from existing ones).
CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this
material is prohibited and subject to legal action under breach of IP and confidentiality clauses.
Exploratory Data Analysis (EDA)
CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this
material is prohibited and subject to legal action under breach of IP and confidentiality clauses.
Feature Engineering
Extracting RFM Values
CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this
material is prohibited and subject to legal action under breach of IP and confidentiality clauses.
Recency (R)
• Definition: Measures how recently a customer made a purchase.
• Inference: Customers with low recency values (i.e., recent
purchases) are more engaged and likely to respond to marketing
efforts. High recency values indicate customers who haven't
purchased in a while and may need re-engagement strategies.
CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this
material is prohibited and subject to legal action under breach of IP and confidentiality clauses.
Frequency (F)
• Definition: Measures how often a customer makes a purchase.
• Inference: High frequency indicates loyal customers who make
repeated purchases. These customers are valuable and should be
nurtured. Low frequency suggests customers who may need
incentives to increase their purchase frequency.
CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this
material is prohibited and subject to legal action under breach of IP and confidentiality clauses.
Monetary (M)
• Definition: Measures the total amount of money a customer has
spent.
• Inference: High monetary values indicate high-value customers
who contribute significantly to revenue. These customers are
crucial for profitability and should receive special attention. Low
monetary values suggest customers who might need targeted
promotions to increase their spending.
CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this
material is prohibited and subject to legal action under breach of IP and confidentiality clauses.
Customer Lifetime Value (CLV)
• Definition: Estimates the total revenue a business can expect from
a customer over the entire relationship.
• Inference: High CLV indicates customers who are expected to
generate substantial revenue over time. These customers are the
most valuable and should be prioritized for retention efforts. Low
CLV suggests customers who may not be as profitable, and
businesses might need to evaluate the cost of retaining them
versus acquiring new customers.
CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this
material is prohibited and subject to legal action under breach of IP and confidentiality clauses.
Pair plots of customer ID with respect to RFM
CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this
material is prohibited and subject to legal action under breach of IP and confidentiality clauses.
What do we infer from RFM?
• Customer 1: Hasn't purchased recently (Recency = 29), but has a moderate
frequency and monetary value. They might need re-engagement strategies.
• Customer 2: Very recent purchase (Recency = 1), moderate frequency, and
high monetary value. They are highly engaged and valuable.
• Customer 3: Recent purchase (Recency = 10), moderate frequency, and
very high monetary value. They are also highly valuable and should be
prioritized for retention.
CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this
material is prohibited and subject to legal action under breach of IP and confidentiality clauses.
Customer Segmentation
• Different clustering methods are
used like K-means, Hierarchical
clustering, DBSCAN clustering
etc.
• From the above I chose k-means
for segmenting the customers
because its simple,efficient and
scales well to large datasets
CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this
material is prohibited and subject to legal action under breach of IP and confidentiality clauses.
• To find the optimal number of
clusters ( K value), I have used
the elbow method.
• A function is written to return
the K value
CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this
material is prohibited and subject to legal action under breach of IP and confidentiality clauses.
Click to edit
Master title
style
Model Prediction
• For prediction we can use different models,
here I have used Logistic regression,
DecisionTree, Random Forest and Support
Vector Machine(SVM).
• The accuracy for the models:
• Logistic Regression = 0.998476 (99.84%)
• Decision Tree = 0.997713 (99.73%)
• Random Forest = 0.994665 (99.46%)
• Support Vector Machine (SVM) = 0.813262
CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this
material is prohibited and subject to legal action under breach of IP and confidentiality clauses.
Result
With help of K-Means clustering we got 4
clusters, which is
• [0] - High-value customers
• [1] - Recent but low-frequency customers
• [2] - Frequent but low-monetary customers
• [3] - Inactive customers
CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this
material is prohibited and subject to legal action under breach of IP and confidentiality clauses.
Here the model predicts on the values of
Recency, Frequency and Monetary.
input from user,
Recency = 310
Frequency = 17
Monetary = 394
The output cluster is mapped to [1] which is
Recent but low frequency customer.
CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this
material is prohibited and subject to legal action under breach of IP and confidentiality clauses.
Insights for each cluster to improve the business,
[0] - High-value customers - Offer loyalty programs and exclusive discounts.
[1] - Recent but low-frequency customers - Send personalized recommendations.
[2] - Frequent but low-monetary customers - Upsell and cross-sell products.
[3] - Inactive customers - Re-engage with special offers and reminders.
CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this
material is prohibited and subject to legal action under breach of IP and confidentiality clauses.
Metrics
Confusion Matrix Classification Report
CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this
material is prohibited and subject to legal action under breach of IP and confidentiality clauses.
Visualizing the decision tree
CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this
material is prohibited and subject to legal action under breach of IP and confidentiality clauses.
Conclusion
1. Customer Value Segmentation
• We can identify high-value customers who contribute significantly to
revenue and should be targeted with loyalty programs and exclusive offers.
• Medium-value customers can be encouraged to increase their spending
through personalized recommendations and upselling strategies.
• Low-value customers may need re-engagement campaigns to boost their
purchasing frequency.
• Additionally, understanding purchasing patterns helps in optimizing
inventory management and planning targeted marketing campaigns,
ultimately enhancing customer satisfaction and retention.
CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this
material is prohibited and subject to legal action under breach of IP and confidentiality clauses.
2. Behavioral Patterns: Understanding the recency, frequency, and monetary
value of purchases helps in identifying active, loyal, and high-spending
customers. This allows for more effective and targeted marketing efforts.
3. Predictive Insights: The predictive model can forecast customer segments
for new or existing customers based on their purchasing behavior,
enabling proactive marketing and retention strategies.
4. Inventory Optimization: Insights into top-selling products and
purchasing patterns over time help in optimizing inventory management,
ensuring that popular products are always in stock and reducing overstock
of less popular items.
5. Marketing Efficiency: Tailoring marketing campaigns based on customer
segments improves the efficiency of marketing spend, ensuring that
resources are allocated to the most promising customer groups.
CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this
material is prohibited and subject to legal action under breach of IP and confidentiality clauses.
Questions ?
CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this
material is prohibited and subject to legal action under breach of IP and confidentiality clauses.
Thank You!

More Related Content

PPTX
"Ecommerce Customer Segmentation & Prediction: Enhancing Business Strategies ...
PPTX
Smart Driver Alert: Predictive Fatigue Detection Technology
PPTX
E-Commerce Customer Segmentation and Prediction: Unlocking Insights for Smart...
PPTX
E-commerce Customer Segmentation: Unlocking Consumer Insights
PPTX
Unlocking Insights: Advanced Customer Segmentation Strategies
PDF
A Survey on Customer Analytics Techniques for the Retail Industry
PPTX
Customer analytics
PPTX
Moduel 2 _KPMG.pptx
"Ecommerce Customer Segmentation & Prediction: Enhancing Business Strategies ...
Smart Driver Alert: Predictive Fatigue Detection Technology
E-Commerce Customer Segmentation and Prediction: Unlocking Insights for Smart...
E-commerce Customer Segmentation: Unlocking Consumer Insights
Unlocking Insights: Advanced Customer Segmentation Strategies
A Survey on Customer Analytics Techniques for the Retail Industry
Customer analytics
Moduel 2 _KPMG.pptx

Similar to E-Commerce Customer Segmentation and Behavior Prediction: A Data-Driven Strategy (20)

PPTX
Cdac -Project Presentation [Autosaved].pptx
PDF
PDF
Rfm clustering analysis
PDF
E-commerce Customer Segmentation and Predictive Modeling: Enhancing Marketing...
PDF
CRM Analytics_Marketelligent
PPTX
AI Is An ROI Booster For Restaurants
PPTX
Customer Segmentation Course 21102024(1).pptx
PDF
Retail Banking Analytics_Marketelligent
PDF
A novel approach to optimizing customer profiles in relation to business metrics
PDF
Using R for customer segmentation
PPTX
Customer segmentation
PDF
Customer Analytics in Retail - Know Thy Customers
PPT
Data Visions Big Data Visual Analytics Tool
PDF
IRJET- Credit Profile of E-Commerce Customer
PDF
K-Means clustering interpretation using recency, frequency, and monetary fact...
PPTX
Customer Segmentation using K-Means clustering
PPTX
Customer Analytics
PPTX
Day 1 (Lecture 2): Business Analytics
PPTX
Disscusion - a crm final
PDF
Using Big Data & Analytics to Create Consumer Actionable Insights
Cdac -Project Presentation [Autosaved].pptx
Rfm clustering analysis
E-commerce Customer Segmentation and Predictive Modeling: Enhancing Marketing...
CRM Analytics_Marketelligent
AI Is An ROI Booster For Restaurants
Customer Segmentation Course 21102024(1).pptx
Retail Banking Analytics_Marketelligent
A novel approach to optimizing customer profiles in relation to business metrics
Using R for customer segmentation
Customer segmentation
Customer Analytics in Retail - Know Thy Customers
Data Visions Big Data Visual Analytics Tool
IRJET- Credit Profile of E-Commerce Customer
K-Means clustering interpretation using recency, frequency, and monetary fact...
Customer Segmentation using K-Means clustering
Customer Analytics
Day 1 (Lecture 2): Business Analytics
Disscusion - a crm final
Using Big Data & Analytics to Create Consumer Actionable Insights
Ad

More from Boston Institute of Analytics (20)

PPTX
"Predicting Employee Retention: A Data-Driven Approach to Enhancing Workforce...
PPTX
Music Recommendation System: A Data Science Project for Personalized Listenin...
PPTX
Mental Wellness Analyzer: Leveraging Data for Better Mental Health Insights -...
PPTX
Suddala-Scan: Enhancing Website Analysis with AI for Capstone Project at Bost...
PPTX
Fraud Detection in Cybersecurity: Advanced Techniques for Safeguarding Digita...
PPTX
Enhancing Brand Presence Through Social Media Marketing: A Strategic Approach...
PPTX
Employee Retention Prediction: Leveraging Data for Workforce Stability
PPTX
Predicting Movie Success: Unveiling Box Office Potential with Data Analytics
PPTX
Financial Fraud Detection: Identifying and Preventing Financial Fraud
PPTX
Smart Driver Alert: Predictive Fatigue Detection Technology
PPTX
Predictive Maintenance: Revolutionizing Vehicle Care with Demographic and Sen...
PPTX
Smart Driver Alert: Revolutionizing Road Safety with Predictive Fatigue Detec...
PDF
Water Potability Prediction: Ensuring Safe and Clean Water
PDF
Developing a Training Program for Employee Skill Enhancement
PPTX
Website Scanning: Uncovering Vulnerabilities and Ensuring Cybersecurity
PPTX
Analyzing Open Ports on Websites: Functions, Benefits, Threats, and Detailed ...
PPTX
Designing a Simple Python Tool for Website Vulnerability Scanning
PPTX
Building a Simple Python-Based Website Vulnerability Scanner
PPTX
Cybersecurity and Ethical Hacking: Capstone Project
PPTX
Website Port Scanning: Functions, Benefits, and Threats of Open Ports
"Predicting Employee Retention: A Data-Driven Approach to Enhancing Workforce...
Music Recommendation System: A Data Science Project for Personalized Listenin...
Mental Wellness Analyzer: Leveraging Data for Better Mental Health Insights -...
Suddala-Scan: Enhancing Website Analysis with AI for Capstone Project at Bost...
Fraud Detection in Cybersecurity: Advanced Techniques for Safeguarding Digita...
Enhancing Brand Presence Through Social Media Marketing: A Strategic Approach...
Employee Retention Prediction: Leveraging Data for Workforce Stability
Predicting Movie Success: Unveiling Box Office Potential with Data Analytics
Financial Fraud Detection: Identifying and Preventing Financial Fraud
Smart Driver Alert: Predictive Fatigue Detection Technology
Predictive Maintenance: Revolutionizing Vehicle Care with Demographic and Sen...
Smart Driver Alert: Revolutionizing Road Safety with Predictive Fatigue Detec...
Water Potability Prediction: Ensuring Safe and Clean Water
Developing a Training Program for Employee Skill Enhancement
Website Scanning: Uncovering Vulnerabilities and Ensuring Cybersecurity
Analyzing Open Ports on Websites: Functions, Benefits, Threats, and Detailed ...
Designing a Simple Python Tool for Website Vulnerability Scanning
Building a Simple Python-Based Website Vulnerability Scanner
Cybersecurity and Ethical Hacking: Capstone Project
Website Port Scanning: Functions, Benefits, and Threats of Open Ports
Ad

Recently uploaded (20)

PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
Introduction to Knowledge Engineering Part 1
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
Database Infoormation System (DBIS).pptx
PDF
Mega Projects Data Mega Projects Data
PDF
Business Analytics and business intelligence.pdf
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PDF
annual-report-2024-2025 original latest.
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
Computer network topology notes for revision
PPT
Quality review (1)_presentation of this 21
Introduction-to-Cloud-ComputingFinal.pptx
Introduction to Knowledge Engineering Part 1
Fluorescence-microscope_Botany_detailed content
oil_refinery_comprehensive_20250804084928 (1).pptx
Data_Analytics_and_PowerBI_Presentation.pptx
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Database Infoormation System (DBIS).pptx
Mega Projects Data Mega Projects Data
Business Analytics and business intelligence.pdf
Business Ppt On Nestle.pptx huunnnhhgfvu
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
annual-report-2024-2025 original latest.
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
Acceptance and paychological effects of mandatory extra coach I classes.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Computer network topology notes for revision
Quality review (1)_presentation of this 21

E-Commerce Customer Segmentation and Behavior Prediction: A Data-Driven Strategy

  • 1. CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this material is prohibited and subject to legal action under breach of IP and confidentiality clauses. E-commerce Customer Segmentation and Prediction
  • 2. CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this material is prohibited and subject to legal action under breach of IP and confidentiality clauses. Agenda • Understanding the problem Statement and the data • Data Cleaning and Preprocessing • Exploratory Data Analysis (EDA) • Feature Engineering • Customer Segmentation • Predicting Model and Training • Results and Performance Evaluation • Conclusion
  • 3. CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this material is prohibited and subject to legal action under breach of IP and confidentiality clauses. Understanding the problem statement • E-commerce Customer Segmentation and Prediction is a data-driven approach to understanding and predicting customer behavior in online retail. It involves dividing customers into distinct groups based on shared characteristics, such as demographics, purchase history, and browsing behavior. This segmentation allows businesses to tailor marketing campaigns, personalize product recommendations, and optimize customer experiences. • By leveraging predictive modeling techniques, businesses can forecast future customer behavior, such as purchase likelihood, churn probability, or product preferences. This enables them to proactively identify at-risk customers, implement targeted retention strategies, and drive sales growth.
  • 4. CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this material is prohibited and subject to legal action under breach of IP and confidentiality clauses. •Customer Segmentation: Dividing customers into distinct groups based on shared characteristics. •Predictive Modeling: Using data to forecast future customer behavior. •Benefits: •Personalized Marketing •Improved Customer Retention •Enhanced Customer Experience •Optimized Inventory Management •Increased Sales and Revenue
  • 5. CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this material is prohibited and subject to legal action under breach of IP and confidentiality clauses. Data Cleaning and Preprocessing • Handle missing values. • Outlier detection and treatment. For removing Outliers Z-Scores is used.
  • 6. CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this material is prohibited and subject to legal action under breach of IP and confidentiality clauses. • Data normalization and standardization. Standard scalar is used to normalize the RFM values • Feature engineering (creating new features from existing ones).
  • 7. CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this material is prohibited and subject to legal action under breach of IP and confidentiality clauses. Exploratory Data Analysis (EDA)
  • 8. CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this material is prohibited and subject to legal action under breach of IP and confidentiality clauses. Feature Engineering Extracting RFM Values
  • 9. CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this material is prohibited and subject to legal action under breach of IP and confidentiality clauses. Recency (R) • Definition: Measures how recently a customer made a purchase. • Inference: Customers with low recency values (i.e., recent purchases) are more engaged and likely to respond to marketing efforts. High recency values indicate customers who haven't purchased in a while and may need re-engagement strategies.
  • 10. CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this material is prohibited and subject to legal action under breach of IP and confidentiality clauses. Frequency (F) • Definition: Measures how often a customer makes a purchase. • Inference: High frequency indicates loyal customers who make repeated purchases. These customers are valuable and should be nurtured. Low frequency suggests customers who may need incentives to increase their purchase frequency.
  • 11. CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this material is prohibited and subject to legal action under breach of IP and confidentiality clauses. Monetary (M) • Definition: Measures the total amount of money a customer has spent. • Inference: High monetary values indicate high-value customers who contribute significantly to revenue. These customers are crucial for profitability and should receive special attention. Low monetary values suggest customers who might need targeted promotions to increase their spending.
  • 12. CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this material is prohibited and subject to legal action under breach of IP and confidentiality clauses. Customer Lifetime Value (CLV) • Definition: Estimates the total revenue a business can expect from a customer over the entire relationship. • Inference: High CLV indicates customers who are expected to generate substantial revenue over time. These customers are the most valuable and should be prioritized for retention efforts. Low CLV suggests customers who may not be as profitable, and businesses might need to evaluate the cost of retaining them versus acquiring new customers.
  • 13. CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this material is prohibited and subject to legal action under breach of IP and confidentiality clauses. Pair plots of customer ID with respect to RFM
  • 14. CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this material is prohibited and subject to legal action under breach of IP and confidentiality clauses. What do we infer from RFM? • Customer 1: Hasn't purchased recently (Recency = 29), but has a moderate frequency and monetary value. They might need re-engagement strategies. • Customer 2: Very recent purchase (Recency = 1), moderate frequency, and high monetary value. They are highly engaged and valuable. • Customer 3: Recent purchase (Recency = 10), moderate frequency, and very high monetary value. They are also highly valuable and should be prioritized for retention.
  • 15. CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this material is prohibited and subject to legal action under breach of IP and confidentiality clauses. Customer Segmentation • Different clustering methods are used like K-means, Hierarchical clustering, DBSCAN clustering etc. • From the above I chose k-means for segmenting the customers because its simple,efficient and scales well to large datasets
  • 16. CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this material is prohibited and subject to legal action under breach of IP and confidentiality clauses. • To find the optimal number of clusters ( K value), I have used the elbow method. • A function is written to return the K value
  • 17. CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this material is prohibited and subject to legal action under breach of IP and confidentiality clauses. Click to edit Master title style Model Prediction • For prediction we can use different models, here I have used Logistic regression, DecisionTree, Random Forest and Support Vector Machine(SVM). • The accuracy for the models: • Logistic Regression = 0.998476 (99.84%) • Decision Tree = 0.997713 (99.73%) • Random Forest = 0.994665 (99.46%) • Support Vector Machine (SVM) = 0.813262
  • 18. CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this material is prohibited and subject to legal action under breach of IP and confidentiality clauses. Result With help of K-Means clustering we got 4 clusters, which is • [0] - High-value customers • [1] - Recent but low-frequency customers • [2] - Frequent but low-monetary customers • [3] - Inactive customers
  • 19. CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this material is prohibited and subject to legal action under breach of IP and confidentiality clauses. Here the model predicts on the values of Recency, Frequency and Monetary. input from user, Recency = 310 Frequency = 17 Monetary = 394 The output cluster is mapped to [1] which is Recent but low frequency customer.
  • 20. CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this material is prohibited and subject to legal action under breach of IP and confidentiality clauses. Insights for each cluster to improve the business, [0] - High-value customers - Offer loyalty programs and exclusive discounts. [1] - Recent but low-frequency customers - Send personalized recommendations. [2] - Frequent but low-monetary customers - Upsell and cross-sell products. [3] - Inactive customers - Re-engage with special offers and reminders.
  • 21. CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this material is prohibited and subject to legal action under breach of IP and confidentiality clauses. Metrics Confusion Matrix Classification Report
  • 22. CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this material is prohibited and subject to legal action under breach of IP and confidentiality clauses. Visualizing the decision tree
  • 23. CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this material is prohibited and subject to legal action under breach of IP and confidentiality clauses. Conclusion 1. Customer Value Segmentation • We can identify high-value customers who contribute significantly to revenue and should be targeted with loyalty programs and exclusive offers. • Medium-value customers can be encouraged to increase their spending through personalized recommendations and upselling strategies. • Low-value customers may need re-engagement campaigns to boost their purchasing frequency. • Additionally, understanding purchasing patterns helps in optimizing inventory management and planning targeted marketing campaigns, ultimately enhancing customer satisfaction and retention.
  • 24. CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this material is prohibited and subject to legal action under breach of IP and confidentiality clauses. 2. Behavioral Patterns: Understanding the recency, frequency, and monetary value of purchases helps in identifying active, loyal, and high-spending customers. This allows for more effective and targeted marketing efforts. 3. Predictive Insights: The predictive model can forecast customer segments for new or existing customers based on their purchasing behavior, enabling proactive marketing and retention strategies. 4. Inventory Optimization: Insights into top-selling products and purchasing patterns over time help in optimizing inventory management, ensuring that popular products are always in stock and reducing overstock of less popular items. 5. Marketing Efficiency: Tailoring marketing campaigns based on customer segments improves the efficiency of marketing spend, ensuring that resources are allocated to the most promising customer groups.
  • 25. CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this material is prohibited and subject to legal action under breach of IP and confidentiality clauses. Questions ?
  • 26. CONFIDENTIAL: The information in this document belongs to Boston Institute of Analytics LLC. Any unauthorized sharing of this material is prohibited and subject to legal action under breach of IP and confidentiality clauses. Thank You!