SlideShare a Scribd company logo
Name – Sanket V. Butoliya Dhavalchandra Panchal
Major - Business Analytics & Information Systems
Group Project – Data Mining
Telecom Customer Churn Prediction
Introduction:
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
1
Problem Statement :
A Cellular service provider wants to analyze customer data to predict whether a customer
is going to churn and also identify what are the critical factors that are causing customers to churn so that
preventive actions could be taken based on these factors. And also perform cost benefit analysis to reduce
the cost of critical errors.
Flow of Presentation :
• Introduction
• Problem Statement
• Dataset Overview
• Data Analysis and Baseline deduction
• Predictive models and Misclassification rate
• Cost sensitive Analysis
• Cost sensitive Classification
• Reduced Cost calculation
• What Drives Churn (Critical variables analysis)
• Visualization in Excel and Tableau
• Implementation in Python
• Conclusion
No. Attributes Information
1 CustomerID
2 Gender
3 SeniorCitizen
4 Partner
5 Dependents
6 tenure
7 PhoneService
8 MultipleLines
9 InternetService
10 OnlineSecurity
Predicted attribute = Customer
Churned?
Number of Instances = 7043
Number of Attributes = 21
Missing Attribute Values = None
No. Attributes Information
11 OnlineBackup
12 DeviceProtection
13 TechSupport
14 StreamingTV
15 StreamingMovies
16 Contract
17 PaperlessBilling
18 PaymentMethod
19 MonthlyCharges
20 TotalCharges
21 Churn
2
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
3
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
BaseLine
Accuracy
73.4630
Yes Count No Count
1869 5174
4
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
5
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
6
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
7
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
8
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
9
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
10
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
11
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
12
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
13
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
14
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
ModelDescription Role FalseNegative TrueNegative FalsePositive TruePositive Accuracy False Positive rate False Negative rate Misclassification Rate
Yes=>No No No=>Yes Yes
c a b d
Decision Tree T 503 2377 210 431 79.75% 8.118 53.854 20.25%
Decision Tree V 297 1419 132 264 79.69% 8.511 52.941 20.31%
GradientBoosting V 291 1441 110 270 81.01% 7.092 51.872 18.99%
GradientBoosting T 482 2378 209 452 80.37% 8.079 51.606 19.63%
GradientBoosting 2 V 271 1412 139 290 80.59% 8.962 48.307 19.41%
GradientBoosting 2 T 529 2823 281 591 80.82% 9.053 47.232 19.18%
Regression(2) T 435 2320 267 499 80.06% 10.321 46.574 19.94%
AutoNeural T 434 2319 268 500 80.06% 10.359 46.467 19.94%
AutoNeural2 T 517 2796 308 603 80.47% 9.923 46.161 19.53%
AutoNeural V 255 1406 145 306 81.06% 9.349 45.455 18.94%
Regression(2) V 255 1406 145 306 81.06% 9.349 45.455 18.94%
AutoNeural2 V 253 1410 141 308 81.34% 9.091 45.098 18.66%
Decision Tree2 V 253 1375 176 308 79.69% 11.348 45.098 20.31%
Regression T 505 2790 314 615 80.61% 10.116 45.089 19.39%
Regression V 252 1406 145 309 81.20% 9.349 44.920 18.80%
Decision Tree2 T 489 2755 349 631 80.16% 11.244 43.661 19.84%
Role : Train = T, Validate = V15
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
• Different kinds of errors have different costs.
• It is worse to classify a customer as NO when they are YES, than to classify as customer as
YES when they are NO.
• Take classification cost into consideration and minimize total cost
• Cost Matrix :
NO YES
Predicted
Actual
0 2 NO
5 0 YES
Classified as
Cost Sensitive Analysis :
16
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementatio
n
Conclusion
ZeroR without cost evaluation :
17
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementatio
n
Conclusion
ZeroR with cost evaluation :
18
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementatio
n
Conclusion
ZeroR with cost evaluation :
Cost = 547 x 5
= 2735
19
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementatio
n
Conclusion
J48 without cost evaluation :
20
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementatio
n
Conclusion
J48 with cost evaluation :
Cost = 260 x 5 + 183 x 2
= 1666
21
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementatio
n
Conclusion
Cost Sensitive Classification :
• Provide cost before building model.
• Reduce the number of False negative as compared to the number of False
Positive.
22
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
J48 with cost sensitive classification :
23
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementatio
n
Conclusion
J48 with cost sensitive classification :
Old Matrix
New Matrix
24
Cost = 116 x 5 + 451 x 2
= 1482
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementatio
n
Conclusion
Model Accuracy Total Cost Average Cost Reduced Cost %
ZeroR 74.11 2735 1.2944 -
J48 79.03 1666 0.7885 39.085
J48 with cost sensitive
classifier
73.16 1482 0.7014 45.813
Reduction in Cost on account of J48 = ( 2735 – 1666 ) / 2735
= 39.085 %
Reduction in Cost on account of J48 = ( 2735 – 1482 ) / 2735
With cost sensitive classifier = 45.813 %
25
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
What drives Churn :
26
• Contract
Contract Total instances YES
YES %
(Out of total
YES count)
NO
No %
(Out of total
YES count)
YES %
(Out of total
instance)
NO %
(Out of total
instance)
Month-to-
month
3875 1655 88.55% 2220 42.91% 42.71% 57.29%
One year 1473 166 8.88% 1307 25.26% 11.27% 88.73%
Two year 1695 48 2.57% 1647 31.83% 2.83% 97.17%
Total Count 7043 1869 100.00% 5174 100.00%
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
27
What drives Churn ? :
• Contract
88.55%
8.88%
2.57%
42.91%
25.26%
31.83%
0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% 90.00% 100.00%
Month-to-month
One year
Two year
Percentage Chart
No Percentage YES Percentage
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
28
What drives Churn ? :
• Contract
42.71%
11.27%
2.83%
57.29%
88.73%
97.17%
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
100.00%
Month-to-month One year Two year
Percentage Split
YES %
(Out of total instance)
NO %
(Out of total instance)
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
29
What drives Churn ? :
• Tenure
55.48%
15.73%
9.63%
7.76%
6.42%
4.98%
Percentage
0-12 mth 13-24 mth 25-36 mth 37-48 mth 49-60 mth 61-72 mth
Tenure in Months YES Count Percentage
0-12 mth 1037 55.48%
13-24 mth 294 15.73%
25-36 mth 180 9.63%
37-48 mth 145 7.76%
49-60 mth 120 6.42%
61-72 mth 93 4.98%
Total 1869 100.00%
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
30
What drives Churn :
• Online Security
Online Security
Total
instances
YES
YES
Percentage
NO
No
Percentage
YES %
(Out of total
instance)
NO %
(Out of total
instance)
No 3498 1461 78.17% 2037 39.37% 41.77% 58.23%
No internet
service
1526 113 6.05% 1413 27.31% 7.40% 92.60%
Yes 2019 295 15.78% 1724 33.32% 14.61% 85.39%
Total Count 7043 1869 100.00% 5174 100.00%
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
31
What drives Churn ? :
• Online Security
78.17%
6.05%
15.78%
39.37%
27.31%
33.32%
0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% 90.00%
No
No internet
service
Yes
Total % Captured
No Percentage YES Percentage
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
32
What drives Churn ? :
• Online Security
41.77%
7.40%
14.61%
58.23%
92.60%
85.39%
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
100.00%
No No internet service Yes
% out of total Instance
YES %
(Out of total instance)
NO %
(Out of total instance)
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
Implementation :
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
34
Tenure
Internet
Service
Online
Security
Online
Backup
Tech
Support
Contract
Paperless
Billing
Monthly
Charges
Predicted
Value
Actual
Value
Instance 1 34 DSL Yes No No One year No 56.95 No No
Instance 2 5
Fiber
optic
No No No
Month-
to-month
Yes 69.7 Yes Yes
Implementation of Decision Tree :
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
35
Instance 1 Instance 2
Conclusion :
Introduction Data Analysis Classification
Cost Benefit
Analysis
What drives
Churn
Implementation Conclusion
36
• With predictive modelling the accuracy can be increased from the baseline accuracy of 73.46 to 79 – 81%
which is significant for such critical prediction.
• Using cost benefit analysis and cost sensitive classification can reduce the cost to the company by a factor
of 39.08% and 45.81% respectively.
• There are 42.71% (1655) customers out of the total 3875 customers who has Month-to-Month contract
and who are going to churn which accounts for 88.55% of the total number of customers that are going to
churn.
• Out of the 2185 customers who have tenure between the range 1-12 months, there are 1037 (i.e. 55.48%
of total YES instances) customers who are going to churn.
• There are 41.77% (1461) customers out of the total 3498 customers who has No online security and who
are going to churn which accounts for 78.17% of the total number of customers that are going to churn.
Thank You…

More Related Content

PPTX
Weka linked in
PDF
Principal component analysis, Code and Time Complexity
PPTX
Optimizing SPARQL Query Processing On Dynamic and Static Data Based on Query ...
PPTX
Pca(principal components analysis)
PPTX
Activation distribution in a neural network
PPT
Predicting Multiple Metrics for Queries: Better Decision Enabled by Machine L...
PDF
Documents.pub sigmaplot 13-smit-principal-components-analysis-principal-compo...
PDF
Predicting Moscow Real Estate Prices with Azure Machine Learning
Weka linked in
Principal component analysis, Code and Time Complexity
Optimizing SPARQL Query Processing On Dynamic and Static Data Based on Query ...
Pca(principal components analysis)
Activation distribution in a neural network
Predicting Multiple Metrics for Queries: Better Decision Enabled by Machine L...
Documents.pub sigmaplot 13-smit-principal-components-analysis-principal-compo...
Predicting Moscow Real Estate Prices with Azure Machine Learning

What's hot (8)

PDF
Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...
PDF
Principal component analysis - application in finance
PPTX
Prediction of House Sales Price
PPTX
House price prediction
PPTX
How Does Math Matter in Data Science
PDF
Building a Predictive Model
PDF
G. Park, J.-Y. Yang, et. al., NeurIPS 2020, MLILAB, KAIST AI
PDF
J. Park, H. Shim, AAAI 2022, MLILAB, KAISTAI
Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...
Principal component analysis - application in finance
Prediction of House Sales Price
House price prediction
How Does Math Matter in Data Science
Building a Predictive Model
G. Park, J.-Y. Yang, et. al., NeurIPS 2020, MLILAB, KAIST AI
J. Park, H. Shim, AAAI 2022, MLILAB, KAISTAI
Ad

Similar to Group Project - Final - Linked in (20)

PPTX
Data Mining to Classify Telco Churners
PPTX
Customer_Churn_prediction.pptx
PPTX
Customer_Churn_prediction.pptx
PPTX
Maximizing a churn campaigns profitability with cost sensitive machine learning
PPTX
BDAS-2017 | Maximizing a churn campaign’s profitability with cost sensitive m...
PDF
Churn Prediction in Practice
PDF
Customer Churn, A Data Science Use Case in Telecom
PDF
A Proposed Churn Prediction Model
PPTX
Maximizing Retention with Minimal Effort
PPTX
Telecom Churn Prediction Presentation
PDF
CUSTOMER CHURN PREDICTION
PDF
Churn in the Telecommunications Industry
PPTX
Insurance Churn Prediction Data Analysis Project
PPTX
Decoding Patterns: Customer Churn Prediction Data Analysis Project
PPTX
Bank Customer Churn Prediction- Saurav Singh.pptx
PDF
IRJET - Customer Churn Analysis in Telecom Industry
PDF
ML_project_ppt.pdf
PPTX
TELECOM_CHURN_PREDICTIAAAAAAAAAAAAAAAAAON[1].pptx
PPT
MIS637_Final_Project_Rahul_Bhatia
PPTX
Day 1 (Lecture 2): Business Analytics
Data Mining to Classify Telco Churners
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptx
Maximizing a churn campaigns profitability with cost sensitive machine learning
BDAS-2017 | Maximizing a churn campaign’s profitability with cost sensitive m...
Churn Prediction in Practice
Customer Churn, A Data Science Use Case in Telecom
A Proposed Churn Prediction Model
Maximizing Retention with Minimal Effort
Telecom Churn Prediction Presentation
CUSTOMER CHURN PREDICTION
Churn in the Telecommunications Industry
Insurance Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Bank Customer Churn Prediction- Saurav Singh.pptx
IRJET - Customer Churn Analysis in Telecom Industry
ML_project_ppt.pdf
TELECOM_CHURN_PREDICTIAAAAAAAAAAAAAAAAAON[1].pptx
MIS637_Final_Project_Rahul_Bhatia
Day 1 (Lecture 2): Business Analytics
Ad

Group Project - Final - Linked in

  • 1. Name – Sanket V. Butoliya Dhavalchandra Panchal Major - Business Analytics & Information Systems Group Project – Data Mining Telecom Customer Churn Prediction
  • 2. Introduction: Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion 1 Problem Statement : A Cellular service provider wants to analyze customer data to predict whether a customer is going to churn and also identify what are the critical factors that are causing customers to churn so that preventive actions could be taken based on these factors. And also perform cost benefit analysis to reduce the cost of critical errors. Flow of Presentation : • Introduction • Problem Statement • Dataset Overview • Data Analysis and Baseline deduction • Predictive models and Misclassification rate • Cost sensitive Analysis • Cost sensitive Classification • Reduced Cost calculation • What Drives Churn (Critical variables analysis) • Visualization in Excel and Tableau • Implementation in Python • Conclusion
  • 3. No. Attributes Information 1 CustomerID 2 Gender 3 SeniorCitizen 4 Partner 5 Dependents 6 tenure 7 PhoneService 8 MultipleLines 9 InternetService 10 OnlineSecurity Predicted attribute = Customer Churned? Number of Instances = 7043 Number of Attributes = 21 Missing Attribute Values = None No. Attributes Information 11 OnlineBackup 12 DeviceProtection 13 TechSupport 14 StreamingTV 15 StreamingMovies 16 Contract 17 PaperlessBilling 18 PaymentMethod 19 MonthlyCharges 20 TotalCharges 21 Churn 2 Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion
  • 4. 3 Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion
  • 5. BaseLine Accuracy 73.4630 Yes Count No Count 1869 5174 4 Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion
  • 6. 5 Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion
  • 7. 6 Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion
  • 8. 7 Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion
  • 9. 8 Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion
  • 10. 9 Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion
  • 11. 10 Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion
  • 12. 11 Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion
  • 13. 12 Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion
  • 14. 13 Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion
  • 15. 14 Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion
  • 16. ModelDescription Role FalseNegative TrueNegative FalsePositive TruePositive Accuracy False Positive rate False Negative rate Misclassification Rate Yes=>No No No=>Yes Yes c a b d Decision Tree T 503 2377 210 431 79.75% 8.118 53.854 20.25% Decision Tree V 297 1419 132 264 79.69% 8.511 52.941 20.31% GradientBoosting V 291 1441 110 270 81.01% 7.092 51.872 18.99% GradientBoosting T 482 2378 209 452 80.37% 8.079 51.606 19.63% GradientBoosting 2 V 271 1412 139 290 80.59% 8.962 48.307 19.41% GradientBoosting 2 T 529 2823 281 591 80.82% 9.053 47.232 19.18% Regression(2) T 435 2320 267 499 80.06% 10.321 46.574 19.94% AutoNeural T 434 2319 268 500 80.06% 10.359 46.467 19.94% AutoNeural2 T 517 2796 308 603 80.47% 9.923 46.161 19.53% AutoNeural V 255 1406 145 306 81.06% 9.349 45.455 18.94% Regression(2) V 255 1406 145 306 81.06% 9.349 45.455 18.94% AutoNeural2 V 253 1410 141 308 81.34% 9.091 45.098 18.66% Decision Tree2 V 253 1375 176 308 79.69% 11.348 45.098 20.31% Regression T 505 2790 314 615 80.61% 10.116 45.089 19.39% Regression V 252 1406 145 309 81.20% 9.349 44.920 18.80% Decision Tree2 T 489 2755 349 631 80.16% 11.244 43.661 19.84% Role : Train = T, Validate = V15 Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion
  • 17. • Different kinds of errors have different costs. • It is worse to classify a customer as NO when they are YES, than to classify as customer as YES when they are NO. • Take classification cost into consideration and minimize total cost • Cost Matrix : NO YES Predicted Actual 0 2 NO 5 0 YES Classified as Cost Sensitive Analysis : 16 Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementatio n Conclusion
  • 18. ZeroR without cost evaluation : 17 Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementatio n Conclusion
  • 19. ZeroR with cost evaluation : 18 Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementatio n Conclusion
  • 20. ZeroR with cost evaluation : Cost = 547 x 5 = 2735 19 Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementatio n Conclusion
  • 21. J48 without cost evaluation : 20 Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementatio n Conclusion
  • 22. J48 with cost evaluation : Cost = 260 x 5 + 183 x 2 = 1666 21 Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementatio n Conclusion
  • 23. Cost Sensitive Classification : • Provide cost before building model. • Reduce the number of False negative as compared to the number of False Positive. 22 Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion
  • 24. J48 with cost sensitive classification : 23 Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementatio n Conclusion
  • 25. J48 with cost sensitive classification : Old Matrix New Matrix 24 Cost = 116 x 5 + 451 x 2 = 1482 Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementatio n Conclusion
  • 26. Model Accuracy Total Cost Average Cost Reduced Cost % ZeroR 74.11 2735 1.2944 - J48 79.03 1666 0.7885 39.085 J48 with cost sensitive classifier 73.16 1482 0.7014 45.813 Reduction in Cost on account of J48 = ( 2735 – 1666 ) / 2735 = 39.085 % Reduction in Cost on account of J48 = ( 2735 – 1482 ) / 2735 With cost sensitive classifier = 45.813 % 25 Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion
  • 27. What drives Churn : 26 • Contract Contract Total instances YES YES % (Out of total YES count) NO No % (Out of total YES count) YES % (Out of total instance) NO % (Out of total instance) Month-to- month 3875 1655 88.55% 2220 42.91% 42.71% 57.29% One year 1473 166 8.88% 1307 25.26% 11.27% 88.73% Two year 1695 48 2.57% 1647 31.83% 2.83% 97.17% Total Count 7043 1869 100.00% 5174 100.00% Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion
  • 28. 27 What drives Churn ? : • Contract 88.55% 8.88% 2.57% 42.91% 25.26% 31.83% 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% 90.00% 100.00% Month-to-month One year Two year Percentage Chart No Percentage YES Percentage Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion
  • 29. 28 What drives Churn ? : • Contract 42.71% 11.27% 2.83% 57.29% 88.73% 97.17% 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% 90.00% 100.00% Month-to-month One year Two year Percentage Split YES % (Out of total instance) NO % (Out of total instance) Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion
  • 30. 29 What drives Churn ? : • Tenure 55.48% 15.73% 9.63% 7.76% 6.42% 4.98% Percentage 0-12 mth 13-24 mth 25-36 mth 37-48 mth 49-60 mth 61-72 mth Tenure in Months YES Count Percentage 0-12 mth 1037 55.48% 13-24 mth 294 15.73% 25-36 mth 180 9.63% 37-48 mth 145 7.76% 49-60 mth 120 6.42% 61-72 mth 93 4.98% Total 1869 100.00% Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion
  • 31. 30 What drives Churn : • Online Security Online Security Total instances YES YES Percentage NO No Percentage YES % (Out of total instance) NO % (Out of total instance) No 3498 1461 78.17% 2037 39.37% 41.77% 58.23% No internet service 1526 113 6.05% 1413 27.31% 7.40% 92.60% Yes 2019 295 15.78% 1724 33.32% 14.61% 85.39% Total Count 7043 1869 100.00% 5174 100.00% Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion
  • 32. 31 What drives Churn ? : • Online Security 78.17% 6.05% 15.78% 39.37% 27.31% 33.32% 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% 90.00% No No internet service Yes Total % Captured No Percentage YES Percentage Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion
  • 33. 32 What drives Churn ? : • Online Security 41.77% 7.40% 14.61% 58.23% 92.60% 85.39% 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% 90.00% 100.00% No No internet service Yes % out of total Instance YES % (Out of total instance) NO % (Out of total instance) Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion
  • 34. Implementation : Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion 34 Tenure Internet Service Online Security Online Backup Tech Support Contract Paperless Billing Monthly Charges Predicted Value Actual Value Instance 1 34 DSL Yes No No One year No 56.95 No No Instance 2 5 Fiber optic No No No Month- to-month Yes 69.7 Yes Yes
  • 35. Implementation of Decision Tree : Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion 35 Instance 1 Instance 2
  • 36. Conclusion : Introduction Data Analysis Classification Cost Benefit Analysis What drives Churn Implementation Conclusion 36 • With predictive modelling the accuracy can be increased from the baseline accuracy of 73.46 to 79 – 81% which is significant for such critical prediction. • Using cost benefit analysis and cost sensitive classification can reduce the cost to the company by a factor of 39.08% and 45.81% respectively. • There are 42.71% (1655) customers out of the total 3875 customers who has Month-to-Month contract and who are going to churn which accounts for 88.55% of the total number of customers that are going to churn. • Out of the 2185 customers who have tenure between the range 1-12 months, there are 1037 (i.e. 55.48% of total YES instances) customers who are going to churn. • There are 41.77% (1461) customers out of the total 3498 customers who has No online security and who are going to churn which accounts for 78.17% of the total number of customers that are going to churn.