MIS 637 Final ProjectMIS 637 Final Project
Predicting Churners in aPredicting Churners in a
Telecom CompanyTelecom Company
By
Rahul Bhatia
Student Id : 10398302
ABSTRACTABSTRACT
• "Churn Rate" is a business term describing the rate at which
customers leave or cease paying for a product or service. It's a
critical figure in many businesses, as it's often the case that
acquiring new customers is a lot more costly than retaining existing
ones (in some cases, 5 to 20 times more expensive).
• Understanding what keeps customers engaged, therefore, is
incredibly valuable, as it is a logical foundation from which to
develop retention strategies and roll out operational practices
aimed to keep customers from walking out the door. Consequently,
there's growing interest among companies to develop better
churn-detection techniques, leading many to look to data mining
and machine learning for new and creative approaches.
1
CROSS-INDUSTRY STANDARD PROCESS
(CRISP–DM)- 6 Phases
•Business understanding phase.
•Data understanding phase
•Data preparation phase
•Modeling phase
•Evaluation phase
•Deployment phase
3
Business UnderstandingBusiness Understanding
Profound Question:
For this project, I have obtained a longstanding telecom customer
dataset of a Telecom (Mobile) company which aims to predict
whether its customers will churn or not. The objective of this
competition is to build a model, learned using historical data, that will
determine churners in the telecom company.
Objective:
The classification goal is to derive rules and predict whether a
customer will churn or not by using KNN and C4.5(variable churn)
algorithms and compare both the model accuracies.
Accomplishments:
By using this model, we can increase churn prediction efficiency by
identifying the main variables which result in churning, and have a
more rational estimate about which customers are potential churners
that we should contact first.
4
Data Source: This dataset was used in yhat blog post “Predicting
customer churn with scikit-learn” by Eric Chiang.
Data set details:
•The data is straightforward. Each row represents a subscribing
telephone customer. Each column contains customer attributes such
as phone number, call minutes used during different times of day,
charges incurred for services, lifetime account duration, and whether
or not the customer is still a customer. The original dataset contains a
total of 3333 rows with 1 dependent variable and 20 independent
variables.
5
Data Understanding
Data UnderstandingData Understanding
Sample Data
6
Data UnderstandingData Understanding
Attributes Description:
7
Data UnderstandingData Understanding
Attribute Description:
8
Data PreparationData Preparation
Data Cleaning and Transformations:
Handle Missing values & Identify outliers:
No missing values and outliers have been found in original data.
Normalization:
Z-Score Normalization was performed on input variables Account Length ,
Number of Voice Mail Messages, Total Day Minutes, Total Day calls, Total
Evening Minutes, Total Evening calls, Total Night Minutes, Total Night Calls,
Total International Minutes, Total International Calls, Customer Service Calls.
9
Data PreparationData Preparation
• Attributes Selection:
 Attributes State, Area Code and Phone Number were dropped from
the model as we do not need these columns for churn prediction.
 Attributes Total Day Charge, Total Evening Charge and Total night
calls and Total International Charge were also dropped from the
model as high correlation was found between them and Total Day
Minutes, Total Evening Minutes, Total night minutes, Total International
Minutes respectively.
1
Data PreparationData Preparation
• Correlation:
Strong Correlation between Day Minutes and Day Charge.
1
Data PreparationData Preparation
• Correlation:
Strong Correlation between Evening Minutes and Evening Charge.
1
Data PreparationData Preparation
• Correlation:
Strong Correlation between Night Minutes and Night Charge.
1
Data PreparationData Preparation
• Correlation:
Strong Correlation between International Minutes and International Charge.
1
TRANSFORMED DATASETTRANSFORMED DATASET
Transformed Dataset has 13 Independent and 1 Dependent Variable(churn)
Sample Data
1
Data PreparationData Preparation
Data Division:
After data clean, the data set consisting of 3333 records is divided into 2 sets.
Training data set: 80% of the data (2666 records) is used to develop the model.
Testing data: 20% of the data ( 667 records) is used to evaluate the model.
16
ModelingModeling
Algorithm?
The target variable is categorical (true, false) and is not continuous, Classification is
the right choice.
Classification: predicts categorical class labels and classifies data based on the
training set and the values in a classifying attribute and uses it in classifying new
data.
17
ModelingModeling
K-Nearest Neighbors algorithm:
The output is a class membership. An object is classified by a majority vote of its
neighbors, with the object being assigned to the class most common among
its k nearest neighbors (k is a positive integer, typically small). If k = 1, then the
object is simply assigned to the class of that single nearest neighbor.
C4.5 algorithm: An extension of ID3 algorithm. C4.5 recursively visits each
decision node, selecting the optimal splits, until no further splits are possible. It
makes use of the concept of information gain or entropy reduction to select the
optimal split.
18
ModelingModeling
Software:
SPSS Modeler 17.0 is a data mining and text analytics software application built by
IBM.
It is an extensive predictive analytics platform that is designed to built predictive
models, conduct analytic tasks and bring predictive intelligence to decisions by
providing a range of advanced algorithms and techniques.
19
ModelingModeling
Training Data
20
ModelingModeling
Input & Output Variables
21
K-nearest neighbor ModelK-nearest neighbor Model
1
K=5 is selected as Error is Minimum.
K-nearest neighbor ModelK-nearest neighbor Model
SummarySummary
1
K-nearest neighbor Test DatasetK-nearest neighbor Test Dataset
on Training Data Modelon Training Data Model
24
Evaluation
K-nearest neighbor AccuracyK-nearest neighbor Accuracy
1
87.71% Accuracy was
achieved.
Modeling(C4.5 algorithm)Modeling(C4.5 algorithm)
26
Set the model and execute it, with
Cross Validation on Training
Dataset and 95.1% accuracy
achieved.
C4.5 Test Dataset on TrainingC4.5 Test Dataset on Training
Data ModelData Model
27
94.9% Accuracy
EvaluationEvaluation
C4.5 algorithm(94.9%) is preferred over K-nearest neighbor algorithm
(87.1%) as the model accuracy is higher.
C4.5 Algorithm:
Coincidence Matrix
Shows a high accuracy in predicting “false” while a low accuracy when predicting “True”.
This is because the model often yield misleading results if the data set is unbalanced, as in
this project, we have 558 “false” and 109 ”true”, the classifier could easily be biased into
classifying all the samples as “false”.
However, we still can use this mode to predict a “true” due to the lifting and gain chart.
28
EvaluationEvaluation
29
6 Times accurate
Lifting is a measure of the
effectiveness of a predictive
model calculated as the ratio
between the results obtained
with and without the predictive
model.
For contacting 10% of
customers, using no model we
should get 10% of positive
churners and using the given
model we should get 60% of
positive churners.
EvaluationEvaluation
30
Gains Chart
The y-axis shows the percentage the
total possible positive churners(“true”)
The x-axis shows the percentage of
customers contacted
By using this model, we just need to
contact 50% of customers to receive
90% of the “true” churners.
Evaluation(Variable Importance)Evaluation(Variable Importance)
31
Day Minutes, Customer Service
Calls and International Plan are
the most important variables.
EvaluationEvaluation
Conclusion:
We can conclude that Day Minutes, Number of Customer Service Calls ,
International Plan, Evening Minutes, Number of International calls and Voice Mail
Plan are the most important variables in predicting Churners.
32
DeploymentDeployment
• Predicting churn is particularly important for businesses w/ subscription models
such as cell phone, cable, or merchant credit card processing plans.
• Since the model achieved high predictive performances, it can be used to in
predicting churners in any Telecom Company and help the company to prevent it
customers from churning by improving on the most important variables as
discussed earlier and also save campaign cost.
33
ReferencesReferences
Data source:
Link to the dataset:
https://raw.githubusercontent.com/EricChiang/churn/master/data/c
hurn.csv
Software:
http://www-01.ibm.com/software/analytics/spss/
Other References:
http://blog.yhathq.com/posts/predicting-customer-churn-with-
sklearn.html
34
Thank you

More Related Content

PPTX
Customer Churn Analysis and Prediction
PPTX
Churn Analysis in Telecom Industry
PPTX
Telecom Churn Prediction Presentation
PPT
Churn prediction
PDF
Customer churn prediction for telecom data set.
PDF
churn prediction in telecom
PPTX
Association Analysis in Data Mining
PDF
AL ICT Lesson 1 Questions
Customer Churn Analysis and Prediction
Churn Analysis in Telecom Industry
Telecom Churn Prediction Presentation
Churn prediction
Customer churn prediction for telecom data set.
churn prediction in telecom
Association Analysis in Data Mining
AL ICT Lesson 1 Questions

What's hot (20)

PPTX
Telecom Churn Analysis
PPTX
Churn modelling
PDF
Telecom Churn Prediction
PPTX
Data analytics telecom churn final ppt
PPTX
Telco churn presentation
PPTX
Data mining and analysis of customer churn dataset
PDF
Churn Prediction in Practice
PDF
Churn prediction data modeling
PDF
IRJET - Customer Churn Analysis in Telecom Industry
PPTX
Telcom churn .pptx
PDF
Ways to Reduce the Customer Churn Rate
PDF
Telecommunication Analysis (3 use-cases) with IBM watson analytics
PDF
Customer Churn, A Data Science Use Case in Telecom
PDF
Recommender systems in practice
PPTX
A case study on churn analysis1
PPTX
Customer_Churn_prediction.pptx
PDF
Churn in the Telecommunications Industry
PDF
Customer Churn Prevention Powerpoint Presentation Slides
PDF
Telecom customer churn prediction
PPTX
Churn customer analysis
Telecom Churn Analysis
Churn modelling
Telecom Churn Prediction
Data analytics telecom churn final ppt
Telco churn presentation
Data mining and analysis of customer churn dataset
Churn Prediction in Practice
Churn prediction data modeling
IRJET - Customer Churn Analysis in Telecom Industry
Telcom churn .pptx
Ways to Reduce the Customer Churn Rate
Telecommunication Analysis (3 use-cases) with IBM watson analytics
Customer Churn, A Data Science Use Case in Telecom
Recommender systems in practice
A case study on churn analysis1
Customer_Churn_prediction.pptx
Churn in the Telecommunications Industry
Customer Churn Prevention Powerpoint Presentation Slides
Telecom customer churn prediction
Churn customer analysis
Ad

Viewers also liked (18)

PPTX
Solving Real Life Problems using Data Science Part - 1
PDF
Pragmatic machine learning for the real world
PDF
Maximizing a churn campaign’s profitability with cost sensitive predictive an...
PPTX
Telecom Churn Prediction from Customer Usage Data (Igor Tymchuk)
PPSX
Telco Churn Roi V3
PPTX
Predicting churn in telco industry: machine learning approach - Marko Mitić
PDF
Pragmatic Machine Learning @ ML Spain
PDF
Logistic Regression: Behind the Scenes
PDF
2013 credit card fraud detection why theory dosent adjust to practice
PPTX
Analytics, KPIs for effective Churn & Loyalty management
PPT
Presentation Churn Management
PDF
Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...
PDF
Logistic regression
PPTX
Combining Logit and Ensemble Modeling for Increased Customer Churn Detection
PDF
Beyond Churn Prediction : An Introduction to uplift modeling
PPTX
Webinar - Pattern Mining Log Data - Vega (20160426)
PDF
Churn management
PPT
Churn Predictive Modelling
Solving Real Life Problems using Data Science Part - 1
Pragmatic machine learning for the real world
Maximizing a churn campaign’s profitability with cost sensitive predictive an...
Telecom Churn Prediction from Customer Usage Data (Igor Tymchuk)
Telco Churn Roi V3
Predicting churn in telco industry: machine learning approach - Marko Mitić
Pragmatic Machine Learning @ ML Spain
Logistic Regression: Behind the Scenes
2013 credit card fraud detection why theory dosent adjust to practice
Analytics, KPIs for effective Churn & Loyalty management
Presentation Churn Management
Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...
Logistic regression
Combining Logit and Ensemble Modeling for Increased Customer Churn Detection
Beyond Churn Prediction : An Introduction to uplift modeling
Webinar - Pattern Mining Log Data - Vega (20160426)
Churn management
Churn Predictive Modelling
Ad

Similar to MIS637_Final_Project_Rahul_Bhatia (20)

PPTX
Customer_Churn_prediction.pptx
PPTX
Data Mining to Classify Telco Churners
PDF
Customer churn classification using machine learning techniques
PPTX
Insurance Churn Prediction Data Analysis Project
PDF
Data Mining on Customer Churn Classification
PPTX
Prediction of customer propensity to churn - Telecom Industry
PDF
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
PPTX
churn_detection.pptx
PPTX
Egypt hackathon 2014 analytics & spss session
PDF
Leveragin research, behavioural and demeographic data
 
PDF
Comparative Analysis of Machine Learning Algorithms for their Effectiveness i...
PDF
1440 track 2 boire_using our laptop
PDF
Unfolding the Credit Card Fraud Detection Technique by Implementing SVM Algor...
PDF
WP_ContactCenterPlanningMethodologies_whitepaper_laser
PDF
Machine Learning Project Presentation by Me
PPTX
Bank Customer Churn Prediction- Saurav Singh.pptx
PPTX
Decoding Loan Approval: Predictive Modeling in Action
PDF
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
PDF
Project crm submission sonali
PDF
Black_Friday_Sales_Trushita
Customer_Churn_prediction.pptx
Data Mining to Classify Telco Churners
Customer churn classification using machine learning techniques
Insurance Churn Prediction Data Analysis Project
Data Mining on Customer Churn Classification
Prediction of customer propensity to churn - Telecom Industry
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
churn_detection.pptx
Egypt hackathon 2014 analytics & spss session
Leveragin research, behavioural and demeographic data
 
Comparative Analysis of Machine Learning Algorithms for their Effectiveness i...
1440 track 2 boire_using our laptop
Unfolding the Credit Card Fraud Detection Technique by Implementing SVM Algor...
WP_ContactCenterPlanningMethodologies_whitepaper_laser
Machine Learning Project Presentation by Me
Bank Customer Churn Prediction- Saurav Singh.pptx
Decoding Loan Approval: Predictive Modeling in Action
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
Project crm submission sonali
Black_Friday_Sales_Trushita

MIS637_Final_Project_Rahul_Bhatia

  • 1. MIS 637 Final ProjectMIS 637 Final Project Predicting Churners in aPredicting Churners in a Telecom CompanyTelecom Company By Rahul Bhatia Student Id : 10398302
  • 2. ABSTRACTABSTRACT • "Churn Rate" is a business term describing the rate at which customers leave or cease paying for a product or service. It's a critical figure in many businesses, as it's often the case that acquiring new customers is a lot more costly than retaining existing ones (in some cases, 5 to 20 times more expensive). • Understanding what keeps customers engaged, therefore, is incredibly valuable, as it is a logical foundation from which to develop retention strategies and roll out operational practices aimed to keep customers from walking out the door. Consequently, there's growing interest among companies to develop better churn-detection techniques, leading many to look to data mining and machine learning for new and creative approaches. 1
  • 3. CROSS-INDUSTRY STANDARD PROCESS (CRISP–DM)- 6 Phases •Business understanding phase. •Data understanding phase •Data preparation phase •Modeling phase •Evaluation phase •Deployment phase 3
  • 4. Business UnderstandingBusiness Understanding Profound Question: For this project, I have obtained a longstanding telecom customer dataset of a Telecom (Mobile) company which aims to predict whether its customers will churn or not. The objective of this competition is to build a model, learned using historical data, that will determine churners in the telecom company. Objective: The classification goal is to derive rules and predict whether a customer will churn or not by using KNN and C4.5(variable churn) algorithms and compare both the model accuracies. Accomplishments: By using this model, we can increase churn prediction efficiency by identifying the main variables which result in churning, and have a more rational estimate about which customers are potential churners that we should contact first. 4
  • 5. Data Source: This dataset was used in yhat blog post “Predicting customer churn with scikit-learn” by Eric Chiang. Data set details: •The data is straightforward. Each row represents a subscribing telephone customer. Each column contains customer attributes such as phone number, call minutes used during different times of day, charges incurred for services, lifetime account duration, and whether or not the customer is still a customer. The original dataset contains a total of 3333 rows with 1 dependent variable and 20 independent variables. 5 Data Understanding
  • 9. Data PreparationData Preparation Data Cleaning and Transformations: Handle Missing values & Identify outliers: No missing values and outliers have been found in original data. Normalization: Z-Score Normalization was performed on input variables Account Length , Number of Voice Mail Messages, Total Day Minutes, Total Day calls, Total Evening Minutes, Total Evening calls, Total Night Minutes, Total Night Calls, Total International Minutes, Total International Calls, Customer Service Calls. 9
  • 10. Data PreparationData Preparation • Attributes Selection:  Attributes State, Area Code and Phone Number were dropped from the model as we do not need these columns for churn prediction.  Attributes Total Day Charge, Total Evening Charge and Total night calls and Total International Charge were also dropped from the model as high correlation was found between them and Total Day Minutes, Total Evening Minutes, Total night minutes, Total International Minutes respectively. 1
  • 11. Data PreparationData Preparation • Correlation: Strong Correlation between Day Minutes and Day Charge. 1
  • 12. Data PreparationData Preparation • Correlation: Strong Correlation between Evening Minutes and Evening Charge. 1
  • 13. Data PreparationData Preparation • Correlation: Strong Correlation between Night Minutes and Night Charge. 1
  • 14. Data PreparationData Preparation • Correlation: Strong Correlation between International Minutes and International Charge. 1
  • 15. TRANSFORMED DATASETTRANSFORMED DATASET Transformed Dataset has 13 Independent and 1 Dependent Variable(churn) Sample Data 1
  • 16. Data PreparationData Preparation Data Division: After data clean, the data set consisting of 3333 records is divided into 2 sets. Training data set: 80% of the data (2666 records) is used to develop the model. Testing data: 20% of the data ( 667 records) is used to evaluate the model. 16
  • 17. ModelingModeling Algorithm? The target variable is categorical (true, false) and is not continuous, Classification is the right choice. Classification: predicts categorical class labels and classifies data based on the training set and the values in a classifying attribute and uses it in classifying new data. 17
  • 18. ModelingModeling K-Nearest Neighbors algorithm: The output is a class membership. An object is classified by a majority vote of its neighbors, with the object being assigned to the class most common among its k nearest neighbors (k is a positive integer, typically small). If k = 1, then the object is simply assigned to the class of that single nearest neighbor. C4.5 algorithm: An extension of ID3 algorithm. C4.5 recursively visits each decision node, selecting the optimal splits, until no further splits are possible. It makes use of the concept of information gain or entropy reduction to select the optimal split. 18
  • 19. ModelingModeling Software: SPSS Modeler 17.0 is a data mining and text analytics software application built by IBM. It is an extensive predictive analytics platform that is designed to built predictive models, conduct analytic tasks and bring predictive intelligence to decisions by providing a range of advanced algorithms and techniques. 19
  • 22. K-nearest neighbor ModelK-nearest neighbor Model 1 K=5 is selected as Error is Minimum.
  • 23. K-nearest neighbor ModelK-nearest neighbor Model SummarySummary 1
  • 24. K-nearest neighbor Test DatasetK-nearest neighbor Test Dataset on Training Data Modelon Training Data Model 24 Evaluation
  • 25. K-nearest neighbor AccuracyK-nearest neighbor Accuracy 1 87.71% Accuracy was achieved.
  • 26. Modeling(C4.5 algorithm)Modeling(C4.5 algorithm) 26 Set the model and execute it, with Cross Validation on Training Dataset and 95.1% accuracy achieved.
  • 27. C4.5 Test Dataset on TrainingC4.5 Test Dataset on Training Data ModelData Model 27 94.9% Accuracy
  • 28. EvaluationEvaluation C4.5 algorithm(94.9%) is preferred over K-nearest neighbor algorithm (87.1%) as the model accuracy is higher. C4.5 Algorithm: Coincidence Matrix Shows a high accuracy in predicting “false” while a low accuracy when predicting “True”. This is because the model often yield misleading results if the data set is unbalanced, as in this project, we have 558 “false” and 109 ”true”, the classifier could easily be biased into classifying all the samples as “false”. However, we still can use this mode to predict a “true” due to the lifting and gain chart. 28
  • 29. EvaluationEvaluation 29 6 Times accurate Lifting is a measure of the effectiveness of a predictive model calculated as the ratio between the results obtained with and without the predictive model. For contacting 10% of customers, using no model we should get 10% of positive churners and using the given model we should get 60% of positive churners.
  • 30. EvaluationEvaluation 30 Gains Chart The y-axis shows the percentage the total possible positive churners(“true”) The x-axis shows the percentage of customers contacted By using this model, we just need to contact 50% of customers to receive 90% of the “true” churners.
  • 31. Evaluation(Variable Importance)Evaluation(Variable Importance) 31 Day Minutes, Customer Service Calls and International Plan are the most important variables.
  • 32. EvaluationEvaluation Conclusion: We can conclude that Day Minutes, Number of Customer Service Calls , International Plan, Evening Minutes, Number of International calls and Voice Mail Plan are the most important variables in predicting Churners. 32
  • 33. DeploymentDeployment • Predicting churn is particularly important for businesses w/ subscription models such as cell phone, cable, or merchant credit card processing plans. • Since the model achieved high predictive performances, it can be used to in predicting churners in any Telecom Company and help the company to prevent it customers from churning by improving on the most important variables as discussed earlier and also save campaign cost. 33
  • 34. ReferencesReferences Data source: Link to the dataset: https://raw.githubusercontent.com/EricChiang/churn/master/data/c hurn.csv Software: http://www-01.ibm.com/software/analytics/spss/ Other References: http://blog.yhathq.com/posts/predicting-customer-churn-with- sklearn.html 34

Editor's Notes

  • #25: To test the training model, we use test dataset.