SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 04 Issue: 09 | Sep -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1531
A COMPARATIVE STUDY OF TECHNIQUES TO PREDICT CUSTOMER
CHURN IN TELECOMMUNICATION INDUSTRY
Maninderjeet Kaur1, Priyanka2
1,2 Computer Science and Engineering Swami Vivekanand Institute of Engineering and Technology
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - In present days there is huge competition
between various companies in the industry. Due to this
companies pay more attentiontowardstheircustomersrather
than their product. They become aware of customer churn
issue. Basically when a customerceasesone’srelationship with
the company, this misfortune of relationship is known as
customer churn. Various data mining approaches are used to
predict customer’s churnattitude. Manyalgorithmshavebeen
proposed to predict these results. In this paper, we have
discussed about various methods used to predict customer
churn in telecommunication industryand proposeatechnique
using Correlation based Symmetric uncertainty feature
selection and ensemblelearningforcustomerchurnprediction
in telecommunication industry.
Key Words: customer churn, data mining, algorithm,
telecommunication, feature selection
1.INTRODUCTION
One of the main concerns of telecommunications companies
is the customer retention. These days, in order to predict
customer churn,manycompaniesinthetelecommunications
sector make use of the data mining techniques [1]. The term
churn refers to the change of the service provider, triggered
by better rates or services or by the benefits offered at
signup by a competitor company [2]. It is measured by the
rate of churn and is an important indicatorfororganizations.
In the telecommunicationsindustry,the mobilemarketisthe
segment that sees the fastest growthandisalmostsaturated.
In order to keep their customers, telecommunications
companies are making use of a defensivemarketingstrategy.
One such company must identify customers who are at risk
of churn before they are actually going to act so they can
send proactive retention campaigns[3]. Tocorrectlyidentify
only the customers who are going to churn, the predictive
model has to be very accurate, toavoidcontactingcustomers
who will be using the services anyway.Toachievethistask is
not easy and well defined becausethepre-paidcustomers do
not have a contract. The predictive model will acceleratethe
retention process and the mobile telecommunications
companies will achieve positive results in this competitive
market. This prediction process depends strongly on the
data mining techniques mainly because of the increased
performance obtained by the machine learning algorithms
[4]. To extract knowledge from data,thedata miningprocess
makes use of machine learning algorithms,statistics,pattern
recognition, and visualization techniques [5]. This paper is
organized as follows: in Section 2, we describe data mining
and its techniques. In Section 3 we discuss about customer
behavior analysis for customer churn. We then discussed
literature survey related to this work in Section 4. Section 5
discusses about drawback of current system and proposed
work. Finally, Section 6 concludes this review process.
2. DATA MINING
Originally, “DATA MINING" is a statistician's term for
overusing data to draw invalid inferences. So, it’s Discovery
of useful summaries of data [1,2]. Data Mining [1] is a
process that discovers the knowledge or hidden pattern
from large databases. DM is known as one of the core
processes of Knowledge Discovery in Database (KDD). It is
the process that results in the discovery of new patterns in
large data sets. It is a useful method at the intersection of
artificial intelligence, machine learning, statistics, and
database systems. It is the principle of picking out relevant
information from data. It is usually used by business
intelligence organizations, and financial analysts, to extract
useful information from large data sets or databases DM is
use to derive patterns and trends that exist in large datasets
involving methods at the intersection of artificial
intelligence, machine learning, statistics, and database
systems. The goal of this technique is to find accurate
patterns that were previously not known by us. So, the
overall goal of the DM process is to extract information from
a data set and transform it into an understandable structure
for further use.
Many DM techniques and systems have been developed and
designed. These techniques can be classified based on the
database, the knowledge to be discovered, and the
techniques to be utilized.
Based on the database - There are many database systems
that are used in organizations, such as relational database,
transaction database, object-oriented database, spatial
database, multimedia database, legacy database, and Web
database. A DM system can be classified based on the type of
database it is designed for. For example, it is a relational DM
system if the system discovers knowledge from relational
database and it is an object-oriented DM system if the
system finds knowledge from object-oriented database [5].
Based on the techniques - DM systems can also be
categorized by DM techniques. For example, a DM system
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 04 Issue: 09 | Sep -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1532
can be categorized according to the driven method, such as
autonomous knowledge mining, data driven mining, query-
driven mining, and interactive DM techniques.Alternatively,
it can be classified according to its underlying mining
approach, such as generalization based mining, pattern-
based mining, statistical- or mathematical-basedminingand
integrated approaches [1].
Based on the knowledge - DMsystemscandiscovervarious
types of knowledge, including association, classification,
clustering, prediction, sequential patterns and decisiontree.
DM systems can also be classified according to the
abstraction level of the discovered knowledge. The
knowledge may be classified into general knowledge,
primitive-level knowledge, and multilevel knowledge. We
will briefly examine those DM techniques in the following
sections:
Association: Association is one of the best known DM
techniques. In association, a patternisdiscoveredbasedona
relationship between items in the same transaction.
Classification: Classification is a classic DMtechniquebased
on machine learning. Basically classification is used to
classify each item in a set of data into one of predefined set
of classes or groups. Classification method makes use of
mathematical techniques such as decision trees, linear
programming, neural network and statistics.
Clustering: Clustering is a DM technique which makes
meaningful (i.e. useful) cluster of objects having similar
characteristics using automatic technique. The clustering
technique defines the classes and puts objects in each class,
while in the classification techniques, objects are assigned
into predefined classes.
Prediction: The prediction, as it name implied, is one of a
DM techniques that discovers relationship between
independent variables and relationship between dependent
and independent variables.
Sequential Patterns: Sequential patterns analysis is one of
DM technique that seeks to discover or identify similar
patterns, regular events or trends in transaction data over a
business period. Decision trees: Decision tree is one of the
most used DM techniques becauseitiseasytounderstand by
users. In decision tree, the root of the decision tree is a
simple question or condition thathasmultipleanswers.Each
answer then leads to a set of questions or conditions that
help us determine the data to make the decision[7].
3. CUSTOMER BEHAVIOUR ANALYSIS
In recent years, management of organizations is moving
from “Product-Centric” to “Customer-Centric”[13].Theyare
not only provides products to meet the need of customers
but also improving their services to increase the loyalty and
satisfaction of the customers. Intense competition in the
market has increased the need for retailers to use strategies
focused on retaining the right customers. Acquiring the new
customers is more expensive than retaining the existing
customers. To retain the customers, organizations are more
concern about the customer behavior analysis. The major
factors of success include learning costumers’ purchase
behavior, developing marketing strategies todiscoverlatent
loyal customers [14]. However a strategy that is effective in
acquiring new customers may not be the most effective in
retaining existing customers so in order to design the
effective activity to retain customers they need to use the
effective strategy for this. So different marketing strategies
can be devised that will target different sets of customers.
Predicting those profitable customersisimportanttoinform
and guide the decision making to keep the products and
services competitive. Consumer behavior is the study of
individual, or group about their process of selecting and
using the product, services, ideas or experiences to satisfy
needs. It involves ideas, services and tangible products.
Data mining techniques shows effectively and easily
business solution can be made and to beat the competition.
New technologies of data mining can be used for Customer
Relationship Management (CRM) and with this different
marketing strategies are devised for different set of
customers [15]. Organizations need to understand the
customer behavior to improve their marketing strategies.
They must understand few things about their customers
such as what is the psychology of the customer while
purchasing the products, what the customer thinks, feel and
select between different alternatives, how the customer is
influenced by environment, and how customers’ decision
strategy differs between products that differ in their level of
importance or interest.
The customer behavior is analyzed to making the marketing
strategies and public policy. The stored data contains the
information of about the spending behavior of customer,
how much they buy, which day at what time he/she doesthe
shopping, and what they buy most often, in that locality etc.
The purchasing sequences of the customersarestoredinthe
database so it is easy to fetch that dataanddeterminedthose
customers which have made repeat purchases [14]. These
sequences can determine the changes in customers’
preferences over time.
4. LITERATURE REVIEW
Berry and Linoff, (1997) described that data mining
techniques can be used to retain the loyal customers, look
out the right prospects, identify new markets for products
and services, and recognize cross-selling opportunities on
and off the web. Data mining techniques areeffectivetool for
analyzing consumer behaviors. There are seven powerful
techniques with are useful for this purpose [11] such as
Cluster Detection, Memory-Based Reasoning,MarketBasket
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 04 Issue: 09 | Sep -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1533
Analysis, Generic Algorithms, Link Analysis, Decision Trees,
Neural Nets.
Vadakattu R. et al, (2015) In this paper the author has
described the process of building a churn prediction
platform for large-scale subscription based businesses and
products. The novel technique of using data segmentation
and past prediction of the customer to further increase the
precision and recall of the model is proposed. Running such
a model at large-scalepossessesseveral challengeswhichwe
cover in our description of extract, transform, and load and
architecture diagram of the platform. The author has
developed novel tools of model tuning to generate three
types of list of "potential churn customers" categorized into
high risk, medium risk and low risk. Such a classification
enables the business units to tailor customized retention
strategies, since each strategy has an associated marketing
cost. The Churn prediction is a continuous process and it
becomes imperative to track customers. We describe the
novel implementation of an index/score which we use to
track and monitor customer receptiveness to retention
schemes and performance over a period of time. The
developed platform is deployed on several eBay sites and
has resulted in the increase of key business metrics. [12]
Amin A. et al, (2014) The author has discussed the
customer churn in telecommunication industry. It is
important to forecast customer churn behavior in order to
retain those customers that will churn or possibly may
churn. This study is another attempt which makes use of
rough set theory as one-class classifier and multi-class
classifier to reveal the trade-off in the selection of an
effective classification model for customer churn prediction.
Four different rule generation algorithms (i.e. Exhaustive,
genetic, covering and LEM2) are analyzed and out of which
rough set as one-class classifier and multi-class classifier
based on genetic algorithm yields more suitable
performance out of four rule generation algorithms. [13]
Mestre M.R. et al, (2013) described the analysis of
customer behavior to find churn customers. They tell that
from the profiles of variety of customers and their changing
behavior over time, organizations can make marketing
strategies to know the group of customers and to decide
whether those customers are profitable or not. They
proposed an algorithm which is hybrid from hierarchical
clustering and hidden markov model (HMM). They compare
augmented method with the non-augmented method with
real data and synthetic data to represent thattheirproposed
model performs better in predicting customer behavior.
They use different clustering algorithms for segmentation of
the customers [14]. Then they use decision theory to check
whether their proposed model is financiallybeneficial for an
organization.
Nabavi S. et al, (2013) described the data mining abilities,
design and implementation of customer churn prediction
model with CRISP-DM based on RFM and Random Forest
Technique. Customer behavior analysis tells that the length
of relationship, average purchase time, and relative
frequency are the best predictors [15]. Forthesegmentation
of churn customers they use random forest technique and
boosted tree as a hybrid technique.
Wang C. et al, (2012) proposed a new methodology to
predict customers’ purchasing behavior using purchase
sequences of customers. Then sequential purchase patterns
are extracted using association rule. Using purchase
transactional records of customers, profile of the customers
is build that describes the likes and dislikes of thecustomers
[16]. Then a group of customers detected who have similar
purchasing behavior using the calculations of correlations
among customers. Transaction clustering is used to cluster
all the transactions of the customers. Then SOM technique is
used to detect customer purchase sequences. Sequential
purchase patterns are extracted using association rule
mining. They predict the customer behavior using
customer’s purchase sequence base on transaction data.
Basiri J. et al, (2010) discussed new approach, the ordered
weighted averaging (OWA) technique to improve the
prediction accuracy of existing churn management systems.
In this paper, they used the strengths of both bagging and
boosting and LOLIMOT algorithms and proposed OWA
approach to combine these algorithms [17].
5. PROPOSED WORK
In the existing work, PCA is used tofeatureselectionwhichis
based on a Gaussian process (assumption), to measure the
variance and sort the eigenvalues which are proportional to
the variances in descending order. The assumption is that
the main Eigenvalues (EVs) containsmostoftheinformation
and therefore, we use the main components (EVs) for data
reduction. This approach for feature reduction is risky
because this assumes that the feature in itself is stable
(invariant) AND the feature's variance contains all
information for classification. To eliminate such drawbacks,
we propose a technique using Correlation based Symmetric
uncertainty feature selection and ensemble learning for
customer churn prediction in telecommunication industry.
6. CONCLUSION
In this paper, we have discussed about the present scenario
of customer churn. Further we have discussed about data
mining and its techniques. Customer behavior analysis is
studied to understand the behavior of customer related to
churn. Literature survey is done to study various techniques
to predict customer churn. Section 5 explains that present
work use PCA for feature selection but this system has some
drawbacks. So to remove these drawbacks, a technique is
proposed using Correlation based Symmetric uncertainty
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 04 Issue: 09 | Sep -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1534
feature selection and ensemble learning for customer churn
prediction.
REFERENCES
[1] M. Shaw, C. Subramaniam, G. W. Tan, and M. E. Welge,
“Knowledge management and data mining for marketing,”
Decision Support Systems, Vol. 31, no. 1, pp. 127-137, 2001.
[2] C. P. Wei and I. T. Chiu, “Turning telecommunicationscall
details to churn prediction: A data mining approach,”Expert
Systems with Applications, Vol. 23, pp. 103-112, 2002.
[3] J. H. Ahn, S. P. Han, and Y. S. Lee, “Customer churn
analysis: Churndeterminantsandmediationeffectsof partial
defection in the Korean mobile telecommunications service
industry,” Telecommunications Policy,Vol.30,Issues10–11,
pp. 552-568, 2006.
[4] V. García, A. I. Marqués, and J. S. Sánchez, “Non-
parametric statistical analysis of machine learning methods
for credit scoring,” Advances in Intelligent Systems and
Computing, Volume 171, pp. 263-272, 2012.
[5] S. Chakrabarti, M. Ester, U. Fayyad, J. Gehrke, J. Han, S.
Morishita, G. Piatetsky-Shapiro, and W. Wang, “Data mining
curriculum: A proposal”, Version 1.0, 2006.
[6] C. L. Blake and C. J. Merz, “Churn dataset”, UCI Repository
of Machine Learning Databases, University of California,
Department of Information and Computer Science, Irvine,
CA, 1998.
[7] I. B. Brându�oiu and G. Toderean, “Churn prediction in
the telecommunications sector using support vector
machines”, Annals of the Oradea University Fascicle of
Management and Technological Engineering, Vol. 22, Nr. 1,
pp. 19-22, 2013.
[8] I. B. Brându�oiu and G. Toderean, “Applying principal
component analysis on call detail records”, ACTA Technica
Napocensis Electronicsand Telecommunications,Vol.55,Nr.
4, pp. 25-28, 2014.
[9] I. B. Brându�oiu and G. Toderean, “Churn prediction in
the telecommunications sector using neural networks”,
ACTA Technica Napocensis Electronics and
Telecommunications, Vol. 57, Nr. 1, 2016, in press.
[10] I. B. Brându�oiu and G. Toderean, “Churn prediction in
the telecommunications sector using Bayesian networks”,
University of Oradea Journal of Computer Science and
Control Systems, Vol. 8, Nr. 2, pp. 11-18, 2015.
[11] Berry M., J., and Linoff G. “Data Mining Techniques: For
Marketing, Sales, and Customer Support”.JohnWiley&Sons,
New York, NY, USA, 1997.
[12] Ramakrishna Vadakattu ; Bibek Panda ; Swarnim
Narayan ; Harshal Godhia “ Enterprise subscription churn
prediction”, IEEE International Conference on Big Data (Big
Data), 2015.
[13] Adnan Amin ; Changez Khan ; Imtiaz Ali ; Sajid Anwar,
“ Customer Churn Prediction in Telecommunication
Industry: With and without Counter-Example”, Network
Intelligence Conference (ENIC), IEEE, 2014.
[14] Mestre Maria Rosario and Victoria Pedro. “Tracking of
consumer behavior in e-commerce”. 16th International
Conference on Information Fusion, Istanbul, Turkey, pp.
1214-1221, 2013.
[15] Nabavi Sadaf and Jafari Shahram. (2013). “Providing a
Customer Churn Prediction Model using Random Forest
Technique”. 5th IEEE-Conference on Information and
Knowledge Technology (IKT), pp. 202-207.
[16] Wang Chong and Wang Yanqing. “Discovering
Consumer’s Behavior Changes Based on Purchase
Sequences”. 9th IEEE-International Conference on Fuzzy
Systems and Knowledge Discovery (FSKD 2012), pp. 642-
645,2012.
[17] Javad Basiri, Fattaneh Taghiyareh and Behzad Moshiri.
(2010). “A Hybrid Approach to Predict Churn”, IEEE Asia-
Pacific Services Computing Conference, pp. 485-491.

More Related Content

PDF
Automated Feature Selection and Churn Prediction using Deep Learning Models
PDF
Using Data Mining Techniques in Customer Segmentation
PDF
Ijatcse71852019
PDF
TOURISM DEMAND FORECASTING MODEL USING NEURAL NETWORK
PDF
Subscriber Data Mining in Telecommunication
PDF
Prediction of Default Customer in Banking Sector using Artificial Neural Network
PDF
Clustering
PDF
An Improvised Fuzzy Preference Tree Of CRS For E-Services Using Incremental A...
Automated Feature Selection and Churn Prediction using Deep Learning Models
Using Data Mining Techniques in Customer Segmentation
Ijatcse71852019
TOURISM DEMAND FORECASTING MODEL USING NEURAL NETWORK
Subscriber Data Mining in Telecommunication
Prediction of Default Customer in Banking Sector using Artificial Neural Network
Clustering
An Improvised Fuzzy Preference Tree Of CRS For E-Services Using Incremental A...

What's hot (18)

PDF
Data Mining in Telecommunication Industry
PDF
IRJET - An Overview of Machine Learning Algorithms for Data Science
PDF
An efficient data pre processing frame work for loan credibility prediction s...
PDF
Av24317320
PDF
IRJET- Customer Buying Prediction using Machine-Learning Techniques: A Survey
PDF
Extended pso algorithm for improvement problems k means clustering algorithm
PDF
Software Agents Role and Predictive Approaches for Online Auctions
DOCX
Machine-Learning: Customer Segmentation and Analysis.
PPTX
Social Network Analysis for Telecoms
PDF
A Survey on the Clustering Algorithms in Sales Data Mining
DOCX
DSO528GroupProject-PortugueseBank
PPTX
Business Analytics Unit III: Developing analytical talent
PDF
20 ccp using logistic
PDF
IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...
PDF
Discovering diamonds under coal piles: Revealing exclusive business intellige...
PDF
IRJET- User Preferences and Similarity Estimation
DOCX
Boosting conversion rates on ecommerce using deep learning algorithms
PDF
Introduction to feature subset selection method
Data Mining in Telecommunication Industry
IRJET - An Overview of Machine Learning Algorithms for Data Science
An efficient data pre processing frame work for loan credibility prediction s...
Av24317320
IRJET- Customer Buying Prediction using Machine-Learning Techniques: A Survey
Extended pso algorithm for improvement problems k means clustering algorithm
Software Agents Role and Predictive Approaches for Online Auctions
Machine-Learning: Customer Segmentation and Analysis.
Social Network Analysis for Telecoms
A Survey on the Clustering Algorithms in Sales Data Mining
DSO528GroupProject-PortugueseBank
Business Analytics Unit III: Developing analytical talent
20 ccp using logistic
IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...
Discovering diamonds under coal piles: Revealing exclusive business intellige...
IRJET- User Preferences and Similarity Estimation
Boosting conversion rates on ecommerce using deep learning algorithms
Introduction to feature subset selection method
Ad

Similar to A Comparative Study of Techniques to Predict Customer Churn in Telecommunication Industry (20)

PDF
Customer Churn Prediction using Association Rule Mining
PDF
Improved Customer Churn Behaviour by using SVM
PDF
Big Data Analytics for Predicting Consumer Behaviour
PDF
Clustering customer data dr sankar rajagopal
PPT
Chapter14 example2
PDF
A simulated decision trees algorithm (sdt)
PDF
A Survey on Customer Analytics Techniques for the Retail Industry
PPT
Datamining for crm
DOCX
AHP Based Data Mining for Customer Segmentation Based on Customer Lifetime Value
PDF
A Proposed Churn Prediction Model
PPTX
Using data mining in e commerce
PDF
IRJET-User Profile based Behavior Identificaton using Data Mining Technique
RTF
Data mining
PDF
A data mining approach to predict
PPTX
Analytics infrastructure, platforms and methods
PDF
U25107111
PPTX
Chapter six new.pptx knowledge based management
PPTX
Data Analytics introduction .pptx
PDF
Data Mining Concepts with Customer Relationship Management
PPTX
Customer analytics
Customer Churn Prediction using Association Rule Mining
Improved Customer Churn Behaviour by using SVM
Big Data Analytics for Predicting Consumer Behaviour
Clustering customer data dr sankar rajagopal
Chapter14 example2
A simulated decision trees algorithm (sdt)
A Survey on Customer Analytics Techniques for the Retail Industry
Datamining for crm
AHP Based Data Mining for Customer Segmentation Based on Customer Lifetime Value
A Proposed Churn Prediction Model
Using data mining in e commerce
IRJET-User Profile based Behavior Identificaton using Data Mining Technique
Data mining
A data mining approach to predict
Analytics infrastructure, platforms and methods
U25107111
Chapter six new.pptx knowledge based management
Data Analytics introduction .pptx
Data Mining Concepts with Customer Relationship Management
Customer analytics
Ad

More from IRJET Journal (20)

PDF
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
PDF
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
PDF
Kiona – A Smart Society Automation Project
PDF
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
PDF
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
PDF
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
PDF
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
PDF
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
PDF
BRAIN TUMOUR DETECTION AND CLASSIFICATION
PDF
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
PDF
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
PDF
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
PDF
Breast Cancer Detection using Computer Vision
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Kiona – A Smart Society Automation Project
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
BRAIN TUMOUR DETECTION AND CLASSIFICATION
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Breast Cancer Detection using Computer Vision
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...

Recently uploaded (20)

PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PDF
PPT on Performance Review to get promotions
PPTX
Strings in CPP - Strings in C++ are sequences of characters used to store and...
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
Welding lecture in detail for understanding
PPTX
web development for engineering and engineering
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPTX
OOP with Java - Java Introduction (Basics)
PPTX
Lecture Notes Electrical Wiring System Components
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PPT on Performance Review to get promotions
Strings in CPP - Strings in C++ are sequences of characters used to store and...
CYBER-CRIMES AND SECURITY A guide to understanding
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Welding lecture in detail for understanding
web development for engineering and engineering
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Model Code of Practice - Construction Work - 21102022 .pdf
OOP with Java - Java Introduction (Basics)
Lecture Notes Electrical Wiring System Components
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx

A Comparative Study of Techniques to Predict Customer Churn in Telecommunication Industry

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 04 Issue: 09 | Sep -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1531 A COMPARATIVE STUDY OF TECHNIQUES TO PREDICT CUSTOMER CHURN IN TELECOMMUNICATION INDUSTRY Maninderjeet Kaur1, Priyanka2 1,2 Computer Science and Engineering Swami Vivekanand Institute of Engineering and Technology ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - In present days there is huge competition between various companies in the industry. Due to this companies pay more attentiontowardstheircustomersrather than their product. They become aware of customer churn issue. Basically when a customerceasesone’srelationship with the company, this misfortune of relationship is known as customer churn. Various data mining approaches are used to predict customer’s churnattitude. Manyalgorithmshavebeen proposed to predict these results. In this paper, we have discussed about various methods used to predict customer churn in telecommunication industryand proposeatechnique using Correlation based Symmetric uncertainty feature selection and ensemblelearningforcustomerchurnprediction in telecommunication industry. Key Words: customer churn, data mining, algorithm, telecommunication, feature selection 1.INTRODUCTION One of the main concerns of telecommunications companies is the customer retention. These days, in order to predict customer churn,manycompaniesinthetelecommunications sector make use of the data mining techniques [1]. The term churn refers to the change of the service provider, triggered by better rates or services or by the benefits offered at signup by a competitor company [2]. It is measured by the rate of churn and is an important indicatorfororganizations. In the telecommunicationsindustry,the mobilemarketisthe segment that sees the fastest growthandisalmostsaturated. In order to keep their customers, telecommunications companies are making use of a defensivemarketingstrategy. One such company must identify customers who are at risk of churn before they are actually going to act so they can send proactive retention campaigns[3]. Tocorrectlyidentify only the customers who are going to churn, the predictive model has to be very accurate, toavoidcontactingcustomers who will be using the services anyway.Toachievethistask is not easy and well defined becausethepre-paidcustomers do not have a contract. The predictive model will acceleratethe retention process and the mobile telecommunications companies will achieve positive results in this competitive market. This prediction process depends strongly on the data mining techniques mainly because of the increased performance obtained by the machine learning algorithms [4]. To extract knowledge from data,thedata miningprocess makes use of machine learning algorithms,statistics,pattern recognition, and visualization techniques [5]. This paper is organized as follows: in Section 2, we describe data mining and its techniques. In Section 3 we discuss about customer behavior analysis for customer churn. We then discussed literature survey related to this work in Section 4. Section 5 discusses about drawback of current system and proposed work. Finally, Section 6 concludes this review process. 2. DATA MINING Originally, “DATA MINING" is a statistician's term for overusing data to draw invalid inferences. So, it’s Discovery of useful summaries of data [1,2]. Data Mining [1] is a process that discovers the knowledge or hidden pattern from large databases. DM is known as one of the core processes of Knowledge Discovery in Database (KDD). It is the process that results in the discovery of new patterns in large data sets. It is a useful method at the intersection of artificial intelligence, machine learning, statistics, and database systems. It is the principle of picking out relevant information from data. It is usually used by business intelligence organizations, and financial analysts, to extract useful information from large data sets or databases DM is use to derive patterns and trends that exist in large datasets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. The goal of this technique is to find accurate patterns that were previously not known by us. So, the overall goal of the DM process is to extract information from a data set and transform it into an understandable structure for further use. Many DM techniques and systems have been developed and designed. These techniques can be classified based on the database, the knowledge to be discovered, and the techniques to be utilized. Based on the database - There are many database systems that are used in organizations, such as relational database, transaction database, object-oriented database, spatial database, multimedia database, legacy database, and Web database. A DM system can be classified based on the type of database it is designed for. For example, it is a relational DM system if the system discovers knowledge from relational database and it is an object-oriented DM system if the system finds knowledge from object-oriented database [5]. Based on the techniques - DM systems can also be categorized by DM techniques. For example, a DM system
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 04 Issue: 09 | Sep -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1532 can be categorized according to the driven method, such as autonomous knowledge mining, data driven mining, query- driven mining, and interactive DM techniques.Alternatively, it can be classified according to its underlying mining approach, such as generalization based mining, pattern- based mining, statistical- or mathematical-basedminingand integrated approaches [1]. Based on the knowledge - DMsystemscandiscovervarious types of knowledge, including association, classification, clustering, prediction, sequential patterns and decisiontree. DM systems can also be classified according to the abstraction level of the discovered knowledge. The knowledge may be classified into general knowledge, primitive-level knowledge, and multilevel knowledge. We will briefly examine those DM techniques in the following sections: Association: Association is one of the best known DM techniques. In association, a patternisdiscoveredbasedona relationship between items in the same transaction. Classification: Classification is a classic DMtechniquebased on machine learning. Basically classification is used to classify each item in a set of data into one of predefined set of classes or groups. Classification method makes use of mathematical techniques such as decision trees, linear programming, neural network and statistics. Clustering: Clustering is a DM technique which makes meaningful (i.e. useful) cluster of objects having similar characteristics using automatic technique. The clustering technique defines the classes and puts objects in each class, while in the classification techniques, objects are assigned into predefined classes. Prediction: The prediction, as it name implied, is one of a DM techniques that discovers relationship between independent variables and relationship between dependent and independent variables. Sequential Patterns: Sequential patterns analysis is one of DM technique that seeks to discover or identify similar patterns, regular events or trends in transaction data over a business period. Decision trees: Decision tree is one of the most used DM techniques becauseitiseasytounderstand by users. In decision tree, the root of the decision tree is a simple question or condition thathasmultipleanswers.Each answer then leads to a set of questions or conditions that help us determine the data to make the decision[7]. 3. CUSTOMER BEHAVIOUR ANALYSIS In recent years, management of organizations is moving from “Product-Centric” to “Customer-Centric”[13].Theyare not only provides products to meet the need of customers but also improving their services to increase the loyalty and satisfaction of the customers. Intense competition in the market has increased the need for retailers to use strategies focused on retaining the right customers. Acquiring the new customers is more expensive than retaining the existing customers. To retain the customers, organizations are more concern about the customer behavior analysis. The major factors of success include learning costumers’ purchase behavior, developing marketing strategies todiscoverlatent loyal customers [14]. However a strategy that is effective in acquiring new customers may not be the most effective in retaining existing customers so in order to design the effective activity to retain customers they need to use the effective strategy for this. So different marketing strategies can be devised that will target different sets of customers. Predicting those profitable customersisimportanttoinform and guide the decision making to keep the products and services competitive. Consumer behavior is the study of individual, or group about their process of selecting and using the product, services, ideas or experiences to satisfy needs. It involves ideas, services and tangible products. Data mining techniques shows effectively and easily business solution can be made and to beat the competition. New technologies of data mining can be used for Customer Relationship Management (CRM) and with this different marketing strategies are devised for different set of customers [15]. Organizations need to understand the customer behavior to improve their marketing strategies. They must understand few things about their customers such as what is the psychology of the customer while purchasing the products, what the customer thinks, feel and select between different alternatives, how the customer is influenced by environment, and how customers’ decision strategy differs between products that differ in their level of importance or interest. The customer behavior is analyzed to making the marketing strategies and public policy. The stored data contains the information of about the spending behavior of customer, how much they buy, which day at what time he/she doesthe shopping, and what they buy most often, in that locality etc. The purchasing sequences of the customersarestoredinthe database so it is easy to fetch that dataanddeterminedthose customers which have made repeat purchases [14]. These sequences can determine the changes in customers’ preferences over time. 4. LITERATURE REVIEW Berry and Linoff, (1997) described that data mining techniques can be used to retain the loyal customers, look out the right prospects, identify new markets for products and services, and recognize cross-selling opportunities on and off the web. Data mining techniques areeffectivetool for analyzing consumer behaviors. There are seven powerful techniques with are useful for this purpose [11] such as Cluster Detection, Memory-Based Reasoning,MarketBasket
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 04 Issue: 09 | Sep -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1533 Analysis, Generic Algorithms, Link Analysis, Decision Trees, Neural Nets. Vadakattu R. et al, (2015) In this paper the author has described the process of building a churn prediction platform for large-scale subscription based businesses and products. The novel technique of using data segmentation and past prediction of the customer to further increase the precision and recall of the model is proposed. Running such a model at large-scalepossessesseveral challengeswhichwe cover in our description of extract, transform, and load and architecture diagram of the platform. The author has developed novel tools of model tuning to generate three types of list of "potential churn customers" categorized into high risk, medium risk and low risk. Such a classification enables the business units to tailor customized retention strategies, since each strategy has an associated marketing cost. The Churn prediction is a continuous process and it becomes imperative to track customers. We describe the novel implementation of an index/score which we use to track and monitor customer receptiveness to retention schemes and performance over a period of time. The developed platform is deployed on several eBay sites and has resulted in the increase of key business metrics. [12] Amin A. et al, (2014) The author has discussed the customer churn in telecommunication industry. It is important to forecast customer churn behavior in order to retain those customers that will churn or possibly may churn. This study is another attempt which makes use of rough set theory as one-class classifier and multi-class classifier to reveal the trade-off in the selection of an effective classification model for customer churn prediction. Four different rule generation algorithms (i.e. Exhaustive, genetic, covering and LEM2) are analyzed and out of which rough set as one-class classifier and multi-class classifier based on genetic algorithm yields more suitable performance out of four rule generation algorithms. [13] Mestre M.R. et al, (2013) described the analysis of customer behavior to find churn customers. They tell that from the profiles of variety of customers and their changing behavior over time, organizations can make marketing strategies to know the group of customers and to decide whether those customers are profitable or not. They proposed an algorithm which is hybrid from hierarchical clustering and hidden markov model (HMM). They compare augmented method with the non-augmented method with real data and synthetic data to represent thattheirproposed model performs better in predicting customer behavior. They use different clustering algorithms for segmentation of the customers [14]. Then they use decision theory to check whether their proposed model is financiallybeneficial for an organization. Nabavi S. et al, (2013) described the data mining abilities, design and implementation of customer churn prediction model with CRISP-DM based on RFM and Random Forest Technique. Customer behavior analysis tells that the length of relationship, average purchase time, and relative frequency are the best predictors [15]. Forthesegmentation of churn customers they use random forest technique and boosted tree as a hybrid technique. Wang C. et al, (2012) proposed a new methodology to predict customers’ purchasing behavior using purchase sequences of customers. Then sequential purchase patterns are extracted using association rule. Using purchase transactional records of customers, profile of the customers is build that describes the likes and dislikes of thecustomers [16]. Then a group of customers detected who have similar purchasing behavior using the calculations of correlations among customers. Transaction clustering is used to cluster all the transactions of the customers. Then SOM technique is used to detect customer purchase sequences. Sequential purchase patterns are extracted using association rule mining. They predict the customer behavior using customer’s purchase sequence base on transaction data. Basiri J. et al, (2010) discussed new approach, the ordered weighted averaging (OWA) technique to improve the prediction accuracy of existing churn management systems. In this paper, they used the strengths of both bagging and boosting and LOLIMOT algorithms and proposed OWA approach to combine these algorithms [17]. 5. PROPOSED WORK In the existing work, PCA is used tofeatureselectionwhichis based on a Gaussian process (assumption), to measure the variance and sort the eigenvalues which are proportional to the variances in descending order. The assumption is that the main Eigenvalues (EVs) containsmostoftheinformation and therefore, we use the main components (EVs) for data reduction. This approach for feature reduction is risky because this assumes that the feature in itself is stable (invariant) AND the feature's variance contains all information for classification. To eliminate such drawbacks, we propose a technique using Correlation based Symmetric uncertainty feature selection and ensemble learning for customer churn prediction in telecommunication industry. 6. CONCLUSION In this paper, we have discussed about the present scenario of customer churn. Further we have discussed about data mining and its techniques. Customer behavior analysis is studied to understand the behavior of customer related to churn. Literature survey is done to study various techniques to predict customer churn. Section 5 explains that present work use PCA for feature selection but this system has some drawbacks. So to remove these drawbacks, a technique is proposed using Correlation based Symmetric uncertainty
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 04 Issue: 09 | Sep -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1534 feature selection and ensemble learning for customer churn prediction. REFERENCES [1] M. Shaw, C. Subramaniam, G. W. Tan, and M. E. Welge, “Knowledge management and data mining for marketing,” Decision Support Systems, Vol. 31, no. 1, pp. 127-137, 2001. [2] C. P. Wei and I. T. Chiu, “Turning telecommunicationscall details to churn prediction: A data mining approach,”Expert Systems with Applications, Vol. 23, pp. 103-112, 2002. [3] J. H. Ahn, S. P. Han, and Y. S. Lee, “Customer churn analysis: Churndeterminantsandmediationeffectsof partial defection in the Korean mobile telecommunications service industry,” Telecommunications Policy,Vol.30,Issues10–11, pp. 552-568, 2006. [4] V. García, A. I. Marqués, and J. S. Sánchez, “Non- parametric statistical analysis of machine learning methods for credit scoring,” Advances in Intelligent Systems and Computing, Volume 171, pp. 263-272, 2012. [5] S. Chakrabarti, M. Ester, U. Fayyad, J. Gehrke, J. Han, S. Morishita, G. Piatetsky-Shapiro, and W. Wang, “Data mining curriculum: A proposal”, Version 1.0, 2006. [6] C. L. Blake and C. J. Merz, “Churn dataset”, UCI Repository of Machine Learning Databases, University of California, Department of Information and Computer Science, Irvine, CA, 1998. [7] I. B. Brându�oiu and G. Toderean, “Churn prediction in the telecommunications sector using support vector machines”, Annals of the Oradea University Fascicle of Management and Technological Engineering, Vol. 22, Nr. 1, pp. 19-22, 2013. [8] I. B. Brându�oiu and G. Toderean, “Applying principal component analysis on call detail records”, ACTA Technica Napocensis Electronicsand Telecommunications,Vol.55,Nr. 4, pp. 25-28, 2014. [9] I. B. Brându�oiu and G. Toderean, “Churn prediction in the telecommunications sector using neural networks”, ACTA Technica Napocensis Electronics and Telecommunications, Vol. 57, Nr. 1, 2016, in press. [10] I. B. Brându�oiu and G. Toderean, “Churn prediction in the telecommunications sector using Bayesian networks”, University of Oradea Journal of Computer Science and Control Systems, Vol. 8, Nr. 2, pp. 11-18, 2015. [11] Berry M., J., and Linoff G. “Data Mining Techniques: For Marketing, Sales, and Customer Support”.JohnWiley&Sons, New York, NY, USA, 1997. [12] Ramakrishna Vadakattu ; Bibek Panda ; Swarnim Narayan ; Harshal Godhia “ Enterprise subscription churn prediction”, IEEE International Conference on Big Data (Big Data), 2015. [13] Adnan Amin ; Changez Khan ; Imtiaz Ali ; Sajid Anwar, “ Customer Churn Prediction in Telecommunication Industry: With and without Counter-Example”, Network Intelligence Conference (ENIC), IEEE, 2014. [14] Mestre Maria Rosario and Victoria Pedro. “Tracking of consumer behavior in e-commerce”. 16th International Conference on Information Fusion, Istanbul, Turkey, pp. 1214-1221, 2013. [15] Nabavi Sadaf and Jafari Shahram. (2013). “Providing a Customer Churn Prediction Model using Random Forest Technique”. 5th IEEE-Conference on Information and Knowledge Technology (IKT), pp. 202-207. [16] Wang Chong and Wang Yanqing. “Discovering Consumer’s Behavior Changes Based on Purchase Sequences”. 9th IEEE-International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2012), pp. 642- 645,2012. [17] Javad Basiri, Fattaneh Taghiyareh and Behzad Moshiri. (2010). “A Hybrid Approach to Predict Churn”, IEEE Asia- Pacific Services Computing Conference, pp. 485-491.