SlideShare a Scribd company logo
International Journal of Database Management Systems (IJDMS) Vol.15, No.1/2/3/4/5, October 2024
DOI: 10.5121/ijdms.2024.16501 1
RESEARCH ON INTEGRATED LEARNING
ALGORITHM MODEL OF BANK CUSTOMER
CHURN PREDICTION
Shang Xinping , Wang Yi
Artificial Intelligence, Dongguan City University, Dongguan, Guangdong, China
ABSTRACT
With the rapid growth of Internet finance, competition within the banking industry has intensified
significantly. To better understand customer needs and enhance customer loyalty, it has become crucial to
develop a customer churn prediction model. Such a model enables banks to identify customers at risk of
leaving, support data-driven business decisions, and implement strategies to retain valuable clients,
thereby safeguarding the bank's interests. In this context, this paper presents a customer churn prediction
model based on an ensemble learning algorithm. Experimental results demonstrate that the model
effectively predicts and analyzes potential customer churn, providing valuable insights for retention efforts.
KEYWORDS
customer churn; data preprocessing; XGBoost
1. INTRODUCTION
With the ongoing expansion of financial markets and heightened competition, customer churn has
emerged as a critical factor influencing banks' operational efficiency and remains a top concern
for businesses [1]
. To effectively reduce customer attrition, enhance satisfaction and loyalty, refine
customer segmentation, attract potential clients, and improve service quality, banks must leverage
advanced prediction models to identify customers at risk of leaving and boost their competitive
advantage [2]
. Customer churn, also known as customer attrition, refers to the process where a
customer ends their relationship with a company or service provider. In the banking sector, this
issue is especially significant, as it directly impacts revenue, profitability, and market share. As
market competition intensifies, banks are under growing pressure to retain existing customers and
attract new ones.
Ensemble learning, a robust machine learning approach, offers a solution by improving the
overall accuracy and stability of predictive models through the combination of multiple
algorithms, making it ideal for customer churn prediction in banks.
Historically, customer churn prediction has progressed from traditional statistical methods to
more advanced machine learning techniques. Early models, such as logistic regression, survival
analysis, and decision trees, provided a basic understanding of churn but were often limited in
their predictive capabilities. As machine learning evolved, more sophisticated algorithms like
random forests, support vector machines (SVMs), and neural networks were introduced, offering
better accuracy but often suffering from overfitting and instability issues. Ensemble learning,
which integrates the predictions of multiple models, has emerged as a more reliable and accurate
solution. Methods like Gradient Boosting Machines (GBM), Random Forest with feature
International Journal of Database Management Systems (IJDMS) Vol.15, No.1/2/3/4/5, October 2024
2
selection, and Stacking have shown marked improvements in predictive performance and
stability. Among these, XGBoost (Extreme Gradient Boosting) stands out for its efficiency,
scalability, and flexibility.
This research focuses on a bank's dataset, consisting of 14 variables and 10,000 samples. The
study begins by analysing and pre-processing the data, which includes tasks like data cleaning,
feature engineering, and feature selection. XGBoost, an ensemble learning algorithm, is then
utilized to predict and model customer churn. The resulting prediction model enables banks to
accurately forecast customer attrition, increase user engagement, improve retention strategies,
and reduce the costs associated with retaining customers.
2. CONSTRUCTION OF BANK CUSTOMER CHURN FORECASTING MODEL
2.1. Data Exploration and Preprocessing
At this stage, it is crucial to systematically clean and transform the data for each feature to
enhance the performance of the predictive models. This process involves several important steps:
Step 1: Handling Missing Values
Begin by thoroughly examining the dataset for any missing values, as these can weaken the
predictive power of the model. Different strategies should be employed depending on the type of
data (numeric or categorical). For numerical data, methods such as filling in missing values with
the mean, median, or mode can be applied. For categorical data, the most frequent category may
be used as a replacement. However, if a feature has an excessive proportion of missing values
(e.g., more than 50%), it may lose its significance due to the large amount of unknown
information and should be considered for removal.
Step 2: Identifying and Removing Duplicate Values
Duplicate entries can occur due to data entry errors or accidental data merging, which can distort
the true distribution and negatively affect the accuracy of the model. Identifying and removing
these duplicates using appropriate techniques ensures the dataset’s integrity and reliability.
Step 3: Eliminating Irrelevant or Low-Variance Features
Certain features may have little relevance to the target variable or exhibit extremely low variance
(e.g., nearly identical values across the dataset), adding unnecessary complexity to the model.
These irrelevant or low-variance features should be identified through correlation analysis or
variance testing and removed to enhance the model’s efficiency and accuracy while reducing
computational overhead.
Step 4: Detecting and Handling Outliers
Outliers can distort statistical models and reduce predictive accuracy. Statistical techniques, such
as the interquartile range (IQR), combined with visual tools like box plots, can be used to detect
these outliers. Depending on the context, outliers can either be removed or transformed (e.g.,
through logarithmic transformation or binning) to minimize their impact on the model. In some
cases, data standardization or normalization is necessary to address the differences in scale
between features.
International Journal of Database Management Systems (IJDMS) Vol.15, No.1/2/3/4/5, October 2024
3
Step 5: Balancing the Dataset
Data imbalance, where certain classes significantly outnumber others, is a common challenge in
classification tasks. In this case, the churn distribution may be imbalanced, with non-churn
customers outnumbering churn customers by approximately 4:1 (as shown in Figure 1). This
imbalance can cause the model to favor the majority class (non-churn customers). To address this,
techniques like oversampling (e.g., SMOTE) or undersampling can be applied to adjust the class
proportions, ensuring a more balanced dataset and improving model performance for minority
classes.
Figure 1. Pie Chart of Loss Rate
Further analysis of the relationship between the target variable and other variables:
Figure 2. Diagram1 of the Relationship Between the Target Variable and Other Variables
The following questions can be seen in Figure 2:
1) Germany has the fewest customers and France the most, but the proportion of lost customers
is reversed. This indicates that banks may not allocate enough customer service resources in
areas with fewer customers.
2) The total number of male customers is higher than that of female customers, but the turnover
ratio is lower than that of female customers, indicating that the bank's service strategy is not
comprehensive enough.
3) Customers with credit cards churn more than customers without credit cards.
International Journal of Database Management Systems (IJDMS) Vol.15, No.1/2/3/4/5, October 2024
4
4) Inactive customers have a higher churn rate. However, the overall proportion of inactive
customers is quite high, so banks should give relatively preferential policies to inactive
customers and turn inactive customers into active customers to reduce the loss of customers.
Figure 3. Diagram2 of the Relationship Between the Target Variable and Other Variables
In Figure 3, it can be seen that:
1) There is no significant difference in the distribution of credit scores between churn and non-
churn customers.
2) Older customers churn more than younger ones, so banks need to adjust retention strategies
for customers of different age groups.
3) In terms of tenure, clients at the extremes are more likely to churn.
4) The bank is losing customers with large bank balances, the bank may lack of loan funds, and
the profit margin will be compressed.
5) Product and salary have no significant effect on the likelihood of churn.
Step 6: Data Transformation and Normalization
Features with varying scales can impact machine learning algorithms differently. To ensure that
all features contribute equally to the model’s performance, it is essential to apply scaling through
normalization or standardization. Normalization typically scales the data to a specific range (such
as [0,1]), while standardization transforms the data to follow a normal distribution (mean of 0 and
variance of 1). These transformations help improve the model's convergence speed and predictive
accuracy.
Once the data cleaning process is complete, the dataset should be reviewed again to confirm that
all missing values, duplicates, and outliers have been appropriately handled. The final dataset
should be well-prepared for model training, with balanced classes and carefully selected, relevant
features.
International Journal of Database Management Systems (IJDMS) Vol.15, No.1/2/3/4/5, October 2024
5
2.2. Feature Construction and Selection
After completing data preprocessing, it is standard practice to examine the correlations between
feature variables using a correlation coefficient matrix. Displaying this matrix as a heatmap
provides a clear, visual representation of the strength of the correlations between different
features.
As shown in the heatmap in Figure 4, the correlations between the feature variables are relatively
weak. This indicates that the features are largely independent of each other, making them suitable
for inclusion in the model-building process without concerns about multicollinearity.
Figure 4. Relationships Among Features
According to Pearson correlation coefficient [3]
, further analysis of the degree of correlation
between customer churn and each dimension is shown in Figure 5, from which we can see that
age characteristics have the greatest impact on customer churn; The impact of different
geographies is also different. The loss rate of users in Germany is significantly higher than that in
other countries. In terms of gender, the loss rate of women is higher than that of men. The loss
rate of active users is significantly lower than that of inactive users, which also indicates that
active customers have higher loyalty than inactive customers.
International Journal of Database Management Systems (IJDMS) Vol.15, No.1/2/3/4/5, October 2024
6
Figure 5. Relationship Between User Churn and Various Dimensions
However, a feature construction and selection process are still needed to optimize model
performance.
Feature construction involves generating new features from the original data to better capture its
intrinsic characteristics, thereby enhancing the model's predictive performance or interpretability.
This process requires a deep understanding of the business context, data analysis objectives, and
domain expertise to manually create relevant features. By deriving new variables from the
original ones, categorical variables with multiple classes can be combined into fewer categories,
reducing the dimensionality and complexity of the dataset. Additionally, interactions between
two or more features can be considered to create interaction features that contain extra
information, potentially improving the model's performance.
Figure 6. The Influence of Balance to Wage Ratio on Attrition Rate
As shown in Figure 6, while estimated wages have minimal impact on customer churn, the
balance-to-wage ratio significantly influences churn rates. Customers with higher balance-to-
wage ratios exhibit a greater likelihood of churn, which could deter banks from lending to such
individuals.
International Journal of Database Management Systems (IJDMS) Vol.15, No.1/2/3/4/5, October 2024
7
Feature selection involves choosing the most relevant subset of features from the original dataset
that are most effective for predicting the target variable. This process helps reduce model
complexity, enhance the model’s generalization ability, and lower the risk of overfitting.
While feature construction enriches the dataset by creating new features, feature selection
simplifies the model by removing redundant or unimportant ones. These two processes work
together to improve both the predictive accuracy and interpretability of the model.
2.3. Model Construction and Evaluation
Ensemble learning algorithms are a powerful class of machine learning frameworks that make
final predictions by aggregating the outputs of multiple base learners, such as decision trees or
neural networks. The core principle behind this approach is akin to "brainstorming"—the idea
that combining the strengths of multiple models typically results in better generalization and
higher accuracy than relying on a single model. In the realm of customer churn prediction,
ensemble learning algorithms, particularly XGBoost, are highly regarded for their superior
performance and flexibility. This study employs the XGBoost ensemble learning algorithm to
model customer churn prediction, as illustrated in Figure 7 below.
Figure 7. Integrated Learning Neural Networks
XGBoost (Extreme Gradient Boosting) is a highly efficient and flexible gradient boosting library
designed for various tasks, including classification, regression, and ranking [4]
. Built on the
gradient boosting framework, it enhances model performance by iteratively adding weak learners,
typically decision trees. XGBoost offers several improvements over traditional gradient boosting
algorithms, including:
 Second-Order Taylor Expansion of the Loss Function: Unlike conventional methods that
consider only the first derivative of the loss function (the gradient), XGBoost also
incorporates the second derivative (the Hessian matrix) at each iteration. This enables a
more precise approximation of the optimal solution, resulting in faster convergence and
improved model accuracy.
 Regularization Terms: To manage model complexity and prevent overfitting, XGBoost
includes regularization terms in the objective function. These terms account for the number
of leaf nodes in the tree and the L1 and L2 norms of the leaf node weights, contributing to
greater model stability and generalization.
International Journal of Database Management Systems (IJDMS) Vol.15, No.1/2/3/4/5, October 2024
8
 Parallel and Distributed Computing: XGBoost supports column sampling and parallel
processing, making it well-suited for handling large datasets efficiently. It also facilitates
distributed computing, allowing model training on large-scale systems.
In the context of customer churn prediction, XGBoost enhances model accuracy and stability
through several mechanisms:
 Feature Importance Assessment: XGBoost automatically evaluates feature importance,
helping to identify the factors most significantly impacting customer churn. This insight
enables business teams to better understand customer behaviour and develop targeted
marketing strategies.
 Automatic Handling of Missing Values: XGBoost can learn and manage missing values in
the training data without the need for manual preprocessing. This streamlines the data
cleaning process and minimizes the potential for human error in model performance.
 Overfitting Prevention: By incorporating regularization terms and employing techniques
such as early stopping, XGBoost effectively mitigates the risk of overfitting. This ensures
that the model performs well on both training and test datasets.
 Efficient Model Training: XGBoost employs various optimization strategies, including
caching mechanisms, feature preordering, and parallel computing, to accelerate the model
training process. This allows for quicker completion of model training on large datasets.
 Flexible Model Tuning: With a wide range of parameter settings, XGBoost allows users to
adjust the model flexibly according to specific tasks and dataset characteristics. Fine-tuning
these parameters can further enhance model performance.
In conclusion, XGBoost is an advanced ensemble learning algorithm that demonstrates
exceptional performance and stability in predicting customer churn. Its efficient model training,
accurate predictive capabilities, and flexible parameter adjustment options make it an invaluable
tool for businesses aiming to forecast customer attrition.
The algorithm model is used to learn 80% of samples as training sets, and 20% of samples as test
sets to verify the learning ability of the model.
To comprehensively evaluate the performance of the model, this paper uses several indexes such
as accuracy rate, recall rate and F1 score of the test set data [5]
. These metrics are key measures of
model performance, helping to understand and evaluate the model's performance in different
aspects, so as to select the most appropriate model or adjust model parameters to optimize
performance.
 Precision: The proportion of samples predicted by the model to be positive that are positive.
A positive class is predicted to be a positive class (TP) and a negative class is predicted to be
a positive class (FP),i.e.
The precision reflects the reliability of the model prediction as positive. High accuracy
means that the majority of the samples predicted to be positive are indeed positive, but it
can also cause the model to be too conservative and miss some samples that are actually
positive.
International Journal of Database Management Systems (IJDMS) Vol.15, No.1/2/3/4/5, October 2024
9
 Recall: The proportion of all positive samples that are correctly predicted by the model to be
positive, i.e.
The recall rate reflects the ability of the model to find all positive samples. A high recall
rate means that the model can find most samples that are actually positive classes, but it can
also cause the model to incorrectly predict some negative class samples as positive classes.
 F1 Score: This is the harmonic average of accuracy and recall for a comprehensive
evaluation of model performance, i.e.
The F1 score is a single metric that considers both the accuracy and comprehensiveness of
the model. In scenarios where both precision and recall need to be a concern, F1 scores are
a good choice.
The test results are evaluated based on precision, recall, F1 score, and accuracy, as shown in the
following Figure 8, all of which are above 0.85, indicating good performance.
Figure 8. Test Results
The accuracy of the learner can be easily and intuitively assessed by examining the ROC curve[6]
,
which provides insights into the model's generalization performance. The ROC curve is plotted
with the True Positive Rate (TPR) on the vertical axis and the False Positive Rate (FPR) on the
horizontal axis for various threshold settings. TPR indicates the proportion of actual positive
samples correctly predicted by the model, while FPR represents the proportion of actual negative
samples incorrectly classified as positive. A ROC curve that approaches the upper left corner of
the plot (indicating high TPR and low FPR) signifies better classification performance.
The Area Under the Curve (AUC) quantifies the overall performance of the ROC curve and
serves as a key metric for evaluating the learner's quality. The AUC value ranges from 0 to 1,
with a value closer to 1 indicating superior predictive performance. An AUC of 0.5 suggests that
the model's performance is equivalent to random guessing, while an AUC below 0.5 implies that
the model's predictions are completely contrary to reality. The AUC provides a standardized
metric for assessing model predictive power, allowing for objective comparisons between
different models. It considers the model's performance across all classification thresholds,
offering a comprehensive view of its predictive capabilities. Both ROC curves and AUC values
perform well with imbalanced datasets.
International Journal of Database Management Systems (IJDMS) Vol.15, No.1/2/3/4/5, October 2024
10
In conclusion, ROC curves and AUC values are essential tools for evaluating the predictive
power of classification models. They offer an intuitive graphical representation as well as a
quantitative assessment, enabling researchers and developers to evaluate model performance
comprehensively and objectively, thereby facilitating more informed decision-making. As
illustrated in Figure 9, the model achieved an AUC value of 0.913, demonstrating strong
predictive effectiveness and making it well-suited for related prediction tasks.
Figure 9. ROC Curve
Based on the performance feedback of the test sets, it is necessary to constantly adjust the model
structure and hyperparameters, as well as try different optimization methods. Through continuous
evaluation and optimization, the performance of the model can be gradually improved to better
adapt to the actual application scenario [7]
.
The following are common methods and strategies for hyperparameter tuning and model
optimization:
1)Hyperparameter Tuning Methods:
 Grid Search: Exhaustively tries all combinations of hyperparameters but can be
computationally expensive.
 Random Search: Randomly samples hyperparameters, often more efficient than Grid
Search for large parameter spaces.
 Bayesian Optimization: Uses historical performance to predict the next best
hyperparameters, often more efficient than Grid or Random Search.
 Hyperband: Based on Successive Halving, reduces computational costs by discarding
poor-performing hyperparameters early.
2)Optimization Methods:
 Learning Rate Scheduling: Adjusts the learning rate dynamically to improve convergence
and performance. Methods include Step Decay, Cosine Annealing, and Warm Restarts.
 Weight Regularization: Adds L2 or L1 regularization to prevent overfitting.
 Batch Normalization: Normalizes inputs at each layer to improve training speed and
generalization.
 Gradient Clipping: Limits gradient values to prevent gradient explosion.
International Journal of Database Management Systems (IJDMS) Vol.15, No.1/2/3/4/5, October 2024
11
3)Early Stopping: Prevents overfitting by stopping training early when validation performance
stops improving.
4)Loss Function Selection: Choose appropriate loss functions based on the task, like Cross-
Entropy Loss for classification or Mean Squared Error for regression.
5)Optimizer Selection: Adaptive optimizers like Adam, RMSprop, and Adagrad dynamically
adjust learning rates, suitable for various training scenarios.
6)Model Architecture Fine-tuning:
 Activation Function: Experiment with different activation functions (e.g., ReLU, Leaky
ReLU, ELU) to find the best for training speed and accuracy.
By applying these methods, model performance can be significantly improved, better fitting the
needs of real-world applications.
3. RETENTION STRATEGY
Even with a prediction accuracy of 88%, banks may still lose customers, as evidenced by a recall
rate of 0.52. This means that 52% of the lost customers need targeted retention strategies, which
can be implemented as follows:
1) Early Identification of Potential Lost Customers
Using churn prediction models, banks can proactively identify customers at risk of leaving
by analyzing their behavior and transaction data. This allows banks to prioritize
communication and offer personalized services.
 Offer tailored incentives, such as customized loan rates or financial advice, to at-risk
customers.
 Establish proactive service plans to regularly engage high-risk customers and address their
needs.
2) Customer Segmentation and Targeted Retention Plans
Different customer segments have varying reasons for churn. By analyzing customer
characteristics, banks can implement more targeted retention strategies.
 For younger customers, introduce digital services like mobile payments to meet their
convenience needs.
 For high-net-worth individuals, offer advanced financial services and personalized wealth
management.
3) Improve Customer Satisfaction and Experience
The model can identify key factors affecting customer satisfaction and churn. Banks should
optimize services and processes based on these insights.
 Enhance customer service efficiency and response times.
 Use feedback to predict future needs and provide personalized product recommendations.
International Journal of Database Management Systems (IJDMS) Vol.15, No.1/2/3/4/5, October 2024
12
4) Timely Intervention
Continuously monitor customer behavior to detect churn risks and respond quickly.
 Set up automated alerts for abnormal customer behavior, prompting timely interventions.
 Utilize account managers to communicate with customers showing signs of leaving.
5) Optimize Marketing and Cross-Selling
Churn prediction models can help identify opportunities for cross-selling other financial
products.
 Create customized product packages for at-risk customers to encourage re-engagement.
 Launch attractive promotions based on insights from predictive models and customer data.
6) Increase Customer Lifetime Value (CLV)
By anticipating churn, banks can enhance customer lifecycle management and maximize
long-term revenue.
 Regularly assess customer CLV and implement retention strategies for high-value clients.
 Boost engagement through loyalty programs to enhance overall customer value.
By integrating churn prediction results with practical strategies, banks can better understand the
factors contributing to customer attrition and develop effective retention plans. This approach not
only reduces churn but also enhances customer satisfaction and loyalty, ultimately improving the
bank's performance.
4. CONCLUSIONS
This study analysed customer data from a specific bank using descriptive statistics and feature
importance analysis and built a customer churn prediction model based on the XGBoost
ensemble learning algorithm. The model helps analyse churn patterns, identify potential reasons
for churn, and enables bank staff to take timely action for customer retention, leading to more
precise marketing and improved efficiency [8]
.
However, the dataset is limited to one bank, with specific geographical, economic, and cultural
factors, which may affect the model's generalizability to other banks or customer groups.
Additionally, the use of static historical data without time-series information on customer
behaviour may limit the model's ability to capture dynamic behaviour patterns. To improve
prediction accuracy, future research could incorporate more external data sources (e.g., social
media, mobile payment, and market data) and use time-series data to capture behavioural
changes. Addressing class imbalance through techniques like SMOTE or weighted loss functions
could also enhance model performance.
Future work could explore deep learning approaches, which are better suited for time-series and
complex behavioural data or use graph neural networks to incorporate customer relationships into
churn prediction. Improving model interpretability would also help bank staff better understand
and apply the model's insights, supporting the ongoing development of churn prediction models
in the banking sector.
ACKNOWLEDGEMENTS
International Journal of Database Management Systems (IJDMS) Vol.15, No.1/2/3/4/5, October 2024
13
I would like to extend my heartfelt gratitude to everyone who contributed to this study. Your
valuable insights and suggestions during our academic discussions have greatly inspired me and
expanded my perspective. Your assistance in data collection, experimental design, and paper
revision has been immensely helpful. This research would not have been possible without your
collaboration and support. I am deeply thankful to all who have offered their guidance and
assistance throughout this process.
REFERENCES
[1] Shi Danlei, Du Baojun. Prediction of bank customer churn based on BP neural network [J]. Science
and Technology Innovation, 2021 (27): 104-106.
[2] Zhao J. Research on key technologies of bank customer analysis management based on data mining
[D]. Zhejiang University,2005.
[3] Shi Yixuan Research on bank customer churn prediction based on data mining [D]. Inner Mongolia
University, 2022.
[4] Fu Lei Bank customer churn early warning and model interpretability analysis [D]. Huazhong
Agricultural University, 2022.
[5] Zhang Wen, Zhang Lili. Prediction and analysis of bank customer churn based on GA-SVM [J].
Computer and Digital Engineering, 2010,38 (04): 55-58.
[6] Chen Chenli. The bank customer churn model based on data mining technology research [D]. North
China institute of aerospace industry, 2023. The DOI: 10.27836 /, dc nki. GBHHT. 2023.000085.
[7] Xie Bin Bank N customer churn analysis and marketing strategy research based on big data mining
[D]. Zhejiang University of Technology, 2020.
[8] Shang Xinping, Wang Yi. Research on Bank Customer Churn Prediction Model based on Ensemble
Learning Algorithm. 13th International Conference on Artificial Intelligence and Machine Learning
(CAIML 2024), Toronto, Canada, 2024.
AUTHORS
Shang Xinping, master, research direction "Artificial intelligence and machine learning",
working in the School of Artificial Intelligence, Dongguan City University, full-time
teacher. Currently studying at St. Paul University Philippines, Doctor in Information
technology, has published several high-quality research papers.

More Related Content

PPTX
Insurance Churn Prediction Data Analysis Project
PPTX
Bank Customer Churn Prediction- Saurav Singh.pptx
PDF
Project crm submission sonali
PDF
Report 190804110930
PDF
Predicting Bank Customer Churn Using Classification
PPTX
Fintech is money for paltforms learning baout bank churn
PDF
A data mining approach to predict
PDF
CUSTOMER CHURN PREDICTION
Insurance Churn Prediction Data Analysis Project
Bank Customer Churn Prediction- Saurav Singh.pptx
Project crm submission sonali
Report 190804110930
Predicting Bank Customer Churn Using Classification
Fintech is money for paltforms learning baout bank churn
A data mining approach to predict
CUSTOMER CHURN PREDICTION

Similar to Research on Integrated Learning Algorithm Model of Bank Customer Churn Prediction (20)

PDF
Data Mining on Customer Churn Classification
PPTX
BANK CUSTOMER CHURN predictio mkini projectn
PDF
Manuscript dss
PDF
Customer churn classification using machine learning techniques
PDF
Customer Churn Prediction Using Machine Learning Techniques: the case of Lion...
PDF
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
PDF
Customer churn prediction in banking
PPTX
Customer_Churn_prediction.pptx
PPTX
Customer_Churn_prediction.pptx
PDF
Customer Churn Prediction using Association Rule Mining
PPTX
ai ml presentation.pptx ON SUBSCRIPTION BASED INDUSTRY
PPTX
Bank churn with Data Science
PDF
Automated Feature Selection and Churn Prediction using Deep Learning Models
PDF
Techathon Idea Paper
PPTX
Maximizing Retention with Minimal Effort
PDF
A Proposed Churn Prediction Model
PDF
Big Data Analytics for Predicting Consumer Behaviour
PDF
Explainable machine learning models applied to predicting customer churn for ...
PDF
Customer churn analysis using XGBoosted decision trees
PDF
IRJET- Customer Buying Prediction using Machine-Learning Techniques: A Survey
Data Mining on Customer Churn Classification
BANK CUSTOMER CHURN predictio mkini projectn
Manuscript dss
Customer churn classification using machine learning techniques
Customer Churn Prediction Using Machine Learning Techniques: the case of Lion...
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
Customer churn prediction in banking
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptx
Customer Churn Prediction using Association Rule Mining
ai ml presentation.pptx ON SUBSCRIPTION BASED INDUSTRY
Bank churn with Data Science
Automated Feature Selection and Churn Prediction using Deep Learning Models
Techathon Idea Paper
Maximizing Retention with Minimal Effort
A Proposed Churn Prediction Model
Big Data Analytics for Predicting Consumer Behaviour
Explainable machine learning models applied to predicting customer churn for ...
Customer churn analysis using XGBoosted decision trees
IRJET- Customer Buying Prediction using Machine-Learning Techniques: A Survey
Ad

More from ijdmsjournal (20)

PDF
Performance Evaluation of SQL and NoSQL Database Management Systems in a Cluster
PDF
July 2025 - Top 10 Read Articles in International Journal of Database Managem...
PDF
INFLUENCE OF THE EVENT RATE ON DISCRIMINATION ABILITIES OF BANKRUPTCY PREDICT...
PDF
6th International Conference on Cloud, Big Data and IoT (CBIoT 2025)
PDF
Top NewSQL Databases and Features Classification
PDF
16th International Conference on Database Management Systems (DMS 2025)
PDF
NoSQL Implementation of a Conceptual Data Model : UML Class Diagram to a Docu...
PDF
In Search of Actionable Patterns of Lowest Cost - A Scalable Graph Method
PDF
11th International Conference on Data Mining (DaMi 2025)
PDF
Evaluation Criteria for Selecting NoSQL Databases in a Single Box Environment
PDF
Affinity Clusters for Business Process Intelligence
PDF
A Semantic Resource Based Approach for Star Schemas Matching
PDF
Database Systems Performance Evaluation for IoT Applications
PDF
Hybrid Encryption Algorithms for Medical Data Storage Security in Cloud Database
PDF
A Theoretical Exploration of Data Management and Integration in Organisation ...
PDF
An Infectious Disease Prediction Method Based on K-Nearest Neighbor Improved ...
PDF
ARCHITECTING INTELLIGENT DECENTRALIZED DATA SYSTEMS TO ENABLE ANALYTICS WITH ...
PDF
OPTIMIZING DATA INTEROPERABILITY IN AGILE ORGANIZATIONS: INTEGRATING NONAKA’S...
PDF
Architecting Intelligent Decentralized Data Systems to Enable Analytics with ...
PDF
A Xgboost Risk Model Via Feature Selection and Bayesian Hyper-Parameter Optim...
Performance Evaluation of SQL and NoSQL Database Management Systems in a Cluster
July 2025 - Top 10 Read Articles in International Journal of Database Managem...
INFLUENCE OF THE EVENT RATE ON DISCRIMINATION ABILITIES OF BANKRUPTCY PREDICT...
6th International Conference on Cloud, Big Data and IoT (CBIoT 2025)
Top NewSQL Databases and Features Classification
16th International Conference on Database Management Systems (DMS 2025)
NoSQL Implementation of a Conceptual Data Model : UML Class Diagram to a Docu...
In Search of Actionable Patterns of Lowest Cost - A Scalable Graph Method
11th International Conference on Data Mining (DaMi 2025)
Evaluation Criteria for Selecting NoSQL Databases in a Single Box Environment
Affinity Clusters for Business Process Intelligence
A Semantic Resource Based Approach for Star Schemas Matching
Database Systems Performance Evaluation for IoT Applications
Hybrid Encryption Algorithms for Medical Data Storage Security in Cloud Database
A Theoretical Exploration of Data Management and Integration in Organisation ...
An Infectious Disease Prediction Method Based on K-Nearest Neighbor Improved ...
ARCHITECTING INTELLIGENT DECENTRALIZED DATA SYSTEMS TO ENABLE ANALYTICS WITH ...
OPTIMIZING DATA INTEROPERABILITY IN AGILE ORGANIZATIONS: INTEGRATING NONAKA’S...
Architecting Intelligent Decentralized Data Systems to Enable Analytics with ...
A Xgboost Risk Model Via Feature Selection and Bayesian Hyper-Parameter Optim...
Ad

Recently uploaded (20)

PPT
Project quality management in manufacturing
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
Artificial Intelligence
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPT
Mechanical Engineering MATERIALS Selection
DOCX
573137875-Attendance-Management-System-original
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
additive manufacturing of ss316l using mig welding
PPTX
Construction Project Organization Group 2.pptx
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPT
introduction to datamining and warehousing
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
Safety Seminar civil to be ensured for safe working.
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
Well-logging-methods_new................
PPTX
Geodesy 1.pptx...............................................
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PDF
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Project quality management in manufacturing
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Artificial Intelligence
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Mechanical Engineering MATERIALS Selection
573137875-Attendance-Management-System-original
UNIT 4 Total Quality Management .pptx
additive manufacturing of ss316l using mig welding
Construction Project Organization Group 2.pptx
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
introduction to datamining and warehousing
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Safety Seminar civil to be ensured for safe working.
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Well-logging-methods_new................
Geodesy 1.pptx...............................................
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
Mitigating Risks through Effective Management for Enhancing Organizational Pe...

Research on Integrated Learning Algorithm Model of Bank Customer Churn Prediction

  • 1. International Journal of Database Management Systems (IJDMS) Vol.15, No.1/2/3/4/5, October 2024 DOI: 10.5121/ijdms.2024.16501 1 RESEARCH ON INTEGRATED LEARNING ALGORITHM MODEL OF BANK CUSTOMER CHURN PREDICTION Shang Xinping , Wang Yi Artificial Intelligence, Dongguan City University, Dongguan, Guangdong, China ABSTRACT With the rapid growth of Internet finance, competition within the banking industry has intensified significantly. To better understand customer needs and enhance customer loyalty, it has become crucial to develop a customer churn prediction model. Such a model enables banks to identify customers at risk of leaving, support data-driven business decisions, and implement strategies to retain valuable clients, thereby safeguarding the bank's interests. In this context, this paper presents a customer churn prediction model based on an ensemble learning algorithm. Experimental results demonstrate that the model effectively predicts and analyzes potential customer churn, providing valuable insights for retention efforts. KEYWORDS customer churn; data preprocessing; XGBoost 1. INTRODUCTION With the ongoing expansion of financial markets and heightened competition, customer churn has emerged as a critical factor influencing banks' operational efficiency and remains a top concern for businesses [1] . To effectively reduce customer attrition, enhance satisfaction and loyalty, refine customer segmentation, attract potential clients, and improve service quality, banks must leverage advanced prediction models to identify customers at risk of leaving and boost their competitive advantage [2] . Customer churn, also known as customer attrition, refers to the process where a customer ends their relationship with a company or service provider. In the banking sector, this issue is especially significant, as it directly impacts revenue, profitability, and market share. As market competition intensifies, banks are under growing pressure to retain existing customers and attract new ones. Ensemble learning, a robust machine learning approach, offers a solution by improving the overall accuracy and stability of predictive models through the combination of multiple algorithms, making it ideal for customer churn prediction in banks. Historically, customer churn prediction has progressed from traditional statistical methods to more advanced machine learning techniques. Early models, such as logistic regression, survival analysis, and decision trees, provided a basic understanding of churn but were often limited in their predictive capabilities. As machine learning evolved, more sophisticated algorithms like random forests, support vector machines (SVMs), and neural networks were introduced, offering better accuracy but often suffering from overfitting and instability issues. Ensemble learning, which integrates the predictions of multiple models, has emerged as a more reliable and accurate solution. Methods like Gradient Boosting Machines (GBM), Random Forest with feature
  • 2. International Journal of Database Management Systems (IJDMS) Vol.15, No.1/2/3/4/5, October 2024 2 selection, and Stacking have shown marked improvements in predictive performance and stability. Among these, XGBoost (Extreme Gradient Boosting) stands out for its efficiency, scalability, and flexibility. This research focuses on a bank's dataset, consisting of 14 variables and 10,000 samples. The study begins by analysing and pre-processing the data, which includes tasks like data cleaning, feature engineering, and feature selection. XGBoost, an ensemble learning algorithm, is then utilized to predict and model customer churn. The resulting prediction model enables banks to accurately forecast customer attrition, increase user engagement, improve retention strategies, and reduce the costs associated with retaining customers. 2. CONSTRUCTION OF BANK CUSTOMER CHURN FORECASTING MODEL 2.1. Data Exploration and Preprocessing At this stage, it is crucial to systematically clean and transform the data for each feature to enhance the performance of the predictive models. This process involves several important steps: Step 1: Handling Missing Values Begin by thoroughly examining the dataset for any missing values, as these can weaken the predictive power of the model. Different strategies should be employed depending on the type of data (numeric or categorical). For numerical data, methods such as filling in missing values with the mean, median, or mode can be applied. For categorical data, the most frequent category may be used as a replacement. However, if a feature has an excessive proportion of missing values (e.g., more than 50%), it may lose its significance due to the large amount of unknown information and should be considered for removal. Step 2: Identifying and Removing Duplicate Values Duplicate entries can occur due to data entry errors or accidental data merging, which can distort the true distribution and negatively affect the accuracy of the model. Identifying and removing these duplicates using appropriate techniques ensures the dataset’s integrity and reliability. Step 3: Eliminating Irrelevant or Low-Variance Features Certain features may have little relevance to the target variable or exhibit extremely low variance (e.g., nearly identical values across the dataset), adding unnecessary complexity to the model. These irrelevant or low-variance features should be identified through correlation analysis or variance testing and removed to enhance the model’s efficiency and accuracy while reducing computational overhead. Step 4: Detecting and Handling Outliers Outliers can distort statistical models and reduce predictive accuracy. Statistical techniques, such as the interquartile range (IQR), combined with visual tools like box plots, can be used to detect these outliers. Depending on the context, outliers can either be removed or transformed (e.g., through logarithmic transformation or binning) to minimize their impact on the model. In some cases, data standardization or normalization is necessary to address the differences in scale between features.
  • 3. International Journal of Database Management Systems (IJDMS) Vol.15, No.1/2/3/4/5, October 2024 3 Step 5: Balancing the Dataset Data imbalance, where certain classes significantly outnumber others, is a common challenge in classification tasks. In this case, the churn distribution may be imbalanced, with non-churn customers outnumbering churn customers by approximately 4:1 (as shown in Figure 1). This imbalance can cause the model to favor the majority class (non-churn customers). To address this, techniques like oversampling (e.g., SMOTE) or undersampling can be applied to adjust the class proportions, ensuring a more balanced dataset and improving model performance for minority classes. Figure 1. Pie Chart of Loss Rate Further analysis of the relationship between the target variable and other variables: Figure 2. Diagram1 of the Relationship Between the Target Variable and Other Variables The following questions can be seen in Figure 2: 1) Germany has the fewest customers and France the most, but the proportion of lost customers is reversed. This indicates that banks may not allocate enough customer service resources in areas with fewer customers. 2) The total number of male customers is higher than that of female customers, but the turnover ratio is lower than that of female customers, indicating that the bank's service strategy is not comprehensive enough. 3) Customers with credit cards churn more than customers without credit cards.
  • 4. International Journal of Database Management Systems (IJDMS) Vol.15, No.1/2/3/4/5, October 2024 4 4) Inactive customers have a higher churn rate. However, the overall proportion of inactive customers is quite high, so banks should give relatively preferential policies to inactive customers and turn inactive customers into active customers to reduce the loss of customers. Figure 3. Diagram2 of the Relationship Between the Target Variable and Other Variables In Figure 3, it can be seen that: 1) There is no significant difference in the distribution of credit scores between churn and non- churn customers. 2) Older customers churn more than younger ones, so banks need to adjust retention strategies for customers of different age groups. 3) In terms of tenure, clients at the extremes are more likely to churn. 4) The bank is losing customers with large bank balances, the bank may lack of loan funds, and the profit margin will be compressed. 5) Product and salary have no significant effect on the likelihood of churn. Step 6: Data Transformation and Normalization Features with varying scales can impact machine learning algorithms differently. To ensure that all features contribute equally to the model’s performance, it is essential to apply scaling through normalization or standardization. Normalization typically scales the data to a specific range (such as [0,1]), while standardization transforms the data to follow a normal distribution (mean of 0 and variance of 1). These transformations help improve the model's convergence speed and predictive accuracy. Once the data cleaning process is complete, the dataset should be reviewed again to confirm that all missing values, duplicates, and outliers have been appropriately handled. The final dataset should be well-prepared for model training, with balanced classes and carefully selected, relevant features.
  • 5. International Journal of Database Management Systems (IJDMS) Vol.15, No.1/2/3/4/5, October 2024 5 2.2. Feature Construction and Selection After completing data preprocessing, it is standard practice to examine the correlations between feature variables using a correlation coefficient matrix. Displaying this matrix as a heatmap provides a clear, visual representation of the strength of the correlations between different features. As shown in the heatmap in Figure 4, the correlations between the feature variables are relatively weak. This indicates that the features are largely independent of each other, making them suitable for inclusion in the model-building process without concerns about multicollinearity. Figure 4. Relationships Among Features According to Pearson correlation coefficient [3] , further analysis of the degree of correlation between customer churn and each dimension is shown in Figure 5, from which we can see that age characteristics have the greatest impact on customer churn; The impact of different geographies is also different. The loss rate of users in Germany is significantly higher than that in other countries. In terms of gender, the loss rate of women is higher than that of men. The loss rate of active users is significantly lower than that of inactive users, which also indicates that active customers have higher loyalty than inactive customers.
  • 6. International Journal of Database Management Systems (IJDMS) Vol.15, No.1/2/3/4/5, October 2024 6 Figure 5. Relationship Between User Churn and Various Dimensions However, a feature construction and selection process are still needed to optimize model performance. Feature construction involves generating new features from the original data to better capture its intrinsic characteristics, thereby enhancing the model's predictive performance or interpretability. This process requires a deep understanding of the business context, data analysis objectives, and domain expertise to manually create relevant features. By deriving new variables from the original ones, categorical variables with multiple classes can be combined into fewer categories, reducing the dimensionality and complexity of the dataset. Additionally, interactions between two or more features can be considered to create interaction features that contain extra information, potentially improving the model's performance. Figure 6. The Influence of Balance to Wage Ratio on Attrition Rate As shown in Figure 6, while estimated wages have minimal impact on customer churn, the balance-to-wage ratio significantly influences churn rates. Customers with higher balance-to- wage ratios exhibit a greater likelihood of churn, which could deter banks from lending to such individuals.
  • 7. International Journal of Database Management Systems (IJDMS) Vol.15, No.1/2/3/4/5, October 2024 7 Feature selection involves choosing the most relevant subset of features from the original dataset that are most effective for predicting the target variable. This process helps reduce model complexity, enhance the model’s generalization ability, and lower the risk of overfitting. While feature construction enriches the dataset by creating new features, feature selection simplifies the model by removing redundant or unimportant ones. These two processes work together to improve both the predictive accuracy and interpretability of the model. 2.3. Model Construction and Evaluation Ensemble learning algorithms are a powerful class of machine learning frameworks that make final predictions by aggregating the outputs of multiple base learners, such as decision trees or neural networks. The core principle behind this approach is akin to "brainstorming"—the idea that combining the strengths of multiple models typically results in better generalization and higher accuracy than relying on a single model. In the realm of customer churn prediction, ensemble learning algorithms, particularly XGBoost, are highly regarded for their superior performance and flexibility. This study employs the XGBoost ensemble learning algorithm to model customer churn prediction, as illustrated in Figure 7 below. Figure 7. Integrated Learning Neural Networks XGBoost (Extreme Gradient Boosting) is a highly efficient and flexible gradient boosting library designed for various tasks, including classification, regression, and ranking [4] . Built on the gradient boosting framework, it enhances model performance by iteratively adding weak learners, typically decision trees. XGBoost offers several improvements over traditional gradient boosting algorithms, including:  Second-Order Taylor Expansion of the Loss Function: Unlike conventional methods that consider only the first derivative of the loss function (the gradient), XGBoost also incorporates the second derivative (the Hessian matrix) at each iteration. This enables a more precise approximation of the optimal solution, resulting in faster convergence and improved model accuracy.  Regularization Terms: To manage model complexity and prevent overfitting, XGBoost includes regularization terms in the objective function. These terms account for the number of leaf nodes in the tree and the L1 and L2 norms of the leaf node weights, contributing to greater model stability and generalization.
  • 8. International Journal of Database Management Systems (IJDMS) Vol.15, No.1/2/3/4/5, October 2024 8  Parallel and Distributed Computing: XGBoost supports column sampling and parallel processing, making it well-suited for handling large datasets efficiently. It also facilitates distributed computing, allowing model training on large-scale systems. In the context of customer churn prediction, XGBoost enhances model accuracy and stability through several mechanisms:  Feature Importance Assessment: XGBoost automatically evaluates feature importance, helping to identify the factors most significantly impacting customer churn. This insight enables business teams to better understand customer behaviour and develop targeted marketing strategies.  Automatic Handling of Missing Values: XGBoost can learn and manage missing values in the training data without the need for manual preprocessing. This streamlines the data cleaning process and minimizes the potential for human error in model performance.  Overfitting Prevention: By incorporating regularization terms and employing techniques such as early stopping, XGBoost effectively mitigates the risk of overfitting. This ensures that the model performs well on both training and test datasets.  Efficient Model Training: XGBoost employs various optimization strategies, including caching mechanisms, feature preordering, and parallel computing, to accelerate the model training process. This allows for quicker completion of model training on large datasets.  Flexible Model Tuning: With a wide range of parameter settings, XGBoost allows users to adjust the model flexibly according to specific tasks and dataset characteristics. Fine-tuning these parameters can further enhance model performance. In conclusion, XGBoost is an advanced ensemble learning algorithm that demonstrates exceptional performance and stability in predicting customer churn. Its efficient model training, accurate predictive capabilities, and flexible parameter adjustment options make it an invaluable tool for businesses aiming to forecast customer attrition. The algorithm model is used to learn 80% of samples as training sets, and 20% of samples as test sets to verify the learning ability of the model. To comprehensively evaluate the performance of the model, this paper uses several indexes such as accuracy rate, recall rate and F1 score of the test set data [5] . These metrics are key measures of model performance, helping to understand and evaluate the model's performance in different aspects, so as to select the most appropriate model or adjust model parameters to optimize performance.  Precision: The proportion of samples predicted by the model to be positive that are positive. A positive class is predicted to be a positive class (TP) and a negative class is predicted to be a positive class (FP),i.e. The precision reflects the reliability of the model prediction as positive. High accuracy means that the majority of the samples predicted to be positive are indeed positive, but it can also cause the model to be too conservative and miss some samples that are actually positive.
  • 9. International Journal of Database Management Systems (IJDMS) Vol.15, No.1/2/3/4/5, October 2024 9  Recall: The proportion of all positive samples that are correctly predicted by the model to be positive, i.e. The recall rate reflects the ability of the model to find all positive samples. A high recall rate means that the model can find most samples that are actually positive classes, but it can also cause the model to incorrectly predict some negative class samples as positive classes.  F1 Score: This is the harmonic average of accuracy and recall for a comprehensive evaluation of model performance, i.e. The F1 score is a single metric that considers both the accuracy and comprehensiveness of the model. In scenarios where both precision and recall need to be a concern, F1 scores are a good choice. The test results are evaluated based on precision, recall, F1 score, and accuracy, as shown in the following Figure 8, all of which are above 0.85, indicating good performance. Figure 8. Test Results The accuracy of the learner can be easily and intuitively assessed by examining the ROC curve[6] , which provides insights into the model's generalization performance. The ROC curve is plotted with the True Positive Rate (TPR) on the vertical axis and the False Positive Rate (FPR) on the horizontal axis for various threshold settings. TPR indicates the proportion of actual positive samples correctly predicted by the model, while FPR represents the proportion of actual negative samples incorrectly classified as positive. A ROC curve that approaches the upper left corner of the plot (indicating high TPR and low FPR) signifies better classification performance. The Area Under the Curve (AUC) quantifies the overall performance of the ROC curve and serves as a key metric for evaluating the learner's quality. The AUC value ranges from 0 to 1, with a value closer to 1 indicating superior predictive performance. An AUC of 0.5 suggests that the model's performance is equivalent to random guessing, while an AUC below 0.5 implies that the model's predictions are completely contrary to reality. The AUC provides a standardized metric for assessing model predictive power, allowing for objective comparisons between different models. It considers the model's performance across all classification thresholds, offering a comprehensive view of its predictive capabilities. Both ROC curves and AUC values perform well with imbalanced datasets.
  • 10. International Journal of Database Management Systems (IJDMS) Vol.15, No.1/2/3/4/5, October 2024 10 In conclusion, ROC curves and AUC values are essential tools for evaluating the predictive power of classification models. They offer an intuitive graphical representation as well as a quantitative assessment, enabling researchers and developers to evaluate model performance comprehensively and objectively, thereby facilitating more informed decision-making. As illustrated in Figure 9, the model achieved an AUC value of 0.913, demonstrating strong predictive effectiveness and making it well-suited for related prediction tasks. Figure 9. ROC Curve Based on the performance feedback of the test sets, it is necessary to constantly adjust the model structure and hyperparameters, as well as try different optimization methods. Through continuous evaluation and optimization, the performance of the model can be gradually improved to better adapt to the actual application scenario [7] . The following are common methods and strategies for hyperparameter tuning and model optimization: 1)Hyperparameter Tuning Methods:  Grid Search: Exhaustively tries all combinations of hyperparameters but can be computationally expensive.  Random Search: Randomly samples hyperparameters, often more efficient than Grid Search for large parameter spaces.  Bayesian Optimization: Uses historical performance to predict the next best hyperparameters, often more efficient than Grid or Random Search.  Hyperband: Based on Successive Halving, reduces computational costs by discarding poor-performing hyperparameters early. 2)Optimization Methods:  Learning Rate Scheduling: Adjusts the learning rate dynamically to improve convergence and performance. Methods include Step Decay, Cosine Annealing, and Warm Restarts.  Weight Regularization: Adds L2 or L1 regularization to prevent overfitting.  Batch Normalization: Normalizes inputs at each layer to improve training speed and generalization.  Gradient Clipping: Limits gradient values to prevent gradient explosion.
  • 11. International Journal of Database Management Systems (IJDMS) Vol.15, No.1/2/3/4/5, October 2024 11 3)Early Stopping: Prevents overfitting by stopping training early when validation performance stops improving. 4)Loss Function Selection: Choose appropriate loss functions based on the task, like Cross- Entropy Loss for classification or Mean Squared Error for regression. 5)Optimizer Selection: Adaptive optimizers like Adam, RMSprop, and Adagrad dynamically adjust learning rates, suitable for various training scenarios. 6)Model Architecture Fine-tuning:  Activation Function: Experiment with different activation functions (e.g., ReLU, Leaky ReLU, ELU) to find the best for training speed and accuracy. By applying these methods, model performance can be significantly improved, better fitting the needs of real-world applications. 3. RETENTION STRATEGY Even with a prediction accuracy of 88%, banks may still lose customers, as evidenced by a recall rate of 0.52. This means that 52% of the lost customers need targeted retention strategies, which can be implemented as follows: 1) Early Identification of Potential Lost Customers Using churn prediction models, banks can proactively identify customers at risk of leaving by analyzing their behavior and transaction data. This allows banks to prioritize communication and offer personalized services.  Offer tailored incentives, such as customized loan rates or financial advice, to at-risk customers.  Establish proactive service plans to regularly engage high-risk customers and address their needs. 2) Customer Segmentation and Targeted Retention Plans Different customer segments have varying reasons for churn. By analyzing customer characteristics, banks can implement more targeted retention strategies.  For younger customers, introduce digital services like mobile payments to meet their convenience needs.  For high-net-worth individuals, offer advanced financial services and personalized wealth management. 3) Improve Customer Satisfaction and Experience The model can identify key factors affecting customer satisfaction and churn. Banks should optimize services and processes based on these insights.  Enhance customer service efficiency and response times.  Use feedback to predict future needs and provide personalized product recommendations.
  • 12. International Journal of Database Management Systems (IJDMS) Vol.15, No.1/2/3/4/5, October 2024 12 4) Timely Intervention Continuously monitor customer behavior to detect churn risks and respond quickly.  Set up automated alerts for abnormal customer behavior, prompting timely interventions.  Utilize account managers to communicate with customers showing signs of leaving. 5) Optimize Marketing and Cross-Selling Churn prediction models can help identify opportunities for cross-selling other financial products.  Create customized product packages for at-risk customers to encourage re-engagement.  Launch attractive promotions based on insights from predictive models and customer data. 6) Increase Customer Lifetime Value (CLV) By anticipating churn, banks can enhance customer lifecycle management and maximize long-term revenue.  Regularly assess customer CLV and implement retention strategies for high-value clients.  Boost engagement through loyalty programs to enhance overall customer value. By integrating churn prediction results with practical strategies, banks can better understand the factors contributing to customer attrition and develop effective retention plans. This approach not only reduces churn but also enhances customer satisfaction and loyalty, ultimately improving the bank's performance. 4. CONCLUSIONS This study analysed customer data from a specific bank using descriptive statistics and feature importance analysis and built a customer churn prediction model based on the XGBoost ensemble learning algorithm. The model helps analyse churn patterns, identify potential reasons for churn, and enables bank staff to take timely action for customer retention, leading to more precise marketing and improved efficiency [8] . However, the dataset is limited to one bank, with specific geographical, economic, and cultural factors, which may affect the model's generalizability to other banks or customer groups. Additionally, the use of static historical data without time-series information on customer behaviour may limit the model's ability to capture dynamic behaviour patterns. To improve prediction accuracy, future research could incorporate more external data sources (e.g., social media, mobile payment, and market data) and use time-series data to capture behavioural changes. Addressing class imbalance through techniques like SMOTE or weighted loss functions could also enhance model performance. Future work could explore deep learning approaches, which are better suited for time-series and complex behavioural data or use graph neural networks to incorporate customer relationships into churn prediction. Improving model interpretability would also help bank staff better understand and apply the model's insights, supporting the ongoing development of churn prediction models in the banking sector. ACKNOWLEDGEMENTS
  • 13. International Journal of Database Management Systems (IJDMS) Vol.15, No.1/2/3/4/5, October 2024 13 I would like to extend my heartfelt gratitude to everyone who contributed to this study. Your valuable insights and suggestions during our academic discussions have greatly inspired me and expanded my perspective. Your assistance in data collection, experimental design, and paper revision has been immensely helpful. This research would not have been possible without your collaboration and support. I am deeply thankful to all who have offered their guidance and assistance throughout this process. REFERENCES [1] Shi Danlei, Du Baojun. Prediction of bank customer churn based on BP neural network [J]. Science and Technology Innovation, 2021 (27): 104-106. [2] Zhao J. Research on key technologies of bank customer analysis management based on data mining [D]. Zhejiang University,2005. [3] Shi Yixuan Research on bank customer churn prediction based on data mining [D]. Inner Mongolia University, 2022. [4] Fu Lei Bank customer churn early warning and model interpretability analysis [D]. Huazhong Agricultural University, 2022. [5] Zhang Wen, Zhang Lili. Prediction and analysis of bank customer churn based on GA-SVM [J]. Computer and Digital Engineering, 2010,38 (04): 55-58. [6] Chen Chenli. The bank customer churn model based on data mining technology research [D]. North China institute of aerospace industry, 2023. The DOI: 10.27836 /, dc nki. GBHHT. 2023.000085. [7] Xie Bin Bank N customer churn analysis and marketing strategy research based on big data mining [D]. Zhejiang University of Technology, 2020. [8] Shang Xinping, Wang Yi. Research on Bank Customer Churn Prediction Model based on Ensemble Learning Algorithm. 13th International Conference on Artificial Intelligence and Machine Learning (CAIML 2024), Toronto, Canada, 2024. AUTHORS Shang Xinping, master, research direction "Artificial intelligence and machine learning", working in the School of Artificial Intelligence, Dongguan City University, full-time teacher. Currently studying at St. Paul University Philippines, Doctor in Information technology, has published several high-quality research papers.