SlideShare a Scribd company logo
Bank Marketing Project
Group 7:
Zhaodi Liu
Preete Dixit
Nandini Naik
Rashmi Nadubeedi Ramesh
Pravin Kumar Prem Kumar
Agenda
• Project Motivation
• Data Description
• Our BI Models
• Experimental Results
• Association Rule mining
• Managerial Implications
• Challenges
• Conclusion
Project Motivation
• Direct marketing targets customers directly
with a personalized message as opposed to
Mass marketing
• The primary benefit to businesses:
– Increased lead generation
– Increase sales volume
– Increased customer base
– Minimize losses
• Focus on generating more "qualified" leads
Impact of Data Mining
• Can be very effective for direct marketing
• Use of sophisticated algorithms generate
rules, determine the most useful attributes
and predict future outcome
• Our goal is to predict the probability of a
client subscribing to the term deposit
• In the interest:
– To boost sales to existing customers
– Increase customer loyalty
– Recapture old customers and generate new
business
Data Description
• The data is related with direct marketing
campaigns (phone calls) of a Portuguese
banking institution
Data Set
Characteristics:
Multivariate Number of Instances: 45211 Area: Business
Attribute
Characteristics:
Real Number of Attributes: 17 Date Donated 2012-02-14
BANK CLIENT DATA
Serial No Name Description Data Type
1. age Client’s age numeric
2. job type of job categorical
3. marital marital status categorical
4. education level of education categorical
5. default has credit in default? binary
6. balance average yearly balance in euros numeric
7. housing has housing loan? binary
8. loan has personal loan? binary
DATA RELATED WITH THE LAST CONTACT OF THE CURRENT CAMPAIGN
Serial No Name Description Data Type
9. contact contact communication type categorical
10. day last contact day of the month numeric
11. month last contact month of year categorical
12. duration last contact duration, in seconds numeric
13. campaign number of contacts performed
during this campaign and for this
client
numeric
14. pdays Number of days that passed by
after the client was last contacted
from a previous campaign
numeric
15. previous number of contacts performed
before this campaign and for this
client
numeric
16. poutcome outcome of the previous
marketing campaign
categorical
OUTPUT VARIABLE (DESIRED TARGET)
Serial No Name Description Data Type
17. y has the client subscribed a term
deposit
binary
BI Model - With Target Profile
Profit/Loss Matrix
Decision Tree
Business Intelligence Using SAS Final Presentation
Business Intelligence Using SAS Final Presentation
Regression
Regression Node:
To convert categorical values to interval value using dummy
variables concept.
To group logically related categories in order to reduce the number of
independent variables in regression equation.
Business Intelligence Using SAS Final Presentation
Neural Network
 The model converged after
70 iterations and contains
124 weights.
Neural Network with Input Selection
• Reducing the number of modelling inputs reduces the number of modelling
weights as well as computational costs and possibly improves the model
performance.
• The useful inputs are selected by connecting the Neural Network Node to
the Regression Node.
 The model converged after 42 iterations and contains 46 weights.
BI Model - Without Target Profile
Decision Tree
Regression
Neural Network
 The model converged after 70 iterations and contains 124 weights.
Neural Network with Input Selection
 The model converged after 76 iterations and contains 85 weights.
Model Assessment and Scoring results
with Target Profile Model
– The performance of the four models are compared based on the average
profit using the model comparison node.
Fit Statistics
ROC Plots
Confusion Matrix
Scoring
– Scoring is used to implement the model deemed best by the model
comparison node for predicted the outcome for a new case/observation
for which the outcome is unknown.
Replaced Variables:
– Job – management
– Education – Secondary
– Contact – Cellular
Rejected Variables:
– Poutcome
– Target y
Scoring
Scoring
Actual Data Scores:
– Percentage No = 88.476%
– Percentage Yes = 11.524%
There is a slight difference of 1.725% in the prediction model
outcome and the actual outcome.
Model Assessment and Scoring results
without Target Profile Model
– The performance of the four models are compared based on the
misclassification rate using the model comparison node.
Fit Statistics
ROC Plots
Confusion Matrix
Scoring
Scoring
Scoring
Actual Data Scores:
– Percentage No = 88.476%
– Percentage Yes = 11.524%
There is a slight difference of 1.725% in the prediction model
outcome and the actual outcome.
Association Rule Mining
Business Intelligence Using SAS Final Presentation
Data Pre-processing
• Default : D(Yes), D(No)
• Housing : H(Yes) , H(No)
• Personnel Loan : PL(Yes),PL(No)
• Age : 20- 40, 40-60, 60-90, and 90-100
Results & Interpretation
Managerial Implications
if
pdays < 19.5 or MISSING
AND month IS ONE OF: MAY, JUN, JUL, AUG, NOV, JAN or MISSING
AND duration < 348.5 or MISSING
AND age < 60.5 or MISSING
then
Predicted: y=YES = 0.02
Predicted: y=NO = 0.98
A total of 100 customers who and the cost of calling a customer is $12 then there
will be a saving $1200 just by not contacting these set of customers.
if
pdays < 19.5 or MISSING
AND month IS ONE OF: FEB
AND housing IS ONE OF: NO
AND duration < 466.5 or MISSING
AND day < 20.5 AND day >= 9.5
AND age < 60.5 or MISSING
then
Predicted: y=YES = 0.75
Predicted: y=NO= 0.25
A total of 100 customers and cost of calling a customer is $12 and if the profit is $100 then the Ban
could generate revenue of $10,000.
Decision Tree Model – Our best-fit model
for maximizing profits
Decision Tree
Predicted
Positive Negative
Actual
Positive 1314 1330
Negative 863 19098
Predicted
Positive Negative
Actual
Positive $15,768 $0
Negative -$10,356 $0
Regression Model
• If the pdays increase by 1-unit then it has
absolutely no impact on the odds of not
subscribing to the term deposit.
��.���𝟖𝟗�
≈ �
Challenges
• To implement the Profit/Loss matrix
• Absence of ROC plot in the result of
Model Assessment
• Non-convergence of Neural Network
• Scoring
Conclusion
• Successfully implemented 2 predictive analysis
models to predict the outcome of term deposit
subscription
• Decision Tree best fit-model based on Profit/Loss
• Decision Tree best-fit model based on
Misclassification Rate
• Using the Decision rules
– Results in a saving of $1200
– Generates a revenue of $10,000
• Using Profit/Loss Matrix
– Profit of $15,768
– Savings of $10,356
Q&A

More Related Content

PPTX
Data Mining – analyse Bank Marketing Data Set
PPTX
Apple event September 07, 2016
PDF
디기리
PPTX
April 27 2014 announcements slideshow
PPTX
инструкция по заполнению журнала
DOCX
Tecnología
PPTX
Trends, Tools and Tips for Technology Careers
Data Mining – analyse Bank Marketing Data Set
Apple event September 07, 2016
디기리
April 27 2014 announcements slideshow
инструкция по заполнению журнала
Tecnología
Trends, Tools and Tips for Technology Careers

Viewers also liked (10)

PDF
НОВАЯ ВОЛНА КОМАНДНАЯ ЭФФЕКТИВНОСТЬ_Развитие Лидеров Коучинг(1)
PDF
A novel switched coupled-inductor dc–dc step-up converter and its derivatives
PPTX
бумбокс
PDF
Educacion a distancia
PPTX
Securing broker less publish subscribe systems using identity-based encryption
PDF
Swedish_Sales_Tender_System
PPTX
Why they are best Viral Video - "Check it Out" , for your reference.
PPTX
Question 7
PPTX
Presentation jenny lourdes t. cayanan
НОВАЯ ВОЛНА КОМАНДНАЯ ЭФФЕКТИВНОСТЬ_Развитие Лидеров Коучинг(1)
A novel switched coupled-inductor dc–dc step-up converter and its derivatives
бумбокс
Educacion a distancia
Securing broker less publish subscribe systems using identity-based encryption
Swedish_Sales_Tender_System
Why they are best Viral Video - "Check it Out" , for your reference.
Question 7
Presentation jenny lourdes t. cayanan
Ad

Similar to Business Intelligence Using SAS Final Presentation (20)

PPTX
Marketing campaign to sell long term deposits
PDF
Denys Osipenko: Navigating Post-Deployment Challenges and Business Realities ...
PPTX
Data science vs real world: friends or foes - Pavle Kecman
PPTX
Wooing the Best Bank Deposit Customers
PDF
Data mining - Machine Learning
PPTX
Digital Marketing: Key Metrics with Jill Quick & Dave Chaffey
PPTX
Data Science, Analytics & Critical Thinking
PPTX
Data Science Introduction by Emerging India Analytics
PDF
E-commerce Berlin Expo 2018 - How to boost your online sales using machine le...
PPTX
Are You Pushing Products, or Connecting Conversations?
PPTX
Developing a Customer Win Back Strategy
PDF
Predictive Analytics Demystified
PDF
Maximize SAP Ariba Solution ROI Through Optimized Governance, Compliance, and...
PPTX
Convincing Your Boss(es) to Confidently Spend (more) on Advertising
PDF
Chop Customer Churn! A webinar for SaaS companies, Sept 2013
PDF
Convincing Your Boss(es) to Confidently Spend (more) on Advertising
PPTX
What's an ABM Solution Really Worth? Understanding the Total Economic Impact ...
PDF
Ledger Alchemy 255 Data mining.pdf
PDF
Net Promoter Score Benchmarks For Business Cases
Marketing campaign to sell long term deposits
Denys Osipenko: Navigating Post-Deployment Challenges and Business Realities ...
Data science vs real world: friends or foes - Pavle Kecman
Wooing the Best Bank Deposit Customers
Data mining - Machine Learning
Digital Marketing: Key Metrics with Jill Quick & Dave Chaffey
Data Science, Analytics & Critical Thinking
Data Science Introduction by Emerging India Analytics
E-commerce Berlin Expo 2018 - How to boost your online sales using machine le...
Are You Pushing Products, or Connecting Conversations?
Developing a Customer Win Back Strategy
Predictive Analytics Demystified
Maximize SAP Ariba Solution ROI Through Optimized Governance, Compliance, and...
Convincing Your Boss(es) to Confidently Spend (more) on Advertising
Chop Customer Churn! A webinar for SaaS companies, Sept 2013
Convincing Your Boss(es) to Confidently Spend (more) on Advertising
What's an ABM Solution Really Worth? Understanding the Total Economic Impact ...
Ledger Alchemy 255 Data mining.pdf
Net Promoter Score Benchmarks For Business Cases
Ad

Recently uploaded (20)

PPTX
Radiologic_Anatomy_of_the_Brachial_plexus [final].pptx
PPTX
Final Presentation General Medicine 03-08-2024.pptx
DOC
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
PDF
LDMMIA Reiki Yoga Finals Review Spring Summer
PPTX
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PDF
Weekly quiz Compilation Jan -July 25.pdf
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
What if we spent less time fighting change, and more time building what’s rig...
PPTX
Lesson notes of climatology university.
PDF
Complications of Minimal Access Surgery at WLH
PDF
Classroom Observation Tools for Teachers
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PDF
Trump Administration's workforce development strategy
PDF
1_English_Language_Set_2.pdf probationary
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PPTX
Unit 4 Skeletal System.ppt.pptxopresentatiom
PPTX
UNIT III MENTAL HEALTH NURSING ASSESSMENT
PDF
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
PPTX
Introduction to Building Materials
Radiologic_Anatomy_of_the_Brachial_plexus [final].pptx
Final Presentation General Medicine 03-08-2024.pptx
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
LDMMIA Reiki Yoga Finals Review Spring Summer
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
202450812 BayCHI UCSC-SV 20250812 v17.pptx
Weekly quiz Compilation Jan -July 25.pdf
Final Presentation General Medicine 03-08-2024.pptx
What if we spent less time fighting change, and more time building what’s rig...
Lesson notes of climatology university.
Complications of Minimal Access Surgery at WLH
Classroom Observation Tools for Teachers
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
Trump Administration's workforce development strategy
1_English_Language_Set_2.pdf probationary
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
Unit 4 Skeletal System.ppt.pptxopresentatiom
UNIT III MENTAL HEALTH NURSING ASSESSMENT
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
Introduction to Building Materials

Business Intelligence Using SAS Final Presentation

  • 1. Bank Marketing Project Group 7: Zhaodi Liu Preete Dixit Nandini Naik Rashmi Nadubeedi Ramesh Pravin Kumar Prem Kumar
  • 2. Agenda • Project Motivation • Data Description • Our BI Models • Experimental Results • Association Rule mining • Managerial Implications • Challenges • Conclusion
  • 3. Project Motivation • Direct marketing targets customers directly with a personalized message as opposed to Mass marketing • The primary benefit to businesses: – Increased lead generation – Increase sales volume – Increased customer base – Minimize losses • Focus on generating more "qualified" leads
  • 4. Impact of Data Mining • Can be very effective for direct marketing • Use of sophisticated algorithms generate rules, determine the most useful attributes and predict future outcome • Our goal is to predict the probability of a client subscribing to the term deposit • In the interest: – To boost sales to existing customers – Increase customer loyalty – Recapture old customers and generate new business
  • 5. Data Description • The data is related with direct marketing campaigns (phone calls) of a Portuguese banking institution Data Set Characteristics: Multivariate Number of Instances: 45211 Area: Business Attribute Characteristics: Real Number of Attributes: 17 Date Donated 2012-02-14 BANK CLIENT DATA Serial No Name Description Data Type 1. age Client’s age numeric 2. job type of job categorical 3. marital marital status categorical 4. education level of education categorical 5. default has credit in default? binary 6. balance average yearly balance in euros numeric 7. housing has housing loan? binary 8. loan has personal loan? binary
  • 6. DATA RELATED WITH THE LAST CONTACT OF THE CURRENT CAMPAIGN Serial No Name Description Data Type 9. contact contact communication type categorical 10. day last contact day of the month numeric 11. month last contact month of year categorical 12. duration last contact duration, in seconds numeric 13. campaign number of contacts performed during this campaign and for this client numeric 14. pdays Number of days that passed by after the client was last contacted from a previous campaign numeric 15. previous number of contacts performed before this campaign and for this client numeric 16. poutcome outcome of the previous marketing campaign categorical OUTPUT VARIABLE (DESIRED TARGET) Serial No Name Description Data Type 17. y has the client subscribed a term deposit binary
  • 7. BI Model - With Target Profile
  • 13. Regression Node: To convert categorical values to interval value using dummy variables concept. To group logically related categories in order to reduce the number of independent variables in regression equation.
  • 16.  The model converged after 70 iterations and contains 124 weights.
  • 17. Neural Network with Input Selection
  • 18. • Reducing the number of modelling inputs reduces the number of modelling weights as well as computational costs and possibly improves the model performance. • The useful inputs are selected by connecting the Neural Network Node to the Regression Node.
  • 19.  The model converged after 42 iterations and contains 46 weights.
  • 20. BI Model - Without Target Profile
  • 23. Neural Network  The model converged after 70 iterations and contains 124 weights.
  • 24. Neural Network with Input Selection  The model converged after 76 iterations and contains 85 weights.
  • 25. Model Assessment and Scoring results with Target Profile Model – The performance of the four models are compared based on the average profit using the model comparison node.
  • 29. Scoring – Scoring is used to implement the model deemed best by the model comparison node for predicted the outcome for a new case/observation for which the outcome is unknown. Replaced Variables: – Job – management – Education – Secondary – Contact – Cellular Rejected Variables: – Poutcome – Target y
  • 31. Scoring Actual Data Scores: – Percentage No = 88.476% – Percentage Yes = 11.524% There is a slight difference of 1.725% in the prediction model outcome and the actual outcome.
  • 32. Model Assessment and Scoring results without Target Profile Model – The performance of the four models are compared based on the misclassification rate using the model comparison node.
  • 38. Scoring Actual Data Scores: – Percentage No = 88.476% – Percentage Yes = 11.524% There is a slight difference of 1.725% in the prediction model outcome and the actual outcome.
  • 41. Data Pre-processing • Default : D(Yes), D(No) • Housing : H(Yes) , H(No) • Personnel Loan : PL(Yes),PL(No) • Age : 20- 40, 40-60, 60-90, and 90-100
  • 43. Managerial Implications if pdays < 19.5 or MISSING AND month IS ONE OF: MAY, JUN, JUL, AUG, NOV, JAN or MISSING AND duration < 348.5 or MISSING AND age < 60.5 or MISSING then Predicted: y=YES = 0.02 Predicted: y=NO = 0.98 A total of 100 customers who and the cost of calling a customer is $12 then there will be a saving $1200 just by not contacting these set of customers. if pdays < 19.5 or MISSING AND month IS ONE OF: FEB AND housing IS ONE OF: NO AND duration < 466.5 or MISSING AND day < 20.5 AND day >= 9.5 AND age < 60.5 or MISSING then Predicted: y=YES = 0.75 Predicted: y=NO= 0.25 A total of 100 customers and cost of calling a customer is $12 and if the profit is $100 then the Ban could generate revenue of $10,000.
  • 44. Decision Tree Model – Our best-fit model for maximizing profits Decision Tree Predicted Positive Negative Actual Positive 1314 1330 Negative 863 19098 Predicted Positive Negative Actual Positive $15,768 $0 Negative -$10,356 $0
  • 45. Regression Model • If the pdays increase by 1-unit then it has absolutely no impact on the odds of not subscribing to the term deposit. ��.���𝟖𝟗� ≈ �
  • 46. Challenges • To implement the Profit/Loss matrix • Absence of ROC plot in the result of Model Assessment • Non-convergence of Neural Network • Scoring
  • 47. Conclusion • Successfully implemented 2 predictive analysis models to predict the outcome of term deposit subscription • Decision Tree best fit-model based on Profit/Loss • Decision Tree best-fit model based on Misclassification Rate • Using the Decision rules – Results in a saving of $1200 – Generates a revenue of $10,000 • Using Profit/Loss Matrix – Profit of $15,768 – Savings of $10,356
  • 48. Q&A

Editor's Notes

  • #9: The Replacement Node was used to replace the unknown and other values by “Missing” values in order to improve the performance of the predictive models. The Impute Node was used to replace the missing values (unknown values replaced by missing in the above step) for class variables by the most frequently occurring value. The Data Partition node was used to split the dataset into 50% Training Dataset and 50% Validation Dataset.
  • #11: the optimal tree has 34 leaves whereas, the maximal tree has 57 leaves.
  • #12: the optimal tree has 34 leaves whereas, the maximal tree has 57 leaves.
  • #17: The model converged after 70 iterations and contains 124 weights. The iteration plot is displayed below. As per the plot, the model obtained after 47 iterations has the highest average profit.
  • #20: the model obtained after 11 iterations has the highest average profit.
  • #22: Maximum Branch - 3 and above average square error and misclassification rate increases Maximum Depth – 9 and above the average square error and misclassification rate is constant where as below 9 it’s increasing Leaf Size – lease average square error and misclassification rate is obtained with value 2 the optimal tree has 34 leaves whereas, the maximal tree has 57 leaves. Train: misclassification rate: 0.1006 Valid: misclassification rate: 0.102 Train: misclassification rate: 0.0941 Valid: misclassification rate: 0.097014
  • #23: the misclassification rate for the training data is minimum for the model trained in Step 12. Hence model trained in step 12 is selected. Interation plot: minmum valid misclassification rate=0.105042
  • #24: The model converged after 70 iterations and contains 124 weights. The model converged after 76 iterations and contains 85 weights. , the model obtained after 44 iterations has the minimum valid 0.100863 misclassification rate
  • #25: the model obtained after 44 iterations has the minimum misclassification rate.