SlideShare a Scribd company logo
Database	
  Marketing	
  and	
  CRM	
  –	
  
Analyzing	
  DONOR	
  data	
  set	
  	
  
Akanksha	
  Jain	
  
Project	
  Goals	
  
•  Goal:	
  Using	
  historical	
  data	
  set	
  DONOR_RAW,	
  develop	
  a	
  

model	
  which	
  can	
  predict	
  whether	
  the	
  prospect	
  will	
  
donate/	
  not	
  donate	
  

•  Scope:	
  DONOR_RAW	
  data	
  set	
  
•  50	
  Variables	
  
•  19,372	
  observaKons	
  
•  Dependent	
  Variable:	
  TARGET_B(Binary)	
  
•  Responder:	
  1	
  
•  Non-­‐Responder:	
  0	
  
TOOLS	
  
•  SAS	
  Enterprise	
  Miner	
  4.3	
  
•  SAS	
  9.3_M1	
  
Diagram 	
  	
  
Diagram	
  (con’t)	
  
Data	
  Source	
  
•  Reject	
  Variables:	
  
•  TARGET_D	
  (using	
  TARGET_B	
  as	
  target)	
  
•  ID	
  (an	
  id	
  number)	
  
•  WEALTH_RATING	
  (huge	
  no.	
  of	
  missing	
  values)	
  
•  Variable	
  TARGET_B	
  
•  Change	
  Role	
  to	
  TARGET	
  
•  Change	
  Order	
  to	
  DESCENDING	
  

•  Select	
  complete	
  data	
  set	
  as	
  Sample	
  
•  Set	
  Prior	
  ProbabiliKes	
  	
  
•  Responder:	
  0.05	
  
•  Non-­‐Responder:	
  0.95	
  
Data	
  Partition	
  
•  Train	
  –	
  60%	
  
•  Validate	
  –	
  25%	
  
•  Test	
  –	
  15%	
  
Variable	
  Transformation	
  
Taking	
  Log	
  TransformaKon	
  to	
  reduce	
  Skewness	
  
•  LIFETIME_GIFT_RANGE	
  
•  LIFETIME_MAX_GIFT_AMT	
  
•  LIFETIME_MIN_GIFT_AMT	
  
•  MOR_HIT_RATE	
  
•  FILE_AVG_GIFT	
  
•  LIFETIME_AVG_GIFT_AMT	
  
•  PCT_ATTRIBUTE1	
  
•  LAST_GIFT_AMT	
  
•  RECENT_AVG_GIFT_AMT	
  
	
  
Keep	
  all	
  variables,	
  original	
  and	
  log	
  transformaKons	
  
Model:	
  CHAID	
  
•  Nominal	
  Criterion:	
  Chi	
  Square	
  
•  Significance	
  Level:	
  0.1	
  
•  Minimum	
  number	
  of	
  observaKons	
  in	
  a	
  leaf	
  =	
  25	
  
•  ObservaKons	
  required	
  for	
  a	
  split	
  search	
  =	
  55	
  
•  Model	
  assessment	
  measure:	
  Total	
  Leaf	
  Impurity	
  (Gini	
  
Index)	
  
Model:	
  CHAID	
  (con’t)	
  
Model:	
  CHAID	
  (con’t)	
  
Inference:	
  
	
  FREQUENCY_STATUS_97NK	
  =	
  3	
  or	
  4;	
  
MONTHS_SINCE_LAST_GIFT	
  <	
  8.5	
  	
  
	
  1%	
  =	
  56%	
  

Less	
  MarkeKng	
  Effort	
  needed	
  as	
  most	
  likely	
  that	
  
they	
  will	
  donate	
  anyways	
  

FREQUENCY_STATUS_97NK	
  =	
  3	
  or	
  4;	
  
MONTHS_SINCE_LAST_GIFT	
  >=	
  8.5;	
  
NUMBER_PROM_12	
  <11.5	
  
1%	
  =	
  43%	
  
	
  

Will	
  also	
  donate	
  but	
  the	
  company	
  should	
  be	
  
careful	
  and	
  not	
  send	
  them	
  too	
  many	
  promoKons	
  

FREQUENCY_STATUS_97NK	
  =	
  3	
  or	
  4;	
  
MONTHS_SINCE_LAST_GIFT	
  >=	
  8.5;	
  
NUMBER_PROM_12	
  >=	
  11.5	
  
1%	
  =	
  30%	
  
	
  

Are	
  geong	
  too	
  many	
  promoKons;	
  and	
  hence	
  
company	
  should	
  cut	
  on	
  sending	
  them	
  
promoKons	
  

FREQUENCY_STATUS_97NK	
  =	
  1,	
  2	
  or	
  Missing	
  
1%	
  =	
  21%	
  

Study	
  them	
  more	
  closely	
  as	
  in	
  why	
  they	
  are	
  not	
  
donaKng,	
  what	
  other	
  factors	
  are	
  responsible	
  and	
  
then	
  decide	
  how	
  to	
  design	
  a	
  markeKng	
  
campaign	
  for	
  them.	
  
	
  
Variable	
  Selection	
  
•  Target	
  AssociaKons:	
  Select	
  Chi	
  Square	
  
Model:	
  Forward	
  Regression	
  
MODEL	
  OPTIONS	
  -­‐>	
  INPUT	
  CODING	
  -­‐>DEVIATION	
  	
  
SELECTION	
  METHOD	
  -­‐>	
  FORWARD	
  
CRITERIA	
  -­‐>	
  CROSS	
  VALIDATION	
  MISCLASSIFICATION	
  
ADVANCED	
  -­‐>	
  OPTIMIZATION	
  METHOD	
  -­‐>	
  NEWTON-­‐RAPHSON	
  
w/	
  LINE	
  SEARCH	
  
•  SL	
  Entry:	
  0.05	
  
• 
• 
• 
• 
Model:	
  Forward	
  Regression	
  
(con’t)	
  
Model:	
  Forward	
  Regression	
  
(con’t)	
  
Model:	
  Backward	
  Regression	
  
MODEL	
  OPTIONS	
  -­‐>	
  INPUT	
  CODING	
  -­‐>DEVIATION	
  	
  
SELECTION	
  METHOD	
  -­‐>	
  BACKWARD	
  
CRITERIA	
  -­‐>	
  CROSS	
  VALIDATION	
  MISCLASSIFICATION	
  
ADVANCED	
  -­‐>	
  OPTIMIZATION	
  METHOD	
  -­‐>	
  NEWTON-­‐RAPHSON	
  
w/	
  LINE	
  SEARCH	
  
•  SL	
  Stay:	
  0.05	
  
• 
• 
• 
• 
Model:	
  Backward	
  Regression	
  
(con’t)	
  
Model:	
  Backward	
  Regression	
  
(con’t)	
  
Model:	
  Stepwise	
  Regression	
  
MODEL	
  OPTIONS	
  -­‐>	
  INPUT	
  CODING	
  -­‐>DEVIATION	
  	
  
SELECTION	
  METHOD	
  -­‐>	
  STEPWISE	
  
CRITERIA	
  -­‐>	
  CROSS	
  VALIDATION	
  MISCLASSIFICATION	
  
ADVANCED	
  -­‐>	
  OPTIMIZATION	
  METHOD	
  -­‐>	
  NEWTON-­‐RAPHSON	
  
w/	
  LINE	
  SEARCH	
  
•  SL	
  Entry:	
  0.15	
  
•  SL	
  Stay:	
  0.05	
  
• 
• 
• 
• 
Model:	
  Stepwise	
  Regression	
  
(con’t)	
  
Model:	
  Stepwise	
  Regression	
  
(con’t)	
  
Variable	
  Comparison	
  
Forward	
  

Backward	
  

Stepwise	
  

FILE_CARD_GIFT	
  

FILE_CARD_GIFT	
  

FILE_CARD_GIFT	
  

FREQUENCY_STATUS_97NK	
  

FREQUENCY_STATUS_97NK	
  

FREQUENCY_STATUS_97NK	
  

INCOME_GROUP	
  

INCOME_GROUP	
  

INCOME_GROUP	
  

LIFE_AV9*	
  

LIFE_AV9*	
  

LIFE_AV9*	
  

MONTHS_SINCE_LAST_GIFT	
  

MONTHS_SINCE_LAST_GIFT	
  

MONTHS_SINCE_LAST_GIFT	
  

PEP_STAR	
  

PEP_STAR	
  

PEP_STAR	
  

LIFETIME_GIFT_AMOUNT	
  
MEDIAN_HOUSEHOLD_INCOME	
  
RECENT_RESPONSE_PROP	
  

*LIFE_AV9	
  is	
  the	
  log(LIFETIME_AVG_GIFT_AMOUNT)	
  
Model	
  Comparison	
  
(Validation):	
  Cumulative	
  LIFT	
  
	
  
Model	
  Comparison	
  
(Validation):	
  Cumulative	
  LIFT	
  
Inference:	
  
•  Capture	
  top	
  20%	
  of	
  the	
  market	
  -­‐>FORWARD	
  	
  
•  Capture	
  top	
  30%	
  of	
  the	
  market	
  -­‐>BACKWARD	
  
Model	
  Comparison	
  (TEST):	
  
Cumulative	
  LIFT	
  
Model	
  Comparison	
  (TEST):	
  
Cumulative	
  LIFT	
  
Inference:	
  
•  Capture	
  top	
  20%	
  of	
  the	
  market	
  -­‐>FORWARD	
  	
  
•  Capture	
  top	
  30%	
  of	
  the	
  market	
  -­‐>	
  ANY	
  
Forward	
  versus	
  Backward	
  	
  
•  Variables:	
  

•  LIFETIME_GIFT_AMOUNT	
  
•  MEDIAN_HOUSEHOLD_INCOME	
  
•  RECENT_RESPONSE_PROP	
  

•  CorrelaKons:	
  
•  MEDIAN_HOUSEHOLD_INCOME	
  and	
  INCOME_GROUP	
  =	
  43%	
  
•  LIFE_AV9	
  and	
  LIFETIME_AVG_GIFT_AMT	
  =	
  83%	
  	
  
•  FILE_CARD_GIFT	
  and	
  RECENT_RESPONSE_PROP	
  =	
  30%	
  
Model:	
  Forward	
  +	
  
RECENT_RESPONSE_PROP	
  	
  	
  
•  Variable	
  SelecKon	
  (call	
  it	
  Variable_1Extra):	
  
• 
• 
• 
• 
• 
• 
• 

FILE_CARD_GIFT	
  
FREQUENCY_STATUS_97NK	
  
INCOME_GROUP	
  
LIFE_AV9	
  
MONTHS_SINCE_LAST_GIFT	
  
PEP_STAR	
  
RECENT_RESPONSE_PROP	
  

•  Reject	
  other	
  variables	
  manually	
  
•  Call	
  this	
  model	
  For_1Extra	
  
Model	
  Comparison	
  
(Validation):	
  Cumulative	
  LIFT	
  
Model	
  Comparison	
  
(Validation):	
  Cumulative	
  LIFT	
  
Inference:	
  
•  Capture	
  top	
  20%	
  of	
  the	
  market	
  -­‐>For_1Extra	
  
•  Capture	
  top	
  30%	
  of	
  the	
  market	
  -­‐>	
  BACKWARD	
  
Model	
  Comparison	
  (TEST):	
  
Cumulative	
  LIFT	
  
Model	
  Comparison	
  (TEST):	
  
Cumulative	
  LIFT	
  
Inference:	
  
•  Capture	
  top	
  20%	
  of	
  the	
  market	
  -­‐>For_1Extra	
  
•  Capture	
  top	
  30%	
  of	
  the	
  market	
  -­‐>	
  ANY	
  
	
  
Model:	
  Decision	
  
Final	
  Model:	
  FOR_1EXTRA	
  
	
  
Variables:	
  
• 
• 
• 
• 
• 
• 
• 

FILE_CARD_GIFT	
  
FREQUENCY_STATUS_97NK	
  
INCOME_GROUP	
  
LIFE_AV9	
  
MONTHS_SINCE_LAST_GIFT	
  
PEP_STAR	
  
RECENT_RESPONSE_PROP	
  
Interaction	
  Terms	
  
•  FREQ_PEP	
  =	
  FREQUENCY_STATUS_97NK	
  *	
  PEP_STAR	
  
•  FREQ_MONTH	
  =	
  FREQUENCY_STATUS_97NK	
  *	
  
MONTHS_SINCE_LAST_GIFT	
  	
  
•  FREQ_INCOME	
  =	
  FREQUENCY_STATUS_97NK	
  *	
  
INCOME_GROUP	
  
Model:	
  Forward	
  Regression	
  
with	
  Interaction	
  Terms	
  
Rename	
  model	
  as	
  FOR_1E_INT	
  
MODEL	
  OPTIONS	
  -­‐>	
  INPUT	
  CODING	
  -­‐>DEVIATION	
  	
  
SELECTION	
  METHOD	
  -­‐>	
  FORWARD	
  
CRITERIA	
  -­‐>	
  CROSS	
  VALIDATION	
  MISCLASSIFICATION	
  
ADVANCED	
  -­‐>	
  OPTIMIZATION	
  METHOD	
  -­‐>	
  NEWTON-­‐RAPHSON	
  
w/	
  LINE	
  SEARCH	
  
•  SL	
  Entry:	
  0.05	
  
• 
• 
• 
• 
• 
Model	
  FOR_1E_INT:	
  
Cumulative	
  LIFT	
  
Model	
  FOR_1E_INT:	
  Variable	
  
List	
  
Model	
  Comparison	
  
(Validation):	
  Cumulative	
  LIFT	
  
Model	
  Comparison	
  
(Validation):	
  Cumulative	
  LIFT	
  
Inference:	
  
•  Capture	
  top	
  20%	
  of	
  the	
  market	
  -­‐>FOR_1E_INT	
  
•  Capture	
  top	
  30%	
  of	
  the	
  market	
  -­‐>	
  FOR_1EXTRA	
  
Model	
  Comparison	
  (TEST):	
  
Cumulative	
  LIFT	
  
Model	
  Comparison	
  (TEST):	
  
Cumulative	
  LIFT	
  
Inference:	
  
•  Capture	
  top	
  20%	
  of	
  the	
  market	
  -­‐>ANY	
  
•  Capture	
  top	
  30%	
  of	
  the	
  market	
  -­‐>	
  FOR_1EXTRA	
  
Model:	
  For_1EXTRA	
  +	
  
Interaction	
  terms	
  	
  	
  
•  Variable	
  SelecKon	
  (call	
  it	
  Variable_UNION):	
  
• 
• 
• 
• 
• 
• 
• 
• 
• 
• 
	
  

FILE_CARD_GIFT	
  
FREQENCY_STATUS_97NK	
  
INCOME_GROUP	
  
LIFE_AV9	
  
MONTHS_SINCE_LAST_GIFT	
  
PEP_STAR	
  
RECENT_RESPONSE_PROP	
  
FREQ_PEP	
  	
  
FREQ_MONTH	
  	
  
FREQ_INCOME	
  

•  Reject	
  other	
  variables	
  manually	
  
•  Call	
  this	
  model	
  For_Union	
  
Model	
  Comparison	
  
(Validation):	
  Cumulative	
  LIFT	
  
Model	
  Comparison	
  
(Validation):	
  Cumulative	
  LIFT	
  
Inference:	
  
•  Capture	
  top	
  20%	
  of	
  the	
  market	
  -­‐>	
  FOR_1E_INT	
  
•  Capture	
  top	
  30%	
  of	
  the	
  market	
  -­‐>	
  FOR_UNION	
  
Model	
  Comparison	
  (Test):	
  
Cumulative	
  LIFT	
  
Model	
  Comparison	
  (Test):	
  
Cumulative	
  LIFT	
  
Inference:	
  
•  Capture	
  top	
  20%	
  of	
  the	
  market	
  -­‐>	
  ANY	
  
•  Capture	
  top	
  30%	
  of	
  the	
  market	
  -­‐>	
  FOR_UNION	
  
Model:	
  Decision	
  
Final	
  Model:	
  FOR_1EXTRA	
  because	
  	
  
•  No	
  significant	
  improvement	
  with	
  other	
  models	
  
•  InteracKon	
  terms	
  bring	
  along	
  complexity	
  
	
  
Variables:	
  
• 
• 
• 
• 
• 
• 
• 

FILE_CARD_GIFT	
  
FREQUENCY_STATUS_97NK	
  
INCOME_GROUP	
  
LIFE_AV9	
  
MONTHS_SINCE_LAST_GIFT	
  
PEP_STAR	
  
RECENT_RESPONSE_PROP	
  
Score:	
  On	
  Donor_Raw_Data	
  
THANK	
  YOU	
  

More Related Content

PDF
Prospect Identification from a Credit Database using Regression, Decision Tre...
PPSX
Customer Segmentation with R - Deep Dive into flexclust
PDF
Customer analytics for e commerce
PPTX
Maximizing Retention with Minimal Effort
PPTX
Presentation Title
PPTX
Developing Web-scale Machine Learning at LinkedIn - From Soup to Nuts
PDF
DMA Analytic Challenge 2015 final
PDF
Marketing Analytics with R Lifting Campaign Success Rates
Prospect Identification from a Credit Database using Regression, Decision Tre...
Customer Segmentation with R - Deep Dive into flexclust
Customer analytics for e commerce
Maximizing Retention with Minimal Effort
Presentation Title
Developing Web-scale Machine Learning at LinkedIn - From Soup to Nuts
DMA Analytic Challenge 2015 final
Marketing Analytics with R Lifting Campaign Success Rates

Similar to Predictive Model for Customer Segmentation using Database Marketing Techniques (20)

PPTX
Prediction of customer propensity to churn - Telecom Industry
PDF
Presentation top tips for getting optimal sql execution
PPTX
Vertical Recommendation Using Collaborative Filtering
PDF
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
PDF
Company segmentation - an approach with R
PPTX
Deepak-Computational Advertising-The LinkedIn Way
PPTX
Customer_Churn_prediction.pptx
PPTX
Customer_Churn_prediction.pptx
PPTX
Maximizing a churn campaigns profitability with cost sensitive machine learning
PPTX
BDAS-2017 | Maximizing a churn campaign’s profitability with cost sensitive m...
PDF
Lead Scoring Group Case Study Presentation.pdf
PPTX
Monte Carlo Simulation for Trading System in AmiBroker
PDF
Phase 2 of Predicting Payment default on Vehicle Loan EMI
PPTX
Supply Chain Basic
PDF
Machine Learning with Binary Logistic Regression - APAC
PPTX
1000 track2 boire
PPTX
The metrics that matter using scalability metrics for project planning of a d...
PPTX
Virtual segment brief
PDF
1030 track1 heiler
PPTX
Telecom Churn Analysis
Prediction of customer propensity to churn - Telecom Industry
Presentation top tips for getting optimal sql execution
Vertical Recommendation Using Collaborative Filtering
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Company segmentation - an approach with R
Deepak-Computational Advertising-The LinkedIn Way
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptx
Maximizing a churn campaigns profitability with cost sensitive machine learning
BDAS-2017 | Maximizing a churn campaign’s profitability with cost sensitive m...
Lead Scoring Group Case Study Presentation.pdf
Monte Carlo Simulation for Trading System in AmiBroker
Phase 2 of Predicting Payment default on Vehicle Loan EMI
Supply Chain Basic
Machine Learning with Binary Logistic Regression - APAC
1000 track2 boire
The metrics that matter using scalability metrics for project planning of a d...
Virtual segment brief
1030 track1 heiler
Telecom Churn Analysis
Ad

Recently uploaded (20)

PPTX
The evolution of the internet - its impacts on consumers
PDF
Building a strong social media presence.
PDF
How a Travel Company Can Implement Content Marketing
PPTX
Sumit Saxena IIM J Project Market segmentation.pptx
PDF
Fly Emirates SEO case study by Rakesh pathak.pdf
PPTX
UNIT 3 - 5 INDUSTRIAL PRICING.ppt x
PDF
UNIT 2 - 5 DISTRIBUTION IN RURAL MARKETS.pdf
PDF
Mastering Bulk Email Campaign Optimization for 2025
PDF
AFCAT Syllabus 2026 Guide by Best Defence Academy in Lucknow.pdf
PPTX
Your score increases as you pick a category, fill out a long description and ...
PDF
exceptionalinsights.group visitor traffic statistics 08-08-25
PDF
Modernizing IT for the age of AI - Jason Aloia, Freshworks
PDF
How the Minnesota Vikings Used Community to Drive 170% Growth and Acquire 34K...
PDF
Coleção Nature .
PPTX
Presentation - MindfulHeal Digital Ayurveda GTM & Marketing Plan.pptx
PDF
Unit 1 -2 THE 4 As of RURAL MARKETING MIX.pdf
PDF
Hidden gems in Microsoft ads with Navah Hopkins
PDF
Mastering Content Strategy in 2025 ss.pdf
PDF
Future Retail Disruption Trends and Observations
PPTX
Assignment 2 Task 1 - How Consumers Use Technology and Its Impact on Their Lives
The evolution of the internet - its impacts on consumers
Building a strong social media presence.
How a Travel Company Can Implement Content Marketing
Sumit Saxena IIM J Project Market segmentation.pptx
Fly Emirates SEO case study by Rakesh pathak.pdf
UNIT 3 - 5 INDUSTRIAL PRICING.ppt x
UNIT 2 - 5 DISTRIBUTION IN RURAL MARKETS.pdf
Mastering Bulk Email Campaign Optimization for 2025
AFCAT Syllabus 2026 Guide by Best Defence Academy in Lucknow.pdf
Your score increases as you pick a category, fill out a long description and ...
exceptionalinsights.group visitor traffic statistics 08-08-25
Modernizing IT for the age of AI - Jason Aloia, Freshworks
How the Minnesota Vikings Used Community to Drive 170% Growth and Acquire 34K...
Coleção Nature .
Presentation - MindfulHeal Digital Ayurveda GTM & Marketing Plan.pptx
Unit 1 -2 THE 4 As of RURAL MARKETING MIX.pdf
Hidden gems in Microsoft ads with Navah Hopkins
Mastering Content Strategy in 2025 ss.pdf
Future Retail Disruption Trends and Observations
Assignment 2 Task 1 - How Consumers Use Technology and Its Impact on Their Lives
Ad

Predictive Model for Customer Segmentation using Database Marketing Techniques

  • 1. Database  Marketing  and  CRM  –   Analyzing  DONOR  data  set     Akanksha  Jain  
  • 2. Project  Goals   •  Goal:  Using  historical  data  set  DONOR_RAW,  develop  a   model  which  can  predict  whether  the  prospect  will   donate/  not  donate   •  Scope:  DONOR_RAW  data  set   •  50  Variables   •  19,372  observaKons   •  Dependent  Variable:  TARGET_B(Binary)   •  Responder:  1   •  Non-­‐Responder:  0  
  • 3. TOOLS   •  SAS  Enterprise  Miner  4.3   •  SAS  9.3_M1  
  • 6. Data  Source   •  Reject  Variables:   •  TARGET_D  (using  TARGET_B  as  target)   •  ID  (an  id  number)   •  WEALTH_RATING  (huge  no.  of  missing  values)   •  Variable  TARGET_B   •  Change  Role  to  TARGET   •  Change  Order  to  DESCENDING   •  Select  complete  data  set  as  Sample   •  Set  Prior  ProbabiliKes     •  Responder:  0.05   •  Non-­‐Responder:  0.95  
  • 7. Data  Partition   •  Train  –  60%   •  Validate  –  25%   •  Test  –  15%  
  • 8. Variable  Transformation   Taking  Log  TransformaKon  to  reduce  Skewness   •  LIFETIME_GIFT_RANGE   •  LIFETIME_MAX_GIFT_AMT   •  LIFETIME_MIN_GIFT_AMT   •  MOR_HIT_RATE   •  FILE_AVG_GIFT   •  LIFETIME_AVG_GIFT_AMT   •  PCT_ATTRIBUTE1   •  LAST_GIFT_AMT   •  RECENT_AVG_GIFT_AMT     Keep  all  variables,  original  and  log  transformaKons  
  • 9. Model:  CHAID   •  Nominal  Criterion:  Chi  Square   •  Significance  Level:  0.1   •  Minimum  number  of  observaKons  in  a  leaf  =  25   •  ObservaKons  required  for  a  split  search  =  55   •  Model  assessment  measure:  Total  Leaf  Impurity  (Gini   Index)  
  • 11. Model:  CHAID  (con’t)   Inference:    FREQUENCY_STATUS_97NK  =  3  or  4;   MONTHS_SINCE_LAST_GIFT  <  8.5      1%  =  56%   Less  MarkeKng  Effort  needed  as  most  likely  that   they  will  donate  anyways   FREQUENCY_STATUS_97NK  =  3  or  4;   MONTHS_SINCE_LAST_GIFT  >=  8.5;   NUMBER_PROM_12  <11.5   1%  =  43%     Will  also  donate  but  the  company  should  be   careful  and  not  send  them  too  many  promoKons   FREQUENCY_STATUS_97NK  =  3  or  4;   MONTHS_SINCE_LAST_GIFT  >=  8.5;   NUMBER_PROM_12  >=  11.5   1%  =  30%     Are  geong  too  many  promoKons;  and  hence   company  should  cut  on  sending  them   promoKons   FREQUENCY_STATUS_97NK  =  1,  2  or  Missing   1%  =  21%   Study  them  more  closely  as  in  why  they  are  not   donaKng,  what  other  factors  are  responsible  and   then  decide  how  to  design  a  markeKng   campaign  for  them.    
  • 12. Variable  Selection   •  Target  AssociaKons:  Select  Chi  Square  
  • 13. Model:  Forward  Regression   MODEL  OPTIONS  -­‐>  INPUT  CODING  -­‐>DEVIATION     SELECTION  METHOD  -­‐>  FORWARD   CRITERIA  -­‐>  CROSS  VALIDATION  MISCLASSIFICATION   ADVANCED  -­‐>  OPTIMIZATION  METHOD  -­‐>  NEWTON-­‐RAPHSON   w/  LINE  SEARCH   •  SL  Entry:  0.05   •  •  •  • 
  • 14. Model:  Forward  Regression   (con’t)  
  • 15. Model:  Forward  Regression   (con’t)  
  • 16. Model:  Backward  Regression   MODEL  OPTIONS  -­‐>  INPUT  CODING  -­‐>DEVIATION     SELECTION  METHOD  -­‐>  BACKWARD   CRITERIA  -­‐>  CROSS  VALIDATION  MISCLASSIFICATION   ADVANCED  -­‐>  OPTIMIZATION  METHOD  -­‐>  NEWTON-­‐RAPHSON   w/  LINE  SEARCH   •  SL  Stay:  0.05   •  •  •  • 
  • 19. Model:  Stepwise  Regression   MODEL  OPTIONS  -­‐>  INPUT  CODING  -­‐>DEVIATION     SELECTION  METHOD  -­‐>  STEPWISE   CRITERIA  -­‐>  CROSS  VALIDATION  MISCLASSIFICATION   ADVANCED  -­‐>  OPTIMIZATION  METHOD  -­‐>  NEWTON-­‐RAPHSON   w/  LINE  SEARCH   •  SL  Entry:  0.15   •  SL  Stay:  0.05   •  •  •  • 
  • 22. Variable  Comparison   Forward   Backward   Stepwise   FILE_CARD_GIFT   FILE_CARD_GIFT   FILE_CARD_GIFT   FREQUENCY_STATUS_97NK   FREQUENCY_STATUS_97NK   FREQUENCY_STATUS_97NK   INCOME_GROUP   INCOME_GROUP   INCOME_GROUP   LIFE_AV9*   LIFE_AV9*   LIFE_AV9*   MONTHS_SINCE_LAST_GIFT   MONTHS_SINCE_LAST_GIFT   MONTHS_SINCE_LAST_GIFT   PEP_STAR   PEP_STAR   PEP_STAR   LIFETIME_GIFT_AMOUNT   MEDIAN_HOUSEHOLD_INCOME   RECENT_RESPONSE_PROP   *LIFE_AV9  is  the  log(LIFETIME_AVG_GIFT_AMOUNT)  
  • 23. Model  Comparison   (Validation):  Cumulative  LIFT    
  • 24. Model  Comparison   (Validation):  Cumulative  LIFT   Inference:   •  Capture  top  20%  of  the  market  -­‐>FORWARD     •  Capture  top  30%  of  the  market  -­‐>BACKWARD  
  • 25. Model  Comparison  (TEST):   Cumulative  LIFT  
  • 26. Model  Comparison  (TEST):   Cumulative  LIFT   Inference:   •  Capture  top  20%  of  the  market  -­‐>FORWARD     •  Capture  top  30%  of  the  market  -­‐>  ANY  
  • 27. Forward  versus  Backward     •  Variables:   •  LIFETIME_GIFT_AMOUNT   •  MEDIAN_HOUSEHOLD_INCOME   •  RECENT_RESPONSE_PROP   •  CorrelaKons:   •  MEDIAN_HOUSEHOLD_INCOME  and  INCOME_GROUP  =  43%   •  LIFE_AV9  and  LIFETIME_AVG_GIFT_AMT  =  83%     •  FILE_CARD_GIFT  and  RECENT_RESPONSE_PROP  =  30%  
  • 28. Model:  Forward  +   RECENT_RESPONSE_PROP       •  Variable  SelecKon  (call  it  Variable_1Extra):   •  •  •  •  •  •  •  FILE_CARD_GIFT   FREQUENCY_STATUS_97NK   INCOME_GROUP   LIFE_AV9   MONTHS_SINCE_LAST_GIFT   PEP_STAR   RECENT_RESPONSE_PROP   •  Reject  other  variables  manually   •  Call  this  model  For_1Extra  
  • 29. Model  Comparison   (Validation):  Cumulative  LIFT  
  • 30. Model  Comparison   (Validation):  Cumulative  LIFT   Inference:   •  Capture  top  20%  of  the  market  -­‐>For_1Extra   •  Capture  top  30%  of  the  market  -­‐>  BACKWARD  
  • 31. Model  Comparison  (TEST):   Cumulative  LIFT  
  • 32. Model  Comparison  (TEST):   Cumulative  LIFT   Inference:   •  Capture  top  20%  of  the  market  -­‐>For_1Extra   •  Capture  top  30%  of  the  market  -­‐>  ANY    
  • 33. Model:  Decision   Final  Model:  FOR_1EXTRA     Variables:   •  •  •  •  •  •  •  FILE_CARD_GIFT   FREQUENCY_STATUS_97NK   INCOME_GROUP   LIFE_AV9   MONTHS_SINCE_LAST_GIFT   PEP_STAR   RECENT_RESPONSE_PROP  
  • 34. Interaction  Terms   •  FREQ_PEP  =  FREQUENCY_STATUS_97NK  *  PEP_STAR   •  FREQ_MONTH  =  FREQUENCY_STATUS_97NK  *   MONTHS_SINCE_LAST_GIFT     •  FREQ_INCOME  =  FREQUENCY_STATUS_97NK  *   INCOME_GROUP  
  • 35. Model:  Forward  Regression   with  Interaction  Terms   Rename  model  as  FOR_1E_INT   MODEL  OPTIONS  -­‐>  INPUT  CODING  -­‐>DEVIATION     SELECTION  METHOD  -­‐>  FORWARD   CRITERIA  -­‐>  CROSS  VALIDATION  MISCLASSIFICATION   ADVANCED  -­‐>  OPTIMIZATION  METHOD  -­‐>  NEWTON-­‐RAPHSON   w/  LINE  SEARCH   •  SL  Entry:  0.05   •  •  •  •  • 
  • 38. Model  Comparison   (Validation):  Cumulative  LIFT  
  • 39. Model  Comparison   (Validation):  Cumulative  LIFT   Inference:   •  Capture  top  20%  of  the  market  -­‐>FOR_1E_INT   •  Capture  top  30%  of  the  market  -­‐>  FOR_1EXTRA  
  • 40. Model  Comparison  (TEST):   Cumulative  LIFT  
  • 41. Model  Comparison  (TEST):   Cumulative  LIFT   Inference:   •  Capture  top  20%  of  the  market  -­‐>ANY   •  Capture  top  30%  of  the  market  -­‐>  FOR_1EXTRA  
  • 42. Model:  For_1EXTRA  +   Interaction  terms       •  Variable  SelecKon  (call  it  Variable_UNION):   •  •  •  •  •  •  •  •  •  •    FILE_CARD_GIFT   FREQENCY_STATUS_97NK   INCOME_GROUP   LIFE_AV9   MONTHS_SINCE_LAST_GIFT   PEP_STAR   RECENT_RESPONSE_PROP   FREQ_PEP     FREQ_MONTH     FREQ_INCOME   •  Reject  other  variables  manually   •  Call  this  model  For_Union  
  • 43. Model  Comparison   (Validation):  Cumulative  LIFT  
  • 44. Model  Comparison   (Validation):  Cumulative  LIFT   Inference:   •  Capture  top  20%  of  the  market  -­‐>  FOR_1E_INT   •  Capture  top  30%  of  the  market  -­‐>  FOR_UNION  
  • 45. Model  Comparison  (Test):   Cumulative  LIFT  
  • 46. Model  Comparison  (Test):   Cumulative  LIFT   Inference:   •  Capture  top  20%  of  the  market  -­‐>  ANY   •  Capture  top  30%  of  the  market  -­‐>  FOR_UNION  
  • 47. Model:  Decision   Final  Model:  FOR_1EXTRA  because     •  No  significant  improvement  with  other  models   •  InteracKon  terms  bring  along  complexity     Variables:   •  •  •  •  •  •  •  FILE_CARD_GIFT   FREQUENCY_STATUS_97NK   INCOME_GROUP   LIFE_AV9   MONTHS_SINCE_LAST_GIFT   PEP_STAR   RECENT_RESPONSE_PROP