SlideShare a Scribd company logo
Exploratory Data Analysis
Loan Defaulter Segmentation
ASSIGNMENT
Presented by --- AMOL KORE
BUSINESS OBJECTIVE
• This case study aims to identify patterns which indicate if
a client has difficulty paying their instalments which may
be used for taking actions such as denying the loan,
reducing the amount of loan, lending (to risky applicants)
at a higher interest rate, etc. This will ensure that the
consumers capable of repaying the loan are not rejected.
Identification of such applicants using EDA is the aim of
this case study.
• The company wants to understand the driving factors (or
driver variables) behind loan default, i.e. the variables
which are strong indicators of default. The company can
utilize this knowledge for its portfolio and risk
assessment.
Problem Statement
The data which we have analyzed contains the information about
the loan application at the time of applying for the loan. It contains
two types of scenarios:
 The client with payment difficulties: he/she had
late payment more than X days on at least one of
the first Y instalments of the loan in our sample. (in
our analysis, it is mentioned as target =1)
 All other cases: All other cases when the payment
is paid on time. (in our analysis, it is mentioned as
target =0)
ANALYSIS & STEPS TAKEN
1. Data Sourcing (already provided in assignment)
2. Data loading and Data Cleaning :
• Fixing the rows and columns
• Imputing and Removing Missing columns
• Handling Outliers
3. Univariate Analysis :
• Categorical – Numerical Analysis
• Numerical- Numerical Analysis
• Categorical-Categorical Analysis
4.Bivariate and Multivariate Analysis:
• Numeric – Numeric Analysis
• Correlation
• Numerical – Categorical Analysis
• Categorical – Categorical Analysis
• most of the loans have
been taken by female
• default rate for females
are just ~7% which is
safer and lesser than
male
CODE_GENDER
• most of the customers
have taken cash loan
• customers who have
taken cash loans are
less likely to default
NAME_CONTRACT_TYPE
• The safest segments are
working, commercial
associates and
pensioners
NAME_INCOME_TYPE
• Married people are safe
to target, default rate is
8%
NAME_FAMILY_STATUS
• People having
house/apartment are
safe to give the loan
with default rate of ~8%
NAME_HOUSING_TYPE
Correlation Matrix FLAG
column Heatmap
Correlation Matrix EXT
column Heatmap')
• Higher education is the
safest segment to give
the loan with a default
rate of less than 5%
NAME_EDUCATION_TYPE
• Transport type 3 highest
defaulter
• Others, Business Entity
Type 3, Self Employed
are good to go with
default rate around 10
%
ORGANIZATION_TYPE
• unaccompanied people
had tankan most of the
loans and the default
rate is ~8.5% which is
still okay
NAME_TYPE_SUITE -
• Low-Skill Laborers and
drivers are highest
defaulters
• Accountants are less
defaulters
• Core staff, Managers
and Laborers are safer
to target with a default
rate of <= 7.5 to 10%
OCCUPATION_TYPE
• Transport type 3 highest
defaulter
• Others, Business Entity
Type 3, Self Employed
are good to go with
default rate around 10
%
ORGANIZATION_TYPE
• most of the loans were given
for the goods price ranging
between 0 to 1 ml
• most of the loans were given
for the credit amount of 0 to 1
ml
• most of the customers are
paying annuity of 0 to 50 K
• mostly the customers have
income between 0 to 1 ml
univariate numeric
variables analysis
• AMT_CREDIT and
AMT_GOODS_PRICE are
linearly corelated, if the
AMT_CREDIT increases the
defaulters are decreasing
Bivariate analysis
• people having income less
than or equals to 1 ml, are
more like to take loans out of
which who are taking loan of
less than 1.5 million, could
turn out to be defaulters.
• we can target income below 1
million and loan amount
greater than 1.5 million
Bivariate analysis
• people having children 1 to
less than 5 are safer to give
the loan
Bivariate analysis
• People who can pay the
annuity of 100K are more like
to get the loan and that's upto
less than 2ml (safer segment)
Bivariate analysis
• for the repairing purpose
customers had applied mostly
prev. and the same purpose
has most number of
cancelations
Analysis on merged data
• most of the app. which were
prev. either canceled or
refused 80-90% of them are
repayor in the current data
Analysis on merged data
• offers which were unused
prev. now have maximum
number of defaulters despite
of having high income band
customers
Analysis on merged data
Final
Conclusion/Insights
Bank should target the customers
• Having low income i.e. below 1 ml
• working in Others, Business Entity Type 3, Self
Employed org. type
• working as Accountants, Core staff, Managers and
Laborers
• having house/apartment and are married and
having children not more than 5
• Highly educated
• preferably female
• unaccompanied people can be safer - default rate
is ~8.5%
Amount segment
recommended
• The credit amount should not be more than 1 ml
• annuity can be made of 50K (depending on the
eligibility)
• income bracket could be below 1 ml
• 80-90% of the customer who were prev.
cancelled/refused, are repapers. Bank can do the
analysis and can consider to give loan to these
segments
Precautions
• org. Transport type 3 should be
avoided
• Low-Skill Labourers and drivers
should be avoided
• offers prev. unused and high income
customer should be avoided
THANK YOU FOR
LISTENING!
--- amolkore957@gmail.com
8080100957 @mr.amolkore
WEBSITE EMAIL
PHONE SOCIAL MEDIA

More Related Content

PPTX
Exploratory Data Analysis For Credit Risk Assesment
PPTX
Credit EDA Assignment (Tanvi Pradhan)
PDF
Understanding How Your Fair Issac Credit Scores (FICO) Scores and How They Work
PPTX
Banking Credit Risk- EDA.pptx
PPTX
As A Dentist: Your Relation With your Bank!
PPTX
ROLE OF credit score WHILE SanctionING LOAN .pptx
PPTX
retailing-credit card
PPTX
Module 4_Small Bussiness Lending_SV.pptx bank credit
Exploratory Data Analysis For Credit Risk Assesment
Credit EDA Assignment (Tanvi Pradhan)
Understanding How Your Fair Issac Credit Scores (FICO) Scores and How They Work
Banking Credit Risk- EDA.pptx
As A Dentist: Your Relation With your Bank!
ROLE OF credit score WHILE SanctionING LOAN .pptx
retailing-credit card
Module 4_Small Bussiness Lending_SV.pptx bank credit

Similar to Sol ppt Exploratory data analysis loan defaluter (20)

PPTX
Crisp Dm
DOCX
CHAPTER OVERVIEWThis chapter defines consumer credit and analyze.docx
DOC
Mahruf -Resume
PDF
BANK LOAN CASE STUDY ANALYSIS by Sindagi M S.pdf
PPTX
Arc of discouragment. Presentation to BBA (Stuart Fraser 18 12-13)
PDF
Ultimate Consumer Guide Guide about debt, credit, sim swapping and other cons...
PDF
Ebook1ntrgudprimer2015 2
PPTX
chapter_7.pptx finance and the following
PPTX
CREDIT ANALYSIS.pptx
PPTX
Group 1 p53
PDF
Credit and Collection - Forms and Procedures Manual -- Jack Horn; Michael Den...
PPT
Estimating Supply and Demand for Microcredit
PPT
HUSC 3366 Chapter 5 Consumer Credit
PPTX
Personal Finance: All About Credit Reports and Credit Scores by @Phroogal
PPTX
Funding Assessment Report
PPTX
K-MODEL PPT.pptx
PPTX
Credit Risk of UAE banks
PPTX
ROLE OF CREDIT RATING IN DEBT MARKETS.
PPTX
globalca-panel-final
PDF
Home credit company risk presentation
Crisp Dm
CHAPTER OVERVIEWThis chapter defines consumer credit and analyze.docx
Mahruf -Resume
BANK LOAN CASE STUDY ANALYSIS by Sindagi M S.pdf
Arc of discouragment. Presentation to BBA (Stuart Fraser 18 12-13)
Ultimate Consumer Guide Guide about debt, credit, sim swapping and other cons...
Ebook1ntrgudprimer2015 2
chapter_7.pptx finance and the following
CREDIT ANALYSIS.pptx
Group 1 p53
Credit and Collection - Forms and Procedures Manual -- Jack Horn; Michael Den...
Estimating Supply and Demand for Microcredit
HUSC 3366 Chapter 5 Consumer Credit
Personal Finance: All About Credit Reports and Credit Scores by @Phroogal
Funding Assessment Report
K-MODEL PPT.pptx
Credit Risk of UAE banks
ROLE OF CREDIT RATING IN DEBT MARKETS.
globalca-panel-final
Home credit company risk presentation
Ad

Recently uploaded (20)

PPT
Predictive modeling basics in data cleaning process
PPTX
New ISO 27001_2022 standard and the changes
PPTX
Introduction to Inferential Statistics.pptx
PDF
Microsoft 365 products and services descrption
PPTX
A Complete Guide to Streamlining Business Processes
PPTX
Steganography Project Steganography Project .pptx
 
PDF
Navigating the Thai Supplements Landscape.pdf
PPTX
Leprosy and NLEP programme community medicine
PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PPTX
chrmotography.pptx food anaylysis techni
PDF
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
PDF
Data Engineering Interview Questions & Answers Data Modeling (3NF, Star, Vaul...
PPTX
modul_python (1).pptx for professional and student
PPTX
Topic 5 Presentation 5 Lesson 5 Corporate Fin
PPTX
IMPACT OF LANDSLIDE.....................
PPTX
FMIS 108 and AISlaudon_mis17_ppt_ch11.pptx
PPTX
SAP 2 completion done . PRESENTATION.pptx
PPT
DU, AIS, Big Data and Data Analytics.ppt
PPTX
Pilar Kemerdekaan dan Identi Bangsa.pptx
PPTX
retention in jsjsksksksnbsndjddjdnFPD.pptx
Predictive modeling basics in data cleaning process
New ISO 27001_2022 standard and the changes
Introduction to Inferential Statistics.pptx
Microsoft 365 products and services descrption
A Complete Guide to Streamlining Business Processes
Steganography Project Steganography Project .pptx
 
Navigating the Thai Supplements Landscape.pdf
Leprosy and NLEP programme community medicine
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
chrmotography.pptx food anaylysis techni
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
Data Engineering Interview Questions & Answers Data Modeling (3NF, Star, Vaul...
modul_python (1).pptx for professional and student
Topic 5 Presentation 5 Lesson 5 Corporate Fin
IMPACT OF LANDSLIDE.....................
FMIS 108 and AISlaudon_mis17_ppt_ch11.pptx
SAP 2 completion done . PRESENTATION.pptx
DU, AIS, Big Data and Data Analytics.ppt
Pilar Kemerdekaan dan Identi Bangsa.pptx
retention in jsjsksksksnbsndjddjdnFPD.pptx
Ad

Sol ppt Exploratory data analysis loan defaluter

  • 1. Exploratory Data Analysis Loan Defaulter Segmentation ASSIGNMENT Presented by --- AMOL KORE
  • 2. BUSINESS OBJECTIVE • This case study aims to identify patterns which indicate if a client has difficulty paying their instalments which may be used for taking actions such as denying the loan, reducing the amount of loan, lending (to risky applicants) at a higher interest rate, etc. This will ensure that the consumers capable of repaying the loan are not rejected. Identification of such applicants using EDA is the aim of this case study. • The company wants to understand the driving factors (or driver variables) behind loan default, i.e. the variables which are strong indicators of default. The company can utilize this knowledge for its portfolio and risk assessment.
  • 3. Problem Statement The data which we have analyzed contains the information about the loan application at the time of applying for the loan. It contains two types of scenarios:  The client with payment difficulties: he/she had late payment more than X days on at least one of the first Y instalments of the loan in our sample. (in our analysis, it is mentioned as target =1)  All other cases: All other cases when the payment is paid on time. (in our analysis, it is mentioned as target =0)
  • 4. ANALYSIS & STEPS TAKEN 1. Data Sourcing (already provided in assignment) 2. Data loading and Data Cleaning : • Fixing the rows and columns • Imputing and Removing Missing columns • Handling Outliers 3. Univariate Analysis : • Categorical – Numerical Analysis • Numerical- Numerical Analysis • Categorical-Categorical Analysis 4.Bivariate and Multivariate Analysis: • Numeric – Numeric Analysis • Correlation • Numerical – Categorical Analysis • Categorical – Categorical Analysis
  • 5. • most of the loans have been taken by female • default rate for females are just ~7% which is safer and lesser than male CODE_GENDER
  • 6. • most of the customers have taken cash loan • customers who have taken cash loans are less likely to default NAME_CONTRACT_TYPE
  • 7. • The safest segments are working, commercial associates and pensioners NAME_INCOME_TYPE
  • 8. • Married people are safe to target, default rate is 8% NAME_FAMILY_STATUS
  • 9. • People having house/apartment are safe to give the loan with default rate of ~8% NAME_HOUSING_TYPE
  • 12. • Higher education is the safest segment to give the loan with a default rate of less than 5% NAME_EDUCATION_TYPE
  • 13. • Transport type 3 highest defaulter • Others, Business Entity Type 3, Self Employed are good to go with default rate around 10 % ORGANIZATION_TYPE
  • 14. • unaccompanied people had tankan most of the loans and the default rate is ~8.5% which is still okay NAME_TYPE_SUITE -
  • 15. • Low-Skill Laborers and drivers are highest defaulters • Accountants are less defaulters • Core staff, Managers and Laborers are safer to target with a default rate of <= 7.5 to 10% OCCUPATION_TYPE
  • 16. • Transport type 3 highest defaulter • Others, Business Entity Type 3, Self Employed are good to go with default rate around 10 % ORGANIZATION_TYPE
  • 17. • most of the loans were given for the goods price ranging between 0 to 1 ml • most of the loans were given for the credit amount of 0 to 1 ml • most of the customers are paying annuity of 0 to 50 K • mostly the customers have income between 0 to 1 ml univariate numeric variables analysis
  • 18. • AMT_CREDIT and AMT_GOODS_PRICE are linearly corelated, if the AMT_CREDIT increases the defaulters are decreasing Bivariate analysis
  • 19. • people having income less than or equals to 1 ml, are more like to take loans out of which who are taking loan of less than 1.5 million, could turn out to be defaulters. • we can target income below 1 million and loan amount greater than 1.5 million Bivariate analysis
  • 20. • people having children 1 to less than 5 are safer to give the loan Bivariate analysis
  • 21. • People who can pay the annuity of 100K are more like to get the loan and that's upto less than 2ml (safer segment) Bivariate analysis
  • 22. • for the repairing purpose customers had applied mostly prev. and the same purpose has most number of cancelations Analysis on merged data
  • 23. • most of the app. which were prev. either canceled or refused 80-90% of them are repayor in the current data Analysis on merged data
  • 24. • offers which were unused prev. now have maximum number of defaulters despite of having high income band customers Analysis on merged data
  • 26. Bank should target the customers • Having low income i.e. below 1 ml • working in Others, Business Entity Type 3, Self Employed org. type • working as Accountants, Core staff, Managers and Laborers • having house/apartment and are married and having children not more than 5 • Highly educated • preferably female • unaccompanied people can be safer - default rate is ~8.5%
  • 27. Amount segment recommended • The credit amount should not be more than 1 ml • annuity can be made of 50K (depending on the eligibility) • income bracket could be below 1 ml • 80-90% of the customer who were prev. cancelled/refused, are repapers. Bank can do the analysis and can consider to give loan to these segments
  • 28. Precautions • org. Transport type 3 should be avoided • Low-Skill Labourers and drivers should be avoided • offers prev. unused and high income customer should be avoided
  • 29. THANK YOU FOR LISTENING! --- amolkore957@gmail.com 8080100957 @mr.amolkore WEBSITE EMAIL PHONE SOCIAL MEDIA