SlideShare a Scribd company logo
2
Most read
3
Most read
4
Most read
BANK LOAN EDA CASE
STUDY
PREPARED BY:
MR.ABHISHEK LAL
MS.LUMBINI SARDARE
PROBLEM STATEMENT
● The loan providing companies find it hard to give loans to the people
due to their insufficient or non-existent credit history. Because of that,
some consumers use it as their advantage by becoming a defaulter.
● This case study aims to identify patterns which indicate if a client has
difficulty paying their installments which may be used for taking actions
such as denying the loan, reducing the amount of loan, lending (to risky
applicants) at a higher interest rate, etc. This will ensure that the
consumers capable of repaying the loan are not rejected. Identification
of such applicants using EDA is the aim of this case study.
2
ANALYSIS DONE
Steps
● Data Understanding & preparation
● Data cleaning & Manipulation
● Data Analysis-
univariate
Bivariate
● Presentation & recommendations
3
Data Understanding and preperation
Data Understanding and
preparation
Step1:Data sourcing
a:Imported the CSV files and then read the
data
B:Data Inspection: we checked the shape of
the data ,as well as checked data,checked the
data type of the given data, we Inspected the
dataframe for dimensions, null-values, and
summary of different numeric columns.
C:Inspecting Null Values in rows which are
greater than 50%,
-Checked columns with missing values &
dropped colums with missing values greater
than 50%
-impute values for columns with missing values of
aprox. 13% and then uniques values are less and
these columns give day, week, month, year etc we
can replace null values with 0
-Dropped unwanted columns
-changed ID to Object & checked datatypes of
columns
-converted negative values to positive
-divided income into credit groups
-variables indicate number of days, hrs, months, etc.
we can conclude that these are columns with
categorical values.And regarding the null values, as it
can be seen from the data above, since these are
categorical columns, and there mode value is 0 for all
these variables. so, it is safe to impute the null values
with 0.
4
DATA CLEANING & MANIPULATION
we have the final data frame after dropping unwanted columns, we now
check for data types of each columns and change their datatype based on
the values they contain
Checking for data quality issues
and binning of continuos
variable:
-checked for all data types and they they are converted into
appropriate standard format
-Performed datatype conversion & changed negative to
positive ,converted data into psotive format,and rounded of
required values
6
7
Found outliers using boxplot
Changed the credit amount,checked imbalance % and Ploted Distribution of Target values where
credit amount and income amount imbalance is shown
8
9
HEAT MAP FOR CORELATION MATRIX
DATA ANALYSIS
Data Imbalance check
Categorized target =0 and
Target=1
Univariate analysis
For defaulter & non defaulter using SNS
Boxplot. Defaulters are within the age
group of 31-50 years
10
11
1.CUSTOMER WITH SECONDARY & HIGHER SECONDARY AS WELL AS LABOURERS ARE
HIGHEST IN DEFAULTERS
12
CUSTOMERS LIVING IN HOUSES & APARTMENTS
ARE HIGHEST IN DEFAULTERS
13
BIVARIATE ANALYSIS
Income for defaulters is
very low irrespective of
no.of years employed
14
SCATTER PLOT
NON DEFAULTER GET MORE CREDIT FOR THEIR ANNUITIES
15
16
RANGE OF CUSTOMERS WITH HIGHER
DEGREE IS HIGHER IN NON-DEFAULTERS
17
18
19
20
RECOMMENDATIONS & CONCLUTIONS
● Range of customer with higher degree is
higher in non defaulters
● Non defaulter get more credit for their
annuities
● Income for defaulter is very low
irrespective of no.of years employed
● Customer living in housing/Apartments
are highest category in defaulter
● Labourer's are biggest category in
defaulters
● Customer with secondary/secondary
special are highest in both defaulters
and non defaulters
● Purpose of finding
defaulters and their details,
steps involved was data
cleaning, data
analysing,drawing
inferences from it and the
graphs for the same has
been attached
21
22
THANKS!

More Related Content

PPTX
EDA_Case_Study_PPT.pptx
PPTX
Credit eda case study
PPTX
Credit EDA Assignment (Tanvi Pradhan)
PDF
Credit EDA case study
PPTX
Exploratory Data Analysis For Credit Risk Assesment
PDF
Credit EDA Case Study : Exploratory Data Analysis on Bank Loan Data
PPTX
Credit eda case study presentation
PDF
Default of Credit Card Payments
EDA_Case_Study_PPT.pptx
Credit eda case study
Credit EDA Assignment (Tanvi Pradhan)
Credit EDA case study
Exploratory Data Analysis For Credit Risk Assesment
Credit EDA Case Study : Exploratory Data Analysis on Bank Loan Data
Credit eda case study presentation
Default of Credit Card Payments

What's hot (20)

PDF
Lead scoring case study presentation
PDF
project-6-bank-loan-case-study.pdf
PDF
Default Credit Card Prediction
PDF
Lead scoring case study
PPTX
PPTX
Churn Analysis in Telecom Industry
PPTX
Lead Scoring Case Study_Final.pptx
PPTX
Customer Churn Analysis and Prediction
PPTX
Default Prediction & Analysis on Lending Club Loan Data
PPTX
Operation Analytics and Investigating Metric Spike_P-3.pptx
PPTX
Lead Scoring Case Study
PDF
Predicting Credit Card Defaults using Machine Learning Algorithms
PPTX
KPMG Virtual Internship
PPTX
Customer loan origination system
PDF
Loan Default Prediction with Machine Learning
PPTX
Telecom Churn Prediction Presentation
PPTX
Churn modelling
PPTX
USE OF DATA MINING IN BANKING SECTOR
DOCX
IMDB Movie Dataset Analysis
PPTX
Churn customer analysis
Lead scoring case study presentation
project-6-bank-loan-case-study.pdf
Default Credit Card Prediction
Lead scoring case study
Churn Analysis in Telecom Industry
Lead Scoring Case Study_Final.pptx
Customer Churn Analysis and Prediction
Default Prediction & Analysis on Lending Club Loan Data
Operation Analytics and Investigating Metric Spike_P-3.pptx
Lead Scoring Case Study
Predicting Credit Card Defaults using Machine Learning Algorithms
KPMG Virtual Internship
Customer loan origination system
Loan Default Prediction with Machine Learning
Telecom Churn Prediction Presentation
Churn modelling
USE OF DATA MINING IN BANKING SECTOR
IMDB Movie Dataset Analysis
Churn customer analysis
Ad

Similar to Exploratory Data Analysis Bank Fraud Case Study (20)

PDF
BANK LOAN CASE STUDY ANALYSIS by Sindagi M S.pdf
PPTX
Sol ppt Exploratory data analysis loan defaluter
PDF
Home credit company risk presentation
PDF
EDA_Assignment_Sourabh S Hubballi.pdf
PPTX
Analyzing loan data project-data analysis.pptx
PPTX
EDA case study.pptx
PDF
EDA_ Bank_Loan_Case_Study_PPT.pdf
PPTX
Apanps5210 - final presentation
PPTX
LOAN PREDICTION BASED ON CUSTOMER BEHAVIOR.pptx
PDF
Project 01 - Data Exploration and Reporting
PPTX
EDA2_v3.pptx
PPTX
prediction of default payment next month using a logistic approach
PDF
Data Exploration, Validation and Sanitization
PDF
fast publication journals
PPTX
Credit risk scoring model final
PPTX
Default payment prediction system
DOCX
PBA.docx ( Credit Risk Analysis of loans )
PDF
Loan Analysis Predicting Defaulters
PPTX
Taekeun Kim_Loan default prediction.pptx
PPTX
Banking Credit Risk- EDA.pptx
BANK LOAN CASE STUDY ANALYSIS by Sindagi M S.pdf
Sol ppt Exploratory data analysis loan defaluter
Home credit company risk presentation
EDA_Assignment_Sourabh S Hubballi.pdf
Analyzing loan data project-data analysis.pptx
EDA case study.pptx
EDA_ Bank_Loan_Case_Study_PPT.pdf
Apanps5210 - final presentation
LOAN PREDICTION BASED ON CUSTOMER BEHAVIOR.pptx
Project 01 - Data Exploration and Reporting
EDA2_v3.pptx
prediction of default payment next month using a logistic approach
Data Exploration, Validation and Sanitization
fast publication journals
Credit risk scoring model final
Default payment prediction system
PBA.docx ( Credit Risk Analysis of loans )
Loan Analysis Predicting Defaulters
Taekeun Kim_Loan default prediction.pptx
Banking Credit Risk- EDA.pptx
Ad

Recently uploaded (20)

PPTX
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
PPT
Predictive modeling basics in data cleaning process
PPTX
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
PPTX
CYBER SECURITY the Next Warefare Tactics
PPTX
Introduction to Inferential Statistics.pptx
PPTX
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PDF
Business Analytics and business intelligence.pdf
PDF
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
Pilar Kemerdekaan dan Identi Bangsa.pptx
PPTX
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
PDF
Tetra Pak Index 2023 - The future of health and nutrition - Full report.pdf
PDF
Global Data and Analytics Market Outlook Report
PDF
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
PDF
Optimise Shopper Experiences with a Strong Data Estate.pdf
PDF
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
PDF
Microsoft 365 products and services descrption
PDF
Microsoft Core Cloud Services powerpoint
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
Predictive modeling basics in data cleaning process
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
CYBER SECURITY the Next Warefare Tactics
Introduction to Inferential Statistics.pptx
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
Business Analytics and business intelligence.pdf
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Pilar Kemerdekaan dan Identi Bangsa.pptx
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
Tetra Pak Index 2023 - The future of health and nutrition - Full report.pdf
Global Data and Analytics Market Outlook Report
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
Optimise Shopper Experiences with a Strong Data Estate.pdf
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
Microsoft 365 products and services descrption
Microsoft Core Cloud Services powerpoint

Exploratory Data Analysis Bank Fraud Case Study

  • 1. BANK LOAN EDA CASE STUDY PREPARED BY: MR.ABHISHEK LAL MS.LUMBINI SARDARE
  • 2. PROBLEM STATEMENT ● The loan providing companies find it hard to give loans to the people due to their insufficient or non-existent credit history. Because of that, some consumers use it as their advantage by becoming a defaulter. ● This case study aims to identify patterns which indicate if a client has difficulty paying their installments which may be used for taking actions such as denying the loan, reducing the amount of loan, lending (to risky applicants) at a higher interest rate, etc. This will ensure that the consumers capable of repaying the loan are not rejected. Identification of such applicants using EDA is the aim of this case study. 2
  • 3. ANALYSIS DONE Steps ● Data Understanding & preparation ● Data cleaning & Manipulation ● Data Analysis- univariate Bivariate ● Presentation & recommendations 3
  • 4. Data Understanding and preperation Data Understanding and preparation Step1:Data sourcing a:Imported the CSV files and then read the data B:Data Inspection: we checked the shape of the data ,as well as checked data,checked the data type of the given data, we Inspected the dataframe for dimensions, null-values, and summary of different numeric columns. C:Inspecting Null Values in rows which are greater than 50%, -Checked columns with missing values & dropped colums with missing values greater than 50% -impute values for columns with missing values of aprox. 13% and then uniques values are less and these columns give day, week, month, year etc we can replace null values with 0 -Dropped unwanted columns -changed ID to Object & checked datatypes of columns -converted negative values to positive -divided income into credit groups -variables indicate number of days, hrs, months, etc. we can conclude that these are columns with categorical values.And regarding the null values, as it can be seen from the data above, since these are categorical columns, and there mode value is 0 for all these variables. so, it is safe to impute the null values with 0. 4
  • 5. DATA CLEANING & MANIPULATION we have the final data frame after dropping unwanted columns, we now check for data types of each columns and change their datatype based on the values they contain
  • 6. Checking for data quality issues and binning of continuos variable: -checked for all data types and they they are converted into appropriate standard format -Performed datatype conversion & changed negative to positive ,converted data into psotive format,and rounded of required values 6
  • 8. Changed the credit amount,checked imbalance % and Ploted Distribution of Target values where credit amount and income amount imbalance is shown 8
  • 9. 9 HEAT MAP FOR CORELATION MATRIX
  • 10. DATA ANALYSIS Data Imbalance check Categorized target =0 and Target=1 Univariate analysis For defaulter & non defaulter using SNS Boxplot. Defaulters are within the age group of 31-50 years 10
  • 11. 11
  • 12. 1.CUSTOMER WITH SECONDARY & HIGHER SECONDARY AS WELL AS LABOURERS ARE HIGHEST IN DEFAULTERS 12
  • 13. CUSTOMERS LIVING IN HOUSES & APARTMENTS ARE HIGHEST IN DEFAULTERS 13
  • 14. BIVARIATE ANALYSIS Income for defaulters is very low irrespective of no.of years employed 14
  • 15. SCATTER PLOT NON DEFAULTER GET MORE CREDIT FOR THEIR ANNUITIES 15
  • 16. 16 RANGE OF CUSTOMERS WITH HIGHER DEGREE IS HIGHER IN NON-DEFAULTERS
  • 17. 17
  • 18. 18
  • 19. 19
  • 20. 20
  • 21. RECOMMENDATIONS & CONCLUTIONS ● Range of customer with higher degree is higher in non defaulters ● Non defaulter get more credit for their annuities ● Income for defaulter is very low irrespective of no.of years employed ● Customer living in housing/Apartments are highest category in defaulter ● Labourer's are biggest category in defaulters ● Customer with secondary/secondary special are highest in both defaulters and non defaulters ● Purpose of finding defaulters and their details, steps involved was data cleaning, data analysing,drawing inferences from it and the graphs for the same has been attached 21