SlideShare a Scribd company logo
2nd edition
#MLSEV
My First BigML Model
Mercè Martín
VP of Applications, BigML
#MLSEV
• Lots of decisions
• Lots of potentially related
variables
• Uncertain correlations
ML CAN HELP
Do I really need a model?
#MLSEV
We decide the actionNew data arrives The model labels it
Maybe I could use a model…
#MLSEV
The challenge
#MLSEV
Credit delinquency
I WANT TO MINIMIZE RISK BY PREDICTING DEFAULTS
https://www.kaggle.com/c/GiveMeSomeCredit
#MLSEV
First step
#MLSEV
Defining the question
#MLSEV
Defining the real question
When do I consider a customer is in default?
When the customer misses payments?
What if the customer pays late?
What is the maximum delinquency that you
allow?
#MLSEV
Defining the contest goal
Predicting who will be 90 days past
due or worse
to act only on them
#MLSEV
And now…
#MLSEV
The First Decision
https://bigml.com/accounts/register
MLSEV
1-month FREE BOOSTED
#MLSEV
The Data Dictionary
Variable Name Description Type
SeriousDlqin2yrs Person experienced 90 days past due delinquency or worse Y/N
RevolvingUtilizationOfUnsecuredLine
s
Total balance on credit cards and personal lines of credit except real
estate and no installment debt like car loans divided by the sum of
credit limits
percentag
e
age Age of borrower in years integer
NumberOfTime30-
59DaysPastDueNotWorse
Number of times borrower has been 30-59 days past due but no
worse in the last 2 years.
integer
DebtRatio
Monthly debt payments, alimony,living costs divided by monthy
gross income
percentag
e
MonthlyIncome Monthly income real
NumberOfOpenCreditLinesAndLoans
Number of Open loans (installment like car loan or mortgage) and
Lines of credit (e.g. credit cards)
integer
NumberOfTimes90DaysLate Number of times borrower has been 90 days or more past due. integer
NumberRealEstateLoansOrLines
Number of mortgage and real estate loans including home equity
lines of credit
integer
NumberOfTime60-
89DaysPastDueNotWorse
Number of times borrower has been 60-89 days past due but no
worse in the last 2 years.
integer
NumberOfDependents
Number of dependents in family excluding themselves (spouse,
children etc.)
integer
10 predictors
#MLSEV
The Data
#MLSEV
The Source
How to interpret your data?
• Field types
• Locale (decimals)
• Missing tokens
• Text / Items parsing
#MLSEV
The Dataset
How is data distributed?
• Histograms
• Statistics
• Number of missings
• Number of errors
#MLSEV
And now… The Model
#MLSEV
The Model
What insights will the model extract?
• Patterns
• Importance
• and…
#MLSEV
The Prediction
What label corresponds to this loan?
• Predictions (labels)
• Confidence
• Explanations
#MLSEV
Are predictions correct?
#MLSEV
The Evaluation
TEST
TRAINING
CONFIDENCEPREDICTION
%
EVALUATION
%
MODEL
#MLSEV
And now… The Evaluation
#MLSEV
The Evaluation
Do predictions match the real values?
Hey! Great accuracy!!! right?
#MLSEV
I wish to make
a complaint!
#MLSEV
The Evaluation
Do predictions match the real values?
• Positive class: 1
1 / 1
Predicted / Actual
TP
FN 0 / 1
FP
TN
1 / 0
0 / 0
#MLSEV
The Costs
Predicting who will be 90 days past due or worse
to act only on them
• Always remember the goal
TO MINIMIZE COST WE SHOULD MAXIMIZE THE RECALL
• And the costs of failing!!!
Unbalanced
#MLSEV
And now… Model Tuning
#MLSEV
Compensating unbalance
The percentage of examples of
the class we are interested is
very low
Increasing their frequency could
help the model to learn better
#MLSEV
Choosing according to Costs
THE BALANCED MODEL WORKS BETTER
vs.
Unbalanced
Balanced
#MLSEV
And now… Automating
#MLSEV
The OptiML
#MLSEV
Automating tuning
Smart search for
the best
performing
configuration
#MLSEV
And the winner is…
A simple decision tree!!!
• 15-node
• balanced
• pruned
MLSEV Virtual. My first BigML Project

More Related Content

PDF
MLSEV Virtual. Predictions
PDF
MLSEV Virtual. Automating Model Selection
PDF
MLSEV Virtual. Supervised vs Unsupervised
PDF
MLSEV Virtual. Evaluations
PDF
MLSEV Virtual. State of the Art in ML
PDF
MLSEV Virtual. Applying Topic Modelling to improve Operations
PDF
Introduction to Machine Learning
PDF
What is Data Science actually is?
MLSEV Virtual. Predictions
MLSEV Virtual. Automating Model Selection
MLSEV Virtual. Supervised vs Unsupervised
MLSEV Virtual. Evaluations
MLSEV Virtual. State of the Art in ML
MLSEV Virtual. Applying Topic Modelling to improve Operations
Introduction to Machine Learning
What is Data Science actually is?

What's hot (20)

PDF
MLSEV Virtual. Searching for Anomalies
PDF
Welcome to the world of Analytics
PDF
Data Science Methodology for Analytics and Solution Implementation
PDF
Guide: MaxDiff
PDF
Trivadis TechEvent 2017 Demystifying AI, ML and Data Science by Marc Schöni
PDF
Qualtrics and MaxDiff Analysis: Understanding True Customer Preference Rankings
PPTX
DIY Max-Diff webinar slides
PDF
SQLDay2013_MarcinSzeliga_DataInDataMining
PDF
Ml masterclass
PDF
When recommendation systems go bad
PPTX
Testing a movingtarget_quest_dynatrace
PPTX
9 17-16 - when recommendation systems go bad - rec sys
PDF
Module 4: Model Selection and Evaluation
PDF
Module 9: Natural Language Processing Part 2
PDF
Evan Estola – Data Scientist, Meetup.com at MLconf ATL
PDF
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
PPT
Maxdiff webinar_10_19_10
PDF
Module 1.3 data exploratory
PDF
Chris swan big data - a little analysis - cloud camp london 24.10.12
PDF
How ml can improve purchase conversions
MLSEV Virtual. Searching for Anomalies
Welcome to the world of Analytics
Data Science Methodology for Analytics and Solution Implementation
Guide: MaxDiff
Trivadis TechEvent 2017 Demystifying AI, ML and Data Science by Marc Schöni
Qualtrics and MaxDiff Analysis: Understanding True Customer Preference Rankings
DIY Max-Diff webinar slides
SQLDay2013_MarcinSzeliga_DataInDataMining
Ml masterclass
When recommendation systems go bad
Testing a movingtarget_quest_dynatrace
9 17-16 - when recommendation systems go bad - rec sys
Module 4: Model Selection and Evaluation
Module 9: Natural Language Processing Part 2
Evan Estola – Data Scientist, Meetup.com at MLconf ATL
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
Maxdiff webinar_10_19_10
Module 1.3 data exploratory
Chris swan big data - a little analysis - cloud camp london 24.10.12
How ml can improve purchase conversions
Ad

Similar to MLSEV Virtual. My first BigML Project (20)

PDF
DutchMLSchool. Your first BigML Project
PDF
BigMLSchool: My First End-to-End Machine Learning Project
PPTX
Introduction to predictive modeling v1
PPTX
Default Prediction & Analysis on Lending Club Loan Data
PDF
Data Science for Business Managers - The bare minimum a manager should know
PDF
Loan Analysis Predicting Defaulters
PDF
BigMLSchool: Bankruptcy Prediction
PDF
Model building in credit card and loan approval
PPTX
Credit Risk Evaluation Model
PDF
Forecasting P2P Credit Risk based on Lending Club data
PDF
Forecasting peer to_peer_lending_risk
PPTX
Credit defaulter analysis
PPT
Should this loan be approved or denied
PDF
Data Exploration, Validation and Sanitization
PDF
A Statistical/Mathematical Approach to Enhanced Loan Modification Targeting
PPTX
LOAN PREDICTION BASED ON CUSTOMER BEHAVIOR.pptx
PDF
Machine Learning Project - Default credit card clients
PDF
Credit iconip
PPTX
Default payment prediction system
PDF
LendingClub portfolio simulator
DutchMLSchool. Your first BigML Project
BigMLSchool: My First End-to-End Machine Learning Project
Introduction to predictive modeling v1
Default Prediction & Analysis on Lending Club Loan Data
Data Science for Business Managers - The bare minimum a manager should know
Loan Analysis Predicting Defaulters
BigMLSchool: Bankruptcy Prediction
Model building in credit card and loan approval
Credit Risk Evaluation Model
Forecasting P2P Credit Risk based on Lending Club data
Forecasting peer to_peer_lending_risk
Credit defaulter analysis
Should this loan be approved or denied
Data Exploration, Validation and Sanitization
A Statistical/Mathematical Approach to Enhanced Loan Modification Targeting
LOAN PREDICTION BASED ON CUSTOMER BEHAVIOR.pptx
Machine Learning Project - Default credit card clients
Credit iconip
Default payment prediction system
LendingClub portfolio simulator
Ad

More from BigML, Inc (20)

PDF
Digital Transformation and Process Optimization in Manufacturing
PDF
DutchMLSchool 2022 - Automation
PDF
DutchMLSchool 2022 - ML for AML Compliance
PDF
DutchMLSchool 2022 - Multi Perspective Anomalies
PDF
DutchMLSchool 2022 - My First Anomaly Detector
PDF
DutchMLSchool 2022 - Anomaly Detection
PDF
DutchMLSchool 2022 - History and Developments in ML
PDF
DutchMLSchool 2022 - End-to-End ML
PDF
DutchMLSchool 2022 - A Data-Driven Company
PDF
DutchMLSchool 2022 - ML in the Legal Sector
PDF
DutchMLSchool 2022 - Smart Safe Stadiums
PDF
DutchMLSchool 2022 - Process Optimization in Manufacturing Plants
PDF
DutchMLSchool 2022 - Anomaly Detection at Scale
PDF
DutchMLSchool 2022 - Citizen Development in AI
PDF
Democratizing Object Detection
PDF
BigML Release: Image Processing
PDF
Machine Learning in Retail: Know Your Customers' Customer. See Your Future
PDF
Machine Learning in Retail: ML in the Retail Sector
PDF
ML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
PDF
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
Digital Transformation and Process Optimization in Manufacturing
DutchMLSchool 2022 - Automation
DutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Citizen Development in AI
Democratizing Object Detection
BigML Release: Image Processing
Machine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: ML in the Retail Sector
ML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...

Recently uploaded (20)

PPTX
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
PPTX
retention in jsjsksksksnbsndjddjdnFPD.pptx
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PPTX
Topic 5 Presentation 5 Lesson 5 Corporate Fin
PDF
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
PPTX
CYBER SECURITY the Next Warefare Tactics
PDF
Global Data and Analytics Market Outlook Report
PDF
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
PDF
[EN] Industrial Machine Downtime Prediction
PDF
Introduction to the R Programming Language
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
PDF
annual-report-2024-2025 original latest.
PPTX
A Complete Guide to Streamlining Business Processes
PDF
Optimise Shopper Experiences with a Strong Data Estate.pdf
DOCX
Factor Analysis Word Document Presentation
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPT
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
retention in jsjsksksksnbsndjddjdnFPD.pptx
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
STERILIZATION AND DISINFECTION-1.ppthhhbx
Topic 5 Presentation 5 Lesson 5 Corporate Fin
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
CYBER SECURITY the Next Warefare Tactics
Global Data and Analytics Market Outlook Report
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
[EN] Industrial Machine Downtime Prediction
Introduction to the R Programming Language
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
annual-report-2024-2025 original latest.
A Complete Guide to Streamlining Business Processes
Optimise Shopper Experiences with a Strong Data Estate.pdf
Factor Analysis Word Document Presentation
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt

MLSEV Virtual. My first BigML Project