SlideShare a Scribd company logo
Data mining for credit
card fraud: A
comparative study
SiddharthaBhattacharyya, SanjeevJha, KurianTharakunne,
ChristopherWestland (2011)
DOR BAHDUR BUDHATHOKI SID:45057
NARENDRA SHARMA SID:45040
Abstract
• This paper evaluates two advanced data mining approaches, support
vector machines and random forests, together with the well-known
logistic regression, as part of an attempt to better detect (and thus control
and prosecute) credit card fraud. The study is based on real-life data of
transactions from an international credit card operation.
Introduction
• Data mining- Practice of examining large pre-existing database in order to
generate new information. It is helpful to turn raw data into useful
information, knowledge discovery, predictive analysis( to apply past
outcomes to predict future)
• Credit Card Fraud- there are two types of credit card fraud,
1) Application fraud- obtaining new card from issuing companies using false
information.
2) Behavioral fraud- Includes mail theft, stolen card , counterfeit card and
card holder not present.
Methods
• There are three data mining techniques used to predict credit card fraud.
1) Logistic Regression(LR)- Appropriate when dependent variable is
categorical, here dependent variable fraud is binary.
2) Support Vector machines(SVM)- statistical learning techniques that have
been found very successful in variety of tasks. SVMs are linear classifier
that work on hi dimensional feature space with out incorporating any
additional computational complexity.
3) Random Forest (RF)-It is ensemble of classification of trees models.
Ensembles work well when individual numbers are dissimilar and random
forests obtain variation among individual.
Results
• This section presents results from the experiments comparing the
performance of Logistic regression (LR), Random Forests (RF) and Support
Vector Machines (SVM) model developed from training data carrying
varying levels of fraud cases.
Result Contd.
Discussion
• This paper examined the performance of two advanced data mining
techniques, random forests and support vector machines, together with
logistic regression, for credit card fraud detection.
• A real-life dataset on credit card transactions from the January 2006–
January 2007 period was used in their evaluation.
• Random forests and SVM are two approaches that have gained prominence
in recent years with noted superior performance across a range of
applications. Till date, their use for credit card fraud prediction has been
limited.
Discussion
• They use data under sampling, a simple approach which has been noted to
perform well and examine the performance of the three techniques with
varying levels of data under sampling. For performance assessment, they
use a test dataset with much lower fraud rate (0.5%) than in the training
datasets with different levels of under sampling.
Thank You

More Related Content

PPTX
Data analysis
PPTX
problems encountered in e-banking in selected bank in Q.C
PPTX
Data analysis
PDF
Survey on Various Classification Techniques in Data Mining
PPTX
Data Mining: Mining ,associations, and correlations
DOCX
Exam Short Preparation on Data Analytics
PDF
Data Mining System and Applications: A Review
PPTX
Data analysis
Data analysis
problems encountered in e-banking in selected bank in Q.C
Data analysis
Survey on Various Classification Techniques in Data Mining
Data Mining: Mining ,associations, and correlations
Exam Short Preparation on Data Analytics
Data Mining System and Applications: A Review
Data analysis

What's hot (20)

PPTX
Basic Statistics & Data Analysis
PPTX
Probability in statistics
PDF
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
PPT
Introduction to statistics 2013
PPT
Data mining
PPTX
DM
PDF
Hypothesis on Different Data Mining Algorithms
PPTX
Statistical software packages
PDF
Exploratory data analysis data visualization
PPTX
Exploratory data analysis
PDF
Data analysis
PDF
Research Method EMBA chapter 11
DOCX
Definition Of Statistics
PPT
Qt business statistics-lesson1-2013
PPTX
Multivariate data analysis
PPTX
R programming for data science
PPTX
Architecture of data mining system
DOCX
Statistik Chapter 1
PDF
Business statistics review
PDF
Variance rover system
Basic Statistics & Data Analysis
Probability in statistics
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
Introduction to statistics 2013
Data mining
DM
Hypothesis on Different Data Mining Algorithms
Statistical software packages
Exploratory data analysis data visualization
Exploratory data analysis
Data analysis
Research Method EMBA chapter 11
Definition Of Statistics
Qt business statistics-lesson1-2013
Multivariate data analysis
R programming for data science
Architecture of data mining system
Statistik Chapter 1
Business statistics review
Variance rover system
Ad

Similar to Data mining for credit card fraud (20)

PDF
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdf
PDF
Meta Classification Technique for Improving Credit Card Fraud Detection
PDF
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
PDF
A Study on Credit Card Fraud Detection using Machine Learning
PDF
IRJET- Credit Card Fraud Detection using Isolation Forest
PDF
Welcome to International Journal of Engineering Research and Development (IJERD)
PPT
CREDIT_CARD.ppt
PDF
A Comparative Study on Credit Card Fraud Detection
PDF
Data mining on Financial Data
PDF
F033026029
PPT
Data mining
PDF
A data mining approach to predict
PDF
IRJET- Credit Card Fraud Detection Analysis
PDF
Review of Algorithms for Crime Analysis & Prediction
PDF
Credit Card Fraud Detection Using Machine Learning
PDF
Credit Card Fraud Detection Using Machine Learning
PDF
credit scoring paper published in eswa
PDF
MACHINE LEARNING ALGORITHMS FOR CREDIT CARD FRAUD DETECTION
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdf
Meta Classification Technique for Improving Credit Card Fraud Detection
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
A Study on Credit Card Fraud Detection using Machine Learning
IRJET- Credit Card Fraud Detection using Isolation Forest
Welcome to International Journal of Engineering Research and Development (IJERD)
CREDIT_CARD.ppt
A Comparative Study on Credit Card Fraud Detection
Data mining on Financial Data
F033026029
Data mining
A data mining approach to predict
IRJET- Credit Card Fraud Detection Analysis
Review of Algorithms for Crime Analysis & Prediction
Credit Card Fraud Detection Using Machine Learning
Credit Card Fraud Detection Using Machine Learning
credit scoring paper published in eswa
MACHINE LEARNING ALGORITHMS FOR CREDIT CARD FRAUD DETECTION
Ad

Recently uploaded (20)

PDF
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
PDF
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
PDF
Tetra Pak Index 2023 - The future of health and nutrition - Full report.pdf
PPTX
A Complete Guide to Streamlining Business Processes
PPTX
FMIS 108 and AISlaudon_mis17_ppt_ch11.pptx
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
PPTX
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
PDF
Navigating the Thai Supplements Landscape.pdf
PDF
Optimise Shopper Experiences with a Strong Data Estate.pdf
PDF
Microsoft 365 products and services descrption
PDF
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
PPT
ISS -ESG Data flows What is ESG and HowHow
PDF
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
PPTX
Pilar Kemerdekaan dan Identi Bangsa.pptx
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PDF
annual-report-2024-2025 original latest.
PDF
Business Analytics and business intelligence.pdf
PPT
Predictive modeling basics in data cleaning process
PDF
Introduction to Data Science and Data Analysis
PDF
Transcultural that can help you someday.
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
Tetra Pak Index 2023 - The future of health and nutrition - Full report.pdf
A Complete Guide to Streamlining Business Processes
FMIS 108 and AISlaudon_mis17_ppt_ch11.pptx
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
Navigating the Thai Supplements Landscape.pdf
Optimise Shopper Experiences with a Strong Data Estate.pdf
Microsoft 365 products and services descrption
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
ISS -ESG Data flows What is ESG and HowHow
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
Pilar Kemerdekaan dan Identi Bangsa.pptx
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
annual-report-2024-2025 original latest.
Business Analytics and business intelligence.pdf
Predictive modeling basics in data cleaning process
Introduction to Data Science and Data Analysis
Transcultural that can help you someday.

Data mining for credit card fraud

  • 1. Data mining for credit card fraud: A comparative study SiddharthaBhattacharyya, SanjeevJha, KurianTharakunne, ChristopherWestland (2011) DOR BAHDUR BUDHATHOKI SID:45057 NARENDRA SHARMA SID:45040
  • 2. Abstract • This paper evaluates two advanced data mining approaches, support vector machines and random forests, together with the well-known logistic regression, as part of an attempt to better detect (and thus control and prosecute) credit card fraud. The study is based on real-life data of transactions from an international credit card operation.
  • 3. Introduction • Data mining- Practice of examining large pre-existing database in order to generate new information. It is helpful to turn raw data into useful information, knowledge discovery, predictive analysis( to apply past outcomes to predict future) • Credit Card Fraud- there are two types of credit card fraud, 1) Application fraud- obtaining new card from issuing companies using false information. 2) Behavioral fraud- Includes mail theft, stolen card , counterfeit card and card holder not present.
  • 4. Methods • There are three data mining techniques used to predict credit card fraud. 1) Logistic Regression(LR)- Appropriate when dependent variable is categorical, here dependent variable fraud is binary. 2) Support Vector machines(SVM)- statistical learning techniques that have been found very successful in variety of tasks. SVMs are linear classifier that work on hi dimensional feature space with out incorporating any additional computational complexity. 3) Random Forest (RF)-It is ensemble of classification of trees models. Ensembles work well when individual numbers are dissimilar and random forests obtain variation among individual.
  • 5. Results • This section presents results from the experiments comparing the performance of Logistic regression (LR), Random Forests (RF) and Support Vector Machines (SVM) model developed from training data carrying varying levels of fraud cases.
  • 7. Discussion • This paper examined the performance of two advanced data mining techniques, random forests and support vector machines, together with logistic regression, for credit card fraud detection. • A real-life dataset on credit card transactions from the January 2006– January 2007 period was used in their evaluation. • Random forests and SVM are two approaches that have gained prominence in recent years with noted superior performance across a range of applications. Till date, their use for credit card fraud prediction has been limited.
  • 8. Discussion • They use data under sampling, a simple approach which has been noted to perform well and examine the performance of the three techniques with varying levels of data under sampling. For performance assessment, they use a test dataset with much lower fraud rate (0.5%) than in the training datasets with different levels of under sampling.