SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 04 | Apr 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1786
Fraud Detection in Credit Card using Machine Learning
Techniques
Mr.Manohar.s1, Arvind Bedi2, Shashank kumar3, Shounak kr Singh4
1(Professor/ CSE, SRM Institute of Science and Technology)
2(B.Tech - Student/ CSE, SRM Institute of Science and Technology)
3(B.Tech - Student/ CSE, SRM Institute of Science and Technology)
4(B.Tech - Student/ CSE, SRM Institute of Science and Technology)
-----------------------------------------------------------------------------***-------------------------------------------------------------------------
Abstract: Credit card fraud happens frequently and leads
to massive financial losses .Online transaction have
increased drastically significant no of online transaction
are done by online credit cards. Therefore, banks and
other financial institutions support the progress of credit
card fraud detection applications. Fraudulent transactions
can happen in different ways and they can be placed into
various categories. Identification of fraud credit card
transactions is important to credit card companies for the
prevention of being charged for items transaction of items
which the customer did not purchase. Data science along
with machine leaving helps in tackling these issues. The
fraudulent transactions are mixed up with legitimate
transactions and the simple recognition techniques which
include comparison of both the fraud and the legitimate
data are never sufficient to detect the fraud transactions
accurately. This project intends to illustrate the modelling
of a knowledge set using machine learning with Credit
Card Fraud Detection. The Credit Card Fraud Detection
Problem includes modelling of credit card transactions
which has happened earlier with the data of fraud
transactions. Our model will determine whether a new
transaction tends to be fraud or legitimate. We have an
objective to detect 100% of the fraud transactions while
reducing invalid fraud classifications.
I. INTRODUCTION
Fraud credit card transaction is the unbidden use of
someone’s account without the owner being aware of it
.Prevention measures need to be taken against such
fraudulent practices by analysing and studying these fraud
transactions to avoid similar situations in the upcoming
transactions.
Briefly, Credit Card transaction Frauds can be explained as
a scenario in which a fraudster utilizes a credit card of
another person for personal means without seeking
permission or authorization of the owner of the credit
card and the credit card issuing officials or institutions are
unknown of the fraud. In order to detect fraud, there is a
need to monitor the activities of the users to avoid
abnormal behaviour which include intrusion, Fraud and
defaulting. Machine learning and data science are the
communities that focus on problems like these since the
solution has greater possibility and feasibility for being
automated .From the perspective of learning this is a very
challenging problem as it is distinguished by many factors
such as class disparity. Usually the legitimate transactions
are more than the fraudulent ones. Furthermore, the
statistical properties of transaction arrangement changes
frequently over a time period .Real -word execution of
credit card fraud detection faces many more obstacles.
However In real world instances, automatic tools scan the
vast stream of transaction requests that tells which
transaction to legitimise. Machine learning algorithm are
used to inspect all the ligament transaction and list out
transaction which are suspicious.
The reported suspicious transaction one investigated by
professionals. They contact the cardholder to identify
whether the transaction was authentic or fraudulent. The
automated systems are the updated by the investigators
as a feedback which helps the system to further train and
improve the effectiveness of the fraud detection over time
.Fraud detection needs to be constantly updated to
defends against against merging fraudulent strategies by
the criminals.
II. LITERATURE SURVEY
Fraudulent transactions act as the illegal activities which
are meant to generate personal or financial profit. It is an
intentional act that is criminal with an intention to make
financial profit.
There are a number of literature or research papers
available on this domain in the public platform. A detailed
survey study conducted by Clifton Phua and his interns
suggested methodologies used in this domain are data
mining applications, Fraud detection, adversarial
detection. In another research paper Suman threw lights
that techniques like supervised and unsupervised learning
for fraud detection. Indeed these methods were very
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 04 | Apr 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1787
effective and efficient in some areas of the domain but
they failed to give a permanent solution to the fraud
detection in credit cards. In a similar research paper by
Wen-Fang Yu and Na Wang, in which they used outlier
mining, outlier detection mining and Distance sum
algorithm to predict fraud in an experiment conducted on
the credit card data of some particular commercial banks.
Outlier mining is a technique of data mining which is
mostly applied in the fields related to finance or internet.
It detects the fields which are not genuine. In this
technique we take fields of customers behavior and on the
basis of that we determine the distance between the
observed value of the field and the predetermined value.
There is some literature which suggests a completely new
perspective to detect fraud. In case of fraudulent
transactions, there have been some studies to make the
alert feedback interaction more efficient. Whenever a
fraudulent transaction is encountered an alert will be
generated and sent to the authorised server system which
in return will deny the transaction. One of the many other
aspects of Artificial Genetic Algorithm is an advancement
to deal with the problem in a totally different aspect. This
method resulted in more accurate fraud detection and less
number of false alerts.
III. SYSTEM FRAMEWORK
Fig.1 demonstrates the process involved in developing the
model. The diagram represents the key steps involved in
the development of the proposed model. The sequence of
operations like data processing, data cleaning and feature
extraction takes place and in the end the classification is
performed.
Fig.2
IV. METHODOLOGY
Our paper suggests an aspect of latest machine learning
algorithms to detect fraudulent or anomalous events
commonly called outliers. Above (Fig.2) is the architecture
diagram of our proposed system.
We have used a dataset provided by Kaggle, this dataset
comprises the transaction records of the european card
holders in the year 2013. Inside the dataset there are 31
columns out of which 30 are used as features and the
remaining 1 column is used as class. Our features include
Time, Amount and Number of transactions.
We have plotted a graph which shows the inconsistency in
the dataset. Kindly refer to Fig.3 for the same.
This graph shows that the number of fraud transactions is
very less as compared to authorised transactions.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 04 | Apr 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1788
Fig. 3
To determine the pattern of the fraudulent transactions
we have plotted three graphs between: Number of
transactions vs Time, Number of transactions vs amount,
Amount vs Time.
This graph shows the relation between Number of
transactions and Amount.
This graph shows the relation between Number of
transactions and Time.
This graph shows the relation between Amount and Time.
These graphs give us a correlation between all the
features of the dataset which will help us to detect an
anomaly. After this step we plot a Heatmap in order to
obtain a correlation between the predicting variable and
the class variable.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 04 | Apr 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1789
In succession to generating the Heatmap we will train our
system with different Machine Learning algorithms so that
we can extract the data and find a pattern between the
data i.e. Fraudulent transactions. We will use the pattern
determined between the normal transactions and the
fraudulent transactions which are already in the dataset to
predict a fraudulent transaction in future.
The following algorithms are used to build the model and
train the model for detection of frauds in credit card
transactions:
Random Forest Classifier:
Random Forest can be used to classify the data into one of
the many classes. In Spark, the Random Forest model can
be achieved using the RandomForest class which is part of
pyspark.mllib.classification module. The data needs to be
made available in the required format for analysis to be
performed on the same. RandomForest takes multiple
parameters. The parameters are: data, numClasses,
categorical Features Info, num Trees, feature Subset
Strategy, impurity, max Depth, max Bins, seed.
Decision Tree:
A decision tree is a tree-like structure in which the root
node and each internal node represent a "test" on an
attribute of an instance in the dataset, the outcome of each
test is represented by the corresponding branches and the
node that does not branch further is called a leaf node and
represents the class labels. It takes three parameters:
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 04 | Apr 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1790
Instances – the set of instances for which class label is
already known. Target_Attribute – the class label attribute.
Attributes_List – the list of predictor attributes.
Support Vector Machine:
Support Vector Machines (SVM) is another classification
algorithm that classifies data into one category or the
other by using hyperplanes based on the training data.
SVM essentially creates a model such that it finds the
widest possible margins and thus the optimal hyperplane.
SVM creates a separating hyperplane by transforming the
data into higher dimensions where the data is separable
using the kernel trick.
VI. CONCLUSION
Detection of credit card fraud is an intent part of testing
for the researchers over a long time and will be an
interesting part of testing in the coming time. We are
introducing a fraud detection system for credit-cards by
applying three different algorithms and training our
machine using these algorithms with the transaction
records we have. The model that we built helps the
authorities to get notified of the fraud in credit-cards and
take the further necessary steps over the transaction and
label the transaction as fraud or legitimate transaction.
These algorithms show us that the given transaction
tends to be a type of fraud or not, these algorithms were
selected using experimentation, discussion and feature
importance techniques as shown in methodology. It is a
real-time transaction data from European credit-card
holders which explains the skewness of data. Therefore,
we can infer that there is a requirement of applying
feature selection technique. We used a PCA algorithm to
select the features from our transaction dataset which
uses correlation and variance as parameters to select the
features. We have set the summation of variance as 95%
for selecting the features using the PCA algorithm. We also
applied the PCA feature selection algorithm as there are
no features which have high variance and correlation with
the class column which determines the transaction as
fraud or not. We have applied the three machine learning
algorithms as stated in methodology and the models
indicate a high accuracy score for each one of them. The
scores of each model were 99.7%, 99.8%, 99.7% for the
decision tree, support vector machine and random forest
classifier algorithms respectively. As these models have
high accuracy but the predicted values have a low
precision, so with the upcoming time we will be focusing
on improvement in our model and get the best results
with high precision in determining the fraud detections in
credit card transaction records.
VII. FUTURE ENHANCEMENTS
Reaching a Goal of 100% should be the target of our
accuracy score for our machine model which detects the
fraud detection of credit card transactions. But reaching a
100% accuracy score we can infer that our model is being
over fitted with data which is giving us the output which is
being already trained for it. So, for future enhancements
we can convey that the precision and confusion matrix
values can be improved with a high range. We can
furthermore implement new algorithms to our European
credit-card holders transaction dataset and combine the
results for these algorithms for precision and confusion
matrix to get more legitimate values. The data set can be
improved too with replacing the highly skewed values to
normalized values and bringing a pattern to it which helps
us in building a more accurate model. The outliers can be
minimized in the present data set as they confuse the
model while training it. The correlation and variance is
also less in the dataset which can be improved to by
minimizing the skewness of the data. These will increase
the modularity and versatility of our project and make it
more accurate to predict the transactions are tending to
fraud or not. These improvements require the knowledge
and support of experts from different sectors like:
machine learning and artificial intelligence experts, data
scientists and also bankers who will help us to provide the
better form of data.
REFERENCES
1. Credit Card Fraud Detection Based on
TransactionBehaviour-by John Richard D. Kho,
Larry A. Vea” published by Proc. of the 2017 IEEE
Region 10 Conference (TENCON), Malaysia,
November 5-8, 20107
2. CLIFTON PHUA1, VINCENT LEE1, KATE SMITH1 &
ROSS GAYLER2 “ A Comprehensive Survey of Data
Mining-based Fraud Detection Research” published
by School of Business Systems, Faculty of
Information Technology, MonashUniversity,
Wellington Road, Clayton, Victoria 3800, Australia
3. “Survey Paper on Credit Card FraudDetection
bySuman”, Research Scholar,GJUS&T Hisar
HCE,Sonepatpublished by International Journal of
Advanced Research in Computer Engineering &
Technology (IJARCET) Volume 3 Issue 3, March
2014
4. “Research on Credit Card Fraud Detection Model
Based on Distance Sum –by Wen-Fang YU and Na
Wang” published by 2009 International Joint
Conference on Artificial Intelligence
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 04 | Apr 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1791
5. “Credit Card Fraud Detection through Parenclitic
Network Analysis-By Massimiliano Zanin, Miguel
Romance, Regino Criado, and SantiagoMoral”
published by Hindawi Complexity Volume 2018,
Article ID 5764370, 9 pages
6. “Credit Card Fraud Detection: A Realistic Modeling
and a Novel Learning Strategy” published by IEEE
TRANSACTIONS ON NEURAL NETWORKS AND
LEARNING SYSTEMS, VOL. 29, NO. 8, AUGUST 2018
7. “Credit Card Fraud Detection-by Ishu Trivedi,
Monika, Mrigya,Mridushi” published by
International Journal of Advanced Research in
Computer and Communication Engineering .

More Related Content

PDF
IRJET- Survey on Credit Card Fraud Detection
PPTX
Online Payment Fraud Detection with Azure Machine Learning
PDF
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
PDF
Credit Card Fraud Detection
PDF
PPT
Telecom Fraud Detection - Naive Bayes Classification
PDF
IRJET- Credit Card Fraud Detection using Random Forest
PDF
Problem Reduction in Online Payment System Using Hybrid Model
IRJET- Survey on Credit Card Fraud Detection
Online Payment Fraud Detection with Azure Machine Learning
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
Credit Card Fraud Detection
Telecom Fraud Detection - Naive Bayes Classification
IRJET- Credit Card Fraud Detection using Random Forest
Problem Reduction in Online Payment System Using Hybrid Model

What's hot (19)

PDF
Fraud Detection presentation
PPT
Audit,fraud detection Using Picalo
PDF
Uses of analytics in the field of Banking
PDF
Enterprise Fraud Management: How Banks Need to Adapt
PDF
Automated anti money laundering using artificial intelligence and machine lea...
PPTX
Emerging technologies enabling in fraud detection
PDF
IRJET- Credit Card Transaction using Fingerprint Recognisation and Two St...
PPT
Ibm financial crime management solution 3
PDF
IoT Big Data Analytics Insights from Patents
PDF
IRJET- E-Commerce Recommender System using Data Mining Algorithms
PDF
Review on 3rd-party Cyber Risk Assessment and Scoring Tools
PDF
Predictive analytics solution for claims fraud detection
PDF
IRJET- Analysis on Credit Card Fraud Detection using Capsule Network
PDF
Use of Automated Teller Machine (ATM) card in Dhaka City: A Survey to Reveal ...
PPTX
Analystics in banking and financial services
PDF
Graphs in Government, Jim Webber
PPTX
Fraud detection analysis
PDF
Streamlining aml processes through automation with ai
PPT
Detecting Opportunities and Threats with Complex Event Processing: Case St...
Fraud Detection presentation
Audit,fraud detection Using Picalo
Uses of analytics in the field of Banking
Enterprise Fraud Management: How Banks Need to Adapt
Automated anti money laundering using artificial intelligence and machine lea...
Emerging technologies enabling in fraud detection
IRJET- Credit Card Transaction using Fingerprint Recognisation and Two St...
Ibm financial crime management solution 3
IoT Big Data Analytics Insights from Patents
IRJET- E-Commerce Recommender System using Data Mining Algorithms
Review on 3rd-party Cyber Risk Assessment and Scoring Tools
Predictive analytics solution for claims fraud detection
IRJET- Analysis on Credit Card Fraud Detection using Capsule Network
Use of Automated Teller Machine (ATM) card in Dhaka City: A Survey to Reveal ...
Analystics in banking and financial services
Graphs in Government, Jim Webber
Fraud detection analysis
Streamlining aml processes through automation with ai
Detecting Opportunities and Threats with Complex Event Processing: Case St...
Ad

Similar to IRJET - Fraud Detection in Credit Card using Machine Learning Techniques (20)

PDF
A Comparative Study on Credit Card Fraud Detection
PDF
IRJET - Online Credit Card Fraud Detection and Prevention System
PDF
Machine Learning-Based Approaches for Fraud Detection in Credit Card Transact...
PDF
CREDIT CARD FRAUD DETECTION IN R
PDF
A Research Paper on Credit Card Fraud Detection
PDF
CREDIT CARD FRAUD DETECTION AND AUTHENTICATION SYSTEM USING MACHINE LEARNING
PPTX
credit card fruad detection from the fake users.pptx
PDF
FRAUD DETECTION IN CREDIT CARD TRANSACTIONS
PDF
Online Transaction Fraud Detection System Based on Machine Learning
PDF
IRJET- Credit Card Fraud Detection using Machine Learning
PDF
A Comparative Study on Online Transaction Fraud Detection by using Machine Le...
PDF
A Review of deep learning techniques in detection of anomaly incredit card tr...
PDF
IRJET- Finalize Attributes and using Specific Way to Find Fraudulent Transaction
PPTX
Fraud detection in financial transaction using advanced analvtics technique
PDF
CREDIT CARD FRAUD DETECTION USING MACHINE LEARNING
PPTX
Credit card fraud dection
PDF
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdf
PPTX
Credit Card Fraud Detection_ Mansi_Choudhary.pptx
PDF
A Study on Credit Card Fraud Detection using Machine Learning
PDF
Tools & Technique Project for machine learning
A Comparative Study on Credit Card Fraud Detection
IRJET - Online Credit Card Fraud Detection and Prevention System
Machine Learning-Based Approaches for Fraud Detection in Credit Card Transact...
CREDIT CARD FRAUD DETECTION IN R
A Research Paper on Credit Card Fraud Detection
CREDIT CARD FRAUD DETECTION AND AUTHENTICATION SYSTEM USING MACHINE LEARNING
credit card fruad detection from the fake users.pptx
FRAUD DETECTION IN CREDIT CARD TRANSACTIONS
Online Transaction Fraud Detection System Based on Machine Learning
IRJET- Credit Card Fraud Detection using Machine Learning
A Comparative Study on Online Transaction Fraud Detection by using Machine Le...
A Review of deep learning techniques in detection of anomaly incredit card tr...
IRJET- Finalize Attributes and using Specific Way to Find Fraudulent Transaction
Fraud detection in financial transaction using advanced analvtics technique
CREDIT CARD FRAUD DETECTION USING MACHINE LEARNING
Credit card fraud dection
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdf
Credit Card Fraud Detection_ Mansi_Choudhary.pptx
A Study on Credit Card Fraud Detection using Machine Learning
Tools & Technique Project for machine learning
Ad

More from IRJET Journal (20)

PDF
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
PDF
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
PDF
Kiona – A Smart Society Automation Project
PDF
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
PDF
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
PDF
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
PDF
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
PDF
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
PDF
BRAIN TUMOUR DETECTION AND CLASSIFICATION
PDF
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
PDF
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
PDF
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
PDF
Breast Cancer Detection using Computer Vision
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Kiona – A Smart Society Automation Project
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
BRAIN TUMOUR DETECTION AND CLASSIFICATION
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Breast Cancer Detection using Computer Vision
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...

Recently uploaded (20)

PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
Construction Project Organization Group 2.pptx
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPT
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PDF
PPT on Performance Review to get promotions
PPTX
additive manufacturing of ss316l using mig welding
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PDF
composite construction of structures.pdf
PPTX
UNIT 4 Total Quality Management .pptx
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Construction Project Organization Group 2.pptx
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Foundation to blockchain - A guide to Blockchain Tech
Automation-in-Manufacturing-Chapter-Introduction.pdf
R24 SURVEYING LAB MANUAL for civil enggi
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPT on Performance Review to get promotions
additive manufacturing of ss316l using mig welding
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
Model Code of Practice - Construction Work - 21102022 .pdf
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
composite construction of structures.pdf
UNIT 4 Total Quality Management .pptx

IRJET - Fraud Detection in Credit Card using Machine Learning Techniques

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 04 | Apr 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1786 Fraud Detection in Credit Card using Machine Learning Techniques Mr.Manohar.s1, Arvind Bedi2, Shashank kumar3, Shounak kr Singh4 1(Professor/ CSE, SRM Institute of Science and Technology) 2(B.Tech - Student/ CSE, SRM Institute of Science and Technology) 3(B.Tech - Student/ CSE, SRM Institute of Science and Technology) 4(B.Tech - Student/ CSE, SRM Institute of Science and Technology) -----------------------------------------------------------------------------***------------------------------------------------------------------------- Abstract: Credit card fraud happens frequently and leads to massive financial losses .Online transaction have increased drastically significant no of online transaction are done by online credit cards. Therefore, banks and other financial institutions support the progress of credit card fraud detection applications. Fraudulent transactions can happen in different ways and they can be placed into various categories. Identification of fraud credit card transactions is important to credit card companies for the prevention of being charged for items transaction of items which the customer did not purchase. Data science along with machine leaving helps in tackling these issues. The fraudulent transactions are mixed up with legitimate transactions and the simple recognition techniques which include comparison of both the fraud and the legitimate data are never sufficient to detect the fraud transactions accurately. This project intends to illustrate the modelling of a knowledge set using machine learning with Credit Card Fraud Detection. The Credit Card Fraud Detection Problem includes modelling of credit card transactions which has happened earlier with the data of fraud transactions. Our model will determine whether a new transaction tends to be fraud or legitimate. We have an objective to detect 100% of the fraud transactions while reducing invalid fraud classifications. I. INTRODUCTION Fraud credit card transaction is the unbidden use of someone’s account without the owner being aware of it .Prevention measures need to be taken against such fraudulent practices by analysing and studying these fraud transactions to avoid similar situations in the upcoming transactions. Briefly, Credit Card transaction Frauds can be explained as a scenario in which a fraudster utilizes a credit card of another person for personal means without seeking permission or authorization of the owner of the credit card and the credit card issuing officials or institutions are unknown of the fraud. In order to detect fraud, there is a need to monitor the activities of the users to avoid abnormal behaviour which include intrusion, Fraud and defaulting. Machine learning and data science are the communities that focus on problems like these since the solution has greater possibility and feasibility for being automated .From the perspective of learning this is a very challenging problem as it is distinguished by many factors such as class disparity. Usually the legitimate transactions are more than the fraudulent ones. Furthermore, the statistical properties of transaction arrangement changes frequently over a time period .Real -word execution of credit card fraud detection faces many more obstacles. However In real world instances, automatic tools scan the vast stream of transaction requests that tells which transaction to legitimise. Machine learning algorithm are used to inspect all the ligament transaction and list out transaction which are suspicious. The reported suspicious transaction one investigated by professionals. They contact the cardholder to identify whether the transaction was authentic or fraudulent. The automated systems are the updated by the investigators as a feedback which helps the system to further train and improve the effectiveness of the fraud detection over time .Fraud detection needs to be constantly updated to defends against against merging fraudulent strategies by the criminals. II. LITERATURE SURVEY Fraudulent transactions act as the illegal activities which are meant to generate personal or financial profit. It is an intentional act that is criminal with an intention to make financial profit. There are a number of literature or research papers available on this domain in the public platform. A detailed survey study conducted by Clifton Phua and his interns suggested methodologies used in this domain are data mining applications, Fraud detection, adversarial detection. In another research paper Suman threw lights that techniques like supervised and unsupervised learning for fraud detection. Indeed these methods were very
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 04 | Apr 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1787 effective and efficient in some areas of the domain but they failed to give a permanent solution to the fraud detection in credit cards. In a similar research paper by Wen-Fang Yu and Na Wang, in which they used outlier mining, outlier detection mining and Distance sum algorithm to predict fraud in an experiment conducted on the credit card data of some particular commercial banks. Outlier mining is a technique of data mining which is mostly applied in the fields related to finance or internet. It detects the fields which are not genuine. In this technique we take fields of customers behavior and on the basis of that we determine the distance between the observed value of the field and the predetermined value. There is some literature which suggests a completely new perspective to detect fraud. In case of fraudulent transactions, there have been some studies to make the alert feedback interaction more efficient. Whenever a fraudulent transaction is encountered an alert will be generated and sent to the authorised server system which in return will deny the transaction. One of the many other aspects of Artificial Genetic Algorithm is an advancement to deal with the problem in a totally different aspect. This method resulted in more accurate fraud detection and less number of false alerts. III. SYSTEM FRAMEWORK Fig.1 demonstrates the process involved in developing the model. The diagram represents the key steps involved in the development of the proposed model. The sequence of operations like data processing, data cleaning and feature extraction takes place and in the end the classification is performed. Fig.2 IV. METHODOLOGY Our paper suggests an aspect of latest machine learning algorithms to detect fraudulent or anomalous events commonly called outliers. Above (Fig.2) is the architecture diagram of our proposed system. We have used a dataset provided by Kaggle, this dataset comprises the transaction records of the european card holders in the year 2013. Inside the dataset there are 31 columns out of which 30 are used as features and the remaining 1 column is used as class. Our features include Time, Amount and Number of transactions. We have plotted a graph which shows the inconsistency in the dataset. Kindly refer to Fig.3 for the same. This graph shows that the number of fraud transactions is very less as compared to authorised transactions.
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 04 | Apr 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1788 Fig. 3 To determine the pattern of the fraudulent transactions we have plotted three graphs between: Number of transactions vs Time, Number of transactions vs amount, Amount vs Time. This graph shows the relation between Number of transactions and Amount. This graph shows the relation between Number of transactions and Time. This graph shows the relation between Amount and Time. These graphs give us a correlation between all the features of the dataset which will help us to detect an anomaly. After this step we plot a Heatmap in order to obtain a correlation between the predicting variable and the class variable.
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 04 | Apr 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1789 In succession to generating the Heatmap we will train our system with different Machine Learning algorithms so that we can extract the data and find a pattern between the data i.e. Fraudulent transactions. We will use the pattern determined between the normal transactions and the fraudulent transactions which are already in the dataset to predict a fraudulent transaction in future. The following algorithms are used to build the model and train the model for detection of frauds in credit card transactions: Random Forest Classifier: Random Forest can be used to classify the data into one of the many classes. In Spark, the Random Forest model can be achieved using the RandomForest class which is part of pyspark.mllib.classification module. The data needs to be made available in the required format for analysis to be performed on the same. RandomForest takes multiple parameters. The parameters are: data, numClasses, categorical Features Info, num Trees, feature Subset Strategy, impurity, max Depth, max Bins, seed. Decision Tree: A decision tree is a tree-like structure in which the root node and each internal node represent a "test" on an attribute of an instance in the dataset, the outcome of each test is represented by the corresponding branches and the node that does not branch further is called a leaf node and represents the class labels. It takes three parameters:
  • 5. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 04 | Apr 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1790 Instances – the set of instances for which class label is already known. Target_Attribute – the class label attribute. Attributes_List – the list of predictor attributes. Support Vector Machine: Support Vector Machines (SVM) is another classification algorithm that classifies data into one category or the other by using hyperplanes based on the training data. SVM essentially creates a model such that it finds the widest possible margins and thus the optimal hyperplane. SVM creates a separating hyperplane by transforming the data into higher dimensions where the data is separable using the kernel trick. VI. CONCLUSION Detection of credit card fraud is an intent part of testing for the researchers over a long time and will be an interesting part of testing in the coming time. We are introducing a fraud detection system for credit-cards by applying three different algorithms and training our machine using these algorithms with the transaction records we have. The model that we built helps the authorities to get notified of the fraud in credit-cards and take the further necessary steps over the transaction and label the transaction as fraud or legitimate transaction. These algorithms show us that the given transaction tends to be a type of fraud or not, these algorithms were selected using experimentation, discussion and feature importance techniques as shown in methodology. It is a real-time transaction data from European credit-card holders which explains the skewness of data. Therefore, we can infer that there is a requirement of applying feature selection technique. We used a PCA algorithm to select the features from our transaction dataset which uses correlation and variance as parameters to select the features. We have set the summation of variance as 95% for selecting the features using the PCA algorithm. We also applied the PCA feature selection algorithm as there are no features which have high variance and correlation with the class column which determines the transaction as fraud or not. We have applied the three machine learning algorithms as stated in methodology and the models indicate a high accuracy score for each one of them. The scores of each model were 99.7%, 99.8%, 99.7% for the decision tree, support vector machine and random forest classifier algorithms respectively. As these models have high accuracy but the predicted values have a low precision, so with the upcoming time we will be focusing on improvement in our model and get the best results with high precision in determining the fraud detections in credit card transaction records. VII. FUTURE ENHANCEMENTS Reaching a Goal of 100% should be the target of our accuracy score for our machine model which detects the fraud detection of credit card transactions. But reaching a 100% accuracy score we can infer that our model is being over fitted with data which is giving us the output which is being already trained for it. So, for future enhancements we can convey that the precision and confusion matrix values can be improved with a high range. We can furthermore implement new algorithms to our European credit-card holders transaction dataset and combine the results for these algorithms for precision and confusion matrix to get more legitimate values. The data set can be improved too with replacing the highly skewed values to normalized values and bringing a pattern to it which helps us in building a more accurate model. The outliers can be minimized in the present data set as they confuse the model while training it. The correlation and variance is also less in the dataset which can be improved to by minimizing the skewness of the data. These will increase the modularity and versatility of our project and make it more accurate to predict the transactions are tending to fraud or not. These improvements require the knowledge and support of experts from different sectors like: machine learning and artificial intelligence experts, data scientists and also bankers who will help us to provide the better form of data. REFERENCES 1. Credit Card Fraud Detection Based on TransactionBehaviour-by John Richard D. Kho, Larry A. Vea” published by Proc. of the 2017 IEEE Region 10 Conference (TENCON), Malaysia, November 5-8, 20107 2. CLIFTON PHUA1, VINCENT LEE1, KATE SMITH1 & ROSS GAYLER2 “ A Comprehensive Survey of Data Mining-based Fraud Detection Research” published by School of Business Systems, Faculty of Information Technology, MonashUniversity, Wellington Road, Clayton, Victoria 3800, Australia 3. “Survey Paper on Credit Card FraudDetection bySuman”, Research Scholar,GJUS&T Hisar HCE,Sonepatpublished by International Journal of Advanced Research in Computer Engineering & Technology (IJARCET) Volume 3 Issue 3, March 2014 4. “Research on Credit Card Fraud Detection Model Based on Distance Sum –by Wen-Fang YU and Na Wang” published by 2009 International Joint Conference on Artificial Intelligence
  • 6. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 04 | Apr 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1791 5. “Credit Card Fraud Detection through Parenclitic Network Analysis-By Massimiliano Zanin, Miguel Romance, Regino Criado, and SantiagoMoral” published by Hindawi Complexity Volume 2018, Article ID 5764370, 9 pages 6. “Credit Card Fraud Detection: A Realistic Modeling and a Novel Learning Strategy” published by IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 29, NO. 8, AUGUST 2018 7. “Credit Card Fraud Detection-by Ishu Trivedi, Monika, Mrigya,Mridushi” published by International Journal of Advanced Research in Computer and Communication Engineering .