Machine Intelligence for Fraud Prediction

MACHINE INTELLIGENCE FOR
FRAUD PREDICTION
#paymentsecurity
Dmitry Petukhov,
ML/DS Preacher,
Machine Intelligence Researcher @ OpenWay &&
Coffee Addicted

Говорят, что компьютерная программа обучается на основе опыта E по отношению к
некоторому классу задач T и меры качества P, если качество решения задач из T, измеренное
на основе P, улучшается с приобретением опыта E.
T.M. Mitchell. Machine Learning, 1997.
Машинное обучение — процесс, в результате которого машина (компьютер) способна
показывать поведение, которое в нее не было явно заложено (запрограммировано).
A.L. Samuel. Some Studies in Machine Learning Using the Game of Checkers, 1959.
Терминология

Machine Learning is the Future
Thesis #1

Machine Intelligence Cases for Retail Banking
Personalized
Product Offering
Real-timeBatch Processing
Processing Speed
Log(Volume)
Pbytes
Tbytes
Gbytes
Structured
data
Semi-structured
Unstructured
Customer Loyalty
Operational Efficiencies
Fraud Detection
Compliance and
Regulatory Reporting
Voice Identity, Chat-bots
Customer Segmentation
Credit Scoring
Credit Card Fraud
Web-/Mobile Bank Fraud
Insider Threats
Information Attacks

Card-not-present Fraud Volume == Big Data caseVolume
Variety
Velocity

Machine Intelligence + Big Data
New Paradigm

Old School vs Big Data Paradigm

Dynamic threshold
Static* threshold
Old School vs AI Paradigm
* ∆t attack ≪ ∆t reaction

Evolution or and Revolution
1.
2.
3.
FALSE
FALSE
TRUE

Data
Infrastructure
Intelligence
Machine Intelligence Stack

MachineHuman
Private cloud Public cloudHybrid cloud
Forget or Secure Store and share
Machine Intelligence Stack
Cost
Law? Ethics?
Black box?

Architecture: Data Flow Online
Real-time processing
Transactions stream
Risk score
Internal data
Transactions Log (WAY4),
customers/merchants CRMs,
black/white lists
External data
НБКИ, ФНС, ПФР, ФССП,
location & devices definition, social
graph, mobile provider score
1. Preprocessing data 2. Calculate statistics 3. Train model 4. Evaluate model
DetailsRaw Aggregates Model
Private data (152-ФЗ)
Payment data (PCI DSS)
0. Retrieve data

Step 1: Preprocessing Data
Transaction Amount Challenge
1. 2. 3.

1. 2.
Step 1: Preprocessing Data
Customer Clustering Challenge

1% женщин в возрасте 40 лет, участвовавших в регулярных обследованиях, имеют рак груди. 80% женщин с раком
груди имеют положительный результат маммографии. 9,6% здоровых женщин также получают положительный
результат (маммография, как любые измерения, не дает 100% результатов).
Женщина-пациент из этой возрастной группы получила положительный результат на регулярном обследовании.
Какова вероятность того, что она фактически больна раком груди?
Step 2: Calculate Statistics

Step 3: Train Model
Algorithm Selection Challenge
Algorithm Accuracy Speed Specifics
1. Logistic regression low fast linearly separable
2. Decision Tree low medium human-readable
3. Boosted Decision Tree high medium generalization ability
4. Neural Networks medium-high low pattern recognition
5. Deep Learning high very low magic AI

Step 4: Evaluate Model
Accur𝑎𝑐𝑦 =
𝑇𝑃 + 𝑇𝑁
𝑃 + 𝑁
𝑅𝑒𝑐𝑎𝑙𝑙
∗
=
𝑇𝑃
𝑇𝑃 + 𝐹𝑁
Challenges:
 Imbalanced classes;
 False Positive Penalty != False Negative Penalty;
 Calculate business-metrics:
 Direct and indirect losses;
 Bonus:
if you change Threshold, you will change everything…
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
𝑇𝑃
𝑇𝑃 + 𝐹𝑃
𝐹2 =
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∙ 𝑟𝑒𝑐𝑎𝑙𝑙
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙
Wikipedia

References
1. Bansal, M. Credit Card Fraud Detection Using Self Organised Map (2014) International Journal of Information & Computation Technology,
Volume 4, Number 13.
2. Chan, P.K., Fan, W., Prodromidis, A.L., Stolfo, S.J. Distributed data mining in credit card fraud detection (1999) IEEE Intelligent Systems and
Their Applications, 14 (6).
3. Grolinger, K., Hayes, M., Higashino, W.A., L'Heureux, A., Allison, D.S., Capretz, M.A.M. Challenges for MapReduce in Big Data (2014)
Proceedings of the 2014 IEEE World Congress on Services.
4. Khan, A., Akhtar, N., and Qureshi, M. Real-Time Credit-Card Fraud Detection using Artificial Neural Network Tuned by Simulated Annealing
Algorithm (2014) ACEEE, Proc. of Int. Conf. on Recent Trends in Information, Telecommunication and Computing, ITC 2014 Chandigarh,
India.
5. Lu, Q., Ju, C. Research on credit card fraud detection model based on class weighted support vector machine (2011) Journal of Convergence
Information Technology, 6 (1).
6. Mardani, S., Akbari, M.K., Sharifian, S. Fraud detection in Process Aware Information systems using MapReduce (2014) 2014 6th Conference on
Information and Knowledge Technology, IKT 2014.
7. Dmitry Petukhov, A. Tselykh. Web service for detecting credit card fraud in near real-time (2015) Proceedings of the 8th International
Conference on Security of Information and Networks.
Advanced References
1. Максим Федотенко. Как защищают банки: разбираем устройство и принципы банковского антифрода. Журнал Хакер, 2017.
2. Дмитрий Петухов. Цикл статей: Антифрод как сервис. Интернет-ресурс 0xCode.in, 2016.

© 2017, Dmitry Petukhov. CC BY-SA 4.0 license. OpenWay and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
Thank you!

Q&A
Now or later (see contacts below)
Stay connected
Facebook: @code.zombi
Habr: @codezombie
All contacts: http://0xCode.in/author
Download presentation from
http://0xCode.in/2017/paymentsecurity or

Machine Intelligence for Fraud Prediction

More Related Content

Viewers also liked (14)

Similar to Machine Intelligence for Fraud Prediction (20)

More from Dmitry Petukhov (9)

Machine Intelligence for Fraud Prediction