SlideShare a Scribd company logo
MACHINE INTELLIGENCE FOR
FRAUD PREDICTION
#paymentsecurity
Dmitry Petukhov,
ML/DS Preacher,
Machine Intelligence Researcher @ OpenWay &&
Coffee Addicted
Говорят, что компьютерная программа обучается на основе опыта E по отношению к
некоторому классу задач T и меры качества P, если качество решения задач из T, измеренное
на основе P, улучшается с приобретением опыта E.
T.M. Mitchell. Machine Learning, 1997.
Машинное обучение — процесс, в результате которого машина (компьютер) способна
показывать поведение, которое в нее не было явно заложено (запрограммировано).
A.L. Samuel. Some Studies in Machine Learning Using the Game of Checkers, 1959.
Терминология
Machine Learning is the Future
Thesis #1
Machine Intelligence Cases for Retail Banking
Personalized
Product Offering
Real-timeBatch Processing
Processing Speed
Log(Volume)
Pbytes
Tbytes
Gbytes
Structured
data
Semi-structured
Unstructured
Customer Loyalty
Operational Efficiencies
Fraud Detection
Compliance and
Regulatory Reporting
Voice Identity, Chat-bots
Customer Segmentation
Credit Scoring
Credit Card Fraud
Web-/Mobile Bank Fraud
Insider Threats
Information Attacks
Data are everywhere
Thesis #2
Card-not-present Fraud Volume == Big Data caseVolume
Variety
Velocity
Machine Intelligence + Big Data
New Paradigm
Old School vs Big Data Paradigm
Dynamic threshold
Static* threshold
Old School vs AI Paradigm
* ∆t attack ≪ ∆t reaction
Evolution or and Revolution
1.
2.
3.
FALSE
FALSE
TRUE
Data
Infrastructure
Intelligence
Machine Intelligence Stack
MachineHuman
Private cloud Public cloudHybrid cloud
Forget or Secure Store and share
Machine Intelligence Stack
Cost
Law? Ethics?
Black box?
Architecture: Data Flow Online
Real-time processing
Transactions stream
Risk score
Internal data
Transactions Log (WAY4),
customers/merchants CRMs,
black/white lists
External data
НБКИ, ФНС, ПФР, ФССП,
location & devices definition, social
graph, mobile provider score
1. Preprocessing data 2. Calculate statistics 3. Train model 4. Evaluate model
DetailsRaw Aggregates Model
Private data (152-ФЗ)
Payment data (PCI DSS)
0. Retrieve data
Step 1: Preprocessing Data
Transaction Amount Challenge
1. 2. 3.
1. 2.
Step 1: Preprocessing Data
Customer Clustering Challenge
Step 2: Calculate Statistics
1% женщин в возрасте 40 лет, участвовавших в регулярных обследованиях, имеют рак груди. 80% женщин с раком
груди имеют положительный результат маммографии. 9,6% здоровых женщин также получают положительный
результат (маммография, как любые измерения, не дает 100% результатов).
Женщина-пациент из этой возрастной группы получила положительный результат на регулярном обследовании.
Какова вероятность того, что она фактически больна раком груди?
Step 2: Calculate Statistics
Step 3: Train Model
Algorithm Selection Challenge
Algorithm Accuracy Speed Specifics
1. Logistic regression low fast linearly separable
2. Decision Tree low medium human-readable
3. Boosted Decision Tree high medium generalization ability
4. Neural Networks medium-high low pattern recognition
5. Deep Learning high very low magic AI
Step 4: Evaluate Model
Accur𝑎𝑐𝑦 =
𝑇𝑃 + 𝑇𝑁
𝑃 + 𝑁
𝑅𝑒𝑐𝑎𝑙𝑙
∗
=
𝑇𝑃
𝑇𝑃 + 𝐹𝑁
Challenges:
 Imbalanced classes;
 False Positive Penalty != False Negative Penalty;
 Calculate business-metrics:
 Direct and indirect losses;
 Bonus:
if you change Threshold, you will change everything…
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
𝑇𝑃
𝑇𝑃 + 𝐹𝑃
𝐹2 =
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∙ 𝑟𝑒𝑐𝑎𝑙𝑙
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙
Wikipedia
Rule-based or AI-based?
References
1. Bansal, M. Credit Card Fraud Detection Using Self Organised Map (2014) International Journal of Information & Computation Technology,
Volume 4, Number 13.
2. Chan, P.K., Fan, W., Prodromidis, A.L., Stolfo, S.J. Distributed data mining in credit card fraud detection (1999) IEEE Intelligent Systems and
Their Applications, 14 (6).
3. Grolinger, K., Hayes, M., Higashino, W.A., L'Heureux, A., Allison, D.S., Capretz, M.A.M. Challenges for MapReduce in Big Data (2014)
Proceedings of the 2014 IEEE World Congress on Services.
4. Khan, A., Akhtar, N., and Qureshi, M. Real-Time Credit-Card Fraud Detection using Artificial Neural Network Tuned by Simulated Annealing
Algorithm (2014) ACEEE, Proc. of Int. Conf. on Recent Trends in Information, Telecommunication and Computing, ITC 2014 Chandigarh,
India.
5. Lu, Q., Ju, C. Research on credit card fraud detection model based on class weighted support vector machine (2011) Journal of Convergence
Information Technology, 6 (1).
6. Mardani, S., Akbari, M.K., Sharifian, S. Fraud detection in Process Aware Information systems using MapReduce (2014) 2014 6th Conference on
Information and Knowledge Technology, IKT 2014.
7. Dmitry Petukhov, A. Tselykh. Web service for detecting credit card fraud in near real-time (2015) Proceedings of the 8th International
Conference on Security of Information and Networks.
Advanced References
1. Максим Федотенко. Как защищают банки: разбираем устройство и принципы банковского антифрода. Журнал Хакер, 2017.
2. Дмитрий Петухов. Цикл статей: Антифрод как сервис. Интернет-ресурс 0xCode.in, 2016.
© 2017, Dmitry Petukhov. CC BY-SA 4.0 license. OpenWay and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
Thank you!
Q&A
Now or later (see contacts below)
Stay connected
Facebook: @code.zombi
Habr: @codezombie
All contacts: http://0xCode.in/author
Download presentation from
http://0xCode.in/2017/paymentsecurity or

More Related Content

PDF
2016 Current State of IoT
PDF
Schneider Electric Smart City Success Stories (Worldwide)
PDF
Auxis Webinar: Diving into RPA
PPTX
Philip bane smart city
PPTX
City as Platform Cooperative - Smart City Expo - Barcelona
PPTX
Azure Machine Learning
PPTX
Smart-city implementation reference model
2016 Current State of IoT
Schneider Electric Smart City Success Stories (Worldwide)
Auxis Webinar: Diving into RPA
Philip bane smart city
City as Platform Cooperative - Smart City Expo - Barcelona
Azure Machine Learning
Smart-city implementation reference model

Viewers also liked (14)

PDF
Monetizing the iot by Sandhiprakash Bhide generic-01-24-2017
PPTX
Machine Learning with Microsoft Azure
PPTX
AI for Retail Banking
PPTX
Democratizing Artificial Intelligence
PDF
[Webinar Slides] Robotic Process Automation 101 What is it? What can it mean ...
PPTX
AI in IoT: Use Cases and Challenges
PPTX
Microsoft Machine Learning Server. Architecture View
PPTX
CISCO SMART CITY
PDF
Smart City and Smart Government : Strategy, Model, and Cases of Korea
PPTX
AI & Robotic Process Automation (RPA) to Digitally Transform Your Environment
PPTX
What is next for IoT and IIoT
PDF
Build your First IoT Application with IBM Watson IoT
PDF
Iot for smart city
PDF
IoT architecture
Monetizing the iot by Sandhiprakash Bhide generic-01-24-2017
Machine Learning with Microsoft Azure
AI for Retail Banking
Democratizing Artificial Intelligence
[Webinar Slides] Robotic Process Automation 101 What is it? What can it mean ...
AI in IoT: Use Cases and Challenges
Microsoft Machine Learning Server. Architecture View
CISCO SMART CITY
Smart City and Smart Government : Strategy, Model, and Cases of Korea
AI & Robotic Process Automation (RPA) to Digitally Transform Your Environment
What is next for IoT and IIoT
Build your First IoT Application with IBM Watson IoT
Iot for smart city
IoT architecture
Ad

Similar to Machine Intelligence for Fraud Prediction (20)

PPTX
Защита практики report for ML fake news detecting.pptx
PPTX
10 Critical Mistakes in Data Analysis
PDF
Big data и bi в медицине 5 волна
PPTX
Machine Learning as a Service
PDF
Dsml for business.full version
PDF
РИФ 2016. Телематика – будущее «Умного страхования»
PDF
Когнитивные технологии
PDF
когнитивные технологии, Ibm
PDF
Нейросетевые системы автоматического распознавания морских объектов
PDF
Николай Марин, Исполнительный Архитектор IBM: Практика IBM в цифровой трансф...
PDF
SECON'2014 - Андрей Устюжанин - Маленькие секреты больших данных
PPTX
Одна лекция из мира Big Data: тренды, кейсы и технологии
PPTX
Cовременные тенденции против устаревших стереотипов бизнеса
PPT
Recognition of handwritten digits
PDF
CleverCLUB-26.03.15-K.Obukhov
PDF
Мониторинг рынка труда IT-специалистов 2016 от Яндекс
PPTX
Igor Kaufman "State of AI. Financial and Healthcare applications"
PPTX
Машинное обучение на каждый день
PPTX
Инсайдеры: стороны взаимодействия - Сергей Кавун
PPTX
RST2014_Volgograd_CombinedThermometry
Защита практики report for ML fake news detecting.pptx
10 Critical Mistakes in Data Analysis
Big data и bi в медицине 5 волна
Machine Learning as a Service
Dsml for business.full version
РИФ 2016. Телематика – будущее «Умного страхования»
Когнитивные технологии
когнитивные технологии, Ibm
Нейросетевые системы автоматического распознавания морских объектов
Николай Марин, Исполнительный Архитектор IBM: Практика IBM в цифровой трансф...
SECON'2014 - Андрей Устюжанин - Маленькие секреты больших данных
Одна лекция из мира Big Data: тренды, кейсы и технологии
Cовременные тенденции против устаревших стереотипов бизнеса
Recognition of handwritten digits
CleverCLUB-26.03.15-K.Obukhov
Мониторинг рынка труда IT-специалистов 2016 от Яндекс
Igor Kaufman "State of AI. Financial and Healthcare applications"
Машинное обучение на каждый день
Инсайдеры: стороны взаимодействия - Сергей Кавун
RST2014_Volgograd_CombinedThermometry
Ad

More from Dmitry Petukhov (9)

PPTX
Introduction to Auto ML
PPTX
Intelligent Banking: AI cases in Retail and Commercial Banking
PPTX
IaaS, PaaS, and DevOps for Data Scientist
PPTX
Introduction to Deep Learning
PPTX
Introduction to Machine Learning
PPTX
R + Apache Spark
PPTX
Introduction to R
PPTX
Microsoft Azure + R
PPTX
Machine Learning in Microsoft Azure
Introduction to Auto ML
Intelligent Banking: AI cases in Retail and Commercial Banking
IaaS, PaaS, and DevOps for Data Scientist
Introduction to Deep Learning
Introduction to Machine Learning
R + Apache Spark
Introduction to R
Microsoft Azure + R
Machine Learning in Microsoft Azure

Machine Intelligence for Fraud Prediction

  • 1. MACHINE INTELLIGENCE FOR FRAUD PREDICTION #paymentsecurity Dmitry Petukhov, ML/DS Preacher, Machine Intelligence Researcher @ OpenWay && Coffee Addicted
  • 2. Говорят, что компьютерная программа обучается на основе опыта E по отношению к некоторому классу задач T и меры качества P, если качество решения задач из T, измеренное на основе P, улучшается с приобретением опыта E. T.M. Mitchell. Machine Learning, 1997. Машинное обучение — процесс, в результате которого машина (компьютер) способна показывать поведение, которое в нее не было явно заложено (запрограммировано). A.L. Samuel. Some Studies in Machine Learning Using the Game of Checkers, 1959. Терминология
  • 3. Machine Learning is the Future Thesis #1
  • 4. Machine Intelligence Cases for Retail Banking Personalized Product Offering Real-timeBatch Processing Processing Speed Log(Volume) Pbytes Tbytes Gbytes Structured data Semi-structured Unstructured Customer Loyalty Operational Efficiencies Fraud Detection Compliance and Regulatory Reporting Voice Identity, Chat-bots Customer Segmentation Credit Scoring Credit Card Fraud Web-/Mobile Bank Fraud Insider Threats Information Attacks
  • 6. Card-not-present Fraud Volume == Big Data caseVolume Variety Velocity
  • 7. Machine Intelligence + Big Data New Paradigm
  • 8. Old School vs Big Data Paradigm
  • 9. Dynamic threshold Static* threshold Old School vs AI Paradigm * ∆t attack ≪ ∆t reaction
  • 10. Evolution or and Revolution 1. 2. 3. FALSE FALSE TRUE
  • 12. MachineHuman Private cloud Public cloudHybrid cloud Forget or Secure Store and share Machine Intelligence Stack Cost Law? Ethics? Black box?
  • 13. Architecture: Data Flow Online Real-time processing Transactions stream Risk score Internal data Transactions Log (WAY4), customers/merchants CRMs, black/white lists External data НБКИ, ФНС, ПФР, ФССП, location & devices definition, social graph, mobile provider score 1. Preprocessing data 2. Calculate statistics 3. Train model 4. Evaluate model DetailsRaw Aggregates Model Private data (152-ФЗ) Payment data (PCI DSS) 0. Retrieve data
  • 14. Step 1: Preprocessing Data Transaction Amount Challenge 1. 2. 3.
  • 15. 1. 2. Step 1: Preprocessing Data Customer Clustering Challenge
  • 16. Step 2: Calculate Statistics
  • 17. 1% женщин в возрасте 40 лет, участвовавших в регулярных обследованиях, имеют рак груди. 80% женщин с раком груди имеют положительный результат маммографии. 9,6% здоровых женщин также получают положительный результат (маммография, как любые измерения, не дает 100% результатов). Женщина-пациент из этой возрастной группы получила положительный результат на регулярном обследовании. Какова вероятность того, что она фактически больна раком груди? Step 2: Calculate Statistics
  • 18. Step 3: Train Model Algorithm Selection Challenge Algorithm Accuracy Speed Specifics 1. Logistic regression low fast linearly separable 2. Decision Tree low medium human-readable 3. Boosted Decision Tree high medium generalization ability 4. Neural Networks medium-high low pattern recognition 5. Deep Learning high very low magic AI
  • 19. Step 4: Evaluate Model Accur𝑎𝑐𝑦 = 𝑇𝑃 + 𝑇𝑁 𝑃 + 𝑁 𝑅𝑒𝑐𝑎𝑙𝑙 ∗ = 𝑇𝑃 𝑇𝑃 + 𝐹𝑁 Challenges:  Imbalanced classes;  False Positive Penalty != False Negative Penalty;  Calculate business-metrics:  Direct and indirect losses;  Bonus: if you change Threshold, you will change everything… 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑇𝑃 𝑇𝑃 + 𝐹𝑃 𝐹2 = 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∙ 𝑟𝑒𝑐𝑎𝑙𝑙 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙 Wikipedia
  • 21. References 1. Bansal, M. Credit Card Fraud Detection Using Self Organised Map (2014) International Journal of Information & Computation Technology, Volume 4, Number 13. 2. Chan, P.K., Fan, W., Prodromidis, A.L., Stolfo, S.J. Distributed data mining in credit card fraud detection (1999) IEEE Intelligent Systems and Their Applications, 14 (6). 3. Grolinger, K., Hayes, M., Higashino, W.A., L'Heureux, A., Allison, D.S., Capretz, M.A.M. Challenges for MapReduce in Big Data (2014) Proceedings of the 2014 IEEE World Congress on Services. 4. Khan, A., Akhtar, N., and Qureshi, M. Real-Time Credit-Card Fraud Detection using Artificial Neural Network Tuned by Simulated Annealing Algorithm (2014) ACEEE, Proc. of Int. Conf. on Recent Trends in Information, Telecommunication and Computing, ITC 2014 Chandigarh, India. 5. Lu, Q., Ju, C. Research on credit card fraud detection model based on class weighted support vector machine (2011) Journal of Convergence Information Technology, 6 (1). 6. Mardani, S., Akbari, M.K., Sharifian, S. Fraud detection in Process Aware Information systems using MapReduce (2014) 2014 6th Conference on Information and Knowledge Technology, IKT 2014. 7. Dmitry Petukhov, A. Tselykh. Web service for detecting credit card fraud in near real-time (2015) Proceedings of the 8th International Conference on Security of Information and Networks. Advanced References 1. Максим Федотенко. Как защищают банки: разбираем устройство и принципы банковского антифрода. Журнал Хакер, 2017. 2. Дмитрий Петухов. Цикл статей: Антифрод как сервис. Интернет-ресурс 0xCode.in, 2016.
  • 22. © 2017, Dmitry Petukhov. CC BY-SA 4.0 license. OpenWay and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. Thank you!
  • 23. Q&A Now or later (see contacts below) Stay connected Facebook: @code.zombi Habr: @codezombie All contacts: http://0xCode.in/author Download presentation from http://0xCode.in/2017/paymentsecurity or