SlideShare a Scribd company logo
DISCOVERING FIRE
OR WHY YOUR DATA QUALITY PROBLEM IS NOT THE USER’S FAULT
www.suade.org/fire
HOW DO WE PREVENT THE NEXT FINANCIAL CRISIS?
REGTECH...
R(x1, x2, ..., xn) = y
where,
y = metrics, reports, analytics, insights
R = rules, regulations
x = data
THE REGULATORY EQUATION
 We think garbage is acceptable
 Garbage input is not our problem
 Garbage output also not our problem
TECHNOLOGY SAYS GARBAGE IN, GARBAGE OUT
Leverage your data model to prevent garbage in!
Objectives:
 Quality, control and governance for
Financial Institutions
 Comparability and supervision of
underlying data for Regulators
 Seamless integration for FinTechs and
third parties
FINANCIAL REGULATION (FIRE) DATA FORMAT
Github:
 https://github.com/suadelabs/fire
WHAT IS BAD DATA?
Leverage your data model to prevent garbage in!
Bad data is data with:
 No definitions
 No validations
 Missing values
 Wrong values
 Flags
 Derived values
https://xkcd.com/1838/
BAD DATA: NO DEFINITIONS
Amount in USD
Total Deposits ?
ACT_DEP_BAL R10_USD_AMT GTEE_AMT LIQ14_TOT_BAL R12_AMT
90 60 60 30 120
Rule 1: No attribute without definition
Data:
Output:
BAD DATA: NO VALIDATIONS
Rule 2: No attribute without limitation
Data:
CCY START_DT CATEGORY BALANCE USDRATE USD_BAL
GBP 2019-03-08 Retail 30 1.20 36
ALL 18/11/20 International -20 0.01 0
USD 18/11/19 Govt 15 100 15
BAD DATA: MISSING/WRONG VALUES
OK. Missing values tell you what you don’t know
Data:
CCY START_DT CATEGORY BALANCE USDRATE USD_BAL
GBP 2019-03-08 Retail 30 1.20 36
ALL International -20 0.01
USD Govt 15 15
Total USD
Deposits from SMEs ?
Deposits from Govts ?
Deposits from other
customers
15
O_REP R10_TYPE CATEGORY USD_AMT SME? GOVT?
GB Central Bank Retail 30 True True
UK Small Business International 20 False True
DE Financial Govt 15 False False
BAD DATA: FLAGS
Rule 3: Flags are not your friend
Output:Data:
BAD DATA: DERIVED VALUES
Amount
in EUR
Total Deposits ?
EURUSD EURGBP GBP_AMT TYPE USD_AMT
1.00 1.16 100 DEP 120
Rule 4: Avoid attributes that can easily be deduced from others
ADHERE TO PRINCIPLES, NOT YOUR BOSS
Good data has:
 Definitions
 Validations
 No flags
 No derivable values
FIRE GUIDING PRINCIPLES
1. Data attributes should always be true:
 “...every pull request requires a corresponding, documented, legal reference, preferably to a
currently in-force financial regulation.”
2. Data attributes should be atomic
 “One property should not be derivable from other properties.“
3. Data attributes should be consistent
 “Schema properties should try to avoid logical inconsistencies.”
TO CONCLUDE
Death List
OOOO----RENRENRENREN ISHIIISHIIISHIIISHII(added columns without defintions)
 VERNITAVERNITAVERNITAVERNITA GREENGREENGREENGREEN (added columns without validations)
BUDDBUDDBUDDBUDD(added a whole bunch of flags)
 ELLEELLEELLEELLE DRIVERDRIVERDRIVERDRIVER (added redundant information)
BILLBILLBILLBILL (allowed this to happen)
GET INVOLVED! www.suade.org/FIRE

More Related Content

PPT
Achieving Regulatory Compliance The Devil Is In The Data Governance
PPT
Achieving Regulatory Compliance The Devil Is In The Data Governance V2
PDF
Gaining Competitive Advantage Through Risk Data Governance
PPT
Defence IT 2012 - Data Quality and Financial Services - Solvency II
PDF
BigDansing presentation slides for KAUST
PDF
DataManagement_Waters_GFT_trimmed
PPT
Artificial Intelligence Expert Session Webinar
 
PPTX
Data architecture around risk management
Achieving Regulatory Compliance The Devil Is In The Data Governance
Achieving Regulatory Compliance The Devil Is In The Data Governance V2
Gaining Competitive Advantage Through Risk Data Governance
Defence IT 2012 - Data Quality and Financial Services - Solvency II
BigDansing presentation slides for KAUST
DataManagement_Waters_GFT_trimmed
Artificial Intelligence Expert Session Webinar
 
Data architecture around risk management

Similar to Shift Money 2019 - Why your data quality problem is not your user's fault - Murat Abur (Suade) (20)

PDF
Data Profiling: The First Step to Big Data Quality
PPT
Lecture 19
PDF
Developing A Universal Approach to Cleansing Customer and Product Data
PDF
Best practise in data management
PPTX
Data Quality: A Raising Data Warehousing Concern
PPTX
Infogix BCBS 239 Implementation Challenges
PPTX
Fraud Detection Presentation Forum
PDF
Data Science Introduction - Data Science: What Art Thou?
PDF
Five finger audit
PDF
Overlooked aspects of data governance: workflow framework for enterprise data...
PDF
Data Quality
PPTX
Exploratory_Data_Analysis on data analysis using python.pptx
PPTX
Dynamic Talks: "Implementing data quality automation with open source stack" ...
PDF
What Is Data Quality.pdf
PDF
Data Quality in Data Warehouse and Business Intelligence Environments - Disc...
PDF
Fraud detection Presentation
PDF
Fraud Detection presentation
PPT
Building a Data Quality Program from Scratch
ODP
Data quality overview
PPT
Ing Lease Uk - The relationship between Risk & Compliance and Data Quality - ...
Data Profiling: The First Step to Big Data Quality
Lecture 19
Developing A Universal Approach to Cleansing Customer and Product Data
Best practise in data management
Data Quality: A Raising Data Warehousing Concern
Infogix BCBS 239 Implementation Challenges
Fraud Detection Presentation Forum
Data Science Introduction - Data Science: What Art Thou?
Five finger audit
Overlooked aspects of data governance: workflow framework for enterprise data...
Data Quality
Exploratory_Data_Analysis on data analysis using python.pptx
Dynamic Talks: "Implementing data quality automation with open source stack" ...
What Is Data Quality.pdf
Data Quality in Data Warehouse and Business Intelligence Environments - Disc...
Fraud detection Presentation
Fraud Detection presentation
Building a Data Quality Program from Scratch
Data quality overview
Ing Lease Uk - The relationship between Risk & Compliance and Data Quality - ...
Ad

More from Shift Conference (20)

PDF
Shift Remote: AI: How Does Face Recognition Work (ars futura)
PDF
Shift Remote: AI: Behind the scenes development in an AI company - Matija Ili...
PDF
Shift Remote: AI: Smarter AI with analytical graph databases - Victor Lee (Ti...
PDF
Shift Remote: DevOps: Devops with Azure Devops and Github - Juarez Junior (Mi...
PDF
Shift Remote: DevOps: Autodesks research into digital twins for AEC - Kean W...
PPTX
Shift Remote: DevOps: When metrics are not enough, and everyone is on-call - ...
PDF
Shift Remote: DevOps: Modern incident management with opsgenie - Kristijan L...
PDF
Shift Remote: DevOps: Gitlab ci hands-on experience - Ivan Rimac (Barrage)
PDF
Shift Remote: DevOps: DevOps Heroes - Adding Advanced Automation to your Tool...
PDF
Shift Remote: DevOps: An (Un)expected Journey - Zeljko Margeta (RBA)
PDF
Shift Remote: Game Dev - Localising Mobile Games - Marta Kunic (Nanobit)
PDF
Shift Remote: Game Dev - Challenges Introducing Open Source to the Games Indu...
PDF
Shift Remote: Game Dev - Ghost in the Machine: Authorial Voice in System Desi...
PDF
Shift Remote: Game Dev - Building Better Worlds with Game Culturalization - K...
PPTX
Shift Remote: Game Dev - Open Match: An Open Source Matchmaking Framework - J...
PDF
Shift Remote: Game Dev - Designing Inside the Box - Fernando Reyes Medina (34...
PDF
Shift Remote: Mobile - Efficiently Building Native Frameworks for Multiple Pl...
PDF
Shift Remote: Mobile - Introduction to MotionLayout on Android - Denis Fodor ...
PDF
Shift Remote: Mobile - Devops-ify your life with Github Actions - Nicola Cort...
PPTX
Shift Remote: WEB - GraphQL and React – Quick Start - Dubravko Bogovic (Infobip)
Shift Remote: AI: How Does Face Recognition Work (ars futura)
Shift Remote: AI: Behind the scenes development in an AI company - Matija Ili...
Shift Remote: AI: Smarter AI with analytical graph databases - Victor Lee (Ti...
Shift Remote: DevOps: Devops with Azure Devops and Github - Juarez Junior (Mi...
Shift Remote: DevOps: Autodesks research into digital twins for AEC - Kean W...
Shift Remote: DevOps: When metrics are not enough, and everyone is on-call - ...
Shift Remote: DevOps: Modern incident management with opsgenie - Kristijan L...
Shift Remote: DevOps: Gitlab ci hands-on experience - Ivan Rimac (Barrage)
Shift Remote: DevOps: DevOps Heroes - Adding Advanced Automation to your Tool...
Shift Remote: DevOps: An (Un)expected Journey - Zeljko Margeta (RBA)
Shift Remote: Game Dev - Localising Mobile Games - Marta Kunic (Nanobit)
Shift Remote: Game Dev - Challenges Introducing Open Source to the Games Indu...
Shift Remote: Game Dev - Ghost in the Machine: Authorial Voice in System Desi...
Shift Remote: Game Dev - Building Better Worlds with Game Culturalization - K...
Shift Remote: Game Dev - Open Match: An Open Source Matchmaking Framework - J...
Shift Remote: Game Dev - Designing Inside the Box - Fernando Reyes Medina (34...
Shift Remote: Mobile - Efficiently Building Native Frameworks for Multiple Pl...
Shift Remote: Mobile - Introduction to MotionLayout on Android - Denis Fodor ...
Shift Remote: Mobile - Devops-ify your life with Github Actions - Nicola Cort...
Shift Remote: WEB - GraphQL and React – Quick Start - Dubravko Bogovic (Infobip)
Ad

Recently uploaded (20)

PPT
FIRE PREVENTION AND CONTROL PLAN- LUS.FM.MQ.OM.UTM.PLN.00014.ppt
PPTX
Layers_of_the_Earth_Grade7.pptx class by
PPTX
SAP Ariba Sourcing PPT for learning material
PPTX
Database Information System - Management Information System
PPTX
1402_iCSC_-_RESTful_Web_APIs_--_Josef_Hammer.pptx
PPTX
Introduction to cybersecurity and digital nettiquette
PDF
Introduction to the IoT system, how the IoT system works
PPTX
Funds Management Learning Material for Beg
PDF
Session 1 (Week 1)fghjmgfdsfgthyjkhfdsadfghjkhgfdsa
PDF
Smart Home Technology for Health Monitoring (www.kiu.ac.ug)
PDF
simpleintnettestmetiaerl for the simple testint
PDF
Slides PDF: The World Game (s) Eco Economic Epochs.pdf
PPTX
Internet Safety for Seniors presentation
PPTX
newyork.pptxirantrafgshenepalchinachinane
PPTX
artificial intelligence overview of it and more
PDF
FINAL CALL-6th International Conference on Networks & IOT (NeTIOT 2025)
PDF
📍 LABUAN4D EXCLUSIVE SERVER STAR GAMING ASIA NO.1 TERPOPULER DI INDONESIA ! 🌟
DOC
Rose毕业证学历认证,利物浦约翰摩尔斯大学毕业证国外本科毕业证
PDF
Uptota Investor Deck - Where Africa Meets Blockchain
PPT
250152213-Excitation-SystemWERRT (1).ppt
FIRE PREVENTION AND CONTROL PLAN- LUS.FM.MQ.OM.UTM.PLN.00014.ppt
Layers_of_the_Earth_Grade7.pptx class by
SAP Ariba Sourcing PPT for learning material
Database Information System - Management Information System
1402_iCSC_-_RESTful_Web_APIs_--_Josef_Hammer.pptx
Introduction to cybersecurity and digital nettiquette
Introduction to the IoT system, how the IoT system works
Funds Management Learning Material for Beg
Session 1 (Week 1)fghjmgfdsfgthyjkhfdsadfghjkhgfdsa
Smart Home Technology for Health Monitoring (www.kiu.ac.ug)
simpleintnettestmetiaerl for the simple testint
Slides PDF: The World Game (s) Eco Economic Epochs.pdf
Internet Safety for Seniors presentation
newyork.pptxirantrafgshenepalchinachinane
artificial intelligence overview of it and more
FINAL CALL-6th International Conference on Networks & IOT (NeTIOT 2025)
📍 LABUAN4D EXCLUSIVE SERVER STAR GAMING ASIA NO.1 TERPOPULER DI INDONESIA ! 🌟
Rose毕业证学历认证,利物浦约翰摩尔斯大学毕业证国外本科毕业证
Uptota Investor Deck - Where Africa Meets Blockchain
250152213-Excitation-SystemWERRT (1).ppt

Shift Money 2019 - Why your data quality problem is not your user's fault - Murat Abur (Suade)

  • 1. DISCOVERING FIRE OR WHY YOUR DATA QUALITY PROBLEM IS NOT THE USER’S FAULT www.suade.org/fire
  • 2. HOW DO WE PREVENT THE NEXT FINANCIAL CRISIS? REGTECH...
  • 3. R(x1, x2, ..., xn) = y where, y = metrics, reports, analytics, insights R = rules, regulations x = data THE REGULATORY EQUATION
  • 4.  We think garbage is acceptable  Garbage input is not our problem  Garbage output also not our problem TECHNOLOGY SAYS GARBAGE IN, GARBAGE OUT Leverage your data model to prevent garbage in!
  • 5. Objectives:  Quality, control and governance for Financial Institutions  Comparability and supervision of underlying data for Regulators  Seamless integration for FinTechs and third parties FINANCIAL REGULATION (FIRE) DATA FORMAT Github:  https://github.com/suadelabs/fire
  • 6. WHAT IS BAD DATA? Leverage your data model to prevent garbage in! Bad data is data with:  No definitions  No validations  Missing values  Wrong values  Flags  Derived values https://xkcd.com/1838/
  • 7. BAD DATA: NO DEFINITIONS Amount in USD Total Deposits ? ACT_DEP_BAL R10_USD_AMT GTEE_AMT LIQ14_TOT_BAL R12_AMT 90 60 60 30 120 Rule 1: No attribute without definition Data: Output:
  • 8. BAD DATA: NO VALIDATIONS Rule 2: No attribute without limitation Data: CCY START_DT CATEGORY BALANCE USDRATE USD_BAL GBP 2019-03-08 Retail 30 1.20 36 ALL 18/11/20 International -20 0.01 0 USD 18/11/19 Govt 15 100 15
  • 9. BAD DATA: MISSING/WRONG VALUES OK. Missing values tell you what you don’t know Data: CCY START_DT CATEGORY BALANCE USDRATE USD_BAL GBP 2019-03-08 Retail 30 1.20 36 ALL International -20 0.01 USD Govt 15 15
  • 10. Total USD Deposits from SMEs ? Deposits from Govts ? Deposits from other customers 15 O_REP R10_TYPE CATEGORY USD_AMT SME? GOVT? GB Central Bank Retail 30 True True UK Small Business International 20 False True DE Financial Govt 15 False False BAD DATA: FLAGS Rule 3: Flags are not your friend Output:Data:
  • 11. BAD DATA: DERIVED VALUES Amount in EUR Total Deposits ? EURUSD EURGBP GBP_AMT TYPE USD_AMT 1.00 1.16 100 DEP 120 Rule 4: Avoid attributes that can easily be deduced from others
  • 12. ADHERE TO PRINCIPLES, NOT YOUR BOSS Good data has:  Definitions  Validations  No flags  No derivable values
  • 13. FIRE GUIDING PRINCIPLES 1. Data attributes should always be true:  “...every pull request requires a corresponding, documented, legal reference, preferably to a currently in-force financial regulation.” 2. Data attributes should be atomic  “One property should not be derivable from other properties.“ 3. Data attributes should be consistent  “Schema properties should try to avoid logical inconsistencies.”
  • 14. TO CONCLUDE Death List OOOO----RENRENRENREN ISHIIISHIIISHIIISHII(added columns without defintions)  VERNITAVERNITAVERNITAVERNITA GREENGREENGREENGREEN (added columns without validations) BUDDBUDDBUDDBUDD(added a whole bunch of flags)  ELLEELLEELLEELLE DRIVERDRIVERDRIVERDRIVER (added redundant information) BILLBILLBILLBILL (allowed this to happen)