SlideShare a Scribd company logo
Advance Data Quality Management
Basice Overview
Khaled Mosharraf. Msc
mosharrafkhaled@gmx.de
A.K.M Bhalul Haque. M.Sc
b.haque@gmx.de
FH Kiel, Germany
2016
Agenda
• Motivation / Introduction
• Data Quality Definitions
• Foundation of Data Quality
• Data Quality Assessments
• Measuring Data Quality
• DQ-Organisation
• Data Policies
• Data Governance
• DQ Policies
• Data Profiling
Kiel University of Applied Sciences
Introduction
Today is world of heterogeneity.
We have different technologies.
We operate on different platforms.
We have large amount of data being generated
everyday in all sorts of organizations and
Enterprises.
And we do have problems with data.
Kiel University of Applied Sciences
The previous slide we discuss
about introduction part and
data quality definitions.
If you missed it please check
that slide
Kiel University of Applied Sciences
Foundation of Data Quality
I. Data Production System
II. IQ-Dimensions
III. IQ-Categories / IQ-Pattern
Kiel University of Applied Sciences
Maintenance of data quality
• Data quality results from the process of going
through the data and scrubbing it, standardizing it,
and de duplicating records, as well as doing some of
the data enrichment.
Maintain complete data.
Clean up your data by standardizing it using rules.
Use fancy algorithms to detect duplicates. Eg: ICS
and Informatics Computer System.
Avoid entry of duplicate leads and contacts.
Merge existing duplicate records.
Use roles for security.
Kiel University of Applied Sciences
Data Production System
• Data collector
• Data custodain
• Data consummer
Kiel University of Applied Sciences
Data Production System
Kiel University of Applied Sciences
IQ Dimanssion
• Relevance
• Accuracy
• Timellness
• Compliteness
• Coherence
• Format
• Accessibility
• Compatibillity
• Security
• Validity
• Accessibility
• Appropriate Amount of Data
• Believability
• Concise Representation
• Consistent Representation
• Ease of Manipulation
• Free of Error
• Interpretability
• Objectivity
• Relevancy
• Understandability
• Value-Added
Kiel University of Applied Sciences
Information Quality Dimensions
Dimensions
• Accessibility
The extent to which data is available, or easily and quickly
retrievable
• Appropriate Amount of Data
The extent to which the volume of data is appropriate for
the task at hand
• Believability
The extent to which data is regarded as true and credible
• Completeness
The extent to which data is not missing and is of sufficient
breadth and depth for the task at hand
Kiel University of Applied Sciences
• Concise Representation
The extent to which data is compactly represented
• Consistent Representation
The extent to which data is presented in the same
format
• Ease of Manipulation
The extent to which data is easy to manipulate and
apply to different tasks
• Free of Error
The extent to which data is correct and reliable
Kiel University of Applied Sciences
• Interpretability
The extent to which data is in appropriate languages,
symbols, and units, and the definitions are clear
• Objectivity
The extent to which data is unbiased, unprejudiced, and
impartial
• Relevancy
The extent to which data is applicable and helpful for the
task at hand
• Security
The extent to which access to data is restricted
appropriately to maintain its security
Kiel University of Applied Sciences
• Timeliness
The extent to which data is sufficiently up-to-date
for the task at hand
• Understandability
The extent to which data is easily comprehended
• Value-Added
The extent to which data is beneficial and
provides advantages from its use
Kiel University of Applied Sciences
Questions
• How do organisations define data quality?
• What data quality problems arise in
organizations?
• How do organizations identify, analyze, and
resolve data quality problems?
• How do organizations encourage employees to
work on a proactive management of DQ / IQ?
• Are there common data quality patterns?
• Across Organisations
• Across DQ-projects
Kiel University of Applied Sciences
IQ Categories / Patterns
Intrinsic IQ
• Information have quality in their own right
Contextual IQ
• Information quality must be considered within
the context of the task
Accessibility IQ / Representational IQ
• Emphasize the importance of the role of
systems
Kiel University of Applied Sciences
Intrinsic IQ
• Mismatch between several sources of the
“same” data
• “consistency” vs. “accuracy”
• Believability issues
• Poor reputation of sources
• Poor reputation for quality
• Subjective production of data
• Human judgment / knowledge in coding
Kiel University of Applied Sciences
Intrinsic IQ
Kiel University of Applied Sciences
Contextual IQ
Mismatch between information available and what
information is relevant for information consumers
• Missing data –the easy case
• Data bundling and analyzability –the hard case
Issue is aggregation
• Across record (transaction) analysis of data
• e.g. Corporate Actions in banking
• Often across distributed systems
Incompatible, distributed systems (HMO)
Kiel University of Applied Sciences
Contextual IQ
Kiel University of Applied Sciences
Accessability IQ / Representational IQ
Technical Accessibility
• Physical access
• Computing resources
Time to Access / Ease of Access:
• Amount of data
• Privacy, confidentiality
Interpretability and Understandability:
• Coding
Representation and its Analyzability:
• Image and text data
Kiel University of Applied Sciences
Accessability IQ / Representational IQ
Kiel University of Applied Sciences
Data Quality Problems
Kiel University of Applied Sciences
Thank You

More Related Content

PDF
Data quality management Basic
PPTX
Data Quality Dashboards
PPTX
Data Quality & Data Governance
PDF
Master Data Management - Aligning Data, Process, and Governance
PDF
Data Governance Best Practices
PDF
Reference master data management
PPT
Data Quality
ODP
Data quality overview
Data quality management Basic
Data Quality Dashboards
Data Quality & Data Governance
Master Data Management - Aligning Data, Process, and Governance
Data Governance Best Practices
Reference master data management
Data Quality
Data quality overview

What's hot (20)

PDF
Data Quality Best Practices
PDF
Business intelligence
PDF
Data Governance Takes a Village (So Why is Everyone Hiding?)
PPTX
Open data quality
PDF
Data quality metrics infographic
PPT
Data quality architecture
PDF
Implementing Effective Data Governance
PPTX
Data Governance
PDF
Data Governance Powerpoint Presentation Slides
PDF
Data Modeling, Data Governance, & Data Quality
PDF
Building a Data Governance Strategy
PPT
Establishing a Strategy for Data Quality
PDF
Data strategy in a Big Data world
PDF
Machine learning techniques to improve data management and data quality
PDF
Data Quality Best Practices
PPTX
Lifecycle of a Data Science Project
PDF
Data Architecture Strategies: The Rise of the Graph Database
PDF
Organizing Master Data Management
PDF
LDM Slides: Conceptual Data Models - How to Get the Attention of Business Use...
PDF
Team Capabilities PowerPoint Presentation Slides
Data Quality Best Practices
Business intelligence
Data Governance Takes a Village (So Why is Everyone Hiding?)
Open data quality
Data quality metrics infographic
Data quality architecture
Implementing Effective Data Governance
Data Governance
Data Governance Powerpoint Presentation Slides
Data Modeling, Data Governance, & Data Quality
Building a Data Governance Strategy
Establishing a Strategy for Data Quality
Data strategy in a Big Data world
Machine learning techniques to improve data management and data quality
Data Quality Best Practices
Lifecycle of a Data Science Project
Data Architecture Strategies: The Rise of the Graph Database
Organizing Master Data Management
LDM Slides: Conceptual Data Models - How to Get the Attention of Business Use...
Team Capabilities PowerPoint Presentation Slides
Ad

Similar to Foundation of data quality (20)

PDF
Data Quality
PPT
Lecture 21
PPTX
Enhancing educational data quality in heterogeneous learning contexts using p...
PPT
Chapter 4 Organizational Aspects of Data Management.ppt
PDF
Data Quality Dimensions Measurement Strategy Management And Governance Rupa M...
PPTX
Data Quality
PDF
Data-Ed Online: Approaching Data Quality
DOCX
Data quality management system
DOCX
Data quality management
PPT
DataQualityAssurance for learning NGO New Comer self-Study
PDF
Getting Data Quality Right
PDF
What Is Data Quality.pdf
PPTX
‏‏‏‏‏‏‏‏‏‏Chapter 12: Data Quality Management
PDF
chapter12-220725121546-610a1427.pdf
PDF
Analysis of data quality and information quality problems in digital manufact...
DOCX
Quality management best practices
PDF
AN EXTENDED DATA OBJECT-DRIVEN APPROACH TO DATA QUALITY EVALUATION: CONTEXTUA...
PDF
Data Quality Audit
PDF
A step towards a data quality theory
PDF
Remediating Data Quality by using Data cleaning
Data Quality
Lecture 21
Enhancing educational data quality in heterogeneous learning contexts using p...
Chapter 4 Organizational Aspects of Data Management.ppt
Data Quality Dimensions Measurement Strategy Management And Governance Rupa M...
Data Quality
Data-Ed Online: Approaching Data Quality
Data quality management system
Data quality management
DataQualityAssurance for learning NGO New Comer self-Study
Getting Data Quality Right
What Is Data Quality.pdf
‏‏‏‏‏‏‏‏‏‏Chapter 12: Data Quality Management
chapter12-220725121546-610a1427.pdf
Analysis of data quality and information quality problems in digital manufact...
Quality management best practices
AN EXTENDED DATA OBJECT-DRIVEN APPROACH TO DATA QUALITY EVALUATION: CONTEXTUA...
Data Quality Audit
A step towards a data quality theory
Remediating Data Quality by using Data cleaning
Ad

More from Khaled Mosharraf (6)

PDF
PCI DSS introduction by khaled mosharraf,
PDF
Pixel Bar Charts A New Technique for Visualizing Large Multi-Attribute Data S...
PPT
Open ssl heart bleed weakness.
PDF
Six sigma
PPTX
Introduction to anonymity network tor
PPTX
Beginners Node.js
PCI DSS introduction by khaled mosharraf,
Pixel Bar Charts A New Technique for Visualizing Large Multi-Attribute Data S...
Open ssl heart bleed weakness.
Six sigma
Introduction to anonymity network tor
Beginners Node.js

Recently uploaded (20)

PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PDF
Business Analytics and business intelligence.pdf
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Database Infoormation System (DBIS).pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
Lecture1 pattern recognition............
PPTX
Computer network topology notes for revision
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPT
Reliability_Chapter_ presentation 1221.5784
PDF
Foundation of Data Science unit number two notes
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
annual-report-2024-2025 original latest.
PPTX
Introduction to Knowledge Engineering Part 1
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Business Analytics and business intelligence.pdf
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Database Infoormation System (DBIS).pptx
Clinical guidelines as a resource for EBP(1).pdf
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Lecture1 pattern recognition............
Computer network topology notes for revision
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Qualitative Qantitative and Mixed Methods.pptx
Reliability_Chapter_ presentation 1221.5784
Foundation of Data Science unit number two notes
STUDY DESIGN details- Lt Col Maksud (21).pptx
annual-report-2024-2025 original latest.
Introduction to Knowledge Engineering Part 1

Foundation of data quality

  • 1. Advance Data Quality Management Basice Overview Khaled Mosharraf. Msc mosharrafkhaled@gmx.de A.K.M Bhalul Haque. M.Sc b.haque@gmx.de FH Kiel, Germany 2016
  • 2. Agenda • Motivation / Introduction • Data Quality Definitions • Foundation of Data Quality • Data Quality Assessments • Measuring Data Quality • DQ-Organisation • Data Policies • Data Governance • DQ Policies • Data Profiling Kiel University of Applied Sciences
  • 3. Introduction Today is world of heterogeneity. We have different technologies. We operate on different platforms. We have large amount of data being generated everyday in all sorts of organizations and Enterprises. And we do have problems with data. Kiel University of Applied Sciences
  • 4. The previous slide we discuss about introduction part and data quality definitions. If you missed it please check that slide Kiel University of Applied Sciences
  • 5. Foundation of Data Quality I. Data Production System II. IQ-Dimensions III. IQ-Categories / IQ-Pattern Kiel University of Applied Sciences
  • 6. Maintenance of data quality • Data quality results from the process of going through the data and scrubbing it, standardizing it, and de duplicating records, as well as doing some of the data enrichment. Maintain complete data. Clean up your data by standardizing it using rules. Use fancy algorithms to detect duplicates. Eg: ICS and Informatics Computer System. Avoid entry of duplicate leads and contacts. Merge existing duplicate records. Use roles for security. Kiel University of Applied Sciences
  • 7. Data Production System • Data collector • Data custodain • Data consummer Kiel University of Applied Sciences
  • 8. Data Production System Kiel University of Applied Sciences
  • 9. IQ Dimanssion • Relevance • Accuracy • Timellness • Compliteness • Coherence • Format • Accessibility • Compatibillity • Security • Validity • Accessibility • Appropriate Amount of Data • Believability • Concise Representation • Consistent Representation • Ease of Manipulation • Free of Error • Interpretability • Objectivity • Relevancy • Understandability • Value-Added Kiel University of Applied Sciences
  • 10. Information Quality Dimensions Dimensions • Accessibility The extent to which data is available, or easily and quickly retrievable • Appropriate Amount of Data The extent to which the volume of data is appropriate for the task at hand • Believability The extent to which data is regarded as true and credible • Completeness The extent to which data is not missing and is of sufficient breadth and depth for the task at hand Kiel University of Applied Sciences
  • 11. • Concise Representation The extent to which data is compactly represented • Consistent Representation The extent to which data is presented in the same format • Ease of Manipulation The extent to which data is easy to manipulate and apply to different tasks • Free of Error The extent to which data is correct and reliable Kiel University of Applied Sciences
  • 12. • Interpretability The extent to which data is in appropriate languages, symbols, and units, and the definitions are clear • Objectivity The extent to which data is unbiased, unprejudiced, and impartial • Relevancy The extent to which data is applicable and helpful for the task at hand • Security The extent to which access to data is restricted appropriately to maintain its security Kiel University of Applied Sciences
  • 13. • Timeliness The extent to which data is sufficiently up-to-date for the task at hand • Understandability The extent to which data is easily comprehended • Value-Added The extent to which data is beneficial and provides advantages from its use Kiel University of Applied Sciences
  • 14. Questions • How do organisations define data quality? • What data quality problems arise in organizations? • How do organizations identify, analyze, and resolve data quality problems? • How do organizations encourage employees to work on a proactive management of DQ / IQ? • Are there common data quality patterns? • Across Organisations • Across DQ-projects Kiel University of Applied Sciences
  • 15. IQ Categories / Patterns Intrinsic IQ • Information have quality in their own right Contextual IQ • Information quality must be considered within the context of the task Accessibility IQ / Representational IQ • Emphasize the importance of the role of systems Kiel University of Applied Sciences
  • 16. Intrinsic IQ • Mismatch between several sources of the “same” data • “consistency” vs. “accuracy” • Believability issues • Poor reputation of sources • Poor reputation for quality • Subjective production of data • Human judgment / knowledge in coding Kiel University of Applied Sciences
  • 17. Intrinsic IQ Kiel University of Applied Sciences
  • 18. Contextual IQ Mismatch between information available and what information is relevant for information consumers • Missing data –the easy case • Data bundling and analyzability –the hard case Issue is aggregation • Across record (transaction) analysis of data • e.g. Corporate Actions in banking • Often across distributed systems Incompatible, distributed systems (HMO) Kiel University of Applied Sciences
  • 19. Contextual IQ Kiel University of Applied Sciences
  • 20. Accessability IQ / Representational IQ Technical Accessibility • Physical access • Computing resources Time to Access / Ease of Access: • Amount of data • Privacy, confidentiality Interpretability and Understandability: • Coding Representation and its Analyzability: • Image and text data Kiel University of Applied Sciences
  • 21. Accessability IQ / Representational IQ Kiel University of Applied Sciences
  • 22. Data Quality Problems Kiel University of Applied Sciences