Data Quality: Three Perspectives, One Result
I like Robert S. Seiner thoughts on Data Governance and the method he suggests. In practice, a seemingly simple question causes a stupor in its implementation and today I would like to share my thoughts on data quality.
How to Link Data to Results, Which Quality Aspects Matter, and How Modern Technologies Can Become Allies Rather Than Sources of New Challenges
Let us examine an approach to managing data quality that will help organizations not drown in the digital flood but harness it to achieve their goals.
Information has always been a key resource—in personal life, in managing organizations, and in governmental activities. It has aided decision-making, planning for the future, and understanding the world around us. However, the term "data" in its modern sense only entered widespread use in the second half of the 20th century, when information technologies began reshaping our reality. With the development of computers, databases, and networks, data evolved from an abstract concept into a concrete asset of the digital economy. Today, they are called "the new oil," underscoring their value. But, just like oil, the quality of this resource determines whether it will bring benefits or become a source of problems. It is not enough to simply collect data; ensuring their quality, security, and usability for creating value is essential. In the digital economy, data quality has become a critical factor for success—for both businesses and public administration.
The scale of the problem is staggering. According to Gartner’s research, companies lose up to 30% of their annual revenue due to poor data quality—errors, gaps, or inconsistencies. Experian reports that 91% of organizations face financial consequences caused by "dirty" data, ranging from missed opportunities to direct losses. One might assume that the increasing adoption of digital technologies would address these issues by automating processes and improving accuracy. In reality, it often exacerbates them: the more data there is, the harder it becomes to monitor its quality.
At the same time, the digital economy increasingly relies on IT, which depends on high-quality information. Artificial intelligence (AI), machine learning (ML), big data, and real-time analytics are tools transforming business and management. For instance, AI can predict demand, optimize logistics, or detect fraud—but only if the data is accurate and up-to-date. Machine learning builds models based on historical data; if that data contains errors or duplicates, the model will produce incorrect results. Big data analytics helps identify patterns in vast datasets, but without standardization and reliability, it loses its purpose. Even basic automation systems like CRM or ERP rely on data quality: an incorrect customer address in a CRM system will disrupt delivery, while errors in an ERP system will distort financial reports. Thus, technologies do not merely solve problems—they require high-quality "raw material" to function effectively.
So why doesn’t the growth of technology eliminate the problem, and sometimes even makes it more acute? The more data we collect, the harder it is to control. The broader the use of AI and analytics, the higher the cost of errors. In this article, we will explore an approach to managing data quality that enables organizations to avoid drowning in the digital flood and instead use it to achieve their objectives. We will break down how to connect data to results, which aspects of quality are critical, and how modern technologies can become allies rather than sources of new challenges.
Approach to Analyzing Data Quality
To analyze data quality, we will use the DIKAR model, which links data to final outcomes, alongside the ten data quality aspects outlined in DMBOK. The DIKAR model describes the journey from data to results: Data → Information → Knowledge → Action → Results. It illustrates how data quality influences decisions and success and is used to analyze information management. We propose examining the data quality issue through three perspectives:
Three Perspectives on Data Quality: A Role-Based Model
Data management is an interdisciplinary endeavor. To understand data quality, we must view it from different angles. Imagine you are assembling a puzzle, but you only have some of the pieces. How do you complete the picture? We suggest dividing the task into three perspectives that reflect the roles of process participants. These roles may overlap, but each contributes a unique piece to the overall understanding of data quality.
Who knows best which data is needed for work? Naturally, it’s the one who owns it and is responsible for the activities tied to it. The data owner—whether an individual or a department—creates the information and has a vested interest in controlling it. This role is recognized in various standards (e.g., ISO 27001), emphasizing the importance of managing information as a resource.
For the data owner, quality is defined by the following aspects:
If the data is incomplete or outdated, the owner loses control over their resource. Would you entrust business management to information you’re not confident in?
Now, let’s step into the world of IT. Here, data transforms from an abstract resource into a practical tool ready for use in the digital economy. IT departments handle processing, standardization, and accessibility of data, setting the pace and scale of decision-making within organizations. Their role is not just technical support but a critical stage in the data lifecycle, where a chaotic stream of information becomes a structured asset. Without quality processing, data remains raw and useless—like unrefined ore that cannot immediately be turned into a tool. Quality is determined by these aspects:
If a system fails due to inconsistent data—say, a customer order is duplicated in the database due to a synchronization glitch—the entire process grinds to a halt: deliveries are delayed, reports are skewed, and trust in the system erodes. IT is responsible for cleaning and delivering data, turning it into a reliable tool. Consider this example: a company aiming to boost sales by 10% relies on IT to integrate order data from various branches, eliminate duplicates, and make it available to analysts in real-time. Without this, the data owners’ efforts to gather customer information would be in vain, and analysts couldn’t produce accurate forecasts.
Finally, data reaches those who use it for decision-making—analysts and executives applying a data-driven approach. This is the final stage where data turns into business value: strategies, forecasts, and operational improvements. Quality here is defined by:
Unreliable data leads to disasters: an incorrect demand forecast might result in overproduction or stock shortages. Is it worth trusting a strategy built on "dirty" data?
These three perspectives are interconnected: problems at one stage—like incomplete data from the owner—trigger errors in IT processing and failures in analysis. This division allows us to structure the approach and pinpoint weak links in the data management chain.
The Role of the CDO: Strategic Coordinator of Balance
In the data economy, managing quality is an increasingly complex task. Information volumes grow daily, processes become more intricate due to new technologies and systems, and responsibility for data is distributed across numerous participants—from operational departments to IT and analytical teams. Who should keep everything under control? The Chief Data Officer (CDO) serves as a strategic coordinator, not just a performer of isolated tasks. Their mission is not to get bogged down in operational routines but to build a system ensuring data quality at all levels. How can management be made effective without becoming unwieldy or chaotic? Here, Ashby’s principle ("the controlled system must not be more complex than the controlling one") helps, advocating a balance between structure and flexibility.
An organization is like an orchestra: data owners, IT, and analysts each play their parts. Without coordination, it turns into chaos, with everyone pulling in different directions. The CDO is the conductor, setting the harmony so disparate efforts form a unified system. The data system is complex: multiple sources (CRM, ERP, external APIs), formats (structured tables, unstructured texts), and issues (duplicates, inconsistencies, outdated records). If management mirrors this chaos—imposing endless rules, excessive checks, and sparking role conflicts—it becomes inefficient and turns into a bureaucratic nightmare. Ashby’s principle demands that the CDO find balance: the management system must be structured, not as chaotic as the data itself, yet flexible enough to keep key aspects under control.
Structure is achieved through several approaches. First, unified standards: frameworks like DMBOK, covering completeness, accuracy, and other quality aspects, establish common rules for all participants. Second, automation: monitoring consistency and integrity with tools (e.g., ETL processes, MDM systems, data catalogs) reduces the burden on people and speeds up processes. Finally, role distribution: data owners ensure timeliness, IT handles integrity and accessibility, and analysts focus on relevance and interpretability.
For example, a company implements a new CRM system to manage customer data. The CDO coordinates efforts: data owners (the sales team) regularly update customer records, IT synchronizes the CRM with other systems (e.g., warehouse software), and analysts use the processed data to forecast demand. Without such coordination, customer data could become outdated, synchronization could fail, and forecasts could prove useless. Chaos is tamed by a clear system where everyone knows their role.
The CDO’s primary deliverable is not fixed data but a system of agreements on quality criteria and its cost. How much is real-time data timeliness worth? The CDO helps the business understand: a 1% increase in forecast accuracy might justify the cost of implementing a monitoring system—like real-time data streaming for analyzing customer behavior. A data contract becomes the tool to lock in these expectations: owners specify which data is critical (e.g., order records), IT ensures processing and accessibility, and analysts confirm suitability for strategic decisions. The CDO introduces data protection and access control policies, safeguarding security and reputation, and organizes employee training in new analytical methods.
This is a "roadmap" outlining roles, quality standards, and areas of responsibility, simplifying coordination and reducing the risk of misunderstandings among participants. Management remains structured (clear roles, standards, automation), effective (covering all key quality aspects from completeness to reliability), and flexible (adapting to changes like new data sources or business requirements).
A question for you: If you were a CDO, how would you organize data management in your organization?
From Results to Data
Every organizational activity aims for a specific result: increasing profits, improving process efficiency, meeting regulatory requirements, or enhancing customer experience. This requires resources—time, money, and human effort. In the digital economy, data plays a pivotal role in this process, forming the foundation for decisions at all levels. Yet, the link between data and results is often underestimated: many organizations either fail to recognize its importance or take it for granted, overlooking the need for quality control. According to the DIKAR model, achieving a result requires a journey:
In practice, companies often try to "skip" stages: decisions are made intuitively, without a clear tie to information and knowledge, and data quality issues only surface during result analysis, when fixing them is costly and complex. For example, if sales data contains duplicates, reports will show inflated figures, knowledge will be false, and a decision to ramp up production will lead to excess inventory and losses.
Reverse engineering flips this approach, starting from the end. First, the desired result is defined—say, a 10% sales increase. Then, key metrics characterizing it are identified: average order value, purchase conversion rate, and repeat order volume. Next, it’s determined which data is needed to calculate these metrics: transaction records, customer behavior, and purchase history. This method allows:
For instance, the goal is to improve process efficiency. Metrics are operation execution speed and demand forecast accuracy. Data includes operation logs and historical sales. The reverse approach ties data to the goal, minimizing redundant effort and focusing on what truly matters for success.
Modeling Results as a Measure of Data Quality
In the data economy, data quality is not just an abstract trait measurable by percentages of accuracy or completeness. The ultimate indicator of data quality is the outcome of decisions, which we can and should model. A result isn’t a random outcome like profit growth or process optimization—it’s proof that data enabled meaningful, effective action. If a data-based decision fails—for example, an ad campaign doesn’t attract customers due to demand analysis errors—can we call that data quality, even if it’s technically flawless?
Modeling results is the key to understanding which data we need and what quality level it must achieve. It involves:
A data contract solidifies these requirements, ensuring decision predictability. It specifies which data must be collected, how it should be processed, and what it’s used for, forming the basis for collaboration between owners, IT, and analysts.
Example: the goal is a 10% sales increase. The DIKAR path: actions (launching an ad campaign), knowledge (customer behavior trends), information (customer data analysis), data (transaction records and preferences). Data quality becomes a tool, not an end in itself—it matters as much as it supports the result.
The three perspectives structure this approach: owners set basic requirements, ensuring data reflects reality; IT ensures usability through consistency and accessibility; analysts verify applicability, assessing relevance and interpretability. The CDO acts as coordinator, helping parties agree on what constitutes quality and its cost. Quality isn’t free: a 1% accuracy boost might cost millions (e.g., implementing real-time analytics), but if it pays off in revenue, it’s justified. Ashby’s principle reminds us: data quality management must be structured for control, avoiding bureaucracy or loss of oversight. Modeling results strikes a balance—focusing on critical quality aspects, saving resources where perfection isn’t needed.
Ultimately, data quality is measured not just by metrics but by its contribution to organizational success. The data economy demands we don’t just collect and store information but turn it into working solutions. By modeling results, we ask: "Which data will lead us to our goal?" The answer is the true measure of quality.
Conclusion
Data quality is not merely the cleanliness of database records but their ability to drive success. Data serves as fuel: clean fuel accelerates business, dirty fuel slows it down. The three perspectives—owners, IT, analysts—and the CDO as coordinator transform chaos into a system. Reverse engineering and data contracts focus efforts on results, whether sales growth or efficiency. The data economy requires not just gathering information but making it useful. Leveraging data for competition demands not only technology but a culture of working with data.
How do you evaluate your data—by spreadsheets or outcomes? Your answer will show if you’re ready for the digital future. Let’s discuss how data can work better—for business, government, and each of us.