5 Reasons Data Democratization May Fail

5 Reasons Data Democratization May Fail

There is no doubt that we live in the Golden Age of Data. We have gone from data being an uninteresting byproduct of automation to it being the new gold, or new oil – the most valuable resource in the world. Of course, value does not exist in data as a mere resource. The value has to be unlocked, chiefly by analytics. In recent years, this last thought has been seized upon and it has been proposed that if as many people as possible in the enterprise can utilize data for analytics needs, then a massive amount of business value can be generated.

From this idea it follows that as many people as possible in the enterprise should be able to get the data they need. This is called data democratization

It all sounds wonderful, and who can argue against more democratization. But is it realistic? Or is it just a combination of shallow thinking, vendor hype, and consultant-speak, all driven by the current Zeitgeist? Personally, I do think there is a long-term trend to data democratization, but there are also real risks it may fail.  Let’s look at them.   

Reason 1: Access to Data is Not Like Access to Electricity or Water

One of the basic goals for data democratization is to “give people access to the data they need”. This goal is usually discussed in terms of providing a capability that allows the data resources of the enterprise to be searched – in other words a data catalog. If the required data can be found through the data catalog, then the business user can request access to it.

A major issue with this is that data is not an undifferentiated resource, like electricity or water, where we are dealing with access to a commodity. To be useful to a business user, the data has to be “shaped” in a way that is sufficiently close to what the user needs for the user to work with it effectively. Data stored in databases can be very complex in terms of the relations between the various data elements, the processes by which these databases get populated, semantics, timing, historical workarounds, and more. A user may find a relevant database table, but it will almost certainly not look like the information requirements the user has. A clue to this reality is the great amount of End User Computing (EUC) that goes on in many enterprises. EUC builds the transformations needed to get the data into a form that can meet a requirement, and is typically carried out by business users with a very deep knowledge of a very restricted set of data. 

If the data is not shaped to be close to the needs of a business user, the business user may not be able to work with it - even if they have access to it. 

Reason 2: End User Computing is Frowned Upon

If we accept that data democratization is something that we need to achieve, what will happen when it is achieved? Surely it will lead to a lot more individuals in the business working directly with data outside of IT supported applications. We have this today, and it is EUC, as we have just noted.

EUC is not at all liked in nearly all enterprises. It is tolerated as a necessity, usually with the notion that the developer of a EUC and the consumers of EUC outputs bear all of the risk involved. Regulators are becoming increasingly more interested in getting more controls around EUC’s, which in turn makes them more burdensome for business users to develop and maintain.

This is not to say that the regulators’ viewpoints lack merit. There is genuine risk in development processes that lack software engineering discipline.  What the solution to all this might be is not clear, but what is clear is that the current attitude towards EUC practices runs counter to the idea of data democratization.

If EUC continues to be restricted via top-heavy governance, then it is difficult to see how data democratization can be operationalized.

It is also quite possible that the worries about EUC are actually justified and if data democratization becomes a reality in enterprises with few or no controls around EUC, then we may get anarchy. That would be a bonus reason for the failure of data democratization. 

Reason 3: Many Business Users Want Another Fish, Not to Learn How to Fish 

It seems to be assumed that if end users in the enterprise are given access to data, then they will start to use it for the enterprise’s benefit. But what assurance do we have that this is so?

There are certainly some users in the business who can use data manipulation and reporting tools to process data they access. Other business users are prepared to learn these tools. At the other end of the spectrum are business users who are not prepared to do any of this. Their minimum expectation is that someone will provide them with a dataset that is exactly in the form that they can use for a particular requirement. Unfortunately, I do not know of any sociological study of the data management propensities of business users, so it is not possible to understand what the relative distribution of relevant skills and desire to do data management is. Personally, I think it is skewed more towards end users who want the datasets provided to them. This is not a criticism – most business users are already overburdened with their business duties and just do not have time to do data management task. But it does not align with the promise of data democratization. 

Data democratization may fail if it is provided, but too few business users are interested. 

Reason #4: Metadata Content is too Poor in Quality to Discover and Use Data

If business users are to easily find data, and successfully use data after they have got access to it, then they need to understand it. Once again this is where the data catalog comes in. There is a lot of information that modern data catalogs can harvest automatically. However, there is also a lot of important metadata that only humans can contribute to the data catalog. This includes semantics and facts of business significance that cannot be gleaned from technical metadata. Subject matter experts (SME’s) may know these details in their heads, but they are often not able to articulate them well in written form. The result is that data catalogs can be filled with information that is not understandable, and which varies widely from one entry to the next because different people wrote them. Of course, there are also the problems of metadata that is missing because someone did not have time to write it, or outdated because nobody kept it updated. 

None of these problems can be solved by simply labeling people as “data stewards” and telling them that they now have to contribute high quality metadata. Data catalogs are containers, and they need good semantic context to be useful.

Without high quality business metadata that has wide coverage, data democratization is going to be difficult.     

Reason #5: Lack of Community

Our last reason comes from the tendency of EUC developers to work in isolation. EUC developers, insofar as they are understood, tend to work by themselves to develop the solutions they need. This would almost certainly be replicated with data democratization. It is true that data catalogs provide mechanisms for social engagement, but they do not solve certain problems that come from working in isolation. These problems include replication of work that has already been done, and creating inconsistencies relative to EUC work done by other business users. Democratization does not have to imply lack of coordination, but right now we do not have adequate processes for how to achieve this coordination. 

Data catalogs can certainly be used to capture details of what business users have developed. This should in theory allow business users to discover these products and use them instead of redeveloping them. But this does not address the ways in which business users develop their solutions. Reuse of shared components, SQL style, commenting code, version management, production deployment, and more are all areas where some commonality is required. 

Business users need to be part of a wider community for data democratization to be successful, and not work in isolation.   

Conclusion

Data democratization is coming and should be welcomed by everyone. However, we need to be realistic about what could cause it to fail and take action to prevent it from happening. We need more thought leadership around how business users can do data management successfully. Without that, data democratization may turn out to be a data oligarchy that only benefits BI analysts and data scientists.   

John O'Gorman

Disambiguation Specialist

4y

Malcolm - Great points, all indicating that the problem is not with the concept of democratization, but the delivery mechanism. A Data Catalog is a really poor vehicle for business users to get at the information they want for all the reasons you listed. There has to be a more accessible way to 'shop' for information than a DC.

Like
Reply
Richard Robinson

Chief Strategist, Open Data and Standards | Author "Cracking the Data Code" | Author "Understanding the Financial Industry through Linguistics"

4y

Super interesting, and highlights some of the key operational/implementation issues we face in achieving that goal. But also I ask - data democratization for what purpose? And I'll point to your nod to community - who is the community that certain data should be democratized for? I explored an aspect of this in my recent book (sorry, shameless plug), too. Especially around Langefor's Infology explorations around data and decentralization; that all data isn't necessarily for everyone (nor is it useful for everyone). These are all pieces to the larger puzzle - no single answer/approach is going to move us forward, but a better understanding of the blend of skills and approaches needed, while understanding that one size does not fit all. Great thoughtful stuff, Malcolm Chisholm!

Carol McGrath

Data Governance Professional | Certified Information Management Professional (CIMP)| Certified Data Steward (CDS)

4y

Thanks Malcolm! I’ve always been a big fan of the concept of data democratisation but reading this has made me realise it’s got a way to go. Point 4- have experienced through implementing a catalog and point 5 has always been something I’ve tried hard to build.

Richard Crawford

Senior Director, Head of Enterprise Data & Analytics at Ansys

4y

Great post Malcolm! Your statement that "Business users need to be part of a wider community for data democratization to be successful" is so true. #datagovernance #cdos #analytics

To view or add a comment, sign in

Others also viewed

Explore content categories