Data Sharing – Industry Utility
Data sharing is the new normal. We’ve been experiencing the same in our personal lives in the last couple of decades whereby people have leveraged social media tools to share information which may be useful to humanity as a whole. But this aspect hasn’t caught up with every industry and there are many places where we still work in silos. This article aims to touch upon this important topic of how firms in each of the industries can come together to create utilities for data sharing as this is one approach which will help everyone scale by keeping cost in control.
(1) Participants
Every industry will have different players & firms which interact with each other to create value for their Customers. Even today when no such a utility exists, firms work with their chosen set of partners, subsidiaries, data vendors, credit bureau & other types of stakeholders to build value for their end Customers. Now, this interaction with each partner would be a 1-1 setup where there is lot of effort & cost which probably goes in building data pipelines between firms.
(2) Ingestion framework
Now what is being proposed here is an industry utility for a set of firms who would like to share data amongst each other to create value for their Customer. One of the firms can play the role of hosting this utility and providing the platform for ingesting the data from individual firms, the modelling infra to leverage the combined data and generate insights. Each participating firm should ensure that their Customer PII data is secure and specific consent
The data from different firms will be stored in one data lake. There needs to be a tight access control mechanism to ensure that one firm cannot see the data from another firm unless they are authorized for the same. We never had the luxury of such inter-firm data-warehouses in the past but in recent years, we have options of data-warehouses where you can store and control data from different firms with inbuilt features. For example, traditional databases like Oracle will bind you to one firm and the way the sell their licenses are also very firm specific while there are new databases like Snowflake which talk about data sharing as a capability and are made for such inter-firm use cases.
(4) Models & Scores
With a powerful data lake which can store data across firms and also ensure privacy in the same breath, you are now empowered to combine data, generate more insights and build models and scores which can add more value to your Customer. This data was always available and in some cases you were getting it through dedicated pipelines built for each partner. But then you were taking a lot of time in getting this setup for each partner and hence were not able to scale at an exponential level without a corresponding increase in your cost. We’ve been talking about Big data and AI/ML for a long time now and many vendors have advanced tools & tech around it but unless we are able to get all the data at one place, such tools would be irrelevant for many of the firms.
(5) Test rules & policies
This utility can be like a sandbox environment for your firm where you can play around with new data, build new insights & scores and test your use cases. Once qualified with a sample population, you can build & deploy the feature or functionality in your environment. You also don’t need to store data in this utility env for a long time and can purge it once your idea has been tested. This is basically your innovation platform as well where you invest once and keep exploring new ideas at a phenomenally lower cost instead of struggling to try out new ideas every time.
The contours around who would manage this utility platform, what role will each partner play, can each firm opt in & opt out at any point of time, how much is the CAPEX and OPEX, etc can be discussed and agreed upon between the stakeholders. You will need one anchor firm to invest initial amount to set this up and other firms can pay for it as per their usage. In every industry, you’ll have platform players or services firms, data providers or even the data consumers who could come up & play the role of the hosting firm and this is a big opportunity today.