SlideShare a Scribd company logo
Tristan Baker - linkedin.com/in/tristanbaker
Suresh Raman - linkedin.com/in/ramansuresh
Allison Bellah (in absentia) - linkedin.com/in/allisonbellah
Data Mesh at Intuit
May 13, 2021
©2021 Intuit Inc. All rights reserved. 2
Arriving at data mesh
Our vision and four part strategy
Now with 25% more parts!
Q&A
Fire away!
Agenda
Arriving at data mesh
©2021 Intuit Inc. All rights reserved. 4
A brief history of data infrastructure
©2021 Intuit Inc. All rights reserved. 5
A brief history of data infrastructure
©2021 Intuit Inc. All rights reserved. 6
A brief history of data infrastructure
©2021 Intuit Inc. All rights reserved. 7
Today
©2021 Intuit Inc. All rights reserved. 8
What “we cannot scale” sounds like from our users
Discovering Data
● Where can I find data about a particular thing (customer,
company, etc)?
● Where can I find the data sourced from a particular product
or service?
Understanding Data
● Who can approve my access so that I can see samples of the
data?
● What is the schema of the data?
● What is the business meaning and context of the data?
● Is this data related to other concepts? Is it joinable to other
data? What is the meaning of the relationship?
Trusting Data
● What system produces this data and at what latency?
● What other systems use this data?
● What is the quality of this data? Is it ‘clean’?
● Which team supports this data if it breaks?
Publishing Data
● How do I describe my data so that others understand what it
means and how to use it?
● Where do I host my data so that other systems can access it?
● Data systems are complicated, how can I build and operate
my process on top of one?
● What are my operational responsibilities once my
process/data is in production?
● How do I meet my compliance requirements for
processing/storing/publishing data?
● Am I duplicating processing/data that already exists?
Consuming Data
● How is this table/topic partitioned?
● Who can approve my production system to access it?
● Will I get alerted if the schema changes?
©2021 Intuit Inc. All rights reserved. 9
The future of data infrastructure
● Data treated as code
● Data service as a facet of a product
● Data responsibility decentralized
● Producers take responsibility for data
● Producers serve consumers
● Data platform provides the ecosystem to
govern and manage the lifecycle of data and
machine learning
The provocation
Data Mesh is born
Our vision and four part
strategy
©2021 Intuit Inc. All rights reserved. 11
Enable more Intuit teams
to more easily use and
create data
©2021 Intuit Inc. All rights reserved. 12
Four part strategy
• Stewardship
– ensures accountability for a set of defined responsibilities in building and managing their solutions; including
adherence to a set of defined best practices to produce only high quality data.
• Organizing people, code and data
– A systematic approach to organizing the people, code and data which clearly identifies the owners of a business problem and its
solution.
• Self serve products
– A rich suite of self serve products that enable teams to more easily author, deploy, govern and operate their own solutions, aided
by automation and processes that support best practices and high quality as a precondition for deployment.
• Rationalizing data definitions
– A process for rationalizing all critical data definitions at the company so that data concepts like Customer, Product and
Entitlement are unique, re-usable and non-conflicting.
Stewardship
©2021 Intuit Inc. All rights reserved. 14
©2021 Intuit Inc. All rights reserved. 15
Stewardship goals for next year
Organizing People,
Code, and Data
©2021 Intuit Inc. All rights reserved. 17
Raw information about physical systems that describes where the data is stored and where code is
executing. This describes where data is physically located so that it can be accessed.
©2021 Intuit Inc. All rights reserved. 18
Basic dependency, ownership and classification information provides additional context about physical
data and code locations so that data can be better governed, secured and operated by the owning
teams.
©2021 Intuit Inc. All rights reserved. 19
Why organizing people, code and data matters
19
Private vs Public
~50% tables are either
temp/sandbox/staging/test/backu
p tables
- Messes up search & discovery
- Teams consume data not meant
for external use
Data Ownership
~50% tables don’t have clearly
identified owners
- Erodes Trust
- Copies proliferate
- Operational, Governance risk
Self Serve Products
©2021 Intuit Inc. All rights reserved. 21
Data Processing Capabilities Data Serving Capabilities
©2021 Intuit Inc. All rights reserved. 22
Self Serve goals for next year
100% of Top 20 tasks in the Data lifecycle are Self Serve
Infra Provisioning
● Transactional Persistence
● Compute for stream, batch
processing
● Monitor, Debug Infra
● Cost
Data Authoring
● Events, Schemas
● Ingestion
● Transformations
● Entities
● ML Features
● Data Quality,
Observability
● Orchestration
Data Governance
● Access Management
● Key management
● Compliance Controls &
Audit
● Privacy
Rationalizing Data
Definitions
©2021 Intuit Inc. All rights reserved. 24
Clean entity information with formally defined meaning and relationships enables better data understanding. This
is the purpose of entity definitions. They ensure that data is clean, organized, connected, discoverable and
documented in a formal way.
©2021 Intuit Inc. All rights reserved. 25
When you bring it all together, you get Intuit’s Data Mesh
©2021 Intuit Inc. All rights reserved. 26
©2021 Intuit Inc. All rights reserved. 27
©2021 Intuit Inc. All rights reserved. 28
©2021 Intuit Inc. All rights reserved. 29
©2021 Intuit Inc. All rights reserved. 30
Capturing meaning,
relationship, ownership, and
system dependencies builds a
full, rich picture for everyone.
No tribal knowledge needed!
In this example, the clean
information describes entities
OII Account and Intuit Product
and the Entitled To relationship
between them.
The basic information describes
how the data for these entities
are sourced from the Identity
Universal Service and the
Entitlement Reference Service.
The raw information describes
which Event Bus topic and Data
Lake table the data for these
entities can be found in.
Q&A
Tristan Baker - linkedin.com/in/tristanbaker
Suresh Raman - linkedin.com/in/ramansuresh
Allison Bellah (in absentia) - linkedin.com/in/allisonbellah
©2021 Intuit Inc. All rights reserved. 32
32

More Related Content

PDF
Webinar Data Mesh - Part 3
PDF
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
PDF
Time to Talk about Data Mesh
PDF
Enabling a Data Mesh Architecture with Data Virtualization
PDF
[XConf Brasil 2020] Data mesh
PDF
Data Mesh Part 4 Monolith to Mesh
PPTX
Data Mesh in Azure using Cloud Scale Analytics (WAF)
PDF
Five Things to Consider About Data Mesh and Data Governance
Webinar Data Mesh - Part 3
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Time to Talk about Data Mesh
Enabling a Data Mesh Architecture with Data Virtualization
[XConf Brasil 2020] Data mesh
Data Mesh Part 4 Monolith to Mesh
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Five Things to Consider About Data Mesh and Data Governance

What's hot (20)

PDF
Data Mesh for Dinner
PPTX
Zero to Snowflake Presentation
PDF
Architect’s Open-Source Guide for a Data Mesh Architecture
PPTX
Master the Multi-Clustered Data Warehouse - Snowflake
PDF
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
PPTX
Introducing the Snowflake Computing Cloud Data Warehouse
PDF
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
PDF
Data Warehouse or Data Lake, Which Do I Choose?
PDF
Data Quality Best Practices
PPTX
Intro to Data Vault 2.0 on Snowflake
PDF
PPTX
Snowflake Overview
PPTX
You Need a Data Catalog. Do You Know Why?
PDF
Moving to Databricks & Delta
PDF
Building a Data Strategy – Practical Steps for Aligning with Business Goals
PDF
Data Mesh 101
PDF
Data Architecture - The Foundation for Enterprise Architecture and Governance
PPTX
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
PPTX
Data Lakehouse, Data Mesh, and Data Fabric (r1)
PDF
Democratizing Data Quality Through a Centralized Platform
Data Mesh for Dinner
Zero to Snowflake Presentation
Architect’s Open-Source Guide for a Data Mesh Architecture
Master the Multi-Clustered Data Warehouse - Snowflake
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Introducing the Snowflake Computing Cloud Data Warehouse
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
Data Warehouse or Data Lake, Which Do I Choose?
Data Quality Best Practices
Intro to Data Vault 2.0 on Snowflake
Snowflake Overview
You Need a Data Catalog. Do You Know Why?
Moving to Databricks & Delta
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Data Mesh 101
Data Architecture - The Foundation for Enterprise Architecture and Governance
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Democratizing Data Quality Through a Centralized Platform
Ad

Similar to Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021 (20)

PDF
Transforming Data Processing with Kubernetes: Journey Towards a Self-Serve Da...
PDF
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
PPTX
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
PPTX
Matt McIlwain opening keynote
PPTX
Data Mesh - Anders Boje - Copenhagen Data Engineering Meetup (24 mar 2022)
PDF
Boston Data Engineering: Designing and Implementing Data Mesh at Your Company...
PDF
Building New Data Ecosystem for Customer Analytics, Strata + Hadoop World, 2016
PDF
Enable Better Decision Making with Power BI Visualizations & Modern Data Estate
 
PDF
Workable Enteprise Data Governance
PDF
Smarter Analytics: Supporting the Enterprise with Automation
PDF
Data Con LA 2022 - Self-Service Success and Data Products
PDF
DAS Slides: Building a Future-State Data Architecture Plan - Where to Begin?
PDF
Data Management at Scale, Second Edition Piethein Strengholt
PPT
Finding Value In Enterprise Architecture
PDF
Achieving Self-service Analytics with a Governed Data Services Layer
PDF
Integrating Structure and Analytics with Unstructured Data
PDF
Straight Talk to Demystify Data Lineage
PDF
Data Resource Management: Good Practices to Make the Most out of a Hidden Tre...
PDF
Data strategy in a Big Data world
PDF
Data Architecture: OMG It’s Made of People
Transforming Data Processing with Kubernetes: Journey Towards a Self-Serve Da...
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Matt McIlwain opening keynote
Data Mesh - Anders Boje - Copenhagen Data Engineering Meetup (24 mar 2022)
Boston Data Engineering: Designing and Implementing Data Mesh at Your Company...
Building New Data Ecosystem for Customer Analytics, Strata + Hadoop World, 2016
Enable Better Decision Making with Power BI Visualizations & Modern Data Estate
 
Workable Enteprise Data Governance
Smarter Analytics: Supporting the Enterprise with Automation
Data Con LA 2022 - Self-Service Success and Data Products
DAS Slides: Building a Future-State Data Architecture Plan - Where to Begin?
Data Management at Scale, Second Edition Piethein Strengholt
Finding Value In Enterprise Architecture
Achieving Self-service Analytics with a Governed Data Services Layer
Integrating Structure and Analytics with Unstructured Data
Straight Talk to Demystify Data Lineage
Data Resource Management: Good Practices to Make the Most out of a Hidden Tre...
Data strategy in a Big Data world
Data Architecture: OMG It’s Made of People
Ad

Recently uploaded (20)

PPT
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
PDF
Mega Projects Data Mega Projects Data
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Business Acumen Training GuidePresentation.pptx
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PDF
Launch Your Data Science Career in Kochi – 2025
PPTX
1_Introduction to advance data techniques.pptx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
Introduction to Knowledge Engineering Part 1
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
Clinical guidelines as a resource for EBP(1).pdf
PDF
Introduction to Business Data Analytics.
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
Mega Projects Data Mega Projects Data
.pdf is not working space design for the following data for the following dat...
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Business Acumen Training GuidePresentation.pptx
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Launch Your Data Science Career in Kochi – 2025
1_Introduction to advance data techniques.pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
Reliability_Chapter_ presentation 1221.5784
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Introduction-to-Cloud-ComputingFinal.pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Introduction to Knowledge Engineering Part 1
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Clinical guidelines as a resource for EBP(1).pdf
Introduction to Business Data Analytics.
Business Ppt On Nestle.pptx huunnnhhgfvu

Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021

  • 1. Tristan Baker - linkedin.com/in/tristanbaker Suresh Raman - linkedin.com/in/ramansuresh Allison Bellah (in absentia) - linkedin.com/in/allisonbellah Data Mesh at Intuit May 13, 2021
  • 2. ©2021 Intuit Inc. All rights reserved. 2 Arriving at data mesh Our vision and four part strategy Now with 25% more parts! Q&A Fire away! Agenda
  • 4. ©2021 Intuit Inc. All rights reserved. 4 A brief history of data infrastructure
  • 5. ©2021 Intuit Inc. All rights reserved. 5 A brief history of data infrastructure
  • 6. ©2021 Intuit Inc. All rights reserved. 6 A brief history of data infrastructure
  • 7. ©2021 Intuit Inc. All rights reserved. 7 Today
  • 8. ©2021 Intuit Inc. All rights reserved. 8 What “we cannot scale” sounds like from our users Discovering Data ● Where can I find data about a particular thing (customer, company, etc)? ● Where can I find the data sourced from a particular product or service? Understanding Data ● Who can approve my access so that I can see samples of the data? ● What is the schema of the data? ● What is the business meaning and context of the data? ● Is this data related to other concepts? Is it joinable to other data? What is the meaning of the relationship? Trusting Data ● What system produces this data and at what latency? ● What other systems use this data? ● What is the quality of this data? Is it ‘clean’? ● Which team supports this data if it breaks? Publishing Data ● How do I describe my data so that others understand what it means and how to use it? ● Where do I host my data so that other systems can access it? ● Data systems are complicated, how can I build and operate my process on top of one? ● What are my operational responsibilities once my process/data is in production? ● How do I meet my compliance requirements for processing/storing/publishing data? ● Am I duplicating processing/data that already exists? Consuming Data ● How is this table/topic partitioned? ● Who can approve my production system to access it? ● Will I get alerted if the schema changes?
  • 9. ©2021 Intuit Inc. All rights reserved. 9 The future of data infrastructure ● Data treated as code ● Data service as a facet of a product ● Data responsibility decentralized ● Producers take responsibility for data ● Producers serve consumers ● Data platform provides the ecosystem to govern and manage the lifecycle of data and machine learning The provocation Data Mesh is born
  • 10. Our vision and four part strategy
  • 11. ©2021 Intuit Inc. All rights reserved. 11 Enable more Intuit teams to more easily use and create data
  • 12. ©2021 Intuit Inc. All rights reserved. 12 Four part strategy • Stewardship – ensures accountability for a set of defined responsibilities in building and managing their solutions; including adherence to a set of defined best practices to produce only high quality data. • Organizing people, code and data – A systematic approach to organizing the people, code and data which clearly identifies the owners of a business problem and its solution. • Self serve products – A rich suite of self serve products that enable teams to more easily author, deploy, govern and operate their own solutions, aided by automation and processes that support best practices and high quality as a precondition for deployment. • Rationalizing data definitions – A process for rationalizing all critical data definitions at the company so that data concepts like Customer, Product and Entitlement are unique, re-usable and non-conflicting.
  • 14. ©2021 Intuit Inc. All rights reserved. 14
  • 15. ©2021 Intuit Inc. All rights reserved. 15 Stewardship goals for next year
  • 17. ©2021 Intuit Inc. All rights reserved. 17 Raw information about physical systems that describes where the data is stored and where code is executing. This describes where data is physically located so that it can be accessed.
  • 18. ©2021 Intuit Inc. All rights reserved. 18 Basic dependency, ownership and classification information provides additional context about physical data and code locations so that data can be better governed, secured and operated by the owning teams.
  • 19. ©2021 Intuit Inc. All rights reserved. 19 Why organizing people, code and data matters 19 Private vs Public ~50% tables are either temp/sandbox/staging/test/backu p tables - Messes up search & discovery - Teams consume data not meant for external use Data Ownership ~50% tables don’t have clearly identified owners - Erodes Trust - Copies proliferate - Operational, Governance risk
  • 21. ©2021 Intuit Inc. All rights reserved. 21 Data Processing Capabilities Data Serving Capabilities
  • 22. ©2021 Intuit Inc. All rights reserved. 22 Self Serve goals for next year 100% of Top 20 tasks in the Data lifecycle are Self Serve Infra Provisioning ● Transactional Persistence ● Compute for stream, batch processing ● Monitor, Debug Infra ● Cost Data Authoring ● Events, Schemas ● Ingestion ● Transformations ● Entities ● ML Features ● Data Quality, Observability ● Orchestration Data Governance ● Access Management ● Key management ● Compliance Controls & Audit ● Privacy
  • 24. ©2021 Intuit Inc. All rights reserved. 24 Clean entity information with formally defined meaning and relationships enables better data understanding. This is the purpose of entity definitions. They ensure that data is clean, organized, connected, discoverable and documented in a formal way.
  • 25. ©2021 Intuit Inc. All rights reserved. 25 When you bring it all together, you get Intuit’s Data Mesh
  • 26. ©2021 Intuit Inc. All rights reserved. 26
  • 27. ©2021 Intuit Inc. All rights reserved. 27
  • 28. ©2021 Intuit Inc. All rights reserved. 28
  • 29. ©2021 Intuit Inc. All rights reserved. 29
  • 30. ©2021 Intuit Inc. All rights reserved. 30 Capturing meaning, relationship, ownership, and system dependencies builds a full, rich picture for everyone. No tribal knowledge needed! In this example, the clean information describes entities OII Account and Intuit Product and the Entitled To relationship between them. The basic information describes how the data for these entities are sourced from the Identity Universal Service and the Entitlement Reference Service. The raw information describes which Event Bus topic and Data Lake table the data for these entities can be found in.
  • 31. Q&A Tristan Baker - linkedin.com/in/tristanbaker Suresh Raman - linkedin.com/in/ramansuresh Allison Bellah (in absentia) - linkedin.com/in/allisonbellah
  • 32. ©2021 Intuit Inc. All rights reserved. 32 32