SlideShare a Scribd company logo
Implementing a Data Mesh with
Apache Kafka
Adam Bellemare
What’s Covered
1 A Brief Intro to Data Mesh
2 Why Event Streams?
3 Example: Applying Data Mesh
4 Demo: Confluent’s Proof-of-Concept
A Brief Intro to Data Mesh
Influences
DDD Microservices Event Streaming
Data Marts
DATA MESH
Domain
Inventory
Orders
Shipments
Data Product
...
Domain
Ownership
Local Autonomy
(Organizational Concerns)
Data as a
First-class Product
Product thinking,
“Microservice for Data”
Federated
Governance
Interoperability,
Network Effects
(Organizational Concerns)
Self-serve
Data Platform
Infra Tooling,
Across Domains
1 2 3 4
Principle 1: Domain Ownership
Objective: Data is owned by those that truly understand it
Pattern: Data belongs to the team
who understands it best
Centralized
Data Ownership
Decentralized
Data Ownership
Anti-pattern: Centralized team owns all data
Data Lake: Ownership rests with a centralized data team
Domain Foo
Domain Bar Data Domain
Connector
Connector
Clean Up &
Remodel
Clean Up &
Remodel
ksqlDB
Kafka Streams
Renegotiate Domain Ownership
Renata
Alice
Joe
Centralized Data
Data Mesh: Ownership rests with the domain
Domain Foo
Domain Bar
Connector
Connector
Clean Up &
Remodel
Clean Up &
Remodel
Alice
Joe
Self-Service Platform
Self-Service Platform
ksqlDB
Kafka Streams
End-to-End Ownership
Platform
Support
Renata
De-centralized Data
Principle 2: Data as a First-Class Product
Objective: Make shared data discoverable, addressable, trustworthy, secure, so
other teams can make good use of it.
● Data is treated as a true product, not a by-product.
Domain
Data Product
Data Product, a “Microservice for the Data World”
● Data product is a node on the data mesh, situated within a domain.
● Produces—and possibly consumes—high-quality data within the mesh.
Infra
Code
Data
Creates, manipulates,
serves, etc. that data
Powers the data (e.g., storage) and the
code (e.g., run, deploy, monitor)
“Items About to Expire”
Data Product
Data and metadata,
including history
Principle 3: Self-Serve Data Platform
Provide discovery, access, and self-service compute and publish tools
Objective: Make it easy to both create and use the data products
Domain Domain Domain
Domain
Principle 4: Federated Governance
Objective: Standards of Interoperability, Policies, and Support
● Global standards, data product support. “Paved Roads”.
Self-Serve Data Platform
What is decided
locally by a domain?
What is globally?
(implemented and
enforced by platform)
Must balance between Decentralization vs. Centralization. No silver bullet!
Why Event Streams for Data Mesh?
Data Products Base Requirements
● Immutable
○ Consumers across time provided with the same data
● Time-Stamped
○ Support time-bounded queries and operations
● Well-defined (Schemas)
○ Clarity as to what the data means
Event Streams Provide An Immutable History
Consumer
Application
Data
Product
0 1 2 3 4 5 6 7 8 9
Bug? Error?
New Aggregate?
Rewind to start of
stream, then
reprocess.
Event Streams let your
consumers replay data
as needed.
Kappa Architecture
● Store all the data you need, for as long as you need it.
● Cheap disk! Compaction!
● Confluent Cloud’s Infinite Storage
● OSS: KIP-405: Kafka Tiered Storage (Targetting Kafka 3.3)
Event Streams: Massively Scalable
Events are Well-Defined
0 1 2 3 4 5 6 7 8 9
Key String ID-2910312
Value String itemName
String Brand
String Construction
Float Price
Baseball Bat
ACME
Wood
29.99
Time String 2022-04-07T14:51:44Z
Kafka Topic + Schema = REST + json
The stream API:
Events are Time-Stamped
0 1 2 3 4 5 6 7 8 9
Time-Stamped Data and
incremental offsets
enable deterministic
reprocessing
Key String ID-2910312
Value String itemName
String Brand
String Construction
Float Price
Baseball Bat
ACME
Wood
29.99
Time String 2022-04-07T14:51:44Z
Event Streams Power Realtime & Batch Processing
All Data
(current and historic)
Streaming
Operational App
Streaming
Analytics
Connector
Connector
Batch-Computed
Analytics
Traditional R/R
Operational App
Millisecond
end-to-end latency
Both operational and
analytical workloads!
Example:
Learn A Language Application
Learn a Language Application
● Lesson types: Written, Audio, Video, Stories, Flashcards
1. What lessons do students fail to complete (24h)? (Analytical)
2. Can we push them lessons based on what they’ve failed? (Operational)
3. Expand the domains to account for paid users (Both)
● Serves content to users
● Collects metrics on users
completing and failing lessons
● User Accounts
● Includes private details, PII,
Payment Info
USERS
SERVING
● Lessons, including written,
video, audio, and flashcards
LESSONS
Alice
Joe
Maria
Simplification!
Could have many more domains
Masters of Their Domains
USERS
User Account Data
Alice
Users
DP key: UserId-6384291
Name: Adam Bellemare
Address: Canada
Email: k2hd9@9fd9s.com
Timestamp: 2022-04-07T15:19:47Z
Event Schema API
Isolates internal model
Format-preserving Encryption
User Accounts Maintained Within a Single Domain
Joe
SERVING
key: UserId-6384291
LessonId: AID-2729
Type: Audio
Status: Completed
Content Serving Domain - Source-Aligned Data Product
DP
Source-Aligned
Data Product
Joe
SERVING
key: UserId-6384291
LessonId: AID-2729
Type: Audio
Status: Completed
Content Serving Domain - Source and Aggregate
DP
Source-Aligned
Data Product
key: UserId-6384291
Completed: <List of Lessons>
Failed: <List of Lessons>
StartDate: 2022-02-02 UTC-0
EndDate: 2022-02-03 UTC-0
Aggregate-Aligned
Data Product
DP
Lessons Domain - Source Aligned Data Product
LESSONS
Maria
Lessons
DP
key: LessonID-623
assets: S3://….
medium: Written
subject: Verbs
difficulty: Intermediate
Source-Aligned
Data Product
1) What Lessons do Students Fail to Complete?
Compute 24h course completion and failure rates:
- Could created our own aggregate using:
OR
- Could use the pre-built aggregate-aligned data product
key: UserId-6384291
Completed: <List of Lessons>
Failed: <List of Lessons>
StartDate: 2022-02-02 UTC-0
EndDate: 2022-02-03 UTC-0
Aggregate-Aligned
Data Product
key: UserId-6384291
LessonId: AID-2729
Type: Audio
Status: Failed
Source-Aligned
Data Product
SERVING
SERVING
1) Select the Data Products
Joe
LESSONS
Maria
key: UserId-6384291
Completed: <List of Lessons>
Failed: <List of Lessons>
StartDate: 2022-02-02 UTC-0
EndDate: 2022-02-03 UTC-0
Aggregate-Aligned
Data Product
key: LessonID-623
assets: S3://….
medium: Written
subject: Verbs
difficulty: Intermediate
Source-Aligned
Data Product
Join the List of Failed Lessons
with the Lesson Content
SERVING
1) Create a New Processor and Emit Results
ksqlDB
ANALYTICS
BI Tool
Joe
Content Serving
Domain
LESSONS
Maria
key: UserId-6384291
Completed: <List of Lessons>
Failed: <List of Lessons>
StartDate: 2022-02-02 UTC-0
EndDate: 2022-02-03 UTC-0
key: LessonID-623
assets: S3://….
medium: Written
subject: Verbs
difficulty: Intermediate
SERVING
1) OR use Connectors to Integrate with Batch Data
Joe
Content Serving
Domain
LESSONS
Maria
Batch
Analytics
Engine
ANALYTICS
BI Tool
Connect
Connect
Cloud
Storage
Cloud
Storage
SERVING
2) Push New Lessons to User Based on Failures
Joe
Content Serving
Domain
LESSONS
Maria
key: LessonID-623
assets: S3://….
medium: Written
subject: Verbs
difficulty: Intermediate
key: UserId-6384291
LessonId: LessonID-623
Type: Audio
Status: Failed
User failed lesson?
- Find them a new one based on
subject, medium, and difficulty
User passed lesson?
- Offer them a more challenging one
Source-Aligned
Data Products
Operational Use-Case
SERVING
Materialize
Data to
Tables
Handle Client
REST Requests
2) Push New Lessons to User Based on Failures
Joe
Content Serving
Domain
LESSONS
Maria
key: LessonID-623
assets: S3://….
medium: Written
subject: Verbs
difficulty: Intermediate
key: UserId-6384291
LessonId: LessonID-623
Type: Audio
Status: Failed
Operational Use-Case Using lesson-completion events for both
operational and analytical use-cases
SERVING
SERVING
3) Expanding the Business Domain: Premium Content
USERS
Alice
LESSONS
Maria
key: LessonID-623
Assets: S3://….
Medium: Written
Subject: Verbs
Difficulty: Intermediate
Status: Premium
key: UserId-6384291
Name: Adam Bellemare
Address: Canada
Email: k2hd9@9fd9s.com
Status: Premium
Add special content that is only
available for premium (paid) users
a) Evolve the User event to
contain a status enum:
(Premium / Normal)
b) Add new content that is only
available for premium users
c) Governance requirement: a
standard definition of
premium across the whole
business.
3) Expanding the Business Domain: Premium Content
USERS
Alice
LESSONS
Maria
key: LessonID-623
Assets: S3://….
Medium: Written
Subject: Verbs
Difficulty: Intermediate
Status: Premium
key: UserId-6384291
Name: Adam Bellemare
Address: Canada
Email: k2hd9@9fd9s.com
Status: Premium
SERVING
Materialize
Data to
Tables
Handle Client
REST Requests
Update Business Logic to show
paid users the Premium Content
ksqlDB
ANALYTICS
BI Tool
3) Build new Analytics off Premium
LESSONS
Maria
key: LessonID-623
Assets: S3://….
Medium: Written
Subject: Verbs
Difficulty: Intermediate
Status: Premium
Joe
Content Serving
Domain
key: UserId-6384291
LessonId: LessonID-623
Type: Audio
Status: Completed
Source-Aligned Data Products
CONTENT
Confluent’s
Proof of Concept
(not a product)
Data Mesh Demo
github.com/confluentinc/data-mesh-demo
Implementing a Data Mesh with Apache Kafka with Adam Bellemare | Kafka Summit London 2022
Your Apache Kafka®
journey begins here
developer.confluent.io
@AdamBellemare

More Related Content

PPTX
Webinar Think Right - Shift Left - 19-03-2025.pptx
PPTX
data-mesh-101.pptx
PDF
Introduction to Streaming Analytics
PDF
Apache Kafka and the Data Mesh | Michael Noll, Confluent
PPTX
Letgo Data Platform: A global overview
PDF
Evolution from EDA to Data Mesh: Data in Motion
PDF
Confluent Partner Tech Talk with BearingPoint
PPTX
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
Webinar Think Right - Shift Left - 19-03-2025.pptx
data-mesh-101.pptx
Introduction to Streaming Analytics
Apache Kafka and the Data Mesh | Michael Noll, Confluent
Letgo Data Platform: A global overview
Evolution from EDA to Data Mesh: Data in Motion
Confluent Partner Tech Talk with BearingPoint
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent

Similar to Implementing a Data Mesh with Apache Kafka with Adam Bellemare | Kafka Summit London 2022 (20)

PDF
fundamentalsofeventdrivenmicroservices11728489736099.pdf
PDF
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
PDF
A primer on building real time data-driven products
PDF
The Journey to Data Mesh with Confluent
PPTX
Advanced Flink Training - Design patterns for streaming applications
PDF
You Put *What* in Your Stream?! Patterns and Practices for Event Design with ...
PDF
Data in Motion Tour 2024 Riyadh, Saudi Arabia
PPTX
SF big Analytics : Stream all things by Gwen Shapira @ Lyft 2018
PDF
Mastering Kafka Streams and ksqlDB: Building Real-Time Data Systems by Exampl...
PDF
Complex event processing platform handling millions of users - Krzysztof Zarz...
PPTX
Project Deimos
PDF
Building real time data-driven products
PDF
Building an Event-Driven Data Mesh (Early Release) Adam Bellemare
PDF
Handling eventual consistency in a transactional world with Matteo Cimini and...
PDF
Kafka Summit 2022: Handling Eventual Consistency in a Transactional World.pdf
PDF
Building a Distributed Collaborative Data Pipeline with Apache Spark
PPTX
Software architecture for data applications
PPTX
Crack the Domain with Event Storming By Vivek
PDF
Building an Event-Driven Data Mesh (Early Release) Adam Bellemare
PPT
Moving Towards a Streaming Architecture
fundamentalsofeventdrivenmicroservices11728489736099.pdf
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
A primer on building real time data-driven products
The Journey to Data Mesh with Confluent
Advanced Flink Training - Design patterns for streaming applications
You Put *What* in Your Stream?! Patterns and Practices for Event Design with ...
Data in Motion Tour 2024 Riyadh, Saudi Arabia
SF big Analytics : Stream all things by Gwen Shapira @ Lyft 2018
Mastering Kafka Streams and ksqlDB: Building Real-Time Data Systems by Exampl...
Complex event processing platform handling millions of users - Krzysztof Zarz...
Project Deimos
Building real time data-driven products
Building an Event-Driven Data Mesh (Early Release) Adam Bellemare
Handling eventual consistency in a transactional world with Matteo Cimini and...
Kafka Summit 2022: Handling Eventual Consistency in a Transactional World.pdf
Building a Distributed Collaborative Data Pipeline with Apache Spark
Software architecture for data applications
Crack the Domain with Event Storming By Vivek
Building an Event-Driven Data Mesh (Early Release) Adam Bellemare
Moving Towards a Streaming Architecture
Ad

More from HostedbyConfluent (20)

PDF
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
PDF
Renaming a Kafka Topic | Kafka Summit London
PDF
Evolution of NRT Data Ingestion Pipeline at Trendyol
PDF
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
PDF
Exactly-once Stream Processing with Arroyo and Kafka
PDF
Fish Plays Pokemon | Kafka Summit London
PDF
Tiered Storage 101 | Kafla Summit London
PDF
Building a Self-Service Stream Processing Portal: How And Why
PDF
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
PDF
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
PDF
Navigating Private Network Connectivity Options for Kafka Clusters
PDF
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
PDF
Explaining How Real-Time GenAI Works in a Noisy Pub
PDF
TL;DR Kafka Metrics | Kafka Summit London
PDF
A Window Into Your Kafka Streams Tasks | KSL
PDF
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
PDF
Data Contracts Management: Schema Registry and Beyond
PDF
Code-First Approach: Crafting Efficient Flink Apps
PDF
Debezium vs. the World: An Overview of the CDC Ecosystem
PDF
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Renaming a Kafka Topic | Kafka Summit London
Evolution of NRT Data Ingestion Pipeline at Trendyol
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Exactly-once Stream Processing with Arroyo and Kafka
Fish Plays Pokemon | Kafka Summit London
Tiered Storage 101 | Kafla Summit London
Building a Self-Service Stream Processing Portal: How And Why
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Navigating Private Network Connectivity Options for Kafka Clusters
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Explaining How Real-Time GenAI Works in a Noisy Pub
TL;DR Kafka Metrics | Kafka Summit London
A Window Into Your Kafka Streams Tasks | KSL
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Data Contracts Management: Schema Registry and Beyond
Code-First Approach: Crafting Efficient Flink Apps
Debezium vs. the World: An Overview of the CDC Ecosystem
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Ad

Recently uploaded (20)

PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
A comparative analysis of optical character recognition models for extracting...
PPTX
Machine Learning_overview_presentation.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Mushroom cultivation and it's methods.pdf
PPTX
A Presentation on Artificial Intelligence
PDF
August Patch Tuesday
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
A comparative study of natural language inference in Swahili using monolingua...
PPTX
Spectroscopy.pptx food analysis technology
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Approach and Philosophy of On baking technology
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PPTX
cloud_computing_Infrastucture_as_cloud_p
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
MIND Revenue Release Quarter 2 2025 Press Release
A comparative analysis of optical character recognition models for extracting...
Machine Learning_overview_presentation.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Mushroom cultivation and it's methods.pdf
A Presentation on Artificial Intelligence
August Patch Tuesday
Assigned Numbers - 2025 - Bluetooth® Document
Programs and apps: productivity, graphics, security and other tools
Reach Out and Touch Someone: Haptics and Empathic Computing
A comparative study of natural language inference in Swahili using monolingua...
Spectroscopy.pptx food analysis technology
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Approach and Philosophy of On baking technology
Unlocking AI with Model Context Protocol (MCP)
Mobile App Security Testing_ A Comprehensive Guide.pdf
gpt5_lecture_notes_comprehensive_20250812015547.pdf
cloud_computing_Infrastucture_as_cloud_p
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Per capita expenditure prediction using model stacking based on satellite ima...

Implementing a Data Mesh with Apache Kafka with Adam Bellemare | Kafka Summit London 2022

  • 1. Implementing a Data Mesh with Apache Kafka Adam Bellemare
  • 2. What’s Covered 1 A Brief Intro to Data Mesh 2 Why Event Streams? 3 Example: Applying Data Mesh 4 Demo: Confluent’s Proof-of-Concept
  • 3. A Brief Intro to Data Mesh
  • 4. Influences DDD Microservices Event Streaming Data Marts DATA MESH Domain Inventory Orders Shipments Data Product ...
  • 5. Domain Ownership Local Autonomy (Organizational Concerns) Data as a First-class Product Product thinking, “Microservice for Data” Federated Governance Interoperability, Network Effects (Organizational Concerns) Self-serve Data Platform Infra Tooling, Across Domains 1 2 3 4
  • 6. Principle 1: Domain Ownership Objective: Data is owned by those that truly understand it Pattern: Data belongs to the team who understands it best Centralized Data Ownership Decentralized Data Ownership Anti-pattern: Centralized team owns all data
  • 7. Data Lake: Ownership rests with a centralized data team Domain Foo Domain Bar Data Domain Connector Connector Clean Up & Remodel Clean Up & Remodel ksqlDB Kafka Streams Renegotiate Domain Ownership Renata Alice Joe Centralized Data
  • 8. Data Mesh: Ownership rests with the domain Domain Foo Domain Bar Connector Connector Clean Up & Remodel Clean Up & Remodel Alice Joe Self-Service Platform Self-Service Platform ksqlDB Kafka Streams End-to-End Ownership Platform Support Renata De-centralized Data
  • 9. Principle 2: Data as a First-Class Product Objective: Make shared data discoverable, addressable, trustworthy, secure, so other teams can make good use of it. ● Data is treated as a true product, not a by-product.
  • 10. Domain Data Product Data Product, a “Microservice for the Data World” ● Data product is a node on the data mesh, situated within a domain. ● Produces—and possibly consumes—high-quality data within the mesh. Infra Code Data Creates, manipulates, serves, etc. that data Powers the data (e.g., storage) and the code (e.g., run, deploy, monitor) “Items About to Expire” Data Product Data and metadata, including history
  • 11. Principle 3: Self-Serve Data Platform Provide discovery, access, and self-service compute and publish tools Objective: Make it easy to both create and use the data products
  • 12. Domain Domain Domain Domain Principle 4: Federated Governance Objective: Standards of Interoperability, Policies, and Support ● Global standards, data product support. “Paved Roads”. Self-Serve Data Platform What is decided locally by a domain? What is globally? (implemented and enforced by platform) Must balance between Decentralization vs. Centralization. No silver bullet!
  • 13. Why Event Streams for Data Mesh?
  • 14. Data Products Base Requirements ● Immutable ○ Consumers across time provided with the same data ● Time-Stamped ○ Support time-bounded queries and operations ● Well-defined (Schemas) ○ Clarity as to what the data means
  • 15. Event Streams Provide An Immutable History Consumer Application Data Product 0 1 2 3 4 5 6 7 8 9 Bug? Error? New Aggregate? Rewind to start of stream, then reprocess. Event Streams let your consumers replay data as needed. Kappa Architecture
  • 16. ● Store all the data you need, for as long as you need it. ● Cheap disk! Compaction! ● Confluent Cloud’s Infinite Storage ● OSS: KIP-405: Kafka Tiered Storage (Targetting Kafka 3.3) Event Streams: Massively Scalable
  • 17. Events are Well-Defined 0 1 2 3 4 5 6 7 8 9 Key String ID-2910312 Value String itemName String Brand String Construction Float Price Baseball Bat ACME Wood 29.99 Time String 2022-04-07T14:51:44Z Kafka Topic + Schema = REST + json The stream API:
  • 18. Events are Time-Stamped 0 1 2 3 4 5 6 7 8 9 Time-Stamped Data and incremental offsets enable deterministic reprocessing Key String ID-2910312 Value String itemName String Brand String Construction Float Price Baseball Bat ACME Wood 29.99 Time String 2022-04-07T14:51:44Z
  • 19. Event Streams Power Realtime & Batch Processing All Data (current and historic) Streaming Operational App Streaming Analytics Connector Connector Batch-Computed Analytics Traditional R/R Operational App Millisecond end-to-end latency Both operational and analytical workloads!
  • 21. Learn a Language Application ● Lesson types: Written, Audio, Video, Stories, Flashcards 1. What lessons do students fail to complete (24h)? (Analytical) 2. Can we push them lessons based on what they’ve failed? (Operational) 3. Expand the domains to account for paid users (Both)
  • 22. ● Serves content to users ● Collects metrics on users completing and failing lessons ● User Accounts ● Includes private details, PII, Payment Info USERS SERVING ● Lessons, including written, video, audio, and flashcards LESSONS Alice Joe Maria Simplification! Could have many more domains Masters of Their Domains
  • 23. USERS User Account Data Alice Users DP key: UserId-6384291 Name: Adam Bellemare Address: Canada Email: k2hd9@9fd9s.com Timestamp: 2022-04-07T15:19:47Z Event Schema API Isolates internal model Format-preserving Encryption User Accounts Maintained Within a Single Domain
  • 24. Joe SERVING key: UserId-6384291 LessonId: AID-2729 Type: Audio Status: Completed Content Serving Domain - Source-Aligned Data Product DP Source-Aligned Data Product
  • 25. Joe SERVING key: UserId-6384291 LessonId: AID-2729 Type: Audio Status: Completed Content Serving Domain - Source and Aggregate DP Source-Aligned Data Product key: UserId-6384291 Completed: <List of Lessons> Failed: <List of Lessons> StartDate: 2022-02-02 UTC-0 EndDate: 2022-02-03 UTC-0 Aggregate-Aligned Data Product DP
  • 26. Lessons Domain - Source Aligned Data Product LESSONS Maria Lessons DP key: LessonID-623 assets: S3://…. medium: Written subject: Verbs difficulty: Intermediate Source-Aligned Data Product
  • 27. 1) What Lessons do Students Fail to Complete? Compute 24h course completion and failure rates: - Could created our own aggregate using: OR - Could use the pre-built aggregate-aligned data product key: UserId-6384291 Completed: <List of Lessons> Failed: <List of Lessons> StartDate: 2022-02-02 UTC-0 EndDate: 2022-02-03 UTC-0 Aggregate-Aligned Data Product key: UserId-6384291 LessonId: AID-2729 Type: Audio Status: Failed Source-Aligned Data Product SERVING SERVING
  • 28. 1) Select the Data Products Joe LESSONS Maria key: UserId-6384291 Completed: <List of Lessons> Failed: <List of Lessons> StartDate: 2022-02-02 UTC-0 EndDate: 2022-02-03 UTC-0 Aggregate-Aligned Data Product key: LessonID-623 assets: S3://…. medium: Written subject: Verbs difficulty: Intermediate Source-Aligned Data Product Join the List of Failed Lessons with the Lesson Content SERVING
  • 29. 1) Create a New Processor and Emit Results ksqlDB ANALYTICS BI Tool Joe Content Serving Domain LESSONS Maria key: UserId-6384291 Completed: <List of Lessons> Failed: <List of Lessons> StartDate: 2022-02-02 UTC-0 EndDate: 2022-02-03 UTC-0 key: LessonID-623 assets: S3://…. medium: Written subject: Verbs difficulty: Intermediate SERVING
  • 30. 1) OR use Connectors to Integrate with Batch Data Joe Content Serving Domain LESSONS Maria Batch Analytics Engine ANALYTICS BI Tool Connect Connect Cloud Storage Cloud Storage SERVING
  • 31. 2) Push New Lessons to User Based on Failures Joe Content Serving Domain LESSONS Maria key: LessonID-623 assets: S3://…. medium: Written subject: Verbs difficulty: Intermediate key: UserId-6384291 LessonId: LessonID-623 Type: Audio Status: Failed User failed lesson? - Find them a new one based on subject, medium, and difficulty User passed lesson? - Offer them a more challenging one Source-Aligned Data Products Operational Use-Case SERVING
  • 32. Materialize Data to Tables Handle Client REST Requests 2) Push New Lessons to User Based on Failures Joe Content Serving Domain LESSONS Maria key: LessonID-623 assets: S3://…. medium: Written subject: Verbs difficulty: Intermediate key: UserId-6384291 LessonId: LessonID-623 Type: Audio Status: Failed Operational Use-Case Using lesson-completion events for both operational and analytical use-cases SERVING SERVING
  • 33. 3) Expanding the Business Domain: Premium Content USERS Alice LESSONS Maria key: LessonID-623 Assets: S3://…. Medium: Written Subject: Verbs Difficulty: Intermediate Status: Premium key: UserId-6384291 Name: Adam Bellemare Address: Canada Email: k2hd9@9fd9s.com Status: Premium Add special content that is only available for premium (paid) users a) Evolve the User event to contain a status enum: (Premium / Normal) b) Add new content that is only available for premium users c) Governance requirement: a standard definition of premium across the whole business.
  • 34. 3) Expanding the Business Domain: Premium Content USERS Alice LESSONS Maria key: LessonID-623 Assets: S3://…. Medium: Written Subject: Verbs Difficulty: Intermediate Status: Premium key: UserId-6384291 Name: Adam Bellemare Address: Canada Email: k2hd9@9fd9s.com Status: Premium SERVING Materialize Data to Tables Handle Client REST Requests Update Business Logic to show paid users the Premium Content
  • 35. ksqlDB ANALYTICS BI Tool 3) Build new Analytics off Premium LESSONS Maria key: LessonID-623 Assets: S3://…. Medium: Written Subject: Verbs Difficulty: Intermediate Status: Premium Joe Content Serving Domain key: UserId-6384291 LessonId: LessonID-623 Type: Audio Status: Completed Source-Aligned Data Products CONTENT
  • 36. Confluent’s Proof of Concept (not a product) Data Mesh Demo github.com/confluentinc/data-mesh-demo
  • 38. Your Apache Kafka® journey begins here developer.confluent.io @AdamBellemare