SlideShare a Scribd company logo
Event Sourcing, Stream Processing & Serverless
Ben Stopford
Office of the CTO, Confluent
What we’re going to talk about
• Event Sourcing
• What it is and how does it relate to Event Streaming?
• Stream Processing as a kind of “Database”
• What does this mean?
• Serverless Functions
• How do this relate?
Can you do event sourcing
with Kafka?
Traditional Event
Sourcing
Popular example: Shopping Cart
DB
Apps
Search
Apps Apps
Database Table matches
what the user sees.
12.42
12.44
12.49
12.50
12.59
Event Sourcing stores events, then derives the
‘current state view’
Apps Apps
DERIVE
Chronological Reduce
Event
Timeseries
of user
activity
Traditional Event Sourcing
(Store immutable events in a database in time order)
Apps
Search
NoSQL
Monitoring
Security
Apps Apps
S T R E A M I N G P L AT F O R MTable of events
Persist events
Apps Apps
Traditional Event Sourcing (Read)
Apps
Search
NoSQL
Monitoring
Security
Apps Apps
S T R E A M I N G P L AT F O R M
Apps
Search Monitoring
Apps Apps
Chronological
Reduce on read
(done inside the app)
Query by
customer Id
(+session?)
- No schema migration
- Similar to ’schema on read’
3 Benefits
Evidentiary
Accountants don’t use erasers
(e.g. audit, ledger, git)
Replayability
Recover corrupted data after a
programmatic bug
Analytics
Keep the data needed to
extract trends and behaviors
i.e. non-lossy
(e.g. insight, metrics, ML)
Traditional Event Sourcing
• Use a database (any one will do)
• Create a table and insert events as they occur
• Query all the events associated with your problem*
• Reduce them chronologically to get the current state
*Aggregate ID in DDD parlance
Traditional Event Sourcing with Kafka
• Use a database Kafka
• Create a table topic insert events as they occur
• Query all the events associated with your problem*
• Reduce them chronologically to get the current state
*Aggregate ID in DDD parlance
Confusion: You can’t query Kafka by say Customer Id*
*Aggregate ID in DDD parlance
Events are a good write model,
but make a tricky read model
CQRS is a tonic: Cache the projection in a ‘View’
Apps
Search Monitoring
Apps Apps
S T R E A M I N G P L AT F O R M
Query by customer Id
Apps
Search
NoSQL
Apps Apps
DWH
Hadoop
S T R E A M I N G P L AT F O R M
View
Events/Command
Events accumulate
in the log
Stream Processor
Cache/DB/Ktable etc.
Even with CQRS, Event Sourcing is Hard
CQRS helps, but it’s still quite hard if you’re a CRUD app
What’s the problem?
Harder:
• Eventually Consistent
• Multi-model (Complexity ∝ #Schemas in the log)
• More moving parts
Apps
Search
NoSQL
Monitoring
Security
Apps Apps
S T R E A M I N G P L A T F O R M
CRUD System CQRS
Eventual Consistency is often good for serving layers
Source of Truth
Every article since
1851
https://www.confluent.io/blog/publishing-apache-kafka-new-york-times/
Normalized assets
(images, articles, bylines, tags
all separate messages)
Denormalized into
“Content View”
If your system is both simple and transactional:
stick with CRUD and an audit/history table
Trigger
Evidentiary Yes
Replayable N/A to web app
Analytics Yes
CDC
More advanced: Use a Bi-Temporal Database
Use Traditional Event
Sourcing judiciously,
where it makes sense
CQRS comes into its own
when the events move data
Online Transaction Processing: e.g. a Flight Booking System
- Flight price served 10,000 x #bookings
- Consistency required only at booking time
CQRS with event movement
Apps
Search Monitoring
Apps Apps
S T R E A M I N G P L AT F O R M
Apps
Search
NoSQL
Apps Apps
DWH
Hadoop
S T R E A M I N G P L AT F O R M
View
Book Flight
Events accumulate
in the log
Apps
Search
Apps
S T R E A M I N G P L A
View
Apps
Search
NoSQL
Apps
DWH
S T R E A M I N G P L A
View
Get Flights
Get Flights
Get Flights
Global Read
Central Write
The exact same logic applies
to microservices
Microservices
Orders Service
Fraud Service
Billing Service
Email Service
Orders
Fraud service doesn’t have to be consistent with the Orders
service because it just creates new data (new events)
Orders Service
Fraud Service
Billing Service
Email Service
Orders
Consistent?
Microservices
Orders Service
Fraud Service
Billing Service
Email Service
Orders
Start to build things
“Event Driven”
Event Streaming
Event Streaming is a more general form of Event Sourcing/CQRS
Event Streaming
• Events as shared data model
• Many microservices
• Polyglot persistence
• Data-in-flight
Traditional Event Sourcing
• Events as a storage model
• Single microservice
• Single DB
• data-at-rest
Benefits of Event Streaming
stand out where there are
multiple data sources.
Join, Filter, Transform and Summarize Events from
Different Sources
Fraud Service
Orders
Service
Payment
Service
Customer
Service
Event Log
Projection created in
Kafka Streams API
KStreams & KSQL have different positioning
•KStreams is a library for Dataflow programming:
• App logic lives in stream processor and can use state stores
• Statefulness limited by operational constraints.
•KSQL is a ‘database’ for event preparation:
• App logic is a separate process (can’t use state stores)
• Statefulness unlimited, like a DB.
• App uses consumer in any language
This difference makes most
sense if we we look to the
future.
Cloud & Serverless
Thesis
• Serverless provides real-time, event-driven infrastructure and
compute.
• A stream processor provides the corollary: a database-equivalent
for real-time, event-driven data.
Using FaaS
• Write a function
• Upload
• Configure a trigger (HTTP, Event, Object Store, Database, Timer etc.)
FaaS in a Nutshell
• Fully managed (Runs in a container pool)
• Cold start’s can be (very) slow: 100ms – 45s (AWS 250ms-7s)
• Pay for execution time (not resources used)
• Auto-scales with load
• 0-1000+ concurrent functions
• Event driven
• Stateless
• Short lived (limit 5-15 mins)
• Weak ordering guarantees
Where is FaaS useful?
• Spikey workloads
• Use cases that don’t typically warrant massive parallelism
e.g. CI systems.
• General purpose programming paradigm?
But there are open questions
Serverless Developer Ecosystem
• Runtime diagnostics
• Monitoring
• Deploy loop
• Testing
• IDE integration
Currently quite poor
Harder than current approaches Easier than current approaches
Amazon
Google
Microsoft
Serverless programming will likely become prevalent
In the future it seems
unlikely we’ll manage our
own infrastructure.
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluent) Kafka Summit London 2019
Event-Streaming approaches this
from a different angle
FaaS is event-driven
But it isn’t streaming
Complex, Timing issues, Scaling limits
Customers
Event Source
Orders
Event Source
Payments
Event Source
Serverless functions handle only one event source
FaaS/μS
FaaS/μS
FaaS/μS
Send SQL
Process
boundary
Orders
Payments
KSQL
Customers
Table
Customers
KSQL simplifies these issues by pre-preparing events
from different sources into one event stream
App
Logic
CREATE STREAM order-
payments AS
SELECT * FROM orders,
payments, customers
LEFT JOIN…
Order
Payment
Customer
KSQL prepares data so,
when a function is called,
a single event has all the
data that function needs.
KSQL also separates
stateful operations
from event-driven
application logic
FaaSFaaSFaaSKSQL
Customers
Table
KSQL as a “Data Layer” for Serverless Functions
FaaSFaaS
STATELESS
Fully elastic
STATEFUL
Orders
Payments
Customers
Autoscale
with load
Filter, transform, join, summarizations
Familiar
Apps
Search
Apps Apps
S T R E A M I N G P L AT F O R M
Apps
Search Monitorin
Apps Apps
S T R E A M I N G P L AT F O R M
Apps
Search
AppsApps
Search Monitor
Apps Apps
Stateful
Stateless
FaaS
Traditional
Application
Event-Driven
Application
Application
Database
KSQL
Stateful
Data Layer
FaaS
FaaS
FaaS
FaaS
FaaS
Streaming
Stateless
Stateless
Stateless
Compute Layer
Massive linear scalability with elasticity
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluent) Kafka Summit London 2019
Use stream processors to
make the consumption of
events both simple and
scalable
Think
Event-
Driven
Summary
• Events underpin the storage models of truthful/factful architectures.
• Event sourcing is most useful when it embraces events as data-in-flight
• A stream processor provides a database-like equivalent for real-time,
event-driven data
• Serverless provides the corollary: real-time, event-driven infrastructure
and compute
Things I didn’t tell you 1/2
• Tools like KSQL provide data provisioning, not state mutation.
• Good for offline services & data pipelines
• Not good for CRUD (but it’s ok to mix and match)
• Kafka’s serverless integration is in it’s early stages.
• Existing connector for Kafka (Limited functionality).
• Confluent connector coming.
• Can KSQL handle large state?
• Unintended rebalance can stall processing
• Static membership (KIP-345) – name the list of stream processors
• Increase the timeout for rebalance after node removal (group.max.session.timeout.ms)
• Worst case reload: RocksDB ~GbE speed
Things I didn’t tell you 2/2
• Can Kafka be used for long term storage?
• Log files are immutable once they roll (unless compacted)
• Jun spent a decade working on DB2
• Careful:
• Historical reads can stall real-time requests (cached)
• ZFS has several page cache optimizations
• Tiered storage will help
Find out More
• Peeking Behind the Curtains of Serverless Platforms, Wang et al.
• Cloud Programming Simplified: A Berkeley View on Serverless Compute
• Neil Avery’s Journey to Event Driven Part 3. The Affinity Between Events, Streams and Serverless.
• Designing Event Driven Systems, Ben Stopford
Thank you
@benstopford
Book:
https://www.confluent.io/designing-event-driven-systems
Github:
http://bit.ly/kafka-microservice-examples
Example ecosystem built with streams.
Includes KSQL, Control Centre, Elastic etc.

More Related Content

PDF
CQRS and Event Sourcing in Action
PDF
Auto scaling using Amazon Web Services ( AWS )
PDF
Auto scaling
PPTX
AWS Lambda
PDF
Introduction to Event-Driven Architecture
PPTX
AWS API Gateway
PDF
Amazon CloudWatch Tutorial | AWS Certification | Cloud Monitoring Tools | AWS...
PDF
Microservice Architecture with CQRS and Event Sourcing
CQRS and Event Sourcing in Action
Auto scaling using Amazon Web Services ( AWS )
Auto scaling
AWS Lambda
Introduction to Event-Driven Architecture
AWS API Gateway
Amazon CloudWatch Tutorial | AWS Certification | Cloud Monitoring Tools | AWS...
Microservice Architecture with CQRS and Event Sourcing

What's hot (20)

PPTX
Amazon SQS overview
PDF
Event-Driven Architecture (EDA)
PDF
대용량 데이터베이스의 클라우드 네이티브 DB로 전환 시 확인해야 하는 체크 포인트-김지훈, AWS Database Specialist SA...
PPTX
Azure Cloud PPT
PDF
CDC patterns in Apache Kafka®
KEY
Event Driven Architecture
PDF
Distributed computing with spark
PPTX
Entendiendo Iaas/Paas/Saas en Azure
PDF
Event-driven Architecture
PPTX
Kafka PPT.pptx
PPTX
Event-driven microservices
PDF
Building Event Driven Systems
PDF
Introduction to Serverless
PDF
Serverless computing with AWS Lambda
PPTX
AWS Lambda
PPTX
Apache tomcat
PDF
AWS 101: Introduction to AWS
PDF
AWS Direct Connect 및 VPN을 이용한 클라우드 아키텍쳐 설계:: Steve Seymour :: AWS Summit Seou...
PDF
Introduction to Azure
PPTX
Microservices with event source and CQRS
Amazon SQS overview
Event-Driven Architecture (EDA)
대용량 데이터베이스의 클라우드 네이티브 DB로 전환 시 확인해야 하는 체크 포인트-김지훈, AWS Database Specialist SA...
Azure Cloud PPT
CDC patterns in Apache Kafka®
Event Driven Architecture
Distributed computing with spark
Entendiendo Iaas/Paas/Saas en Azure
Event-driven Architecture
Kafka PPT.pptx
Event-driven microservices
Building Event Driven Systems
Introduction to Serverless
Serverless computing with AWS Lambda
AWS Lambda
Apache tomcat
AWS 101: Introduction to AWS
AWS Direct Connect 및 VPN을 이용한 클라우드 아키텍쳐 설계:: Steve Seymour :: AWS Summit Seou...
Introduction to Azure
Microservices with event source and CQRS
Ad

Similar to Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluent) Kafka Summit London 2019 (20)

PDF
Event Sourcing, Stream Processing and Serverless (Ben Stopford, Confluent) K...
PDF
A Global Source of Truth for the Microservices Generation
PDF
Serverless and Streaming: Building ‘eBay’ by ‘Turning the Database Inside Out’
PDF
The Future of Streaming: Global Apps, Event Stores and Serverless
PDF
Concepts and Patterns for Streaming Services with Kafka
PDF
BBL KAPPA Lesfurets.com
PDF
Big Data LDN 2018: THE FUTURE OF STREAMING: GLOBAL APPS, EVENT STORES AND SER...
PDF
The art of the event streaming application: streams, stream processors and sc...
PDF
Kafka summit SF 2019 - the art of the event-streaming app
PDF
Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...
PDF
Serverless London 2019 FaaS composition using Kafka and CloudEvents
PPTX
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
PDF
Events Everywhere: Enabling Digital Transformation in the Public Sector
PDF
How to Build Streaming Apps with Confluent II
PDF
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
PDF
Event Driven Services Part 2: Building Event-Driven Services with Apache Kafka
PDF
Cloud Native London 2019 Faas composition using Kafka and cloud-events
PDF
Jay Kreps | Kafka Summit NYC 2019 Keynote (Events Everywhere) | CEO, Confluent
PDF
The State of Stream Processing
PDF
EDA Meets Data Engineering – What's the Big Deal?
Event Sourcing, Stream Processing and Serverless (Ben Stopford, Confluent) K...
A Global Source of Truth for the Microservices Generation
Serverless and Streaming: Building ‘eBay’ by ‘Turning the Database Inside Out’
The Future of Streaming: Global Apps, Event Stores and Serverless
Concepts and Patterns for Streaming Services with Kafka
BBL KAPPA Lesfurets.com
Big Data LDN 2018: THE FUTURE OF STREAMING: GLOBAL APPS, EVENT STORES AND SER...
The art of the event streaming application: streams, stream processors and sc...
Kafka summit SF 2019 - the art of the event-streaming app
Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...
Serverless London 2019 FaaS composition using Kafka and CloudEvents
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
Events Everywhere: Enabling Digital Transformation in the Public Sector
How to Build Streaming Apps with Confluent II
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Services Part 2: Building Event-Driven Services with Apache Kafka
Cloud Native London 2019 Faas composition using Kafka and cloud-events
Jay Kreps | Kafka Summit NYC 2019 Keynote (Events Everywhere) | CEO, Confluent
The State of Stream Processing
EDA Meets Data Engineering – What's the Big Deal?
Ad

More from confluent (20)

PDF
Stream Processing Handson Workshop - Flink SQL Hands-on Workshop (Korean)
PPTX
Webinar Think Right - Shift Left - 19-03-2025.pptx
PDF
Migration, backup and restore made easy using Kannika
PDF
Five Things You Need to Know About Data Streaming in 2025
PDF
Data in Motion Tour Seoul 2024 - Keynote
PDF
Data in Motion Tour Seoul 2024 - Roadmap Demo
PDF
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
PDF
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
PDF
Data in Motion Tour 2024 Riyadh, Saudi Arabia
PDF
Build a Real-Time Decision Support Application for Financial Market Traders w...
PDF
Strumenti e Strategie di Stream Governance con Confluent Platform
PDF
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
PDF
Building Real-Time Gen AI Applications with SingleStore and Confluent
PDF
Unlocking value with event-driven architecture by Confluent
PDF
Il Data Streaming per un’AI real-time di nuova generazione
PDF
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
PDF
Break data silos with real-time connectivity using Confluent Cloud Connectors
PDF
Building API data products on top of your real-time data infrastructure
PDF
Speed Wins: From Kafka to APIs in Minutes
PDF
Evolving Data Governance for the Real-time Streaming and AI Era
Stream Processing Handson Workshop - Flink SQL Hands-on Workshop (Korean)
Webinar Think Right - Shift Left - 19-03-2025.pptx
Migration, backup and restore made easy using Kannika
Five Things You Need to Know About Data Streaming in 2025
Data in Motion Tour Seoul 2024 - Keynote
Data in Motion Tour Seoul 2024 - Roadmap Demo
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
Data in Motion Tour 2024 Riyadh, Saudi Arabia
Build a Real-Time Decision Support Application for Financial Market Traders w...
Strumenti e Strategie di Stream Governance con Confluent Platform
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
Building Real-Time Gen AI Applications with SingleStore and Confluent
Unlocking value with event-driven architecture by Confluent
Il Data Streaming per un’AI real-time di nuova generazione
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Break data silos with real-time connectivity using Confluent Cloud Connectors
Building API data products on top of your real-time data infrastructure
Speed Wins: From Kafka to APIs in Minutes
Evolving Data Governance for the Real-time Streaming and AI Era

Recently uploaded (20)

PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
cuic standard and advanced reporting.pdf
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Spectral efficient network and resource selection model in 5G networks
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
KodekX | Application Modernization Development
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
A Presentation on Artificial Intelligence
PDF
Machine learning based COVID-19 study performance prediction
Advanced methodologies resolving dimensionality complications for autism neur...
Reach Out and Touch Someone: Haptics and Empathic Computing
Unlocking AI with Model Context Protocol (MCP)
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
cuic standard and advanced reporting.pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Spectral efficient network and resource selection model in 5G networks
“AI and Expert System Decision Support & Business Intelligence Systems”
Agricultural_Statistics_at_a_Glance_2022_0.pdf
KodekX | Application Modernization Development
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
MYSQL Presentation for SQL database connectivity
Chapter 3 Spatial Domain Image Processing.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
The Rise and Fall of 3GPP – Time for a Sabbatical?
Encapsulation_ Review paper, used for researhc scholars
A Presentation on Artificial Intelligence
Machine learning based COVID-19 study performance prediction

Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluent) Kafka Summit London 2019

  • 1. Event Sourcing, Stream Processing & Serverless Ben Stopford Office of the CTO, Confluent
  • 2. What we’re going to talk about • Event Sourcing • What it is and how does it relate to Event Streaming? • Stream Processing as a kind of “Database” • What does this mean? • Serverless Functions • How do this relate?
  • 3. Can you do event sourcing with Kafka?
  • 5. Popular example: Shopping Cart DB Apps Search Apps Apps Database Table matches what the user sees.
  • 6. 12.42 12.44 12.49 12.50 12.59 Event Sourcing stores events, then derives the ‘current state view’ Apps Apps DERIVE Chronological Reduce Event Timeseries of user activity
  • 7. Traditional Event Sourcing (Store immutable events in a database in time order) Apps Search NoSQL Monitoring Security Apps Apps S T R E A M I N G P L AT F O R MTable of events Persist events Apps Apps
  • 8. Traditional Event Sourcing (Read) Apps Search NoSQL Monitoring Security Apps Apps S T R E A M I N G P L AT F O R M Apps Search Monitoring Apps Apps Chronological Reduce on read (done inside the app) Query by customer Id (+session?) - No schema migration - Similar to ’schema on read’
  • 10. Evidentiary Accountants don’t use erasers (e.g. audit, ledger, git)
  • 11. Replayability Recover corrupted data after a programmatic bug
  • 12. Analytics Keep the data needed to extract trends and behaviors i.e. non-lossy (e.g. insight, metrics, ML)
  • 13. Traditional Event Sourcing • Use a database (any one will do) • Create a table and insert events as they occur • Query all the events associated with your problem* • Reduce them chronologically to get the current state *Aggregate ID in DDD parlance
  • 14. Traditional Event Sourcing with Kafka • Use a database Kafka • Create a table topic insert events as they occur • Query all the events associated with your problem* • Reduce them chronologically to get the current state *Aggregate ID in DDD parlance
  • 15. Confusion: You can’t query Kafka by say Customer Id* *Aggregate ID in DDD parlance
  • 16. Events are a good write model, but make a tricky read model
  • 17. CQRS is a tonic: Cache the projection in a ‘View’ Apps Search Monitoring Apps Apps S T R E A M I N G P L AT F O R M Query by customer Id Apps Search NoSQL Apps Apps DWH Hadoop S T R E A M I N G P L AT F O R M View Events/Command Events accumulate in the log Stream Processor Cache/DB/Ktable etc.
  • 18. Even with CQRS, Event Sourcing is Hard CQRS helps, but it’s still quite hard if you’re a CRUD app
  • 19. What’s the problem? Harder: • Eventually Consistent • Multi-model (Complexity ∝ #Schemas in the log) • More moving parts Apps Search NoSQL Monitoring Security Apps Apps S T R E A M I N G P L A T F O R M CRUD System CQRS
  • 20. Eventual Consistency is often good for serving layers Source of Truth Every article since 1851 https://www.confluent.io/blog/publishing-apache-kafka-new-york-times/ Normalized assets (images, articles, bylines, tags all separate messages) Denormalized into “Content View”
  • 21. If your system is both simple and transactional: stick with CRUD and an audit/history table Trigger Evidentiary Yes Replayable N/A to web app Analytics Yes CDC
  • 22. More advanced: Use a Bi-Temporal Database
  • 23. Use Traditional Event Sourcing judiciously, where it makes sense
  • 24. CQRS comes into its own when the events move data
  • 25. Online Transaction Processing: e.g. a Flight Booking System - Flight price served 10,000 x #bookings - Consistency required only at booking time
  • 26. CQRS with event movement Apps Search Monitoring Apps Apps S T R E A M I N G P L AT F O R M Apps Search NoSQL Apps Apps DWH Hadoop S T R E A M I N G P L AT F O R M View Book Flight Events accumulate in the log Apps Search Apps S T R E A M I N G P L A View Apps Search NoSQL Apps DWH S T R E A M I N G P L A View Get Flights Get Flights Get Flights Global Read Central Write
  • 27. The exact same logic applies to microservices
  • 29. Fraud service doesn’t have to be consistent with the Orders service because it just creates new data (new events) Orders Service Fraud Service Billing Service Email Service Orders Consistent?
  • 30. Microservices Orders Service Fraud Service Billing Service Email Service Orders Start to build things “Event Driven”
  • 32. Event Streaming is a more general form of Event Sourcing/CQRS Event Streaming • Events as shared data model • Many microservices • Polyglot persistence • Data-in-flight Traditional Event Sourcing • Events as a storage model • Single microservice • Single DB • data-at-rest
  • 33. Benefits of Event Streaming stand out where there are multiple data sources.
  • 34. Join, Filter, Transform and Summarize Events from Different Sources Fraud Service Orders Service Payment Service Customer Service Event Log Projection created in Kafka Streams API
  • 35. KStreams & KSQL have different positioning •KStreams is a library for Dataflow programming: • App logic lives in stream processor and can use state stores • Statefulness limited by operational constraints. •KSQL is a ‘database’ for event preparation: • App logic is a separate process (can’t use state stores) • Statefulness unlimited, like a DB. • App uses consumer in any language
  • 36. This difference makes most sense if we we look to the future.
  • 38. Thesis • Serverless provides real-time, event-driven infrastructure and compute. • A stream processor provides the corollary: a database-equivalent for real-time, event-driven data.
  • 39. Using FaaS • Write a function • Upload • Configure a trigger (HTTP, Event, Object Store, Database, Timer etc.)
  • 40. FaaS in a Nutshell • Fully managed (Runs in a container pool) • Cold start’s can be (very) slow: 100ms – 45s (AWS 250ms-7s) • Pay for execution time (not resources used) • Auto-scales with load • 0-1000+ concurrent functions • Event driven • Stateless • Short lived (limit 5-15 mins) • Weak ordering guarantees
  • 41. Where is FaaS useful? • Spikey workloads • Use cases that don’t typically warrant massive parallelism e.g. CI systems. • General purpose programming paradigm?
  • 42. But there are open questions
  • 43. Serverless Developer Ecosystem • Runtime diagnostics • Monitoring • Deploy loop • Testing • IDE integration Currently quite poor
  • 44. Harder than current approaches Easier than current approaches Amazon Google Microsoft Serverless programming will likely become prevalent
  • 45. In the future it seems unlikely we’ll manage our own infrastructure.
  • 48. FaaS is event-driven But it isn’t streaming
  • 49. Complex, Timing issues, Scaling limits Customers Event Source Orders Event Source Payments Event Source Serverless functions handle only one event source FaaS/μS FaaS/μS FaaS/μS
  • 50. Send SQL Process boundary Orders Payments KSQL Customers Table Customers KSQL simplifies these issues by pre-preparing events from different sources into one event stream App Logic CREATE STREAM order- payments AS SELECT * FROM orders, payments, customers LEFT JOIN… Order Payment Customer
  • 51. KSQL prepares data so, when a function is called, a single event has all the data that function needs.
  • 52. KSQL also separates stateful operations from event-driven application logic
  • 53. FaaSFaaSFaaSKSQL Customers Table KSQL as a “Data Layer” for Serverless Functions FaaSFaaS STATELESS Fully elastic STATEFUL Orders Payments Customers Autoscale with load Filter, transform, join, summarizations
  • 54. Familiar Apps Search Apps Apps S T R E A M I N G P L AT F O R M Apps Search Monitorin Apps Apps S T R E A M I N G P L AT F O R M Apps Search AppsApps Search Monitor Apps Apps Stateful Stateless
  • 57. Use stream processors to make the consumption of events both simple and scalable Think Event- Driven
  • 58. Summary • Events underpin the storage models of truthful/factful architectures. • Event sourcing is most useful when it embraces events as data-in-flight • A stream processor provides a database-like equivalent for real-time, event-driven data • Serverless provides the corollary: real-time, event-driven infrastructure and compute
  • 59. Things I didn’t tell you 1/2 • Tools like KSQL provide data provisioning, not state mutation. • Good for offline services & data pipelines • Not good for CRUD (but it’s ok to mix and match) • Kafka’s serverless integration is in it’s early stages. • Existing connector for Kafka (Limited functionality). • Confluent connector coming. • Can KSQL handle large state? • Unintended rebalance can stall processing • Static membership (KIP-345) – name the list of stream processors • Increase the timeout for rebalance after node removal (group.max.session.timeout.ms) • Worst case reload: RocksDB ~GbE speed
  • 60. Things I didn’t tell you 2/2 • Can Kafka be used for long term storage? • Log files are immutable once they roll (unless compacted) • Jun spent a decade working on DB2 • Careful: • Historical reads can stall real-time requests (cached) • ZFS has several page cache optimizations • Tiered storage will help
  • 61. Find out More • Peeking Behind the Curtains of Serverless Platforms, Wang et al. • Cloud Programming Simplified: A Berkeley View on Serverless Compute • Neil Avery’s Journey to Event Driven Part 3. The Affinity Between Events, Streams and Serverless. • Designing Event Driven Systems, Ben Stopford