SlideShare a Scribd company logo
The Rise of Data in Motion in the Healthcare Industry
Use Cases, Architectures and Examples powered by Apache Kafka
Kai Waehner
Field CTO
contact@kai-waehner.de
linkedin.com/in/kaiwaehner
@KaiWaehner
www.confluent.io
www.kai-waehner.de
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Healthcare includes many topics…
https://isilanguagesolutions.com/2019/02/25/what-are-the-differences-between-health-care-medical-life-science-and-pharmaceutical-translations/
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Healthcare Value Chain
4
https://www.researchgate.net/publication/265654743_The_business_of_healthcare_innovation_in_the_Wharton_School_curriculum
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
The world is changing.
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
“Pandemic drives digital
adoption forward 5 years
in a span of 8 weeks.”
Digital adoption through COVID and beyond, McKinsey
Covid Increases the Pressure
6
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Digital health
ecosystems: A payer
perspective
- McKinsey Article August
2019
Digital
Health
Ecosystem
Disruption
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
This transformation is
happening everywhere
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Doctors become Software
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Medical Research becomes Software
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Patient Data becomes Software
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Security becomes Software
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Healthcare Companies and Organizations
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
What enables this
transformation?
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Real-time Data beats Slow Data.
19
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Real-time Data beats Slow Data.
Emergency
Real-time sensor
diagnostics
Intelligent routing
ETA updates
Patient Care
Diagnosis
Treatment
Connected Health
Insurance
Member Enrollment
Claim processing
Omnichannel
patient experience
Cybersecurity
Threat detection
Incident response
Data privacy
protection
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
This is a fundamental paradigm shift...
21
Infrastructure
as code
Data in Motion
as continuous
streams of events
Future of the
datacenter
Future of data
Cloud
Event
Streaming
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
What is Data in Motion?
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
‘Event’ is what happens in your business
Transportation
GPS in the ambulance sends ETA to the hospital at 5:11am.
Kafka
Insurance Claim
Alice filed a healthcare insurance claim Friday at 7:34pm.
Kafka
Patient Interaction
The doctor updates Sabine’s case status at 9:10am.
Kafka
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Data in Motion in the Healthcare Industry
Your Business as Streams of Events, powered by Kafka
Insurance Claim
Processing
Contact
Relatives
Patient
Diagnosis
Surgery
Ambulance
Emergency
Situation
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
An Event Streaming Platform is the
Underpinning of an Event-driven Architecture
25
MES
ERP
Sensors
Mobile
Customer 360
Real-time
Alerting System
Data
warehouse
Producers
Consumers
Streams of real time events
Stream processing
apps
Connectors
Connectors
Stream processing
apps
Supplier
Alert
Forecast
Inventory Customer
Order
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
With Confluent
Hadoop ... Device
Logs ... App ...
Microservice
Mainframes
Data
Warehouse Splunk ...
Data Stores Logs 3rd Party Apps Custom Apps / Microservices
Supply Chain
Management
Medical Fraud
Detection
Patient &
Beneficiary 360
Disease Spread
Modeling
HL Data
Transformation ...
Contextual Event-Driven Applications
Universal Event Pipeline
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Public Health Data Automation in Confluent
28
Connectors:
CDC
MQ
REST Proxy
EDI / Batch Input
Processing
Legacy Data
Storage and
Processing
Claims Clinical
Schema
Registry
ksqlDB / Streams
HL7-FHIR
MicroServices
Analytics
Sink Connector
Sinks
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Use Cases for Data in Motion in the Healthcare Industry
30
Know Your Patient (= “Customer 360”)
● Digital Transformation
● eCommerce Optimization
● Product Catalog Optimization
● Product-Inventory Profiling and
Filtering by Customer or Persona
● Real-time Pricing Models
● Next Best Offer/Cross-Sell/
Recommendations
● Omni-Channel Experience
● Customer Profile Updates
● …
Operations (Healthcare 4.0 including
Drug R&D, Patient Care, etc.)
● Supply Chain Optimization
● Shipment Notifications/Delays
● Inventory Processing and
Oversight
● Predictive Inventory Management
● Connected Health
● Improved Care
● Proactive Patient Care
● Patient Notifications
● Pharma Modernization
● M&A Rapid Integration
● …
IT Perspective
● Cybersecurity/
SIEM Optimization
● Mainframe Offload
● Hybrid Cloud Integration/ Bridge
to Cloud
● Middleware/
Messaging Modernization
● Streaming ETL & Analytics
● …
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Example: Benefits application process
Software-using
1 3 5
4 6
2
BENEFICIARY FORM
INTAKE
CASE
MANAGER
APPLICATION
REVIEW
BENEFITS
APPLICATION
APPROVE
DENY
Software-defined
1
BENEFICIARY BENEFITS
APP UI
3
APPROVE
DENY
$
BENEFITS
SERVICE
RISK/FRAUD
SERVICE
!
EXTERNAL
AGENCY
SERVICE
2
Weeks
Seconds
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Real World Deployments
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Covid-19 Electronic Lab Reporting
(CDC) - CELR (COVID Electronic Lab Reporting)
Track the threat of COVID-19 virus to provide comprehensive data for local, state, and federal response
Better understand locations with an increase in incidence
Rapidly aggregate, validate, transform, and distribute laboratory testing data submitted by public health
departments and other partners
36
https://www.confluent.io/resources/kafka-summit-2020/flattening-the-curve-with-kafka/
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Cerner – Sepsis Alerting
Supplier of health information technology services, devices, and hardware
~30% of all US Healthcare Data in a Cerner Solution
Central event streaming platform for sepsis alerting in real-time to save lives
37
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Optum – Self-Service Kafka
American pharmacy benefit manager and health care provider (subsidiary of UnitedHealth Group)
Kafka as a Service within UnitedHealth Group
Centrally managed and utilized by over 200 internal application teams
Repeatable, scalable, cost-efficient way to standardize data
From mainframe via CDC into modern data processing and analytics tools
38
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Centene
Integration and Data Processing at Scale in Real-Time
Healthcare Insurer acts as intermediary for both government-sponsored and privately insured health care programs
Largest Medicaid and Medicare Managed Care Provider in the US
39
https://www.confluent.io/online-talks/building-an-enterprise-eventing-framework-on-demand/
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Centene- “CentEvent” Claims System Consolidation
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Humana – Real-Time Integration and Analytics
Interoperability platform to transition from Insurance Company with Elements of Health, to truly a Health Company with Elements
of Insurance.
Consumer-centric, health plan agnostic, provider agnostic. Cloud resilient and elastic. Event-driven and real-time.
Use cases include real-time updates of health information (Connecting HCP’s -> Pharmacies), reducing pre-authorizations from 20-
30 minutes to 1 minute, real-time home healthcare assistant communication
41
https://www.confluent.io/resources/kafka-summit-2020/levi-bailey-keynote-humana-improving-health-with-event-driven-architectures/
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Invitae – Data Science and 24/7 Production
Invitae offers gene panels and single-gene testing for a broad range of clinical areas including
hereditary cancer, cardiology, neurology, pediatric genetics, metabolic disorders, immunology, hematology.
Bring comprehensive genetic information into mainstream medical practice
to improve the quality of healthcare for billions of people.
Truly decoupled infrastructure to enable others to join in and consume the data.
Paradigm shift: Building an application entirely of streams.
42
https://www.confluent.io/kafka-summit-san-francisco-2019/from-zero-to-streaming-healthcare-in-production
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Babylon Health – Secure and Agile Integration
Connectivity + Agile Microservice Architecture.
GDPR and PII compliant security.
43
https://www.confluent.io/kafka-summit-lon19/one-key-to-rule-them-all
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Bayer AG – Hybrid Real-Time Data Flow
Adopted a cloud first strategy and started a multi-year transition to the cloud.
Kafka-based cross-datacenter DataHub was created to facilitate migration and to drive shift to real-time stream processing.
Strong enterprise adoption and supports a myriad of use cases
44
https://www.confluent.io/kafka-summit-sf18/bringing-streaming-data-to-the-masses
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Bayer AG – Data Integration and Processing in R&D
Analysis of clinical trials, patents, reports, news, literature, etc.
250M documents, 7TB raw text from 30 data sources.
Variety of document streams with different formats and schemas flowing through several text processing and enrichment steps.
Scalable, reliable Kafka pipelines with Kafka Streams (Java) and Faust (Python) replaced custom, error-prone, non-scalable scripts.
45
https://www.kafka-summit.org/sessions/bayer-document-stream-pipelines
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Celmatix - Reproductive Health Care
46
https://www.confluent.io/customers/celmatix/
Preclinical-stage biotech company that provides
digital tools and genetic insights focused on fertility.
Personalized information to disrupt how women
approach their lifelong reproductive health journey.
Real-time aggregation of heterogeneous data data
collected from Electronic Medical Records (EMRs)
and genetic data collected from partners through
their Personalized Reproductive Medicine (PReM)
Initiative.
Proactive reproductive health decisions by leveraging
real-time genomics data and applying technologies
such as big data analytics, machine learning, A/I and
whole-genome DNA sequencing
Data governance for security and compliance.
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Care.com – Trusted Caregivers
48
Online marketplace for a range of care services including senior care and housekeeping
Bravo Platform as simple, unified IT architecture to be able to streamline go-to-market initiatives
From a monolithic architecture into a truly decoupled, scalable microservices platform
Migration from Confluent Platform to Confluent Cloud to focus on business problems
Data Governance with Schema Registry across different run times (Java, .NET, Go, etc.)
“Care APIs” (inspired by Google APIs) to define all of their data and service contracts with Protobuf
Enhance security for PII data with fine-grained RBAC and data lineage
https://www.confluent.io/customers/care-com/
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Cyber Intelligence Platform
leveraging Kafka Connect, Kafka Streams, Multi-Region Clusters (MRC), and more…
https://www.intel.com/content/www/us/en/it-management/intel-it-best-practices/modern-scalable-cyber-intelligence-platform-kafka.html
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
What is Apache Kafka?
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Kafka: The Trinity of Event Streaming
01
Publish & Subscribe
to Streams of Events
02
Store
your Event Streams
03
Process & Analyze
your Events Streams
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Kafka Makes Your Business Real-time
CREATE STREAM payments (user VARCHAR, amount INT)
WITH (kafka_topic = 'all_payments', value_format = 'avro');
CREDIT
SERVICE
ksqlDB
CREATE TABLE credit_scores AS
SELECT user, updateScore(p.amount) AS credit_score
FROM payments AS p
GROUP BY user
EMIT CHANGES;
RISK
SERVICE
ksqlDB
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Databases
Messaging
ETL / Data Integration
Data Warehouse
Why can’t I do this with my
existing data platforms?
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Enterprise Data Platform Requirements Are Shifting
1 3 4
2
Scalable for
Transactional Data
Transient Raw data
Built for
Historical Data
Built for Real-
Time Events
Scalable for
ALL data
Persistent +
Durable
Enriched
data
● Value: Trigger real-
time workflows (i.e.
real-time order
management)
● Value: Scale across
the enterprise (i.e.
customer 360)
● Value: Build
mission-critical
apps with zero data
loss (i.e. instant
payments)
● Value: Add context &
situational awareness
(i.e. ride sharing ETA)
55
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Only Event Streaming Has All 4 Requirements
56
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Only Event Streaming Has All 4 Requirements
Messaging
Databases
Event Streaming
Data Warehouse
BUILT FOR REAL-
TIME EVENTS
SCALABLE
FOR ALL DATA
PERSISTENT &
DURABLE
CAPABLE OF
ENRICHMENT
57
Good for transactional applications
Good for ultra low-latency, fire-and-forget use cases
Good for batch data integration
Good for historical analytics and reporting
Platform for Event-Driven Transformation
(Scalable Messaging + Real-Time Data Integration + Stream Processing)
ETL/Data Integration
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Kafka is Complementary to other Middleware
in the Enterprise Architecture
Orders Customers
Payments
Stock
REST
JMS
ESB
REST
CRM
Mainframe
SOAP
…
Kafka
Kafka
Kafka
Kafka
SOAP
API Management
HTTP
MQ
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Machine Learning and Event Streaming
Improve Traditional and to Build New Use Cases
in Pharma and Life Sciences
Streams Processing / AI / ML
Clinical Trials
Patents,
Text etc
Structured &
unstructured
Data
IoT & Business
Applications
Multi-
Hybrid-
Cloud
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Project Example:
Drug Discovery
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Use Case: Drug Discovery
“On average, it takes at least ten
years for a new medicine to
complete the journey from initial
discovery to the marketplace”
PhRMA
http://phrma-docs.phrma.org/sites/default/files/pdf/rd_brochure_022307.pdf
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Recursion – Discovering Drugs in Real-Time
Accelerate drug discovery.
Find drug treatments by processing biological images.
Massively parallel system.
Combines experimental biology, artificial intelligence,
automation and real-time event streaming.
63
https://www.confluent.io/customers/recursion
https://www.confluent.io/kafka-summit-san-francisco-2019/discovering-drugs-with-kafka-streams
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Recursion
Partnering with Roche and Bayer
64
https://www.bloomberg.com/news/articles/2021-12-07/roche-signs-machine-learning-neuroscience-deal-with-recursion
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Image and Video Processing
… (on high level) is “just” pixels (arrays of 0s and 1s) and matrix multiplication
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Drug Discovery
in manual and slow, bursty batch mode, not scalable
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Drug Discovery
in automated, scalable, reliable real time Mode
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Digital Image
Processing
(e.g. noise
reduction)
Streaming Analytics for
Drug Discovery in Real Time at Scale
Real Time
Integration
Layer
Batch
Reporting
Platform
BI
Dashboard
Event
Streaming
Platform
Real Time
Integration
Layer
Laboratory
Streaming Platform
Other Components
Automated
Drug
Analysis
All
Data
Processed
Images
Ingest
Images
Human
Intelligence
Data
Processing
(e.g. filtering)
Stateful
Workflow
Orchestration
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Digital Image Processing for Drug Discovery
Find drug treatments by processing biological images:
• ML models can be trained to decide between healthy cells and disease
cells with problematic genes
• Grow healthy cells and disease cells in labs
• Apply different drugs à Make disease cells look healthy again
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Digital Image
Processing
(OpenCV
SaaS Service
REST API)
Kafka, ksqlDB and TensorFlow for
Drug Discovery in Real Time at Scale
Kafka Client
(.NET C++)
Batch
Reporting
Platform
BI
Dashboard
Confluent
Server
Tiered Storage
Kafka
Connect
Laboratory
(Windows Machines)
Confluent Platform
Other Components
Model Training
and Scoring
(Python Client +
TensorFlow)
All Data
Processed
Images
Images
Human
Intelligence
Streaming
ETL
(ksqlDB)
Stateful
Workflow
Orchestration
(Kafka Streams)
Database
(MySQL) Kafka Connect
(Oracle CDC)
Historical Drugs Data
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Ingestion of Images
Replication
Cluster Linking
Kafka
Connect
Laboratory
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Data Preprocessing
Preprocessing
Filter, transform, anonymize, extract features,
reduce noise, enhance brightness / contrast
Streams
Data Ready
For Model Training
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
SELECT image_id, experiment_id, image_details
FROM image_channel i
LEFT JOIN experiment_database e ON i.experiment_id =
e.experiment_id
WHERE e.image_type = ‘black_and_white';
Data Processing with ksqlDB
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
TensorFlow Model —
Convolutional Neural Network (CNN)
for Image Recognition (as part of the ML Pipeline)
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Direct streaming ingestion
for model training and / or scoring
with TensorFlow I/O + Kafka Plugin
(no additional data storage
like S3 or HDFS required!)
Time
Model B
Model A
Producer
Distributed Commit Log
Streaming Ingestion and Model Training
with TensorFlow IO
https://github.com/tensorflow/io
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Confluent Tiered Storage for Kafka
78
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Use Cases for Reprocessing Historical Events
Give me all events from time A to time B
Real-time Producer
Time
• New consumer application
• Error-handling
• Compliance / regulatory processing
• Query and analyze existing events
• Schema changes in analytics platform
• Model training
Real-time Consumer
Consumer of Historical Data
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Local Predictions
Model Training
in Cloud
Model Deployment
at the Edge
Analytic Model
Separation of
Model Training and Model Inference
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Streams
Input Event
Prediction
Request
Response
Model Serving
TensorFlow Serving
gRPC / HTTP
Application
Stream Processing with External Model and RPC
Model
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
“CREATE STREAM ImageAnalysis AS
SELECT image_id, analyzeImage(image_details)
FROM image_channel;“
User Defined Function (UDF)
Embedded Model Deployment with
Apache Kafka, ksqlDB and TensorFlow
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Model Training and Scoring
with the same ML Pipeline (or even in the same Application)
• Data Science team responsible for the whole model lifecycle
• Beloved Python tool stack (Pandas, scikit learn, TensorFlow, Jupyter, …)
• 24/7 production scale with Confluent Python Client (e.g. deployed in Docker containers on Kubernetes)
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Digital Image
Processing
(External SaaS
Service + REST)
Kafka, ksqlDB and TensorFlow for
Drug Discovery in Real Time at Scale
Kafka Client
(.NET C++)
Batch
Reporting
Platform
BI
Dashboard
Confluent
Server
Tiered Storage
Kafka
Connect
Laboratory
(Windows Machines)
Confluent Platform
Other Components
Model Training
and Scoring
(Python Client +
TensorFlow)
All Data
Processed
Images
Images
Human
Intelligence
Streaming
ETL
(ksqlDB)
Stateful
Workflow
Orchestration
(Kafka Streams)
Database
(MySQL) Kafka Connect
(Oracle CDC)
Historical Drugs Data
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Data in Motion Is The Future Of Data
85
Infrastructure
as code
Data in motion
as continuous
streams of events
Future of the
datacenter
Future of data
Cloud
Event
Streaming
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Why Confluent?
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
The Rise of Data in Motion
2010
Apache Kafka
created at LinkedIn by
Confluent founders
2014
2020
80%
Fortune 100
Companies
trust and use
Apache Kafka
88
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
I N V E S T M E N T & T I M E
V
A
L
U
E
3
4
5
1
2
Event Streaming Maturity Model
Initial Awareness /
Pilot (1 Kafka
Cluster)
Start to Build
Pipeline / Deliver 1
New Outcome
(1 Kafka Cluster)
Mission-Critical
Deployment
(Stretched, Hybrid,
Multi-Region)
Build Contextual
Event-Driven Apps
(Stretched, Hybrid,
Multi-Region)
Central Nervous
System
(Global Kafka)
Product, Support, Training, Partners, Technical Account Management...
89
Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Car Engine Car Self-driving Car
Confluent completes Apache Kafka. Cloud-native.
Everywhere.
Kai Waehner
Field CTO
contact@kai-waehner.de
@KaiWaehner
kai-waehner.de
confluent.io
linkedin.com/in/kaiwaehner
Questions? Feedback?
Let’s connect!

More Related Content

PPTX
The Top 5 Apache Kafka Use Cases and Architectures in 2022
PDF
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
PDF
Apache Kafka for Automotive Industry, Mobility Services & Smart City
PDF
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
PDF
Apache Kafka® Use Cases for Financial Services
PDF
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
PDF
Kafka for Real-Time Replication between Edge and Hybrid Cloud
The Top 5 Apache Kafka Use Cases and Architectures in 2022
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Apache Kafka for Automotive Industry, Mobility Services & Smart City
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Apache Kafka® Use Cases for Financial Services
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Kafka for Real-Time Replication between Edge and Hybrid Cloud

What's hot (20)

PDF
Can Apache Kafka Replace a Database?
PDF
CDC patterns in Apache Kafka®
PPTX
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
PDF
Disaster Recovery Plans for Apache Kafka
PDF
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
PPTX
Azure Synapse Analytics Overview (r2)
PDF
Kafka Streams: What it is, and how to use it?
PDF
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
PDF
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
PDF
Apache Kafka in the Transportation and Logistics
PDF
Kafka Connect and Streams (Concepts, Architecture, Features)
PPTX
Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout
PDF
When NOT to use Apache Kafka?
PDF
Top 5 Event Streaming Use Cases for 2021 with Apache Kafka
PDF
AWS Cloud Adoption Framework and Workshops
PPTX
Splunk Architecture
PDF
Top use cases for 2022 with Data in Motion and Apache Kafka
PPTX
Databricks on AWS.pptx
PPTX
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
PPTX
Azure Synapse Analytics Overview (r1)
Can Apache Kafka Replace a Database?
CDC patterns in Apache Kafka®
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Disaster Recovery Plans for Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
Azure Synapse Analytics Overview (r2)
Kafka Streams: What it is, and how to use it?
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
Apache Kafka in the Transportation and Logistics
Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout
When NOT to use Apache Kafka?
Top 5 Event Streaming Use Cases for 2021 with Apache Kafka
AWS Cloud Adoption Framework and Workshops
Splunk Architecture
Top use cases for 2022 with Data in Motion and Apache Kafka
Databricks on AWS.pptx
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
Azure Synapse Analytics Overview (r1)
Ad

Similar to Apache Kafka in the Healthcare Industry (20)

PDF
Apache Kafka in the Healthcare Industry
PPTX
Healthcare Analytics Summit Keynote Fall 2017
PPTX
Hadoop Enabled Healthcare
PDF
Machine Learning with Apache Kafka in Pharma and Life Sciences
PPTX
Cloud-Based Open-Platform Data Solutions: The Best Way to Meet Today’s Growin...
PPTX
The Digitization of Healthcare: Why the Right Approach Matters and Five Steps...
PPTX
The Data Operating System: Changing the Digital Trajectory of Healthcare
PPTX
The Data Operating System: Changing the Digital Trajectory of Healthcare
PPTX
HPE and Hortonworks join forces to Deliver Healthcare Transformation
PDF
HETT Conference Olympic Central 2014 Integrating Healthcare Delivery
PPTX
Is Big Data a Big Deal...or Not?
PDF
Top Five Digital Trends Fueling Disruption in healthcare
PPTX
A Data and Analytics Ecosystem, Purpose-Built for Healthcare
PPTX
Data Is the New Strategic Asset in M&As: Is Ripping and Replacing EHRs Really...
PPTX
Healthcare IT Management
PDF
Apache Kafka in the Insurance Industry
PPTX
The biggest opportunities in digital health for Turkey's Medical Sector
PDF
Connecting the Dots: How Open Health Data will Accelerate Care Delivery Innov...
PDF
Data-driven Healthcare for Payers
PDF
Review paper on Big Data in healthcare informatics
Apache Kafka in the Healthcare Industry
Healthcare Analytics Summit Keynote Fall 2017
Hadoop Enabled Healthcare
Machine Learning with Apache Kafka in Pharma and Life Sciences
Cloud-Based Open-Platform Data Solutions: The Best Way to Meet Today’s Growin...
The Digitization of Healthcare: Why the Right Approach Matters and Five Steps...
The Data Operating System: Changing the Digital Trajectory of Healthcare
The Data Operating System: Changing the Digital Trajectory of Healthcare
HPE and Hortonworks join forces to Deliver Healthcare Transformation
HETT Conference Olympic Central 2014 Integrating Healthcare Delivery
Is Big Data a Big Deal...or Not?
Top Five Digital Trends Fueling Disruption in healthcare
A Data and Analytics Ecosystem, Purpose-Built for Healthcare
Data Is the New Strategic Asset in M&As: Is Ripping and Replacing EHRs Really...
Healthcare IT Management
Apache Kafka in the Insurance Industry
The biggest opportunities in digital health for Turkey's Medical Sector
Connecting the Dots: How Open Health Data will Accelerate Care Delivery Innov...
Data-driven Healthcare for Payers
Review paper on Big Data in healthcare informatics
Ad

More from Kai Wähner (18)

PDF
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
PDF
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
PDF
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
PDF
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
PDF
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
PDF
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
PDF
Apache Kafka Landscape for Automotive and Manufacturing
PDF
Kappa vs Lambda Architectures and Technology Comparison
PDF
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
PDF
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
PDF
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
PDF
Apache Kafka for Cybersecurity and SIEM / SOAR Modernization
PDF
Serverless Kafka on AWS as Part of a Cloud-native Data Lake Architecture
PDF
IBM Cloud Pak for Integration with Confluent Platform powered by Apache Kafka
PDF
Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies?
PDF
Apache Kafka and MQTT - Overview, Comparison, Use Cases, Architectures
PDF
Connected Vehicles and V2X with Apache Kafka
PDF
Apache Kafka in the Airline, Aviation and Travel Industry
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Apache Kafka Landscape for Automotive and Manufacturing
Kappa vs Lambda Architectures and Technology Comparison
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Apache Kafka for Cybersecurity and SIEM / SOAR Modernization
Serverless Kafka on AWS as Part of a Cloud-native Data Lake Architecture
IBM Cloud Pak for Integration with Confluent Platform powered by Apache Kafka
Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies?
Apache Kafka and MQTT - Overview, Comparison, Use Cases, Architectures
Connected Vehicles and V2X with Apache Kafka
Apache Kafka in the Airline, Aviation and Travel Industry

Recently uploaded (20)

PDF
top salesforce developer skills in 2025.pdf
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PPT
JAVA ppt tutorial basics to learn java programming
PPTX
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
PPTX
Odoo POS Development Services by CandidRoot Solutions
PPTX
history of c programming in notes for students .pptx
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PPTX
Materi_Pemrograman_Komputer-Looping.pptx
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
DOCX
The Five Best AI Cover Tools in 2025.docx
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
AI in Product Development-omnex systems
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PPTX
Operating system designcfffgfgggggggvggggggggg
PPTX
Transform Your Business with a Software ERP System
PDF
Understanding Forklifts - TECH EHS Solution
top salesforce developer skills in 2025.pdf
Wondershare Filmora 15 Crack With Activation Key [2025
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
How to Choose the Right IT Partner for Your Business in Malaysia
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
JAVA ppt tutorial basics to learn java programming
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
Odoo POS Development Services by CandidRoot Solutions
history of c programming in notes for students .pptx
How to Migrate SBCGlobal Email to Yahoo Easily
Materi_Pemrograman_Komputer-Looping.pptx
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
The Five Best AI Cover Tools in 2025.docx
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
AI in Product Development-omnex systems
2025 Textile ERP Trends: SAP, Odoo & Oracle
Operating system designcfffgfgggggggvggggggggg
Transform Your Business with a Software ERP System
Understanding Forklifts - TECH EHS Solution

Apache Kafka in the Healthcare Industry

  • 1. The Rise of Data in Motion in the Healthcare Industry Use Cases, Architectures and Examples powered by Apache Kafka Kai Waehner Field CTO contact@kai-waehner.de linkedin.com/in/kaiwaehner @KaiWaehner www.confluent.io www.kai-waehner.de
  • 2. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Healthcare includes many topics… https://isilanguagesolutions.com/2019/02/25/what-are-the-differences-between-health-care-medical-life-science-and-pharmaceutical-translations/
  • 3. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Healthcare Value Chain 4 https://www.researchgate.net/publication/265654743_The_business_of_healthcare_innovation_in_the_Wharton_School_curriculum
  • 4. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de The world is changing.
  • 5. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de “Pandemic drives digital adoption forward 5 years in a span of 8 weeks.” Digital adoption through COVID and beyond, McKinsey Covid Increases the Pressure 6
  • 6. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Digital health ecosystems: A payer perspective - McKinsey Article August 2019 Digital Health Ecosystem Disruption
  • 7. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de This transformation is happening everywhere
  • 8. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Doctors become Software
  • 9. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Medical Research becomes Software
  • 10. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Patient Data becomes Software
  • 11. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Security becomes Software
  • 12. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Healthcare Companies and Organizations
  • 13. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de What enables this transformation?
  • 14. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Real-time Data beats Slow Data. 19
  • 15. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Real-time Data beats Slow Data. Emergency Real-time sensor diagnostics Intelligent routing ETA updates Patient Care Diagnosis Treatment Connected Health Insurance Member Enrollment Claim processing Omnichannel patient experience Cybersecurity Threat detection Incident response Data privacy protection
  • 16. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de This is a fundamental paradigm shift... 21 Infrastructure as code Data in Motion as continuous streams of events Future of the datacenter Future of data Cloud Event Streaming
  • 17. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de What is Data in Motion?
  • 18. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de ‘Event’ is what happens in your business Transportation GPS in the ambulance sends ETA to the hospital at 5:11am. Kafka Insurance Claim Alice filed a healthcare insurance claim Friday at 7:34pm. Kafka Patient Interaction The doctor updates Sabine’s case status at 9:10am. Kafka
  • 19. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Data in Motion in the Healthcare Industry Your Business as Streams of Events, powered by Kafka Insurance Claim Processing Contact Relatives Patient Diagnosis Surgery Ambulance Emergency Situation
  • 20. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de An Event Streaming Platform is the Underpinning of an Event-driven Architecture 25 MES ERP Sensors Mobile Customer 360 Real-time Alerting System Data warehouse Producers Consumers Streams of real time events Stream processing apps Connectors Connectors Stream processing apps Supplier Alert Forecast Inventory Customer Order
  • 21. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de With Confluent Hadoop ... Device Logs ... App ... Microservice Mainframes Data Warehouse Splunk ... Data Stores Logs 3rd Party Apps Custom Apps / Microservices Supply Chain Management Medical Fraud Detection Patient & Beneficiary 360 Disease Spread Modeling HL Data Transformation ... Contextual Event-Driven Applications Universal Event Pipeline
  • 22. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Public Health Data Automation in Confluent 28 Connectors: CDC MQ REST Proxy EDI / Batch Input Processing Legacy Data Storage and Processing Claims Clinical Schema Registry ksqlDB / Streams HL7-FHIR MicroServices Analytics Sink Connector Sinks
  • 23. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Use Cases for Data in Motion in the Healthcare Industry 30 Know Your Patient (= “Customer 360”) ● Digital Transformation ● eCommerce Optimization ● Product Catalog Optimization ● Product-Inventory Profiling and Filtering by Customer or Persona ● Real-time Pricing Models ● Next Best Offer/Cross-Sell/ Recommendations ● Omni-Channel Experience ● Customer Profile Updates ● … Operations (Healthcare 4.0 including Drug R&D, Patient Care, etc.) ● Supply Chain Optimization ● Shipment Notifications/Delays ● Inventory Processing and Oversight ● Predictive Inventory Management ● Connected Health ● Improved Care ● Proactive Patient Care ● Patient Notifications ● Pharma Modernization ● M&A Rapid Integration ● … IT Perspective ● Cybersecurity/ SIEM Optimization ● Mainframe Offload ● Hybrid Cloud Integration/ Bridge to Cloud ● Middleware/ Messaging Modernization ● Streaming ETL & Analytics ● …
  • 24. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Example: Benefits application process Software-using 1 3 5 4 6 2 BENEFICIARY FORM INTAKE CASE MANAGER APPLICATION REVIEW BENEFITS APPLICATION APPROVE DENY Software-defined 1 BENEFICIARY BENEFITS APP UI 3 APPROVE DENY $ BENEFITS SERVICE RISK/FRAUD SERVICE ! EXTERNAL AGENCY SERVICE 2 Weeks Seconds
  • 25. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Real World Deployments
  • 26. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Covid-19 Electronic Lab Reporting (CDC) - CELR (COVID Electronic Lab Reporting) Track the threat of COVID-19 virus to provide comprehensive data for local, state, and federal response Better understand locations with an increase in incidence Rapidly aggregate, validate, transform, and distribute laboratory testing data submitted by public health departments and other partners 36 https://www.confluent.io/resources/kafka-summit-2020/flattening-the-curve-with-kafka/
  • 27. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Cerner – Sepsis Alerting Supplier of health information technology services, devices, and hardware ~30% of all US Healthcare Data in a Cerner Solution Central event streaming platform for sepsis alerting in real-time to save lives 37
  • 28. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Optum – Self-Service Kafka American pharmacy benefit manager and health care provider (subsidiary of UnitedHealth Group) Kafka as a Service within UnitedHealth Group Centrally managed and utilized by over 200 internal application teams Repeatable, scalable, cost-efficient way to standardize data From mainframe via CDC into modern data processing and analytics tools 38
  • 29. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Centene Integration and Data Processing at Scale in Real-Time Healthcare Insurer acts as intermediary for both government-sponsored and privately insured health care programs Largest Medicaid and Medicare Managed Care Provider in the US 39 https://www.confluent.io/online-talks/building-an-enterprise-eventing-framework-on-demand/
  • 30. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Centene- “CentEvent” Claims System Consolidation
  • 31. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Humana – Real-Time Integration and Analytics Interoperability platform to transition from Insurance Company with Elements of Health, to truly a Health Company with Elements of Insurance. Consumer-centric, health plan agnostic, provider agnostic. Cloud resilient and elastic. Event-driven and real-time. Use cases include real-time updates of health information (Connecting HCP’s -> Pharmacies), reducing pre-authorizations from 20- 30 minutes to 1 minute, real-time home healthcare assistant communication 41 https://www.confluent.io/resources/kafka-summit-2020/levi-bailey-keynote-humana-improving-health-with-event-driven-architectures/
  • 32. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Invitae – Data Science and 24/7 Production Invitae offers gene panels and single-gene testing for a broad range of clinical areas including hereditary cancer, cardiology, neurology, pediatric genetics, metabolic disorders, immunology, hematology. Bring comprehensive genetic information into mainstream medical practice to improve the quality of healthcare for billions of people. Truly decoupled infrastructure to enable others to join in and consume the data. Paradigm shift: Building an application entirely of streams. 42 https://www.confluent.io/kafka-summit-san-francisco-2019/from-zero-to-streaming-healthcare-in-production
  • 33. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Babylon Health – Secure and Agile Integration Connectivity + Agile Microservice Architecture. GDPR and PII compliant security. 43 https://www.confluent.io/kafka-summit-lon19/one-key-to-rule-them-all
  • 34. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Bayer AG – Hybrid Real-Time Data Flow Adopted a cloud first strategy and started a multi-year transition to the cloud. Kafka-based cross-datacenter DataHub was created to facilitate migration and to drive shift to real-time stream processing. Strong enterprise adoption and supports a myriad of use cases 44 https://www.confluent.io/kafka-summit-sf18/bringing-streaming-data-to-the-masses
  • 35. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Bayer AG – Data Integration and Processing in R&D Analysis of clinical trials, patents, reports, news, literature, etc. 250M documents, 7TB raw text from 30 data sources. Variety of document streams with different formats and schemas flowing through several text processing and enrichment steps. Scalable, reliable Kafka pipelines with Kafka Streams (Java) and Faust (Python) replaced custom, error-prone, non-scalable scripts. 45 https://www.kafka-summit.org/sessions/bayer-document-stream-pipelines
  • 36. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Celmatix - Reproductive Health Care 46 https://www.confluent.io/customers/celmatix/ Preclinical-stage biotech company that provides digital tools and genetic insights focused on fertility. Personalized information to disrupt how women approach their lifelong reproductive health journey. Real-time aggregation of heterogeneous data data collected from Electronic Medical Records (EMRs) and genetic data collected from partners through their Personalized Reproductive Medicine (PReM) Initiative. Proactive reproductive health decisions by leveraging real-time genomics data and applying technologies such as big data analytics, machine learning, A/I and whole-genome DNA sequencing Data governance for security and compliance.
  • 37. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Care.com – Trusted Caregivers 48 Online marketplace for a range of care services including senior care and housekeeping Bravo Platform as simple, unified IT architecture to be able to streamline go-to-market initiatives From a monolithic architecture into a truly decoupled, scalable microservices platform Migration from Confluent Platform to Confluent Cloud to focus on business problems Data Governance with Schema Registry across different run times (Java, .NET, Go, etc.) “Care APIs” (inspired by Google APIs) to define all of their data and service contracts with Protobuf Enhance security for PII data with fine-grained RBAC and data lineage https://www.confluent.io/customers/care-com/
  • 38. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Cyber Intelligence Platform leveraging Kafka Connect, Kafka Streams, Multi-Region Clusters (MRC), and more… https://www.intel.com/content/www/us/en/it-management/intel-it-best-practices/modern-scalable-cyber-intelligence-platform-kafka.html
  • 39. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de What is Apache Kafka?
  • 40. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Kafka: The Trinity of Event Streaming 01 Publish & Subscribe to Streams of Events 02 Store your Event Streams 03 Process & Analyze your Events Streams
  • 41. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Kafka Makes Your Business Real-time CREATE STREAM payments (user VARCHAR, amount INT) WITH (kafka_topic = 'all_payments', value_format = 'avro'); CREDIT SERVICE ksqlDB CREATE TABLE credit_scores AS SELECT user, updateScore(p.amount) AS credit_score FROM payments AS p GROUP BY user EMIT CHANGES; RISK SERVICE ksqlDB
  • 42. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Databases Messaging ETL / Data Integration Data Warehouse Why can’t I do this with my existing data platforms?
  • 43. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Enterprise Data Platform Requirements Are Shifting 1 3 4 2 Scalable for Transactional Data Transient Raw data Built for Historical Data Built for Real- Time Events Scalable for ALL data Persistent + Durable Enriched data ● Value: Trigger real- time workflows (i.e. real-time order management) ● Value: Scale across the enterprise (i.e. customer 360) ● Value: Build mission-critical apps with zero data loss (i.e. instant payments) ● Value: Add context & situational awareness (i.e. ride sharing ETA) 55
  • 44. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Only Event Streaming Has All 4 Requirements 56
  • 45. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Only Event Streaming Has All 4 Requirements Messaging Databases Event Streaming Data Warehouse BUILT FOR REAL- TIME EVENTS SCALABLE FOR ALL DATA PERSISTENT & DURABLE CAPABLE OF ENRICHMENT 57 Good for transactional applications Good for ultra low-latency, fire-and-forget use cases Good for batch data integration Good for historical analytics and reporting Platform for Event-Driven Transformation (Scalable Messaging + Real-Time Data Integration + Stream Processing) ETL/Data Integration
  • 46. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Kafka is Complementary to other Middleware in the Enterprise Architecture Orders Customers Payments Stock REST JMS ESB REST CRM Mainframe SOAP … Kafka Kafka Kafka Kafka SOAP API Management HTTP MQ
  • 47. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Machine Learning and Event Streaming Improve Traditional and to Build New Use Cases in Pharma and Life Sciences Streams Processing / AI / ML Clinical Trials Patents, Text etc Structured & unstructured Data IoT & Business Applications Multi- Hybrid- Cloud
  • 48. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Project Example: Drug Discovery
  • 49. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Use Case: Drug Discovery “On average, it takes at least ten years for a new medicine to complete the journey from initial discovery to the marketplace” PhRMA http://phrma-docs.phrma.org/sites/default/files/pdf/rd_brochure_022307.pdf
  • 50. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Recursion – Discovering Drugs in Real-Time Accelerate drug discovery. Find drug treatments by processing biological images. Massively parallel system. Combines experimental biology, artificial intelligence, automation and real-time event streaming. 63 https://www.confluent.io/customers/recursion https://www.confluent.io/kafka-summit-san-francisco-2019/discovering-drugs-with-kafka-streams
  • 51. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Recursion Partnering with Roche and Bayer 64 https://www.bloomberg.com/news/articles/2021-12-07/roche-signs-machine-learning-neuroscience-deal-with-recursion
  • 52. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Image and Video Processing … (on high level) is “just” pixels (arrays of 0s and 1s) and matrix multiplication
  • 53. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Drug Discovery in manual and slow, bursty batch mode, not scalable
  • 54. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Drug Discovery in automated, scalable, reliable real time Mode
  • 55. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Digital Image Processing (e.g. noise reduction) Streaming Analytics for Drug Discovery in Real Time at Scale Real Time Integration Layer Batch Reporting Platform BI Dashboard Event Streaming Platform Real Time Integration Layer Laboratory Streaming Platform Other Components Automated Drug Analysis All Data Processed Images Ingest Images Human Intelligence Data Processing (e.g. filtering) Stateful Workflow Orchestration
  • 56. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Digital Image Processing for Drug Discovery Find drug treatments by processing biological images: • ML models can be trained to decide between healthy cells and disease cells with problematic genes • Grow healthy cells and disease cells in labs • Apply different drugs à Make disease cells look healthy again
  • 57. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Digital Image Processing (OpenCV SaaS Service REST API) Kafka, ksqlDB and TensorFlow for Drug Discovery in Real Time at Scale Kafka Client (.NET C++) Batch Reporting Platform BI Dashboard Confluent Server Tiered Storage Kafka Connect Laboratory (Windows Machines) Confluent Platform Other Components Model Training and Scoring (Python Client + TensorFlow) All Data Processed Images Images Human Intelligence Streaming ETL (ksqlDB) Stateful Workflow Orchestration (Kafka Streams) Database (MySQL) Kafka Connect (Oracle CDC) Historical Drugs Data
  • 58. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Ingestion of Images Replication Cluster Linking Kafka Connect Laboratory
  • 59. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Data Preprocessing Preprocessing Filter, transform, anonymize, extract features, reduce noise, enhance brightness / contrast Streams Data Ready For Model Training
  • 60. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de SELECT image_id, experiment_id, image_details FROM image_channel i LEFT JOIN experiment_database e ON i.experiment_id = e.experiment_id WHERE e.image_type = ‘black_and_white'; Data Processing with ksqlDB
  • 61. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de TensorFlow Model — Convolutional Neural Network (CNN) for Image Recognition (as part of the ML Pipeline)
  • 62. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Direct streaming ingestion for model training and / or scoring with TensorFlow I/O + Kafka Plugin (no additional data storage like S3 or HDFS required!) Time Model B Model A Producer Distributed Commit Log Streaming Ingestion and Model Training with TensorFlow IO https://github.com/tensorflow/io
  • 63. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Confluent Tiered Storage for Kafka 78
  • 64. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Use Cases for Reprocessing Historical Events Give me all events from time A to time B Real-time Producer Time • New consumer application • Error-handling • Compliance / regulatory processing • Query and analyze existing events • Schema changes in analytics platform • Model training Real-time Consumer Consumer of Historical Data
  • 65. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Local Predictions Model Training in Cloud Model Deployment at the Edge Analytic Model Separation of Model Training and Model Inference
  • 66. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Streams Input Event Prediction Request Response Model Serving TensorFlow Serving gRPC / HTTP Application Stream Processing with External Model and RPC Model
  • 67. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de “CREATE STREAM ImageAnalysis AS SELECT image_id, analyzeImage(image_details) FROM image_channel;“ User Defined Function (UDF) Embedded Model Deployment with Apache Kafka, ksqlDB and TensorFlow
  • 68. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Model Training and Scoring with the same ML Pipeline (or even in the same Application) • Data Science team responsible for the whole model lifecycle • Beloved Python tool stack (Pandas, scikit learn, TensorFlow, Jupyter, …) • 24/7 production scale with Confluent Python Client (e.g. deployed in Docker containers on Kubernetes)
  • 69. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Digital Image Processing (External SaaS Service + REST) Kafka, ksqlDB and TensorFlow for Drug Discovery in Real Time at Scale Kafka Client (.NET C++) Batch Reporting Platform BI Dashboard Confluent Server Tiered Storage Kafka Connect Laboratory (Windows Machines) Confluent Platform Other Components Model Training and Scoring (Python Client + TensorFlow) All Data Processed Images Images Human Intelligence Streaming ETL (ksqlDB) Stateful Workflow Orchestration (Kafka Streams) Database (MySQL) Kafka Connect (Oracle CDC) Historical Drugs Data
  • 70. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Data in Motion Is The Future Of Data 85 Infrastructure as code Data in motion as continuous streams of events Future of the datacenter Future of data Cloud Event Streaming
  • 71. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Why Confluent?
  • 72. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de The Rise of Data in Motion 2010 Apache Kafka created at LinkedIn by Confluent founders 2014 2020 80% Fortune 100 Companies trust and use Apache Kafka 88
  • 73. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de I N V E S T M E N T & T I M E V A L U E 3 4 5 1 2 Event Streaming Maturity Model Initial Awareness / Pilot (1 Kafka Cluster) Start to Build Pipeline / Deliver 1 New Outcome (1 Kafka Cluster) Mission-Critical Deployment (Stretched, Hybrid, Multi-Region) Build Contextual Event-Driven Apps (Stretched, Hybrid, Multi-Region) Central Nervous System (Global Kafka) Product, Support, Training, Partners, Technical Account Management... 89
  • 74. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de Car Engine Car Self-driving Car Confluent completes Apache Kafka. Cloud-native. Everywhere.