SlideShare a Scribd company logo
Schema Registry 101
Bill Bejeck
@bbejeck
@bbejeck
Nice to meet you!
• Member of the DevX team
• Prior to DevX ~3 years as engineer on Kafka Streams team
• Apache Kafka® Committer and PMC member
• Author of “Kafka Streams in Action” - 2nd edition underway!
2
Why Schema Registry?
A quick Apache Kafka® review
@bbejeck
Kafka – an append only distributed log
4
@bbejeck
Kafka brokers work with bytes only
5
@bbejeck
Clients produce and consume
6
Event objects form a contract
@bbejeck
An implicit contract
8
@bbejeck
Expectation of object structure
9
@bbejeck
Potentially make unexpected changes
10
@bbejeck
Use Schema Registry FTW!
11
@bbejeck
Use Schema Registry FTW!
12
Working with Schema Registry
@bbejeck
Working with Schema Registry
1. Write/download a schema
2. Test and upload the schema
3. Generate the objects
4. Configure clients (Producer, Consumer, Kafka Streams)
5. Write your application!
14
@bbejeck
Working with Schema Registry
• Confluent Cloud UI
• Schema Registry REST API / Confluent CLI
• Producer and Consumer client
• Schema Registry console producer and consumer
• Tools
• Gradle and Maven plugins
15
Build a Schema
@bbejeck
Build a Schema
• Avro
• Protocol Buffers
• JSON Schema
17
@bbejeck
Build a Schema
{
"type":"record",
}
18
@bbejeck
Build a Schema
{
"type":"record",
"namespace": "io.confluent.developer.avro",
}
19
@bbejeck
Build a Schema
{
"type":"record",
"namespace": "io.confluent.developer.avro",
"name":"Purchase",
}
20
@bbejeck
Build a Schema
{
"type":"record",
"namespace": "io.confluent.developer.avro",
"name":"Purchase",
"fields": [
{"name": "item", "type":"string"},
{"name": "amount", "type": "double”, ”default”:0.0},
{"name": "customer_id", "type": "string"}
]
}
21
@bbejeck
Build a Schema (Protobuf)
syntax = "proto3";
package io.confluent.developer.proto;
option java_outer_classname = "PurchaseProto";
message Purchase {
string item = 1;
double amount = 2;
string customer_id = 3;
}
22
@bbejeck
Build a Schema
Avro
{"name": ”list", "type":{ "type": "array", "items" : "string",
"default": [] }}
{"name": ”numbers", "type": {"type": "map", "values": ”long”,
“default” : {}}}
Protobuf
repeated string strings = 1;
map<string, string> projects = 2;
map<string, Message> myOtherMap = 3;
23
Working with schemas
@bbejeck
Register
25
@bbejeck
Register
26
@bbejeck
Register
27
Producer clients can “auto-register”
producerConfigs.put(“auto.register.schemas”, true)
Not for production!!!
@bbejeck
View/Retrieve
28
Schema lifecycle
@bbejeck
Lifecycle
30
Integrating Schema Registry
@bbejeck
Clients
32
• basic.auth.credentials.source=USER_INFO
• schema.registry.url=
https://<CLUSTER>.us-east-2.aws.confluent.cloud
• basic.auth.user.info=API_KEY:API_SECRET
@bbejeck
Clients - Producer
33
producerConfigs.put("value.serializer", ? );
• KafkaAvroSerializer.class
• KafkaProtobufSerializer.class
• KafkaJsonSchemaSerializer.class
@bbejeck
Clients – Consumer
34
consumerConfigs.put("value.deserializer", ? );
• KafkaAvroDeserializer.class
• KafkaProtobufDeserializer.class
• KafkaJsonSchemaDeserializer.class
@bbejeck
Clients – Consumer
35
• specific.avro.reader = true|false
• specific.protobuf.value.type = proto class name
• json.value.type = class name
@bbejeck
Clients – Consumer Returned Types
Avro
• SpecificRecord – myObj.getFoo()
• GenericRecord - generic.get(“foo”)
Protobuf
• Specific
• Dynamic
36
@bbejeck
Clients – Kafka Streams
37
• SpecificAvroSerde
• GenericAvroSerde
• KafkaProtobufSerde
• KafkaJsonSchemaSerde
@bbejeck
Clients – Kafka Streams
38
Serde<Generated> serde = new SpecificAvroSerde<>();
Map<String, String> configs =
Map.of(“schema.registry.url”, “https..”, …)
serde.configure(configs)
@bbejeck
Clients – Command Line Produce
39
confluent kafka topic produce purchases 
--value-format protobuf 
--schema src/main/proto/purchase.proto 
--sr-endpoint https://.... 
--sr-api-key xxxyyyy
--sr-api-secret abc123 
--cluster lkc-45687
> {"item":"pizza", "amount":17.99, "customer_id":”lombardi"}
@bbejeck
Clients – Command Line Produce
40
./bin/kafka-protobuf-console-producer 
--topic purchases 
--bootstrap-server localhost:9092 
--property schema.registry.url = http://... 
--property value.schema ="$(<src/main/proto/purchase.proto)"
> {"item":"pizza", "amount":17.99, "customer_id":”lombardi"}
@bbejeck
Clients – Command Line Consume
41
confluent kafka topic consume purchases 
--from-beginning 
--value-format protobuf 
--schema src/main/proto/purchase.proto 
--sr-endpoint https://.... 
--sr-api-key xxxyyyy
--sr-api-secret abc123 
--cluster lkc-45687
@bbejeck
Clients – Command Line Consume
42
./bin/kafka-protobuf-console-consumer 
--from-beginning 
--topic purchases 
--bootstrap-server localhost:9092 
--property schema.registry.url = http://...
@bbejeck
Clients –Testing
43
For testing you can use a mock schema registry
• schema.registry.url= “mock://scope-name”
What is a Subject?
@bbejeck
What is a Subject ?
• Defines a namespace for a schema
• Compatibility checks are per subject
• Different approaches – subject name strategies
45
@bbejeck
Subjects - TopicNameStrategy
46
@bbejeck
Subjects - RecordNameStrategy
47
@bbejeck
Subjects - TopicRecordNameStrategy
48
Schema Compatibility
@bbejeck
Schema Compatibility
• Schema Registry provides a mechanism for safe changes
• Evolving a schema
• Compatibility checks are per subject
• When a schema evolves, the subject remains, but the schema
gets a new ID and version
50
@bbejeck
Schema Compatibility
• Backward - default
• Forward
• Full
Latest version will work with previous version
• But not necessarily with versions beyond that
51
@bbejeck
Schema Compatibility
• Backward transitive
• Forward transitive
• Full transitive
Latest version will work with all pervious versions
52
@bbejeck
Schema Compatibility
Backward Forward Full
Delete fields
Add fields with default values
Delete fields with default values
Add fields
Delete or add fields either must have
default values.
Update consumer clients first Update producer clients first Order of update doesn’t matter
53
@bbejeck
Testing a Schema for compatiblity
54
@bbejeck
Summary
• Kafka broker works with bytes only
• Event objects form an implicit contract
• Using Schema Registry allows for an explicit contract
• Schema Registry provides for keeping domain objects in sync
55
@bbejeck
Resources
• Documentation - https://docs.confluent.io/platform/current/schema-
registry/index.html#sr-overview
• Multiple Event Types Presentation - https://www.confluent.io/en-
gb/events/kafka-summit-europe-2021/managing-multiple-event-types-in-
a-single-topic-with-schema-registry/
• Multiple Events tutorial - https://developer.confluent.io/tutorials/multiple-
event-type-topics/confluent.html
• Multiple Event Type code - https://github.com/bbejeck/multiple-events-
kafka-summit-europe-2021
56
@bbejeck
Resources
• Martin Kleppmann - https://www.confluent.io/blog/put-several-event-
types-kafka-topic/
• Robert Yokota - https://www.confluent.io/blog/multiple-event-types-in-
the-same-kafka-topic/
• Avro - https://avro.apache.org/docs/current/
• Protocol Buffers - https://developers.google.com/protocol-buffers
• JSON Schema - https://json-schema.org/
57
@bbejeck
Build tools
58
• Schema Registry Maven
• https://docs.confluent.io/platform/current/schema-
registry/develop/maven-plugin.html#sr-maven-plugin
• Schema Registry Gradle
• https://github.com/ImFlog/schema-registry-plugin
• Protobuf Gradle
• https://github.com/google/protobuf-gradle-plugin
• Avro Gradle
• https://github.com/davidmc24/gradle-avro-plugin
@bbejeck
Build tools
59
• JSON Schema
• https://github.com/jsonschema2dataclass/js2d-gradle
• https://github.com/joelittlejohn/jsonschema2pojo
Thank you!
@bbejeck
bill@confluent.io
cnfl.io/meetups cnfl.io/slack
cnfl.io/blog
Your Apache Kafka®
journey begins here
developer.confluent.io
61

More Related Content

PDF
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
PDF
Kafka 101 and Developer Best Practices
PDF
Benefits of Stream Processing and Apache Kafka Use Cases
PDF
When NOT to use Apache Kafka?
PDF
ksqlDB: A Stream-Relational Database System
PDF
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
PDF
Can Apache Kafka Replace a Database?
PDF
Kafka Streams: What it is, and how to use it?
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Kafka 101 and Developer Best Practices
Benefits of Stream Processing and Apache Kafka Use Cases
When NOT to use Apache Kafka?
ksqlDB: A Stream-Relational Database System
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Can Apache Kafka Replace a Database?
Kafka Streams: What it is, and how to use it?

What's hot (20)

PPTX
Apache Flink in the Cloud-Native Era
ODP
Stream processing using Kafka
PDF
Building Microservices with Apache Kafka
PPTX
The Top 5 Apache Kafka Use Cases and Architectures in 2022
PDF
How Apache Kafka® Works
PDF
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
PDF
Fundamentals of Apache Kafka
PPTX
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
PDF
Building Event Driven (Micro)services with Apache Kafka
PDF
Introduction to apache kafka
PDF
Apache Kafka - Martin Podval
PDF
Apache Kafka Introduction
PDF
Scaling your Data Pipelines with Apache Spark on Kubernetes
PPTX
Introduction to Apache Kafka
PDF
Kafka Overview
PDF
Apache Kafka Architecture & Fundamentals Explained
PDF
Introduction to Apache Kafka
PPTX
A visual introduction to Apache Kafka
PDF
Apache Kafka® Security Overview
PDF
Distributed stream processing with Apache Kafka
Apache Flink in the Cloud-Native Era
Stream processing using Kafka
Building Microservices with Apache Kafka
The Top 5 Apache Kafka Use Cases and Architectures in 2022
How Apache Kafka® Works
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Fundamentals of Apache Kafka
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Building Event Driven (Micro)services with Apache Kafka
Introduction to apache kafka
Apache Kafka - Martin Podval
Apache Kafka Introduction
Scaling your Data Pipelines with Apache Spark on Kubernetes
Introduction to Apache Kafka
Kafka Overview
Apache Kafka Architecture & Fundamentals Explained
Introduction to Apache Kafka
A visual introduction to Apache Kafka
Apache Kafka® Security Overview
Distributed stream processing with Apache Kafka
Ad

Similar to Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022 (20)

PDF
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
PDF
Simplify Governance of Streaming Data
PDF
Evolve Your Schemas in a Better Way! A Deep Dive into Avro Schema Compatibili...
PDF
Getting Started with Confluent Schema Registry
PPTX
Schema registry
PPTX
Streaming with Structure | Kate Stanley and Salma Saeed, IBM
PDF
What is Apache Kafka and What is an Event Streaming Platform?
PDF
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
PDF
Wikipedia’s Event Data Platform, Or: JSON Is Okay Too With Andrew Otto | Curr...
PDF
Real-time, real estate listings with Apache Kafka
PPTX
Kafka and Avro with Confluent Schema Registry
PPTX
Schema Registry - Set Your Data Free
PDF
Building a Streaming Platform with Kafka
PDF
Set your Data in Motion with Confluent & Apache Kafka Tech Talk Series LME
PPTX
Schema Registry - Set you Data Free
PDF
Beyond the brokers - A tour of the Kafka ecosystem
PDF
Beyond the Brokers: A Tour of the Kafka Ecosystem
PPTX
Managing multiple event types in a single topic with Schema Registry | Bill B...
PDF
Beyond the brokers - Un tour de l'écosystème Kafka
PDF
JHipster conf 2019 - Kafka Ecosystem
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Simplify Governance of Streaming Data
Evolve Your Schemas in a Better Way! A Deep Dive into Avro Schema Compatibili...
Getting Started with Confluent Schema Registry
Schema registry
Streaming with Structure | Kate Stanley and Salma Saeed, IBM
What is Apache Kafka and What is an Event Streaming Platform?
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
Wikipedia’s Event Data Platform, Or: JSON Is Okay Too With Andrew Otto | Curr...
Real-time, real estate listings with Apache Kafka
Kafka and Avro with Confluent Schema Registry
Schema Registry - Set Your Data Free
Building a Streaming Platform with Kafka
Set your Data in Motion with Confluent & Apache Kafka Tech Talk Series LME
Schema Registry - Set you Data Free
Beyond the brokers - A tour of the Kafka ecosystem
Beyond the Brokers: A Tour of the Kafka Ecosystem
Managing multiple event types in a single topic with Schema Registry | Bill B...
Beyond the brokers - Un tour de l'écosystème Kafka
JHipster conf 2019 - Kafka Ecosystem
Ad

More from HostedbyConfluent (20)

PDF
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
PDF
Renaming a Kafka Topic | Kafka Summit London
PDF
Evolution of NRT Data Ingestion Pipeline at Trendyol
PDF
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
PDF
Exactly-once Stream Processing with Arroyo and Kafka
PDF
Fish Plays Pokemon | Kafka Summit London
PDF
Tiered Storage 101 | Kafla Summit London
PDF
Building a Self-Service Stream Processing Portal: How And Why
PDF
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
PDF
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
PDF
Navigating Private Network Connectivity Options for Kafka Clusters
PDF
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
PDF
Explaining How Real-Time GenAI Works in a Noisy Pub
PDF
TL;DR Kafka Metrics | Kafka Summit London
PDF
A Window Into Your Kafka Streams Tasks | KSL
PDF
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
PDF
Data Contracts Management: Schema Registry and Beyond
PDF
Code-First Approach: Crafting Efficient Flink Apps
PDF
Debezium vs. the World: An Overview of the CDC Ecosystem
PDF
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Renaming a Kafka Topic | Kafka Summit London
Evolution of NRT Data Ingestion Pipeline at Trendyol
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Exactly-once Stream Processing with Arroyo and Kafka
Fish Plays Pokemon | Kafka Summit London
Tiered Storage 101 | Kafla Summit London
Building a Self-Service Stream Processing Portal: How And Why
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Navigating Private Network Connectivity Options for Kafka Clusters
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Explaining How Real-Time GenAI Works in a Noisy Pub
TL;DR Kafka Metrics | Kafka Summit London
A Window Into Your Kafka Streams Tasks | KSL
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Data Contracts Management: Schema Registry and Beyond
Code-First Approach: Crafting Efficient Flink Apps
Debezium vs. the World: An Overview of the CDC Ecosystem
Beyond Tiered Storage: Serverless Kafka with No Local Disks

Recently uploaded (20)

PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Encapsulation theory and applications.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Machine learning based COVID-19 study performance prediction
PPTX
A Presentation on Artificial Intelligence
PDF
NewMind AI Monthly Chronicles - July 2025
PPT
Teaching material agriculture food technology
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Review of recent advances in non-invasive hemoglobin estimation
20250228 LYD VKU AI Blended-Learning.pptx
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Encapsulation theory and applications.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
NewMind AI Weekly Chronicles - August'25 Week I
Machine learning based COVID-19 study performance prediction
A Presentation on Artificial Intelligence
NewMind AI Monthly Chronicles - July 2025
Teaching material agriculture food technology
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Dropbox Q2 2025 Financial Results & Investor Presentation
Unlocking AI with Model Context Protocol (MCP)
Mobile App Security Testing_ A Comprehensive Guide.pdf
Electronic commerce courselecture one. Pdf
Building Integrated photovoltaic BIPV_UPV.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
“AI and Expert System Decision Support & Business Intelligence Systems”
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Review of recent advances in non-invasive hemoglobin estimation

Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022