Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, Vectorized

Data Policies for the Kafka-API with
WebAssembly
https://github.com/vectorizedio/redpanda

background
● i’ve worked on streaming sys. for 12+ years
● developer, founder & CEO of Vectorized, hacking on
Redpanda, a modern streaming platform for mission
critical workloads.
● previously, principal engineer at Akamai; co-founder &
CTO of concord.io, a high performance stream
processing engine built in C++ and acquired by Akamai
in 2016
alex gallego
@emaxerrno

keystone problem in streaming
The Data
What you
care about

keystone problem in streaming: consume what you want
The Data
What you
care about
The Good Parts Of Streaming
○ Streaming data is immutable
■ Good for understandability
■ Built-in auditing
○ Ability to replay data
○ The essence of a streaming platform is to
decouple producers from consumers
■ Focuses on Data vs Who, What and
When produced it
○ Proven at scale

The Data
What you
care about
The *Not so* Good Parts Of Streaming
○ Consume *at least* as much as you
produce
■ In the same order
■ common to 10x more than produce
○ Up front architecture cost of splitting your
streams, types, and data contracts for
things like privacy, etc…
○ Can easily saturate your network (simply
shifting the bottleneck)
■ Forces developers to create
specialized clusters
● Mission Critical Prod
● Analytics/Dashboard Prod
● ML sample clusters

The Data
What you
care about 🥳🤯🎉 - hooray!

performance
improvement
2007 2011 2020
SSD: $2500/TB
typical instance 4 cores
SSD $200/TB - 1000x faster, 10x cheaper
225 core VMs - 30x more cores
100Gbps NICs - 100x more throughput
first open source
solutions
take advantage of cheap
disk
disaggregate compute
and storage
modern hardware +
cloud native
30x taller computers + 1000x faster disks

thread per core
architecture
● explicit scheduling everywhere
○ IO groups
○ x-core groups (smp)
○ memory throttling
● ONLY supports async interfaces
○ requires library re-writes for
threading model to work
well

future<>
● viral primitive (like actors, Orleans, Akka, Pony, etc) - mix, map-reduce, filter,
chain, fail, complete, generate, fulfill, sleep, expire futures, etc
● fundamentally about program structure. w/ concurrent structure, parallelism
is a free variable
● one pinned thread per core - must express parallelism and concurrency
explicitly
● no locks on the hotpath - network of SPSC queues
async-only
cooperative scheduling framework
new way to build software:

no virtual memory
buddy allocator ● preallocate 100% of mem; split across
N-cores for thread-local allocation/access
● create pools by dividing the memory one
layer above/2 and creating a new pool
● large allocations (above 64KB are not
pooled)
● buddy allocator pools for all object sizes
below 64KB
● full free-lists are recycled
● difficult to use this technique in practice,
and requires developer
retraining/accounting for every single byte
present in the system at all times
○ forces developer to pay additional attention
to all hash-maps, allocations, pooling, etc
Pools
of 8KB
Pools of 16KB
Pools of 64KB
Pool 0 - large object pool;
above 64KB+1
memory/2
memory/2
...
memory core local (usually around 2GB+)
memory global/N cores…

iobuf - TPC buffer management
src: https://vectorized.io/blog/tpc-buffers/

request pipelining per partition
● parallelism model == number of
cores/pthreads in the system
● read full request metadata and assign
subrequest to physical core
● for all non-overlapping cores, execute in
parallel
● for all overlapping cores per *partition*
pipeline (enqueue writes in order)

core-local metadata piggybacking
(...pandabacking?)
● maintain core-local metadata cache of
○ bytes written per partition (for future
readers)
○ latencies from the remote core (could be
highly contended and we need TCP
backpressure)
○ per TCP-connection read-ahead pointers
on disk for O(1) access/assignment
copy-on-read cache
x-shard metadata for low
latency access

core-local v8::isolate per topic/partition/policy
● maintain a v8::isolate *per core*
● maintain v8::context per topic/partition
○ thread_local v8::isolate
■ Low latency access
■ Preemption
● Timebound
● CPU Cycles
● Cooperative scheduling
○ No cross-core communications
○ Small memory footprint

applying a .wasm or .js to a Kafka Topic
> bin/kafka-topics.sh
--alter --topic my_topic_name
--config x-data-policy={...}
(redpanda has a tool called `rpk`
that is similar to kafka-topics.sh)
● Must be a pure function
○ limit of global state per core is 1MB
● On TCP connection
○ Look up associated data policy for the
topic
○ Instantiate a v8::context
○ Perform reads from disk, and subsystems
as normal
○ *before sending tcp bytes
■ Call v8::isolate
■ Swap v8::context
■ Transform payload
■ Re-checksum payload
■ Return new RecordBatch

flow
import { InlineTransform } from "@vectorizedio/InlineTransform";
const transform = new InlineTransform();
transform.topics([{"input": "lowercase", "output":"uppercase"}]);
...
const uppercase = async (record) => {
const newRecord = {
...record,
value: record.value.map((char) => {
if (char > 97 && char < 122) {
return char - 32;
} else {
return char;
}
}),
};
return newRecord;
}
l
o
w
e
r
c
a
s
e
U
P
P
E
R
C
A
S
E
F
i
l
t
e
r
*using the Kafka-API in your favorite
programming language (JS, Py, Java, C++, etc)
full compatibility with all your tools. No code
changes

check out the code for yourself!
● https://github.com/vectorizedio/redpanda
● ask questions from the maintainers at https://vectorized.io/slack
● say hi on twitter https://twitter.com/vectorizedio
● wasm+kafka-api https://vectorized.io/blog/wasm-architecture/

Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, Vectorized

More Related Content

What's hot (20)

Similar to Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, Vectorized (20)

More from HostedbyConfluent (20)

Recently uploaded (20)

Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, Vectorized