SlideShare a Scribd company logo
ScyllaDB Embraces Wasm
Piotr Sarna
Principal Software Engineer @ScyllaDB
Piotr Sarna
■ software engineer keen on open-source projects, C++ and Rust
■ used to develop a distributed file system (LizardFS)
■ wrote a few patches for the Linux kernel
■ graduated from University of Warsaw with MSc in Computer Science
■ maintainer of the Scylla Rust Driver project
Principal Software Engineer @ScyllaDB
WebAssembly
Binary format for expressing executable code, executed on a stack-based virtual
machine. Designed to be:
■ portable
■ easily embeddable
■ efficient
WebAssembly is binary, but it also specifies a standard human-readable format:
WAT (WebAssembly Text Format).
Runtime of choice: Wasmtime
A variety of WebAssembly engines are available for embedding into C++ projects
■ Wasmtime
• implemented in Rust
• WebAssembly only
• lightweight (esp. compared to v8)
• has bindings for C/C++
• native support for yielding
■ v8
• implemented in C++
• supports javascript too
• a heavy dependency
• no direct support for yielding
the execution to reduce
latency
Runtime of choice: Wasmtime
For an initial implementation, we chose Wasmtime and its C++ bindings -
libwasmtime.
The next step is to get rid of the bindings due to its incomplete feature set,
and instead write the UDF support in Rust and compile it directly into Scylla.
How to code in WebAssembly?
Option 1: by hand (for Lisp enthusiasts)
(module
(func $fib (param $n i64) (result i64)
(if
(i64.lt_s (local.get $n) (i64.const 2))
(return (local.get $n))
)
(i64.add
(call $fib (i64.sub (local.get $n) (i64.const 1)))
(call $fib (i64.sub (local.get $n) (i64.const 2)))
)
)
(export "fib" (func $fib))
)
How to code in WebAssembly?
Option 2: write in C, compile with clang
int fib(int n) {
if (n < 2) {
return n;
}
return fib(n - 1) + fib(n - 2);
}
clang -O2 --target=wasm32 --no-standard-libraries -Wl,--export-all -Wl,--no-entry
fib.c -o fib.wasm
wasm2wat fib.wasm > fib.wat
How to code in WebAssembly?
Option 3: Rust!
use wasm_bindgen::prelude::*;
#[wasm_bindgen]
pub fn fib(n: i32) -> i32 {
if n < 2 {
n
} else {
fib(n - 1) + fib(n - 2)
}
}
rustup target add wasm32-unknown-unknown
cargo build --target wasm32-unknown-unknown
wasm2wat target/wasm32-unknown-unknown/debug/fib.wasm > fib.wat
How to code in WebAssembly?
Option 4: AssemblyScript
export function fib(n: i32): i32 {
if (n < 2) {
return n
}
return fib(n - 1) + fib(n - 2)
}
asc fib.ts --textFile fib.wat --optimize
source: https://www.assemblyscript.org/introduction.html
User-defined functions
User-defined functions are a CQL feature that allows applying a custom function
to the query result rows.
cassandra@cqlsh:ks> SELECT id, inv(id), mult(id, inv(id)) FROM t;
id | ks.inv(id) | ks.mult(id, ks.inv(id))
----+------------+-------------------------
7 | 0.142857 | 1
1 | 1 | 1
0 | Infinity | NaN
4 | 0.25 | 1
(4 rows)
User-defined aggregates
A powerful tool for combining functions into accumulators, which aggregate
results from single rows.
cassandra@cqlsh:ks> SELECT * FROM words;
word
------------
monkey
rhinoceros
dog
(3 rows)
cassandra@cqlsh:ks> SELECT avg_length(word) FROM words;
ks.avg_length(word)
-----------------------------------------------
The average string length is 6.3333333333333!
(1 rows)
CREATE FUNCTION accumulate_len(acc tuple<bigint,bigint>, a text)
RETURNS NULL ON NULL INPUT
RETURNS tuple<bigint,bigint>
LANGUAGE lua as 'return {acc[1] + 1, acc[2] + #a}';
CREATE OR REPLACE FUNCTION present(res tuple<bigint,bigint>)
RETURNS NULL ON NULL INPUT
RETURNS text
LANGUAGE lua as
'return "The average string length is " .. res[2]/res[1] .. "!"';
CREATE OR REPLACE AGGREGATE avg_length(text)
SFUNC accumulate_len
STYPE tuple<bigint,bigint>
FINALFUNC present INITCOND (0,0);
User-defined aggregates
Possible scenarios for user-defined aggregates:
■ gathering statistical data: variance, standard deviation, percentiles, etc.
■ combining multiple rows into a new format, e.g. JSON or XML
■ custom predicates, e.g. "return 10 highest values"
■ you name it!
UDF coded with Wasm
Creating a user-defined function with Wasm is as easy as providing its source code
represented in WebAssembly Text Format:
CREATE FUNCTION fib(input bigint) RETURNS NULL ON NULL INPUT RETURNS
bigint
LANGUAGE xwasm AS
'(module
(func $fib (param $n i64) (result i64)
(if
(i64.lt_s (local.get $n) (i64.const 2))
(return (local.get $n))
)
(i64.add
(call $fib (i64.sub (local.get $n) (i64.const 1)))
(call $fib (i64.sub (local.get $n) (i64.const 2)))
)
)
(export "fib" (func $fib))
)';
cassandra@cqlsh:ks> SELECT n, fib(n) FROM numbers;
n | ks.fib(n)
---+-----------
1 | 1
2 | 1
3 | 2
4 | 3
5 | 5
6 | 8
7 | 13
8 | 21
9 | 34
(9 rows)
UDF coded with Wasm
The interface for expressing CQL types, return values, NULL values and many more
details are thoroughly explained in a public design doc:
https://github.com/scylladb/scylla/blob/master/docs/design-notes/wasm.md
Try it out!
Support for Wasm-based user-defined functions and
user-defined aggregates is already available
in experimental mode.
Enable it for testing today by adding these entries
to your scylla.yaml configuration file:
enable_user_defined_functions: true
experimental_features:
- udf
Scylla currently supports Lua and Wasm for
user-defined functions.
Thank you!
Stay in touch
Piotr Sarna
sarna@scylladb.com

More Related Content

PDF
Solving PostgreSQL wicked problems
PDF
Ixgbe internals
PPTX
Sharding Methods for MongoDB
PDF
Process Scheduler and Balancer in Linux Kernel
PPTX
大規模データ処理の定番OSS Hadoop / Spark 最新動向 - 2021秋 -(db tech showcase 2021 / ONLINE 発...
PDF
BlueStore, A New Storage Backend for Ceph, One Year In
PDF
Galera cluster for high availability
PDF
Seastore: Next Generation Backing Store for Ceph
Solving PostgreSQL wicked problems
Ixgbe internals
Sharding Methods for MongoDB
Process Scheduler and Balancer in Linux Kernel
大規模データ処理の定番OSS Hadoop / Spark 最新動向 - 2021秋 -(db tech showcase 2021 / ONLINE 発...
BlueStore, A New Storage Backend for Ceph, One Year In
Galera cluster for high availability
Seastore: Next Generation Backing Store for Ceph

What's hot (20)

PPTX
MongoDBが遅いときの切り分け方法
PDF
MySQL 8.0で憶えておいてほしいこと
PPTX
Scylla Summit 2022: Scylla 5.0 New Features, Part 1
PDF
Using ClickHouse for Experimentation
PPTX
An Overview of Apache Cassandra
PDF
The Linux Block Layer - Built for Fast Storage
PPTX
RocksDB detail
PPTX
The Basics of MongoDB
PDF
Ceph Block Devices: A Deep Dive
PDF
Understanding the architecture of MariaDB ColumnStore
PDF
Structured Streaming - The Internal -
PDF
Ceph issue 해결 사례
PDF
High Availability PostgreSQL with Zalando Patroni
PDF
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
PDF
Ceph scale testing with 10 Billion Objects
PDF
A Deep Dive into Query Execution Engine of Spark SQL
PPTX
PostgreSQL and JDBC: striving for high performance
PPTX
大量のデータ処理や分析に使えるOSS Apache Sparkのご紹介(Open Source Conference 2020 Online/Kyoto ...
PDF
10 Good Reasons to Use ClickHouse
PDF
binary log と 2PC と Group Commit
MongoDBが遅いときの切り分け方法
MySQL 8.0で憶えておいてほしいこと
Scylla Summit 2022: Scylla 5.0 New Features, Part 1
Using ClickHouse for Experimentation
An Overview of Apache Cassandra
The Linux Block Layer - Built for Fast Storage
RocksDB detail
The Basics of MongoDB
Ceph Block Devices: A Deep Dive
Understanding the architecture of MariaDB ColumnStore
Structured Streaming - The Internal -
Ceph issue 해결 사례
High Availability PostgreSQL with Zalando Patroni
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
Ceph scale testing with 10 Billion Objects
A Deep Dive into Query Execution Engine of Spark SQL
PostgreSQL and JDBC: striving for high performance
大量のデータ処理や分析に使えるOSS Apache Sparkのご紹介(Open Source Conference 2020 Online/Kyoto ...
10 Good Reasons to Use ClickHouse
binary log と 2PC と Group Commit
Ad

Similar to Scylla Summit 2022: ScyllaDB Embraces Wasm (20)

PDF
Keeping Latency Low for User-Defined Functions with WebAssembly
PDF
WebAssembly. Neither Web Nor Assembly, All Revolutionary
PDF
WebAssembly for the rest of us - Jan-Erik Rediger - Codemotion Amsterdam 2017
PDF
Scheme on WebAssembly: It is happening!
PDF
The WebAssembly Revolution Has Begun
PDF
DEF CON 27- JACK BAKER - web assembly games
PPTX
Web assembly - Future of the Web
PDF
A world to win: WebAssembly for the rest of us
PDF
React, Powered by WebAssembly
PDF
Web (dis)assembly
PPTX
WebAssembly: In a Nutshell
PDF
Web assembly brings the web to a new era
PDF
Start writing in WebAssembly
PDF
Altitude San Francisco 2018: WebAssembly Tools & Applications
PDF
ShaREing Is Caring
PDF
Build Your Own WebAssembly Compiler
PPT
Web assembly overview by Mikhail Sorokovsky
PDF
Machine vision and device integration with the Ruby programming language (2008)
PDF
Tips And Tricks For Bioinformatics Software Engineering
PDF
Fluent14
Keeping Latency Low for User-Defined Functions with WebAssembly
WebAssembly. Neither Web Nor Assembly, All Revolutionary
WebAssembly for the rest of us - Jan-Erik Rediger - Codemotion Amsterdam 2017
Scheme on WebAssembly: It is happening!
The WebAssembly Revolution Has Begun
DEF CON 27- JACK BAKER - web assembly games
Web assembly - Future of the Web
A world to win: WebAssembly for the rest of us
React, Powered by WebAssembly
Web (dis)assembly
WebAssembly: In a Nutshell
Web assembly brings the web to a new era
Start writing in WebAssembly
Altitude San Francisco 2018: WebAssembly Tools & Applications
ShaREing Is Caring
Build Your Own WebAssembly Compiler
Web assembly overview by Mikhail Sorokovsky
Machine vision and device integration with the Ruby programming language (2008)
Tips And Tricks For Bioinformatics Software Engineering
Fluent14
Ad

More from ScyllaDB (20)

PDF
Understanding The True Cost of DynamoDB Webinar
PDF
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
PDF
Database Benchmarking for Performance Masterclass: Session 1 - Benchmarking F...
PDF
New Ways to Reduce Database Costs with ScyllaDB
PDF
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
PDF
Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...
PDF
Leading a High-Stakes Database Migration
PDF
Achieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
PDF
Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...
PDF
How Agoda Scaled 50x Throughput with ScyllaDB by Worakarn Isaratham
PDF
How Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd Coleman
PDF
ScyllaDB: 10 Years and Beyond by Dor Laor
PDF
Reduce Your Cloud Spend with ScyllaDB by Tzach Livyatan
PDF
Migrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence Liu
PDF
Vector Search with ScyllaDB by Szymon Wasik
PDF
Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...
PDF
Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...
PDF
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
PDF
Object Storage in ScyllaDB by Ran Regev, ScyllaDB
PDF
Lessons Learned from Building a Serverless Notifications System by Srushith R...
Understanding The True Cost of DynamoDB Webinar
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
Database Benchmarking for Performance Masterclass: Session 1 - Benchmarking F...
New Ways to Reduce Database Costs with ScyllaDB
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...
Leading a High-Stakes Database Migration
Achieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...
How Agoda Scaled 50x Throughput with ScyllaDB by Worakarn Isaratham
How Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd Coleman
ScyllaDB: 10 Years and Beyond by Dor Laor
Reduce Your Cloud Spend with ScyllaDB by Tzach Livyatan
Migrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence Liu
Vector Search with ScyllaDB by Szymon Wasik
Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...
Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
Object Storage in ScyllaDB by Ran Regev, ScyllaDB
Lessons Learned from Building a Serverless Notifications System by Srushith R...

Recently uploaded (20)

PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
Architecture types and enterprise applications.pdf
PPTX
OMC Textile Division Presentation 2021.pptx
PPT
What is a Computer? Input Devices /output devices
PPTX
Modernising the Digital Integration Hub
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
project resource management chapter-09.pdf
PPT
Module 1.ppt Iot fundamentals and Architecture
PDF
DP Operators-handbook-extract for the Mautical Institute
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PPTX
Chapter 5: Probability Theory and Statistics
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PDF
Enhancing emotion recognition model for a student engagement use case through...
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Developing a website for English-speaking practice to English as a foreign la...
PPTX
1. Introduction to Computer Programming.pptx
1 - Historical Antecedents, Social Consideration.pdf
Architecture types and enterprise applications.pdf
OMC Textile Division Presentation 2021.pptx
What is a Computer? Input Devices /output devices
Modernising the Digital Integration Hub
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
project resource management chapter-09.pdf
Module 1.ppt Iot fundamentals and Architecture
DP Operators-handbook-extract for the Mautical Institute
A contest of sentiment analysis: k-nearest neighbor versus neural network
Final SEM Unit 1 for mit wpu at pune .pptx
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Group 1 Presentation -Planning and Decision Making .pptx
Chapter 5: Probability Theory and Statistics
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
Enhancing emotion recognition model for a student engagement use case through...
Programs and apps: productivity, graphics, security and other tools
Assigned Numbers - 2025 - Bluetooth® Document
Developing a website for English-speaking practice to English as a foreign la...
1. Introduction to Computer Programming.pptx

Scylla Summit 2022: ScyllaDB Embraces Wasm

  • 1. ScyllaDB Embraces Wasm Piotr Sarna Principal Software Engineer @ScyllaDB
  • 2. Piotr Sarna ■ software engineer keen on open-source projects, C++ and Rust ■ used to develop a distributed file system (LizardFS) ■ wrote a few patches for the Linux kernel ■ graduated from University of Warsaw with MSc in Computer Science ■ maintainer of the Scylla Rust Driver project Principal Software Engineer @ScyllaDB
  • 3. WebAssembly Binary format for expressing executable code, executed on a stack-based virtual machine. Designed to be: ■ portable ■ easily embeddable ■ efficient WebAssembly is binary, but it also specifies a standard human-readable format: WAT (WebAssembly Text Format).
  • 4. Runtime of choice: Wasmtime A variety of WebAssembly engines are available for embedding into C++ projects ■ Wasmtime • implemented in Rust • WebAssembly only • lightweight (esp. compared to v8) • has bindings for C/C++ • native support for yielding ■ v8 • implemented in C++ • supports javascript too • a heavy dependency • no direct support for yielding the execution to reduce latency
  • 5. Runtime of choice: Wasmtime For an initial implementation, we chose Wasmtime and its C++ bindings - libwasmtime. The next step is to get rid of the bindings due to its incomplete feature set, and instead write the UDF support in Rust and compile it directly into Scylla.
  • 6. How to code in WebAssembly? Option 1: by hand (for Lisp enthusiasts) (module (func $fib (param $n i64) (result i64) (if (i64.lt_s (local.get $n) (i64.const 2)) (return (local.get $n)) ) (i64.add (call $fib (i64.sub (local.get $n) (i64.const 1))) (call $fib (i64.sub (local.get $n) (i64.const 2))) ) ) (export "fib" (func $fib)) )
  • 7. How to code in WebAssembly? Option 2: write in C, compile with clang int fib(int n) { if (n < 2) { return n; } return fib(n - 1) + fib(n - 2); } clang -O2 --target=wasm32 --no-standard-libraries -Wl,--export-all -Wl,--no-entry fib.c -o fib.wasm wasm2wat fib.wasm > fib.wat
  • 8. How to code in WebAssembly? Option 3: Rust! use wasm_bindgen::prelude::*; #[wasm_bindgen] pub fn fib(n: i32) -> i32 { if n < 2 { n } else { fib(n - 1) + fib(n - 2) } } rustup target add wasm32-unknown-unknown cargo build --target wasm32-unknown-unknown wasm2wat target/wasm32-unknown-unknown/debug/fib.wasm > fib.wat
  • 9. How to code in WebAssembly? Option 4: AssemblyScript export function fib(n: i32): i32 { if (n < 2) { return n } return fib(n - 1) + fib(n - 2) } asc fib.ts --textFile fib.wat --optimize source: https://www.assemblyscript.org/introduction.html
  • 10. User-defined functions User-defined functions are a CQL feature that allows applying a custom function to the query result rows. cassandra@cqlsh:ks> SELECT id, inv(id), mult(id, inv(id)) FROM t; id | ks.inv(id) | ks.mult(id, ks.inv(id)) ----+------------+------------------------- 7 | 0.142857 | 1 1 | 1 | 1 0 | Infinity | NaN 4 | 0.25 | 1 (4 rows)
  • 11. User-defined aggregates A powerful tool for combining functions into accumulators, which aggregate results from single rows. cassandra@cqlsh:ks> SELECT * FROM words; word ------------ monkey rhinoceros dog (3 rows) cassandra@cqlsh:ks> SELECT avg_length(word) FROM words; ks.avg_length(word) ----------------------------------------------- The average string length is 6.3333333333333! (1 rows) CREATE FUNCTION accumulate_len(acc tuple<bigint,bigint>, a text) RETURNS NULL ON NULL INPUT RETURNS tuple<bigint,bigint> LANGUAGE lua as 'return {acc[1] + 1, acc[2] + #a}'; CREATE OR REPLACE FUNCTION present(res tuple<bigint,bigint>) RETURNS NULL ON NULL INPUT RETURNS text LANGUAGE lua as 'return "The average string length is " .. res[2]/res[1] .. "!"'; CREATE OR REPLACE AGGREGATE avg_length(text) SFUNC accumulate_len STYPE tuple<bigint,bigint> FINALFUNC present INITCOND (0,0);
  • 12. User-defined aggregates Possible scenarios for user-defined aggregates: ■ gathering statistical data: variance, standard deviation, percentiles, etc. ■ combining multiple rows into a new format, e.g. JSON or XML ■ custom predicates, e.g. "return 10 highest values" ■ you name it!
  • 13. UDF coded with Wasm Creating a user-defined function with Wasm is as easy as providing its source code represented in WebAssembly Text Format: CREATE FUNCTION fib(input bigint) RETURNS NULL ON NULL INPUT RETURNS bigint LANGUAGE xwasm AS '(module (func $fib (param $n i64) (result i64) (if (i64.lt_s (local.get $n) (i64.const 2)) (return (local.get $n)) ) (i64.add (call $fib (i64.sub (local.get $n) (i64.const 1))) (call $fib (i64.sub (local.get $n) (i64.const 2))) ) ) (export "fib" (func $fib)) )'; cassandra@cqlsh:ks> SELECT n, fib(n) FROM numbers; n | ks.fib(n) ---+----------- 1 | 1 2 | 1 3 | 2 4 | 3 5 | 5 6 | 8 7 | 13 8 | 21 9 | 34 (9 rows)
  • 14. UDF coded with Wasm The interface for expressing CQL types, return values, NULL values and many more details are thoroughly explained in a public design doc: https://github.com/scylladb/scylla/blob/master/docs/design-notes/wasm.md
  • 15. Try it out! Support for Wasm-based user-defined functions and user-defined aggregates is already available in experimental mode. Enable it for testing today by adding these entries to your scylla.yaml configuration file: enable_user_defined_functions: true experimental_features: - udf Scylla currently supports Lua and Wasm for user-defined functions.
  • 16. Thank you! Stay in touch Piotr Sarna sarna@scylladb.com