SlideShare a Scribd company logo
From Trill to Quill: Pushing the Envelope of Functionality and Scale
Badrish Chandramouli @ DEBS 2016
From Trill to Quill: Pushing the Envelope of Functionality and Scale
• Real-time
raise alerts
• Real-time with historical
• Correlate
• Offline
• Develop initial monitoring query
• Back-test
• Progressive
Non-temporal analysis
Engine
+ Fabric
Interactive Query Authoring
Real-Time
Dashboard
Badrish Chandramouli @ DEBS 2016
• Performance
• Fabric & language integration
• Query model
Scenarios
• monitor
telemetry &
raise alerts
• correlate real-
time with logs
• develop initial
monitoring
query
• back-test over
historical logs
• offline analysis
(BI) with early
results
Badrish Chandramouli @ DEBS 2016
• Performance
• Fabric & language integration
• Query model
Badrish Chandramouli @ DEBS 2016
Q
1
2
3
2
1
5min Window
snapshots
logical time
Input
T-1
T-2
T-3
Output
Q = COUNT(*)
3
Relational
Model
Tempo-Relational
Model
QQQ Q Q𝜹𝜹𝜹 𝜹 𝜹
Supports broad & rich analytics
scenarios (relational, progressive,
time-based)
Badrish Chandramouli @ DEBS 2016
• Key enabler: performance +
fabric & language integration +
query model
Badrish Chandramouli @ DEBS 2016
struct ClickEvent { long ClickTime; long User; long AdId; }
var str = Network.ToStream(e => e.ClickTime, Latency(10secs));
var query =
str.Where(e => e.User % 100 < 5)
.Select(e => { e.AdId })
.GroupApply( e => e.AdId,
s => s.Window(5min).Aggregate(w => w.Count()));
query.Subscribe(e => Console.Write(e)); // write results to console
Badrish Chandramouli @ DEBS 2016
stream of batches
• More load  larger batches  better throughput
…
𝑜𝑝2
…
…
𝑜𝑝1
Badrish Chandramouli @ DEBS 2016
class DataBatch {
long[] SyncTime;
...
Bitvector BV;
}
class UserData_Gen : DataBatch {
long[] c_ClickTime;
long[] c_User;
long[] c_AdId;
}
…
𝑜𝑝2
…
…
𝑜𝑝1
timestamp payload columns
bitvector
Badrish Chandramouli @ DEBS 2016
str.Where(e => e.User % 100<5);
Send(events)
...
Application
Receive(results)
On(Batch b) {
for i = 0 to b.Size {
if !(b.c_User[i]%100 < 5)
set b.bitvector[i]
}
next-operator.On(b)
}
Trill
Badrish Chandramouli @ DEBS 2016
Func<TState> InitialState();
Func<TState, long, TInput, TState> Accumulate();
Func<TState,long, TInput, TState> Deaccumulate();
Func<TState, TState, TState> Sum();
Func<TState, TState, TState> Difference();
Func<TState, TResult> ComputeResult();
InitialState: () => 0
Accumulate: (oldCount, timestamp, input) => oldCount + 1
Deaccumulate: (oldCount, timestamp, input) => oldCount - 1
Sum: (leftCount, rightCount) => leftCount + rightCount
Difference: (leftCount, rightCount) => leftCount - rightCount
ComputeResult: count => count
Badrish Chandramouli @ DEBS 2016
session windows,
http://aka.ms/trill
Badrish Chandramouli @ DEBS 2016
Badrish Chandramouli @ DEBS 2016
From Trill to Quill: Pushing the Envelope of Functionality and Scale
• Increasing interest in real-time processing over
out-of-order streams
0
20
40
60
80
100
Refresh every second
Badrish Chandramouli @ DEBS 2016
Up to 8X faster
Badrish Chandramouli @ DEBS 2016
use existing high-perf in-order Trill operators unchanged
Badrish Chandramouli @ DEBS 2016
Low-latency
Completeness
1 sec, 98%
1 hour, 100%
?
0
20
40
60
80
100
0
20
40
60
80
100
0
20
40
60
80
100
0
20
40
60
80
100
0
20
40
60
80
100
10 seconds
Refresh every secondCloud telemetry log
Badrish Chandramouli @ DEBS 2016
Impatience framework gives us low latency, high
completeness, high throughput, and low memory usage
Latency Completeness
{1 sec} ~ 1 sec 98%
{1 hour} ~ 1 hour 100%
{1 sec}
+ {1 min}
+ {1 hour}
~ 1 sec 100%
{1 sec,
1 min,
1 hour}
~ 1 sec 100%
0
2
4
6
8
10
12
14
Throughput(million/sec)
Throughput
{1sec, 1min, 1hour} {1sec}+{1min}+{1hour}
1
10
100
1000
Memoryusage(MB),logscale
Memory usage
{1sec, 1min, 1hour} {1sec}+{1min}+{1hour}
Badrish Chandramouli @ DEBS 2016
Badrish Chandramouli @ DEBS 2016
From Trill to Quill: Pushing the Envelope of Functionality and Scale
no overlapping lifetimes
0
20
40
60
80
100
Badrish Chandramouli @ DEBS 2016
data streams and operations
arrays of numerical values
Badrish Chandramouli @ DEBS 2016
Badrish Chandramouli @ DEBS 2016
Badrish Chandramouli @ DEBS 2016
Badrish Chandramouli @ DEBS 2016
Badrish Chandramouli @ DEBS 2016
From Trill to Quill: Pushing the Envelope of Functionality and Scale
rich space
temporal logic
• Transfer
ShardedStreamable
Badrish Chandramouli @ DEBS 2016
shards
• querying
• data movement
• keying
Operation Description
Query Applies unmodified query on each
(keyed) shard
Broadcast Duplicate each shard’s contents on
all shards
Multicast Copy tuples from each input shard
to zero or more specific result
shards
ReShard Load balance across shards
ReDistribute Move tuples so that same key
resides in same result shard
ReKey Changes key associated with each
row in each shard
…
…
…
…
Badrish Chandramouli @ DEBS 2016
Badrish Chandramouli @ DEBS 2016
e => e.Count()
Flat re-
distribute
e => e.Count()
e => e.Sum()
Badrish Chandramouli @ DEBS 2016
e => e.Count()
[ReDist]
Union
[ReDist]
Union
[ReKey] [ReKey]
AGG AGG
[ReDist]
Union
[ReDist]
Union
[ReKey] [ReKey]
[ReDist]
Union
[ReDist]
Union
AGG AGG
[ReDist]
Union
[ReDist]
Union
AGG AGG
AGG AGG
e => e.Sum()
Badrish Chandramouli @ DEBS 2016
(l,r) => l.Join(r, …)
(l,r) => l.Join(r, …)
Flat re-
distribute
Flat
broadcast
No data
movement
Badrish Chandramouli @ DEBS 2016
str => str.SlidingWindow(Y).Count()
.Where(c => c > threshold)
(l, r) => l.WhereNotExists(y)
str => str.HoppingWindow(Z).Count()
Badrish Chandramouli @ DEBS 2016
•
•
•
•
•
Badrish Chandramouli @ DEBS 2016
Badrish Chandramouli @ DEBS 2016
Scan (Quill vs. SparkSQL) Time taken & scheduling overhead
Badrish Chandramouli @ DEBS 2016
Grouped agg with 40M groups Hopping window (Github data)
Badrish Chandramouli @ DEBS 2016
http://badrish.net/papers/shrink-TR.pdf
Badrish Chandramouli @ DEBS 2016
From Trill to Quill: Pushing the Envelope of Functionality and Scale
https://www.microsoft.com/en-us/research/people/badrishc/
http://aka.ms/streams/
Badrish Chandramouli @ DEBS 2016
From Trill to Quill: Pushing the Envelope of Functionality and Scale

More Related Content

PPTX
Megadata With Python and Hadoop
PDF
PG Day'14 Russia, GIN — Stronger than ever in 9.4 and further, Александр Коро...
PDF
Smalltalk
PDF
Tutorial 9 (bloom filters)
PPTX
Tech talk Probabilistic Data Structure
PPT
New zealand bloom filter
PDF
k-means algorithm implementation on Hadoop
PDF
MongoDB Project: Relational databases to Document-Oriented databases
Megadata With Python and Hadoop
PG Day'14 Russia, GIN — Stronger than ever in 9.4 and further, Александр Коро...
Smalltalk
Tutorial 9 (bloom filters)
Tech talk Probabilistic Data Structure
New zealand bloom filter
k-means algorithm implementation on Hadoop
MongoDB Project: Relational databases to Document-Oriented databases

What's hot (20)

PDF
Bloom filter
PPTX
PDF
Finding similar items in high dimensional spaces locality sensitive hashing
PDF
Deep dive into deeplearn.js
PDF
PDF
Azure Stream Analytics Project : On-demand real-time analytics
PDF
Too Much Data? - Just Sample, Just Hash, ...
PDF
The Weather of the Century
PPTX
La R Users Group Survey Of R Graphics
PDF
The Weather of the Century Part 3: Visualization
PDF
3.5 webinar
DOCX
R Data Visualization-Spatial data and Maps in R: Using R as a GIS
PPT
Python Coding Examples for Drive Time Analysis
PPTX
Weather of the Century: Visualization
PDF
DeepLearning 6_5 ~ 6_5_3
PDF
Cloud flare jgc bigo meetup rolling hashes
PDF
GeoMesa on Apache Spark SQL with Anthony Fox
PDF
Using PyPy instead of Python for speed
PPT
Bloom filter
PDF
Map reduce: beyond word count
Bloom filter
Finding similar items in high dimensional spaces locality sensitive hashing
Deep dive into deeplearn.js
Azure Stream Analytics Project : On-demand real-time analytics
Too Much Data? - Just Sample, Just Hash, ...
The Weather of the Century
La R Users Group Survey Of R Graphics
The Weather of the Century Part 3: Visualization
3.5 webinar
R Data Visualization-Spatial data and Maps in R: Using R as a GIS
Python Coding Examples for Drive Time Analysis
Weather of the Century: Visualization
DeepLearning 6_5 ~ 6_5_3
Cloud flare jgc bigo meetup rolling hashes
GeoMesa on Apache Spark SQL with Anthony Fox
Using PyPy instead of Python for speed
Bloom filter
Map reduce: beyond word count
Ad

Similar to From Trill to Quill: Pushing the Envelope of Functionality and Scale (20)

PPTX
Lightning Talk: MongoDB Sharding
PDF
Problem Solving Techniques For Evolutionary Design
PPTX
FluentMigrator - Dayton .NET - July 2023
PPTX
[MongoDB.local Bengaluru 2018] Just in Time Validation with JSON Schema
PPTX
At the core you will have KUSTO
PDF
Jan 2015 - Cassandra101 Manchester Meetup
PDF
Vienna Feb 2015: Cassandra: How it works and what it's good for!
PDF
Real Time Analytics with Apache Cassandra - Cassandra Day Munich
PDF
Five data models for sharding and which is right | PGConf.ASIA 2018 | Craig K...
PDF
Five Data Models for Sharding | Nordic PGDay 2018 | Craig Kerstiens
PDF
Real Time Analytics with Apache Cassandra - Cassandra Day Berlin
PDF
C# What's next? (7.x and 8.0)
PDF
Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
PDF
Cassandra Day London: Building Java Applications
PDF
Cassandra Day London 2015: Getting Started with Apache Cassandra and Java
PDF
Summingbird: Streaming Portable, MapReduce
PPTX
Phily JUG : Web Services APIs for Real-time Analytics w/ Storm and DropWizard
PDF
Data science at the command line
PDF
Build a Complex, Realtime Data Management App with Postgres 14!
PDF
Datastax day 2016 introduction to apache cassandra
Lightning Talk: MongoDB Sharding
Problem Solving Techniques For Evolutionary Design
FluentMigrator - Dayton .NET - July 2023
[MongoDB.local Bengaluru 2018] Just in Time Validation with JSON Schema
At the core you will have KUSTO
Jan 2015 - Cassandra101 Manchester Meetup
Vienna Feb 2015: Cassandra: How it works and what it's good for!
Real Time Analytics with Apache Cassandra - Cassandra Day Munich
Five data models for sharding and which is right | PGConf.ASIA 2018 | Craig K...
Five Data Models for Sharding | Nordic PGDay 2018 | Craig Kerstiens
Real Time Analytics with Apache Cassandra - Cassandra Day Berlin
C# What's next? (7.x and 8.0)
Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Cassandra Day London: Building Java Applications
Cassandra Day London 2015: Getting Started with Apache Cassandra and Java
Summingbird: Streaming Portable, MapReduce
Phily JUG : Web Services APIs for Real-time Analytics w/ Storm and DropWizard
Data science at the command line
Build a Complex, Realtime Data Management App with Postgres 14!
Datastax day 2016 introduction to apache cassandra
Ad

Recently uploaded (20)

PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
PPTX
SAP 2 completion done . PRESENTATION.pptx
PPTX
Database Infoormation System (DBIS).pptx
PPTX
Introduction to machine learning and Linear Models
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPT
Quality review (1)_presentation of this 21
PDF
Introduction to Data Science and Data Analysis
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Computer network topology notes for revision
IB Computer Science - Internal Assessment.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
climate analysis of Dhaka ,Banglades.pptx
Clinical guidelines as a resource for EBP(1).pdf
Miokarditis (Inflamasi pada Otot Jantung)
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
SAP 2 completion done . PRESENTATION.pptx
Database Infoormation System (DBIS).pptx
Introduction to machine learning and Linear Models
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Introduction-to-Cloud-ComputingFinal.pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
Quality review (1)_presentation of this 21
Introduction to Data Science and Data Analysis
Reliability_Chapter_ presentation 1221.5784
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Computer network topology notes for revision

From Trill to Quill: Pushing the Envelope of Functionality and Scale