Structured Streaming Use-Cases at Apple

Apple logo is a trademark of Apple Inc.
Kristine Gu
o

Liang-Chi Hsie
h

THIS IS NOT A CONTRIBUTION
Structured Streaming Use-cases at
Apple

Liang-Chi Hsieh
Apache Spark Committe
r

Software Engineer @ Appl
e

https://github.com/viirya
https://www.linkedin.com/in/liang-chi-hsieh-a7904568/
Who am I

Kristine Guo
Software Engineer @ Appl
e

Focus on cloud platform technologie
s

Currently work on developing high scale backend
system
s

https://www.linkedin.com/in/kristineguo/
Who am I

Agenda
Revive Previous Structured Streaming Effort
New Enhancement to Structured Streaming
Use-case at Appl
e

Features in Structured Streaming that matter to us
New built-in StateStor
e

Session Windo
w

Stateful task scheduling enhancemen
t

Checkpoint enhancemen
t

Revive Previous Structured
Streaming Efforts

StateStore: Current Status
What is StateStore
?

A component for state management for stateful operators such streaming aggregates,
joins, etc
.

Stateful
operators
Checkpoint/Restore
Project
Get/Put key/value pairs
StateStore FileSystem

Built-in StateStore
HDFSBackedStateStor
e

• Store states in an in-memory ma
p

• Checkpoint to HDFS-compatible file syste
Disadvantages
?

• Limitation by executor memory and an issue for large state use-case
• Impact other memory usage on the executors

We need new built-in StateStore
Reviving RocksDB StateStore as a built-in StateStore in Structured Streamin
Why
?

• More and more streaming applications requiring large state
• Widely used in the industr
y

• SPARK-34198: Add RocksDB StateStore as external modul
• Received all positive responses from the communty

Visit Current RocksDB StateStore
OSS implementation
s

• https://github.com/chermenin/spark-states
• SPARK-28120: RocksDB state storag
e

• https://github.com/qubole/spark-state-store
OSS in the futur
e

• RocksDB StateStore from Databricks

RocksDB StateStore Benchmark
put/get significant key/value pairs
Best Time(ms) Avg Time(ms) Relative
Chermenin 32725 34507 1.0X
Qubole 25493 25636 1.3X
Databricks TBD TBD TBD

Window Operations
Time
12:01
12:02
10 min window
12:12
12:15
…
Time
event-time window
12:01
12:02
12:20
12:25
…
session gap
event-time window
session window

Reviving session window as a built-in window feature
Session Window
• SPARK-10816: EventTime based sessionizatio
• Keep inactive in last few year
s

• Available in other streaming engines but missing at Structured Streaming

Session Window Internals
Session manipulatio
n

• Session initialization, restoring, merging, savin
Internal StateStore forma
t

• Efficiently retrieve all session states for a specific session ke
• Partially update the start time and duration of the affected windows

Internal StateStore format
• Simple session window list as the valu
e

• Double list approac
h

• Start times of session windows as key
* Special thanks to Yuanjian Li for the state store format design doc

Session Window List Approach
• Easy to implemen
t

• Memory issue if too many sessions per valu
• Not support partial update
Session key A list of row
Row(a1, a2, a3…, session_window(start_time, end_time))
Row structure
Single state store

Double List Approach
• Order of session windows is kept, efficient to travers
• No complex structur
e

• Harder to maintain
Session key Row(…, session_window(s, e))
First start time key, start time1
key, start time2
key, start time3
None, start time 2
start time 1, start time 3
start time 2, None
key, start time1
key, start time2
key, start time3
Row(…, session_window(s, e))

Start times as Key Approach
• Keep the order in the list of start time
s

• Store a list of session start times
key 1
key 2
key 3
start time 1, start time 2… key1, start time1
start time 1, start time 2…
start time 1, start time 2…
key1, start time2
key2, start time1

Stateful Task Scheduling Enhancement
• Spark task scheduling is not designed for stateful task
• State store location is assigned arbitraril
- Change of state store location causes frequent reloading from remote F
- As obstacle of future checkpoint enhancements in our future works

Stateful Task Scheduling Enhancement
• Leveraging existing data locality preferences as a simple workaroun
- SPARK-33814: Provide preferred locations for stateful operation
• Customizing Spark task scheduling behavio
- SPARK-35022: Task Scheduling Plugin in Spark
Possible approaches: Ongoing works
Spark Task Scheduler Task Scheduling Plugin
New resource offers
Candidate tasks for scheduling
Scheduling preferences

How Task Scheduling Plugin helps
• Try best to distribute stateful tasks across available executor
• Keep stateful task location stable across batches
Task Scheduling Plugin
Batch N Batch N + 1
Exec 1
Exec 2
Exec 3
Exec 4
State1
State2
State3
State4
Exec 1
State1

Batch N Batch N + 1
Exec 1
Exec 2
Exec 3
Exec 4
State1
State2
State3
State4
Exec 1
State1
Exec 3
State3

Batch N Batch N + 1
Exec 1
Exec 2
Exec 3
Exec 4
State1
State2
State3
State4
Exec 1
State1
Exec 3
State3
Exec 4
State4

Use Case
• Two parallel data streams
• Each stream performs aggregation over the same data source
• Stream aggregation operates on dynamically-sized app-defined batches
• Stream-stream join between the two data streams
- Must account for potential lag between streams

Performance Requirements
• Throughput: PBs/day
• RPS: High, O(10k)
• Data size: Varying (1Kb to 1MB)
• State store: Stream-stream join exerts high memory pressure

Solutions
• Accounting for potential lag: watermarking
• Dynamic batch aggregation: session windows
• Stream-stream join pressure: RocksDB-based State Store

Thank you!
• Your feedback is important to u
s

• Don’t forget to rate and review the sessions

Structured Streaming Use-Cases at Apple

More Related Content

What's hot (20)

Similar to Structured Streaming Use-Cases at Apple (20)

More from Databricks (20)

Recently uploaded (20)

Structured Streaming Use-Cases at Apple