Spark meetup - Zoomdata Streaming

Interactive Visualization of Data powered by
Spark

Streaming Data @ Zoomdata
Visualizations react to
new data delivered
Users start,
stop, pause
the stream
Users select a rolling
window or pin a start
time to capture
cumulative metrics

Drivers for Streaming Data
Data Freshness Time to Analytic Business Context

Challenges
● Time
● Frequency
● Retention
● Synchronization
● Order
● Updates

Addressing the Problem @ Zoomdata
Historical Revised
Receive Data JMS Kafka
Manipulate Stream Single JVM in Memory Spark Streaming
Hold Data in Buffer MongoDB Pluggable
Interact with Data Custom Code Pluggable

Technology Cast
● The Stream - Kafka, Kinesis, JMS
● Processing Fabric - Spark Streaming
● Landing Area - MemSQL, Solr, Kudu, Others

Benefits
● Contextual Expressiveness with Streaming Data
● Independent scalability (scale-up, scale-around)
● Expressiveness powered by Spark -- using
Windowing (dataframe API with stream)

Side Benefits
● Separation of concerns
● Disaster Recovery, COOP, other Data management
concerns
● Restatements
● Options!

Demo
● Twitter Producer
● Spark Streaming
● MemSQL & Solr Sinks

Future Work
● Cross Stream Synchronization & Fusion
● On-demand scale out and resource management via
Mesos
● Schema Evolution
● Storage Tiering

Thanks
For more information contact:
ruhollah@zoomdata.com
quan@zoomdata.com

Spark meetup - Zoomdata Streaming

More Related Content

What's hot (20)

Viewers also liked (20)

Similar to Spark meetup - Zoomdata Streaming (20)

More from Zoomdata (8)

Recently uploaded (20)

Spark meetup - Zoomdata Streaming