SlideShare a Scribd company logo
Real-Time Data Pipelines
Nikita Shamgunov, MemSQL CTO and co-founder
February 17, 2016
MemSQL Confidential2
Designed for Modern Hardware, Trends, and Workloads
Scalable SQL
In-Memory and
Solid-State
Distributed Datacenter or Cloud
 Multi-mode
 OLTP, OLAP, HTAP
 Multi-model
 ANSI SQL
 Key-value
 Document/JSON
 Geospatial
 In-Memory rowstore
 Solid-state columnstore
 Stream directly to
rowstore or columnstore
 Distributed query
optimizer and execution
 Scale-out on commodity
hardware
 Deploy on-premises
 Cloud agnostic
 Amazon
 Microsoft
 Google
 Digital Ocean
Simple Real-Time Affordable Flexible
SSD
3
Creating Real-Time Pipelines with Streamliner
Real-Time
Application
 One click deployment of integrated Apache Spark
 Create real-time data pipelines through a graphical UI
 Eliminate batch ETL
 Open sourced on GitHub at memsql.github.io/spark-streamliner
MemSQL Confidential
Apache Spark
STREAMLINER
Extract Transform Load
STREAMLINER
Real-Time
Inputs
MemSQL Confidential4
MemSQL in Energy
 Real-Time Scoring for Predictive Applications
 Sensor reading and predictive model score appear
simultaneously in database table
Input
User Jar
SAS Generated PMML
Industrial
Equipment
Sensor Data
S1 S2 S3 P1 P2 P3
Sensor 1 Predictive Model 1
STREAMLINER
Internet-of-Things simulation depicting health
of wind turbines globally.
7 machines - AWS C4-2X large instances, at
$0.311 per hour
per machine,
annual cost ~ $19,000.
MemSQL PowerStream
Sensors
Wind Turbine Wind Farm
MemSQL PowerStream
197,000 wind turbines around the world
Apache Spark
STREAMLINER
Data Producers
(simulating sensor
activity)
PowerStream User
Interface
MemSQL PowerStream Architecture
Thank You
www.memsql.com

More Related Content

PDF
Building Software to Scale
PPTX
O'Reilly Media Webcast: Building Real-Time Data Pipelines
PPTX
Data & Analytics Forum: Moving Telcos to Real Time
PPTX
INTRODUCING: CREATE PIPELINE
PPTX
Real-Time Geospatial Intelligence at Scale
PPTX
Modeling the Smart and Connected City of the Future with Kafka and Spark
PPTX
Bringing olap fully online analyze changing datasets in mem sql and spark wi...
PDF
Data Transformations on Ops Metrics using Kafka Streams (Srividhya Ramachandr...
Building Software to Scale
O'Reilly Media Webcast: Building Real-Time Data Pipelines
Data & Analytics Forum: Moving Telcos to Real Time
INTRODUCING: CREATE PIPELINE
Real-Time Geospatial Intelligence at Scale
Modeling the Smart and Connected City of the Future with Kafka and Spark
Bringing olap fully online analyze changing datasets in mem sql and spark wi...
Data Transformations on Ops Metrics using Kafka Streams (Srividhya Ramachandr...

What's hot (20)

PPTX
Curriculum Associates Strata NYC 2017
PPTX
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
PPTX
In-Memory Database Performance on AWS M4 Instances
PDF
Scaling graphite for application metrics
PPTX
The evolution of the big data platform @ Netflix (OSCON 2015)
PDF
Journey to the Real-Time Analytics in Extreme Growth
PDF
How to teach your data scientist to leverage an analytics cluster with Presto...
PDF
Change Data Capture - Scale by the Bay 2019
PPTX
Netflix incloudsmarch8 2011forwiki
PDF
Monitoring Large-Scale Apache Spark Clusters at Databricks
PDF
OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yua...
PDF
Going from three nines to four nines using Kafka | Tejas Chopra, Netflix
PDF
Maximize the Business Value of Machine Learning and Data Science with Kafka (...
PDF
Using Kafka to integrate DWH and Cloud Based big data systems
PDF
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn
PPTX
Disrupting Big Data with Apache Spark in the Cloud
PDF
Kafka, Killer of Point-to-Point Integrations, Lucian Lita
PDF
Redis + Structured Streaming—A Perfect Combination to Scale-Out Your Continuo...
PPTX
goto; London: Keeping your Cloud Footprint in Check
PDF
What’s Evolving in the Elastic Stack
Curriculum Associates Strata NYC 2017
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
In-Memory Database Performance on AWS M4 Instances
Scaling graphite for application metrics
The evolution of the big data platform @ Netflix (OSCON 2015)
Journey to the Real-Time Analytics in Extreme Growth
How to teach your data scientist to leverage an analytics cluster with Presto...
Change Data Capture - Scale by the Bay 2019
Netflix incloudsmarch8 2011forwiki
Monitoring Large-Scale Apache Spark Clusters at Databricks
OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yua...
Going from three nines to four nines using Kafka | Tejas Chopra, Netflix
Maximize the Business Value of Machine Learning and Data Science with Kafka (...
Using Kafka to integrate DWH and Cloud Based big data systems
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn
Disrupting Big Data with Apache Spark in the Cloud
Kafka, Killer of Point-to-Point Integrations, Lucian Lita
Redis + Structured Streaming—A Perfect Combination to Scale-Out Your Continuo...
goto; London: Keeping your Cloud Footprint in Check
What’s Evolving in the Elastic Stack
Ad

Similar to PowerStream Demo (20)

PDF
Real-Time Analytics with Confluent and MemSQL
PDF
Scaling up Near Real-time Analytics @Uber &LinkedIn
PDF
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
PDF
Introduction to apache kafka, confluent and why they matter
PDF
Media_Entertainment_Veriticals
PDF
Webinar - Big Data: Let's SMACK - Jorg Schad
PPTX
CTO View: Driving the On-Demand Economy with Predictive Analytics
PPTX
Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, Qlik
PDF
Stream analytics
PPTX
Spring Boot & Spring Cloud on Pivotal Application Service - Alexandre Roman
PPTX
From Spark to Ignition: Fueling Your Business on Real-Time Analytics
PDF
Hamburg Data Science Meetup - MLOps with a Feature Store
PDF
Streaming etl in practice with postgre sql, apache kafka, and ksql mic
PPTX
What's New in Spark 2?
PDF
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
PPTX
High performance Spark distribution on PKS by SnappyData
PPTX
High performance Spark distribution on PKS by SnappyData
PDF
DS_2016_StreamAnalytix_real_time_streaming_analytics_platform
PDF
The Fast Path to Building Operational Applications with Spark
PDF
Writing Continuous Applications with Structured Streaming in PySpark
Real-Time Analytics with Confluent and MemSQL
Scaling up Near Real-time Analytics @Uber &LinkedIn
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Introduction to apache kafka, confluent and why they matter
Media_Entertainment_Veriticals
Webinar - Big Data: Let's SMACK - Jorg Schad
CTO View: Driving the On-Demand Economy with Predictive Analytics
Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, Qlik
Stream analytics
Spring Boot & Spring Cloud on Pivotal Application Service - Alexandre Roman
From Spark to Ignition: Fueling Your Business on Real-Time Analytics
Hamburg Data Science Meetup - MLOps with a Feature Store
Streaming etl in practice with postgre sql, apache kafka, and ksql mic
What's New in Spark 2?
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
High performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyData
DS_2016_StreamAnalytix_real_time_streaming_analytics_platform
The Fast Path to Building Operational Applications with Spark
Writing Continuous Applications with Structured Streaming in PySpark
Ad

More from SingleStore (20)

PPTX
Five ways database modernization simplifies your data life
PPTX
How Kafka and Modern Databases Benefit Apps and Analytics
PDF
Architecting Data in the AWS Ecosystem
PPTX
Building the Foundation for a Latency-Free Life
PDF
Converging Database Transactions and Analytics
PDF
Building a Machine Learning Recommendation Engine in SQL
PPTX
MemSQL 201: Advanced Tips and Tricks Webcast
PDF
Introduction to MemSQL
PDF
An Engineering Approach to Database Evaluations
PPTX
Building a Fault Tolerant Distributed Architecture
PDF
Stream Processing with Pipelines and Stored Procedures
PPTX
Curriculum Associates Strata NYC 2017
PPTX
Image Recognition on Streaming Data
PPTX
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
PDF
The State of the Data Warehouse in 2017 and Beyond
PDF
How Database Convergence Impacts the Coming Decades of Data Management
PPTX
Teaching Databases to Learn in the World of AI
PDF
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
PPTX
Gartner Catalyst 2017: Image Recognition on Streaming Data
PDF
Real-Time Analytics at Uber Scale
Five ways database modernization simplifies your data life
How Kafka and Modern Databases Benefit Apps and Analytics
Architecting Data in the AWS Ecosystem
Building the Foundation for a Latency-Free Life
Converging Database Transactions and Analytics
Building a Machine Learning Recommendation Engine in SQL
MemSQL 201: Advanced Tips and Tricks Webcast
Introduction to MemSQL
An Engineering Approach to Database Evaluations
Building a Fault Tolerant Distributed Architecture
Stream Processing with Pipelines and Stored Procedures
Curriculum Associates Strata NYC 2017
Image Recognition on Streaming Data
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
The State of the Data Warehouse in 2017 and Beyond
How Database Convergence Impacts the Coming Decades of Data Management
Teaching Databases to Learn in the World of AI
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: Image Recognition on Streaming Data
Real-Time Analytics at Uber Scale

Recently uploaded (20)

PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
top salesforce developer skills in 2025.pdf
PDF
Digital Strategies for Manufacturing Companies
PPTX
CHAPTER 2 - PM Management and IT Context
PPTX
L1 - Introduction to python Backend.pptx
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
System and Network Administration Chapter 2
PPTX
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PPTX
history of c programming in notes for students .pptx
PPTX
Transform Your Business with a Software ERP System
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PPTX
Operating system designcfffgfgggggggvggggggggg
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
Odoo Companies in India – Driving Business Transformation.pdf
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
top salesforce developer skills in 2025.pdf
Digital Strategies for Manufacturing Companies
CHAPTER 2 - PM Management and IT Context
L1 - Introduction to python Backend.pptx
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
System and Network Administration Chapter 2
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
2025 Textile ERP Trends: SAP, Odoo & Oracle
Wondershare Filmora 15 Crack With Activation Key [2025
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Navsoft: AI-Powered Business Solutions & Custom Software Development
Softaken Excel to vCard Converter Software.pdf
Which alternative to Crystal Reports is best for small or large businesses.pdf
history of c programming in notes for students .pptx
Transform Your Business with a Software ERP System
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Operating system designcfffgfgggggggvggggggggg

PowerStream Demo

  • 1. Real-Time Data Pipelines Nikita Shamgunov, MemSQL CTO and co-founder February 17, 2016
  • 2. MemSQL Confidential2 Designed for Modern Hardware, Trends, and Workloads Scalable SQL In-Memory and Solid-State Distributed Datacenter or Cloud  Multi-mode  OLTP, OLAP, HTAP  Multi-model  ANSI SQL  Key-value  Document/JSON  Geospatial  In-Memory rowstore  Solid-state columnstore  Stream directly to rowstore or columnstore  Distributed query optimizer and execution  Scale-out on commodity hardware  Deploy on-premises  Cloud agnostic  Amazon  Microsoft  Google  Digital Ocean Simple Real-Time Affordable Flexible SSD
  • 3. 3 Creating Real-Time Pipelines with Streamliner Real-Time Application  One click deployment of integrated Apache Spark  Create real-time data pipelines through a graphical UI  Eliminate batch ETL  Open sourced on GitHub at memsql.github.io/spark-streamliner MemSQL Confidential Apache Spark STREAMLINER Extract Transform Load STREAMLINER Real-Time Inputs
  • 4. MemSQL Confidential4 MemSQL in Energy  Real-Time Scoring for Predictive Applications  Sensor reading and predictive model score appear simultaneously in database table Input User Jar SAS Generated PMML Industrial Equipment Sensor Data S1 S2 S3 P1 P2 P3 Sensor 1 Predictive Model 1 STREAMLINER
  • 5. Internet-of-Things simulation depicting health of wind turbines globally. 7 machines - AWS C4-2X large instances, at $0.311 per hour per machine, annual cost ~ $19,000. MemSQL PowerStream
  • 6. Sensors Wind Turbine Wind Farm MemSQL PowerStream 197,000 wind turbines around the world
  • 7. Apache Spark STREAMLINER Data Producers (simulating sensor activity) PowerStream User Interface MemSQL PowerStream Architecture