SlideShare a Scribd company logo
The State of the
Data Warehouse in
2017 and Beyond
Presented by
Copyright (C) 2017 451 Research LLC
The Changing Analytic Environment
James Curtis, Senior Analyst, Data Platforms & Analytics
Copyright (C) 2017 451 Research LLC
33
451 Research is a leading IT research & advisory company
Founded in 2000
300+ employees, including over 120 analysts
2,000+ clients: Technology & Service providers, corporate
advisory, finance, professional services, and IT decision makers
50,000+ IT professionals, business users and consumers in our research
community
Over 52 million data points published each quarter and 4,500+ reports
published each year
3,000+ technology & service providers under coverage
451 Research and its sister company, Uptime Institute, are the two divisions
of The 451 Group
Headquartered in New York City, with offices in London, Boston, San
Francisco, Washington DC, Mexico, Costa Rica, Brazil, Spain, UAE, Russia,
Taiwan, Singapore and Malaysia
Research & Data
Advisory
Events
Go 2 Market
Copyright (C) 2017 451 Research LLC
4
A combination of research & data is delivered across fifteen
channels aligned to the prevailing topics and technologies of digital
infrastructure… from the datacenter core to the mobile edge.
Copyright (C) 2017 451 Research LLC
5
• Data Platforms & Analytics
• Some Trends
• The Evolving Data Warehouse
• The Evolution of Analytics
• Key Takeaways
5
Agenda
Copyright (C) 2017 451 Research LLC
66
Data Platforms & Analytics
§ Technologies to store, process and analyze
data
§ Collect/analyze data to identify potential
opportunities for improvement
§ Includes:
• Operational, analytic databases,
Hadoop, data grid/cache,
event/stream processing
• Data management/integration
technologies to prepare data for
analysis, and analytics tools
Copyright (C) 2017 451 Research LLC
7
Some Trends
The	lines	will	continue	to	blur	between	
operational	and	analytical	databases.
Machine	learning	and	deep	learning	
will	enter	a	new	phase	of	strategic	
adoption	for	predictive	analytics.
Stream	processing	adoption	will	
accelerate	as	companies	grapple	
with	fast	data.
• Avoid	seeing	transactional	and	analytic	databases	
as	two	totally	different	systems
• Think	about	existing	database	admin	and	BI	skills
• Don’t	ignore	the	demand	for	predictive	analytics	
using	machine	learning
• Balance	user-friendliness	against	complexity
• Does	your	existing	data	processing	and	analytics	
infrastructure	handle	streaming	data?
• Understand	the	business	use	case	for	streaming	data
TREND RECOMMENDATION
Source: 451 Research. 2016/2017 Trends in Data
Platforms and Analytics. Oct 2015/2016.
Copyright (C) 2017 451 Research LLC
88
A Growing Market
Source: 451 Research Market
Monitor. Total Data: Platforms &
Analytics. May 2017.
Copyright (C) 2017 451 Research LLC
9
“He that will that not apply new remedies must
expect new evils.”
−Francis Bacon
Copyright (C) 2017 451 Research LLC
10
DECISION
MAKERS
DATA
ANALYSTS
IT PROSENTERPRISE
APPLICATIONS
DATA
WAREHOUSE
Enterprise Data Warehouse: Common characteristics
Copyright (C) 2017 451 Research LLC
What’s driving the change?
11
COMPUTE
OPTIONS
STORAGE
CHOICES
ORGANIZATIONAL
EXPECTATIONS
OPEN SOURCE
SOFTWARE
DATA, DATA,
AND MORE DATA
Copyright (C) 2017 451 Research LLC
12
ENTERPRISE
APPLICATIONS
DECISION
MAKERS
DATA
ANALYSTS
IT PROSDATA
WAREHOUSE
Adapt and
Expand
Our Field
of Vision
Copyright (C) 2017 451 Research LLC
13
ENTERPRISE
APPLICATIONS
CLOUD STORAGE
DECISION
MAKERS
HADOOP
SPARK
AI+ML
DATA
ANALYSTS
IT PROSDATA
WAREHOUSE
Expanded
Processing
Choices
Copyright (C) 2017 451 Research LLC
14
ENTERPRISE
APPLICATIONS
CLOUD STORAGE
MOBILE
APPS
BOTS
IOT DEVICES
AND SENSORS
SOCIAL
MEDIA
DECISION
MAKERS
HADOOP
SPARK
AI+ML
DATA
ANALYSTS
IT PROS
LOG AND
CLICKSTREAM
DATA
DATA
WAREHOUSE
Leads to
Expansion
of Data
Sources
Copyright (C) 2017 451 Research LLC
15
ENTERPRISE
APPLICATIONS
CLOUD STORAGE
MOBILE
APPS
BOTS
IOT DEVICES
AND SENSORS
SOCIAL
MEDIA
BUSINESS
USERS
DATA-DRIVEN
APPLICATIONS
DATA
SCIENTISTS
DECISION
MAKERS
HADOOP
SPARK
AI+ML
DATA
ANALYSTS
IT PROS
LOG AND
CLICKSTREAM
DATA
OT
USERS
DATA
WAREHOUSE
Which
leads to
More
Advanced
Decision-
Making
Processes
Copyright (C) 2017 451 Research LLC
The evolution of analytics
16
Copyright (C) 2017 451 Research LLC
The evolution of analytics
17
Copyright (C) 2017 451 Research LLC
The evolution of analytics
18
Copyright (C) 2017 451 Research LLC
19
Key takeaways
Copyright (C) 2017 451 Research LLC
20
Thank you
james.curtis@451research.com
@jmscrts
www.451research.com
New Data Warehouse Architectures
Mike Boyarski
The	requirements	of	the
Data	Warehouse
has	changed.
22
New Data Warehouse Requirements
Performance /
Intraday results on fast growing data
Usability /
Easier to setup, tune, and scale
Optimization /
Address scale and performance at a lower cost
Ecosystem /
Drive operational and machine learning applications
23
24
The Modern Approach to
Addressing New Requirements
A Real-Time Data Warehouse
Real-Time Data Warehouse Explained
§ Low latency between data generation and analysis
§ Micro batching or stream ingestion
§ Sub-second views on operational data and applications
§ Transaction processing for accelerated transformation
§ Extensibility functions for ML applications
§ Durable for operational readiness
MemSQL: A Real-Time Data Warehouse
Easy to setup
real-time data pipelines
with exactly-once semantics
Streaming Data Ingest
Memory optimized tables
for analyzing
real-time events
Live Data
Disk optimized tables with up to
10x compression and vectorized
queries for fast analytics
Historical Data
26
Real-Time Data Warehouse Ecosystem
27
Streaming Ingest Live Data Historical Data
Real-Time Data
Pipelines
Memory Optimized
Tables
Disk Optimized
Tables
Real-Time Data
Messaging and
Transforms
Historical Data
Real-Time
Application
Analytics
Business Intelligence
Dashboards
Bare Metal, Virtual Machines, Containers On-Premises, Cloud, As a Service
Kafka Spark
Relational Hadoop Amazon S3
28
Common Real-Time
Data Warehouse
Architectures
Data Lake Acceleration
Application
Reference Store
New
Data Input
Transformation
Structured
Real-Time Data Platform
Application
Reference Store
New
Data MemSQL
Spark
Connector
Real-Time Data Platform
Spark
Data Lake Acceleration with Spark
Application
Reference Store
Exactly-once
Semantics
Real-Time Data Platform
New
Data
Data Lake Acceleration with Kafka
Application
Historical Data Analysis
Legacy Data Warehouse
Live Data
Analysis
Real-Time Data Platform
New
Data
Legacy Data Warehouse Acceleration
Stream
Batch
Application
Historical Data Analysis
Legacy Data Warehouse
Live Data
Analysis
Real-Time Data Platform
New
Data
Legacy Data Warehouse Acceleration
Stream
Batch
Real-Time
Application
New
Data Real-Time
Scoring
Reference
Store
High Speed ML in SQL
Operationalizing Machine Learning Applications
35
http://blog.memsql.com/image-recognition-at-the-speed-of-memory-bandwidth/
§ Dot product implementation for image recognition
§ Similarity search
§ Massive performance improvement
Companies are
changing how they
interact with their data
and customers.
36
This image cannot currently be displayed.
Real-time data with high concurrency tracking millions of cars,
drivers, and riders to optimize fleet operations
+
37
Uber Real-Time Analytics Architecture
++
BUSINESS BENEFITS
• Real-time data with massive concurrency across millions of drivers, riders, and
employees accessing the database concurrently
• Enables real-time indicators to understand operating performance
• Geospatial indexing for live location-based analysis
• Company-wide dashboard for global trends
39
TECHNICAL BENEFITS
• Analyze millions of rows/second
• Analyze historical and live data simultaneously
• Massive concurrency: Hundred of users query reporting databases
40
Real-time analytics transformed profitability analysis of customer logistics
from weekly to daily, and reduced latency from days to minutes
+
+
BUSINESS BENEFITS
• Real-time analytics transformed profitability analysis of customer logistics data
• Reduced data latency from hours to minutes giving business users access to the
most recent data
41
TECHNICAL BENEFITS
• Reduced 22 hour ETL to minutes
• Increased query response time by 80x over mySQL
Thank You
Questions?

More Related Content

PDF
Converging Database Transactions and Analytics
PDF
Architecting Data in the AWS Ecosystem
PPTX
Building the Foundation for a Latency-Free Life
PPTX
How Kafka and Modern Databases Benefit Apps and Analytics
PDF
Building a Machine Learning Recommendation Engine in SQL
PDF
Real-Time Analytics with Confluent and MemSQL
PPTX
Five ways database modernization simplifies your data life
PDF
Building the Next-gen Digital Meter Platform for Fluvius
Converging Database Transactions and Analytics
Architecting Data in the AWS Ecosystem
Building the Foundation for a Latency-Free Life
How Kafka and Modern Databases Benefit Apps and Analytics
Building a Machine Learning Recommendation Engine in SQL
Real-Time Analytics with Confluent and MemSQL
Five ways database modernization simplifies your data life
Building the Next-gen Digital Meter Platform for Fluvius

What's hot (20)

PDF
Ebooks - Accelerating Time to Value of Big Data of Apache Spark | Qubole
PDF
Building Data Lakes with Apache Airflow
PDF
Building an IoT Kafka Pipeline in Under 5 Minutes
PDF
Unifying Streaming and Historical Telemetry Data For Real-time Performance Re...
PPTX
Intuit Analytics Cloud 101
PDF
Building Data Intensive Analytic Application on Top of Delta Lakes
PPTX
Getting It Right Exactly Once: Principles for Streaming Architectures
PPTX
Whoops, The Numbers Are Wrong! Scaling Data Quality @ Netflix
PPTX
Brandon obrien streaming_data
PPTX
Dealing with Drift: Building an Enterprise Data Lake
PPTX
Real-Time Analytics with Spark and MemSQL
PPTX
Optimizing industrial operations using the big data ecosystem
PDF
Data Pipelines With Streamsets
PDF
Encryption and Masking for Sensitive Apache Spark Analytics Addressing CCPA a...
PDF
Phar Data Platform: From the Lakehouse Paradigm to the Reality
PPTX
Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...
PDF
Building Pinterest Real-Time Ads Platform Using Kafka Streams
PPTX
Real-Time Geospatial Intelligence at Scale
PDF
How to Rebuild an End-to-End ML Pipeline with Databricks and Upwork with Than...
PDF
Presto: Fast SQL on Everything
Ebooks - Accelerating Time to Value of Big Data of Apache Spark | Qubole
Building Data Lakes with Apache Airflow
Building an IoT Kafka Pipeline in Under 5 Minutes
Unifying Streaming and Historical Telemetry Data For Real-time Performance Re...
Intuit Analytics Cloud 101
Building Data Intensive Analytic Application on Top of Delta Lakes
Getting It Right Exactly Once: Principles for Streaming Architectures
Whoops, The Numbers Are Wrong! Scaling Data Quality @ Netflix
Brandon obrien streaming_data
Dealing with Drift: Building an Enterprise Data Lake
Real-Time Analytics with Spark and MemSQL
Optimizing industrial operations using the big data ecosystem
Data Pipelines With Streamsets
Encryption and Masking for Sensitive Apache Spark Analytics Addressing CCPA a...
Phar Data Platform: From the Lakehouse Paradigm to the Reality
Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...
Building Pinterest Real-Time Ads Platform Using Kafka Streams
Real-Time Geospatial Intelligence at Scale
How to Rebuild an End-to-End ML Pipeline with Databricks and Upwork with Than...
Presto: Fast SQL on Everything
Ad

Similar to The State of the Data Warehouse in 2017 and Beyond (20)

PDF
Analytical Innovation: How to Build the Next Generation Data Platform
PDF
Microservices And Fast Data: Industry And Architecture Trends [with 451 Resea...
PDF
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
PDF
Insight Platforms Accelerate Digital Transformation
PPTX
Accelerating Data Lakes and Streams with Real-time Analytics
PPTX
Refactoring your EDW with Mobile Analytics Products
PPTX
In-Memory Computing Webcast. Market Predictions 2017
PDF
Cloudian 451-hortonworks - webinar
PDF
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
PDF
Decision Ready Data: Power Your Analytics with Great Data
PDF
Big Data Meetup: Analytical Systems Evolution
PDF
Denodo’s Data Catalog: Bridging the Gap between Data and Business (APAC)
PDF
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
PPTX
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
PPTX
Streaming and Visual Data Discovery for the Internet of Things
PPTX
Big Data Management: What's New, What's Different, and What You Need To Know
PDF
Big Data LDN 2018: THE NEXT WAVE: DATA, AI AND ANALYTICS IN 2019 AND BEYOND
PDF
Knowledge Graphs Webinar- 11/7/2017
PPTX
Analyze billions of records on Salesforce App Cloud with BigObject
PDF
Advanced Analytics and Machine Learning with Data Virtualization
Analytical Innovation: How to Build the Next Generation Data Platform
Microservices And Fast Data: Industry And Architecture Trends [with 451 Resea...
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
Insight Platforms Accelerate Digital Transformation
Accelerating Data Lakes and Streams with Real-time Analytics
Refactoring your EDW with Mobile Analytics Products
In-Memory Computing Webcast. Market Predictions 2017
Cloudian 451-hortonworks - webinar
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
Decision Ready Data: Power Your Analytics with Great Data
Big Data Meetup: Analytical Systems Evolution
Denodo’s Data Catalog: Bridging the Gap between Data and Business (APAC)
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
Streaming and Visual Data Discovery for the Internet of Things
Big Data Management: What's New, What's Different, and What You Need To Know
Big Data LDN 2018: THE NEXT WAVE: DATA, AI AND ANALYTICS IN 2019 AND BEYOND
Knowledge Graphs Webinar- 11/7/2017
Analyze billions of records on Salesforce App Cloud with BigObject
Advanced Analytics and Machine Learning with Data Virtualization
Ad

More from SingleStore (20)

PPTX
MemSQL 201: Advanced Tips and Tricks Webcast
PDF
Introduction to MemSQL
PDF
An Engineering Approach to Database Evaluations
PPTX
Building a Fault Tolerant Distributed Architecture
PDF
Stream Processing with Pipelines and Stored Procedures
PPTX
Curriculum Associates Strata NYC 2017
PPTX
Image Recognition on Streaming Data
PPTX
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
PDF
How Database Convergence Impacts the Coming Decades of Data Management
PPTX
Teaching Databases to Learn in the World of AI
PDF
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
PPTX
Gartner Catalyst 2017: Image Recognition on Streaming Data
PPTX
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
PDF
Real-Time Analytics at Uber Scale
PDF
Machines and the Magic of Fast Learning
PPTX
Machines and the Magic of Fast Learning - Strata Keynote
PDF
Enabling Real-Time Analytics for IoT
PPTX
Driving the On-Demand Economy with Predictive Analytics
PPTX
Tapjoy: Building a Real-Time Data Science Service for Mobile Advertising
PPTX
The Real-Time CDO and the Cloud-Forward Path to Predictive Analytics
MemSQL 201: Advanced Tips and Tricks Webcast
Introduction to MemSQL
An Engineering Approach to Database Evaluations
Building a Fault Tolerant Distributed Architecture
Stream Processing with Pipelines and Stored Procedures
Curriculum Associates Strata NYC 2017
Image Recognition on Streaming Data
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
How Database Convergence Impacts the Coming Decades of Data Management
Teaching Databases to Learn in the World of AI
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: Image Recognition on Streaming Data
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Real-Time Analytics at Uber Scale
Machines and the Magic of Fast Learning
Machines and the Magic of Fast Learning - Strata Keynote
Enabling Real-Time Analytics for IoT
Driving the On-Demand Economy with Predictive Analytics
Tapjoy: Building a Real-Time Data Science Service for Mobile Advertising
The Real-Time CDO and the Cloud-Forward Path to Predictive Analytics

Recently uploaded (20)

PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
medical staffing services at VALiNTRY
PDF
Nekopoi APK 2025 free lastest update
PPTX
ManageIQ - Sprint 268 Review - Slide Deck
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PPTX
CHAPTER 2 - PM Management and IT Context
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
top salesforce developer skills in 2025.pdf
PPTX
history of c programming in notes for students .pptx
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PDF
Digital Strategies for Manufacturing Companies
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PDF
AI in Product Development-omnex systems
PPTX
ISO 45001 Occupational Health and Safety Management System
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
How to Migrate SBCGlobal Email to Yahoo Easily
medical staffing services at VALiNTRY
Nekopoi APK 2025 free lastest update
ManageIQ - Sprint 268 Review - Slide Deck
Adobe Illustrator 28.6 Crack My Vision of Vector Design
2025 Textile ERP Trends: SAP, Odoo & Oracle
Odoo Companies in India – Driving Business Transformation.pdf
CHAPTER 2 - PM Management and IT Context
Design an Analysis of Algorithms I-SECS-1021-03
top salesforce developer skills in 2025.pdf
history of c programming in notes for students .pptx
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Digital Strategies for Manufacturing Companies
Design an Analysis of Algorithms II-SECS-1021-03
AI in Product Development-omnex systems
ISO 45001 Occupational Health and Safety Management System
VVF-Customer-Presentation2025-Ver1.9.pptx
Internet Downloader Manager (IDM) Crack 6.42 Build 41

The State of the Data Warehouse in 2017 and Beyond

  • 1. The State of the Data Warehouse in 2017 and Beyond Presented by
  • 2. Copyright (C) 2017 451 Research LLC The Changing Analytic Environment James Curtis, Senior Analyst, Data Platforms & Analytics
  • 3. Copyright (C) 2017 451 Research LLC 33 451 Research is a leading IT research & advisory company Founded in 2000 300+ employees, including over 120 analysts 2,000+ clients: Technology & Service providers, corporate advisory, finance, professional services, and IT decision makers 50,000+ IT professionals, business users and consumers in our research community Over 52 million data points published each quarter and 4,500+ reports published each year 3,000+ technology & service providers under coverage 451 Research and its sister company, Uptime Institute, are the two divisions of The 451 Group Headquartered in New York City, with offices in London, Boston, San Francisco, Washington DC, Mexico, Costa Rica, Brazil, Spain, UAE, Russia, Taiwan, Singapore and Malaysia Research & Data Advisory Events Go 2 Market
  • 4. Copyright (C) 2017 451 Research LLC 4 A combination of research & data is delivered across fifteen channels aligned to the prevailing topics and technologies of digital infrastructure… from the datacenter core to the mobile edge.
  • 5. Copyright (C) 2017 451 Research LLC 5 • Data Platforms & Analytics • Some Trends • The Evolving Data Warehouse • The Evolution of Analytics • Key Takeaways 5 Agenda
  • 6. Copyright (C) 2017 451 Research LLC 66 Data Platforms & Analytics § Technologies to store, process and analyze data § Collect/analyze data to identify potential opportunities for improvement § Includes: • Operational, analytic databases, Hadoop, data grid/cache, event/stream processing • Data management/integration technologies to prepare data for analysis, and analytics tools
  • 7. Copyright (C) 2017 451 Research LLC 7 Some Trends The lines will continue to blur between operational and analytical databases. Machine learning and deep learning will enter a new phase of strategic adoption for predictive analytics. Stream processing adoption will accelerate as companies grapple with fast data. • Avoid seeing transactional and analytic databases as two totally different systems • Think about existing database admin and BI skills • Don’t ignore the demand for predictive analytics using machine learning • Balance user-friendliness against complexity • Does your existing data processing and analytics infrastructure handle streaming data? • Understand the business use case for streaming data TREND RECOMMENDATION Source: 451 Research. 2016/2017 Trends in Data Platforms and Analytics. Oct 2015/2016.
  • 8. Copyright (C) 2017 451 Research LLC 88 A Growing Market Source: 451 Research Market Monitor. Total Data: Platforms & Analytics. May 2017.
  • 9. Copyright (C) 2017 451 Research LLC 9 “He that will that not apply new remedies must expect new evils.” −Francis Bacon
  • 10. Copyright (C) 2017 451 Research LLC 10 DECISION MAKERS DATA ANALYSTS IT PROSENTERPRISE APPLICATIONS DATA WAREHOUSE Enterprise Data Warehouse: Common characteristics
  • 11. Copyright (C) 2017 451 Research LLC What’s driving the change? 11 COMPUTE OPTIONS STORAGE CHOICES ORGANIZATIONAL EXPECTATIONS OPEN SOURCE SOFTWARE DATA, DATA, AND MORE DATA
  • 12. Copyright (C) 2017 451 Research LLC 12 ENTERPRISE APPLICATIONS DECISION MAKERS DATA ANALYSTS IT PROSDATA WAREHOUSE Adapt and Expand Our Field of Vision
  • 13. Copyright (C) 2017 451 Research LLC 13 ENTERPRISE APPLICATIONS CLOUD STORAGE DECISION MAKERS HADOOP SPARK AI+ML DATA ANALYSTS IT PROSDATA WAREHOUSE Expanded Processing Choices
  • 14. Copyright (C) 2017 451 Research LLC 14 ENTERPRISE APPLICATIONS CLOUD STORAGE MOBILE APPS BOTS IOT DEVICES AND SENSORS SOCIAL MEDIA DECISION MAKERS HADOOP SPARK AI+ML DATA ANALYSTS IT PROS LOG AND CLICKSTREAM DATA DATA WAREHOUSE Leads to Expansion of Data Sources
  • 15. Copyright (C) 2017 451 Research LLC 15 ENTERPRISE APPLICATIONS CLOUD STORAGE MOBILE APPS BOTS IOT DEVICES AND SENSORS SOCIAL MEDIA BUSINESS USERS DATA-DRIVEN APPLICATIONS DATA SCIENTISTS DECISION MAKERS HADOOP SPARK AI+ML DATA ANALYSTS IT PROS LOG AND CLICKSTREAM DATA OT USERS DATA WAREHOUSE Which leads to More Advanced Decision- Making Processes
  • 16. Copyright (C) 2017 451 Research LLC The evolution of analytics 16
  • 17. Copyright (C) 2017 451 Research LLC The evolution of analytics 17
  • 18. Copyright (C) 2017 451 Research LLC The evolution of analytics 18
  • 19. Copyright (C) 2017 451 Research LLC 19 Key takeaways
  • 20. Copyright (C) 2017 451 Research LLC 20 Thank you james.curtis@451research.com @jmscrts www.451research.com
  • 21. New Data Warehouse Architectures Mike Boyarski
  • 23. New Data Warehouse Requirements Performance / Intraday results on fast growing data Usability / Easier to setup, tune, and scale Optimization / Address scale and performance at a lower cost Ecosystem / Drive operational and machine learning applications 23
  • 24. 24 The Modern Approach to Addressing New Requirements A Real-Time Data Warehouse
  • 25. Real-Time Data Warehouse Explained § Low latency between data generation and analysis § Micro batching or stream ingestion § Sub-second views on operational data and applications § Transaction processing for accelerated transformation § Extensibility functions for ML applications § Durable for operational readiness
  • 26. MemSQL: A Real-Time Data Warehouse Easy to setup real-time data pipelines with exactly-once semantics Streaming Data Ingest Memory optimized tables for analyzing real-time events Live Data Disk optimized tables with up to 10x compression and vectorized queries for fast analytics Historical Data 26
  • 27. Real-Time Data Warehouse Ecosystem 27 Streaming Ingest Live Data Historical Data Real-Time Data Pipelines Memory Optimized Tables Disk Optimized Tables Real-Time Data Messaging and Transforms Historical Data Real-Time Application Analytics Business Intelligence Dashboards Bare Metal, Virtual Machines, Containers On-Premises, Cloud, As a Service Kafka Spark Relational Hadoop Amazon S3
  • 29. Data Lake Acceleration Application Reference Store New Data Input Transformation Structured Real-Time Data Platform
  • 30. Application Reference Store New Data MemSQL Spark Connector Real-Time Data Platform Spark Data Lake Acceleration with Spark
  • 31. Application Reference Store Exactly-once Semantics Real-Time Data Platform New Data Data Lake Acceleration with Kafka
  • 32. Application Historical Data Analysis Legacy Data Warehouse Live Data Analysis Real-Time Data Platform New Data Legacy Data Warehouse Acceleration Stream Batch
  • 33. Application Historical Data Analysis Legacy Data Warehouse Live Data Analysis Real-Time Data Platform New Data Legacy Data Warehouse Acceleration Stream Batch
  • 34. Real-Time Application New Data Real-Time Scoring Reference Store High Speed ML in SQL Operationalizing Machine Learning Applications
  • 35. 35 http://blog.memsql.com/image-recognition-at-the-speed-of-memory-bandwidth/ § Dot product implementation for image recognition § Similarity search § Massive performance improvement
  • 36. Companies are changing how they interact with their data and customers. 36
  • 37. This image cannot currently be displayed. Real-time data with high concurrency tracking millions of cars, drivers, and riders to optimize fleet operations + 37
  • 38. Uber Real-Time Analytics Architecture
  • 39. ++ BUSINESS BENEFITS • Real-time data with massive concurrency across millions of drivers, riders, and employees accessing the database concurrently • Enables real-time indicators to understand operating performance • Geospatial indexing for live location-based analysis • Company-wide dashboard for global trends 39 TECHNICAL BENEFITS • Analyze millions of rows/second • Analyze historical and live data simultaneously • Massive concurrency: Hundred of users query reporting databases
  • 40. 40 Real-time analytics transformed profitability analysis of customer logistics from weekly to daily, and reduced latency from days to minutes +
  • 41. + BUSINESS BENEFITS • Real-time analytics transformed profitability analysis of customer logistics data • Reduced data latency from hours to minutes giving business users access to the most recent data 41 TECHNICAL BENEFITS • Reduced 22 hour ETL to minutes • Increased query response time by 80x over mySQL