SlideShare a Scribd company logo
Key Capabilities for
Real–Time Analytics
Brian Bulkowski
CTO
Today’s Discussion
We’re awash in real-time data
Real-time data, combined with
historical data, provides the most
context for decision making
Building data pipelines with fewer
systems and steps leads to
greater scalability and reliability
2CONFIDENTIAL
A Real-Time World
Real-Time Reality
Everything is trackable
Everything is shareable,
often inadvertently
Consumers expectations
demand real-time
4
Real-Time Reality of Yesterday’s Data Systems
No ability to easily capture real-time
feeds
Too many disparate silos
Poor data cleanliness
Difficult data access (tooling, obscure
languages)
Unpredictable performance and
resource consumption
5
Real-Time Needs
Ingest on-the-fly data
• Natively from apps, Kafka/Spark, ETL tools, high speed loaders
Write groundbreaking analytic applications
• Custom dashboards, reporting
Deliver massive capacity
• With minimal node count
Guarantee performance
• Across thousands of users with reserved resources
Provide universal accessibility with ANSI SQL
6
7
A Real-Time World
Incorporating History
Real-Time Is Only Part of the Picture
An important moment,
always fleeting
Challenging to incorporate
context
A small view of the stream
compared to the broad view
over time
9
Incorporating Historical Data for Context
Business value lies in the right amount of history
• Hospitality
• Measure across annual visits
• Consumer goods
• Seasonal analytics
Both examples benefit from being able to incorporate real-
time data
• Real-time offers to hospitality guests
• More efficient inventory management
10
A Real-Time World
Incorporating History
Building A Real-Time Future
Identifying The Right Capabilities
Ingest and data loading
• Direct from apps, Kafka/Spark, Change Data Capture from OLTP systems,
ETL, YB Load
Data store scale and expansion
• Capacity, number of concurrent users, mixed workloads
Data accessibility
• Interactive applications, Ad Hoc SQL, Business critical reporting
12
Evolution of data pipeline architectures
Enterprise Data Warehouse model
• Consolidate one or multiple application data sets
into a data warehouse
Desire to capture all Internet data
led to adoption of a data lake
• However, MapReduce was challenging
SQL-as-a-Layer provides some relief
• But SQL on a file system IS NOT
a data warehouse
SQL as a Layer
Further evolution of data pipelines
14
Data science
Data Lake
High value data to EDW
Large number of
enterprise analytics users
Incoming Data
Structured and semi-structured
Enterprise Data Warehouse 1000s of users
(BI analysts, Data engineers)
High value data moves to EDW
Unstructured data Data Lake Data science
Modern architecture for real-time analytics
15
Real-Time Architecture Data Warehouse Attributes
Real-time Feeds
Ingest IoT or OLTP data
Capture 100,000s
of rows per second
Interactive Applications
Serve short queries in
under 100 milliseconds
Periodic Bulk Loads
Capture terabytes
of data, petabytes
over time
Powerful Analytics
Respond to
complex BI queries
in just a few seconds
Load and Transform
Use existing ETL tools including intensive
push-down ELT
Business Critical Reporting
Workload management
for prioritized responses
PostgreSQL
compatible
CONFIDENTIAL16
The Yellowbrick Data Warehouse
MPP scale-out architecture
Start small
Grow compute
and storage
CONFIDENTIAL17
MODULAR PURPOSE-BUILT APPLIANCE
ALL FLASH DATA WAREHOUSE
Capacity from tens of terabytes
to petabytes
Yellowbrick deployments across hybrid cloud
Yellowbrick Data Warehouse
Enabling analytics anywhere
Today
On-premises data centers
Private cloud
Colocation
Edge
2019
Cloud
Hybrid Cloud
Colocation
On-premises
Data Centers
Private Cloud Edge
Cloud
CONFIDENTIAL18
The Yellowbrick Impact: 6 full racks > 1 appliance (6 rack units)
3x-100x performance improvement
19
Real-World Use Cases
Risk analytics
• Fraud detection for e-commerce
Consumer financing
• Tracking loyalty points and
impact on balance sheet
Hospitality
• Real-time offers
20
THANK YOU
yellowbrick.com
S E E I N G I S B E L I E V I N G
Common Event Streams
Business Applications
Customer orders
Airline Reservations
Insurance claims
Bank transactions
Telco CDRs
Sources
Digital Information
Clickstreams
Social computing
Customer call logs
News, weather feeds
IT, network logs
Market data
Email
Ideal for real-time
applications and analytics
Internet of Things
RFID
Telemetry SCADA
Geolocation
Machine logs
CONFIDENTIAL22
Getting ready for real-time analytics
Business Applications
- OLTP databases
Consolidate multiple
data integration patterns
into fewer systems
Enterprise Digital Information
available via existing ETL procedures
Big data clickstreams, IoT,
Machine logs
CONFIDENTIAL23
IoT
Big Data
Gartner on Data Integration Styles
Real-time analytics popularity
dwarfs its practice
Ideal solutions will handle
multiple ingestion methods
More many workflows, the
further “up the stream” you
can grab the data, the better
Source: Gartner24

More Related Content

PPTX
How Yellowbrick Data Integrates to Existing Environments Webcast
PDF
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
PPTX
Delivering digital transformation and business impact with io t, machine lear...
PDF
Denodo DataFest 2017: Edge Computing: Collecting vs. Connecting to Streaming ...
PPTX
HP Discover: Real Time Insights from Big Data
PDF
Denodo DataFest 2017: Outpace Your Competition with Real-Time Responses
PPTX
Platform for Data Scientists
PPTX
Cloud-Con: Integration & Web APIs
How Yellowbrick Data Integrates to Existing Environments Webcast
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Delivering digital transformation and business impact with io t, machine lear...
Denodo DataFest 2017: Edge Computing: Collecting vs. Connecting to Streaming ...
HP Discover: Real Time Insights from Big Data
Denodo DataFest 2017: Outpace Your Competition with Real-Time Responses
Platform for Data Scientists
Cloud-Con: Integration & Web APIs

What's hot (20)

PDF
Building an IoT Kafka Pipeline in Under 5 Minutes
PDF
Big Data at a Gaming Company: Spil Games
PPTX
Disrupting Insurance with Advanced Analytics The Next Generation Carrier
PPT
Webinar: 2 Billion Data Points Each Day
PDF
Denodo DataFest 2017: Integrating Big Data and Streaming Data with Enterprise...
PDF
Architecting for Real-Time Big Data Analytics
PDF
Apache Kafka® and the Data Mesh
PPTX
StreamSet ETL tool
PDF
Intro to Delta Lake
PDF
2016 Spark Summit East Keynote: Ali Ghodsi and Databricks Community Edition demo
PPTX
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
PPTX
Intuit Analytics Cloud 101
PPTX
Making Bank Predictive and Real-Time
PPTX
Break Free From Oracle with Attunity and Microsoft
PDF
Data platform architecture
PDF
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
PPTX
Design Principles for a Modern Data Warehouse
PDF
Airbyte @ Airflow Summit - The new modern data stack
PPTX
Optimize Data for the Logical Data Warehouse
PPTX
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Building an IoT Kafka Pipeline in Under 5 Minutes
Big Data at a Gaming Company: Spil Games
Disrupting Insurance with Advanced Analytics The Next Generation Carrier
Webinar: 2 Billion Data Points Each Day
Denodo DataFest 2017: Integrating Big Data and Streaming Data with Enterprise...
Architecting for Real-Time Big Data Analytics
Apache Kafka® and the Data Mesh
StreamSet ETL tool
Intro to Delta Lake
2016 Spark Summit East Keynote: Ali Ghodsi and Databricks Community Edition demo
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
Intuit Analytics Cloud 101
Making Bank Predictive and Real-Time
Break Free From Oracle with Attunity and Microsoft
Data platform architecture
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
Design Principles for a Modern Data Warehouse
Airbyte @ Airflow Summit - The new modern data stack
Optimize Data for the Logical Data Warehouse
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Ad

Similar to Yellowbrick Webcast with DBTA for Real-Time Analytics (20)

PDF
single store faster analytics for warehousing
PDF
Data Warehouse or Data Lake, Which Do I Choose?
PDF
Big Data Fabric: A Necessity For Any Successful Big Data Initiative
PDF
Creating a Modern Data Architecture for Digital Transformation
PPS
Qo Introduction V2
PDF
Bridging the Last Mile: Getting Data to the People Who Need It
PDF
Big Data & Analytics - Innovating at the Speed of Light
PPTX
Five ways database modernization simplifies your data life
PPTX
Gluent Extending Enterprise Applications with Hadoop
PDF
J1 - Keynote Data Platform - Rohan Kumar
PPTX
Driving the On-Demand Economy with Predictive Analytics
PDF
Presto @ Treasure Data - Presto Meetup Boston 2015
PDF
From Single Purpose to Multi Purpose Data Lakes - Broadening End Users
PDF
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
PDF
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
PDF
Top 10 Enterprise Use Cases for NoSQL
PDF
How Enterprises are Using NoSQL for Mission-Critical Applications
PPTX
Data Lakehouse, Data Mesh, and Data Fabric (r1)
PPTX
Digital Business Transformation in the Streaming Era
PDF
Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022
single store faster analytics for warehousing
Data Warehouse or Data Lake, Which Do I Choose?
Big Data Fabric: A Necessity For Any Successful Big Data Initiative
Creating a Modern Data Architecture for Digital Transformation
Qo Introduction V2
Bridging the Last Mile: Getting Data to the People Who Need It
Big Data & Analytics - Innovating at the Speed of Light
Five ways database modernization simplifies your data life
Gluent Extending Enterprise Applications with Hadoop
J1 - Keynote Data Platform - Rohan Kumar
Driving the On-Demand Economy with Predictive Analytics
Presto @ Treasure Data - Presto Meetup Boston 2015
From Single Purpose to Multi Purpose Data Lakes - Broadening End Users
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
Top 10 Enterprise Use Cases for NoSQL
How Enterprises are Using NoSQL for Mission-Critical Applications
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Digital Business Transformation in the Streaming Era
Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022
Ad

Recently uploaded (20)

PDF
Electronic commerce courselecture one. Pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
Big Data Technologies - Introduction.pptx
PPT
Teaching material agriculture food technology
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Spectroscopy.pptx food analysis technology
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Network Security Unit 5.pdf for BCA BBA.
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
cuic standard and advanced reporting.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
KodekX | Application Modernization Development
Electronic commerce courselecture one. Pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Digital-Transformation-Roadmap-for-Companies.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
Encapsulation_ Review paper, used for researhc scholars
Big Data Technologies - Introduction.pptx
Teaching material agriculture food technology
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Spectroscopy.pptx food analysis technology
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Network Security Unit 5.pdf for BCA BBA.
“AI and Expert System Decision Support & Business Intelligence Systems”
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Empathic Computing: Creating Shared Understanding
cuic standard and advanced reporting.pdf
Spectral efficient network and resource selection model in 5G networks
Mobile App Security Testing_ A Comprehensive Guide.pdf
KodekX | Application Modernization Development

Yellowbrick Webcast with DBTA for Real-Time Analytics

  • 1. Key Capabilities for Real–Time Analytics Brian Bulkowski CTO
  • 2. Today’s Discussion We’re awash in real-time data Real-time data, combined with historical data, provides the most context for decision making Building data pipelines with fewer systems and steps leads to greater scalability and reliability 2CONFIDENTIAL
  • 4. Real-Time Reality Everything is trackable Everything is shareable, often inadvertently Consumers expectations demand real-time 4
  • 5. Real-Time Reality of Yesterday’s Data Systems No ability to easily capture real-time feeds Too many disparate silos Poor data cleanliness Difficult data access (tooling, obscure languages) Unpredictable performance and resource consumption 5
  • 6. Real-Time Needs Ingest on-the-fly data • Natively from apps, Kafka/Spark, ETL tools, high speed loaders Write groundbreaking analytic applications • Custom dashboards, reporting Deliver massive capacity • With minimal node count Guarantee performance • Across thousands of users with reserved resources Provide universal accessibility with ANSI SQL 6
  • 7. 7
  • 9. Real-Time Is Only Part of the Picture An important moment, always fleeting Challenging to incorporate context A small view of the stream compared to the broad view over time 9
  • 10. Incorporating Historical Data for Context Business value lies in the right amount of history • Hospitality • Measure across annual visits • Consumer goods • Seasonal analytics Both examples benefit from being able to incorporate real- time data • Real-time offers to hospitality guests • More efficient inventory management 10
  • 11. A Real-Time World Incorporating History Building A Real-Time Future
  • 12. Identifying The Right Capabilities Ingest and data loading • Direct from apps, Kafka/Spark, Change Data Capture from OLTP systems, ETL, YB Load Data store scale and expansion • Capacity, number of concurrent users, mixed workloads Data accessibility • Interactive applications, Ad Hoc SQL, Business critical reporting 12
  • 13. Evolution of data pipeline architectures Enterprise Data Warehouse model • Consolidate one or multiple application data sets into a data warehouse Desire to capture all Internet data led to adoption of a data lake • However, MapReduce was challenging SQL-as-a-Layer provides some relief • But SQL on a file system IS NOT a data warehouse SQL as a Layer
  • 14. Further evolution of data pipelines 14 Data science Data Lake High value data to EDW Large number of enterprise analytics users
  • 15. Incoming Data Structured and semi-structured Enterprise Data Warehouse 1000s of users (BI analysts, Data engineers) High value data moves to EDW Unstructured data Data Lake Data science Modern architecture for real-time analytics 15
  • 16. Real-Time Architecture Data Warehouse Attributes Real-time Feeds Ingest IoT or OLTP data Capture 100,000s of rows per second Interactive Applications Serve short queries in under 100 milliseconds Periodic Bulk Loads Capture terabytes of data, petabytes over time Powerful Analytics Respond to complex BI queries in just a few seconds Load and Transform Use existing ETL tools including intensive push-down ELT Business Critical Reporting Workload management for prioritized responses PostgreSQL compatible CONFIDENTIAL16
  • 17. The Yellowbrick Data Warehouse MPP scale-out architecture Start small Grow compute and storage CONFIDENTIAL17 MODULAR PURPOSE-BUILT APPLIANCE ALL FLASH DATA WAREHOUSE Capacity from tens of terabytes to petabytes
  • 18. Yellowbrick deployments across hybrid cloud Yellowbrick Data Warehouse Enabling analytics anywhere Today On-premises data centers Private cloud Colocation Edge 2019 Cloud Hybrid Cloud Colocation On-premises Data Centers Private Cloud Edge Cloud CONFIDENTIAL18
  • 19. The Yellowbrick Impact: 6 full racks > 1 appliance (6 rack units) 3x-100x performance improvement 19
  • 20. Real-World Use Cases Risk analytics • Fraud detection for e-commerce Consumer financing • Tracking loyalty points and impact on balance sheet Hospitality • Real-time offers 20
  • 21. THANK YOU yellowbrick.com S E E I N G I S B E L I E V I N G
  • 22. Common Event Streams Business Applications Customer orders Airline Reservations Insurance claims Bank transactions Telco CDRs Sources Digital Information Clickstreams Social computing Customer call logs News, weather feeds IT, network logs Market data Email Ideal for real-time applications and analytics Internet of Things RFID Telemetry SCADA Geolocation Machine logs CONFIDENTIAL22
  • 23. Getting ready for real-time analytics Business Applications - OLTP databases Consolidate multiple data integration patterns into fewer systems Enterprise Digital Information available via existing ETL procedures Big data clickstreams, IoT, Machine logs CONFIDENTIAL23 IoT Big Data
  • 24. Gartner on Data Integration Styles Real-time analytics popularity dwarfs its practice Ideal solutions will handle multiple ingestion methods More many workflows, the further “up the stream” you can grab the data, the better Source: Gartner24

Editor's Notes

  • #8: https://twitter.com/jer_s/status/1113667343480045569 @jer_s Follow Follow @jer_s More Jeremy Schneider Retweeted PostgreSQL The relational model was invented to make it easier to build good apps. When people consider non-relational data stores they sometimes overlook the benefits of a relational approach. Platforms with things like consistency & transactions make better applications with simpler code.