SlideShare a Scribd company logo
1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved1 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Scott Gnau, CTO
@Scott_Gnau
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
The Next Gen EDW is the Big Data Warehouse
 In Forrester’s 2016 global survey, 59% of respondents stated that leveraging big data
and analytics was a critical or high priority.
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Companies Are Looking to Big Data for EDW Optimization
 82% of 2550+ respondents are looking to Big Data for EDW Optimization rather than a
straight replacement. – 2016 Big Data Maturity Survey
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Hortonworks Connected Data Platforms and Solutions
Hortonworks
Connection
Hortonworks Solutions
Enterprise Data
Warehouse Optimization
Cyber Security and
Threat Management
Internet of Things
and Streaming Analytics
Hortonworks Connection
Subscription Support
SmartSense
Premier Support
Educational Services
Professional Services
Community Connection
Cloud
Hortonworks Data Cloud
AWS HDInsight
Data Center
Hortonworks Data Suite
HDFHDP
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Drivers of a Modern BI Infrastructure
Deeper and
Broader Data Sets
Complete Data
‘Provenance’
Leading Analytics
and Tools
Integrate non-EDW
data and EDW data
Total Cost of
Ownership
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Open Source Transformational Impact to EDW
Unmatched Economics
support low cost data-center and cloud
architectures for Enterprise Apache
Hadoop
Eliminates Risk and Ensures Integration
prevents vendor lock-in and speeds
ecosystem adoption of ODPi-compliant
core
COST
EFFICIENCY
DATA
VARIETY
EDW
PROPRIETARY
HADOOP
HORTONWORKS
OPEN SOURCE
RDBMS
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
But, why aren’t more companies running to this solution?
Risky
Hadoop requires a bunch of
new skill sets
It’ll take a long time
There’s too much manual coding required
It’s hard to integrate to
my BI tool stack
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Legacy EDW vs. EDW Optimization Solution with Connected Data Platforms
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
EDW Optimization: Fast BI on Hadoop
 The Problem:
– Legacy EDW systems were adopted for Fast BI
and deep slice-and-dice analytics, but EDW
costs can limit breadth and depth of these
analytics.
 The Solution:
– Interactive SQL is a reality on Hadoop today.
– AtScale Intelligence Platform adds OLAP
capabilities for deep drilldown at scale.
 The Result:
– Query terabytes of data in seconds.
– Connect your favorite BI tools like Tableau and
Excel through SQL and MDX interfaces.
– The EDW Optimization Solution is tailor-made
to deliver Fast BI on Hadoop.
ETL/ELT
DATA
MART
DATA
LANDING &
DEEP
ARCHIVE
CUBE
MART
END USER
APPLICATIONS
APPLICATIONS
APPLICATIONS
END USERS
AND APPS
EDW OPTIMIZATION SOLUTION
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
EDW Optimization: ETL Offload
 The Problem:
– EDWs can consume between 50% and 90% of
resources just on ETL/ELT tasks.
– These jobs interfere with more business-
critical tasks like BI and advanced analytics.
 The Solution:
– Hive and HDP deliver ETL that scales to
petabytes.
– Syncsort DMX-h for simple drag-and-drop ETL
workflows.
– Economical scale-out processing on
commodity servers.
 The Result:
– Better SLAs for mission-critical analytics.
– Limit EDW expansion or retire old systems.
ETL/ELT
DATA
MART
DATA
LANDING &
DEEP
ARCHIVE
CUBE
MART
END USER
APPLICATIONS
APPLICATIONS
APPLICATIONS
END USERS
AND APPS
EDW OPTIMIZATION SOLUTION
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
EDW Optimization: Active Archive
 The Problem:
– Increasing data volumes and cost pressure
force data to be archived to tape.
– Archived data not available for analytics, or
must be retrieved at great expense.
 The Solution:
– Adopting Hadoop delivers cost per terabyte
on par with tape backup solutions.
– Data in Hadoop can be analyzed by all major
BI tools, allowing analytics on archive data.
 The Result:
– Data always available for analytics.
– Store years of data rather than months.
ETL/ELT
DATA
MART
DATA
LANDING &
DEEP
ARCHIVE
CUBE
MART
END USER
APPLICATIONS
APPLICATIONS
APPLICATIONS
END USERS
AND APPS
EDW OPTIMIZATION SOLUTION
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Multi-Channel Behavioral Analysis
 Industry: Mass Media
– Largest broadcasting and cable company
in the world by revenue
– Multiple channels: Cable (set-top-box),
wireless devices, streaming
programming,
– 22 million+ subscribers (internet &
video)
 Results:
– Scalability: 480B rows, 500 nodes
– 60x query performance improvement
– Insights: New info improve negations
– Loyalty: Outreach to customers viewing
competitive streams; ▼churn ▲
revenue
Before After
Leading Media Company
Hortonworks HDP
AtScale Intelligence Server
Hortonworks HDP
Netezza Data Mart
Channel Feeds
Tableau + MS Excel + R
Channel Feeds
Tableau + MS Excel
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Campaign Paid-Search Effectiveness: Retail
 Industry: Retail / eCommerce
– Top US department store (by rev)
– Online sales $4B+ & growing (11%+ total)
– 800+ department stores nationwide
 Results
– Scale: Millions paid keywords analyzed
– Speed: Eliminate extract step
– Insight: Operationalized closed-loop
analysis  insight  decision  action
– Impact: Make and save $ millions w/
instant bid decisions over 6-week season
 that drives 60% annual revenue
Before After
Hortonworks HDP
AtScale Intelligence Server
Hortonworks HDP
Vertica Data Marts
Ad & Paid Keywords
Cognos + Tableau + Excel
Ad & Paid Keywords
Tableau + Excel
Leading Retailer
14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Client and Patient Analysis
 Industry: Managed Health Care
– Member of Fortune 100
– Health, life + other insurance products
– ~ 52 million members;
medical/dental/pharm
 Results
– Scalable: BI directly on 264+ nodes data
– Time: Eliminate data movement step
– 62x query performance improvement
– Speed: <2.2 second average query time
– Insight: Tableau on Hadoop for 1000+
– Security: Access control by user; HIPAA
Before After
Leading Managed Healthcare Provider
Hortonworks HDP
AtScale Intelligence Server
Hortonworks HDP
Netezza Data Mart
Client / Patient Details
Tableau + MS Excel
Client / Patient Details
Tableau + MS Excel
15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Solution Architecture
Inbound
HDFS
(Based Data and Aggregates
Stored in ORC)
HIVE
(Batch and Interactive SQL)
HORTONWORKS DATA PLATFORM (HDP)
MULTITENANT PROCESSING:
YARN
(syncsort, llap, spark, tez)
AtScale
virtual cube
DMX Data
Funnel
DMX-h
Engine
EDW/
Legacy
4. Build Virtual Cube using AtScale
5. Build aggregates in Atscale for optimization
6. Query data using BI Tool like Tableau/Excel
through odbc/jdbc connection
High Level Flow
1. Install HDP, Syncsort and AtScale
2. Install EDW/Hive Drivers on Edge Node
3. Bring all tables involved in use case using
Syncsort data funnel into Hive
16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Hortonworks EDW Optimization Solution Components
Syncsort
High-Performance
Data Movement
Hadoop
Scalable Storage and Compute
Hive LLAP
High Performance SQL Data Mart
AtScale Intelligence Platform
OLAP Cubes for Higher Performance
Source Data
Systems
Fast, scalable SQL analytics
Intelligent in-memory caching
Define OLAP cubes for 10x faster queries
Unified semantic layer for all BI tools
High performance data import
from all major EDW platforms
Pre-aggregated
data
... Or, full-fidelity
data
17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
ETL Workflow Onboarding: SyncSort DMX-h
18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Hybrid Query Service
❑ Choice of BI Tool
❑ Zero Client Install
❑ Secure Data
Access
❑ Optimized Queries
19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Enterprise Data Optimization Solution Components
 Hortonworks: 24 nodes of Enterprise
Plus Support
 Syncsort: 24 nodes of DMX-H
 AtScale: 24 nodes of AtScale Intelligence
Platform
 Single Legacy Data source
 1 Fact table with 5 Dimensions
 Load up to 15 tables
 One time data dump
 Up to 1 cube with 10 measures
 1 BI Connection
 5TB Total Cube Limit
12 month license and support offering Pre-packaged Professional Services
20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Future Proof
 Hive Optimizations
– Hve, Tez, ORC, LLAP
– Additional SQL coverage
 ACID Merge for SQL 2011 compliant (Upsert)
 Business Continuity Options
– Replication
– Backup/Restore
 Additional Hive options tech preview in 2.6
21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
EDW Package: Professional Services ‘Proof of Value’
1. Install HDP, AtScale and Syncsort
2. Configure drivers for appropriate EDW and Hive on Edge Node
3. Enable and configure Interactive Hive (LLAP)
4. Ingest data from 1 legacy system
5. Create up to 3 BI cubes
6. Support connection to BI Tool
7. Demo of capabilities ( functionality and Performance). Under 10 second response time.
8. Solution Architecture Document and Schema definition
22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
EDW Optimization Solution - Try It Now!
Tool-based approach means we can
leverage existing skillsets
Proof points in 60 days
Integrated into my BI tool stack
Hive supports scaled
queries and fast queries
It works!
23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
To Learn More
 Everyone will receive a free copy of Forrester White Paper titled ”The Next-Generation
EDW Is The Big Data Warehouse”
 EDW Optimization with HDP
– http://hortonworks.com/solutions/edw-optimization/
– EDW Optimization 7 min video
 AtScale Intelligence Platform
– http://hortonworks.com/partner/atscale/
 Syncsort DMX-h
– http://hortonworks.com/partner/syncsort/
24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Hortonworks Connected Data Platforms and Solutions
Hortonworks
Connection
Hortonworks Solutions
Enterprise Data
Warehouse Optimization
Cyber Security and
Threat Management
Internet of Things
and Streaming Analytics
Hortonworks Connection
Subscription Support
SmartSense
Premier Support
Educational Services
Professional Services
Community Connection
Cloud
Hortonworks Data Cloud
AWS HDInsight
Data Center
Hortonworks Data Suite
HDFHDP
25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Thank You

More Related Content

PPT
Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
PDF
Powering Big Data Success On-Prem and in the Cloud
PDF
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
PPTX
Double Your Hadoop Hardware Performance with SmartSense
PDF
Analytics Modernization: Configuring SAS® Grid Manager for Hadoop
PPTX
Hadoop and Spark – Perfect Together
PPTX
Hortonworks Data In Motion Webinar Series Pt. 2
PPTX
Hortonworks for Financial Analysts Presentation
Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
Powering Big Data Success On-Prem and in the Cloud
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
Double Your Hadoop Hardware Performance with SmartSense
Analytics Modernization: Configuring SAS® Grid Manager for Hadoop
Hadoop and Spark – Perfect Together
Hortonworks Data In Motion Webinar Series Pt. 2
Hortonworks for Financial Analysts Presentation

What's hot (20)

PPTX
Falcon Meetup
PPTX
Streamline Apache Hadoop Operations with Apache Ambari and SmartSense
PDF
Democratizing Big Data with Microsoft Azure HDInsight
PPT
Eric Baldeschwieler Keynote from Storage Developers Conference
PDF
Supporting Financial Services with a More Flexible Approach to Big Data
PPTX
Hadoop & Cloud Storage: Object Store Integration in Production
PPTX
Internet of things Crash Course Workshop
PDF
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
PDF
Deep learning with Hortonworks and Apache Spark - Hortonworks technical workshop
PPTX
How to Use Apache Zeppelin with HWX HDB
PPTX
Enabling the Real Time Analytical Enterprise
PDF
Hortonworks Technical Workshop: What's New in HDP 2.3
PPTX
Row/Column- Level Security in SQL for Apache Spark
PPTX
Apache NiFi Toronto Meetup
PPTX
Protecting enterprise Data in Hadoop
PDF
Attunity Hortonworks Webinar- Sept 22, 2016
PDF
Hp Converged Systems and Hortonworks - Webinar Slides
PPTX
Design a Dataflow in 7 minutes with Apache NiFi/HDF
PDF
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
PPTX
Hadoop crash course workshop at Hadoop Summit
Falcon Meetup
Streamline Apache Hadoop Operations with Apache Ambari and SmartSense
Democratizing Big Data with Microsoft Azure HDInsight
Eric Baldeschwieler Keynote from Storage Developers Conference
Supporting Financial Services with a More Flexible Approach to Big Data
Hadoop & Cloud Storage: Object Store Integration in Production
Internet of things Crash Course Workshop
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
Deep learning with Hortonworks and Apache Spark - Hortonworks technical workshop
How to Use Apache Zeppelin with HWX HDB
Enabling the Real Time Analytical Enterprise
Hortonworks Technical Workshop: What's New in HDP 2.3
Row/Column- Level Security in SQL for Apache Spark
Apache NiFi Toronto Meetup
Protecting enterprise Data in Hadoop
Attunity Hortonworks Webinar- Sept 22, 2016
Hp Converged Systems and Hortonworks - Webinar Slides
Design a Dataflow in 7 minutes with Apache NiFi/HDF
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Hadoop crash course workshop at Hadoop Summit
Ad

Viewers also liked (20)

PPTX
Dynamic Column Masking and Row-Level Filtering in HDP
PPTX
Top 5 Strategies for Retail Data Analytics
PDF
Pivotal - Advanced Analytics for Telecommunications
PDF
Getting involved with Open Source at the ASF
PDF
The path to a Modern Data Architecture in Financial Services
PPTX
S3Guard: What's in your consistency model?
PPTX
Hortonworks Data Cloud for AWS
PPTX
Hive - 1455: Cloud Storage
PPTX
How Universities Use Big Data to Transform Education
PDF
SAS - Hortonworks: Creating the Omnichannel Experience in Retail webinar marc...
PPTX
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
PPTX
Scaling real time streaming architectures with HDF and Dell EMC Isilon
PDF
Hortonworks technical workshop operations with ambari
PPTX
Webinar Series Part 5 New Features of HDF 5
PPTX
The Power of your Data Achieved - Next Gen Modernization
PDF
Hortonworks Technical Workshop - Operational Best Practices Workshop
PDF
Credit Card Analytics on a Connected Data Platform
PPTX
Micro services vs hadoop
PPTX
Mutable Data in Hive's Immutable World
PDF
How to Become a Thought Leader in Your Niche
Dynamic Column Masking and Row-Level Filtering in HDP
Top 5 Strategies for Retail Data Analytics
Pivotal - Advanced Analytics for Telecommunications
Getting involved with Open Source at the ASF
The path to a Modern Data Architecture in Financial Services
S3Guard: What's in your consistency model?
Hortonworks Data Cloud for AWS
Hive - 1455: Cloud Storage
How Universities Use Big Data to Transform Education
SAS - Hortonworks: Creating the Omnichannel Experience in Retail webinar marc...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Scaling real time streaming architectures with HDF and Dell EMC Isilon
Hortonworks technical workshop operations with ambari
Webinar Series Part 5 New Features of HDF 5
The Power of your Data Achieved - Next Gen Modernization
Hortonworks Technical Workshop - Operational Best Practices Workshop
Credit Card Analytics on a Connected Data Platform
Micro services vs hadoop
Mutable Data in Hive's Immutable World
How to Become a Thought Leader in Your Niche
Ad

Similar to Edw Optimization Solution (20)

PDF
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
PPTX
SoCal BigData Day
PDF
Introduction to Hadoop
PDF
Hortonworks and Platfora in Financial Services - Webinar
PDF
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
PDF
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
PDF
Hortonworks and Red Hat Webinar - Part 2
PPTX
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
PDF
Storm Demo Talk - Denver Apr 2015
PDF
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
PDF
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
PDF
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
PPTX
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
PDF
Webinar turbo charging_data_science_hawq_on_hdp_final
PDF
Webinar turbo charging_data_science_hawq_on_hdp_final
PDF
Eliminating the Challenges of Big Data Management Inside Hadoop
PDF
Eliminating the Challenges of Big Data Management Inside Hadoop
PPTX
Hortonworks Oracle Big Data Integration
PPTX
Enterprise data science at scale
PDF
Storm Demo Talk - Colorado Springs May 2015
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
SoCal BigData Day
Introduction to Hadoop
Hortonworks and Platfora in Financial Services - Webinar
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Hortonworks and Red Hat Webinar - Part 2
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
Storm Demo Talk - Denver Apr 2015
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
Hortonworks Oracle Big Data Integration
Enterprise data science at scale
Storm Demo Talk - Colorado Springs May 2015

More from Hortonworks (20)

PDF
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
PDF
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
PDF
Getting the Most Out of Your Data in the Cloud with Cloudbreak
PDF
Johns Hopkins - Using Hadoop to Secure Access Log Events
PDF
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
PDF
HDF 3.2 - What's New
PPTX
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
PDF
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
PDF
IBM+Hortonworks = Transformation of the Big Data Landscape
PDF
Premier Inside-Out: Apache Druid
PDF
Accelerating Data Science and Real Time Analytics at Scale
PDF
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
PDF
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
PDF
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
PDF
Making Enterprise Big Data Small with Ease
PDF
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
PDF
Driving Digital Transformation Through Global Data Management
PPTX
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
PDF
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
PDF
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Johns Hopkins - Using Hadoop to Secure Access Log Events
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
HDF 3.2 - What's New
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
IBM+Hortonworks = Transformation of the Big Data Landscape
Premier Inside-Out: Apache Druid
Accelerating Data Science and Real Time Analytics at Scale
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Making Enterprise Big Data Small with Ease
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Driving Digital Transformation Through Global Data Management
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Unlock Value from Big Data with Apache NiFi and Streaming CDC

Recently uploaded (20)

PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Big Data Technologies - Introduction.pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
MIND Revenue Release Quarter 2 2025 Press Release
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Encapsulation theory and applications.pdf
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
cuic standard and advanced reporting.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Chapter 3 Spatial Domain Image Processing.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Big Data Technologies - Introduction.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
MIND Revenue Release Quarter 2 2025 Press Release
The AUB Centre for AI in Media Proposal.docx
Review of recent advances in non-invasive hemoglobin estimation
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
NewMind AI Weekly Chronicles - August'25 Week I
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Encapsulation theory and applications.pdf
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
MYSQL Presentation for SQL database connectivity
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Reach Out and Touch Someone: Haptics and Empathic Computing
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
cuic standard and advanced reporting.pdf
Spectral efficient network and resource selection model in 5G networks
Programs and apps: productivity, graphics, security and other tools
Chapter 3 Spatial Domain Image Processing.pdf

Edw Optimization Solution

  • 1. 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved1 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Scott Gnau, CTO @Scott_Gnau
  • 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved The Next Gen EDW is the Big Data Warehouse  In Forrester’s 2016 global survey, 59% of respondents stated that leveraging big data and analytics was a critical or high priority.
  • 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Companies Are Looking to Big Data for EDW Optimization  82% of 2550+ respondents are looking to Big Data for EDW Optimization rather than a straight replacement. – 2016 Big Data Maturity Survey
  • 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Hortonworks Connected Data Platforms and Solutions Hortonworks Connection Hortonworks Solutions Enterprise Data Warehouse Optimization Cyber Security and Threat Management Internet of Things and Streaming Analytics Hortonworks Connection Subscription Support SmartSense Premier Support Educational Services Professional Services Community Connection Cloud Hortonworks Data Cloud AWS HDInsight Data Center Hortonworks Data Suite HDFHDP
  • 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Drivers of a Modern BI Infrastructure Deeper and Broader Data Sets Complete Data ‘Provenance’ Leading Analytics and Tools Integrate non-EDW data and EDW data Total Cost of Ownership
  • 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Open Source Transformational Impact to EDW Unmatched Economics support low cost data-center and cloud architectures for Enterprise Apache Hadoop Eliminates Risk and Ensures Integration prevents vendor lock-in and speeds ecosystem adoption of ODPi-compliant core COST EFFICIENCY DATA VARIETY EDW PROPRIETARY HADOOP HORTONWORKS OPEN SOURCE RDBMS
  • 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved But, why aren’t more companies running to this solution? Risky Hadoop requires a bunch of new skill sets It’ll take a long time There’s too much manual coding required It’s hard to integrate to my BI tool stack
  • 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Legacy EDW vs. EDW Optimization Solution with Connected Data Platforms
  • 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved EDW Optimization: Fast BI on Hadoop  The Problem: – Legacy EDW systems were adopted for Fast BI and deep slice-and-dice analytics, but EDW costs can limit breadth and depth of these analytics.  The Solution: – Interactive SQL is a reality on Hadoop today. – AtScale Intelligence Platform adds OLAP capabilities for deep drilldown at scale.  The Result: – Query terabytes of data in seconds. – Connect your favorite BI tools like Tableau and Excel through SQL and MDX interfaces. – The EDW Optimization Solution is tailor-made to deliver Fast BI on Hadoop. ETL/ELT DATA MART DATA LANDING & DEEP ARCHIVE CUBE MART END USER APPLICATIONS APPLICATIONS APPLICATIONS END USERS AND APPS EDW OPTIMIZATION SOLUTION
  • 10. 10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved EDW Optimization: ETL Offload  The Problem: – EDWs can consume between 50% and 90% of resources just on ETL/ELT tasks. – These jobs interfere with more business- critical tasks like BI and advanced analytics.  The Solution: – Hive and HDP deliver ETL that scales to petabytes. – Syncsort DMX-h for simple drag-and-drop ETL workflows. – Economical scale-out processing on commodity servers.  The Result: – Better SLAs for mission-critical analytics. – Limit EDW expansion or retire old systems. ETL/ELT DATA MART DATA LANDING & DEEP ARCHIVE CUBE MART END USER APPLICATIONS APPLICATIONS APPLICATIONS END USERS AND APPS EDW OPTIMIZATION SOLUTION
  • 11. 11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved EDW Optimization: Active Archive  The Problem: – Increasing data volumes and cost pressure force data to be archived to tape. – Archived data not available for analytics, or must be retrieved at great expense.  The Solution: – Adopting Hadoop delivers cost per terabyte on par with tape backup solutions. – Data in Hadoop can be analyzed by all major BI tools, allowing analytics on archive data.  The Result: – Data always available for analytics. – Store years of data rather than months. ETL/ELT DATA MART DATA LANDING & DEEP ARCHIVE CUBE MART END USER APPLICATIONS APPLICATIONS APPLICATIONS END USERS AND APPS EDW OPTIMIZATION SOLUTION
  • 12. 12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Multi-Channel Behavioral Analysis  Industry: Mass Media – Largest broadcasting and cable company in the world by revenue – Multiple channels: Cable (set-top-box), wireless devices, streaming programming, – 22 million+ subscribers (internet & video)  Results: – Scalability: 480B rows, 500 nodes – 60x query performance improvement – Insights: New info improve negations – Loyalty: Outreach to customers viewing competitive streams; ▼churn ▲ revenue Before After Leading Media Company Hortonworks HDP AtScale Intelligence Server Hortonworks HDP Netezza Data Mart Channel Feeds Tableau + MS Excel + R Channel Feeds Tableau + MS Excel
  • 13. 13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Campaign Paid-Search Effectiveness: Retail  Industry: Retail / eCommerce – Top US department store (by rev) – Online sales $4B+ & growing (11%+ total) – 800+ department stores nationwide  Results – Scale: Millions paid keywords analyzed – Speed: Eliminate extract step – Insight: Operationalized closed-loop analysis  insight  decision  action – Impact: Make and save $ millions w/ instant bid decisions over 6-week season  that drives 60% annual revenue Before After Hortonworks HDP AtScale Intelligence Server Hortonworks HDP Vertica Data Marts Ad & Paid Keywords Cognos + Tableau + Excel Ad & Paid Keywords Tableau + Excel Leading Retailer
  • 14. 14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Client and Patient Analysis  Industry: Managed Health Care – Member of Fortune 100 – Health, life + other insurance products – ~ 52 million members; medical/dental/pharm  Results – Scalable: BI directly on 264+ nodes data – Time: Eliminate data movement step – 62x query performance improvement – Speed: <2.2 second average query time – Insight: Tableau on Hadoop for 1000+ – Security: Access control by user; HIPAA Before After Leading Managed Healthcare Provider Hortonworks HDP AtScale Intelligence Server Hortonworks HDP Netezza Data Mart Client / Patient Details Tableau + MS Excel Client / Patient Details Tableau + MS Excel
  • 15. 15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Solution Architecture Inbound HDFS (Based Data and Aggregates Stored in ORC) HIVE (Batch and Interactive SQL) HORTONWORKS DATA PLATFORM (HDP) MULTITENANT PROCESSING: YARN (syncsort, llap, spark, tez) AtScale virtual cube DMX Data Funnel DMX-h Engine EDW/ Legacy 4. Build Virtual Cube using AtScale 5. Build aggregates in Atscale for optimization 6. Query data using BI Tool like Tableau/Excel through odbc/jdbc connection High Level Flow 1. Install HDP, Syncsort and AtScale 2. Install EDW/Hive Drivers on Edge Node 3. Bring all tables involved in use case using Syncsort data funnel into Hive
  • 16. 16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Hortonworks EDW Optimization Solution Components Syncsort High-Performance Data Movement Hadoop Scalable Storage and Compute Hive LLAP High Performance SQL Data Mart AtScale Intelligence Platform OLAP Cubes for Higher Performance Source Data Systems Fast, scalable SQL analytics Intelligent in-memory caching Define OLAP cubes for 10x faster queries Unified semantic layer for all BI tools High performance data import from all major EDW platforms Pre-aggregated data ... Or, full-fidelity data
  • 17. 17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved ETL Workflow Onboarding: SyncSort DMX-h
  • 18. 18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Hybrid Query Service ❑ Choice of BI Tool ❑ Zero Client Install ❑ Secure Data Access ❑ Optimized Queries
  • 19. 19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Enterprise Data Optimization Solution Components  Hortonworks: 24 nodes of Enterprise Plus Support  Syncsort: 24 nodes of DMX-H  AtScale: 24 nodes of AtScale Intelligence Platform  Single Legacy Data source  1 Fact table with 5 Dimensions  Load up to 15 tables  One time data dump  Up to 1 cube with 10 measures  1 BI Connection  5TB Total Cube Limit 12 month license and support offering Pre-packaged Professional Services
  • 20. 20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Future Proof  Hive Optimizations – Hve, Tez, ORC, LLAP – Additional SQL coverage  ACID Merge for SQL 2011 compliant (Upsert)  Business Continuity Options – Replication – Backup/Restore  Additional Hive options tech preview in 2.6
  • 21. 21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved EDW Package: Professional Services ‘Proof of Value’ 1. Install HDP, AtScale and Syncsort 2. Configure drivers for appropriate EDW and Hive on Edge Node 3. Enable and configure Interactive Hive (LLAP) 4. Ingest data from 1 legacy system 5. Create up to 3 BI cubes 6. Support connection to BI Tool 7. Demo of capabilities ( functionality and Performance). Under 10 second response time. 8. Solution Architecture Document and Schema definition
  • 22. 22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved EDW Optimization Solution - Try It Now! Tool-based approach means we can leverage existing skillsets Proof points in 60 days Integrated into my BI tool stack Hive supports scaled queries and fast queries It works!
  • 23. 23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved To Learn More  Everyone will receive a free copy of Forrester White Paper titled ”The Next-Generation EDW Is The Big Data Warehouse”  EDW Optimization with HDP – http://hortonworks.com/solutions/edw-optimization/ – EDW Optimization 7 min video  AtScale Intelligence Platform – http://hortonworks.com/partner/atscale/  Syncsort DMX-h – http://hortonworks.com/partner/syncsort/
  • 24. 24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Hortonworks Connected Data Platforms and Solutions Hortonworks Connection Hortonworks Solutions Enterprise Data Warehouse Optimization Cyber Security and Threat Management Internet of Things and Streaming Analytics Hortonworks Connection Subscription Support SmartSense Premier Support Educational Services Professional Services Community Connection Cloud Hortonworks Data Cloud AWS HDInsight Data Center Hortonworks Data Suite HDFHDP
  • 25. 25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Thank You