SlideShare a Scribd company logo
1 © Hortonworks Inc. 2011–2018. All rights reserved
Fast SQL for Big Data
Apache Hive and Apache Druid
Alan Gates
Hortonworks Co-founder, Apache Hive PMC member
@alanfgates
2 © Hortonworks Inc. 2011–2018. All rights reserved
7000 analysts, 80ms average latency, 1PB data.
250k BI queries per hour
On demand deep reporting in the cloud over
100TB in minutes.
3 © Hortonworks Inc. 2011–2018. All rights reserved
• Ran all 99 TPCDS queries
• Total query runtime have improved multifold in each release!
Benchmark journey
TPCDS 10TB scale on 10 node cluster
HDP 2.5
Hive1
HDP 2.5
LLAP
HDP 2.6
LLAP
25x 3x 2x
HDP 3.0
LLAP
2016 20182017
ACID
tables
4 © Hortonworks Inc. 2011–2018. All rights reserved
Hive LLAP v Presto v Spark SQL
• TPC-DS, scale 3TB, all 99 queries, not run by Hortonworks (nor at our request)
• Total time to run (seconds):
LLAP: 5,517 Presto: 12,948 Spark SQL: 26,247
• LLAP faster than Presto on 83/97 queries, Spark SQL on 92/96 queries
• More details:
• Hive 3.1 LLAP, Presto 0.208e, Spark 2.3.1
• 19 worker nodes, 84G each
• Source mr3.postech.ac.kr/blog/2018/10/30/performance-evaluation-0.4/
5 © Hortonworks Inc. 2011–2018. All rights reserved
Hive LLAP - MPP Performance at Hadoop Scale
Deep
Storage
Hadoop Cluster
LLAP Daemon
Query
Executors
LLAP Daemon
Query
Executors
LLAP Daemon
Query
Executors
LLAP Daemon
Query
Executors
Query
Coordinators
Coord-
inator
Coord-
inator
Coord-
inator
HiveServer2
(Query
Endpoint)
ODBC /
JDBC
SQL
Queries In-Memory Cache
(Shared Across All Users)
HDFS and
Compatible
S3 WASB Isilon
6 © Hortonworks Inc. 2011–2018. All rights reserved
Aggressive Caching
7 © Hortonworks Inc. 2011–2018. All rights reserved
Caching in LLAP
• Fine grained (by row group and column) and compact (dictionary encoding, RLE)
• Important in environment with PBs of data but common queries only touch 100s of GB
• Prioritized – indexes cached with higher priority
• Off heap to avoid GC
• Supports spill to SSD – important in the cloud
• Uses LRFU replacement algorithm to avoid large scans purging the cache
8 © Hortonworks Inc. 2011–2018. All rights reserved
Query result cache
Returns results directly from storage (e.g.
HDFS) without actually executing the query
If the same query has run before and the
underlying data has not changed
Important for dashboards, reports etc.
where repetitive queries are common
Uses transactions to determine when
underlying data has changed
Without
cache
With
cache
9 © Hortonworks Inc. 2011–2018. All rights reserved
Metastore Cache
• With query execution time being < 1 sec, compilation time starts to dominate
• Metadata retrieval is often significant part of compilation time. Most of it is in RDBMS
queries.
• Cloud RDBMS As a Service is often slower, and frequent queries leads to throttling.
• Metadata cache speeds compilation time by around 50% with on prem MySQL.
Significantly more improvement with cloud RDBMS.
• Cache is consistent in single metastore setup, eventually consistent with HA setup.
Consistent HA setup support is in the works.
10 © Hortonworks Inc. 2011–2018. All rights reserved
New in Hive 3:
Materialized Views
11 © Hortonworks Inc. 2011–2018. All rights reserved
Possible workflow
1. Create materialized view using Hive tables
• Stored by Hive or Druid
2. User or dashboard sends queries to Hive
• Hive rewrites queries using available materialized views
• Execute rewritten query
Dashboards, BI tools
CREATE MATERIALIZED VIEW `ssb_mv`
STORED AS 'org.apache.hadoop.hive.druid.DruidStorageHandler'
ENABLE REWRITE
AS
<query>;
DBA, recommendation system
①
②
Data
Queries
12 © Hortonworks Inc. 2011–2018. All rights reserved
Materialized view-based rewriting example
• Materialized view definition
CREATE MATERIALIZED VIEW mv AS
SELECT <dims>,
lo_revenue,
lo_extendedprice * lo_discount AS d_price,
lo_revenue - lo_supplycost
FROM
customer, dates, lineorder, part, supplier
WHERE
lo_orderdate = d_datekey
and lo_partkey = p_partkey
and lo_suppkey = s_suppkey
and lo_custkey = c_custkey;
• Query
SELECT sum(lo_extendedprice * lo_discount)
FROM
lineorder, dates
WHERE
lo_orderdate = d_datekey
and d_year = 2013
and lo_discount between 1 and 3;
• Materialized view-based rewriting
SELECT SUM(d_price)
FROM mv
WHERE
d_year = 2013
and lo_discount between 1 and 3;
supplier
part
dates
customerlineorder
d_year lo_discount <dims> d_price
2013 2 ... 7.55
2014 4 ... 432.60
2013 2 ... 34.45
2012 2 ... 2.05
… … ... …
mv contents
sum
42.0
…
Query results
13 © Hortonworks Inc. 2011–2018. All rights reserved
Materialized view - Maintenance
• Partial table rewrites are supported
• Typical: Denormalize last month of data only
• Rewrite engine will produce union of latest and historical data
• Updates to base tables
• Invalidates views, but
• Can choose to allow stale views (max staleness) for performance
• Can partial match views and compute delta after updates
• Incremental updates
• Common classes of views allow for incremental updates
• Others need full refresh
14 © Hortonworks Inc. 2011–2018. All rights reserved
Optimizer Improvements
15 © Hortonworks Inc. 2011–2018. All rights reserved
SELECT * FROM
( SELECT AVG(ss_list_price) B1_LP,
COUNT(ss_list_price) B1_CNT ,COUNT(DISTINCT
ss_list_price) B1_CNTD
FROM store_sales
WHERE ss_quantity BETWEEN 0 AND 5 AND
(ss_list_price BETWEEN 11 and 11+10 OR
ss_coupon_amt BETWEEN 460 and 460+1000 OR
ss_wholesale_cost BETWEEN 14 and 14+20)) B1,
( SELECT AVG(ss_list_price) B2_LP,
COUNT(ss_list_price) B2_CNT ,COUNT(DISTINCT
ss_list_price) B2_CNTD
FROM store_sales
WHERE ss_quantity BETWEEN 6 AND 10 AND
(ss_list_price BETWEEN 91 and 91+10 OR
ss_coupon_amt BETWEEN 1430 and 1430+1000 OR
ss_wholesale_cost BETWEEN 32 and 32+20)) B2,
. . .
LIMIT 100;
TPCDS SQL query 28 joins 6 instances of store_sales table
Shared scan - 4x improvement!
RS RS RS RS RS
Scan
store_sales
Combined OR’ed B1-B6 Filters
B1 Filter B2 Filter B3 Filter B4 Filter B5 Filter
Join
16 © Hortonworks Inc. 2011–2018. All rights reserved
• Dramatically improves performance of very selective joins
• Builds a bloom filter from one side of join and filters rows from other side
• Skips scan and further evaluation of rows that would not qualify the join
Dynamic Semijoin Reduction - 7x improvement for q72
SELECT …
FROM sales JOIN time ON
sales.time_id = time.time_id
WHERE time.year = 2014 AND
time.quarter IN ('Q1', 'Q2’)
Reduced scan on sales
17 © Hortonworks Inc. 2011–2018. All rights reserved
Statistics (not new)
• Statistics collection can be set to automatic or manual
• Used extensively in join selection
• Without statistics much of the optimizer will not be used
18 © Hortonworks Inc. 2011–2018. All rights reserved
⬢ Solution
● Query fails because of stats estimation error
● Runtime sends observed statistics back to
coordinator
● Statistics overrides are created at session, server
or global level
● Query is replanned and resubmitted
Optimizer is learning from planning mistakes
⬢ Symptoms
● Memory exhaustion due to under
provisioning
● Excessive runtime (future)
● Excessive spilling (future)
19 © Hortonworks Inc. 2011–2018. All rights reserved
Apache Druid
20 © Hortonworks Inc. 2011–2018. All rights reserved
Druid capabilities
• Streaming ingestion capability
• Data Freshness – analyze events as they occur
• Fast response time (ideally < 1sec query time)
• Arbitrary slicing and dicing
• Multi-tenancy – 1000s of concurrent users
• Scalability and Availability
• Rich real-time visualization with Superset
Apache Druid is a distributed, real-time, column-oriented
datastore designed to quickly ingest and index large amounts
of data and make it available for real-time query.
21 © Hortonworks Inc. 2011–2018. All rights reserved
Druid: Fast Facts
Most Events per Day
30 Billion Events / Day
(Metamarkets)
Most Computed Metrics
1 Billion Metrics / Min
(Jolata)
Largest Cluster
200 Nodes
(Metamarkets)
Largest Hourly Ingestion
2TB per Hour
(Netflix)
22 © Hortonworks Inc. 2011–2018. All rights reserved
Hive and Druid, Better Together
Technology Strengths Issues
Hive SQL 2011, JDBC/ODBC
Fast scans
ACID
Not optimized for slice and dice and drill down (OLAP
cubing) operations
Druid Dimensional aggregates support OLAP cubes
Timeseries queries
Realtime ingestion of streaming data
Lacks SQL interface
No joins
Problem: You don't want two systems to manage and load data into
Solution: For data that fits best in Druid, load it in Druid and access it with Hive
• Hive supports push down of queries to Druid, optimizer knows what to push and what to run in Hive
• Enables SQL and JDBC/ODBC access to data in Druid
• Enables join of historical and realtime data
• Enables Hive support of slice & dice, drill down for OLAP cubing
• Can also create materialized views in Hive and store them in Druid
23 © Hortonworks Inc. 2011–2018. All rights reserved
Druid Connector
Realtime Node
Realtime Node
Realtime Node
Broker HiveServer2
Instantly analyze kafka data with milliseconds latency
24 © Hortonworks Inc. 2011–2018. All rights reserved
Druid Connector - Joins between Hive and realtime data in Druid
Bloom filter pushdown greatly reduces data transfer
Send promotional email to all customers from CA who purchased more than $1000 worth of merchandise today.
create external table sales(`__time` timestamp, quantity int, sales_price double,customer_id bigint, item_id int, store_id int)
stored by 'org.apache.hadoop.hive.druid.DruidStorageHandler'
tblproperties ( "kafka.bootstrap.servers" = "localhost:9092", "kafka.topic" = "sales-topic",
"druid.kafka.ingestion.maxRowsInMemory" = "5");
create table customers (customer_id bigint, first_name string, last_name string, email string, state string);
select email from customers join sales using customer_id where to_date(sales.__time) = date ‘2018-09-06’
and quantity * sales_price > 1000 and customers.state = ‘CA’;
25 © Hortonworks Inc. 2011–2018. All rights reserved
Tips for Optimizing Hive
26 © Hortonworks Inc. 2011–2018. All rights reserved
Making Your Queries Blaze in Hive 3
• Use a columnar format
• We recommend ORC; ORC or Parquet much better for DW queries than row oriented formats
• Use the right tool for the right job, all in Hive
• LLAP for BI queries
• Tez for ETL/batch
• Druid for ROLAP and realtime ingestion
• Do not use MapReduce as your Hive engine, it is very slow
• Keep statistics current on your data
• Define materialized views for common joins and aggregations
• Turn on ACID – it enables query cache and materialized view partial rewrites
27 © Hortonworks Inc. 2011–2018. All rights reserved
SOLUTIONS: Heuristic recommendation engine
Fully self-serviced query and storage optimization
28 © Hortonworks Inc. 2011–2018. All rights reserved
Questions?

More Related Content

PPTX
Druid and Hive Together : Use Cases and Best Practices
PPTX
Stinger Initiative - Deep Dive
PPTX
Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...
PPTX
What's new in apache hive
PPTX
Apache Hadoop YARN: state of the union
PPTX
Sharing metadata across the data lake and streams
PDF
Hadoop: The Unintended Benefits
PPTX
From Zero to Data Flow in Hours with Apache NiFi
Druid and Hive Together : Use Cases and Best Practices
Stinger Initiative - Deep Dive
Disaster Recovery Experience at CACIB: Hardening Hadoop for Critical Financia...
What's new in apache hive
Apache Hadoop YARN: state of the union
Sharing metadata across the data lake and streams
Hadoop: The Unintended Benefits
From Zero to Data Flow in Hours with Apache NiFi

What's hot (20)

PDF
What's New in Apache Hive
PPTX
Apache Deep Learning 201
PPTX
YARN Ready: Apache Spark
PPTX
PayPal merchant ecosystem using Apache Spark, Hive, Druid, and HBase
PDF
Apache Eagle - Monitor Hadoop in Real Time
PPTX
Deep Learning using Spark and DL4J for fun and profit
PDF
Performance tuning your Hadoop/Spark clusters to use cloud storage
PPTX
Apache Hadoop YARN: Present and Future
PPTX
What's new in Ambari
PPTX
Apache Tez - A New Chapter in Hadoop Data Processing
PPTX
IoT:what about data storage?
PPTX
Migrating Analytics to the Cloud at Fannie Mae
PPTX
Practice of large Hadoop cluster in China Mobile
PPTX
Enabling Diverse Workload Scheduling in YARN
PPTX
Evolving HDFS to a Generalized Storage Subsystem
PPTX
Accelerating Big Data Insights
PPTX
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
PPTX
Insights into Real-world Data Management Challenges
PPTX
To The Cloud and Back: A Look At Hybrid Analytics
What's New in Apache Hive
Apache Deep Learning 201
YARN Ready: Apache Spark
PayPal merchant ecosystem using Apache Spark, Hive, Druid, and HBase
Apache Eagle - Monitor Hadoop in Real Time
Deep Learning using Spark and DL4J for fun and profit
Performance tuning your Hadoop/Spark clusters to use cloud storage
Apache Hadoop YARN: Present and Future
What's new in Ambari
Apache Tez - A New Chapter in Hadoop Data Processing
IoT:what about data storage?
Migrating Analytics to the Cloud at Fannie Mae
Practice of large Hadoop cluster in China Mobile
Enabling Diverse Workload Scheduling in YARN
Evolving HDFS to a Generalized Storage Subsystem
Accelerating Big Data Insights
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
Insights into Real-world Data Management Challenges
To The Cloud and Back: A Look At Hybrid Analytics
Ad

Similar to Fast SQL on Hadoop, Really? (20)

PDF
Fast SQL on Hadoop, really?
PDF
What's New in Apache Hive 3.0 - Tokyo
PDF
What's New in Apache Hive 3.0?
PPTX
Hive 3 - a new horizon
PDF
Hive 3 a new horizon
PPTX
Don't reengineer, reimagine: Hive buzzing with Druid's magic potion
PDF
Hive 3 a new horizon
PPTX
Big data processing engines, Atlanta Meetup 4/30
PDF
What is new in Apache Hive 3.0?
PPTX
SoCal BigData Day
PPTX
Accelerating query processing
PPTX
Hive 3 New Horizons DataWorks Summit Melbourne February 2019
PDF
What is New in Apache Hive 3.0?
PPTX
Accelerating query processing with materialized views in Apache Hive
PPTX
Hive edw-dataworks summit-eu-april-2017
PPTX
An Apache Hive Based Data Warehouse
PPTX
Interactive Analytics at Scale in Apache Hive Using Druid
PPTX
Discardable In-Memory Materialized Queries With Hadoop
PPTX
Discardable In-Memory Materialized Query for Hadoop
PPTX
An Overview on Optimization in Apache Hive: Past, Present Future
Fast SQL on Hadoop, really?
What's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0?
Hive 3 - a new horizon
Hive 3 a new horizon
Don't reengineer, reimagine: Hive buzzing with Druid's magic potion
Hive 3 a new horizon
Big data processing engines, Atlanta Meetup 4/30
What is new in Apache Hive 3.0?
SoCal BigData Day
Accelerating query processing
Hive 3 New Horizons DataWorks Summit Melbourne February 2019
What is New in Apache Hive 3.0?
Accelerating query processing with materialized views in Apache Hive
Hive edw-dataworks summit-eu-april-2017
An Apache Hive Based Data Warehouse
Interactive Analytics at Scale in Apache Hive Using Druid
Discardable In-Memory Materialized Queries With Hadoop
Discardable In-Memory Materialized Query for Hadoop
An Overview on Optimization in Apache Hive: Past, Present Future
Ad

More from DataWorks Summit (20)

PPTX
Data Science Crash Course
PPTX
Floating on a RAFT: HBase Durability with Apache Ratis
PPTX
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
PDF
HBase Tales From the Trenches - Short stories about most common HBase operati...
PPTX
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
PPTX
Managing the Dewey Decimal System
PPTX
Practical NoSQL: Accumulo's dirlist Example
PPTX
HBase Global Indexing to support large-scale data ingestion at Uber
PPTX
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
PPTX
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
PPTX
Supporting Apache HBase : Troubleshooting and Supportability Improvements
PPTX
Security Framework for Multitenant Architecture
PDF
Presto: Optimizing Performance of SQL-on-Anything Engine
PPTX
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
PPTX
Extending Twitter's Data Platform to Google Cloud
PPTX
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
PPTX
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
PPTX
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
PDF
Computer Vision: Coming to a Store Near You
PPTX
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Data Science Crash Course
Floating on a RAFT: HBase Durability with Apache Ratis
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
HBase Tales From the Trenches - Short stories about most common HBase operati...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Managing the Dewey Decimal System
Practical NoSQL: Accumulo's dirlist Example
HBase Global Indexing to support large-scale data ingestion at Uber
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Security Framework for Multitenant Architecture
Presto: Optimizing Performance of SQL-on-Anything Engine
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Extending Twitter's Data Platform to Google Cloud
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Computer Vision: Coming to a Store Near You
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark

Recently uploaded (20)

PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPT
Teaching material agriculture food technology
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Electronic commerce courselecture one. Pdf
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
A Presentation on Artificial Intelligence
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Machine learning based COVID-19 study performance prediction
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Approach and Philosophy of On baking technology
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Spectral efficient network and resource selection model in 5G networks
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Mobile App Security Testing_ A Comprehensive Guide.pdf
Review of recent advances in non-invasive hemoglobin estimation
Teaching material agriculture food technology
20250228 LYD VKU AI Blended-Learning.pptx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Electronic commerce courselecture one. Pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
A Presentation on Artificial Intelligence
Chapter 3 Spatial Domain Image Processing.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Machine learning based COVID-19 study performance prediction
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Approach and Philosophy of On baking technology
NewMind AI Weekly Chronicles - August'25 Week I
Digital-Transformation-Roadmap-for-Companies.pptx
Per capita expenditure prediction using model stacking based on satellite ima...

Fast SQL on Hadoop, Really?

  • 1. 1 © Hortonworks Inc. 2011–2018. All rights reserved Fast SQL for Big Data Apache Hive and Apache Druid Alan Gates Hortonworks Co-founder, Apache Hive PMC member @alanfgates
  • 2. 2 © Hortonworks Inc. 2011–2018. All rights reserved 7000 analysts, 80ms average latency, 1PB data. 250k BI queries per hour On demand deep reporting in the cloud over 100TB in minutes.
  • 3. 3 © Hortonworks Inc. 2011–2018. All rights reserved • Ran all 99 TPCDS queries • Total query runtime have improved multifold in each release! Benchmark journey TPCDS 10TB scale on 10 node cluster HDP 2.5 Hive1 HDP 2.5 LLAP HDP 2.6 LLAP 25x 3x 2x HDP 3.0 LLAP 2016 20182017 ACID tables
  • 4. 4 © Hortonworks Inc. 2011–2018. All rights reserved Hive LLAP v Presto v Spark SQL • TPC-DS, scale 3TB, all 99 queries, not run by Hortonworks (nor at our request) • Total time to run (seconds): LLAP: 5,517 Presto: 12,948 Spark SQL: 26,247 • LLAP faster than Presto on 83/97 queries, Spark SQL on 92/96 queries • More details: • Hive 3.1 LLAP, Presto 0.208e, Spark 2.3.1 • 19 worker nodes, 84G each • Source mr3.postech.ac.kr/blog/2018/10/30/performance-evaluation-0.4/
  • 5. 5 © Hortonworks Inc. 2011–2018. All rights reserved Hive LLAP - MPP Performance at Hadoop Scale Deep Storage Hadoop Cluster LLAP Daemon Query Executors LLAP Daemon Query Executors LLAP Daemon Query Executors LLAP Daemon Query Executors Query Coordinators Coord- inator Coord- inator Coord- inator HiveServer2 (Query Endpoint) ODBC / JDBC SQL Queries In-Memory Cache (Shared Across All Users) HDFS and Compatible S3 WASB Isilon
  • 6. 6 © Hortonworks Inc. 2011–2018. All rights reserved Aggressive Caching
  • 7. 7 © Hortonworks Inc. 2011–2018. All rights reserved Caching in LLAP • Fine grained (by row group and column) and compact (dictionary encoding, RLE) • Important in environment with PBs of data but common queries only touch 100s of GB • Prioritized – indexes cached with higher priority • Off heap to avoid GC • Supports spill to SSD – important in the cloud • Uses LRFU replacement algorithm to avoid large scans purging the cache
  • 8. 8 © Hortonworks Inc. 2011–2018. All rights reserved Query result cache Returns results directly from storage (e.g. HDFS) without actually executing the query If the same query has run before and the underlying data has not changed Important for dashboards, reports etc. where repetitive queries are common Uses transactions to determine when underlying data has changed Without cache With cache
  • 9. 9 © Hortonworks Inc. 2011–2018. All rights reserved Metastore Cache • With query execution time being < 1 sec, compilation time starts to dominate • Metadata retrieval is often significant part of compilation time. Most of it is in RDBMS queries. • Cloud RDBMS As a Service is often slower, and frequent queries leads to throttling. • Metadata cache speeds compilation time by around 50% with on prem MySQL. Significantly more improvement with cloud RDBMS. • Cache is consistent in single metastore setup, eventually consistent with HA setup. Consistent HA setup support is in the works.
  • 10. 10 © Hortonworks Inc. 2011–2018. All rights reserved New in Hive 3: Materialized Views
  • 11. 11 © Hortonworks Inc. 2011–2018. All rights reserved Possible workflow 1. Create materialized view using Hive tables • Stored by Hive or Druid 2. User or dashboard sends queries to Hive • Hive rewrites queries using available materialized views • Execute rewritten query Dashboards, BI tools CREATE MATERIALIZED VIEW `ssb_mv` STORED AS 'org.apache.hadoop.hive.druid.DruidStorageHandler' ENABLE REWRITE AS <query>; DBA, recommendation system ① ② Data Queries
  • 12. 12 © Hortonworks Inc. 2011–2018. All rights reserved Materialized view-based rewriting example • Materialized view definition CREATE MATERIALIZED VIEW mv AS SELECT <dims>, lo_revenue, lo_extendedprice * lo_discount AS d_price, lo_revenue - lo_supplycost FROM customer, dates, lineorder, part, supplier WHERE lo_orderdate = d_datekey and lo_partkey = p_partkey and lo_suppkey = s_suppkey and lo_custkey = c_custkey; • Query SELECT sum(lo_extendedprice * lo_discount) FROM lineorder, dates WHERE lo_orderdate = d_datekey and d_year = 2013 and lo_discount between 1 and 3; • Materialized view-based rewriting SELECT SUM(d_price) FROM mv WHERE d_year = 2013 and lo_discount between 1 and 3; supplier part dates customerlineorder d_year lo_discount <dims> d_price 2013 2 ... 7.55 2014 4 ... 432.60 2013 2 ... 34.45 2012 2 ... 2.05 … … ... … mv contents sum 42.0 … Query results
  • 13. 13 © Hortonworks Inc. 2011–2018. All rights reserved Materialized view - Maintenance • Partial table rewrites are supported • Typical: Denormalize last month of data only • Rewrite engine will produce union of latest and historical data • Updates to base tables • Invalidates views, but • Can choose to allow stale views (max staleness) for performance • Can partial match views and compute delta after updates • Incremental updates • Common classes of views allow for incremental updates • Others need full refresh
  • 14. 14 © Hortonworks Inc. 2011–2018. All rights reserved Optimizer Improvements
  • 15. 15 © Hortonworks Inc. 2011–2018. All rights reserved SELECT * FROM ( SELECT AVG(ss_list_price) B1_LP, COUNT(ss_list_price) B1_CNT ,COUNT(DISTINCT ss_list_price) B1_CNTD FROM store_sales WHERE ss_quantity BETWEEN 0 AND 5 AND (ss_list_price BETWEEN 11 and 11+10 OR ss_coupon_amt BETWEEN 460 and 460+1000 OR ss_wholesale_cost BETWEEN 14 and 14+20)) B1, ( SELECT AVG(ss_list_price) B2_LP, COUNT(ss_list_price) B2_CNT ,COUNT(DISTINCT ss_list_price) B2_CNTD FROM store_sales WHERE ss_quantity BETWEEN 6 AND 10 AND (ss_list_price BETWEEN 91 and 91+10 OR ss_coupon_amt BETWEEN 1430 and 1430+1000 OR ss_wholesale_cost BETWEEN 32 and 32+20)) B2, . . . LIMIT 100; TPCDS SQL query 28 joins 6 instances of store_sales table Shared scan - 4x improvement! RS RS RS RS RS Scan store_sales Combined OR’ed B1-B6 Filters B1 Filter B2 Filter B3 Filter B4 Filter B5 Filter Join
  • 16. 16 © Hortonworks Inc. 2011–2018. All rights reserved • Dramatically improves performance of very selective joins • Builds a bloom filter from one side of join and filters rows from other side • Skips scan and further evaluation of rows that would not qualify the join Dynamic Semijoin Reduction - 7x improvement for q72 SELECT … FROM sales JOIN time ON sales.time_id = time.time_id WHERE time.year = 2014 AND time.quarter IN ('Q1', 'Q2’) Reduced scan on sales
  • 17. 17 © Hortonworks Inc. 2011–2018. All rights reserved Statistics (not new) • Statistics collection can be set to automatic or manual • Used extensively in join selection • Without statistics much of the optimizer will not be used
  • 18. 18 © Hortonworks Inc. 2011–2018. All rights reserved ⬢ Solution ● Query fails because of stats estimation error ● Runtime sends observed statistics back to coordinator ● Statistics overrides are created at session, server or global level ● Query is replanned and resubmitted Optimizer is learning from planning mistakes ⬢ Symptoms ● Memory exhaustion due to under provisioning ● Excessive runtime (future) ● Excessive spilling (future)
  • 19. 19 © Hortonworks Inc. 2011–2018. All rights reserved Apache Druid
  • 20. 20 © Hortonworks Inc. 2011–2018. All rights reserved Druid capabilities • Streaming ingestion capability • Data Freshness – analyze events as they occur • Fast response time (ideally < 1sec query time) • Arbitrary slicing and dicing • Multi-tenancy – 1000s of concurrent users • Scalability and Availability • Rich real-time visualization with Superset Apache Druid is a distributed, real-time, column-oriented datastore designed to quickly ingest and index large amounts of data and make it available for real-time query.
  • 21. 21 © Hortonworks Inc. 2011–2018. All rights reserved Druid: Fast Facts Most Events per Day 30 Billion Events / Day (Metamarkets) Most Computed Metrics 1 Billion Metrics / Min (Jolata) Largest Cluster 200 Nodes (Metamarkets) Largest Hourly Ingestion 2TB per Hour (Netflix)
  • 22. 22 © Hortonworks Inc. 2011–2018. All rights reserved Hive and Druid, Better Together Technology Strengths Issues Hive SQL 2011, JDBC/ODBC Fast scans ACID Not optimized for slice and dice and drill down (OLAP cubing) operations Druid Dimensional aggregates support OLAP cubes Timeseries queries Realtime ingestion of streaming data Lacks SQL interface No joins Problem: You don't want two systems to manage and load data into Solution: For data that fits best in Druid, load it in Druid and access it with Hive • Hive supports push down of queries to Druid, optimizer knows what to push and what to run in Hive • Enables SQL and JDBC/ODBC access to data in Druid • Enables join of historical and realtime data • Enables Hive support of slice & dice, drill down for OLAP cubing • Can also create materialized views in Hive and store them in Druid
  • 23. 23 © Hortonworks Inc. 2011–2018. All rights reserved Druid Connector Realtime Node Realtime Node Realtime Node Broker HiveServer2 Instantly analyze kafka data with milliseconds latency
  • 24. 24 © Hortonworks Inc. 2011–2018. All rights reserved Druid Connector - Joins between Hive and realtime data in Druid Bloom filter pushdown greatly reduces data transfer Send promotional email to all customers from CA who purchased more than $1000 worth of merchandise today. create external table sales(`__time` timestamp, quantity int, sales_price double,customer_id bigint, item_id int, store_id int) stored by 'org.apache.hadoop.hive.druid.DruidStorageHandler' tblproperties ( "kafka.bootstrap.servers" = "localhost:9092", "kafka.topic" = "sales-topic", "druid.kafka.ingestion.maxRowsInMemory" = "5"); create table customers (customer_id bigint, first_name string, last_name string, email string, state string); select email from customers join sales using customer_id where to_date(sales.__time) = date ‘2018-09-06’ and quantity * sales_price > 1000 and customers.state = ‘CA’;
  • 25. 25 © Hortonworks Inc. 2011–2018. All rights reserved Tips for Optimizing Hive
  • 26. 26 © Hortonworks Inc. 2011–2018. All rights reserved Making Your Queries Blaze in Hive 3 • Use a columnar format • We recommend ORC; ORC or Parquet much better for DW queries than row oriented formats • Use the right tool for the right job, all in Hive • LLAP for BI queries • Tez for ETL/batch • Druid for ROLAP and realtime ingestion • Do not use MapReduce as your Hive engine, it is very slow • Keep statistics current on your data • Define materialized views for common joins and aggregations • Turn on ACID – it enables query cache and materialized view partial rewrites
  • 27. 27 © Hortonworks Inc. 2011–2018. All rights reserved SOLUTIONS: Heuristic recommendation engine Fully self-serviced query and storage optimization
  • 28. 28 © Hortonworks Inc. 2011–2018. All rights reserved Questions?