LLAP: Sub-Second Analytical Queries in Hive

© Hortonworks Inc. 2011 – 2016. All Rights Reserved
LLAP: Sub-Second Analytical Queries in Hive
Gopal Vijayaraghavan

Why LLAP?
• People like Hive
• Disk->Mem is getting further away
– Cloud Storage isn’t co-located
– Disks are connected to the CPU via network
• Security landscape is changing
– Cells & Columns are the new security boundary, not files
– Safely masking columns needs a process boundary
• Concurrency, Performance & Scale are at conflict
– Concurrency at 100k queries/hour
– Latencies at 2-5 seconds/query
– Petabyte scale warehouses (with terabytes of “hot” data)
Node
LLAP Process
Cache
Query Fragment
HDFS
Query Fragment

What is LLAP?
• Hybrid model combining daemons and containers for
fast, concurrent execution of analytical workloads
(e.g. Hive SQL queries)
• Concurrent queries without specialized YARN queue setup
• Multi-threaded execution of vectorized operator pipelines
• Asynchronous IO and efficient in-memory caching
• Relational view of the data available thru the API
• High performance scans, execution code pushdown
• Centralized data security
Node
LLAP Process
Cache
Query Fragment
HDFS
Query Fragment

Hive 2.0 (+ LLAP)
• Transparent to Hive users, BI tools, etc.
• Hive decides where query fragments run
(LLAP, Container, AM) based on
configuration, data size, format, etc.
• Each Query coordinated independently by
a Tez AM
• Number of concurrent queries throttled
by number of active AMs
• Hive Operators used for processing
• Tez Runtime used for data transfer
HiveServer2
Query/AM
Controller
Client(s) YARN Cluster
AM1
llapd
llapd
Container AM1
Container AM1
llapd
Container AM2
AM2
AM3
llapd

Industry benchmark – 10Tb scale
0
5
10
15
20
25
30
35
40
45
50
query3 query12 query20 query21 query26 query27 query42 query52 query55 query73 query89 query91 query98
Time(seconds)
LLAP vs Hive 1.x 10TB Scale
Hive 1.x LLAP90% faster

Evaluation from a customer case study
0
20000
1 3 5 7 9 11 13 15 17 19 21 23 25 27
• Aggregate daily statistics for a time interval:
SELECT yyyymmdd,
sum(total_1),
sum(total_2),
...
from table
where yyyymmdd >= xxx
and yyyymmdd < xxx
and userid = xxx
group by userid, yyyymmdd;
Max
Max
Max
Avg
Avg Avg
0
1
2
3
4
5
6
7
8
D W M Y D W M Y D W M Y
Tez Phoenix LLAP
Executiontime,s

Evaluation from a customer case study
• Display a large report
Execution Time in seconds over time range
Max
Max
Avg
Avg
0
5
10
15
20
D W M Y D W M Y
Tez LLAP
Executiontime,s

Cut-awaytodemo
(GIF)

How does LLAP make queries faster?

LLAP
Queue
Technical overview – execution
• LLAP daemon has a number of executors (think
containers) that execute work "fragments"
• Fragments are parts of one, or multiple parallel
workloads (e.g. Hive SQL queries)
• Work queue with pluggable priority
• Geared towards low latency queries over long-
running queries (by default)
• I/O is similar to containers – read/write to HDFS,
shuffle, other storages and formats
• Streaming output for data API
Executor
Q1 Map 1
Executor
External read
Executor
Q3 Reducer 3
Q1 Map 1
Q1 Map 1
Q3 Map 19
HDFS
Waiting for
shuffle inputs
HBase
Container
(shuffle input)
Spark
executor

Executor
Technical overview – IO layer
• Optional: when executing inside LLAP
• All other formats use in-sync mode
• Asynchronous IO for Hive
• Wraps over InputFormat, reads through cache
• Supported with ORC
• Transparent, compressed in-memory cache
• Format-specific, extensible
• NVMe/NVDIMM caches
RecordReade
r
Fragment
Cache
IO thread
Plan & decode
Read, decompress
Metadata cache
Actual data (HDFS, S3, …)
What to read Data buffers
Splits Vectorized data
Indexes

Parallel queries – priorities, preemption
• Lower-priority fragments can be preempted
• For example, a fragment can start running before its inputs are
ready, for better pipelining; such fragments may be preempted
• LLAP work queue examines the DAG parameters to give
preference to interactive (BI) queries
LLAP
QueueExecutor
Executor
Interactive
query map 1/3
…
Interactive
query map 3/3
Executor
Interactive
query map 2/3
Wide query
reduce waiting
Time
ClusterUtilization
Long-Running Query
Short-Running Query

I/OExecution
In-memory processing – present and future
Decoder
Distributed FS
Fragment
Hive
operator
Hive
operator
Vectorized
processin
g
col1 col2
Low-level pushdown
(e.g. filter)*
SSD cache*
Off-heap cache Compression
codec
Native data
vectors
- work in
progress
*
Compact encoded data
Compressed data
col1
col2

First query erformance
• Cold LLAP is nearly as fast as
shared pre-warmed containers
(impractical on real clusters)
• Realistic (long-running) LLAP
~3x faster than realistic (no
prewarm) Tez
No
prewarm,
20.94
Shared
prewarm
11.96
Cold LLAP
13.23
Realistic
LLAP
7.49
0
5
10
15
20
25
Firstqueryruntime,s

JIT Performance – heavy use
• Cache disabled!
13.23
7.93 7.7
6.29
5.44 5.3 5.01 4.89 4.83
7.49
6.1
4.61 4.59
4.93
4.62 4.63 4.45 4.25
0
2
4
6
8
10
12
14
0 1 2 3 4 5 6 7 8
Queryruntime,s
Query run
Cold LLAP
LLAP after an
unrelated workload
~2xtimesaving
For cold start, the
warmup took 3 runs

Parallel query execution – LLAP vs Hive 1.2
3131
2416
1741
1420
1508
531
291
147
94 75
0
500
1000
1500
2000
2500
3000
3500
1 2 4 8 16
Totalruntimeforallqueries,s.
Number of concurrent users
Hive 1.2 total runtime
LLAP total runtime
Hive 1.2 linear scaling
LLAP linear scaling
Parallelism eats into latency

Parallel query execution – 10Tb scale
531
291
147
94
75
0
100
200
300
400
500
600
1 2 4 8 16
Totalruntimeforallqueries,s.
Number of concurrent users
LLAP total runtime
LLAP linear scaling
Latency sensitive pre-emption

Performance – cache on HDFS, 1Tb scale
20.2
18.3
17.6
16.2 16.2
16.8
14.5
11.3
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
0
5
10
15
20
25
24 26 28 30 32 34 36 38 40 42
Cachehitrate
Queryruntime,s
Cache size, Gb
Query runtime, after
unrelated workload
Query runtime, after
related workload
Metadata hit rate
Data hit rate

LLAP as a “relational” datanode

Example - SparkSQL integration – execution flow
Page 20
HadoopRDD.compute()
LlapInputFormat.getRecordReader()
- Open socket for incoming data
- Send package/plan to LLAP
Check permissions
Generate splits w/LLAP locations
Return securely signed splits
Ranger+HiveServer2
Spark Executor
HadoopRDD.getPartitions()
Get Hive splits
Cluster nodes
Verify splits
Run scan + transform
Send data back
LLAP
Partitions/
Splits
Request
splits
: :
var llapContext =
LlapContext.newInstance(
sparkContext, jdbcUrl)
var df: DataFrame =
llapContext.sql("select *
from tpch_text_5.region")
DataFrame for Hive/LLAP data

Monitoring LLAP Queries

Monitoring
• LLAP exposes a UI for
monitoring
• Also has jmx endpoint with
much more data, logs and
jstack endpoints as usual
• Aggregate monitoring UI is
work in progress

Watching queries – Tez UI integration

Questions?
?
Interested? Stop by the Hortonworks booth to learn more

LLAP: Sub-Second Analytical Queries in Hive

More Related Content

What's hot (20)

Viewers also liked (17)

Similar to LLAP: Sub-Second Analytical Queries in Hive (20)

More from DataWorks Summit/Hadoop Summit (20)

Recently uploaded (20)

LLAP: Sub-Second Analytical Queries in Hive