SlideShare a Scribd company logo
Daoyuan Wang (Intel)
Yuanjian Li (Baidu)
OAP: Optimized Analytics
Package for Spark Platform
Notice and Disclaimers:
• Intel, the Intel logo are trademarks of IntelCorporation in the U.S. and/or other countries. *Othernames and brandsmay be
claimed as the property of others.
See Trademarkson intel.com for fulllist of Intel trademarks.
• Optimization Notice:
Intel's compilers may or may not optimize to the same degree for non-Intelmicroprocessorsfor optimizations that are not
unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other
optimizations. Inteldoes not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors
not manufactured by Intel.
Microprocessor-dependentoptimizations in this product are intended for use with Intelmicroprocessors. Certain
optimizations not specific to Intelmicroarchitecture are reserved for Intelmicroprocessors. Please refer to the applicable
product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
• Intel technologies may require enabled hardware, specific software, or servicesactivation. Checkwith your system
manufacturer or retailer.
• No computer systemcan be absolutely secure. Inteldoes not assumeany liability for lost or stolen data or systems or any
damages resulting from such losses.
• You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning
Intel products described herein. You agree to grant Intela non-exclusive, royalty-free license to any patent claim thereafter
drafted which includes subject matter disclosed herein.
• No license (express or implied, by estoppelor otherwise) to any intellectualpropertyrights is granted by this document.
• The products described may contain design defectsor errorsknownas errata which maycausethe product to deviate from
publish.
About me
Daoyuan Wang
• developer@Intel
• Focuses on Spark
optimization
• An active Spark
contributor since 2014
Yuanjian Li
• Baidu INF distributed
computation
• Apache Spark
contributor
• Baidu Spark team
leader
Agenda
• Background for OAP
• Key features
• Benchmark
• OAP and Spark in Baidu
• Future plans
Agenda
• Background for OAP
• Key features
• Benchmark
• OAP and Spark in Baidu
• Future plans
Data Analytics in Big Data Definition
• People wants OLAP
against large dataset
as fast as possible.
• People wants extract
information from new
coming data as soon
as possible.
Data Analytics Acceleration is
Required by Spark Users
http://cdn2.hubspot.net/hubfs/438089/DataBricks_Surveys_-_Content/2016_Spark_Survey/2016_Spark_Infographic.pdf
Emerging hardware technology
Intel® Optane™ Technology
Data Center Solutions
Accelerate applications for
fast caching and storage,
reduce transaction costs for
latency-sensitive workloads
and increase scale per server.
Intel® Optane™ technology
allows data centers to deploy
bigger and more affordable
datasets to gain new insights
from large memory pools.
Our proposal – OAP
Spark* Job Server
Spark SQL / StructuredStreaming/ Core
Cassandra* HBase*Redis*Alluxio*
HDFS* S3* … Storage Layer
Hive* Table Parquet * JSON * ORC *
Redis *
Connector
Cassandra *
Connector
OAP (Codename “Spinach”)
• IndexedDataSource / CacheAware
• RDMA, QAT, ISA-L,FPGA …
• User Customized Indices
• Columnar formats & supportParquet, ORC
• Runtime ComputingV.S.Data Store
• Columnar Fine-grainedCache
• Spark Executor in-process Cache
• 3D Xpoint (APP Direct Mode)
• Auto tuningbasedonperiodicaljobhistory
• K8S Integration/ AES-NI Encryption
Why OAP
Low cost
• Makes full use of
existing hardware
• Open source
Good
Performance
• Index just like
traditional database
• Up to 5x boost in
real-world
Easy to Use
• Easy to deploy
• Easy to maintain
• Easy to learn
Agenda
• Background for OAP
• Key features
• Benchmark
• OAP and Spark in Baidu
• Future plans
A Simple Example
1. Run with OAP
$SPARK_HOME/sbin/start-thriftserver --package oap.jar;
2. Create a OAP table
beeline> CREATE TABLE src(a: Int, b: String) USING spn;
3. Create a single column B+ Tree index
beeline> CREATE SINDEX idx_1 ON src (a) USING BTREE;
4. Insert data
beeline> INSERT INTO TABLE src SELECT key, value FROM xxx;
5. Refresh index
beeline> REFRESH SINDEX on src;
6. Execution would automatically utilize index
beeline> SELECT MAX(value), MIN(value) FROM src WHERE a > 100 and a <
1000;
OAP Files and Fibers
Column (Fiber) #1
Column (Fiber) #2
Column (Fiber) #N
RowGroup #1
…RowGroup #2
RowGroup #N
Index meta
statistics
Index data
structure
(Index Fiber)
One Index file
for every data
file
Index meta
statistics
Index data
structure
(Index Fiber)
OAP meta file
OAP data
files
OAP
index files
OAP
index files
14
OAP Internals - index
Spark predicate
push down
FilteredScan
Read OAP Meta
Available
index?
read statistics
before use index
Get Local RowID
from index
Full table scan
Access data file for
RowIDs directly
Y
N
OAP cached access
Index selection
Supports Btree Index
and BitMap Index, find
best match among all
created indices
Supports statistics such
as MinMax, PartbyValue,
Sample, BloomFilter
Only reads data fibers
we need and puts those
fibers into cache (in-
memory fiber)
OAP compatible layer
RowGroup #k
RowGroup #1
RowGroup #2
Parquet compatible layer
Read row #m from parquet file
Find Row group #k
Read row group and
get specific rows
Parquet data file
Cache
OAP Data locality
Spark	as	a	Service
Meta Data
FiberCacheManager
Executor
Index
Storage(HDFS	/	S3	/	OSS)
SpinachContext	(Driver)
FiberSensor
HeartBeat
Agenda
• Background for OAP
• Key features
• Benchmark
• OAP and Spark in Baidu
• Future plans
Performance
72.083
7.095
2.304
0
10
20
30
40
50
60
70
80
Parquet Vectorized Read OAP Indexed Read OAP Indexed Read with
Fiber Cache
QueryTime(seconds)
OAP Index And Cache Performance
Cluster:
1 Master + 2 Slaves
Hardware:
CPU – 2x E5-2699 v4
RAM – 256 GB
Storage – S3610 1.6TB
Data:
300GB (Compressed Parquet)
2 Billion Records
Agenda
• Background for OAP
• Key features
• Benchmark
• OAP and Spark in Baidu
• Future plans
Spark In Baidu
• Spark import to Baidu
• Version: 0.8
80
1000
3000
6500
50 300
1500
5800
0
1000
2000
3000
4000
5000
6000
7000
Nodes Jobs/day
2014 2015 2016 2017
• Build standalone
cluster
• Integratewith in-
houseFSPub-
SubDW
• Version: 1.4
• Build Cluster over
YARN
• Integratewith in-
houseResource
Scheduler System
• Version: 1.6
• SQLGraph Service
over Spark
• OAP
• Version: 2.1
Baidu Big SQLBaiduBigSQL
Web UI Restful API
BBS HTTPServer
BBS Worker BBS Worker BBS Worker
BBS Master
Cache & Index Layer(OAP)
Spark Over Yarn
Roll Up Table Layer
API Layer:
• Meta Control API
• Job API:
LoadExportQueryInde
x Control
Control Layer:
• Meta Control
• Job Scheduler
• Spark Driver
• Query Classification
Boosting Layer:
• Roll Up Table
Management
• Roll Up Query
Change
• Index CreateUpdate
• CacheHit
Baidu Big SQL
Query Physical Queue(FAIR)
Import Physical Queues
BBS Worker
Big Query
Pool Small Query Pool
Index Create
Pool
BBS Master
Import Physical Queues
Load Physical Queues
Spark Over YARN
Data Sources
Logs DW
Load Job
alter table create indexclassify query
Resource Management & Isolation
Query Job
Introductory Story
Introductory Story
Get the top 10
charge sum and
correspond
advertiser which
triggered by the
query word‘flower’
• Create index on ‘userid’ column
• Various index types to choosefor
different fields types
• ×5 speed boosting than native
spark sql, ×80 than MR Job
• 3 day baidu charging log, 4TB
data,70000+files, query timein
10~15s
Roll Up Table Layer
date userid searchid baiduid cmatch
…
…
shows clicks charge
1 1 1 10 2 10 1 5
1 1 2 11 3 10 1 5
1 1 3 12 2 10 1 5
1 1 4 13 1 10 1 5
1 1 5 14 1 10 1 5
1 2 6 14 2 10 1 5
1 2 7 15 3 10 1 5
1 2 8 16 4 10 1 5
1 2 9 17 5 10 1 5
700+ Columns
99% query only use <10 columns
Select date,userid,shows,clicks,charge from…
date userid shows clicks charge
1 1 50 5 25
1 2 40 4 20
Multi Roll Up Table
(user-transparent)
date cmatch shows clicks charge
1 1 20 2 10
1 2 30 3 15
1 3 20 2 10
1 4 10 1 5
1 5 10 1 5
OAP In BigSQL
… Name Department Age …
… … … … …
… John INF 35 …
… Michelle AI-Lab 29 …
… Amy INF 42 …
… Kim AI-Lab 27 …
… Mary AI-Lab 47 …
… … … … …
DataFile
IndexFile
Sorted Age Row Index
in Data File
27 3
29 1
35 0
42 2
45 4
Department Bit Array
INF 10100
AI-Lab 01011
Index Build
NormalTableScan
UseIndex
Skippable Reader
Select xxx from xxx where age > 29 and department in (INF, AI-Lab)
OAP In BigSQL
… Name Department Age …
… … … … …
… John INF 35 …
… Michelle AI-Lab 29 …
… Amy INF 42 …
… Kim AI-Lab 27 …
… Mary AI-Lab 47 …
… … … … …
DataFile
InMemoryCache
Load Cache
Department Row Index
in Data File
INF 2
AI-Lab 3
Age Row Index
in Data File
35 0
29 1
BBS’s Contribute to Spark
• Spark-4502
Spark SQL reads unneccesary nested fields from Parquet
• Spark-18700
getCached in HiveMetastoreCatalog not thread safe cause driver OOM
• Spark-20408
Get glob path in parallel to reduce resolve relation time
• …
Agenda
• Background for OAP
• Key features
• Benchmark
• OAP and Spark in Baidu
• Future plans
Future plans
• Compatible with more data formats
• Explicit cache and cache management
• Optimize SQL operators (join, aggregate) with index
• Integrate with structured streaming
• Utilize Latest hardware technology, such as Intel QAT
or 3D XPoint.
• Welcome to contribute!
https://github.com/Intel-bigdata/OAP
OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yuanjian Li
Thank You.
daoyuan.wang@intel.com
liyuanjian@baidu.com

More Related Content

PDF
Rental Cars and Industrialized Learning to Rank with Sean Downes
PDF
Stream All Things—Patterns of Modern Data Integration with Gwen Shapira
PDF
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
PDF
SSR: Structured Streaming for R and Machine Learning
PDF
From R Script to Production Using rsparkling with Navdeep Gill
PDF
Top 5 mistakes when writing Streaming applications
PDF
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
PDF
Spark Summit San Francisco 2016 - Ali Ghodsi Keynote
Rental Cars and Industrialized Learning to Rank with Sean Downes
Stream All Things—Patterns of Modern Data Integration with Gwen Shapira
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
SSR: Structured Streaming for R and Machine Learning
From R Script to Production Using rsparkling with Navdeep Gill
Top 5 mistakes when writing Streaming applications
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Spark Summit San Francisco 2016 - Ali Ghodsi Keynote

What's hot (20)

PPTX
Spark Summit EU talk by Kaarthik Sivashanmugam
PDF
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn
PDF
Scalable Monitoring Using Apache Spark and Friends with Utkarsh Bhatnagar
PDF
CERN’s Next Generation Data Analysis Platform with Apache Spark with Enric Te...
PDF
Building a Data Pipeline from Scratch - Joe Crobak
PDF
Lessons Learned from Managing Thousands of Production Apache Spark Clusters w...
PDF
Spark Summit EU talk by Christos Erotocritou
PDF
Experiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan Zhang
PDF
Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0
PDF
Trends for Big Data and Apache Spark in 2017 by Matei Zaharia
PDF
Building a Unified Data Pipeline with Apache Spark and XGBoost with Nan Zhu
PDF
Advanced Natural Language Processing with Apache Spark NLP
PDF
Spark Summit EU talk by John Musser
PDF
Debugging Big Data Analytics in Apache Spark with BigDebug with Muhammad Gulz...
PDF
Large Scale Feature Aggregation Using Apache Spark with Pulkit Bhanot and Ami...
PDF
Building a Business Logic Translation Engine with Spark Streaming for Communi...
PDF
Analytics at Scale with Apache Spark on AWS with Jonathan Fritz
PPTX
From Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's Data
PDF
The Key to Machine Learning is Prepping the Right Data with Jean Georges Perrin
PDF
Online Security Analytics on Large Scale Video Surveillance System by Yu Cao ...
Spark Summit EU talk by Kaarthik Sivashanmugam
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn
Scalable Monitoring Using Apache Spark and Friends with Utkarsh Bhatnagar
CERN’s Next Generation Data Analysis Platform with Apache Spark with Enric Te...
Building a Data Pipeline from Scratch - Joe Crobak
Lessons Learned from Managing Thousands of Production Apache Spark Clusters w...
Spark Summit EU talk by Christos Erotocritou
Experiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan Zhang
Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0
Trends for Big Data and Apache Spark in 2017 by Matei Zaharia
Building a Unified Data Pipeline with Apache Spark and XGBoost with Nan Zhu
Advanced Natural Language Processing with Apache Spark NLP
Spark Summit EU talk by John Musser
Debugging Big Data Analytics in Apache Spark with BigDebug with Muhammad Gulz...
Large Scale Feature Aggregation Using Apache Spark with Pulkit Bhanot and Ami...
Building a Business Logic Translation Engine with Spark Streaming for Communi...
Analytics at Scale with Apache Spark on AWS with Jonathan Fritz
From Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's Data
The Key to Machine Learning is Prepping the Right Data with Jean Georges Perrin
Online Security Analytics on Large Scale Video Surveillance System by Yu Cao ...
Ad

Similar to OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yuanjian Li (20)

PDF
Hyperspace: An Indexing Subsystem for Apache Spark
PPTX
Day 1 - Technical Bootcamp azure synapse analytics
PDF
Spark + AI Summit 2020 イベント概要
PPTX
Java EE 7 with Apache Spark for the World’s Largest Credit Card Core Systems ...
PDF
Hyperspace for Delta Lake
PDF
Apache Spark 3.0: Overview of What’s New and Why Care
PPTX
JSSUG: SQL Sever Index Tuning
PDF
Leveraging Apache Spark to Develop AI-Enabled Products and Services at Bosch
PDF
MySQL 5.6 - Operations and Diagnostics Improvements
PDF
MySQL 5.6, news in 5.7 and our HA options
PDF
In-memory ColumnStore Index
DOCX
Tony Reid Resume
PDF
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
PDF
Introduction to DataFusion An Embeddable Query Engine Written in Rust
PDF
Intel® Xeon® Processor E5-2600 v4 Enterprise Database Applications Showcase
PDF
Intel® Xeon® processor E7-8800/4800 v3 Application Showcase
PPTX
Getting Started with Splunk Breakout Session
PPT
香港六合彩
PPTX
M|18 Intel and MariaDB: Strategic Collaboration to Enhance MariaDB Functional...
PDF
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Hyperspace: An Indexing Subsystem for Apache Spark
Day 1 - Technical Bootcamp azure synapse analytics
Spark + AI Summit 2020 イベント概要
Java EE 7 with Apache Spark for the World’s Largest Credit Card Core Systems ...
Hyperspace for Delta Lake
Apache Spark 3.0: Overview of What’s New and Why Care
JSSUG: SQL Sever Index Tuning
Leveraging Apache Spark to Develop AI-Enabled Products and Services at Bosch
MySQL 5.6 - Operations and Diagnostics Improvements
MySQL 5.6, news in 5.7 and our HA options
In-memory ColumnStore Index
Tony Reid Resume
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Introduction to DataFusion An Embeddable Query Engine Written in Rust
Intel® Xeon® Processor E5-2600 v4 Enterprise Database Applications Showcase
Intel® Xeon® processor E7-8800/4800 v3 Application Showcase
Getting Started with Splunk Breakout Session
香港六合彩
M|18 Intel and MariaDB: Strategic Collaboration to Enhance MariaDB Functional...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Ad

More from Databricks (20)

PPTX
DW Migration Webinar-March 2022.pptx
PPTX
Data Lakehouse Symposium | Day 1 | Part 1
PPT
Data Lakehouse Symposium | Day 1 | Part 2
PPTX
Data Lakehouse Symposium | Day 2
PPTX
Data Lakehouse Symposium | Day 4
PDF
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
PDF
Democratizing Data Quality Through a Centralized Platform
PDF
Learn to Use Databricks for Data Science
PDF
Why APM Is Not the Same As ML Monitoring
PDF
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
PDF
Stage Level Scheduling Improving Big Data and AI Integration
PDF
Simplify Data Conversion from Spark to TensorFlow and PyTorch
PDF
Scaling your Data Pipelines with Apache Spark on Kubernetes
PDF
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
PDF
Sawtooth Windows for Feature Aggregations
PDF
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
PDF
Re-imagine Data Monitoring with whylogs and Spark
PDF
Raven: End-to-end Optimization of ML Prediction Queries
PDF
Processing Large Datasets for ADAS Applications using Apache Spark
PDF
Massive Data Processing in Adobe Using Delta Lake
DW Migration Webinar-March 2022.pptx
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 4
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Democratizing Data Quality Through a Centralized Platform
Learn to Use Databricks for Data Science
Why APM Is Not the Same As ML Monitoring
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Stage Level Scheduling Improving Big Data and AI Integration
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Sawtooth Windows for Feature Aggregations
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Re-imagine Data Monitoring with whylogs and Spark
Raven: End-to-end Optimization of ML Prediction Queries
Processing Large Datasets for ADAS Applications using Apache Spark
Massive Data Processing in Adobe Using Delta Lake

Recently uploaded (20)

PPT
Quality review (1)_presentation of this 21
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
Database Infoormation System (DBIS).pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PDF
.pdf is not working space design for the following data for the following dat...
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
Business Acumen Training GuidePresentation.pptx
PPT
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
Logistic Regression ml machine learning.pptx
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Quality review (1)_presentation of this 21
Introduction-to-Cloud-ComputingFinal.pptx
Database Infoormation System (DBIS).pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Business Ppt On Nestle.pptx huunnnhhgfvu
.pdf is not working space design for the following data for the following dat...
Galatica Smart Energy Infrastructure Startup Pitch Deck
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Business Acumen Training GuidePresentation.pptx
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
Fluorescence-microscope_Botany_detailed content
Introduction to Knowledge Engineering Part 1
oil_refinery_comprehensive_20250804084928 (1).pptx
Major-Components-ofNKJNNKNKNKNKronment.pptx
IB Computer Science - Internal Assessment.pptx
Logistic Regression ml machine learning.pptx
MODULE 8 - DISASTER risk PREPAREDNESS.pptx

OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yuanjian Li

  • 1. Daoyuan Wang (Intel) Yuanjian Li (Baidu) OAP: Optimized Analytics Package for Spark Platform
  • 2. Notice and Disclaimers: • Intel, the Intel logo are trademarks of IntelCorporation in the U.S. and/or other countries. *Othernames and brandsmay be claimed as the property of others. See Trademarkson intel.com for fulllist of Intel trademarks. • Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intelmicroprocessorsfor optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Inteldoes not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependentoptimizations in this product are intended for use with Intelmicroprocessors. Certain optimizations not specific to Intelmicroarchitecture are reserved for Intelmicroprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. • Intel technologies may require enabled hardware, specific software, or servicesactivation. Checkwith your system manufacturer or retailer. • No computer systemcan be absolutely secure. Inteldoes not assumeany liability for lost or stolen data or systems or any damages resulting from such losses. • You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. You agree to grant Intela non-exclusive, royalty-free license to any patent claim thereafter drafted which includes subject matter disclosed herein. • No license (express or implied, by estoppelor otherwise) to any intellectualpropertyrights is granted by this document. • The products described may contain design defectsor errorsknownas errata which maycausethe product to deviate from publish.
  • 3. About me Daoyuan Wang • developer@Intel • Focuses on Spark optimization • An active Spark contributor since 2014 Yuanjian Li • Baidu INF distributed computation • Apache Spark contributor • Baidu Spark team leader
  • 4. Agenda • Background for OAP • Key features • Benchmark • OAP and Spark in Baidu • Future plans
  • 5. Agenda • Background for OAP • Key features • Benchmark • OAP and Spark in Baidu • Future plans
  • 6. Data Analytics in Big Data Definition • People wants OLAP against large dataset as fast as possible. • People wants extract information from new coming data as soon as possible.
  • 7. Data Analytics Acceleration is Required by Spark Users http://cdn2.hubspot.net/hubfs/438089/DataBricks_Surveys_-_Content/2016_Spark_Survey/2016_Spark_Infographic.pdf
  • 8. Emerging hardware technology Intel® Optane™ Technology Data Center Solutions Accelerate applications for fast caching and storage, reduce transaction costs for latency-sensitive workloads and increase scale per server. Intel® Optane™ technology allows data centers to deploy bigger and more affordable datasets to gain new insights from large memory pools.
  • 9. Our proposal – OAP Spark* Job Server Spark SQL / StructuredStreaming/ Core Cassandra* HBase*Redis*Alluxio* HDFS* S3* … Storage Layer Hive* Table Parquet * JSON * ORC * Redis * Connector Cassandra * Connector OAP (Codename “Spinach”) • IndexedDataSource / CacheAware • RDMA, QAT, ISA-L,FPGA … • User Customized Indices • Columnar formats & supportParquet, ORC • Runtime ComputingV.S.Data Store • Columnar Fine-grainedCache • Spark Executor in-process Cache • 3D Xpoint (APP Direct Mode) • Auto tuningbasedonperiodicaljobhistory • K8S Integration/ AES-NI Encryption
  • 10. Why OAP Low cost • Makes full use of existing hardware • Open source Good Performance • Index just like traditional database • Up to 5x boost in real-world Easy to Use • Easy to deploy • Easy to maintain • Easy to learn
  • 11. Agenda • Background for OAP • Key features • Benchmark • OAP and Spark in Baidu • Future plans
  • 12. A Simple Example 1. Run with OAP $SPARK_HOME/sbin/start-thriftserver --package oap.jar; 2. Create a OAP table beeline> CREATE TABLE src(a: Int, b: String) USING spn; 3. Create a single column B+ Tree index beeline> CREATE SINDEX idx_1 ON src (a) USING BTREE; 4. Insert data beeline> INSERT INTO TABLE src SELECT key, value FROM xxx; 5. Refresh index beeline> REFRESH SINDEX on src; 6. Execution would automatically utilize index beeline> SELECT MAX(value), MIN(value) FROM src WHERE a > 100 and a < 1000;
  • 13. OAP Files and Fibers Column (Fiber) #1 Column (Fiber) #2 Column (Fiber) #N RowGroup #1 …RowGroup #2 RowGroup #N Index meta statistics Index data structure (Index Fiber) One Index file for every data file Index meta statistics Index data structure (Index Fiber) OAP meta file OAP data files OAP index files OAP index files
  • 14. 14 OAP Internals - index Spark predicate push down FilteredScan Read OAP Meta Available index? read statistics before use index Get Local RowID from index Full table scan Access data file for RowIDs directly Y N OAP cached access Index selection Supports Btree Index and BitMap Index, find best match among all created indices Supports statistics such as MinMax, PartbyValue, Sample, BloomFilter Only reads data fibers we need and puts those fibers into cache (in- memory fiber)
  • 15. OAP compatible layer RowGroup #k RowGroup #1 RowGroup #2 Parquet compatible layer Read row #m from parquet file Find Row group #k Read row group and get specific rows Parquet data file Cache
  • 16. OAP Data locality Spark as a Service Meta Data FiberCacheManager Executor Index Storage(HDFS / S3 / OSS) SpinachContext (Driver) FiberSensor HeartBeat
  • 17. Agenda • Background for OAP • Key features • Benchmark • OAP and Spark in Baidu • Future plans
  • 18. Performance 72.083 7.095 2.304 0 10 20 30 40 50 60 70 80 Parquet Vectorized Read OAP Indexed Read OAP Indexed Read with Fiber Cache QueryTime(seconds) OAP Index And Cache Performance Cluster: 1 Master + 2 Slaves Hardware: CPU – 2x E5-2699 v4 RAM – 256 GB Storage – S3610 1.6TB Data: 300GB (Compressed Parquet) 2 Billion Records
  • 19. Agenda • Background for OAP • Key features • Benchmark • OAP and Spark in Baidu • Future plans
  • 20. Spark In Baidu • Spark import to Baidu • Version: 0.8 80 1000 3000 6500 50 300 1500 5800 0 1000 2000 3000 4000 5000 6000 7000 Nodes Jobs/day 2014 2015 2016 2017 • Build standalone cluster • Integratewith in- houseFSPub- SubDW • Version: 1.4 • Build Cluster over YARN • Integratewith in- houseResource Scheduler System • Version: 1.6 • SQLGraph Service over Spark • OAP • Version: 2.1
  • 21. Baidu Big SQLBaiduBigSQL Web UI Restful API BBS HTTPServer BBS Worker BBS Worker BBS Worker BBS Master Cache & Index Layer(OAP) Spark Over Yarn Roll Up Table Layer API Layer: • Meta Control API • Job API: LoadExportQueryInde x Control Control Layer: • Meta Control • Job Scheduler • Spark Driver • Query Classification Boosting Layer: • Roll Up Table Management • Roll Up Query Change • Index CreateUpdate • CacheHit
  • 22. Baidu Big SQL Query Physical Queue(FAIR) Import Physical Queues BBS Worker Big Query Pool Small Query Pool Index Create Pool BBS Master Import Physical Queues Load Physical Queues Spark Over YARN Data Sources Logs DW Load Job alter table create indexclassify query Resource Management & Isolation Query Job
  • 24. Introductory Story Get the top 10 charge sum and correspond advertiser which triggered by the query word‘flower’ • Create index on ‘userid’ column • Various index types to choosefor different fields types • ×5 speed boosting than native spark sql, ×80 than MR Job • 3 day baidu charging log, 4TB data,70000+files, query timein 10~15s
  • 25. Roll Up Table Layer date userid searchid baiduid cmatch … … shows clicks charge 1 1 1 10 2 10 1 5 1 1 2 11 3 10 1 5 1 1 3 12 2 10 1 5 1 1 4 13 1 10 1 5 1 1 5 14 1 10 1 5 1 2 6 14 2 10 1 5 1 2 7 15 3 10 1 5 1 2 8 16 4 10 1 5 1 2 9 17 5 10 1 5 700+ Columns 99% query only use <10 columns Select date,userid,shows,clicks,charge from… date userid shows clicks charge 1 1 50 5 25 1 2 40 4 20 Multi Roll Up Table (user-transparent) date cmatch shows clicks charge 1 1 20 2 10 1 2 30 3 15 1 3 20 2 10 1 4 10 1 5 1 5 10 1 5
  • 26. OAP In BigSQL … Name Department Age … … … … … … … John INF 35 … … Michelle AI-Lab 29 … … Amy INF 42 … … Kim AI-Lab 27 … … Mary AI-Lab 47 … … … … … … DataFile IndexFile Sorted Age Row Index in Data File 27 3 29 1 35 0 42 2 45 4 Department Bit Array INF 10100 AI-Lab 01011 Index Build NormalTableScan UseIndex Skippable Reader Select xxx from xxx where age > 29 and department in (INF, AI-Lab)
  • 27. OAP In BigSQL … Name Department Age … … … … … … … John INF 35 … … Michelle AI-Lab 29 … … Amy INF 42 … … Kim AI-Lab 27 … … Mary AI-Lab 47 … … … … … … DataFile InMemoryCache Load Cache Department Row Index in Data File INF 2 AI-Lab 3 Age Row Index in Data File 35 0 29 1
  • 28. BBS’s Contribute to Spark • Spark-4502 Spark SQL reads unneccesary nested fields from Parquet • Spark-18700 getCached in HiveMetastoreCatalog not thread safe cause driver OOM • Spark-20408 Get glob path in parallel to reduce resolve relation time • …
  • 29. Agenda • Background for OAP • Key features • Benchmark • OAP and Spark in Baidu • Future plans
  • 30. Future plans • Compatible with more data formats • Explicit cache and cache management • Optimize SQL operators (join, aggregate) with index • Integrate with structured streaming • Utilize Latest hardware technology, such as Intel QAT or 3D XPoint. • Welcome to contribute! https://github.com/Intel-bigdata/OAP