SlideShare a Scribd company logo
Optimizing Latency-Sensitive Queries at Facebook
with Presto & Alluxio
Ke Wang (Facebook)
Bin Fan (Alluxio)
December 2020
• Overview
• Architecture and Problems
• Re-architecture and Solutions
• Performance
• Alluxio Deep-dive
2
Presto is Open Source
40K
Servers
~ 1 EB data
scan per day
> 80%
new ETL
Presto @ Facebook Scale
4
• Overview
• Architecture and Problems
• Re-architecture and Solution
• Performance
• Alluxio Deep-dive
5
Driver
Driver
Planner/
Optimizer
Scheduler
Worker
Worker
Driver
Worker
HDFS
Hive
Metastore
read/writeBlock
workload balanced
openFiles
getPartitions
getFiles
SQL
result
How Presto Works
6
• Overview
• Architecture and Problems
• Re-architecture and Solution
• Performance
• Alluxio Deep-dive
7
• Metadata cache at various levels
• schemas
• ACLs
• Partitions info
• HDFS
• File handle caching: avoid file open calls
• File stripe/footer caching: avoid multiple redundant RPC calls to HDFS
• File data caching: avoid network or HDFS latency.
• Compute
• Plan
• Partitial Result
Caching
8
• An optimization technique is to cache working dataset closer to the
compute node.
• Less trips to remote storage should help with latencies and IO.
Data Caching
9
Driver
Driver
Planner/
Optimizer
Scheduler
Worker
Worker
Driver
Worker
HDFS
Hive
Metastore
getFiles
openFile/footer cache
read/writeBlock
soft affinity
Data Cache
Local SSD
Metadata
Caching
Low-overhead
coordinator
KV store
file location/stats
Presto with Data Caching
10
• Random Node Scheduler
• Best efforts to assign the same split to the same worker
Affinity Scheduling
11
• Blocked --> Secondary Preference --> Least busy
Soft Affinity
12
• Facebook internal caching libraries
• Open source solutions
• Build our own
Various Options
13
• Naïve solution
• Copying files from remote storage on local storage
• Merging files in the local storage to keep file count low
File Merge Caching
14
File Merge Caching
15
• Segment Based data caching
• Pluggable eviction policies
• Configuration of various aspects like sizes, resources usage, eviction policies, etc.
•
• A Java based OSS library
• Provide detailed stats regarding cache usage.
• Caching should not become a point of failure.
• Asynchronous operations. 
• Files management at the disk level.
• Flash throughput limiter to avoid endurance issues.
Learnings & Alluxio Collaboration
16
• Overview
• Architecture and Problems
• Re-architecture w/
Presto+Alluxio
• Performance
• Alluxio Deep-dive
17
• Two full days worth of queries from the production cluster was shadowed
to the test cluster.
• Query Count: 17320
• 600 nodes cluster
• 460GB per node was configured for data caching.
• LRU eviction policy.
• 1MB as the block size, meaning data is read, stored, and evicted in the 1
MB size.
Benchmark Configuration
18
Benchmark Results
Query Execution Time
19
• Data Size read for master branch run: 582 T Bytes
• Data Size read for caching branch run: 251 T Bytes
• Savings in Scans: 57%
Benchmark Results
IO Savings
20
Benchmark Results
Cache hit rate
21
Production
• Overview
• Architecture and Problems
• Re-architecture and Solutions
• Performance
• Alluxio Deep-dive
23
Alluxio Overview
Translate access to optimal storage APIs over a slow network
Data Orchestration for the Cloud
Java File API HDFS Interface S3 Interface REST APIPOSIX Interface
HDFS Driver Swift Driver S3 Driver NFS Driver
24
Local cache
storage
Alluxio Caching
File System
On Cache Hit
External
Storage
Presto
Worker
On Cache Miss
HDFS API Calls
Alluxio Cache
Manager
External
File System
Presto Server JVM
Presto & Alluxio Local Cache
Architecture
25
• Cache files in fix-sized segments (called pages)
• configurable, 1MB by default
• Store pages off-heap
• avoid using JVM memory resource but with SSDs
• Highly-concurrent & thread-safe
• Light-weight & fine-grained locking
if cacheManager.hasPage(pageId):
page = cacheManager.readPage(pageId)
else:
readFromExternalFS(page, offset, len)
cacheManager.writePage(pageId, page)
Implementation & Optimization
26
• Pluggable cache replace policies:
• LRU, LFU
• Pluggable cache storage options:
• Local file system store: each page -> one file
• Rocksdb store: page -> one value associated with pageId
• Async cache writes
• to handle bursty cache write ops, queue writes in background
• Failure Recovery
• disks are expected to fail when running at Facebook scale
Implementation & Optimization
27
• (WIP) Support Schema/Table/Partition level Cache Quota
• (WIP) Performance optimizations for small files
• (Future work) Semantics-aware caching
Ongoing Development
28
• Edit etc/catalog/hive.properties
• More details in the blog
cache.enabled=true
cache.type=ALLUXIO
cache.base-directory=/tmp/alluxio-cache
cache.alluxio.max-cache-size=500GB
hive.node-selection-strategy=SOFT_AFFINITY
Enable Alluxio Local Cache w/ Presto
29
https://prestodb.io/blog/2020/06/16/alluxio-datacaching
• Fine-grained control on working set
• free / pin data in cache, set data TTL in cache etc
• Metadata caching and syncing
• Automatically sync data b/w Alluxio cache and persisted data
• Data Transformation Services
• e.g., convert csv files into parquet format in cache
• Data Migration services
• e.g., migrate from HDFS to S3 based on access time policy
• Familiar Filesystem CLIs
• e.g., alluxio fs ls /my/path
Alluxio File System Enhancements
30
Alluxio Doc: https://docs.alluxio.io/os/user/stable/en/Overview.html
Twitter.com/alluxio
Linkedin.com/alluxio
Website
www.alluxio.io
Slack
http://slackin.alluxio.io/
@
Social Media
A recording of this talk will be available soon
Q & A
www.prestodb.io
https://prestodb.io/blog/2020/06/16/alluxio-datacaching
https://prestodb.slack.com
https://alluxio.io/slack

More Related Content

PDF
Accelerate Cloud Training with Alluxio
PDF
Hybrid data lake on google cloud with alluxio and dataproc
PDF
Alluxio Use Cases and Future Directions
PDF
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
PDF
Cybersecurity and fraud detection at ING Bank using Presto & Alluxio on S3
PDF
Achieving Separation of Compute and Storage in a Cloud World
PDF
Presto on Alluxio Hands-On Lab
PDF
Alluxio+Presto: An Architecture for Fast SQL in the Cloud
Accelerate Cloud Training with Alluxio
Hybrid data lake on google cloud with alluxio and dataproc
Alluxio Use Cases and Future Directions
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Cybersecurity and fraud detection at ING Bank using Presto & Alluxio on S3
Achieving Separation of Compute and Storage in a Cloud World
Presto on Alluxio Hands-On Lab
Alluxio+Presto: An Architecture for Fast SQL in the Cloud

What's hot (20)

PDF
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
PDF
The Practice of Presto & Alluxio in E-Commerce Big Data Platform
PDF
Apache Hudi: The Path Forward
PDF
From limited Hadoop compute capacity to increased data scientist efficiency
PDF
High Performance Data Lake with Apache Hudi and Alluxio at T3Go
PDF
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
PDF
Accelerate Analytics and ML in the Hybrid Cloud Era
PPTX
Using Alluxio as a Fault-tolerant Pluggable Optimization Component of JD.com'...
PDF
Iceberg + Alluxio for Fast Data Analytics
PDF
Improving Presto performance with Alluxio at TikTok
PDF
StorageQuery: federated querying on object stores, powered by Alluxio and Presto
PDF
Accelerating Data Computation on Ceph Objects
PDF
Accelerating Hive with Alluxio on S3
PDF
RaptorX: Building a 10X Faster Presto with hierarchical cache
PDF
How to Develop and Operate Cloud First Data Platforms
PDF
Securely Enhancing Data Access in Hybrid Cloud with Alluxio
PDF
Speeding Up Spark Performance using Alluxio at China Unicom
PDF
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
PDF
Enterprise Distributed Query Service powered by Presto & Alluxio across cloud...
PDF
Rise of Intermediate APIs - Beam and Alluxio at Alluxio Meetup 2016
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
The Practice of Presto & Alluxio in E-Commerce Big Data Platform
Apache Hudi: The Path Forward
From limited Hadoop compute capacity to increased data scientist efficiency
High Performance Data Lake with Apache Hudi and Alluxio at T3Go
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
Accelerate Analytics and ML in the Hybrid Cloud Era
Using Alluxio as a Fault-tolerant Pluggable Optimization Component of JD.com'...
Iceberg + Alluxio for Fast Data Analytics
Improving Presto performance with Alluxio at TikTok
StorageQuery: federated querying on object stores, powered by Alluxio and Presto
Accelerating Data Computation on Ceph Objects
Accelerating Hive with Alluxio on S3
RaptorX: Building a 10X Faster Presto with hierarchical cache
How to Develop and Operate Cloud First Data Platforms
Securely Enhancing Data Access in Hybrid Cloud with Alluxio
Speeding Up Spark Performance using Alluxio at China Unicom
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Enterprise Distributed Query Service powered by Presto & Alluxio across cloud...
Rise of Intermediate APIs - Beam and Alluxio at Alluxio Meetup 2016
Ad

Similar to Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration between Presto & Alluxio (20)

PDF
Building Fast SQL Analytics on Anything with Presto, Alluxio
PDF
Enterprise Distributed Query Service powered by Presto & Alluxio across cloud...
PDF
Enabling Ultra-fast Presto in the Cloud with Alluxio
PDF
Meetup at AI NextCon 2019: In-Stream data process, Data Orchestration & More
PDF
Interactive Analytics with the Starburst Presto + Alluxio stack for the Cloud
PDF
Open Source Data Orchestration for AI, Big Data, and Cloud
PDF
Enabling Presto Caching at Uber with Alluxio
PDF
Getting Started with Apache Spark and Alluxio for Blazingly Fast Analytics
PDF
Alluxio Community Office Hour: Getting Started with Alluxio Open Source
PDF
Best Practices for Using Alluxio with Spark
PDF
Spark Summit EU talk by Jiri Simsa
PDF
Spark Summit EU talk by Jiri Simsa
PDF
Alluxio Webinar | Optimize, Don't Overspend: Data Caching Strategy for AI Wor...
PDF
Alluxio Use Cases at Strata+Hadoop World Beijing 2016
PDF
The Practice of Alluxio in JD.com
PPTX
Alluxio: Unify Data at Memory Speed
PDF
Unified Big Data Analytics: Any Stack, Any Cloud
PDF
Slides: Accelerating Queries on Cloud Data Lakes
PDF
Accelerating analytics in the cloud with the Starburst Presto + Alluxio stack
PDF
Alluxio 2.0 Deep Dive – Simplifying data access for cloud workloads
Building Fast SQL Analytics on Anything with Presto, Alluxio
Enterprise Distributed Query Service powered by Presto & Alluxio across cloud...
Enabling Ultra-fast Presto in the Cloud with Alluxio
Meetup at AI NextCon 2019: In-Stream data process, Data Orchestration & More
Interactive Analytics with the Starburst Presto + Alluxio stack for the Cloud
Open Source Data Orchestration for AI, Big Data, and Cloud
Enabling Presto Caching at Uber with Alluxio
Getting Started with Apache Spark and Alluxio for Blazingly Fast Analytics
Alluxio Community Office Hour: Getting Started with Alluxio Open Source
Best Practices for Using Alluxio with Spark
Spark Summit EU talk by Jiri Simsa
Spark Summit EU talk by Jiri Simsa
Alluxio Webinar | Optimize, Don't Overspend: Data Caching Strategy for AI Wor...
Alluxio Use Cases at Strata+Hadoop World Beijing 2016
The Practice of Alluxio in JD.com
Alluxio: Unify Data at Memory Speed
Unified Big Data Analytics: Any Stack, Any Cloud
Slides: Accelerating Queries on Cloud Data Lakes
Accelerating analytics in the cloud with the Starburst Presto + Alluxio stack
Alluxio 2.0 Deep Dive – Simplifying data access for cloud workloads
Ad

More from Alluxio, Inc. (20)

PDF
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
PDF
Introduction to Apache Iceberg™ & Tableflow
PDF
Optimizing Tiered Storage for Low-Latency Real-Time Analytics at AI Scale
PDF
Meet in the Middle: Solving the Low-Latency Challenge for Agentic AI
PDF
From Data Preparation to Inference: How Alluxio Speeds Up AI
PDF
Best Practice for LLM Serving in the Cloud
PDF
Meet You in the Middle: 1000x Performance for Parquet Queries on PB-Scale Dat...
PDF
How Coupang Leverages Distributed Cache to Accelerate ML Model Training
PDF
Alluxio Webinar | Inside Deepseek 3FS: A Deep Dive into AI-Optimized Distribu...
PDF
AI/ML Infra Meetup | Building Production Platform for Large-Scale Recommendat...
PDF
AI/ML Infra Meetup | How Uber Optimizes LLM Training and Finetune
PDF
AI/ML Infra Meetup | Optimizing ML Data Access with Alluxio: Preprocessing, ...
PDF
AI/ML Infra Meetup | Deployment, Discovery and Serving of LLMs at Uber Scale
PDF
Alluxio Webinar | What’s New in Alluxio AI: 3X Faster Checkpoint File Creatio...
PDF
AI/ML Infra Meetup | A Faster and More Cost Efficient LLM Inference Stack
PDF
AI/ML Infra Meetup | Balancing Cost, Performance, and Scale - Running GPU/CPU...
PDF
AI/ML Infra Meetup | RAYvolution - The Last Mile: Mastering AI Deployment wit...
PDF
Alluxio Webinar | Accelerate AI: Alluxio 101
PDF
AI/ML Infra Meetup | The power of Ray in the era of LLM and multi-modality AI
PDF
AI/ML Infra Meetup | Exploring Distributed Caching for Faster GPU Training wi...
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
Introduction to Apache Iceberg™ & Tableflow
Optimizing Tiered Storage for Low-Latency Real-Time Analytics at AI Scale
Meet in the Middle: Solving the Low-Latency Challenge for Agentic AI
From Data Preparation to Inference: How Alluxio Speeds Up AI
Best Practice for LLM Serving in the Cloud
Meet You in the Middle: 1000x Performance for Parquet Queries on PB-Scale Dat...
How Coupang Leverages Distributed Cache to Accelerate ML Model Training
Alluxio Webinar | Inside Deepseek 3FS: A Deep Dive into AI-Optimized Distribu...
AI/ML Infra Meetup | Building Production Platform for Large-Scale Recommendat...
AI/ML Infra Meetup | How Uber Optimizes LLM Training and Finetune
AI/ML Infra Meetup | Optimizing ML Data Access with Alluxio: Preprocessing, ...
AI/ML Infra Meetup | Deployment, Discovery and Serving of LLMs at Uber Scale
Alluxio Webinar | What’s New in Alluxio AI: 3X Faster Checkpoint File Creatio...
AI/ML Infra Meetup | A Faster and More Cost Efficient LLM Inference Stack
AI/ML Infra Meetup | Balancing Cost, Performance, and Scale - Running GPU/CPU...
AI/ML Infra Meetup | RAYvolution - The Last Mile: Mastering AI Deployment wit...
Alluxio Webinar | Accelerate AI: Alluxio 101
AI/ML Infra Meetup | The power of Ray in the era of LLM and multi-modality AI
AI/ML Infra Meetup | Exploring Distributed Caching for Faster GPU Training wi...

Recently uploaded (20)

PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PPTX
Odoo POS Development Services by CandidRoot Solutions
PPTX
ISO 45001 Occupational Health and Safety Management System
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
AI in Product Development-omnex systems
PPTX
Introduction to Artificial Intelligence
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
Nekopoi APK 2025 free lastest update
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PDF
PTS Company Brochure 2025 (1).pdf.......
PPTX
history of c programming in notes for students .pptx
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Odoo Companies in India – Driving Business Transformation.pdf
Operating system designcfffgfgggggggvggggggggg
Wondershare Filmora 15 Crack With Activation Key [2025
Odoo POS Development Services by CandidRoot Solutions
ISO 45001 Occupational Health and Safety Management System
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
AI in Product Development-omnex systems
Introduction to Artificial Intelligence
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Navsoft: AI-Powered Business Solutions & Custom Software Development
How to Migrate SBCGlobal Email to Yahoo Easily
How to Choose the Right IT Partner for Your Business in Malaysia
Nekopoi APK 2025 free lastest update
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PTS Company Brochure 2025 (1).pdf.......
history of c programming in notes for students .pptx
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...

Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration between Presto & Alluxio

  • 1. Optimizing Latency-Sensitive Queries at Facebook with Presto & Alluxio Ke Wang (Facebook) Bin Fan (Alluxio) December 2020
  • 2. • Overview • Architecture and Problems • Re-architecture and Solutions • Performance • Alluxio Deep-dive 2
  • 3. Presto is Open Source
  • 4. 40K Servers ~ 1 EB data scan per day > 80% new ETL Presto @ Facebook Scale 4
  • 5. • Overview • Architecture and Problems • Re-architecture and Solution • Performance • Alluxio Deep-dive 5
  • 7. • Overview • Architecture and Problems • Re-architecture and Solution • Performance • Alluxio Deep-dive 7
  • 8. • Metadata cache at various levels • schemas • ACLs • Partitions info • HDFS • File handle caching: avoid file open calls • File stripe/footer caching: avoid multiple redundant RPC calls to HDFS • File data caching: avoid network or HDFS latency. • Compute • Plan • Partitial Result Caching 8
  • 9. • An optimization technique is to cache working dataset closer to the compute node. • Less trips to remote storage should help with latencies and IO. Data Caching 9
  • 10. Driver Driver Planner/ Optimizer Scheduler Worker Worker Driver Worker HDFS Hive Metastore getFiles openFile/footer cache read/writeBlock soft affinity Data Cache Local SSD Metadata Caching Low-overhead coordinator KV store file location/stats Presto with Data Caching 10
  • 11. • Random Node Scheduler • Best efforts to assign the same split to the same worker Affinity Scheduling 11
  • 12. • Blocked --> Secondary Preference --> Least busy Soft Affinity 12
  • 13. • Facebook internal caching libraries • Open source solutions • Build our own Various Options 13
  • 14. • Naïve solution • Copying files from remote storage on local storage • Merging files in the local storage to keep file count low File Merge Caching 14
  • 16. • Segment Based data caching • Pluggable eviction policies • Configuration of various aspects like sizes, resources usage, eviction policies, etc. • • A Java based OSS library • Provide detailed stats regarding cache usage. • Caching should not become a point of failure. • Asynchronous operations.  • Files management at the disk level. • Flash throughput limiter to avoid endurance issues. Learnings & Alluxio Collaboration 16
  • 17. • Overview • Architecture and Problems • Re-architecture w/ Presto+Alluxio • Performance • Alluxio Deep-dive 17
  • 18. • Two full days worth of queries from the production cluster was shadowed to the test cluster. • Query Count: 17320 • 600 nodes cluster • 460GB per node was configured for data caching. • LRU eviction policy. • 1MB as the block size, meaning data is read, stored, and evicted in the 1 MB size. Benchmark Configuration 18
  • 20. • Data Size read for master branch run: 582 T Bytes • Data Size read for caching branch run: 251 T Bytes • Savings in Scans: 57% Benchmark Results IO Savings 20
  • 23. • Overview • Architecture and Problems • Re-architecture and Solutions • Performance • Alluxio Deep-dive 23
  • 24. Alluxio Overview Translate access to optimal storage APIs over a slow network Data Orchestration for the Cloud Java File API HDFS Interface S3 Interface REST APIPOSIX Interface HDFS Driver Swift Driver S3 Driver NFS Driver 24
  • 25. Local cache storage Alluxio Caching File System On Cache Hit External Storage Presto Worker On Cache Miss HDFS API Calls Alluxio Cache Manager External File System Presto Server JVM Presto & Alluxio Local Cache Architecture 25
  • 26. • Cache files in fix-sized segments (called pages) • configurable, 1MB by default • Store pages off-heap • avoid using JVM memory resource but with SSDs • Highly-concurrent & thread-safe • Light-weight & fine-grained locking if cacheManager.hasPage(pageId): page = cacheManager.readPage(pageId) else: readFromExternalFS(page, offset, len) cacheManager.writePage(pageId, page) Implementation & Optimization 26
  • 27. • Pluggable cache replace policies: • LRU, LFU • Pluggable cache storage options: • Local file system store: each page -> one file • Rocksdb store: page -> one value associated with pageId • Async cache writes • to handle bursty cache write ops, queue writes in background • Failure Recovery • disks are expected to fail when running at Facebook scale Implementation & Optimization 27
  • 28. • (WIP) Support Schema/Table/Partition level Cache Quota • (WIP) Performance optimizations for small files • (Future work) Semantics-aware caching Ongoing Development 28
  • 29. • Edit etc/catalog/hive.properties • More details in the blog cache.enabled=true cache.type=ALLUXIO cache.base-directory=/tmp/alluxio-cache cache.alluxio.max-cache-size=500GB hive.node-selection-strategy=SOFT_AFFINITY Enable Alluxio Local Cache w/ Presto 29 https://prestodb.io/blog/2020/06/16/alluxio-datacaching
  • 30. • Fine-grained control on working set • free / pin data in cache, set data TTL in cache etc • Metadata caching and syncing • Automatically sync data b/w Alluxio cache and persisted data • Data Transformation Services • e.g., convert csv files into parquet format in cache • Data Migration services • e.g., migrate from HDFS to S3 based on access time policy • Familiar Filesystem CLIs • e.g., alluxio fs ls /my/path Alluxio File System Enhancements 30 Alluxio Doc: https://docs.alluxio.io/os/user/stable/en/Overview.html
  • 31. Twitter.com/alluxio Linkedin.com/alluxio Website www.alluxio.io Slack http://slackin.alluxio.io/ @ Social Media A recording of this talk will be available soon Q & A www.prestodb.io https://prestodb.io/blog/2020/06/16/alluxio-datacaching https://prestodb.slack.com https://alluxio.io/slack