SkyhookDM: Towards an
Arrow-Native Storage System
Jayjeet Chakraborty, Ivo Jimenez, Sebastiaan Rodriguez Alvarez,
Alexandru Uta, Jeff LeFevre, Carlos Maltzahn
Agenda
● Broader Problem and Solution
● Our approach
● Background
● Design and Architecture
● Evaluations
2
The Broader Problem
● With high speed storage devices like NVMe SSDs serving upto 3GB/s and
networks supporting 25-100Gb/s, CPU is the new bottleneck
● Client-side computation of data and reading from efficient storage formats like
Parquet, ORC exhausts the clients CPUs
● Leads to severely hampered scalability, latency, and throughput
3
Computational Storage as a Solution
● Offload as much compute as possible to the storage layer
● Use the idle CPUs of the storage nodes to perform data filtering, decoding,
decompression
● Accelerated queries due to reduced data movement and increased scalability
4
How is our approach (Skyhook) different ?
● Most computational storage systems require hardware support like SSDs
embedded with FPGAs for offloading compute
○ No extra hardware, use storage nodes CPU, start offloading instantly
○ Does it work ? What are the challenges ?
● Enable compute offloading in existing programmable storage systems without
code changes
○ Easily test out computational storage with your existing storage infrastructure
● Embed data access libraries directly into the storage system
○ Offload metadata management to the storage layer
5
● Provides 3 types of storage
interface: File, Object, Block
● No central point of failure. Uses
CRUSH maps that contains Object
- OSD mapping
● Extensible Object storage layer via
the Ceph Object Classes SDK
Ceph
6
Object Class Mechanism
● Utilizing Ceph’s object class mechanism
(“cls”)
○ Plugin-based object storage extension
mechanism
○ Helps extend the Object store I/O path
with User-defined functions
● Used by several Ceph internals
○ CephFS, RGW (Rados Gateway), RBD
(Rados Block Device)
Object
Object Class
Method
Object Store
Redundancy Layer
K/V
Store
Chunk
store
7
Object Class Mechanism
Object classes in Ceph
Growth of object classes in Ceph
8
Apache Arrow
9
● Language-independent columnar memory format for flat and hierarchical
data, organised for efficient analytic operations on modern hardware
● Share data between processes without serialization overhead
10
Rich collection of pluggable components for building data processing systems
11
Design Paradigm
● Extend client and storage layers of
programmable storage systems
with data access libraries
● Embed a FS shim inside storage
nodes to have file-like view over
objects
● Make object class extensions
directly available to the clients
without having to change the FS
12
File Layout
● Max object size allowed by Ceph is 128MB
○ In this work, we use 64MB files as we found out it to provide the best performance
● Files larger than 64MB are split into ~64MB files
○ Each partition goes into a single RADOS object
● This 1:1 mapping between files and objects facilitates the direct translation from
filenames to Object IDs
13
Architecture
14
Using Skyhook from Arrow
Skyhook supports file
formats that are supported
by Arrow out-of-the-box
○ Parquet, CSV, JSON,
Feather
15
Evaluations
16
Query Latency
● Skyhook scales with cluster size but Parquet does not
● When Skyhook cannot benefit from scale out, serialization overhead dominates
● 100% selectivity of Skyhook results in a unnecessary packing/unpacking of
Parquet files inside the storage nodes
○ We can simply detect 100% selectivity queries and avoid offloading to Skyhook
17
CPU usage
● Skyhook provides compute access to more CPU resources improving scalability and
performance
● Decompression of LZ4 compressed batches uses some CPU on the client side in Skyhook
● The simplicity of offloading using storage nodes CPUs trades off total CPU usage
18
Client: Thick blue line
Storage: Other lines
Network Traffic
● Skyhook prevents unnecessary
bandwidth wastage making room for
other applications
● Queries with 10% and 1% row
selectivity are much faster in Skyhook
than without Skyhook
● Bandwidth usage during 100%
selectivity of Skyhook is little more
than without skyhook as Skyhook
transfers data in slightly larger
LZ4-compressed Arrow IPC format
19
Crash Recovery
● Skyhook queries are fault tolerant due
to the integration of compute with
storage
● Object method calls are sent to object
names regardless of their physical
location
● When a server crashes, method class are
automatically restarted on another
server with a redundant copy of that
object
20
Skyhook upstreamed in Apache Arrow !
21
Future Work
● Support RDMA to transfer result Arrow Record Batches from storage to the client
○ This is to avoid the memory-to-wire serialization overhead
● Move to embedding the Arrow Streaming Compute Engine instead of the Arrow
Dataset API to support offloading more complex compute operations
○ Requires having a streaming interface in Ceph Object Class SDK
● Use Gandiva to accelerate Arrow query processing inside the storage layer
○ Leverage SIMD processing capabilities of modern processors
22
Thank You
jayjeetc@ucsc.edu
23
https://iris-hep.org/projects/skyhookdm.html

More Related Content

PDF
DPDK Summit 2015 - Aspera - Charles Shiflett
PDF
SkyhookDM - Towards an Arrow-Native Storage System
PDF
2021.02 new in Ceph Pacific Dashboard
PPTX
HPC and cloud distributed computing, as a journey
PPTX
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
PDF
Stream Data Processing at Big Data Landscape by Oleksandr Fedirko
PPTX
Introduction to rook
PDF
Ceph Research at UCSC
DPDK Summit 2015 - Aspera - Charles Shiflett
SkyhookDM - Towards an Arrow-Native Storage System
2021.02 new in Ceph Pacific Dashboard
HPC and cloud distributed computing, as a journey
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Stream Data Processing at Big Data Landscape by Oleksandr Fedirko
Introduction to rook
Ceph Research at UCSC

Similar to Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022 (20)

PDF
HNSciCloud Info Day, 7 Sept 2016, Functional Requirements by Helge Meinhard
PPTX
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
PDF
Ceph Day San Jose - Object Storage for Big Data
PPTX
Se training storage grid webscale technical overview
PPTX
Big Data on Cloud Native Platform
PPTX
Big Data on Cloud Native Platform
PDF
6 open capi_meetup_in_japan_final
PPTX
Cloud computing UNIT 2.1 presentation in
PDF
Cncf storage-final-filip
PDF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
PDF
AI/ML Infra Meetup | Exploring Distributed Caching for Faster GPU Training wi...
PPTX
Jaringan virtual komputasi awan bagian ke 2
PDF
Hadoop 3 @ Hadoop Summit San Jose 2017
PDF
Apache Hadoop 3.0 Community Update
PDF
LCA14: LCA14-209: ODP Project Update
PDF
Initial presentation of swift (for montreal user group)
PDF
HDFCloud Workshop: HDF5 in the Cloud
PDF
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
PDF
OpenPOWER Acceleration of HPCC Systems
PDF
Concurrency, Parallelism And IO
HNSciCloud Info Day, 7 Sept 2016, Functional Requirements by Helge Meinhard
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day San Jose - Object Storage for Big Data
Se training storage grid webscale technical overview
Big Data on Cloud Native Platform
Big Data on Cloud Native Platform
6 open capi_meetup_in_japan_final
Cloud computing UNIT 2.1 presentation in
Cncf storage-final-filip
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
AI/ML Infra Meetup | Exploring Distributed Caching for Faster GPU Training wi...
Jaringan virtual komputasi awan bagian ke 2
Hadoop 3 @ Hadoop Summit San Jose 2017
Apache Hadoop 3.0 Community Update
LCA14: LCA14-209: ODP Project Update
Initial presentation of swift (for montreal user group)
HDFCloud Workshop: HDF5 in the Cloud
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
OpenPOWER Acceleration of HPCC Systems
Concurrency, Parallelism And IO
Ad

Recently uploaded (20)

PDF
AI-Powered Threat Modeling: The Future of Cybersecurity by Arun Kumar Elengov...
PPTX
Cybersecurity: Protecting the Digital World
DOCX
How to Use SharePoint as an ISO-Compliant Document Management System
PPTX
Oracle Fusion HCM Cloud Demo for Beginners
PDF
Website Design Services for Small Businesses.pdf
PDF
AI Guide for Business Growth - Arna Softech
PDF
Ableton Live Suite for MacOS Crack Full Download (Latest 2025)
PDF
AI/ML Infra Meetup | LLM Agents and Implementation Challenges
PDF
Autodesk AutoCAD Crack Free Download 2025
PDF
EaseUS PDF Editor Pro 6.2.0.2 Crack with License Key 2025
PDF
AI/ML Infra Meetup | Beyond S3's Basics: Architecting for AI-Native Data Access
PPTX
AMADEUS TRAVEL AGENT SOFTWARE | AMADEUS TICKETING SYSTEM
PPTX
Advanced SystemCare Ultimate Crack + Portable (2025)
PDF
The Dynamic Duo Transforming Financial Accounting Systems Through Modern Expe...
PDF
Designing Intelligence for the Shop Floor.pdf
PDF
Types of Token_ From Utility to Security.pdf
PDF
MCP Security Tutorial - Beginner to Advanced
PPTX
WiFi Honeypot Detecscfddssdffsedfseztor.pptx
PDF
DNT Brochure 2025 – ISV Solutions @ D365
PPTX
Log360_SIEM_Solutions Overview PPT_Feb 2020.pptx
AI-Powered Threat Modeling: The Future of Cybersecurity by Arun Kumar Elengov...
Cybersecurity: Protecting the Digital World
How to Use SharePoint as an ISO-Compliant Document Management System
Oracle Fusion HCM Cloud Demo for Beginners
Website Design Services for Small Businesses.pdf
AI Guide for Business Growth - Arna Softech
Ableton Live Suite for MacOS Crack Full Download (Latest 2025)
AI/ML Infra Meetup | LLM Agents and Implementation Challenges
Autodesk AutoCAD Crack Free Download 2025
EaseUS PDF Editor Pro 6.2.0.2 Crack with License Key 2025
AI/ML Infra Meetup | Beyond S3's Basics: Architecting for AI-Native Data Access
AMADEUS TRAVEL AGENT SOFTWARE | AMADEUS TICKETING SYSTEM
Advanced SystemCare Ultimate Crack + Portable (2025)
The Dynamic Duo Transforming Financial Accounting Systems Through Modern Expe...
Designing Intelligence for the Shop Floor.pdf
Types of Token_ From Utility to Security.pdf
MCP Security Tutorial - Beginner to Advanced
WiFi Honeypot Detecscfddssdffsedfseztor.pptx
DNT Brochure 2025 – ISV Solutions @ D365
Log360_SIEM_Solutions Overview PPT_Feb 2020.pptx
Ad

Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022

  • 1. SkyhookDM: Towards an Arrow-Native Storage System Jayjeet Chakraborty, Ivo Jimenez, Sebastiaan Rodriguez Alvarez, Alexandru Uta, Jeff LeFevre, Carlos Maltzahn
  • 2. Agenda ● Broader Problem and Solution ● Our approach ● Background ● Design and Architecture ● Evaluations 2
  • 3. The Broader Problem ● With high speed storage devices like NVMe SSDs serving upto 3GB/s and networks supporting 25-100Gb/s, CPU is the new bottleneck ● Client-side computation of data and reading from efficient storage formats like Parquet, ORC exhausts the clients CPUs ● Leads to severely hampered scalability, latency, and throughput 3
  • 4. Computational Storage as a Solution ● Offload as much compute as possible to the storage layer ● Use the idle CPUs of the storage nodes to perform data filtering, decoding, decompression ● Accelerated queries due to reduced data movement and increased scalability 4
  • 5. How is our approach (Skyhook) different ? ● Most computational storage systems require hardware support like SSDs embedded with FPGAs for offloading compute ○ No extra hardware, use storage nodes CPU, start offloading instantly ○ Does it work ? What are the challenges ? ● Enable compute offloading in existing programmable storage systems without code changes ○ Easily test out computational storage with your existing storage infrastructure ● Embed data access libraries directly into the storage system ○ Offload metadata management to the storage layer 5
  • 6. ● Provides 3 types of storage interface: File, Object, Block ● No central point of failure. Uses CRUSH maps that contains Object - OSD mapping ● Extensible Object storage layer via the Ceph Object Classes SDK Ceph 6
  • 7. Object Class Mechanism ● Utilizing Ceph’s object class mechanism (“cls”) ○ Plugin-based object storage extension mechanism ○ Helps extend the Object store I/O path with User-defined functions ● Used by several Ceph internals ○ CephFS, RGW (Rados Gateway), RBD (Rados Block Device) Object Object Class Method Object Store Redundancy Layer K/V Store Chunk store 7
  • 8. Object Class Mechanism Object classes in Ceph Growth of object classes in Ceph 8
  • 10. ● Language-independent columnar memory format for flat and hierarchical data, organised for efficient analytic operations on modern hardware ● Share data between processes without serialization overhead 10
  • 11. Rich collection of pluggable components for building data processing systems 11
  • 12. Design Paradigm ● Extend client and storage layers of programmable storage systems with data access libraries ● Embed a FS shim inside storage nodes to have file-like view over objects ● Make object class extensions directly available to the clients without having to change the FS 12
  • 13. File Layout ● Max object size allowed by Ceph is 128MB ○ In this work, we use 64MB files as we found out it to provide the best performance ● Files larger than 64MB are split into ~64MB files ○ Each partition goes into a single RADOS object ● This 1:1 mapping between files and objects facilitates the direct translation from filenames to Object IDs 13
  • 15. Using Skyhook from Arrow Skyhook supports file formats that are supported by Arrow out-of-the-box ○ Parquet, CSV, JSON, Feather 15
  • 17. Query Latency ● Skyhook scales with cluster size but Parquet does not ● When Skyhook cannot benefit from scale out, serialization overhead dominates ● 100% selectivity of Skyhook results in a unnecessary packing/unpacking of Parquet files inside the storage nodes ○ We can simply detect 100% selectivity queries and avoid offloading to Skyhook 17
  • 18. CPU usage ● Skyhook provides compute access to more CPU resources improving scalability and performance ● Decompression of LZ4 compressed batches uses some CPU on the client side in Skyhook ● The simplicity of offloading using storage nodes CPUs trades off total CPU usage 18 Client: Thick blue line Storage: Other lines
  • 19. Network Traffic ● Skyhook prevents unnecessary bandwidth wastage making room for other applications ● Queries with 10% and 1% row selectivity are much faster in Skyhook than without Skyhook ● Bandwidth usage during 100% selectivity of Skyhook is little more than without skyhook as Skyhook transfers data in slightly larger LZ4-compressed Arrow IPC format 19
  • 20. Crash Recovery ● Skyhook queries are fault tolerant due to the integration of compute with storage ● Object method calls are sent to object names regardless of their physical location ● When a server crashes, method class are automatically restarted on another server with a redundant copy of that object 20
  • 21. Skyhook upstreamed in Apache Arrow ! 21
  • 22. Future Work ● Support RDMA to transfer result Arrow Record Batches from storage to the client ○ This is to avoid the memory-to-wire serialization overhead ● Move to embedding the Arrow Streaming Compute Engine instead of the Arrow Dataset API to support offloading more complex compute operations ○ Requires having a streaming interface in Ceph Object Class SDK ● Use Gandiva to accelerate Arrow query processing inside the storage layer ○ Leverage SIMD processing capabilities of modern processors 22