SlideShare a Scribd company logo
ORC File –
Optimizing Your Big Data
Owen O’Malley, Co-founder Hortonworks
Apache Hadoop, Hive, ORC, and
Incubator
@owen_omalley
2 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Overview
3 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
In the Beginning…
 Hadoop applications used text or SequenceFile
– Text is slow and not splittable when compressed
– SequenceFile only supports key and value and user-defined serialization
 Hive added RCFile
– User controls the columns to read and decompress
– No type information and user-defined serialization
– Finding splits was expensive
 Avro files created
– Type information included!
– Had to read and decompress entire row
4 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
ORC File Basics
 Columnar format
– Enables user to read & decompress just the bytes they need
 Fast
– See https://www.slideshare.net/HadoopSummit/file-format-benchmark-avro-json-orc-parquet
 Indexed
 Self-describing
– Includes all of the information about types and encoding
 Rich type system
– All of Hive’s types including timestamp, struct, map, list, and union
5 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
File Compatibility
 Backwards compatibility
– Automatically detect the version of the file and read it.
 Forward compatibility
– Most changes are made so old readers will read the new files
– Maintain the ability to write old files via orc.write.format
– Always write old version until your last cluster upgrades
 Current file versions
– 0.11 – Original version
– 0.12 – Updated run length encoding (RLE)
6 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
File Structure
 File contains a list of stripes, which are sets of rows
– Default size is 64MB
– Large stripe size enables efficient reads
 Footer
– Contains the list of stripe locations
– Type description
– File and stripe statistics
 Postscript
– Compression parameters
– File format version
7 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Stripe Structure
 Indexes
– Offsets to jump to start of row group
– Row group size defaults to 10,000 rows
– Minimum, Maximum, and Count of each column
 Data
– Data for the stripe organized by column
 Footer
– List of stream locations
– Column encoding information
8 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
File Layout
Page 8
Column 1
Column 2
Column 7
Column 8
Column 3
Column 6
Column 4
Column 5
Column 1
Column 2
Column 7
Column 8
Column 3
Column 6
Column 4
Column 5
Index Data
Row Data
Stripe Footer
~64MBStripe
Index Data
Row Data
Stripe Footer
~64MBStripe
Index Data
Row Data
Stripe Footer
~64MBStripe
File Footer
Postscript
File Metadata
9 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Schema Evolution
 ORC now supports schema evolution
– Hive 2.1 – append columns or type conversion
– Upcoming Hive 2.3 – map columns or inner structures by name
– User passes desired schema to ORC reader
 Type conversions
– Most types will convert although some are ugly.
– If the value doesn’t fit in the new type, it will become null.
 Cautions
– Name mapping requires ORC files written by Hive ≥ 2.0
– Some of the type conversions are slow
10 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Using ORC
11 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
From Hive or Presto
 Modify your table definition:
– create table my_table (
name string,
address string,
) stored as orc;
 Import data:
– insert overwrite table my_table select * from my_staging;
 Use either configuration or table properties
– tblproperties ("orc.compress"="NONE")
– set hive.exec.orc.default.compress=NONE;
12 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
From Java
 Use the ORC project rather than Hive’s ORC.
– Hive’s master branch uses it.
– Maven group id: org.apache.orc version: 1.4.0
– nohive classifier avoids interfering with Hive’s packages
 Two levels of access
– orc-core – Faster access, but uses Hive’s vectorized API
– orc-mapreduce – Row by row access, simpler OrcStruct API
 MapReduce API implements WritableComparable
– Can be shuffled
– Need to specify type information in configuration for shuffle or output
13 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
From C++
 Pure C++ client library
– No JNI or JDK so client can estimate and control memory
 Combine with pure C++ HDFS client from HDFS-8707
– Work ongoing in feature branch, but should be committed soon.
 Reader is stable and in production use.
 Alibaba has created a writer and is contributing it to Apache ORC.
– Should be in the next release ORC 1.5.0.
14 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Command Line
 Using hive –orcfiledump from Hive
– -j -p – pretty prints the metadata as JSON
– -d – prints data as JSON
 Using java -jar orc-tools-1.4.0-uber.jar from ORC
– meta – print the metadata as JSON
– data – print data as JSON
– convert – convert JSON to ORC
– json-schema – scan a set of JSON documents to find the matching schema
15 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Optimization
16 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Stripe Size
 Makes a huge difference in performance
– orc.stripe.size or hive.exec.orc.default.stripe.size
– Controls the amount of buffer in writer. Default is 64MB
– Trade off
• Large stripes = Large more efficient reads
• Small stripes = Less memory and more granular processing splits
 Multiple files written at the same time will shrink stripes
– Use Hive’s hive.optimize.sort.dynamic.partition
– Sorting dynamic partitions means a one writer at a time
17 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
HDFS Block Padding
 The stripes don’t align exactly with HDFS blocks
 HDFS scatters blocks around cluster
 Often want to pad to block boundaries
– Costs space, but improves performance
– hive.exec.orc.default.block.padding – true
– hive.exec.orc.block.padding.tolerance – 0.05
Index Data
Row Data
Stripe Footer
~64MBStripe
Index Data
Row Data
Stripe Footer
~64MBStripe
Index Data
Row Data
Stripe Footer
~64MBStripe
HDFS Block
HDFS Block
Padding
File Footer
Postscript
File Metadata
18 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Predicate Push Down
 Reader is given a SearchArg
– Limited set predicates over column and literal value
– Reader will skip over any parts of file that can’t contain valid rows
 ORC indexes at three levels:
– File
– Stripe
– Row Group (10k rows)
 Reader still needs to apply predicate to filter out single rows
19 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Row Pruning
 Every primitive column has minimum and maximum at each level
– Sorting your data within a file helps a lot
– Consider sorting instead of making lots of partitions
 Writer can optionally include bloomfilters
– Provides a probabilistic bitmap of hashcodes
– Only works with equality predicates at the row group level
– Requires significant space in the file
– Manually enabled by using orc.bloom.filter.columns
– Use orc.bloom.filter.fpp to set the false positive rate (default 0.05)
– Set the default charset in JVM via -Dfile.encoding=UTF-8
20 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Row Pruning Example
 TPC-DS
– from tpch1000.lineitem where l_orderkey = 1212000001;
 Rows Read
– Nothing – 5,999,989,709
– Min/Max – 540,000
– BloomFilter – 10,000
 Time Taken
– Nothing – 74 sec
– Min/Max – 4.5 sec
– BloomFilter – 1.3 sec
21 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Split Calculation
 Hive’s OrcInputFormat has three strategies for split calculation
– BI
• Small fast queries
• Splits based on HDFS blocks
– ETL
• Large queries
• Read file footer and apply SearchArg to stripes
• Can include footer in splits (hive.orc.splits.include.file.footer)
– Hybrid
• If small files or lots of files, use BI
22 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
LLAP – Live Long & Process
 Provides a persistent service to speed up Hive
– Caches ORC and text data
– Saves costs of Yarn container & JVM spin up
– JIT finishes after first few seconds
 Cache uses ORC’s RLE
– Decompresses zlib or Snappy
– RLE is fast and saves memory
– Automatically caches hot columns and partitions
 Allows Spark to use Hive’s column and row security
23 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Current Work In Progress
24 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Speed Improvements for ACID
 Hive supports ACID transactions on ORC tables
– Uses delta files in HDFS to store changes to each partition
– Delta files store insert/update/delete operations
– Used to support SQL insert commands
 Unfortunately, update operations don’t allow predicate push down
on the deltas
 In the upcoming Hive 2.3, we added a new ACID layout
– It change updates to an insert and delete
– Allows predicate pushdown even on the delta files
 Also added SQL merge command in Hive 2.2
25 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Column Encryption (ORC-14)
 Allows users to encrypt some of the columns of the file
– Provides column level security even with access to raw files
– Uses Key Management Server from Ranger or Hadoop
– Includes both the data and the index
– Daily key rolling can anonymize data after 90 days
 User specifies how data is masked if user doesn’t have access
– Nullify
– Redact
– SHA256
26 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Thank You
@owen_omalley
owen@hortonworks.com

More Related Content

PPTX
Hive + Tez: A Performance Deep Dive
PPTX
Apache Tez - A New Chapter in Hadoop Data Processing
PPTX
Apache Tez: Accelerating Hadoop Query Processing
PPTX
LLAP: long-lived execution in Hive
PPTX
Hive+Tez: A performance deep dive
PDF
Top 5 Mistakes When Writing Spark Applications
PDF
Spark Summit EU talk by Ted Malaska
PPTX
How to understand and analyze Apache Hive query execution plan for performanc...
Hive + Tez: A Performance Deep Dive
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez: Accelerating Hadoop Query Processing
LLAP: long-lived execution in Hive
Hive+Tez: A performance deep dive
Top 5 Mistakes When Writing Spark Applications
Spark Summit EU talk by Ted Malaska
How to understand and analyze Apache Hive query execution plan for performanc...

What's hot (20)

PDF
Spark and S3 with Ryan Blue
PDF
A Thorough Comparison of Delta Lake, Iceberg and Hudi
PPTX
Hive: Loading Data
PDF
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
PDF
Simplifying Change Data Capture using Databricks Delta
PDF
Parquet performance tuning: the missing guide
PDF
Fine Tuning and Enhancing Performance of Apache Spark Jobs
PPTX
Apache Flink and what it is used for
PDF
Apache Spark Core—Deep Dive—Proper Optimization
PDF
The Parquet Format and Performance Optimization Opportunities
PPTX
Achieving 100k Queries per Hour on Hive on Tez
PDF
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
PPTX
Apache Tez – Present and Future
PDF
Etsy Activity Feeds Architecture
PDF
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
PDF
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
PDF
Understanding Query Plans and Spark UIs
PPTX
HBaseCon 2015: HBase Performance Tuning @ Salesforce
PPTX
Transactional operations in Apache Hive: present and future
PDF
Dynamic Partition Pruning in Apache Spark
Spark and S3 with Ryan Blue
A Thorough Comparison of Delta Lake, Iceberg and Hudi
Hive: Loading Data
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Simplifying Change Data Capture using Databricks Delta
Parquet performance tuning: the missing guide
Fine Tuning and Enhancing Performance of Apache Spark Jobs
Apache Flink and what it is used for
Apache Spark Core—Deep Dive—Proper Optimization
The Parquet Format and Performance Optimization Opportunities
Achieving 100k Queries per Hour on Hive on Tez
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Apache Tez – Present and Future
Etsy Activity Feeds Architecture
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Understanding Query Plans and Spark UIs
HBaseCon 2015: HBase Performance Tuning @ Salesforce
Transactional operations in Apache Hive: present and future
Dynamic Partition Pruning in Apache Spark
Ad

Similar to ORC File - Optimizing Your Big Data (20)

PPTX
Hadoop & cloud storage object store integration in production (final)
PPTX
Hadoop & Cloud Storage: Object Store Integration in Production
PPTX
File Format Benchmark - Avro, JSON, ORC & Parquet
PPTX
File Format Benchmark - Avro, JSON, ORC & Parquet
PPTX
Hadoop & Cloud Storage: Object Store Integration in Production
PPTX
Using Apache Hive with High Performance
PPTX
Ozone- Object store for Apache Hadoop
PPTX
Fast Spark Access To Your Complex Data - Avro, JSON, ORC, and Parquet
PPTX
Apache Hive ACID Project
PPTX
ACID Transactions in Hive
PPTX
ORC improvement in Apache Spark 2.3
PPTX
Hive acid and_2.x new_features
PPTX
HiveWarehouseConnector
PPTX
Ansible + Hadoop
PPTX
Apache Hive 2.0; SQL, Speed, Scale
PPTX
Intro to Spark with Zeppelin
PPTX
Storage and-compute-hdfs-map reduce
PPTX
Apache Spark and Object Stores
PPTX
SQL On Hadoop
PPTX
Micro services vs hadoop
Hadoop & cloud storage object store integration in production (final)
Hadoop & Cloud Storage: Object Store Integration in Production
File Format Benchmark - Avro, JSON, ORC & Parquet
File Format Benchmark - Avro, JSON, ORC & Parquet
Hadoop & Cloud Storage: Object Store Integration in Production
Using Apache Hive with High Performance
Ozone- Object store for Apache Hadoop
Fast Spark Access To Your Complex Data - Avro, JSON, ORC, and Parquet
Apache Hive ACID Project
ACID Transactions in Hive
ORC improvement in Apache Spark 2.3
Hive acid and_2.x new_features
HiveWarehouseConnector
Ansible + Hadoop
Apache Hive 2.0; SQL, Speed, Scale
Intro to Spark with Zeppelin
Storage and-compute-hdfs-map reduce
Apache Spark and Object Stores
SQL On Hadoop
Micro services vs hadoop
Ad

More from DataWorks Summit (20)

PPTX
Data Science Crash Course
PPTX
Floating on a RAFT: HBase Durability with Apache Ratis
PPTX
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
PDF
HBase Tales From the Trenches - Short stories about most common HBase operati...
PPTX
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
PPTX
Managing the Dewey Decimal System
PPTX
Practical NoSQL: Accumulo's dirlist Example
PPTX
HBase Global Indexing to support large-scale data ingestion at Uber
PPTX
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
PPTX
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
PPTX
Supporting Apache HBase : Troubleshooting and Supportability Improvements
PPTX
Security Framework for Multitenant Architecture
PDF
Presto: Optimizing Performance of SQL-on-Anything Engine
PPTX
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
PPTX
Extending Twitter's Data Platform to Google Cloud
PPTX
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
PPTX
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
PPTX
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
PDF
Computer Vision: Coming to a Store Near You
PPTX
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Data Science Crash Course
Floating on a RAFT: HBase Durability with Apache Ratis
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
HBase Tales From the Trenches - Short stories about most common HBase operati...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Managing the Dewey Decimal System
Practical NoSQL: Accumulo's dirlist Example
HBase Global Indexing to support large-scale data ingestion at Uber
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Security Framework for Multitenant Architecture
Presto: Optimizing Performance of SQL-on-Anything Engine
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Extending Twitter's Data Platform to Google Cloud
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Computer Vision: Coming to a Store Near You
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark

Recently uploaded (20)

PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Big Data Technologies - Introduction.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
KodekX | Application Modernization Development
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Cloud computing and distributed systems.
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
Understanding_Digital_Forensics_Presentation.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Encapsulation_ Review paper, used for researhc scholars
Network Security Unit 5.pdf for BCA BBA.
Per capita expenditure prediction using model stacking based on satellite ima...
Big Data Technologies - Introduction.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
KodekX | Application Modernization Development
Programs and apps: productivity, graphics, security and other tools
20250228 LYD VKU AI Blended-Learning.pptx
Spectral efficient network and resource selection model in 5G networks
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Cloud computing and distributed systems.
Agricultural_Statistics_at_a_Glance_2022_0.pdf
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm

ORC File - Optimizing Your Big Data

  • 1. ORC File – Optimizing Your Big Data Owen O’Malley, Co-founder Hortonworks Apache Hadoop, Hive, ORC, and Incubator @owen_omalley
  • 2. 2 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Overview
  • 3. 3 © Hortonworks Inc. 2011 – 2017. All Rights Reserved In the Beginning…  Hadoop applications used text or SequenceFile – Text is slow and not splittable when compressed – SequenceFile only supports key and value and user-defined serialization  Hive added RCFile – User controls the columns to read and decompress – No type information and user-defined serialization – Finding splits was expensive  Avro files created – Type information included! – Had to read and decompress entire row
  • 4. 4 © Hortonworks Inc. 2011 – 2017. All Rights Reserved ORC File Basics  Columnar format – Enables user to read & decompress just the bytes they need  Fast – See https://www.slideshare.net/HadoopSummit/file-format-benchmark-avro-json-orc-parquet  Indexed  Self-describing – Includes all of the information about types and encoding  Rich type system – All of Hive’s types including timestamp, struct, map, list, and union
  • 5. 5 © Hortonworks Inc. 2011 – 2017. All Rights Reserved File Compatibility  Backwards compatibility – Automatically detect the version of the file and read it.  Forward compatibility – Most changes are made so old readers will read the new files – Maintain the ability to write old files via orc.write.format – Always write old version until your last cluster upgrades  Current file versions – 0.11 – Original version – 0.12 – Updated run length encoding (RLE)
  • 6. 6 © Hortonworks Inc. 2011 – 2017. All Rights Reserved File Structure  File contains a list of stripes, which are sets of rows – Default size is 64MB – Large stripe size enables efficient reads  Footer – Contains the list of stripe locations – Type description – File and stripe statistics  Postscript – Compression parameters – File format version
  • 7. 7 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Stripe Structure  Indexes – Offsets to jump to start of row group – Row group size defaults to 10,000 rows – Minimum, Maximum, and Count of each column  Data – Data for the stripe organized by column  Footer – List of stream locations – Column encoding information
  • 8. 8 © Hortonworks Inc. 2011 – 2017. All Rights Reserved File Layout Page 8 Column 1 Column 2 Column 7 Column 8 Column 3 Column 6 Column 4 Column 5 Column 1 Column 2 Column 7 Column 8 Column 3 Column 6 Column 4 Column 5 Index Data Row Data Stripe Footer ~64MBStripe Index Data Row Data Stripe Footer ~64MBStripe Index Data Row Data Stripe Footer ~64MBStripe File Footer Postscript File Metadata
  • 9. 9 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Schema Evolution  ORC now supports schema evolution – Hive 2.1 – append columns or type conversion – Upcoming Hive 2.3 – map columns or inner structures by name – User passes desired schema to ORC reader  Type conversions – Most types will convert although some are ugly. – If the value doesn’t fit in the new type, it will become null.  Cautions – Name mapping requires ORC files written by Hive ≥ 2.0 – Some of the type conversions are slow
  • 10. 10 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Using ORC
  • 11. 11 © Hortonworks Inc. 2011 – 2017. All Rights Reserved From Hive or Presto  Modify your table definition: – create table my_table ( name string, address string, ) stored as orc;  Import data: – insert overwrite table my_table select * from my_staging;  Use either configuration or table properties – tblproperties ("orc.compress"="NONE") – set hive.exec.orc.default.compress=NONE;
  • 12. 12 © Hortonworks Inc. 2011 – 2017. All Rights Reserved From Java  Use the ORC project rather than Hive’s ORC. – Hive’s master branch uses it. – Maven group id: org.apache.orc version: 1.4.0 – nohive classifier avoids interfering with Hive’s packages  Two levels of access – orc-core – Faster access, but uses Hive’s vectorized API – orc-mapreduce – Row by row access, simpler OrcStruct API  MapReduce API implements WritableComparable – Can be shuffled – Need to specify type information in configuration for shuffle or output
  • 13. 13 © Hortonworks Inc. 2011 – 2017. All Rights Reserved From C++  Pure C++ client library – No JNI or JDK so client can estimate and control memory  Combine with pure C++ HDFS client from HDFS-8707 – Work ongoing in feature branch, but should be committed soon.  Reader is stable and in production use.  Alibaba has created a writer and is contributing it to Apache ORC. – Should be in the next release ORC 1.5.0.
  • 14. 14 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Command Line  Using hive –orcfiledump from Hive – -j -p – pretty prints the metadata as JSON – -d – prints data as JSON  Using java -jar orc-tools-1.4.0-uber.jar from ORC – meta – print the metadata as JSON – data – print data as JSON – convert – convert JSON to ORC – json-schema – scan a set of JSON documents to find the matching schema
  • 15. 15 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Optimization
  • 16. 16 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Stripe Size  Makes a huge difference in performance – orc.stripe.size or hive.exec.orc.default.stripe.size – Controls the amount of buffer in writer. Default is 64MB – Trade off • Large stripes = Large more efficient reads • Small stripes = Less memory and more granular processing splits  Multiple files written at the same time will shrink stripes – Use Hive’s hive.optimize.sort.dynamic.partition – Sorting dynamic partitions means a one writer at a time
  • 17. 17 © Hortonworks Inc. 2011 – 2017. All Rights Reserved HDFS Block Padding  The stripes don’t align exactly with HDFS blocks  HDFS scatters blocks around cluster  Often want to pad to block boundaries – Costs space, but improves performance – hive.exec.orc.default.block.padding – true – hive.exec.orc.block.padding.tolerance – 0.05 Index Data Row Data Stripe Footer ~64MBStripe Index Data Row Data Stripe Footer ~64MBStripe Index Data Row Data Stripe Footer ~64MBStripe HDFS Block HDFS Block Padding File Footer Postscript File Metadata
  • 18. 18 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Predicate Push Down  Reader is given a SearchArg – Limited set predicates over column and literal value – Reader will skip over any parts of file that can’t contain valid rows  ORC indexes at three levels: – File – Stripe – Row Group (10k rows)  Reader still needs to apply predicate to filter out single rows
  • 19. 19 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Row Pruning  Every primitive column has minimum and maximum at each level – Sorting your data within a file helps a lot – Consider sorting instead of making lots of partitions  Writer can optionally include bloomfilters – Provides a probabilistic bitmap of hashcodes – Only works with equality predicates at the row group level – Requires significant space in the file – Manually enabled by using orc.bloom.filter.columns – Use orc.bloom.filter.fpp to set the false positive rate (default 0.05) – Set the default charset in JVM via -Dfile.encoding=UTF-8
  • 20. 20 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Row Pruning Example  TPC-DS – from tpch1000.lineitem where l_orderkey = 1212000001;  Rows Read – Nothing – 5,999,989,709 – Min/Max – 540,000 – BloomFilter – 10,000  Time Taken – Nothing – 74 sec – Min/Max – 4.5 sec – BloomFilter – 1.3 sec
  • 21. 21 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Split Calculation  Hive’s OrcInputFormat has three strategies for split calculation – BI • Small fast queries • Splits based on HDFS blocks – ETL • Large queries • Read file footer and apply SearchArg to stripes • Can include footer in splits (hive.orc.splits.include.file.footer) – Hybrid • If small files or lots of files, use BI
  • 22. 22 © Hortonworks Inc. 2011 – 2017. All Rights Reserved LLAP – Live Long & Process  Provides a persistent service to speed up Hive – Caches ORC and text data – Saves costs of Yarn container & JVM spin up – JIT finishes after first few seconds  Cache uses ORC’s RLE – Decompresses zlib or Snappy – RLE is fast and saves memory – Automatically caches hot columns and partitions  Allows Spark to use Hive’s column and row security
  • 23. 23 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Current Work In Progress
  • 24. 24 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Speed Improvements for ACID  Hive supports ACID transactions on ORC tables – Uses delta files in HDFS to store changes to each partition – Delta files store insert/update/delete operations – Used to support SQL insert commands  Unfortunately, update operations don’t allow predicate push down on the deltas  In the upcoming Hive 2.3, we added a new ACID layout – It change updates to an insert and delete – Allows predicate pushdown even on the delta files  Also added SQL merge command in Hive 2.2
  • 25. 25 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Column Encryption (ORC-14)  Allows users to encrypt some of the columns of the file – Provides column level security even with access to raw files – Uses Key Management Server from Ranger or Hadoop – Includes both the data and the index – Daily key rolling can anonymize data after 90 days  User specifies how data is masked if user doesn’t have access – Nullify – Redact – SHA256
  • 26. 26 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Thank You @owen_omalley owen@hortonworks.com