SlideShare a Scribd company logo
Apache Hive on ACID
Alan Gates
Hive PMC Member
Co-founder Hortonworks
May 2016
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
History
 Hive only updated partitions
– INSERT...OVERWRITE rewrote an entire partition
– Forced daily or even hourly partitions
– Could add files to partition directory, file compaction was manual
 What about concurrent readers?
– Ok for inserts, but overwrite caused races
– There is a zookeeper lock manager, but…
 No way to delete or update rows
 No INSERT INTO T VALUES…
– Breaks some tools
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Why Do You Need ACID?
 Hadoop and Hive have always…
– Just said no to ACID
– Perceived as tradeoff for performance
 But, your data isn’t static
– It changes daily, hourly, or faster
– Sometimes it needs restated (late arriving data) or facts change (e.g. a user’s physical address)
– Loading data into Hive every hour is so 2010; data should be available in Hive as soon as it arrives
 We saw users implementing ad hoc solutions
– This is a lot of work and hard to get right
– Hive should support this as a first class feature
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
When Should You Use Hive’s ACID?
 NOT OLTP!!!
 Updating a Dimension Table
– Changing a customer’s address
 Delete Old Records
– Remove records for compliance
 Update/Restate Large Fact Tables
– Fix problems after they are in the warehouse
 Streaming Data Ingest
– A continual stream of data coming in
– Typically from Flume or Storm
 NOT OLTP!!!
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
SQL Changes for ACID
 Since Hive 0.14
 New DML
– INSERT INTO T VALUES(1, ‘fred’, ...);
– UPDATE T SET (x = 5[, ...]) [WHERE ...]
– DELETE FROM T [WHERE ...]
– Supports partitioned and non-partitioned tables, WHERE clause can specify partition but not required
 Restrictions
– Table must have format that extends AcidInputFormat
• currently ORC
• work started on Parquet (HIVE-8123)
– Table must be bucketed and not sorted
• can use 1 bucket but this will restrict write parallelism
– Table must be marked transactional
• create table T(...) clustered by (a) into 2 buckets stored as orc TBLPROPERTIES
('transactional'='true');
• Existing ORC tables that are bucketed can be marked transactional via ALTER
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Ingesting Data Into Hive From a Stream
 Data is flowing in from generators in a stream
 Without this, you have to add it to Hive in batches, often every hour
– Thus your users have to wait an hour before they can see their data
 New interface in hive.hcatalog.streaming lets applications write small batches of
records and commit them
– Users can now see data within a few seconds of it arriving from the data generators
 Available for Apache Flume and Apache Storm
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Design
 HDFS does not allow arbitrary writes
– Store changes as delta files
– Stitched together by client on read
 Writes get a transaction ID
– Sequentially assigned by metastore
 Reads get highest committed transaction & list of open/aborted transactions
– Provides snapshot consistency
– No exclusive locks required
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Why Not HBase
 Good
– Handles compactions for us
– Already has similar data model with LSM
 Bad
– When we started this there were no transaction managers for HBase, this requires transactions
– Hfile is column family based rather than columnar
– HBase focused on point lookups and range scans
• Warehousing requires full scans
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Stitching Buckets Together
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Layout
 Partition locations remain unchanged
– Still warehouse/$db/$tbl/$part
 Bucket Files Structured By Transactions
– Base files $part/base_$tid/bucket_*
– Delta files $part/delta_$tid_$tid/bucket_*
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Input and Output Formats
 Created new AcidInput/OutputFormat
– Unique key is original transaction id, bucket, row id
 Reader returns correct version of row based on transaction state
 Also added raw API for compactor
– Provides previous events as well
 ORC implements new API
– Extends records with change metadata
• Add operation (d, u, i), latest transaction id, and key
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Transaction Manager
 Existing lock managers
– In memory - not durable
– ZooKeeper - requires additional components to install, administer, etc.
 Locks need to be integrated with transactions
– commit/rollback must atomically release locks
 We sort of have this database lying around which has ACID characteristics (metastore)
 Transactions and locks stored in metastore
 Uses metastore DB to provide unique, ascending ids for transactions and locks
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Transaction & Locking Model
 DML statements are auto-commit
 Snapshot isolation
– Reader will see consistent data for the duration of a query
 Current transactions can be displayed using SHOW TRANSACTIONS
 Three types of locks
– shared read
– shared write (can co-exist with shared read, but not other shared write)
– exclusive
 Operations require different locks
– SELECT, INSERT – shared read (inserts cannot conflict because there is no primary key)
– UPDATE, DELETE – shared write
– DROP, INSERT OVERWRITE – exclusive
14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Compaction
 Each transaction (or batch of transactions in streaming) creates a new delta directory
 Too many files = NameNode  and poor read performance due to fan in on merge
 Need to automatically compact files
– Initiated by metastore server, run as MR jobs in the cluster
– Can be manually initiated by user via ALTER TABLE COMPACT
 Minor compaction merges many deltas into one
– Run when there are more than 10 delta directories (configurable)
 Major compaction merges deltas with base and rewrites base
– Run when size of the deltas > 10% of the size of the base (configurable)
 Old files kept around until all readers are done with their snapshots, then cleaned up
– Compaction and data read/writes can be done in parallel with no need to pause the world
15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Issues Found and (Some) Fixed
 Not GA ready in Hive 1.2 or 2.0, hope to have GA ready by 1.3 and 2.1
 Deadlocks in the RDBMS
– The way the Hive metastore used the RDBMS caused a lot of deadlocks – greatly improved
 Usability
– SHOW COMPACTIONS and SHOW LOCKS did not give users/admins enough information to successfully
determine who was blocking whom or what was getting compacted – improved, some work still to do
here
 Resilience
– System was easy to knock over when clients did silly things (like open 1M+ transactions) – improved,
though I am sure there are still some ways to kill it
– Initially compactor threads only run in 1 metastore instance – resolved, now can run in multiple instances
 Correctness
– Streaming ingest did not enforce proper bucket spraying – resolved
– Initial versions of the compactor had a race condition that resulted in record loss – resolved
– Adding a column to a table or changing a column’s type caused read time errors - resolved
– Updates can get lost when overlapping transactions update the same partition – HIVE-13395
 Performance
– Some work done here (e.g. making predicate push down work, efficient split combinations)
– Much still to be done
16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Next: MERGE
 Standard SQL, added in SQL 2003
 Problem, today each UPDATE requires a scan of the partition or table
– There is no way to apply separate updates in a batch
 Allows upserts
 Use case:
– bring in batch from transactional/front end systems
– Apply as insert or updates (as appropriate) in one read/write pass
17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Future Work
 Multi-statement transactions (BEGIN, COMMIT, ROLLBACK)
 Integration with LLAP
– Figure out how MVCC works with LLAP’s caching
– Build a write path through LLAP
 Lower the user burden
– Make the bucketing automatic so the user does not have to be aware of it
– Allow user to determine sort order of the table
– Eventually remove the transactional/non-transactional distinction in tables
 Improve monitoring and alerting facilities
– Make is easier for an admin to determine when the system is in trouble, e.g. the compactor is not
running or is failing on every run, there are too many open transactions, etc.
18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Thank You

More Related Content

PPTX
Hive acid-updates-strata-sjc-feb-2015
PPTX
Hive2.0 big dataspain-nov-2016
PPTX
Hive acid and_2.x new_features
PPTX
Big data spain keynote nov 2016
PPTX
Hive & HBase for Transaction Processing Hadoop Summit EU Apr 2015
PPTX
Hive Does ACID
PPTX
Apache Hive on ACID
PDF
How to use Hadoop for operational and transactional purposes by RODRIGO MERI...
Hive acid-updates-strata-sjc-feb-2015
Hive2.0 big dataspain-nov-2016
Hive acid and_2.x new_features
Big data spain keynote nov 2016
Hive & HBase for Transaction Processing Hadoop Summit EU Apr 2015
Hive Does ACID
Apache Hive on ACID
How to use Hadoop for operational and transactional purposes by RODRIGO MERI...

What's hot (20)

PPTX
Adding ACID Transactions, Inserts, Updates, and Deletes in Apache Hive
PPTX
Transactional SQL in Apache Hive
PPTX
Hive acid-updates-summit-sjc-2014
PPTX
Ozone- Object store for Apache Hadoop
PPTX
Apache Hive 2.0: SQL, Speed, Scale
PPTX
Meet HBase 2.0 and Phoenix-5.0
PPTX
ORC File - Optimizing Your Big Data
PDF
Apache Hive 2.0 SQL, Speed, Scale by Alan Gates
PPTX
Apache HBase Internals you hoped you Never Needed to Understand
PPTX
Apache Phoenix Query Server PhoenixCon2016
PPTX
Apache Phoenix Query Server
PDF
What is new in Apache Hive 3.0?
PPTX
Apache Hive 2.0; SQL, Speed, Scale
PPTX
Breathing New Life into Apache Oozie with Apache Ambari Workflow Manager
PPTX
Cloudy with a chance of Hadoop - DataWorks Summit 2017 San Jose
PPTX
Transactional operations in Apache Hive: present and future
PPTX
De-Mystifying the Apache Phoenix QueryServer
PDF
You Can't Search Without Data
PPTX
Meet HBase 2.0 and Phoenix 5.0
PPTX
Major advancements in Apache Hive towards full support of SQL compliance
Adding ACID Transactions, Inserts, Updates, and Deletes in Apache Hive
Transactional SQL in Apache Hive
Hive acid-updates-summit-sjc-2014
Ozone- Object store for Apache Hadoop
Apache Hive 2.0: SQL, Speed, Scale
Meet HBase 2.0 and Phoenix-5.0
ORC File - Optimizing Your Big Data
Apache Hive 2.0 SQL, Speed, Scale by Alan Gates
Apache HBase Internals you hoped you Never Needed to Understand
Apache Phoenix Query Server PhoenixCon2016
Apache Phoenix Query Server
What is new in Apache Hive 3.0?
Apache Hive 2.0; SQL, Speed, Scale
Breathing New Life into Apache Oozie with Apache Ambari Workflow Manager
Cloudy with a chance of Hadoop - DataWorks Summit 2017 San Jose
Transactional operations in Apache Hive: present and future
De-Mystifying the Apache Phoenix QueryServer
You Can't Search Without Data
Meet HBase 2.0 and Phoenix 5.0
Major advancements in Apache Hive towards full support of SQL compliance
Ad

Viewers also liked (20)

PPTX
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
PPTX
Keynote apache bd-eu-nov-2016
PPTX
Hortonworks apache training
PPTX
Machine Learning in Big Data
PDF
Strata Stinger Talk October 2013
PPTX
Introduction to Hive
PDF
Apache Spark Usage in the Open Source Ecosystem
PPTX
Hive analytic workloads hadoop summit san jose 2014
PDF
Data Science with Apache Spark - Crash Course - HS16SJ
PDF
PySpark Best Practices
PPTX
Harnessing Hadoop Distuption: A Telco Case Study
PPT
Hive Training -- Motivations and Real World Use Cases
PDF
Fast Data Analytics with Spark and Python
PDF
Python and Bigdata - An Introduction to Spark (PySpark)
PPTX
Achieving Real-time Ingestion and Analysis of Security Events through Kafka a...
PDF
Hive Quick Start Tutorial
PDF
Architecting a Next Generation Data Platform
PDF
RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion Stoica
PDF
New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...
PDF
Trends for Big Data and Apache Spark in 2017 by Matei Zaharia
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
Keynote apache bd-eu-nov-2016
Hortonworks apache training
Machine Learning in Big Data
Strata Stinger Talk October 2013
Introduction to Hive
Apache Spark Usage in the Open Source Ecosystem
Hive analytic workloads hadoop summit san jose 2014
Data Science with Apache Spark - Crash Course - HS16SJ
PySpark Best Practices
Harnessing Hadoop Distuption: A Telco Case Study
Hive Training -- Motivations and Real World Use Cases
Fast Data Analytics with Spark and Python
Python and Bigdata - An Introduction to Spark (PySpark)
Achieving Real-time Ingestion and Analysis of Security Events through Kafka a...
Hive Quick Start Tutorial
Architecting a Next Generation Data Platform
RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion Stoica
New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...
Trends for Big Data and Apache Spark in 2017 by Matei Zaharia
Ad

Similar to Hive ACID Apache BigData 2016 (20)

PPTX
ACID Transactions in Hive
PPTX
Apache Hive ACID Project
PPTX
HiveACIDPublic
PPTX
Adding ACID Transactions, Inserts, Updates, and Deletes in Apache Hive
PPTX
Hive 3 New Horizons DataWorks Summit Melbourne February 2019
PDF
What is New in Apache Hive 3.0?
PPTX
What's new in apache hive
PDF
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
POTX
Meet HBase 2.0 and Phoenix 5.0
PPTX
GDPR compliance application architecture and implementation using Hadoop and ...
PPTX
Apache Phoenix and HBase - Hadoop Summit Tokyo, Japan
PPTX
Software architecture for data applications
PPTX
Apache phoenix: Past, Present and Future of SQL over HBAse
PDF
Batch Processing at Scale with Flink & Iceberg
PPTX
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
PPTX
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
PPTX
[Pulsar summit na 21] Change Data Capture To Data Lakes Using Apache Pulsar/Hudi
PDF
Optimizing Hive Queries
PPTX
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
PPTX
Strata NY 2018: The deconstructed database
ACID Transactions in Hive
Apache Hive ACID Project
HiveACIDPublic
Adding ACID Transactions, Inserts, Updates, and Deletes in Apache Hive
Hive 3 New Horizons DataWorks Summit Melbourne February 2019
What is New in Apache Hive 3.0?
What's new in apache hive
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
Meet HBase 2.0 and Phoenix 5.0
GDPR compliance application architecture and implementation using Hadoop and ...
Apache Phoenix and HBase - Hadoop Summit Tokyo, Japan
Software architecture for data applications
Apache phoenix: Past, Present and Future of SQL over HBAse
Batch Processing at Scale with Flink & Iceberg
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
[Pulsar summit na 21] Change Data Capture To Data Lakes Using Apache Pulsar/Hudi
Optimizing Hive Queries
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Strata NY 2018: The deconstructed database

Recently uploaded (20)

PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PPTX
Odoo POS Development Services by CandidRoot Solutions
PPTX
Transform Your Business with a Software ERP System
PDF
System and Network Administraation Chapter 3
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PPTX
Essential Infomation Tech presentation.pptx
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PPTX
L1 - Introduction to python Backend.pptx
PDF
System and Network Administration Chapter 2
PDF
Design an Analysis of Algorithms II-SECS-1021-03
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Which alternative to Crystal Reports is best for small or large businesses.pdf
Navsoft: AI-Powered Business Solutions & Custom Software Development
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Design an Analysis of Algorithms I-SECS-1021-03
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Odoo Companies in India – Driving Business Transformation.pdf
Odoo POS Development Services by CandidRoot Solutions
Transform Your Business with a Software ERP System
System and Network Administraation Chapter 3
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Upgrade and Innovation Strategies for SAP ERP Customers
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Wondershare Filmora 15 Crack With Activation Key [2025
Essential Infomation Tech presentation.pptx
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
L1 - Introduction to python Backend.pptx
System and Network Administration Chapter 2
Design an Analysis of Algorithms II-SECS-1021-03

Hive ACID Apache BigData 2016

  • 1. Apache Hive on ACID Alan Gates Hive PMC Member Co-founder Hortonworks May 2016
  • 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved History  Hive only updated partitions – INSERT...OVERWRITE rewrote an entire partition – Forced daily or even hourly partitions – Could add files to partition directory, file compaction was manual  What about concurrent readers? – Ok for inserts, but overwrite caused races – There is a zookeeper lock manager, but…  No way to delete or update rows  No INSERT INTO T VALUES… – Breaks some tools
  • 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Why Do You Need ACID?  Hadoop and Hive have always… – Just said no to ACID – Perceived as tradeoff for performance  But, your data isn’t static – It changes daily, hourly, or faster – Sometimes it needs restated (late arriving data) or facts change (e.g. a user’s physical address) – Loading data into Hive every hour is so 2010; data should be available in Hive as soon as it arrives  We saw users implementing ad hoc solutions – This is a lot of work and hard to get right – Hive should support this as a first class feature
  • 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved When Should You Use Hive’s ACID?  NOT OLTP!!!  Updating a Dimension Table – Changing a customer’s address  Delete Old Records – Remove records for compliance  Update/Restate Large Fact Tables – Fix problems after they are in the warehouse  Streaming Data Ingest – A continual stream of data coming in – Typically from Flume or Storm  NOT OLTP!!!
  • 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved SQL Changes for ACID  Since Hive 0.14  New DML – INSERT INTO T VALUES(1, ‘fred’, ...); – UPDATE T SET (x = 5[, ...]) [WHERE ...] – DELETE FROM T [WHERE ...] – Supports partitioned and non-partitioned tables, WHERE clause can specify partition but not required  Restrictions – Table must have format that extends AcidInputFormat • currently ORC • work started on Parquet (HIVE-8123) – Table must be bucketed and not sorted • can use 1 bucket but this will restrict write parallelism – Table must be marked transactional • create table T(...) clustered by (a) into 2 buckets stored as orc TBLPROPERTIES ('transactional'='true'); • Existing ORC tables that are bucketed can be marked transactional via ALTER
  • 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Ingesting Data Into Hive From a Stream  Data is flowing in from generators in a stream  Without this, you have to add it to Hive in batches, often every hour – Thus your users have to wait an hour before they can see their data  New interface in hive.hcatalog.streaming lets applications write small batches of records and commit them – Users can now see data within a few seconds of it arriving from the data generators  Available for Apache Flume and Apache Storm
  • 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Design  HDFS does not allow arbitrary writes – Store changes as delta files – Stitched together by client on read  Writes get a transaction ID – Sequentially assigned by metastore  Reads get highest committed transaction & list of open/aborted transactions – Provides snapshot consistency – No exclusive locks required
  • 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Why Not HBase  Good – Handles compactions for us – Already has similar data model with LSM  Bad – When we started this there were no transaction managers for HBase, this requires transactions – Hfile is column family based rather than columnar – HBase focused on point lookups and range scans • Warehousing requires full scans
  • 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Stitching Buckets Together
  • 10. 10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDFS Layout  Partition locations remain unchanged – Still warehouse/$db/$tbl/$part  Bucket Files Structured By Transactions – Base files $part/base_$tid/bucket_* – Delta files $part/delta_$tid_$tid/bucket_*
  • 11. 11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Input and Output Formats  Created new AcidInput/OutputFormat – Unique key is original transaction id, bucket, row id  Reader returns correct version of row based on transaction state  Also added raw API for compactor – Provides previous events as well  ORC implements new API – Extends records with change metadata • Add operation (d, u, i), latest transaction id, and key
  • 12. 12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Transaction Manager  Existing lock managers – In memory - not durable – ZooKeeper - requires additional components to install, administer, etc.  Locks need to be integrated with transactions – commit/rollback must atomically release locks  We sort of have this database lying around which has ACID characteristics (metastore)  Transactions and locks stored in metastore  Uses metastore DB to provide unique, ascending ids for transactions and locks
  • 13. 13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Transaction & Locking Model  DML statements are auto-commit  Snapshot isolation – Reader will see consistent data for the duration of a query  Current transactions can be displayed using SHOW TRANSACTIONS  Three types of locks – shared read – shared write (can co-exist with shared read, but not other shared write) – exclusive  Operations require different locks – SELECT, INSERT – shared read (inserts cannot conflict because there is no primary key) – UPDATE, DELETE – shared write – DROP, INSERT OVERWRITE – exclusive
  • 14. 14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Compaction  Each transaction (or batch of transactions in streaming) creates a new delta directory  Too many files = NameNode  and poor read performance due to fan in on merge  Need to automatically compact files – Initiated by metastore server, run as MR jobs in the cluster – Can be manually initiated by user via ALTER TABLE COMPACT  Minor compaction merges many deltas into one – Run when there are more than 10 delta directories (configurable)  Major compaction merges deltas with base and rewrites base – Run when size of the deltas > 10% of the size of the base (configurable)  Old files kept around until all readers are done with their snapshots, then cleaned up – Compaction and data read/writes can be done in parallel with no need to pause the world
  • 15. 15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Issues Found and (Some) Fixed  Not GA ready in Hive 1.2 or 2.0, hope to have GA ready by 1.3 and 2.1  Deadlocks in the RDBMS – The way the Hive metastore used the RDBMS caused a lot of deadlocks – greatly improved  Usability – SHOW COMPACTIONS and SHOW LOCKS did not give users/admins enough information to successfully determine who was blocking whom or what was getting compacted – improved, some work still to do here  Resilience – System was easy to knock over when clients did silly things (like open 1M+ transactions) – improved, though I am sure there are still some ways to kill it – Initially compactor threads only run in 1 metastore instance – resolved, now can run in multiple instances  Correctness – Streaming ingest did not enforce proper bucket spraying – resolved – Initial versions of the compactor had a race condition that resulted in record loss – resolved – Adding a column to a table or changing a column’s type caused read time errors - resolved – Updates can get lost when overlapping transactions update the same partition – HIVE-13395  Performance – Some work done here (e.g. making predicate push down work, efficient split combinations) – Much still to be done
  • 16. 16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Next: MERGE  Standard SQL, added in SQL 2003  Problem, today each UPDATE requires a scan of the partition or table – There is no way to apply separate updates in a batch  Allows upserts  Use case: – bring in batch from transactional/front end systems – Apply as insert or updates (as appropriate) in one read/write pass
  • 17. 17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Future Work  Multi-statement transactions (BEGIN, COMMIT, ROLLBACK)  Integration with LLAP – Figure out how MVCC works with LLAP’s caching – Build a write path through LLAP  Lower the user burden – Make the bucketing automatic so the user does not have to be aware of it – Allow user to determine sort order of the table – Eventually remove the transactional/non-transactional distinction in tables  Improve monitoring and alerting facilities – Make is easier for an admin to determine when the system is in trouble, e.g. the compactor is not running or is failing on every run, there are too many open transactions, etc.
  • 18. 18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Thank You