SlideShare a Scribd company logo
Trafodion 
Transactional SQL-on-HBase 
Trafodion and Hadoop / HBase 
www.trafodion.org 
HP © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Trafodion innovation built upon Hadoop stack 
Leverages Hadoop and 
HBase for core modules 
• Maintains API compatibility 
• Inherited scalability and 
availability 
Differentiation 
• ANSI SQL via ODBC/JDBC 
• Relational schema abstraction 
• Distributed transaction protection 
• Mature SQL technology 
• Automatic parallelism 
Hadoop Trafodion 
Client Application using 
ODBC/JDBC on 
Windows/Linux 
Client Services for ODBC and JDBC 
HBase 
HP © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject 2 to change without notice. 
Hive 
HDFS 
Zookeeper 
SQL Compiler / Optimizer / Executor 
Distributed Transaction Manager 
+
HBase vs. Trafodion comparison 
HBase Trafodion + HBase 
Data abstraction Key and value pair Relational schema 
Physical Layout Column family store where 
row data is stored together by 
cells 
Same except there is a single column 
family with space-saving column 
encoding 
Column values Uninterpreted array of bytes Explicitly defined and enforced data 
types 
ACID Guarantee Single row atomicity Multi- SQL statements, tables, and 
rows defined as part of transaction 
Language API Get/put/delete SQL (Trafodion invokes native HBase 
API) 
Row Key Index Single (string) row key Composite (multi-column) row key 
Secondary Indexes Not supported Arbitrary secondary key columns 
HP © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject 3 to change without notice.
Salting of row keys 
How it works 
• HBase table gets created, pre-split with one 
region per salt value 
• A hash value column, “_SALT_”, is added as a 
prefix to the row key 
• Salting is transparent to SQL statements 
– Automatically computed during insert/update 
statements 
– Predicates automatically generated where feasible 
– Minimal overhead for direct lookup by key value 
Benefits 
• Even data distributions across HBase regions 
• Avoids region hotspots caused by insertion of 
data in row key order 
INSERT(s) SELECT(s) 
HBase 
Region 
PART 1 PART 2 PART 3 PART 4 
HDFS 
CREATE TABLE t(a integer not null primary key, b 
integer) SALT USING 4 PARTITIONS; 
HP © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject 4 to change without notice. 
HBase 
Region 
HDFS 
HBase 
Region 
HDFS 
HBase 
Region 
HDFS
Trafodion and Hadoop – Better Together! 
Leverages and extends Hadoop for transactional SQL workloads 
Complete: Full-function ANSI SQL 
Reuse existing SQL skills and improve developer productivity 
Protected: Distributed ACID transactions 
Guarantees data consistency across multiple rows, tables, SQL statements 
Efficient: Optimized for low-latency read and write 
transactions 
Supports real-time transaction processing applications 
Flexible: Schema flexibility and multi-structured data 
Seamlessly integrates structured, unstructured, and semi-structured data 
Interoperable: Standard ODBC/JDBC access 
Works with existing tools and applications 
Open: Hadoop and Linux distribution neutral 
Easy to add to your existing infrastructure and no vendor lock-in 
HP © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject 5 to change without notice. 
Reuse SQL 
skills 
Scale without 
complexity 
Complements 
Hadoop 
Reduce 
Costs 
Real-time 
Performanc 
e 
+
See for yourself… 
Come discover and develop on Trafodion 
www.trafodion.org 
HP © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject 6 to change without notice.

More Related Content

PPTX
Trafodion overview
PPTX
1 - The Case for Trafodion
PPTX
Trafodion – an enterprise class sql based on hadoop
PPTX
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
PDF
Splice machine-bloor-webinar-data-lakes
PDF
Filling the Data Lake
PDF
Hadoop-DS: Which SQL-on-Hadoop Rules the Herd
PPTX
The DAP - Where YARN, HBase, Kafka and Spark go to Production
Trafodion overview
1 - The Case for Trafodion
Trafodion – an enterprise class sql based on hadoop
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
Splice machine-bloor-webinar-data-lakes
Filling the Data Lake
Hadoop-DS: Which SQL-on-Hadoop Rules the Herd
The DAP - Where YARN, HBase, Kafka and Spark go to Production

What's hot (20)

PDF
Ingesting Data at Blazing Speed Using Apache Orc
PDF
Hortonworks and HP Vertica Webinar
PPTX
The Future of Apache Hadoop an Enterprise Architecture View
PPTX
Big Data Simplified - Is all about Ab'strakSHeN
PPTX
Hadoop crash course workshop at Hadoop Summit
PDF
Pivotal HAWQ 소개
PDF
How can Hadoop & SAP be integrated
PDF
HBaseCon2017 HBase/Phoenix @ Scale @ Salesforce
PPTX
Analyzing the World's Largest Security Data Lake!
PPTX
Internet of things Crash Course Workshop
PPTX
Impala Unlocks Interactive BI on Hadoop
PPTX
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
PDF
SAP HORTONWORKS
PDF
BDM39: HP Vertica BI: Sub-second big data analytics your users and developers...
PDF
HAWQ: a massively parallel processing SQL engine in hadoop
PPTX
What's new in Ambari
PPTX
Luo june27 1150am_room230_a_v2
PDF
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
PDF
Data Science at Scale on MPP databases - Use Cases & Open Source Tools
PPTX
Format Wars: from VHS and Beta to Avro and Parquet
Ingesting Data at Blazing Speed Using Apache Orc
Hortonworks and HP Vertica Webinar
The Future of Apache Hadoop an Enterprise Architecture View
Big Data Simplified - Is all about Ab'strakSHeN
Hadoop crash course workshop at Hadoop Summit
Pivotal HAWQ 소개
How can Hadoop & SAP be integrated
HBaseCon2017 HBase/Phoenix @ Scale @ Salesforce
Analyzing the World's Largest Security Data Lake!
Internet of things Crash Course Workshop
Impala Unlocks Interactive BI on Hadoop
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
SAP HORTONWORKS
BDM39: HP Vertica BI: Sub-second big data analytics your users and developers...
HAWQ: a massively parallel processing SQL engine in hadoop
What's new in Ambari
Luo june27 1150am_room230_a_v2
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
Data Science at Scale on MPP databases - Use Cases & Open Source Tools
Format Wars: from VHS and Beta to Avro and Parquet
Ad

Similar to 2 - Trafodion and Hadoop HBase (20)

PDF
Hbase mhug 2015
PPTX
PPTX
Hive and querying data
PDF
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
PPTX
Big data and tools
PDF
Techincal Talk Hbase-Ditributed,no-sql database
PDF
BIGDATA ppts
PPTX
Impala for PhillyDB Meetup
PPTX
Overview of Big data, Hadoop and Microsoft BI - version1
PPTX
Overview of big data & hadoop version 1 - Tony Nguyen
PPTX
Presentation ON Hive Big Data NOSQL.pptx
PPTX
HBase introduction in azure
PPTX
Hive - A theoretical overview in Detail.pptx
PPTX
Hive with HDInsight
PPTX
Geo-based content processing using hbase
PPTX
3 - Trafodion Technology Look
PPTX
Intro to Hadoop
PPTX
Hadoop_arunam_ppt
PPTX
BDA: Introduction to HIVE, PIG and HBASE
PPTX
12 SQL On-Hadoop Tools
Hbase mhug 2015
Hive and querying data
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
Big data and tools
Techincal Talk Hbase-Ditributed,no-sql database
BIGDATA ppts
Impala for PhillyDB Meetup
Overview of Big data, Hadoop and Microsoft BI - version1
Overview of big data & hadoop version 1 - Tony Nguyen
Presentation ON Hive Big Data NOSQL.pptx
HBase introduction in azure
Hive - A theoretical overview in Detail.pptx
Hive with HDInsight
Geo-based content processing using hbase
3 - Trafodion Technology Look
Intro to Hadoop
Hadoop_arunam_ppt
BDA: Introduction to HIVE, PIG and HBASE
12 SQL On-Hadoop Tools
Ad

Recently uploaded (20)

PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PPTX
history of c programming in notes for students .pptx
PDF
System and Network Administration Chapter 2
PDF
PTS Company Brochure 2025 (1).pdf.......
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PDF
Digital Strategies for Manufacturing Companies
PDF
Understanding Forklifts - TECH EHS Solution
PDF
iTop VPN Free 5.6.0.5262 Crack latest version 2025
PPTX
Introduction to Artificial Intelligence
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PPTX
Log360_SIEM_Solutions Overview PPT_Feb 2020.pptx
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PPTX
Why Generative AI is the Future of Content, Code & Creativity?
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
Cost to Outsource Software Development in 2025
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
Digital Systems & Binary Numbers (comprehensive )
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
history of c programming in notes for students .pptx
System and Network Administration Chapter 2
PTS Company Brochure 2025 (1).pdf.......
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Digital Strategies for Manufacturing Companies
Understanding Forklifts - TECH EHS Solution
iTop VPN Free 5.6.0.5262 Crack latest version 2025
Introduction to Artificial Intelligence
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Log360_SIEM_Solutions Overview PPT_Feb 2020.pptx
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Why Generative AI is the Future of Content, Code & Creativity?
Which alternative to Crystal Reports is best for small or large businesses.pdf
Cost to Outsource Software Development in 2025
Operating system designcfffgfgggggggvggggggggg
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Digital Systems & Binary Numbers (comprehensive )
wealthsignaloriginal-com-DS-text-... (1).pdf
Adobe Illustrator 28.6 Crack My Vision of Vector Design

2 - Trafodion and Hadoop HBase

  • 1. Trafodion Transactional SQL-on-HBase Trafodion and Hadoop / HBase www.trafodion.org HP © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 2. Trafodion innovation built upon Hadoop stack Leverages Hadoop and HBase for core modules • Maintains API compatibility • Inherited scalability and availability Differentiation • ANSI SQL via ODBC/JDBC • Relational schema abstraction • Distributed transaction protection • Mature SQL technology • Automatic parallelism Hadoop Trafodion Client Application using ODBC/JDBC on Windows/Linux Client Services for ODBC and JDBC HBase HP © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject 2 to change without notice. Hive HDFS Zookeeper SQL Compiler / Optimizer / Executor Distributed Transaction Manager +
  • 3. HBase vs. Trafodion comparison HBase Trafodion + HBase Data abstraction Key and value pair Relational schema Physical Layout Column family store where row data is stored together by cells Same except there is a single column family with space-saving column encoding Column values Uninterpreted array of bytes Explicitly defined and enforced data types ACID Guarantee Single row atomicity Multi- SQL statements, tables, and rows defined as part of transaction Language API Get/put/delete SQL (Trafodion invokes native HBase API) Row Key Index Single (string) row key Composite (multi-column) row key Secondary Indexes Not supported Arbitrary secondary key columns HP © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject 3 to change without notice.
  • 4. Salting of row keys How it works • HBase table gets created, pre-split with one region per salt value • A hash value column, “_SALT_”, is added as a prefix to the row key • Salting is transparent to SQL statements – Automatically computed during insert/update statements – Predicates automatically generated where feasible – Minimal overhead for direct lookup by key value Benefits • Even data distributions across HBase regions • Avoids region hotspots caused by insertion of data in row key order INSERT(s) SELECT(s) HBase Region PART 1 PART 2 PART 3 PART 4 HDFS CREATE TABLE t(a integer not null primary key, b integer) SALT USING 4 PARTITIONS; HP © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject 4 to change without notice. HBase Region HDFS HBase Region HDFS HBase Region HDFS
  • 5. Trafodion and Hadoop – Better Together! Leverages and extends Hadoop for transactional SQL workloads Complete: Full-function ANSI SQL Reuse existing SQL skills and improve developer productivity Protected: Distributed ACID transactions Guarantees data consistency across multiple rows, tables, SQL statements Efficient: Optimized for low-latency read and write transactions Supports real-time transaction processing applications Flexible: Schema flexibility and multi-structured data Seamlessly integrates structured, unstructured, and semi-structured data Interoperable: Standard ODBC/JDBC access Works with existing tools and applications Open: Hadoop and Linux distribution neutral Easy to add to your existing infrastructure and no vendor lock-in HP © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject 5 to change without notice. Reuse SQL skills Scale without complexity Complements Hadoop Reduce Costs Real-time Performanc e +
  • 6. See for yourself… Come discover and develop on Trafodion www.trafodion.org HP © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject 6 to change without notice.

Editor's Notes

  • #2: Welcome to the Trafodion briefing series. Trafodion is a HP sponsored open-source project to deliver an enterprise-class Transactional SQL-on-HBase DBMS solution. The purpose of this segment is to discuss how Trafodion both leverages and extends elements of the Hadoop/HBase infrastructure to support operational workload requirements.
  • #3: Trafodion is designed to build upon and leverage Apache Hadoop and HBase core modules and thereby transactional/operational applications using Trafodion transparently gain Hadoop’ s advantages of affordable performance, scalability, elasticity, availability, etc. This graphic depicts a subset of the Hadoop software stack colored in dark grey that are specifically leveraged by Trafodion, namely HBase, HDFS, and Zookeeper. To this stack, Trafodion adds (items colored in blue) the Trafodion ODBC/JDBC drivers, the Trafodion database software, and the DTM for distributed transaction protection. Trafodion interfaces to Hadoop services using their standard APIs. By maintaining API compatibility, Trafodion becomes Hadoop distribution neutral thereby eliminating vendor lock-in by offering customers a choice of distributions to choose from. Trafodion is targeted to deliver differentiation on top of Hadoop in these key areas: A full-featured ANSI SQL implementation whose database services are accessible via a standard ODBC/JDBC connection Provides a SQL relational schema abstraction which makes Trafodion look and feel like any other relational database Distributed ACID transaction protection Mature SQL technology Parallel optimizations for transactional workloads
  • #4: Although Trafodion stores its database objects in HBase/HDFS storage structures, it differs and brings value-add over vanilla HBase in a multitude of ways as depicted on this slide. Trafodion provides a relational schema abstraction on top of HBase which allows customers to leverage known and well tested relational design methodologies and SQL programming skills. From a physical layout perspective, Trafodion uses standard HBase storage mechanisms (column family store using key-value pairs) to store and access objects. Trafodion currently stores all columns in a single column family to improve access efficiency and speed for operational data. Additionally Trafodion incorporates a column name encoding mechanism to save space on disk and to reduce messaging overhead for the purposes of improving SQL performance. Unlike vanilla HBase that treats stored data as an uninterpreted array of bytes, Trafodion defined columns are assigned specific data types that are enforced by Trafodion when inserting or updating its data contents. This not only greatly improves data quality/integrity, it also eliminates the need to develop application logic to parse and interpret the data contents. Vanilla HBase provides ACID transaction protection only at the row level. Trafodion extends ACID protection to application defined transactions that can span multiple SQL statements, multiple tables, and rows. This greatly improves database integrity by protecting the database against partially completed transactions i.e. ensuring that either the whole transaction is completely materialized in the database or none of it. HBase’s native API is at a very low level and is not a commonly used programming API. In contrast, Trafodion’ s API is ANSI SQL which is a familiar and well known programming interface and allows companies to leverage existing SQL knowledge and skills. Unlike HBase’s key structure that is comprised of a single uninterpreted array of bytes, Trafodion supports the common relational practice of allowing the primary key to be a composite key comprised of multiple columns. Finally unlike vanilla HBase, Trafodion supports the creation of secondary indexes that can be used to speed transaction performance when accessing row data by a column value that is not the row key. In summary, Trafodion incorporates a number of enhancements over vanilla HBase for the purposes of improving transaction performance, data integrity, and DBA/developer productivity (i.e. by reducing application complexity through the use of standard and well known relational practices and APIs).
  • #5: One known problematic area for HBase is supporting transactional workloads where data is inserted into a table in row key order. When this happens, all of the I/O gets concentrated to a single HBase region which in turn creates a server and disk hotspot and performance bottleneck. To alleviate this problem, Trafodion provides a innovative feature called “salting the row key”. To enable this feature the DBA specifies the number of partitions (i.e. regions) the table is to be split over when creating the table i.e. “SALT USING 4 PARTITIONS”. Trafodion creates the table pre-split with one region per salt value. An internal hash value column, “_SALT_”, is added as a prefix to the row key. Salting is handled automatically by Trafodion and is transparent to application written SQL statements. As data is inserted into the table, Trafodion computes the salt value and directs the insert to the appropriate region. Likewise, Trafodion calculates the salt value when data is retrieved from the table and automatically generates predicates where feasible. This is a very lightweight operation with little overhead or impact to direct key access operations. The benefits of salting are that you get more even data distributions across regions and improved performance via hotspot elimination. Next let’s look at other ways Trafodion brings innovation and value add to vanilla HBase.
  • #6: Trafodion delivers on the promise of a full featured and optimized transactional SQL-on-HBase DBMS solution with full transactional data protection. This combination of HBase and an enterprise-class transactional SQL engine overcomes Hadoop’s weaknesses in terms of supporting operational workloads. Customers gain the following recognized benefits: Complete: Is a comprehensive and full-functioned ANSI SQL DBMS which allows companies to leverage their in-house SQL learnings and expertise versus having to learn complex map/reduce programming. Protected: Extends Hadoop HBase by adding support for guaranteed transactional consistency across multiple SQL statements, tables, and rows. Efficient: Includes many optimizations for low-latency read and write transactions in support of the fast response time requirements of the operational SQL workloads Trafodion is targeting. Flexible: Trafodion hosted applications gain schema flexibility and seamless integration of data from Trafodion, native HBase, and Hive tables without expensive replication or data movement overhead. Interoperable: Provides investment protection via interoperability with your existing tools and applications using standard ODBC and JDBC access. Open: Is designed to seamlessly fit within customer’s existing IT infrastructure with no vendor lock-in by remaining neutral to the underlying Linux and Hadoop distribution. It complements existing Hadoop investments and benefits. HP open-source sponsorship and investment