SlideShare a Scribd company logo
What Is Apache Tephra ?
● Provides transactions for HBase and Phoenix
● Apache incubating project
● Uses HBase's native data versioning to
● Provide multi-versioned concurrency control (MVCC)
● For transactional reads and writes
● Provides snapshot isolation of concurrent transactions
● Open source / Apache 2.0 license
Tephra Architecture
● Tephra has three main components
● Transaction Server
– Maintains global view of transaction state
– Assigns new transaction IDs
– Performs conflict detection
● Transaction Client
– Coordinates start, commit
– And rollback of transactions
Tephra Architecture
● Tephra has three main components
● TransactionProcessor Coprocessor
– Applies filtering to the data read
● (based on a given transaction's state)
– Cleans up any data from old
● (no longer visible) transactions
● Multiple transaction server instances can run concurrently
– Allows for automatic failover
– One server instance is actively serving requests
– Configured by ZooKeeper
Tephra Phoenix
● Tephra is an incubating Apache project
● Phoenix uses Tephra for transaction support
● So this functionality is in a beta stage
● It gives cross row and cross table transaction support
● And full ACID semantics
● Remember that Phoenix uses Hbase as it's backing store
● Next slides show configuration
Phoenix Architecture ( Reminder )
Tephra Phoenix Config
● Add the following config
● To your client side hbase-site.xml file
● To enable transactions
<property>
<name>phoenix.transactions.enabled</name>
<value>true</value>
</property>
Tephra Phoenix Config
● Add the following config
● To your server side hbase-site.xml file
● To configure the transaction manager
<property>
<name>data.tx.snapshot.dir</name>
<value>/tmp/tephra/snapshots</value>
</property>
Tephra Phoenix Config
● Add the following config
● To your server side hbase-site.xml file
● To set the transaction timeout
<property>
<name>data.tx.timeout</name>
<value>60</value>
</property>
● Then you can start Tephra on Phoenix
./bin/tephra
Tephra Requirements
Component
Java
HDFS
Hbase
ZooKeeper
Source
Apache Hadoop
CDH or HDP
MapR
Apache
CDH or HDP
MapR
Apache
CDH or HDP
MapR
Version
1.7.xx / 1.8.xx
2.0.2-alpha - 2.7.x
(CDH) 5.0.0 - 5.12.0 /(HDP) 2.0 – 2.6
4.1 - 5.1 (with MapR-FS)
0.96.x, 0.98.x, 1.0.x, 1.1.x, 1.2.x, 1.3.x
(except 1.1.5 and 1.2.2) and 2.0.x
(CDH) 5.0.0 - 5.12.0 /(HDP) 2.0 – 2.6
4.1 - 5.1 (with Apache Hbase)
Version 3.4.3 - 3.4.5
(CDH) 5.0.0 - 5.12.0 /(HDP) 2.0 – 2.6
4.1 - 5.1
Tephra Transaction Server Config
● Add changes to hbase-site.xml
data.tx.bind.port
data.tx.bind.address
data.tx.server.io.threads
data.tx.server.threads
data.tx.timeout
data.tx.long.timeout
data.tx.cleanup.interval
data.tx.snapshot.dir
data.tx.snapshot.interval
data.tx.snapshot.retain
data.tx.metrics.period
15165
0.0.0.0
2
20
30
86400
10
300
10
60
Port to bind to
Server address to listen on
Number of threads for socket IO
Number of handler threads
Timeout for a transaction to complete
Timeout for a long run trans to complete
Frequency to check for timed out trans
HDFS directory used to store snapshots
requency to write new snapshots
No. old transaction snapshots to retain
Frequency for metrics reporting
Tephra Transaction Client Config
● Add changes to hbase-site.xml
data.tx.client.timeout
data.tx.client.provider
data.tx.client.count
data.tx.client.obtain.timeout
data.tx.client.retry.strategy
data.tx.client.retry.attempts
data.tx.client.retry.backoff.initial
data.tx.client.retry.backoff.factor
data.tx.client.retry.backoff.limit
30000
Pool
50
3000
Backoff
2
100
4
30000
Client socket timeout (milliseconds)
Client provider strategy:
"pool" uses a pool of clients
"thread-local" a client per thread
Max number of clients for "pool" provider
Pool provider clients get timeout (ms)
Client retry strategy(Backoff/n-times)
Number of times to retry (n-times)
Initial sleep time (backoff)
Multiplication factor for sleep time
Exit when sleep time reaches this limit
Tephra HBase Coprocessor Configuration
● Tephra requires an HBase coprocessor to be installed
● On all tables where transactional reads and writes
● Will be performed, Add this change
● To hbase-site.xml
<property>
<name>hbase.coprocessor.region.classes</name>
<value>org.apache.tephra.hbase.coprocessor.TransactionProcessor</value>
</property>
● Use Tephra binary to start once configured
./bin/tephra start
Available Books
● See “Big Data Made Easy”
– Apress Jan 2015
●
See “Mastering Apache Spark”
– Packt Oct 2015
●
See “Complete Guide to Open Source Big Data Stack
– “Apress Jan 2018”
● Find the author on Amazon
– www.amazon.com/Michael-Frampton/e/B00NIQDOOM/
●
Connect on LinkedIn
– www.linkedin.com/in/mike-frampton-38563020
Connect
● Feel free to connect on LinkedIn
– www.linkedin.com/in/mike-frampton-38563020
● See my open source blog at
– open-source-systems.blogspot.com/
● I am always interested in
– New technology
– Opportunities
– Technology based issues
– Big data integration

More Related Content

PPTX
HBaseCon 2015: OpenTSDB and AsyncHBase Update
PDF
Oops! I Started a Broker | Yinon Kahta, Taboola
PDF
HBaseCon2017 Transactions in HBase
PDF
Redpanda and ClickHouse
PDF
2017 meetup-apache-kafka-nov
PDF
Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021
PPTX
Keynote: Apache HBase at Yahoo! Scale
PDF
Tales Of The Black Knight - Keeping EverythingMe running
HBaseCon 2015: OpenTSDB and AsyncHBase Update
Oops! I Started a Broker | Yinon Kahta, Taboola
HBaseCon2017 Transactions in HBase
Redpanda and ClickHouse
2017 meetup-apache-kafka-nov
Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021
Keynote: Apache HBase at Yahoo! Scale
Tales Of The Black Knight - Keeping EverythingMe running

What's hot (19)

PDF
Максим Барышиков-«WoT: Geographically distributed cluster of clusters»
PDF
HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...
PDF
Redis vs Infinispan | DevNation Tech Talk
PPTX
Query logging with proxysql
PDF
Training Slides: Intermediate 205: Configuring Tungsten Replicator to Extract...
PPTX
Monitoring MongoDB’s Engines in the Wild
ODP
JActor Cluster Platform
PDF
Linux HTTPS/TCP/IP Stack for the Fast and Secure Web
PPT
SaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStack
PDF
Kafka on ZFS: Better Living Through Filesystems
PDF
OpenTSDB: HBaseCon2017
PDF
GeoDistributed datacenter: the DNS way
PPTX
Apache Kafka at LinkedIn
PDF
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
PPT
He Pi Xii2003
PPTX
Introduction to Haproxy
PDF
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
PDF
Distributed Stream Processing on Fluentd / #fluentd
PDF
Demystifying postgres logical replication percona live sc
Максим Барышиков-«WoT: Geographically distributed cluster of clusters»
HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...
Redis vs Infinispan | DevNation Tech Talk
Query logging with proxysql
Training Slides: Intermediate 205: Configuring Tungsten Replicator to Extract...
Monitoring MongoDB’s Engines in the Wild
JActor Cluster Platform
Linux HTTPS/TCP/IP Stack for the Fast and Secure Web
SaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStack
Kafka on ZFS: Better Living Through Filesystems
OpenTSDB: HBaseCon2017
GeoDistributed datacenter: the DNS way
Apache Kafka at LinkedIn
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
He Pi Xii2003
Introduction to Haproxy
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Distributed Stream Processing on Fluentd / #fluentd
Demystifying postgres logical replication percona live sc
Ad

Similar to Apache Tephra (20)

PDF
ACID Transactions in Apache Phoenix with Apache Tephra™ (incubating), by Poor...
PDF
Transaction in HBase, by Andreas Neumann, Cask
PDF
Introduction to HBase - NoSqlNow2015
PPTX
Open stack HA - Theory to Reality
PDF
The Real World - Plugging the Enterprise Into It (nodejs)
PPTX
HBase New Features
 
PDF
Hadoop security
PPTX
Dissecting Open Source Cloud Evolution: An OpenStack Case Study
PDF
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
PPTX
Federated sharing with the Open Cloud Mesh API
PDF
Apache HTTPD 2.4 Reverse Proxy: The Hidden Gem
PDF
Dragoncraft Architectural Overview
PPTX
Introduction to Apache HBase
PDF
The State of HBase Replication
PPTX
Hadoop: Components and Key Ideas, -part1
PPTX
Hadoop introduction
PPTX
Couchbase and Apache Spark
PPTX
Scalable Web Apps
PDF
03 h base-2-installation_andshell
PPTX
Meet HBase 2.0 and Phoenix-5.0
ACID Transactions in Apache Phoenix with Apache Tephra™ (incubating), by Poor...
Transaction in HBase, by Andreas Neumann, Cask
Introduction to HBase - NoSqlNow2015
Open stack HA - Theory to Reality
The Real World - Plugging the Enterprise Into It (nodejs)
HBase New Features
 
Hadoop security
Dissecting Open Source Cloud Evolution: An OpenStack Case Study
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
Federated sharing with the Open Cloud Mesh API
Apache HTTPD 2.4 Reverse Proxy: The Hidden Gem
Dragoncraft Architectural Overview
Introduction to Apache HBase
The State of HBase Replication
Hadoop: Components and Key Ideas, -part1
Hadoop introduction
Couchbase and Apache Spark
Scalable Web Apps
03 h base-2-installation_andshell
Meet HBase 2.0 and Phoenix-5.0
Ad

More from Mike Frampton (20)

PDF
Apache Airavata
PDF
Apache MADlib AI/ML
PDF
Apache MXNet AI
PDF
Apache Gobblin
PDF
Apache Singa AI
PDF
Apache Ranger
PDF
OrientDB
PDF
Prometheus
PDF
Apache Kudu
PDF
Apache Bahir
PDF
Apache Arrow
PDF
JanusGraph DB
PDF
Apache Ignite
PDF
Apache Samza
PDF
Apache Flink
PDF
Apache Edgent
PDF
Apache CouchDB
ODP
An introduction to Apache Mesos
ODP
An introduction to Pentaho
ODP
An introduction to Apache Thrift
Apache Airavata
Apache MADlib AI/ML
Apache MXNet AI
Apache Gobblin
Apache Singa AI
Apache Ranger
OrientDB
Prometheus
Apache Kudu
Apache Bahir
Apache Arrow
JanusGraph DB
Apache Ignite
Apache Samza
Apache Flink
Apache Edgent
Apache CouchDB
An introduction to Apache Mesos
An introduction to Pentaho
An introduction to Apache Thrift

Recently uploaded (20)

PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Approach and Philosophy of On baking technology
PPTX
Cloud computing and distributed systems.
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
KodekX | Application Modernization Development
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
“AI and Expert System Decision Support & Business Intelligence Systems”
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Approach and Philosophy of On baking technology
Cloud computing and distributed systems.
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Encapsulation_ Review paper, used for researhc scholars
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
MYSQL Presentation for SQL database connectivity
Per capita expenditure prediction using model stacking based on satellite ima...
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
KodekX | Application Modernization Development
Review of recent advances in non-invasive hemoglobin estimation
Advanced methodologies resolving dimensionality complications for autism neur...
Diabetes mellitus diagnosis method based random forest with bat algorithm
Dropbox Q2 2025 Financial Results & Investor Presentation
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx

Apache Tephra

  • 1. What Is Apache Tephra ? ● Provides transactions for HBase and Phoenix ● Apache incubating project ● Uses HBase's native data versioning to ● Provide multi-versioned concurrency control (MVCC) ● For transactional reads and writes ● Provides snapshot isolation of concurrent transactions ● Open source / Apache 2.0 license
  • 2. Tephra Architecture ● Tephra has three main components ● Transaction Server – Maintains global view of transaction state – Assigns new transaction IDs – Performs conflict detection ● Transaction Client – Coordinates start, commit – And rollback of transactions
  • 3. Tephra Architecture ● Tephra has three main components ● TransactionProcessor Coprocessor – Applies filtering to the data read ● (based on a given transaction's state) – Cleans up any data from old ● (no longer visible) transactions ● Multiple transaction server instances can run concurrently – Allows for automatic failover – One server instance is actively serving requests – Configured by ZooKeeper
  • 4. Tephra Phoenix ● Tephra is an incubating Apache project ● Phoenix uses Tephra for transaction support ● So this functionality is in a beta stage ● It gives cross row and cross table transaction support ● And full ACID semantics ● Remember that Phoenix uses Hbase as it's backing store ● Next slides show configuration
  • 6. Tephra Phoenix Config ● Add the following config ● To your client side hbase-site.xml file ● To enable transactions <property> <name>phoenix.transactions.enabled</name> <value>true</value> </property>
  • 7. Tephra Phoenix Config ● Add the following config ● To your server side hbase-site.xml file ● To configure the transaction manager <property> <name>data.tx.snapshot.dir</name> <value>/tmp/tephra/snapshots</value> </property>
  • 8. Tephra Phoenix Config ● Add the following config ● To your server side hbase-site.xml file ● To set the transaction timeout <property> <name>data.tx.timeout</name> <value>60</value> </property> ● Then you can start Tephra on Phoenix ./bin/tephra
  • 9. Tephra Requirements Component Java HDFS Hbase ZooKeeper Source Apache Hadoop CDH or HDP MapR Apache CDH or HDP MapR Apache CDH or HDP MapR Version 1.7.xx / 1.8.xx 2.0.2-alpha - 2.7.x (CDH) 5.0.0 - 5.12.0 /(HDP) 2.0 – 2.6 4.1 - 5.1 (with MapR-FS) 0.96.x, 0.98.x, 1.0.x, 1.1.x, 1.2.x, 1.3.x (except 1.1.5 and 1.2.2) and 2.0.x (CDH) 5.0.0 - 5.12.0 /(HDP) 2.0 – 2.6 4.1 - 5.1 (with Apache Hbase) Version 3.4.3 - 3.4.5 (CDH) 5.0.0 - 5.12.0 /(HDP) 2.0 – 2.6 4.1 - 5.1
  • 10. Tephra Transaction Server Config ● Add changes to hbase-site.xml data.tx.bind.port data.tx.bind.address data.tx.server.io.threads data.tx.server.threads data.tx.timeout data.tx.long.timeout data.tx.cleanup.interval data.tx.snapshot.dir data.tx.snapshot.interval data.tx.snapshot.retain data.tx.metrics.period 15165 0.0.0.0 2 20 30 86400 10 300 10 60 Port to bind to Server address to listen on Number of threads for socket IO Number of handler threads Timeout for a transaction to complete Timeout for a long run trans to complete Frequency to check for timed out trans HDFS directory used to store snapshots requency to write new snapshots No. old transaction snapshots to retain Frequency for metrics reporting
  • 11. Tephra Transaction Client Config ● Add changes to hbase-site.xml data.tx.client.timeout data.tx.client.provider data.tx.client.count data.tx.client.obtain.timeout data.tx.client.retry.strategy data.tx.client.retry.attempts data.tx.client.retry.backoff.initial data.tx.client.retry.backoff.factor data.tx.client.retry.backoff.limit 30000 Pool 50 3000 Backoff 2 100 4 30000 Client socket timeout (milliseconds) Client provider strategy: "pool" uses a pool of clients "thread-local" a client per thread Max number of clients for "pool" provider Pool provider clients get timeout (ms) Client retry strategy(Backoff/n-times) Number of times to retry (n-times) Initial sleep time (backoff) Multiplication factor for sleep time Exit when sleep time reaches this limit
  • 12. Tephra HBase Coprocessor Configuration ● Tephra requires an HBase coprocessor to be installed ● On all tables where transactional reads and writes ● Will be performed, Add this change ● To hbase-site.xml <property> <name>hbase.coprocessor.region.classes</name> <value>org.apache.tephra.hbase.coprocessor.TransactionProcessor</value> </property> ● Use Tephra binary to start once configured ./bin/tephra start
  • 13. Available Books ● See “Big Data Made Easy” – Apress Jan 2015 ● See “Mastering Apache Spark” – Packt Oct 2015 ● See “Complete Guide to Open Source Big Data Stack – “Apress Jan 2018” ● Find the author on Amazon – www.amazon.com/Michael-Frampton/e/B00NIQDOOM/ ● Connect on LinkedIn – www.linkedin.com/in/mike-frampton-38563020
  • 14. Connect ● Feel free to connect on LinkedIn – www.linkedin.com/in/mike-frampton-38563020 ● See my open source blog at – open-source-systems.blogspot.com/ ● I am always interested in – New technology – Opportunities – Technology based issues – Big data integration