SlideShare a Scribd company logo
Unleash the Power of HBase Shell 
Big Data Everywhere 
Chicago 2014 
Jayesh Thakrar 
jthakrar@conversant.com
HBase = Truly Big Data Store ……..(well, one of many) 
• Proven 
• Scalable 
• Resilient and highly available (HA) 
• Low-latency 
• OLTP and batch/mapreduce usage 
• True big data store 
 billions of rows 
 millions of columns
However……HBase also makes me…… 
Data 
• No query tools like psql, mysql, etc 
• No interactive development tools 
• Can’t browse Java primitive data in cells, e.g. 
int, long, double, etc. 
• Cell values printed as string if “printable bytes” 
else bytes printed in “hexadecimal” format
But I was somewhat wrong…… 
• HBase shell is a full-fledged jruby shell (jirb) 
• It can exploit the strengths of Ruby and Java 
• With minimal code can retrieve/view data in any HBase row/cell 
• Can possibly use "Ruby-on-Rails" and other frameworks directly 
interfacing with HBase
HBase Shell = JRuby Under the Cover 
HBase Shell Ruby Source Code 
$ cd <HBASE_DIRECTORY> 
$ find . -name '*.rb' -print 
./bin/get-active-master.rb 
./bin/hirb.rb 
./bin/region_mover.rb 
./bin/region_status.rb 
…. 
./lib/ruby/shell/commands/assign.rb 
./lib/ruby/shell/commands/balancer.rb 
… 
./lib/ruby/shell/commands/create.rb 
./lib/ruby/shell/commands/delete.rb 
…..
JRuby under the cover 
HBase Shell Ruby Source Code Shell Commands = HBase DSL (Domain Specific Language) 
$ cd <HBASE_DIRECTORY> 
$ find . -name '*.rb' -print 
./bin/get-active-master.rb 
./bin/hirb.rb 
./bin/region_mover.rb 
./bin/region_status.rb 
…. 
./lib/ruby/shell/commands/assign.rb 
./lib/ruby/shell/commands/balancer.rb 
… 
./lib/ruby/shell/commands/create.rb 
./lib/ruby/shell/commands/delete.rb 
….. 
$ hbase shell 
get 'user', 'AB350000000000000350' 
get('user', 'AB350000000000000350') 
DSL format 
Ruby method format
JRuby under the cover 
HBase Shell Ruby Source Code Shell Commands = HBase DSL (Domain Specific Language) 
$ cd <HBASE_DIRECTORY> 
$ find . -name '*.rb' -print 
./bin/get-active-master.rb 
./bin/hirb.rb 
./bin/region_mover.rb 
./bin/region_status.rb 
…. 
./lib/ruby/shell/commands/assign.rb 
./lib/ruby/shell/commands/balancer.rb 
… 
./lib/ruby/shell/commands/create.rb 
./lib/ruby/shell/commands/delete.rb 
….. 
$ hbase shell 
get 'user', 'AB350000000000000350' 
get('user', 'AB350000000000000350') 
table_name = 'user' 
rowkey = 'AB350000000000000350' 
get(table_name, rowkey) 
DSL format 
Ruby method format
JRuby under the cover 
HBase Shell Ruby Source Code Shell Commands = HBase DSL (Domain Specific Language) 
$ cd <HBASE_DIRECTORY> 
$ find . -name '*.rb' -print 
./bin/get-active-master.rb 
./bin/hirb.rb 
./bin/region_mover.rb 
./bin/region_status.rb 
…. 
./lib/ruby/shell/commands/assign.rb 
./lib/ruby/shell/commands/balancer.rb 
… 
./lib/ruby/shell/commands/create.rb 
./lib/ruby/shell/commands/delete.rb 
….. 
$ hbase shell 
get 'user', 'AB350000000000000350' 
get('user', 'AB350000000000000350') 
table_name = 'user' 
rowkey = 'AB350000000000000350' 
get(table_name, rowkey) 
DSL format 
Ruby method format 
scan 'user', {STARTROW => 'AB350000000000000350', LIMIT => 5} 
scan_options = {STARTROW => rowkey, LIMIT => 5} 
scan table_name, scan_options 
Defining and using Ruby Hash or Dictionary
HBase shell JRuby Example - 1 
include Java 
import org.apache.hadoop.hbase. HBaseConfiguration 
import org.apache.hadoop.hbase.client.HTable 
import org.apache.hadoop.hbase.client.Scan 
import org.apache.hadoop.hbase.client.Get 
import org.apache.hadoop.hbase.client.Result 
import org.apache.hadoop.hbase.util.Bytes 
htable = HTable.new(HBaseConfiguration.new, "sample") 
rowkey = Bytes.toBytes("some_rowkey") 
get = Get.new(rowkey) 
result = htable.get(get) 
result.list.collect {|kv| puts "#{Bytes.toString(kv.getFamily)}:#{Bytes.toString(kv.getQualifier)}"}
HBase shell JRuby Example - 1 
include Java 
import org.apache.hadoop.hbase. HBaseConfiguration 
import org.apache.hadoop.hbase.client.HTable 
import org.apache.hadoop.hbase.client.Scan 
import org.apache.hadoop.hbase.client.Get 
import org.apache.hadoop.hbase.client.Result 
import org.apache.hadoop.hbase.util.Bytes 
htable = HTable.new(HBaseConfiguration.new, "sample") 
rowkey = Bytes.toBytes("some_rowkey") 
get = Get.new(rowkey) 
result = htable.get(get) 
result.list.collect {|kv| puts "#{Bytes.toString(kv.getFamily)}:#{Bytes.toString(kv.getQualifier)}"} 
Allow calling Java from within JRuby
HBase shell JRuby Example - 1 
include Java 
import org.apache.hadoop.hbase. HBaseConfiguration 
import org.apache.hadoop.hbase.client.HTable 
import org.apache.hadoop.hbase.client.Scan 
import org.apache.hadoop.hbase.client.Get 
import org.apache.hadoop.hbase.client.Result 
import org.apache.hadoop.hbase.util.Bytes 
htable = HTable.new(HBaseConfiguration.new, "sample") 
rowkey = Bytes.toBytes("some_rowkey") 
get = Get.new(rowkey) 
result = htable.get(get) 
result.list.collect {|kv| puts "#{Bytes.toString(kv.getFamily)}:#{Bytes.toString(kv.getQualifier)}"} 
Allow calling Java from within JRuby 
"import" Java classes
HBase shell JRuby Example - 1 
include Java 
import org.apache.hadoop.hbase. HBaseConfiguration 
import org.apache.hadoop.hbase.client.HTable 
import org.apache.hadoop.hbase.client.Scan 
import org.apache.hadoop.hbase.client.Get 
import org.apache.hadoop.hbase.client.Result 
import org.apache.hadoop.hbase.util.Bytes 
htable = HTable.new(HBaseConfiguration.new, "sample") 
rowkey = Bytes.toBytes("some_rowkey") 
get = Get.new(rowkey) 
result = htable.get(get) 
result.list.collect {|kv| puts "#{Bytes.toString(kv.getFamily)}:#{Bytes.toString(kv.getQualifier)}"} 
Allow calling Java from within JRuby 
"import" Java classes 
Can invoke HBase Java API 
Jruby variables for Java class instance and static 
and instance method output 
HTable.new = new Htable() in Java
HBase shell JRuby Example - 1 
include Java 
import org.apache.hadoop.hbase.HBaseConfiguration 
import org.apache.hadoop.hbase.client.HTable 
import org.apache.hadoop.hbase.client.Scan 
import org.apache.hadoop.hbase.client.Get 
import org.apache.hadoop.hbase.client.Result 
import org.apache.hadoop.hbase.util.Bytes 
htable = HTable.new(HBaseConfiguration.new, "sample") 
rowkey = Bytes.toBytes("some_rowkey") 
get = Get.new(rowkey) 
result = htable.get(get) 
result.list.collect {|kv| puts "#{Bytes.toString(kv.getFamily)}:#{Bytes.toString(kv.getQualifier)}"} 
Allow calling Java from within JRuby 
"import" Java classes 
Creating Jruby variables for Java class instance 
and static and instance method output 
HTable.new = new Htable() in Java 
Ruby expression: "collect" is a Ruby method for list objects. Here it is made available to 
a Java list thus making available features of both languages transparently and seamlessly.
HBase shell JRuby Example - 2 
# Same include and import statements as Example - 1 
htable = HTable.new(HBaseConfiguration.new, ".META.") 
scanner = htable.getScanner(Scan.new()) 
tables = {} 
scanner.each do |r| 
table_name = Bytes.toString(r.getRow).split(",")[0] 
if not tables.has_key?(table_name) 
tables[table_name] = 0 
end 
tables[table_name] = tables[table_name] + 1 
end 
tables.keys.each { |t| puts "Table #{t} has #{tables[t]} regions"} 
This example scans the ".META.“ 
to get a count of regions by 
tables and regionserver
HBase shell JRuby Example - 2 
# Same include and import statements as Example - 1 
htable = HTable.new(HBaseConfiguration.new, ".META.") 
scanner = htable.getScanner(Scan.new()) 
tables = {} 
scanner.each do |r| 
table_name = Bytes.toString(r.getRow).split(",")[0] 
if not tables.has_key?(table_name) 
tables[table_name] = 0 
end 
tables[table_name] = tables[table_name] + 1 
end 
tables.keys.each { |t| puts "Table #{t} has #{tables[t]} regions"} 
This example scans the ".META.“ 
to get a count of regions by 
tables and regionserver 
Empty Ruby hash or dictionary
HBase shell JRuby Example - 2 
# Same include and import statements as Example - 1 
htable = HTable.new(HBaseConfiguration.new, ".META.") 
scanner = htable.getScanner(Scan.new()) 
tables = {} 
scanner.each do |r| 
table_name = Bytes.toString(r.getRow).split(",")[0] 
if not tables.has_key?(table_name) 
tables[table_name] = 0 
end 
tables[table_name] = tables[table_name] + 1 
end 
tables.keys.each { |t| puts "Table #{t} has #{tables[t]} regions"} 
This example scans the ".META.“ 
to get a count of regions by 
tables and regionserver 
Empty Ruby hash or dictionary 
Example of how to iterate through a Java "iterable". 
Each iteration of scanner gives a "Result" object which is then 
passed to a code block
HBase shell JRuby Example - 2 
# Same include and import statements as Example - 1 
htable = HTable.new(HBaseConfiguration.new, ".META.") 
scanner = htable.getScanner(Scan.new()) 
tables = {} 
scanner.each do |r| 
table_name = Bytes.toString(r.getRow).split(",")[0] 
if not tables.has_key?(table_name) 
tables[table_name] = 0 
end 
tables[table_name] = tables[table_name] + 1 
end 
tables.keys.each { |t| puts "Table #{t} has #{tables[t]} regions"} 
This example scans the ".META.“ 
to get a count of regions by 
tables and regionserver 
Example of how to iterate through a Java "iterable". 
Each iteration of scanner gives a "Result" object which is then 
passed to a code block 
The "code block" can be enclosed by curly braces 
({}) or "do" and "end" keywords. Convention is to 
use {} for single line code blocks and do/end for 
multi-line code blocks. 
Empty Ruby hash or dictionary
HBase shell JRuby Example - 2 
# Same include and import statements as Example - 1 
htable = HTable.new(HBaseConfiguration.new, ".META.") 
scanner = htable.getScanner(Scan.new()) 
tables = {} 
scanner.each do |r| 
table_name = Bytes.toString(r.getRow).split(",")[0] 
if not tables.has_key?(table_name) 
tables[table_name] = 0 
end 
tables[table_name] = tables[table_name] + 1 
end 
tables.keys.each { |t| puts "Table #{t} has #{tables[t]} regions"} 
This example scans the ".META.“ 
to get a count of regions by 
tables and regionserver 
Example of how to iterate through a Java "iterable". 
Each iteration of scanner gives a "Result" object which is then passed to a code block 
The "code block" can be enclosed by curly braces 
({}) or "do" and "end" keywords. Convention is to 
use {} for single line code blocks and do/end for 
multi-line code blocks. 
Empty Ruby hash or dictionary 
Print region count by table using 
an iterator that is passed a code 
block. Compare the code block 
enclosed in {} v/s “do/end” above
HBase shell JRuby Example - 3 
• See https://github.com/JThakrar/hse 
• hbase_shell_extension.rb
To Conclude……. 
• HBase shell 
 is an interactive scripting environment 
 allows mixing of Java and Jruby 
 Is not “recommended” for serious, enterprise/group development that 
requires automated testing, continuous integration, etc. 
• Can use JRuby IDE, provided you add HBase jars using "require" 
e.g. require '<path>/hbase.jar' 
• Can also use your custom Java jars in IDE and/or HBase shell 
• Can even “compile” your code to generate “jars” from your JRuby scripts for 
optimal performance and/or to avoid exposing source code

More Related Content

PDF
Scalding - Hadoop Word Count in LESS than 70 lines of code
PPTX
Should I Use Scalding or Scoobi or Scrunch?
PDF
Introduction to Scalding and Monoids
PDF
Cascading Through Hadoop for the Boulder JUG
PPTX
Avro introduction
PDF
Declarative Internal DSLs in Lua: A Game Changing Experience
PPTX
Writing Hadoop Jobs in Scala using Scalding
PDF
Polyglot Persistence
Scalding - Hadoop Word Count in LESS than 70 lines of code
Should I Use Scalding or Scoobi or Scrunch?
Introduction to Scalding and Monoids
Cascading Through Hadoop for the Boulder JUG
Avro introduction
Declarative Internal DSLs in Lua: A Game Changing Experience
Writing Hadoop Jobs in Scala using Scalding
Polyglot Persistence

What's hot (20)

PPTX
MongoDB - Aggregation Pipeline
PDF
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...
PDF
wtf is in Java/JDK/wtf7?
PPT
Hive Object Model
PDF
Cloudera Impala, updated for v1.0
PPTX
MongoDB Aggregation
PPT
Hive - SerDe and LazySerde
PPTX
Scalding: Reaching Efficient MapReduce
PDF
Avro, la puissance du binaire, la souplesse du JSON
PPTX
Hive data migration (export/import)
PDF
Apache avro and overview hadoop tools
PDF
MongoDB World 2019: Creating a Self-healing MongoDB Replica Set on GCP Comput...
PDF
MongoDB Aggregation Framework
PDF
Perl at SkyCon'12
PDF
Cassandra 3.0 - JSON at scale - StampedeCon 2015
ZIP
CouchDB-Lucene
PDF
Sql cheat sheet
PDF
High Performance tDiary
PDF
Aggregation Framework MongoDB Days Munich
PDF
DBD::Gofer 200809
MongoDB - Aggregation Pipeline
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...
wtf is in Java/JDK/wtf7?
Hive Object Model
Cloudera Impala, updated for v1.0
MongoDB Aggregation
Hive - SerDe and LazySerde
Scalding: Reaching Efficient MapReduce
Avro, la puissance du binaire, la souplesse du JSON
Hive data migration (export/import)
Apache avro and overview hadoop tools
MongoDB World 2019: Creating a Self-healing MongoDB Replica Set on GCP Comput...
MongoDB Aggregation Framework
Perl at SkyCon'12
Cassandra 3.0 - JSON at scale - StampedeCon 2015
CouchDB-Lucene
Sql cheat sheet
High Performance tDiary
Aggregation Framework MongoDB Days Munich
DBD::Gofer 200809
Ad

Similar to Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant) (20)

PPTX
H base introduction & development
PPTX
Introduction to Apache HBase, MapR Tables and Security
PPTX
HBase.pptx
PPTX
Hbase interact with shell
PPTX
HBase_-_data_operaet le opérations de calciletions_final.pptx
PPTX
Hadoop - Apache Hbase
PPTX
PDF
SE2016 Java Valerii Moisieienko "Apache HBase Workshop"
PDF
Valerii Moisieienko Apache hbase workshop
PPTX
CCS334 BIG DATA ANALYTICS UNIT 5 PPT ELECTIVE PAPER
PDF
Apache HBase Workshop
PDF
Hbase
PDF
Rails on HBase
PDF
Rails on HBase
PDF
Rails on HBase
PPT
HBASE Overview
DOCX
Hbase Quick Review Guide for Interviews
PDF
03 h base-2-installation_andshell
ODP
Apache hadoop hbase
PDF
TP2 Big Data HBase
H base introduction & development
Introduction to Apache HBase, MapR Tables and Security
HBase.pptx
Hbase interact with shell
HBase_-_data_operaet le opérations de calciletions_final.pptx
Hadoop - Apache Hbase
SE2016 Java Valerii Moisieienko "Apache HBase Workshop"
Valerii Moisieienko Apache hbase workshop
CCS334 BIG DATA ANALYTICS UNIT 5 PPT ELECTIVE PAPER
Apache HBase Workshop
Hbase
Rails on HBase
Rails on HBase
Rails on HBase
HBASE Overview
Hbase Quick Review Guide for Interviews
03 h base-2-installation_andshell
Apache hadoop hbase
TP2 Big Data HBase
Ad

More from BigDataEverywhere (7)

PDF
Big Data Everywhere Chicago: Apache Spark Plus Many Other Frameworks -- How S...
PPTX
Big Data Everywhere Chicago: Getting Real with the MapR Platform (MapR)
PDF
Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...
PDF
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
PDF
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
PPTX
Big Data Everywhere Chicago: SQL on Hadoop
PPTX
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Apache Spark Plus Many Other Frameworks -- How S...
Big Data Everywhere Chicago: Getting Real with the MapR Platform (MapR)
Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
Big Data Everywhere Chicago: SQL on Hadoop
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...

Recently uploaded (20)

PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPT
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
Computer network topology notes for revision
PDF
Taxes Foundatisdcsdcsdon Certificate.pdf
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
A Quantitative-WPS Office.pptx research study
climate analysis of Dhaka ,Banglades.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
Business Acumen Training GuidePresentation.pptx
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
Clinical guidelines as a resource for EBP(1).pdf
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
Business Ppt On Nestle.pptx huunnnhhgfvu
Miokarditis (Inflamasi pada Otot Jantung)
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Computer network topology notes for revision
Taxes Foundatisdcsdcsdon Certificate.pdf
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
A Quantitative-WPS Office.pptx research study

Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)

  • 1. Unleash the Power of HBase Shell Big Data Everywhere Chicago 2014 Jayesh Thakrar jthakrar@conversant.com
  • 2. HBase = Truly Big Data Store ……..(well, one of many) • Proven • Scalable • Resilient and highly available (HA) • Low-latency • OLTP and batch/mapreduce usage • True big data store  billions of rows  millions of columns
  • 3. However……HBase also makes me…… Data • No query tools like psql, mysql, etc • No interactive development tools • Can’t browse Java primitive data in cells, e.g. int, long, double, etc. • Cell values printed as string if “printable bytes” else bytes printed in “hexadecimal” format
  • 4. But I was somewhat wrong…… • HBase shell is a full-fledged jruby shell (jirb) • It can exploit the strengths of Ruby and Java • With minimal code can retrieve/view data in any HBase row/cell • Can possibly use "Ruby-on-Rails" and other frameworks directly interfacing with HBase
  • 5. HBase Shell = JRuby Under the Cover HBase Shell Ruby Source Code $ cd <HBASE_DIRECTORY> $ find . -name '*.rb' -print ./bin/get-active-master.rb ./bin/hirb.rb ./bin/region_mover.rb ./bin/region_status.rb …. ./lib/ruby/shell/commands/assign.rb ./lib/ruby/shell/commands/balancer.rb … ./lib/ruby/shell/commands/create.rb ./lib/ruby/shell/commands/delete.rb …..
  • 6. JRuby under the cover HBase Shell Ruby Source Code Shell Commands = HBase DSL (Domain Specific Language) $ cd <HBASE_DIRECTORY> $ find . -name '*.rb' -print ./bin/get-active-master.rb ./bin/hirb.rb ./bin/region_mover.rb ./bin/region_status.rb …. ./lib/ruby/shell/commands/assign.rb ./lib/ruby/shell/commands/balancer.rb … ./lib/ruby/shell/commands/create.rb ./lib/ruby/shell/commands/delete.rb ….. $ hbase shell get 'user', 'AB350000000000000350' get('user', 'AB350000000000000350') DSL format Ruby method format
  • 7. JRuby under the cover HBase Shell Ruby Source Code Shell Commands = HBase DSL (Domain Specific Language) $ cd <HBASE_DIRECTORY> $ find . -name '*.rb' -print ./bin/get-active-master.rb ./bin/hirb.rb ./bin/region_mover.rb ./bin/region_status.rb …. ./lib/ruby/shell/commands/assign.rb ./lib/ruby/shell/commands/balancer.rb … ./lib/ruby/shell/commands/create.rb ./lib/ruby/shell/commands/delete.rb ….. $ hbase shell get 'user', 'AB350000000000000350' get('user', 'AB350000000000000350') table_name = 'user' rowkey = 'AB350000000000000350' get(table_name, rowkey) DSL format Ruby method format
  • 8. JRuby under the cover HBase Shell Ruby Source Code Shell Commands = HBase DSL (Domain Specific Language) $ cd <HBASE_DIRECTORY> $ find . -name '*.rb' -print ./bin/get-active-master.rb ./bin/hirb.rb ./bin/region_mover.rb ./bin/region_status.rb …. ./lib/ruby/shell/commands/assign.rb ./lib/ruby/shell/commands/balancer.rb … ./lib/ruby/shell/commands/create.rb ./lib/ruby/shell/commands/delete.rb ….. $ hbase shell get 'user', 'AB350000000000000350' get('user', 'AB350000000000000350') table_name = 'user' rowkey = 'AB350000000000000350' get(table_name, rowkey) DSL format Ruby method format scan 'user', {STARTROW => 'AB350000000000000350', LIMIT => 5} scan_options = {STARTROW => rowkey, LIMIT => 5} scan table_name, scan_options Defining and using Ruby Hash or Dictionary
  • 9. HBase shell JRuby Example - 1 include Java import org.apache.hadoop.hbase. HBaseConfiguration import org.apache.hadoop.hbase.client.HTable import org.apache.hadoop.hbase.client.Scan import org.apache.hadoop.hbase.client.Get import org.apache.hadoop.hbase.client.Result import org.apache.hadoop.hbase.util.Bytes htable = HTable.new(HBaseConfiguration.new, "sample") rowkey = Bytes.toBytes("some_rowkey") get = Get.new(rowkey) result = htable.get(get) result.list.collect {|kv| puts "#{Bytes.toString(kv.getFamily)}:#{Bytes.toString(kv.getQualifier)}"}
  • 10. HBase shell JRuby Example - 1 include Java import org.apache.hadoop.hbase. HBaseConfiguration import org.apache.hadoop.hbase.client.HTable import org.apache.hadoop.hbase.client.Scan import org.apache.hadoop.hbase.client.Get import org.apache.hadoop.hbase.client.Result import org.apache.hadoop.hbase.util.Bytes htable = HTable.new(HBaseConfiguration.new, "sample") rowkey = Bytes.toBytes("some_rowkey") get = Get.new(rowkey) result = htable.get(get) result.list.collect {|kv| puts "#{Bytes.toString(kv.getFamily)}:#{Bytes.toString(kv.getQualifier)}"} Allow calling Java from within JRuby
  • 11. HBase shell JRuby Example - 1 include Java import org.apache.hadoop.hbase. HBaseConfiguration import org.apache.hadoop.hbase.client.HTable import org.apache.hadoop.hbase.client.Scan import org.apache.hadoop.hbase.client.Get import org.apache.hadoop.hbase.client.Result import org.apache.hadoop.hbase.util.Bytes htable = HTable.new(HBaseConfiguration.new, "sample") rowkey = Bytes.toBytes("some_rowkey") get = Get.new(rowkey) result = htable.get(get) result.list.collect {|kv| puts "#{Bytes.toString(kv.getFamily)}:#{Bytes.toString(kv.getQualifier)}"} Allow calling Java from within JRuby "import" Java classes
  • 12. HBase shell JRuby Example - 1 include Java import org.apache.hadoop.hbase. HBaseConfiguration import org.apache.hadoop.hbase.client.HTable import org.apache.hadoop.hbase.client.Scan import org.apache.hadoop.hbase.client.Get import org.apache.hadoop.hbase.client.Result import org.apache.hadoop.hbase.util.Bytes htable = HTable.new(HBaseConfiguration.new, "sample") rowkey = Bytes.toBytes("some_rowkey") get = Get.new(rowkey) result = htable.get(get) result.list.collect {|kv| puts "#{Bytes.toString(kv.getFamily)}:#{Bytes.toString(kv.getQualifier)}"} Allow calling Java from within JRuby "import" Java classes Can invoke HBase Java API Jruby variables for Java class instance and static and instance method output HTable.new = new Htable() in Java
  • 13. HBase shell JRuby Example - 1 include Java import org.apache.hadoop.hbase.HBaseConfiguration import org.apache.hadoop.hbase.client.HTable import org.apache.hadoop.hbase.client.Scan import org.apache.hadoop.hbase.client.Get import org.apache.hadoop.hbase.client.Result import org.apache.hadoop.hbase.util.Bytes htable = HTable.new(HBaseConfiguration.new, "sample") rowkey = Bytes.toBytes("some_rowkey") get = Get.new(rowkey) result = htable.get(get) result.list.collect {|kv| puts "#{Bytes.toString(kv.getFamily)}:#{Bytes.toString(kv.getQualifier)}"} Allow calling Java from within JRuby "import" Java classes Creating Jruby variables for Java class instance and static and instance method output HTable.new = new Htable() in Java Ruby expression: "collect" is a Ruby method for list objects. Here it is made available to a Java list thus making available features of both languages transparently and seamlessly.
  • 14. HBase shell JRuby Example - 2 # Same include and import statements as Example - 1 htable = HTable.new(HBaseConfiguration.new, ".META.") scanner = htable.getScanner(Scan.new()) tables = {} scanner.each do |r| table_name = Bytes.toString(r.getRow).split(",")[0] if not tables.has_key?(table_name) tables[table_name] = 0 end tables[table_name] = tables[table_name] + 1 end tables.keys.each { |t| puts "Table #{t} has #{tables[t]} regions"} This example scans the ".META.“ to get a count of regions by tables and regionserver
  • 15. HBase shell JRuby Example - 2 # Same include and import statements as Example - 1 htable = HTable.new(HBaseConfiguration.new, ".META.") scanner = htable.getScanner(Scan.new()) tables = {} scanner.each do |r| table_name = Bytes.toString(r.getRow).split(",")[0] if not tables.has_key?(table_name) tables[table_name] = 0 end tables[table_name] = tables[table_name] + 1 end tables.keys.each { |t| puts "Table #{t} has #{tables[t]} regions"} This example scans the ".META.“ to get a count of regions by tables and regionserver Empty Ruby hash or dictionary
  • 16. HBase shell JRuby Example - 2 # Same include and import statements as Example - 1 htable = HTable.new(HBaseConfiguration.new, ".META.") scanner = htable.getScanner(Scan.new()) tables = {} scanner.each do |r| table_name = Bytes.toString(r.getRow).split(",")[0] if not tables.has_key?(table_name) tables[table_name] = 0 end tables[table_name] = tables[table_name] + 1 end tables.keys.each { |t| puts "Table #{t} has #{tables[t]} regions"} This example scans the ".META.“ to get a count of regions by tables and regionserver Empty Ruby hash or dictionary Example of how to iterate through a Java "iterable". Each iteration of scanner gives a "Result" object which is then passed to a code block
  • 17. HBase shell JRuby Example - 2 # Same include and import statements as Example - 1 htable = HTable.new(HBaseConfiguration.new, ".META.") scanner = htable.getScanner(Scan.new()) tables = {} scanner.each do |r| table_name = Bytes.toString(r.getRow).split(",")[0] if not tables.has_key?(table_name) tables[table_name] = 0 end tables[table_name] = tables[table_name] + 1 end tables.keys.each { |t| puts "Table #{t} has #{tables[t]} regions"} This example scans the ".META.“ to get a count of regions by tables and regionserver Example of how to iterate through a Java "iterable". Each iteration of scanner gives a "Result" object which is then passed to a code block The "code block" can be enclosed by curly braces ({}) or "do" and "end" keywords. Convention is to use {} for single line code blocks and do/end for multi-line code blocks. Empty Ruby hash or dictionary
  • 18. HBase shell JRuby Example - 2 # Same include and import statements as Example - 1 htable = HTable.new(HBaseConfiguration.new, ".META.") scanner = htable.getScanner(Scan.new()) tables = {} scanner.each do |r| table_name = Bytes.toString(r.getRow).split(",")[0] if not tables.has_key?(table_name) tables[table_name] = 0 end tables[table_name] = tables[table_name] + 1 end tables.keys.each { |t| puts "Table #{t} has #{tables[t]} regions"} This example scans the ".META.“ to get a count of regions by tables and regionserver Example of how to iterate through a Java "iterable". Each iteration of scanner gives a "Result" object which is then passed to a code block The "code block" can be enclosed by curly braces ({}) or "do" and "end" keywords. Convention is to use {} for single line code blocks and do/end for multi-line code blocks. Empty Ruby hash or dictionary Print region count by table using an iterator that is passed a code block. Compare the code block enclosed in {} v/s “do/end” above
  • 19. HBase shell JRuby Example - 3 • See https://github.com/JThakrar/hse • hbase_shell_extension.rb
  • 20. To Conclude……. • HBase shell  is an interactive scripting environment  allows mixing of Java and Jruby  Is not “recommended” for serious, enterprise/group development that requires automated testing, continuous integration, etc. • Can use JRuby IDE, provided you add HBase jars using "require" e.g. require '<path>/hbase.jar' • Can also use your custom Java jars in IDE and/or HBase shell • Can even “compile” your code to generate “jars” from your JRuby scripts for optimal performance and/or to avoid exposing source code