SlideShare a Scribd company logo
A Developers Guide To Coprocessors
Hbasecon 2013John Weatherford
https://github.com/jweatherford
Telescope is the leading provider of interactive
television, audience participation and customer engagement
solutions.
Clients include TV networks, producers, digital
platforms, studios, and sponsors seeking to
reach, engage, and retain mass-audiences and consumers in
real-time.
Who Is Telescope?
Arbitrary code that can run on each server
Extendthe functionality of Hbase
Avoid bothering the core committers
What Is A Coprocessor
Region 2
Endpoint
Region 3
Post-Action
Endpoint
Endpoints
Call a function explicitly
Execute code on all regions
Action
Observers
React to an event
Run code before or after
Two Types of Coprocessors
Pre-ActionClient
Region 1
Endpoint
Client
What Can I Do With Coprocessors
Ideas
what can be done
Access Control
Secondary Indexes
Optimized Search
Data Aggregation
Control compaction times
Real Time Analytics
Reduce result sets
Cache Request
Email split alerts
Getting Started With Code
preGet(ObserverContext<RegionCoprocessorEnvironment> c, Get get,
List<KeyValue> result)
postGet(ObserverContext<RegionCoprocessorEnvironment> c, Get get,
List<KeyValue> result)
prePut(ObserverContext<RegionCoprocessorEnvironment> c, Put put,
WALEdit edit, boolean writeToWAL)
postPut(ObserverContext<RegionCoprocessorEnvironment> c, Put put,
WALEdit edit, boolean writeToWAL)
preDelete(ObserverContext<RegionCoprocessorEnvironment> c, Delete delete,
WALEdit edit, boolean writeToWAL)
postDelete(ObserverContext<RegionCoprocessorEnvironment> c, Delete delete,
WALEdit edit, boolean writeToWAL)
Our First Observer
Intercept and modify the action
Consider all circumstances that will trigger the observer
Compile your jar to the same version of Java running your
Hbase Regions
Look for output from the coprocessor
key: id-1332343
twitter:name: “loljk4u”
twitter:message: “<3”
twitter:length: 0x2
twitter:registered: 0xFF
favorite:name: “Taylor”
favorite:song: “I knew
you were trouble”
Our First Observer
Motivation Apache flume only writes one column per put
{twitter:
{ name: “loljk4u”,
message: “<3”,
length: 2,
registered: true
},
favorite:
{ name: “Taylor”
...
JSON
key: id-1332343
family: twitter
qualifier: json_raw
value: “{twitter:
{name: “loljk4u”,
message: “<3”,
length: 2,
registered: true
...
Single
Row Put
preput()
put
JsonColumnExpander
//get the arguments on the coprocessor
public void start(CoprocessorEnvironment env) throws IOException {
Configuration c = env.getConfiguration();
families = c.get("families", "").split(":");
}
public void prePut(ObserverContext<…> e, Put put, WALEdit edit, boolean waL) {
if(!put.has(FAMILY, JSON_COLUMN)) { return; } //check for the json_raw column
String json = Bytes.toString(put.get(FAMILY, JSON_COLUMN).get(0).getValue());
for(Entry<String, ?> column : columns.entrySet()) { //loop through the json
String value = (String) column.getValue();
put.add(family, Bytes.toBytes(column.getKey()), Bytes.toBytes(value));
}
//remove the original json from the put
put.add(FAMILY, JSON_COLUMN, "--removed--".getBytes());
}
Loading the Coprocessor
Push the jar to where your cluster can find it
$>hadoop fs –put JsonColumnExpander.jar /
Alter the table to enable the coprocessor
$> alter „test', METHOD =>
'table_att', 'coprocessor'=>'hdfs:///JsonColumnExpander.jar|telesco
pe.hbase.JsonColumnExpander|1001|arg1=1,arg2=2„
Verify the load by checking the master web UI.
Running The Code
Trigger the coprocessor with a put on the table
Put put = new Put(“rowkey”);
Put.add(“twitter”.toBytes(), “json_raw”.toBytes(), json_data);
Check each server’s local logs
http://regionnode:60030/logs/
hbase-hbase-regionserver-node2.
dev-hadoop.telescope.tv.out
Creating Your First Endpoint
Define the available methods a protocol
Implement the protocol
Extend BaseRegionEndpoint
Load the endpoint on the table
Endpoint Example
public interface TrendsProtocol extends CoprocessorProtocol{
HashMap<String, Long> getData() throws IOException;
}
//The endpoint class implements the protocol we wrote above
public class TrendsEndpoint extends BaseEndpointCoprocessor implements TrendsProtocol {
@Override
public HashMap<String, Long> getTrends() throws IOException {
RegionCoprocessorEnvironment environment = getEnvironment();
InternalScanner scanner = environment.getRegion().getScanner(s);
try {
List<KeyValue> curVals = new ArrayList<KeyValue>();
do {
curVals.clear();
for(KeyValue pair : curVals){
//loop through values on the region and process
}
}while(!done);
}
}
}
Endpoint Returned Results
htable = HBaseDB.getTable(connection, “hbase_demo");
Map<byte[], HashMap<String, Long>> results = null;
results = m_analytics.coprocessorExec(
TrendsProtocol.class,
null, //start row
null, //end row
new Batch.Call<TrendsProtocol, HashMap<String, Long>>(){
@Override
public HashMap<String, Long> call(TrendsProtocol trends)throws IOException {
return trends.getData();
}
}
);
for (Map.Entry<byte[], Boolean> entry : results.entrySet()) {
//process results from each region server
}
Addendum to Endpoints
0.96 is changing Endpoints to use protobuf
public static abstract class RowCountService
implements com.google.protobuf.Service {
...
public interface Interface {
public abstract void getRowCount(
com.google.protobuf.RpcController controller,
CountRequest request,
com.google.protobuf.RpcCallback done);
public abstract void getKeyValueCount(
com.google.protobuf.RpcController controller,
CountRequest request,
com.google.protobuf.RpcCallback done);
}
}
Telescope’s Coprocessors
Observers collect real time analytics data for our
moderation platform as well as to create aggregate tables
for the steaming data
Endpoints optimize searches and transmit only the
necessary data. Perform simple reporting queries that
don’t need the full power of mapreduce.
Questions?
Alreadyusing coprocessors? I would love to hear about it.
Curious to know more about a specific part?
All code samples and table definitions can be found at
https://github.com/jweatherford

More Related Content

PPTX
HBase Coprocessor Introduction
PDF
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
PDF
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
PPT
HBase at Xiaomi
PPTX
HBase Secondary Indexing
PPTX
HBaseCon 2015: HBase 2.0 and Beyond Panel
PPTX
HBaseCon 2013: How (and Why) Phoenix Puts the SQL Back into NoSQL
PDF
HBaseCon 2015: HBase Operations at Xiaomi
HBase Coprocessor Introduction
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBase at Xiaomi
HBase Secondary Indexing
HBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2013: How (and Why) Phoenix Puts the SQL Back into NoSQL
HBaseCon 2015: HBase Operations at Xiaomi

What's hot (19)

PPTX
006 performance tuningandclusteradmin
PPT
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
PDF
HBase 0.20.0 Performance Evaluation
PPTX
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
PPTX
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
PPTX
Meet hbase 2.0
PDF
HBaseCon 2013: Scalable Network Designs for Apache HBase
PDF
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PDF
HBaseCon2017 Improving HBase availability in a multi tenant environment
PDF
Background Tasks in Node - Evan Tahler, TaskRabbit
PDF
Out of the box replication in postgres 9.4(pg confus)
PDF
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
PDF
Feed Burner Scalability
PDF
Hbase Nosql
PPTX
Streaming replication in PostgreSQL
PDF
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
PDF
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...
ODP
Hug Hbase Presentation.
PDF
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
006 performance tuningandclusteradmin
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBase 0.20.0 Performance Evaluation
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
Meet hbase 2.0
HBaseCon 2013: Scalable Network Designs for Apache HBase
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
HBaseCon2017 Improving HBase availability in a multi tenant environment
Background Tasks in Node - Evan Tahler, TaskRabbit
Out of the box replication in postgres 9.4(pg confus)
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
Feed Burner Scalability
Hbase Nosql
Streaming replication in PostgreSQL
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...
Hug Hbase Presentation.
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
Ad

Similar to HBaseCon 2013: A Developer’s Guide to Coprocessors (20)

PPTX
Nov. 4, 2011 o reilly webcast-hbase- lars george
PPTX
HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions
PPTX
Coprocessors - Uses, Abuses, Solutions - presented at HBaseCon East 2016
PPTX
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
PPTX
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
PDF
HBase Coprocessors @ HUG NYC
PPTX
HBase coprocessors, Uses, Abuses, Solutions
PDF
Apache Big Data EU 2015 - HBase
PDF
hbaseconasia2017: hbase-2.0.0
PDF
HBaseConAsia2018 Keynote1: Apache HBase Project Status
PDF
Apache HBase Low Latency
PPTX
HBase: Where Online Meets Low Latency
PPTX
Apache HBase Internals you hoped you Never Needed to Understand
PPTX
Big Data Processing Using Hadoop Infrastructure
PPTX
HBase Low Latency
ODP
HBase introduction talk
PPTX
HBase Low Latency, StrataNYC 2014
PPTX
CCS334 BIG DATA ANALYTICS UNIT 5 PPT ELECTIVE PAPER
PPTX
Apache Spark on Apache HBase: Current and Future
PDF
Apache HBase Improvements and Practices at Xiaomi
Nov. 4, 2011 o reilly webcast-hbase- lars george
HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions
Coprocessors - Uses, Abuses, Solutions - presented at HBaseCon East 2016
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBase Coprocessors @ HUG NYC
HBase coprocessors, Uses, Abuses, Solutions
Apache Big Data EU 2015 - HBase
hbaseconasia2017: hbase-2.0.0
HBaseConAsia2018 Keynote1: Apache HBase Project Status
Apache HBase Low Latency
HBase: Where Online Meets Low Latency
Apache HBase Internals you hoped you Never Needed to Understand
Big Data Processing Using Hadoop Infrastructure
HBase Low Latency
HBase introduction talk
HBase Low Latency, StrataNYC 2014
CCS334 BIG DATA ANALYTICS UNIT 5 PPT ELECTIVE PAPER
Apache Spark on Apache HBase: Current and Future
Apache HBase Improvements and Practices at Xiaomi
Ad

More from Cloudera, Inc. (20)

PPTX
Partner Briefing_January 25 (FINAL).pptx
PPTX
Cloudera Data Impact Awards 2021 - Finalists
PPTX
2020 Cloudera Data Impact Awards Finalists
PPTX
Edc event vienna presentation 1 oct 2019
PPTX
Machine Learning with Limited Labeled Data 4/3/19
PPTX
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
PPTX
Introducing Cloudera DataFlow (CDF) 2.13.19
PPTX
Introducing Cloudera Data Science Workbench for HDP 2.12.19
PPTX
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
PPTX
Leveraging the cloud for analytics and machine learning 1.29.19
PPTX
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
PPTX
Leveraging the Cloud for Big Data Analytics 12.11.18
PPTX
Modern Data Warehouse Fundamentals Part 3
PPTX
Modern Data Warehouse Fundamentals Part 2
PPTX
Modern Data Warehouse Fundamentals Part 1
PPTX
Extending Cloudera SDX beyond the Platform
PPTX
Federated Learning: ML with Privacy on the Edge 11.15.18
PPTX
Analyst Webinar: Doing a 180 on Customer 360
PPTX
Build a modern platform for anti-money laundering 9.19.18
PPTX
Introducing the data science sandbox as a service 8.30.18
Partner Briefing_January 25 (FINAL).pptx
Cloudera Data Impact Awards 2021 - Finalists
2020 Cloudera Data Impact Awards Finalists
Edc event vienna presentation 1 oct 2019
Machine Learning with Limited Labeled Data 4/3/19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Leveraging the cloud for analytics and machine learning 1.29.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Leveraging the Cloud for Big Data Analytics 12.11.18
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 1
Extending Cloudera SDX beyond the Platform
Federated Learning: ML with Privacy on the Edge 11.15.18
Analyst Webinar: Doing a 180 on Customer 360
Build a modern platform for anti-money laundering 9.19.18
Introducing the data science sandbox as a service 8.30.18

Recently uploaded (20)

PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Modernizing your data center with Dell and AMD
PDF
Approach and Philosophy of On baking technology
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPT
Teaching material agriculture food technology
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Cloud computing and distributed systems.
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Big Data Technologies - Introduction.pptx
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
“AI and Expert System Decision Support & Business Intelligence Systems”
Modernizing your data center with Dell and AMD
Approach and Philosophy of On baking technology
Network Security Unit 5.pdf for BCA BBA.
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Teaching material agriculture food technology
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Cloud computing and distributed systems.
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
20250228 LYD VKU AI Blended-Learning.pptx
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Chapter 3 Spatial Domain Image Processing.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
Understanding_Digital_Forensics_Presentation.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Big Data Technologies - Introduction.pptx
NewMind AI Weekly Chronicles - August'25 Week I
Advanced methodologies resolving dimensionality complications for autism neur...

HBaseCon 2013: A Developer’s Guide to Coprocessors

  • 1. A Developers Guide To Coprocessors Hbasecon 2013John Weatherford https://github.com/jweatherford
  • 2. Telescope is the leading provider of interactive television, audience participation and customer engagement solutions. Clients include TV networks, producers, digital platforms, studios, and sponsors seeking to reach, engage, and retain mass-audiences and consumers in real-time. Who Is Telescope?
  • 3. Arbitrary code that can run on each server Extendthe functionality of Hbase Avoid bothering the core committers What Is A Coprocessor
  • 4. Region 2 Endpoint Region 3 Post-Action Endpoint Endpoints Call a function explicitly Execute code on all regions Action Observers React to an event Run code before or after Two Types of Coprocessors Pre-ActionClient Region 1 Endpoint Client
  • 5. What Can I Do With Coprocessors Ideas what can be done Access Control Secondary Indexes Optimized Search Data Aggregation Control compaction times Real Time Analytics Reduce result sets Cache Request Email split alerts
  • 6. Getting Started With Code preGet(ObserverContext<RegionCoprocessorEnvironment> c, Get get, List<KeyValue> result) postGet(ObserverContext<RegionCoprocessorEnvironment> c, Get get, List<KeyValue> result) prePut(ObserverContext<RegionCoprocessorEnvironment> c, Put put, WALEdit edit, boolean writeToWAL) postPut(ObserverContext<RegionCoprocessorEnvironment> c, Put put, WALEdit edit, boolean writeToWAL) preDelete(ObserverContext<RegionCoprocessorEnvironment> c, Delete delete, WALEdit edit, boolean writeToWAL) postDelete(ObserverContext<RegionCoprocessorEnvironment> c, Delete delete, WALEdit edit, boolean writeToWAL)
  • 7. Our First Observer Intercept and modify the action Consider all circumstances that will trigger the observer Compile your jar to the same version of Java running your Hbase Regions Look for output from the coprocessor
  • 8. key: id-1332343 twitter:name: “loljk4u” twitter:message: “<3” twitter:length: 0x2 twitter:registered: 0xFF favorite:name: “Taylor” favorite:song: “I knew you were trouble” Our First Observer Motivation Apache flume only writes one column per put {twitter: { name: “loljk4u”, message: “<3”, length: 2, registered: true }, favorite: { name: “Taylor” ... JSON key: id-1332343 family: twitter qualifier: json_raw value: “{twitter: {name: “loljk4u”, message: “<3”, length: 2, registered: true ... Single Row Put preput() put
  • 9. JsonColumnExpander //get the arguments on the coprocessor public void start(CoprocessorEnvironment env) throws IOException { Configuration c = env.getConfiguration(); families = c.get("families", "").split(":"); } public void prePut(ObserverContext<…> e, Put put, WALEdit edit, boolean waL) { if(!put.has(FAMILY, JSON_COLUMN)) { return; } //check for the json_raw column String json = Bytes.toString(put.get(FAMILY, JSON_COLUMN).get(0).getValue()); for(Entry<String, ?> column : columns.entrySet()) { //loop through the json String value = (String) column.getValue(); put.add(family, Bytes.toBytes(column.getKey()), Bytes.toBytes(value)); } //remove the original json from the put put.add(FAMILY, JSON_COLUMN, "--removed--".getBytes()); }
  • 10. Loading the Coprocessor Push the jar to where your cluster can find it $>hadoop fs –put JsonColumnExpander.jar / Alter the table to enable the coprocessor $> alter „test', METHOD => 'table_att', 'coprocessor'=>'hdfs:///JsonColumnExpander.jar|telesco pe.hbase.JsonColumnExpander|1001|arg1=1,arg2=2„ Verify the load by checking the master web UI.
  • 11. Running The Code Trigger the coprocessor with a put on the table Put put = new Put(“rowkey”); Put.add(“twitter”.toBytes(), “json_raw”.toBytes(), json_data); Check each server’s local logs http://regionnode:60030/logs/ hbase-hbase-regionserver-node2. dev-hadoop.telescope.tv.out
  • 12. Creating Your First Endpoint Define the available methods a protocol Implement the protocol Extend BaseRegionEndpoint Load the endpoint on the table
  • 13. Endpoint Example public interface TrendsProtocol extends CoprocessorProtocol{ HashMap<String, Long> getData() throws IOException; } //The endpoint class implements the protocol we wrote above public class TrendsEndpoint extends BaseEndpointCoprocessor implements TrendsProtocol { @Override public HashMap<String, Long> getTrends() throws IOException { RegionCoprocessorEnvironment environment = getEnvironment(); InternalScanner scanner = environment.getRegion().getScanner(s); try { List<KeyValue> curVals = new ArrayList<KeyValue>(); do { curVals.clear(); for(KeyValue pair : curVals){ //loop through values on the region and process } }while(!done); } } }
  • 14. Endpoint Returned Results htable = HBaseDB.getTable(connection, “hbase_demo"); Map<byte[], HashMap<String, Long>> results = null; results = m_analytics.coprocessorExec( TrendsProtocol.class, null, //start row null, //end row new Batch.Call<TrendsProtocol, HashMap<String, Long>>(){ @Override public HashMap<String, Long> call(TrendsProtocol trends)throws IOException { return trends.getData(); } } ); for (Map.Entry<byte[], Boolean> entry : results.entrySet()) { //process results from each region server }
  • 15. Addendum to Endpoints 0.96 is changing Endpoints to use protobuf public static abstract class RowCountService implements com.google.protobuf.Service { ... public interface Interface { public abstract void getRowCount( com.google.protobuf.RpcController controller, CountRequest request, com.google.protobuf.RpcCallback done); public abstract void getKeyValueCount( com.google.protobuf.RpcController controller, CountRequest request, com.google.protobuf.RpcCallback done); } }
  • 16. Telescope’s Coprocessors Observers collect real time analytics data for our moderation platform as well as to create aggregate tables for the steaming data Endpoints optimize searches and transmit only the necessary data. Perform simple reporting queries that don’t need the full power of mapreduce.
  • 17. Questions? Alreadyusing coprocessors? I would love to hear about it. Curious to know more about a specific part? All code samples and table definitions can be found at https://github.com/jweatherford

Editor's Notes

  • #2: Thank you for comingI am John WeatherfordThis is going to have Java code
  • #3: We create digital products and applications for multiple devices and deliver campaigns and solutions across multiple platforms, live events, and more.Our major campaigns are Idol, Voice
  • #4: There are two different types of Coprocessors, endpoints and observers.Observers are code that is triggered by an Hbase operation. In the relational DB model, this is logically similar to a trigger.Endpoints are code that is called explicitly as a function on the server. In the relational DB model, this is logically similar to a stored proceedureCoprocessors can be run on all regions, just the regions of a particular table or just on the master VERIFY THIS IS TRUE
  • #5: RegionObserverMasterObserverWALObserver
  • #6: RidiulousHbase exampleRestrict data changes after midnight? Rick Roll a random data requests
  • #7: Coprocessor class? What does all this extend?
  • #9: Example: AsyncHbase writer for Apache Flume doesn’t allow more than a single column write per operation. The goal of this observer is to allow flume to send all the column data we need and simply organize it when we get to the server
  • #11: There are two ways to load the Jar, through the hbase-site.xml and altering the table. For demonstration we will be altering the table.Check the github repo for a base script that can be used to load the coprocessor through each stepSHOW: PICTURE: Insert picture of the loaded coprocessor
  • #12: Each server has local logs that can be accessed through the master UI. Should the coprocessor have some sort of error, we can find the output here.
  • #13: Explain what a protocol is.Endpoints aren’t triggered by actIt is important to remember endpoints run on all servers that contain any key within the start and end key passed.ions on the table, but called directly from the client.
  • #15: Remember the endpoint runs on all the region servers so we are returned a set of results in a mapCall the endpoint in your client code.https://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.html
  • #16: Remember the endpoint runs on all the region servers so we are returned a set of results in a mapCall the endpoint in your client code.https://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.html