SlideShare a Scribd company logo
ノンブロッキングIOで分散システム
を手懐ける
ーチャットワークでのasynchbaseの
利用
Non-blocking IO to tame distributed systems
ー How and why ChatWork uses asynchbase
安田裕介/Yusuke Yasuda (@TanUkkii007)
Agenda
● How we used a native HBase client
● Problems we faced with a native HBase client
● Migration to asynchbase
● Blocking IO vs Non-blocking IO: performance test results
About me
● Yusuke Yasuda / 安田裕介
● @TanUkkii007
● Working for Chatwork for 2 years
● Scala developer
About ChatWork
How we used a native HBase client
Messaging system architecture overview
You can find more information about our architecture at Kafka summit 2017.
Today’s topic
HBase
● Key-value storage to enable random access on HDFS
● HBase is used as a query-side storage in our system
○ version: 1.2.0
● Provides streaming API called “Scan” to query a sequence of
rows iteratively
● Scan is the most used HBase API in ChatWork
Synchronous scan with native HBase client
A bad example
def scanHBase(connection: Connection, tableName: TableName, scan: Scan): Vector[Result] = {
val table: Table = connection.getTable(tableName)
val scanner: ResultScanner = table.getScanner(scan)
@tailrec
def loop(results: Vector[Result]): Vector[Result] = {
val result = scanner.next()
if (result == null)
results
else
loop(results :+ result)
}
try {
loop(Vector.empty)
} finally {
table.close()
scanner.close()
}
}
● a thread is not released
until whole scan is
finished
● throughput is bounded
by the number of threads
in a pool
● long running blocking
calls cause serious
performance problem in
event loop style
application like Akka
HTTP
Cons:
Gist
Throughput and Latency trade-off
in asynchronous and synchronous settings
asynchronous : throughput=8, latency=2
synchronous: throughput=4, latency=1
Asynchronous setting is more flexible and fair!
synchronous asynchronous
Optimized for latency throughput
Under high
workload
throughput is
bounded
throughput
increases while
sacrificing
latency
Under low
workload
Requests for
many rows
are executed
exclusively
are evenly
scheduled as
small requests
both have equal latency and
throughput
Asynchronous streaming of Scan operation
with Akka Stream
class HBaseScanStage(connection: Connection, tableName: TableName, scan: Scan)
extends GraphStage[SourceShape[Result]] {
val out: Outlet[Result] = Outlet("HBaseScanSource")
override def shape: SourceShape[Result] = SourceShape(out)
override def createLogic(inheritedAttributes: Attributes): GraphStageLogic =
new GraphStageLogic(shape) {
var table: Table = _
var scanner: ResultScanner = _
override def preStart(): Unit = {
table = connection.getTable(tableName)
scanner = table.getScanner(scan)
}
setHandler(out, new OutHandler {
override def onPull(): Unit = {
val next = scanner.next()
if (next == null)
complete(out)
else
push(out, next)
}
})
override def postStop(): Unit = {
if (scanner != null) scanner.close()
if (table != null) table.close()
super.postStop()
}
}
}
● ResultScanner#next() is passively called
inside callback in a thread safe way
● thread is released immediately after
single ResultScanner#next() call
● Results are pushed to downstream
asynchronously
● when and how many times next()s are
called is determined by downstream
Gist
Problems we faced
caused by a native HBase client
Just a single unresponsive HBase region
server caused whole system degradation
The call queue size of hslave-5 region server spiked.
All Message Read API servers suffered
latency increase and throughput fall.
Distributed systems are supposed
to fail partially but why not?
● Native HBase client uses blocking IO
● Requests to unresponsive HBase block a
thread until timeout
● All threads in a thread pool are consumed
so Message Read API servers were not
able to respond
upper limit of pool size
HBase IPC queue size
thread pool status in Read API servers
#active threads
Asynchronous streaming
is not enough.
Non-blocking IO matters.
What we learned
Migration to asynchbase
asynchbase
Non-blocking HBase client based on Netty
● https://github.com/OpenTSDB/asynchbase
● Netty 3.9
● Supports reverse scan since 1.8
● Asynchronous interface by Deferred
○ https://github.com/OpenTSDB/async
○ Observer pattern that provides callback interfaces
● Thread safety provided by Deferred
○ Event loop executes volatile checks at each step
○ Safe to mutate states inside callbacks
Introduce streaming
interface to
asynchbase with Akka
Stream
class HBaseAsyncScanStage(scanner: Scanner)
extends GraphStage[SourceShape[util.ArrayList[KeyValue]]] with HBaseCallbackConversion {
val out: Outlet[util.ArrayList[KeyValue]] = Outlet("HBaseAsyncScanStage")
override def shape: SourceShape[util.ArrayList[KeyValue]] = SourceShape(out)
override def createLogic(inheritedAttributes: Attributes): GraphStageLogic =
new GraphStageLogic(shape) {
var buffer: List[util.ArrayList[KeyValue]] = List.empty
setHandler(out, new OutHandler {
override def onPull(): Unit = {
if (buffer.isEmpty) {
val deferred = scanner.nextRows()
deferred.addCallbacks(
(results: util.ArrayList[util.ArrayList[KeyValue]]) => callback.invoke(Option(results)),
(e: Throwable) => errorback.invoke(e)
)
} else {
val (element, tailBuffer) = (buffer.head, buffer.tail)
buffer = tailBuffer
push(out, element)
}
}
})
override def postStop(): Unit = {
scanner.close()
super.postStop()
}
private val callback = getAsyncCallback[Option[util.ArrayList[util.ArrayList[KeyValue]]]] {
case Some(results) if !results.isEmpty =>
val element = results.remove(0)
buffer = results.asScala.toList
push(out, element)
case Some(results) if results.isEmpty => complete(out)
case None => complete(out)
}
private val errorback = getAsyncCallback[Throwable] { error => fail(out, error) }
}
}
※ This code contains a serious issue.
You must handle downstream cancellation properly.
Otherwise a Close request may be fired while NextRows
request is still running, which causes HBase protocol violation.
See how to solve this problem on the Gist.
Gist
Customizing Scan behavior with
downstream pipelines
HBaseAsyncScanSource(scanner).take(1000)
HBaseAsyncScanSource(scanner)
.throttle(elements=100, per=1 second, maximumBurst=100, ThrottleMode.Shaping)
HBaseAsyncScanSource(scanner).completionTimeout(5 seconds)
HBaseAsyncScanSource(scanner).recoverWithRetries(10, {
case NotServingRegionException => HBaseAsyncScanSource(scanner)
})
● early termination of scan when count of rows limit is reached
● scan iteration rate limiting
● early termination of scan by timeout
● retrying if a region server is not serving
Gist
Switching from synchronous API to
asynchronous API
● Switching from synchronous API to asynchronous API usually
requires rewriting whole APIs
● Abstracting database drivers is difficult
● Starting with asynchronous interface like Future[T] is a good
practice
● Another option for abstract interface is streams
● Streams can behave collections like Future, Option, List, Try, but
do not require monad transformer to integrate each other
● Stream interface specification like reactive-streams (JEP266)
gives a way to connect various asynchronous libraries
● Akka Stream is one of the implementations of the reactive-streams
Database access abstraction with streams
Transport Interface Layer
interface: Directive[T], Future[T]
engine: Akka HTTP
Stream Adaptor
interface: Source[Out, M], Flow[In, Out, M], Sink[In, M]
engine: Akka Stream
Database Interface Layer
interface: implementation specific
engine: database driver
● native HBase client
● asynchbase
● HBaseScanStage
● HBaseAsyncScanStage
● ReadMessageDAS
UseCase Layer
interface: Source[Out, M], Flow[In, Out, M], Sink[In, M]
engine: Akka Stream
Domain Layer
interface: Scala collections and case classes
engine: Scala standard library
Transport Interface Layer
interface: Directive[T], Future[T]
engine: Akka HTTP
Stream Adaptor
interface: Source[Out, M], Flow[In, Out, M], Sink[In, M]
engine: Akka Stream
Database Interface Layer
interface: implementation specific
engine: database driver
● native HBase client
● asynchbase
● HBaseScanStage
● HBaseAsyncScanStage
● ReadMessageDAS
UseCase Layer
interface: Source[Out, M], Flow[In, Out, M], Sink[In, M]
engine: Akka Stream
Domain Layer
interface: Scala collections and case classes
engine: Scala standard library
● Stream abstraction mitigates impact of changes of underlying implementations
● Database access implementation can be switched by Factory functions
● No change was required inside UseCase and Domain layers
Database access abstraction with streams
Blocking IO vs Non-blocking IO
performance test results
Fortunately we have not faced HBase issues since asynchbase migration in production.
Following slides show performance test results that was conducted before asynchbase deployment.
Blocking IO vs Non-blocking IO
performance test settings
● Single Message Read API server
○ JVM heap size=4GiB
○ CPU request=3.5
○ CPU limit=4
● Using production workload pattern simulated with gatling stress tool
● 1340 request/second
● mainly invokes HBase Scan, but there are Get and batch Get
as well
Both implementations with asynchbase and native HBase client are
tested with the same condition.
Blocking IO vs Non-blocking IO
throughput
Message Read API server
with native HBase client
Message Read API server
with asynchbase
throughput: 1000 → 1300
Blocking IO vs Non-blocking IO
latency
Message Read API server with
native HBase client
Message Read API server
with asynchbase
※ Note that the scales of y-axis are different.
99pt.: 2000ms → 300ms
95pt.: 1000ms → 200ms
Blocking IO vs Non-blocking IO
Thread pool usage
Message Read API server with
native HBase client
Message Read API server
with asynchbase
Note that hbase-dispatcher is an application
thread pool, not Netty IO worker thread pool.
pool size: 600 → 8
active threads: 80 → 2
Blocking IO vs Non-blocking IO
JVM heap usage
Message Read API server with
native HBase client
Message Read API server with
asynchbase
heap usage: 2.6GiB → 1.8Gi
Blocking IO vs Non-blocking IO
HBase scan metrics
Message Read API server with
native HBase client
Message Read API server with
asynchbase
average of sum of millis sec between nexts average of sum of millis sec between nexts
HBase scan metrics may come to
asynchnase
https://github.com/OpenTSDB/asynchbase/pull/184
Room for improvement
Timeouts and Rate limiting
● Proper timeouts and rate limiting are necessary for asynchronous and non-blocking
systems
○ Without reins asynchronous system increases its throughput until consumes
all resources
● Timeouts
○ completionTimeout: timout based on total processing time
■ Not ideal for Scan that has broad distribution of processing time
○ idleTimeout: timeout based on processing time between two data
■ Single iteration of Scan has sharp distribution of processing time.
Probably a better strategy.
● Rate limiting
○ Under high workload, the first bottleneck is throughput of storage of HBase
■ How to implement storage-aware rate limiting?
■ Tuning application resources may be necessary
Conclusion
● Blocking IO spoils benefits of distributed databases
○ partial failure of database exhausts application threads and makes
the application unresponsive
● Non-blocking IO is resilient to partial failure
● Asynchronous stream is great as a flexible execution model and abstract
interface
● asynchronous stream with Non-blocking IO outperforms blocking one
● Our journey for resilient system continues

More Related Content

PPTX
Architecture of Falcon, a new chat messaging backend system build on Scala
PDF
Introduction to apache kafka
PDF
Deploying Kafka on DC/OS
PDF
Developing Secure Scala Applications With Fortify For Scala
PDF
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
PDF
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
PDF
What's new in Confluent 3.2 and Apache Kafka 0.10.2
PPTX
Putting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OS
Architecture of Falcon, a new chat messaging backend system build on Scala
Introduction to apache kafka
Deploying Kafka on DC/OS
Developing Secure Scala Applications With Fortify For Scala
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
What's new in Confluent 3.2 and Apache Kafka 0.10.2
Putting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OS

What's hot (20)

PDF
Akka at Enterprise Scale: Performance Tuning Distributed Applications
PDF
Revitalizing Enterprise Integration with Reactive Streams
PDF
Building High-Throughput, Low-Latency Pipelines in Kafka
ODP
Introduction to Apache Kafka- Part 1
PDF
A Journey through the JDKs (Java 9 to Java 11)
PDF
Building Stream Processing Applications with Apache Kafka's Exactly-Once Proc...
PDF
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
PDF
The best of Apache Kafka Architecture
ODP
Kafka aws
PDF
Apache Kafka® Security Overview
PDF
How Apache Kafka® Works
PDF
Steps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQL
PDF
ksqlDB: A Stream-Relational Database System
PPTX
Kafka connect 101
PPTX
Kafka Streams for Java enthusiasts
PPTX
Kafka Connect: Real-time Data Integration at Scale with Apache Kafka, Ewen Ch...
PDF
Bridging the Gap: Connecting AWS and Kafka
PDF
Oops! I started a broker | Yinon Kahta, Taboola
PDF
Containerizing Distributed Pipes
PDF
Introducing Kafka's Streams API
Akka at Enterprise Scale: Performance Tuning Distributed Applications
Revitalizing Enterprise Integration with Reactive Streams
Building High-Throughput, Low-Latency Pipelines in Kafka
Introduction to Apache Kafka- Part 1
A Journey through the JDKs (Java 9 to Java 11)
Building Stream Processing Applications with Apache Kafka's Exactly-Once Proc...
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
The best of Apache Kafka Architecture
Kafka aws
Apache Kafka® Security Overview
How Apache Kafka® Works
Steps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQL
ksqlDB: A Stream-Relational Database System
Kafka connect 101
Kafka Streams for Java enthusiasts
Kafka Connect: Real-time Data Integration at Scale with Apache Kafka, Ewen Ch...
Bridging the Gap: Connecting AWS and Kafka
Oops! I started a broker | Yinon Kahta, Taboola
Containerizing Distributed Pipes
Introducing Kafka's Streams API
Ad

Similar to Non-blocking IO to tame distributed systems ー How and why ChatWork uses asynchbase (20)

PPTX
002 hbase clientapi
PPTX
HBase Low Latency, StrataNYC 2014
KEY
HBase and Hadoop at Urban Airship
PDF
HBase Client APIs (for webapps?)
PPTX
HBase: Where Online Meets Low Latency
PDF
Transactions Over Apache HBase
PPTX
Practical HBase - Hadoop World2011
PDF
Reactive Summit 2017 Highlights!
PPTX
Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica
PPTX
HBase New Features
 
PDF
Apache HBase Low Latency
PDF
Concurrency at the Database Layer
PPTX
Europe Virtual ALT.NET - EventStore v3
PPTX
Streaming map reduce
PDF
Kabukiza.tech 1 LT - ScalikeJDBC-Async & Skinny Framework #kbkz_tech
PPTX
Introduction to Apache HBase
PPTX
Splice Machine Overview
PDF
Facebook keynote-nicolas-qcon
PDF
Facebook Messages & HBase
PDF
支撑Facebook消息处理的h base存储系统
002 hbase clientapi
HBase Low Latency, StrataNYC 2014
HBase and Hadoop at Urban Airship
HBase Client APIs (for webapps?)
HBase: Where Online Meets Low Latency
Transactions Over Apache HBase
Practical HBase - Hadoop World2011
Reactive Summit 2017 Highlights!
Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica
HBase New Features
 
Apache HBase Low Latency
Concurrency at the Database Layer
Europe Virtual ALT.NET - EventStore v3
Streaming map reduce
Kabukiza.tech 1 LT - ScalikeJDBC-Async & Skinny Framework #kbkz_tech
Introduction to Apache HBase
Splice Machine Overview
Facebook keynote-nicolas-qcon
Facebook Messages & HBase
支撑Facebook消息处理的h base存储系统
Ad

More from TanUkkii (15)

PDF
Distributed ID generator in ChatWork
PPTX
JSON CRDT
PPTX
Akka Clusterの耐障害設計
PDF
WaveNet
PPTX
スケールするシステムにおけるエンティティの扱いと 分散ID生成
PPTX
Akka HTTP
PDF
すべてのアクター プログラマーが知るべき 単一責務原則とは何か
PDF
ディープニューラルネット入門
PDF
プログラミング言語のパラダイムシフト(ダイジェスト)ーScalaから見る関数型と並列性時代の幕開けー
PDF
プログラミング言語のパラダイムシフトーScalaから見る関数型と並列性時代の幕開けー
PDF
Isomorphic web development with scala and scala.js
PDF
Scalaによる型安全なエラーハンドリング
PDF
ECMAScript6による関数型プログラミング
PDF
プログラミング言語Scala
PDF
これからのJavaScriptー関数型プログラミングとECMAScript6
Distributed ID generator in ChatWork
JSON CRDT
Akka Clusterの耐障害設計
WaveNet
スケールするシステムにおけるエンティティの扱いと 分散ID生成
Akka HTTP
すべてのアクター プログラマーが知るべき 単一責務原則とは何か
ディープニューラルネット入門
プログラミング言語のパラダイムシフト(ダイジェスト)ーScalaから見る関数型と並列性時代の幕開けー
プログラミング言語のパラダイムシフトーScalaから見る関数型と並列性時代の幕開けー
Isomorphic web development with scala and scala.js
Scalaによる型安全なエラーハンドリング
ECMAScript6による関数型プログラミング
プログラミング言語Scala
これからのJavaScriptー関数型プログラミングとECMAScript6

Recently uploaded (20)

PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPT
Project quality management in manufacturing
PPTX
UNIT 4 Total Quality Management .pptx
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PPTX
OOP with Java - Java Introduction (Basics)
DOCX
573137875-Attendance-Management-System-original
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
Welding lecture in detail for understanding
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PDF
PPT on Performance Review to get promotions
PPTX
Lecture Notes Electrical Wiring System Components
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
R24 SURVEYING LAB MANUAL for civil enggi
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
Embodied AI: Ushering in the Next Era of Intelligent Systems
Project quality management in manufacturing
UNIT 4 Total Quality Management .pptx
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
OOP with Java - Java Introduction (Basics)
573137875-Attendance-Management-System-original
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Welding lecture in detail for understanding
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPT on Performance Review to get promotions
Lecture Notes Electrical Wiring System Components
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd

Non-blocking IO to tame distributed systems ー How and why ChatWork uses asynchbase

  • 1. ノンブロッキングIOで分散システム を手懐ける ーチャットワークでのasynchbaseの 利用 Non-blocking IO to tame distributed systems ー How and why ChatWork uses asynchbase 安田裕介/Yusuke Yasuda (@TanUkkii007)
  • 2. Agenda ● How we used a native HBase client ● Problems we faced with a native HBase client ● Migration to asynchbase ● Blocking IO vs Non-blocking IO: performance test results
  • 3. About me ● Yusuke Yasuda / 安田裕介 ● @TanUkkii007 ● Working for Chatwork for 2 years ● Scala developer
  • 5. How we used a native HBase client
  • 6. Messaging system architecture overview You can find more information about our architecture at Kafka summit 2017. Today’s topic
  • 7. HBase ● Key-value storage to enable random access on HDFS ● HBase is used as a query-side storage in our system ○ version: 1.2.0 ● Provides streaming API called “Scan” to query a sequence of rows iteratively ● Scan is the most used HBase API in ChatWork
  • 8. Synchronous scan with native HBase client A bad example def scanHBase(connection: Connection, tableName: TableName, scan: Scan): Vector[Result] = { val table: Table = connection.getTable(tableName) val scanner: ResultScanner = table.getScanner(scan) @tailrec def loop(results: Vector[Result]): Vector[Result] = { val result = scanner.next() if (result == null) results else loop(results :+ result) } try { loop(Vector.empty) } finally { table.close() scanner.close() } } ● a thread is not released until whole scan is finished ● throughput is bounded by the number of threads in a pool ● long running blocking calls cause serious performance problem in event loop style application like Akka HTTP Cons: Gist
  • 9. Throughput and Latency trade-off in asynchronous and synchronous settings asynchronous : throughput=8, latency=2 synchronous: throughput=4, latency=1 Asynchronous setting is more flexible and fair! synchronous asynchronous Optimized for latency throughput Under high workload throughput is bounded throughput increases while sacrificing latency Under low workload Requests for many rows are executed exclusively are evenly scheduled as small requests both have equal latency and throughput
  • 10. Asynchronous streaming of Scan operation with Akka Stream class HBaseScanStage(connection: Connection, tableName: TableName, scan: Scan) extends GraphStage[SourceShape[Result]] { val out: Outlet[Result] = Outlet("HBaseScanSource") override def shape: SourceShape[Result] = SourceShape(out) override def createLogic(inheritedAttributes: Attributes): GraphStageLogic = new GraphStageLogic(shape) { var table: Table = _ var scanner: ResultScanner = _ override def preStart(): Unit = { table = connection.getTable(tableName) scanner = table.getScanner(scan) } setHandler(out, new OutHandler { override def onPull(): Unit = { val next = scanner.next() if (next == null) complete(out) else push(out, next) } }) override def postStop(): Unit = { if (scanner != null) scanner.close() if (table != null) table.close() super.postStop() } } } ● ResultScanner#next() is passively called inside callback in a thread safe way ● thread is released immediately after single ResultScanner#next() call ● Results are pushed to downstream asynchronously ● when and how many times next()s are called is determined by downstream Gist
  • 11. Problems we faced caused by a native HBase client
  • 12. Just a single unresponsive HBase region server caused whole system degradation The call queue size of hslave-5 region server spiked. All Message Read API servers suffered latency increase and throughput fall.
  • 13. Distributed systems are supposed to fail partially but why not? ● Native HBase client uses blocking IO ● Requests to unresponsive HBase block a thread until timeout ● All threads in a thread pool are consumed so Message Read API servers were not able to respond upper limit of pool size HBase IPC queue size thread pool status in Read API servers #active threads
  • 14. Asynchronous streaming is not enough. Non-blocking IO matters. What we learned
  • 16. asynchbase Non-blocking HBase client based on Netty ● https://github.com/OpenTSDB/asynchbase ● Netty 3.9 ● Supports reverse scan since 1.8 ● Asynchronous interface by Deferred ○ https://github.com/OpenTSDB/async ○ Observer pattern that provides callback interfaces ● Thread safety provided by Deferred ○ Event loop executes volatile checks at each step ○ Safe to mutate states inside callbacks
  • 17. Introduce streaming interface to asynchbase with Akka Stream class HBaseAsyncScanStage(scanner: Scanner) extends GraphStage[SourceShape[util.ArrayList[KeyValue]]] with HBaseCallbackConversion { val out: Outlet[util.ArrayList[KeyValue]] = Outlet("HBaseAsyncScanStage") override def shape: SourceShape[util.ArrayList[KeyValue]] = SourceShape(out) override def createLogic(inheritedAttributes: Attributes): GraphStageLogic = new GraphStageLogic(shape) { var buffer: List[util.ArrayList[KeyValue]] = List.empty setHandler(out, new OutHandler { override def onPull(): Unit = { if (buffer.isEmpty) { val deferred = scanner.nextRows() deferred.addCallbacks( (results: util.ArrayList[util.ArrayList[KeyValue]]) => callback.invoke(Option(results)), (e: Throwable) => errorback.invoke(e) ) } else { val (element, tailBuffer) = (buffer.head, buffer.tail) buffer = tailBuffer push(out, element) } } }) override def postStop(): Unit = { scanner.close() super.postStop() } private val callback = getAsyncCallback[Option[util.ArrayList[util.ArrayList[KeyValue]]]] { case Some(results) if !results.isEmpty => val element = results.remove(0) buffer = results.asScala.toList push(out, element) case Some(results) if results.isEmpty => complete(out) case None => complete(out) } private val errorback = getAsyncCallback[Throwable] { error => fail(out, error) } } } ※ This code contains a serious issue. You must handle downstream cancellation properly. Otherwise a Close request may be fired while NextRows request is still running, which causes HBase protocol violation. See how to solve this problem on the Gist. Gist
  • 18. Customizing Scan behavior with downstream pipelines HBaseAsyncScanSource(scanner).take(1000) HBaseAsyncScanSource(scanner) .throttle(elements=100, per=1 second, maximumBurst=100, ThrottleMode.Shaping) HBaseAsyncScanSource(scanner).completionTimeout(5 seconds) HBaseAsyncScanSource(scanner).recoverWithRetries(10, { case NotServingRegionException => HBaseAsyncScanSource(scanner) }) ● early termination of scan when count of rows limit is reached ● scan iteration rate limiting ● early termination of scan by timeout ● retrying if a region server is not serving Gist
  • 19. Switching from synchronous API to asynchronous API ● Switching from synchronous API to asynchronous API usually requires rewriting whole APIs ● Abstracting database drivers is difficult ● Starting with asynchronous interface like Future[T] is a good practice ● Another option for abstract interface is streams ● Streams can behave collections like Future, Option, List, Try, but do not require monad transformer to integrate each other ● Stream interface specification like reactive-streams (JEP266) gives a way to connect various asynchronous libraries ● Akka Stream is one of the implementations of the reactive-streams
  • 20. Database access abstraction with streams Transport Interface Layer interface: Directive[T], Future[T] engine: Akka HTTP Stream Adaptor interface: Source[Out, M], Flow[In, Out, M], Sink[In, M] engine: Akka Stream Database Interface Layer interface: implementation specific engine: database driver ● native HBase client ● asynchbase ● HBaseScanStage ● HBaseAsyncScanStage ● ReadMessageDAS UseCase Layer interface: Source[Out, M], Flow[In, Out, M], Sink[In, M] engine: Akka Stream Domain Layer interface: Scala collections and case classes engine: Scala standard library
  • 21. Transport Interface Layer interface: Directive[T], Future[T] engine: Akka HTTP Stream Adaptor interface: Source[Out, M], Flow[In, Out, M], Sink[In, M] engine: Akka Stream Database Interface Layer interface: implementation specific engine: database driver ● native HBase client ● asynchbase ● HBaseScanStage ● HBaseAsyncScanStage ● ReadMessageDAS UseCase Layer interface: Source[Out, M], Flow[In, Out, M], Sink[In, M] engine: Akka Stream Domain Layer interface: Scala collections and case classes engine: Scala standard library ● Stream abstraction mitigates impact of changes of underlying implementations ● Database access implementation can be switched by Factory functions ● No change was required inside UseCase and Domain layers Database access abstraction with streams
  • 22. Blocking IO vs Non-blocking IO performance test results Fortunately we have not faced HBase issues since asynchbase migration in production. Following slides show performance test results that was conducted before asynchbase deployment.
  • 23. Blocking IO vs Non-blocking IO performance test settings ● Single Message Read API server ○ JVM heap size=4GiB ○ CPU request=3.5 ○ CPU limit=4 ● Using production workload pattern simulated with gatling stress tool ● 1340 request/second ● mainly invokes HBase Scan, but there are Get and batch Get as well Both implementations with asynchbase and native HBase client are tested with the same condition.
  • 24. Blocking IO vs Non-blocking IO throughput Message Read API server with native HBase client Message Read API server with asynchbase throughput: 1000 → 1300
  • 25. Blocking IO vs Non-blocking IO latency Message Read API server with native HBase client Message Read API server with asynchbase ※ Note that the scales of y-axis are different. 99pt.: 2000ms → 300ms 95pt.: 1000ms → 200ms
  • 26. Blocking IO vs Non-blocking IO Thread pool usage Message Read API server with native HBase client Message Read API server with asynchbase Note that hbase-dispatcher is an application thread pool, not Netty IO worker thread pool. pool size: 600 → 8 active threads: 80 → 2
  • 27. Blocking IO vs Non-blocking IO JVM heap usage Message Read API server with native HBase client Message Read API server with asynchbase heap usage: 2.6GiB → 1.8Gi
  • 28. Blocking IO vs Non-blocking IO HBase scan metrics Message Read API server with native HBase client Message Read API server with asynchbase average of sum of millis sec between nexts average of sum of millis sec between nexts
  • 29. HBase scan metrics may come to asynchnase https://github.com/OpenTSDB/asynchbase/pull/184
  • 30. Room for improvement Timeouts and Rate limiting ● Proper timeouts and rate limiting are necessary for asynchronous and non-blocking systems ○ Without reins asynchronous system increases its throughput until consumes all resources ● Timeouts ○ completionTimeout: timout based on total processing time ■ Not ideal for Scan that has broad distribution of processing time ○ idleTimeout: timeout based on processing time between two data ■ Single iteration of Scan has sharp distribution of processing time. Probably a better strategy. ● Rate limiting ○ Under high workload, the first bottleneck is throughput of storage of HBase ■ How to implement storage-aware rate limiting? ■ Tuning application resources may be necessary
  • 31. Conclusion ● Blocking IO spoils benefits of distributed databases ○ partial failure of database exhausts application threads and makes the application unresponsive ● Non-blocking IO is resilient to partial failure ● Asynchronous stream is great as a flexible execution model and abstract interface ● asynchronous stream with Non-blocking IO outperforms blocking one ● Our journey for resilient system continues