Microservices communication styles and event bus

Communication Styles and Event Bus
(From the Perspective of Event Sourcing and CQRS)
By Touraj Ebrahimi
Github: toraj58
Bitbucket: toraj58
Twitter:@toraj58
Youtube channel: https://www.youtube.com/channel/UCcLcw6sTk_8G6EgfBr0E5uA
LinkedIn: www.linkedin.com/in/touraj-ebrahimi-956063118/

Contents
Communication type.....................................................................................................................................3
Communication style ....................................................................................................................................3
Messaging .....................................................................................................................................................3
Messaging Protocols (Most Popular)........................................................................................................3
Asynchronous messaging technologies (Message Brokers) .........................................................................3
Event Bus Features........................................................................................................................................4
Trends of Message Brokers...........................................................................................................................5
Kafka Cluster .................................................................................................................................................9
RabbitMQ..................................................................................................................................................9
Kafka............................................................................................................................................................10
ActiveMQ ....................................................................................................................................................11
Kestrel .........................................................................................................................................................11
List of Message Broker technologies ..........................................................................................................12
Comparison of Message broker (rabbitMq, ActiveMq,Kafka, ZeroMq) .....................................................13
Protocols (Most Common protocols used in Message-Brokers) ................................................................15
RabbitMQ Messaging Example with Scala and Spring Framework ............................................................16
RabbitMQ Web Console..............................................................................................................................17
ActiveMQ Web Console..............................................................................................................................20
Eventuate....................................................................................................................................................23
Eventuate Data base Model:...................................................................................................................24
Debezium ....................................................................................................................................................26
What is Change Data Capture?...............................................................................................................27
Debezium Connectors.............................................................................................................................28
Debezium Usage .....................................................................................................................................28
Why Debezium is distributed..................................................................................................................28
Directly monitor a single database by Debezium ...................................................................................28
How Debezium works .............................................................................................................................28
Event Store..................................................................................................................................................29
Event Store DashBoard...........................................................................................................................30

Communication type
 Asynchronous
 Synchronous
Communication style
 Remote Procedure Invocation
 Messaging
 Domain-specific protocol
Messaging
Use asynchronous messaging for inter-service communication. Services communicating by exchanging
messages over messaging channels.
Messaging Protocols (Most Popular)
 AMQP
 MQTT
 STOMP
 OpenWire
 Proprietary Protocols (e.g. Kafka Protocol)
Asynchronous messaging technologies (Message Brokers)
 Apache Kafka
 RabbitMQ
 ActiveMQ
 Kestrel (From Twitter)
 HornetQ (From JBoss)

Event Bus Features
Simple yet powerful: Software architecture may get great benefit by using decoupling components:
Subscribers do not have to know about senders, when using events.
Battle tested: Should be used with many apps and products having good reputation
High Performance: Should be high performance but we should consider a reasonable balance between
performance and reliability.
A-Synchronization: Communication style for the Event Bus should be A-synchronized. This feature is
good for non-blocking UI and if possible for non-blocking IO.
Zero Configuration: It should be possible to use Event Bus with Zero or least configuration and bring it
up and working very fast.
Configurable: It should be possible to configure Event Bus base on requirements and adjusting its
behavior taking into consideration that configuring the bus should be easy for developers and system
engineers.
Easy for Developers: The Event Bus should be easy for Developers to be understood and coded. It
should have been designed base on Best-Practices and Designing Patterns
High-Availability: Event Bus should be high-available because if it fails and get down frequently then
whole system will be in serious trouble; According to CAP Theorem devised by Eric Brewer in the
distributed systems it is hard to both have availability and consistency.
Fault-Tolerant: Because event bus (Specially Message Brokers) add a Single-Point-of-Failure to the
whole system so they should be reliable and having good fault-tolerance

Kafka Cluster
RabbitMQ
RabbitMQ is well known and popular message broker and it has many powerful features. The
documentation on the RabbitMQ web site is excellent and there are many books available.
RabbitMQ is written in Erlang, not a widely used programming language but well adapted to
such tasks. The company Pivotal develops and maintains RabbitMQ. I reviewed version 3.2.2 on
CentOS 6 servers.
The installation was easy, I installed Erlang version R14B from epel and the RabbitMQ rpm. The
only small issue I had was that the server is expecting “127.0.0.1″ to be resolved in /etc/hosts and
the openstack VMs I used were missing that. Easy to fix. I also installed and enabled the
management plugin.
The RabbitMQ configuration is set in the rabbitmq.config file and it has tons of adjustable
parameters. I used the defaults. In term of client API, RabbitMQ support a long list of languages
and some standards protocols, like STOMP are available with a plugin. Queues and topics can be
created either by the web interface or through the client API directly. If you have more than one
node, they can be clustered and then, queues and topics, can be replicated to other servers.

I created 4 queues, wrote a ruby client and started inserting messages. I got a publishing rate of
about 20k/s using multiple threads but I got a few stalls caused by the
vm_memory_high_watermark, from my understanding during those stalls it writing to disk. Not
exactly awesome given my requirements. Also, some part is always kept in memory even if a
queue is durable so, even though I had plenty of disk space, the memory usage grew and
eventually hit the vm_memory_high_watermark setting. The cpu load was pretty high during the
load, between 40% and 50% on an 8 cores VM.
Even though my requirements were not met, I setup a replicated queue on 2 nodes and inserted a
few millions objects. I killed one of the two nodes and insert were even faster but then… I did a
mistake. I restarted the node and asked for a resync. Either I didn’t set it correctly or the resync is
poorly implemented but it took forever to resync and it was slowing down as it progressed. At
58% done, it has been running for 17h, one thread at 100%. My patience was exhausted.
So, lots of feature, decent performance but behavior not compatible with the requirements.
Kafka
Kafka has been designed originally by LinkedIn, it is written in Java and it is now under the
Apache project umbrella. Sometimes you look at a technology and you just say: wow, this is
really done the way it should be. At least I could say that for the purpose I had. What is so
special about Kafka is the architecture, it stores the messages in flat files and consumers ask
messages based on an offset. Think of it like a MySQL server (producer) saving messages
(updates SQL) to its binlogs and slaves (consumers) ask messages based on an offset. The server
is pretty simple and just don’t care about the consumers much. That simplicity makes it is super-
fast and low on resource. Old messages can be retained on a time base (like expire_logs_days)
and/or on a storage usage base.
So, if the server doesn’t keep track of what has been consumed on each topics, how do can you
have multiple consumer. The missing element here is Zookeeper. The Kafka server uses
Zookeeper for cluster membership and routing while the consumers can also use Zookeeper or
something else for synchronization. The sample consumer provided with the server uses
Zookeeper so you can launch many instances and they’ll synchronize automatically. For the ones
that doesn’t know Zookeeper, it is a highly-available synchronous distributed storage system. If
you know Corosync, it provides somewhat the same functionality.
Feature wise Kafka, isn’t that great. There’s no web frontend built-in although a few are
available in the ecosystem. Routing and rules are inexistent and stats are just with JMX. But, the
performance… I reached a publishing speed of 165k messages/s over a single thread, I didn’t
bother tuning for more. Consuming was essentially disk bound on the server, 3M messages/s…
amazing. That was without Zookeeper coordination. Memory and CPU usage were modest.
To test clustering, I created a replicated queue, inserted a few messages, stopped a replica,
inserted a few millions more messages and restarted the replica. I took only a few seconds to
resync.

So, Kafka is very good fit for the requirements, stellar performance, low resource usage and nice
fit with the requirements.
ActiveMQ
ActiveMQ is another big player in the field with an impressive feature set. ActiveMQ is more in
the RabbitMQ league than Kafka and like Kafka, it is written in Java. HA can be provided by the
storage backend, levelDB supports replication but I got some issues with it. My requirements are
not for full HA, just to make sure the publishers are never blocked so I dropped the storage
backend replication in favor of a mesh of brokers.
My understanding of the mesh of brokers is that you connect to one of the members and you
publish or consume a message. You don’t know on which node(s) the queue is located, the
broker you connect to knows and routes your request. To further help, you can specify all the
brokers on the connection string and the client library will just reconnect to another if the one
you are connected to goes down. That looks pretty good for the requirements.
With the mesh of brokers setup, I got an insert rate of about 5000 msg/s over 15 threads and a
single consumer was able to read 2000 msg/s. I let it run for a while and got 150M messages. At
this point though, I lost the web interface and the publishing rate was much slower.
So, a big beast, lot of features, decent performance, on the edge with the requirements.
Kestrel
Kestrel is another interesting broker, this time, more like Kafka. Written in scala, the Kestrel
broker speaks the memcached protocol. Basically, the key becomes the queue name and the
object is the message. Kestrel is very simple, queues are defined in a configuration file but you
can specify, per queue, storage limits, expiration and behavior when limits are reached. With a
setting like “discardOldWhenFull = true”, my requirement of never blocking the publishers is
easily met.
In term of clustering Kestrel is a bit limited but each can publish its availability to Zookeeper so
that publishers and consumers can be informed of a missing server and adjust. Of course, if you
have many Kestrel servers with the same queue defined, the consumers will need to query all of
the broker to get the message back and strict ordering can be a bit hard.
In term of performance, a few simple bash scripts using nc to publish messages easily reached
10k messages/s which is very good. The rate is static over time and likely limited by the
reconnection for each message. The presence of consumers slightly reduces the publishing rate
but nothing drastic. The only issue I had was when a large number of messages expired, the
server froze for some time but that was because I forgot to set maxExpireSweep to something
like 100 and all the messages were removed in one pass.
So, fairly good impression on Kestrel, simple but works well.

List of Message Broker technologies
 Apache ActiveMQ
 Apache Kafka
 Apache Qpid
 Celery
 Cloverleaf (E-Novation Lifeline)
 Comverse Message Broker (Comverse Technology)
 Enduro/X Transactional Message Queue (TMQ)
 Financial Fusion Message Broker (Sybase)
 Fuse Message Broker (enterprise ActiveMQ)
 Gearman
 HornetQ (Red Hat)
 IBM Integration Bus
 IBM Message Queues
 JBoss Messaging (JBoss)
 JORAM
 Microsoft Azure Service Bus (Microsoft)
 Microsoft BizTalk Server (Microsoft)
 NATS (MIT Open Source License, written in Go)
 Open Message Queue
 Oracle Message Broker (Oracle Corporation)
 QDB (Apache License 2.0, supports message replay by timestamp)
 RabbitMQ (Mozilla Public License, written in Erlang)
 Redis an open source, in-memory data structure store, used as a database, cache and
message broker.
 SAP PI (SAP AG)
 Solace Systems Message Router
 Spread Toolkit
 Tarantool, a NoSQL database, with a set of stored procedures for message queues
 WSO2 Message Broker

Comparison of Message broker (rabbitMq, ActiveMq,Kafka, ZeroMq)
Parameter RabbitMq ActiveMq Zero Mq Kafka
1. Clustering/Load
Balancing
mechanism.
Clustering Available,
Queues clustering have to
be handled separately.
Clustering queue will be
only for HA not for load
balancing Feature
Available Can be
achieved by
writing lots of
customize code.
Available but
producer has to
know to which
partition it is
writing.
2. Replication
among different
nodes.
Available Available Not automatic
as there is no
broker but can
be coded. But
lot of
customization.
Available
3. Fault tolerance
feature. Turned
around time in
case of failure.
Durable Queue, Durable
Message and Clustering
support. Another cluster
node will take over but in
case of queue it is different
(connection has to be
established with new node
again by client.)
Durable Queue,
topic and
durable
consumer
supports and
availability
through
clustering is
ensured.
Features
available but not
out of the box.
Zookeeper is
required to
manage it.
4. Supported
libraries for go
and other
languages like dot
net (CRM, ERP
and CMS are on
window stack).
Available in languages
Java, Go, Python and .Net
Go client not
available. Rest
based http
interface is
available.
Go support
available
Available support
for Go.
5. security Basic Level of
Authentication like
restricting users for
read/write/configure
(administration) exist.
Authentication
support using
different plugin.
One has to build
on top of it.
Not available in
current version.
6. Interoperability
in case Message
broker is to be
changed. (No
binding)
AMQP 0.9 complaint. So
changing one AMQP
complaint broker with
another one should not
need a change in client
code. Rest based plugin
available.
Same as
rabbitMq.It is
AQMP 1.0
compliant
Specific client
has to be
written.
Rest interface
plugins are
available.
7. Performance
throughput
(read/write).
Moderate as per
benchmarking data
available. (I read in pivotal
blog that it can receive and
Comparable to
RabbitMq.
Very fast Very fast

deliver more than one
million messages per
second.)
8. Administration
interface
Available, Http based
having basic functionality.
Basic Web
console.
Not available
has to be built
in.
Very basic
interface. Third
party web console
is available. Less
features as
compared to
RabbitMq interface
like User
Management
9. Open Source Yes Yes Yes Yes
10. Support for
Big Data
Publishing & Consumption
rate comparison to Kafka
is less. So Can be a
bottleneck in a situation
like click stream where
continuous publishing is
required without a pause.
One apache project
"Flume" which can be
used to transfer data to
Hadoop.
Same as for
rabbit MQ.
Flume can also
be used in
ActiveMQ as it
works with
AMQP.
Good in terms
of fast writing
and reading
Kafka Hadoop
Consumer API
11. Push
Notification
Libraries support both
push and pull notification.
Libraries support
both push and
pull notification.
Libraries
support both
push and pull
notification.
Libraries support
both push and pull
notification.
12. Other Worker has to
manage what it
has consumed or
not. Broker does
not take care of it.
Message remain
in the storage until
a specified time.
Worker has to
provide partition id
broker details.

Protocols (Most Common protocols used in Message-Brokers)
 AMQP: The Advanced Message Queuing Protocol (AMQP) is an open standard application layer
protocol for message-oriented middleware. The defining features of AMQP are message
orientation, queuing, routing (including point-to-point and publish-and-subscribe), reliability and
security.
 MQTT: MQTT (MQ Telemetry Transport or Message Queue Telemetry Transport) is an ISO
standard (ISO/IEC PRF 20922) publish-subscribe-based "lightweight" messaging protocol for use
on top of the TCP/IP protocol.
 STOMP: Simple (or Streaming) Text Oriented Message Protocol (STOMP), formerly known as
TTMP, is a simple text-based protocol, designed for working with message-oriented middleware
(MOM). It provides an interoperable wire format that allows STOMP clients to talk with any
message broker supporting the protocol.
 OpenWire: OpenWire is ActiveMQ cross language Wire Protocol to allow native access to
ActiveMQ from a number of different languages and platforms. The Java OpenWire transport is
the default transport in ActiveMQ 4.x or later.

RabbitMQ Messaging Example with Scala and Spring Framework
Following is an example of a component, which is written in Scala that sends a message via RabbitMQ
using the Spring Framework’s RabbitTemplate:
@RestController
class UserRegistrationController @Autowired()(registeredUserRepository:
RegisteredUserRepository, rabbitTemplate: RabbitTemplate) {
import MessagingNames._
@RequestMapping(value = Array("/user"), method = Array(RequestMethod.POST))
def registerUser(@Validated @RequestBody request: RegistrationRequest) = {
val registeredUser = new RegisteredUser(null, request.emailAddress,
request.password)
registeredUserRepository.save(registeredUser)
rabbitTemplate.convertAndSend(exchangeName, routingKey,
NewRegistrationNotification(registeredUser.id, request.emailAddress,
request.password))
RegistrationResponse(registeredUser.id, request.emailAddress)
}
@ResponseStatus(value = HttpStatus.CONFLICT, reason = "duplicate email
address")
@ExceptionHandler(Array(classOf[DuplicateKeyException]))
def duplicateEmailAddress() {}
}

Eventuate
Eventuate is a framework for implementing Event Sourcing and CQRS in a distributed architecture
having scalability, Availability and Eventual Consistency. It also supports Change Data Capture. Its
Infrastructure is based on Kafka, Zoo Keeper, Debezium and MySQL.
Cons
 Mysql Dependent
 It seems that performance does not reach our needs
 The project is not mature and it is a new actor in the filed
 Relatively obscure
 Database specific solutions
 Tricky to avoid duplicate publishing
 Low level DB changes make difficult to determine the business level events
Pros
 Ease-of-use
 Full docker support
 Having Kafka and Zoo Keeper as Backbone.
 No 2PC
 No application changes required
 Guaranteed to be accurate

Big Picture
Eventuate Data base Model:
Eventuate stores data and events in two tables in the mysql database called events and entities.
The following is the script for generating these tables:
create database eventuate;
GRANT ALL PRIVILEGES ON eventuate.* TO 'mysqluser'@'%' WITH GRANT OPTION;
USE eventuate;
DROP table IF EXISTS events;
DROP table IF EXISTS entities;
create table events (
event_id varchar(1000) PRIMARY KEY,
event_type varchar(1000),
event_data varchar(1000),
entity_type VARCHAR(1000),
entity_id VARCHAR(1000),
triggering_event VARCHAR(1000)
);
CREATE INDEX events_idx ON events(entity_type, entity_id, event_id);
create table entities (
entity_type VARCHAR(1000),
entity_id VARCHAR(1000),
entity_version VARCHAR(1000),
PRIMARY KEY(entity_type, entity_id)
);

Eventuate CDC Debezium:
Configuration and class related to integration between eventuate and Debezium are resided in the
following package as you see in the picture:
The configurations for Debezium has been set in the class called:
EventTableChangesToAggregateTopicRelay.java

Following is the code snippet that configure Debezium:
Debezium
Debezium is an open source distributed platform for change data capture. Start it up, point it at your
databases, and your apps can start responding to all of the inserts, updates, and deletes that other apps
commit to your databases. Debezium is durable and fast, so your apps can respond quickly and never
miss an event, even when things go wrong.
Your data is always changing. Debezium lets your apps react every time your data changes, and you
don't have to change your apps that modify the data. Debezium continuously monitors your databases
and lets any of your applications stream every row-level change in the same order they were committed
to the database. Use the event streams to purge a cache, update search indexes, generate derived views
and data, keep other data sources in sync, and much more. In fact, pull that functionality out of your app
and into separate services.
Since Debezium can monitor your data, why have one app update the database and update search
indexes and send notifications and publish messages? Doing that correctly - especially when things go
wrong - is really tough, and if you get it wrong the data in those system may become inconsistent. Keep
things simple, and move that extra functionality into separate services that use Debezium.
String connectorName = "my-sql-connector";
Configuration config = Configuration.create()
/* begin engine properties */
.with("connector.class",
"io.debezium.connector.mysql.MySqlConnector")
.with("offset.storage", KafkaOffsetBackingStore.class.getName())
.with("bootstrap.servers", kafkaBootstrapServers)
.with("offset.storage.topic", "eventuate.local.cdc." + connectorName +
".offset.storage")
.with("poll.interval.ms", 50)
.with("offset.flush.interval.ms", 6000)
/* begin connector properties */
.with("name", connectorName)
.with("database.hostname", jdbcUrl.getHost())
.with("database.port", jdbcUrl.getPort())
.with("database.user", dbUser)
.with("database.password", dbPassword)
.with("database.server.id", 85744)
.with("database.server.name", "my-app-connector")
//.with("database.whitelist", "eventuate")
.with("database.history",
io.debezium.relational.history.KafkaDatabaseHistory.class.getName())
.with("database.history.kafka.topic",
"eventuate.local.cdc." + connectorName + ".history.kafka.topic")
.with("database.history.kafka.bootstrap.servers",
kafkaBootstrapServers)
.build();

Since Debezium can monitor your data, why have one app update the database and update search
indexes and send notifications and publish messages? Doing that correctly - especially when things go
wrong - is really tough, and if you get it wrong the data in those system may become inconsistent. Keep
things simple, and move that extra functionality into separate services that use Debezium.
Take your apps and services down for maintenance, and Debezium keeps monitoring so that when your
apps come back up they'll continue exactly where they left off. No matter what, Debezium keeps the
events in the same order they were made to the database. And Debezium makes sure that you always
see every event, even when things go wrong.
When all things are running smoothly, Debezium is fast. And that means your apps and services can
react quickly. Debezium is built on top of Apache Kafka, which is proven, scalable, and handles very large
volumes of data very quickly.
What is Change Data Capture?
Change Data Capture, or CDC, is an older term for a system that monitors and captures the
changes in data so that other software can respond to those changes. Data warehouses often had
built-in CDC support, since data warehouses need to stay up-to-date as the data changed in the
upstream OLTP databases.
Debezium is essentially a modern, distributed open source change data capture platform that
will eventually support monitoring a variety of database systems.

Debezium Connectors
 MySQL Connector
 MongoDB Connector
 PostgreSQL Connector
 Oracle Connector (coming soon)
Debezium Usage
The primary use of Debezium is to let applications respond almost immediately whenever data in
databases change. Applications can do anything with the insert, update, and delete events. They might
use the events to know when to remove entries from a cache. They might update search indexes with
the data. They might update a derived data store with the same information or with information
computed from the changing data, such as with Command Query Responsibility Separation (CQRS). They
might send a push notification to one or more mobile devices. They might aggregate the changes and
produce a stream of patches for entities.
Why Debezium is distributed
Debezium is architected to be tolerant of faults and failures, and the only effectively way to do that is
with a distributed system. Debezium distributes the monitoring processes, or connectors, across
multiple machines so that, if anything goes wrong, the connectors can be restarted. The events are
recorded and replicated across multiple machines to minimize risk of information loss.
Directly monitor a single database by Debezium
We recommend most people use the full Debezium platform, it is possible for a single application to
embed a Debezium connector so it can monitor a database and respond to the events. This approach is
indeed far simpler with few moving parts, but it is more limited and far less tolerant of failures. If your
application needs at-least-once delivery guarantees of all messages, please consider using the full
distributed system.
How Debezium works
A running Debezium system consists of several pieces. A cluster of Apache Kafka brokers
provides the persistent, replicated, and partitioned transaction logs where Debezium records all
events and from which applications consume all events. The number of Kafka brokers depends
largely on the volume of events, the number of database tables being monitored, and the number
of applications that are consuming the events. Kafka does rely upon a small cluster of Zookeeper
nodes to manage responsibilities of each broker.
Each Debezium connector monitors one database cluster/server, and connectors are configured
and deployed to a cluster of Kafka Connect services that ensure that each connector is always
running, even as Kafka Connect service instances leave and join the cluster. Each Kafka Connect
service cluster (a.k.a., group) is independent, so it is possible for each group within an
organization to manage its own clusters.

All connectors record their events (and other information) to Apache Kafka, which persists,
replicates, and partitions the events for each table in separate topics. Multiple Kafka Connect
service clusters can share a single cluster of Kafka brokers, but the number of Kafka brokers
depends largely on the volume of events, the number of database tables being monitored, and the
number of applications that are consuming the events.
Applications connect to Kafka directly and consume the events within the appropriate topics.
Event Store
Event Store stores your data as a series of immutable events over time, making it easy to build event-
sourced applications.
Event Store is licensed under a 3-clause BSD license, whether it runs on a single node or as a high
availability cluster.
Event Store can run as a cluster of nodes containing the same data, which remains available for writes
provided at least half the nodes are alive and connected.
Event Store is not very high-performance because it only supports 15,000 writes per second and 50,000
reads per second.
Curl code:
The Content of the test.txt can be like following (as JSON) :
curl -i -d@test.txt "http://172.16.131.199:2113/streams/tourajstream" -H "Content-
Type:application/json" -H "ES-EventType: TourajEvent" -H "ES-EventId: C322E299-CB73-4B47-
97C5-5054F920746E"
{
"Developer of Doom" : "John Carmack and John Remero"
}

Microservices communication styles and event bus

More Related Content

What's hot (18)

Similar to Microservices communication styles and event bus (20)

Recently uploaded (20)

Microservices communication styles and event bus