SlideShare a Scribd company logo
Simple Solutions

for Complex Problems
Tyler Treat / Workiva
Boulder NATS Meetup 6/7/2016
• Embracing the reality of complex
systems
• Using simplicity to your advantage
• Why NATS?
• How Workiva uses NATS
ABOUT THIS TALK
• Messaging tech lead at Workiva
• Platform infrastructure
• Distributed systems
• bravenewgeek.com
@tyler_treat

tyler.treat@workiva.com
ABOUT THE SPEAKER
There are a lot of parallels between
real-world systems and

distributed software systems.
The world is eventually consistent…
…and the database is just
an optimization.[1]
[1] https://christophermeiklejohn.com/lasp/erlang/2015/10/27/tendency.html
“There will be no further print editions
[of the Merck Manual]. Publishing a
printed book every five years and
sending reams of paper around the
world on trucks, planes, and boats is
no longer the optimal way to provide
medical information.”
Dr. Robert S. Porter

Editor-in-Chief, The Merck Manuals
Programmers find asynchrony hard
to reason about, but the truth is…
Life is mostly asynchronous.
What does this mean for us as
programmers?
time / complexity
timesharing
monoliths
soa
virtualization
microservices
???
Complicated made complex…
Distributed!
Distributed computation is

inherently asynchronous

and the network is

inherently unreliable[2]…
[2] http://queue.acm.org/detail.cfm?id=2655736
…but the natural tendency is to build
distributed systems as if they aren’t
distributed at all because it’s

easy to reason about.
strong consistency - reliable messaging - predictability
• Complicated algorithms
• Transaction managers
• Coordination services
• Distributed locking
What’s in a guarantee?
Simple Solutions for Complex Problems - Boulder Meetup
• Message handed to the transport layer?
• Enqueued in the recipient’s mailbox?
• Recipient started processing it?
• Recipient finished processing it?
What’s a delivery guarantee?
Each of these has a very different set of
conditions, constraints, and costs.
Guaranteed, ordered,
exactly-once delivery
is expensive (if not impossible[3]).
[3] http://bravenewgeek.com/you-cannot-have-exactly-once-delivery/
Over-engineered
Complex
Difficult to deploy & operate
Fragile
Slow
At large scale, guarantees will give out.
0.1% failure at scale is huge.
Simple Solutions for Complex Problems - Boulder Meetup
Simple Solutions for Complex Problems - Boulder Meetup
Replayable > Guaranteed
Replayable > Guaranteed
Idempotent > Exactly-once
Replayable > Guaranteed
Idempotent > Exactly-once
Commutative > Ordered
But delivery != processing
Also, what does it even mean to
“process” a message?
It depends on the

business context!
If you need business-level
guarantees, build them into

the business layer.
Simple Solutions for Complex Problems - Boulder Meetup
We can always build

stronger guarantees on top,

but we can’t always remove

them from below.
End-to-end system semantics matter
much more than the semantics of an

individual building block[4].
[4] http://web.mit.edu/Saltzer/www/publications/endtoend/endtoend.pdf
Embrace the chaos!
“Simplicity is the ultimate sophistication.”
EMBRACING THE CHAOS MEANS

LOOKING AT THE NEGATIVE SPACE.
A simple technology

in a sea of complexity.
Simple doesn’t mean easy.
[5] https://blog.wearewizards.io/some-a-priori-good-qualities-of-software-development
“Simple can be harder than complex.
You have to work hard to get your thinking
clean to make it simple. But it’s worth it in
the end because once you get there, you
can move mountains.”
• Wdesk: platform for enterprises to collect, manage,
and report critical business data in real time
• Increasing amounts of data and complexity of
formats
• Cloud solution:

- Data accuracy

- Secure

- Highly available

- Scalable

- Mobile-enabled
About Workiva
Simple Solutions for Complex Problems - Boulder Meetup
Simple Solutions for Complex Problems - Boulder Meetup
• First solution built on Google App Engine
• Scaling new solutions requires service-oriented
approach
• Scaling new services requires a low-latency
communication backplane
About Workiva
Why ?
Availability

over

everything.
• Always on, always available
• Protects itself at all costs—no compromises on
performance
• Disconnects slow consumers and lazy listeners
• Clients have automatic failover and reconnect logic
• Clients buffer messages while temporarily
partitioned
Availability over Everything
Simplicity as a feature.
• Single, lightweight binary
• Embraces the “negative space”:

- Simplicity —> high-performance

- No complicated configuration or external dependencies

(e.g. ZooKeeper)

- No fragile guarantees —> face complexity head-on, encourage async
• Simple pub/sub semantics provide a versatile primitive:

- Fan-in

- Fan-out

- Request/response

- Distributed queueing
• Simple text-based wire protocol
Simplicity as a Feature
Fast as hell.
[6] http://bravenewgeek.com/benchmarking-message-queue-latency/
Simple Solutions for Complex Problems - Boulder Meetup
• Fast, predictable performance at scale and at tail
• ~8 million messages per second
• Auto-pruning of interest graph allows efficient
routing
• When SLAs matter, it’s hard to beat NATS
Fast as Hell
• Low-latency service bus
• Pub/Sub
• RPC
How We Use NATS
Service
Service
Service
NATS
Service
Gateway
Web
Client
Web
Client
Web
Client
Service
Service
Service
NATS
Service
Gateway
Web
Client
Web
Client
Web
Client
Service
Service
Service
NATS
Service
Gateway
Web
Client
Web
Client
Web
Client
Service
Service
Service
NATS
Service
Gateway
Web
Client
Web
Client
Web
Client
Service
Service
Service
Service
Service
NATS
Service
Gateway
Web
Client
Web
Client
Web
Client
Web
Client
Web
Client
Web
Client
Service
Gateway
NATS
Service
Service
Service
Service
Service
Service
NATS
Pub/Sub
“Just send this thing containing these fields
serialized in this way using that encoding to
this topic!”
“Just subscribe to this topic and decode
using that encoding then deserialize in

this way and extract these fields from

this thing!”
Simple Solutions for Complex Problems - Boulder Meetup
Pub/Sub is meant to decouple services
but often ends up coupling the teams
developing them.
How do we evolve services in isolation
and reduce development overhead?
• Extension of Apache Thrift
• IDL and cross-language, code-generated pub/sub
APIs
• Allows developers to think in terms of services and
APIs rather than opaque messages and topics
• Allows APIs to evolve while maintaining compatibility
• Transports are pluggable (we use NATS)
Frugal RPC
struct Event {

1: i64 id,

2: string message,

3: i64 timestamp,

}
scope Events prefix {user} {

EventCreated: Event

EventUpdated: Event

EventDeleted: Event

}
subscriber.SubscribeEventCreated(

"user-1", func(e *event.Event) {

fmt.Println(e)

},

)
. . .
publisher.PublishEventCreated(

"user-1", event.NewEvent())
generated
• Service instances form a queue group
• Client “connects” to instance by publishing a message to the service
queue group
• Serving instance sets up an inbox for the client and sends it back in the
response
• Client sends requests to the inbox
• Connecting is cheap—no service discovery and no sockets to create, just
a request/response
• Heartbeats used to check health of server and client
• Very early prototype code: https://github.com/workiva/thrift-nats
RPC over NATS
Simple Solutions for Complex Problems - Boulder Meetup
• Store JSON containing cluster membership in S3
• Container reads JSON on startup and creates
routes w/ correct credentials
• Services only talk to the NATS daemon on their VM
via localhost
• Don’t have to worry about encryption between
services and NATS, only between NATS peers
NATS per VM
• Only messages intended for a process on another
host go over the network since NATS cluster
maintains interest graph
• Greatly reduces network hops (usually 0 vs. 2-3)
• If local NATS daemon goes down, restart it
automatically
NATS per VM
• Doesn’t scale to large number of VMs
• Fairly easy to transition to floating NATS cluster or
running on a subset of machines per AZ
• NATS communication abstracted from service
• Send messages to services without thinking about
routing or service discovery
• Queue groups provide service load balancing
NATS per VM
• We’re a SaaS company, not an infrastructure company
• High availability
• Operational simplicity
• Performance
• First-party clients:

Go Java C C#

Python Ruby Elixir Node.js
NATS as a Messaging Backplane
• Handle failure at the client

- The less state in your middleware &

infrastructure, the easier it is to scale

- Exponential backoffs with jitter
• But never trust the client

- Rate limits, message size limits, back pressure

- Be strict in what you accept

- Limit failure domain by forcing applications to

make design decisions upfront instead of

punting
Important Corollaries
Assume every client is trying to DoS you
(because they probably are, intentionally or not).
Assume every client is trying to DoS you
(because they probably are, intentionally or not).
–Derek Landy, Skulduggery Pleasant
“Every solution to every problem is simple…

It's the distance between the two where the mystery lies.”
@tyler_treat
github.com/tylertreat
bravenewgeek.com
Thanks!

More Related Content

PDF
NATS + Docker meetup talk Oct - 2016
PDF
Simple Solutions for Complex Problems
PDF
GopherCon 2017 - Writing Networking Clients in Go: The Design & Implementati...
PDF
KubeCon + CloudNative Con NA 2021 | A New Generation of NATS
PDF
A New Way of Thinking | NATS 2.0 & Connectivity
PDF
Implementing Microservices with NATS
PDF
How Clarifai uses NATS and Kubernetes for Machine Learning
PDF
Serverless for the Cloud Native Era with Fission
NATS + Docker meetup talk Oct - 2016
Simple Solutions for Complex Problems
GopherCon 2017 - Writing Networking Clients in Go: The Design & Implementati...
KubeCon + CloudNative Con NA 2021 | A New Generation of NATS
A New Way of Thinking | NATS 2.0 & Connectivity
Implementing Microservices with NATS
How Clarifai uses NATS and Kubernetes for Machine Learning
Serverless for the Cloud Native Era with Fission

What's hot (16)

PPTX
NATS for Modern Messaging and Microservices
PDF
NATS vs HTTP
PPTX
Deep Dive into Building a Secure & Multi-tenant SaaS Solution with NATS
PDF
Deploy Secure and Scalable Services Across Kubernetes Clusters with NATS
PDF
Microservices Meetup San Francisco - August 2017 Talk on NATS
PDF
KubeCon NA 2019 Keynote | NATS - Past, Present, and the Future
PDF
NATS: Simple, Secure and Scalable Messaging For the Cloud Native Era
PDF
NATS in action - A Real time Microservices Architecture handled by NATS
PDF
Micro on NATS - Microservices with Messaging
PDF
Easy, Secure, and Fast: Using NATS.io for Streams and Services
PDF
NATS Connect Live | NATS as a Service Mesh
PDF
NATS: Control Flow for Distributed Systems
PDF
NATS Connect Live!
PDF
Nats.io meetup october 2015 - Community Update
PDF
The Zen of High Performance Messaging with NATS (Strange Loop 2016)
PDF
GopherFest 2017 - Adding Context to NATS
NATS for Modern Messaging and Microservices
NATS vs HTTP
Deep Dive into Building a Secure & Multi-tenant SaaS Solution with NATS
Deploy Secure and Scalable Services Across Kubernetes Clusters with NATS
Microservices Meetup San Francisco - August 2017 Talk on NATS
KubeCon NA 2019 Keynote | NATS - Past, Present, and the Future
NATS: Simple, Secure and Scalable Messaging For the Cloud Native Era
NATS in action - A Real time Microservices Architecture handled by NATS
Micro on NATS - Microservices with Messaging
Easy, Secure, and Fast: Using NATS.io for Streams and Services
NATS Connect Live | NATS as a Service Mesh
NATS: Control Flow for Distributed Systems
NATS Connect Live!
Nats.io meetup october 2015 - Community Update
The Zen of High Performance Messaging with NATS (Strange Loop 2016)
GopherFest 2017 - Adding Context to NATS
Ad

Viewers also liked (20)

PDF
Blaze clan company presentation
PPTX
Nejat Murat Erkan Dinamikler 2016
PDF
Migrate Your Business to the Cloud
PPTX
Levent Nart Dinamikler 2016
KEY
Virtualization @ Sehir
PPTX
Deniz Saral Dinamikler 2016
PPTX
Office 365 Hizmetlere Genel Bakış ve Exchange Online
PDF
Google Apps | Automatic substitution
PDF
Cloudlytics Reporting: Analyze Amazon CloudFront, S3 & ELB Logs - Part 2
PPTX
Overview of AWS Services for Media Content
PDF
Finding the Right CRM
PPTX
Sherif Adel Medhar Dinamikler 2016
PPTX
Office 365 Yönetilen Hizmetler ( Deployment as a Service)
PDF
Google Quick Tip - Spell Check
PPTX
Merve Taşkan Dinamikler 2016
PDF
How to Migrate to Cloud with Complete Confidence and Trust
PPTX
Paolo Pulcini Dinamikler 2016
PPTX
Bulut Bilisim Nedir ? Ne Degildir ?
PDF
οργάνωση, διοίκηση και λειτουργία ενός γραφείου εισερχομένου τουρισμού
PPTX
Metin Örnek Dinamikler 2016
Blaze clan company presentation
Nejat Murat Erkan Dinamikler 2016
Migrate Your Business to the Cloud
Levent Nart Dinamikler 2016
Virtualization @ Sehir
Deniz Saral Dinamikler 2016
Office 365 Hizmetlere Genel Bakış ve Exchange Online
Google Apps | Automatic substitution
Cloudlytics Reporting: Analyze Amazon CloudFront, S3 & ELB Logs - Part 2
Overview of AWS Services for Media Content
Finding the Right CRM
Sherif Adel Medhar Dinamikler 2016
Office 365 Yönetilen Hizmetler ( Deployment as a Service)
Google Quick Tip - Spell Check
Merve Taşkan Dinamikler 2016
How to Migrate to Cloud with Complete Confidence and Trust
Paolo Pulcini Dinamikler 2016
Bulut Bilisim Nedir ? Ne Degildir ?
οργάνωση, διοίκηση και λειτουργία ενός γραφείου εισερχομένου τουρισμού
Metin Örnek Dinamikler 2016
Ad

Similar to Simple Solutions for Complex Problems - Boulder Meetup (20)

PDF
Simple Solutions for Complex Problems
PPTX
NATS for Modern Messaging and Microservices
PDF
NATS Connector Framework - Boulder Meetup
PDF
NATS - A new nervous system for distributed cloud platforms
PDF
NATS: A Central Nervous System for IoT Messaging - Larry McQueary
PPTX
Patterns for Asynchronous Microservices with NATS
PDF
Patterns for Asynchronous Microservices with NATS
PDF
Nats in action a real time microservices architecture handled by nats
PDF
The Zen of High Performance Messaging with NATS
PDF
The Zen of High Performance Messaging with NATS
PPTX
Captial One: Why Stream Data as Part of Data Transformation?
PDF
Designing microservices platforms with nats
PPTX
Connect Everything with NATS - Cloud Expo Europe
PPTX
NServiceBus - building a distributed system based on a messaging infrastructure
PDF
OSCON: Building Cloud Native Apps with NATS
PDF
[WSO2 API Day Chicago 2019] Cloud-native Integration for the Enterprise
PPTX
designing distributed scalable and reliable systems
PDF
Simple and Scalable Microservices: Using NATS with Docker Compose and Swarm
PDF
Simple and Scalable Microservices: Using NATS with Docker Compose and Swarm
PPTX
NServiceBus - introduction to a message based distributed architecture
Simple Solutions for Complex Problems
NATS for Modern Messaging and Microservices
NATS Connector Framework - Boulder Meetup
NATS - A new nervous system for distributed cloud platforms
NATS: A Central Nervous System for IoT Messaging - Larry McQueary
Patterns for Asynchronous Microservices with NATS
Patterns for Asynchronous Microservices with NATS
Nats in action a real time microservices architecture handled by nats
The Zen of High Performance Messaging with NATS
The Zen of High Performance Messaging with NATS
Captial One: Why Stream Data as Part of Data Transformation?
Designing microservices platforms with nats
Connect Everything with NATS - Cloud Expo Europe
NServiceBus - building a distributed system based on a messaging infrastructure
OSCON: Building Cloud Native Apps with NATS
[WSO2 API Day Chicago 2019] Cloud-native Integration for the Enterprise
designing distributed scalable and reliable systems
Simple and Scalable Microservices: Using NATS with Docker Compose and Swarm
Simple and Scalable Microservices: Using NATS with Docker Compose and Swarm
NServiceBus - introduction to a message based distributed architecture

More from Apcera (17)

PDF
Gopher fest 2017: Adding Context To NATS
PPTX
Modernizing IT in the Platform Era
PDF
Debugging Network Issues
PDF
IT Modernization Doesn’t Mean You Leave Your Legacy Apps Behind
PDF
How Greta uses NATS to revolutionize data distribution on the Internet
PDF
Actor Patterns and NATS - Boulder Meetup
PDF
Securing the Cloud Native Stack
PDF
KURMA - A Containerized Container Platform - KubeCon 2016
PDF
Integration Patterns and Anti-Patterns for Microservices Architectures
PDF
Kubernetes, The Day After
PDF
Policy-based Cloud Storage: Persisting Data in a Multi-Site, Multi-Cloud World
PDF
Integration Patterns for Microservices Architectures
PDF
Nats meetup sf 20150826
PDF
Microservices: Notes From The Field
PDF
Docker + App Container = ocp
PDF
Apcera: Agility and Security in Docker Delivery
PDF
Delivering Policy & Trust to the Hybrid Cloud
Gopher fest 2017: Adding Context To NATS
Modernizing IT in the Platform Era
Debugging Network Issues
IT Modernization Doesn’t Mean You Leave Your Legacy Apps Behind
How Greta uses NATS to revolutionize data distribution on the Internet
Actor Patterns and NATS - Boulder Meetup
Securing the Cloud Native Stack
KURMA - A Containerized Container Platform - KubeCon 2016
Integration Patterns and Anti-Patterns for Microservices Architectures
Kubernetes, The Day After
Policy-based Cloud Storage: Persisting Data in a Multi-Site, Multi-Cloud World
Integration Patterns for Microservices Architectures
Nats meetup sf 20150826
Microservices: Notes From The Field
Docker + App Container = ocp
Apcera: Agility and Security in Docker Delivery
Delivering Policy & Trust to the Hybrid Cloud

Recently uploaded (20)

PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PPTX
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
PDF
Advanced Soft Computing BINUS July 2025.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Electronic commerce courselecture one. Pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Big Data Technologies - Introduction.pptx
PPT
Teaching material agriculture food technology
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
The Rise and Fall of 3GPP – Time for a Sabbatical?
Mobile App Security Testing_ A Comprehensive Guide.pdf
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
Advanced Soft Computing BINUS July 2025.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
Electronic commerce courselecture one. Pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Advanced methodologies resolving dimensionality complications for autism neur...
Spectral efficient network and resource selection model in 5G networks
Review of recent advances in non-invasive hemoglobin estimation
Big Data Technologies - Introduction.pptx
Teaching material agriculture food technology

Simple Solutions for Complex Problems - Boulder Meetup