SlideShare a Scribd company logo
>>> 5 must-have
Patterns for your web-scale
Microservices
@aliostad
Ali Kheyrollahi, ASOS
@aliostad
> stackoverflow
> £1.5 bln
global fashion
destination
> 35% every year
@aliostad
/// ASOS in numbers
2 0 1 6 T u r n O v e r → £15 bln
A c t i v e C u s t o m e r s → 12 M
N e w P r o d u c t s / w k → 4 k
U n i q u e V i s i t s / m o → 123 M
P a g e V i e w s / d a y → 95 M
P l a t f o r m T e a m s → 40
@aliostad
/// Microservices
Architecture
@aliostad
/// why microservices
> Scaling people not the solution
> Decentralising decision centres => Agility
> Frequent deployment => Agility
> Reduced complexity of each ms (Divide/Conquere) => Agility
> Overall solution complex but ...
@aliostad
/// anecdote
Often you can measure your success in
implementing Microservice Architecture not
be the number of services you build, but by
the number you decommission.
@aliostad
/// microservices vs soa
SOA Microservices
Main Goal Architectual Decoupling Agility
Audience Mainly Architecture Business (Everyone)
Set out to solve Architectural Coupling
Scaling People,
Frequent Deployment
Organisational
Structure Impact
Minimal Huge
Service Cardinality Usually up to a dozen >40 (Commonly >100)
When to do Always teams > ~5**
** Debateable. There are articles and discussions on this very topic
@aliostad
/// microservice challenges
> Very difficult to build a complete mental picture of solution
> When things go wrong, need to know where before why
> Potentially increased latency
> Performance outliers intractable to solve
> A complete mind-shift requiring a new operating model
@aliostad
/// probability distribution
Response Time
Probabilty
@aliostad
/// performance outliers
Microservice
A
Microservie
B
99th Percentile = 500ms 99th Percentile = 500ms
A B Total
<1s 99% 99% 98.01%
>500m 1% 99% 0.99%
>500m 99% 1% 0.99%
>1s 1% 1% 0.01%
@aliostad
/// ActivityId
Propagator
@aliostad
/// ActivityId
> Every customer request matters
> Every request is unique
> Every request creates a chain (or tree) of calls/events
> Activities are correlated
> You need an ActivityId (or CorrelationId) to link calls/events
@aliostad
/// ActivityId
Microservice
Id
IdId Thread Local Storage
Id
To Other APIs
Id
Event
@aliostad
/// ActivityId - HTTP
Request
GET /api/v2/foo HTTP/1.1
host: foo.com
activity-id: 96c5a1f106ce468ebcca8303ed7464bd
Response
200 OK
activity-id: 96c5a1f106ce468ebcca8303ed7464bd
@aliostad
/// Retry and
Timeout Policy
@aliostad
/// Failure
Microservice
A
1% chance of failure
X
Wait (back-off)
X
Wait (back-off longer)
Microservice
B
1% chance of failure
@aliostad
/// Preemptive Timeout
Microservice
A
X
retry
X
retry
Short timeout
Short timeout
Microservice
B
@aliostad
/// Timeout
C
B
A
A > B > C
A > B + C
@aliostad
/// Choosing a timeout?
Static => Based on Server SLO
Dynamic => 95th percentile
@aliostad
/// IO
Monitor
@aliostad
/// Blame Game
“If there is a single place where
you can play blame game,
instead of collective responsibility,
it is in
Microservices troubleshooting”
@aliostad
/// Did you say IO??
Microservice
DB
API
Cache
Measure...
every time your code
goes out of your process
@aliostad
/// Recording Methods
> Explicitly by calling record()
> Asking the library to record a closure
> Aspect-oriented
Java (spf4j)
private static final MeasurementRecorder recorder
= RecorderFactory.createScalableCountingRecorder(forWhat, unitOfMeasurement,
sampleTimeMillis);
…
recorder.record(measurement);
.NET (PerfIt)
var ins = new SimpleInstrumentor(new InstrumentationInfo()
{
Counters = CounterTypes.StandardCounters,
Description = "test",
InstanceName = "Test instance",
CategoryName = TestCategory
});
ins.Instrument(() => Thread.Sleep(100), "test...");
Java and .NET
@PerformanceMonitor(warnThresholdMillis=1, errorThresholdMillis=100, recorderSource =
RecorderSourceInstance.Rs5m.class)
[PerfItFilter(“PerfItTests", InstanceName = "Test")]
public string Get()
{
return Guid.NewGuid().ToString();
}
@aliostad
/// Publishing Methods
> Local file (various to logstash)
> TCP and HTTP (many, to zipkin, influxdb)
> UDP (statsd, collectd to graphite, logstash)
> Raising Kernel-level event (Windows ETW)
> Local communication (statsd)
@aliostad
/// Circuit-
Breaker
@aliostad
/// tri-state
> Closed traffic can flow normally
> Open traffic does not flow
> Half-open circuit breaker tests the waters again
Closed
Open
Half-open
Test
Failure
Wait timeout
@aliostad
/// Netflix Hysterix
RequestVolumeThreshold
ErrorThresholdPercentage
SleepWindowInMilliseconds
TimeInMilliseconds
NumBuckets
@aliostad
/// Fallback
> Custom: e.g. serve content from a local cache (status 206)
> Silent: return null/no-data/empty (status 200/204)
> Fail-fast: Customer experience is important (status 5xx)
@aliostad
/// Canary and
Health Endpoint
@aliostad
/// Health Endpoints
Ping returns a success code when invoked
Canary returns a connectivity status and
latency on the service and dependencies
“… none of them invoke any application code”
@aliostad
/// Ping
Request
GET /api/health HTTP/1.1
host: foo.com
Response
200 OK
Response
500 Server Error
@aliostad
/// Canary
Request
GET /api/canary HTTP/1.1
host: foo.com
Response
200 OK
{
[Nested Structure]
}
@aliostad
/// ChirpResult
{
"serviceName": "foo",
"latency": "00:00:00.0542172",
"statusCode": 200,
"isCritical": true
}
@aliostad
/// ChirpResult
@aliostad
/// ChirpResult - critical failure
API
NC
NC
C
200
200
500
500
@aliostad
/// ChirpResult - non-critical failure
API
NC
NC
C
500
200
200
200
@aliostad
/// AOP / Declarative (c#)
[AzureStorageCanary("Foo-AzureStorage-BarDatabaseServer", “config-key-for-cn“)]
[SqlCanary("SQL-BazActiveDatabase", null, typeof(SqlConnectionFactory))]
[CanaryEndpointCanary("Dependency-Api", “config-key-for-endpoint“)]
public class CanaryController : CanaryBaseController
{
… // some boilerplate code
}
@aliostad
/// Deep vs Shallow
API
API
“Deep”“Shallow”
/api/canary?deep=false
@aliostad
/// Wrap-up
> If you have more than ~5 teams, consider Microservices
> Logging/Monitoring/Alerting: single most important asset
> Use ActivityId Propagator to correlate (consider zipkin)
> Cloud is a jungleTM
. Without retry/timeout you won’t survive
> Monitor and measure all calls to external services (blame game)
> Protect your systems with circuit-breakers (and isolation)
> Canary helps you detect connectivity from customer view
@aliostad
Thomas Wood: Daisy Picture
Thomas Au: Thermometer Picture
Torbakhopper: Cables Picture
Dam Picture - Japan
Hsiung: Lights Picture
Health Endpoint in API Design

More Related Content

PDF
Reflection Madness - Dr. Heinz Kabutz
PDF
Microservices pros and cons - houston tech fest
PDF
Java libraries you can't afford to miss
PDF
How to Build Modern Web Applications Using Microservices
 
PDF
High-Performance JDBC Voxxed Bucharest 2016
PDF
Expressive Microservice Framework Blastoff
PPTX
MicroServices on Azure
PDF
GridGain 6.0: Open Source In-Memory Computing Platform - Nikita Ivanov
Reflection Madness - Dr. Heinz Kabutz
Microservices pros and cons - houston tech fest
Java libraries you can't afford to miss
How to Build Modern Web Applications Using Microservices
 
High-Performance JDBC Voxxed Bucharest 2016
Expressive Microservice Framework Blastoff
MicroServices on Azure
GridGain 6.0: Open Source In-Memory Computing Platform - Nikita Ivanov

Similar to Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice (20)

PDF
5 must-have patterns for your microservice - buildstuff
PDF
5 must have patterns for your microservice - techorama
PDF
Microservice Architecture at ASOS - DevSum 2017
PDF
Buildstuff - what do you need to know about RPC comeback
PPTX
ConFoo Montreal - Approaches for application request throttling
PDF
The Real World - Plugging the Enterprise Into It (nodejs)
PPTX
VISUG - Approaches for application request throttling
PPTX
Approaches to application request throttling
PDF
Appriss Automates Support of 650 End-Point Routers with CA Process Automation
PPTX
How we sleep well at night using Hystrix at Finn.no
PPTX
Being HAPI! Reverse Proxying on Purpose
PPTX
Approaches for application request throttling - Cloud Developer Days Poland
PPTX
Low latency in java 8 by Peter Lawrey
PDF
Viaje al centro de los datos: ingeniería, arquitectura y ecología
PDF
Aerospike Go Language Client
PDF
The value of reactive
PDF
The Value of Reactive
PPTX
Approaches for application request throttling - dotNetCologne
PPTX
Cloud Storage Introduction ( CEPH )
PPTX
eProsima RPC over DDS - Connext Conf London October 2015
5 must-have patterns for your microservice - buildstuff
5 must have patterns for your microservice - techorama
Microservice Architecture at ASOS - DevSum 2017
Buildstuff - what do you need to know about RPC comeback
ConFoo Montreal - Approaches for application request throttling
The Real World - Plugging the Enterprise Into It (nodejs)
VISUG - Approaches for application request throttling
Approaches to application request throttling
Appriss Automates Support of 650 End-Point Routers with CA Process Automation
How we sleep well at night using Hystrix at Finn.no
Being HAPI! Reverse Proxying on Purpose
Approaches for application request throttling - Cloud Developer Days Poland
Low latency in java 8 by Peter Lawrey
Viaje al centro de los datos: ingeniería, arquitectura y ecología
Aerospike Go Language Client
The value of reactive
The Value of Reactive
Approaches for application request throttling - dotNetCologne
Cloud Storage Introduction ( CEPH )
eProsima RPC over DDS - Connext Conf London October 2015
Ad

More from Voxxed Days Thessaloniki (13)

PPTX
Voxxed Days Thesaloniki 2016 - The Long Road
PDF
Voxxed Days Thesaloniki 2016 - Scaling react.js applications
PDF
Voxxed Days Thesaloniki 2016 - Herding cats to a firefight
PDF
Voxxed Days Thessaloniki 2016 - Web assembly : the browser vm we were waiting...
PDF
Voxxed Days Thesaloniki 2016 - Streaming Engines for Big Data
PDF
Voxxed Days Thessaloniki 2016 - Documentation Avoidance
PDF
Voxxed Days Thesaloniki 2016 - Rightsize Your Services with WildFly & WildFly...
PDF
Voxxed Days Thessaloniki 2016 - Microservices in production
PDF
Voxxed Days Thesaloniki 2016 - Whirlwind tour through the HTTP2 spec
PDF
Voxxed Days Thesaloniki 2016 - Machine Learning for Developers
PDF
Voxxed Days Thessaloniki 2016 - Continuous Delivery: Jenkins, Docker and Spri...
PDF
Voxxed Days Thesaloniki 2016 - A journey to Open Source Technologies on Azure
PDF
Voxxed Days Thessaloniki 2016 - Keynote - JDK 9 : Big Changes To Make Java Sm...
Voxxed Days Thesaloniki 2016 - The Long Road
Voxxed Days Thesaloniki 2016 - Scaling react.js applications
Voxxed Days Thesaloniki 2016 - Herding cats to a firefight
Voxxed Days Thessaloniki 2016 - Web assembly : the browser vm we were waiting...
Voxxed Days Thesaloniki 2016 - Streaming Engines for Big Data
Voxxed Days Thessaloniki 2016 - Documentation Avoidance
Voxxed Days Thesaloniki 2016 - Rightsize Your Services with WildFly & WildFly...
Voxxed Days Thessaloniki 2016 - Microservices in production
Voxxed Days Thesaloniki 2016 - Whirlwind tour through the HTTP2 spec
Voxxed Days Thesaloniki 2016 - Machine Learning for Developers
Voxxed Days Thessaloniki 2016 - Continuous Delivery: Jenkins, Docker and Spri...
Voxxed Days Thesaloniki 2016 - A journey to Open Source Technologies on Azure
Voxxed Days Thessaloniki 2016 - Keynote - JDK 9 : Big Changes To Make Java Sm...
Ad

Recently uploaded (20)

PPTX
ISO 45001 Occupational Health and Safety Management System
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
Digital Strategies for Manufacturing Companies
PDF
System and Network Administration Chapter 2
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
How Creative Agencies Leverage Project Management Software.pdf
PPTX
ai tools demonstartion for schools and inter college
PPTX
ManageIQ - Sprint 268 Review - Slide Deck
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PPTX
L1 - Introduction to python Backend.pptx
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
medical staffing services at VALiNTRY
PDF
Understanding Forklifts - TECH EHS Solution
PDF
System and Network Administraation Chapter 3
PDF
AI in Product Development-omnex systems
ISO 45001 Occupational Health and Safety Management System
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Digital Strategies for Manufacturing Companies
System and Network Administration Chapter 2
Upgrade and Innovation Strategies for SAP ERP Customers
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
How Creative Agencies Leverage Project Management Software.pdf
ai tools demonstartion for schools and inter college
ManageIQ - Sprint 268 Review - Slide Deck
Navsoft: AI-Powered Business Solutions & Custom Software Development
Design an Analysis of Algorithms II-SECS-1021-03
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
L1 - Introduction to python Backend.pptx
Design an Analysis of Algorithms I-SECS-1021-03
medical staffing services at VALiNTRY
Understanding Forklifts - TECH EHS Solution
System and Network Administraation Chapter 3
AI in Product Development-omnex systems

Voxxed Days Thesaloniki 2016 - 5 must have patterns for your web-scale microservice