Performance Testing in Production - Leveraging the Universal Scalability Law

Kevin Brockhoff
Performance Testing
in Production
Leveraging the Universal
Scalability Law

Testing in Production Methodologies
❖ A/B Testing (aka Online Controlled Experimentation)
❖ Some percent of users of a website or service are unbeknownst to them given an alternate
experience (a new version of the service). Data is then collected on how users act in the old versus
new service, which can be analyzed to determine whether the new proposed change is good or not.
❖ Ramped Deployment
❖ Using Exposure Control, a new deployment of a website or service is gradually rolled out. First to a
small percentage of users, and then ultimately to all of them. At each stage the software is
monitored, and if any critical issues are uncovered that cannot be remedied, the deployment is
rolled back.
❖ Shadowing
❖ The system under test is deployed and uses real production data in real-time, but the results are not
exposed to the end user.
Source: https://blogs.msdn.microsoft.com/seliot/2011/06/07/testing-in-production-tip-it-really-happensexamples-from-facebook-amazon-google-and-microsoft/

Cynefin Framework
Complex
The complex domain represents the "unknown
unknowns". Cause and effect can only be
deduced in retrospect, and there are no right
answers. "Instructive patterns ... can emerge,"
write Snowden and Boone, "if the leader
conducts experiments that are safe to fail."
Cynefin calls this process "probe–sense–
respond". Hard insurance cases are one example.
"Hard cases ... need human underwriters,"
Stewart writes, "and the best all do the same
thing: Dump the file and spread out the
contents." Stewart identifies battlefields,
markets, ecosystems and corporate cultures as
complex systems that are "impervious to a
reductionist, take-it-apart-and-see-how-it-works
approach, because your very actions change the
situation in unpredictable ways."
Source: Wikipedia – Cynefin framework

Properties of a Complex System
❖ No complex system is ever fully healthy.
❖ Distributed systems are pathologically unpredictable.
❖ It’s impossible to predict the myriad states of partial failure
various parts of the system might end up in.
❖ Failure needs to be embraced at every phase, from system
design to implementation, testing, deployment, and, ﬁnally,
operation.
❖ Ease of debugging is a cornerstone for the maintenance and
evolution of robust systems.
From Distributed Systems Observability by Cindy Sridharan

–Cindy Sridharan
“I’m more and more convinced that staging
environments are like mocks - at best a pale
imitation of the genuine article and the worst form
of conﬁrmation bias. It’s still better than having
nothing - but “works in staging” is only one step
better than “works on my machine”.”

Testing in Production, the safe way - Cindy Sridharan on Medium

If production is the only reliable test
environment, how can we determine where
the limits are without impacting user
experience?

Universal Scalability Law
❖ Introduced by Neil J. Gunther in 1993
❖ Applicable to both hardware and software
❖ Predicts throughput X(N) at a given load, N
❖ Hardware - N = number of processors
❖ Software - N = number of simultaneous users
❖ Constraints
❖ Contention - Competition for resources on single node
❖ Coherence - Cost of shared state coordination between nodes
❖ Wide-spread applicability
❖ Load testing tools like LoadRunner and JMeter
❖ Modeling disk arrays, SANs, and multicore processors
❖ Modeling distributed application software, e.g., Hadoop
❖ Accounts for such effects as memory thrashing, and cache-miss latencies
❖ Can extrapolate when a system will degrade without having to push the system to that point

Original 2 parameter equation:
Revised 3 parameter
equation (known value for
1 N not required):
Sources:
http://www.perfdynamics.com/Manifesto/USLscalability.html
http://perfdynamics.blogspot.com/2018/05/usl-scalability-modeling-with-three.html

Calculated USL
JMeter Test Runs
1 Pod 2 Pods
Users No. Avg/s Avg Resp Max Resp No. Avg/s Avg Resp
64 16257 53.2 128 1380 16283 53.3 128 1223
128 31635 103.6 161 1716 31788 103.8 155 1472
192 46033 150.7 199 1816 46155 150.7 196 1511
256 58878 190.9 253 2072 59385 193.7 241 4115
320 68345 223.1 353 3564 70845 231.2 303 4525
384 74272 242.3 500 4493 80283 261.9 383 3098
448 87009 283.7 492 3152
Pods Users Throughput
1 738.392 94640.14
2 878.9402 108487.9
Maximum Throughput

Links
❖ Source code: https://github.com/kbrockhoff/perf-test-
in-prod

Performance Testing in Production - Leveraging the Universal Scalability Law

More Related Content

What's hot (15)

Similar to Performance Testing in Production - Leveraging the Universal Scalability Law (20)

Recently uploaded (20)

Performance Testing in Production - Leveraging the Universal Scalability Law