Stress Test & Chaos Engineering

Stress Test &
Chaos Engineering
Diego Pacheco

Every engineer + manager think about
❏ FRPs(Functional Requirements) ~ Features
❏ Time / Productivity
❏ Business Logic that works
❏ But is it all...?

Scalability + Availability + Reliability
❏ As the business grows would the code continue working?
❏ Would the user experience be the same(getting slow)?
❏ Would be good for some users(p50) but few users really might
have a bad experience (p99.9 & p99.99).
❏ Does the user trust the system? Lack of think
About this 3 disciplines could really destroy
Your brand really fast.

The Rise and Fall of fallbacks
❏ Hystrix
❏ Spring Cloud -> Resilience4J
❏ Fallback Issues:
❏ Hard to Tests
❏ Fallbacks fail
❏ Lack of continuous testing
❏ Fallbacks can make outage even worst
❏ Amazon Philosophy -> focus in code more resilient.

Erlang | Akka | Amazon Philosophy

How to do Proper Stress / Load Testing?
❏ Have Plan
❏ What Service to Test? Why?
❏ Select Endpoints to test (don't test them all)
❏ Have Expectations in sense of Latency | Requests to Handle
❏ Know where your service break. Figure it out why.
❏ Test using batteries: 1,5,10,50,100,1k,2k,5k,10k,50k,100k,1M,100M...
❏ You must have observability. Dedicated Env is a must as well.
❏ Understand your metrics(which ones per service)
❏ Automate Stress Tests in your build pineline
❏ Have platform: It could be a jenkins job + scripts.

Stress / Load Testing with Gatling
https://gist.github.com/diegopacheco/faf7ceb2496e4ebdaded

docker run diegopacheco/time-microservice

./gradlew gatlingRun-com.github.diegopacheco.gatling.microservices.st.StressTest
-DGATLING_URL="http://172.17.0.2:8080"

https://gatling.io/docs/current/cheat-sheet/

Chaos
❏ Test your Infrastructure
❏ All ASG in place?
❏ Does the failover to other: Instance, AZ, Region works?
❏ Test your clusters:
❏ SQL | NoSQL | NewSQL
❏ Test your microservices downstream dependencies
❏ Timeouts
❏ Retries | Exponential backoﬀ + Jitter
❏ Chaos Inside a Box
❏ DISK, CPU, Memory, Metadata...

Chaos
https://github.com/asobti/kube-monkey

Chaos
https://docs.chaostoolkit.org/

Chaos
https://github.com/pingcap/chaos-mesh

Exercises
Constraints
1. Stress Tests need to be written in Scala
2. You need to use Gatling
1. Using previous exercises or time-timecroservice image make the
application run in kubernetes.
2. Create a stress test with Gatling
3. Create chaos with kubernetes killing PODS and make sure app still
works and gatling tests don't fail.

Stress Test & Chaos Engineering

More Related Content

What's hot (20)

Similar to Stress Test & Chaos Engineering (20)

More from Diego Pacheco (20)

Recently uploaded (20)

Stress Test & Chaos Engineering