Leveraging Chaos Mesh in Astra
Serverless testing
2021-08-26
Pierre Laporte - @pingtimeout - pierre.laporte@datastax.com
Intro
Apache Cassandra™
● Open-source (ASLv2) distributed database
● Leader-less architecture
● Linearly scalable
● Zero-downtime
3
● Created in 2010
● Supports and contributes to Cassandra
● Develops DSE
● Astra Streaming, powered by Pulsar
● Distributed Tests on Pulsar with Chaos Mesh & Fallout
● Astra DB, powered by Cassandra
Datastax
4
Distributed System Testing
5
● Distributed database testing framework
● Verify data consistency
● CAP theorem
● Mostly standalone deployments
Jepsen
6
● Built on top of Jepsen
● Open-source (ASLv2) test orchestration project
● Provisions, builds, installs and configure products
● Collects test run artifacts (.log, .hdr, .csv, …)
Fallout
7
● Open-source (ASLv2) load generator
● Supports multiple protocols (CQL, Kafka, Pulsar, MongoDB, HTTP, …)
● Benchmark definition as yaml
● Lots of metrics
● Graphite/HDR historgam
NoSQLBench
8
Chaos testing Astra
Serverless
9
10
Types of failures Recoverable
infrastructure
failures
01
Local failures (hardware induced,
traffic induced, …)
Disaster failures
02
Region-wide failures (storm in
us-east-1, DNS in ap-northeast-2, it's
always DNS, …)
Goals of chaos testing Astra Serverless
● Build confidence, trust and knowledge
● How resilient is the system?
● How big is the user impact?
● Calculate Mean Time to Recovery
● How long to fully recover from an outage?
11
Two different mitigation procedures
● Driver level - Avoid busy/dead pods
● Speculative execution policy
● Load-balancing policy
● Kubernetes level - Replace dead pods
12
Astra Serverless architecture
● Cloud-native database offering
● Architecture whitepaper
13
● Astra DB Data Plane
● Between internal pods
● Between pods and external
services
Chaos Mesh usage
14
1. Create Chaos Mesh experiments
2. Design tests in fallout with NoSQLBench
3. Automatically compute MTTR after tests
All chaos tests are run weekly
Automate everything
15
Closing thoughts
16
● Many ideas to leverage Chaos Mesh
● Chaos tests against Cassandra 4.0 with K8ssandra
● More Chaos Mesh experiments (IOChaos, StressChaos, …)
Thank you for such a useful tool !
Future work
17
We are hiring !
Pierre Laporte - @pingtimeout - pierre.laporte@datastax.com
18

More Related Content

PDF
10 Devops-Friendly Database Must-Haves - Dor Laor, ScyllaDB - DevOpsDays Tel ...
PPTX
Apache cassandra v4.0
PPTX
How to Build a Multi-DC Cassandra Cluster in AWS with OpsCenter LCM
PDF
Query and audit logging in cassandra
PDF
Anatomy of an action
PDF
How we got to 1 millisecond latency in 99% under repair, compaction, and flus...
PDF
Taking Your Database Global with Kubernetes
PDF
Moving from CellsV1 to CellsV2 at CERN
10 Devops-Friendly Database Must-Haves - Dor Laor, ScyllaDB - DevOpsDays Tel ...
Apache cassandra v4.0
How to Build a Multi-DC Cassandra Cluster in AWS with OpsCenter LCM
Query and audit logging in cassandra
Anatomy of an action
How we got to 1 millisecond latency in 99% under repair, compaction, and flus...
Taking Your Database Global with Kubernetes
Moving from CellsV1 to CellsV2 at CERN

What's hot (20)

PDF
ScyllaDB @ Apache BigData, may 2016
PPTX
Glance Updates - Kilo Edition
PDF
Running Cassandra in AWS
PDF
OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...
PDF
Future Science on Future OpenStack
PDF
Seastar / ScyllaDB, or how we implemented a 10-times faster Cassandra
PDF
Cassandra To Infinity And Beyond
PDF
Apache Cassandra in the Real World
PPTX
Stabilising the jenga tower
PPTX
CrateDB - Giacomo Ceribelli
PPTX
OSMC - Fast logs management
PDF
OSMC 2019 | Fast logs Ingestion by Nicolas Fraenkel
PDF
Safer restarts, faster streaming, and better repair, just a glimpse of cassan...
PDF
Monitoring Large-scale Cloud Infrastructures with OpenNebula
PDF
Cern Cloud Architecture - February, 2016
PPTX
Scylla Summit 2018: Consensus in Eventually Consistent Databases
PPTX
How bol.com makes sense of its logs, using the Elastic technology stack.
PDF
Back to the future with C++ and Seastar
PDF
Scylla: 1 Million CQL operations per second per server
PPTX
Developing Scylla Applications: Practical Tips
ScyllaDB @ Apache BigData, may 2016
Glance Updates - Kilo Edition
Running Cassandra in AWS
OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...
Future Science on Future OpenStack
Seastar / ScyllaDB, or how we implemented a 10-times faster Cassandra
Cassandra To Infinity And Beyond
Apache Cassandra in the Real World
Stabilising the jenga tower
CrateDB - Giacomo Ceribelli
OSMC - Fast logs management
OSMC 2019 | Fast logs Ingestion by Nicolas Fraenkel
Safer restarts, faster streaming, and better repair, just a glimpse of cassan...
Monitoring Large-scale Cloud Infrastructures with OpenNebula
Cern Cloud Architecture - February, 2016
Scylla Summit 2018: Consensus in Eventually Consistent Databases
How bol.com makes sense of its logs, using the Elastic technology stack.
Back to the future with C++ and Seastar
Scylla: 1 Million CQL operations per second per server
Developing Scylla Applications: Practical Tips
Ad

Similar to Leveraging chaos mesh in Astra Serverless testing (20)

PDF
Testing kubernetes and_open_shift_at_scale_20170209
PDF
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
PPTX
HPC and cloud distributed computing, as a journey
PPTX
Flexible compute
PPTX
Sanger, upcoming Openstack for Bio-informaticians
PDF
OpenStack Best Practices and Considerations - terasky tech day
PDF
Provisioning Servers Made Easy
PDF
Sanger OpenStack presentation March 2017
PPTX
BigData Developers MeetUp
PDF
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
PDF
Chicago Kafka Meetup
PDF
Public vs. Private Cloud Performance by Flex
PDF
Modern Elastic Datacenter Architecture
PDF
OpenNebula TechDay Waterloo 2015 - Hyperconvergence and OpenNebula
PDF
Low-Cost, Unlimited Metrics Storage with Thanos: Monitor All Your K8s Cluster...
PPT
Openstack - An introduction/Installation - Presented at Dr Dobb's conference...
PPTX
What is the OpenStack Platform? By Peter Dens - Kangaroot
PDF
Cassandra at Pollfish
PDF
Cassandra at Pollfish
PDF
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Testing kubernetes and_open_shift_at_scale_20170209
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
HPC and cloud distributed computing, as a journey
Flexible compute
Sanger, upcoming Openstack for Bio-informaticians
OpenStack Best Practices and Considerations - terasky tech day
Provisioning Servers Made Easy
Sanger OpenStack presentation March 2017
BigData Developers MeetUp
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Chicago Kafka Meetup
Public vs. Private Cloud Performance by Flex
Modern Elastic Datacenter Architecture
OpenNebula TechDay Waterloo 2015 - Hyperconvergence and OpenNebula
Low-Cost, Unlimited Metrics Storage with Thanos: Monitor All Your K8s Cluster...
Openstack - An introduction/Installation - Presented at Dr Dobb's conference...
What is the OpenStack Platform? By Peter Dens - Kangaroot
Cassandra at Pollfish
Cassandra at Pollfish
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Ad

More from Pierre Laporte (7)

PDF
Les race conditions, nos très chères amies
PDF
Devoxx BE - How to fail at benchmarking
PDF
Lyon jug-how-to-fail-at-benchmarking
PDF
La BDD, l'enfant gâté des SI
PDF
How to fail at benchmarking?
PDF
Pimp my gc - Supersonic Scala
ODP
Building a lock profiler on the JVM
Les race conditions, nos très chères amies
Devoxx BE - How to fail at benchmarking
Lyon jug-how-to-fail-at-benchmarking
La BDD, l'enfant gâté des SI
How to fail at benchmarking?
Pimp my gc - Supersonic Scala
Building a lock profiler on the JVM

Recently uploaded (20)

PPTX
Software Engineering and software moduleing
PPT
INTRODUCTION -Data Warehousing and Mining-M.Tech- VTU.ppt
PDF
III.4.1.2_The_Space_Environment.p pdffdf
PPTX
ASME PCC-02 TRAINING -DESKTOP-NLE5HNP.pptx
PDF
SMART SIGNAL TIMING FOR URBAN INTERSECTIONS USING REAL-TIME VEHICLE DETECTI...
PPT
Total quality management ppt for engineering students
PDF
distributed database system" (DDBS) is often used to refer to both the distri...
PPTX
Current and future trends in Computer Vision.pptx
PPTX
Module 8- Technological and Communication Skills.pptx
PPTX
6ME3A-Unit-II-Sensors and Actuators_Handouts.pptx
PDF
Categorization of Factors Affecting Classification Algorithms Selection
PDF
Exploratory_Data_Analysis_Fundamentals.pdf
PDF
Improvement effect of pyrolyzed agro-food biochar on the properties of.pdf
PPTX
introduction to high performance computing
PPTX
tack Data Structure with Array and Linked List Implementation, Push and Pop O...
PDF
Abrasive, erosive and cavitation wear.pdf
PDF
Accra-Kumasi Expressway - Prefeasibility Report Volume 1 of 7.11.2018.pdf
PPTX
CURRICULAM DESIGN engineering FOR CSE 2025.pptx
PPTX
Feature types and data preprocessing steps
PPTX
Management Information system : MIS-e-Business Systems.pptx
Software Engineering and software moduleing
INTRODUCTION -Data Warehousing and Mining-M.Tech- VTU.ppt
III.4.1.2_The_Space_Environment.p pdffdf
ASME PCC-02 TRAINING -DESKTOP-NLE5HNP.pptx
SMART SIGNAL TIMING FOR URBAN INTERSECTIONS USING REAL-TIME VEHICLE DETECTI...
Total quality management ppt for engineering students
distributed database system" (DDBS) is often used to refer to both the distri...
Current and future trends in Computer Vision.pptx
Module 8- Technological and Communication Skills.pptx
6ME3A-Unit-II-Sensors and Actuators_Handouts.pptx
Categorization of Factors Affecting Classification Algorithms Selection
Exploratory_Data_Analysis_Fundamentals.pdf
Improvement effect of pyrolyzed agro-food biochar on the properties of.pdf
introduction to high performance computing
tack Data Structure with Array and Linked List Implementation, Push and Pop O...
Abrasive, erosive and cavitation wear.pdf
Accra-Kumasi Expressway - Prefeasibility Report Volume 1 of 7.11.2018.pdf
CURRICULAM DESIGN engineering FOR CSE 2025.pptx
Feature types and data preprocessing steps
Management Information system : MIS-e-Business Systems.pptx

Leveraging chaos mesh in Astra Serverless testing

  • 1. Leveraging Chaos Mesh in Astra Serverless testing 2021-08-26 Pierre Laporte - @pingtimeout - pierre.laporte@datastax.com
  • 3. Apache Cassandra™ ● Open-source (ASLv2) distributed database ● Leader-less architecture ● Linearly scalable ● Zero-downtime 3
  • 4. ● Created in 2010 ● Supports and contributes to Cassandra ● Develops DSE ● Astra Streaming, powered by Pulsar ● Distributed Tests on Pulsar with Chaos Mesh & Fallout ● Astra DB, powered by Cassandra Datastax 4
  • 6. ● Distributed database testing framework ● Verify data consistency ● CAP theorem ● Mostly standalone deployments Jepsen 6
  • 7. ● Built on top of Jepsen ● Open-source (ASLv2) test orchestration project ● Provisions, builds, installs and configure products ● Collects test run artifacts (.log, .hdr, .csv, …) Fallout 7
  • 8. ● Open-source (ASLv2) load generator ● Supports multiple protocols (CQL, Kafka, Pulsar, MongoDB, HTTP, …) ● Benchmark definition as yaml ● Lots of metrics ● Graphite/HDR historgam NoSQLBench 8
  • 10. 10 Types of failures Recoverable infrastructure failures 01 Local failures (hardware induced, traffic induced, …) Disaster failures 02 Region-wide failures (storm in us-east-1, DNS in ap-northeast-2, it's always DNS, …)
  • 11. Goals of chaos testing Astra Serverless ● Build confidence, trust and knowledge ● How resilient is the system? ● How big is the user impact? ● Calculate Mean Time to Recovery ● How long to fully recover from an outage? 11
  • 12. Two different mitigation procedures ● Driver level - Avoid busy/dead pods ● Speculative execution policy ● Load-balancing policy ● Kubernetes level - Replace dead pods 12
  • 13. Astra Serverless architecture ● Cloud-native database offering ● Architecture whitepaper 13
  • 14. ● Astra DB Data Plane ● Between internal pods ● Between pods and external services Chaos Mesh usage 14
  • 15. 1. Create Chaos Mesh experiments 2. Design tests in fallout with NoSQLBench 3. Automatically compute MTTR after tests All chaos tests are run weekly Automate everything 15
  • 17. ● Many ideas to leverage Chaos Mesh ● Chaos tests against Cassandra 4.0 with K8ssandra ● More Chaos Mesh experiments (IOChaos, StressChaos, …) Thank you for such a useful tool ! Future work 17
  • 18. We are hiring ! Pierre Laporte - @pingtimeout - pierre.laporte@datastax.com 18