SlideShare a Scribd company logo
Continuous Deployment with C*:
Treating C* as First-Class Code
Michael Kjellman
@mkjellman
Software Engineer, Barracuda Networks
Continuous Deployment with Cassandra
C* At Barracuda
• Powers 100% of our Spam and Webfilter Backend
• 48 Node Cluster
• 2 Datacenters
• Requests: 20k writes/sec 30k reads/sec
• Latency: 1 ms/write 1.6 ms/read
• > 30TB of Data
• Almost entirely native protocol/CQL3
Hardware Configuration
• 32GB of RAM
• 1x SSD
• 2x Spinning Disks
• 2x 6 Core AMD
Key Configuration Options
• key_cache_size_in_mb: 1024
• row_cache_size_in_mb: 0
• memtable_total_space_in_mb: 2048
• HEAP_NEWSIZE = “1200M” (-Xmn)
• MAX_HEAP_SIZE = “8G” (-Xmx)
• -XX:SurvivorRatio=6
• Sidenote: Java 7u40 is out!
How do I keep my graphs pretty during
a C* upgrade?
September 18th 2013
Make a C* Build
$> git clone http://git-wip-
us.apache.org/repos/asf/cassandra.git
$> git checkout –t origin/cassandra-1.2
$> git log
$> vim build.xml (change version number every
time you make a build!)
$> ant clean release
Deployment
• Make release
• Test release with CCM
• Push release to Puppet (deals with config, etc)
• Run controlled and scripted rolling restart one datacenter
at a time
– flush
– stop
– start
– validate node
Automate, Automate, Automate
So, why not just
apt-get install cassandra?
• Makes running a custom release in the future a
complete nightmare
• Lost visibility into changes in the release
• WHY are you upgrading
• Treat a C* build just as if it was a release of your
code. What commits did you put into your own
release?
MY CODE DOESN’T WORK WITHOUT A
STABLE C* CLUSTER
Simply Put:
When things go wrong
• Every commit (those by C* committers or my
own) come with potential bugs and regressions
• Gossip Bugs Can Bite Hard:
– CASSANDRA-5665: Gossiper.handleMajorStateChange
can lose existing node ApplicationState
• At 48 nodes, even small mistakes are massive
Writing your code to deal with node
failure
• Upgrading a C* cluster means constant node
failures for the duration of the rolling restart
• How does your code deal with read latency and
retries
– CASSANDRA-4705: Eager Retries for reads for 2.0+
• The mythical “constantly failing” code != stability.
– Handle exceptions (and node/read failures) gracefully!
Why treat C* like your own code
• Using C* will move much of your own
application logic to C*
• The bugs have to go somewhere!
• Data replication at database layer or at
application layer
QUESTIONS?
Thanks for Listening!

More Related Content

PDF
Cassandra Day Atlanta 2015: Software Development with Apache Cassandra: A Wal...
PDF
Drupal Performance
PPTX
Building Scalable Web Apps - LVL.UP KL
PPT
High Performance Wordpress
PPTX
JEEconf - Nikolas Ischenko - Java embedded why 8 not 11 (one comma was missed)
PDF
OpenNebulaconf2017US: Rapid scaling of research computing to over 70,000 cor...
PDF
WordCamp RVA
PPT
OpenNebula Administrator View
Cassandra Day Atlanta 2015: Software Development with Apache Cassandra: A Wal...
Drupal Performance
Building Scalable Web Apps - LVL.UP KL
High Performance Wordpress
JEEconf - Nikolas Ischenko - Java embedded why 8 not 11 (one comma was missed)
OpenNebulaconf2017US: Rapid scaling of research computing to over 70,000 cor...
WordCamp RVA
OpenNebula Administrator View

What's hot (17)

PPTX
WebLogic Stability; Detect and Analyse Stuck Threads
PPT
Galera webinar migration to galera cluster from my sql async replication
PDF
Moving mongo db to the cloud strategies and points to consider
PDF
What's New in Postgres Plus Advanced Server 9.3
 
PDF
OpenNebulaconf2017US: Configuration management with OpenNebula and Ansible by...
PDF
Nick Fisk - low latency Ceph
PPTX
Continuous Delivery and Infrastructure as Code
PPTX
Virtualization and SAN Basics for DBAs
PPTX
Drupal 8 and NGINX
PPTX
NOSQL - not only sql
PPT
Using galera replication to create geo distributed clusters on the wan
PPTX
Ceph Tech Talk -- Ceph Benchmarking Tool
PDF
Plny12 galera-cluster-best-practices
PDF
Integrating Puppet with Cloud Infrastructures-Remco Overdijk
PPTX
WordPress + NGINX Best Practices with EasyEngine
PDF
MySQL for Beginners - part 1
PDF
Selenium grid workshop london 2016
WebLogic Stability; Detect and Analyse Stuck Threads
Galera webinar migration to galera cluster from my sql async replication
Moving mongo db to the cloud strategies and points to consider
What's New in Postgres Plus Advanced Server 9.3
 
OpenNebulaconf2017US: Configuration management with OpenNebula and Ansible by...
Nick Fisk - low latency Ceph
Continuous Delivery and Infrastructure as Code
Virtualization and SAN Basics for DBAs
Drupal 8 and NGINX
NOSQL - not only sql
Using galera replication to create geo distributed clusters on the wan
Ceph Tech Talk -- Ceph Benchmarking Tool
Plny12 galera-cluster-best-practices
Integrating Puppet with Cloud Infrastructures-Remco Overdijk
WordPress + NGINX Best Practices with EasyEngine
MySQL for Beginners - part 1
Selenium grid workshop london 2016
Ad

Viewers also liked (13)

PPTX
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman
PPTX
UKOUG, Lies, Damn Lies and I/O Statistics
PPTX
Apache Cassandra Ignite Presentation
PDF
oracle 11g database architecture
PDF
Optimizer Statistics
PDF
DataFrames: The Extended Cut
PPTX
Analytic SQL Sep 2013
PDF
How to find and fix your Oracle application performance problem
PDF
An introduction to data virtualization in business intelligence
PDF
Apache Jackrabbit Oak on MongoDB
PDF
Python as part of a production machine learning stack by Michael Manapat PyDa...
PDF
Netflix oss season 2 episode 1 - meetup Lightning talks
PPTX
SQL-on-Hadoop Tutorial
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman
UKOUG, Lies, Damn Lies and I/O Statistics
Apache Cassandra Ignite Presentation
oracle 11g database architecture
Optimizer Statistics
DataFrames: The Extended Cut
Analytic SQL Sep 2013
How to find and fix your Oracle application performance problem
An introduction to data virtualization in business intelligence
Apache Jackrabbit Oak on MongoDB
Python as part of a production machine learning stack by Michael Manapat PyDa...
Netflix oss season 2 episode 1 - meetup Lightning talks
SQL-on-Hadoop Tutorial
Ad

Similar to Continuous Deployment with Cassandra (20)

PPTX
Hindsight is 20/20: MySQL to Cassandra
PDF
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
PPTX
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
PDF
Caching for Performance Masterclass: The In-Memory Datastore
PDF
PPTX
Presentation1
PPTX
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
PDF
TechWiseTV Workshop: Cisco UCS C4200
PPTX
CPU Caches
PPTX
Something about SSE and beyond
PDF
Cuda 6 performance_report
PDF
Feature Store Evolution Under Cost Constraints: When Cost is Part of the Arch...
PDF
A Dataflow Processing Chip for Training Deep Neural Networks
PDF
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
PDF
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
PDF
Apache Spark At Scale in the Cloud
PDF
Apache Spark At Scale in the Cloud
PDF
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
PDF
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
PDF
Sista: Improving Cog’s JIT performance
Hindsight is 20/20: MySQL to Cassandra
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Caching for Performance Masterclass: The In-Memory Datastore
Presentation1
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
TechWiseTV Workshop: Cisco UCS C4200
CPU Caches
Something about SSE and beyond
Cuda 6 performance_report
Feature Store Evolution Under Cost Constraints: When Cost is Part of the Arch...
A Dataflow Processing Chip for Training Deep Neural Networks
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the Cloud
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
Sista: Improving Cog’s JIT performance

Recently uploaded (20)

PPTX
OMC Textile Division Presentation 2021.pptx
PDF
DP Operators-handbook-extract for the Mautical Institute
PDF
A comparative study of natural language inference in Swahili using monolingua...
PPT
Module 1.ppt Iot fundamentals and Architecture
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PPT
What is a Computer? Input Devices /output devices
PDF
project resource management chapter-09.pdf
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Getting started with AI Agents and Multi-Agent Systems
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
cloud_computing_Infrastucture_as_cloud_p
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
WOOl fibre morphology and structure.pdf for textiles
PDF
2021 HotChips TSMC Packaging Technologies for Chiplets and 3D_0819 publish_pu...
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PPTX
Modernising the Digital Integration Hub
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
OMC Textile Division Presentation 2021.pptx
DP Operators-handbook-extract for the Mautical Institute
A comparative study of natural language inference in Swahili using monolingua...
Module 1.ppt Iot fundamentals and Architecture
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
What is a Computer? Input Devices /output devices
project resource management chapter-09.pdf
Assigned Numbers - 2025 - Bluetooth® Document
Getting started with AI Agents and Multi-Agent Systems
NewMind AI Weekly Chronicles - August'25-Week II
cloud_computing_Infrastucture_as_cloud_p
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
WOOl fibre morphology and structure.pdf for textiles
2021 HotChips TSMC Packaging Technologies for Chiplets and 3D_0819 publish_pu...
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Modernising the Digital Integration Hub
Univ-Connecticut-ChatGPT-Presentaion.pdf

Continuous Deployment with Cassandra

  • 1. Continuous Deployment with C*: Treating C* as First-Class Code Michael Kjellman @mkjellman Software Engineer, Barracuda Networks
  • 3. C* At Barracuda • Powers 100% of our Spam and Webfilter Backend • 48 Node Cluster • 2 Datacenters • Requests: 20k writes/sec 30k reads/sec • Latency: 1 ms/write 1.6 ms/read • > 30TB of Data • Almost entirely native protocol/CQL3
  • 4. Hardware Configuration • 32GB of RAM • 1x SSD • 2x Spinning Disks • 2x 6 Core AMD
  • 5. Key Configuration Options • key_cache_size_in_mb: 1024 • row_cache_size_in_mb: 0 • memtable_total_space_in_mb: 2048 • HEAP_NEWSIZE = “1200M” (-Xmn) • MAX_HEAP_SIZE = “8G” (-Xmx) • -XX:SurvivorRatio=6 • Sidenote: Java 7u40 is out!
  • 6. How do I keep my graphs pretty during a C* upgrade? September 18th 2013
  • 7. Make a C* Build $> git clone http://git-wip- us.apache.org/repos/asf/cassandra.git $> git checkout –t origin/cassandra-1.2 $> git log $> vim build.xml (change version number every time you make a build!) $> ant clean release
  • 8. Deployment • Make release • Test release with CCM • Push release to Puppet (deals with config, etc) • Run controlled and scripted rolling restart one datacenter at a time – flush – stop – start – validate node
  • 10. So, why not just apt-get install cassandra? • Makes running a custom release in the future a complete nightmare • Lost visibility into changes in the release • WHY are you upgrading • Treat a C* build just as if it was a release of your code. What commits did you put into your own release?
  • 11. MY CODE DOESN’T WORK WITHOUT A STABLE C* CLUSTER Simply Put:
  • 12. When things go wrong • Every commit (those by C* committers or my own) come with potential bugs and regressions • Gossip Bugs Can Bite Hard: – CASSANDRA-5665: Gossiper.handleMajorStateChange can lose existing node ApplicationState • At 48 nodes, even small mistakes are massive
  • 13. Writing your code to deal with node failure • Upgrading a C* cluster means constant node failures for the duration of the rolling restart • How does your code deal with read latency and retries – CASSANDRA-4705: Eager Retries for reads for 2.0+ • The mythical “constantly failing” code != stability. – Handle exceptions (and node/read failures) gracefully!
  • 14. Why treat C* like your own code • Using C* will move much of your own application logic to C* • The bugs have to go somewhere! • Data replication at database layer or at application layer