SlideShare a Scribd company logo
A ScyllaDB Community
Freshworks Migration Journey
from Cassandra to ScyllaDB
Premkumar Patturaj
Senior Manager
Prem Kumar Patturaj
■ Senior Engineering Manager at Freshworks with 15 years of
IT experience, with 10 years at Freshworks.
■ Expertise in Relational and NoSQL databases, specializing
in designing and optimizing scalable, high-performance
systems.
■ Experienced in solving complex technical challenges,
mentoring teams, and fostering a culture of continuous learning.
■ Committed to engineering excellence, leveraging best
practices to create efficient and reliable software solutions.
© 2024 Freshworks Inc. All rights reserved.
Freshworks at a glance
2010
Founded
4,500
Employees
$700M+
67,000+
Total Customers 3 Gartner Magic Quadrants
Leader in 3 Major Peer Reviews
Recognition
2024 Annual Revenue Guidance
IPO September 2021
FRSH
© 2024 Freshworks Inc. All rights reserved.
Neo Platform and Freddy infuse AI across all products
Freshworks Solutions
Freddy AI Insights
Freddy AI Copilot
Integrate & Extend
Developer tools
Marketplace
Unify
Data Analytics Admin Security
Manage & Secure
Employee Experience Customer Experience
SOLUTIONS
Freshservice Customer
Service Suite
Freshdesk Freshchat Freshsales Freshmarketer
Freshservice
for Business Teams
Device42
PLATFORM
AI
Freddy AI
for Customer Service, Sales,
Marketing, IT & Developers
for Business Leaders
Freshworks Neo
Freddy AI Self Service
for Customers & Employees
■ Background and Motivation
■ Goals
■ Approach
■ Challenges
■ Optimization
Presentation Agenda
We manage all databases in Freshworks
■ Availability
■ Reliability
■ Monitoring
■ Recovery
■ Keep Current
■ RDS MySQL, Postgres; Redis; MongoDB; Kafka; ClickHouse; …
■ A mix of self-hosted and cloud solutions
■ Identify the best balance for Freshworks
■ Uber goal for Dataverse
■ Application teams agnostic to the underlying database
■ eg, use Cassandra client but backend is ScyllaDB
Dataverse
Databases at Freshworks
Database Servers Data Processed Req/s Data persisted Availability
MySQL 1200 7.9Gb/s 1.4M 4.5 PiB 99.992
Redis 869 1GB/s 2M 550 GiB 99.991
Kafka 65 1GB/s 0.7M 420 TiB 99.99
ClickHouse 16 400Mb/s 2M 33 TiB 99.99
Memcached 72 12Mb/s 2M 257 GiB 99.99
Postgres 110 2.2Gb/s 0.22M 210 TiB 99.99
ScyllaDB 45 750Mb/s 0.05M 270 TiB 99.99
Scale
ScyllaDB at Freshworks
Clusters Nodes IOPS Storage
10 45 500k 270TB
Background and Motivation
Background
Hypertrail
■ Hypertrail aims to provide a scalable, cost-effective, and fault-tolerant timeline solution that enables
products to capture and query activity and audit logs for any custom entity, with flexible filtering
capabilities to meet specific business needs
Workflow Automator
■ Workflows can be configured for project and task creation and associating them to tickets/changes.
Users can configure the workflow using any condition they want for tickets/changes, This is currently used
for alerts module right now.
Hypertrail
Cassandra Overview
Cassandra Cluster Overview:
■ 24TB of unreplicated data.
■ Spread across 56 Cassandra nodes.
Challenges in Cassandra:
■ Repair & Consistency Issues
■ High Tailend Latencies
■ Backup & Restore Overheads
■ Manual Toil with more nodes
Performance Benchmark
Motivation
ScyllaDB Advantages Over Cassandra:
Hardware Efficiency:
■ Few large machines replace many small ones.
Operational Simplicity:
■ Reduced overhead for repairs, compactions, and scaling.
Cost Reduction:
■ Lower infrastructure costs due to fewer machines.
Goals
Goals
Zero Downtime:
■ Ensure the application remains fully operational during migration.
Low Latency Overhead:
■ Minimize the impact on application latency during the process.
Accuracy:
■ Validate the migrated data for completeness and correctness.
Efficiency:
■ Perform the migration in the shortest duration possible to reduce infrastructure costs.
■ Complete migration and validations in a time and cost-efficient manner.
Migration Approach
Migration Approach
Historical Data Migration:
■ Bulk migration of existing data from Cassandra to ScyllaDB cluster.
Dual Writes:
■ Writing data to both Cassandra and ScyllaDB clusters while the migration is
in progress using ZDM(Zero Downtime Migration) proxy
Data Validation:
■ Validating data consistency between the source and destination using CDM
(Cassandra Data Migrator)
Historical Data Migration
Evaluated options for bulk data migration
■ Datastax CDM Tool
■ Stream SSTables via Tools
■ Load and Stream using nodetool
Advantages of Load and Stream
■ Fastest approach
■ Minimal impact on ScyllaDB cluster.
Dual Writes
■ ZDM Proxy performed dual-writes, handling all use-cases required for the migration process.
■ Latency added by ZDM Proxy was benchmarked under 10 milliseconds,
InfrastructureSetup
Hosted on EC2 c6.2xlarge instances with 3 replicas distributed across availability zones (AZs).
■ Prometheus Metrics:
■ Exported by ZDM Proxy by default.
■ Node exporter service ran alongside ZDM to monitor system-level bottlenecks.
ZDM Proxy
Reads from Source Only:
■ Used during the initial migration phase.
Async Reads to Target:
■ Enabled after historical data migration and validation.
■ Allowed performance measurement of ScyllaDB before switching the traffic.
Migration Workflow:
■ ZDM Proxy initially operated with reads coming from the source only.
■ After completing bulk data migration and validation, reconfigured ZDM Proxy
to async read from the target.
■ Measured ScyllaDB performance before fully transitioning application traffic.
Data Validation
CDM for Data Validation
■ Validating terabytes of data is time-intensive.
■ Optimized validation to reduce time by 80%
Validation Steps
■ CDM reads from the source in bulk.
■ Compares corresponding data in the target cluster.
■ Repeats for the entire partition range.
Tuning CDM Properties:
■ Enabled spark.cdm.autocorrect.missing
spark.cdm.autocorrect.mismatch
■ Bridges gaps in data consistency automatically.
Challenges
Challenges
Large Partition
■ CDM migrator processes large partitions by loading entire slices into memory - OOM Error
Large-Scale Validation:
■ Validating over 20TB of unreplicated data estimated to take weeks.
■ CDM jobs scanned partitions, retrieving rows individually.
■ High I/O latency due to individual select operations for each row.
Optimization
Optimization
Large Partition
■ Split partition range into smaller chunks
■ Controls the amount of data loaded into memory for each slice
Large-Scale Validation
■ Adopted range-based reads.
■ Bypassed value validation by only checking key presence.
Range-Based Reads from Target
Customized CDM validation
Optimization Outcome
■ Reduced validation times by over 80%, ensuring efficiency for large-scale data validations.
■ Enhanced scalability and practicality for production environments.
■ Achieved significant cost savings, particularly in infrastructure expenses.
■ Enabled faster and more frequent validation cycles, ensuring data accuracy and consistency.
Future Usecases
■ BLOB Store
■ UCR
■ DynamoDB usecases
Thank you
Stay in Touch
Prem Kumar Patturaj
premkumar.patturaj@freshworks.com
https://x.com/iam_prem
https://www.linkedin.com/in/prem-kumar-
patturaj-27217933/

More Related Content

PDF
Database Migration Strategies and Pitfalls by Patrick Bossman
PPTX
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
PDF
How Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd Coleman
PPTX
MongoDB vs ScyllaDB: Tractian’s Experience with Real-Time ML
PDF
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
PDF
Eliminating Volatile Latencies Inside Rakuten’s NoSQL Migration
PDF
The True Cost of NoSQL DBaaS Options
PDF
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
Database Migration Strategies and Pitfalls by Patrick Bossman
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
How Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd Coleman
MongoDB vs ScyllaDB: Tractian’s Experience with Real-Time ML
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Eliminating Volatile Latencies Inside Rakuten’s NoSQL Migration
The True Cost of NoSQL DBaaS Options
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration

Similar to Inside Freshworks' Migration from Cassandra to ScyllaDB by Premkumar Patturaj (20)

PDF
Click to Disk Troubleshooting with AppDynamics and OpsDataStore - AppSphere16
PDF
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
PPTX
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
PDF
Data Architecture for Modern Applications
PDF
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
PDF
What Developers Need to Unlearn for High Performance NoSQL
PDF
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
PDF
Big data should be simple
PPTX
Our Journey from Cassandra to Scylla
PPTX
Essential Data Engineering for Data Scientist
PPTX
Complex Analytics with NoSQL Data Store in Real Time
PPTX
DC Migration and Hadoop Scale For Big Billion Days
PDF
Top 5 Considerations for a Big Data Solution
PDF
LesFurets.com: From 0 to Cassandra on AWS in 30 days - Tsunami Alerting Syste...
PDF
Handling the growth of data
PPTX
Scylla Summit 2018: Adventures in AdTech: Processing 50 Billion User Profiles...
PDF
Getting 100B Metrics to Disk
PDF
An Introduction To Palomino
PDF
The Top 5 Factors to Consider When Choosing a Big Data Solution
PDF
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark - Demi Ben...
Click to Disk Troubleshooting with AppDynamics and OpsDataStore - AppSphere16
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Data Architecture for Modern Applications
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
What Developers Need to Unlearn for High Performance NoSQL
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
Big data should be simple
Our Journey from Cassandra to Scylla
Essential Data Engineering for Data Scientist
Complex Analytics with NoSQL Data Store in Real Time
DC Migration and Hadoop Scale For Big Billion Days
Top 5 Considerations for a Big Data Solution
LesFurets.com: From 0 to Cassandra on AWS in 30 days - Tsunami Alerting Syste...
Handling the growth of data
Scylla Summit 2018: Adventures in AdTech: Processing 50 Billion User Profiles...
Getting 100B Metrics to Disk
An Introduction To Palomino
The Top 5 Factors to Consider When Choosing a Big Data Solution
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark - Demi Ben...
Ad

More from ScyllaDB (20)

PDF
Understanding The True Cost of DynamoDB Webinar
PDF
Database Benchmarking for Performance Masterclass: Session 1 - Benchmarking F...
PDF
New Ways to Reduce Database Costs with ScyllaDB
PDF
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
PDF
Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...
PDF
Leading a High-Stakes Database Migration
PDF
Achieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
PDF
Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...
PDF
How Agoda Scaled 50x Throughput with ScyllaDB by Worakarn Isaratham
PDF
ScyllaDB: 10 Years and Beyond by Dor Laor
PDF
Reduce Your Cloud Spend with ScyllaDB by Tzach Livyatan
PDF
Migrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence Liu
PDF
Vector Search with ScyllaDB by Szymon Wasik
PDF
Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...
PDF
Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...
PDF
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
PDF
Object Storage in ScyllaDB by Ran Regev, ScyllaDB
PDF
Lessons Learned from Building a Serverless Notifications System by Srushith R...
PDF
A Dist Sys Programmer's Journey into AI by Piotr Sarna
PDF
High Availability: Lessons Learned by Paul Preuveneers
Understanding The True Cost of DynamoDB Webinar
Database Benchmarking for Performance Masterclass: Session 1 - Benchmarking F...
New Ways to Reduce Database Costs with ScyllaDB
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...
Leading a High-Stakes Database Migration
Achieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...
How Agoda Scaled 50x Throughput with ScyllaDB by Worakarn Isaratham
ScyllaDB: 10 Years and Beyond by Dor Laor
Reduce Your Cloud Spend with ScyllaDB by Tzach Livyatan
Migrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence Liu
Vector Search with ScyllaDB by Szymon Wasik
Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...
Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
Object Storage in ScyllaDB by Ran Regev, ScyllaDB
Lessons Learned from Building a Serverless Notifications System by Srushith R...
A Dist Sys Programmer's Journey into AI by Piotr Sarna
High Availability: Lessons Learned by Paul Preuveneers
Ad

Recently uploaded (20)

PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PPTX
A Presentation on Artificial Intelligence
PPT
Teaching material agriculture food technology
PPTX
MYSQL Presentation for SQL database connectivity
PDF
cuic standard and advanced reporting.pdf
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Approach and Philosophy of On baking technology
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Encapsulation theory and applications.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Machine learning based COVID-19 study performance prediction
PDF
Electronic commerce courselecture one. Pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
A Presentation on Artificial Intelligence
Teaching material agriculture food technology
MYSQL Presentation for SQL database connectivity
cuic standard and advanced reporting.pdf
Chapter 3 Spatial Domain Image Processing.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Approach and Philosophy of On baking technology
Per capita expenditure prediction using model stacking based on satellite ima...
Encapsulation theory and applications.pdf
The AUB Centre for AI in Media Proposal.docx
Machine learning based COVID-19 study performance prediction
Electronic commerce courselecture one. Pdf
Spectral efficient network and resource selection model in 5G networks
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...

Inside Freshworks' Migration from Cassandra to ScyllaDB by Premkumar Patturaj

  • 1. A ScyllaDB Community Freshworks Migration Journey from Cassandra to ScyllaDB Premkumar Patturaj Senior Manager
  • 2. Prem Kumar Patturaj ■ Senior Engineering Manager at Freshworks with 15 years of IT experience, with 10 years at Freshworks. ■ Expertise in Relational and NoSQL databases, specializing in designing and optimizing scalable, high-performance systems. ■ Experienced in solving complex technical challenges, mentoring teams, and fostering a culture of continuous learning. ■ Committed to engineering excellence, leveraging best practices to create efficient and reliable software solutions.
  • 3. © 2024 Freshworks Inc. All rights reserved. Freshworks at a glance 2010 Founded 4,500 Employees $700M+ 67,000+ Total Customers 3 Gartner Magic Quadrants Leader in 3 Major Peer Reviews Recognition 2024 Annual Revenue Guidance IPO September 2021 FRSH
  • 4. © 2024 Freshworks Inc. All rights reserved. Neo Platform and Freddy infuse AI across all products Freshworks Solutions Freddy AI Insights Freddy AI Copilot Integrate & Extend Developer tools Marketplace Unify Data Analytics Admin Security Manage & Secure Employee Experience Customer Experience SOLUTIONS Freshservice Customer Service Suite Freshdesk Freshchat Freshsales Freshmarketer Freshservice for Business Teams Device42 PLATFORM AI Freddy AI for Customer Service, Sales, Marketing, IT & Developers for Business Leaders Freshworks Neo Freddy AI Self Service for Customers & Employees
  • 5. ■ Background and Motivation ■ Goals ■ Approach ■ Challenges ■ Optimization Presentation Agenda
  • 6. We manage all databases in Freshworks ■ Availability ■ Reliability ■ Monitoring ■ Recovery ■ Keep Current ■ RDS MySQL, Postgres; Redis; MongoDB; Kafka; ClickHouse; … ■ A mix of self-hosted and cloud solutions ■ Identify the best balance for Freshworks ■ Uber goal for Dataverse ■ Application teams agnostic to the underlying database ■ eg, use Cassandra client but backend is ScyllaDB Dataverse
  • 7. Databases at Freshworks Database Servers Data Processed Req/s Data persisted Availability MySQL 1200 7.9Gb/s 1.4M 4.5 PiB 99.992 Redis 869 1GB/s 2M 550 GiB 99.991 Kafka 65 1GB/s 0.7M 420 TiB 99.99 ClickHouse 16 400Mb/s 2M 33 TiB 99.99 Memcached 72 12Mb/s 2M 257 GiB 99.99 Postgres 110 2.2Gb/s 0.22M 210 TiB 99.99 ScyllaDB 45 750Mb/s 0.05M 270 TiB 99.99 Scale
  • 8. ScyllaDB at Freshworks Clusters Nodes IOPS Storage 10 45 500k 270TB
  • 10. Background Hypertrail ■ Hypertrail aims to provide a scalable, cost-effective, and fault-tolerant timeline solution that enables products to capture and query activity and audit logs for any custom entity, with flexible filtering capabilities to meet specific business needs Workflow Automator ■ Workflows can be configured for project and task creation and associating them to tickets/changes. Users can configure the workflow using any condition they want for tickets/changes, This is currently used for alerts module right now.
  • 12. Cassandra Overview Cassandra Cluster Overview: ■ 24TB of unreplicated data. ■ Spread across 56 Cassandra nodes. Challenges in Cassandra: ■ Repair & Consistency Issues ■ High Tailend Latencies ■ Backup & Restore Overheads ■ Manual Toil with more nodes
  • 14. Motivation ScyllaDB Advantages Over Cassandra: Hardware Efficiency: ■ Few large machines replace many small ones. Operational Simplicity: ■ Reduced overhead for repairs, compactions, and scaling. Cost Reduction: ■ Lower infrastructure costs due to fewer machines.
  • 15. Goals
  • 16. Goals Zero Downtime: ■ Ensure the application remains fully operational during migration. Low Latency Overhead: ■ Minimize the impact on application latency during the process. Accuracy: ■ Validate the migrated data for completeness and correctness. Efficiency: ■ Perform the migration in the shortest duration possible to reduce infrastructure costs. ■ Complete migration and validations in a time and cost-efficient manner.
  • 18. Migration Approach Historical Data Migration: ■ Bulk migration of existing data from Cassandra to ScyllaDB cluster. Dual Writes: ■ Writing data to both Cassandra and ScyllaDB clusters while the migration is in progress using ZDM(Zero Downtime Migration) proxy Data Validation: ■ Validating data consistency between the source and destination using CDM (Cassandra Data Migrator)
  • 19. Historical Data Migration Evaluated options for bulk data migration ■ Datastax CDM Tool ■ Stream SSTables via Tools ■ Load and Stream using nodetool Advantages of Load and Stream ■ Fastest approach ■ Minimal impact on ScyllaDB cluster.
  • 20. Dual Writes ■ ZDM Proxy performed dual-writes, handling all use-cases required for the migration process. ■ Latency added by ZDM Proxy was benchmarked under 10 milliseconds, InfrastructureSetup Hosted on EC2 c6.2xlarge instances with 3 replicas distributed across availability zones (AZs). ■ Prometheus Metrics: ■ Exported by ZDM Proxy by default. ■ Node exporter service ran alongside ZDM to monitor system-level bottlenecks.
  • 21. ZDM Proxy Reads from Source Only: ■ Used during the initial migration phase. Async Reads to Target: ■ Enabled after historical data migration and validation. ■ Allowed performance measurement of ScyllaDB before switching the traffic. Migration Workflow: ■ ZDM Proxy initially operated with reads coming from the source only. ■ After completing bulk data migration and validation, reconfigured ZDM Proxy to async read from the target. ■ Measured ScyllaDB performance before fully transitioning application traffic.
  • 22. Data Validation CDM for Data Validation ■ Validating terabytes of data is time-intensive. ■ Optimized validation to reduce time by 80% Validation Steps ■ CDM reads from the source in bulk. ■ Compares corresponding data in the target cluster. ■ Repeats for the entire partition range. Tuning CDM Properties: ■ Enabled spark.cdm.autocorrect.missing spark.cdm.autocorrect.mismatch ■ Bridges gaps in data consistency automatically.
  • 24. Challenges Large Partition ■ CDM migrator processes large partitions by loading entire slices into memory - OOM Error Large-Scale Validation: ■ Validating over 20TB of unreplicated data estimated to take weeks. ■ CDM jobs scanned partitions, retrieving rows individually. ■ High I/O latency due to individual select operations for each row.
  • 26. Optimization Large Partition ■ Split partition range into smaller chunks ■ Controls the amount of data loaded into memory for each slice Large-Scale Validation ■ Adopted range-based reads. ■ Bypassed value validation by only checking key presence.
  • 27. Range-Based Reads from Target Customized CDM validation
  • 28. Optimization Outcome ■ Reduced validation times by over 80%, ensuring efficiency for large-scale data validations. ■ Enhanced scalability and practicality for production environments. ■ Achieved significant cost savings, particularly in infrastructure expenses. ■ Enabled faster and more frequent validation cycles, ensuring data accuracy and consistency.
  • 29. Future Usecases ■ BLOB Store ■ UCR ■ DynamoDB usecases
  • 31. Stay in Touch Prem Kumar Patturaj premkumar.patturaj@freshworks.com https://x.com/iam_prem https://www.linkedin.com/in/prem-kumar- patturaj-27217933/