Inside Freshworks' Migration from Cassandra to ScyllaDB by Premkumar Patturaj

A ScyllaDB Community
Freshworks Migration Journey
from Cassandra to ScyllaDB
Premkumar Patturaj
Senior Manager

Prem Kumar Patturaj
■ Senior Engineering Manager at Freshworks with 15 years of
IT experience, with 10 years at Freshworks.
■ Expertise in Relational and NoSQL databases, specializing
in designing and optimizing scalable, high-performance
systems.
■ Experienced in solving complex technical challenges,
mentoring teams, and fostering a culture of continuous learning.
■ Committed to engineering excellence, leveraging best
practices to create efficient and reliable software solutions.

© 2024 Freshworks Inc. All rights reserved.
Freshworks at a glance
2010
Founded
4,500
Employees
$700M+
67,000+
Total Customers 3 Gartner Magic Quadrants
Leader in 3 Major Peer Reviews
Recognition
2024 Annual Revenue Guidance
IPO September 2021
FRSH

© 2024 Freshworks Inc. All rights reserved.
Neo Platform and Freddy infuse AI across all products
Freshworks Solutions
Freddy AI Insights
Freddy AI Copilot
Integrate & Extend
Developer tools
Marketplace
Unify
Data Analytics Admin Security
Manage & Secure
Employee Experience Customer Experience
SOLUTIONS
Freshservice Customer
Service Suite
Freshdesk Freshchat Freshsales Freshmarketer
Freshservice
for Business Teams
Device42
PLATFORM
AI
Freddy AI
for Customer Service, Sales,
Marketing, IT & Developers
for Business Leaders
Freshworks Neo
Freddy AI Self Service
for Customers & Employees

■ Background and Motivation
■ Goals
■ Approach
■ Challenges
■ Optimization
Presentation Agenda

We manage all databases in Freshworks
■ Availability
■ Reliability
■ Monitoring
■ Recovery
■ Keep Current
■ RDS MySQL, Postgres; Redis; MongoDB; Kafka; ClickHouse; …
■ A mix of self-hosted and cloud solutions
■ Identify the best balance for Freshworks
■ Uber goal for Dataverse
■ Application teams agnostic to the underlying database
■ eg, use Cassandra client but backend is ScyllaDB
Dataverse

Databases at Freshworks
Database Servers Data Processed Req/s Data persisted Availability
MySQL 1200 7.9Gb/s 1.4M 4.5 PiB 99.992
Redis 869 1GB/s 2M 550 GiB 99.991
Kafka 65 1GB/s 0.7M 420 TiB 99.99
ClickHouse 16 400Mb/s 2M 33 TiB 99.99
Memcached 72 12Mb/s 2M 257 GiB 99.99
Postgres 110 2.2Gb/s 0.22M 210 TiB 99.99
ScyllaDB 45 750Mb/s 0.05M 270 TiB 99.99
Scale

ScyllaDB at Freshworks
Clusters Nodes IOPS Storage
10 45 500k 270TB

Background
Hypertrail
■ Hypertrail aims to provide a scalable, cost-effective, and fault-tolerant timeline solution that enables
products to capture and query activity and audit logs for any custom entity, with flexible filtering
capabilities to meet specific business needs
Workflow Automator
■ Workflows can be configured for project and task creation and associating them to tickets/changes.
Users can configure the workflow using any condition they want for tickets/changes, This is currently used
for alerts module right now.

Cassandra Overview
Cassandra Cluster Overview:
■ 24TB of unreplicated data.
■ Spread across 56 Cassandra nodes.
Challenges in Cassandra:
■ Repair & Consistency Issues
■ High Tailend Latencies
■ Backup & Restore Overheads
■ Manual Toil with more nodes

Motivation
ScyllaDB Advantages Over Cassandra:
Hardware Eﬃciency:
■ Few large machines replace many small ones.
Operational Simplicity:
■ Reduced overhead for repairs, compactions, and scaling.
Cost Reduction:
■ Lower infrastructure costs due to fewer machines.

Goals
Zero Downtime:
■ Ensure the application remains fully operational during migration.
Low Latency Overhead:
■ Minimize the impact on application latency during the process.
Accuracy:
■ Validate the migrated data for completeness and correctness.
Eﬃciency:
■ Perform the migration in the shortest duration possible to reduce infrastructure costs.
■ Complete migration and validations in a time and cost-eﬃcient manner.

Migration Approach
Historical Data Migration:
■ Bulk migration of existing data from Cassandra to ScyllaDB cluster.
Dual Writes:
■ Writing data to both Cassandra and ScyllaDB clusters while the migration is
in progress using ZDM(Zero Downtime Migration) proxy
Data Validation:
■ Validating data consistency between the source and destination using CDM
(Cassandra Data Migrator)

Historical Data Migration
Evaluated options for bulk data migration
■ Datastax CDM Tool
■ Stream SSTables via Tools
■ Load and Stream using nodetool
Advantages of Load and Stream
■ Fastest approach
■ Minimal impact on ScyllaDB cluster.

Dual Writes
■ ZDM Proxy performed dual-writes, handling all use-cases required for the migration process.
■ Latency added by ZDM Proxy was benchmarked under 10 milliseconds,
InfrastructureSetup
Hosted on EC2 c6.2xlarge instances with 3 replicas distributed across availability zones (AZs).
■ Prometheus Metrics:
■ Exported by ZDM Proxy by default.
■ Node exporter service ran alongside ZDM to monitor system-level bottlenecks.

ZDM Proxy
Reads from Source Only:
■ Used during the initial migration phase.
Async Reads to Target:
■ Enabled after historical data migration and validation.
■ Allowed performance measurement of ScyllaDB before switching the traffic.
Migration Workflow:
■ ZDM Proxy initially operated with reads coming from the source only.
■ After completing bulk data migration and validation, reconfigured ZDM Proxy
to async read from the target.
■ Measured ScyllaDB performance before fully transitioning application traffic.

Data Validation
CDM for Data Validation
■ Validating terabytes of data is time-intensive.
■ Optimized validation to reduce time by 80%
Validation Steps
■ CDM reads from the source in bulk.
■ Compares corresponding data in the target cluster.
■ Repeats for the entire partition range.
Tuning CDM Properties:
■ Enabled spark.cdm.autocorrect.missing
spark.cdm.autocorrect.mismatch
■ Bridges gaps in data consistency automatically.

Challenges
Large Partition
■ CDM migrator processes large partitions by loading entire slices into memory - OOM Error
Large-Scale Validation:
■ Validating over 20TB of unreplicated data estimated to take weeks.
■ CDM jobs scanned partitions, retrieving rows individually.
■ High I/O latency due to individual select operations for each row.

Optimization
Large Partition
■ Split partition range into smaller chunks
■ Controls the amount of data loaded into memory for each slice
Large-Scale Validation
■ Adopted range-based reads.
■ Bypassed value validation by only checking key presence.

Range-Based Reads from Target
Customized CDM validation

Optimization Outcome
■ Reduced validation times by over 80%, ensuring efficiency for large-scale data validations.
■ Enhanced scalability and practicality for production environments.
■ Achieved significant cost savings, particularly in infrastructure expenses.
■ Enabled faster and more frequent validation cycles, ensuring data accuracy and consistency.

Future Usecases
■ BLOB Store
■ UCR
■ DynamoDB usecases

Stay in Touch
Prem Kumar Patturaj
premkumar.patturaj@freshworks.com
https://x.com/iam_prem
https://www.linkedin.com/in/prem-kumar-
patturaj-27217933/

Inside Freshworks' Migration from Cassandra to ScyllaDB by Premkumar Patturaj

More Related Content

Similar to Inside Freshworks' Migration from Cassandra to ScyllaDB by Premkumar Patturaj (20)

More from ScyllaDB (20)

Recently uploaded (20)

Inside Freshworks' Migration from Cassandra to ScyllaDB by Premkumar Patturaj