SlideShare a Scribd company logo
Building Zeotap's Privacy
Compliant Customer Data
Platform (CDP) with ScyllaDB
Shubham Patil, Lead Software Engineer
Safal Pandita, Senior Software Engineer
Presenters
Shubham Patil, Lead Software Engineer
■ Leads the platform engineering team at Zeotap for CDP product suite
■ Responsible for its architecture, design and engineering delivery
■ 6 years of experience building scalable distributed systems
Safal Pandita, Senior Software Engineer
■ Leads the Scylla integrations at Zeotap for CDP product suite
■ 4 years of experience in building scalable distributed systems
About Zeotap
Zeotap is a privacy-focused 360º Customer Data Platform
(CDP) made for privacy-sensitive marketers
■ Enables brands to better understand their customers
- 360º view
■ Built on GCP
■ Native 3P data enrichment from over 130 premium
sources
PRIVACY AND SECURITY IS IN OUR DNA
2018-2021:
Customer Data
Platform
2014-2021:
Stitching Data from 120 companies for
500m customers under Strict EU Privacy
Law For Better Targeting for Brands
https://www.youtube.com/watch?v=XS790sG1Y7I
Vertical: CDP/CIP
What is a Customer Data Platform (CDP) ?
CONSENTED AND ACTIONABLE
TRUSTED GOLDEN RECORDS OF
1P CUSTOMER PROFILES TO
SUPPORT MARKETING GOALS
Data
Unification
Build your single
customer view
Consent
Unification
Unify consent across
user Ids and channels
Client ID
MAID
Email
Phone
Web
Cookies
Other IDs
Marketing Preferences
Consent Purposes
A GOLDEN RECORD
Your own private identity graph
Universal ID
Contract History
Demographics
Loyalty Status
CDP: Unification of all silos
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platform (CDP) with ScyllaDB
Zeotap’s CDP Tech Requirements
Batch
(Data Onboarding)
Realtime
(Event Orchestration)
Privacy/Compliance
(Consent Mastering)
■ Ingestion of user data from
website interactions in real
time.
■ Real time activation of user
audience.
■ User opt-out, consent
management and mastering
etc.
■ Ingestion of e.g.
CRM/database dumps.
■ Batch activation of user
audience in DMPs
■ Bulk data exports to client
databases/sinks
CDP Tech Matrix
Requirements v1 v2 v3
Multi-regional, Multi-Tenant, Privacy and GDPR compliant deployment
Sub-second/Realtime writes (with BQ streaming inserts)
Sub-second/Realtime reads/deletes (for ‘On The Fly’ User Unification)
Point Lookups
Works for data at every scale (few MegaBs to PetaBs)
Mature and transparent monitoring stack
Supports Spark integration to export data dumps to data lakes
Complete control on sizing of cluster/processing
Supports Encryption: At rest, value level, rotation (RawPII)
Complete control on underlying data model and scans
Simple SQL-like query capabilities
Enterprise Support
Before Scylla : CDP v1.0
CDP Tech Matrix Review
Requirements v1 v2 v3
Multi-regional, Multi-Tenant, Privacy and GDPR compliant deployment ✅
Sub-second/Realtime writes (with BQ streaming inserts) ✅
Sub-second/Realtime reads/deletes (for ‘On The Fly’ User Unification) ❌
Point Lookups ❌
Works for data at every scale (few MegaBs to PetaBs) ✅
Mature and transparent monitoring stack ❌
Supports Spark integration to export data dumps to data lakes ✅
Complete control on sizing of cluster/processing ❌
Supports Encryption: At rest, value level, rotation (RawPII) ✅
Complete control on underlying data model and scans ✅
Simple SQL-like query capabilities ✅
Enterprise Support ✅
Before Scylla : CDP v2.0
CDP Tech Matrix Review
Requirements v1 v2 v3
Multi-regional, Multi-Tenant, Privacy and GDPR compliant deployment ✅ ✅
Sub-second/Realtime writes (with BQ streaming inserts) ✅ ✅
Sub-second/Realtime reads/deletes (for ‘On The Fly’ User Unification) ❌ ✅
Point Lookups ❌ ❌
Works for data at every scale (few MegaBs to PetaBs) ✅ ❌
Mature and transparent monitoring stack ❌ ❌
Supports Spark integration to export data dumps to data lakes ✅ ✅
Complete control on sizing of cluster/processing ❌ ✅
Supports Encryption: At rest, value level, rotation (RawPII) ✅ ✅
Complete control on underlying data model and scans ✅ ❌
Simple SQL-like query capabilities ✅ ❌
Enterprise Support ✅ ❌
With Scylla : CDP v3.0
CDP Tech Matrix Review
Requirements v1 v2 v3
Multi-regional, Multi-Tenant, Privacy and GDPR compliant deployment ✅ ✅ ✅
Sub-second/Realtime writes (with BQ streaming inserts) ✅ ✅ ✅
Sub-second/Realtime reads/deletes (for ‘On The Fly’ User Unification) (600 ms in JG vs 30ms in Scylla) ❌ ✅ ✅
Point Lookups ❌ ❌ ✅
Works for data at every scale (few MegaBs to PetaBs) ✅ ❌ ✅
Mature and transparent monitoring stack ❌ ❌ ✅
Supports Spark integration to export data dumps to data lakes ✅ ✅ ✅
Complete control on sizing of cluster/processing ❌ ✅ ✅
Supports Encryption: At rest, value level, rotation (RawPII) ✅ ✅ ✅
Complete control on underlying data model and scans ✅ ❌ ✅
Simple SQL-like query capabilities ✅ ❌ ✅
Enterprise Support ✅ ❌ ✅
The Data Model
Requirements from a User Store
■ On-The-Fly User Unification (ID Resolution)
■ Fast lookup store with low latencies for both read and write
■ Flexible enough to be used as a Profile/Consent/ID store
■ Needed to be used as a linkage store
■ We needed TTL in a few different ways
• Profiles/Consents (Attributes in a Map)
• ID (Elements of a collection)
• ID Store (row level)
Pattern 2: Find User Profiles by UCID
Pattern 3: Stamp UCID in Id Store
Pattern 4: Insert profiles/consents/preferences in User Store
Read/Write Query Patterns
Pattern 1: Read IdStore by Id Type and Value
■ Isolation for each client achieved through keyspaces
■ Separate clusters for each region (EU, US, IN, UK)
■ Each keyspace could have a different schema
■ RF = 3, ICS Compaction, CL=QUORUM for Read/Write
■ Single table - acts as our profile, consent and
linkage store
Data Model v1.0
Problems faced
■ Batch sizes became a bottleneck since our
transactions needed to be atomic across partitions.
We crossed the recommended limit of ~100K per
batch.
■ Collection sizes started increasing beyond the
recommended size of ~1MB
■ Latencies worsened due to large batches and
collection sizes. In some cases, queries started timing
out
Applied Solutions
■ Split queries into multiple batches with multiple
retries each batch.
■ Use Prepared statements to improve
performance
■ Use TTL to keep total volume under check
Bottlenecks - Hot Rows
■ Storing linkages in a collection became a
bottleneck due to our increasing scale
■ Going beyond recommended ~1MB per collection
reduces latency SLAs
■ Collections go through a
serialization/deserialization step in Scylla which
makes them slower compared to other data types
Data Model v2.0
Updated Data Model
■ Queries that were timing out earlier(>10s) due to high linkages started succeeding
within our SLA’s (~30ms)
■ Separate linkages store - TTL’s easier to maintain on rows which was earlier
complicated on individual elements of a collection
■ No arbitrary limit on the number of linkages(~1MB) which allowed us to scale more
effectively
Production Gotchas - PK Migration
■ Problem : No easy way to migrate your primary key once the data is live in the tables.
■ Solution : Use Scylla Migrator to move the data to intermediate/temporary table with the
required schema.
• Since we wanted to reuse the names of our original tables (You can’t rename a
table), we had to copy the SSTables from our migrated schema.
• Lesson : Choose your PK wisely
Production Gotchas - Schema Corruption
■ Problem : Schemas can get corrupted while copying SSTables. Schema settlement under load can
sometimes take more than a minute and can cause cluster to crash.
■ Solutions
• Always check that your schema is correctly replicated on all nodes before attempting
SSTable copying.
• Ask the scylla team/manually SSH/write your own service around cqlsh
• Scylla team resolved our issue by restoring our snapshots and redoing the migration for the
affected schemas.
• Lesson : ALWAYS BACKUP YOUR DATA
Production Setup
4 Clusters (EU, UK,
IN, US) - 6
n2-higmem-64 nodes
- Scylla v2021.1.5
130+
client/keyspaces
being managed
Max 60K QPS
30 ms avg. read
10 ms avg. write
5.4 TBs - data
ingested
50 GBs - Max
keyspace
Future Plans
■ Microservices around handling Schema corruption and updates
■ Explore LightWeight Transactions (LWT’s) for consistency guarantees
■ Explore encrypted data rotation w/o blocking real time writes
Thank you!
Stay in touch
Shubham Patil & Safal Pandita
/itsshubhpatil, /safalpandita
patil.sm17@gmail.com
safalpandita@gmail.com

More Related Content

PDF
Keeping Identity Graphs In Sync With Apache Spark
PDF
Improving Machine Learning using Graph Algorithms
PDF
Fast ALS-Based Matrix Factorization for Recommender Systems
PPTX
[Customizable Template] How to Get Stakeholder Buy-In for a Toolchain Integra...
PDF
Using Geospatial to Innovate in Last-Mile Logistics
PDF
Context Aware Recommendations at Netflix
PDF
Interactive Recommender Systems with Netflix and Spotify
PPTX
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Keeping Identity Graphs In Sync With Apache Spark
Improving Machine Learning using Graph Algorithms
Fast ALS-Based Matrix Factorization for Recommender Systems
[Customizable Template] How to Get Stakeholder Buy-In for a Toolchain Integra...
Using Geospatial to Innovate in Last-Mile Logistics
Context Aware Recommendations at Netflix
Interactive Recommender Systems with Netflix and Spotify
Exactly-Once Financial Data Processing at Scale with Flink and Pinot

What's hot (20)

PPTX
Introduction to Data Engineering
PPTX
Searching,sorting
PPTX
Introduction to Big Data and Data Science
PPTX
In-Memory Big Data Analytics
PPTX
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
PDF
Data Science Training | Data Science Tutorial for Beginners | Data Science wi...
PPTX
Tableau Server Basics
PDF
Behind the Buzzword: Understanding Customer Data Platforms in the Light of Pr...
PPTX
Dask: Scaling Python
PDF
Pinot: Near Realtime Analytics @ Uber
PDF
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorial
PDF
Building real time analytics applications using pinot : A LinkedIn case study
PDF
Music Recommendation 2018
PPTX
Querying Druid in SQL with Superset
PDF
Recommender Systems
PPT
Startup Metrics for Pirates (Aug 2010)
PPTX
Recommendation system
PPTX
Tableau
PDF
seven steps to dataops @ dataops.rocks conference Oct 2019
PPTX
Introduction to Tableau
Introduction to Data Engineering
Searching,sorting
Introduction to Big Data and Data Science
In-Memory Big Data Analytics
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Data Science Training | Data Science Tutorial for Beginners | Data Science wi...
Tableau Server Basics
Behind the Buzzword: Understanding Customer Data Platforms in the Light of Pr...
Dask: Scaling Python
Pinot: Near Realtime Analytics @ Uber
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorial
Building real time analytics applications using pinot : A LinkedIn case study
Music Recommendation 2018
Querying Druid in SQL with Superset
Recommender Systems
Startup Metrics for Pirates (Aug 2010)
Recommendation system
Tableau
seven steps to dataops @ dataops.rocks conference Oct 2019
Introduction to Tableau
Ad

Similar to Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platform (CDP) with ScyllaDB (20)

PPTX
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
PDF
Data has a better idea the in-memory data grid
PDF
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
PDF
Inside Freshworks' Migration from Cassandra to ScyllaDB by Premkumar Patturaj
PDF
Solving enterprise challenges through scale out storage & big compute final
PPTX
Lessons learned from embedding Cassandra in xPatterns
PDF
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
PPTX
Presentation mongo db munich
PPTX
Big Data on Cloud Native Platform
PPTX
Big Data on Cloud Native Platform
PDF
Achieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
PDF
Introduction to Apache Kafka
PDF
Azure + DataStax Enterprise Powers Office 365 Per User Store
PDF
A Dataflow Processing Chip for Training Deep Neural Networks
PPTX
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
PDF
My sql cluster case study apr16
PDF
Simpler, faster, cheaper Enterprise Apps using only Spring Boot on GCP
PDF
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
PPTX
DEVNET-1166 Open SDN Controller APIs
PPTX
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Data has a better idea the in-memory data grid
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
Inside Freshworks' Migration from Cassandra to ScyllaDB by Premkumar Patturaj
Solving enterprise challenges through scale out storage & big compute final
Lessons learned from embedding Cassandra in xPatterns
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Presentation mongo db munich
Big Data on Cloud Native Platform
Big Data on Cloud Native Platform
Achieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
Introduction to Apache Kafka
Azure + DataStax Enterprise Powers Office 365 Per User Store
A Dataflow Processing Chip for Training Deep Neural Networks
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
My sql cluster case study apr16
Simpler, faster, cheaper Enterprise Apps using only Spring Boot on GCP
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
DEVNET-1166 Open SDN Controller APIs
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Ad

More from ScyllaDB (20)

PDF
Understanding The True Cost of DynamoDB Webinar
PDF
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
PDF
Database Benchmarking for Performance Masterclass: Session 1 - Benchmarking F...
PDF
New Ways to Reduce Database Costs with ScyllaDB
PDF
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
PDF
Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...
PDF
Leading a High-Stakes Database Migration
PDF
Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...
PDF
How Agoda Scaled 50x Throughput with ScyllaDB by Worakarn Isaratham
PDF
How Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd Coleman
PDF
ScyllaDB: 10 Years and Beyond by Dor Laor
PDF
Reduce Your Cloud Spend with ScyllaDB by Tzach Livyatan
PDF
Migrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence Liu
PDF
Vector Search with ScyllaDB by Szymon Wasik
PDF
Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...
PDF
Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...
PDF
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
PDF
Object Storage in ScyllaDB by Ran Regev, ScyllaDB
PDF
Lessons Learned from Building a Serverless Notifications System by Srushith R...
PDF
A Dist Sys Programmer's Journey into AI by Piotr Sarna
Understanding The True Cost of DynamoDB Webinar
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
Database Benchmarking for Performance Masterclass: Session 1 - Benchmarking F...
New Ways to Reduce Database Costs with ScyllaDB
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...
Leading a High-Stakes Database Migration
Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...
How Agoda Scaled 50x Throughput with ScyllaDB by Worakarn Isaratham
How Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd Coleman
ScyllaDB: 10 Years and Beyond by Dor Laor
Reduce Your Cloud Spend with ScyllaDB by Tzach Livyatan
Migrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence Liu
Vector Search with ScyllaDB by Szymon Wasik
Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...
Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
Object Storage in ScyllaDB by Ran Regev, ScyllaDB
Lessons Learned from Building a Serverless Notifications System by Srushith R...
A Dist Sys Programmer's Journey into AI by Piotr Sarna

Recently uploaded (20)

PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
cuic standard and advanced reporting.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Modernizing your data center with Dell and AMD
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Approach and Philosophy of On baking technology
PPT
Teaching material agriculture food technology
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
cuic standard and advanced reporting.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
Modernizing your data center with Dell and AMD
MYSQL Presentation for SQL database connectivity
20250228 LYD VKU AI Blended-Learning.pptx
Network Security Unit 5.pdf for BCA BBA.
Understanding_Digital_Forensics_Presentation.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Unlocking AI with Model Context Protocol (MCP)
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Approach and Philosophy of On baking technology
Teaching material agriculture food technology
The AUB Centre for AI in Media Proposal.docx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025

Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platform (CDP) with ScyllaDB

  • 1. Building Zeotap's Privacy Compliant Customer Data Platform (CDP) with ScyllaDB Shubham Patil, Lead Software Engineer Safal Pandita, Senior Software Engineer
  • 2. Presenters Shubham Patil, Lead Software Engineer ■ Leads the platform engineering team at Zeotap for CDP product suite ■ Responsible for its architecture, design and engineering delivery ■ 6 years of experience building scalable distributed systems Safal Pandita, Senior Software Engineer ■ Leads the Scylla integrations at Zeotap for CDP product suite ■ 4 years of experience in building scalable distributed systems
  • 3. About Zeotap Zeotap is a privacy-focused 360º Customer Data Platform (CDP) made for privacy-sensitive marketers ■ Enables brands to better understand their customers - 360º view ■ Built on GCP ■ Native 3P data enrichment from over 130 premium sources PRIVACY AND SECURITY IS IN OUR DNA 2018-2021: Customer Data Platform 2014-2021: Stitching Data from 120 companies for 500m customers under Strict EU Privacy Law For Better Targeting for Brands https://www.youtube.com/watch?v=XS790sG1Y7I
  • 5. What is a Customer Data Platform (CDP) ? CONSENTED AND ACTIONABLE TRUSTED GOLDEN RECORDS OF 1P CUSTOMER PROFILES TO SUPPORT MARKETING GOALS Data Unification Build your single customer view Consent Unification Unify consent across user Ids and channels Client ID MAID Email Phone Web Cookies Other IDs Marketing Preferences Consent Purposes A GOLDEN RECORD Your own private identity graph Universal ID Contract History Demographics Loyalty Status CDP: Unification of all silos
  • 7. Zeotap’s CDP Tech Requirements Batch (Data Onboarding) Realtime (Event Orchestration) Privacy/Compliance (Consent Mastering) ■ Ingestion of user data from website interactions in real time. ■ Real time activation of user audience. ■ User opt-out, consent management and mastering etc. ■ Ingestion of e.g. CRM/database dumps. ■ Batch activation of user audience in DMPs ■ Bulk data exports to client databases/sinks
  • 8. CDP Tech Matrix Requirements v1 v2 v3 Multi-regional, Multi-Tenant, Privacy and GDPR compliant deployment Sub-second/Realtime writes (with BQ streaming inserts) Sub-second/Realtime reads/deletes (for ‘On The Fly’ User Unification) Point Lookups Works for data at every scale (few MegaBs to PetaBs) Mature and transparent monitoring stack Supports Spark integration to export data dumps to data lakes Complete control on sizing of cluster/processing Supports Encryption: At rest, value level, rotation (RawPII) Complete control on underlying data model and scans Simple SQL-like query capabilities Enterprise Support
  • 9. Before Scylla : CDP v1.0
  • 10. CDP Tech Matrix Review Requirements v1 v2 v3 Multi-regional, Multi-Tenant, Privacy and GDPR compliant deployment ✅ Sub-second/Realtime writes (with BQ streaming inserts) ✅ Sub-second/Realtime reads/deletes (for ‘On The Fly’ User Unification) ❌ Point Lookups ❌ Works for data at every scale (few MegaBs to PetaBs) ✅ Mature and transparent monitoring stack ❌ Supports Spark integration to export data dumps to data lakes ✅ Complete control on sizing of cluster/processing ❌ Supports Encryption: At rest, value level, rotation (RawPII) ✅ Complete control on underlying data model and scans ✅ Simple SQL-like query capabilities ✅ Enterprise Support ✅
  • 11. Before Scylla : CDP v2.0
  • 12. CDP Tech Matrix Review Requirements v1 v2 v3 Multi-regional, Multi-Tenant, Privacy and GDPR compliant deployment ✅ ✅ Sub-second/Realtime writes (with BQ streaming inserts) ✅ ✅ Sub-second/Realtime reads/deletes (for ‘On The Fly’ User Unification) ❌ ✅ Point Lookups ❌ ❌ Works for data at every scale (few MegaBs to PetaBs) ✅ ❌ Mature and transparent monitoring stack ❌ ❌ Supports Spark integration to export data dumps to data lakes ✅ ✅ Complete control on sizing of cluster/processing ❌ ✅ Supports Encryption: At rest, value level, rotation (RawPII) ✅ ✅ Complete control on underlying data model and scans ✅ ❌ Simple SQL-like query capabilities ✅ ❌ Enterprise Support ✅ ❌
  • 13. With Scylla : CDP v3.0
  • 14. CDP Tech Matrix Review Requirements v1 v2 v3 Multi-regional, Multi-Tenant, Privacy and GDPR compliant deployment ✅ ✅ ✅ Sub-second/Realtime writes (with BQ streaming inserts) ✅ ✅ ✅ Sub-second/Realtime reads/deletes (for ‘On The Fly’ User Unification) (600 ms in JG vs 30ms in Scylla) ❌ ✅ ✅ Point Lookups ❌ ❌ ✅ Works for data at every scale (few MegaBs to PetaBs) ✅ ❌ ✅ Mature and transparent monitoring stack ❌ ❌ ✅ Supports Spark integration to export data dumps to data lakes ✅ ✅ ✅ Complete control on sizing of cluster/processing ❌ ✅ ✅ Supports Encryption: At rest, value level, rotation (RawPII) ✅ ✅ ✅ Complete control on underlying data model and scans ✅ ❌ ✅ Simple SQL-like query capabilities ✅ ❌ ✅ Enterprise Support ✅ ❌ ✅
  • 16. Requirements from a User Store ■ On-The-Fly User Unification (ID Resolution) ■ Fast lookup store with low latencies for both read and write ■ Flexible enough to be used as a Profile/Consent/ID store ■ Needed to be used as a linkage store ■ We needed TTL in a few different ways • Profiles/Consents (Attributes in a Map) • ID (Elements of a collection) • ID Store (row level)
  • 17. Pattern 2: Find User Profiles by UCID Pattern 3: Stamp UCID in Id Store Pattern 4: Insert profiles/consents/preferences in User Store Read/Write Query Patterns Pattern 1: Read IdStore by Id Type and Value
  • 18. ■ Isolation for each client achieved through keyspaces ■ Separate clusters for each region (EU, US, IN, UK) ■ Each keyspace could have a different schema ■ RF = 3, ICS Compaction, CL=QUORUM for Read/Write ■ Single table - acts as our profile, consent and linkage store Data Model v1.0
  • 19. Problems faced ■ Batch sizes became a bottleneck since our transactions needed to be atomic across partitions. We crossed the recommended limit of ~100K per batch. ■ Collection sizes started increasing beyond the recommended size of ~1MB ■ Latencies worsened due to large batches and collection sizes. In some cases, queries started timing out
  • 20. Applied Solutions ■ Split queries into multiple batches with multiple retries each batch. ■ Use Prepared statements to improve performance ■ Use TTL to keep total volume under check
  • 21. Bottlenecks - Hot Rows ■ Storing linkages in a collection became a bottleneck due to our increasing scale ■ Going beyond recommended ~1MB per collection reduces latency SLAs ■ Collections go through a serialization/deserialization step in Scylla which makes them slower compared to other data types
  • 23. Updated Data Model ■ Queries that were timing out earlier(>10s) due to high linkages started succeeding within our SLA’s (~30ms) ■ Separate linkages store - TTL’s easier to maintain on rows which was earlier complicated on individual elements of a collection ■ No arbitrary limit on the number of linkages(~1MB) which allowed us to scale more effectively
  • 24. Production Gotchas - PK Migration ■ Problem : No easy way to migrate your primary key once the data is live in the tables. ■ Solution : Use Scylla Migrator to move the data to intermediate/temporary table with the required schema. • Since we wanted to reuse the names of our original tables (You can’t rename a table), we had to copy the SSTables from our migrated schema. • Lesson : Choose your PK wisely
  • 25. Production Gotchas - Schema Corruption ■ Problem : Schemas can get corrupted while copying SSTables. Schema settlement under load can sometimes take more than a minute and can cause cluster to crash. ■ Solutions • Always check that your schema is correctly replicated on all nodes before attempting SSTable copying. • Ask the scylla team/manually SSH/write your own service around cqlsh • Scylla team resolved our issue by restoring our snapshots and redoing the migration for the affected schemas. • Lesson : ALWAYS BACKUP YOUR DATA
  • 26. Production Setup 4 Clusters (EU, UK, IN, US) - 6 n2-higmem-64 nodes - Scylla v2021.1.5 130+ client/keyspaces being managed Max 60K QPS 30 ms avg. read 10 ms avg. write 5.4 TBs - data ingested 50 GBs - Max keyspace
  • 27. Future Plans ■ Microservices around handling Schema corruption and updates ■ Explore LightWeight Transactions (LWT’s) for consistency guarantees ■ Explore encrypted data rotation w/o blocking real time writes
  • 28. Thank you! Stay in touch Shubham Patil & Safal Pandita /itsshubhpatil, /safalpandita patil.sm17@gmail.com safalpandita@gmail.com