SlideShare a Scribd company logo
Relational (RDBMS) to NoSQL
Migration
Ankit Patel | DataStax | Principal Strategy Architect
2 © 2020 Datastax, Inc. All rights reserved.
“We cannot solve our problems
with the same thinking we
used when we created them.”
- Albert Einstein
The Digital Era - The Need to Modernize
3 © 2020 Datastax, Inc. All rights reserved.
Digital Data-Driven AI Enabled
The Modern Era
SAD (Silos Affects Delivery) Speed of Data Matters!
4
Data access
Legacy
processes
Lack of data
analytical skills
Resistance
to change
© 2020 Datastax, Inc. All rights reserved.
Source: https://www.pinterest.com/pin/573716440029920090/
NoSQL - The Future
What is a NoSQL (Not-only-SQL) Database?
5 © 2020 Datastax, Inc. All rights reserved.
• Non Relational Database - supports
ability to access data using other
forms besides Structured Query
Language (SQL)
• Designed to be used by Cloud
Applications’ need to handle massive
amounts of Data in real-time
• Provides ability to overcome scale,
performance, data storage, data
model, and data distribution
limitations
NoSQL vs RDBMS….
6 © 2020 Datastax, Inc. All rights reserved.
C When to use NoSQL? When to use RDBMS?
Applications Decentralized (scalable)
microservice applications
Centralized monolithic
applications
Availability 100% availability,
zero-downtime
Moderate to high
Data Low latency
structured/semi/unstructured
data @ high velocity
Structured data @ moderate
velocity & latency
Transactions Simple transactions & queries Complex nested transactions &
joins
Scalability
(Reads/Writes)
Horizontal (Linear) scaling Vertical scaling
Cassandra: The Best NoSQL Database of Choice
7 © 2020 Datastax, Inc. All rights reserved.
Active-everywhere,
masterless, scales linearly
Best NoSQL database for
cloud-native and microservices
#1 choice of world’s largest
consumer internet applications
Zero Lock-in Global ScaleZero Downtime
If you use a website or a smartphone today,
you’re touching a Cassandra backend system.
Source: https://sdtimes.com/data/apache-cassandra-4-0-beta-now-available/
Cassandra: Cloud Native NoSQL Database
Why?
With Cassandra masterless architecture,
easily achieving 100% uptime across
on-prem, single cloud, hybrid, and/or
multi-cloud deployments is engraved in
the technology.
8 © 2020 Datastax, Inc. All rights reserved.
Experiences, Microservices
& Insights
ON PREM
© 2020 Datastax, Inc. All rights reserved.
● CQL – Cassandra Query Language
● Similar to syntax compared to SQL
● Standard way to communicate to DSE C* cluster for
reading/writing data.
● Feature rich language that allow you to manage the cluster
(managing schema/permissions, managing roles, JSON support,
UDF/UDA support…)
● Example Read: select * from keyspace.table where
partition_key=<value>;
● Example Writing Data: insert into keyspace.table
(partition_key,clustering_key,value1) values (‘A’,’B’,’C’);
Cassandra: What is CQL?
9
© 2020 Datastax, Inc. All rights reserved.
● Similar to schema in RDBMS
● Container for multiple tables
● Replication Strategy is set at the keyspace level (Example:
SimpleStrategy, NetworkTopologyStrategy)
● Replication Factor defined at the keyspace level
● DURABLE_WRITES is set at the keyspace level. Setting to false
will bypass the commit log.
● Example to create keyspace:
CREATE KEYSPACE test WITH replication = {'class':
NetworkTologyStrategy', 'DC1': '1'} AND durable_writes = true;
Cassandra: What is a Keyspace?
10
© 2020 Datastax, Inc. All rights reserved.
● Same as RDMBS table
● Contains a primary key
● Always has partition key as part of primary key
● Optionally can define a clustering key (ordering can be defined)
● Both partition and clustering key can be composed of multi-column
● A of parameters can be adjusted at the table level (compaction,
compression, gc_grace_seconds, time to live, etc..)
Cassandra: What is a Table?
11
© 2020 Datastax, Inc. All rights reserved.
CREATE TABLE test.sample_table (
par_key1 uuid,
par_key2 uuid,
clust_key1 timestamp,
clust_key2 int,
value1 text,
value2 double,
PRIMARY KEY ((par_key1, par_key2), clust_key1, clust_key2)
) WITH CLUSTERING ORDER BY (clust_key1 DESC, clust_key2
ASC)
Cassandra: Example Create Table
12
© 2020 Datastax, Inc. All rights reserved.
● Replication factor determines how many copies of your data are
stored in the Cassandra Cluster.
● Each copy is stored in a different node.
● Replication Factor can be defined by datacenters that you’ve setup
● This is a parameter set at the keyspace level within the cluster.
Cassandra: What is Replication Factor
13
© 2020 Datastax, Inc. All rights reserved.
● This parameter is set by the client on individual queries
● This parameter combined with replication factor can help you achieve
the consistency requirement the specific use case is looking for.
● Some of the different values are
ONE
LOCAL_ONE
QUORUM
EACH_QUORUM
LOCAL_QUORUM
ALL
Cassandra: What is Consistency Level
14
Cassandra - Read/Write in Action
15 © 2020 Datastax, Inc. All rights reserved.
Replication - 3 per DC
Consistency - Per Read/Write
Request from Client
Application - Active/Active
Deployment across DC for
Read/Write
APP
ON-PREM AWS AZURE
APP APP
© 2020 Datastax, Inc. All rights reserved.
● Structured Data is the norm for both
● Re-evaluate the need for ACID transactions with
Lightweight-transactions (LWT) in Cassandra
● Take advantage of Cassandra Performance
○ Move Joins to Application Stack
○ Denormalization & Data Duplication is efficient
○ Choose type of Index wisely based on Latency/TPS
requirements
● Thoroughly plan the Data Model in Cassandra
How can My Enterprise get from an RDBMS Based
Design to Cassandra Based Architecture?
16
ERD to Query Based
ERD Based Design Query Based Design
© 2020 Datastax, Inc. All rights reserved.17
5 Steps to Query Based Design
18 © 2020 Datastax, Inc. All rights reserved.
Design a Mental Model of
Access Patterns
Examples:
Medical History: Read
Surgeries, Read Allergies,
Read Health Conditions
Doctor Visit: Read Notes,
Read Prescriptions, Read
Vitals
Decide the application
access patterns to various
entities to deliver business
functionality.
Examples:
Medical History Queries
Doctor Visit Queries
Define the structure of the
data elements based on
query based design
Example: Read
Prescriptions (patient,
date, drug, dosage, etc..)
Make optimizations to
access the data
Example: Create index to
Read Prescription by drug
type or prescribing Doctor.
Build Cassandra table
schema based on logical
model & optimizations
Example: Table
prescriptions with primary
key patient, date and
index on doctor & drug
type
Application
Conceptual
Model
Logical
Model
Optimizations
Physical
Model
DataStax Enterprise: Cassandra Data Platform
Kubernetes Operator (Cloud-Native Automation + Elasticity)
Developer and DevOps APIs (K8S, CQL, REST, GraphQL, gRPC)
Operational Reliability (Advanced Performance, Enterprise Security, Monitoring)
AI-Scale Experiences, Microservices and Insights
Apache Cassandra NoSQL Database (100% Uptime, Zero-Lock-In, Global Scale)
TRUSTED
ACCELERATED
STRATEGIC
OUTCOMES
FOUNDATIONAL
Operational
Analytics
(Spark, Pipelines,
Streaming)
Enhanced
Search
(Enhance Any Query)
Extensible
Integration
(Kafka, Elastic,
Bulk Loading)
Graph
Engine
(Relate Data Across
Partitions)
Multi-Model
Data
(All Data Styles)
Tools
Thought Leadership
Enterprise Support
Partnerships
OSS Commitment
19 © 2020 Datastax, Inc. All rights reserved.
DataStax Astra: Cassandra Made Easy in the Cloud
20 © 2020 Datastax, Inc. All rights reserved.
Cloud-native
Database-as-a-Service built
on Apache Cassandra
Eliminate the overhead
to install, operate, and
scale Cassandra
Out-of-the-box REST
and GraphQL endpoints
and browser CQL shell
Powered by our
open-source Kubernetes
Operator for Cassandra
Deploy on AWS or GCP and
keep compatibility with
open-source Cassandra
Launch a database in the
cloud with a few clicks,
no credit card required
Cassandra-as-a-Service No Operations Powerful APIs
Cloud Native Zero Lock-in 10 Gig Free Tier
21
Use Case #1 - C&S Wholesale Grocers - Supply
Chain
● Delivers over 140,000 food and non-food items to from over 50
warehouse locations
● Operates over 18 million square feet of storage
● Some of C&S’s customers are Safeway, Target, Stop & Shop
● Traditional solutions slowing down distribution efficiency &
impeding innovation
● Business growth leading to Technology Innovation
22
Use Case #1 - C&S - The Challenge
● Supply Chain Process in local RDBMS to warehouse
● Business need to consolidate warehouse data for ease of
management via mobile app
● The transaction volumes were in the thousands per several
seconds
● Needed real-time view of all the working parts of the
manufacturing operations. Warehouse → locations → pallet
● Data Platform capable of operational analytics
23
Use Case #1 - C&S - Why Cassandra?
● Scalable
● High Transaction Volume
● Low Latency
● High Availability - Warehouse operations 24/7
● Ease of Development for Microservices & Mobile App
● Multi-DC Deployment Capability
● Ease of Operational Analytics
24
Use Case #1 - C&S - Business Benefits
● 5 year ROI projection to save multi-millions
● Able to optimize management capabilities of consolidated
warehouse operations
● Achieve remarkable efficiency in data pipeline
● Transactions - Read/Write Thousands in seconds
● Supports 300+ Users processing ~ 300k records in 5 mins
25
Use Case #1 - C&S - The Architecture
26
C&S - Case Study
We needed an application that
was entirely reliable and not
vulnerable to unplanned outages
because our warehouses are
pretty much 24/7...
https://www.datastax.com/resources/case-study/cs-
wholesale-achieving-seamless-supply-chain-master
y-datastax-enterprise
27
Use Case #2 - Financial Services - Mobile Banking
● Very competitive retail banking market
● Need to keep up with demand growth in digital banking
● Have high customer satisfaction rates
● Achieve efficient DR & Business Continuity Plans
28
Use Case #2 - Financial Services - The Challenge
● # of Transactions in RDBMS was not easily scalable
● DR was not easy
● Achieving Latency metrics was harder as volumes increased
● Downtime or poor experience would translate to customer churn
29
Use Case #2 - Financial Services - Why
Cassandra?
● Deploy 3 DC Cluster
● Microservices Architecture
● Scale Application Stack w/ Database
● Achieve low latency SLA (<20ms on avg)
● DR Strategy was solid w/ High Availability
● Capable of processing billions of transactions per month
• Customer 360/SVOC
• Omnichannel & Global
Payments
• IoT/Time Series/eCommerce
Data (sensors, tick data, user
interactions, shopping cart)
• Fraud Detection
• Online/Mobile Banking
• Inventory Management
30
Some Other Common Use Cases
• Recommendations (products
& services)
• Regulatory Compliance
• Alerts & Monitoring (Credit
card transactions)
• Global Payments
• Portfolio Management
• Loan Authorization
• Authentication (Mobile
Logins)
Thank You!
31 © 2020 Datastax, Inc. All rights reserved.
Ankit Patel
Principal Strategy Architect @ DataStax
https://www.linkedin.com/in/ankit-p-patel

More Related Content

PPTX
NOSQL vs SQL
PPTX
Getting Started with NuoDB Community Edition
PPTX
Introduction to NuoDB
PDF
Polyglot Persistence - Two Great Tastes That Taste Great Together
PPTX
Chapter 1 big data
PDF
Replication Troubleshooting in Classic VS GTID
PDF
Bi 5
PPTX
Introducing MongoDB Atlas
NOSQL vs SQL
Getting Started with NuoDB Community Edition
Introduction to NuoDB
Polyglot Persistence - Two Great Tastes That Taste Great Together
Chapter 1 big data
Replication Troubleshooting in Classic VS GTID
Bi 5
Introducing MongoDB Atlas

What's hot (20)

PDF
Introduction to NuoDB - March 2018
PDF
Introduction to Data Science
PDF
Giáo trình bảo mật thông tin
PDF
The Complete MariaDB Server tutorial
PPTX
MaxScale이해와활용-2023.11
PPTX
Document Database
ODP
Neo4j
PPTX
Cơ bản về blockchain, bitcoin và ethereum
PPT
Big data
PPTX
Sql vs NoSQL-Presentation
PDF
OIT552 Cloud Computing - Question Bank
PDF
MySQL Sharding: Tools and Best Practices for Horizontal Scaling
PPTX
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
PDF
Automata slide
PPTX
Column oriented database
PDF
Demystifying MySQL Replication Crash Safety
PDF
MySQL on AWS RDS
PDF
NewSQL Database Overview
PDF
NY Meetup: Scaling MariaDB with Maxscale
PDF
Comparison of-foss-distributed-storage
Introduction to NuoDB - March 2018
Introduction to Data Science
Giáo trình bảo mật thông tin
The Complete MariaDB Server tutorial
MaxScale이해와활용-2023.11
Document Database
Neo4j
Cơ bản về blockchain, bitcoin và ethereum
Big data
Sql vs NoSQL-Presentation
OIT552 Cloud Computing - Question Bank
MySQL Sharding: Tools and Best Practices for Horizontal Scaling
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
Automata slide
Column oriented database
Demystifying MySQL Replication Crash Safety
MySQL on AWS RDS
NewSQL Database Overview
NY Meetup: Scaling MariaDB with Maxscale
Comparison of-foss-distributed-storage
Ad

Similar to Slides: Relational to NoSQL Migration (20)

PDF
DataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQL
PPTX
John Glendenning - Real time data driven services in the Cloud
PPTX
An Overview of Apache Cassandra
PPTX
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
PDF
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
PPTX
BigData Developers MeetUp
PPTX
Presentation of Apache Cassandra
PDF
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
PDF
Introduction to Apache Cassandra
DOCX
Cassandra data modelling best practices
PPTX
Apache Cassandra introduction
PDF
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
PDF
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
PDF
04-Introduction-to-CassandraDB-.pdf
PDF
An Introduction to Apache Cassandra
PDF
State of Cassandra 2012
PPTX
Cassandra training
PPT
Toronto jaspersoft meetup
PDF
DataStax Enterprise & Apache Cassandra – Essentials for Financial Services – ...
PDF
cassandra
DataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQL
John Glendenning - Real time data driven services in the Cloud
An Overview of Apache Cassandra
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
BigData Developers MeetUp
Presentation of Apache Cassandra
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Introduction to Apache Cassandra
Cassandra data modelling best practices
Apache Cassandra introduction
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
04-Introduction-to-CassandraDB-.pdf
An Introduction to Apache Cassandra
State of Cassandra 2012
Cassandra training
Toronto jaspersoft meetup
DataStax Enterprise & Apache Cassandra – Essentials for Financial Services – ...
cassandra
Ad

More from DATAVERSITY (20)

PDF
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
PDF
Data at the Speed of Business with Data Mastering and Governance
PDF
Exploring Levels of Data Literacy
PDF
Building a Data Strategy – Practical Steps for Aligning with Business Goals
PDF
Make Data Work for You
PDF
Data Catalogs Are the Answer – What is the Question?
PDF
Data Catalogs Are the Answer – What Is the Question?
PDF
Data Modeling Fundamentals
PDF
Showing ROI for Your Analytic Project
PDF
How a Semantic Layer Makes Data Mesh Work at Scale
PDF
Is Enterprise Data Literacy Possible?
PDF
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
PDF
Emerging Trends in Data Architecture – What’s the Next Big Thing?
PDF
Data Governance Trends - A Look Backwards and Forwards
PDF
Data Governance Trends and Best Practices To Implement Today
PDF
2023 Trends in Enterprise Analytics
PDF
Data Strategy Best Practices
PDF
Who Should Own Data Governance – IT or Business?
PDF
Data Management Best Practices
PDF
MLOps – Applying DevOps to Competitive Advantage
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Data at the Speed of Business with Data Mastering and Governance
Exploring Levels of Data Literacy
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Make Data Work for You
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What Is the Question?
Data Modeling Fundamentals
Showing ROI for Your Analytic Project
How a Semantic Layer Makes Data Mesh Work at Scale
Is Enterprise Data Literacy Possible?
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends and Best Practices To Implement Today
2023 Trends in Enterprise Analytics
Data Strategy Best Practices
Who Should Own Data Governance – IT or Business?
Data Management Best Practices
MLOps – Applying DevOps to Competitive Advantage

Recently uploaded (20)

PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PDF
Mega Projects Data Mega Projects Data
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
Business Analytics and business intelligence.pdf
PPTX
Supervised vs unsupervised machine learning algorithms
PDF
Clinical guidelines as a resource for EBP(1).pdf
PDF
annual-report-2024-2025 original latest.
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPT
Quality review (1)_presentation of this 21
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
IB Computer Science - Internal Assessment.pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Fluorescence-microscope_Botany_detailed content
climate analysis of Dhaka ,Banglades.pptx
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Mega Projects Data Mega Projects Data
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
Business Analytics and business intelligence.pdf
Supervised vs unsupervised machine learning algorithms
Clinical guidelines as a resource for EBP(1).pdf
annual-report-2024-2025 original latest.
Acceptance and paychological effects of mandatory extra coach I classes.pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Miokarditis (Inflamasi pada Otot Jantung)
Introduction-to-Cloud-ComputingFinal.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Quality review (1)_presentation of this 21
Galatica Smart Energy Infrastructure Startup Pitch Deck

Slides: Relational to NoSQL Migration

  • 1. Relational (RDBMS) to NoSQL Migration Ankit Patel | DataStax | Principal Strategy Architect
  • 2. 2 © 2020 Datastax, Inc. All rights reserved. “We cannot solve our problems with the same thinking we used when we created them.” - Albert Einstein
  • 3. The Digital Era - The Need to Modernize 3 © 2020 Datastax, Inc. All rights reserved. Digital Data-Driven AI Enabled
  • 4. The Modern Era SAD (Silos Affects Delivery) Speed of Data Matters! 4 Data access Legacy processes Lack of data analytical skills Resistance to change © 2020 Datastax, Inc. All rights reserved. Source: https://www.pinterest.com/pin/573716440029920090/
  • 5. NoSQL - The Future What is a NoSQL (Not-only-SQL) Database? 5 © 2020 Datastax, Inc. All rights reserved. • Non Relational Database - supports ability to access data using other forms besides Structured Query Language (SQL) • Designed to be used by Cloud Applications’ need to handle massive amounts of Data in real-time • Provides ability to overcome scale, performance, data storage, data model, and data distribution limitations
  • 6. NoSQL vs RDBMS…. 6 © 2020 Datastax, Inc. All rights reserved. C When to use NoSQL? When to use RDBMS? Applications Decentralized (scalable) microservice applications Centralized monolithic applications Availability 100% availability, zero-downtime Moderate to high Data Low latency structured/semi/unstructured data @ high velocity Structured data @ moderate velocity & latency Transactions Simple transactions & queries Complex nested transactions & joins Scalability (Reads/Writes) Horizontal (Linear) scaling Vertical scaling
  • 7. Cassandra: The Best NoSQL Database of Choice 7 © 2020 Datastax, Inc. All rights reserved. Active-everywhere, masterless, scales linearly Best NoSQL database for cloud-native and microservices #1 choice of world’s largest consumer internet applications Zero Lock-in Global ScaleZero Downtime If you use a website or a smartphone today, you’re touching a Cassandra backend system. Source: https://sdtimes.com/data/apache-cassandra-4-0-beta-now-available/
  • 8. Cassandra: Cloud Native NoSQL Database Why? With Cassandra masterless architecture, easily achieving 100% uptime across on-prem, single cloud, hybrid, and/or multi-cloud deployments is engraved in the technology. 8 © 2020 Datastax, Inc. All rights reserved. Experiences, Microservices & Insights ON PREM
  • 9. © 2020 Datastax, Inc. All rights reserved. ● CQL – Cassandra Query Language ● Similar to syntax compared to SQL ● Standard way to communicate to DSE C* cluster for reading/writing data. ● Feature rich language that allow you to manage the cluster (managing schema/permissions, managing roles, JSON support, UDF/UDA support…) ● Example Read: select * from keyspace.table where partition_key=<value>; ● Example Writing Data: insert into keyspace.table (partition_key,clustering_key,value1) values (‘A’,’B’,’C’); Cassandra: What is CQL? 9
  • 10. © 2020 Datastax, Inc. All rights reserved. ● Similar to schema in RDBMS ● Container for multiple tables ● Replication Strategy is set at the keyspace level (Example: SimpleStrategy, NetworkTopologyStrategy) ● Replication Factor defined at the keyspace level ● DURABLE_WRITES is set at the keyspace level. Setting to false will bypass the commit log. ● Example to create keyspace: CREATE KEYSPACE test WITH replication = {'class': NetworkTologyStrategy', 'DC1': '1'} AND durable_writes = true; Cassandra: What is a Keyspace? 10
  • 11. © 2020 Datastax, Inc. All rights reserved. ● Same as RDMBS table ● Contains a primary key ● Always has partition key as part of primary key ● Optionally can define a clustering key (ordering can be defined) ● Both partition and clustering key can be composed of multi-column ● A of parameters can be adjusted at the table level (compaction, compression, gc_grace_seconds, time to live, etc..) Cassandra: What is a Table? 11
  • 12. © 2020 Datastax, Inc. All rights reserved. CREATE TABLE test.sample_table ( par_key1 uuid, par_key2 uuid, clust_key1 timestamp, clust_key2 int, value1 text, value2 double, PRIMARY KEY ((par_key1, par_key2), clust_key1, clust_key2) ) WITH CLUSTERING ORDER BY (clust_key1 DESC, clust_key2 ASC) Cassandra: Example Create Table 12
  • 13. © 2020 Datastax, Inc. All rights reserved. ● Replication factor determines how many copies of your data are stored in the Cassandra Cluster. ● Each copy is stored in a different node. ● Replication Factor can be defined by datacenters that you’ve setup ● This is a parameter set at the keyspace level within the cluster. Cassandra: What is Replication Factor 13
  • 14. © 2020 Datastax, Inc. All rights reserved. ● This parameter is set by the client on individual queries ● This parameter combined with replication factor can help you achieve the consistency requirement the specific use case is looking for. ● Some of the different values are ONE LOCAL_ONE QUORUM EACH_QUORUM LOCAL_QUORUM ALL Cassandra: What is Consistency Level 14
  • 15. Cassandra - Read/Write in Action 15 © 2020 Datastax, Inc. All rights reserved. Replication - 3 per DC Consistency - Per Read/Write Request from Client Application - Active/Active Deployment across DC for Read/Write APP ON-PREM AWS AZURE APP APP
  • 16. © 2020 Datastax, Inc. All rights reserved. ● Structured Data is the norm for both ● Re-evaluate the need for ACID transactions with Lightweight-transactions (LWT) in Cassandra ● Take advantage of Cassandra Performance ○ Move Joins to Application Stack ○ Denormalization & Data Duplication is efficient ○ Choose type of Index wisely based on Latency/TPS requirements ● Thoroughly plan the Data Model in Cassandra How can My Enterprise get from an RDBMS Based Design to Cassandra Based Architecture? 16
  • 17. ERD to Query Based ERD Based Design Query Based Design © 2020 Datastax, Inc. All rights reserved.17
  • 18. 5 Steps to Query Based Design 18 © 2020 Datastax, Inc. All rights reserved. Design a Mental Model of Access Patterns Examples: Medical History: Read Surgeries, Read Allergies, Read Health Conditions Doctor Visit: Read Notes, Read Prescriptions, Read Vitals Decide the application access patterns to various entities to deliver business functionality. Examples: Medical History Queries Doctor Visit Queries Define the structure of the data elements based on query based design Example: Read Prescriptions (patient, date, drug, dosage, etc..) Make optimizations to access the data Example: Create index to Read Prescription by drug type or prescribing Doctor. Build Cassandra table schema based on logical model & optimizations Example: Table prescriptions with primary key patient, date and index on doctor & drug type Application Conceptual Model Logical Model Optimizations Physical Model
  • 19. DataStax Enterprise: Cassandra Data Platform Kubernetes Operator (Cloud-Native Automation + Elasticity) Developer and DevOps APIs (K8S, CQL, REST, GraphQL, gRPC) Operational Reliability (Advanced Performance, Enterprise Security, Monitoring) AI-Scale Experiences, Microservices and Insights Apache Cassandra NoSQL Database (100% Uptime, Zero-Lock-In, Global Scale) TRUSTED ACCELERATED STRATEGIC OUTCOMES FOUNDATIONAL Operational Analytics (Spark, Pipelines, Streaming) Enhanced Search (Enhance Any Query) Extensible Integration (Kafka, Elastic, Bulk Loading) Graph Engine (Relate Data Across Partitions) Multi-Model Data (All Data Styles) Tools Thought Leadership Enterprise Support Partnerships OSS Commitment 19 © 2020 Datastax, Inc. All rights reserved.
  • 20. DataStax Astra: Cassandra Made Easy in the Cloud 20 © 2020 Datastax, Inc. All rights reserved. Cloud-native Database-as-a-Service built on Apache Cassandra Eliminate the overhead to install, operate, and scale Cassandra Out-of-the-box REST and GraphQL endpoints and browser CQL shell Powered by our open-source Kubernetes Operator for Cassandra Deploy on AWS or GCP and keep compatibility with open-source Cassandra Launch a database in the cloud with a few clicks, no credit card required Cassandra-as-a-Service No Operations Powerful APIs Cloud Native Zero Lock-in 10 Gig Free Tier
  • 21. 21 Use Case #1 - C&S Wholesale Grocers - Supply Chain ● Delivers over 140,000 food and non-food items to from over 50 warehouse locations ● Operates over 18 million square feet of storage ● Some of C&S’s customers are Safeway, Target, Stop & Shop ● Traditional solutions slowing down distribution efficiency & impeding innovation ● Business growth leading to Technology Innovation
  • 22. 22 Use Case #1 - C&S - The Challenge ● Supply Chain Process in local RDBMS to warehouse ● Business need to consolidate warehouse data for ease of management via mobile app ● The transaction volumes were in the thousands per several seconds ● Needed real-time view of all the working parts of the manufacturing operations. Warehouse → locations → pallet ● Data Platform capable of operational analytics
  • 23. 23 Use Case #1 - C&S - Why Cassandra? ● Scalable ● High Transaction Volume ● Low Latency ● High Availability - Warehouse operations 24/7 ● Ease of Development for Microservices & Mobile App ● Multi-DC Deployment Capability ● Ease of Operational Analytics
  • 24. 24 Use Case #1 - C&S - Business Benefits ● 5 year ROI projection to save multi-millions ● Able to optimize management capabilities of consolidated warehouse operations ● Achieve remarkable efficiency in data pipeline ● Transactions - Read/Write Thousands in seconds ● Supports 300+ Users processing ~ 300k records in 5 mins
  • 25. 25 Use Case #1 - C&S - The Architecture
  • 26. 26 C&S - Case Study We needed an application that was entirely reliable and not vulnerable to unplanned outages because our warehouses are pretty much 24/7... https://www.datastax.com/resources/case-study/cs- wholesale-achieving-seamless-supply-chain-master y-datastax-enterprise
  • 27. 27 Use Case #2 - Financial Services - Mobile Banking ● Very competitive retail banking market ● Need to keep up with demand growth in digital banking ● Have high customer satisfaction rates ● Achieve efficient DR & Business Continuity Plans
  • 28. 28 Use Case #2 - Financial Services - The Challenge ● # of Transactions in RDBMS was not easily scalable ● DR was not easy ● Achieving Latency metrics was harder as volumes increased ● Downtime or poor experience would translate to customer churn
  • 29. 29 Use Case #2 - Financial Services - Why Cassandra? ● Deploy 3 DC Cluster ● Microservices Architecture ● Scale Application Stack w/ Database ● Achieve low latency SLA (<20ms on avg) ● DR Strategy was solid w/ High Availability ● Capable of processing billions of transactions per month
  • 30. • Customer 360/SVOC • Omnichannel & Global Payments • IoT/Time Series/eCommerce Data (sensors, tick data, user interactions, shopping cart) • Fraud Detection • Online/Mobile Banking • Inventory Management 30 Some Other Common Use Cases • Recommendations (products & services) • Regulatory Compliance • Alerts & Monitoring (Credit card transactions) • Global Payments • Portfolio Management • Loan Authorization • Authentication (Mobile Logins)
  • 31. Thank You! 31 © 2020 Datastax, Inc. All rights reserved. Ankit Patel Principal Strategy Architect @ DataStax https://www.linkedin.com/in/ankit-p-patel