SlideShare a Scribd company logo
Peter Milne
peter@aerospike.com
@helipilot50
helipilot50
SQL to NoSQL Migration
Wisdom vs Guessing
"Everything that can be
invented has been invented.” -
Charles Holland Duell – US Patent
Office 1899
“Insanity is doing the same
thing over & over again
expecting different results”
– Albert Einstein
Why we love RDBMS
■ Reliability
■ Concurrency made easy
■Atomic
■Consistency
■Isolation
■Durability
■ 2 phase commit
■ Rigorous Schema
■ Normalization (normal forms)
…And it is what we all know
Edgar “Ted” Codd
Why switch to NoSQL
■ Scalability
■ Performance
■ Developer productivity
RDBMS and NoSQL differences
■ RDBMS is accessible for many
types of applications
■ Schema helps to control rules
common for everybody
■ NoSQL is about focusing on
particular need
■ Model your data for specific use
case
■ Speed at Scale
10 Reasons why NoSQL is the Key to Velocity
1. RAM is fast
2. Easy for developers
3. Flex schema
4. Read / write agility
5. Scale-out clustering works
6. Flash-enabled for big data
7. Geographic replication
8. Cloud or On-premise
9. KVS+ for fast analytics
10. Open source
Polyglot Persistence
■Different solutions are designed to solve different
problems
■session & fast transactions
■Cache
■Aggregations
■Analytical ad-hock scans
■Traversal
■The requirements for OLTP and OLAP storages are
very different
NoSQL stereotypes
8
NoSQL types: Key-value & Key-document
Key-Value:
■Riak
■Redis
■Aerospike
■Kyoto Tycoon
Key-Document
■MongoDB
■CouchBase
NoSQL types: Column oriented & Graph
Column oriented
■Cassandra
■Hbase
Graph Databases
■Neo4j
■OrientDB
NoSQL Data Modeling Basics
Normalization
■ Normalization is the process of structuring relational database schema
such that most ambiguity is removed. The stages of normalization are
referred to as normal forms and progress from the least restrictive (First
Normal Form) through the most restrictive (Fifth Normal Form).
De-normalization or Duplication
■ De-normalization is OK
■Immutable data
■ Aggregation vs Association
■“Consists of” vs “related to”
Consists of Related to
Aggregates
Aggregation is a technique of embedding data structures into each other.
Why?
■ Simple read/write is fast
■ Optimize storage I/O
■ Parallelism through clustering
What it costs
■ Updates and Queries are
heavy and complex
Aggregation gives you Velocity
Joins in your code
Not all the questions can be answered by the same data model.
■ Frequent questions move to the data model
■Composite key, aggregation, etc
■ Infrequent questions done as an application Join
Example: User profile and Photo
Join
Composite keys
Single record, 2 pieces of related data, Single composite key
■ Chat example:
■ Bank account example:
Append
As events occur Append them to a List*
■ Banking example
■Deposit
■Withdrawal
■Deposit
*Row oriented
■List, Byte Array, Set
*Column oriented
■Add a column
Append
Uncle Pete’s advice
Practical migration steps
■ What do you want to achieve
■Speed at Scale
■ Know your traffic
■ Know your data
■ What are you willing to sacrifice
■Consistency, Transactions, Data loss
■ Apply polyglot persistence
■ Model you data
Latency of your application
Latency = Sum(LD) + Sum(LS)
■LD = Device latency
■LS = Stupidity latency
■ Minimize stupidity
Right tool for the job
Load test
■ Simulation
■Simulate real load
■ Nothing is better than real data
■Record live data and playback in testing
Finally..
A well designed and build application should
■ Deliver the correct result
■ Perform adequately
■ Be maintainable by the average Guy or Girl
Klausimai
Questions
Dúvidas
Fragen
質問がありますか
Perguntas

More Related Content

PPTX
Distributing Data The Aerospike Way
PDF
Realtime Indexing for Fast Queries on Massive Semi-Structured Data
PPTX
Configuring Aerospike - Part 2
PPTX
Flash Economics and Lessons learned from operating low latency platforms at h...
PPTX
Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage
PPTX
ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACID
PPTX
Brian Bulkowski : what startups can learn from real-time bidding
PPTX
Aerospike Architecture
Distributing Data The Aerospike Way
Realtime Indexing for Fast Queries on Massive Semi-Structured Data
Configuring Aerospike - Part 2
Flash Economics and Lessons learned from operating low latency platforms at h...
Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage
ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACID
Brian Bulkowski : what startups can learn from real-time bidding
Aerospike Architecture

What's hot (20)

PPTX
Configuring Aerospike - Part 1
PPTX
Tectonic Shift: A New Foundation for Data Driven Business
PDF
Red Hat Storage for Mere Mortals
PPTX
Redis vs Aerospike
PPT
Aerospike: Key Value Data Access
PDF
CEPH DAY BERLIN - CEPH ON THE BRAIN!
PDF
Red Hat Storage Day New York - What's New in Red Hat Ceph Storage
PDF
Scalable POSIX File Systems in the Cloud
PDF
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
PDF
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
PDF
Hyperconverged Cloud, Not just a toy anymore - Andrew Hatfield, Red Hat
PDF
Virtual training Intro to the Tick stack and InfluxEnterprise
PDF
Improving Presto performance with Alluxio at TikTok
PPTX
Architecting Ceph Solutions
PDF
Red Hat Storage: Emerging Use Cases
PDF
OpenStackTage Cologne - OpenStack at 99.999% availability with Ceph
PPTX
HDFS Erasure Coding in Action
PDF
NVMe over Fabrics and Composable Infrastructure - What Do They Mean for Softw...
PDF
My personal journey through the World of Open Source! How What Was Old Beco...
PDF
Red Hat Ceph Storage Roadmap: January 2016
Configuring Aerospike - Part 1
Tectonic Shift: A New Foundation for Data Driven Business
Red Hat Storage for Mere Mortals
Redis vs Aerospike
Aerospike: Key Value Data Access
CEPH DAY BERLIN - CEPH ON THE BRAIN!
Red Hat Storage Day New York - What's New in Red Hat Ceph Storage
Scalable POSIX File Systems in the Cloud
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
Hyperconverged Cloud, Not just a toy anymore - Andrew Hatfield, Red Hat
Virtual training Intro to the Tick stack and InfluxEnterprise
Improving Presto performance with Alluxio at TikTok
Architecting Ceph Solutions
Red Hat Storage: Emerging Use Cases
OpenStackTage Cologne - OpenStack at 99.999% availability with Ceph
HDFS Erasure Coding in Action
NVMe over Fabrics and Composable Infrastructure - What Do They Mean for Softw...
My personal journey through the World of Open Source! How What Was Old Beco...
Red Hat Ceph Storage Roadmap: January 2016
Ad

Viewers also liked (16)

PDF
Achieving High Load in Advertising Technology
PDF
資料分析101
PDF
Что хотят агенства от DSP?
PDF
Demand Side Platforms: Silver Bullet or Fog of War?
PDF
AXONIM Devices presentation
PDF
Никита Пасынков, Adfox RIW 2012
PPTX
RTB для издателей Sell-Side Platform (SSP)
PDF
MoPub Marketplace Report: Q4 2014
PDF
Programmatic jednoducho
PDF
Mobile Programmatic Trends: Q3 2015
PPTX
Programmatic Advertising: How To Join In On the Fun
PDF
Programmatic vs Adwords
PPTX
Programmatic Media Scenario
PDF
Demand side management
PPTX
All about Programmatic buying(RTB), DSP,SSP, DMP & DCT - A complete digital ...
PDF
Distributed Systems: scalability and high availability
Achieving High Load in Advertising Technology
資料分析101
Что хотят агенства от DSP?
Demand Side Platforms: Silver Bullet or Fog of War?
AXONIM Devices presentation
Никита Пасынков, Adfox RIW 2012
RTB для издателей Sell-Side Platform (SSP)
MoPub Marketplace Report: Q4 2014
Programmatic jednoducho
Mobile Programmatic Trends: Q3 2015
Programmatic Advertising: How To Join In On the Fun
Programmatic vs Adwords
Programmatic Media Scenario
Demand side management
All about Programmatic buying(RTB), DSP,SSP, DMP & DCT - A complete digital ...
Distributed Systems: scalability and high availability
Ad

Similar to Glue con denver may 2015 sql to nosql (20)

PDF
Beyond Relational Databases
PDF
NoSQL, What it is and how our projects can benefit from it
PPT
NO SQL: What, Why, How
ODP
Data massage! databases scaled from one to one million nodes (ulf wendel)
PPTX
Dbms and sqlpptx
PPTX
NoSQL databases
ODP
Data massage: How databases have been scaled from one to one million nodes
PDF
Nosql part1 8th December
DOCX
Know what is NOSQL
PPT
PDF
B.Vinithamani,II-M.sc.,Computer science,Bon Secours college for women,thanjavur.
PDF
ITB_2023_Relationships_are_Hard_Data_modeling_with_NoSQL_Curt_Gratz.pdf
PDF
A Beginners Guide to noSQL
PPT
D B M S Animate
PPTX
T-SQL Overview
PDF
Data Modeling, Normalization, and De-Normalization | PostgresOpen 2019 | Dimi...
PPTX
Introduction to asdfghjkln b vfgh n v
PDF
No SQL databases basics module 1 vtu notes
PPTX
Relational Database Management System
PPTX
SQL-queries-for-Data-Analysts-Updated.pptx
Beyond Relational Databases
NoSQL, What it is and how our projects can benefit from it
NO SQL: What, Why, How
Data massage! databases scaled from one to one million nodes (ulf wendel)
Dbms and sqlpptx
NoSQL databases
Data massage: How databases have been scaled from one to one million nodes
Nosql part1 8th December
Know what is NOSQL
B.Vinithamani,II-M.sc.,Computer science,Bon Secours college for women,thanjavur.
ITB_2023_Relationships_are_Hard_Data_modeling_with_NoSQL_Curt_Gratz.pdf
A Beginners Guide to noSQL
D B M S Animate
T-SQL Overview
Data Modeling, Normalization, and De-Normalization | PostgresOpen 2019 | Dimi...
Introduction to asdfghjkln b vfgh n v
No SQL databases basics module 1 vtu notes
Relational Database Management System
SQL-queries-for-Data-Analysts-Updated.pptx

Recently uploaded (20)

PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
Spectroscopy.pptx food analysis technology
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
cuic standard and advanced reporting.pdf
PPTX
A Presentation on Artificial Intelligence
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Empathic Computing: Creating Shared Understanding
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Tartificialntelligence_presentation.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
NewMind AI Weekly Chronicles - August'25-Week II
Spectroscopy.pptx food analysis technology
Diabetes mellitus diagnosis method based random forest with bat algorithm
cuic standard and advanced reporting.pdf
A Presentation on Artificial Intelligence
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
A comparative analysis of optical character recognition models for extracting...
Building Integrated photovoltaic BIPV_UPV.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Assigned Numbers - 2025 - Bluetooth® Document
Empathic Computing: Creating Shared Understanding
Advanced methodologies resolving dimensionality complications for autism neur...
Tartificialntelligence_presentation.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Network Security Unit 5.pdf for BCA BBA.
Mobile App Security Testing_ A Comprehensive Guide.pdf
Group 1 Presentation -Planning and Decision Making .pptx

Glue con denver may 2015 sql to nosql

  • 2. Wisdom vs Guessing "Everything that can be invented has been invented.” - Charles Holland Duell – US Patent Office 1899 “Insanity is doing the same thing over & over again expecting different results” – Albert Einstein
  • 3. Why we love RDBMS ■ Reliability ■ Concurrency made easy ■Atomic ■Consistency ■Isolation ■Durability ■ 2 phase commit ■ Rigorous Schema ■ Normalization (normal forms) …And it is what we all know Edgar “Ted” Codd
  • 4. Why switch to NoSQL ■ Scalability ■ Performance ■ Developer productivity
  • 5. RDBMS and NoSQL differences ■ RDBMS is accessible for many types of applications ■ Schema helps to control rules common for everybody ■ NoSQL is about focusing on particular need ■ Model your data for specific use case ■ Speed at Scale
  • 6. 10 Reasons why NoSQL is the Key to Velocity 1. RAM is fast 2. Easy for developers 3. Flex schema 4. Read / write agility 5. Scale-out clustering works 6. Flash-enabled for big data 7. Geographic replication 8. Cloud or On-premise 9. KVS+ for fast analytics 10. Open source
  • 7. Polyglot Persistence ■Different solutions are designed to solve different problems ■session & fast transactions ■Cache ■Aggregations ■Analytical ad-hock scans ■Traversal ■The requirements for OLTP and OLAP storages are very different
  • 9. NoSQL types: Key-value & Key-document Key-Value: ■Riak ■Redis ■Aerospike ■Kyoto Tycoon Key-Document ■MongoDB ■CouchBase
  • 10. NoSQL types: Column oriented & Graph Column oriented ■Cassandra ■Hbase Graph Databases ■Neo4j ■OrientDB
  • 12. Normalization ■ Normalization is the process of structuring relational database schema such that most ambiguity is removed. The stages of normalization are referred to as normal forms and progress from the least restrictive (First Normal Form) through the most restrictive (Fifth Normal Form).
  • 13. De-normalization or Duplication ■ De-normalization is OK ■Immutable data ■ Aggregation vs Association ■“Consists of” vs “related to” Consists of Related to
  • 14. Aggregates Aggregation is a technique of embedding data structures into each other. Why? ■ Simple read/write is fast ■ Optimize storage I/O ■ Parallelism through clustering What it costs ■ Updates and Queries are heavy and complex Aggregation gives you Velocity
  • 15. Joins in your code Not all the questions can be answered by the same data model. ■ Frequent questions move to the data model ■Composite key, aggregation, etc ■ Infrequent questions done as an application Join Example: User profile and Photo Join
  • 16. Composite keys Single record, 2 pieces of related data, Single composite key ■ Chat example: ■ Bank account example:
  • 17. Append As events occur Append them to a List* ■ Banking example ■Deposit ■Withdrawal ■Deposit *Row oriented ■List, Byte Array, Set *Column oriented ■Add a column Append
  • 19. Practical migration steps ■ What do you want to achieve ■Speed at Scale ■ Know your traffic ■ Know your data ■ What are you willing to sacrifice ■Consistency, Transactions, Data loss ■ Apply polyglot persistence ■ Model you data
  • 20. Latency of your application Latency = Sum(LD) + Sum(LS) ■LD = Device latency ■LS = Stupidity latency ■ Minimize stupidity
  • 21. Right tool for the job
  • 22. Load test ■ Simulation ■Simulate real load ■ Nothing is better than real data ■Record live data and playback in testing
  • 23. Finally.. A well designed and build application should ■ Deliver the correct result ■ Perform adequately ■ Be maintainable by the average Guy or Girl

Editor's Notes

  • #2: SQL to NoSQL migration Key topics that he will talk about are following: ●  Why would you want to migrate to NoSQL ●  Conceptual difference between RBDMS and NoSQL ●  Data modeling and architectural best practices ●  "I got it. But what exactly I need to do?" ­ Practical migration steps Peter will explain the main advantages of NoSQL, common use cases in which the migration to NoSQL makes sense. You will learn about key questions that you have to ask before migration, as well as difference in data modeling and architectural approaches that you should know. Finally, we will take a look at typical application based on RDBMS and will migrate it to NoSQL step by step.