SlideShare a Scribd company logo
Is NoSQL the Future of Data
         Storage?
        By Gary Short
      Developer Express
Introduction
•   Gary Short
•   Technical Evangelist for Developer Express
•   C# MVP
•   garys@devexpress.com
•   www.garyshort.org
•   @garyshort.



                                                 2
What About You Guys?




                       3
Breadth First Look @ NoSQL




                             4
Be Doing 3 Things
1. Define NoSQL databases
2. Look at scenarios where you can use NoSQL
3. Drill into a specific use case.




                                               5
6
Where Does NoSQL Originate?
• 1998
  – OS relational database
     •   Created by Carlo Strozzi
     •   Didn’t expose an SQL interface
     •   Called NoSQL
     •   The author said:
     •   “departs from the relational model altogether...”
     •   “...should have been called ‘NoREL”.



                                                             7
More Recently...
• Eric Evans reintroduced the term in 2009
  – Johan Oskarsson (last.fm)
     • Event to discuss OS distributed databases
• This labels growing number datastores
  – Open source
  – Non-relational
  – Distributed
  – (often) don’t guarantee ACID.

                                                   8
Atlanta 2009
• No:sql(east) conference
• Billed as “conference of no-rel datastores”
• Worst tag line ever
  – SELECT fun, profit FROM real_world WHERE rel=false.




                                                          9
Not Ant-RDBMS




                10
Let’s Talk a Bit About What NoSQL DBs
               Look Like...




                                    11
Key Attributes of NoSQL Databases
•   Don’t require fixed table schemas
•   Non-relational
•   (Usually) avoid join operations
•   Scale horizontally
    – Adding more nodes to a storage system.




                                               12
What Does the Taxonomy Look Like?




                                    13
Document Store
•   RavenDB
•   Apache Jackrabbit
•   CouchDB
•   MongoDB
•   SimpleDB
•   XML Databases
    – MarkLogic Server
    – eXist.

                                14
Document What?




                 15
Graph Storage
•   Trinity
•   AllegroGraph
•   Core Data
•   Neo4j
•   DEX
•   FlockDB.



                               16
Which Means?
• Graph consists of
  – Node (‘stations’ of the graph)
  – Edges (lines between them)
• FlockDB
  – Created by the Twitter folks
  – Nodes = Users
  – Edges = Nature of relationship between nodes.


                                                    17
Social Graph




               18
Key/Value Stores
• On disk
• Cache in Ram
• Eventually Consistent
   – Weak Definition
      • “If no updates occur for a period, eventually all updates will
        propagate through the system and all replicas will be consistent”
   – Strong Definition
      • “for a given update and a given replica eventually either the
        update reaches the replica or the replica retires”
• Ordered
   – Distributed Hash Table allows lexicographical processing.

                                                                            19
Object Databases
•   Db4o
•   GemStone/S
•   InterSystems Caché
•   Objectivity/DB
•   ZODB.




                                20
How the &*$% do You Index
         That?!




                            21
Okay got it, Now Let’s Compare Some
       Real World Scenarios




                                  22
You Need Constant Consistency
•   You’re dealing with financial transactions
•   You’re dealing with medical records
•   You’re dealing with bonded goods
•   Best you use a RDMBS ☺.




                                                 23
You Need Horizontal Scalability
• You’re working across defined geographic regions
• You’re working with large quantities of data
• Game server sharding
• Use NoSQL
   – Something like Cassandra.




                                                     24
Up in the Clouds Baby




                        25
26
Frequently Written Rarely Read
•   Think web counters and the like
•   Every time a user comes to a page = ctr++
•   But it’s only read when the report is run
•   Use NoSQL (key-value storage/memcache).




                                                27
I Got Big Data!




                  28
Binary Baby!
•   If you are YouTube
•   Flickr
•   Twitpic
•   Spotify
•   NoSQL (Amazon S3).




                              29
Here Today Gone Tomorrow
• Transient data like..
  – Web Sessions
  – Locks
  – Short Term Stats
     • Shopping cart contents
• Use NoSQL (Memcache).



                                30
Data Replication
• Same data in two or more locations
  – Music Library
     • Web browser
     • iPone App
• NoSQL (CouchDB).




                                       31
Hit me Baby One More Time!
• High Availability
  – High number of important transactions
     • Online gambling
     • Pay Per view
        – Ahem!
     • Online Auction
• NoSQL (Cassandra – automatic clustering).



                                              32
Give me a Real World Example
• Twitter
  – The challenges
     • Needs to store many graphs
        – Who you are following
        – Who’s following you
        – Who you receive phone notifications from etc
     • To deliver a tweet requires rapid paging of followers
     • Heavy write load as followers are added and removed
     • Set arithmetic for @mentions (intersection of users).


                                                               33
What Did They Try?
• Relational Databases
• Key-Value storage of denormalized lists




                                            34
Did it Work?




               35
What Did They Need?
• Simplest possible thing that would work
• Allow for horizontal partitioning
• Allow write operations to
  – Arrive out of order
  – Or be processed more than once
• Failures should result in redundant work
  – Not lost work!


                                             36
The Result was FlockDB
• Stores graph data
• Not optimised for graph traversal operations
• Optimised for large adjacency lists
  – List of all edges in a graph
     • Each entry is a set of end points (or tuple if directed)
• Optimised for fast read and write
• Optimised for page-able set arithmetic.


                                                                  37
How Does it Work?
• Stores graphs as sets of edges between nodes
• Data is partitioned by node
  – All queries can be answered by a single partition
• Write operations are idempotent
  – Can be applied multiple times without changing
    the result
• And commutative
  – Changing the order of operands doesn’t change
    the result.

                                                        38
A Little More About Idempotency
• Applied several times with no change to the
  result
• A operation ’O’ on set S is called idempotent
  if, for all x in S, x O x = x.
• Set union
  – A U B = {X: X E A or X E B}
• Set intersection
  – A n B = {X: X E A and X E B}

                                                  39
A Little More About Commutative
• Changing the order of operands doesn’t
  change the result.
  3+2=5
• Can be combined with idempotency
• Let’s look at the follow command in Twitter
   • Let X = follow person X
   • Let Y = follow person Y
   • Then 3X + 2Y = 2Y + 3X
   • And 2X + 3Y = 3X + 2Y
• Note: it’s only true for the same operation.
                                                 40
Commutative Writes Help Bring up
            Partitions
• Partition can receive write traffic immediately
• Receive dump of data in the background
• Live for read as soon as the dump is complete.




                                                41
Performance?
• Currently store 13 billion edges
• 20K writes / second
• 100K reads / second.




                                     42
Punchline?
• Under all the bells and whistles...
  – Its MySQL ☺.




                                        43
So is this the Future?
• Yes!
• And No!




                                 44
What?! How Can That be?!




                           45

More Related Content

PDF
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
PDF
Crash course intro to cassandra
PDF
A Hitchhiker's Guide to NOSQL v1.0
PDF
Cassandra at Vast
KEY
Mongo db admin_20110329
PPTX
Hadoop for the Absolute Beginner
PPTX
NOSQL Databases for the .NET Developer
KEY
NoSQL: Why, When, and How
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
Crash course intro to cassandra
A Hitchhiker's Guide to NOSQL v1.0
Cassandra at Vast
Mongo db admin_20110329
Hadoop for the Absolute Beginner
NOSQL Databases for the .NET Developer
NoSQL: Why, When, and How

What's hot (20)

PDF
The MySQL Server ecosystem in 2016
PPTX
Relational and non relational database 7
PDF
Relational vs. Non-Relational
PDF
Cloud conference - mongodb
PPT
NoSQL databases pros and cons
PDF
TechTalk #14 Grokking: Couchbase - NoSQL + Memcached + Real-time + Offline!
PPTX
Utilizing the OpenNTF Domino API
PDF
How Shit Works: Storage
PPTX
MongoDB
PDF
Is the database a solved problem?
PDF
Ichii mysql-osc2011tokyofall
PDF
Thorny path to the Large-Scale Graph Processing (Highload++, 2014)
PPTX
Oracle OpenWo2014 review part 03 three_paa_s_database
ODP
Nonrelational Databases
PDF
Modern software architectures - PHP UK Conference 2015
PPTX
When to Use MongoDB...and When You Should Not...
PDF
The Wix Microservice Stack
PDF
Beware of your Hype Value Stores
PPT
JavaOne_2010
PDF
Cassandra@Coursera: AWS deploy and MySQL transition
The MySQL Server ecosystem in 2016
Relational and non relational database 7
Relational vs. Non-Relational
Cloud conference - mongodb
NoSQL databases pros and cons
TechTalk #14 Grokking: Couchbase - NoSQL + Memcached + Real-time + Offline!
Utilizing the OpenNTF Domino API
How Shit Works: Storage
MongoDB
Is the database a solved problem?
Ichii mysql-osc2011tokyofall
Thorny path to the Large-Scale Graph Processing (Highload++, 2014)
Oracle OpenWo2014 review part 03 three_paa_s_database
Nonrelational Databases
Modern software architectures - PHP UK Conference 2015
When to Use MongoDB...and When You Should Not...
The Wix Microservice Stack
Beware of your Hype Value Stores
JavaOne_2010
Cassandra@Coursera: AWS deploy and MySQL transition
Ad

Similar to Is NoSQL The Future of Data Storage? (20)

PPTX
Intro to Big Data and NoSQL
PPTX
Big Data (NJ SQL Server User Group)
PDF
PPTX
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
PDF
Seminar.2010.NoSql
PPTX
An Introduction to Big Data, NoSQL and MongoDB
PDF
No SQL Technologies
PDF
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, Egypt
PPTX
PPTX
The Rise of NoSQL and Polyglot Persistence
PDF
Spring one2gx2010 spring-nonrelational_data
PDF
The NoSQL Ecosystem
PDF
HPTS 2011: The NoSQL Ecosystem
PDF
Overview of no sql
PDF
Datastores
PPT
CouchBase The Complete NoSql Solution for Big Data
PPTX
Lviv EDGE 2 - NoSQL
PPT
Wmware NoSQL
PDF
Scaling the Web: Databases & NoSQL
PPTX
DataStax C*ollege Credit: What and Why NoSQL?
Intro to Big Data and NoSQL
Big Data (NJ SQL Server User Group)
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
Seminar.2010.NoSql
An Introduction to Big Data, NoSQL and MongoDB
No SQL Technologies
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, Egypt
The Rise of NoSQL and Polyglot Persistence
Spring one2gx2010 spring-nonrelational_data
The NoSQL Ecosystem
HPTS 2011: The NoSQL Ecosystem
Overview of no sql
Datastores
CouchBase The Complete NoSql Solution for Big Data
Lviv EDGE 2 - NoSQL
Wmware NoSQL
Scaling the Web: Databases & NoSQL
DataStax C*ollege Credit: What and Why NoSQL?
Ad

More from Saltmarch Media (18)

PDF
Concocting an MVC, Data Services and Entity Framework solution for Azure
PDF
Caring about Code Quality
PDF
Learning Open Source Business Intelligence
PDF
Java EE 7: the Voyage of the Cloud Treader
PDF
Introduction to WCF RIA Services for Silverlight 4 Developers
PDF
Integrated Services for Web Applications
PDF
Gaelyk - Web Apps In Practically No Time
PDF
CDI and Seam 3: an Exciting New Landscape for Java EE Development
PDF
JBoss at Work: Using JBoss AS 6
PDF
WF and WCF with AppFabric – Application Infrastructure for OnPremise Services
PDF
“What did I do?” - T-SQL Worst Practices
PDF
Building RESTful Services with WCF 4.0
PDF
Building Facebook Applications on Windows Azure
PDF
Architecting Smarter Apps with Entity Framework
PDF
Agile Estimation
PDF
Alternate JVM Languages
PDF
A Cocktail of Guice and Seam, the missing ingredients for Java EE 6
PDF
A Bit of Design Thinking for Developers
Concocting an MVC, Data Services and Entity Framework solution for Azure
Caring about Code Quality
Learning Open Source Business Intelligence
Java EE 7: the Voyage of the Cloud Treader
Introduction to WCF RIA Services for Silverlight 4 Developers
Integrated Services for Web Applications
Gaelyk - Web Apps In Practically No Time
CDI and Seam 3: an Exciting New Landscape for Java EE Development
JBoss at Work: Using JBoss AS 6
WF and WCF with AppFabric – Application Infrastructure for OnPremise Services
“What did I do?” - T-SQL Worst Practices
Building RESTful Services with WCF 4.0
Building Facebook Applications on Windows Azure
Architecting Smarter Apps with Entity Framework
Agile Estimation
Alternate JVM Languages
A Cocktail of Guice and Seam, the missing ingredients for Java EE 6
A Bit of Design Thinking for Developers

Recently uploaded (20)

PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Encapsulation theory and applications.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Approach and Philosophy of On baking technology
PDF
Modernizing your data center with Dell and AMD
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPT
Teaching material agriculture food technology
PPTX
MYSQL Presentation for SQL database connectivity
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
cuic standard and advanced reporting.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
NewMind AI Monthly Chronicles - July 2025
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Machine learning based COVID-19 study performance prediction
Digital-Transformation-Roadmap-for-Companies.pptx
Encapsulation theory and applications.pdf
Spectral efficient network and resource selection model in 5G networks
Approach and Philosophy of On baking technology
Modernizing your data center with Dell and AMD
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Teaching material agriculture food technology
MYSQL Presentation for SQL database connectivity
The AUB Centre for AI in Media Proposal.docx
cuic standard and advanced reporting.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
Advanced methodologies resolving dimensionality complications for autism neur...
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Understanding_Digital_Forensics_Presentation.pptx
NewMind AI Monthly Chronicles - July 2025
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Machine learning based COVID-19 study performance prediction

Is NoSQL The Future of Data Storage?

  • 1. Is NoSQL the Future of Data Storage? By Gary Short Developer Express
  • 2. Introduction • Gary Short • Technical Evangelist for Developer Express • C# MVP • garys@devexpress.com • www.garyshort.org • @garyshort. 2
  • 3. What About You Guys? 3
  • 4. Breadth First Look @ NoSQL 4
  • 5. Be Doing 3 Things 1. Define NoSQL databases 2. Look at scenarios where you can use NoSQL 3. Drill into a specific use case. 5
  • 6. 6
  • 7. Where Does NoSQL Originate? • 1998 – OS relational database • Created by Carlo Strozzi • Didn’t expose an SQL interface • Called NoSQL • The author said: • “departs from the relational model altogether...” • “...should have been called ‘NoREL”. 7
  • 8. More Recently... • Eric Evans reintroduced the term in 2009 – Johan Oskarsson (last.fm) • Event to discuss OS distributed databases • This labels growing number datastores – Open source – Non-relational – Distributed – (often) don’t guarantee ACID. 8
  • 9. Atlanta 2009 • No:sql(east) conference • Billed as “conference of no-rel datastores” • Worst tag line ever – SELECT fun, profit FROM real_world WHERE rel=false. 9
  • 11. Let’s Talk a Bit About What NoSQL DBs Look Like... 11
  • 12. Key Attributes of NoSQL Databases • Don’t require fixed table schemas • Non-relational • (Usually) avoid join operations • Scale horizontally – Adding more nodes to a storage system. 12
  • 13. What Does the Taxonomy Look Like? 13
  • 14. Document Store • RavenDB • Apache Jackrabbit • CouchDB • MongoDB • SimpleDB • XML Databases – MarkLogic Server – eXist. 14
  • 16. Graph Storage • Trinity • AllegroGraph • Core Data • Neo4j • DEX • FlockDB. 16
  • 17. Which Means? • Graph consists of – Node (‘stations’ of the graph) – Edges (lines between them) • FlockDB – Created by the Twitter folks – Nodes = Users – Edges = Nature of relationship between nodes. 17
  • 19. Key/Value Stores • On disk • Cache in Ram • Eventually Consistent – Weak Definition • “If no updates occur for a period, eventually all updates will propagate through the system and all replicas will be consistent” – Strong Definition • “for a given update and a given replica eventually either the update reaches the replica or the replica retires” • Ordered – Distributed Hash Table allows lexicographical processing. 19
  • 20. Object Databases • Db4o • GemStone/S • InterSystems Caché • Objectivity/DB • ZODB. 20
  • 21. How the &*$% do You Index That?! 21
  • 22. Okay got it, Now Let’s Compare Some Real World Scenarios 22
  • 23. You Need Constant Consistency • You’re dealing with financial transactions • You’re dealing with medical records • You’re dealing with bonded goods • Best you use a RDMBS ☺. 23
  • 24. You Need Horizontal Scalability • You’re working across defined geographic regions • You’re working with large quantities of data • Game server sharding • Use NoSQL – Something like Cassandra. 24
  • 25. Up in the Clouds Baby 25
  • 26. 26
  • 27. Frequently Written Rarely Read • Think web counters and the like • Every time a user comes to a page = ctr++ • But it’s only read when the report is run • Use NoSQL (key-value storage/memcache). 27
  • 28. I Got Big Data! 28
  • 29. Binary Baby! • If you are YouTube • Flickr • Twitpic • Spotify • NoSQL (Amazon S3). 29
  • 30. Here Today Gone Tomorrow • Transient data like.. – Web Sessions – Locks – Short Term Stats • Shopping cart contents • Use NoSQL (Memcache). 30
  • 31. Data Replication • Same data in two or more locations – Music Library • Web browser • iPone App • NoSQL (CouchDB). 31
  • 32. Hit me Baby One More Time! • High Availability – High number of important transactions • Online gambling • Pay Per view – Ahem! • Online Auction • NoSQL (Cassandra – automatic clustering). 32
  • 33. Give me a Real World Example • Twitter – The challenges • Needs to store many graphs – Who you are following – Who’s following you – Who you receive phone notifications from etc • To deliver a tweet requires rapid paging of followers • Heavy write load as followers are added and removed • Set arithmetic for @mentions (intersection of users). 33
  • 34. What Did They Try? • Relational Databases • Key-Value storage of denormalized lists 34
  • 36. What Did They Need? • Simplest possible thing that would work • Allow for horizontal partitioning • Allow write operations to – Arrive out of order – Or be processed more than once • Failures should result in redundant work – Not lost work! 36
  • 37. The Result was FlockDB • Stores graph data • Not optimised for graph traversal operations • Optimised for large adjacency lists – List of all edges in a graph • Each entry is a set of end points (or tuple if directed) • Optimised for fast read and write • Optimised for page-able set arithmetic. 37
  • 38. How Does it Work? • Stores graphs as sets of edges between nodes • Data is partitioned by node – All queries can be answered by a single partition • Write operations are idempotent – Can be applied multiple times without changing the result • And commutative – Changing the order of operands doesn’t change the result. 38
  • 39. A Little More About Idempotency • Applied several times with no change to the result • A operation ’O’ on set S is called idempotent if, for all x in S, x O x = x. • Set union – A U B = {X: X E A or X E B} • Set intersection – A n B = {X: X E A and X E B} 39
  • 40. A Little More About Commutative • Changing the order of operands doesn’t change the result. 3+2=5 • Can be combined with idempotency • Let’s look at the follow command in Twitter • Let X = follow person X • Let Y = follow person Y • Then 3X + 2Y = 2Y + 3X • And 2X + 3Y = 3X + 2Y • Note: it’s only true for the same operation. 40
  • 41. Commutative Writes Help Bring up Partitions • Partition can receive write traffic immediately • Receive dump of data in the background • Live for read as soon as the dump is complete. 41
  • 42. Performance? • Currently store 13 billion edges • 20K writes / second • 100K reads / second. 42
  • 43. Punchline? • Under all the bells and whistles... – Its MySQL ☺. 43
  • 44. So is this the Future? • Yes! • And No! 44
  • 45. What?! How Can That be?! 45