SlideShare a Scribd company logo
The future is:

                               NoSQL Databases

                              Polyglot Persistence
                                                                         Martin Fowler
   a note on the future of data storage in
   the enterprise, written primarily for those
   involved in the management of
   application development.                                              Pramod Sadalage




© Martin Fowler and Pramod Sadalage   Rendered: February 8, 2012 11:26
SQL has Ruled for two decades

        Store persistent data                                 Application Integration
          Storing large amounts of data                         Many applications in an enterprise
          on disk, while allowing                               need to share information. By
          applications to grab the bits                         getting all applications to use the
          they need through queries                             database, we ensure all these
                                                                applications have consistent, up-to-
                                                                date data

                                          Mostly Standard
                                           The relational model is widely
                                           used and understood. Interaction
                                           with the database is done with
                                           SQL, which is a (mostly) standard
    Concurrency Control                    language. This degree of
     Many users access the same            standardization is enough to keep
     information at the same time.         things familiar so people don’t                Reporting
     Handling this concurrency is          need to learn new things
                                                                                            SQL’s simple data model and
     difficult to program, so databases                                                     standardization has made it a
     provide transactions to help                                                           foundation for many reporting
     ensure consistent interaction.                                                         tools




      All this supported by Big Database Vendors and the separation of the DBA profession.
2
but SQL’s dominance is cracking

    Relational databases are designed                             But it’s cheaper and more effective
    to run on a single machine, so to                             to scale horizontally by buying lots of
    scale, you need buy a bigger                                  machines.
    machine




                                                                                           SQL
        SQL




                                        The machines in these large clusters are                   Google and Amazon were both early
                                        individually unreliable, but the overall cluster           adopters of large clusters, and both
                                        keeps working even as machines die - so the                eschewed relational databases.
                                        overall cluster is reliable.
                                        The “cloud” is exactly this kind of cluster, which          Google                   Bigtable
                                        means relational databases don’t play well with
                                        the cloud.
                                                                                                   Amazon                   Dynamo
                                        The rise of web services provides an effective
                                        alternative to shared databases for application
                                        integration, making it easier for different                Their efforts have been a large inspiration
                                        applications to choose their own data storage.             to the NoSQL community
3
so now we have NoSQL databases
                                                      examples include
    There is no standard definition of what NoSQL
    means. The term began with a workshop
    organized in 2009, but there is much
    argument about what databases can truly be
    called NoSQL.
    But while there is no formal definition, there
    are some common characteristics of NoSQL
    databases
       they don’t use the relational data model,
       and thus don’t use the SQL language
       they tend to be designed to run on a
       cluster
       they tend to be Open Source
       they don’t have a fixed schema, allowing      We should also remember Google’s
       you to store any data in any record           Bigtable and Amazon’s SimpleDB. While
                                                     these are tied to their host’s cloud
                                                     service, they certainly fit the general
                                                     operating characteristics




4
so this means we can
Reduce Development Drag                                              Embrace Large Scale
A lot of effort in application development is tied up in             The large scale clusters that we can support with
working with relational databases. Although Object/                  NoSQL databases allow us to store larger datasets
Relational Mapping frameworks have eased the load, the               (people are talking about petabytes these days) to
database is still a significant source of developer hours.           process large amounts of analytic data.
Often we can reduce this effort by choosing an alternative
database that’s more suited to the problem domain.                   Alternative data models also allow us to carry out many
                                                                     tasks more efficiently, allowing us to tackle problems
We often come across projects who are using relational               that we would have balked at when using only relational
databases because they are the default, not because they             databases
are the best choice for the job. Often they are paying a
cost, in developer time and execution performance, for
features they do not use.
                                                                                                  Danish Health Care
                                                                                                    Centralized record of drug
                                                                                                    prescriptions. Currently held in
                                                                                                    MySQL databases, but concerned
                                                                                                    about scale for both response time
                                                                                                    and availability. Migrated data to
                                                 DNC                                                Riak. [more...]
                                                  Searching 300 Million voters
                                                  information for 1 person with
                                                  addresses, emails, phones is
                                                  tough with a relational data store.
                                                                                               McLaren
    Guardian                                      MongoDB was used to store the
                                                  documents about the person.                    Streaming of telemetric data into
     New functionality uses Mongo rather          [more...]                                      MongoDB for later analysis. Orders
     than relational DB. They found                                                              of magnitude faster than relational
     Mongo’s document data model                                                                 (SQL Server). [more...]
     significantly easier to interact with for
5    their kind of application. [more...]
but this does not mean relational is dead


              the relational model is still             ACID transactions
              relevant                                    In order to run effectively on a
               The tabular model is suitable for          cluster, most NoSQL databases
               many kinds of data, particularly           have limited transactional
               when you need to pick apart data           capability. Often this is enough…
               and re-assemble it in different            but not always.
               ways for different purposes.




      Tools                                        Familiarity
       The long dominance of SQL                    NoSQL systems are still new, so
       means that many tools have been              people aren’t familiar with using
       written to work with SQL                     them. So we shouldn’t be using
       databases. Tooling for alternative           them on utility projects where
       datastores is much more limited.             their benefits would have less
                                                    impact.




6
this leads us to a world of

       Polyglot Persistence
                                        using multiple data storage technologies, chosen
                                        based upon the way data is being used by
                                        individual applications. Why store binary images
                                        in relational database, when there are better
                                        storage systems?




      Polyglot persistence will occur over the enterprise
      as different applications use different data storage
      technologies. It will also occur within a single
      application as different parts of an application’s
      data store have different access characteristics.      http://martinfowler.com/bliki/PolyglotPersistence.html

7
what might Polyglot Persistence look like?

                                                                        Needs high availability across               Rapidly traverse links
    Rapid access for reads              Needs transactional             multiple locations. Can merge                between friends, product
    and writes. No need to              updates. Tabular                inconsistent writes                          purchases, and ratings
    be durable                          structure fits data



                                               Speculative Retailers Web Application



                      User sessions               Financial Data         Shopping Cart          Recomendations


                         Redis                       RDBMS                   Riak                    Neo4J                 This is a very
                                                                                                                           hypothetical example,
                                                                                                                           we would not make
                                                                                                                           technology
                     Product Catalog                Reporting              Analytics            User activity logs         recommendations
                                                                                                                           without more
                       MongoDB                       RDBMS                 Cassandra               Cassandra
                                                                                                                           contextual information



                 Lots of reads, infrequent
                 writes. Products make                             Large-scale analytics on
                 natural aggregate                                 large cluster                  High volume of writes on
                                                                                                  multiple nodes
                                      SQL interfaces well with
                                      reporting tools
8
polyglot persistence provides lots of new
    opportunities for enterprises

           i.e. problems
                   Decisions                            Organizational Change
                    We have to decide what data          How will our data groups react to
                    storage technology to use, rather    this new technology?
                    than just go with relational

                   Immaturity                           Dealing with Eventual Consistency
                    NoSQL tools are still young, and    Paradigm
                    full of the rough edges that new
                    tools have.                          How will different stakeholders in the
                                                         enterprise data deal with data that
                    Furthermore since we don’t have      could be stale and how do you
                    much experience with them, we        enforce rules to sync data across
                    don’t know how to use them well,     systems
                    what the good patterns are, and
                    what gotchas are lying in wait.



9
what kinds of projects are candidates for
  polyglot persistence?
                                                                                       If you need to get to market quickly,
                                                                                       then you need to maximize
                                                                                       productivity of your development
                                                                                       team. If appropriate, polyglot
                                                                                       persistence can remove significant
                                              rapid time to market                     drag.



                Strategic               and      and/or

                                              data intensive

Most software projects are utility                                   Data intensiveness can come in
projects, i.e. they aren’t central to                                various forms
the competitive advantage of the                                        lots of data
company. Utility projects should not
                                                                        high availability
take on the risk and staffing
demands that polyglot persistence                                       lots of traffic: reads or writes
brings as the potential benefits are                                    complex data relationships
not there.
                                                                     Any of these may suggest non-
                                                                     relational storage, but its the exact
                                                                     nature of the data interaction that
                                                                     will suggest the best of the many
10                                                                   alternatives.
for more information…
              On the Web
                We are both active writers on our websites.
                Martin writes at http://martinfowler.com and
                Pramod at http://www.sadalage.com/.




              Forthcoming Book
                We are currently working on a introductory book
                to NoSQL databases, to be titled: NoSQL
                Distilled (see http://martinfowler.com/bliki/
                NosqlDistilled.html)



              Consulting and Delivery
                ThoughtWorks has carried out several projects
                delivering production systems using NoSQL
                technologies. To see if NoSQL is a good fit for
                your needs, and how we can help your delivery,
                contact your local ThoughtWorks office, which
                you can find at http://thoughtworks.com



11

More Related Content

KEY
NoSQL Databases: Why, what and when
PDF
NoSQL-Database-Concepts
PPTX
Introduction to NoSQL
PPTX
Chapter1: NoSQL: It’s about making intelligent choices
PDF
NoSQL Now! NoSQL Architecture Patterns
PPTX
NoSQL Consepts
PPTX
Sql vs NoSQL
NoSQL Databases: Why, what and when
NoSQL-Database-Concepts
Introduction to NoSQL
Chapter1: NoSQL: It’s about making intelligent choices
NoSQL Now! NoSQL Architecture Patterns
NoSQL Consepts
Sql vs NoSQL

What's hot (20)

PPTX
NoSQL Data Architecture Patterns
PPTX
Selecting best NoSQL
PPTX
Polyglot Persistence
PPT
PDF
Comparison between rdbms and nosql
PDF
2012 10 24_briefing room
DOCX
Sql vs NO-SQL database differences explained
PDF
Datastores
PDF
The Future of Distributed Databases
PPT
RDBMS vs NoSQL
PPT
Indic threads pune12-nosql now and path ahead
PDF
NoSQL databases
PPTX
To SQL or NoSQL, that is the question
PPT
SQL/NoSQL How to choose ?
PPTX
Schema migrations in no sql
PDF
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
PDF
NoSQL Basics - A Quick Tour
PDF
Cache and consistency in nosql
PDF
Queues, Pools and Caches paper
PPTX
Intro to Big Data and NoSQL
NoSQL Data Architecture Patterns
Selecting best NoSQL
Polyglot Persistence
Comparison between rdbms and nosql
2012 10 24_briefing room
Sql vs NO-SQL database differences explained
Datastores
The Future of Distributed Databases
RDBMS vs NoSQL
Indic threads pune12-nosql now and path ahead
NoSQL databases
To SQL or NoSQL, that is the question
SQL/NoSQL How to choose ?
Schema migrations in no sql
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
NoSQL Basics - A Quick Tour
Cache and consistency in nosql
Queues, Pools and Caches paper
Intro to Big Data and NoSQL
Ad

Similar to Nosql intro (20)

PPTX
No sql database
DOCX
Report 1.0.docx
PPTX
Nosql-Module 1 PPT.pptx
PPTX
Module-1.pptx63.pptx
PDF
A Comparative Study of NoSQL and Relational Database.pdf
PDF
Ijaprr vol1-2-6-9naseer
PDF
Ijaprr vol1-2-6-9naseer
PDF
No SQL databases basics module 1 vtu notes
PDF
Data management in cloud study of existing systems and future opportunities
DOCX
Report 2.0.docx
PDF
SURVEY ON IMPLEMANTATION OF COLUMN ORIENTED NOSQL DATA STORES ( BIGTABLE & CA...
PDF
SQL OR NoSQL DATABASES? CRITICAL DIFFERENCES.pdf
PDF
NoSql And The Semantic Web
PDF
Big Data using NoSQL Technologies
PPT
No sql databases explained
PDF
Comparative study of no sql document, column store databases and evaluation o...
PPTX
Presentation on Databases in the Cloud
PDF
Massive Data Analytics and the Cloud
PDF
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
No sql database
Report 1.0.docx
Nosql-Module 1 PPT.pptx
Module-1.pptx63.pptx
A Comparative Study of NoSQL and Relational Database.pdf
Ijaprr vol1-2-6-9naseer
Ijaprr vol1-2-6-9naseer
No SQL databases basics module 1 vtu notes
Data management in cloud study of existing systems and future opportunities
Report 2.0.docx
SURVEY ON IMPLEMANTATION OF COLUMN ORIENTED NOSQL DATA STORES ( BIGTABLE & CA...
SQL OR NoSQL DATABASES? CRITICAL DIFFERENCES.pdf
NoSql And The Semantic Web
Big Data using NoSQL Technologies
No sql databases explained
Comparative study of no sql document, column store databases and evaluation o...
Presentation on Databases in the Cloud
Massive Data Analytics and the Cloud
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
Ad

More from Hoang Nguyen (20)

PDF
GANs and Applications
PDF
Scrum - An introduction
PDF
ORM in Django
PDF
Introduction to Cross-platform App Development
PDF
Conistency of random forests
PDF
Trust - Digital Signature
PDF
Key Exchange
PDF
SOME SECURITY CHALLENGES IN CLOUD COMPUTING
PDF
Stream ciphers
PDF
Classical ciphers
PDF
Confidentiality
PDF
Information, Data and Decision Making
PDF
Multiple processor systems
PDF
Multiprocessor Systems
PDF
Introduction to AOS course
PDF
Background Knowledge
PDF
Introduction to Information Security Course
PDF
Introduction to CNS Course
PDF
Dynamic Testing
PDF
Static Testing
GANs and Applications
Scrum - An introduction
ORM in Django
Introduction to Cross-platform App Development
Conistency of random forests
Trust - Digital Signature
Key Exchange
SOME SECURITY CHALLENGES IN CLOUD COMPUTING
Stream ciphers
Classical ciphers
Confidentiality
Information, Data and Decision Making
Multiple processor systems
Multiprocessor Systems
Introduction to AOS course
Background Knowledge
Introduction to Information Security Course
Introduction to CNS Course
Dynamic Testing
Static Testing

Recently uploaded (20)

PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PPTX
Big Data Technologies - Introduction.pptx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Getting Started with Data Integration: FME Form 101
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PPTX
1. Introduction to Computer Programming.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
Spectroscopy.pptx food analysis technology
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
A Presentation on Artificial Intelligence
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Machine learning based COVID-19 study performance prediction
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPT
Teaching material agriculture food technology
SOPHOS-XG Firewall Administrator PPT.pptx
Big Data Technologies - Introduction.pptx
The Rise and Fall of 3GPP – Time for a Sabbatical?
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Getting Started with Data Integration: FME Form 101
NewMind AI Weekly Chronicles - August'25-Week II
gpt5_lecture_notes_comprehensive_20250812015547.pdf
1. Introduction to Computer Programming.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Digital-Transformation-Roadmap-for-Companies.pptx
Spectroscopy.pptx food analysis technology
“AI and Expert System Decision Support & Business Intelligence Systems”
A Presentation on Artificial Intelligence
Network Security Unit 5.pdf for BCA BBA.
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Machine learning based COVID-19 study performance prediction
Per capita expenditure prediction using model stacking based on satellite ima...
Teaching material agriculture food technology

Nosql intro

  • 1. The future is: NoSQL Databases Polyglot Persistence Martin Fowler a note on the future of data storage in the enterprise, written primarily for those involved in the management of application development. Pramod Sadalage © Martin Fowler and Pramod Sadalage Rendered: February 8, 2012 11:26
  • 2. SQL has Ruled for two decades Store persistent data Application Integration Storing large amounts of data Many applications in an enterprise on disk, while allowing need to share information. By applications to grab the bits getting all applications to use the they need through queries database, we ensure all these applications have consistent, up-to- date data Mostly Standard The relational model is widely used and understood. Interaction with the database is done with SQL, which is a (mostly) standard Concurrency Control language. This degree of Many users access the same standardization is enough to keep information at the same time. things familiar so people don’t Reporting Handling this concurrency is need to learn new things SQL’s simple data model and difficult to program, so databases standardization has made it a provide transactions to help foundation for many reporting ensure consistent interaction. tools All this supported by Big Database Vendors and the separation of the DBA profession. 2
  • 3. but SQL’s dominance is cracking Relational databases are designed But it’s cheaper and more effective to run on a single machine, so to to scale horizontally by buying lots of scale, you need buy a bigger machines. machine SQL SQL The machines in these large clusters are Google and Amazon were both early individually unreliable, but the overall cluster adopters of large clusters, and both keeps working even as machines die - so the eschewed relational databases. overall cluster is reliable. The “cloud” is exactly this kind of cluster, which Google Bigtable means relational databases don’t play well with the cloud. Amazon Dynamo The rise of web services provides an effective alternative to shared databases for application integration, making it easier for different Their efforts have been a large inspiration applications to choose their own data storage. to the NoSQL community 3
  • 4. so now we have NoSQL databases examples include There is no standard definition of what NoSQL means. The term began with a workshop organized in 2009, but there is much argument about what databases can truly be called NoSQL. But while there is no formal definition, there are some common characteristics of NoSQL databases they don’t use the relational data model, and thus don’t use the SQL language they tend to be designed to run on a cluster they tend to be Open Source they don’t have a fixed schema, allowing We should also remember Google’s you to store any data in any record Bigtable and Amazon’s SimpleDB. While these are tied to their host’s cloud service, they certainly fit the general operating characteristics 4
  • 5. so this means we can Reduce Development Drag Embrace Large Scale A lot of effort in application development is tied up in The large scale clusters that we can support with working with relational databases. Although Object/ NoSQL databases allow us to store larger datasets Relational Mapping frameworks have eased the load, the (people are talking about petabytes these days) to database is still a significant source of developer hours. process large amounts of analytic data. Often we can reduce this effort by choosing an alternative database that’s more suited to the problem domain. Alternative data models also allow us to carry out many tasks more efficiently, allowing us to tackle problems We often come across projects who are using relational that we would have balked at when using only relational databases because they are the default, not because they databases are the best choice for the job. Often they are paying a cost, in developer time and execution performance, for features they do not use. Danish Health Care Centralized record of drug prescriptions. Currently held in MySQL databases, but concerned about scale for both response time and availability. Migrated data to DNC Riak. [more...] Searching 300 Million voters information for 1 person with addresses, emails, phones is tough with a relational data store. McLaren Guardian MongoDB was used to store the documents about the person. Streaming of telemetric data into New functionality uses Mongo rather [more...] MongoDB for later analysis. Orders than relational DB. They found of magnitude faster than relational Mongo’s document data model (SQL Server). [more...] significantly easier to interact with for 5 their kind of application. [more...]
  • 6. but this does not mean relational is dead the relational model is still ACID transactions relevant In order to run effectively on a The tabular model is suitable for cluster, most NoSQL databases many kinds of data, particularly have limited transactional when you need to pick apart data capability. Often this is enough… and re-assemble it in different but not always. ways for different purposes. Tools Familiarity The long dominance of SQL NoSQL systems are still new, so means that many tools have been people aren’t familiar with using written to work with SQL them. So we shouldn’t be using databases. Tooling for alternative them on utility projects where datastores is much more limited. their benefits would have less impact. 6
  • 7. this leads us to a world of Polyglot Persistence using multiple data storage technologies, chosen based upon the way data is being used by individual applications. Why store binary images in relational database, when there are better storage systems? Polyglot persistence will occur over the enterprise as different applications use different data storage technologies. It will also occur within a single application as different parts of an application’s data store have different access characteristics. http://martinfowler.com/bliki/PolyglotPersistence.html 7
  • 8. what might Polyglot Persistence look like? Needs high availability across Rapidly traverse links Rapid access for reads Needs transactional multiple locations. Can merge between friends, product and writes. No need to updates. Tabular inconsistent writes purchases, and ratings be durable structure fits data Speculative Retailers Web Application User sessions Financial Data Shopping Cart Recomendations Redis RDBMS Riak Neo4J This is a very hypothetical example, we would not make technology Product Catalog Reporting Analytics User activity logs recommendations without more MongoDB RDBMS Cassandra Cassandra contextual information Lots of reads, infrequent writes. Products make Large-scale analytics on natural aggregate large cluster High volume of writes on multiple nodes SQL interfaces well with reporting tools 8
  • 9. polyglot persistence provides lots of new opportunities for enterprises i.e. problems Decisions Organizational Change We have to decide what data How will our data groups react to storage technology to use, rather this new technology? than just go with relational Immaturity Dealing with Eventual Consistency NoSQL tools are still young, and Paradigm full of the rough edges that new tools have. How will different stakeholders in the enterprise data deal with data that Furthermore since we don’t have could be stale and how do you much experience with them, we enforce rules to sync data across don’t know how to use them well, systems what the good patterns are, and what gotchas are lying in wait. 9
  • 10. what kinds of projects are candidates for polyglot persistence? If you need to get to market quickly, then you need to maximize productivity of your development team. If appropriate, polyglot persistence can remove significant rapid time to market drag. Strategic and and/or data intensive Most software projects are utility Data intensiveness can come in projects, i.e. they aren’t central to various forms the competitive advantage of the lots of data company. Utility projects should not high availability take on the risk and staffing demands that polyglot persistence lots of traffic: reads or writes brings as the potential benefits are complex data relationships not there. Any of these may suggest non- relational storage, but its the exact nature of the data interaction that will suggest the best of the many 10 alternatives.
  • 11. for more information… On the Web We are both active writers on our websites. Martin writes at http://martinfowler.com and Pramod at http://www.sadalage.com/. Forthcoming Book We are currently working on a introductory book to NoSQL databases, to be titled: NoSQL Distilled (see http://martinfowler.com/bliki/ NosqlDistilled.html) Consulting and Delivery ThoughtWorks has carried out several projects delivering production systems using NoSQL technologies. To see if NoSQL is a good fit for your needs, and how we can help your delivery, contact your local ThoughtWorks office, which you can find at http://thoughtworks.com 11