SlideShare a Scribd company logo
History and Introduction to NoSQL over Traditional Rdbms
Types of NoSQL Databases
Introduction
• It’s born out of a need to handle larger data
volumes which forced a fundamental shift to
building large hardware platforms through
clusters of commodity servers.
• Advocates of NoSQL databases claim that they
can build systems that are more performant,
scale much better, and are easier to program
with.
Why Are NoSQL Databases
Interesting?
• Application development productivity. A lot
of application development effort is spent on
mapping data between in-memory data
structures and a relational database.
• A NoSQL database may provide a data model
that better fits the application’s needs, thus
simplifying that interaction and resulting in
less code to write, debug, and evolve.
Cont’d
• Large-scale data. Organizations are finding it valuable
to capture more data and process it more quickly.
• They are finding it expensive, if even possible, to do so
with relational databases.
• The primary reason is that a relational database is
designed to run on a single machine, but it is usually
more economic to run large data and computing loads
on clusters of many smaller and cheaper machines.
• Many NoSQL databases are designed explicitly to run
on clusters, so they make a better fit for big data
scenarios.
The Value of Relational Databases
• Getting at Persistent Data – provide a “backing”
store for volatile memory
– Two areas of memory:
• Fast, small, volatile main memory
• Larger, slower, non volatile backing store
• Since main memory is volatile to keep data around,
we write it to a backing store, commonly seen a
disk which can be persistent memory.
The backing store can be: • File system • Database
The database allows more flexibility than a file
system in storing large amounts of data in a way
that allows an application program to get
information quickly and easily.
Concurrency
• Multiple applications accessing shared data
– Transactions
• Enterprise applications tend to have many people using
same data at once, possibly modifying that data.
• We have to worry about coordinating interactions
between them to avoid things like double booking of
hotel rooms
• Since enterprise applications can have lots of users and
other systems all working concurrently, there’s a lot of
room for bad things to happen.
• Relational databases help to handle this by controlling
all access to their data through transactions..
Integration
• Enterprise requires multiple applications, written by
different teams, to collaborate in order to get things
done.
• Applications often need to use the same data and
updates made through one application have to be
visible to others.
• A common way to do this is shared database
integration where multiple applications store their
data in a single database.
• Using a single database allows all the applications to
use each others’ data easily, while the database’s
concurrency control handles multiple applications in
the same way as it handles multiple users in a single
application.
Impedance Mismatch
• Impedance mismatch is a term used in computer science
to describe the problem that arises when two systems or
components that are supposed to work together have
different data models, structures, or interfaces that
make communication difficult or inefficient.
• In the context of databases, impedance mismatch refers
to the discrepancy between the object-oriented
programming (OOP) model used in application code and
the relational model used in database management
systems (DBMS).
• While OOP models are designed to represent data as
objects with properties and methods, relational models
represent data as tables with columns and rows.
• This impedance mismatch can create challenges when it
comes to mapping objects in code to tables in a database
or vice versa.
Impedance Mismatch
• The difference between the relational model
and the in-memory data structures.
• The relational data model organizes data into
a structure of tables.
– Where a tuple is a set of name-value pairs and a
relation is a set of tuples.
• Structure and relationships have to be
mapped
– Rich, in-memory structures have to be translated
to relational representation to be stored on disk
– Translation: impedance mismatch
Cont’d
Cont’d
• Impedance mismatch has been made much
easier to deal with by the wide availability of
object relational mapping frameworks.
• Impedance mismatch has been made much
easier to deal with by the wide availability of
object relational mapping frameworks, such as
Hibernate and iBATIS that implement well-
known mapping patterns but the mapping
problem is still an issue.
Application and Integration Databases
• Data integration is the process of taking data
from different sources and formats and
combining it into a single data set.
• Integration database - with multiple applications,
usually developed by separate teams, storing
their data in a common database.
• This improves communication because all the
applications are operating on a consistent set of
persistent data.
Or
• An integration database is a database which acts
as the data store for multiple applications, and
thus integrates data across these applications .
Cont’d
Cont’d
Integrate many applications becomes (dramatically)
more complex than any single application needs
−Changes to the data model must be
coordinated
−Different structural and performance needs for
different applications
−Database integrity becomes an issue
Instead, treat the database as an application
database
−Single application, single development team
−Provide alternate integration mechanisms
Cont’d
• Data integration platforms are an efficient
approach to data utilization and storage.
• Rather than replicating data across locations
or environments, the integration database
serves as a single source of truth.
During the 2000s we saw a distinct shift to web
services where applications would communicate over
HTTP.
Alternate Integration Mechanism: Services
More recent push to use Web Services where applications
integrate over HTTP communications
−XML-RPC, SOAP, REST
∙Results in more flexibility for exchange data structure
−XML, JSON, etc.
−Text-based protocols
∙Results in letting application developers choose database
−Application databases
−Relational databases are often still an appropriate
choice
Application Database
• Application Database for a database that is
controlled and accessed by a single application.
• With an application database, only the team
using the application needs to know about the
database structure, which makes it much easier
to maintain and evolve the schema.
• Since the application team controls both the
database and the application code, the
responsibility for database integrity can be put in
the application code.
The Attack of the Clusters
The 2000s saw the web grow enormously
−Web use tracking data, social networks, activity logs,
mapping data, etc.
−Huge websites serving huge numbers of visitors
∙To handle the increase in data and traffic required more
computing resources
∙Instead of building bigger machines with more
processors, storage, and memory, use clusters of small,
commodity machines
−Cheaper, more resilient
∙But relational databases are not designed to be run on
clusters
Cont’d
• Coping with the increase in data and traffic
required more computing resources.
• To handle this kind of increase, you have two
choices:
• 1. Scaling up implies:
– bigger machines
– more processors
– more disk storage
– more memory
• Scaling up disadvantages:
– But bigger machines get more and more expensive.
– There are real limits as size increases.
Cont’d
• Use lots of small machines in a cluster:
– A cluster of small machines can use commodity
hardware and ends up being cheaper at these
kinds of scales.
– It can also be more resilient—while individual
machine failures are common, the overall cluster
can be built to keep going despite such failures,
providing high reliability.
Clustered Relational Databases
• Relational databases are not designed to be run on
Clusters.
• Clustered relational databases, such as the Oracle RAC
or Microsoft SQL Server, work on the concept of a
shared disk subsystem where cluster still has the disk
subsystem as a single point of failure.
• Relational databases could also be run as separate
servers for different sets of data, effectively sharding
the database.
• Even though this separates the load, all the sharding
has to be controlled by the application which has to
keep track of which database server to talk to for each
bit of data.
Cont’d
• We lose any querying, referential integrity,
transactions, or consistency controls that cross shards.
• Commercial relational databases (licensed) are usually
priced on a single-server assumption, so running on a
cluster raised prices.
• This mismatch between relational databases and
clusters led some organization to consider an
alternative route to data storage. Two companies in
particular
– 1. Google
– 2.Amazon
• Both were running large clusters
• They were capturing huge amounts of data
The Emergence of NoSQL
• Historical note: ‘NoSQL’ was first used to name an
open-source relational database development led by
Carlo Strozzi.
• Current use of the phrase came from a conference
meet up discussing “open-source, distributed,
nonrelational databases.
• The name NoSQL comes from the fact that the NoSQL
databases doesn’t use SQL as a query language.
• Instead, the database is manipulated through shell
scripts that can be combined into the usual UNIX
pipelines.
Cont’d
• Most NoSQL databases are driven by the need to run
on clusters.
• Relational databases use ACID transactions to handle
consistency across the whole database.
• This inherently clashes with a cluster environment, so
NoSQL databases offer a range of options for
consistency and distribution.
• Not all NoSQL databases are strongly oriented
towards running on clusters.
• Graph databases are one style of NoSQL databases
that uses a distribution model similar to relational
databases but offers a different data model that makes
it better at handling data with complex relationships.
Cont’d
• NoSQL databases operate without a schema,
allowing you to freely add fields to database
records without having to define any changes
in structure first.
• Two primary reasons for considering NoSQL:
– 1) To handle data access with sizes and
performance that demand a cluster
– 2) To improve the productivity of application
development by using a more convenient data
interaction style.
Cont’d
• A NoSQL is a database that provides a
mechanism for storage and retrieval of data,
they are used in real-time web applications
and big data and their use are increasing over
time.
• Many NoSQL stores compromise consistency
in favor of availability, speed and partition
tolerance.
Advantages of NoSQL
• 1. High Scalability
– NoSQL databases use sharding for horizontal
scaling.
– It can handle huge amount of data because of
scalability, as the data grows NoSQL scale itself to
handle that data in efficient manner.
• 2. High Availability
– Auto replication feature in NoSQL databases
makes it highly available.
Disadvantages of NoSQL
1. Narrow Focus: It is mainly designed for storage, but it
provides very little functionality.
2. Open Source: NoSQL is open-source database that is
two database systems are likely to be unequal.
3. Management Challenge: Big data management in
NoSQL is much more complex than a relational
database.
4. GUI is not available: GUI mode tools to access the
database is not flexibly available in the market.
5. Backup: it is a great weak point for some NoSQL
databases like MongoDB.
6. Large Document size: Data in JSON format increases
the document size.
When should NoSQL be used
• When huge amount of data need to be stored
and retrieved.
• The relationship between data you store is not
that important.
• The data changing over time and is not
structured.
• Support of constraint and joins is not required at
database level.
• The data is growing continuously and you need to
scale the database regular to handle the data.
Characteristics of NoSQL Databases
They do not use SQL and the relational model
• Some do have query languages which are similar to SQL to
be easy to learn and use.
∙ Mostly open-source projects
∙Designed to be distributed –clustered
−No expectation of ACID properties
−Range of options for consistency and distribution
∙Schema free
−Freely add fields to records without having to define any
changes in structure first
−Non-uniform data and custom fields
∙A no Definition of NoSQL: An ill-defined set of mostly open-
source databases, mostly developed in the early 21stcentury, and
mostly not using SQL
Polyglot Persistence
• Polyglot persistence is a conceptual term that refers to the use of
different data storage approaches and technologies to support the
unique storage requirements of various data types that live within
enterprise applications.
• Polyglot persistence refers to using different data storage
technologies to handle varying data storage needs.
• Polyglot Persistence is a fancy term to mean that when storing data,
it is best to use multiple data storage technologies, chosen based
upon the way data is being used by individual applications or
components of a single application.
• Different kinds of data are best dealt with different data stores. In
short, it means picking the right tool for the right use case.
Example
• Looking at a Polyglot Persistence example, an
e-commerce platform will deal with many
types of data (i.e. shopping cart, inventory,
completed orders, etc). Instead of trying to
store all this data in one database, which
would require a lot of data conversion to make
the format of the data all the same, store the
data in the database best suited for that type
of data. So the e-commerce platform might
look like this:
Cont’d
Cont’d
Cont’d

More Related Content

PPTX
Software Architecture and Design
PDF
Neural Network Architectures
PPTX
Class based modeling
PPT
Improving software econimics
PPT
Unit 4
PPT
multi processors
PDF
Expert system neural fuzzy system
PPT
Lecture 19 design concepts
Software Architecture and Design
Neural Network Architectures
Class based modeling
Improving software econimics
Unit 4
multi processors
Expert system neural fuzzy system
Lecture 19 design concepts

What's hot (20)

PPT
Hierarchical Object Oriented Design
PPTX
Model Based Software Architectures
PDF
PPTX
Lecture 8 (software Metrics) Unit 3.pptx
PPT
REQUIREMENT ENGINEERING
PPTX
Direct manipulation and virtual environments
PPT
Synchronization linux
PPTX
Software architecture and software design
DOCX
Cifrado del cesar
PPTX
Pipeline and Vector Processing Computer Org. Architecture.pptx
PPTX
2. forward chaining and backward chaining
PPTX
Google App Engine
PPT
Software maintenance and configuration management, software engineering
PPT
Operating System 2
PPTX
Artificial Intelligence Approaches
PPT
Function points analysis
PPTX
Virtualization & cloud computing
PDF
Cloud Infrastructure m Service Delivery Models (IAAS, PAAS and SAAS) Cloud D...
PPTX
Hardware virtualization basic
Hierarchical Object Oriented Design
Model Based Software Architectures
Lecture 8 (software Metrics) Unit 3.pptx
REQUIREMENT ENGINEERING
Direct manipulation and virtual environments
Synchronization linux
Software architecture and software design
Cifrado del cesar
Pipeline and Vector Processing Computer Org. Architecture.pptx
2. forward chaining and backward chaining
Google App Engine
Software maintenance and configuration management, software engineering
Operating System 2
Artificial Intelligence Approaches
Function points analysis
Virtualization & cloud computing
Cloud Infrastructure m Service Delivery Models (IAAS, PAAS and SAAS) Cloud D...
Hardware virtualization basic
Ad

Similar to History and Introduction to NoSQL over Traditional Rdbms (20)

PPTX
NOSQL DATAbASES INTRDUCTION powerpoint presentaion
PPTX
Nosql-Module 1 PPT.pptx
PDF
Database-Technology_introduction and feature.pdf
PPTX
Introduction to NoSQL Databases and Types of NOSQL Databases.pptx
PPTX
What Is a Database Powerpoint Presentation.pptx
PPTX
dbms introduction.pptx
PPTX
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
PPTX
Module-1.pptx63.pptx
PPTX
Module 2.2 Introduction to NoSQL Databases.pptx
PPTX
UNIT-2.pptx
PPTX
DBMS basics and normalizations unit.pptx
PPT
Intro Duction of Database and its fundamentals .ppt
PPTX
Database management system
PPTX
NoSQL and Couchbase
PPT
CouchBase The Complete NoSql Solution for Big Data
PPTX
Module-2_HADOOP.pptx
PPTX
BIg Data Analytics-Module-2 vtu engineering.pptx
PPTX
BIg Data Analytics-Module-2 as per vtu syllabus.pptx
PDF
PPTX
AdvanceDatabaseChapter6Advance Dtabases.pptx
NOSQL DATAbASES INTRDUCTION powerpoint presentaion
Nosql-Module 1 PPT.pptx
Database-Technology_introduction and feature.pdf
Introduction to NoSQL Databases and Types of NOSQL Databases.pptx
What Is a Database Powerpoint Presentation.pptx
dbms introduction.pptx
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Module-1.pptx63.pptx
Module 2.2 Introduction to NoSQL Databases.pptx
UNIT-2.pptx
DBMS basics and normalizations unit.pptx
Intro Duction of Database and its fundamentals .ppt
Database management system
NoSQL and Couchbase
CouchBase The Complete NoSql Solution for Big Data
Module-2_HADOOP.pptx
BIg Data Analytics-Module-2 vtu engineering.pptx
BIg Data Analytics-Module-2 as per vtu syllabus.pptx
AdvanceDatabaseChapter6Advance Dtabases.pptx
Ad

Recently uploaded (20)

PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PDF
PPT on Performance Review to get promotions
PPTX
web development for engineering and engineering
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
DOCX
573137875-Attendance-Management-System-original
PPTX
Welding lecture in detail for understanding
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
additive manufacturing of ss316l using mig welding
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
Geodesy 1.pptx...............................................
PPT
Project quality management in manufacturing
Operating System & Kernel Study Guide-1 - converted.pdf
R24 SURVEYING LAB MANUAL for civil enggi
PPT on Performance Review to get promotions
web development for engineering and engineering
Automation-in-Manufacturing-Chapter-Introduction.pdf
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
573137875-Attendance-Management-System-original
Welding lecture in detail for understanding
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Model Code of Practice - Construction Work - 21102022 .pdf
UNIT-1 - COAL BASED THERMAL POWER PLANTS
additive manufacturing of ss316l using mig welding
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Geodesy 1.pptx...............................................
Project quality management in manufacturing

History and Introduction to NoSQL over Traditional Rdbms

  • 2. Types of NoSQL Databases
  • 3. Introduction • It’s born out of a need to handle larger data volumes which forced a fundamental shift to building large hardware platforms through clusters of commodity servers. • Advocates of NoSQL databases claim that they can build systems that are more performant, scale much better, and are easier to program with.
  • 4. Why Are NoSQL Databases Interesting? • Application development productivity. A lot of application development effort is spent on mapping data between in-memory data structures and a relational database. • A NoSQL database may provide a data model that better fits the application’s needs, thus simplifying that interaction and resulting in less code to write, debug, and evolve.
  • 5. Cont’d • Large-scale data. Organizations are finding it valuable to capture more data and process it more quickly. • They are finding it expensive, if even possible, to do so with relational databases. • The primary reason is that a relational database is designed to run on a single machine, but it is usually more economic to run large data and computing loads on clusters of many smaller and cheaper machines. • Many NoSQL databases are designed explicitly to run on clusters, so they make a better fit for big data scenarios.
  • 6. The Value of Relational Databases • Getting at Persistent Data – provide a “backing” store for volatile memory – Two areas of memory: • Fast, small, volatile main memory • Larger, slower, non volatile backing store • Since main memory is volatile to keep data around, we write it to a backing store, commonly seen a disk which can be persistent memory. The backing store can be: • File system • Database The database allows more flexibility than a file system in storing large amounts of data in a way that allows an application program to get information quickly and easily.
  • 7. Concurrency • Multiple applications accessing shared data – Transactions • Enterprise applications tend to have many people using same data at once, possibly modifying that data. • We have to worry about coordinating interactions between them to avoid things like double booking of hotel rooms • Since enterprise applications can have lots of users and other systems all working concurrently, there’s a lot of room for bad things to happen. • Relational databases help to handle this by controlling all access to their data through transactions..
  • 8. Integration • Enterprise requires multiple applications, written by different teams, to collaborate in order to get things done. • Applications often need to use the same data and updates made through one application have to be visible to others. • A common way to do this is shared database integration where multiple applications store their data in a single database. • Using a single database allows all the applications to use each others’ data easily, while the database’s concurrency control handles multiple applications in the same way as it handles multiple users in a single application.
  • 9. Impedance Mismatch • Impedance mismatch is a term used in computer science to describe the problem that arises when two systems or components that are supposed to work together have different data models, structures, or interfaces that make communication difficult or inefficient. • In the context of databases, impedance mismatch refers to the discrepancy between the object-oriented programming (OOP) model used in application code and the relational model used in database management systems (DBMS). • While OOP models are designed to represent data as objects with properties and methods, relational models represent data as tables with columns and rows. • This impedance mismatch can create challenges when it comes to mapping objects in code to tables in a database or vice versa.
  • 10. Impedance Mismatch • The difference between the relational model and the in-memory data structures. • The relational data model organizes data into a structure of tables. – Where a tuple is a set of name-value pairs and a relation is a set of tuples. • Structure and relationships have to be mapped – Rich, in-memory structures have to be translated to relational representation to be stored on disk – Translation: impedance mismatch
  • 12. Cont’d • Impedance mismatch has been made much easier to deal with by the wide availability of object relational mapping frameworks. • Impedance mismatch has been made much easier to deal with by the wide availability of object relational mapping frameworks, such as Hibernate and iBATIS that implement well- known mapping patterns but the mapping problem is still an issue.
  • 13. Application and Integration Databases • Data integration is the process of taking data from different sources and formats and combining it into a single data set. • Integration database - with multiple applications, usually developed by separate teams, storing their data in a common database. • This improves communication because all the applications are operating on a consistent set of persistent data. Or • An integration database is a database which acts as the data store for multiple applications, and thus integrates data across these applications .
  • 15. Cont’d Integrate many applications becomes (dramatically) more complex than any single application needs −Changes to the data model must be coordinated −Different structural and performance needs for different applications −Database integrity becomes an issue Instead, treat the database as an application database −Single application, single development team −Provide alternate integration mechanisms
  • 16. Cont’d • Data integration platforms are an efficient approach to data utilization and storage. • Rather than replicating data across locations or environments, the integration database serves as a single source of truth.
  • 17. During the 2000s we saw a distinct shift to web services where applications would communicate over HTTP. Alternate Integration Mechanism: Services More recent push to use Web Services where applications integrate over HTTP communications −XML-RPC, SOAP, REST ∙Results in more flexibility for exchange data structure −XML, JSON, etc. −Text-based protocols ∙Results in letting application developers choose database −Application databases −Relational databases are often still an appropriate choice
  • 18. Application Database • Application Database for a database that is controlled and accessed by a single application. • With an application database, only the team using the application needs to know about the database structure, which makes it much easier to maintain and evolve the schema. • Since the application team controls both the database and the application code, the responsibility for database integrity can be put in the application code.
  • 19. The Attack of the Clusters The 2000s saw the web grow enormously −Web use tracking data, social networks, activity logs, mapping data, etc. −Huge websites serving huge numbers of visitors ∙To handle the increase in data and traffic required more computing resources ∙Instead of building bigger machines with more processors, storage, and memory, use clusters of small, commodity machines −Cheaper, more resilient ∙But relational databases are not designed to be run on clusters
  • 20. Cont’d • Coping with the increase in data and traffic required more computing resources. • To handle this kind of increase, you have two choices: • 1. Scaling up implies: – bigger machines – more processors – more disk storage – more memory • Scaling up disadvantages: – But bigger machines get more and more expensive. – There are real limits as size increases.
  • 21. Cont’d • Use lots of small machines in a cluster: – A cluster of small machines can use commodity hardware and ends up being cheaper at these kinds of scales. – It can also be more resilient—while individual machine failures are common, the overall cluster can be built to keep going despite such failures, providing high reliability.
  • 22. Clustered Relational Databases • Relational databases are not designed to be run on Clusters. • Clustered relational databases, such as the Oracle RAC or Microsoft SQL Server, work on the concept of a shared disk subsystem where cluster still has the disk subsystem as a single point of failure. • Relational databases could also be run as separate servers for different sets of data, effectively sharding the database. • Even though this separates the load, all the sharding has to be controlled by the application which has to keep track of which database server to talk to for each bit of data.
  • 23. Cont’d • We lose any querying, referential integrity, transactions, or consistency controls that cross shards. • Commercial relational databases (licensed) are usually priced on a single-server assumption, so running on a cluster raised prices. • This mismatch between relational databases and clusters led some organization to consider an alternative route to data storage. Two companies in particular – 1. Google – 2.Amazon • Both were running large clusters • They were capturing huge amounts of data
  • 24. The Emergence of NoSQL • Historical note: ‘NoSQL’ was first used to name an open-source relational database development led by Carlo Strozzi. • Current use of the phrase came from a conference meet up discussing “open-source, distributed, nonrelational databases. • The name NoSQL comes from the fact that the NoSQL databases doesn’t use SQL as a query language. • Instead, the database is manipulated through shell scripts that can be combined into the usual UNIX pipelines.
  • 25. Cont’d • Most NoSQL databases are driven by the need to run on clusters. • Relational databases use ACID transactions to handle consistency across the whole database. • This inherently clashes with a cluster environment, so NoSQL databases offer a range of options for consistency and distribution. • Not all NoSQL databases are strongly oriented towards running on clusters. • Graph databases are one style of NoSQL databases that uses a distribution model similar to relational databases but offers a different data model that makes it better at handling data with complex relationships.
  • 26. Cont’d • NoSQL databases operate without a schema, allowing you to freely add fields to database records without having to define any changes in structure first. • Two primary reasons for considering NoSQL: – 1) To handle data access with sizes and performance that demand a cluster – 2) To improve the productivity of application development by using a more convenient data interaction style.
  • 27. Cont’d • A NoSQL is a database that provides a mechanism for storage and retrieval of data, they are used in real-time web applications and big data and their use are increasing over time. • Many NoSQL stores compromise consistency in favor of availability, speed and partition tolerance.
  • 28. Advantages of NoSQL • 1. High Scalability – NoSQL databases use sharding for horizontal scaling. – It can handle huge amount of data because of scalability, as the data grows NoSQL scale itself to handle that data in efficient manner. • 2. High Availability – Auto replication feature in NoSQL databases makes it highly available.
  • 29. Disadvantages of NoSQL 1. Narrow Focus: It is mainly designed for storage, but it provides very little functionality. 2. Open Source: NoSQL is open-source database that is two database systems are likely to be unequal. 3. Management Challenge: Big data management in NoSQL is much more complex than a relational database. 4. GUI is not available: GUI mode tools to access the database is not flexibly available in the market. 5. Backup: it is a great weak point for some NoSQL databases like MongoDB. 6. Large Document size: Data in JSON format increases the document size.
  • 30. When should NoSQL be used • When huge amount of data need to be stored and retrieved. • The relationship between data you store is not that important. • The data changing over time and is not structured. • Support of constraint and joins is not required at database level. • The data is growing continuously and you need to scale the database regular to handle the data.
  • 31. Characteristics of NoSQL Databases They do not use SQL and the relational model • Some do have query languages which are similar to SQL to be easy to learn and use. ∙ Mostly open-source projects ∙Designed to be distributed –clustered −No expectation of ACID properties −Range of options for consistency and distribution ∙Schema free −Freely add fields to records without having to define any changes in structure first −Non-uniform data and custom fields ∙A no Definition of NoSQL: An ill-defined set of mostly open- source databases, mostly developed in the early 21stcentury, and mostly not using SQL
  • 32. Polyglot Persistence • Polyglot persistence is a conceptual term that refers to the use of different data storage approaches and technologies to support the unique storage requirements of various data types that live within enterprise applications. • Polyglot persistence refers to using different data storage technologies to handle varying data storage needs. • Polyglot Persistence is a fancy term to mean that when storing data, it is best to use multiple data storage technologies, chosen based upon the way data is being used by individual applications or components of a single application. • Different kinds of data are best dealt with different data stores. In short, it means picking the right tool for the right use case.
  • 33. Example • Looking at a Polyglot Persistence example, an e-commerce platform will deal with many types of data (i.e. shopping cart, inventory, completed orders, etc). Instead of trying to store all this data in one database, which would require a lot of data conversion to make the format of the data all the same, store the data in the database best suited for that type of data. So the e-commerce platform might look like this: