2. Dept of CSE
Course Objectives
Course Outcomes
CLO 1. Recognize and Describe the four types of NoSQL Databases, the Document-oriented,
Key-Value pairs
CLO 2. Column-oriented and Graph databases useful for diverse applications.
CLO 3. Apply performance tuning on Column-oriented NoSQL databases and Document-
oriented NoSQL Databases.
CLO 4. Differentiate the detailed architecture of column oriented NoSQL database,
Document database and Graph Database and relate usage of processor, memory, storage
and file system commands.
CLO 5. Evaluate several applications for location based service and recommendation
services. Devise an application using the components of NoSQL
CO1. Demonstrate an understanding of the detailed architecture of Column Oriented NoSQL
databases, Document databases, Graph databases.
CO2. Use the concepts pertaining to all the types of databases.
CO3. Analyze the structural Models of NoSQL.
CO4. Develop various applications using NoSQL databases.
3. Dept of CSE
Assessment Details (both CIE and SEE)
Continuous Internal Evaluation:
Three Unit Tests each of 20 Marks (duration 01 hour) 60M
Two assignments each of 10 Marks 20M
Group discussion/Seminar/quiz any one of three suitably planned 20M
CIE 100 M --> 50 M
SEE 100M ---> 50M
4. Dept of CSE
Textbooks
1. Sadalage, P. & Fowler, NoSQL Distilled: A Brief Guide to the Emerging World of
Polyglot Persistence, Pearson Addision Wesley, 2012
Reference Books
1. Dan Sullivan, "NoSQL For Mere Mortals", 1st Edition, Pearson Education India, 2015.
(ISBN- 13: 978-9332557338)
2. Dan McCreary and Ann Kelly, "Making Sense of NoSQL: A guide for Managers and the
Rest of us", 1st Edition, Manning Publication/Dreamtech Press, 2013. (ISBN-13: 978-
9351192022)
3. Kristina Chodorow, "Mongodb: The Definitive Guide- Powerful and Scalable Data
Storage", 2nd
Edition, O'Reilly Publications, 2013. (ISBN-13: 978-9351102694)
5. Dept of CSE
Weblinks and Video Lectures (e-Resources):
1. https://www.geeksforgeeks.org/introduction-to-nosql/ ( and related links in the
page)
2. https://www.youtube.com/watch?v=0buKQHokLK8 (How do NoSQL databases
work? Simply
explained)
3. https://www.techtarget.com/searchdatamanagement/definition/NoSQL-Not-Only-
SQL (What is
NoSQL and How do NoSQL databases work)
4. https://www.mongodb.com/nosql-explained (What is NoSQL)
5. https://onlinecourses.nptel.ac.in/noc20-cs92/preview (preview of Bigdata course
contains
NoSQL)
6. Module - 4
Dept of CSE
Document Databases, What Is a Document Database?, Features,
Consistency, Transactions, Availability, Query Features, Scaling,
Suitable Use Cases, Event Logging, Content Management Systems,
Blogging Platforms, Web Analytics or Real-Time Analytics, E-
Commerce Applications, When Not to Use, Complex Transactions
Spanning Different Operations, Queries against Varying Aggregate
Structure.
7. Dept of CSE
Document
Databases
Basically we have different types of databases Relational, Object Oriented, Normal,
Key Value, document, Columnar and Graph databases.
Document Database
A Database in one which have document as a database.
Document databases are considered to be non-relational
(or NoSQL) databases. Instead of storing data in a fixed rows and
columns, document databases use flexible documents.
Document databases are the most popular alternative to tabular,
relational databases.
Example : RDBMS
8. Dept of CSE
Document
Databases
RDBMS: Database is structured database. Data is stored in the form of tables
containing rows and columns. Each table defines its own structures. We can also have
multiple tables leads to database.
Ex: Student table, Faculty Table, Course Table and etc…
Where as in Document databases, all these data is organized in a document form. All
the tables are stored in a document also. Because it is flexible in nature.
What are documents?
• A document is a record in a document database. A document typically stores
information about one object and any of its related metadata.
• Documents store data in a fixed value pairs. The values can be variety of types and
structures, including strings, numbers, dates, arrays or objects.
• Documents can be stored in formats like JSON, BSON and XML.
• Below JSON document that stores information about a user name Tom.
{“_id”:1,”First_name”: ”Tom”, ”email”: “tom@example.com”, “cell”: “765-55-5555”,
“likes”: [“fashion”, “spas”, “shopping” ], “business”: [{ “name”: “entertainment 1080”,
“partner”: “Jean”, “status”: “Bankrupt”, “date_founded”: {“$date”: “2012-05-
19T04:00:00Z”} }, { “name”: “Swag for Tweens”, “date_founded”: { “$date”: “2012-11-
01T04:00:00z”}} ] }
9. Dept of CSE
Document
Databases
• Documents are the main concept in document databases.
• The database stores and retrieves documents, which can be XML, JSON, BSON, and
so on.
• These documents are self-describing, hierarchical tree data structures which can
consist of maps, collections, and scalar values.
• The documents stored are similar to each other but do not have to be exactly the
same.
• Document databases store documents in the value part of the key-value store;
think about document databases as key-value stores where the value is
examinable.
Properties
10. Dept of CSE
Document
Databases
This document can be considered
a row in a traditional RDBMS. Let’s
look at another document:
{
"firstname": "Pramod",
"citiesvisited": [ "Chicago", "London", "Pune", "Bangalore" ],
"addresses": [
{ "state": "AK",
"city": "DILLINGHAM",
"type": "R"
},
{ "state": "MH",
"city": "PUNE",
"type": "R" }
],
}
"lastcity": "Chicago"
}
we can see here like similar, but have differences in attribute names.
This is allowed in document databases. The schema of the data can
differ across documents, but these documents can still belong to the
same collection—unlike an RDBMS where every row in a table has to
follow the same schema. We represent a list of citiesvisited as an array,
or a list of addresses as list of documents embedded inside the main
document. Embedding child documents as sub-objects inside
documents provides for easy access and better performance.
Properties
11. Dept of CSE
Document
Databases
Some of the popular document databases we have seen are
• MongoDB,
• CouchDB,
• Terrastore,
• OrientDB,
• RavenDB,
• Lotus Notes
12. Dept of CSE
Document
Databases
Document Databases compares in Oracle and MongoDB.
The _id is a special field that is found on all documents in Mongo, just like ROWID in
Oracle. In MongoDB, _id can be assigned by the user, as long as it is unique.
13. Dept of CSE
Document
Databases
Collections
• A collection is a group of documents. Collections typically store documents that have
similar contents.
• Not all documents in a collection are required to have the same fields, because
document databases have a flexible schema. Note that some document databases
provide schema validation, so the schema can optionally be locked down when
needed.
• Continuing with the example above, the document with information about Tom
could be stored in a collection in order to store information about users. For
example, the document below that stores information about Donna could be added
to the users collection.
{“_id”:2,”First_name”: ”Donna”, ”email”: “donna@example.com”, “Spouse”: “Joe”,
“likes”: [“spas”, “shopping”, “live tweeting” ], “business”: [{ “name”: “Castle Reality”,
“status”: “Thriving”, “date_founded”: {“$date”: “2013-11-21T04:00:00Z”} } ] }.
Note that the document for Donna does not contain the same fields as the document
for Tom. The users collection is leveraging a flexible schema to store the information
that exists for each user.
14. Dept of CSE
Document
Databases
Features of Document Based database
This is a data model which works as a semi-structured data model in which the
records and data associated with them are stored in a single document which means
this data model is not completely unstructured. The main thing is that data here is
stored in a document
Document Type Model: As we all know data is stored in documents rather than
tables or graphs, so it becomes easy to map things in many programming
languages.
Flexible Schema: Overall schema is very much flexible to support this statement
one must know that not all documents in a collection need to have the same
fields.
Distributed and Resilient: Document data models are very much dispersed which
is the reason behind horizontal scaling and distribution of data.
• Manageable Query Language: These data models are the ones in which query
language allows the developers to perform CRUD (Create Read Update Destroy)
operations on the data model.
15. Dept of CSE
Document
Databases
Manageable Query Language
The SQL for this would be:
SELECT * FROM order
The equivalent query in Mongo shell would be:
db.order.find()
Selecting the orders for a single customerId of 883c2c5b4e5b would be:
SELECT * FROM order WHERE customerId = "883c2c5b4e5b“
The equivalent query in Mongo to get all orders for a single customerId of 883c2c5b4e5b:
db.order.find({"customerId":"883c2c5b4e5b"})
Selecting orderId and orderDate for one customer in SQL would be:
SELECT orderId,orderDate FROM order WHERE customerId = "883c2c5b4e5b“
and the equivalent in Mongo would be:
db.order.find({customerId:"883c2c5b4e5b"},{orderId:1,orderDate:1})
16. Dept of CSE
Document
Databases
Advantages:
Schema-less: These are very good in retaining existing data at massive volumes because
there are absolutely no restrictions in the format and the structure of data storage.
Faster creation of document and maintenance: It is very simple to create a document and
apart from this maintenance requires is almost nothing.
Open formats: It has a very simple build process that uses XML, JSON, and its other forms.
Built-in versioning: It has built-in versioning which means as the documents grow in size
there might be a chance they can grow in complexity. Versioning decreases conflicts.
Disadvantages:
Weak Atomicity: It lacks in supporting multi-document ACID transactions. A change in the
document data model involving two collections will require us to run two separate queries
i.e. one for each collection. This is where it breaks atomicity requirements.
Consistency Check Limitations: One can search the collections and documents that are not
connected to an author collection but doing this might create a problem in the performance
of database performance.
Security: Nowadays many web applications lack security which in turn results in the leakage
of sensitive data. So it becomes a point of concern, one must pay attention to web app
vulnerabilities.
18. Dept of CSE
Document
Databases
Applications of Document Data Model :
Content Management: These data models are very much used in creating
various video streaming platforms, blogs, and similar services Because each is
stored as a single document and the database here is much easier to maintain as
the service evolves over time.
Book Database: These are very much useful in making book databases because as
we know this data model lets us nest.
Catalog: When it comes to storing and reading catalog files these data models are
very much used because it has a fast reading ability.
Analytics Platform: These data models are very much used in the Analytics
Platform.