SlideShare a Scribd company logo
STUDIERENUND DURCHSTARTEN.Author:	Dip.-Inf. (FH) Johannes HoppeDate:	06.05.2011
NoSQL and MongoDBAuthor:	Dip.-Inf. (FH) Johannes HoppeDate:	06.05.2011
01Not only SQL3
Trends4
TrendsDataFacebook had 60k servers  in 2010Google had 450k servers in 2006 (speculated)Microsoft: between 100k and 500k servers (since Azure)Amazon: likely has a similar numbers, too (S3)Facebook Server Footprint5
TrendsTrend 1: increasing data sizesTrend 2: more connectedness (“web 2.0”)Trend 3:moreindividualization (feverstructure)6
NoSQL7
NoSQLDatabase paradigmsRelational (RDBMS)NoSQLKey-Value storesDocument databasesWide column stores (BigTable and clones)Graph databasesOther8
NoSQLSome NoSQL use cases1. Massive data volumesMassively distributed architecture required to store the dataGoogle, Amazon, Yahoo, Facebook…2. Extreme query workloadImpossible to efficiently do joins at that scale with an RDBMS3. Schema evolutionSchema flexibility (migration) is not trivial at large scaleSchema changes can be gradually introduced with NoSQ9
NoSQL - CAP theoremRequirements for distributed systems:ConsistencyAvailabilityPartition tolerance10
NoSQL - CAP theoremConsistencyThe system is in a consistent state after an operationAll clients see the same dataStrong consistency (ACID)vs. eventual consistency (BASE)ACID: Atomicity, Consistency, Isolation and DurabilityBASE: Basically Available, Soft state, Eventually consistent11
NoSQL - CAP theoremAvailabilityThe system is “always on”, no downtimeNode failure tolerance– all clients can find some available replicaSoftware/hardware upgrade tolerance12
NoSQL - CAP theoremPartition toleranceThe system continues to function even when Split into disconnected subsets (by a network disruption)Not only for reads, but writes as well!13
NoSQLCAP TheoremE. Brewer, N. LynchYou can satisfyat most 2 out of the 3 requirements14
NoSQLCAP Theorem  CASingle site clusters(easier to ensure all nodes are always in contact)When a partition occurs, the system blockse.g. usable for two-phase commits (2PC) which already require/use blocks 15
NoSQLCAP Theorem  CASingle site clusters(easier to ensure all nodes are always in contact)When a partition occurs, the system blockse.g. usable for two-phase commits (2PC) which already require/use blocks Obviously, any horizontal scaling strategy is based on data partitioning; therefore, designers are forced to decide between consistency and availability.16
NoSQLCAP Theorem  CPSome data may be inaccessible (availability sacrificed), but the rest is still consistent/accuratee.g. sharded database17
NoSQLCAP Theorem  APSystem is still available under partitioning,but some of the data returned my be inaccurateNeed some conflict resolution strategye.g. Master/Slave replication18
NoSQLRDBMSGuaratnee ACID by CA(two-phasecommits)SQLMature:19
NoSQLNoSQL DBMSNo relational tablesNo fixed table schemasNo joinsNo risk, no fun!CP and AP	(and sometimes even AP and on top of CP  MongoDB*)* This is damn cool!20
NoSQLKey-valueOne key  one value, very fastKey: Hash (no duplicates)Value: binary object („BLOB“)	(DB does not understand your content)Players: Amazon Dynamo, Memcached…21
NoSQLkeyvalue?=PQ)“§VN? =§(Q$U%V§W=(BN W§(=BU&W§$()= W§$(=%GIVE ME A MEANING!customer_2222
NoSQLDocument databasesKey-value store, tooValue is „understood“ by the DBQuerying the data is possible(not just retrieving the key‘s content)Players: Amazon SimpleDB, CouchDB, MongoDB …23
NoSQLkeyvalue{     Type: “Customer”,     Name: "Norbert“,Invoiced: 2222 }customer_2224
NoSQLkeyvalue / documents{         Type: "Customer",     Name: "Norbert",     Invoiced: 2222     Messages: [        {  Title: "Hello",           Text: "World" },        {  Title: "Second",           Text: "message" }     ] }customer_2225
NoSQL(Wide) column storesOften referred as “BigTable clones”Each key is associated with many attributes (columns)NoSQL column stores are actually hybrid row/column storesDifferent from “pure” relational column stores!Players: Google BigTable, Cassandra (Facebook), HBase…26
NoSQLWon‘t be stored as: 			It will be stored as:22;Norbert;22222			22;23;2423;Hans;50000			Norbert;Hans;Franz24;Franz;44000			22222;50000;4400027
NoSQLGraph databasesMulti-relational graphsSPARQL query language (W3C Recommendation!)Players: Neo4j, InfoGrid …(note: graph DBs are special and somehow the “black sheep” in the NoSQL world –the following PROs/CONs don’t apply very well)28
NoSQLPROs (& Promisses)Scheme-free / semi-structured dataMassive data storesScaling is easyVery, very high availabilityOften simpler to implement	(and OR Mappers aren’t required)„Web 2.0 ready“29
NoSQLCONSsNoSQL implementations often „alpha“, no standardsData consistency, no transactions,Insufficient access controlSQL: strong for dynamic, cross-table queries (JOIN)Relationships aren‘t enforced	(conventions over constrains – except for graph DBs (of course))Premature optimization: Scalability	(Don’t build for scalability if you never need it!)30
02MongoDB31
NoSQL		     Lets rock! MongoDB Quick Reference Cardshttp://www.10gen.com/reference32
Basic DeploymentCreate the default data directory in c:\data\dbStart mongod.exeOptionally: mongod.exe --dbpath c:\data\db --port 27017 --logpath c:\data\mongodb.logStart the shell: mongo.exe33
Data Importcd c:\dba-training-data\datamongoimport -d twitter -c tweets twitter.jsoncd c:\dba-training-data\data\dump\trainingmongorestore -d training -c scores scores.bsoncd c:\dba-training-data\data\dumpmongorestore -d diggdigg34
35
MongoDB Documents(in the shell)use diggdb.stories.findOne();36
JSON  BSONAll JSON documents are stored in a binary format called BSON. BSON supports a richer set of types than JSON.http://bsonspec.org37
CRUD – Create(in the shell)db.people.save({name: 'Smith', age: 30});See how the save command works:db.foo.save38
CRUD – CreateHow training.scores was created:for(i=0; i<1000; i++) {     ['quiz', 'essay', 'exam'].forEach(function(name) {var score = Math.floor(Math.random() * 50) + 50;db.scores.save({student: i, name: name, score: score});     });   }db.scores.count();39
CRUD – ReadQueries are specified using a document-style syntax!use trainingdb.scores.find({score: 50});db.scores.find({score: {"$gte": 70}});db.scores.find({score: {"$gte": 70}});Cursor!40
ExercisesFind all scores less than 65. Find the lowest quiz score. Find the highest quiz score. Write a query to find all digg stories where the view count is greater than 1000. Query for all digg stories whose media type is either 'news' or 'images' and where the topic name is 'Comedy’.(For extra practice, construct two queries using different sets of operators to do this. )Find all digg stories where the topic name is 'Television' or the media type is 'videos'. Skip the first 5 results, and limit the result set to 10.41
CRUD – Updateuse digg; db.people.update({name: 'Smith'}, {'$set': {interests: []}});db.people.update({name: 'Smith'}, 	{'$push': {interests:  ['chess']}});42
ExercisesSet the proper 'grade' attribute for all scores. For example, users with scores greater than 90 get an 'A.' Set the grade to ‘B’ for scores falling between 80 and 90.You're being nice, so you decide to add 10 points to every score on every “final” exam whose score is lower than 60. How do you do this update?43
CRUD – Deletedb.dropDatabase();db.foo.drop();db.foo.remove();44
“Map Reduce is the Uzi of aggregation tools. Everything described with count, distinct and group can be done with MapReduce, and more.”Kristina Chadorow, Michael Dirolf in MongoDB – The Definitive Guide45
MapReduceTo use map-reduce, you first write a map function.map = function() {	emit(this.user.name, {diggs: this.diggs, posts: 0});}46
MapReduceThe reduce functions then aggregation those docs by key.reduce = function(key, values) {vardiggs = 0;var posts = 0;values.forEach(function(doc) {diggs += doc.diggs;    		posts += 1;  	});  	return {diggs: diggs, posts: posts};}47
MapReduceNow both are used to perform custom aggregation.db.stories.mapReduce(map, reduce, {out: 'digg_users'});48
THANK YOUFOR YOUR ATTENTION49

More Related Content

PDF
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB für .NET Entwickler)
PDF
2012-03-20 - Getting started with Node.js and MongoDB on MS Azure
PPTX
NoSQL - Hands on
PPTX
2015 02-09 - NoSQL Vorlesung Mosbach
PDF
Building Apps with MongoDB
PDF
Intro To Couch Db
PDF
Mysql to mongo
KEY
An introduction to CouchDB
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB für .NET Entwickler)
2012-03-20 - Getting started with Node.js and MongoDB on MS Azure
NoSQL - Hands on
2015 02-09 - NoSQL Vorlesung Mosbach
Building Apps with MongoDB
Intro To Couch Db
Mysql to mongo
An introduction to CouchDB

What's hot (20)

PDF
Json within a relational database
PPTX
Open Source 1010 and Quest InSync presentations March 30th, 2021 on MySQL Ind...
PDF
Scala with mongodb
PPTX
Indexing and Query Optimisation
PPTX
Making MySQL Agile-ish
PDF
Getting Started with MongoDB
PDF
NoSQL Overview
KEY
Optimize drupal using mongo db
PDF
Nko workshop - node js & nosql
PDF
The world's next top data model
PPTX
Validating JSON -- Percona Live 2021 presentation
PDF
JavaScript and Friends August 20th, 20201 -- MySQL Shell and JavaScript
PPTX
Back to Basics Webinar 3: Schema Design Thinking in Documents
PDF
Drupal 7: What's In It For You?
PPTX
Discover the Power of the NoSQL + SQL with MySQL
PDF
MySQL Without the SQL - Oh My! August 2nd presentation at Mid Atlantic Develo...
PDF
Dirty - How simple is your database?
PDF
Hive jdbc
PDF
CouchDB at New York PHP
PDF
Scaling up and accelerating Drupal 8 with NoSQL
Json within a relational database
Open Source 1010 and Quest InSync presentations March 30th, 2021 on MySQL Ind...
Scala with mongodb
Indexing and Query Optimisation
Making MySQL Agile-ish
Getting Started with MongoDB
NoSQL Overview
Optimize drupal using mongo db
Nko workshop - node js & nosql
The world's next top data model
Validating JSON -- Percona Live 2021 presentation
JavaScript and Friends August 20th, 20201 -- MySQL Shell and JavaScript
Back to Basics Webinar 3: Schema Design Thinking in Documents
Drupal 7: What's In It For You?
Discover the Power of the NoSQL + SQL with MySQL
MySQL Without the SQL - Oh My! August 2nd presentation at Mid Atlantic Develo...
Dirty - How simple is your database?
Hive jdbc
CouchDB at New York PHP
Scaling up and accelerating Drupal 8 with NoSQL
Ad

Viewers also liked (15)

PPTX
Ria 09 trends_and_technologies
PPTX
DMDW Lesson 01 - Introduction
PPTX
DMDW Lesson 05 + 06 + 07 - Data Mining Applied
PPTX
DMDW Lesson 08 - Further Data Mining Algorithms
PPTX
DMDW Lesson 03 - Data Warehouse Theory
PPTX
DMDW Lesson 04 - Data Mining Theory
PDF
2011-12-13 NoSQL aus der Praxis
PDF
2012-05-14 NoSQL in .NET - mit Redis und MongoDB
PDF
2012-05-10 - UG Karlsruhe: NoSQL in .NET - mit Redis und MongoDB
PDF
2017 - NoSQL Vorlesung Mosbach
PDF
2013 02-26 - Software Tests with Mongo db
PDF
2012-09-17 - WDC12: Node.js & MongoDB
PDF
2012-10-12 - NoSQL in .NET - mit Redis und Mongodb
PDF
2012-01-31 NoSQL in .NET
PPTX
Exkurs: Save the pixel
Ria 09 trends_and_technologies
DMDW Lesson 01 - Introduction
DMDW Lesson 05 + 06 + 07 - Data Mining Applied
DMDW Lesson 08 - Further Data Mining Algorithms
DMDW Lesson 03 - Data Warehouse Theory
DMDW Lesson 04 - Data Mining Theory
2011-12-13 NoSQL aus der Praxis
2012-05-14 NoSQL in .NET - mit Redis und MongoDB
2012-05-10 - UG Karlsruhe: NoSQL in .NET - mit Redis und MongoDB
2017 - NoSQL Vorlesung Mosbach
2013 02-26 - Software Tests with Mongo db
2012-09-17 - WDC12: Node.js & MongoDB
2012-10-12 - NoSQL in .NET - mit Redis und Mongodb
2012-01-31 NoSQL in .NET
Exkurs: Save the pixel
Ad

Similar to DMDW Extra Lesson - NoSql and MongoDB (20)

PDF
SQL vs NoSQL deep dive
PPTX
Big Data, NoSQL with MongoDB and Cassasdra
PPTX
Big data vahidamiri-tabriz-13960226-datastack.ir
PPTX
NoSQL databases - An introduction
PPTX
NoSQL Intro with cassandra
ODP
Databases benoitg 2009-03-10
PPTX
NoSQL
PPTX
NoSQL Basics and MongDB
PPT
NO SQL: What, Why, How
PPTX
Introduction to NoSQL
PPTX
Big data technology unit 3
PPT
Trouble with nosql_dbs
PPTX
PDF
NoSQL for great good [hanoi.rb talk]
PPTX
Compressed Introduction to Hadoop, SQL-on-Hadoop and NoSQL
PPTX
NoSQL and MongoDB Introdction
PDF
Introduction to Apache Cassandra
PPTX
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...
PPTX
001 hbase introduction
PDF
If NoSQL is your answer, you are probably asking the wrong question.
SQL vs NoSQL deep dive
Big Data, NoSQL with MongoDB and Cassasdra
Big data vahidamiri-tabriz-13960226-datastack.ir
NoSQL databases - An introduction
NoSQL Intro with cassandra
Databases benoitg 2009-03-10
NoSQL
NoSQL Basics and MongDB
NO SQL: What, Why, How
Introduction to NoSQL
Big data technology unit 3
Trouble with nosql_dbs
NoSQL for great good [hanoi.rb talk]
Compressed Introduction to Hadoop, SQL-on-Hadoop and NoSQL
NoSQL and MongoDB Introdction
Introduction to Apache Cassandra
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...
001 hbase introduction
If NoSQL is your answer, you are probably asking the wrong question.

More from Johannes Hoppe (18)

PDF
Einführung in Angular 2
PDF
MDC kompakt 2014: Hybride Apps mit Cordova, AngularJS und Ionic
PDF
2012-06-25 - MapReduce auf Azure
PDF
2013-06-25 - HTML5 & JavaScript Security
PDF
2013-06-24 - Software Craftsmanship with JavaScript
PDF
2013-06-15 - Software Craftsmanship mit JavaScript
PDF
2013 05-03 - HTML5 & JavaScript Security
PDF
2013-03-23 - NoSQL Spartakiade
PDF
2013-02-21 - .NET UG Rhein-Neckar: JavaScript Best Practices
PDF
2012-10-16 - WebTechCon 2012: HTML5 & WebGL
PDF
2012-09-18 - HTML5 & WebGL
PDF
2012-04-12 - AOP .NET UserGroup Niederrhein
PPTX
2011-06-27 - AOP - .NET User Group Rhein Neckar
PDF
DMDW 8. Student Presentation - Groovy to MongoDB
PPTX
DMDW 5. Student Presentation - Pentaho Data Integration (Kettle)
PPTX
DMDW 7. Student Presentation - Pentaho Data Integration (Kettle)
PPTX
DMDW 9. Student Presentation - Java to MySQL
PPTX
DMDW 11. Student Presentation - JAVA to MongoDB
Einführung in Angular 2
MDC kompakt 2014: Hybride Apps mit Cordova, AngularJS und Ionic
2012-06-25 - MapReduce auf Azure
2013-06-25 - HTML5 & JavaScript Security
2013-06-24 - Software Craftsmanship with JavaScript
2013-06-15 - Software Craftsmanship mit JavaScript
2013 05-03 - HTML5 & JavaScript Security
2013-03-23 - NoSQL Spartakiade
2013-02-21 - .NET UG Rhein-Neckar: JavaScript Best Practices
2012-10-16 - WebTechCon 2012: HTML5 & WebGL
2012-09-18 - HTML5 & WebGL
2012-04-12 - AOP .NET UserGroup Niederrhein
2011-06-27 - AOP - .NET User Group Rhein Neckar
DMDW 8. Student Presentation - Groovy to MongoDB
DMDW 5. Student Presentation - Pentaho Data Integration (Kettle)
DMDW 7. Student Presentation - Pentaho Data Integration (Kettle)
DMDW 9. Student Presentation - Java to MySQL
DMDW 11. Student Presentation - JAVA to MongoDB

Recently uploaded (20)

PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Empathic Computing: Creating Shared Understanding
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
KodekX | Application Modernization Development
PPT
Teaching material agriculture food technology
PDF
cuic standard and advanced reporting.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Modernizing your data center with Dell and AMD
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Cloud computing and distributed systems.
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
A Presentation on Artificial Intelligence
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Empathic Computing: Creating Shared Understanding
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
KodekX | Application Modernization Development
Teaching material agriculture food technology
cuic standard and advanced reporting.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Modernizing your data center with Dell and AMD
The Rise and Fall of 3GPP – Time for a Sabbatical?
Cloud computing and distributed systems.
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
A Presentation on Artificial Intelligence
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Diabetes mellitus diagnosis method based random forest with bat algorithm
NewMind AI Weekly Chronicles - August'25 Week I
Network Security Unit 5.pdf for BCA BBA.
How UI/UX Design Impacts User Retention in Mobile Apps.pdf

DMDW Extra Lesson - NoSql and MongoDB

  • 1. STUDIERENUND DURCHSTARTEN.Author: Dip.-Inf. (FH) Johannes HoppeDate: 06.05.2011
  • 2. NoSQL and MongoDBAuthor: Dip.-Inf. (FH) Johannes HoppeDate: 06.05.2011
  • 5. TrendsDataFacebook had 60k servers in 2010Google had 450k servers in 2006 (speculated)Microsoft: between 100k and 500k servers (since Azure)Amazon: likely has a similar numbers, too (S3)Facebook Server Footprint5
  • 6. TrendsTrend 1: increasing data sizesTrend 2: more connectedness (“web 2.0”)Trend 3:moreindividualization (feverstructure)6
  • 8. NoSQLDatabase paradigmsRelational (RDBMS)NoSQLKey-Value storesDocument databasesWide column stores (BigTable and clones)Graph databasesOther8
  • 9. NoSQLSome NoSQL use cases1. Massive data volumesMassively distributed architecture required to store the dataGoogle, Amazon, Yahoo, Facebook…2. Extreme query workloadImpossible to efficiently do joins at that scale with an RDBMS3. Schema evolutionSchema flexibility (migration) is not trivial at large scaleSchema changes can be gradually introduced with NoSQ9
  • 10. NoSQL - CAP theoremRequirements for distributed systems:ConsistencyAvailabilityPartition tolerance10
  • 11. NoSQL - CAP theoremConsistencyThe system is in a consistent state after an operationAll clients see the same dataStrong consistency (ACID)vs. eventual consistency (BASE)ACID: Atomicity, Consistency, Isolation and DurabilityBASE: Basically Available, Soft state, Eventually consistent11
  • 12. NoSQL - CAP theoremAvailabilityThe system is “always on”, no downtimeNode failure tolerance– all clients can find some available replicaSoftware/hardware upgrade tolerance12
  • 13. NoSQL - CAP theoremPartition toleranceThe system continues to function even when Split into disconnected subsets (by a network disruption)Not only for reads, but writes as well!13
  • 14. NoSQLCAP TheoremE. Brewer, N. LynchYou can satisfyat most 2 out of the 3 requirements14
  • 15. NoSQLCAP Theorem  CASingle site clusters(easier to ensure all nodes are always in contact)When a partition occurs, the system blockse.g. usable for two-phase commits (2PC) which already require/use blocks 15
  • 16. NoSQLCAP Theorem  CASingle site clusters(easier to ensure all nodes are always in contact)When a partition occurs, the system blockse.g. usable for two-phase commits (2PC) which already require/use blocks Obviously, any horizontal scaling strategy is based on data partitioning; therefore, designers are forced to decide between consistency and availability.16
  • 17. NoSQLCAP Theorem  CPSome data may be inaccessible (availability sacrificed), but the rest is still consistent/accuratee.g. sharded database17
  • 18. NoSQLCAP Theorem  APSystem is still available under partitioning,but some of the data returned my be inaccurateNeed some conflict resolution strategye.g. Master/Slave replication18
  • 19. NoSQLRDBMSGuaratnee ACID by CA(two-phasecommits)SQLMature:19
  • 20. NoSQLNoSQL DBMSNo relational tablesNo fixed table schemasNo joinsNo risk, no fun!CP and AP (and sometimes even AP and on top of CP  MongoDB*)* This is damn cool!20
  • 21. NoSQLKey-valueOne key  one value, very fastKey: Hash (no duplicates)Value: binary object („BLOB“) (DB does not understand your content)Players: Amazon Dynamo, Memcached…21
  • 22. NoSQLkeyvalue?=PQ)“§VN? =§(Q$U%V§W=(BN W§(=BU&W§$()= W§$(=%GIVE ME A MEANING!customer_2222
  • 23. NoSQLDocument databasesKey-value store, tooValue is „understood“ by the DBQuerying the data is possible(not just retrieving the key‘s content)Players: Amazon SimpleDB, CouchDB, MongoDB …23
  • 24. NoSQLkeyvalue{ Type: “Customer”, Name: "Norbert“,Invoiced: 2222 }customer_2224
  • 25. NoSQLkeyvalue / documents{ Type: "Customer", Name: "Norbert", Invoiced: 2222 Messages: [ { Title: "Hello", Text: "World" }, { Title: "Second", Text: "message" } ] }customer_2225
  • 26. NoSQL(Wide) column storesOften referred as “BigTable clones”Each key is associated with many attributes (columns)NoSQL column stores are actually hybrid row/column storesDifferent from “pure” relational column stores!Players: Google BigTable, Cassandra (Facebook), HBase…26
  • 27. NoSQLWon‘t be stored as: It will be stored as:22;Norbert;22222 22;23;2423;Hans;50000 Norbert;Hans;Franz24;Franz;44000 22222;50000;4400027
  • 28. NoSQLGraph databasesMulti-relational graphsSPARQL query language (W3C Recommendation!)Players: Neo4j, InfoGrid …(note: graph DBs are special and somehow the “black sheep” in the NoSQL world –the following PROs/CONs don’t apply very well)28
  • 29. NoSQLPROs (& Promisses)Scheme-free / semi-structured dataMassive data storesScaling is easyVery, very high availabilityOften simpler to implement (and OR Mappers aren’t required)„Web 2.0 ready“29
  • 30. NoSQLCONSsNoSQL implementations often „alpha“, no standardsData consistency, no transactions,Insufficient access controlSQL: strong for dynamic, cross-table queries (JOIN)Relationships aren‘t enforced (conventions over constrains – except for graph DBs (of course))Premature optimization: Scalability (Don’t build for scalability if you never need it!)30
  • 32. NoSQL Lets rock! MongoDB Quick Reference Cardshttp://www.10gen.com/reference32
  • 33. Basic DeploymentCreate the default data directory in c:\data\dbStart mongod.exeOptionally: mongod.exe --dbpath c:\data\db --port 27017 --logpath c:\data\mongodb.logStart the shell: mongo.exe33
  • 34. Data Importcd c:\dba-training-data\datamongoimport -d twitter -c tweets twitter.jsoncd c:\dba-training-data\data\dump\trainingmongorestore -d training -c scores scores.bsoncd c:\dba-training-data\data\dumpmongorestore -d diggdigg34
  • 35. 35
  • 36. MongoDB Documents(in the shell)use diggdb.stories.findOne();36
  • 37. JSON  BSONAll JSON documents are stored in a binary format called BSON. BSON supports a richer set of types than JSON.http://bsonspec.org37
  • 38. CRUD – Create(in the shell)db.people.save({name: 'Smith', age: 30});See how the save command works:db.foo.save38
  • 39. CRUD – CreateHow training.scores was created:for(i=0; i<1000; i++) { ['quiz', 'essay', 'exam'].forEach(function(name) {var score = Math.floor(Math.random() * 50) + 50;db.scores.save({student: i, name: name, score: score}); }); }db.scores.count();39
  • 40. CRUD – ReadQueries are specified using a document-style syntax!use trainingdb.scores.find({score: 50});db.scores.find({score: {"$gte": 70}});db.scores.find({score: {"$gte": 70}});Cursor!40
  • 41. ExercisesFind all scores less than 65. Find the lowest quiz score. Find the highest quiz score. Write a query to find all digg stories where the view count is greater than 1000. Query for all digg stories whose media type is either 'news' or 'images' and where the topic name is 'Comedy’.(For extra practice, construct two queries using different sets of operators to do this. )Find all digg stories where the topic name is 'Television' or the media type is 'videos'. Skip the first 5 results, and limit the result set to 10.41
  • 42. CRUD – Updateuse digg; db.people.update({name: 'Smith'}, {'$set': {interests: []}});db.people.update({name: 'Smith'}, {'$push': {interests: ['chess']}});42
  • 43. ExercisesSet the proper 'grade' attribute for all scores. For example, users with scores greater than 90 get an 'A.' Set the grade to ‘B’ for scores falling between 80 and 90.You're being nice, so you decide to add 10 points to every score on every “final” exam whose score is lower than 60. How do you do this update?43
  • 45. “Map Reduce is the Uzi of aggregation tools. Everything described with count, distinct and group can be done with MapReduce, and more.”Kristina Chadorow, Michael Dirolf in MongoDB – The Definitive Guide45
  • 46. MapReduceTo use map-reduce, you first write a map function.map = function() { emit(this.user.name, {diggs: this.diggs, posts: 0});}46
  • 47. MapReduceThe reduce functions then aggregation those docs by key.reduce = function(key, values) {vardiggs = 0;var posts = 0;values.forEach(function(doc) {diggs += doc.diggs; posts += 1; }); return {diggs: diggs, posts: posts};}47
  • 48. MapReduceNow both are used to perform custom aggregation.db.stories.mapReduce(map, reduce, {out: 'digg_users'});48
  • 49. THANK YOUFOR YOUR ATTENTION49