SlideShare a Scribd company logo
Document databases in practice

Nicola Baldi

http://it.linkedin.com/in/nicolabaldi

Luigi Berrettini

http://it.linkedin.com/in/luigiberrettini
Overview

15/12/2012

Document databases in practice

2
Unbounded result sets problem
Unbounded number of requests problem

15/12/2012

Document databases in practice - Overview

3
 They favor denormalization over

composition and joins

 Relations are different than in RDBMSs
 They are schema-less, but attention should

be paid in designing documents

15/12/2012

Document databases in practice - Overview

4
« a conceptual model should be drawn with
little or no regard for the software that might
implement it » (Martin Fowler, UML Distilled)

A domain model should be independent from
implementation details like persistence
In RavenDB this is somewhat true
15/12/2012

Document databases in practice - Overview

5
 RDBMS are schema-full
• tuples = sets of key-value pairs ⇒ flat structure
• more complex data structures are stored as relations

 Document databases are schema-less
• object graphs stored as docs ⇒ no flat structure
• each document is treated as a single entity
RavenDB suggested approach is to follow the
aggregate pattern from the DDD book

15/12/2012

Document databases in practice - Overview

6
ENTITY
 Some objects are not defined primarily by

their attributes

 They represent a thread of identity that runs

through time and often across distinct
representations

 Mistaken identity can lead to data corruption
15/12/2012

Document databases in practice - Overview

7
VALUE OBJECT
 When you care only about the attributes of an

element of the model, classify it as a value object

 Make it express the meaning of the attributes it

conveys and give it related functionality

 Treat the value object as immutable
 Don't give it any identity and avoid the design

complexities necessary to maintain entities

15/12/2012

Document databases in practice - Overview

8
AGGREGATE
 Invariants are consistency rules that must be

maintained whenever data changes

 They’ll involve relationships within an aggregate

(relations & foreign keys: order / orderlines)

 Invariants applied within an aggregate will be

enforced with the completion of each transaction

15/12/2012

Document databases in practice - Overview

9
 Cluster entities and value objects into aggregates

and define boundaries around each

 Choose one entity to be the root of each

aggregate and control all access to the objects
inside the boundary through the root

 Allow external objects to hold references to the

root only

 Transient references to internal members can be

passed out for use within a single operation only

15/12/2012

Document databases in practice - Overview

10
 Because the root controls access, it cannot

be blindsided by changes to the internals

 This arrangement makes it practical to

enforce all invariants for objects in the
aggregate and for the aggregate as a
whole in any state change

15/12/2012

Document databases in practice - Overview

11
Nested child document

15/12/2012

Document databases in practice - Overview

12
Document referenced by ID

15/12/2012

Document databases in practice - Overview

13
Denormalized reference
 we clone properties that we care about when

displaying or processing a containing document
 avoids many cross document lookups and results in
only the necessary data being transmitted over the
network
 it makes other scenarios more difficult: if we add
frequently changing data, keeping details in synch
could become very demanding on the server
 use only for rarely changing data or for data that
can be dereferenced by out-of-sync data
15/12/2012

Document databases in practice - Overview

14
15/12/2012

Document databases in practice - Overview

15
Order contains
denormalized data
from Customer
and Product
Full data are
saved elsewhere

15/12/2012

Document databases in practice - Overview

16
15/12/2012

Document databases in practice - Overview

17
Querying

15/12/2012

Document databases in practice

18
 DocumentStore
• used to connect to a RavenDB data store
• thread-safe
• one instance per database per application

 Session
• used to perform operations on the database
• not thread-safe
• implements the Unit of Work pattern
 in a single session, a single document (identified
by its key) always resolves to the same instance
 change tracking
15/12/2012

Document databases in practice – Querying

19
15/12/2012

Document databases in practice – Querying

20
 Sequential GUID key
• when document key is not relevant (e.g. log entries)
• entity Id = sequential GUID (sorts well for indexing)
• Id property missing / not set ⇒ server generates a key

 Identity key
• entity Id = prefix + next available integer Id for it
• Id property set to a prefix = value ending with slash
• new DocumentStore ⇒ server sends a range of HiLo keys

 Assign a key yourself
• for documents which already have native id (e.g. users)
15/12/2012

Document databases in practice – Querying

21
15/12/2012

Document databases in practice – Querying

22
 soft-limit = 128

no Take() replaced by Take(128)

 hard-limit = 1024

if x > 1024 Take(x) returns 1024 documents

15/12/2012

Document databases in practice – Querying

23
 RavenDB can skip over some results internally

⇒ TotalResults value invalidated

 For proper paging use SkippedResults:
Skip(currentPage * pageSize + SkippedResults)
 Assuming a page size of 10…

15/12/2012

Document databases in practice – Querying

24
15/12/2012

Document databases in practice – Querying

25
15/12/2012

Document databases in practice – Querying

26
 RavenDB supports Count and Distinct
 SelectMany, GroupBy and Join are not supported

 The let keyword is not supported
 For such operations an index is needed

15/12/2012

Document databases in practice – Querying

27
All queries use an index to return results
 Dynamic = created automatically by the server
 Static = created explicitly by the user

15/12/2012

Document databases in practice – Querying

28
 no matching static index to query ⇒ RavenDB

automatically creates a dynamic index on the
fly (on first user query)

 based on requests coming in, RavenDB can

decide to promote a temporary index to a
permanent one

15/12/2012

Document databases in practice – Querying

29
 permanent
 expose much more functionality
 low latency: on first run dynamic indexes

have performance issues

 map / reduce

15/12/2012

Document databases in practice – Querying

30
15/12/2012

Document databases in practice – Querying

31
15/12/2012

Document databases in practice – Querying

32
15/12/2012

Document databases in practice – Querying

33
Advanced topics

15/12/2012

Document databases in practice

34
 an index is made of documents

 document
•
•
•
•

15/12/2012

atomic unit of indexing and searching
flat ⇒ recursion and joins must be denormalized
flexible schema
made of fields

Document databases in practice – Advanced topics

35
 field
• a name-value pair with associated info
• can be indexed if you're going to search on it
⇒ tokenization by analysis
• can be stored in order to preserve original
untokenized value within document

 example of physical index structure
{“__document_id”: “docs/1”, “tag”: “NoSQL”}

15/12/2012

Document databases in practice – Advanced topics

36
15/12/2012

Document databases in practice - Overview

37
15/12/2012

Document databases in practice – Advanced topics

38
15/12/2012

Document databases in practice – Advanced topics

39
One to one

15/12/2012

Document databases in practice – Advanced topics

40
One to many ⇒ SELECT N+1

15/12/2012

Document databases in practice – Advanced topics

41
Value type

15/12/2012

Document databases in practice – Advanced topics

42
 indexing: thread executed on creation or update
 server responds quickly BUT you may query stale

indexes (better stale than offline)

15/12/2012

Document databases in practice – Advanced topics

43
15/12/2012

Document databases in practice – Advanced topics

44
documentStore.Conventions.DefaultQueryingConsistency

 ConsistencyOptions.QueryYourWrites

same behavior of
WaitForNonStaleResultsAsOfLastWrite

 ConsistencyOptions.MonotonicRead

you never go back in time and read older
data than what you have already seen

15/12/2012

Document databases in practice – Advanced topics

45
15/12/2012

Document databases in practice - Overview

46
15/12/2012

Document databases in practice - Overview

47
15/12/2012

Document databases in practice - Overview

48

More Related Content

PPTX
DotNetToscana: NoSQL Revolution - RavenDB
PPTX
Introduction to database
PDF
1 overview-handout-notes
PPTX
Database Management
PDF
The Rise of Nosql Databases
PDF
Elements of Data Documentation
PPTX
PDF
Massively scalable ETL in real world applications: the hard way
DotNetToscana: NoSQL Revolution - RavenDB
Introduction to database
1 overview-handout-notes
Database Management
The Rise of Nosql Databases
Elements of Data Documentation
Massively scalable ETL in real world applications: the hard way

What's hot (20)

PPTX
Nosql
PDF
Slides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
PPT
Managing data resources
PDF
Business Intelligence & NoSQL Databases
PPTX
SQLite forensics - Free Lists, unallocated space, carving
PPTX
Designing database week ix part 1
PPT
Lecture 04 data resource management
PPTX
Database Fundamental Concepts- Series 1 - Performance Analysis
PDF
Jarrar: Data Schema Integration
PDF
Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...
PPT
Inb343 week2 sql server intro
DOCX
Data documentation and retrieval using unity in a universe®
PPT
5 data resource management
PPTX
Session#5; data resource managment
PDF
2011 06-sq lite-forensics
PDF
Redis Cashe is an open-source distributed in-memory data store.
PDF
Native JSON Support in SQL2016
PDF
A Survey of Non -Relational Databases with Big Data
PDF
Comparison of Relational Database and Object Oriented Database
Nosql
Slides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
Managing data resources
Business Intelligence & NoSQL Databases
SQLite forensics - Free Lists, unallocated space, carving
Designing database week ix part 1
Lecture 04 data resource management
Database Fundamental Concepts- Series 1 - Performance Analysis
Jarrar: Data Schema Integration
Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...
Inb343 week2 sql server intro
Data documentation and retrieval using unity in a universe®
5 data resource management
Session#5; data resource managment
2011 06-sq lite-forensics
Redis Cashe is an open-source distributed in-memory data store.
Native JSON Support in SQL2016
A Survey of Non -Relational Databases with Big Data
Comparison of Relational Database and Object Oriented Database
Ad

Similar to RavenDB (20)

PPTX
Database workshop - Encode | Bhuvan Gandhi | Vishwas Ganatra
PPTX
Introduction to NoSQL and MongoDB
PPT
Basic and Introduction to DBMS Unit 1 of AU
PPT
PDF
Denodo: Enabling a Data Mesh Architecture and Data Sharing Culture at Landsba...
ODP
Introduction to MongoDB
DOCX
Choosing an IdM User Store technology
PDF
3170722_BDA_GTU_Study_Material_Presentations_Unit-3_29092021094744AM.pdf
PDF
Logical Data Fabric and Data Mesh – Driving Business Outcomes
PPTX
DBMS outline.pptx
DOCX
MongoDB DOC v1.5
PPT
computer fund-database presentation
PPTX
NoSQL for SQL Users
PPTX
RDBMS-Unit-1Manika.pptx information rdbms
PPTX
RDBMS stands for Relational Database Management System. It is a type of datab...
PPTX
Complete first chapter rdbm 17332
ODP
Building next generation data warehouses
PPTX
Real world business workflow with SharePoint designer 2013
PPTX
Sql good practices
PDF
System i - DDL vs DDS Presentation
Database workshop - Encode | Bhuvan Gandhi | Vishwas Ganatra
Introduction to NoSQL and MongoDB
Basic and Introduction to DBMS Unit 1 of AU
Denodo: Enabling a Data Mesh Architecture and Data Sharing Culture at Landsba...
Introduction to MongoDB
Choosing an IdM User Store technology
3170722_BDA_GTU_Study_Material_Presentations_Unit-3_29092021094744AM.pdf
Logical Data Fabric and Data Mesh – Driving Business Outcomes
DBMS outline.pptx
MongoDB DOC v1.5
computer fund-database presentation
NoSQL for SQL Users
RDBMS-Unit-1Manika.pptx information rdbms
RDBMS stands for Relational Database Management System. It is a type of datab...
Complete first chapter rdbm 17332
Building next generation data warehouses
Real world business workflow with SharePoint designer 2013
Sql good practices
System i - DDL vs DDS Presentation
Ad

Recently uploaded (20)

PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Approach and Philosophy of On baking technology
PPTX
sap open course for s4hana steps from ECC to s4
PDF
KodekX | Application Modernization Development
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Cloud computing and distributed systems.
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Programs and apps: productivity, graphics, security and other tools
Unlocking AI with Model Context Protocol (MCP)
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
“AI and Expert System Decision Support & Business Intelligence Systems”
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Reach Out and Touch Someone: Haptics and Empathic Computing
Approach and Philosophy of On baking technology
sap open course for s4hana steps from ECC to s4
KodekX | Application Modernization Development
Spectral efficient network and resource selection model in 5G networks
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Cloud computing and distributed systems.

RavenDB

  • 1. Document databases in practice Nicola Baldi http://it.linkedin.com/in/nicolabaldi Luigi Berrettini http://it.linkedin.com/in/luigiberrettini
  • 3. Unbounded result sets problem Unbounded number of requests problem 15/12/2012 Document databases in practice - Overview 3
  • 4.  They favor denormalization over composition and joins  Relations are different than in RDBMSs  They are schema-less, but attention should be paid in designing documents 15/12/2012 Document databases in practice - Overview 4
  • 5. « a conceptual model should be drawn with little or no regard for the software that might implement it » (Martin Fowler, UML Distilled) A domain model should be independent from implementation details like persistence In RavenDB this is somewhat true 15/12/2012 Document databases in practice - Overview 5
  • 6.  RDBMS are schema-full • tuples = sets of key-value pairs ⇒ flat structure • more complex data structures are stored as relations  Document databases are schema-less • object graphs stored as docs ⇒ no flat structure • each document is treated as a single entity RavenDB suggested approach is to follow the aggregate pattern from the DDD book 15/12/2012 Document databases in practice - Overview 6
  • 7. ENTITY  Some objects are not defined primarily by their attributes  They represent a thread of identity that runs through time and often across distinct representations  Mistaken identity can lead to data corruption 15/12/2012 Document databases in practice - Overview 7
  • 8. VALUE OBJECT  When you care only about the attributes of an element of the model, classify it as a value object  Make it express the meaning of the attributes it conveys and give it related functionality  Treat the value object as immutable  Don't give it any identity and avoid the design complexities necessary to maintain entities 15/12/2012 Document databases in practice - Overview 8
  • 9. AGGREGATE  Invariants are consistency rules that must be maintained whenever data changes  They’ll involve relationships within an aggregate (relations & foreign keys: order / orderlines)  Invariants applied within an aggregate will be enforced with the completion of each transaction 15/12/2012 Document databases in practice - Overview 9
  • 10.  Cluster entities and value objects into aggregates and define boundaries around each  Choose one entity to be the root of each aggregate and control all access to the objects inside the boundary through the root  Allow external objects to hold references to the root only  Transient references to internal members can be passed out for use within a single operation only 15/12/2012 Document databases in practice - Overview 10
  • 11.  Because the root controls access, it cannot be blindsided by changes to the internals  This arrangement makes it practical to enforce all invariants for objects in the aggregate and for the aggregate as a whole in any state change 15/12/2012 Document databases in practice - Overview 11
  • 12. Nested child document 15/12/2012 Document databases in practice - Overview 12
  • 13. Document referenced by ID 15/12/2012 Document databases in practice - Overview 13
  • 14. Denormalized reference  we clone properties that we care about when displaying or processing a containing document  avoids many cross document lookups and results in only the necessary data being transmitted over the network  it makes other scenarios more difficult: if we add frequently changing data, keeping details in synch could become very demanding on the server  use only for rarely changing data or for data that can be dereferenced by out-of-sync data 15/12/2012 Document databases in practice - Overview 14
  • 15. 15/12/2012 Document databases in practice - Overview 15
  • 16. Order contains denormalized data from Customer and Product Full data are saved elsewhere 15/12/2012 Document databases in practice - Overview 16
  • 17. 15/12/2012 Document databases in practice - Overview 17
  • 19.  DocumentStore • used to connect to a RavenDB data store • thread-safe • one instance per database per application  Session • used to perform operations on the database • not thread-safe • implements the Unit of Work pattern  in a single session, a single document (identified by its key) always resolves to the same instance  change tracking 15/12/2012 Document databases in practice – Querying 19
  • 20. 15/12/2012 Document databases in practice – Querying 20
  • 21.  Sequential GUID key • when document key is not relevant (e.g. log entries) • entity Id = sequential GUID (sorts well for indexing) • Id property missing / not set ⇒ server generates a key  Identity key • entity Id = prefix + next available integer Id for it • Id property set to a prefix = value ending with slash • new DocumentStore ⇒ server sends a range of HiLo keys  Assign a key yourself • for documents which already have native id (e.g. users) 15/12/2012 Document databases in practice – Querying 21
  • 22. 15/12/2012 Document databases in practice – Querying 22
  • 23.  soft-limit = 128 no Take() replaced by Take(128)  hard-limit = 1024 if x > 1024 Take(x) returns 1024 documents 15/12/2012 Document databases in practice – Querying 23
  • 24.  RavenDB can skip over some results internally ⇒ TotalResults value invalidated  For proper paging use SkippedResults: Skip(currentPage * pageSize + SkippedResults)  Assuming a page size of 10… 15/12/2012 Document databases in practice – Querying 24
  • 25. 15/12/2012 Document databases in practice – Querying 25
  • 26. 15/12/2012 Document databases in practice – Querying 26
  • 27.  RavenDB supports Count and Distinct  SelectMany, GroupBy and Join are not supported  The let keyword is not supported  For such operations an index is needed 15/12/2012 Document databases in practice – Querying 27
  • 28. All queries use an index to return results  Dynamic = created automatically by the server  Static = created explicitly by the user 15/12/2012 Document databases in practice – Querying 28
  • 29.  no matching static index to query ⇒ RavenDB automatically creates a dynamic index on the fly (on first user query)  based on requests coming in, RavenDB can decide to promote a temporary index to a permanent one 15/12/2012 Document databases in practice – Querying 29
  • 30.  permanent  expose much more functionality  low latency: on first run dynamic indexes have performance issues  map / reduce 15/12/2012 Document databases in practice – Querying 30
  • 31. 15/12/2012 Document databases in practice – Querying 31
  • 32. 15/12/2012 Document databases in practice – Querying 32
  • 33. 15/12/2012 Document databases in practice – Querying 33
  • 35.  an index is made of documents  document • • • • 15/12/2012 atomic unit of indexing and searching flat ⇒ recursion and joins must be denormalized flexible schema made of fields Document databases in practice – Advanced topics 35
  • 36.  field • a name-value pair with associated info • can be indexed if you're going to search on it ⇒ tokenization by analysis • can be stored in order to preserve original untokenized value within document  example of physical index structure {“__document_id”: “docs/1”, “tag”: “NoSQL”} 15/12/2012 Document databases in practice – Advanced topics 36
  • 37. 15/12/2012 Document databases in practice - Overview 37
  • 38. 15/12/2012 Document databases in practice – Advanced topics 38
  • 39. 15/12/2012 Document databases in practice – Advanced topics 39
  • 40. One to one 15/12/2012 Document databases in practice – Advanced topics 40
  • 41. One to many ⇒ SELECT N+1 15/12/2012 Document databases in practice – Advanced topics 41
  • 42. Value type 15/12/2012 Document databases in practice – Advanced topics 42
  • 43.  indexing: thread executed on creation or update  server responds quickly BUT you may query stale indexes (better stale than offline) 15/12/2012 Document databases in practice – Advanced topics 43
  • 44. 15/12/2012 Document databases in practice – Advanced topics 44
  • 45. documentStore.Conventions.DefaultQueryingConsistency  ConsistencyOptions.QueryYourWrites same behavior of WaitForNonStaleResultsAsOfLastWrite  ConsistencyOptions.MonotonicRead you never go back in time and read older data than what you have already seen 15/12/2012 Document databases in practice – Advanced topics 45
  • 46. 15/12/2012 Document databases in practice - Overview 46
  • 47. 15/12/2012 Document databases in practice - Overview 47
  • 48. 15/12/2012 Document databases in practice - Overview 48