SlideShare a Scribd company logo
Dealing with Azure Cosmos DB
About me
Agenda:
What is Cosmos DB
What is Azure Cosmos DB
What is Azure Cosmos DB
Global Distribution
Worldwide presence
Automatic multi-region replication
Multi-homing APIs
Manual and automatic failovers
What is Azure Cosmos DB
Five Consistency Models
Helps navigate Brewer's CAP theorem
Intuitive Programming
• Tunable well-defined consistency levels
• Override on per-request basis
Clear PACELC tradeoffs
• Partition – Availability vs Consistency
• Else – Latency vs Consistency
What is Azure Cosmos DB
Comprehensive SLAs
99.99% availability
Durable quorum committed writes
Latency, consistency, and throughput also covered by
financially backed SLAs
Made possible with highly-redundant architecture
Operation
type Single region
Multi-region (single
region writes)
Multi-region (multi-
region writes)
Writes 99.99 99.99 99.999
Reads 99.99 99.999 99.999
Cosmos DB Data Formats
Cosmos DB Data Formats
Cosmos DB Data Formats
• The query syntax is geared at navigating
graphs – you could say e.g. .has(‘person’, ‘name’,
‘Thomas’).outE(‘Knows’) to find people who
Thomas knows.
Cosmos DB Data Formats
Cosmos DB Data Formats
Cosmos DB Design
Containers and Databases
Cosmos DB Storage
Cosmos DB Indexing
 Indexing on by default
 Index only specific paths in your document
Cosmos DB Resources
Cosmos DB Resources
Cosmos DB
Resources
Cosmos DB Resources - Core (SQL) API
Container
Cosmos DB Resources
• Database Account:
• Regions
• API
Cosmos DB Resources
• The container that houses
your data
• /dbs/{id} is not your ID
• Hash known as a “Self Link”
Cosmos DB Resources
• Video
• Audio
• Blob
• Etc.
Cosmos DB Resources
• Invite in an existing azure account
• Allows you to set permissions on
each concept of the database
Cosmos DB Resources
• Authorization token
• Associated with a user
• Grants access to a given
resource
Cosmos DB Resources
• Most like a “table”
• Structure is not defined
• Dynamic shapes based on
what
you put in it
Cosmos DB Resources
• A blob of JSON representing your data
• Can be a deeply nested shape
• No specialty types
• No specific encoding types
Cosmos DB Resources
• Think media – at the
document level!
The maximum size for a document
and it's attachment in CosmosDB now is 2 MB.
In DocumentDB the maximum size of document
and it’s attachment was 512KB
Media size – 2GB
Cosmos DB Resources
• Written in JavaScript!
• Is transactional
• Executed by the database engine
• Can live in the store
• Can be sent over the wire
Cosmos DB Resources
• Can be Pre or Post (before or after)
• Can operate on the following actions
• Create
• Replace
• Delete
• All
• Also written in javascript!
+ Azure
Functions
Cosmos DB Resources
• Can only be ran on a query
• Modifies the result of a
given query
• mathSqrt()
Cosmos DB Resources
Cosmos DB Resources
 JSON array {"exports": [{"city": “Moscow"}, {"city": Athens"}]} correspond to the
paths /"exports"/0/"city"/"Moscow" and /"exports"/1/"city"/"Athens".
Cosmos DB Resources
Cosmos DB Resources
Cosmos DB Resources
Cosmos DB Resources
Property User settable or system generated? Purpose
_rid System generated System generated, unique and hierarchical
identifier of the resource.
_etag System generated etag of the resource required for
optimistic concurrency control.
_ts System generated Last updated timestamp of the resource.
_self System generated Unique addressable URI of the resource.
id User settable User defined unique name of the
resource.
Cosmos DB Resources
Value of the _self Description
/dbs Feed of databases under a database account.
/dbs/{_rid-db} Database with the unique id property with the value {_rid-db}.
/dbs/{_rid-db}/colls/ Feed of collections under a database.
/dbs/{_rid-db}/colls/{_rid-coll} Collection with the unique id property with the value {_rid-coll}.
/dbs/{_rid-db}/users/ Feed of users under a database.
/dbs/{_rid-db}/users/{_rid-user} User with the unique id property with the value {_rid-user}.
/dbs/{_rid-db}/users/{_rid-user}/permissions Feed of permissions under a database.
/dbs/{_rid-db}/users/{_rid-user}/permissions/{_rid-permission} Permission with the unique id property with the value {_rid-permission}.
Request Units
Request Units
Request Units (RUs) is a rate-based currency – e.g. 1000 RU/second
Abstracts physical resources for performing requests
% IOPS% CPU% Memory
Cosmos DB Resources
Request Units
Each request consumes # of RU
Approx. 1 RU = 1 read of 1 KB document
Approx. 5 RU = 1 write of a 1KB document
Query: Depends on query & documents involved
GET
POST
PUT
Query
…
=
=
=
=
Request Units- Provisioned throughput
Provisioned in terms of RU/sec – e.g. 1000 RU/s
Billed for highest RU/s in 1 hour
Easy to increase and decrease on demand
Rate limiting based on amount of throughput provisioned
Background processes like TTL expiration, index
transformations scheduled when quiescent
Min RU/sec
Max
RU/sec
IncomingRequests
No rate limiting,
process background
operations
Rate limiting –
SDK retry
No rate limiting
Cosmos DB Resources
Partitioned
collection
Single partition
collection (only via
SDK v.2) S1 S2 S3
Maximum
throughput
Unlimited 10K RU/s 250 RU/s 1 K RU/s 2.5 K RU/s
Minimum
throughput
2.5K
400 RU/s
400 RU/s 250 RU/s 1 K RU/s 2.5 K RU/s
Maximum storage Unlimited 10 GB 10 GB 10 GB 10 GB
Price Throughput: $6 /
100 RU/s
Storage: $0.25/GB
Throughput: $6 /
100 RU/s
Storage: $0.25/GB
$25 USD $50 USD $100 USD
Data Modeling in
Azure Cosmos DB
2 Extremes
Sample structure
{
"ID": 1,
"ItemName": "hamburger",
"ItemDescription": "cheeseburger, no cheese",
“CategoryId": 5,
"Category": "sandwiches"
"CategoryDescription": "2 pieces of bread + filling"
}
Modeling challenge : To embed or reference?
{
"menuID": 1,
"menuName": "Lunch menu",
"items": [
{"ID": 1, "ItemName": "hamburger", "ItemDescription":...}
{"ID": 2, "ItemName": "cheeseburger", "ItemDescription":...}
]
}
{
"menuID": 1,
"menuName": "Lunch menu",
"items": [
{"ID": 1}
{"ID": 2}
]
}
{"ID": 1, "ItemName": “hamburger", "ItemDescription":...}
{"ID": 2, "ItemName": “cheeseburger", "ItemDescription":...}
But wait, you can do both!
{
"id": "speaker1",
"name": "Alice",
"email": "alice@contoso.com",
“address”: “1 Microsoft Way”
“phone”: “555-5555”
"sessions":[
{"id": "session1"},
{"id": "session2"}
]
}
{
"id": “session1",
"name": "Modelling Data 101",
"speakers":[
{"id": "speaker1“, “name”: “Alice”,
“email”: “alice@contoso.com”},
{"id": "speaker2“, “name”: “Bob”}
]
}
Embed reference less frequently
used
Partitioning in
Azure Cosmos DB
Partitioning in Cosmos DB
Partitioning
Logical partition: Stores all data associated with the same partition
key value
Physical partition: Fixed amount of reserved SSD-backed storage +
compute.
Cosmos DB distributes logical partitions among a smaller number of
physical partitions.
From user’s perspective: define 1 partition key per container
Containers support unlimited storage by dynamically
allocating additional physical partitions
Storage for single partition key value (logical partition)
is quota'ed to 10GB.
When a partition key reaches its provisioned storage
limit, requests to create new resources will return a
HTTP Status Code of 403 (Forbidden).
Azure Cosmos DB will automatically add partitions, and
may also return a 403 if:
• An authorization token has expired
• A programmatic element (UDF, Stored Procedure,
Trigger) has been flagged for repeated violations
Partition Key Storage Limits
HTTP 403
Partitioning in Cosmos DB
Partitioning in Cosmos DB
p
p1 p2
Partitioning in Cosmos DB
API Partition Key Row Key
DocumentDB custom partition key path fixed id
MongoDB custom shard key fixed _id
Graph custom partition key
property
fixed id
Table fixed PartitionKey fixed RowKey
Developing against
Cosmos DB (SQL API)
Developing against Cosmos DB SQL API
Cosmos DB APIs Support
Querying Cosmos DB SQL API
Querying Cosmos DB SQL API
 SELECT
AS
AS
AS
AS
 FROM
 JOIN IN
 JOIN IN
Querying Cosmos DB SQL API
 var
id: "contains",
function {
if (arr.indexOf(obj) > -1) {
 return true;
 }
 return false;
}
Querying Cosmos DB SQL API
 SELECT FROM Families WHERE
contains "Andersen" false
Querying Cosmos DB SQL API
Querying Cosmos DB SQL API
var createDocumentStoredProc = {
id: "createCustomDocument",
body: function createCustomDocument(documentToCreate) {
var context = getContext();
var collection = context.getCollection();
var accepted = collection.createDocument(collection.getSelfLink(),
documentToCreate,
function (err, documentCreated) {
if (err) throw new Error('Error' + err.message);
context.getResponse().setBody(documentCreated.id)
});
if (!accepted) return;
}
}
Querying Cosmos DB SQL API
var result =
client.ExecuteStoredProcedureAsync(
createdStoredProcedure._self);
Querying Cosmos DB SQL API
Querying Cosmos DB SQL API
Querying Cosmos DB SQL API
Cosmos DB and Table Storage
Cosmos DB: Table API
Cosmos DB: Table API
Cosmos DB: Table API
Azure Table Storage
Azure Cosmos DB: Table storage
(preview)
Latency Fast, but no upper bounds on latency Single-digit millisecond latency for reads and
writes, backed with <10-ms latency reads
and <15-ms latency writes at the 99th
percentile, at any scale, anywhere in the
world
Throughput variable throughput model. Tables have a
scalability limit of 20,000 operations/s
Highly scalable with dedicated reserved
throughput per table, that is backed by
SLAs. Accounts have no upper limit on
throughput, and support >10 million
operations/s per table
Global Distribution Single region with one optional readable
secondary read region for HA. You cannot
initiate failover
Turn-key global distribution from one to 30+
regions, Support for automatic and manual
failovers at any time, anywhere in the world
Indexing Only primary index on PartitionKey and
RowKey. No secondary indexes
Automatic and complete indexing on all
properties, no index management
Cosmos DB: Table API
Azure Table Storage
Azure Cosmos DB: Table storage
(preview)
Query Query execution uses index for primary
key, and scans otherwise.
Queries can take advantage of
automatic indexing on properties for
fast query times. Azure Cosmos DB's
database engine is capable of
supporting aggregates, geo-spatial, and
sorting.
Consistency Strong within primary region, Eventual
with secondary region
Five well-defined consistency levels to
trade off availability, latency,
throughput, and consistency based on
your application needs
Pricing Storage-optimized Throughput-optimized
SLAs 99.9% availability 99.99% availability within a single
region, and ability to add more regions
for higher availability. Industry-leading
comprehensive SLAs on general
availability
Cosmos DB
and
MongoDB
Cosmos DB: API for MongoDB
Cosmos DB: API for MongoDB
Cosmos DB
Change Feed
Cosmos DB Change Feed
Common Change Feed Scenarios
Common Scenarios
Event Sourcing (Microservices)
1. Event driven design with Azure Fuctions
Azure Functions
(E-Commerce Checkout API)
Azure Cosmos
DB (Order Event
Store)
Azure Functions
(Microservice 1: Tax)
Azure Functions
(Microservice 2: Payment)
Azure Functions
(Microservice N:
Fulfillment)
. . .
2. Real-time data movement
Data Movement / Backup
…
3. Materialized View
SubscriptionI
D
UserID Create
Date
…
123abc Ben6 6/17/17
456efg Ben6 3/14/17
789hij Jen4 8/1/16
012klm Joe3 3/4/17
UserID Total Subscriptions
Ben6 2
Jen4 1
Joe3 1
Three different ways to use the Change Feed
Implementation Use Case Advantages
Azure Functions
Serverless
applications
Easy to implement.
Used as a trigger, input or output binding to an Azure
Function.
Change Feed
Processor Library
Distributed
applications
Ability to distribute the processing of events towards
multiple clients. Requires a “leases collection”.
SQL API SDK for
.NET or Java
Not
recommended
Requires manual implementation in a .NET or Java
application.
Lease collection
Dealing with Azure Cosmos DB
Cosmos DB Performance
Cosmos DB Performance: SQL API
 Use direct connection mode for better performance
Cosmos DB Performance: SQL API
 var client =DocumentClient client = new DocumentClient
 (serviceEndpoint, authKey,
 new ConnectionPolicy
 {
 ConnectionMode = ConnectionMode.Direct,
 ConnectionProtocol = Protocol.Tcp
 });
Cosmos DB Performance: SQL API
Cosmos DB Performance: SQL API
Cosmos DB Performance: SQL API
Document document = await
client.ReadDocumentAsync("/dbs/1234/colls/1234354/docs/2332435465");
Cosmos DB Regional Failover
Multi-master at global scale with Azure Cosmos DB
Auto Scaling Containers
Auto Scaling Containers
Request Units and Bullings
Billing Model
Two components: Consumed Storage + Provisioned Throughput
You are billed on consumed storage and provisioned throughput
Containers in a database can share throughput
Unit Price (for most Azure regions)
SSD Storage (per GB) $0.25 per month
Provisioned Throughput (single region
writes)
$0.008/hour per 100 RU/s
Provisioned Throughput (multi-region
writes)
$0.016/hour per 100 multi-region write
RU/s
* pricing may vary by region; for up-to-date pricing, see: https://azure.microsoft.com/pricing/details/cosmos-
Billing Model
Automatically configuring provisioned throughput with Autopilot Preview
You are billed on consumed storage and provisioned throughput
Containers in a database can share throughput
Autopilot Throughput – Unit (100 RU/s
per hour)
Price
100 Autopilot RU/s, single-region account $0.012/hour
100 Autopilot RU/s, multi-region, single
master account with N regions
N regions x $0.012/hour, where N > 1
100 RU/s multi-region, multi-master
account with N regions
N regions x $0.016/hour, where N > 1
* pricing may vary by region; for up-to-date pricing, see: https://azure.microsoft.com/pricing/details/cosmos-
Billing Model
Reserved capacity for provisioned throughput
* pricing may vary by region; for up-to-date pricing, see: https://azure.microsoft.com/pricing/details/cosmos-
1 YEAR RESERVATION 3 YEAR RESERVATION
THROUGHPUT SINGLE REGION WRITE MULTIPLE REGION WRITE SINGLE REGION WRITE MULTIPLE REGION WRITE
PRICE/SAVINGS PRICE PER 100 RU/S
(SAVINGS OVER PAYG)
PRICE PER 100 RU/S
(SAVINGS OVER PAYG)
PRICE PER 100 RU/S
(SAVINGS OVER PAYG)
PRICE PER 100 RU/S
(SAVINGS OVER PAYG)
First 50K RU/s $0.0068 (~15%) $0.0128 (~20%) $0.006 (~25%) $0.0112 (~30%)
Next 450K RU/s $0.006 (~25%) $0.0112 (~30%) $0.0052 (~35%) $0.0096 (~40%)
Next 2.5M RU/s $0.0056 (~30%) $0.0104 (~35%) $0.0044 (~45%) $0.008 (~50%)
Over 3M RU/s $0.0044 (~45%) $0.008 (~50%) $0.0032 (~60%) $0.0056 (~65%)
Billing Model
Free Cosmos DB Tier
Azure Cosmos DB Free Tier. Develop and test applications, or
run small production workloads free within the Azure
environment.
Get Started: Enable Free Tier on a new account to receive 400
RU/s throughput and 5 GBs storage free each month for the life
of your account.
* pricing may vary by region; for up-to-date pricing, see: https://azure.microsoft.com/pricing/details/cosmos-
Demos
Questions?
Thank you!

More Related Content

PPTX
Introduction to azure cosmos db
PDF
OLTP vs OLAP
PPT
Introduction to MongoDB
PPTX
Mysql Crud, Php Mysql, php, sql
PDF
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
PDF
Nosql data models
ODP
Ms sql-server
PPTX
Data partitioning
Introduction to azure cosmos db
OLTP vs OLAP
Introduction to MongoDB
Mysql Crud, Php Mysql, php, sql
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Nosql data models
Ms sql-server
Data partitioning

What's hot (20)

PPTX
Non relational databases-no sql
PPTX
Azure Data Factory Data Flow
PDF
Introduction to ETL and Data Integration
PPTX
Structured query language(sql)ppt
PPTX
Basics of MongoDB
PPT
ETL Testing Training Presentation
PPTX
Mongo DB Presentation
PPTX
Overview SQL Server 2019
KEY
Intro to Neo4j presentation
PPTX
Big Data & Hadoop Introduction
PPTX
DATA WAREHOUSING
PDF
Introduction to Graph Database
PPTX
introduction to NOSQL Database
PPT
Introduction to structured query language (sql)
PPTX
Introduction to NoSQL
PPTX
ETL Testing Overview
PPTX
SQL for interview
PPTX
Oracle database introduction
PPT
PLSQL Cursors
Non relational databases-no sql
Azure Data Factory Data Flow
Introduction to ETL and Data Integration
Structured query language(sql)ppt
Basics of MongoDB
ETL Testing Training Presentation
Mongo DB Presentation
Overview SQL Server 2019
Intro to Neo4j presentation
Big Data & Hadoop Introduction
DATA WAREHOUSING
Introduction to Graph Database
introduction to NOSQL Database
Introduction to structured query language (sql)
Introduction to NoSQL
ETL Testing Overview
SQL for interview
Oracle database introduction
PLSQL Cursors
Ad

Similar to Dealing with Azure Cosmos DB (20)

PPTX
Azure CosmosDb
KEY
Managing Social Content with MongoDB
PDF
Couchdb Nosql
PDF
CouchDB Open Source Bridge
PDF
Introduction to MongoDB
KEY
Couchdb: No SQL? No driver? No problem
PDF
MongoDB for Coder Training (Coding Serbia 2013)
KEY
Introduction to MongoDB
KEY
Mongodb intro
PDF
Zero to 60 with Azure Cosmos DB
PDF
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
PDF
MongoDB Munich 2012: MongoDB for official documents in Bavaria
PDF
10gen Presents Schema Design and Data Modeling
PDF
Replacing Oracle with MongoDB for a templating application at the Bavarian go...
PDF
Building Your First MongoDB App
PDF
Using MongoDB and Python
PDF
2016 feb-23 pyugre-py_mongo
PDF
phptut4
PDF
phptut4
PDF
CouchDB at JAOO Århus 2009
Azure CosmosDb
Managing Social Content with MongoDB
Couchdb Nosql
CouchDB Open Source Bridge
Introduction to MongoDB
Couchdb: No SQL? No driver? No problem
MongoDB for Coder Training (Coding Serbia 2013)
Introduction to MongoDB
Mongodb intro
Zero to 60 with Azure Cosmos DB
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
MongoDB Munich 2012: MongoDB for official documents in Bavaria
10gen Presents Schema Design and Data Modeling
Replacing Oracle with MongoDB for a templating application at the Bavarian go...
Building Your First MongoDB App
Using MongoDB and Python
2016 feb-23 pyugre-py_mongo
phptut4
phptut4
CouchDB at JAOO Århus 2009
Ad

More from Mihail Mateev (7)

PDF
Clash of Technologies Google Cloud vs Microsoft Azure
PDF
Devday 2014 using_afs_in_your_cloud_app
PDF
Cloud conf-varna-2014-mihail mateev-spatial-data-and-microsoft-azure-sql-data...
PDF
Varna conf nodejs-oss-microsoft-azure[final]
PDF
Win j svsphonegap-damyan-petev-mihail-mateev
PPTX
Using SQL Local Database in Mobile Applications
PPTX
Spatial Data with SQL Server Reporting Services
Clash of Technologies Google Cloud vs Microsoft Azure
Devday 2014 using_afs_in_your_cloud_app
Cloud conf-varna-2014-mihail mateev-spatial-data-and-microsoft-azure-sql-data...
Varna conf nodejs-oss-microsoft-azure[final]
Win j svsphonegap-damyan-petev-mihail-mateev
Using SQL Local Database in Mobile Applications
Spatial Data with SQL Server Reporting Services

Recently uploaded (20)

PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Empathic Computing: Creating Shared Understanding
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
GamePlan Trading System Review: Professional Trader's Honest Take
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
KodekX | Application Modernization Development
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPT
Teaching material agriculture food technology
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
cuic standard and advanced reporting.pdf
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Understanding_Digital_Forensics_Presentation.pptx
Empathic Computing: Creating Shared Understanding
Spectral efficient network and resource selection model in 5G networks
NewMind AI Monthly Chronicles - July 2025
GamePlan Trading System Review: Professional Trader's Honest Take
Diabetes mellitus diagnosis method based random forest with bat algorithm
KodekX | Application Modernization Development
The Rise and Fall of 3GPP – Time for a Sabbatical?
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Dropbox Q2 2025 Financial Results & Investor Presentation
Teaching material agriculture food technology
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Reach Out and Touch Someone: Haptics and Empathic Computing
cuic standard and advanced reporting.pdf
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
Per capita expenditure prediction using model stacking based on satellite ima...

Dealing with Azure Cosmos DB

  • 5. What is Azure Cosmos DB
  • 6. What is Azure Cosmos DB Global Distribution Worldwide presence Automatic multi-region replication Multi-homing APIs Manual and automatic failovers
  • 7. What is Azure Cosmos DB Five Consistency Models Helps navigate Brewer's CAP theorem Intuitive Programming • Tunable well-defined consistency levels • Override on per-request basis Clear PACELC tradeoffs • Partition – Availability vs Consistency • Else – Latency vs Consistency
  • 8. What is Azure Cosmos DB Comprehensive SLAs 99.99% availability Durable quorum committed writes Latency, consistency, and throughput also covered by financially backed SLAs Made possible with highly-redundant architecture Operation type Single region Multi-region (single region writes) Multi-region (multi- region writes) Writes 99.99 99.99 99.999 Reads 99.99 99.999 99.999
  • 9. Cosmos DB Data Formats
  • 10. Cosmos DB Data Formats
  • 11. Cosmos DB Data Formats • The query syntax is geared at navigating graphs – you could say e.g. .has(‘person’, ‘name’, ‘Thomas’).outE(‘Knows’) to find people who Thomas knows.
  • 12. Cosmos DB Data Formats
  • 13. Cosmos DB Data Formats
  • 17. Cosmos DB Indexing  Indexing on by default  Index only specific paths in your document
  • 21. Cosmos DB Resources - Core (SQL) API Container
  • 22. Cosmos DB Resources • Database Account: • Regions • API
  • 23. Cosmos DB Resources • The container that houses your data • /dbs/{id} is not your ID • Hash known as a “Self Link”
  • 24. Cosmos DB Resources • Video • Audio • Blob • Etc.
  • 25. Cosmos DB Resources • Invite in an existing azure account • Allows you to set permissions on each concept of the database
  • 26. Cosmos DB Resources • Authorization token • Associated with a user • Grants access to a given resource
  • 27. Cosmos DB Resources • Most like a “table” • Structure is not defined • Dynamic shapes based on what you put in it
  • 28. Cosmos DB Resources • A blob of JSON representing your data • Can be a deeply nested shape • No specialty types • No specific encoding types
  • 29. Cosmos DB Resources • Think media – at the document level! The maximum size for a document and it's attachment in CosmosDB now is 2 MB. In DocumentDB the maximum size of document and it’s attachment was 512KB Media size – 2GB
  • 30. Cosmos DB Resources • Written in JavaScript! • Is transactional • Executed by the database engine • Can live in the store • Can be sent over the wire
  • 31. Cosmos DB Resources • Can be Pre or Post (before or after) • Can operate on the following actions • Create • Replace • Delete • All • Also written in javascript! + Azure Functions
  • 32. Cosmos DB Resources • Can only be ran on a query • Modifies the result of a given query • mathSqrt()
  • 34. Cosmos DB Resources  JSON array {"exports": [{"city": “Moscow"}, {"city": Athens"}]} correspond to the paths /"exports"/0/"city"/"Moscow" and /"exports"/1/"city"/"Athens".
  • 38. Cosmos DB Resources Property User settable or system generated? Purpose _rid System generated System generated, unique and hierarchical identifier of the resource. _etag System generated etag of the resource required for optimistic concurrency control. _ts System generated Last updated timestamp of the resource. _self System generated Unique addressable URI of the resource. id User settable User defined unique name of the resource.
  • 39. Cosmos DB Resources Value of the _self Description /dbs Feed of databases under a database account. /dbs/{_rid-db} Database with the unique id property with the value {_rid-db}. /dbs/{_rid-db}/colls/ Feed of collections under a database. /dbs/{_rid-db}/colls/{_rid-coll} Collection with the unique id property with the value {_rid-coll}. /dbs/{_rid-db}/users/ Feed of users under a database. /dbs/{_rid-db}/users/{_rid-user} User with the unique id property with the value {_rid-user}. /dbs/{_rid-db}/users/{_rid-user}/permissions Feed of permissions under a database. /dbs/{_rid-db}/users/{_rid-user}/permissions/{_rid-permission} Permission with the unique id property with the value {_rid-permission}.
  • 41. Request Units Request Units (RUs) is a rate-based currency – e.g. 1000 RU/second Abstracts physical resources for performing requests % IOPS% CPU% Memory
  • 43. Request Units Each request consumes # of RU Approx. 1 RU = 1 read of 1 KB document Approx. 5 RU = 1 write of a 1KB document Query: Depends on query & documents involved GET POST PUT Query … = = = =
  • 44. Request Units- Provisioned throughput Provisioned in terms of RU/sec – e.g. 1000 RU/s Billed for highest RU/s in 1 hour Easy to increase and decrease on demand Rate limiting based on amount of throughput provisioned Background processes like TTL expiration, index transformations scheduled when quiescent Min RU/sec Max RU/sec IncomingRequests No rate limiting, process background operations Rate limiting – SDK retry No rate limiting
  • 45. Cosmos DB Resources Partitioned collection Single partition collection (only via SDK v.2) S1 S2 S3 Maximum throughput Unlimited 10K RU/s 250 RU/s 1 K RU/s 2.5 K RU/s Minimum throughput 2.5K 400 RU/s 400 RU/s 250 RU/s 1 K RU/s 2.5 K RU/s Maximum storage Unlimited 10 GB 10 GB 10 GB 10 GB Price Throughput: $6 / 100 RU/s Storage: $0.25/GB Throughput: $6 / 100 RU/s Storage: $0.25/GB $25 USD $50 USD $100 USD
  • 48. Sample structure { "ID": 1, "ItemName": "hamburger", "ItemDescription": "cheeseburger, no cheese", “CategoryId": 5, "Category": "sandwiches" "CategoryDescription": "2 pieces of bread + filling" }
  • 49. Modeling challenge : To embed or reference? { "menuID": 1, "menuName": "Lunch menu", "items": [ {"ID": 1, "ItemName": "hamburger", "ItemDescription":...} {"ID": 2, "ItemName": "cheeseburger", "ItemDescription":...} ] } { "menuID": 1, "menuName": "Lunch menu", "items": [ {"ID": 1} {"ID": 2} ] } {"ID": 1, "ItemName": “hamburger", "ItemDescription":...} {"ID": 2, "ItemName": “cheeseburger", "ItemDescription":...}
  • 50. But wait, you can do both! { "id": "speaker1", "name": "Alice", "email": "alice@contoso.com", “address”: “1 Microsoft Way” “phone”: “555-5555” "sessions":[ {"id": "session1"}, {"id": "session2"} ] } { "id": “session1", "name": "Modelling Data 101", "speakers":[ {"id": "speaker1“, “name”: “Alice”, “email”: “alice@contoso.com”}, {"id": "speaker2“, “name”: “Bob”} ] } Embed reference less frequently used
  • 53. Partitioning Logical partition: Stores all data associated with the same partition key value Physical partition: Fixed amount of reserved SSD-backed storage + compute. Cosmos DB distributes logical partitions among a smaller number of physical partitions. From user’s perspective: define 1 partition key per container
  • 54. Containers support unlimited storage by dynamically allocating additional physical partitions Storage for single partition key value (logical partition) is quota'ed to 10GB. When a partition key reaches its provisioned storage limit, requests to create new resources will return a HTTP Status Code of 403 (Forbidden). Azure Cosmos DB will automatically add partitions, and may also return a 403 if: • An authorization token has expired • A programmatic element (UDF, Stored Procedure, Trigger) has been flagged for repeated violations Partition Key Storage Limits HTTP 403
  • 57. Partitioning in Cosmos DB API Partition Key Row Key DocumentDB custom partition key path fixed id MongoDB custom shard key fixed _id Graph custom partition key property fixed id Table fixed PartitionKey fixed RowKey
  • 60. Cosmos DB APIs Support
  • 62. Querying Cosmos DB SQL API  SELECT AS AS AS AS  FROM  JOIN IN  JOIN IN
  • 63. Querying Cosmos DB SQL API  var id: "contains", function { if (arr.indexOf(obj) > -1) {  return true;  }  return false; }
  • 64. Querying Cosmos DB SQL API  SELECT FROM Families WHERE contains "Andersen" false
  • 66. Querying Cosmos DB SQL API var createDocumentStoredProc = { id: "createCustomDocument", body: function createCustomDocument(documentToCreate) { var context = getContext(); var collection = context.getCollection(); var accepted = collection.createDocument(collection.getSelfLink(), documentToCreate, function (err, documentCreated) { if (err) throw new Error('Error' + err.message); context.getResponse().setBody(documentCreated.id) }); if (!accepted) return; } }
  • 67. Querying Cosmos DB SQL API var result = client.ExecuteStoredProcedureAsync( createdStoredProcedure._self);
  • 71. Cosmos DB and Table Storage
  • 74. Cosmos DB: Table API Azure Table Storage Azure Cosmos DB: Table storage (preview) Latency Fast, but no upper bounds on latency Single-digit millisecond latency for reads and writes, backed with <10-ms latency reads and <15-ms latency writes at the 99th percentile, at any scale, anywhere in the world Throughput variable throughput model. Tables have a scalability limit of 20,000 operations/s Highly scalable with dedicated reserved throughput per table, that is backed by SLAs. Accounts have no upper limit on throughput, and support >10 million operations/s per table Global Distribution Single region with one optional readable secondary read region for HA. You cannot initiate failover Turn-key global distribution from one to 30+ regions, Support for automatic and manual failovers at any time, anywhere in the world Indexing Only primary index on PartitionKey and RowKey. No secondary indexes Automatic and complete indexing on all properties, no index management
  • 75. Cosmos DB: Table API Azure Table Storage Azure Cosmos DB: Table storage (preview) Query Query execution uses index for primary key, and scans otherwise. Queries can take advantage of automatic indexing on properties for fast query times. Azure Cosmos DB's database engine is capable of supporting aggregates, geo-spatial, and sorting. Consistency Strong within primary region, Eventual with secondary region Five well-defined consistency levels to trade off availability, latency, throughput, and consistency based on your application needs Pricing Storage-optimized Throughput-optimized SLAs 99.9% availability 99.99% availability within a single region, and ability to add more regions for higher availability. Industry-leading comprehensive SLAs on general availability
  • 77. Cosmos DB: API for MongoDB
  • 78. Cosmos DB: API for MongoDB
  • 81. Common Change Feed Scenarios
  • 83. 1. Event driven design with Azure Fuctions Azure Functions (E-Commerce Checkout API) Azure Cosmos DB (Order Event Store) Azure Functions (Microservice 1: Tax) Azure Functions (Microservice 2: Payment) Azure Functions (Microservice N: Fulfillment) . . .
  • 84. 2. Real-time data movement Data Movement / Backup …
  • 85. 3. Materialized View SubscriptionI D UserID Create Date … 123abc Ben6 6/17/17 456efg Ben6 3/14/17 789hij Jen4 8/1/16 012klm Joe3 3/4/17 UserID Total Subscriptions Ben6 2 Jen4 1 Joe3 1
  • 86. Three different ways to use the Change Feed Implementation Use Case Advantages Azure Functions Serverless applications Easy to implement. Used as a trigger, input or output binding to an Azure Function. Change Feed Processor Library Distributed applications Ability to distribute the processing of events towards multiple clients. Requires a “leases collection”. SQL API SDK for .NET or Java Not recommended Requires manual implementation in a .NET or Java application.
  • 90. Cosmos DB Performance: SQL API  Use direct connection mode for better performance
  • 91. Cosmos DB Performance: SQL API  var client =DocumentClient client = new DocumentClient  (serviceEndpoint, authKey,  new ConnectionPolicy  {  ConnectionMode = ConnectionMode.Direct,  ConnectionProtocol = Protocol.Tcp  });
  • 94. Cosmos DB Performance: SQL API Document document = await client.ReadDocumentAsync("/dbs/1234/colls/1234354/docs/2332435465");
  • 95. Cosmos DB Regional Failover
  • 96. Multi-master at global scale with Azure Cosmos DB
  • 99. Request Units and Bullings
  • 100. Billing Model Two components: Consumed Storage + Provisioned Throughput You are billed on consumed storage and provisioned throughput Containers in a database can share throughput Unit Price (for most Azure regions) SSD Storage (per GB) $0.25 per month Provisioned Throughput (single region writes) $0.008/hour per 100 RU/s Provisioned Throughput (multi-region writes) $0.016/hour per 100 multi-region write RU/s * pricing may vary by region; for up-to-date pricing, see: https://azure.microsoft.com/pricing/details/cosmos-
  • 101. Billing Model Automatically configuring provisioned throughput with Autopilot Preview You are billed on consumed storage and provisioned throughput Containers in a database can share throughput Autopilot Throughput – Unit (100 RU/s per hour) Price 100 Autopilot RU/s, single-region account $0.012/hour 100 Autopilot RU/s, multi-region, single master account with N regions N regions x $0.012/hour, where N > 1 100 RU/s multi-region, multi-master account with N regions N regions x $0.016/hour, where N > 1 * pricing may vary by region; for up-to-date pricing, see: https://azure.microsoft.com/pricing/details/cosmos-
  • 102. Billing Model Reserved capacity for provisioned throughput * pricing may vary by region; for up-to-date pricing, see: https://azure.microsoft.com/pricing/details/cosmos- 1 YEAR RESERVATION 3 YEAR RESERVATION THROUGHPUT SINGLE REGION WRITE MULTIPLE REGION WRITE SINGLE REGION WRITE MULTIPLE REGION WRITE PRICE/SAVINGS PRICE PER 100 RU/S (SAVINGS OVER PAYG) PRICE PER 100 RU/S (SAVINGS OVER PAYG) PRICE PER 100 RU/S (SAVINGS OVER PAYG) PRICE PER 100 RU/S (SAVINGS OVER PAYG) First 50K RU/s $0.0068 (~15%) $0.0128 (~20%) $0.006 (~25%) $0.0112 (~30%) Next 450K RU/s $0.006 (~25%) $0.0112 (~30%) $0.0052 (~35%) $0.0096 (~40%) Next 2.5M RU/s $0.0056 (~30%) $0.0104 (~35%) $0.0044 (~45%) $0.008 (~50%) Over 3M RU/s $0.0044 (~45%) $0.008 (~50%) $0.0032 (~60%) $0.0056 (~65%)
  • 103. Billing Model Free Cosmos DB Tier Azure Cosmos DB Free Tier. Develop and test applications, or run small production workloads free within the Azure environment. Get Started: Enable Free Tier on a new account to receive 400 RU/s throughput and 5 GBs storage free each month for the life of your account. * pricing may vary by region; for up-to-date pricing, see: https://azure.microsoft.com/pricing/details/cosmos-
  • 104. Demos