SH 2 - SES 3 - MongoDB Aggregation Framework.pptx

Understanding
Aggregation Framework
@ MongoDB 3.6
Matan Zohar
Disruptive Technologies Leader – Matrix

2
Why Aggregation Framework?
SELECT cust_id,ord_date,
SUM(price) AS total
FROM orders
GROUP BY cust_id,ord_date
HAVING total > 250

3
Our Standard Database Tools
GROUP BY HAVING
JOIN
WHERE
AVG
MIN MAX
SUM
COUNTSELECT
ORDER BY

4
We deserve Tools for Documents
$group
$abs$lookup $match $avg
$min
$max db.collection.aggregate()
$sum
db.collection.find() $sort
$mergeObjects
$count
$bucket
$limit
$project
$sample
$skip
$unwind
$eq
$divide
$cond
$exp
$concat
$log
$map
$reduce
$split $substr
$size $cmp $dateFromString $filter

5
What is the big deal?
ARTICLE
Name
Publish date
URL
Text
CATEGORY
Name
URL
TAG
Name
URL
COMMENT
Text
Date
Author
USER
Name
Email
ARTICLE
Name
Publish date
URL
Text
USER
Name
Email
COMMENT []
Text
Date
Author
TAG []
Name
URL
CATEGORY []
Name
URL
Relational Model Document Model

6
What is Aggregation Framework?
• Processing pipeline of stages, transforming the document into an aggregated result.
• Query process optimization, designed for a sharded cluster.

7
Get to know the Tools (Stages)
• $project – Change the document structure, select fields, remove fields, add newly
computed fields (SELECT).
• $lookup – Join two collections at a stage by an expression (JOIN / SUB QUERY)
• $match – Filters the documents according to a condition (WHERE / HAVING)
• $group – Groups the documents by an expression (GROUP BY)
• $sort – Sorts all input documents by selected fields and order (ORDER BY)
• $unwind – Flatten a hierarchal document (array of documents)
• $count – Counts the number of documents in the current stage
• $bucket – Categorizes groups of documents according to an expression.

8
How does it actually look?
SELECT cust_id, ord_date,
SUM(price) AS total
FROM orders
GROUP BY cust_id, ord_date
HAVING total > 250
db.orders.aggregate( [
{ $group: {
_id: {
cust_id: "$cust_id",
ord_date: {
month: { $month: "$ord_date" },
day: { $dayOfMonth: "$ord_date" },
year: { $year: "$ord_date"}
}
},
total: { $sum: "$price" }
}
},
{ $match: { total: { $gt: 250 } } }
] )

9
MongoDB 3.6 – What’s new ?
• $lookup – More expressive, now supports non equi-joins, subqueries!
That means improved performance for Analytics & BI Tools as more ops are pushed down to the database.
• $expr – You can use aggregation expressions within the query language!
Simpler code, can be used from db.collection.find(), but use with caution, does net yet fully leverages indexes.
Whenever in doubt use db.collection.aggregate().
• New Aggregation Operations:
– $arrayToObject
– $objectToArray
– $mergeObjects
– $dateFromString
– $dateFromParts
– $dateToParts
• $$REMOVE – New aggregation variable, allows for the conditional exclusion of a field.
• hint, comment – New options for aggregate command and the db.collection.aggregate() method.
• hint – An index to use for the aggregation (on the initial collection).
• comment – A string to help trace the operation in the database profiler, currentOp, and logs.

10
• Filter as soon as possible.
• Use indexes to improve sorts, matches, lookups.
• If performance is less than expected use explain to analyze the plan.
• Use hint when there is a better index that is not being used.
• Use projection to filter the subset of fields you need at the beginning,
the pipeline will take it in consideration and pass less data between stages.
Optimizing Aggregation Pipelines

21
Aggregation Pipeline & SQL
db.restaurants.aggregate([
{ $match: { name: "Riviera Caterer" }},
{ $project: { name: 1, cuisine: 1, borough: 1, "grades.score": 1 }},
{ $unwind: "$grades" }
])

SH 2 - SES 3 - MongoDB Aggregation Framework.pptx

More Related Content

What's hot (20)

Similar to SH 2 - SES 3 - MongoDB Aggregation Framework.pptx (20)

More from MongoDB (20)

SH 2 - SES 3 - MongoDB Aggregation Framework.pptx