SlideShare a Scribd company logo
MongoDBEurope2016
Old Billingsgate, London
15th November
Use my code JD20 for 20% off tickets
mongodb.com/europe
Back to Basics 2016 : Webinar 4
Advanced Indexing –
Text and Geospatial Indexes
Joe Drumgoole
Director of Developer Advocacy, EMEA
@jdrumgoole
V1.1
3
Recap
• Webinar 1 – Introduction to NoSQL
– The different types of NoSQL databases
– What kind of database is MongoDB? A document database.
• Webinar 2 – My First Application
– Creating databases and collections
– CRUD operations
– Indexes and Explain
• Webinar 3 – Schema Design
– Dynamic schema
– Embedding approaches
– Examples
4
Indexing
• An efficient way to look up data by its value
• Avoids table scans
1 2 3 4 5 6 7
5
Traditional Databases Use Btrees
• … and so does MongoDB
6
Queries, Inserts, Deletes O(Log(n) Time
7
Creating a Simple Index
db.coll.createIndex( { fieldName : <Direction> } )
Database Name
Collection Name
Command
Field Name to
be indexed
Ascending : 1
Descending : -1
8
Two Other Kinds of Indexes
• Full Text Index
– Allows searching inside the text of a field ( Lucene, Solr and Elastic
Search)
• Geospatial Index
– Allows searching by location (e.g. people near me)
• These indexes do not use Btrees
9
Full Text Indexes
• An “inverted index” on all the words inside a single field (only one text index per collection)
{ “comment” : “I think your blog post is very interesting
and informative. I hope you will post more
info like this in the future” }
>> db.posts.createIndex( { “comments” : “text” } )
MongoDB Enterprise > db.posts.find( { $text: { $search : "info" }} )
{ "_id" : ObjectId(“…"), "comment" : "I think your blog post is very
interesting and informative. I hope you will post more info like this
in the future" }
MongoDB Enterprise >
10
Results
MongoDB Enterprise > db.posts.getIndexes()
...
{
"v" : 1,
"key" : {
"_fts" : "text",
"_ftsx" : 1
},
"name" : "comment_text",
"ns" : "test.posts",
"weights" : {
"comment" : 1
},
"default_language" : "english",
"language_override" : "language",
"textIndexVersion" : 3
}
11
Dropping Text Indexes
• We drop text indexes by name rather than shape
db.posts.getIndexes()
{
"v" : 1,
"key" : {
"_fts" : "text",
"_ftsx" : 1
},
"name" : "comment_text_text",
"ns" : "test.posts",
"weights" : {
"comment" : 5,
"tags" : 10
},
"default_language" : "english",
"language_override" : "language",
"textIndexVersion" : 3
}
12
Hence
MongoDB Enterprise > db.posts.dropIndex( "comment_text_tags_text" )
{ "nIndexesWas" : 2, "ok" : 1 }
MongoDB Enterprise >
• You can give an index an explict name to make this easier
MongoDB Enterprise > db.posts.createIndex( { "comments" : "text", "tags" :
"text" }, { "name" : "text_index" } )
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
13
On The Server
I INDEX [conn275] build index on: test.posts properties: { v: 1, key:
{ _fts: "text", _ftsx: 1 }, name: "comment_text", ns: "test.posts",
weights: { comment: 1 }, default_language: "english",
language_override: "language", textIndexVersion: 3 }}
I INDEX [conn275] building index using bulk method
I INDEX [conn275] build index done. scanned 3 total records. 0 secs
14
More Detailed Example
>> db.posts.insert( { "comment" : "Red yellow orange green" } )
>> db.posts.insert( { "comment" : "Pink purple blue" } )
>> db.posts.insert( { "comment" : "Red Pink" } )
>> db.posts.find( { "$text" : { "$search" : "Red" }} )
{ "_id" : ObjectId(“…”), "comment" : "Red yellow orange green" }
{ "_id" : ObjectId( »…"), "comment" : "Red Pink" }
>> db.posts.find( { "$text" : { "$search" : "Red Green" }} )
{ "_id" : ObjectId(« …"), "comment" : "Red Pink" }
{ "_id" : ObjectId(« …"), "comment" : "Red yellow orange green" }
>> db.posts.find( { "$text" : { "$search" : "red" }} ) # <- Case Insensitve
{ "_id" : ObjectId(“…"), "comment" : "Red yellow orange green" }
{ "_id" : ObjectId(«…”), "comment" : "Red Pink" }
>>
15
Using Weights
• We can assign different weights to different fields in the text index
• E.g. I want to favour tags over comments in searching
• So I increase the weight for the the tags field
>> db.blog.createIndex( { comment: "text",
tags : "text” },
{ weights: { comment: 5,
tags : 10 }} )
• Now searches will favour tags
16
$textscore
• Weights impact $textscore:
>> db.posts.find( { "$text" : { "$search" : "Red" }}, { score: {
$meta: "textScore" }} ).sort( { score: { $meta: "textScore" } } )
{ "_id" : …, "comment" : "hello", "tags" : "Red green orange", "score"
: 6.666666666666666 }
{ "_id" : …, "comment" : "Red Pink", "score" : 3.75 }
{ "_id" : …, "comment" : "Red yellow orange green", "score" : 3.125 }
>>
17
Other Parameters
• Language : Pick the language you want to search in e.g.
– $language : Spanish
• Support case sensitive searching
– $caseSensitive : True (default false)
• Support accented characters (diacritic sensitive search e.g. café
is distinguished from cafe )
– $diacriticSensitive : True (default false)
Geospatial Indexes
19
Geospatial Indexes
• MongoDB supports 2D Sphere indexes
• Allows a user to represent location on the earth (which is a sphere)
• Coordinates are stored in GeoJSON format
• The Geospatial index supports subset of the GeoJSON operations
• The index is based on a QuadTree representation
• Index is based on WGS 84 standard
20
Coordinates
• Coordinates are represented as longitude, latitude
• longitude
– Measured from Greenwich meridian in London (0 degrees) locations east
(up to 180 degrees)
– For locations west we specify as negative
• Latitude
– Measured from equator north and south (0 to 90 north, 0 to -90 south)
• Coordinates in MongoDB are stored on Longitude/Latitude order
• Coordinates in Google are stored in Latitude/Longitude order
21
2DSphere Versions
• Three versions of 2dSphere index in MongoDB
• Version 1 : Up to MongoDB 2.4
• Version 2 : From MongoDB 2.6 onwards
• Version 3 : From MongoDB 3.2 onwards
• We will only be talking about Version 3 in this webinar
22
Creating a 2dSphere Index
db.collection.createIndex
( { <location field> : "2dsphere" } )
• Location field must be coordinate or GeoJSON data
23
Example
>> db.test.createIndex( { loc : "2dsphere" } )
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
24
Output
>> db.test.getIndexes()
[
{
"v" : 1,
"key" : {
"loc" : "2dsphere"
},
"name" : "loc_2dsphere",
"ns" : "geo.test",
"2dsphereIndexVersion" : 3
}
]
>>
25
Use a Simple Dataset to investigate Geo Queries
• Lets search for restaurants in Manhattan
• Using two candidate collections
– https://raw.githubusercontent.com/mongodb/docs-assets/geospatial/neighborhoods.json
– https://raw.githubusercontent.com/mongodb/docs-assets/geospatial/restaurants.json
• Import them into MongoDB
– mongoimport –c neighborhoods –d geo neighborhoods.json
– mongoimport –c restaurants –d geo restaurants.json
26
Neighborhood Document
MongoDB Enterprise > db.neighborhoods.findOne()
{
"_id" : ObjectId("55cb9c666c522cafdb053a1a"),
"geometry" : {
"coordinates" : [
[
[
-73.94193078816193,
40.70072523469547
],
...
[
-73.94409591260093,
40.69897295461309
],
]
"type" : "Polygon"
},
"name" : "Bedford"
}
27
Restaurant Document
MongoDB Enterprise > db.restaurants.findOne()
{
"_id" : ObjectId("55cba2476c522cafdb053adf"),
"location" : {
"coordinates" : [
-73.98241999999999,
40.579505
],
"type" : "Point"
},
"name" : "Riviera Caterer"
}
MongoDB Enterprise >
You can type this into
google maps but
remember to reverse the
coordinate order
28
Add Indexes
MongoDB Enterprise > db.restaurants.createIndex({ location: "2dsphere" })
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
MongoDB Enterprise > db.neighborhoods.createIndex({ geometry: "2dsphere" })
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
MongoDB Enterprise >
29
Use $geoIntersects to find our Neighborhood
• Assume we are at -73.93414657, 40.82302903
• What neighborhood are we in? Use $geoIntersects
db.neighborhoods.findOne({ geometry:
{ $geoIntersects:
{ $geometry:
{ type: "Point",
coordinates:
[ -73.93414657,
40.82302903 ]}}}})
30
Results
{
"geometry" : {
”coordinates" : [
[
-73.9338307684026,
40.81959665747723
],
...
[
-73.93383000695911,
40.81949109558767
]
]
"type" : "Polygon"
},
"name" : "Central Harlem North-Polo Grounds"
}
31
Find All Restaurants within 0.35 km
db.restaurants.find({ location:
{ $geoWithin: { $centerSphere:
[ [ -73.93414657, 40.82302903 ], 5 / 6,378.1 ] }
} })
Distance in km
Divide by radius of earth
to convert to radians
32
Results – (Projected)
{ "name" : "Gotham Stadium Tennis Center Cafe" }
{ "name" : "Chuck E. Cheese'S" }
{ "name" : "Red Star Chinese Restaurant" }
{ "name" : "Tia Melli'S Latin Kitchen" }
{ "name" : "Domino'S Pizza" }
• Without projection
{ "_id" : ObjectId("55cba2476c522cafdb0550aa"),
"location" : { "coordinates" : [ -73.93795159999999, 40.823376 ],
"type" : "Point" },
"name" : "Domino'S Pizza" }
33
Summary of Operators
• $geoIntersect: Find areas or points that overlap or are
adjacent
• $geoWithin: Find areas on points that lie within a specific area
• $geoNear: Returns locations in order from nearest to furthest
away
34
Summary
• Text Indexes : Full text searching of all the text items in a
collection
• Geospatial Indexes : Search by location, by intersection or by
distance from a point
35
Q & A
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
37
• This is slide content
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
41
42
LOREM
IPSUM
LOREM
IPSUM
LOREM
IPSUM
LOREM
IPSUM
Sollicitudin VenenatisLOREM
IPSUM
LOREM
IPSUM
LOREM
IPSUM
LOREM
IPSUM
Graphic Element Examples
Porta Ultricies
Commodo Porta
Graph Examples
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
Category 1 Category 2 Category 3 Category 4
Series 1
Series 2
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
Category 1 Category 2 Category 3 Category 4
Series 1
Series 2
{
_id : ObjectId("4c4ba5e5e8aabf3"),
employee_name: "Dunham, Justin",
department : "Marketing",
title : "Product Manager, Web",
report_up: "Neray, Graham",
pay_band: “C",
benefits : [
{ type : "Health",
plan : "PPO Plus" },
{ type : "Dental",
plan : "Standard" }
]
}
Code/Highlight Example
Aggregation Framework Agility Backup Big Data Briefcase
Buildings Business Intelligence Camera Cash Register Catalog
Chat Checkmark Checkmark Cloud Commercial Contract
Computer Content Continuous Development Credit Card Customer Success
Data Center Data Variety Data Velocity Data Volume Data Warehouse Database
Dialogue Directory Documents Downloads Drivers Dynamic Schema
EDW Integration Faster Time to Market File Transfer Flexible Gear Hadoop
Health Check High Availability Horizontal Scaling Integrating into Infrastructure Internet of Things Iterative Development
Life Preserver Line Graph Lock Log Data Lower Cost Magnifying Glass
Man Mobile Phone Meter Monitoring Music New Apps
New Data Types Online Open Source Parachute Personalization Pin
Platform Certification Product Catalog Puzzle Pieces RDBMS Realtime Analytics Rich Querying
Life Preserver RSS Scalability Scale Secondary Indexing Steering Wheel
Stopwatch Text Search Tick Data Training Transmission Tower Trophy
Woman World

More Related Content

PPTX
Back to Basics Webinar 3: Schema Design Thinking in Documents
PPTX
Webinar: Back to Basics: Thinking in Documents
PPTX
Webinar: Transitioning from SQL to MongoDB
PPTX
Back to Basics Webinar 5: Introduction to the Aggregation Framework
PPTX
Indexing Strategies to Help You Scale
PPTX
Beyond the Basics 2: Aggregation Framework
PPTX
Back to Basics: My First MongoDB Application
PPTX
Back to Basics, webinar 2: La tua prima applicazione MongoDB
Back to Basics Webinar 3: Schema Design Thinking in Documents
Webinar: Back to Basics: Thinking in Documents
Webinar: Transitioning from SQL to MongoDB
Back to Basics Webinar 5: Introduction to the Aggregation Framework
Indexing Strategies to Help You Scale
Beyond the Basics 2: Aggregation Framework
Back to Basics: My First MongoDB Application
Back to Basics, webinar 2: La tua prima applicazione MongoDB

What's hot (19)

PPTX
Webinar: Getting Started with MongoDB - Back to Basics
PPTX
Webinaire 2 de la série « Retour aux fondamentaux » : Votre première applicat...
PPT
Introduction to MongoDB
PPT
Introduction to MongoDB
PPTX
Back to Basics Webinar 3: Introduction to Replica Sets
PPTX
Webinar: Schema Design
PDF
Indexing
PPTX
Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework
PDF
Webinar: Building Your First App with MongoDB and Java
PPTX
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
PPTX
Back to Basics Webinar 1: Introduction to NoSQL
PPTX
Building a Scalable Inbox System with MongoDB and Java
KEY
MongoDB Java Development - MongoBoston 2010
PPTX
MongoDB Schema Design: Four Real-World Examples
PDF
Webinar: Working with Graph Data in MongoDB
KEY
PDF
Mongo DB schema design patterns
PPTX
Back to Basics Webinar 3 - Thinking in Documents
PPTX
Back to Basics Webinar 2: Your First MongoDB Application
Webinar: Getting Started with MongoDB - Back to Basics
Webinaire 2 de la série « Retour aux fondamentaux » : Votre première applicat...
Introduction to MongoDB
Introduction to MongoDB
Back to Basics Webinar 3: Introduction to Replica Sets
Webinar: Schema Design
Indexing
Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework
Webinar: Building Your First App with MongoDB and Java
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Back to Basics Webinar 1: Introduction to NoSQL
Building a Scalable Inbox System with MongoDB and Java
MongoDB Java Development - MongoBoston 2010
MongoDB Schema Design: Four Real-World Examples
Webinar: Working with Graph Data in MongoDB
Mongo DB schema design patterns
Back to Basics Webinar 3 - Thinking in Documents
Back to Basics Webinar 2: Your First MongoDB Application
Ad

Viewers also liked (11)

PDF
Mongo db data-models guide
PPTX
MongoDB for Developers
PPTX
Back to Basics Webinar 6: Production Deployment
PPTX
Beyond the Basics 1: Storage Engines
PPTX
Back to Basics Webinar 1: Introduction to NoSQL
PPTX
Back to Basics, webinar 4: Indicizzazione avanzata, indici testuali e geospaz...
PDF
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
PPTX
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
KEY
OSCON 2012 MongoDB Tutorial
PDF
Advanced Schema Design Patterns
PPTX
Developing with the Modern App Stack: MEAN and MERN (with Angular2 and ReactJS)
Mongo db data-models guide
MongoDB for Developers
Back to Basics Webinar 6: Production Deployment
Beyond the Basics 1: Storage Engines
Back to Basics Webinar 1: Introduction to NoSQL
Back to Basics, webinar 4: Indicizzazione avanzata, indici testuali e geospaz...
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
OSCON 2012 MongoDB Tutorial
Advanced Schema Design Patterns
Developing with the Modern App Stack: MEAN and MERN (with Angular2 and ReactJS)
Ad

Similar to Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes (20)

PPT
Fast querying indexing for performance (4)
PPTX
Webinar: General Technical Overview of MongoDB for Dev Teams
PDF
OSDC 2012 | Building a first application on MongoDB by Ross Lawley
ODP
Mongo indexes
PDF
Mongo db a deep dive of mongodb indexes
PPTX
MongoDB and Indexes - MUG Denver - 20160329
PPTX
Getting Started with Geospatial Data in MongoDB
PPTX
Query Optimization in MongoDB
PPTX
Geoindexing with MongoDB
PPTX
Sharing about MongoDB Overview and Indexing in MongoDB
PPTX
Whats new in MongoDB 24
PPTX
MongoDB (Advanced)
PDF
Geospatial Enhancements in MongoDB 2.4
PPT
Building Your First MongoDB App ~ Metadata Catalog
PPT
Mongodb Training Tutorial in Bangalore
PPT
mongodb-120401144140-phpapp01 claud camputing
PPT
9b. Document-Oriented Databases lab
KEY
Mongodb intro
PDF
Nosql part 2
KEY
Seedhack MongoDB 2011
Fast querying indexing for performance (4)
Webinar: General Technical Overview of MongoDB for Dev Teams
OSDC 2012 | Building a first application on MongoDB by Ross Lawley
Mongo indexes
Mongo db a deep dive of mongodb indexes
MongoDB and Indexes - MUG Denver - 20160329
Getting Started with Geospatial Data in MongoDB
Query Optimization in MongoDB
Geoindexing with MongoDB
Sharing about MongoDB Overview and Indexing in MongoDB
Whats new in MongoDB 24
MongoDB (Advanced)
Geospatial Enhancements in MongoDB 2.4
Building Your First MongoDB App ~ Metadata Catalog
Mongodb Training Tutorial in Bangalore
mongodb-120401144140-phpapp01 claud camputing
9b. Document-Oriented Databases lab
Mongodb intro
Nosql part 2
Seedhack MongoDB 2011

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...

Recently uploaded (20)

PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
Introduction to machine learning and Linear Models
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
IB Computer Science - Internal Assessment.pptx
PPT
ISS -ESG Data flows What is ESG and HowHow
PDF
annual-report-2024-2025 original latest.
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Introduction to machine learning and Linear Models
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
IB Computer Science - Internal Assessment.pptx
ISS -ESG Data flows What is ESG and HowHow
annual-report-2024-2025 original latest.
Business Acumen Training GuidePresentation.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Introduction-to-Cloud-ComputingFinal.pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
Miokarditis (Inflamasi pada Otot Jantung)
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
climate analysis of Dhaka ,Banglades.pptx

Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

  • 1. MongoDBEurope2016 Old Billingsgate, London 15th November Use my code JD20 for 20% off tickets mongodb.com/europe
  • 2. Back to Basics 2016 : Webinar 4 Advanced Indexing – Text and Geospatial Indexes Joe Drumgoole Director of Developer Advocacy, EMEA @jdrumgoole V1.1
  • 3. 3 Recap • Webinar 1 – Introduction to NoSQL – The different types of NoSQL databases – What kind of database is MongoDB? A document database. • Webinar 2 – My First Application – Creating databases and collections – CRUD operations – Indexes and Explain • Webinar 3 – Schema Design – Dynamic schema – Embedding approaches – Examples
  • 4. 4 Indexing • An efficient way to look up data by its value • Avoids table scans 1 2 3 4 5 6 7
  • 5. 5 Traditional Databases Use Btrees • … and so does MongoDB
  • 7. 7 Creating a Simple Index db.coll.createIndex( { fieldName : <Direction> } ) Database Name Collection Name Command Field Name to be indexed Ascending : 1 Descending : -1
  • 8. 8 Two Other Kinds of Indexes • Full Text Index – Allows searching inside the text of a field ( Lucene, Solr and Elastic Search) • Geospatial Index – Allows searching by location (e.g. people near me) • These indexes do not use Btrees
  • 9. 9 Full Text Indexes • An “inverted index” on all the words inside a single field (only one text index per collection) { “comment” : “I think your blog post is very interesting and informative. I hope you will post more info like this in the future” } >> db.posts.createIndex( { “comments” : “text” } ) MongoDB Enterprise > db.posts.find( { $text: { $search : "info" }} ) { "_id" : ObjectId(“…"), "comment" : "I think your blog post is very interesting and informative. I hope you will post more info like this in the future" } MongoDB Enterprise >
  • 10. 10 Results MongoDB Enterprise > db.posts.getIndexes() ... { "v" : 1, "key" : { "_fts" : "text", "_ftsx" : 1 }, "name" : "comment_text", "ns" : "test.posts", "weights" : { "comment" : 1 }, "default_language" : "english", "language_override" : "language", "textIndexVersion" : 3 }
  • 11. 11 Dropping Text Indexes • We drop text indexes by name rather than shape db.posts.getIndexes() { "v" : 1, "key" : { "_fts" : "text", "_ftsx" : 1 }, "name" : "comment_text_text", "ns" : "test.posts", "weights" : { "comment" : 5, "tags" : 10 }, "default_language" : "english", "language_override" : "language", "textIndexVersion" : 3 }
  • 12. 12 Hence MongoDB Enterprise > db.posts.dropIndex( "comment_text_tags_text" ) { "nIndexesWas" : 2, "ok" : 1 } MongoDB Enterprise > • You can give an index an explict name to make this easier MongoDB Enterprise > db.posts.createIndex( { "comments" : "text", "tags" : "text" }, { "name" : "text_index" } ) { "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 }
  • 13. 13 On The Server I INDEX [conn275] build index on: test.posts properties: { v: 1, key: { _fts: "text", _ftsx: 1 }, name: "comment_text", ns: "test.posts", weights: { comment: 1 }, default_language: "english", language_override: "language", textIndexVersion: 3 }} I INDEX [conn275] building index using bulk method I INDEX [conn275] build index done. scanned 3 total records. 0 secs
  • 14. 14 More Detailed Example >> db.posts.insert( { "comment" : "Red yellow orange green" } ) >> db.posts.insert( { "comment" : "Pink purple blue" } ) >> db.posts.insert( { "comment" : "Red Pink" } ) >> db.posts.find( { "$text" : { "$search" : "Red" }} ) { "_id" : ObjectId(“…”), "comment" : "Red yellow orange green" } { "_id" : ObjectId( »…"), "comment" : "Red Pink" } >> db.posts.find( { "$text" : { "$search" : "Red Green" }} ) { "_id" : ObjectId(« …"), "comment" : "Red Pink" } { "_id" : ObjectId(« …"), "comment" : "Red yellow orange green" } >> db.posts.find( { "$text" : { "$search" : "red" }} ) # <- Case Insensitve { "_id" : ObjectId(“…"), "comment" : "Red yellow orange green" } { "_id" : ObjectId(«…”), "comment" : "Red Pink" } >>
  • 15. 15 Using Weights • We can assign different weights to different fields in the text index • E.g. I want to favour tags over comments in searching • So I increase the weight for the the tags field >> db.blog.createIndex( { comment: "text", tags : "text” }, { weights: { comment: 5, tags : 10 }} ) • Now searches will favour tags
  • 16. 16 $textscore • Weights impact $textscore: >> db.posts.find( { "$text" : { "$search" : "Red" }}, { score: { $meta: "textScore" }} ).sort( { score: { $meta: "textScore" } } ) { "_id" : …, "comment" : "hello", "tags" : "Red green orange", "score" : 6.666666666666666 } { "_id" : …, "comment" : "Red Pink", "score" : 3.75 } { "_id" : …, "comment" : "Red yellow orange green", "score" : 3.125 } >>
  • 17. 17 Other Parameters • Language : Pick the language you want to search in e.g. – $language : Spanish • Support case sensitive searching – $caseSensitive : True (default false) • Support accented characters (diacritic sensitive search e.g. café is distinguished from cafe ) – $diacriticSensitive : True (default false)
  • 19. 19 Geospatial Indexes • MongoDB supports 2D Sphere indexes • Allows a user to represent location on the earth (which is a sphere) • Coordinates are stored in GeoJSON format • The Geospatial index supports subset of the GeoJSON operations • The index is based on a QuadTree representation • Index is based on WGS 84 standard
  • 20. 20 Coordinates • Coordinates are represented as longitude, latitude • longitude – Measured from Greenwich meridian in London (0 degrees) locations east (up to 180 degrees) – For locations west we specify as negative • Latitude – Measured from equator north and south (0 to 90 north, 0 to -90 south) • Coordinates in MongoDB are stored on Longitude/Latitude order • Coordinates in Google are stored in Latitude/Longitude order
  • 21. 21 2DSphere Versions • Three versions of 2dSphere index in MongoDB • Version 1 : Up to MongoDB 2.4 • Version 2 : From MongoDB 2.6 onwards • Version 3 : From MongoDB 3.2 onwards • We will only be talking about Version 3 in this webinar
  • 22. 22 Creating a 2dSphere Index db.collection.createIndex ( { <location field> : "2dsphere" } ) • Location field must be coordinate or GeoJSON data
  • 23. 23 Example >> db.test.createIndex( { loc : "2dsphere" } ) { "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 }
  • 24. 24 Output >> db.test.getIndexes() [ { "v" : 1, "key" : { "loc" : "2dsphere" }, "name" : "loc_2dsphere", "ns" : "geo.test", "2dsphereIndexVersion" : 3 } ] >>
  • 25. 25 Use a Simple Dataset to investigate Geo Queries • Lets search for restaurants in Manhattan • Using two candidate collections – https://raw.githubusercontent.com/mongodb/docs-assets/geospatial/neighborhoods.json – https://raw.githubusercontent.com/mongodb/docs-assets/geospatial/restaurants.json • Import them into MongoDB – mongoimport –c neighborhoods –d geo neighborhoods.json – mongoimport –c restaurants –d geo restaurants.json
  • 26. 26 Neighborhood Document MongoDB Enterprise > db.neighborhoods.findOne() { "_id" : ObjectId("55cb9c666c522cafdb053a1a"), "geometry" : { "coordinates" : [ [ [ -73.94193078816193, 40.70072523469547 ], ... [ -73.94409591260093, 40.69897295461309 ], ] "type" : "Polygon" }, "name" : "Bedford" }
  • 27. 27 Restaurant Document MongoDB Enterprise > db.restaurants.findOne() { "_id" : ObjectId("55cba2476c522cafdb053adf"), "location" : { "coordinates" : [ -73.98241999999999, 40.579505 ], "type" : "Point" }, "name" : "Riviera Caterer" } MongoDB Enterprise > You can type this into google maps but remember to reverse the coordinate order
  • 28. 28 Add Indexes MongoDB Enterprise > db.restaurants.createIndex({ location: "2dsphere" }) { "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 } MongoDB Enterprise > db.neighborhoods.createIndex({ geometry: "2dsphere" }) { "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 } MongoDB Enterprise >
  • 29. 29 Use $geoIntersects to find our Neighborhood • Assume we are at -73.93414657, 40.82302903 • What neighborhood are we in? Use $geoIntersects db.neighborhoods.findOne({ geometry: { $geoIntersects: { $geometry: { type: "Point", coordinates: [ -73.93414657, 40.82302903 ]}}}})
  • 30. 30 Results { "geometry" : { ”coordinates" : [ [ -73.9338307684026, 40.81959665747723 ], ... [ -73.93383000695911, 40.81949109558767 ] ] "type" : "Polygon" }, "name" : "Central Harlem North-Polo Grounds" }
  • 31. 31 Find All Restaurants within 0.35 km db.restaurants.find({ location: { $geoWithin: { $centerSphere: [ [ -73.93414657, 40.82302903 ], 5 / 6,378.1 ] } } }) Distance in km Divide by radius of earth to convert to radians
  • 32. 32 Results – (Projected) { "name" : "Gotham Stadium Tennis Center Cafe" } { "name" : "Chuck E. Cheese'S" } { "name" : "Red Star Chinese Restaurant" } { "name" : "Tia Melli'S Latin Kitchen" } { "name" : "Domino'S Pizza" } • Without projection { "_id" : ObjectId("55cba2476c522cafdb0550aa"), "location" : { "coordinates" : [ -73.93795159999999, 40.823376 ], "type" : "Point" }, "name" : "Domino'S Pizza" }
  • 33. 33 Summary of Operators • $geoIntersect: Find areas or points that overlap or are adjacent • $geoWithin: Find areas on points that lie within a specific area • $geoNear: Returns locations in order from nearest to furthest away
  • 34. 34 Summary • Text Indexes : Full text searching of all the text items in a collection • Geospatial Indexes : Search by location, by intersection or by distance from a point
  • 37. 37 • This is slide content
  • 41. 41
  • 42. 42
  • 44. Porta Ultricies Commodo Porta Graph Examples 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Category 1 Category 2 Category 3 Category 4 Series 1 Series 2
  • 45. 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Category 1 Category 2 Category 3 Category 4 Series 1 Series 2
  • 46. { _id : ObjectId("4c4ba5e5e8aabf3"), employee_name: "Dunham, Justin", department : "Marketing", title : "Product Manager, Web", report_up: "Neray, Graham", pay_band: “C", benefits : [ { type : "Health", plan : "PPO Plus" }, { type : "Dental", plan : "Standard" } ] } Code/Highlight Example
  • 47. Aggregation Framework Agility Backup Big Data Briefcase Buildings Business Intelligence Camera Cash Register Catalog Chat Checkmark Checkmark Cloud Commercial Contract Computer Content Continuous Development Credit Card Customer Success
  • 48. Data Center Data Variety Data Velocity Data Volume Data Warehouse Database Dialogue Directory Documents Downloads Drivers Dynamic Schema EDW Integration Faster Time to Market File Transfer Flexible Gear Hadoop Health Check High Availability Horizontal Scaling Integrating into Infrastructure Internet of Things Iterative Development
  • 49. Life Preserver Line Graph Lock Log Data Lower Cost Magnifying Glass Man Mobile Phone Meter Monitoring Music New Apps New Data Types Online Open Source Parachute Personalization Pin Platform Certification Product Catalog Puzzle Pieces RDBMS Realtime Analytics Rich Querying
  • 50. Life Preserver RSS Scalability Scale Secondary Indexing Steering Wheel Stopwatch Text Search Tick Data Training Transmission Tower Trophy Woman World

Editor's Notes

  • #3: Who I am, how long have I been at MongoDB.
  • #6: Each item in a Btree node points to a sub-tree containing elements below its key value. Insertions require a read before a write. Writes that split nodes are expensive.
  • #7: Effectively the depth of the tree.
  • #22: Production release numbering.
  • #27: Visit Map to show location.
  • #28: Show Riviera on Google Maps.