SlideShare a Scribd company logo
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Oak Lucene Indexes
Chetan Mehrotra | Senior Computer Scientist
Alex Parvulescu | Senior Developer
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Content
 Lucene Index Definitions
 Anatomy of a Query (Restrictions, Sorting, Aggregation)
 Query Diagnostics and Troubleshooting
 Lucene Index Internals (Oak Directory, JMX, Luke)
 Asynchronous Indexing
 Q&A
2
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Lucene Index Definition
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Index Definition
 Stored under oak:index node
 Define how content gets indexed
 type oak:QueryIndexDefinition
 Required properties
 compatVersion = 2
 type = “lucene”
 async = “async”
4
/oak:index/assetType (oak:QueryIndexDefinition)
- compatVersion = 2
- type = "lucene"
- async = "async"
+ indexRules (nt:unstructured)
+ dam:Asset
+ properties (nt:unstructured)
+ assetType
- propertyIndex = true
- name = "jcr:content/metadata/type"
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Index Definition – Index Rules
 Defines which types of node and properties are
indexed
 Rules are defined per nodeType
 Rule consist of one or more property definitions
 Index selected based on match between type used
in Query and presence of indexRule for that type
 Multiple indexRules in same index
 Order important – nodeType matching honors
inheritance
5
SELECT *
FROM [dam:Asset] AS a
WHERE ISDESCENDANTNODE([/content/en])
AND a.[jcr:content/metadata/type] = 'image'
/oak:index/assetType (oak:QueryIndexDefinition)
- compatVersion = 2
- type = "lucene"
- async = "async"
+ indexRules (nt:unstructured)
+ dam:Asset
+ properties (nt:unstructured)
+ assetType
- propertyIndex = true
- name = "jcr:content/metadata/assetType"
https://jackrabbit.apache.org/oak/docs/query/lucene.html#Indexing_Rules
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Index Definition – Property Definitions
 Defines how a property gets indexed
 One or more property definition per indexRule
 Definition mapping done based on matching
property name or regex pattern
 Supports relative property name by there relative
paths
 Order important (if regex are used)
6
SELECT *
FROM [dam:Asset] AS a
WHERE ISDESCENDANTNODE([/content/en])
AND a.[jcr:content/metadata/type] = 'image'
/oak:index/assetType (oak:QueryIndexDefinition)
- compatVersion = 2
- type = "lucene"
- async = "async"
+ indexRules (nt:unstructured)
+ dam:Asset
+ properties (nt:unstructured)
+ assetType
- propertyIndex = true
- name = "jcr:content/metadata/assetType"
https://jackrabbit.apache.org/oak/docs/query/lucene.html#Property_Definitions
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Index Definition – Best Practices
 Precise Index Definition - That indexes just the right amount of content based on your query requirement. Precise index is
happy index!
 Make use of nodetype to achieve a “cohesive index” - This would allow multiple queries to make use of same index and
also evaluation of multiple property restrictions natively in Lucene
 For people familiar with Relational Databases - Nodetype is your Table in your DB and all the direct or relative
properties as columns in that table. Various property definitions are like indexes on those columns.
7
https://jackrabbit.apache.org/oak/docs/query/lucene.html#Design_Considerations
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Sample Content to Query Against
8
/content/dam/assets/december/banner.png (dam:Asset)
+ metadata (dam:AssetContent)
- dc:format = "image/png"
- status = "published"
- jcr:lastModified = "2009-10-9T21:52:31"
- app:tags = ["properties:orientation/landscape",
"marketing:interest/product"]
- size = 450
- comment = "Image for december launch"
- jcr:title = "December Banner"
+ xmpMM:History
+ 1
- softwareAgent = "Adobe Photoshop"
- author = "David"
+ renditions (nt:folder)
+ original (nt:file)
+ jcr:content
- jcr:data = ...
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Anatomy of Query
9
SELECT
*
FROM [dam:Asset] AS a
WHERE ISDESCENDANTNODE([/content/public/platform])
AND a.[jcr:content/metadata/status] = 'published'
AND CONTAINS([jcr:content/metadata/comment], 'december')
ORDER BY
a.[jcr:content/metadata/jcr:lastModified] DESC
• Nodetype restriction on dam:Asset
• Path restriction on /content/public/platform
• Property restriction on jcr:content/metadata/status
• Fulltext property restriction on jcr:content/metadata/comment
• Sorting done on jcr:content/metadata/jcr:lastModified
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Nodetype Restrictions
10
SELECT
*
FROM
[dam:Asset] AS a
WHERE
ISDESCENDANTNODE([/content/public/platform])
AND
a.[jcr:content/metadata/status] = 'published'
AND
CONTAINS([jcr:content/metadata/comment], 'december')
ORDER BY
a.[jcr:content/metadata/jcr:lastModified] DESC
/oak:index/damAsset (oak:QueryIndexDefinition)
- compatVersion = 2
- type = "lucene"
- async = "async"
+ indexRules (nt:unstructured)
+ dam:Asset (nt:unstructured)
+ properties
...
Create index definition node at /oak:index/damAsset with indexRule for dam:Asset
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Path Restriction
11
SELECT
*
FROM
[dam:Asset] AS a
WHERE
ISDESCENDANTNODE([/content/public/platform])
AND
a.[jcr:content/metadata/status] = 'published'
AND
CONTAINS([jcr:content/metadata/comment], 'december')
ORDER BY
a.[jcr:content/metadata/jcr:lastModified] DESC
Enable evaluatePathRestrictions for indexing paths
Bonus Tip – If all indexable content is under /content/public and query always specify the path restriction then it would be better to define index definition under
/content/public/oak:index (more details)
/oak:index/damAsset (oak:QueryIndexDefinition)
- compatVersion = 2
- type = "lucene"
- async = "async"
- evaluatePathRestrictions = true
+ indexRules (nt:unstructured)
+ dam:Asset (nt:unstructured)
+ properties
...
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Property Restriction
12
SELECT
*
FROM
[dam:Asset] AS a
WHERE
ISDESCENDANTNODE([/content/public/platform])
AND
a.[jcr:content/metadata/status] = 'published'
AND
CONTAINS([jcr:content/metadata/comment], 'december')
ORDER BY
a.[jcr:content/metadata/jcr:lastModified] DESC
Create property definition node with propertyIndex enabled and name set to relative path
of property
/oak:index/damAsset (oak:QueryIndexDefinition)
- compatVersion = 2
- type = "lucene"
- async = "async"
- evaluatePathRestrictions = true
+ indexRules (nt:unstructured)
+ dam:Asset (nt:unstructured)
+ properties
+ status
- propertyIndex = true
- name = "jcr:content/metadata/status"
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Fulltext Property Restriction
13
SELECT
*
FROM
[dam:Asset] AS a
WHERE
ISDESCENDANTNODE([/content/public/platform])
AND
a.[jcr:content/metadata/status] = 'published'
AND
CONTAINS([jcr:content/metadata/comment], 'december')
ORDER BY
a.[jcr:content/metadata/jcr:lastModified] DESC
Create property definition node with analyzed enabled
/oak:index/damAsset (oak:QueryIndexDefinition)
- compatVersion = 2
- type = "lucene"
- async = "async"
- evaluatePathRestrictions = true
+ indexRules (nt:unstructured)
+ dam:Asset (nt:unstructured)
+ properties
+ status
- propertyIndex = true
- name = "jcr:content/metadata/status"
+ comment
- name = "jcr:content/metadata/comment"
- analyzed = true
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Sorting
14
SELECT
*
FROM
[dam:Asset] AS a
WHERE
ISDESCENDANTNODE([/content/public/platform])
AND
a.[jcr:content/metadata/status] = 'published'
AND
CONTAINS([jcr:content/metadata/comment], 'december')
ORDER BY
a.[jcr:content/metadata/jcr:lastModified] DESC
/oak:index/damAsset (oak:QueryIndexDefinition)
- compatVersion = 2
- type = "lucene"
- async = "async"
- evaluatePathRestrictions = true
+ indexRules (nt:unstructured)
+ dam:Asset (nt:unstructured)
+ properties
+ status
- propertyIndex = true
- name = "jcr:content/metadata/status"
+ comment
- name = "jcr:content/metadata/comment"
- analyzed = true
+ lastModified
- name = "jcr:content/metadata/jcr:lastModified"
- ordered = true
- type = Date
- propertyIndex = true
Create property definition node with ordered enabled and type set to property type. Also
enable propertyIndex if you plan to have some restrictions on it
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Fulltext Node Restriction
 Searches for ‘christmas’ in all nodes of type dam:Asset
 Fulltext index for a node is made up fulltext terms made up from
 Node properties – Properties with nodeScopeIndex set to true
 Properties of relative nodes defined by Aggregation Rules
 Aggregation Rules
 Define path patterns for selecting the relative nodes
 Are bound to specific type
 Can be recursive – Relative path refers to nt:file and nt:file has its own aggregation rule defined
 For aggregated nodes all properties whose type are part of includePropertyTypes are included unless a property
definition is defined with nodeScopeIndex=false
15
SELECT * FROM [dam:Asset] WHERE CONTAINS(., 'christmas')
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Fulltext - Aggregation
16
/content/dam/assets/december/banner.png
(dam:Asset)
+ metadata (dam:AssetContent)
- dc:format = "image/png"
- status = "published"
- jcr:lastModified = "2009-10-9T21:52:31"
- app:tags =
["properties:orientation/landscape",
"marketing:interest/product"]
- size = 450
- comment = "Image for Christmas launch"
- jcr:title = "December Banner"
+ xmpMM:History
+ 1
- softwareAgent = "Adobe Photoshop"
- author = "David"
+ renditions (nt:folder)
+ original (nt:file)
+ jcr:content
- jcr:data = ...
+ aggregates
+ dam:Asset
+ include0
- path = "jcr:content"
+ include1
- path = "jcr:content/metadata"
+ include2
- path = "jcr:content/metadata/*"
+ include3
- path = "jcr:content/metadata/*/*"
+ include4
- path = "jcr:content/renditions"
+ include5
- path = "jcr:content/renditions/original"
+ nt:file
+ include0
- path = "jcr:content"
image/png
Published
properties:orientation/landscape
marketing:interest/product
December Banner
Image for Christmas launch
Adobe Photoshop
David
Content Aggregation Rules Extracted Terms for Fulltext Index
https://jackrabbit.apache.org/oak/docs/query/lucene.html#Aggregation
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Query Result Size
 Oak Fast Result Size
 By default NodeIterator.getSize() returns -1 if result is large as size estimate cost is O(n) due to ACL checks
 ACL Checks can be relaxed (check first ‘k’ only). Enable via system property oak.fastQuerySize.
 OSGi config support with next release
 AEM Query Builder and Pagination
 Make use of p.guessTotal query parameter to avoid costly operation for determining result size
 Use progressive pagination
17
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Other Features
 Composing Analyzer – For configuring Stemming, Synonyms, Stop words etc
 Boost – Improving search relevancy
 Tika Config – Control how and which types of binary files are indexed
 Suggestions
 Spell Check
 Pre Extracting Text from Binaries – To speedup reindexing time for repositories having marge number of
binaries having text
18
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Tools
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Query Explain Tool
 Shipped with AEM 6.1
 Tools -> Operations -> Dashboard -> Diagnosis -> Query Performance
 http://localhost:4502/libs/granite/operations/content/diagnosis/tool.html/_granite_queryperformance
 Shows Slow Query, Popular Query and Explain Query
 ACS Tools (more upto date)
 https://adobe-consulting-services.github.io/acs-aem-tools/explain-query.html
20
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Query Explain Tool
21
• Shows logs from various index consulted
• Shows the actual Lucene query fired
• Path Restriction
+:ancestors:/content/public/platform
• Fulltext Restriction
+full:jcr:content/metadata/comment:december
• Property Restriction
+jcr:content/metadata/status:published
• Ordering
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Lucene Index Internals
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Lucene Index Internals - Directory
 Lucene Directory is stored in the repository
(the source of truth)
 Copy on Read & Copy on Write maintain local
copies for faster access (index content to disk
location mappings are exposed via JMX)
23
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Lucene Index Internals - JMX
org.apache.jackrabbit.oak: Lucene Index statistics (LuceneIndex)
 provides a listing of the existing lucene indexes
 http://localhost:4502/system/console/jmx/org.apache.jackrabbit.oak%3Aname%3DLucene+Index+statistics%2Ctype%3
DLuceneIndex
24
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Lucene Index Internals - JMX continued
25
org.apache.jackrabbit.oak: IndexCopier support statistics (IndexCopierStats)
 Copy on Read and Copy on Write related stats, of interest is the mapping between index content and location on disk
 http://localhost:4502/system/console/jmx/org.apache.jackrabbit.oak%3Aname%3DIndexCopier+support+statistics%2Ctype%3DIndexCopi
erStats
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Lucene Index Internals - JMX continued
26
org.apache.jackrabbit.oak: TextExtraction statistics (TextExtractionStats)
 Very relevant stats related to how much work is done extracting text from binaries
 http://localhost:4502/system/console/jmx/org.apache.jackrabbit.oak%3Aname%3DTextExtraction+statistics%2Ctype%3DTextExtractionSt
ats
 Make sure you remember this one for our experiment with ‘Luke’
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Lucene Index Internals - Luke
Let’s run a small experiment: upload a pdf file to the repository, verify if full-text
search works
27
It works! And now let’s see why...
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Lucene Index Internals - Luke
Setting up ‘Luke’ to look at the Lucene index:
1. Why Luke? Luke is a dedicated Lucene index tool, has no alternatives for
viewing content
2. Identify which index you want to look at
3. Export Index Contents
 (easy/online) Lookup the Copy on Read mappings in the JMX console and grab a copy of the index
 (harder/possibly offline) Use the oak console to export the index to a specific location
4. Open ‘Luke’ and make sure you pass in the oak-lucene jar as a classpath entry
(as documented on the docs)
28
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Lucene Index Internals - Luke
29
For the given token ‘mongomk’
there are 2 matching lucene
docs, pointing to the pdf file.
Why 2? Because of index time
aggregation: the parent node
will inherit the ‘:fulltext’
information from its child node.
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Lucene Index Internals - Luke
The default Lucene index defines aggregation for ‘nt:file’s, meaning they will
inherit all extracted full-text information from the ‘nt:resource’ child nodes.
This means that the following search
/jcr:root//element(*, nt:file)[jcr:contains(., 'mongomk')]
Will return a single item:
/granite-gems-lucene/AEM 6 Oak - MongoMK and Queries.pdf
even though the nt:file node itself contains no full-text information
30
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Asynchronous Indexing
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Asynchronous Indexing - Overview
 AsyncIndexUpdate class is the glue for all existing index implementations (all logging comes from this place)
 Runs as a background job every 5 seconds, for clusters this runs on a single cluster node
 Used mainly with full-text indexes: lucene/solr, also for ordered property indexes (deprecated)
 Efficient: takes care of processing only new content since last successful cycle, uses a fast diff based on checkpoints
 Resilient: in case of error, it will try again on next cycle (no data loss)
 Status exposed via JMX “ IndexStats”
 You can change an index definition to be asynchronous by setting the async property: async=“async”
32
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Asynchronous Indexing - Checkpoints
 Checkpoints are a form of read-only tagging of the current
state of the repository
 Each checkpoint has an expected lifetime provided at
creation time, after which it will be removed, as well as
some metadata related to its creation
33
/checkpoints/a6fe070e-deef-4582-85fb-b96b57ecd1a9
- created = 1450285984929
- timestamp = 1536685984929
+ properties
- creator = "AsyncIndexUpdate”
- name = "async”
- thread = "pool-75-thread-4”
+ root // entire repository content
+ libs
+ content
+ apps
....
[SegmentMK representation of a checkpoint]
 The link between the async indexing process and a checkpoint is
established via the /:async node
 /:async@async property must point to an existing checkpoint, otherwise a
full reindex will happen
 /:async@async-LastIndexedTo stores the timestamp up to which the
repository was indexed
 /:async@async-temp is the list of checkpoints to be cleaned up after all
processing is done
/:async
- async = “a6fe070e-deef-4582-85fb-b96b57ecd1a9”
- async-LastIndexedTo = 2015-12-
16T18:13:04.929+01:00
- async-temp = ["6766f0ec-600f-4b8e-95d3-
9b4d04f5877e",
"a6fe070e-deef-4582-85fb-
b96b57ecd1a9”]
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Asynchronous Indexing - JMX
org.apache.jackrabbit.oak: "async" ("IndexStats”)
 Start / Done timestamps
 Checkpoints (reference, temp)
 Execution Count & Time, Indexed Nodes Count series
 Errors: failing flag, latest seen error with its timestamp
34
© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Useful Links
Oak Lucene Docs
https://jackrabbit.apache.org/oak/docs/query/lucene.html
AEM 6 Oak: MongoMK and Queries Gem session
http://dev.day.com/content/ddc/en/gems/aem-6-oak--mongomk-and-queries.html
AEM Docs on Oak Queries and Indexing
https://docs.adobe.com/docs/en/aem/6-1/deploy/platform/queries-and-indexing.html
https://docs.adobe.com/docs/en/aem/6-1/deploy/best-practices/best-practices-for-queries-and-indexing.html
The Index Manager
https://docs.adobe.com/docs/en/aem/6-1/administer/operations/operations-dashboard.html#The Index
Manager
35
AEM GEMs Session Oak Lucene Indexes

More Related Content

PDF
JCR, Sling or AEM? Which API should I use and when?
PPTX
Real Time search using Spark and Elasticsearch
PDF
Oracle DB를 AWS로 이관하는 방법들 - 서호석 클라우드 사업부/컨설팅팀 이사, 영우디지탈 :: AWS Summit Seoul 2021
PDF
Architecting an Highly Available and Scalable WordPress Site in AWS
PDF
Auto scaling using Amazon Web Services ( AWS )
PDF
마이크로서비스를 위한 AWS 아키텍처 패턴 및 모범 사례 - AWS Summit Seoul 2017
PDF
Hibernate Presentation
PDF
Introduction to Apache Tomcat 7 Presentation
JCR, Sling or AEM? Which API should I use and when?
Real Time search using Spark and Elasticsearch
Oracle DB를 AWS로 이관하는 방법들 - 서호석 클라우드 사업부/컨설팅팀 이사, 영우디지탈 :: AWS Summit Seoul 2021
Architecting an Highly Available and Scalable WordPress Site in AWS
Auto scaling using Amazon Web Services ( AWS )
마이크로서비스를 위한 AWS 아키텍처 패턴 및 모범 사례 - AWS Summit Seoul 2017
Hibernate Presentation
Introduction to Apache Tomcat 7 Presentation

What's hot (20)

PDF
Deploying Spring Boot applications with Docker (east bay cloud meetup dec 2014)
PDF
webservice scaling for newbie
PDF
[2019] PAYCO 쇼핑 마이크로서비스 아키텍처(MSA) 전환기
PPT
Unified Expression Language
PDF
MSMQ - Microsoft Message Queueing
PDF
[2018] MyBatis에서 JPA로
PPTX
Dom based xss
PPS
Java Hibernate Programming with Architecture Diagram and Example
PDF
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
PPTX
AEM and Sling
PDF
[115]쿠팡 서비스 클라우드 마이그레이션 통해 배운것들
PPTX
Zookeeper Tutorial for beginners
PDF
Exception handling
PPTX
It8074 soa-unit i
PDF
Running Kafka as a Native Binary Using GraalVM with Ozan Günalp
PDF
Amazon Aurora Deep Dive (김기완) - AWS DB Day
PDF
카카오 광고 플랫폼 MSA 적용 사례 및 API Gateway와 인증 구현에 대한 소개
PPTX
PPTX
Ask the expert AEM Assets best practices 092016
PPTX
Storage Requirements and Options for Running Spark on Kubernetes
Deploying Spring Boot applications with Docker (east bay cloud meetup dec 2014)
webservice scaling for newbie
[2019] PAYCO 쇼핑 마이크로서비스 아키텍처(MSA) 전환기
Unified Expression Language
MSMQ - Microsoft Message Queueing
[2018] MyBatis에서 JPA로
Dom based xss
Java Hibernate Programming with Architecture Diagram and Example
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
AEM and Sling
[115]쿠팡 서비스 클라우드 마이그레이션 통해 배운것들
Zookeeper Tutorial for beginners
Exception handling
It8074 soa-unit i
Running Kafka as a Native Binary Using GraalVM with Ozan Günalp
Amazon Aurora Deep Dive (김기완) - AWS DB Day
카카오 광고 플랫폼 MSA 적용 사례 및 API Gateway와 인증 구현에 대한 소개
Ask the expert AEM Assets best practices 092016
Storage Requirements and Options for Running Spark on Kubernetes
Ad

Viewers also liked (18)

PPTX
Demystifying Oak Search
PDF
AEM GEMS Session SAML authentication in AEM
PDF
Case Study: Time Warner Cable's Formula for Maximizing Adobe Experience Manager
PPT
IBM WebSphere Commerce Product Overview
PDF
Immerse 2016 Efficient publishing with content fragments
PPTX
IMMERSE'16 Introduction to AEM Tooling
PPTX
AEM GEMS Session Template Editor Sept 14 2016
PPTX
IMMERSE 2016 Cedric Huesler US Keynote
PPTX
IMMERSE 2016 IST Mark Szulc Keynote
PPTX
Adobe Ask the AEM Community Expert Session Oct 2016
PPTX
Introduction to Adobe Experience Manager based e commerce
PDF
IMMERSE 2016 Introducing content fragments
PPTX
Oak, the Architecture of the new Repository
PPTX
AEM & eCommerce integration
PPTX
IMMERSE'16 Intro to Adobe Experience Manager & Adobe Marketing Cloud
PDF
Adobe AEM Commerce with hybris
PDF
AEM (CQ) eCommerce Framework
PPTX
Oak, the architecture of Apache Jackrabbit 3
Demystifying Oak Search
AEM GEMS Session SAML authentication in AEM
Case Study: Time Warner Cable's Formula for Maximizing Adobe Experience Manager
IBM WebSphere Commerce Product Overview
Immerse 2016 Efficient publishing with content fragments
IMMERSE'16 Introduction to AEM Tooling
AEM GEMS Session Template Editor Sept 14 2016
IMMERSE 2016 Cedric Huesler US Keynote
IMMERSE 2016 IST Mark Szulc Keynote
Adobe Ask the AEM Community Expert Session Oct 2016
Introduction to Adobe Experience Manager based e commerce
IMMERSE 2016 Introducing content fragments
Oak, the Architecture of the new Repository
AEM & eCommerce integration
IMMERSE'16 Intro to Adobe Experience Manager & Adobe Marketing Cloud
Adobe AEM Commerce with hybris
AEM (CQ) eCommerce Framework
Oak, the architecture of Apache Jackrabbit 3
Ad

Similar to AEM GEMs Session Oak Lucene Indexes (20)

PDF
Just one-shade-of-openstack
PDF
Declare your infrastructure: InfraKit, LinuxKit and Moby
PDF
Docker Online Meetup: Infrakit update and Q&A
PDF
Terrastore - A document database for developers
PPTX
Your Content, Your Search, Your Decision
PDF
Immutable Deployments with AWS CloudFormation and AWS Lambda
PDF
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
PPTX
iOS Dev Happy Hour Realm - Feb 2021
PDF
MongoDB for Coder Training (Coding Serbia 2013)
PDF
Real-Time Spark: From Interactive Queries to Streaming
PPTX
Lightning fast analytics with Cassandra and Spark
PPTX
Drupal 7 entities & TextbookMadness.com
PDF
Null Bachaav - May 07 Attack Monitoring workshop.
PDF
대용량 데이타 쉽고 빠르게 분석하기 :: 김일호 솔루션즈 아키텍트 :: Gaming on AWS 2016
PDF
Lightning fast analytics with Spark and Cassandra
PDF
Elasticsearch first-steps
KEY
Mongodb intro
KEY
Schema Design with MongoDB
KEY
Managing Social Content with MongoDB
PPTX
Kubernetes Operators With Scala
Just one-shade-of-openstack
Declare your infrastructure: InfraKit, LinuxKit and Moby
Docker Online Meetup: Infrakit update and Q&A
Terrastore - A document database for developers
Your Content, Your Search, Your Decision
Immutable Deployments with AWS CloudFormation and AWS Lambda
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
iOS Dev Happy Hour Realm - Feb 2021
MongoDB for Coder Training (Coding Serbia 2013)
Real-Time Spark: From Interactive Queries to Streaming
Lightning fast analytics with Cassandra and Spark
Drupal 7 entities & TextbookMadness.com
Null Bachaav - May 07 Attack Monitoring workshop.
대용량 데이타 쉽고 빠르게 분석하기 :: 김일호 솔루션즈 아키텍트 :: Gaming on AWS 2016
Lightning fast analytics with Spark and Cassandra
Elasticsearch first-steps
Mongodb intro
Schema Design with MongoDB
Managing Social Content with MongoDB
Kubernetes Operators With Scala

Recently uploaded (20)

PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Machine learning based COVID-19 study performance prediction
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
cuic standard and advanced reporting.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Encapsulation theory and applications.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
Network Security Unit 5.pdf for BCA BBA.
Machine learning based COVID-19 study performance prediction
Building Integrated photovoltaic BIPV_UPV.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Empathic Computing: Creating Shared Understanding
Agricultural_Statistics_at_a_Glance_2022_0.pdf
cuic standard and advanced reporting.pdf
Electronic commerce courselecture one. Pdf
Review of recent advances in non-invasive hemoglobin estimation
20250228 LYD VKU AI Blended-Learning.pptx
Encapsulation theory and applications.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Big Data Technologies - Introduction.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
A comparative analysis of optical character recognition models for extracting...
MIND Revenue Release Quarter 2 2025 Press Release
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation

AEM GEMs Session Oak Lucene Indexes

  • 1. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Oak Lucene Indexes Chetan Mehrotra | Senior Computer Scientist Alex Parvulescu | Senior Developer
  • 2. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Content  Lucene Index Definitions  Anatomy of a Query (Restrictions, Sorting, Aggregation)  Query Diagnostics and Troubleshooting  Lucene Index Internals (Oak Directory, JMX, Luke)  Asynchronous Indexing  Q&A 2
  • 3. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Lucene Index Definition
  • 4. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Index Definition  Stored under oak:index node  Define how content gets indexed  type oak:QueryIndexDefinition  Required properties  compatVersion = 2  type = “lucene”  async = “async” 4 /oak:index/assetType (oak:QueryIndexDefinition) - compatVersion = 2 - type = "lucene" - async = "async" + indexRules (nt:unstructured) + dam:Asset + properties (nt:unstructured) + assetType - propertyIndex = true - name = "jcr:content/metadata/type"
  • 5. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Index Definition – Index Rules  Defines which types of node and properties are indexed  Rules are defined per nodeType  Rule consist of one or more property definitions  Index selected based on match between type used in Query and presence of indexRule for that type  Multiple indexRules in same index  Order important – nodeType matching honors inheritance 5 SELECT * FROM [dam:Asset] AS a WHERE ISDESCENDANTNODE([/content/en]) AND a.[jcr:content/metadata/type] = 'image' /oak:index/assetType (oak:QueryIndexDefinition) - compatVersion = 2 - type = "lucene" - async = "async" + indexRules (nt:unstructured) + dam:Asset + properties (nt:unstructured) + assetType - propertyIndex = true - name = "jcr:content/metadata/assetType" https://jackrabbit.apache.org/oak/docs/query/lucene.html#Indexing_Rules
  • 6. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Index Definition – Property Definitions  Defines how a property gets indexed  One or more property definition per indexRule  Definition mapping done based on matching property name or regex pattern  Supports relative property name by there relative paths  Order important (if regex are used) 6 SELECT * FROM [dam:Asset] AS a WHERE ISDESCENDANTNODE([/content/en]) AND a.[jcr:content/metadata/type] = 'image' /oak:index/assetType (oak:QueryIndexDefinition) - compatVersion = 2 - type = "lucene" - async = "async" + indexRules (nt:unstructured) + dam:Asset + properties (nt:unstructured) + assetType - propertyIndex = true - name = "jcr:content/metadata/assetType" https://jackrabbit.apache.org/oak/docs/query/lucene.html#Property_Definitions
  • 7. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Index Definition – Best Practices  Precise Index Definition - That indexes just the right amount of content based on your query requirement. Precise index is happy index!  Make use of nodetype to achieve a “cohesive index” - This would allow multiple queries to make use of same index and also evaluation of multiple property restrictions natively in Lucene  For people familiar with Relational Databases - Nodetype is your Table in your DB and all the direct or relative properties as columns in that table. Various property definitions are like indexes on those columns. 7 https://jackrabbit.apache.org/oak/docs/query/lucene.html#Design_Considerations
  • 8. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Sample Content to Query Against 8 /content/dam/assets/december/banner.png (dam:Asset) + metadata (dam:AssetContent) - dc:format = "image/png" - status = "published" - jcr:lastModified = "2009-10-9T21:52:31" - app:tags = ["properties:orientation/landscape", "marketing:interest/product"] - size = 450 - comment = "Image for december launch" - jcr:title = "December Banner" + xmpMM:History + 1 - softwareAgent = "Adobe Photoshop" - author = "David" + renditions (nt:folder) + original (nt:file) + jcr:content - jcr:data = ...
  • 9. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Anatomy of Query 9 SELECT * FROM [dam:Asset] AS a WHERE ISDESCENDANTNODE([/content/public/platform]) AND a.[jcr:content/metadata/status] = 'published' AND CONTAINS([jcr:content/metadata/comment], 'december') ORDER BY a.[jcr:content/metadata/jcr:lastModified] DESC • Nodetype restriction on dam:Asset • Path restriction on /content/public/platform • Property restriction on jcr:content/metadata/status • Fulltext property restriction on jcr:content/metadata/comment • Sorting done on jcr:content/metadata/jcr:lastModified
  • 10. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Nodetype Restrictions 10 SELECT * FROM [dam:Asset] AS a WHERE ISDESCENDANTNODE([/content/public/platform]) AND a.[jcr:content/metadata/status] = 'published' AND CONTAINS([jcr:content/metadata/comment], 'december') ORDER BY a.[jcr:content/metadata/jcr:lastModified] DESC /oak:index/damAsset (oak:QueryIndexDefinition) - compatVersion = 2 - type = "lucene" - async = "async" + indexRules (nt:unstructured) + dam:Asset (nt:unstructured) + properties ... Create index definition node at /oak:index/damAsset with indexRule for dam:Asset
  • 11. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Path Restriction 11 SELECT * FROM [dam:Asset] AS a WHERE ISDESCENDANTNODE([/content/public/platform]) AND a.[jcr:content/metadata/status] = 'published' AND CONTAINS([jcr:content/metadata/comment], 'december') ORDER BY a.[jcr:content/metadata/jcr:lastModified] DESC Enable evaluatePathRestrictions for indexing paths Bonus Tip – If all indexable content is under /content/public and query always specify the path restriction then it would be better to define index definition under /content/public/oak:index (more details) /oak:index/damAsset (oak:QueryIndexDefinition) - compatVersion = 2 - type = "lucene" - async = "async" - evaluatePathRestrictions = true + indexRules (nt:unstructured) + dam:Asset (nt:unstructured) + properties ...
  • 12. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Property Restriction 12 SELECT * FROM [dam:Asset] AS a WHERE ISDESCENDANTNODE([/content/public/platform]) AND a.[jcr:content/metadata/status] = 'published' AND CONTAINS([jcr:content/metadata/comment], 'december') ORDER BY a.[jcr:content/metadata/jcr:lastModified] DESC Create property definition node with propertyIndex enabled and name set to relative path of property /oak:index/damAsset (oak:QueryIndexDefinition) - compatVersion = 2 - type = "lucene" - async = "async" - evaluatePathRestrictions = true + indexRules (nt:unstructured) + dam:Asset (nt:unstructured) + properties + status - propertyIndex = true - name = "jcr:content/metadata/status"
  • 13. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Fulltext Property Restriction 13 SELECT * FROM [dam:Asset] AS a WHERE ISDESCENDANTNODE([/content/public/platform]) AND a.[jcr:content/metadata/status] = 'published' AND CONTAINS([jcr:content/metadata/comment], 'december') ORDER BY a.[jcr:content/metadata/jcr:lastModified] DESC Create property definition node with analyzed enabled /oak:index/damAsset (oak:QueryIndexDefinition) - compatVersion = 2 - type = "lucene" - async = "async" - evaluatePathRestrictions = true + indexRules (nt:unstructured) + dam:Asset (nt:unstructured) + properties + status - propertyIndex = true - name = "jcr:content/metadata/status" + comment - name = "jcr:content/metadata/comment" - analyzed = true
  • 14. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Sorting 14 SELECT * FROM [dam:Asset] AS a WHERE ISDESCENDANTNODE([/content/public/platform]) AND a.[jcr:content/metadata/status] = 'published' AND CONTAINS([jcr:content/metadata/comment], 'december') ORDER BY a.[jcr:content/metadata/jcr:lastModified] DESC /oak:index/damAsset (oak:QueryIndexDefinition) - compatVersion = 2 - type = "lucene" - async = "async" - evaluatePathRestrictions = true + indexRules (nt:unstructured) + dam:Asset (nt:unstructured) + properties + status - propertyIndex = true - name = "jcr:content/metadata/status" + comment - name = "jcr:content/metadata/comment" - analyzed = true + lastModified - name = "jcr:content/metadata/jcr:lastModified" - ordered = true - type = Date - propertyIndex = true Create property definition node with ordered enabled and type set to property type. Also enable propertyIndex if you plan to have some restrictions on it
  • 15. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Fulltext Node Restriction  Searches for ‘christmas’ in all nodes of type dam:Asset  Fulltext index for a node is made up fulltext terms made up from  Node properties – Properties with nodeScopeIndex set to true  Properties of relative nodes defined by Aggregation Rules  Aggregation Rules  Define path patterns for selecting the relative nodes  Are bound to specific type  Can be recursive – Relative path refers to nt:file and nt:file has its own aggregation rule defined  For aggregated nodes all properties whose type are part of includePropertyTypes are included unless a property definition is defined with nodeScopeIndex=false 15 SELECT * FROM [dam:Asset] WHERE CONTAINS(., 'christmas')
  • 16. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Fulltext - Aggregation 16 /content/dam/assets/december/banner.png (dam:Asset) + metadata (dam:AssetContent) - dc:format = "image/png" - status = "published" - jcr:lastModified = "2009-10-9T21:52:31" - app:tags = ["properties:orientation/landscape", "marketing:interest/product"] - size = 450 - comment = "Image for Christmas launch" - jcr:title = "December Banner" + xmpMM:History + 1 - softwareAgent = "Adobe Photoshop" - author = "David" + renditions (nt:folder) + original (nt:file) + jcr:content - jcr:data = ... + aggregates + dam:Asset + include0 - path = "jcr:content" + include1 - path = "jcr:content/metadata" + include2 - path = "jcr:content/metadata/*" + include3 - path = "jcr:content/metadata/*/*" + include4 - path = "jcr:content/renditions" + include5 - path = "jcr:content/renditions/original" + nt:file + include0 - path = "jcr:content" image/png Published properties:orientation/landscape marketing:interest/product December Banner Image for Christmas launch Adobe Photoshop David Content Aggregation Rules Extracted Terms for Fulltext Index https://jackrabbit.apache.org/oak/docs/query/lucene.html#Aggregation
  • 17. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Query Result Size  Oak Fast Result Size  By default NodeIterator.getSize() returns -1 if result is large as size estimate cost is O(n) due to ACL checks  ACL Checks can be relaxed (check first ‘k’ only). Enable via system property oak.fastQuerySize.  OSGi config support with next release  AEM Query Builder and Pagination  Make use of p.guessTotal query parameter to avoid costly operation for determining result size  Use progressive pagination 17
  • 18. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Other Features  Composing Analyzer – For configuring Stemming, Synonyms, Stop words etc  Boost – Improving search relevancy  Tika Config – Control how and which types of binary files are indexed  Suggestions  Spell Check  Pre Extracting Text from Binaries – To speedup reindexing time for repositories having marge number of binaries having text 18
  • 19. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Tools
  • 20. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Query Explain Tool  Shipped with AEM 6.1  Tools -> Operations -> Dashboard -> Diagnosis -> Query Performance  http://localhost:4502/libs/granite/operations/content/diagnosis/tool.html/_granite_queryperformance  Shows Slow Query, Popular Query and Explain Query  ACS Tools (more upto date)  https://adobe-consulting-services.github.io/acs-aem-tools/explain-query.html 20
  • 21. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Query Explain Tool 21 • Shows logs from various index consulted • Shows the actual Lucene query fired • Path Restriction +:ancestors:/content/public/platform • Fulltext Restriction +full:jcr:content/metadata/comment:december • Property Restriction +jcr:content/metadata/status:published • Ordering
  • 22. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Lucene Index Internals
  • 23. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Lucene Index Internals - Directory  Lucene Directory is stored in the repository (the source of truth)  Copy on Read & Copy on Write maintain local copies for faster access (index content to disk location mappings are exposed via JMX) 23
  • 24. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Lucene Index Internals - JMX org.apache.jackrabbit.oak: Lucene Index statistics (LuceneIndex)  provides a listing of the existing lucene indexes  http://localhost:4502/system/console/jmx/org.apache.jackrabbit.oak%3Aname%3DLucene+Index+statistics%2Ctype%3 DLuceneIndex 24
  • 25. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Lucene Index Internals - JMX continued 25 org.apache.jackrabbit.oak: IndexCopier support statistics (IndexCopierStats)  Copy on Read and Copy on Write related stats, of interest is the mapping between index content and location on disk  http://localhost:4502/system/console/jmx/org.apache.jackrabbit.oak%3Aname%3DIndexCopier+support+statistics%2Ctype%3DIndexCopi erStats
  • 26. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Lucene Index Internals - JMX continued 26 org.apache.jackrabbit.oak: TextExtraction statistics (TextExtractionStats)  Very relevant stats related to how much work is done extracting text from binaries  http://localhost:4502/system/console/jmx/org.apache.jackrabbit.oak%3Aname%3DTextExtraction+statistics%2Ctype%3DTextExtractionSt ats  Make sure you remember this one for our experiment with ‘Luke’
  • 27. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Lucene Index Internals - Luke Let’s run a small experiment: upload a pdf file to the repository, verify if full-text search works 27 It works! And now let’s see why...
  • 28. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Lucene Index Internals - Luke Setting up ‘Luke’ to look at the Lucene index: 1. Why Luke? Luke is a dedicated Lucene index tool, has no alternatives for viewing content 2. Identify which index you want to look at 3. Export Index Contents  (easy/online) Lookup the Copy on Read mappings in the JMX console and grab a copy of the index  (harder/possibly offline) Use the oak console to export the index to a specific location 4. Open ‘Luke’ and make sure you pass in the oak-lucene jar as a classpath entry (as documented on the docs) 28
  • 29. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Lucene Index Internals - Luke 29 For the given token ‘mongomk’ there are 2 matching lucene docs, pointing to the pdf file. Why 2? Because of index time aggregation: the parent node will inherit the ‘:fulltext’ information from its child node.
  • 30. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Lucene Index Internals - Luke The default Lucene index defines aggregation for ‘nt:file’s, meaning they will inherit all extracted full-text information from the ‘nt:resource’ child nodes. This means that the following search /jcr:root//element(*, nt:file)[jcr:contains(., 'mongomk')] Will return a single item: /granite-gems-lucene/AEM 6 Oak - MongoMK and Queries.pdf even though the nt:file node itself contains no full-text information 30
  • 31. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Asynchronous Indexing
  • 32. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Asynchronous Indexing - Overview  AsyncIndexUpdate class is the glue for all existing index implementations (all logging comes from this place)  Runs as a background job every 5 seconds, for clusters this runs on a single cluster node  Used mainly with full-text indexes: lucene/solr, also for ordered property indexes (deprecated)  Efficient: takes care of processing only new content since last successful cycle, uses a fast diff based on checkpoints  Resilient: in case of error, it will try again on next cycle (no data loss)  Status exposed via JMX “ IndexStats”  You can change an index definition to be asynchronous by setting the async property: async=“async” 32
  • 33. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Asynchronous Indexing - Checkpoints  Checkpoints are a form of read-only tagging of the current state of the repository  Each checkpoint has an expected lifetime provided at creation time, after which it will be removed, as well as some metadata related to its creation 33 /checkpoints/a6fe070e-deef-4582-85fb-b96b57ecd1a9 - created = 1450285984929 - timestamp = 1536685984929 + properties - creator = "AsyncIndexUpdate” - name = "async” - thread = "pool-75-thread-4” + root // entire repository content + libs + content + apps .... [SegmentMK representation of a checkpoint]  The link between the async indexing process and a checkpoint is established via the /:async node  /:async@async property must point to an existing checkpoint, otherwise a full reindex will happen  /:async@async-LastIndexedTo stores the timestamp up to which the repository was indexed  /:async@async-temp is the list of checkpoints to be cleaned up after all processing is done /:async - async = “a6fe070e-deef-4582-85fb-b96b57ecd1a9” - async-LastIndexedTo = 2015-12- 16T18:13:04.929+01:00 - async-temp = ["6766f0ec-600f-4b8e-95d3- 9b4d04f5877e", "a6fe070e-deef-4582-85fb- b96b57ecd1a9”]
  • 34. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Asynchronous Indexing - JMX org.apache.jackrabbit.oak: "async" ("IndexStats”)  Start / Done timestamps  Checkpoints (reference, temp)  Execution Count & Time, Indexed Nodes Count series  Errors: failing flag, latest seen error with its timestamp 34
  • 35. © 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Useful Links Oak Lucene Docs https://jackrabbit.apache.org/oak/docs/query/lucene.html AEM 6 Oak: MongoMK and Queries Gem session http://dev.day.com/content/ddc/en/gems/aem-6-oak--mongomk-and-queries.html AEM Docs on Oak Queries and Indexing https://docs.adobe.com/docs/en/aem/6-1/deploy/platform/queries-and-indexing.html https://docs.adobe.com/docs/en/aem/6-1/deploy/best-practices/best-practices-for-queries-and-indexing.html The Index Manager https://docs.adobe.com/docs/en/aem/6-1/administer/operations/operations-dashboard.html#The Index Manager 35