SlideShare a Scribd company logo
SharePoint Intersection
Session SP40

Solving Real World Challenges
with Enterprise Search
Agnes Molnar
International Consultant, ECM & Search Expert
aghy@aghy.hu
Introduction – Agnes Molnar
International SharePoint Consultant
• 10+ Years SharePoint Experience
• Information Architecture & ECM
• Search

SharePoint Server MVP
• 6 Years SharePoint Server MVP
• 5+ Years Speaking at Conferences Around the
World
• Numerous Books, White Papers, Articles

Contact
• E-mail: aghy@aghy.hu
• Blog: http://aghy.hu
• Twitter: @molnaragnes

2

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Agenda

3

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Information Overload OR Filter Failure?

Source - http://financiallyeliteblog.com/wp-content/uploads/2011/04/information-overload.jpg
Enterprise Search
Search Technology
that your organization owns and controls

5

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Search is Easy…
Find is the real challenge!

6

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Search as an Application

Source: http://www.domorewithsearch.com

7

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Search as an Application
 Search is no longer the white box
 Content lives in disparate locations
 Structured and unstructured content lives in different locations
 Need to aggregate content according to







Process
Context
Customer
Goal
Program
Parameter of any of the above

8

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
User – Context – Content
 Context:
Business models & goals, corporate
culture, resources


Context

[Where information is used]

 Content:
Document types Objects, structure,
attributes, Meta-information


[How to describe the information]

 Users:
Information needs, audience types,
expertise, tasks


Content

Users

[How to Use the Information]

9

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Requirements Gathering

Types of
Content

Types of
Users

Users’
Behavior

Content
Sources

Metadata

Actions to
Take

Amount of
Content

Current
“Pain Points”

10

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Search is more than Technology

Source: http://searchpatterns.org

11

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
The Complexity of Enterprise Information
What we give to the search engine…

What the search engine sees…

Title

Author

Created Date

Modified Date

File Type

…

Overview of SharePoint 2013 Preview Installation and Configuration

Alex Yarrow

06/21/2012

10/16/2012

docx

…

12

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Explicit metadata versus implicit metadata
Content Type =

License

Explicit metadata

ABC Company

Organization =
DEF Company

Topic =

Forward Index – Words per document
Inverted Index – Documents per word

Support

ABC shall provide first level technical support
to all Licensed Product end users and/or
Sublicensed Product customers/users. DEF
will provide second level support. DEF shall
provide to ABC a primary and a secondary
support person to act as the primary interface
with ABC’s technical and customer support
team. DEF shall provide direct technical
support to ABC for all uses of the DEF
Software. Support level definitions and
responsibilities are set forth in Exhibit C. An
“SLA Failure” as defined in Exhibit C shall
qualify as a Release Condition sufficient to
authorize the Escrow Agent to release to
Source Code to ABC pursuant to Section 7
and the Escrow Agreement.

ABC
customers
customer support
customer support team
DEF
DEF software
end users
escrow agreement.
escrow agent
exhibit c
licensed product

release condition
section 7
secondary support
SLA
SLA failure
software
source code
support level
sublicensed product
technical support

Implicit metadata
13

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
The Complexity of Search
Result Block
Data Source

Content Source
Result Block

Data Source

Query Rule

Query Rule

Query Rule

Result Set

Display
Templates

Content Source
Data Source

metadata

Content Source
Data Source

Local Search Index

Refinement Panel

Result Source

Indexing

Hover Panel

Federation

Result Source

Remote Search index

14

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Requirements Gathering
Information-Seeking Patterns
 „I know what I’m searching for and know how to do that”

 „I know what I’m searching for but I don’t know how to do that”
 „I don’t know what I’m searching for”

 „Am I Searching?...”

15

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Real World Expectations
Content Inventory
 “I have a lot of content, but I don’t know what to do with them…”

17

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Content Inventory


SharePoint content (2013, 2010, …)







File shares









Internal communication

Business Data




Company public web site
Professional Know-How Web Sites
(finance, IT, development, etc.)
Common interest
(stock, management, etc.)

Exchange Public Folders




Sales repository (RFPs, proposals, etc.)
Marketing documents (DMs, brochures, etc.)

Web sites




Intranet
Department sites
Project sites
Internal KB

Data from databases

Custom connector



SAP data
CRM data
18

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Search Federation

19

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Crawl or Federate? – Where to get the content from?


Crawl + Use Local Index:


Examples:





Pros:






Full control over the index (crawl schedule, metadata included, etc.) and ranking model
Results can be aggregated into one result set
Common refiners (facets)

Cons:





Intranet
Company file shares

Needs resources for the crawling process
Needs storage to store the index

Federate:


Examples:






Pros:




Professional know-how web sites (TechNet, MSDN, etc.)
Internet results for a specific topic (financial news, stock information, etc.)
3rd party Content Management System
Doesn’t need resources to crawl / store the index

Cons:





Live Internet connection is required
No control over the index
No control over the ranking model
No real aggregation with other result sources

20

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Content Source Inventory
Name

Type

Location

Owner

Volume of
Content

Frequency of
Updates

Intranet

SharePoint

http://intranet

Intranet Team

200K items

100-300/hr

Project Sites

SharePoint

http://projects

Delivery

200K items

100-200/hr

Sales share

File share

X:Sales

Sales

500K docs

300-500/hr

Marketing share

File share

X:Marketing

Marketing

200K docs

300-500/hr

Company web site

Web site

http://mycompany.com

Marketing/
Publishing Team

<100K pages

1-10/day

Competitor’s web
site

Web site

http://competitor.com

[external]

<100K pages

1-10/day

Professional
Know-How

Web site

http://www.mykb.com

[external]

<100K pages

5-10/week

Company
Announcements

Exchange
Public
Folder

Exchange/Public
Folders/Announcements

Marketing/
Internal Comm.
Team

<100K items

5-10/day

HR data

Business
Data (SQL)

SQL database

HR

<100K items

10-100/day

CRM data

Custom
Connector

CRM system

Sales

500K entries

500-1000/hr

21

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Metadata in Search
 The “glue” of Search Applications
 Crawled property:

metadata extracted from the documents/items during the
crawl.

 Managed property:
mapped to crawled properties, controlled by Search Admins,
helping users perform more efficient and successful queries:





Refiners
Displayed in Search Results
Sorting Properties
22

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Metadata in Search
Crawled Property

Managed Property

Usage
Refiner

Author
Display on
Result Set
CreatedBy

Author
Display on
Hover Panel

From
Sorting by

23

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Using Managed Properties
In Query
Rules

Refinement

Result Type &
Display Template

On Hover
Panel

24

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Security

Users can see what they have access to.
vs.

Users cannot see what they don’t have access to.

25

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
The Search Security Paradox
As Search is deployed further and further into the Enterprise, the likelihood of
having a security problem increases.

26

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Sizing and Capacity Planning
 “Sounds good, but I’m not sure if we have resources for this…”

27

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Scaling Factors

Content
characteristics

Search
features

Document
freshness

Query
performance

High
availability

28

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Components – Scaling cheat sheet

Component

CPU

Network

Disk

Memory

Search administration









Crawling









Content processing (CPC)





Analytics processing (APC)









Index









Query processing (QPC)





29





© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Sorting the Results – Relevance Ranking
 Requirements:

“I’d like to see ALL the relevant results.”
vs.

“I don’t want to see anything that is not relevant
(to me, in this context).”

30

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
User Experience


Recall: the fraction of relevant instances that are retrieved



Precision: the fraction of retrieved instances that are
relevant

 Relevance: how well a retrieved document or set of documents meets
the information need of the current user, in the current context
 Ranking: the order in which the search results for a query appear
31
31

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Sorting the Results – Relevance Ranking
 Various elements can be monitored, interpreted or used in calculation
of ranking
 These can be tuned and weighted in different ways to impact results

Element

Description

Freshness
Authority
Quality
Geo

Age of a document compared to the time when the query is issued
Importance of a document determined by the links to it from other documents
Assigned importance of a document, independent of the query
Importance of geographical distance between a document’s associated latitude/longitude
and a target location specified in a query

Context
Proximity

Importance of matching a query in a given document field
For multi-term queries: the shorter the distance between query terms in a document, the
higher the document’s rank value

Position
Frequency

The earlier a query term occurs in a field, the higher the document’s rank value
The more frequent a query term occurs in a document, the higher the document’s rank
value

Completeness The greater the number of query terms present in the same field of a matching document,
the higher the document’s rank value
Number

For multi-term queries; the more query terms matched in a document, the higher the
document’s rank value

Reference: Okapi BM25
http://en.wikipedia.org/wiki/Probabilistic_relevance_model_(BM25)

32

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Search Analytics
“How to Improve the Search Experience?”

33

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Search Analytics in SharePoint 2013
•

Usage Events – As users interact with content in SharePoint, actions are captured and
stored as events (click a link, press a button, view or open a document).

•

Access and create experiences using data captured in the analytics database.

34

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Search Analytics – Examples

35

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Search Analytics – Examples

36

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Conclusions

37

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Want to Learn More?
 SP41 How to Manage and Troubleshoot Search – A Practical Guide
 POSTCON03 Architecting the Optimal Enterprise Search Strategy
 Blog: http://aghy.hu
 The Essential Guide to Enterprise Search in SharePoint 2013 (free e-book)
http://www.bainsight.com/pages/sharepoint-search-2013.aspx
 Search Circle (subscription service for Search Managers)
http://www.intranetfocus.com/enterprise-search/thesearchcircle
 SharePoint Videos – online trainings: http://www.SharePoint-Videos.com
Code for 30-days free access: SPC12Free
 Online webinars and trainings for IA and Search Managers
http://earley.com/Training-Webinars
38

© DEVintersection. All rights reserved.
http://www.DEVintersection.com
Questions?
Don’t forget to enter your evaluation
of this session using EventBoard!

Thank you!

More Related Content

PDF
Blurring the Boundaries Between Salesforce Orgs
PPTX
SharePoint Fest Denver - SharePoint 2010 Integration and Interoperability: Wh...
PPTX
SharePoint 2013 ediscovery overview
PPTX
Barcelona salesforce sdg november lightning connect
PPTX
Sp24 design a share point 2013 architecture – the basics
PPTX
SharePoint Integration and Interoperability - SharePoint Saturday Philly
PPTX
SharePoint Fest Chicago - SharePoint 2010 Integration and Interoperability: W...
PDF
Overcoming Security Threats and Vulnerabilities in SharePoint
Blurring the Boundaries Between Salesforce Orgs
SharePoint Fest Denver - SharePoint 2010 Integration and Interoperability: Wh...
SharePoint 2013 ediscovery overview
Barcelona salesforce sdg november lightning connect
Sp24 design a share point 2013 architecture – the basics
SharePoint Integration and Interoperability - SharePoint Saturday Philly
SharePoint Fest Chicago - SharePoint 2010 Integration and Interoperability: W...
Overcoming Security Threats and Vulnerabilities in SharePoint

What's hot (20)

PPTX
Data Visualization in SharePoint and Office 365
PPTX
Is BCS Dead?
PPTX
Implementing BCS-Business Connectivity Services - Sharepoint 2013- Office 365
PPTX
What SharePoint is My Ferrari?
PPTX
SharePoint Pros & Cons (2007-2010)
PDF
Office 365 and share point online ramp up in 60 minutes for on-premises share...
PPTX
SharePoint 2010 Integration and Interoperability - SharePoint Saturday Hartford
PDF
Age of Exploration: How to Achieve Enterprise-Wide Discovery
PPTX
Sp2010success
PDF
SharePoint Fest Chicago 2014 - Anatomy of SharePoint and Office 365 Hybrid De...
KEY
WPF: Working with Data
PPTX
Data Centric Composites and mashups In SharePoint 2010
PPTX
Introduction to SharePoint 2010
PPTX
2017 01-11 intelligent search and intranet - chihuahuas vs muffins v1
PDF
Rits Brown Bag - Salesforce Lightning External Connection
PDF
Share poinrt 2013 planning consideration sps atlanta
PPT
Sharepoint Moss 2007 Pros & Cons by Toby Ward, Prescient Digital Media
PPT
Ferraz Ia252 Developing An Information Architecture
PPTX
SharePoint Syntex 5 Practical Uses
PPTX
Introduction to enterprise search
Data Visualization in SharePoint and Office 365
Is BCS Dead?
Implementing BCS-Business Connectivity Services - Sharepoint 2013- Office 365
What SharePoint is My Ferrari?
SharePoint Pros & Cons (2007-2010)
Office 365 and share point online ramp up in 60 minutes for on-premises share...
SharePoint 2010 Integration and Interoperability - SharePoint Saturday Hartford
Age of Exploration: How to Achieve Enterprise-Wide Discovery
Sp2010success
SharePoint Fest Chicago 2014 - Anatomy of SharePoint and Office 365 Hybrid De...
WPF: Working with Data
Data Centric Composites and mashups In SharePoint 2010
Introduction to SharePoint 2010
2017 01-11 intelligent search and intranet - chihuahuas vs muffins v1
Rits Brown Bag - Salesforce Lightning External Connection
Share poinrt 2013 planning consideration sps atlanta
Sharepoint Moss 2007 Pros & Cons by Toby Ward, Prescient Digital Media
Ferraz Ia252 Developing An Information Architecture
SharePoint Syntex 5 Practical Uses
Introduction to enterprise search
Ad

Viewers also liked (20)

PPTX
SharePoint 2013 Search - Lessons Learned
PPTX
Agnes Molnar - Is Enterprise Search Dead???
PPTX
The Future of Enterprise Search - #SPSUK Keynote
PPTX
Real World Challenges in Enterprise Search
PPTX
Singapore SharePoint User Group - Real World Challenges in Enterprise Search
PPTX
Five Business Challenges of Hybrid Search #Live360
PPTX
Office Graph and Delve - The Future of Discovering and Consuming INformation?
PPTX
Search Quality Management
PPTX
Best Practices of Information Architecture and Search
PPTX
Agnes Molnar - Best Practices for Information Architecture and Enterprise Search
PPTX
10 steps to be successful with search
PPTX
Search Based Applications - Pecha Kucha session at #IKOSG2015
PPTX
Modern Knowledge Management
PPTX
Scoping a Successful SharePoint 2016 Hybrid Search Implementation
PPTX
Ms Search and Mr Project
PPTX
Connecting External Content to SharePoint Search
PPTX
Agnes Molnar - 10 Steps to be Successful with Enterprise Search #Collab365Summit
PPTX
Agnes Molnar - 10 Steps to be Successful with Enterprise Search
PPTX
Agnes Molnar - Scoping and Enterprise Search Implementation
PPTX
Managing and Troubleshooting SharePoint 2013 Search
SharePoint 2013 Search - Lessons Learned
Agnes Molnar - Is Enterprise Search Dead???
The Future of Enterprise Search - #SPSUK Keynote
Real World Challenges in Enterprise Search
Singapore SharePoint User Group - Real World Challenges in Enterprise Search
Five Business Challenges of Hybrid Search #Live360
Office Graph and Delve - The Future of Discovering and Consuming INformation?
Search Quality Management
Best Practices of Information Architecture and Search
Agnes Molnar - Best Practices for Information Architecture and Enterprise Search
10 steps to be successful with search
Search Based Applications - Pecha Kucha session at #IKOSG2015
Modern Knowledge Management
Scoping a Successful SharePoint 2016 Hybrid Search Implementation
Ms Search and Mr Project
Connecting External Content to SharePoint Search
Agnes Molnar - 10 Steps to be Successful with Enterprise Search #Collab365Summit
Agnes Molnar - 10 Steps to be Successful with Enterprise Search
Agnes Molnar - Scoping and Enterprise Search Implementation
Managing and Troubleshooting SharePoint 2013 Search
Ad

Similar to Solving Real World Challenges with Enterprise Search (20)

PPT
Share Point Governance: 10 Steps to Successful Deployment by Joel Oleson Bes...
PPTX
Office 365 SUGUK march 2011
PPTX
SharePoint 2013: What's New For Legal?
PPT
SharePoint Governance: From Chaos to Success in 10 Steps
PPTX
Managesp 160805190411
PPTX
Share point online 미리보기
PPT
Avoiding Failed Deployments Part 2 Interactive Discussion by Joel Oleson
PDF
Microsoft SharePoint 2013 : The Ultimate Enterprise Collaboration Platform
PPTX
Microsoft SharePoint - Edureka Webinar
PPTX
Microsoft Sharepoint 2013 : The Ultimate Enterprise Collaboration Platform
PPTX
Migrating Your Intranet to SharePoint Online
PPTX
Introducing the Salesforce platform
PPTX
SharePoint Server 2016 - Lets get ready - Wisconsin SharePoint User Group
PPT
Governance
PDF
Elevate london dec 2014.pptx
PPTX
Power User functionality in SharePoint 2013 - SP Intersection
PPTX
Sharepoint 2010 architecture, ha and dr (tig)
PPTX
SharePoint 2016 Hybrid Overview
PPTX
SharePoint Intersections - SP10 - Getting Started with Office 365 - Identity,...
PDF
Empowering Teamwork with Mobile and Intelligent Intranet with SharePoint
Share Point Governance: 10 Steps to Successful Deployment by Joel Oleson Bes...
Office 365 SUGUK march 2011
SharePoint 2013: What's New For Legal?
SharePoint Governance: From Chaos to Success in 10 Steps
Managesp 160805190411
Share point online 미리보기
Avoiding Failed Deployments Part 2 Interactive Discussion by Joel Oleson
Microsoft SharePoint 2013 : The Ultimate Enterprise Collaboration Platform
Microsoft SharePoint - Edureka Webinar
Microsoft Sharepoint 2013 : The Ultimate Enterprise Collaboration Platform
Migrating Your Intranet to SharePoint Online
Introducing the Salesforce platform
SharePoint Server 2016 - Lets get ready - Wisconsin SharePoint User Group
Governance
Elevate london dec 2014.pptx
Power User functionality in SharePoint 2013 - SP Intersection
Sharepoint 2010 architecture, ha and dr (tig)
SharePoint 2016 Hybrid Overview
SharePoint Intersections - SP10 - Getting Started with Office 365 - Identity,...
Empowering Teamwork with Mobile and Intelligent Intranet with SharePoint

More from Agnes Molnar (13)

PPTX
Microsoft 365 Collaboration Conference Virtual Event - Agnes Molnar - Microso...
PPTX
Live360 2019 - Agnes Molnar - Search Like a Pro
PPTX
Search like a Pro: Mythbusting the "Black Box" of Search
PPTX
Workshop: Search Managers Bootcamp
PPTX
Search Like a Pro: Mythbusting the "Black Box" of Search
PDF
Enterprise Search and Findability in Office 365
PPTX
SharePoint Conference 2019: Microsoft Search in YOUR Organization
PPTX
Why You Need to Invest to Search and How to do it
PPTX
Agnes Molnar: Personalized Search and Collaboration in Office 365
PPTX
Intelligent Insights and Collaboration in Office 365 #Live!360
PPTX
Unified Search Experiences in SharePoint
PPTX
10 Steps to be Successful with Enterprise Search - INNOVA (20min)
PPTX
How to be Successful with Search in YOUR Organization
Microsoft 365 Collaboration Conference Virtual Event - Agnes Molnar - Microso...
Live360 2019 - Agnes Molnar - Search Like a Pro
Search like a Pro: Mythbusting the "Black Box" of Search
Workshop: Search Managers Bootcamp
Search Like a Pro: Mythbusting the "Black Box" of Search
Enterprise Search and Findability in Office 365
SharePoint Conference 2019: Microsoft Search in YOUR Organization
Why You Need to Invest to Search and How to do it
Agnes Molnar: Personalized Search and Collaboration in Office 365
Intelligent Insights and Collaboration in Office 365 #Live!360
Unified Search Experiences in SharePoint
10 Steps to be Successful with Enterprise Search - INNOVA (20min)
How to be Successful with Search in YOUR Organization

Recently uploaded (20)

PPT
Teaching material agriculture food technology
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
A comparative analysis of optical character recognition models for extracting...
PPTX
Tartificialntelligence_presentation.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
cuic standard and advanced reporting.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
1. Introduction to Computer Programming.pptx
PDF
Encapsulation theory and applications.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Getting Started with Data Integration: FME Form 101
PDF
NewMind AI Weekly Chronicles - August'25-Week II
Teaching material agriculture food technology
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
A comparative analysis of optical character recognition models for extracting...
Tartificialntelligence_presentation.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
Unlocking AI with Model Context Protocol (MCP)
cuic standard and advanced reporting.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
1. Introduction to Computer Programming.pptx
Encapsulation theory and applications.pdf
Electronic commerce courselecture one. Pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Group 1 Presentation -Planning and Decision Making .pptx
Programs and apps: productivity, graphics, security and other tools
Digital-Transformation-Roadmap-for-Companies.pptx
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Empathic Computing: Creating Shared Understanding
Advanced methodologies resolving dimensionality complications for autism neur...
Getting Started with Data Integration: FME Form 101
NewMind AI Weekly Chronicles - August'25-Week II

Solving Real World Challenges with Enterprise Search

  • 1. SharePoint Intersection Session SP40 Solving Real World Challenges with Enterprise Search Agnes Molnar International Consultant, ECM & Search Expert aghy@aghy.hu
  • 2. Introduction – Agnes Molnar International SharePoint Consultant • 10+ Years SharePoint Experience • Information Architecture & ECM • Search SharePoint Server MVP • 6 Years SharePoint Server MVP • 5+ Years Speaking at Conferences Around the World • Numerous Books, White Papers, Articles Contact • E-mail: aghy@aghy.hu • Blog: http://aghy.hu • Twitter: @molnaragnes 2 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 3. Agenda 3 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 4. Information Overload OR Filter Failure? Source - http://financiallyeliteblog.com/wp-content/uploads/2011/04/information-overload.jpg
  • 5. Enterprise Search Search Technology that your organization owns and controls 5 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 6. Search is Easy… Find is the real challenge! 6 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 7. Search as an Application Source: http://www.domorewithsearch.com 7 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 8. Search as an Application  Search is no longer the white box  Content lives in disparate locations  Structured and unstructured content lives in different locations  Need to aggregate content according to       Process Context Customer Goal Program Parameter of any of the above 8 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 9. User – Context – Content  Context: Business models & goals, corporate culture, resources  Context [Where information is used]  Content: Document types Objects, structure, attributes, Meta-information  [How to describe the information]  Users: Information needs, audience types, expertise, tasks  Content Users [How to Use the Information] 9 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 10. Requirements Gathering Types of Content Types of Users Users’ Behavior Content Sources Metadata Actions to Take Amount of Content Current “Pain Points” 10 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 11. Search is more than Technology Source: http://searchpatterns.org 11 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 12. The Complexity of Enterprise Information What we give to the search engine… What the search engine sees… Title Author Created Date Modified Date File Type … Overview of SharePoint 2013 Preview Installation and Configuration Alex Yarrow 06/21/2012 10/16/2012 docx … 12 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 13. Explicit metadata versus implicit metadata Content Type = License Explicit metadata ABC Company Organization = DEF Company Topic = Forward Index – Words per document Inverted Index – Documents per word Support ABC shall provide first level technical support to all Licensed Product end users and/or Sublicensed Product customers/users. DEF will provide second level support. DEF shall provide to ABC a primary and a secondary support person to act as the primary interface with ABC’s technical and customer support team. DEF shall provide direct technical support to ABC for all uses of the DEF Software. Support level definitions and responsibilities are set forth in Exhibit C. An “SLA Failure” as defined in Exhibit C shall qualify as a Release Condition sufficient to authorize the Escrow Agent to release to Source Code to ABC pursuant to Section 7 and the Escrow Agreement. ABC customers customer support customer support team DEF DEF software end users escrow agreement. escrow agent exhibit c licensed product release condition section 7 secondary support SLA SLA failure software source code support level sublicensed product technical support Implicit metadata 13 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 14. The Complexity of Search Result Block Data Source Content Source Result Block Data Source Query Rule Query Rule Query Rule Result Set Display Templates Content Source Data Source metadata Content Source Data Source Local Search Index Refinement Panel Result Source Indexing Hover Panel Federation Result Source Remote Search index 14 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 15. Requirements Gathering Information-Seeking Patterns  „I know what I’m searching for and know how to do that”  „I know what I’m searching for but I don’t know how to do that”  „I don’t know what I’m searching for”  „Am I Searching?...” 15 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 17. Content Inventory  “I have a lot of content, but I don’t know what to do with them…” 17 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 18. Content Inventory  SharePoint content (2013, 2010, …)      File shares      Internal communication Business Data   Company public web site Professional Know-How Web Sites (finance, IT, development, etc.) Common interest (stock, management, etc.) Exchange Public Folders   Sales repository (RFPs, proposals, etc.) Marketing documents (DMs, brochures, etc.) Web sites   Intranet Department sites Project sites Internal KB Data from databases Custom connector   SAP data CRM data 18 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 19. Search Federation 19 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 20. Crawl or Federate? – Where to get the content from?  Crawl + Use Local Index:  Examples:    Pros:     Full control over the index (crawl schedule, metadata included, etc.) and ranking model Results can be aggregated into one result set Common refiners (facets) Cons:    Intranet Company file shares Needs resources for the crawling process Needs storage to store the index Federate:  Examples:     Pros:   Professional know-how web sites (TechNet, MSDN, etc.) Internet results for a specific topic (financial news, stock information, etc.) 3rd party Content Management System Doesn’t need resources to crawl / store the index Cons:     Live Internet connection is required No control over the index No control over the ranking model No real aggregation with other result sources 20 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 21. Content Source Inventory Name Type Location Owner Volume of Content Frequency of Updates Intranet SharePoint http://intranet Intranet Team 200K items 100-300/hr Project Sites SharePoint http://projects Delivery 200K items 100-200/hr Sales share File share X:Sales Sales 500K docs 300-500/hr Marketing share File share X:Marketing Marketing 200K docs 300-500/hr Company web site Web site http://mycompany.com Marketing/ Publishing Team <100K pages 1-10/day Competitor’s web site Web site http://competitor.com [external] <100K pages 1-10/day Professional Know-How Web site http://www.mykb.com [external] <100K pages 5-10/week Company Announcements Exchange Public Folder Exchange/Public Folders/Announcements Marketing/ Internal Comm. Team <100K items 5-10/day HR data Business Data (SQL) SQL database HR <100K items 10-100/day CRM data Custom Connector CRM system Sales 500K entries 500-1000/hr 21 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 22. Metadata in Search  The “glue” of Search Applications  Crawled property: metadata extracted from the documents/items during the crawl.  Managed property: mapped to crawled properties, controlled by Search Admins, helping users perform more efficient and successful queries:    Refiners Displayed in Search Results Sorting Properties 22 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 23. Metadata in Search Crawled Property Managed Property Usage Refiner Author Display on Result Set CreatedBy Author Display on Hover Panel From Sorting by 23 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 24. Using Managed Properties In Query Rules Refinement Result Type & Display Template On Hover Panel 24 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 25. Security Users can see what they have access to. vs. Users cannot see what they don’t have access to. 25 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 26. The Search Security Paradox As Search is deployed further and further into the Enterprise, the likelihood of having a security problem increases. 26 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 27. Sizing and Capacity Planning  “Sounds good, but I’m not sure if we have resources for this…” 27 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 29. Components – Scaling cheat sheet Component CPU Network Disk Memory Search administration     Crawling     Content processing (CPC)   Analytics processing (APC)     Index     Query processing (QPC)   29   © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 30. Sorting the Results – Relevance Ranking  Requirements: “I’d like to see ALL the relevant results.” vs. “I don’t want to see anything that is not relevant (to me, in this context).” 30 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 31. User Experience  Recall: the fraction of relevant instances that are retrieved  Precision: the fraction of retrieved instances that are relevant  Relevance: how well a retrieved document or set of documents meets the information need of the current user, in the current context  Ranking: the order in which the search results for a query appear 31 31 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 32. Sorting the Results – Relevance Ranking  Various elements can be monitored, interpreted or used in calculation of ranking  These can be tuned and weighted in different ways to impact results Element Description Freshness Authority Quality Geo Age of a document compared to the time when the query is issued Importance of a document determined by the links to it from other documents Assigned importance of a document, independent of the query Importance of geographical distance between a document’s associated latitude/longitude and a target location specified in a query Context Proximity Importance of matching a query in a given document field For multi-term queries: the shorter the distance between query terms in a document, the higher the document’s rank value Position Frequency The earlier a query term occurs in a field, the higher the document’s rank value The more frequent a query term occurs in a document, the higher the document’s rank value Completeness The greater the number of query terms present in the same field of a matching document, the higher the document’s rank value Number For multi-term queries; the more query terms matched in a document, the higher the document’s rank value Reference: Okapi BM25 http://en.wikipedia.org/wiki/Probabilistic_relevance_model_(BM25) 32 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 33. Search Analytics “How to Improve the Search Experience?” 33 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 34. Search Analytics in SharePoint 2013 • Usage Events – As users interact with content in SharePoint, actions are captured and stored as events (click a link, press a button, view or open a document). • Access and create experiences using data captured in the analytics database. 34 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 35. Search Analytics – Examples 35 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 36. Search Analytics – Examples 36 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 37. Conclusions 37 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 38. Want to Learn More?  SP41 How to Manage and Troubleshoot Search – A Practical Guide  POSTCON03 Architecting the Optimal Enterprise Search Strategy  Blog: http://aghy.hu  The Essential Guide to Enterprise Search in SharePoint 2013 (free e-book) http://www.bainsight.com/pages/sharepoint-search-2013.aspx  Search Circle (subscription service for Search Managers) http://www.intranetfocus.com/enterprise-search/thesearchcircle  SharePoint Videos – online trainings: http://www.SharePoint-Videos.com Code for 30-days free access: SPC12Free  Online webinars and trainings for IA and Search Managers http://earley.com/Training-Webinars 38 © DEVintersection. All rights reserved. http://www.DEVintersection.com
  • 39. Questions? Don’t forget to enter your evaluation of this session using EventBoard! Thank you!

Editor's Notes

  • #5: Source: http://financiallyeliteblog.com/wp-content/uploads/2011/04/information-overload.jpg
  • #6: No longer within the firewallRelevance is criticalSearch within the organization„Transparent” SearchSearch Driven Applications
  • #11: Management by Walking Around
  • #23: “Join” by…FilterRefinementDisplaySort/Order
  • #25: Resource: Configure properties of the Search Box Web Part in SharePoint Server 2013 (http://technet.microsoft.com/en-us/library/gg576963.aspx).Entity Extraction for other content sources
  • #27: Search “opens up windows” but not a “security leak”!!Plan!!Research on SOURCE SYSTEM, involve the admins there!!TestOn Source systemOn SearchInvolve:Source system key usersSource system adminsTest users (&lt;7)More test users
  • #32: the relevant items are to the left of the straight line while the retrieved items are within the oval. The red regions represent errors. On the left these are the relevant items not retrieved (false negatives), while on the right they are the retrieved items that are not relevant (false positives).
  • #35: New analytics processing component analyzes content in the search index and user actions that were performed on a site to identify items that users perceive as more relevant than others.Number of ViewsNumber of ClicksOverall item usageRecommendationSocial distance…
  • #38: Jeff