SlideShare a Scribd company logo
Faceted search using Solr and Ontopia2009-11-03Geir Ove Grønmo, grove@bouvet.no
AgendaShort introductions to Solr and OntopiaWhat is faceted search?An integration of the two – a prototypeDemos
Apache SolrA search engineimplemented as HTTP service on top of Apache Lucenesearching and indexing (no web-crawling)adds support for faceted search (and more)sharding and replicationdistributed searchexcellent interoperability (i.e not really Java-specific)Next release: Solr 1.4Open source:http://lucene.apache.org/solr/Apache Licence 2.0
OntopiaA Topic Maps toolkit:data representation, persistence and queryingapplication developmentwritten in JavaNext release: Ontopia 5.1Open source:http://code.google.com/p/ontopia/Apache Licence 2.0
Where the meat is...Solrfast textual search and faceted search supportOntopiarich semantic data and structured searchUser interface designproviding a useful interface to the user
But first, what is faceted search?A technique for refining search resultsIntegrates textual search and navigationAllows concept compositionslow + expensive + red  + used + cararticle + in english + about salmonpeople + aged 20-30 + SQL expertpunk rock songs + < 1 minute + in norwegian + released 1980-1982Support exploration and learningNever returns zero results
Faceted search using Solr and Ontopia
How is it done?Given a starting setusually all documentsor the result of filling in the search input box...do the following:count the number of hits matching each facet fieldwhich fields to facet on are defined at query time
Faceted search using Solr and Ontopia
Faceted search using Solr and Ontopia
Faceted search using Solr and Ontopia
An example without faceted search
Facet typesStandard facetsa list of facet valuesHierarchical facet valuestaxonomy of facet valuesRange/query facetsdatespricesalphabet bucketsintervals (lower and upper bounds)
Standard facets
Hierarchical facet valuesNote: the facets can also be hierarchical
Alphabet buckets
Range facets
User interface considerationsSingle selectlinkradio buttonMulti selectcheckboxesDecide on which operator to use: AND/ORwithin a facetbetween facetsHow many facet values to displaygiven limited screen real estateHow to provide intuitive undo operation
Examples
ScoringSome types of documents should be ranked higher than othersSolr lets one boost the default score:per documentper fieldThe total score of a documents depends on:the boost and score of the fields adjusted by how relevant a field is relatively to the actual querythe boost of the document
SortingHow to sort the list of facets?by relevanceHow to sort the values of each facet?by number of hitsalphabeticallyHow to sort the search result?by relevancealphabeticallyby date
Proposition“Concept composition, using faceted search, and Topic Maps is a perfect match”
Why not use Ontopia only?You can, but it is not optimizedfor this use caseIt lets you implement faceted searchbut it’ll be too slowThe reasons are:all the expensive processing will have to happen at runtime, and not indexing timeinvolves a lot of traversalrelies on the underlying fulltext search enginesearch has limited cacheability
Trade-offsConsiderations:Search performanceIndexing performanceConsistencyOntopiano indexing overheadresults always up-to-dateSolrvery fast searchindexing overheadindex must be kept up-to-date regularly
Solr – the data modelAn index contains documentsDocuments have fieldsA field can have multiple values{ “id”: “1234”,     “title”: “Structure and Interpretation of Computer Programs”,     “authors”: [“Harold Abelson”, “Gerald Jay Sussman”] }
Ontopia – the data modelA topic map containstopicsand information about themIdentitiesNamesAssociations to other topicsOccurrences (read: non-association properties)
Integrating Solr and OntopiaProposed solution:Solr indexes constructed from Ontopia queriesFor each document type create a query that extracts data from the topic map to fields in documentsThen do faceting on selected fieldsUse-case specific schema definitionshould be project specific (to some degree)Perform full index or incremental reindex
Index rule set
Index rule: Organisasjonsenheter
Query result: Organisasjonsenheter
Solr index: Organisasjonsenhet
Index rule: Artikler
Query result: Artikler
Solr index: Artikler
DemoA prototype for Bergen kommune
Ideas for the futureFaceted search user-interface in Ontopolycould be made declarativeIncremental reindexingrequires tracking changesusually done with a timestampimplement last-modified field in OntopolyAdd optional fourth column for score boost?a float between 0 and 1Ontopia extensions for interacting with SolrJSP tag librarytolog predicates
More demosEpicurious: recipe searchhttp://www.epicurious.com/tools/searchresults?search=Flickr photo search with hierarchical facetshttp://people.csail.mit.edu/dfhuynh/projects/hierarchical-facets/test.htmlA collection of faceted navigation examples:http://www.flickr.com/photos/morville/collections/72157603789246885/
More information3 Quick Design Patterns for Better Faceted Searchhttp://www.thingsontop.com/3-quick-patterns-better-facet-design-889.htmlHow to Make a Faceted Classification and Put It On the Webhttp://www.miskatonic.org/library/facet-web-howto.htmlBook: Faceted Search (Synthesis Lectures on Information Concepts, Retrieval, and Services), Daniel Tunkelang
...is easier to find when using faceted search.Structured semantics-rich data...

More Related Content

PDF
Haystacks slides
PPTX
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...
PDF
It's Just Search: Presented by Erik Hatcher, Lucidworks
PDF
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
PPT
The Art Of Searching
PPTX
Solr 6.0 Graph Query Overview
PDF
Solr Graph Query: Presented by Kevin Watters, KMW Technology
PDF
Personalized Search and Job Recommendations - Simon Hughes, Dice.com
Haystacks slides
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...
It's Just Search: Presented by Erik Hatcher, Lucidworks
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
The Art Of Searching
Solr 6.0 Graph Query Overview
Solr Graph Query: Presented by Kevin Watters, KMW Technology
Personalized Search and Job Recommendations - Simon Hughes, Dice.com

What's hot (20)

PPTX
The Intent Algorithms of Search & Recommendation Engines
PPTX
Interleaving, Evaluation to Self-learning Search @904Labs
PDF
Vespa, A Tour
PPTX
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
PPTX
Self-learned Relevancy with Apache Solr
PPT
Boosting Documents in Solr by Recency, Popularity, and User Preferences
PPTX
Dice.com Bay Area Search - Beyond Learning to Rank Talk
PPTX
Searching with vectors
PPT
Domain Ontology Usage Analysis Framework (OUSAF)
PDF
Search Product Manager: Software PM vs. Enterprise PM or What does that * PM do?
PDF
Graphs, Graphs everywhere - Lucene powered relation exploration
PDF
Made to Measure: Ranking Evaluation using Elasticsearch
PPTX
Topic sensitive page rank(review)
PPTX
Live Blog Analysis
PDF
Enhancing relevancy through personalization & semantic search
PDF
Elasticsearch
PPTX
Getting the most ouf of SharePoint Search - Tulsa SharePoint Interest Group
PPTX
Implementing Enterprise Search in SharePoint 2010
PPTX
Exploratory Search upon Semantically Described Web Data Sources: Service regi...
PDF
PrachiSharma
The Intent Algorithms of Search & Recommendation Engines
Interleaving, Evaluation to Self-learning Search @904Labs
Vespa, A Tour
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
Self-learned Relevancy with Apache Solr
Boosting Documents in Solr by Recency, Popularity, and User Preferences
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Searching with vectors
Domain Ontology Usage Analysis Framework (OUSAF)
Search Product Manager: Software PM vs. Enterprise PM or What does that * PM do?
Graphs, Graphs everywhere - Lucene powered relation exploration
Made to Measure: Ranking Evaluation using Elasticsearch
Topic sensitive page rank(review)
Live Blog Analysis
Enhancing relevancy through personalization & semantic search
Elasticsearch
Getting the most ouf of SharePoint Search - Tulsa SharePoint Interest Group
Implementing Enterprise Search in SharePoint 2010
Exploratory Search upon Semantically Described Web Data Sources: Service regi...
PrachiSharma
Ad

Viewers also liked (7)

PDF
Searching the Stuff of Life - BioSolr: Presented by Matt Pearce & Alan Woodwa...
PPTX
Impedance Mismatch 2.0
PPT
Web Du Faceted Search V3 Alt
PDF
Start Anywhere - Faceted Navigation (euroIA 2010)
PDF
Automatically mining facets for queries from their search results
PPT
Are users really ready for faceted search?
PDF
Faceted Search and Solr
Searching the Stuff of Life - BioSolr: Presented by Matt Pearce & Alan Woodwa...
Impedance Mismatch 2.0
Web Du Faceted Search V3 Alt
Start Anywhere - Faceted Navigation (euroIA 2010)
Automatically mining facets for queries from their search results
Are users really ready for faceted search?
Faceted Search and Solr
Ad

Similar to Faceted search using Solr and Ontopia (20)

PDF
How the Lucene More Like This Works
PDF
Solr Architecture
PDF
Apace Solr Web Development.pdf
PPS
Making IA Real: Planning an Information Architecture Strategy
PPT
Solr and Elasticsearch, a performance study
PPTX
Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
PPTX
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
PPTX
Longwell final ppt
PPTX
Search Me: Using Lucene.Net
PPT
Faceted Navigation (LACASIS Fall Workshop 2005)
PDF
Wanna search? Piece of cake!
PPTX
Apache Solr vs Oracle Endeca
PPTX
Apache lucene
PPT
Advanced full text searching techniques using Lucene
PPTX
Eureka, I found it! - Special Libraries Association 2021 Presentation
PPTX
ElasticSearch in Production: lessons learned
PPTX
Implementing full text search with Apache Solr
PDF
In search of: A meetup about Liferay and Search 2016-04-20
PDF
Search explained T3DD15
PPT
Lucene Bootcamp -1
How the Lucene More Like This Works
Solr Architecture
Apace Solr Web Development.pdf
Making IA Real: Planning an Information Architecture Strategy
Solr and Elasticsearch, a performance study
Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Longwell final ppt
Search Me: Using Lucene.Net
Faceted Navigation (LACASIS Fall Workshop 2005)
Wanna search? Piece of cake!
Apache Solr vs Oracle Endeca
Apache lucene
Advanced full text searching techniques using Lucene
Eureka, I found it! - Special Libraries Association 2021 Presentation
ElasticSearch in Production: lessons learned
Implementing full text search with Apache Solr
In search of: A meetup about Liferay and Search 2016-04-20
Search explained T3DD15
Lucene Bootcamp -1

Recently uploaded (20)

PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
KodekX | Application Modernization Development
PDF
cuic standard and advanced reporting.pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Approach and Philosophy of On baking technology
PPTX
Spectroscopy.pptx food analysis technology
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Empathic Computing: Creating Shared Understanding
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
Diabetes mellitus diagnosis method based random forest with bat algorithm
Spectral efficient network and resource selection model in 5G networks
Building Integrated photovoltaic BIPV_UPV.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
sap open course for s4hana steps from ECC to s4
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Per capita expenditure prediction using model stacking based on satellite ima...
The AUB Centre for AI in Media Proposal.docx
KodekX | Application Modernization Development
cuic standard and advanced reporting.pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Approach and Philosophy of On baking technology
Spectroscopy.pptx food analysis technology
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Chapter 3 Spatial Domain Image Processing.pdf
Encapsulation_ Review paper, used for researhc scholars
Empathic Computing: Creating Shared Understanding
Dropbox Q2 2025 Financial Results & Investor Presentation

Faceted search using Solr and Ontopia

  • 1. Faceted search using Solr and Ontopia2009-11-03Geir Ove Grønmo, grove@bouvet.no
  • 2. AgendaShort introductions to Solr and OntopiaWhat is faceted search?An integration of the two – a prototypeDemos
  • 3. Apache SolrA search engineimplemented as HTTP service on top of Apache Lucenesearching and indexing (no web-crawling)adds support for faceted search (and more)sharding and replicationdistributed searchexcellent interoperability (i.e not really Java-specific)Next release: Solr 1.4Open source:http://lucene.apache.org/solr/Apache Licence 2.0
  • 4. OntopiaA Topic Maps toolkit:data representation, persistence and queryingapplication developmentwritten in JavaNext release: Ontopia 5.1Open source:http://code.google.com/p/ontopia/Apache Licence 2.0
  • 5. Where the meat is...Solrfast textual search and faceted search supportOntopiarich semantic data and structured searchUser interface designproviding a useful interface to the user
  • 6. But first, what is faceted search?A technique for refining search resultsIntegrates textual search and navigationAllows concept compositionslow + expensive + red + used + cararticle + in english + about salmonpeople + aged 20-30 + SQL expertpunk rock songs + < 1 minute + in norwegian + released 1980-1982Support exploration and learningNever returns zero results
  • 8. How is it done?Given a starting setusually all documentsor the result of filling in the search input box...do the following:count the number of hits matching each facet fieldwhich fields to facet on are defined at query time
  • 12. An example without faceted search
  • 13. Facet typesStandard facetsa list of facet valuesHierarchical facet valuestaxonomy of facet valuesRange/query facetsdatespricesalphabet bucketsintervals (lower and upper bounds)
  • 15. Hierarchical facet valuesNote: the facets can also be hierarchical
  • 18. User interface considerationsSingle selectlinkradio buttonMulti selectcheckboxesDecide on which operator to use: AND/ORwithin a facetbetween facetsHow many facet values to displaygiven limited screen real estateHow to provide intuitive undo operation
  • 20. ScoringSome types of documents should be ranked higher than othersSolr lets one boost the default score:per documentper fieldThe total score of a documents depends on:the boost and score of the fields adjusted by how relevant a field is relatively to the actual querythe boost of the document
  • 21. SortingHow to sort the list of facets?by relevanceHow to sort the values of each facet?by number of hitsalphabeticallyHow to sort the search result?by relevancealphabeticallyby date
  • 22. Proposition“Concept composition, using faceted search, and Topic Maps is a perfect match”
  • 23. Why not use Ontopia only?You can, but it is not optimizedfor this use caseIt lets you implement faceted searchbut it’ll be too slowThe reasons are:all the expensive processing will have to happen at runtime, and not indexing timeinvolves a lot of traversalrelies on the underlying fulltext search enginesearch has limited cacheability
  • 24. Trade-offsConsiderations:Search performanceIndexing performanceConsistencyOntopiano indexing overheadresults always up-to-dateSolrvery fast searchindexing overheadindex must be kept up-to-date regularly
  • 25. Solr – the data modelAn index contains documentsDocuments have fieldsA field can have multiple values{ “id”: “1234”, “title”: “Structure and Interpretation of Computer Programs”, “authors”: [“Harold Abelson”, “Gerald Jay Sussman”] }
  • 26. Ontopia – the data modelA topic map containstopicsand information about themIdentitiesNamesAssociations to other topicsOccurrences (read: non-association properties)
  • 27. Integrating Solr and OntopiaProposed solution:Solr indexes constructed from Ontopia queriesFor each document type create a query that extracts data from the topic map to fields in documentsThen do faceting on selected fieldsUse-case specific schema definitionshould be project specific (to some degree)Perform full index or incremental reindex
  • 35. DemoA prototype for Bergen kommune
  • 36. Ideas for the futureFaceted search user-interface in Ontopolycould be made declarativeIncremental reindexingrequires tracking changesusually done with a timestampimplement last-modified field in OntopolyAdd optional fourth column for score boost?a float between 0 and 1Ontopia extensions for interacting with SolrJSP tag librarytolog predicates
  • 37. More demosEpicurious: recipe searchhttp://www.epicurious.com/tools/searchresults?search=Flickr photo search with hierarchical facetshttp://people.csail.mit.edu/dfhuynh/projects/hierarchical-facets/test.htmlA collection of faceted navigation examples:http://www.flickr.com/photos/morville/collections/72157603789246885/
  • 38. More information3 Quick Design Patterns for Better Faceted Searchhttp://www.thingsontop.com/3-quick-patterns-better-facet-design-889.htmlHow to Make a Faceted Classification and Put It On the Webhttp://www.miskatonic.org/library/facet-web-howto.htmlBook: Faceted Search (Synthesis Lectures on Information Concepts, Retrieval, and Services), Daniel Tunkelang
  • 39. ...is easier to find when using faceted search.Structured semantics-rich data...