SlideShare a Scribd company logo
Mastering solr
   Jur de Vries
Who am I?

Developer/architect at Triquanta
Trainer at Wizzlern
Use case

Market place
Advertisements
Adjust relevancy
Paid boosting of add's
Of course we use Drupal and Apache Solr
Running Solr Locally

Download latest version (3.6)
Be sure to download distribution (not src)
Unpack solr
Go to example directory
Run
 java -jar start.jar
Drupal: which contrib?

2 Possibilities
  Apachesolr search
  Search api with solr backend
Apache solr search

Streghts:
  Supported by Acquia
  Easy to set up
  Mature
Weaknesses
  Integration with views (still in dev)
Search Api

Strengths
  Flexible
  Indexes all entities
  Excellent views integration
  Related fields are easy to add to index
Weaknesses
  Not supported (yet) by Acquia
  Solr backend has some issues
Drupal: which contrib?

Apachesolr search integration
  Quick setup
  Acquia
Search API
  Exportable configuaration
  Views integration
  Index all entities
Depends on your needs
Basic use of search api

Create server
Create index
  Select fields to index
  Define data alterations
  Define processors
Start indexing
Field types

Integer, date, boolean
String or fulltext?
  Fulltext will get processed!
      Tokenize
      Stopwords
      Ignore case
  String is as is
Demo

Run solr
Copy schema.xml and solrconfig.xml (!)
Create server
Create index
Create view
  ads
  Ad filter exposed: search
Advanced use of Search api

This talk is about Solr, not about search API
Understand Solr first!
Many resources on the web
Watch screencasts etc
Mastering Solr

Mastering solr is understanding solr
What happens after a Drupal module?
Let's have a look at the request
Solr request

Look at solr log
Parameters:
  start
  rows
  q (query)
  qf (query fields)
  fl (fields)
  fq (filter query)
Field names

item_id, id
t_.., ss_.., → why?
Solr has to know how to handle fields
Field api: field names differ
Dynamic field names: tell solr field type!
Schema.xml

Defines field types and fields
The real tweaking starts here!
Let's have a look!
  dynamicField
  field type
  analyzers
Copyfield
What can you do in schema.xml?

Synonyms (is disabled by default)
Stopwords (and, or, etc)
Stemming
Proper multilingual handling
Browse the schema

Solr offers schema browsing
Go to: http://localhost:8983/solr/admin
Search relevancy

Types of boosting:
  Field level boost
  Boost function
  Boost query
  (QueryElevation)
Boost parameters

Field level boosting: qf
   qf:t_body^20
   score in field is multiplied by 20
Boost function: bf
   bf:product(fieldname, 2)
   result of function is added to score
Boost query: bq
boost (only for edismax) like bf but multiplication
Let's boost title

Field level boost is incorporated in Search API...
But, where are the numbers in the request???
Search api solr forgot to add them!
There is a patch :-)
But lets do it another way...
Debugging Solr

Lets add &echoParams=all to the request...
Where do all these parameters come from?
Solrconfig.xml!!!
Among other things: request handler
Let's look at the dismax request handler
Solrconfig.xml

(Default) Request handler:
  Default parameters
  Add Spellcheck
  Tweak all kinds of search behavior!
  Let's add default search fields with boost
Boost function

Mathematical functions on field values
Available functions:
  sum(x,y): x + y
  product(x,y): x * y
  scale(x, minTarget, maxTarget)
  recip(x, m, a, b): x / (m * a + b)
  ms(): time → ms(NOW/DAY, created)
  Many more!
Boost date

We need ms(): big values!
Linear? To much difference
Recip!
recip(x,1,1000,1000)
if x 1000: half
1 year: 3.1e10
recip(ms(NOW/YEAR?, created),1,3.1e10,3.1e10)
bf=recip(ms(NOW/YEAR?, created),1,3.1e10,3.1e10)^3
Use a graphing tool!
Boost queries

Do a query like fq:
Boost add's:
  content_type:add
  bq=content_type:add
  bq=(content_type)^20
Debugging relevancy

We know how to boost
How can finetuning be done?
solr has the solutions:
  add debugQuery=on
debugQuery=on


normal                   source
Relevancy

Choose your boosting methods
Try in your browser
Finetuning: debugQuery=on, source
Add parameters to solrconfig.xml
Or...
Add parameters in code

use
hook_search_api_solr_query_alter(array
  &$call_args, SearchApiQueryInterface $query)
$call_args['params']['bq'] = '(t_title:foo)^20'
$call_args['params']['bf'][] = b_promote
Override solr service class

In Search API: define server class
extend solr service class
Only change key methods
It's all about passing parameters!
Conclusion

Tweak indexing in schema.xml
  Stopwords
  Multilingual
Tweak searching in solrconfig.xml
Tweak searching by passing variables
This is only an introduction!
Questions?
Feedback & follow-up:
http://drupalcampgent.be/feedback

More Related Content

PPS
Introduction to Solr
PDF
Solr workshop
PPTX
Apache Solr + ajax solr
PPTX
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
PPTX
JSON in Solr: from top to bottom
PDF
Solr Troubleshooting - TreeMap approach
PDF
Apache Solr Workshop
PDF
Using Apache Solr
Introduction to Solr
Solr workshop
Apache Solr + ajax solr
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
JSON in Solr: from top to bottom
Solr Troubleshooting - TreeMap approach
Apache Solr Workshop
Using Apache Solr

What's hot (20)

PDF
Solr Masterclass Bangkok, June 2014
PPT
Enterprise Search Solution: Apache SOLR. What's available and why it's so cool
PPTX
Rapid Solr Schema Development (Phone directory)
PDF
An Introduction to Basics of Search and Relevancy with Apache Solr
PDF
From content to search: speed-dating Apache Solr (ApacheCON 2018)
ODP
Searching for AI - Leveraging Solr for classic Artificial Intelligence tasks
PPTX
Solr 6 Feature Preview
PPT
Solr Presentation
PDF
Solr Recipes Workshop
PPTX
Apache Solr
PDF
Rapid Prototyping with Solr
PDF
Schemaless Solr and the Solr Schema REST API
PDF
Get the most out of Solr search with PHP
PDF
New-Age Search through Apache Solr
PPTX
Apache Solr
PPT
Introduction to Apache Solr.
PDF
Solr Black Belt Pre-conference
PDF
Introduction to Solr
PDF
Solr Query Parsing
PDF
Solr Indexing and Analysis Tricks
Solr Masterclass Bangkok, June 2014
Enterprise Search Solution: Apache SOLR. What's available and why it's so cool
Rapid Solr Schema Development (Phone directory)
An Introduction to Basics of Search and Relevancy with Apache Solr
From content to search: speed-dating Apache Solr (ApacheCON 2018)
Searching for AI - Leveraging Solr for classic Artificial Intelligence tasks
Solr 6 Feature Preview
Solr Presentation
Solr Recipes Workshop
Apache Solr
Rapid Prototyping with Solr
Schemaless Solr and the Solr Schema REST API
Get the most out of Solr search with PHP
New-Age Search through Apache Solr
Apache Solr
Introduction to Apache Solr.
Solr Black Belt Pre-conference
Introduction to Solr
Solr Query Parsing
Solr Indexing and Analysis Tricks
Ad

Similar to Mastering solr (20)

PDF
Search Engine-Building with Lucene and Solr, Part 1 (SoCal Code Camp LA 2013)
PDF
Apache Solr crash course
PDF
Apace Solr Web Development.pdf
PDF
Search Engine-Building with Lucene and Solr
KEY
Apache Solr - Enterprise search platform
PDF
Basics of Solr and Solr Integration with AEM6
PPTX
Apache Solr Workshop
PDF
A Practical Introduction to Apache Solr
PPTX
Solr Introduction
PDF
Information Retrieval - Data Science Bootcamp
PPTX
Implementing full text search with Apache Solr
PPTX
Solr/Elasticsearch for CF Developers (and others)
DOCX
Apache solr tech doc
PPT
Building Intelligent Search Applications with Apache Solr and PHP5
PPTX
20130310 solr tuorial
PDF
PDF
Apache Solr
KEY
Intro to Apache Solr for Drupal
PDF
Beyond full-text searches with Lucene and Solr
PDF
Apache Solr 4 Part 1 - Introduction, Features, Recency Ranking and Popularity...
Search Engine-Building with Lucene and Solr, Part 1 (SoCal Code Camp LA 2013)
Apache Solr crash course
Apace Solr Web Development.pdf
Search Engine-Building with Lucene and Solr
Apache Solr - Enterprise search platform
Basics of Solr and Solr Integration with AEM6
Apache Solr Workshop
A Practical Introduction to Apache Solr
Solr Introduction
Information Retrieval - Data Science Bootcamp
Implementing full text search with Apache Solr
Solr/Elasticsearch for CF Developers (and others)
Apache solr tech doc
Building Intelligent Search Applications with Apache Solr and PHP5
20130310 solr tuorial
Apache Solr
Intro to Apache Solr for Drupal
Beyond full-text searches with Lucene and Solr
Apache Solr 4 Part 1 - Introduction, Features, Recency Ranking and Popularity...
Ad

Recently uploaded (20)

PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PPT
Teaching material agriculture food technology
PDF
Electronic commerce courselecture one. Pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
A Presentation on Artificial Intelligence
PPTX
Cloud computing and distributed systems.
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Review of recent advances in non-invasive hemoglobin estimation
CIFDAQ's Market Insight: SEC Turns Pro Crypto
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
20250228 LYD VKU AI Blended-Learning.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
MYSQL Presentation for SQL database connectivity
Teaching material agriculture food technology
Electronic commerce courselecture one. Pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Reach Out and Touch Someone: Haptics and Empathic Computing
The Rise and Fall of 3GPP – Time for a Sabbatical?
Building Integrated photovoltaic BIPV_UPV.pdf
A Presentation on Artificial Intelligence
Cloud computing and distributed systems.
Mobile App Security Testing_ A Comprehensive Guide.pdf
NewMind AI Weekly Chronicles - August'25 Week I
Encapsulation_ Review paper, used for researhc scholars
Advanced methodologies resolving dimensionality complications for autism neur...
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Review of recent advances in non-invasive hemoglobin estimation

Mastering solr

  • 1. Mastering solr Jur de Vries
  • 2. Who am I? Developer/architect at Triquanta Trainer at Wizzlern
  • 3. Use case Market place Advertisements Adjust relevancy Paid boosting of add's Of course we use Drupal and Apache Solr
  • 4. Running Solr Locally Download latest version (3.6) Be sure to download distribution (not src) Unpack solr Go to example directory Run java -jar start.jar
  • 5. Drupal: which contrib? 2 Possibilities Apachesolr search Search api with solr backend
  • 6. Apache solr search Streghts: Supported by Acquia Easy to set up Mature Weaknesses Integration with views (still in dev)
  • 7. Search Api Strengths Flexible Indexes all entities Excellent views integration Related fields are easy to add to index Weaknesses Not supported (yet) by Acquia Solr backend has some issues
  • 8. Drupal: which contrib? Apachesolr search integration Quick setup Acquia Search API Exportable configuaration Views integration Index all entities Depends on your needs
  • 9. Basic use of search api Create server Create index Select fields to index Define data alterations Define processors Start indexing
  • 10. Field types Integer, date, boolean String or fulltext? Fulltext will get processed! Tokenize Stopwords Ignore case String is as is
  • 11. Demo Run solr Copy schema.xml and solrconfig.xml (!) Create server Create index Create view ads Ad filter exposed: search
  • 12. Advanced use of Search api This talk is about Solr, not about search API Understand Solr first! Many resources on the web Watch screencasts etc
  • 13. Mastering Solr Mastering solr is understanding solr What happens after a Drupal module? Let's have a look at the request
  • 14. Solr request Look at solr log Parameters: start rows q (query) qf (query fields) fl (fields) fq (filter query)
  • 15. Field names item_id, id t_.., ss_.., → why? Solr has to know how to handle fields Field api: field names differ Dynamic field names: tell solr field type!
  • 16. Schema.xml Defines field types and fields The real tweaking starts here! Let's have a look! dynamicField field type analyzers Copyfield
  • 17. What can you do in schema.xml? Synonyms (is disabled by default) Stopwords (and, or, etc) Stemming Proper multilingual handling
  • 18. Browse the schema Solr offers schema browsing Go to: http://localhost:8983/solr/admin
  • 19. Search relevancy Types of boosting: Field level boost Boost function Boost query (QueryElevation)
  • 20. Boost parameters Field level boosting: qf qf:t_body^20 score in field is multiplied by 20 Boost function: bf bf:product(fieldname, 2) result of function is added to score Boost query: bq boost (only for edismax) like bf but multiplication
  • 21. Let's boost title Field level boost is incorporated in Search API... But, where are the numbers in the request??? Search api solr forgot to add them! There is a patch :-) But lets do it another way...
  • 22. Debugging Solr Lets add &echoParams=all to the request... Where do all these parameters come from? Solrconfig.xml!!! Among other things: request handler Let's look at the dismax request handler
  • 23. Solrconfig.xml (Default) Request handler: Default parameters Add Spellcheck Tweak all kinds of search behavior! Let's add default search fields with boost
  • 24. Boost function Mathematical functions on field values Available functions: sum(x,y): x + y product(x,y): x * y scale(x, minTarget, maxTarget) recip(x, m, a, b): x / (m * a + b) ms(): time → ms(NOW/DAY, created) Many more!
  • 25. Boost date We need ms(): big values! Linear? To much difference Recip! recip(x,1,1000,1000) if x 1000: half 1 year: 3.1e10 recip(ms(NOW/YEAR?, created),1,3.1e10,3.1e10) bf=recip(ms(NOW/YEAR?, created),1,3.1e10,3.1e10)^3 Use a graphing tool!
  • 26. Boost queries Do a query like fq: Boost add's: content_type:add bq=content_type:add bq=(content_type)^20
  • 27. Debugging relevancy We know how to boost How can finetuning be done? solr has the solutions: add debugQuery=on
  • 29. Relevancy Choose your boosting methods Try in your browser Finetuning: debugQuery=on, source Add parameters to solrconfig.xml Or...
  • 30. Add parameters in code use hook_search_api_solr_query_alter(array &$call_args, SearchApiQueryInterface $query) $call_args['params']['bq'] = '(t_title:foo)^20' $call_args['params']['bf'][] = b_promote
  • 31. Override solr service class In Search API: define server class extend solr service class Only change key methods It's all about passing parameters!
  • 32. Conclusion Tweak indexing in schema.xml Stopwords Multilingual Tweak searching in solrconfig.xml Tweak searching by passing variables This is only an introduction!