SlideShare a Scribd company logo
Solr integration

April 20, 2012
Ard Schrijvers • a.schrijvers@onehippo.com /
ard@apache.org
About me:
              Ard Schrijvers

1. Working at Hippo since 2001
2. Email: a.schrijvers@onehippo.com
           ard@apache.org
3. Worked primarily on:
    1. HST
    2. Hippo Repository / Jackrabbit
    3. Lucene
    4. Cocoon
    5. Slide
4. Apache committer of Jackrabbit and Cocoon
Outline


1. The current search (HST / repo) architecture
Outline


1. The current search (HST / repo) architecture
2. The current problems / shortcomings / mismatches
Outline


1. The current search (HST / repo) architecture
2. The current problems / shortcomings / mismatches
3. What we are trying to improve, the objectives
Outline


1. The current search (HST / repo) architecture
2. The current problems / shortcomings / mismatches
3. What we are trying to improve, the objectives
4. Solr integration to rescue
Outline


1. The current search (HST / repo) architecture
2. The current problems / shortcomings / mismatches
3. What we are trying to improve, the objectives
4. Solr integration to rescue
5. A very fast demo
Outline


1. The current search (HST / repo) architecture
2. The current problems / shortcomings / mismatches
3. What we are trying to improve, the objectives
4. Solr integration to rescue
5. A very fast demo
6. Wrap up
Outline


1. The current search (HST / repo) architecture
2. The current problems / shortcomings / mismatches
3. What we are trying to improve, the objectives
4. Solr integration to rescue
5. A very fast demo
6. Wrap up
7. Questions
Current search architecture
Current search architecture


                        So
                 An HSTQuery
               is translated to an
                  XPath query
Which is delegated to the repository that returns a
               JCR NodeIterator
          which the HST binds back to
                  HippoBean's
Current search architecture


  That sounds doable and not to complex

                  is it?
Current search architecture

Well, it is .......
Current search architecture

Well, it is ....... very complex
Current search architecture

Reasons:

1. Back in the days when Jackrabbit 1 started, Lucene was at
   version 1.4
Current search architecture

Reasons:

1. Back in the days when Jackrabbit 1 started, Lucene was at
   version 1.4
2. The first JSR-170 spec imposed some very harsh
   constraints : A save must result in directly updated search
   results
Current search architecture

Reasons:

1. Back in the days when Jackrabbit 1 started, Lucene was at
   version 1.4
2. The first JSR-170 spec imposed some very harsh
   constraints : A save must result in directly updated search
   results
3. Support for XPath / SQL was needed. However, Lucene
   likes flattened data, JCR with XPath / SQL is all about
   hierarchical data
Current search architecture

Reasons:

1. Back in the days when Jackrabbit 1 started, Lucene was at
   version 1.4
2. The first JSR-170 spec imposed some very harsh
   constraints : A save must result in directly updated search
   results
3. Support for XPath / SQL was needed. However, Lucene
   likes flattened data, JCR with XPath / SQL is all about
   hierarchical data
4. JCR Nodes != Documents
Outline


1. The current search (HST / repo) architecture
2. The current problems / shortcomings / mismatches
3. What we are trying to improve, the objectives
4. Solr integration to rescue
5. A short HOWTO as developer
6. A very fast demo
7. Wrap up
8. Questions
Current problems / shortcomings /
            mismatches
1. JCR Nodes are indexed instead of Documents
   (#nodes >> #documents)
Current problems / shortcomings /
            mismatches
1. JCR Nodes are indexed instead of Documents
   (#nodes >> #documents)
2. A search result only returns Nodes (Rows) : what if you
   want something else, like auto-completion
Current problems / shortcomings /
            mismatches
1. JCR Nodes are indexed instead of Documents
   (#nodes >> #documents)
2. A search result only returns Nodes (Rows) : what if you
   want something else, like auto-completion
3. Very hard and very limited to customize
Current problems / shortcomings /
            mismatches
1. JCR Nodes are indexed instead of Documents
   (#nodes >> #documents)
2. A search result only returns Nodes (Rows) : what if you
   want something else, like auto-completion
3. Very hard and very limited to customize
4. A single index for an entire workspace
Current problems / shortcomings /
            mismatches
1. JCR Nodes are indexed instead of Documents
   (#nodes >> #documents)
2. A search result only returns Nodes (Rows) : what if you
   want something else, like auto-completion
3. Very hard and very limited to customize
4. A single index for an entire workspace
5. Support for very complex XPath / SQL queries at a price
   of CPU, Memory and complexity
Current problems / shortcomings /
            mismatches
1. JCR Nodes are indexed instead of Documents
   (#nodes >> #documents)
2. A search result only returns Nodes (Rows) : what if you
   want something else, like auto-completion
3. Very hard and very limited to customize
4. A single index for an entire workspace
5. Support for very complex XPath / SQL queries at a price
   of CPU, Memory and complexity
6. Only JCR Nodes and properties are indexed : no 'derived'
   field indexes
Current problems / shortcomings /
            mismatches
1. JCR Nodes are indexed instead of Documents
   (#nodes >> #documents)
2. A search result only returns Nodes (Rows) : what if you
   want something else, like auto-completion
3. Very hard and very limited to customize
4. A single index for an entire workspace
5. Support for very complex XPath / SQL queries at a price
   of CPU, Memory and complexity
6. Only JCR Nodes and properties are indexed : no 'derived'
   field indexes
7. To index external sources, the sources need to be stored in
   the repository
Current problems / shortcomings /
            mismatches
1. JCR Nodes are indexed instead of Documents
   (#nodes >> #documents)
2. A search result only returns Nodes (Rows) : what if you
   want something else, like auto-completion
3. Very hard and very limited to customize
4. A single index for an entire workspace
5. Support for very complex XPath / SQL queries at a price
   of CPU, Memory and complexity
6. Only JCR Nodes and properties are indexed : no 'derived'
   field indexes
7. To index external sources, the sources need to be stored in
   the repository
8. Range queries (and others) easily blow up
Current problems / shortcomings /
            mismatches
1. JCR Nodes are indexed instead of Documents
   (#nodes >> #documents)
2. A search result only returns Nodes (Rows) : what if you
   want something else, like auto-completion
3. Very hard and very limited to customize
4. A single index for an entire workspace
5. Support for very complex XPath / SQL queries at a price
   of CPU, Memory and complexity
6. Only JCR Nodes and properties are indexed : no 'derived'
   field indexes
7. To index external sources, the sources need to be stored in
   the repository
8. Range queries (and others) easily blow up
9. Getting the number of hits is complex
Current problems / shortcomings /
             mismatches
                       Extra problem

                         JCR Nodes
                             !=
                         Documents


For example : A news document contains a link to an author
document : Through the author name, the news document
should be found
Outline


1. The current search (HST / repo) architecture
2. The current problems / shortcomings / mismatches
3. What we are trying to improve, the objectives
4. Solr integration to rescue
5. A very fast demo
6. Wrap up
7. Questions
Objectives

 1. Fix all the 9+ problems / shortcomings/ mismatches from
    previous slides
 2. Easy to use and customize
 3. Satisfied customers
 4. Satisfied partners
 5. Scalable searches : CPU, memory and large document
    numbers
 6. Document oriented
 7. Integration with HST ContentBeans (HippoBeans)
 8. Index external sources
 9. Control the SIZE of the index yourself
10. Don't invent but integrate ( with out-of-the-box features
    supported by a large community)
Objective: Fix all the 9 problems /
shortcomings/ mismatches from
         previous slides
Objective: Fix all the 9 problems /
  shortcomings/ mismatches from
           previous slides
Easy:


           Solr integration to rescue
Objective: Easy to use and
        customize
Objective: Easy to use and
        customize



   YOU will be in the driver seat
Objective: Easy to use and
        customize
Objective: Easy to use and
        customize
Objective: Easy to use and
               customize
No more complete dependence on what the sometimes not so
smAR&D Hippo team thought was good for YOU
Objective : Easy to use and
        customize
Objective: Easy to use and
                customize
You decide 'from where', 'what', 'how' and 'when' to index
Objective: Easy to use and
                customize
You decide 'from where', 'what', 'how' and 'when' to index
 1. from where: which sources (jcr, webpages, database,
    noSQL store, nuxeo, alfresco, anything)
Objective: Easy to use and
                customize
You decide 'from where', 'what', 'how' and 'when' to index
 1. from where: which sources (jcr, webpages, database,
    noSQL store, nuxeo, alfresco, anything)
 2. what : which parts of a document (not jcr node) or external
    source
Objective: Easy to use and
                customize
You decide 'from where', 'what', 'how' and 'when' to index
 1. from where: which sources (jcr, webpages, database,
    noSQL store, nuxeo, alfresco, anything)
 2. what : which parts of a document (not jcr node) or external
    source
 3. how :
     1. which analyzer,
     2. index on document level, property level or both
     3. store the text
Objective: Easy to use and
                customize
You decide 'from where', 'what', 'how' and 'when' to index
 1. from where: which sources (jcr, webpages, database,
    noSQL store, nuxeo, alfresco, anything)
 2. what : which parts of a document (not jcr node) or external
    source
 3. how :
     1. which analyzer,
     2. index on document level, property level or both
     3. store the text
 4. when : when do you want to index
Objective: Easy to use and
         customize


But of course, out-of-the-box support and tooling
          ready to be used by YOU
Objective: Easy to use and
               customize


      But of course, out-of-the-box support and tooling
                 ready to be used by YOU

1. Default hippo repository indexer & observer
Objective: Easy to use and
               customize


      But of course, out-of-the-box support and tooling
                ready to be used by YOU

1. Default hippo repository indexer & observer
2. ContentBean (HippoBean) annotations for indexing
Objective: Easy to use and
               customize


      But of course, out-of-the-box support and tooling
                ready to be used by YOU

1. Default hippo repository indexer & observer
2. ContentBean (HippoBean) annotations for indexing
3. Binding search results to ContentBean's
Objective: Easy to use and
               customize


      But of course, out-of-the-box support and tooling
                ready to be used by YOU

1. Default hippo repository indexer & observer
2. ContentBean (HippoBean) annotations for indexing
3. Binding search results to ContentBean's
4. Deployment support
Objective: Easy to use and
               customize


      But of course, out-of-the-box support and tooling
                ready to be used by YOU

1. Default hippo repository indexer & observer
2. ContentBean (HippoBean) annotations for indexing
3. Binding search results to ContentBean's
4. Deployment support
5. Clustering support
Objective: Satisfied customers
Objective: Satisfied customers




            HOW?
Objective: Satisfied customers




            EASY
Objective: Satisfied customers




   Most likely they just will be satisfied
Objective: Satisfied customers

If they are not satisfied enough you can:

 1. Easily customize it (aka tune it until 'je een ons weegt')
 2. Hire anyone with Solr experience : All our partners have
    Solr experience
Objective: Satisfied customers

Still not satisfied?



    Let them pay too much for a Google Search appliance,
   Autonomy or any of the other 'useless to pay for software'
Objective: Satisfied partners
Objective: Satisfied partners




Although on thin ice here, I strongly believe in this because:
Objective: Satisfied partners


1. Our partners frequently have good knowledge about Solr
Objective: Satisfied partners


1. Our partners frequently have good knowledge about Solr
2. Our partners depend less on the current search limitations
Objective: Satisfied partners


1. Our partners frequently have good knowledge about Solr
2. Our partners depend less on the current search limitations
3. Our partners can pitch with their Solr knowledge
Objective: Satisfied partners


1. Our partners frequently have good knowledge about Solr
2. Our partners depend less on the current search limitations
3. Our partners can pitch with their Solr knowledge
4. Our partners can sell more Hippo implementations
Objective: Satisfied partners


1. Our partners frequently have good knowledge about Solr
2. Our partners depend less on the current search limitations
3. Our partners can pitch with their Solr knowledge
4. Our partners can sell more Hippo implementations
5. Our partners will earn more on Hippo and have happier
   developers
Objective: Satisfied partners


1. Our partners frequently have good knowledge about Solr
2. Our partners depend less on the current search limitations
3. Our partners can pitch with their Solr knowledge
4. Our partners can sell more Hippo implementations
5. Our partners will earn more on Hippo and have happier
   developers
6. Hippo will earn more through HES: Which will satisfy
   partners again, because Hippo can spend more on AR&D
   ==> more features
Objective: Scalable searches
Objective: Scalable searches

1. Using Solr to do the searches
Objective: Scalable searches

1. Using Solr to do the searches
2. Not the complex JCR hierarchical searches
Objective: Scalable searches

1. Using Solr to do the searches
2. Not the complex JCR hierarchical searches
3. Document oriented instead of JCR Nodes ( #docs <<
   #nodes)
Objective: Document oriented
Objective: Document oriented



     What do we want to search for?
Objective: Document oriented



           Exactly,

         Documents!!
Objective: Document oriented



          A Document
               ==
          A HippoBean
               !=
           JCR Node
Objective: Document oriented




          So let's index
Objective: Document oriented




          So let's index

          HippoBeans
        (ContentBeans)
Objective: Integration with
ContentBeans (HippoBeans)
Objective: Integration with
       ContentBeans (HippoBeans)
As a developer ....



              how am I going to index my beans?
Objective: Integration with
     ContentBeans (HippoBeans)



I know how to write HippoBeans, that all I ever did in my life
Objective: Integration with
ContentBeans (HippoBeans)



 How do you expect me to index my beans?
Objective: Integration with
         ContentBeans (HippoBeans)
Annotate your getters with

                             @IndexField
                                  or
                        @IndexField(name="foo")

And account for them in Solr schema.xml
 <field name="title" type="text_general" indexed="true" stored="true" />
 <field name="summary" type="text_general" indexed="true" stored="true"/>
Objective: Integration with
          ContentBeans (HippoBeans)
An example:
@Node(jcrType="demosite:textdocument")
public class TextBean extends BaseDocument {

     @IndexField
     public String getTitle() {
         return getProperty("demosite:title") ;
     }
     @IndexField(name="samenvatting")
     public String getSummary() {
         return getProperty("demosite:summary") ;
     }
}
Objective: Integration with
          ContentBeans (HippoBeans)
Another example:
@Node(jcrType="demosite:textdocument")
public class TextBean extends BaseDocument {

     @IndexField
     public String getTitle() {
         return getProperty("demosite:title") ;
     }
     @IndexField
     public String getSummary() {
         return getProperty("demosite:summary") ;
     }

     @IndexField
     public String getAuthor() {
         return getLinkedBean("demosite:author", Author.class). etAuthor();
                                                              g
     }
}
Objective: Integration with
          ContentBeans (HippoBeans)
Another example:
@Node(jcrType="demosite:textdocument")
public class TextBean extends BaseDocument {

     @IndexField
     public String getTitle() {
         return getProperty("demosite:title") ;
     }
     @IndexField
     public String getSummary() {
         return getProperty("demosite:summary") ;
     }

     @ReIndexOnChange
     @IndexField
     public Author getAuthor() {
         return getLinkedBean("demosite:author", Author.class);
     }
}
Objective: Integration with
          ContentBeans (HippoBeans)
Another example: Setters
@Node(jcrType="demosite:textdocument")
public class TextBean extends BaseDocument {
     private String title;
     private String summary;

     @IndexField
     public String getTitle() {
         return title == null ? getProperty("demosite:title"): title ;
     }
     public void setTitle(String title) {
         this.title = title;
     }
     @IndexField
     public String getSummary() {
         return summary == null ? getProperty("demosite:summary"): summary ;
     }
     public void setSummary(String summary) {
         this.summary = summary;
     }
}
              Bonus : What can we achieve with the Setters?
Objective: Integration with
       ContentBeans (HippoBeans)
That's all you need to do

And the HST binds some extra indexing fields like

 1. The path
 2. The canonicalUUID
 3. The name
 4. The localized name
 5. The depth
 6. The class hierarchy (including interfaces)
Objective: Index external sources
Objective: Index external sources

You can

1. Push them directly to Solr
Objective: Index external sources

You can

1. Push them directly to Solr
2. Push them to a HST JAX-RS resource that binds to a
   ContentBean and commits to Solr
Objective: Index external sources

You can

1. Push them directly to Solr
2. Push them to a HST JAX-RS resource that binds to a
   ContentBean and commits to Solr
3. Crawl from the HST and bind to ContentBeans and commit
   them to Solr
Objective: Index external sources

A ContentBean does *not* need a JCR Node!

ContentBean interface:

public interface ContentBean {
    @IndexField(name="id")
    String getPath();
    void setPath(String path);
}
Objective: Index external sources

An example : GoGreenProductBean in Testsuite
public class   GoGreenProductBean            implements   ContentBean     {

    private String path;
    private String title;
    private String summary;
    private String description;


    public String getPath() {return path;}
    public void setPath(final String path) {this.path = path;}
    @IndexField
    public String getTitle() {return title;}
    public void setTitle(String title) {this.title = title;}
    @IndexField
    public String getSummary() {return summary ;}
    public void setSummary(String summary) {this.summary = summary;}
    @IndexField
    public String getDescription() {return description;}
    public void setDescription(String description) {this.description = description;}
}
Objective: Index external sources

And add the GoGreenProductBean to Solr
{
     List<GoGreenProductBean> gogreenBeans = new ArrayList<GoGreenProductBean>();
     // FILL THE gogreenBeans LIST

     // NOW ADD TO INDEX
     HippoSolrManager solrManager =
                HstServices.getComponentManager().getComponent(
                HippoSolrManager.class.getName(), SOLR_MODULE_NAME);
         try {
             solrManager.getSolrServer().addBeans(gogreenBeans);
             UpdateResponse commit = solrManager.getSolrServer().commit();
         } catch (IOException e) {
             e.printStackTrace();
         } catch (SolrServerException e) {
             e.printStackTrace();
         }
}
Objective: Control the SIZE of the
         index yourself
Objective: Control the SIZE of the
           index yourself
JCR / Jackrabbit / Hippo-Repository has a generic


                   one-fits-all-index   (or one-fits-none-index)




Which grows very large easily, and can hardly be customized
Objective: Control the SIZE of the
           index yourself
However, search is

                      domain specific

Thus,

                 Just index what is needed
                      for the customer
Objective: Don't invent but integrate
Objective: Don't invent but integrate


                   Use Solr

               Use Solrj client

          Expose the Solrj SolrQuery
Objective: Don't invent but integrate

For example:
HippoSolrManager solrManager = ...
String query = ...
HippoQuery hippoQuery = solrManager.createQuery(query);
hippoQuery.setLimit(pageSize);
hippoQuery.setOffset((page - 1) * pageSize);

// hippoQuery.getSolrQuery() is the SolrQuery object
// include scoring

hippoQuery.getSolrQuery().setIncludeScore(true);
hippoQuery.getSolrQuery().setHighlight(true);
hippoQuery.getSolrQuery().setHighlightFragsize(200);
hippoQuery.getSolrQuery().addHighlightField("title");
hippoQuery.getSolrQuery().addHighlightField("summary");
hippoQuery.getSolrQuery().addHighlightField("htmlContent");

HippoQueryResult result = hippoQuery.execute(true);
Objective: Don't invent but integrate

For example:
HippoSolrManager solrManager = ...
String query = ...
HippoQuery hippoQuery = solrManager.createQuery(query);
hippoQuery.setLimit(pageSize);
hippoQuery.setOffset((page - 1) * pageSize);

// hippoQuery.getSolrQuery() is the SolrQuery object
// include scoring

hippoQuery.getSolrQuery().setIncludeScore(true);
hippoQuery.getSolrQuery().setHighlight(true);
hippoQuery.getSolrQuery().setHighlightFragsize(200);
hippoQuery.getSolrQuery().addHighlightField("title");
hippoQuery.getSolrQuery().addHighlightField("summary");
hippoQuery.getSolrQuery().addHighlightField("htmlContent");

HippoQueryResult result = hippoQuery.execute(true);
Outline


1. The current search (HST / repo) architecture
2. The current problems / shortcomings / mismatches
3. What we are trying to improve, the objectives
4. Solr integration to rescue
5. A very fast demo
6. Wrap up
7. Questions
Solr integration to rescue



     No further comments :-)
Outline


1. The current search (HST / repo) architecture
2. The current problems / shortcomings / mismatches
3. What we are trying to improve, the objectives
4. Solr integration to rescue
5. A very fast demo
6. Wrap up
7. Questions
A very fast demo

                 setup
~75.000 long wikipedia docs in repository




    ............... doing the demo .................
That was : a very fast demo
Outline


1. The current search (HST / repo) architecture
2. The current problems / shortcomings / mismatches
3. What we are trying to improve, the objectives
4. Solr integration to rescue
5. A very fast demo
6. Wrap up
7. Questions
Wrap up

I think that with the Solr integration
Wrap up

I think that with the Solr integration

1. Developers will be happier
Wrap up

I think that with the Solr integration

1. Developers will be happier
2. Customers will be happier
Wrap up

I think that with the Solr integration

1. Developers will be happier
2. Customers will be happier
3. Partners will be happier
Wrap up

I think that with the Solr integration

1. Developers will be happier
2. Customers will be happier
3. Partners will be happier
4. Hippo will be happier
Wrap up

I think that with the Solr integration

 1. Developers will be happier
 2. Customers will be happier
 3. Partners will be happier
 4. Hippo will be happier

And finally, last and least
Wrap up

I think that with the Solr integration

1. Developers will be happier
2. Customers will be happier
3. Partners will be happier
4. Hippo will be happier
5. Infra will be happier because the servers stop sweating
Outline


1. The current search (HST / repo) architecture
2. The current problems / shortcomings / mismatches
3. What we are trying to improve, the objectives
4. Solr integration to rescue
5. A very fast demo
6. Wrap up
7. Questions
Questions?

Check out the example at :
http://svn.onehippo.org/repos/hippo/hippo-cms7/testsuite/trunk

More Related Content

PDF
Solr 4
PDF
Rapid Prototyping with Solr
PDF
Solr Recipes
PDF
Lucene for Solr Developers
PDF
Lucene's Latest (for Libraries)
PDF
Most Wanted: Future PostgreSQL Features
PDF
code4lib 2011 preconference: What's New in Solr (since 1.4.1)
PPTX
Andrzej bialecki lr-2013-dublin
Solr 4
Rapid Prototyping with Solr
Solr Recipes
Lucene for Solr Developers
Lucene's Latest (for Libraries)
Most Wanted: Future PostgreSQL Features
code4lib 2011 preconference: What's New in Solr (since 1.4.1)
Andrzej bialecki lr-2013-dublin

What's hot (20)

PPTX
Apache Solr - search for everyone!
PPTX
Introduction to Lucene and Solr - 1
PPT
Solr vs ElasticSearch
PPT
Building Intelligent Search Applications with Apache Solr and PHP5
PPTX
Battle of the giants: Apache Solr vs ElasticSearch
PDF
Rapid Prototyping with Solr
PDF
New-Age Search through Apache Solr
KEY
Elasticsearch - Devoxx France 2012 - English version
PDF
Apache Solr crash course
PPTX
Oracle REST Data Services: POUG Edition
PPT
SE2016 - Java EE revisits design patterns 2016
PPTX
Introduction to Apache Solr
ODP
Introduction to Apache solr
PDF
Apache Solr Search Course Drupal 7 Acquia
PDF
Lucene for Solr Developers
PDF
Solr Flair
PPTX
Enterprise Search Using Apache Solr
PDF
Getting Started with Solr
PDF
Improved Search with Lucene 4.0 - Robert Muir
PDF
Apache Solr - An Experience Report
Apache Solr - search for everyone!
Introduction to Lucene and Solr - 1
Solr vs ElasticSearch
Building Intelligent Search Applications with Apache Solr and PHP5
Battle of the giants: Apache Solr vs ElasticSearch
Rapid Prototyping with Solr
New-Age Search through Apache Solr
Elasticsearch - Devoxx France 2012 - English version
Apache Solr crash course
Oracle REST Data Services: POUG Edition
SE2016 - Java EE revisits design patterns 2016
Introduction to Apache Solr
Introduction to Apache solr
Apache Solr Search Course Drupal 7 Acquia
Lucene for Solr Developers
Solr Flair
Enterprise Search Using Apache Solr
Getting Started with Solr
Improved Search with Lucene 4.0 - Robert Muir
Apache Solr - An Experience Report
Ad

Viewers also liked (20)

PDF
Cms integration of apache solr how we did it.
PDF
Hippo gettogether april 2012 faceted navigation a tale of daemons
KEY
Introducing Apricot, The Eclipse Content Management Platform
PDF
2008-12 OJUG JCR Demo
PDF
The Java Content Repository
PDF
Hippo get together workshop automatic export
PPTX
Building high performance
PPTX
Power point
PPT
ΚΕΣΥΠ Ηρακλείου Θ Αντωνίου. Μετά το γυμνάσιο Τι; (Ιούνιος 2014)
PPTX
Web Applications Development
PPT
Hippo Presentation Jboye Study tour
PDF
JCR In Action (ApacheCon US 2009)
PPTX
Hippo CMS at OpenCo Amsterdam 2014
PDF
What's new in JSR-283?
PPTX
2η πανελλήνια ημέρα σχ. αθλητισμού
PDF
Module%201%20 physics%20basic%20science
PPT
Δες τη ζωή υγιεινά. Η διατροφή στην εφηβική ηλικία.
ODP
Rapid JCR Applications Development with Sling
PPTX
我想請你吃飯 (繁体)
PPSX
Η χημεία του κρασιού
Cms integration of apache solr how we did it.
Hippo gettogether april 2012 faceted navigation a tale of daemons
Introducing Apricot, The Eclipse Content Management Platform
2008-12 OJUG JCR Demo
The Java Content Repository
Hippo get together workshop automatic export
Building high performance
Power point
ΚΕΣΥΠ Ηρακλείου Θ Αντωνίου. Μετά το γυμνάσιο Τι; (Ιούνιος 2014)
Web Applications Development
Hippo Presentation Jboye Study tour
JCR In Action (ApacheCon US 2009)
Hippo CMS at OpenCo Amsterdam 2014
What's new in JSR-283?
2η πανελλήνια ημέρα σχ. αθλητισμού
Module%201%20 physics%20basic%20science
Δες τη ζωή υγιεινά. Η διατροφή στην εφηβική ηλικία.
Rapid JCR Applications Development with Sling
我想請你吃飯 (繁体)
Η χημεία του κρασιού
Ad

Similar to Hippo get together presentation solr integration (20)

PDF
Apace Solr Web Development.pdf
PPT
The return of the hierarchical model
KEY
Solr 101
PDF
Introduction to Solr
KEY
Intro to Apache Solr for Drupal
PPTX
JahiaOne - Jahia7: Query and Search API under the Hood
PDF
NoSQL, Apache SOLR and Apache Hadoop
PDF
Flexible search in Apache Jackrabbit Oak
PPTX
Intro to Apache Lucene and Solr
PDF
An Overview of ModeShape
PDF
Suche mit Apache Lucene & Co.
KEY
Apache Solr - Enterprise search platform
PDF
Searching the Stuff of Life - BioSolr: Presented by Matt Pearce & Alan Woodwa...
PDF
Small wins in a small time with Apache Solr
PDF
Solr Troubleshooting - TreeMap approach
PDF
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
PDF
Solr + Hadoop = Big Data Search
PDF
The First Class Integration of Solr with Hadoop
PPTX
IRT Unit_4.pptx
Apace Solr Web Development.pdf
The return of the hierarchical model
Solr 101
Introduction to Solr
Intro to Apache Solr for Drupal
JahiaOne - Jahia7: Query and Search API under the Hood
NoSQL, Apache SOLR and Apache Hadoop
Flexible search in Apache Jackrabbit Oak
Intro to Apache Lucene and Solr
An Overview of ModeShape
Suche mit Apache Lucene & Co.
Apache Solr - Enterprise search platform
Searching the Stuff of Life - BioSolr: Presented by Matt Pearce & Alan Woodwa...
Small wins in a small time with Apache Solr
Solr Troubleshooting - TreeMap approach
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr + Hadoop = Big Data Search
The First Class Integration of Solr with Hadoop
IRT Unit_4.pptx

Recently uploaded (20)

PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Encapsulation theory and applications.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
KodekX | Application Modernization Development
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Machine learning based COVID-19 study performance prediction
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
cuic standard and advanced reporting.pdf
PDF
Approach and Philosophy of On baking technology
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Chapter 3 Spatial Domain Image Processing.pdf
Encapsulation_ Review paper, used for researhc scholars
Digital-Transformation-Roadmap-for-Companies.pptx
sap open course for s4hana steps from ECC to s4
Programs and apps: productivity, graphics, security and other tools
MYSQL Presentation for SQL database connectivity
20250228 LYD VKU AI Blended-Learning.pptx
The Rise and Fall of 3GPP – Time for a Sabbatical?
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
NewMind AI Weekly Chronicles - August'25 Week I
Encapsulation theory and applications.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
MIND Revenue Release Quarter 2 2025 Press Release
KodekX | Application Modernization Development
Per capita expenditure prediction using model stacking based on satellite ima...
Machine learning based COVID-19 study performance prediction
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
cuic standard and advanced reporting.pdf
Approach and Philosophy of On baking technology
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Chapter 3 Spatial Domain Image Processing.pdf

Hippo get together presentation solr integration

  • 1. Solr integration April 20, 2012 Ard Schrijvers • a.schrijvers@onehippo.com / ard@apache.org
  • 2. About me: Ard Schrijvers 1. Working at Hippo since 2001 2. Email: a.schrijvers@onehippo.com ard@apache.org 3. Worked primarily on: 1. HST 2. Hippo Repository / Jackrabbit 3. Lucene 4. Cocoon 5. Slide 4. Apache committer of Jackrabbit and Cocoon
  • 3. Outline 1. The current search (HST / repo) architecture
  • 4. Outline 1. The current search (HST / repo) architecture 2. The current problems / shortcomings / mismatches
  • 5. Outline 1. The current search (HST / repo) architecture 2. The current problems / shortcomings / mismatches 3. What we are trying to improve, the objectives
  • 6. Outline 1. The current search (HST / repo) architecture 2. The current problems / shortcomings / mismatches 3. What we are trying to improve, the objectives 4. Solr integration to rescue
  • 7. Outline 1. The current search (HST / repo) architecture 2. The current problems / shortcomings / mismatches 3. What we are trying to improve, the objectives 4. Solr integration to rescue 5. A very fast demo
  • 8. Outline 1. The current search (HST / repo) architecture 2. The current problems / shortcomings / mismatches 3. What we are trying to improve, the objectives 4. Solr integration to rescue 5. A very fast demo 6. Wrap up
  • 9. Outline 1. The current search (HST / repo) architecture 2. The current problems / shortcomings / mismatches 3. What we are trying to improve, the objectives 4. Solr integration to rescue 5. A very fast demo 6. Wrap up 7. Questions
  • 11. Current search architecture So An HSTQuery is translated to an XPath query Which is delegated to the repository that returns a JCR NodeIterator which the HST binds back to HippoBean's
  • 12. Current search architecture That sounds doable and not to complex is it?
  • 14. Current search architecture Well, it is ....... very complex
  • 15. Current search architecture Reasons: 1. Back in the days when Jackrabbit 1 started, Lucene was at version 1.4
  • 16. Current search architecture Reasons: 1. Back in the days when Jackrabbit 1 started, Lucene was at version 1.4 2. The first JSR-170 spec imposed some very harsh constraints : A save must result in directly updated search results
  • 17. Current search architecture Reasons: 1. Back in the days when Jackrabbit 1 started, Lucene was at version 1.4 2. The first JSR-170 spec imposed some very harsh constraints : A save must result in directly updated search results 3. Support for XPath / SQL was needed. However, Lucene likes flattened data, JCR with XPath / SQL is all about hierarchical data
  • 18. Current search architecture Reasons: 1. Back in the days when Jackrabbit 1 started, Lucene was at version 1.4 2. The first JSR-170 spec imposed some very harsh constraints : A save must result in directly updated search results 3. Support for XPath / SQL was needed. However, Lucene likes flattened data, JCR with XPath / SQL is all about hierarchical data 4. JCR Nodes != Documents
  • 19. Outline 1. The current search (HST / repo) architecture 2. The current problems / shortcomings / mismatches 3. What we are trying to improve, the objectives 4. Solr integration to rescue 5. A short HOWTO as developer 6. A very fast demo 7. Wrap up 8. Questions
  • 20. Current problems / shortcomings / mismatches 1. JCR Nodes are indexed instead of Documents (#nodes >> #documents)
  • 21. Current problems / shortcomings / mismatches 1. JCR Nodes are indexed instead of Documents (#nodes >> #documents) 2. A search result only returns Nodes (Rows) : what if you want something else, like auto-completion
  • 22. Current problems / shortcomings / mismatches 1. JCR Nodes are indexed instead of Documents (#nodes >> #documents) 2. A search result only returns Nodes (Rows) : what if you want something else, like auto-completion 3. Very hard and very limited to customize
  • 23. Current problems / shortcomings / mismatches 1. JCR Nodes are indexed instead of Documents (#nodes >> #documents) 2. A search result only returns Nodes (Rows) : what if you want something else, like auto-completion 3. Very hard and very limited to customize 4. A single index for an entire workspace
  • 24. Current problems / shortcomings / mismatches 1. JCR Nodes are indexed instead of Documents (#nodes >> #documents) 2. A search result only returns Nodes (Rows) : what if you want something else, like auto-completion 3. Very hard and very limited to customize 4. A single index for an entire workspace 5. Support for very complex XPath / SQL queries at a price of CPU, Memory and complexity
  • 25. Current problems / shortcomings / mismatches 1. JCR Nodes are indexed instead of Documents (#nodes >> #documents) 2. A search result only returns Nodes (Rows) : what if you want something else, like auto-completion 3. Very hard and very limited to customize 4. A single index for an entire workspace 5. Support for very complex XPath / SQL queries at a price of CPU, Memory and complexity 6. Only JCR Nodes and properties are indexed : no 'derived' field indexes
  • 26. Current problems / shortcomings / mismatches 1. JCR Nodes are indexed instead of Documents (#nodes >> #documents) 2. A search result only returns Nodes (Rows) : what if you want something else, like auto-completion 3. Very hard and very limited to customize 4. A single index for an entire workspace 5. Support for very complex XPath / SQL queries at a price of CPU, Memory and complexity 6. Only JCR Nodes and properties are indexed : no 'derived' field indexes 7. To index external sources, the sources need to be stored in the repository
  • 27. Current problems / shortcomings / mismatches 1. JCR Nodes are indexed instead of Documents (#nodes >> #documents) 2. A search result only returns Nodes (Rows) : what if you want something else, like auto-completion 3. Very hard and very limited to customize 4. A single index for an entire workspace 5. Support for very complex XPath / SQL queries at a price of CPU, Memory and complexity 6. Only JCR Nodes and properties are indexed : no 'derived' field indexes 7. To index external sources, the sources need to be stored in the repository 8. Range queries (and others) easily blow up
  • 28. Current problems / shortcomings / mismatches 1. JCR Nodes are indexed instead of Documents (#nodes >> #documents) 2. A search result only returns Nodes (Rows) : what if you want something else, like auto-completion 3. Very hard and very limited to customize 4. A single index for an entire workspace 5. Support for very complex XPath / SQL queries at a price of CPU, Memory and complexity 6. Only JCR Nodes and properties are indexed : no 'derived' field indexes 7. To index external sources, the sources need to be stored in the repository 8. Range queries (and others) easily blow up 9. Getting the number of hits is complex
  • 29. Current problems / shortcomings / mismatches Extra problem JCR Nodes != Documents For example : A news document contains a link to an author document : Through the author name, the news document should be found
  • 30. Outline 1. The current search (HST / repo) architecture 2. The current problems / shortcomings / mismatches 3. What we are trying to improve, the objectives 4. Solr integration to rescue 5. A very fast demo 6. Wrap up 7. Questions
  • 31. Objectives 1. Fix all the 9+ problems / shortcomings/ mismatches from previous slides 2. Easy to use and customize 3. Satisfied customers 4. Satisfied partners 5. Scalable searches : CPU, memory and large document numbers 6. Document oriented 7. Integration with HST ContentBeans (HippoBeans) 8. Index external sources 9. Control the SIZE of the index yourself 10. Don't invent but integrate ( with out-of-the-box features supported by a large community)
  • 32. Objective: Fix all the 9 problems / shortcomings/ mismatches from previous slides
  • 33. Objective: Fix all the 9 problems / shortcomings/ mismatches from previous slides Easy: Solr integration to rescue
  • 34. Objective: Easy to use and customize
  • 35. Objective: Easy to use and customize YOU will be in the driver seat
  • 36. Objective: Easy to use and customize
  • 37. Objective: Easy to use and customize
  • 38. Objective: Easy to use and customize No more complete dependence on what the sometimes not so smAR&D Hippo team thought was good for YOU
  • 39. Objective : Easy to use and customize
  • 40. Objective: Easy to use and customize You decide 'from where', 'what', 'how' and 'when' to index
  • 41. Objective: Easy to use and customize You decide 'from where', 'what', 'how' and 'when' to index 1. from where: which sources (jcr, webpages, database, noSQL store, nuxeo, alfresco, anything)
  • 42. Objective: Easy to use and customize You decide 'from where', 'what', 'how' and 'when' to index 1. from where: which sources (jcr, webpages, database, noSQL store, nuxeo, alfresco, anything) 2. what : which parts of a document (not jcr node) or external source
  • 43. Objective: Easy to use and customize You decide 'from where', 'what', 'how' and 'when' to index 1. from where: which sources (jcr, webpages, database, noSQL store, nuxeo, alfresco, anything) 2. what : which parts of a document (not jcr node) or external source 3. how : 1. which analyzer, 2. index on document level, property level or both 3. store the text
  • 44. Objective: Easy to use and customize You decide 'from where', 'what', 'how' and 'when' to index 1. from where: which sources (jcr, webpages, database, noSQL store, nuxeo, alfresco, anything) 2. what : which parts of a document (not jcr node) or external source 3. how : 1. which analyzer, 2. index on document level, property level or both 3. store the text 4. when : when do you want to index
  • 45. Objective: Easy to use and customize But of course, out-of-the-box support and tooling ready to be used by YOU
  • 46. Objective: Easy to use and customize But of course, out-of-the-box support and tooling ready to be used by YOU 1. Default hippo repository indexer & observer
  • 47. Objective: Easy to use and customize But of course, out-of-the-box support and tooling ready to be used by YOU 1. Default hippo repository indexer & observer 2. ContentBean (HippoBean) annotations for indexing
  • 48. Objective: Easy to use and customize But of course, out-of-the-box support and tooling ready to be used by YOU 1. Default hippo repository indexer & observer 2. ContentBean (HippoBean) annotations for indexing 3. Binding search results to ContentBean's
  • 49. Objective: Easy to use and customize But of course, out-of-the-box support and tooling ready to be used by YOU 1. Default hippo repository indexer & observer 2. ContentBean (HippoBean) annotations for indexing 3. Binding search results to ContentBean's 4. Deployment support
  • 50. Objective: Easy to use and customize But of course, out-of-the-box support and tooling ready to be used by YOU 1. Default hippo repository indexer & observer 2. ContentBean (HippoBean) annotations for indexing 3. Binding search results to ContentBean's 4. Deployment support 5. Clustering support
  • 54. Objective: Satisfied customers Most likely they just will be satisfied
  • 55. Objective: Satisfied customers If they are not satisfied enough you can: 1. Easily customize it (aka tune it until 'je een ons weegt') 2. Hire anyone with Solr experience : All our partners have Solr experience
  • 56. Objective: Satisfied customers Still not satisfied? Let them pay too much for a Google Search appliance, Autonomy or any of the other 'useless to pay for software'
  • 58. Objective: Satisfied partners Although on thin ice here, I strongly believe in this because:
  • 59. Objective: Satisfied partners 1. Our partners frequently have good knowledge about Solr
  • 60. Objective: Satisfied partners 1. Our partners frequently have good knowledge about Solr 2. Our partners depend less on the current search limitations
  • 61. Objective: Satisfied partners 1. Our partners frequently have good knowledge about Solr 2. Our partners depend less on the current search limitations 3. Our partners can pitch with their Solr knowledge
  • 62. Objective: Satisfied partners 1. Our partners frequently have good knowledge about Solr 2. Our partners depend less on the current search limitations 3. Our partners can pitch with their Solr knowledge 4. Our partners can sell more Hippo implementations
  • 63. Objective: Satisfied partners 1. Our partners frequently have good knowledge about Solr 2. Our partners depend less on the current search limitations 3. Our partners can pitch with their Solr knowledge 4. Our partners can sell more Hippo implementations 5. Our partners will earn more on Hippo and have happier developers
  • 64. Objective: Satisfied partners 1. Our partners frequently have good knowledge about Solr 2. Our partners depend less on the current search limitations 3. Our partners can pitch with their Solr knowledge 4. Our partners can sell more Hippo implementations 5. Our partners will earn more on Hippo and have happier developers 6. Hippo will earn more through HES: Which will satisfy partners again, because Hippo can spend more on AR&D ==> more features
  • 66. Objective: Scalable searches 1. Using Solr to do the searches
  • 67. Objective: Scalable searches 1. Using Solr to do the searches 2. Not the complex JCR hierarchical searches
  • 68. Objective: Scalable searches 1. Using Solr to do the searches 2. Not the complex JCR hierarchical searches 3. Document oriented instead of JCR Nodes ( #docs << #nodes)
  • 70. Objective: Document oriented What do we want to search for?
  • 71. Objective: Document oriented Exactly, Documents!!
  • 72. Objective: Document oriented A Document == A HippoBean != JCR Node
  • 74. Objective: Document oriented So let's index HippoBeans (ContentBeans)
  • 76. Objective: Integration with ContentBeans (HippoBeans) As a developer .... how am I going to index my beans?
  • 77. Objective: Integration with ContentBeans (HippoBeans) I know how to write HippoBeans, that all I ever did in my life
  • 78. Objective: Integration with ContentBeans (HippoBeans) How do you expect me to index my beans?
  • 79. Objective: Integration with ContentBeans (HippoBeans) Annotate your getters with @IndexField or @IndexField(name="foo") And account for them in Solr schema.xml <field name="title" type="text_general" indexed="true" stored="true" /> <field name="summary" type="text_general" indexed="true" stored="true"/>
  • 80. Objective: Integration with ContentBeans (HippoBeans) An example: @Node(jcrType="demosite:textdocument") public class TextBean extends BaseDocument { @IndexField public String getTitle() { return getProperty("demosite:title") ; } @IndexField(name="samenvatting") public String getSummary() { return getProperty("demosite:summary") ; } }
  • 81. Objective: Integration with ContentBeans (HippoBeans) Another example: @Node(jcrType="demosite:textdocument") public class TextBean extends BaseDocument { @IndexField public String getTitle() { return getProperty("demosite:title") ; } @IndexField public String getSummary() { return getProperty("demosite:summary") ; } @IndexField public String getAuthor() { return getLinkedBean("demosite:author", Author.class). etAuthor(); g } }
  • 82. Objective: Integration with ContentBeans (HippoBeans) Another example: @Node(jcrType="demosite:textdocument") public class TextBean extends BaseDocument { @IndexField public String getTitle() { return getProperty("demosite:title") ; } @IndexField public String getSummary() { return getProperty("demosite:summary") ; } @ReIndexOnChange @IndexField public Author getAuthor() { return getLinkedBean("demosite:author", Author.class); } }
  • 83. Objective: Integration with ContentBeans (HippoBeans) Another example: Setters @Node(jcrType="demosite:textdocument") public class TextBean extends BaseDocument { private String title; private String summary; @IndexField public String getTitle() { return title == null ? getProperty("demosite:title"): title ; } public void setTitle(String title) { this.title = title; } @IndexField public String getSummary() { return summary == null ? getProperty("demosite:summary"): summary ; } public void setSummary(String summary) { this.summary = summary; } } Bonus : What can we achieve with the Setters?
  • 84. Objective: Integration with ContentBeans (HippoBeans) That's all you need to do And the HST binds some extra indexing fields like 1. The path 2. The canonicalUUID 3. The name 4. The localized name 5. The depth 6. The class hierarchy (including interfaces)
  • 86. Objective: Index external sources You can 1. Push them directly to Solr
  • 87. Objective: Index external sources You can 1. Push them directly to Solr 2. Push them to a HST JAX-RS resource that binds to a ContentBean and commits to Solr
  • 88. Objective: Index external sources You can 1. Push them directly to Solr 2. Push them to a HST JAX-RS resource that binds to a ContentBean and commits to Solr 3. Crawl from the HST and bind to ContentBeans and commit them to Solr
  • 89. Objective: Index external sources A ContentBean does *not* need a JCR Node! ContentBean interface: public interface ContentBean { @IndexField(name="id") String getPath(); void setPath(String path); }
  • 90. Objective: Index external sources An example : GoGreenProductBean in Testsuite public class GoGreenProductBean implements ContentBean { private String path; private String title; private String summary; private String description; public String getPath() {return path;} public void setPath(final String path) {this.path = path;} @IndexField public String getTitle() {return title;} public void setTitle(String title) {this.title = title;} @IndexField public String getSummary() {return summary ;} public void setSummary(String summary) {this.summary = summary;} @IndexField public String getDescription() {return description;} public void setDescription(String description) {this.description = description;} }
  • 91. Objective: Index external sources And add the GoGreenProductBean to Solr { List<GoGreenProductBean> gogreenBeans = new ArrayList<GoGreenProductBean>(); // FILL THE gogreenBeans LIST // NOW ADD TO INDEX HippoSolrManager solrManager = HstServices.getComponentManager().getComponent( HippoSolrManager.class.getName(), SOLR_MODULE_NAME); try { solrManager.getSolrServer().addBeans(gogreenBeans); UpdateResponse commit = solrManager.getSolrServer().commit(); } catch (IOException e) { e.printStackTrace(); } catch (SolrServerException e) { e.printStackTrace(); } }
  • 92. Objective: Control the SIZE of the index yourself
  • 93. Objective: Control the SIZE of the index yourself JCR / Jackrabbit / Hippo-Repository has a generic one-fits-all-index (or one-fits-none-index) Which grows very large easily, and can hardly be customized
  • 94. Objective: Control the SIZE of the index yourself However, search is domain specific Thus, Just index what is needed for the customer
  • 95. Objective: Don't invent but integrate
  • 96. Objective: Don't invent but integrate Use Solr Use Solrj client Expose the Solrj SolrQuery
  • 97. Objective: Don't invent but integrate For example: HippoSolrManager solrManager = ... String query = ... HippoQuery hippoQuery = solrManager.createQuery(query); hippoQuery.setLimit(pageSize); hippoQuery.setOffset((page - 1) * pageSize); // hippoQuery.getSolrQuery() is the SolrQuery object // include scoring hippoQuery.getSolrQuery().setIncludeScore(true); hippoQuery.getSolrQuery().setHighlight(true); hippoQuery.getSolrQuery().setHighlightFragsize(200); hippoQuery.getSolrQuery().addHighlightField("title"); hippoQuery.getSolrQuery().addHighlightField("summary"); hippoQuery.getSolrQuery().addHighlightField("htmlContent"); HippoQueryResult result = hippoQuery.execute(true);
  • 98. Objective: Don't invent but integrate For example: HippoSolrManager solrManager = ... String query = ... HippoQuery hippoQuery = solrManager.createQuery(query); hippoQuery.setLimit(pageSize); hippoQuery.setOffset((page - 1) * pageSize); // hippoQuery.getSolrQuery() is the SolrQuery object // include scoring hippoQuery.getSolrQuery().setIncludeScore(true); hippoQuery.getSolrQuery().setHighlight(true); hippoQuery.getSolrQuery().setHighlightFragsize(200); hippoQuery.getSolrQuery().addHighlightField("title"); hippoQuery.getSolrQuery().addHighlightField("summary"); hippoQuery.getSolrQuery().addHighlightField("htmlContent"); HippoQueryResult result = hippoQuery.execute(true);
  • 99. Outline 1. The current search (HST / repo) architecture 2. The current problems / shortcomings / mismatches 3. What we are trying to improve, the objectives 4. Solr integration to rescue 5. A very fast demo 6. Wrap up 7. Questions
  • 100. Solr integration to rescue No further comments :-)
  • 101. Outline 1. The current search (HST / repo) architecture 2. The current problems / shortcomings / mismatches 3. What we are trying to improve, the objectives 4. Solr integration to rescue 5. A very fast demo 6. Wrap up 7. Questions
  • 102. A very fast demo setup ~75.000 long wikipedia docs in repository ............... doing the demo .................
  • 103. That was : a very fast demo
  • 104. Outline 1. The current search (HST / repo) architecture 2. The current problems / shortcomings / mismatches 3. What we are trying to improve, the objectives 4. Solr integration to rescue 5. A very fast demo 6. Wrap up 7. Questions
  • 105. Wrap up I think that with the Solr integration
  • 106. Wrap up I think that with the Solr integration 1. Developers will be happier
  • 107. Wrap up I think that with the Solr integration 1. Developers will be happier 2. Customers will be happier
  • 108. Wrap up I think that with the Solr integration 1. Developers will be happier 2. Customers will be happier 3. Partners will be happier
  • 109. Wrap up I think that with the Solr integration 1. Developers will be happier 2. Customers will be happier 3. Partners will be happier 4. Hippo will be happier
  • 110. Wrap up I think that with the Solr integration 1. Developers will be happier 2. Customers will be happier 3. Partners will be happier 4. Hippo will be happier And finally, last and least
  • 111. Wrap up I think that with the Solr integration 1. Developers will be happier 2. Customers will be happier 3. Partners will be happier 4. Hippo will be happier 5. Infra will be happier because the servers stop sweating
  • 112. Outline 1. The current search (HST / repo) architecture 2. The current problems / shortcomings / mismatches 3. What we are trying to improve, the objectives 4. Solr integration to rescue 5. A very fast demo 6. Wrap up 7. Questions
  • 113. Questions? Check out the example at : http://svn.onehippo.org/repos/hippo/hippo-cms7/testsuite/trunk