[email_address] [email_address] Lucene @ Ghent @ Lund Vlengel - November 2009 Maastricht
http://lib.ugent.be
http://elin.ugent.be
The Numbers 5.000.000 Bibliographic Records Full-Text: ca 20% 490.000  Google Books Hathi 136.000  18 th  Cent. Coll. Online 100.000  Early English Books 32.000  Google Books Gent 82.000  Gutenberg, DBNL, SFX,… Ghent 16 Collections 120.000 visits/month 34% via search engines
The Numbers 54.000.000 Bibliographic Records ELIN Full-Text: 100  % 29  customers worldwide 6  timezones 17.000.000  electronic journals 25.000.000  Ebsco   4.000.000  JSTOR   3.400.000  Proquest ABI 1.670.000  IEE/IEEE standars/proceedings 1.300.000  E-print archives
The Parts Searching/Portal Verity Sesat Endeca Indexing Fast Autonomy Zebra Lucene/Solr Sphinx Drupal Liferay JBoss Zope Primo Aquabrowser VuFind
Indexing ALEPHSEQ OAI-PMH MySQL DUMP XSL indexML rug01_xml rug02_xml hath01_xml dbnl_xml Cmdline tools Tomcat Servlet Tuned Solr Perl MVC Java/Spring
Searching/Portal Search Engine Plugins Lucene SOLR ISI/WOS YouTube OpenSearch SRU Models KeyVal XML MARC Configuration Files I18N Props Default Open Search UnAPI HTML Velocity JS Meercat Controllers Views
The ‘Haves’ Facets Filters RSS OpenSearch OpenURL OAI-PMH Mobile TicToc Cover Art Statistics Cool URI’s unAPI Zotero Google Maps Stemming Flexible Sort Image Browsing Zoomers Pagers Basket Plugin/Integration libX Real-Time Availability Check Requesting Global Holdings Full-Text Links Lists: Journals, Databases, Collections Diacrit translation
The ‘Have Nots’  Nice Administrative Interface  ILS integration (requests, renewals, …) Personalization (saved searches, alerts,…) Tagging, Rating, User Contributed Content Deduplication Excerpts, Table of Contents Word clouds Expand Searches (see also) Highlighting Federated Search Browsing Advanced Search Extended FRBR
The Characteristics Lightweight ,  Tunable In Lund  54.000.000  indexed on  1  Linux 4-core  machine  16GB RAM +/-  2000 records/second In Ghent indexation runs on Aleph server  during business hours Continuous  100 simultaneous users  on  1  Linux 2-core  machine  4GB RAM Simple, easy  web interface. Less is more
The Characteristics Flexible Used in  6  different projects in Gent,  2  in Lund KeyVal ,  XML ,  MARC  models can be used internally Indexes anything  that can be turned into our XML index format Total control  on every aspect of interface. We do  text ,  images ,  video ,  mobile ,  RSS , …
The Characteristics Very Large Developer Community Open Source used in  thousands  of projects worldwide in all major (computer) languages Extensive Documentation , many articles, presentations, research Books, User Group, Conferences, Social Networks,…
But…
Acknowledgements Kjell Lotigiers (UGent)– Java/Spring development Salam Baker Shanawa (Lund) – Perl/ELIN development, System tuning Nicolas Steenlant (UGent) – Ajax/CSS development Geert Roels (UGent) – Web Design Paul Bastijns (UGent) – SFX integration
Refs Calhoun, K., & Cellentani, D. (2009).  Online catalogs: What users and librarians want : an OCLC report . Dublin, Ohio: OCLC.  http://lib.ugent.be http://www.lub.lu.se

More Related Content

ODP
Earth
PDF
Design for Variable Printing
PPTX
Data Salon 3 - Ghent
PDF
American Cyanamid New Dealer Program ArgiCenters- Agri Marketing
PDF
Biradsfa qs
PDF
CEFPI Journal job order contracting opens doors to new era
PDF
Affinity Marketing Programs and the Association’s Dilemma
Earth
Design for Variable Printing
Data Salon 3 - Ghent
American Cyanamid New Dealer Program ArgiCenters- Agri Marketing
Biradsfa qs
CEFPI Journal job order contracting opens doors to new era
Affinity Marketing Programs and the Association’s Dilemma

Viewers also liked (20)

PDF
Job Order Contracting Journeys to China - CJE Newsletter fall 2008
PPS
Viñetas
PDF
Certification Questionaire
PPT
Jewish Moroccan Bride Adornment And Ritual Objects Final
PPTX
Association Engagement Success Guide
PPT
صور من إحياء البيت الفلسطيني لذكرى يوم الأرض
PDF
ePRO ROI
KEY
Women Of Algiers 3
PDF
Open | Linked | Open Linked data
PDF
2ST.net Corporate Overview 2012
PPT
MSc Physics Sem IV Acoustics II
PDF
Optimizing Collection of Patient-Driven eData in Elderly Populations
PPT
Optimized Internet Marketing
PDF
God Is In The Toaster
DOC
The Ying & Yang of Creative Management
PDF
Gent_M 2011-04-26
PPTX
Justin Smith Fdg Slides
PPT
IFMA NM Presentation Sept 2009
PPT
GREP - Ghent University Repository
Job Order Contracting Journeys to China - CJE Newsletter fall 2008
Viñetas
Certification Questionaire
Jewish Moroccan Bride Adornment And Ritual Objects Final
Association Engagement Success Guide
صور من إحياء البيت الفلسطيني لذكرى يوم الأرض
ePRO ROI
Women Of Algiers 3
Open | Linked | Open Linked data
2ST.net Corporate Overview 2012
MSc Physics Sem IV Acoustics II
Optimizing Collection of Patient-Driven eData in Elderly Populations
Optimized Internet Marketing
God Is In The Toaster
The Ying & Yang of Creative Management
Gent_M 2011-04-26
Justin Smith Fdg Slides
IFMA NM Presentation Sept 2009
GREP - Ghent University Repository
Ad

Similar to 20091120 Vlengel Maastricht (20)

PPTX
Illuminating Lucene.Net
PDF
The power of faceted search in alfresco
PPT
Lucene and MySQL
PPT
Intelligent crawling and indexing using lucene
PPTX
Apache lucene
PDF
Introduction to libre « fulltext » technology
PDF
Nutch and lucene_framework
PPT
Advanced full text searching techniques using Lucene
PPTX
Google history nd architecture
PPTX
DC presentation 1
PPTX
Presentacion tics (1)
PDF
Federated to library discovery platfoms
PPTX
PDF
Review of "The anatomy of a large scale hyper textual web search engine"
PPTX
Anatomy of google
PPTX
VuFind and its use at ULB
PDF
History of Search and Web Search Engines - Seminar on Web Search
PPT
Googling of GooGle
PPTX
Search Me: Using Lucene.Net
KEY
Lucene intro
Illuminating Lucene.Net
The power of faceted search in alfresco
Lucene and MySQL
Intelligent crawling and indexing using lucene
Apache lucene
Introduction to libre « fulltext » technology
Nutch and lucene_framework
Advanced full text searching techniques using Lucene
Google history nd architecture
DC presentation 1
Presentacion tics (1)
Federated to library discovery platfoms
Review of "The anatomy of a large scale hyper textual web search engine"
Anatomy of google
VuFind and its use at ULB
History of Search and Web Search Engines - Seminar on Web Search
Googling of GooGle
Search Me: Using Lucene.Net
Lucene intro
Ad

More from Patrick Hochstenbach (18)

PDF
Processing Linked Data with Catmandu
PDF
The Library in 2050
PDF
20130308 webstrategie
KEY
KEY
LibreCat::Catmandu
PPT
Catmandu Librecat
PDF
Catmandu / LibreCat Project
PDF
UGent Datacenter of waarom we 140TB kopen
PDF
देवनागरी Devanāgarī
PDF
Informatie Aan Zee - TTT E-Research
PDF
Informatie Aan Zee - TTT Digital Architecture
PDF
ELAG2011 Bootcamp
PPT
20100831 igelu mobilise_ugent
PPT
20100618 Datasalon5 Vooruit Gent
PDF
20100306 Datasalon 4 : code4lib
PPT
20081007 Workshop BOM-VL WP3
Processing Linked Data with Catmandu
The Library in 2050
20130308 webstrategie
LibreCat::Catmandu
Catmandu Librecat
Catmandu / LibreCat Project
UGent Datacenter of waarom we 140TB kopen
देवनागरी Devanāgarī
Informatie Aan Zee - TTT E-Research
Informatie Aan Zee - TTT Digital Architecture
ELAG2011 Bootcamp
20100831 igelu mobilise_ugent
20100618 Datasalon5 Vooruit Gent
20100306 Datasalon 4 : code4lib
20081007 Workshop BOM-VL WP3

Recently uploaded (20)

PPTX
Modernising the Digital Integration Hub
PPTX
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
PDF
How IoT Sensor Integration in 2025 is Transforming Industries Worldwide
PPT
Module 1.ppt Iot fundamentals and Architecture
PDF
Getting started with AI Agents and Multi-Agent Systems
PDF
sbt 2.0: go big (Scala Days 2025 edition)
PPTX
The various Industrial Revolutions .pptx
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
PPTX
Build Your First AI Agent with UiPath.pptx
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PDF
A proposed approach for plagiarism detection in Myanmar Unicode text
PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
PDF
Enhancing plagiarism detection using data pre-processing and machine learning...
PDF
OpenACC and Open Hackathons Monthly Highlights July 2025
PDF
Comparative analysis of machine learning models for fake news detection in so...
PDF
A review of recent deep learning applications in wood surface defect identifi...
PPTX
TEXTILE technology diploma scope and career opportunities
PPTX
Training Program for knowledge in solar cell and solar industry
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
Modernising the Digital Integration Hub
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
How IoT Sensor Integration in 2025 is Transforming Industries Worldwide
Module 1.ppt Iot fundamentals and Architecture
Getting started with AI Agents and Multi-Agent Systems
sbt 2.0: go big (Scala Days 2025 edition)
The various Industrial Revolutions .pptx
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
Build Your First AI Agent with UiPath.pptx
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
sustainability-14-14877-v2.pddhzftheheeeee
A proposed approach for plagiarism detection in Myanmar Unicode text
Final SEM Unit 1 for mit wpu at pune .pptx
Enhancing plagiarism detection using data pre-processing and machine learning...
OpenACC and Open Hackathons Monthly Highlights July 2025
Comparative analysis of machine learning models for fake news detection in so...
A review of recent deep learning applications in wood surface defect identifi...
TEXTILE technology diploma scope and career opportunities
Training Program for knowledge in solar cell and solar industry
How ambidextrous entrepreneurial leaders react to the artificial intelligence...

20091120 Vlengel Maastricht

  • 1. [email_address] [email_address] Lucene @ Ghent @ Lund Vlengel - November 2009 Maastricht
  • 4. The Numbers 5.000.000 Bibliographic Records Full-Text: ca 20% 490.000 Google Books Hathi 136.000 18 th Cent. Coll. Online 100.000 Early English Books 32.000 Google Books Gent 82.000 Gutenberg, DBNL, SFX,… Ghent 16 Collections 120.000 visits/month 34% via search engines
  • 5. The Numbers 54.000.000 Bibliographic Records ELIN Full-Text: 100 % 29 customers worldwide 6 timezones 17.000.000 electronic journals 25.000.000 Ebsco 4.000.000 JSTOR 3.400.000 Proquest ABI 1.670.000 IEE/IEEE standars/proceedings 1.300.000 E-print archives
  • 6. The Parts Searching/Portal Verity Sesat Endeca Indexing Fast Autonomy Zebra Lucene/Solr Sphinx Drupal Liferay JBoss Zope Primo Aquabrowser VuFind
  • 7. Indexing ALEPHSEQ OAI-PMH MySQL DUMP XSL indexML rug01_xml rug02_xml hath01_xml dbnl_xml Cmdline tools Tomcat Servlet Tuned Solr Perl MVC Java/Spring
  • 8. Searching/Portal Search Engine Plugins Lucene SOLR ISI/WOS YouTube OpenSearch SRU Models KeyVal XML MARC Configuration Files I18N Props Default Open Search UnAPI HTML Velocity JS Meercat Controllers Views
  • 9. The ‘Haves’ Facets Filters RSS OpenSearch OpenURL OAI-PMH Mobile TicToc Cover Art Statistics Cool URI’s unAPI Zotero Google Maps Stemming Flexible Sort Image Browsing Zoomers Pagers Basket Plugin/Integration libX Real-Time Availability Check Requesting Global Holdings Full-Text Links Lists: Journals, Databases, Collections Diacrit translation
  • 10. The ‘Have Nots’ Nice Administrative Interface ILS integration (requests, renewals, …) Personalization (saved searches, alerts,…) Tagging, Rating, User Contributed Content Deduplication Excerpts, Table of Contents Word clouds Expand Searches (see also) Highlighting Federated Search Browsing Advanced Search Extended FRBR
  • 11. The Characteristics Lightweight , Tunable In Lund 54.000.000 indexed on 1 Linux 4-core machine 16GB RAM +/- 2000 records/second In Ghent indexation runs on Aleph server during business hours Continuous 100 simultaneous users on 1 Linux 2-core machine 4GB RAM Simple, easy web interface. Less is more
  • 12. The Characteristics Flexible Used in 6 different projects in Gent, 2 in Lund KeyVal , XML , MARC models can be used internally Indexes anything that can be turned into our XML index format Total control on every aspect of interface. We do text , images , video , mobile , RSS , …
  • 13. The Characteristics Very Large Developer Community Open Source used in thousands of projects worldwide in all major (computer) languages Extensive Documentation , many articles, presentations, research Books, User Group, Conferences, Social Networks,…
  • 15. Acknowledgements Kjell Lotigiers (UGent)– Java/Spring development Salam Baker Shanawa (Lund) – Perl/ELIN development, System tuning Nicolas Steenlant (UGent) – Ajax/CSS development Geert Roels (UGent) – Web Design Paul Bastijns (UGent) – SFX integration
  • 16. Refs Calhoun, K., & Cellentani, D. (2009). Online catalogs: What users and librarians want : an OCLC report . Dublin, Ohio: OCLC. http://lib.ugent.be http://www.lub.lu.se