SlideShare a Scribd company logo
Improving the Search Experience
in a Social Network with Cross
Media Contents
Daniele Cenni, Paolo Nesi
University of Florence
Department of Systems and Informatics
Distributed Systems and Internet Technology Laboratory
Paolo.nesi@unifi.it
cenni@dsi.unifi.it , http://www.disit.dinfo.unifi.it
DMS2013, August 2013, UK, Paolo Nesi 1
ECLAP Social Network
 ECLAP is a Digital Library on Performing
Arts connected with Europeana
 ECLAP is a Best Practice and Social
Network (blogs, forums, comments,
tagging, voting, …)
DMS2013, August 2013, UK, Paolo Nesi 2
Goals/Requirements
 Develop an Indexing/Searching solution for ECLAP Social
Network allowing:
 Indexing multilingual crossmedia content metadata and
data (e.g. documents)
 Indexing portal blogs, forums, events, group pages,
comments, etc.
 Efficient multilingual search (keyword search and advanced
search) supporting:
 misspelled words (e.g. shespeare)
 partial word search
 Sorting and filtering search results
 re-index the whole data without blocking the system
 Log and monitor users activity
 …
 Evaluate the Indexing/Searchig service
DMS2013, August 2013, UK, Paolo Nesi 3
ECLAP ANY content kind
 Informative Content
 Video, audio, images,
documents
 3D, animations, Braille
 Slide, Video-Slide, courses
 eBook, ePub, Mpeg21,
intelligent
 Aggregated Content:
 Playlist, Collections
 Annotations,
Synchronization
 Support and networking
content:
 Blog, WebPage, Events,
comments,
forum, votes, messages, …
4
comments
rating
relationships
technical
Dynamic
recommend
……………
• Performance
• Master classes
• Scene Sketches
• Scenography
• Scenes
• Private lives of
artists
• Scores
• Braille
• BackStage Stills
• Choreography
• Morals
• Poster
• Booklets
• Magazines Music
• Audio ballets
ECLAP Semantic Model 1
DMS2013, August 2013, UK, Paolo Nesi
Media Object
Video Audio
Document
Group/Channel
CollectionPlaylist
0..n
0..n
1..n
0..n
Image
AVObjectAnnotation
0..n 1..2
1..n
0..n
ForumWebPage
CommentContentTaxonomyTerm
0..n 0..n 0..n1
0..n
0..n
Blog
Metadata
Performing
Arts
Dublin Core
Technical
Main
Annotation
Side
Annotation
1..n
1
GeoName
Crossmedia
Archive
Event
epub
3D
IPR
Braille Music
Score
5
ECLAP Semantic Model 2
DMS2013, August 2013, UK, Paolo Nesi
User Group/Channel
Content
Media Object
Comment
Annotation
TaxonomyTerm
foaf:member
admin
isProvidedBy
isFavouriteOf
dc:creator
dc:creator
foaf:topic_interest
isFeaturedBy
foaf:knows
6
Indexing
 Indexing & Search system
 Based on Apache Solr
 Multilingual aspects
 Translate the metadata or translate the query?.. both
 metadata translation
 Query translation
 Indexing schema
 Dublin Core + DCTerms (multi language)
 Performing Arts
 Technical (provider, content type, GPS, IPR, duration, quality, …)
 Groups associations (multi language)
 Taxonomy associations (multi language)
 Comments & multi language tags
 FullText of the textual digital resources
DMS2013, August 2013, UK, Paolo Nesi 7
Indexing
DMS2013, August 2013, UK, Paolo Nesi 8
Metadata Schema Indexing
DMS2013, August 2013, UK, Paolo Nesi 9
Search Facilities
 Full text search
 Uses the catch all fields to search for keywords in
most important fields in all languages (title,
description, text, body, subject,…)
 Fuzzy search
 Allows matching mistyped words
 Deep search
 Allows searching for partial words
 Faceted Search
 Maximasing Precision and Recall:
 Relevance & boosting terms
DMS2013, August 2013, UK, Paolo Nesi 10
Search Facilities vs Information
DMS2013, August 2013, UK, Paolo Nesi 11
Searching
 Faceted search
DMS2013, August 2013, UK, Paolo Nesi 12
Weighted Query Model
 Where for the “q” query
 Weights are boosting fields
 Title is DC.Title, description DC.Description….,
 Body is textual body, subject…,
 taxonomy the full description of the taxonomy
branch
DMS2013, August 2013, UK, Paolo Nesi 13
Model Optimization
 Optimization of the Precision&Recall to
improve search quality
 50 reference queries
 Optimization Methods
 Simulated Annealing
 Genetic Algorithms
 7 parameters
DMS2013, August 2013, UK, Paolo Nesi 14
Monte Carlo Analysis
MAP: Mean Average PrecisionDMS2013, August 2013, UK, Paolo Nesi 15
DMS2013, August 2013, UK, Paolo Nesi 16
Some weights’ Trends
DMS2013, August 2013, UK, Paolo Nesi 17
Comparative Results
MAP: Mean Average PrecisionDMS2013, August 2013, UK, Paolo Nesi 18
Usage Results
 Over than 500.000 visits
 7.29 minutes of permanence on the
portal
DMS2013, August 2013, UK, Paolo Nesi 19
Assessment of Search Facility
 Distribution of performed clicks
First page
DMS2013, August 2013, UK, Paolo Nesi 20
Conclusions
 indexing solution for
 cross media for multilingual metadata and texts
 Improved Searching & filtering results and thus user experience
quality
 Providing: (full text, operators), advanced, faceted, etc.
 Precision and Recall analysis allowed to tune the search
services
 Simulated Annealing and Genetic Algorithms produced similar
results
 User behavior assessment has shown that search facility
appreciation has been improved wrt to early previous
settings, grounded on common sense and classical
metadata relevance
DMS2013, August 2013, UK, Paolo Nesi 21

More Related Content

PPT
Online research 101
PDF
Aletras, Nikolaos and Stevenson, Mark (2013) "Evaluating Topic Coherence Us...
PPT
WP3 Further specification of Functionality and Interoperability - Gradmann
PDF
Roadmap from ESEPaths to EDMPaths: a note on representing annotations resulti...
PPTX
Ranking Resources in Folksonomies by Exploiting Semantic and Context-specific...
PDF
Le chat tra Luca Palamara e Cochita Grillo
PDF
Le chat con David Ermini
PDF
Inaugural Addresses
Online research 101
Aletras, Nikolaos and Stevenson, Mark (2013) "Evaluating Topic Coherence Us...
WP3 Further specification of Functionality and Interoperability - Gradmann
Roadmap from ESEPaths to EDMPaths: a note on representing annotations resulti...
Ranking Resources in Folksonomies by Exploiting Semantic and Context-specific...
Le chat tra Luca Palamara e Cochita Grillo
Le chat con David Ermini
Inaugural Addresses

Similar to Improving the Search Experience in a Social Network with Cross Media Contents (20)

PDF
Indexing and Searching Cross Media Content in a Social Network
PPT
Slawek Korea
PPT
WP3 Further specification of Functionality and Interoperability - Gradmann / ...
PPT
Intro to Digitization Projects
PPTX
UCIAD overview
PPT
Geo-annotations in Semantic Digital Libraries
PDF
NLP on Hadoop: A Distributed Framework for NLP-Based Keyword and Keyphrase Ex...
PPT
Semantic Web in Action
PPTX
Usability & User-Centred Design
PPT
MPEG-7 Services in Community Engines
PPT
Gettingstartedwithdigitalcollectionsweb[1]
PDF
Information Architecture
PDF
Qualitative Data Analysis with ATLAS ti 2nd Edition Susanne Friese
PPT
Panel: Social Tagging and Folksonomies: Indexing, Retrieving... and Beyond? ...
PPT
Accessibility, Automation and Metadata
PPT
Tech WG report 2011
PPT
JeromeDL Tutorial
PDF
RDF Data and Image Annotations in ResearchSpace (paper)
PPS
Modular Documentation Joe Gelb Techshoret 2009
PDF
Institutional Services and Tools for Content, Metadata and IPR Management
Indexing and Searching Cross Media Content in a Social Network
Slawek Korea
WP3 Further specification of Functionality and Interoperability - Gradmann / ...
Intro to Digitization Projects
UCIAD overview
Geo-annotations in Semantic Digital Libraries
NLP on Hadoop: A Distributed Framework for NLP-Based Keyword and Keyphrase Ex...
Semantic Web in Action
Usability & User-Centred Design
MPEG-7 Services in Community Engines
Gettingstartedwithdigitalcollectionsweb[1]
Information Architecture
Qualitative Data Analysis with ATLAS ti 2nd Edition Susanne Friese
Panel: Social Tagging and Folksonomies: Indexing, Retrieving... and Beyond? ...
Accessibility, Automation and Metadata
Tech WG report 2011
JeromeDL Tutorial
RDF Data and Image Annotations in ResearchSpace (paper)
Modular Documentation Joe Gelb Techshoret 2009
Institutional Services and Tools for Content, Metadata and IPR Management
Ad

Recently uploaded (20)

PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Machine learning based COVID-19 study performance prediction
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Encapsulation theory and applications.pdf
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PDF
Approach and Philosophy of On baking technology
PPTX
1. Introduction to Computer Programming.pptx
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Mushroom cultivation and it's methods.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
OMC Textile Division Presentation 2021.pptx
PDF
A comparative analysis of optical character recognition models for extracting...
PPTX
Tartificialntelligence_presentation.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
Assigned Numbers - 2025 - Bluetooth® Document
Machine learning based COVID-19 study performance prediction
MIND Revenue Release Quarter 2 2025 Press Release
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Encapsulation theory and applications.pdf
SOPHOS-XG Firewall Administrator PPT.pptx
Approach and Philosophy of On baking technology
1. Introduction to Computer Programming.pptx
Group 1 Presentation -Planning and Decision Making .pptx
Programs and apps: productivity, graphics, security and other tools
Mushroom cultivation and it's methods.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Heart disease approach using modified random forest and particle swarm optimi...
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Diabetes mellitus diagnosis method based random forest with bat algorithm
OMC Textile Division Presentation 2021.pptx
A comparative analysis of optical character recognition models for extracting...
Tartificialntelligence_presentation.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
Ad

Improving the Search Experience in a Social Network with Cross Media Contents

  • 1. Improving the Search Experience in a Social Network with Cross Media Contents Daniele Cenni, Paolo Nesi University of Florence Department of Systems and Informatics Distributed Systems and Internet Technology Laboratory Paolo.nesi@unifi.it cenni@dsi.unifi.it , http://www.disit.dinfo.unifi.it DMS2013, August 2013, UK, Paolo Nesi 1
  • 2. ECLAP Social Network  ECLAP is a Digital Library on Performing Arts connected with Europeana  ECLAP is a Best Practice and Social Network (blogs, forums, comments, tagging, voting, …) DMS2013, August 2013, UK, Paolo Nesi 2
  • 3. Goals/Requirements  Develop an Indexing/Searching solution for ECLAP Social Network allowing:  Indexing multilingual crossmedia content metadata and data (e.g. documents)  Indexing portal blogs, forums, events, group pages, comments, etc.  Efficient multilingual search (keyword search and advanced search) supporting:  misspelled words (e.g. shespeare)  partial word search  Sorting and filtering search results  re-index the whole data without blocking the system  Log and monitor users activity  …  Evaluate the Indexing/Searchig service DMS2013, August 2013, UK, Paolo Nesi 3
  • 4. ECLAP ANY content kind  Informative Content  Video, audio, images, documents  3D, animations, Braille  Slide, Video-Slide, courses  eBook, ePub, Mpeg21, intelligent  Aggregated Content:  Playlist, Collections  Annotations, Synchronization  Support and networking content:  Blog, WebPage, Events, comments, forum, votes, messages, … 4 comments rating relationships technical Dynamic recommend …………… • Performance • Master classes • Scene Sketches • Scenography • Scenes • Private lives of artists • Scores • Braille • BackStage Stills • Choreography • Morals • Poster • Booklets • Magazines Music • Audio ballets
  • 5. ECLAP Semantic Model 1 DMS2013, August 2013, UK, Paolo Nesi Media Object Video Audio Document Group/Channel CollectionPlaylist 0..n 0..n 1..n 0..n Image AVObjectAnnotation 0..n 1..2 1..n 0..n ForumWebPage CommentContentTaxonomyTerm 0..n 0..n 0..n1 0..n 0..n Blog Metadata Performing Arts Dublin Core Technical Main Annotation Side Annotation 1..n 1 GeoName Crossmedia Archive Event epub 3D IPR Braille Music Score 5
  • 6. ECLAP Semantic Model 2 DMS2013, August 2013, UK, Paolo Nesi User Group/Channel Content Media Object Comment Annotation TaxonomyTerm foaf:member admin isProvidedBy isFavouriteOf dc:creator dc:creator foaf:topic_interest isFeaturedBy foaf:knows 6
  • 7. Indexing  Indexing & Search system  Based on Apache Solr  Multilingual aspects  Translate the metadata or translate the query?.. both  metadata translation  Query translation  Indexing schema  Dublin Core + DCTerms (multi language)  Performing Arts  Technical (provider, content type, GPS, IPR, duration, quality, …)  Groups associations (multi language)  Taxonomy associations (multi language)  Comments & multi language tags  FullText of the textual digital resources DMS2013, August 2013, UK, Paolo Nesi 7
  • 9. Metadata Schema Indexing DMS2013, August 2013, UK, Paolo Nesi 9
  • 10. Search Facilities  Full text search  Uses the catch all fields to search for keywords in most important fields in all languages (title, description, text, body, subject,…)  Fuzzy search  Allows matching mistyped words  Deep search  Allows searching for partial words  Faceted Search  Maximasing Precision and Recall:  Relevance & boosting terms DMS2013, August 2013, UK, Paolo Nesi 10
  • 11. Search Facilities vs Information DMS2013, August 2013, UK, Paolo Nesi 11
  • 12. Searching  Faceted search DMS2013, August 2013, UK, Paolo Nesi 12
  • 13. Weighted Query Model  Where for the “q” query  Weights are boosting fields  Title is DC.Title, description DC.Description….,  Body is textual body, subject…,  taxonomy the full description of the taxonomy branch DMS2013, August 2013, UK, Paolo Nesi 13
  • 14. Model Optimization  Optimization of the Precision&Recall to improve search quality  50 reference queries  Optimization Methods  Simulated Annealing  Genetic Algorithms  7 parameters DMS2013, August 2013, UK, Paolo Nesi 14
  • 15. Monte Carlo Analysis MAP: Mean Average PrecisionDMS2013, August 2013, UK, Paolo Nesi 15
  • 16. DMS2013, August 2013, UK, Paolo Nesi 16
  • 17. Some weights’ Trends DMS2013, August 2013, UK, Paolo Nesi 17
  • 18. Comparative Results MAP: Mean Average PrecisionDMS2013, August 2013, UK, Paolo Nesi 18
  • 19. Usage Results  Over than 500.000 visits  7.29 minutes of permanence on the portal DMS2013, August 2013, UK, Paolo Nesi 19
  • 20. Assessment of Search Facility  Distribution of performed clicks First page DMS2013, August 2013, UK, Paolo Nesi 20
  • 21. Conclusions  indexing solution for  cross media for multilingual metadata and texts  Improved Searching & filtering results and thus user experience quality  Providing: (full text, operators), advanced, faceted, etc.  Precision and Recall analysis allowed to tune the search services  Simulated Annealing and Genetic Algorithms produced similar results  User behavior assessment has shown that search facility appreciation has been improved wrt to early previous settings, grounded on common sense and classical metadata relevance DMS2013, August 2013, UK, Paolo Nesi 21