SlideShare a Scribd company logo
Bridging the Semantic Gap in
Multimedia Information Retrieval
Top-down and Bottom-up Approaches
Jonathon S. Hare, Patrick A.S. Sinclair,
Paul H. Lewis and Kirk Martinez
Intelligence,Agents, Multimedia Group
School of Electronics and Computer Science
University of Southampton
{jsh2 | pass | phl | km}@ecs.soton.ac.uk
&
Peter G.B. Enser and Christine J. Sandom
School of Computing, Mathematical and Information Sciences
University of Brighton
{p.g.b.enser | c.sandom}@bton.ac.uk
Introduction
What is the semantic gap in image retrieval?
The gap between information extractable automatically
from the visual data and the interpretation a user may have
for the same data
…typically between low level features and the image
semantics
Two approaches to solving this:
Top-down:- Metadata driven
Bottom-up:- Image features
Motivation
Hallmark of a good retrieval system is its ability to
respond to queries posed by a user
We have been collecting numerous real queries
for images from different collections in order to
investigate how we need to bridge the semantic
gap in order to answer the queries
Users’ queries should be
the driver
User queries may specify unique features
A member of parliament with a beard
May involve temporal or spatial facets
A 1950s fridge in the background
Particular significance
Bannister breaking the 4 min mile
The absence of features
GeorgeV’s Coronation but no procession or royals
Top-down Approaches
Ontologies can improve accuracy of retrieval
Simple keyword-based retrieval
Concept-based retrieval
Reasoning
Integration of different sources
Cultural heritage
Browsing and visualisation
Top-down Approaches
Browsing and visualisation
Poodle Demo
Issues
Manual annotation is expensive
Use of context to annotate
Keyword-based:
Problems if a predefined vocabulary is not used
Keyword-Concept matching is difficult!
Especially with inconsistent keywords
What if there is no metadata?
Bottom-up Approaches
Content-based Retrieval
In the past, content-based retrieval has been used
as a means to provide bottom-up searching of
(unannotated) image collections
For example, Query by Image Content
Paradigm
Demo...
However, it doesn’t tend to work well w.r.t to real
image searchers
Bottom-up Approaches
Auto-annotation
Lots of techniques proposed, using different descriptor
morphologies (global, region-based [segmented, salient, ...])
Co-occurrence of keywords and image features
Machine translation
Statistical, maximum entropy, ...
Probabilistic methods
Inference networks, density estimation, ...
Latent-spaces
Keyword propagation
Simple classifiers using low level features
Almost all of these techniques involve explicitly annotating the
media with a keyword, phrase or concept
The Auto-Annotation Process
Descripto
rs
Raw Media
Objects Labels
SKY
MOUNTAINS
TREES
Photo of
Yosemite valley
showing El Capitan and
Glacier Point with the
Half Dome in the
distance
Semantics
Inter-object relationships
sub/super
objects
other
contextual
information
Bottom-up Approaches
Semantic Spaces
Simple idea based on
factorisation and
dimensionality reduction
of a vector-space of
keywords/phrases/
concepts and visual
terms created from
feature-vectors
calculated from the
media
Doesn’t allow explicit
annotation, but does
provide search
Semantic Space Demo
Issues
The biggest problems come from:
Biased training data
Noisy/poor keywording, as with top-down
approaches
The semantic space should be able to cope
with this by learning from co-occurrence, but
this is difficult with keywords
Proper full-text captions would be better...
Image noise...
Image Noise
video/ss search for ‘cups’
Image Noise
Possible solution:Automatic Region-of-Interest Detection
Integration
Concept-augmented Semantic Spaces
Less clutter, better training
Better classifiers and auto-annotators
Safer annotation or classification using
ontological reasoning
e.g. image is either ‘horse’ or ‘foal’ according
to classifier.Which is safer?
Conclusions
Have discussed some techniques and issues with current
top-down and bottom-up techniques
Top-down techniques work well, but can’t help us find
metadata-less media
Bottom-up approaches allow us to locate media without
metadata, however performance is variable
Hopefully demo’s have illustrated where we are taking this
and how we are beginning to integrate both approaches
Any Questions?

More Related Content

PDF
Mind the Gap: Another look at the problem of the semantic gap in image retrieval
PDF
A Linear-Algebraic Technique with an Application in Semantic Image Retrieval
PDF
Content-based image retrieval using a mobile device as a novel interface
PPT
Multimodal Searching and Semantic Spaces: ...or how to find images of Dalmati...
PDF
Spot the Dog: An overview of semantic retrieval of unannotated images in the ...
PDF
Searching Images: Recent research at Southampton
PPT
Semantics In Digital Photos A Contenxtual Analysis
PDF
Searching Images: Recent research at Southampton
Mind the Gap: Another look at the problem of the semantic gap in image retrieval
A Linear-Algebraic Technique with an Application in Semantic Image Retrieval
Content-based image retrieval using a mobile device as a novel interface
Multimodal Searching and Semantic Spaces: ...or how to find images of Dalmati...
Spot the Dog: An overview of semantic retrieval of unannotated images in the ...
Searching Images: Recent research at Southampton
Semantics In Digital Photos A Contenxtual Analysis
Searching Images: Recent research at Southampton

What's hot (20)

PDF
BUILDING A SCALABLE MULTIMEDIA WEB OBSERVATORY
PDF
Semantic Retrieval and Automatic Annotation: Linear Transformations, Correlat...
PPTX
A brief introduction to extracting information from images
PDF
Searching Images: Recent research at Southampton
PDF
Sharp images and fuzzy concepts: Multimedia retrieval and the semantic gap
PDF
OpenIMAJ and ImageTerrier: Java Libraries and Tools for Scalable Multimedia A...
PPTX
Geotagging Social Media Content with a Refined Language Modelling Approach
PPT
Geotagging Photographs By Sanjay Rana
PPT
Tracking of objects with known color signature - ELITECH 20
PDF
Content based image retrieval based on shape with texture features
PDF
Week06 bme429-cbir
PPTX
Image processing using labview
PDF
[IJET V2I3P9] Authors: Ruchi Kumari , Sandhya Tarar
PPTX
Multilabel Image Retreval Using Hashing
PDF
Remote sensing e course (Geohydrology)
PDF
Query Image Searching With Integrated Textual and Visual Relevance Feedback f...
PDF
Unsupervised semi-supervised object detection
PPT
Image re ranking system
PDF
International Journal of Engineering and Science Invention (IJESI)
PDF
Cnn acuracia remotesensing-08-00329
BUILDING A SCALABLE MULTIMEDIA WEB OBSERVATORY
Semantic Retrieval and Automatic Annotation: Linear Transformations, Correlat...
A brief introduction to extracting information from images
Searching Images: Recent research at Southampton
Sharp images and fuzzy concepts: Multimedia retrieval and the semantic gap
OpenIMAJ and ImageTerrier: Java Libraries and Tools for Scalable Multimedia A...
Geotagging Social Media Content with a Refined Language Modelling Approach
Geotagging Photographs By Sanjay Rana
Tracking of objects with known color signature - ELITECH 20
Content based image retrieval based on shape with texture features
Week06 bme429-cbir
Image processing using labview
[IJET V2I3P9] Authors: Ruchi Kumari , Sandhya Tarar
Multilabel Image Retreval Using Hashing
Remote sensing e course (Geohydrology)
Query Image Searching With Integrated Textual and Visual Relevance Feedback f...
Unsupervised semi-supervised object detection
Image re ranking system
International Journal of Engineering and Science Invention (IJESI)
Cnn acuracia remotesensing-08-00329
Ad

Viewers also liked (15)

PDF
The Art and Science of Image Retrieval
PDF
IMAGE DIVERSITY ANALYSIS: CONTEXT, OPINION AND BIAS
PPT
Saliency-based Models of Image Content and their Application to Auto-Annotati...
PDF
Scale Saliency: Applications in Visual Matching,Tracking and View-Based Objec...
PDF
WAISFest 2011: Southampton Goggles
PDF
Mining Events from Multimedia Streams (WAIS Research group seminar June 2014)
PDF
SEWM'14 keynote: Mining Events from Multimedia Streams
PDF
Multimedia Information Retrieval
PDF
Lecture01
PDF
Lecture05
PDF
Lecture06
PDF
Achieving interoperability between CARARE schema for monuments and sites and ...
PDF
2015.12.17 kg bim
KEY
Mechanisms of bottom-up and top-down processing in visual perception
PDF
Top Down and Bottom Up Design Model
The Art and Science of Image Retrieval
IMAGE DIVERSITY ANALYSIS: CONTEXT, OPINION AND BIAS
Saliency-based Models of Image Content and their Application to Auto-Annotati...
Scale Saliency: Applications in Visual Matching,Tracking and View-Based Objec...
WAISFest 2011: Southampton Goggles
Mining Events from Multimedia Streams (WAIS Research group seminar June 2014)
SEWM'14 keynote: Mining Events from Multimedia Streams
Multimedia Information Retrieval
Lecture01
Lecture05
Lecture06
Achieving interoperability between CARARE schema for monuments and sites and ...
2015.12.17 kg bim
Mechanisms of bottom-up and top-down processing in visual perception
Top Down and Bottom Up Design Model
Ad

Similar to Bridging the Semantic Gap in Multimedia Information Retrieval: Top-down and Bottom-up Approaches (20)

PPTX
PDF
Ko3419161921
PPT
Semantic Search with Topic Maps
PDF
Web-scale semantic search
PPSX
Image Search: Then and Now
PDF
Braving the Semantic Gap Mapping Visual Concepts from Images and Videos 1st E...
PDF
Towards an automatic semantic integration of information
PDF
Advances in Image Search and Retrieval
PDF
Oge Marques (FAU) - invited talk at WISMA 2010 (Barcelona, May 2010)
DOC
WEB IMAGE RE-RANKING USING QUERY-SPECIFIC SEMANTIC SIGNATURES
PPT
Final Year Major Project Report ( Year 2010-2014 Batch )
PDF
ENHANCED WEB IMAGE RE-RANKING USING SEMANTIC SIGNATURES
PPT
Semantics for visual resources: use cases from e-culture
PDF
Introduction about TRECVID (The Video Retreival benchmark)
PDF
Collaborative semantic annotation of images ontology based model
PDF
Haystacks slides
DOCX
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
DOCX
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
PPT
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
PPTX
Making things findable
Ko3419161921
Semantic Search with Topic Maps
Web-scale semantic search
Image Search: Then and Now
Braving the Semantic Gap Mapping Visual Concepts from Images and Videos 1st E...
Towards an automatic semantic integration of information
Advances in Image Search and Retrieval
Oge Marques (FAU) - invited talk at WISMA 2010 (Barcelona, May 2010)
WEB IMAGE RE-RANKING USING QUERY-SPECIFIC SEMANTIC SIGNATURES
Final Year Major Project Report ( Year 2010-2014 Batch )
ENHANCED WEB IMAGE RE-RANKING USING SEMANTIC SIGNATURES
Semantics for visual resources: use cases from e-culture
Introduction about TRECVID (The Video Retreival benchmark)
Collaborative semantic annotation of images ontology based model
Haystacks slides
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
Making things findable

Recently uploaded (20)

PPTX
Business Acumen Training GuidePresentation.pptx
PDF
Foundation of Data Science unit number two notes
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
Computer network topology notes for revision
PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPT
Quality review (1)_presentation of this 21
PPT
Reliability_Chapter_ presentation 1221.5784
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PDF
Lecture1 pattern recognition............
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Database Infoormation System (DBIS).pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Business Acumen Training GuidePresentation.pptx
Foundation of Data Science unit number two notes
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
oil_refinery_comprehensive_20250804084928 (1).pptx
Computer network topology notes for revision
ISS -ESG Data flows What is ESG and HowHow
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Quality review (1)_presentation of this 21
Reliability_Chapter_ presentation 1221.5784
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Miokarditis (Inflamasi pada Otot Jantung)
Lecture1 pattern recognition............
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Database Infoormation System (DBIS).pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Supervised vs unsupervised machine learning algorithms
Introduction-to-Cloud-ComputingFinal.pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx

Bridging the Semantic Gap in Multimedia Information Retrieval: Top-down and Bottom-up Approaches

  • 1. Bridging the Semantic Gap in Multimedia Information Retrieval Top-down and Bottom-up Approaches Jonathon S. Hare, Patrick A.S. Sinclair, Paul H. Lewis and Kirk Martinez Intelligence,Agents, Multimedia Group School of Electronics and Computer Science University of Southampton {jsh2 | pass | phl | km}@ecs.soton.ac.uk & Peter G.B. Enser and Christine J. Sandom School of Computing, Mathematical and Information Sciences University of Brighton {p.g.b.enser | c.sandom}@bton.ac.uk
  • 2. Introduction What is the semantic gap in image retrieval? The gap between information extractable automatically from the visual data and the interpretation a user may have for the same data …typically between low level features and the image semantics Two approaches to solving this: Top-down:- Metadata driven Bottom-up:- Image features
  • 3. Motivation Hallmark of a good retrieval system is its ability to respond to queries posed by a user We have been collecting numerous real queries for images from different collections in order to investigate how we need to bridge the semantic gap in order to answer the queries
  • 4. Users’ queries should be the driver User queries may specify unique features A member of parliament with a beard May involve temporal or spatial facets A 1950s fridge in the background Particular significance Bannister breaking the 4 min mile The absence of features GeorgeV’s Coronation but no procession or royals
  • 5. Top-down Approaches Ontologies can improve accuracy of retrieval Simple keyword-based retrieval Concept-based retrieval Reasoning Integration of different sources Cultural heritage Browsing and visualisation
  • 8. Issues Manual annotation is expensive Use of context to annotate Keyword-based: Problems if a predefined vocabulary is not used Keyword-Concept matching is difficult! Especially with inconsistent keywords What if there is no metadata?
  • 9. Bottom-up Approaches Content-based Retrieval In the past, content-based retrieval has been used as a means to provide bottom-up searching of (unannotated) image collections For example, Query by Image Content Paradigm Demo... However, it doesn’t tend to work well w.r.t to real image searchers
  • 10. Bottom-up Approaches Auto-annotation Lots of techniques proposed, using different descriptor morphologies (global, region-based [segmented, salient, ...]) Co-occurrence of keywords and image features Machine translation Statistical, maximum entropy, ... Probabilistic methods Inference networks, density estimation, ... Latent-spaces Keyword propagation Simple classifiers using low level features Almost all of these techniques involve explicitly annotating the media with a keyword, phrase or concept
  • 11. The Auto-Annotation Process Descripto rs Raw Media Objects Labels SKY MOUNTAINS TREES Photo of Yosemite valley showing El Capitan and Glacier Point with the Half Dome in the distance Semantics Inter-object relationships sub/super objects other contextual information
  • 12. Bottom-up Approaches Semantic Spaces Simple idea based on factorisation and dimensionality reduction of a vector-space of keywords/phrases/ concepts and visual terms created from feature-vectors calculated from the media Doesn’t allow explicit annotation, but does provide search
  • 14. Issues The biggest problems come from: Biased training data Noisy/poor keywording, as with top-down approaches The semantic space should be able to cope with this by learning from co-occurrence, but this is difficult with keywords Proper full-text captions would be better... Image noise...
  • 15. Image Noise video/ss search for ‘cups’
  • 16. Image Noise Possible solution:Automatic Region-of-Interest Detection
  • 17. Integration Concept-augmented Semantic Spaces Less clutter, better training Better classifiers and auto-annotators Safer annotation or classification using ontological reasoning e.g. image is either ‘horse’ or ‘foal’ according to classifier.Which is safer?
  • 18. Conclusions Have discussed some techniques and issues with current top-down and bottom-up techniques Top-down techniques work well, but can’t help us find metadata-less media Bottom-up approaches allow us to locate media without metadata, however performance is variable Hopefully demo’s have illustrated where we are taking this and how we are beginning to integrate both approaches