Sirio: an Ontology-based Web Search Engine for Videos
Thomas Alisi, Marco Bertini, Gianpaolo D’Amico, Alberto Del Bimbo,
Andrea Ferracani, Federico Pernici and Giuseppe Serra
Media Integration and Communication Center, University of Florence, Italy
{alisi, bertini, damico, delbimbo, ferracani, pernici, serra}@dsi.unifi.it
http://www.micc.unifi.it/vim
ABSTRACT
In this technical demonstration we show a web video search
engine based on ontologies, the Sirio1
system, that has been
developed within the EU VidiVideo project. The goal of
the system is to provide a search engine for videos for both
technical and non-technical users. In fact, the system has
different interfaces that permit different query modalities:
free-text, natural language, graphical composition of con-
cepts using boolean and temporal relations and query by
visual example. In addition, the ontology structure is ex-
ploited to encode semantic relations between concepts per-
mitting, for example, to expand queries to synonyms and
concept specializations.
Categories and Subject Descriptors
H.3.3 [Information Storage and Retrieval]: Information
Search and Retrieval—Search process; H.3.5 [Information
Storage and Retrieval]: Online Information Services—
Web-based services
General Terms
Algorithms, Experimentation
Keywords
Video retrieval, ontologies, web services
1. INTRODUCTION
Video search engines are the product of progress in many
technologies: visual and audio analysis, machine learning
techniques, as well as visualization and interaction. The cur-
rent video search engines are based on lexicons of semantic
concepts and perform keyword-based queries [1]. These sys-
tems are generally desktop applications or have simple web
interfaces that show the results of the query as a ranked list
1
Sirio was the hound of the mythical hunter Orion. It was
a dog so swift that no prey could escape it.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
MM’09, October 19–24, 2009, Beijing, China.
Copyright 2009 ACM X-XXXXX-XX-X/XX/XX ...$10.00.
of keyframes [2, 3]. These systems do not let users to per-
form composite queries that can include temporal relations
between concepts and do not allow to look for concepts that
are not in the lexicon. In addition, desktop applications re-
quire installation on the end-user computer and can not be
used in a distributed environment.
In this demonstration we present the Sirio system, a web
video search engine that allows semantic retrieval by content
for different domains (broadcast news, surveillance, cultural
heritage documentaries) with query interaction and visual-
ization. The system permits different query modalities (free
text, natural language, graphical composition of concepts
using boolean and temporal relations and query by visual
example) and visualizations, resulting in an advanced tool
for retrieval and exploration of video archives for both tech-
nical and non-technical users. In addition the use of ontolo-
gies permits to exploit semantic relations between concepts
through reasoning. Finally our web system, using the Rich
Internet Application paradigm (RIA), does not require any
installation and provides a responsive user interface.
2. THE SYSTEM
The Sirio system2
is composed by three different inter-
faces: a GUI to build composite queries that may include
boolean/temporal operators and visual examples, a natural
language interface for simpler queries with boolean/temporal
operators, a free-text interface for Google-like searches. In
all the interfaces it is possible to extend queries adding syn-
onyms and concept specializations through ontology reason-
ing and the use of WordNet. Consider, for instance, a query
“Find shots with animal”: the concept specializations ex-
pansion through ontology structure permits to retrieve not
only the shots annotated with animal, but also those anno-
tated with its specializations (dogs, cats, etc.). In particu-
lar, WordNet query expansion, using synonyms, is required
when using natural language and free-text queries, since it is
not possible to force the user to formulate a query selecting
terms from a lexicon, as is done using the GUI interface.
The search engine uses an ontology that has been created
automatically from a flat lexicon, using WordNet to cre-
ate concept relations (is a, is part of and has part). The
ontology is modelled following the Dynamic Pictorially En-
riched Ontology model [4], that includes both concepts and
visual concept prototypes. These prototypes represent the
different visual modalities in which a concept can manifest;
they can be selected by the users to perform query by exam-
2
http://deckard.micc.unifi.it/sirio/
Figure 1: Search interfaces: natural language search; Google-like search; GUI query builder.
ple. Concepts, concepts relations, video annotations and vi-
sual concept prototypes are defined using the standard Web
Ontology Language (OWL) so that the ontology can be eas-
ily reused and shared. The queries created in each interface
are translated by the search engine into SPARQL, the W3C
standard ontology query language.
The system is based on the Rich Internet Application
paradigm, using a client side Flash virtual machine which
can execute instructions on the client computer. RIAs can
avoid the usual slow and synchronous loop for user interac-
tions, typical of web based environments that use only the
HTML widgets available to standard browsers. This allows
to implement a visual querying mechanism that exhibits a
look and feel approaching that of a desktop environment,
with the fast response that is expected by users. With this
solution the application installation is not required, since
the system is updated on the server, and run anywhere re-
gardless of what operating system is used.
The system backend is currently based on open source
tools (i.e. Apache Tomcat and Red 5 video streaming server)
or freely available commercial tools (Adobe Media Server
has a free developer edition). The RTMP video stream-
ing protocol is used. The search engine is developed in Java
and supports multiple ontologies and ontology reasoning ser-
vices. Audio-visual concepts are automatically annotated
using the VidiVideo annotation engine [2]. The search re-
sults are in RSS 2.0 XML format with paging, so that they
can be treated as RSS feeds. Results of the query are shown
in the interface and for each video clip of the result set is
shown the first frame. These frames are obtained from the
video streaming server, and are shown within a small video
player. Users can then play the video sequence and, if in-
terested, zoom in each result displaying it in a larger player,
that provides more details on the video metadata and al-
lows better video browsing. The user interface is written
in Adobe Flex and Action Script 3. All the modules of the
system are connected using HTTP POST, XML and SOAP
web services.
3. DEMONSTRATION
We demonstrate the search modalities of the system in
three different video domains: broadcast news, video surveil-
lance and cultural heritage documentaries. We show how
each interface is suitable for different users: the GUI in-
terface allows to build composite queries that take into ac-
count also metadata, as required by professional archivists,
the natural language interface allows to build simple queries
with boolean and temporal relations between concepts, the
free-text interface provides the popular Google-like search.
Acknowledgments.
This work is partially supported by the EU IST VidiVideo
project (www.vidivideo.info - contract FP6-045547).
4. REFERENCES
[1] A. F. Smeaton, P. Over and W. Kraai. High-Level
Feature Detection from Video in TRECVid: a 5-Year
Retrospective of Achievements, Multimedia Content
Analysis, Theory and Applications, 151–174, 2009,
Springer Verlag.
[2] C. G. M. Snoek et al. The MediaMill TRECVID 2008
Semantic Video Search Engine, In Proceedings of the
6th TRECVID Workshop, 2008.
[3] A. Natsev, J. R. Smith, J. Teˇsi´c, L. Xie, R. Yan,
W. Jiang, M. Merler IBM Research TRECVID-2008
Video Retrieval System, In Proceedings of the 6th
TRECVID Workshop, 2008.
[4] M. Bertini, R. Cucchiara, A. Del Bimbo, C. Grana,
G. Serra, C. Torniai and R. Vezzani. Dynamic
Pictorially Enriched Ontologies for Video Digital
Libraries. In IEEE Multimedia, to appear, 2009.

More Related Content

PDF
Video Hyperlinking Tutorial (Part B)
PDF
Video Hyperlinking Tutorial (Part C)
PDF
PDF
1784 1788
PDF
PDF
Interactive Video Search and Browsing Systems
PDF
An information system to access contemporary archives of art: Cavalcaselle, V...
PDF
International Journal of Engineering Research and Development
Video Hyperlinking Tutorial (Part B)
Video Hyperlinking Tutorial (Part C)
1784 1788
Interactive Video Search and Browsing Systems
An information system to access contemporary archives of art: Cavalcaselle, V...
International Journal of Engineering Research and Development

Similar to Sirio (20)

PDF
Sirio Orione and Pan
PDF
Semantic browsing
PDF
Interactive Video Search and Browsing Systems
PDF
On Linked Open Data (LOD)-based Semantic Video Annotation Systems
DOCX
Automatic semantic content extraction in videos using a fuzzy ontology and ru...
PDF
Tell Me Quality Documentation
PDF
On Annotation of Video Content for Multimedia Retrieval and Sharing
PDF
Smart India Hackathon Idea Submission
PPT
Development Tools - Abhijeet
PDF
Review on content based video lecture retrieval
PDF
Media Pick
PPT
Summer school bz_fp7research_20100708
PDF
PARKING ALLOTMENT SYSTEM PROJECT REPORT REPORT.
PDF
Scalable architectures for phenotype libraries
PPT
OGC Web Service Shibboleth Interoperability Experiment
PPT
Shibboleth Federations and Secure SDI
PPTX
OSFair2017 Workshop | EGI applications database
PPT
ConnectME: connecting content for future TV & video
PDF
Pacify based video retrieval system
Sirio Orione and Pan
Semantic browsing
Interactive Video Search and Browsing Systems
On Linked Open Data (LOD)-based Semantic Video Annotation Systems
Automatic semantic content extraction in videos using a fuzzy ontology and ru...
Tell Me Quality Documentation
On Annotation of Video Content for Multimedia Retrieval and Sharing
Smart India Hackathon Idea Submission
Development Tools - Abhijeet
Review on content based video lecture retrieval
Media Pick
Summer school bz_fp7research_20100708
PARKING ALLOTMENT SYSTEM PROJECT REPORT REPORT.
Scalable architectures for phenotype libraries
OGC Web Service Shibboleth Interoperability Experiment
Shibboleth Federations and Secure SDI
OSFair2017 Workshop | EGI applications database
ConnectME: connecting content for future TV & video
Pacify based video retrieval system
Ad

Recently uploaded (20)

PDF
Getting started with AI Agents and Multi-Agent Systems
PDF
The influence of sentiment analysis in enhancing early warning system model f...
PDF
CloudStack 4.21: First Look Webinar slides
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
PPTX
Custom Battery Pack Design Considerations for Performance and Safety
PDF
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PPTX
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
PPTX
Build Your First AI Agent with UiPath.pptx
PDF
Developing a website for English-speaking practice to English as a foreign la...
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PPTX
Microsoft Excel 365/2024 Beginner's training
PDF
Credit Without Borders: AI and Financial Inclusion in Bangladesh
PDF
OpenACC and Open Hackathons Monthly Highlights July 2025
PDF
1 - Historical Antecedents, Social Consideration.pdf
PPTX
Benefits of Physical activity for teenagers.pptx
PDF
Zenith AI: Advanced Artificial Intelligence
PPTX
2018-HIPAA-Renewal-Training for executives
PPTX
The various Industrial Revolutions .pptx
PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
Getting started with AI Agents and Multi-Agent Systems
The influence of sentiment analysis in enhancing early warning system model f...
CloudStack 4.21: First Look Webinar slides
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
Custom Battery Pack Design Considerations for Performance and Safety
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
Build Your First AI Agent with UiPath.pptx
Developing a website for English-speaking practice to English as a foreign la...
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
Microsoft Excel 365/2024 Beginner's training
Credit Without Borders: AI and Financial Inclusion in Bangladesh
OpenACC and Open Hackathons Monthly Highlights July 2025
1 - Historical Antecedents, Social Consideration.pdf
Benefits of Physical activity for teenagers.pptx
Zenith AI: Advanced Artificial Intelligence
2018-HIPAA-Renewal-Training for executives
The various Industrial Revolutions .pptx
Final SEM Unit 1 for mit wpu at pune .pptx
Ad

Sirio

  • 1. Sirio: an Ontology-based Web Search Engine for Videos Thomas Alisi, Marco Bertini, Gianpaolo D’Amico, Alberto Del Bimbo, Andrea Ferracani, Federico Pernici and Giuseppe Serra Media Integration and Communication Center, University of Florence, Italy {alisi, bertini, damico, delbimbo, ferracani, pernici, serra}@dsi.unifi.it http://www.micc.unifi.it/vim ABSTRACT In this technical demonstration we show a web video search engine based on ontologies, the Sirio1 system, that has been developed within the EU VidiVideo project. The goal of the system is to provide a search engine for videos for both technical and non-technical users. In fact, the system has different interfaces that permit different query modalities: free-text, natural language, graphical composition of con- cepts using boolean and temporal relations and query by visual example. In addition, the ontology structure is ex- ploited to encode semantic relations between concepts per- mitting, for example, to expand queries to synonyms and concept specializations. Categories and Subject Descriptors H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval—Search process; H.3.5 [Information Storage and Retrieval]: Online Information Services— Web-based services General Terms Algorithms, Experimentation Keywords Video retrieval, ontologies, web services 1. INTRODUCTION Video search engines are the product of progress in many technologies: visual and audio analysis, machine learning techniques, as well as visualization and interaction. The cur- rent video search engines are based on lexicons of semantic concepts and perform keyword-based queries [1]. These sys- tems are generally desktop applications or have simple web interfaces that show the results of the query as a ranked list 1 Sirio was the hound of the mythical hunter Orion. It was a dog so swift that no prey could escape it. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. MM’09, October 19–24, 2009, Beijing, China. Copyright 2009 ACM X-XXXXX-XX-X/XX/XX ...$10.00. of keyframes [2, 3]. These systems do not let users to per- form composite queries that can include temporal relations between concepts and do not allow to look for concepts that are not in the lexicon. In addition, desktop applications re- quire installation on the end-user computer and can not be used in a distributed environment. In this demonstration we present the Sirio system, a web video search engine that allows semantic retrieval by content for different domains (broadcast news, surveillance, cultural heritage documentaries) with query interaction and visual- ization. The system permits different query modalities (free text, natural language, graphical composition of concepts using boolean and temporal relations and query by visual example) and visualizations, resulting in an advanced tool for retrieval and exploration of video archives for both tech- nical and non-technical users. In addition the use of ontolo- gies permits to exploit semantic relations between concepts through reasoning. Finally our web system, using the Rich Internet Application paradigm (RIA), does not require any installation and provides a responsive user interface. 2. THE SYSTEM The Sirio system2 is composed by three different inter- faces: a GUI to build composite queries that may include boolean/temporal operators and visual examples, a natural language interface for simpler queries with boolean/temporal operators, a free-text interface for Google-like searches. In all the interfaces it is possible to extend queries adding syn- onyms and concept specializations through ontology reason- ing and the use of WordNet. Consider, for instance, a query “Find shots with animal”: the concept specializations ex- pansion through ontology structure permits to retrieve not only the shots annotated with animal, but also those anno- tated with its specializations (dogs, cats, etc.). In particu- lar, WordNet query expansion, using synonyms, is required when using natural language and free-text queries, since it is not possible to force the user to formulate a query selecting terms from a lexicon, as is done using the GUI interface. The search engine uses an ontology that has been created automatically from a flat lexicon, using WordNet to cre- ate concept relations (is a, is part of and has part). The ontology is modelled following the Dynamic Pictorially En- riched Ontology model [4], that includes both concepts and visual concept prototypes. These prototypes represent the different visual modalities in which a concept can manifest; they can be selected by the users to perform query by exam- 2 http://deckard.micc.unifi.it/sirio/
  • 2. Figure 1: Search interfaces: natural language search; Google-like search; GUI query builder. ple. Concepts, concepts relations, video annotations and vi- sual concept prototypes are defined using the standard Web Ontology Language (OWL) so that the ontology can be eas- ily reused and shared. The queries created in each interface are translated by the search engine into SPARQL, the W3C standard ontology query language. The system is based on the Rich Internet Application paradigm, using a client side Flash virtual machine which can execute instructions on the client computer. RIAs can avoid the usual slow and synchronous loop for user interac- tions, typical of web based environments that use only the HTML widgets available to standard browsers. This allows to implement a visual querying mechanism that exhibits a look and feel approaching that of a desktop environment, with the fast response that is expected by users. With this solution the application installation is not required, since the system is updated on the server, and run anywhere re- gardless of what operating system is used. The system backend is currently based on open source tools (i.e. Apache Tomcat and Red 5 video streaming server) or freely available commercial tools (Adobe Media Server has a free developer edition). The RTMP video stream- ing protocol is used. The search engine is developed in Java and supports multiple ontologies and ontology reasoning ser- vices. Audio-visual concepts are automatically annotated using the VidiVideo annotation engine [2]. The search re- sults are in RSS 2.0 XML format with paging, so that they can be treated as RSS feeds. Results of the query are shown in the interface and for each video clip of the result set is shown the first frame. These frames are obtained from the video streaming server, and are shown within a small video player. Users can then play the video sequence and, if in- terested, zoom in each result displaying it in a larger player, that provides more details on the video metadata and al- lows better video browsing. The user interface is written in Adobe Flex and Action Script 3. All the modules of the system are connected using HTTP POST, XML and SOAP web services. 3. DEMONSTRATION We demonstrate the search modalities of the system in three different video domains: broadcast news, video surveil- lance and cultural heritage documentaries. We show how each interface is suitable for different users: the GUI in- terface allows to build composite queries that take into ac- count also metadata, as required by professional archivists, the natural language interface allows to build simple queries with boolean and temporal relations between concepts, the free-text interface provides the popular Google-like search. Acknowledgments. This work is partially supported by the EU IST VidiVideo project (www.vidivideo.info - contract FP6-045547). 4. REFERENCES [1] A. F. Smeaton, P. Over and W. Kraai. High-Level Feature Detection from Video in TRECVid: a 5-Year Retrospective of Achievements, Multimedia Content Analysis, Theory and Applications, 151–174, 2009, Springer Verlag. [2] C. G. M. Snoek et al. The MediaMill TRECVID 2008 Semantic Video Search Engine, In Proceedings of the 6th TRECVID Workshop, 2008. [3] A. Natsev, J. R. Smith, J. Teˇsi´c, L. Xie, R. Yan, W. Jiang, M. Merler IBM Research TRECVID-2008 Video Retrieval System, In Proceedings of the 6th TRECVID Workshop, 2008. [4] M. Bertini, R. Cucchiara, A. Del Bimbo, C. Grana, G. Serra, C. Torniai and R. Vezzani. Dynamic Pictorially Enriched Ontologies for Video Digital Libraries. In IEEE Multimedia, to appear, 2009.