SlideShare a Scribd company logo
Television Linked To The Web



               Daniel Stein1, Evlampios Apostolidis2, Vasileios Mezaris2,
                      Nicolas de Abreu Pereira3, Jennifer Müller3,
                    Mathilde Sahuguet4, Benoit Huet4, Ivo Lašek5


  Enrichment of News Show Videos with
  Multimodal Semi-Automatic Analysis
1 Fraunhofer Institute IAIS, Schloss Birlinghoven, Germany
2 Information Technologies Institute, CERTH, Greece
3 rbb - Rundfunk Berlin-Brandenburg, 14482 Potsdam, Germany
4 Eurecom, Sophia Antpolis, France
5 Czech Technical University and University of Economics, Prague, Czech Republic
     NEM Summit, Istanbul,
                                         www.linkedtv.eu
        October 2012
Synopsis
                                                                www.linkedtv.eu




   Introduction: LinkedTV Project
   Use Cases
   Intelligent Video Analysis
   Results
   Conclusions & future plans




2                               Information Technologies Institute
                                Centre for Research and Technology Hellas
LinkedTV ― Television Linked To the Web
                                                                            www.linkedtv.eu


   Vision:                                  12 Excellent Partners
     hypervideo                          Fraunhofer      Eurecom
     ubiquitously online cloud of        STI GMBH        Condat
      Networked Audio-Visual Content      CERTH           BEELD EN GELUID
     decoupled from place, device or     UEP             Noterik
      source                              UMONS           U. ST GALLEN
                                          CWI             RBB
   Aim:
     provide interactive multimedia

      service for non-professional end-
      users
     Focus on television broadcast

      content as seed videos

   Web: http://www.linkedtv.eu


    3                                       Information Technologies Institute
                                            Centre for Research and Technology Hellas
LinkedTV Workflow
                                                                                 www.linkedtv.eu




    Overall Architecture

                                  Use Case Scenarios



     Intelligent Video Analysis


     Linking Hypervideo to Web Content
                                                        Contextualization
                                                        and Personalization
     Interface and Presentation Engine




4                                                Information Technologies Institute
                                                 Centre for Research and Technology Hellas
LinkedTV Workflow
                                                                                 www.linkedtv.eu




    Overall Architecture

                                  Use Case Scenarios



     Intelligent Video Analysis


     Linking Hypervideo to Web Content
                                                        Contextualization
                                                        and Personalization
     Interface and Presentation Engine




5                                                Information Technologies Institute
                                                 Centre for Research and Technology Hellas
Two Use Case Scenarios in LinkedTV
                                                                        www.linkedtv.eu


Scenario 1 (this talk):
Interactive News Show
 Professional news
                           Due to legal constraints: whitelist
                           Detailed scenario archetype description
    content produced by
    RBB                         News topic,
 Seed content: local           people,
    news show "rbb              locations,
    Aktuell"                    objects etc
                                                 Scenario 2
                                                 (not covered here):
                                                 Hyperlinked Documentary
                                                    Cultural content from
                                                     S&V (1700 hours of
                                                     cultural heritage AV-
                                                     content under CCL)
                                                    Seed content:
                                                     "Antique Roadshow"
                                                                                    6
6                                       Information Technologies Institute
                                        Centre for Research and Technology Hellas
Intelligent Video Analysis
                                                             www.linkedtv.eu




7                            Information Technologies Institute
                             Centre for Research and Technology Hellas
Segmentation
                                                                                                                           www.linkedtv.eu




   Shot segmentation technique                                    Spatio-temporal Segmentation
   [Tsamoura et. al., 2008]                                       [Mezaris et. al., 2004]
   News show video performance:                                   News show performance: Good
  “remarkably well”                                                False positives due to:
   Out of 269 shots detected:                                            Camera movement or zoom in/out (~ 55 %)
           2 had wrong starting points                                   Gradual transition between frames (~ 10 %)
                                                                          Erroneous motion vectors (~ 35 %)
           4 contained multiple shots


           11 were too short to evaluate
                                                                   Unwanted effect: false recognition of
          properly                                                  moving banners which do not yield
                                                                    additional information




V. Mezaris, I. Kompatsiaris, N. V. Boulgouris, and M. G. Strintzis, "Real-time compressed-domain spatiotemporal segmentation and ontologies
for video indexing and retrieval", IEEE Transactions on Circuits and Systems for Video Technology, vol. 14, no. 5, pp. 606-621, May 2004.

E. Tsamoura, V. Mezaris, I. Kompatsiaris, "Gradual transition detection using color coherence and other criteria in a video shot meta-
segmentation framework", IEEE International Conference on Image Processing, Workshop on Multimedia Information Retrieval (ICIP-MIR
2008), San Diego, CA, USA, October 2008, pp. 45-48.


      8                                                                           Information Technologies Institute
                                                                                  Centre for Research and Technology Hellas
Concept Detection
                                                                                                                              www.linkedtv.eu




       Method was described in [Moumtzidou et.
        al., 2011]
       346 concepts from TRECVID 2011 SIN task
       Overall performance:
          Correctly detected concepts > 64 %
          About 25 % of them are characterized as
           particularly useful mostly related to detecting
           persons (e.g., person, face, adult)
          Erroneous concepts vary between 22% -
           42% and in many cases achieve high scores
           (e.g., outdoor, amateur video)

                                                                                      Visit: http://mklab.iti.gr/eventdetection-linkedtv/



A. Moumtzidou, P. Sidiropoulos, S. Vrochidis, N. Gkalelis, S. Nikolopoulos, V. Mezaris, I. Kompatsiaris, I. Patras, "ITI-CERTH participation
to TRECVID 2011", Proc. TRECVID 2011 Workshop, December 2011, Gaithersburg, MD, USA.


    9                                                                               Information Technologies Institute
                                                                                    Centre for Research and Technology Hellas
Automatic Speech Recognition
                                                                                                                       www.linkedtv.eu


 Automatic speech recognition for German (using [Schneider08]):

        segment of one news show                WER            notes
        new airport                             36.2           outdoor, spontaneous
        soccer riot                             44.2           tavern, dialect, background noise
        various news I                          9.5

        murder case                             24.0

        boxing                                  50.6           dialect, very spontaneous
        various news II                         20.9

        rbb game                                39.1

        weather report                          46.7           spontaneous, casual

  main obstacles: local dialect, spontaneous speech, background noise

 Schneider, D., Schon, J., and Eickeler, S. (2008). Towards Large Scale Vocabulary Independent Spoken Term Detection: Advances
 in the Fraunhofer IAIS Audiomining System. In Proc. SIGIR, Singapore.


   10                                                                          Information Technologies Institute
                                                                               Centre for Research and Technology Hellas
Person Detection
                                                                              www.linkedtv.eu




 Face clustering using the face.com api  Speaker Identification using a GMM-
 Result: generally very good, some      HMM model, with 253 German parliament
erroneous clusters due to side-view      speakers
                                          Result: 8.0% Equal Error Rate
 11                                           Information Technologies Institute
                                              Centre for Research and Technology Hellas
Conclusions
                                                                                         www.linkedtv.eu



    We have established:
       all the different video analysis techniques
       their exact functionality
       the connections among them


    Preliminary results work as a solid ground for further improvements


    Many challenges have been addressed but several aspects of the analysis
     techniques show much room for improvement, e.g.,
       over-sensitivity of spatiotemporal segmentation algorithm to gradual transitions
        and camera’s movement
       adaptation of several TRECVID concepts to the needs of each specific multimedia
        content (news show, documentary, art show)
       over-sensitivity of speech recognizer to localized dialects and background noise



12                                                       Information Technologies Institute
                                                         Centre for Research and Technology Hellas
Future Plans
                                                                                www.linkedtv.eu



  Incorporate new methods:
       Near-duplicate Content Detection
          Goal: find parts that are already watched
       Optical Character Recognition
          Goal: exploit banner information to obtain a database for face and
            speaker recognition
       Topic Segmentation
          Goal: improve scene segmentation

  Find synergies between methods:
       ASR + Speaker Recognition + Face Detection
           Person Detection
       ASR + Topic Classification + Shot Segmentation
           Story Segmentation
       Concept Detection + Keywords Extraction + Topic Segmentation
           Video Similarity/Clustering



13                                              Information Technologies Institute
                                                Centre for Research and Technology Hellas
www.linkedtv.eu




                         Questions ?



More information:
http://www.iti.gr/~bmezaris
bmezaris@iti.gr

http://www.linkedtv.eu

14                                 Information Technologies Institute
                                   Centre for Research and Technology Hellas

More Related Content

PPT
Contextualised user profiling in networked media environments
PDF
OW2 A presentation pierre_chatel
PDF
Future Cities Conference´13 / Jacques Magen - "1 Introduction to the INFINITY...
PDF
Virtual Campfire/iNMV Storytelling on the iPhone
PDF
PDF
Accessing and utilising Smart Venues for experiments and pilots
PDF
CICT's ICT Month Celebrations - HCDG Week
Contextualised user profiling in networked media environments
OW2 A presentation pierre_chatel
Future Cities Conference´13 / Jacques Magen - "1 Introduction to the INFINITY...
Virtual Campfire/iNMV Storytelling on the iPhone
Accessing and utilising Smart Venues for experiments and pilots
CICT's ICT Month Celebrations - HCDG Week

What's hot (17)

PPTX
Enhancing Academic Event Participation with Context-aware and Social Recommen...
PDF
0c96052b28b8e9f1cf000000
PPTX
Turning social disputes into knowledge representations DERI reading group 201...
PDF
SDR Europe 2011 Programme Agenda
PDF
Building a Maturity & Capability Model Repository
PPTX
Stefan Decker
PDF
Ec Nsf Workshop June99
PPTX
Future Internet Enterprise Systems
PPTX
Explicit vs. latent concept models for cross language information retrieval
PDF
Wouter Joossen - Security
PDF
SMARCOS - ARTEMIS Summer Camp 2010
PPTX
EXPERIMEDIA project overview
PDF
Nat'l Defense Univ: Lessons Learned in CLoud Computing
PDF
Vol13 no2
PDF
28032012 Jacques Bus Privacy en Identiteit in Europese richtlijnen en program...
PDF
201004 - Natural User Interfaces
PDF
Towards Abundant Do-it-Yourself (DiY) Service Creativity in the Internet-of-T...
Enhancing Academic Event Participation with Context-aware and Social Recommen...
0c96052b28b8e9f1cf000000
Turning social disputes into knowledge representations DERI reading group 201...
SDR Europe 2011 Programme Agenda
Building a Maturity & Capability Model Repository
Stefan Decker
Ec Nsf Workshop June99
Future Internet Enterprise Systems
Explicit vs. latent concept models for cross language information retrieval
Wouter Joossen - Security
SMARCOS - ARTEMIS Summer Camp 2010
EXPERIMEDIA project overview
Nat'l Defense Univ: Lessons Learned in CLoud Computing
Vol13 no2
28032012 Jacques Bus Privacy en Identiteit in Europese richtlijnen en program...
201004 - Natural User Interfaces
Towards Abundant Do-it-Yourself (DiY) Service Creativity in the Internet-of-T...
Ad

Viewers also liked (8)

PDF
CSR Discovery Program - International Documentary Series on CSR
PPTX
Conventions of television news
PDF
LinkedTV Deliverable 5.7 - Validation of the LinkedTV Architecture
PDF
LinkedTV Deliverable 4.7 - Contextualisation and personalisation evaluation a...
PDF
LinkedTV Deliverable 7.7 - Dissemination and Standardisation Report (v3)
PDF
LinkedTV Deliverable 9.1.4 Annual Project Scientific Report (final)
PDF
LinkedTV Deliverable 9.3 Final LinkedTV Project Report
PDF
LinkedTV Deliverable 6.5 - Final evaluation of the LinkedTV Scenarios
CSR Discovery Program - International Documentary Series on CSR
Conventions of television news
LinkedTV Deliverable 5.7 - Validation of the LinkedTV Architecture
LinkedTV Deliverable 4.7 - Contextualisation and personalisation evaluation a...
LinkedTV Deliverable 7.7 - Dissemination and Standardisation Report (v3)
LinkedTV Deliverable 9.1.4 Annual Project Scientific Report (final)
LinkedTV Deliverable 9.3 Final LinkedTV Project Report
LinkedTV Deliverable 6.5 - Final evaluation of the LinkedTV Scenarios
Ad

Similar to Enrichment of News Show Videos with Multimodal Semi-Automatic Analysis (20)

PDF
LinkedTV project overview
PPT
Semantic personalisation in networked media: determining the background know...
PDF
Reducing Infrastructure and Service Fragmentation
PPT
Supporting a Mobile Lost and Found Community
PPTX
Multimedia Content Understanding: Bringing Context to Content
PDF
Qo E E2 E2 Project Overview Antoine Dejonghe
PPT
2013-10-10 robust and trusted crowd-sourcing and crowd-tasking in the future ...
PDF
CHOReOS
PPTX
project final review VS about cryptography and network security
PDF
How Open Data Can Enhance Interactive Television
PDF
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
PDF
Content-adaptive Video Coding for HTTP Adaptive Streaming
PPTX
FIWARE and IoT net services by DunavNET, SenZations 2015
PDF
Mainflux Labs - References (1).pdf
PDF
The Collaboratory, Videoconferencing, and Collaboration ...
PPT
Shibboleth Federations and Secure SDI
PPT
OGC Web Service Shibboleth Interoperability Experiment
PDF
IRJET- Segmenting, Multimedia Summarizing and Query based Retrieval of New...
PPT
Access Control in ESDIN: Shibboleth
LinkedTV project overview
Semantic personalisation in networked media: determining the background know...
Reducing Infrastructure and Service Fragmentation
Supporting a Mobile Lost and Found Community
Multimedia Content Understanding: Bringing Context to Content
Qo E E2 E2 Project Overview Antoine Dejonghe
2013-10-10 robust and trusted crowd-sourcing and crowd-tasking in the future ...
CHOReOS
project final review VS about cryptography and network security
How Open Data Can Enhance Interactive Television
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
Content-adaptive Video Coding for HTTP Adaptive Streaming
FIWARE and IoT net services by DunavNET, SenZations 2015
Mainflux Labs - References (1).pdf
The Collaboratory, Videoconferencing, and Collaboration ...
Shibboleth Federations and Secure SDI
OGC Web Service Shibboleth Interoperability Experiment
IRJET- Segmenting, Multimedia Summarizing and Query based Retrieval of New...
Access Control in ESDIN: Shibboleth

More from LinkedTV (20)

PDF
LinkedTV Deliverable 3.8 - Design guideline document for concept-based presen...
PDF
LinkedTV Deliverable 2.7 - Final Linked Media Layer and Evaluation
PDF
LinkedTV Deliverable 1.6 - Intelligent hypervideo analysis evaluation, final ...
PDF
LinkedTV Deliverable 5.5 - LinkedTV front-end: video player and MediaCanvas A...
PPT
LinkedTV - an added value enrichment solution for AV content providers
PPT
LinkedTV tools for Linked Media applications (LIME 2015 workshop talk)
PDF
LinkedTV Newsletter (2015 edition)
PDF
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
PDF
LinkedTV Deliverable D3.7 User Interfaces selected and refined (version 2)
PDF
LinkedTV Deliverable D2.6 LinkedTV Framework for Generating Video Enrichments...
PDF
LinkedTV Deliverable D1.5 The Editor Tool, final release
PDF
LinkedTV Deliverable D1.4 Visual, text and audio information analysis for hyp...
PDF
LinkedTV D8.6 Market and Product Survey for LinkedTV Services and Technology
PDF
LinkedTV D7.6 Project Demonstrator v2
PDF
LinkedTV D7.5 LinkedTV Dissemination and Standardisation Report v2
PDF
LinkedTV Deliverable D6.3 User Trial Results
PDF
LinkedTV Deliverable D5.6 Final LinkedTV End-to-End Platform
PDF
LinkedTV D6.4 Scenario Demonstrators v2
PPT
LinkedTV results at the end of the 3rd year
PPT
Annotating TV programming and linking to related content on the Web
LinkedTV Deliverable 3.8 - Design guideline document for concept-based presen...
LinkedTV Deliverable 2.7 - Final Linked Media Layer and Evaluation
LinkedTV Deliverable 1.6 - Intelligent hypervideo analysis evaluation, final ...
LinkedTV Deliverable 5.5 - LinkedTV front-end: video player and MediaCanvas A...
LinkedTV - an added value enrichment solution for AV content providers
LinkedTV tools for Linked Media applications (LIME 2015 workshop talk)
LinkedTV Newsletter (2015 edition)
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D3.7 User Interfaces selected and refined (version 2)
LinkedTV Deliverable D2.6 LinkedTV Framework for Generating Video Enrichments...
LinkedTV Deliverable D1.5 The Editor Tool, final release
LinkedTV Deliverable D1.4 Visual, text and audio information analysis for hyp...
LinkedTV D8.6 Market and Product Survey for LinkedTV Services and Technology
LinkedTV D7.6 Project Demonstrator v2
LinkedTV D7.5 LinkedTV Dissemination and Standardisation Report v2
LinkedTV Deliverable D6.3 User Trial Results
LinkedTV Deliverable D5.6 Final LinkedTV End-to-End Platform
LinkedTV D6.4 Scenario Demonstrators v2
LinkedTV results at the end of the 3rd year
Annotating TV programming and linking to related content on the Web

Recently uploaded (20)

PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Empathic Computing: Creating Shared Understanding
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Spectroscopy.pptx food analysis technology
PDF
KodekX | Application Modernization Development
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Encapsulation theory and applications.pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Approach and Philosophy of On baking technology
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Cloud computing and distributed systems.
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
The AUB Centre for AI in Media Proposal.docx
20250228 LYD VKU AI Blended-Learning.pptx
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
“AI and Expert System Decision Support & Business Intelligence Systems”
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Empathic Computing: Creating Shared Understanding
Dropbox Q2 2025 Financial Results & Investor Presentation
Spectroscopy.pptx food analysis technology
KodekX | Application Modernization Development
MIND Revenue Release Quarter 2 2025 Press Release
Encapsulation theory and applications.pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Approach and Philosophy of On baking technology
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Diabetes mellitus diagnosis method based random forest with bat algorithm
The Rise and Fall of 3GPP – Time for a Sabbatical?
Cloud computing and distributed systems.
Digital-Transformation-Roadmap-for-Companies.pptx

Enrichment of News Show Videos with Multimodal Semi-Automatic Analysis

  • 1. Television Linked To The Web Daniel Stein1, Evlampios Apostolidis2, Vasileios Mezaris2, Nicolas de Abreu Pereira3, Jennifer Müller3, Mathilde Sahuguet4, Benoit Huet4, Ivo Lašek5 Enrichment of News Show Videos with Multimodal Semi-Automatic Analysis 1 Fraunhofer Institute IAIS, Schloss Birlinghoven, Germany 2 Information Technologies Institute, CERTH, Greece 3 rbb - Rundfunk Berlin-Brandenburg, 14482 Potsdam, Germany 4 Eurecom, Sophia Antpolis, France 5 Czech Technical University and University of Economics, Prague, Czech Republic NEM Summit, Istanbul, www.linkedtv.eu October 2012
  • 2. Synopsis www.linkedtv.eu  Introduction: LinkedTV Project  Use Cases  Intelligent Video Analysis  Results  Conclusions & future plans 2 Information Technologies Institute Centre for Research and Technology Hellas
  • 3. LinkedTV ― Television Linked To the Web www.linkedtv.eu  Vision: 12 Excellent Partners  hypervideo Fraunhofer Eurecom  ubiquitously online cloud of STI GMBH Condat Networked Audio-Visual Content CERTH BEELD EN GELUID  decoupled from place, device or UEP Noterik source UMONS U. ST GALLEN CWI RBB  Aim:  provide interactive multimedia service for non-professional end- users  Focus on television broadcast content as seed videos  Web: http://www.linkedtv.eu 3 Information Technologies Institute Centre for Research and Technology Hellas
  • 4. LinkedTV Workflow www.linkedtv.eu Overall Architecture Use Case Scenarios Intelligent Video Analysis Linking Hypervideo to Web Content Contextualization and Personalization Interface and Presentation Engine 4 Information Technologies Institute Centre for Research and Technology Hellas
  • 5. LinkedTV Workflow www.linkedtv.eu Overall Architecture Use Case Scenarios Intelligent Video Analysis Linking Hypervideo to Web Content Contextualization and Personalization Interface and Presentation Engine 5 Information Technologies Institute Centre for Research and Technology Hellas
  • 6. Two Use Case Scenarios in LinkedTV www.linkedtv.eu Scenario 1 (this talk): Interactive News Show  Professional news  Due to legal constraints: whitelist  Detailed scenario archetype description content produced by RBB News topic,  Seed content: local people, news show "rbb locations, Aktuell" objects etc Scenario 2 (not covered here): Hyperlinked Documentary  Cultural content from S&V (1700 hours of cultural heritage AV- content under CCL)  Seed content: "Antique Roadshow" 6 6 Information Technologies Institute Centre for Research and Technology Hellas
  • 7. Intelligent Video Analysis www.linkedtv.eu 7 Information Technologies Institute Centre for Research and Technology Hellas
  • 8. Segmentation www.linkedtv.eu  Shot segmentation technique  Spatio-temporal Segmentation  [Tsamoura et. al., 2008]  [Mezaris et. al., 2004]  News show video performance:  News show performance: Good “remarkably well”  False positives due to:  Out of 269 shots detected:  Camera movement or zoom in/out (~ 55 %)  2 had wrong starting points  Gradual transition between frames (~ 10 %)  Erroneous motion vectors (~ 35 %)  4 contained multiple shots  11 were too short to evaluate  Unwanted effect: false recognition of properly moving banners which do not yield additional information V. Mezaris, I. Kompatsiaris, N. V. Boulgouris, and M. G. Strintzis, "Real-time compressed-domain spatiotemporal segmentation and ontologies for video indexing and retrieval", IEEE Transactions on Circuits and Systems for Video Technology, vol. 14, no. 5, pp. 606-621, May 2004. E. Tsamoura, V. Mezaris, I. Kompatsiaris, "Gradual transition detection using color coherence and other criteria in a video shot meta- segmentation framework", IEEE International Conference on Image Processing, Workshop on Multimedia Information Retrieval (ICIP-MIR 2008), San Diego, CA, USA, October 2008, pp. 45-48. 8 Information Technologies Institute Centre for Research and Technology Hellas
  • 9. Concept Detection www.linkedtv.eu  Method was described in [Moumtzidou et. al., 2011]  346 concepts from TRECVID 2011 SIN task  Overall performance:  Correctly detected concepts > 64 %  About 25 % of them are characterized as particularly useful mostly related to detecting persons (e.g., person, face, adult)  Erroneous concepts vary between 22% - 42% and in many cases achieve high scores (e.g., outdoor, amateur video) Visit: http://mklab.iti.gr/eventdetection-linkedtv/ A. Moumtzidou, P. Sidiropoulos, S. Vrochidis, N. Gkalelis, S. Nikolopoulos, V. Mezaris, I. Kompatsiaris, I. Patras, "ITI-CERTH participation to TRECVID 2011", Proc. TRECVID 2011 Workshop, December 2011, Gaithersburg, MD, USA. 9 Information Technologies Institute Centre for Research and Technology Hellas
  • 10. Automatic Speech Recognition www.linkedtv.eu  Automatic speech recognition for German (using [Schneider08]): segment of one news show WER notes new airport 36.2 outdoor, spontaneous soccer riot 44.2 tavern, dialect, background noise various news I 9.5 murder case 24.0 boxing 50.6 dialect, very spontaneous various news II 20.9 rbb game 39.1 weather report 46.7 spontaneous, casual  main obstacles: local dialect, spontaneous speech, background noise Schneider, D., Schon, J., and Eickeler, S. (2008). Towards Large Scale Vocabulary Independent Spoken Term Detection: Advances in the Fraunhofer IAIS Audiomining System. In Proc. SIGIR, Singapore. 10 Information Technologies Institute Centre for Research and Technology Hellas
  • 11. Person Detection www.linkedtv.eu  Face clustering using the face.com api  Speaker Identification using a GMM-  Result: generally very good, some HMM model, with 253 German parliament erroneous clusters due to side-view speakers  Result: 8.0% Equal Error Rate 11 Information Technologies Institute Centre for Research and Technology Hellas
  • 12. Conclusions www.linkedtv.eu  We have established:  all the different video analysis techniques  their exact functionality  the connections among them  Preliminary results work as a solid ground for further improvements  Many challenges have been addressed but several aspects of the analysis techniques show much room for improvement, e.g.,  over-sensitivity of spatiotemporal segmentation algorithm to gradual transitions and camera’s movement  adaptation of several TRECVID concepts to the needs of each specific multimedia content (news show, documentary, art show)  over-sensitivity of speech recognizer to localized dialects and background noise 12 Information Technologies Institute Centre for Research and Technology Hellas
  • 13. Future Plans www.linkedtv.eu  Incorporate new methods:  Near-duplicate Content Detection  Goal: find parts that are already watched  Optical Character Recognition  Goal: exploit banner information to obtain a database for face and speaker recognition  Topic Segmentation  Goal: improve scene segmentation  Find synergies between methods:  ASR + Speaker Recognition + Face Detection  Person Detection  ASR + Topic Classification + Shot Segmentation  Story Segmentation  Concept Detection + Keywords Extraction + Topic Segmentation  Video Similarity/Clustering 13 Information Technologies Institute Centre for Research and Technology Hellas
  • 14. www.linkedtv.eu Questions ? More information: http://www.iti.gr/~bmezaris bmezaris@iti.gr http://www.linkedtv.eu 14 Information Technologies Institute Centre for Research and Technology Hellas