SlideShare a Scribd company logo
Television Linked To The Web

LinkedTV @ MediaEval
Search and Hyperlinking
M. Sahuguet1, B. Huet1, B. Cervenková2, E. Apostolidis4, V. Mezaris4, D. Stein3,
S. Eickeler3, J.L. Redondo Garcia1, R. Troncy1, and L. Pikora2
MediaEval 2013 Workshop
Barcelona, Catalunya, Spain, 18-19 October 2013.
(1)

(2)

www.linkedtv.eu

(3)

(4)
LinkedTV ― Television Linked To the Web
www.linkedtv.eu

LinkedTV: interweaving Web and
TV into a single experience
Second screen scenario for
enriching television content and
achieving interaction between
user and content

Web: http://www.linkedtv.eu
2

LinkedTV @ MediaEval Search and Hyperlinking 2013

10/18/2013
LinkedTV@MediaEval
www.linkedtv.eu

 MediaEval Search & Hyperlinking:
an overview of LinkedTV’s enrichment process









Brainstorming
Pre-processing (BBC dataset)
Video segmentation
Indexing data in Lucene
From visual cues to detected concepts
Search task
Hyperlinking task
Conclusion

3

LinkedTV @ MediaEval Search and Hyperlinking 2013

10/18/2013
Brainstorming
www.linkedtv.eu

 Brainstorming meeting: Tasks and Dataset analysis

Shots are too small to return to user
Typos in the queries
Duplicate videos in the dataset
Visual concepts are not usable as such
Visual cues may not be helpful
Visual cues can also help as search terms
Maybe we can segment the videos differently?
Can we use speaker information?
Name of show/channel may appear in the query
Actors/Character names may appear
What analysis can we further apply on videos?

4

LinkedTV @ MediaEval Search and Hyperlinking 2013

10/18/2013
Brainstorming
www.linkedtv.eu

 Brainstorming meeting: Tasks and Dataset analysis
 Search:



Getting the right video is possible
Need to extract segment with good timing

 Segmentation level is of major importance


Shot are too short



We want to be as close as possible to the viewer

 Visual cues: not always helpful
<visualQueues>2 men sitting opposite each other</visualQueues>
<visualQueues>stands out and grabs your attention</visualQueues>

 Need to design a framework to use Visual Cues

 How can the LinkedTV media analysis tools be used?

5

LinkedTV @ MediaEval Search and Hyperlinking 2013

10/18/2013
Pre-processing dataset
www.linkedtv.eu

 Processing ~ 1697h of BBC video data

Visual Concept detection (151)

20 days on 100 cores

Scene segmentation

CERTH

2 days on 6 cores

OCR

Fraunhofer

1 day on 10 cores

Keywords extraction

Fraunhofer

5 hours

Named Entities extraction

Eurecom

4 days

Face detection and tracking

6

CERTH

Eurecom

4 days on 160 cores

LinkedTV @ MediaEval Search and Hyperlinking 2013

10/18/2013
Video Segmentation
www.linkedtv.eu

 Shots (provided by Task Organisers)
 Scenes: groups of adjacent shots




Visual similarity
Temporal consistency
P. Sidiropoulos, V. Mezaris, I. Kompatsiaris, H. Meinedo, M. Bugalho, and I.
Trancoso. Temporal Video Segmentation to Scenes Using High-Level
Audiovisual Features. IEEE Transactions on Circuits and Systems for Video
Technology, 2011

 Sliding windows:


7

inspired from M. Eskevich, G. Jones, C. Wartena, M. Larson, R. Aly, T.
Verschoor, and R. Ordelman. Comparing retrieval effectiveness of
alternative content segmentation methods for Internet video search. 10th
International Workshop on Content-Based Multimedia Indexing (CBMI), 2012

LinkedTV @ MediaEval Search and Hyperlinking 2013

10/18/2013
Indexing data in Lucene
www.linkedtv.eu

 Lucene engine for indexing the data
 Index at different temporal granularities:


Video level (pre-filtering)



Scenes level



Shot level



Sliding windows segments level

 Index different features at each temporal granularity:


Text (transcripts, subtitles)



Metadata (title, synopsis, cast, etc)



OCR



Visual concepts values (floating point fields)

 Design a framework for querying indexes and returning video segments
from a query
8

LinkedTV @ MediaEval Search and Hyperlinking 2013

10/18/2013
From visual cues to detected concepts
www.linkedtv.eu

 Text search is straightforward (default, TF-IDF values)
 Need to incorporate visual information to the search

9

LinkedTV @ MediaEval Search and Hyperlinking 2013

10/18/2013
From visual cues to detected concepts
www.linkedtv.eu

 Text search is straightforward (default, TF-IDF values)
 Need to incorporate visual information to the search
 Which concepts are present in the query?
 semantic word distance based on Wordnet synset
 mapping between keywords (extracted from the visual cues query)
and visual concepts
<visualQueues>animals, kenya wildlife reserve, marathon</visualQueues>
mapped visual concepts: Athlete, Dogs, Horse, Animal

10

LinkedTV @ MediaEval Search and Hyperlinking 2013

10/18/2013
From visual cues to detected concepts
www.linkedtv.eu

 Text search is straightforward (default, TF-IDF values)
 Need to incorporate visual information to the search
 Which concepts are present in the query?
 semantic word distance based on Wordnet synset
 mapping between keywords (extracted from the visual cues query)
and visual concepts
<visualQueues>animals, kenya wildlife reserve, marathon</visualQueues>
mapped visual concepts: Athlete, Dogs, Horse, Animal

 Integration of detected visual concepts to the Lucene search:
 Concepts filtering

11

LinkedTV @ MediaEval Search and Hyperlinking 2013

10/18/2013
From visual cues to detected concepts
www.linkedtv.eu

 Text search is straightforward (default, TF-IDF values)
 Need to incorporate visual information to the search
 Which concepts are present in the query?
 semantic word distance based on Wordnet synset
 mapping between keywords (extracted first results:
- Correct detection rate from the 100 from the visual cues query)
and visual concepts 0,5
- threshold at
<visualQueues>animals, kenya wildlife reserve, marathon</visualQueues>
- Normalize confidence: threshold at 0,7
mapped visual concepts: Athlete, Dogs, Horse, Animal

 Integration of detected visual concepts to the Lucene search:
 Concepts filtering

12

LinkedTV @ MediaEval Search and Hyperlinking 2013

10/18/2013
From visual cues to detected concepts
www.linkedtv.eu

 Text search is straightforward (default, TF-IDF values)
 Need to incorporate visual information to the search
 Which concepts are present in the query?
 semantic word distance based on Wordnet synset
 mapping between keywords (extracted from the visual cues query)
and visual concepts
<visualQueues>animals, kenya wildlife reserve, marathon</visualQueues>
mapped visual concepts: Athlete, Dogs, Horse, Animal

 Integration of detected visual concepts to the Lucene search:
 Concepts Selection
 Designing an enriched query: both textual (text query) and visual
information (range query).

13

LinkedTV @ MediaEval Search and Hyperlinking 2013

10/18/2013
Search task
www.linkedtv.eu

 Search videos at different temporal granularity
 Concatenation of textual and visual query for text search


<queryText>Odd cars, Fake MacLaren, </queryText>



<visualQueues>Jeremy Clarkson, Richard Hammond, James May, Ferrari 430
Scuderia</visualQueues>

 Visual cues can be found in queryText too

 If TV Channel is mentioned, perform filtering:


<visualQueues>Cannabis on BBC ONE</visualQueues>



Should also be done on show titles (for next year?)

 For some runs, filter at video level first


Making a text query on the video index



Use 20 first video for segment search

 Focused search
14

LinkedTV @ MediaEval Search and Hyperlinking 2013

10/18/2013
Search task
www.linkedtv.eu

 Different granularities:





scenes
partial scenes (begin at shot ; ends at the corresponding scene ending)
temporally clustered shots (inside a video)
sliding window

 Different textual data (transcript/ASR)
 With/Without Visual Concepts
 With/Without use of synonyms
 9 runs
 goal : comparing approaches and features

15

LinkedTV @ MediaEval Search and Hyperlinking 2013

10/18/2013
Search task – Results
www.linkedtv.eu

MASP

scenes-C

0.3095

0.1770

0.1951

0.3091

0.1767

0.1947

0.3152

0.1635

0.2021

scenes-I

0.2613

0.1444

0.1582

scenes-U

0.2458

0.1344

0.1528

0.2284

0.1241

0.1024

part-scenes-noC

0.2281

0.1240

0.1021

clustering-C

0.2929

0.1525

0.1814

clustering-noC

0.2849

0.1479

0.1713

SW-60-S

0.2833

0.1925

0.2027

SW-60-I

0.1965

0.1206

0.1204

SW-40-U

16

mGAP

part-scenes-C

Search over
sliding window
segments (size
60)

MRR

scenes-S
Scene search
using only
subtitles

Run
scenes-noC

Scenes search
using textual and
visual concepts

0.2368

0.1342

0.1501

LinkedTV @ MediaEval Search and Hyperlinking 2013

10/18/2013
Hyperlinking Task
www.linkedtv.eu

 Re-use of the search component



Shot clustering approach
Scene approach

 Create a query from the anchor!




Get subtitle and shots aligned with anchor
Text query: extract keywords using Alchemy API (highest weight to anchor
than context)
Visual cues query: for each concept, highest score over all shots

 Use of “MoreLikeThis” (MLT) feature in Lucene, combined with THD


sliding window approach

 Create temporary documents from the anchor!



17

THD = Targeted Hypernym Discovery (UEP): returns semantic
annotation, synonyms
MLT: finding similar documents as input

LinkedTV @ MediaEval Search and Hyperlinking 2013

10/18/2013
Hyperlinking results
www.linkedtv.eu

Run

18

P-10

P-20

0.0577

0.4467

0.3200

0.2067

LA SW MLT

0.1201

0.4200

0.4200

0.3217

LA scenes

0.1770

0.6867

0.5867

0.4167

LC clustering 0.0823

Scenes search in
LC condition
(anchor + context)

P-5

LA clustering
Scenes search in
LA condition
(anchor only)

MAP

0.5733

0.4833

0.2767

LC SW MLT

0.1820

0.5667

0.5667

0.4300

LC scenes

0.2523

0.8133

0.7300

0.5283

LinkedTV @ MediaEval Search and Hyperlinking 2013

10/18/2013
Conclusions
www.linkedtv.eu

 Major findings
 Scene segmentation approach performs best
 Improvement when using visual concepts
 when carefully employed

 Future work
 Improve scene detection
 Closer follow human perception
 Improve the link between query and visual concepts
 Use named entities

Thank you
Questions?
19

LinkedTV @ MediaEval Search and Hyperlinking 2013

10/18/2013

More Related Content

PPTX
Recycling Gypsum Wallboard
PPTX
Sigurnost na internetu
PPTX
Strukturas General Presentation (200ppi)
PPT
Steel truss ppt
PDF
Newsletter 2013
PDF
Building the future of television - Lynda Hardman
PDF
LinkedTV Newsletter (2015 edition)
PDF
LinkedTV Newsletter September 2014
Recycling Gypsum Wallboard
Sigurnost na internetu
Strukturas General Presentation (200ppi)
Steel truss ppt
Newsletter 2013
Building the future of television - Lynda Hardman
LinkedTV Newsletter (2015 edition)
LinkedTV Newsletter September 2014

Similar to LinkedTV @ MediaEval 2013 Search and Hyperlinking Task (20)

PPT
LinkedTV project overview (2nd year)
PPT
LinkedTV tools for Linked Media applications (LIME 2015 workshop talk)
PDF
Intetain presentation on VideoHypE, the LinkedTV video hyperlink editor
PPT
LinkedTV project results at the end of year 2
PDF
FIAT-IFTA 2013 - Television linked to the web: the case for audiovisual arch...
PDF
Annual Project Scientific Report
PPT
LinkedTV results at the end of the 3rd year
PDF
LinkedTV Deliverable 9.1.4 Annual Project Scientific Report (final)
PPT
LinkedTV: Television Linked to the Web, June 2013
PDF
Multimodal Features for Linking Television Content
PDF
First LinkedTV End-to-end Platform
PDF
LinkedTV Deliverable D2.6 LinkedTV Framework for Generating Video Enrichments...
PPT
LinkedTV: Building the future of television
PDF
LinkedTV Poster
PPT
Remixing Media on the Semantic Web (ISWC2014 Tutorial) Pt 2 Linked Media: An...
PDF
Annotation and retrieval module of media fragments
PDF
Survey of Semantic Media Annotation Tools - towards New Media Applications wi...
PDF
Presentation on MediaEval Search & Linking task 2013
PPTX
Convenient Discovery of Archived Video Using Audiovisual Hyperlinking
PDF
Hyper Video Browser Search and Hyperlinking in Broadcast Media
LinkedTV project overview (2nd year)
LinkedTV tools for Linked Media applications (LIME 2015 workshop talk)
Intetain presentation on VideoHypE, the LinkedTV video hyperlink editor
LinkedTV project results at the end of year 2
FIAT-IFTA 2013 - Television linked to the web: the case for audiovisual arch...
Annual Project Scientific Report
LinkedTV results at the end of the 3rd year
LinkedTV Deliverable 9.1.4 Annual Project Scientific Report (final)
LinkedTV: Television Linked to the Web, June 2013
Multimodal Features for Linking Television Content
First LinkedTV End-to-end Platform
LinkedTV Deliverable D2.6 LinkedTV Framework for Generating Video Enrichments...
LinkedTV: Building the future of television
LinkedTV Poster
Remixing Media on the Semantic Web (ISWC2014 Tutorial) Pt 2 Linked Media: An...
Annotation and retrieval module of media fragments
Survey of Semantic Media Annotation Tools - towards New Media Applications wi...
Presentation on MediaEval Search & Linking task 2013
Convenient Discovery of Archived Video Using Audiovisual Hyperlinking
Hyper Video Browser Search and Hyperlinking in Broadcast Media
Ad

More from Benoit HUET (9)

PPTX
Affective Multimodal Analysis for the Media Industry
PDF
NexGenTV: Providing Real-Time Insight during Political Debates in a Second Sc...
PDF
Media Genre Inference for Predicting Media Interestingness
PPTX
Event-based MultiMedia Search and Retrieval for Question Answering
PDF
When textual and visual information join forces for multimedia retrieval
PPTX
Multimedia Content Understanding: Bringing Context to Content
PPTX
Mining the Web for Multimedia-based Enriching - Multimedia Hyperlinking and ...
PPSX
Multimedia Data Collection using Social Media Analysis
PPTX
Wsm2011
Affective Multimodal Analysis for the Media Industry
NexGenTV: Providing Real-Time Insight during Political Debates in a Second Sc...
Media Genre Inference for Predicting Media Interestingness
Event-based MultiMedia Search and Retrieval for Question Answering
When textual and visual information join forces for multimedia retrieval
Multimedia Content Understanding: Bringing Context to Content
Mining the Web for Multimedia-based Enriching - Multimedia Hyperlinking and ...
Multimedia Data Collection using Social Media Analysis
Wsm2011
Ad

Recently uploaded (20)

PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PPTX
cloud_computing_Infrastucture_as_cloud_p
PPTX
A Presentation on Artificial Intelligence
PDF
DP Operators-handbook-extract for the Mautical Institute
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PDF
Mushroom cultivation and it's methods.pdf
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PDF
Encapsulation theory and applications.pdf
PDF
Hindi spoken digit analysis for native and non-native speakers
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
August Patch Tuesday
PPTX
Tartificialntelligence_presentation.pptx
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Web App vs Mobile App What Should You Build First.pdf
A comparative analysis of optical character recognition models for extracting...
Assigned Numbers - 2025 - Bluetooth® Document
SOPHOS-XG Firewall Administrator PPT.pptx
cloud_computing_Infrastucture_as_cloud_p
A Presentation on Artificial Intelligence
DP Operators-handbook-extract for the Mautical Institute
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
Mushroom cultivation and it's methods.pdf
Enhancing emotion recognition model for a student engagement use case through...
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
Encapsulation theory and applications.pdf
Hindi spoken digit analysis for native and non-native speakers
Group 1 Presentation -Planning and Decision Making .pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
August Patch Tuesday
Tartificialntelligence_presentation.pptx
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Web App vs Mobile App What Should You Build First.pdf

LinkedTV @ MediaEval 2013 Search and Hyperlinking Task

  • 1. Television Linked To The Web LinkedTV @ MediaEval Search and Hyperlinking M. Sahuguet1, B. Huet1, B. Cervenková2, E. Apostolidis4, V. Mezaris4, D. Stein3, S. Eickeler3, J.L. Redondo Garcia1, R. Troncy1, and L. Pikora2 MediaEval 2013 Workshop Barcelona, Catalunya, Spain, 18-19 October 2013. (1) (2) www.linkedtv.eu (3) (4)
  • 2. LinkedTV ― Television Linked To the Web www.linkedtv.eu LinkedTV: interweaving Web and TV into a single experience Second screen scenario for enriching television content and achieving interaction between user and content Web: http://www.linkedtv.eu 2 LinkedTV @ MediaEval Search and Hyperlinking 2013 10/18/2013
  • 3. LinkedTV@MediaEval www.linkedtv.eu  MediaEval Search & Hyperlinking: an overview of LinkedTV’s enrichment process         Brainstorming Pre-processing (BBC dataset) Video segmentation Indexing data in Lucene From visual cues to detected concepts Search task Hyperlinking task Conclusion 3 LinkedTV @ MediaEval Search and Hyperlinking 2013 10/18/2013
  • 4. Brainstorming www.linkedtv.eu  Brainstorming meeting: Tasks and Dataset analysis Shots are too small to return to user Typos in the queries Duplicate videos in the dataset Visual concepts are not usable as such Visual cues may not be helpful Visual cues can also help as search terms Maybe we can segment the videos differently? Can we use speaker information? Name of show/channel may appear in the query Actors/Character names may appear What analysis can we further apply on videos? 4 LinkedTV @ MediaEval Search and Hyperlinking 2013 10/18/2013
  • 5. Brainstorming www.linkedtv.eu  Brainstorming meeting: Tasks and Dataset analysis  Search:   Getting the right video is possible Need to extract segment with good timing  Segmentation level is of major importance  Shot are too short  We want to be as close as possible to the viewer  Visual cues: not always helpful <visualQueues>2 men sitting opposite each other</visualQueues> <visualQueues>stands out and grabs your attention</visualQueues>  Need to design a framework to use Visual Cues  How can the LinkedTV media analysis tools be used? 5 LinkedTV @ MediaEval Search and Hyperlinking 2013 10/18/2013
  • 6. Pre-processing dataset www.linkedtv.eu  Processing ~ 1697h of BBC video data Visual Concept detection (151) 20 days on 100 cores Scene segmentation CERTH 2 days on 6 cores OCR Fraunhofer 1 day on 10 cores Keywords extraction Fraunhofer 5 hours Named Entities extraction Eurecom 4 days Face detection and tracking 6 CERTH Eurecom 4 days on 160 cores LinkedTV @ MediaEval Search and Hyperlinking 2013 10/18/2013
  • 7. Video Segmentation www.linkedtv.eu  Shots (provided by Task Organisers)  Scenes: groups of adjacent shots    Visual similarity Temporal consistency P. Sidiropoulos, V. Mezaris, I. Kompatsiaris, H. Meinedo, M. Bugalho, and I. Trancoso. Temporal Video Segmentation to Scenes Using High-Level Audiovisual Features. IEEE Transactions on Circuits and Systems for Video Technology, 2011  Sliding windows:  7 inspired from M. Eskevich, G. Jones, C. Wartena, M. Larson, R. Aly, T. Verschoor, and R. Ordelman. Comparing retrieval effectiveness of alternative content segmentation methods for Internet video search. 10th International Workshop on Content-Based Multimedia Indexing (CBMI), 2012 LinkedTV @ MediaEval Search and Hyperlinking 2013 10/18/2013
  • 8. Indexing data in Lucene www.linkedtv.eu  Lucene engine for indexing the data  Index at different temporal granularities:  Video level (pre-filtering)  Scenes level  Shot level  Sliding windows segments level  Index different features at each temporal granularity:  Text (transcripts, subtitles)  Metadata (title, synopsis, cast, etc)  OCR  Visual concepts values (floating point fields)  Design a framework for querying indexes and returning video segments from a query 8 LinkedTV @ MediaEval Search and Hyperlinking 2013 10/18/2013
  • 9. From visual cues to detected concepts www.linkedtv.eu  Text search is straightforward (default, TF-IDF values)  Need to incorporate visual information to the search 9 LinkedTV @ MediaEval Search and Hyperlinking 2013 10/18/2013
  • 10. From visual cues to detected concepts www.linkedtv.eu  Text search is straightforward (default, TF-IDF values)  Need to incorporate visual information to the search  Which concepts are present in the query?  semantic word distance based on Wordnet synset  mapping between keywords (extracted from the visual cues query) and visual concepts <visualQueues>animals, kenya wildlife reserve, marathon</visualQueues> mapped visual concepts: Athlete, Dogs, Horse, Animal 10 LinkedTV @ MediaEval Search and Hyperlinking 2013 10/18/2013
  • 11. From visual cues to detected concepts www.linkedtv.eu  Text search is straightforward (default, TF-IDF values)  Need to incorporate visual information to the search  Which concepts are present in the query?  semantic word distance based on Wordnet synset  mapping between keywords (extracted from the visual cues query) and visual concepts <visualQueues>animals, kenya wildlife reserve, marathon</visualQueues> mapped visual concepts: Athlete, Dogs, Horse, Animal  Integration of detected visual concepts to the Lucene search:  Concepts filtering 11 LinkedTV @ MediaEval Search and Hyperlinking 2013 10/18/2013
  • 12. From visual cues to detected concepts www.linkedtv.eu  Text search is straightforward (default, TF-IDF values)  Need to incorporate visual information to the search  Which concepts are present in the query?  semantic word distance based on Wordnet synset  mapping between keywords (extracted first results: - Correct detection rate from the 100 from the visual cues query) and visual concepts 0,5 - threshold at <visualQueues>animals, kenya wildlife reserve, marathon</visualQueues> - Normalize confidence: threshold at 0,7 mapped visual concepts: Athlete, Dogs, Horse, Animal  Integration of detected visual concepts to the Lucene search:  Concepts filtering 12 LinkedTV @ MediaEval Search and Hyperlinking 2013 10/18/2013
  • 13. From visual cues to detected concepts www.linkedtv.eu  Text search is straightforward (default, TF-IDF values)  Need to incorporate visual information to the search  Which concepts are present in the query?  semantic word distance based on Wordnet synset  mapping between keywords (extracted from the visual cues query) and visual concepts <visualQueues>animals, kenya wildlife reserve, marathon</visualQueues> mapped visual concepts: Athlete, Dogs, Horse, Animal  Integration of detected visual concepts to the Lucene search:  Concepts Selection  Designing an enriched query: both textual (text query) and visual information (range query). 13 LinkedTV @ MediaEval Search and Hyperlinking 2013 10/18/2013
  • 14. Search task www.linkedtv.eu  Search videos at different temporal granularity  Concatenation of textual and visual query for text search  <queryText>Odd cars, Fake MacLaren, </queryText>  <visualQueues>Jeremy Clarkson, Richard Hammond, James May, Ferrari 430 Scuderia</visualQueues>  Visual cues can be found in queryText too  If TV Channel is mentioned, perform filtering:  <visualQueues>Cannabis on BBC ONE</visualQueues>  Should also be done on show titles (for next year?)  For some runs, filter at video level first  Making a text query on the video index  Use 20 first video for segment search  Focused search 14 LinkedTV @ MediaEval Search and Hyperlinking 2013 10/18/2013
  • 15. Search task www.linkedtv.eu  Different granularities:     scenes partial scenes (begin at shot ; ends at the corresponding scene ending) temporally clustered shots (inside a video) sliding window  Different textual data (transcript/ASR)  With/Without Visual Concepts  With/Without use of synonyms  9 runs  goal : comparing approaches and features 15 LinkedTV @ MediaEval Search and Hyperlinking 2013 10/18/2013
  • 16. Search task – Results www.linkedtv.eu MASP scenes-C 0.3095 0.1770 0.1951 0.3091 0.1767 0.1947 0.3152 0.1635 0.2021 scenes-I 0.2613 0.1444 0.1582 scenes-U 0.2458 0.1344 0.1528 0.2284 0.1241 0.1024 part-scenes-noC 0.2281 0.1240 0.1021 clustering-C 0.2929 0.1525 0.1814 clustering-noC 0.2849 0.1479 0.1713 SW-60-S 0.2833 0.1925 0.2027 SW-60-I 0.1965 0.1206 0.1204 SW-40-U 16 mGAP part-scenes-C Search over sliding window segments (size 60) MRR scenes-S Scene search using only subtitles Run scenes-noC Scenes search using textual and visual concepts 0.2368 0.1342 0.1501 LinkedTV @ MediaEval Search and Hyperlinking 2013 10/18/2013
  • 17. Hyperlinking Task www.linkedtv.eu  Re-use of the search component   Shot clustering approach Scene approach  Create a query from the anchor!    Get subtitle and shots aligned with anchor Text query: extract keywords using Alchemy API (highest weight to anchor than context) Visual cues query: for each concept, highest score over all shots  Use of “MoreLikeThis” (MLT) feature in Lucene, combined with THD  sliding window approach  Create temporary documents from the anchor!   17 THD = Targeted Hypernym Discovery (UEP): returns semantic annotation, synonyms MLT: finding similar documents as input LinkedTV @ MediaEval Search and Hyperlinking 2013 10/18/2013
  • 18. Hyperlinking results www.linkedtv.eu Run 18 P-10 P-20 0.0577 0.4467 0.3200 0.2067 LA SW MLT 0.1201 0.4200 0.4200 0.3217 LA scenes 0.1770 0.6867 0.5867 0.4167 LC clustering 0.0823 Scenes search in LC condition (anchor + context) P-5 LA clustering Scenes search in LA condition (anchor only) MAP 0.5733 0.4833 0.2767 LC SW MLT 0.1820 0.5667 0.5667 0.4300 LC scenes 0.2523 0.8133 0.7300 0.5283 LinkedTV @ MediaEval Search and Hyperlinking 2013 10/18/2013
  • 19. Conclusions www.linkedtv.eu  Major findings  Scene segmentation approach performs best  Improvement when using visual concepts  when carefully employed  Future work  Improve scene detection  Closer follow human perception  Improve the link between query and visual concepts  Use named entities Thank you Questions? 19 LinkedTV @ MediaEval Search and Hyperlinking 2013 10/18/2013

Editor's Notes

  • #4: Input from Daniel regarding the progress in Audio Analysis and VideoOCR*** adoption of new video OCR*** speech processing - preparation of new paradigms: **** deep neural networks (automatic speech recognition) **** i-vectors + SVMs using cosine kernel (speaker recognition)