SlideShare a Scribd company logo
Media Fragment Indexing
Using Social Media
Yunjia Li1, Raphael Troncy2, Mike Wald1 and Gary Wills1
1School of Electronics and Computer Science
University of Southampton, UK
2EURECOM, Sophia Antipolis, France,
1
Agenda
• Media Fragments
• Media Fragment Indexing Framework
• Survey on Media Fragment URI Implementations on Video
Sharing Platforms
• Indexing Media Fragments Using Twitter
• Conclusions and Future Work
2
Media Fragment
• Denote the inside content of multimedia resources
• Dimensions defined in the Media Fragment URI 1.0 spec
– Temporal dimension
http://example.org/test.mp4#t=3,7
– Spatial dimension (a rectangle area)
http://example.org/test.mp4#xywh=120,240,180,240
3
Current Situation
• Multimedia uploading, sharing, tagging is easy
• Searching a complete multimedia resource on major search
engines is easy
• But searching multimedia resource at a fine-grained level
on major search engines is difficult
– Availability of annotations: limited amount of
annotations linked to media fragments
– SEO problem:
• The landing page is not search-engine-friendly
• Everything is on the same page and the notion of
media fragment is not explicitly embedded in HTML
4
Media Fragment Indexing
Framework
5
Google’s Ajax Content Crawler
• The Crawler is designed to index Ajax content
• Replace token “#!” in URLs with “_escaped_fragment_”
6
*Diagram from
https://developers.google.com/webmasters/ajax-
crawling/docs/getting-started
Key Ideas
• The fragment information must be included in the URL
– Syntax: W3C Media Fragment 1.0 Specification
• Prepare two sets of pages for every media fragment
– original landing page for end-users
– a snapshot page for SEO
• Landing page keeps the original user interaction
– Highlight media fragments on opening
• SEO page
– ONLY includes annotations of the media fragment
– Embed rich snippet 7
The Solution
8
Server
Crawler
1:
1: Submit pretty URL replay/1#!t=3,7 to the crawler
2:
2: Crawler asks server for replay/1?_escaped_fragment_=t=3,7
Terrace Theater
3:
Snapshot page
Snapshot/1?_escaped_frag
ment_=t=3,7
3: Redirect the request to the snapshot page generated by the server. The snapshot page only
contains annotations and Microdata for “#t=3,7”,
Terrace Theater
Linked Data
Landing page replay/1#!t=3,7
Terrace Theater replay/1#!t=3,7
4:
4: The snapshot page is returned to the crawler with URL replay/1#!t=3,7
5: Terrace Theater
5: A user searches keyword “Terrace Theater”
6: replay/1#!t=3,7
6: Google includes replay/1#!t=3,7 in the search results
7:
7: The user click the link and ask for the document at replay/1#!t=3,7
8:
8: The server returns the landing page containing both “Terrace Theater” and “Linked Data”
9:
9: The landing page highlights the media fragment by start playing from 3s to 7s
Discussion
• The Media Fragment Indexing Framework solved the SEO
problem of media fragments
• The scalability of such method largely relies on whether
there are large number of annotations linked to media
fragments
• Looking for media fragment annotations?
– Timed-text, transcript, speech recognition
– Manual annotations on each video sharing platforms
– Social Media (Twitter)
9
Survey on Media Fragment
URI Implementation
10
Media Fragments and Social Media
• The deep-linking function
• A Media Fragment URL can be embedded in a Tweet
• Text of the Tweet is the annotation to the URL
• Get annotations by filtering Tweets that have MF URIs 11
Filter Tweets by Media Fragment URIs
• Problem:
– Any URL in Tweet is potentially a MF URI
– Too many false-positive cases
http://example.org/1234#t=23
http://example.org/1234?t=23
http://example.org/1234?track=23
– They could all be MF URIs, need to be identified
manually
• Work around:
– Identify platforms (partially-)implementing MF URI
– Only filter Tweets containing URLs from those domains
12
Survey Methodology
• Find a list of video sharing platforms
– http://en.wikipedia.org/wiki/List_of_video_hosting_services
– 59 websites are targeted in the survey
– Some of them have access restrictions
• Go through each website manually to see whether they
provide deep-linking function, such as:
– Social sharing button from a certain time point
– Deep-linking option in right click menu
13
Survey Results (1)
• 9 websites partially-implemented MFURI
– 56.com, Dailymotion, Hulu, Vbox7, Viddler, vimeo,
Tudou, Youku and YouTube
• They use different syntax to encode temporal dimension
– Most of them use URI query, except YouTube & Vimeo
– Parameter name: “start”, “t”, “st”, etc
– Only Hulu implemented the end time
• Only YouTube partially implemented spatial dimension
– This is an external function implemented by Clickberry
https://clickberry.tv/video/6dafe30e-dcb8-44b8-8190-32be8249a297
14
Survey Results (2)
• Only 9 websites partially-implemented MFURI, however:
– Those websites have covered most videos shared on the
web
– eBizMBA report:
http://www.ebizmba.com/articles/video-websites
• Select filter keywords based on the survey results:
– Twitter is banned in China, so 56.com, Tudou and
Youku are ignored
– Hulu has access restriction outside U.S.
• Filter keywords: “YouTube”, “Dailymotion”, “Vbox7”,
“Vimeo” and “Viddler”
15
Indexing Media Fragments
Using Twitter
16
Twitter Media Fragment Indexer
• Collect Tweets filtered by the keywords
• Extract MF URIs in Tweets, parse the media fragment
information
• Use Media Fragment Indexing Framework to publish
Tweets as media fragment annotations
• Embed rich snippet in the snapshot pages
• Create sitemap for Google to crawl the snapshot pages
• User searches keywords in the Tweet in Google and the link
will lead to the video with corresponding start time
17
The Detailed Workflow
18
Indexing Results (1)
• Monitor 50-hour non-stop Twitter stream
• Filter phrase: “youtube, dailymotion, vimeo, vbox7, viddler”
• 5,779,858 Tweets examined, 5,269,742 contain URLs
• 32,754 Tweets contain MF URIs, 32796 MF URIs in total
• Media Fragment URIs shared in each website:
19
Website No. of MFURIs %
YouTube 32,666 99.604
Dailymotion 101 0.308
Vbox7 0 0
Viddler 0 0
Vimeo 29 0.088
Indexing Results (2)
• 13,088 distinct videos are found
• 17,854 distinct MF URIs for sitemap
– Many Tweets share the same video, but different
fragments
– Many retweets
– Some video are not available in UK
• 17,479 URLs (97.9%) in the sitemap have been indexed by
Google
• Only 775 URLs are indexed as VideoObject even though all
rich snippets are embedded in all snapshot pages
20
Demo
• Search “Chris Eppstein”
• As a result, this landing page will be opened and the video
start playing from the time indicated in the Tweet
containing keywords “Chris Eppstein”
21
Conclusions and Future Work
22
Conclusions
• Introduced Media Fragment Indexing Framework
• Propose the using of social media to acquire more
annotations to media fragments
• Survey the MF URI implementation on major video sharing
platforms
• Twitter Media Fragment Indexer
– Monitor Tweet Stream and automatically create media
fragment annotations
– Index media fragments in Google
– YouTube is the most important domain to share media
fragments on Twitter 23
Future Work
• How valid tweets could be served as media fragment
annotations
– many noisy and unrelated text
– many re-tweets
• Experiment on larger scale (billions of tweets and
continuous monitoring)
• Expand the methodology to other media fragment
annotations, such as timed-text
• Extract named entities from tweets and further link media
fragments to the Linked Data Cloud
24
Questions?
25

More Related Content

PDF
Survey of Semantic Media Annotation Tools - towards New Media Applications wi...
PPT
Remixing Media on the Semantic Web (ISWC2014 Tutorial) Pt 2 Linked Media: An...
PPTX
LinkedTV. Engaging TV viewers with AudioVisual heritage on second screens
PDF
Remixing Media on the Semantic Web (ISWC 2014 Tutorial) Pt 1 Media Fragment S...
PDF
How Open Data Can Enhance Interactive Television
PPT
LinkedTV: Television Linked to the Web, June 2013
PDF
Video Hyperlinking Tutorial (Part A)
PPTX
News Semantic Snapshot
Survey of Semantic Media Annotation Tools - towards New Media Applications wi...
Remixing Media on the Semantic Web (ISWC2014 Tutorial) Pt 2 Linked Media: An...
LinkedTV. Engaging TV viewers with AudioVisual heritage on second screens
Remixing Media on the Semantic Web (ISWC 2014 Tutorial) Pt 1 Media Fragment S...
How Open Data Can Enhance Interactive Television
LinkedTV: Television Linked to the Web, June 2013
Video Hyperlinking Tutorial (Part A)
News Semantic Snapshot

What's hot (11)

PPT
LinkedTV project results at the end of year 2
PPT
VideoHypE: An Editor Tool for Supervised Automatic Video Hyperlinking
PDF
FIAT-IFTA 2013 - Television linked to the web: the case for audiovisual arch...
PPT
LinkedTV project overview (2nd year)
PDF
The importance of Linked Media to the Future Web
PPT
LinkedTV - an added value enrichment solution for AV content providers
PDF
Implementation of Hyperlinks in videos with HTML5
PDF
Newsletter 2013
PDF
D5.1. LinkedTV Platform and Architecture
PPT
Annotating TV programming and linking to related content on the Web
PDF
Presentation / invited talk by Kalina Bontcheva at Digilience 2019, Oct 2019
LinkedTV project results at the end of year 2
VideoHypE: An Editor Tool for Supervised Automatic Video Hyperlinking
FIAT-IFTA 2013 - Television linked to the web: the case for audiovisual arch...
LinkedTV project overview (2nd year)
The importance of Linked Media to the Future Web
LinkedTV - an added value enrichment solution for AV content providers
Implementation of Hyperlinks in videos with HTML5
Newsletter 2013
D5.1. LinkedTV Platform and Architecture
Annotating TV programming and linking to related content on the Web
Presentation / invited talk by Kalina Bontcheva at Digilience 2019, Oct 2019
Ad

Similar to Media Fragments Indexing using Social Media (20)

PDF
The InVID Plug-in: Web Video Verification on the Browser
PDF
My Media at University of Toronto Libraries
PPT
DOCX
IEEE 2014 JAVA CLOUD COMPUTING PROJECTS Cloud based mobile multimedia recomme...
PDF
Lots of SIOC Data, Now What?
PDF
Remixing Media on the Web: Media Fragment Specification and Semantics
PDF
YouTube Tools to the Rescue
PDF
Chapter 6_The Internet and Web Applications.pdf
PPT
Introduction to social media training for BBC Vision
PPT
Moodle Series - Learn Local - Embedding in Moodle
PDF
Twet Application
PDF
[System design] Design a tweeter-like system
PDF
Remaining Agile Amidst Seismic Shifts in the Social Media Landscape
PDF
Ses sew cz kelly cutler
PPTX
Information update march 2013.ppt
PDF
International Video SEO Optimization ISS Berlin Massimo Burgio
PPTX
Information sharing about Columbia University Library’s recent web archiving ...
PDF
On Annotation of Video Content for Multimedia Retrieval and Sharing
PDF
Video Thumbnail Selector
PPTX
5 privacy terms
The InVID Plug-in: Web Video Verification on the Browser
My Media at University of Toronto Libraries
IEEE 2014 JAVA CLOUD COMPUTING PROJECTS Cloud based mobile multimedia recomme...
Lots of SIOC Data, Now What?
Remixing Media on the Web: Media Fragment Specification and Semantics
YouTube Tools to the Rescue
Chapter 6_The Internet and Web Applications.pdf
Introduction to social media training for BBC Vision
Moodle Series - Learn Local - Embedding in Moodle
Twet Application
[System design] Design a tweeter-like system
Remaining Agile Amidst Seismic Shifts in the Social Media Landscape
Ses sew cz kelly cutler
Information update march 2013.ppt
International Video SEO Optimization ISS Berlin Massimo Burgio
Information sharing about Columbia University Library’s recent web archiving ...
On Annotation of Video Content for Multimedia Retrieval and Sharing
Video Thumbnail Selector
5 privacy terms
Ad

More from LinkedTV (20)

PDF
LinkedTV Deliverable 9.1.4 Annual Project Scientific Report (final)
PDF
LinkedTV Deliverable 9.3 Final LinkedTV Project Report
PDF
LinkedTV Deliverable 7.7 - Dissemination and Standardisation Report (v3)
PDF
LinkedTV Deliverable 6.5 - Final evaluation of the LinkedTV Scenarios
PDF
LinkedTV Deliverable 5.7 - Validation of the LinkedTV Architecture
PDF
LinkedTV Deliverable 4.7 - Contextualisation and personalisation evaluation a...
PDF
LinkedTV Deliverable 3.8 - Design guideline document for concept-based presen...
PDF
LinkedTV Deliverable 2.7 - Final Linked Media Layer and Evaluation
PDF
LinkedTV Deliverable 1.6 - Intelligent hypervideo analysis evaluation, final ...
PDF
LinkedTV Deliverable 5.5 - LinkedTV front-end: video player and MediaCanvas A...
PPT
LinkedTV tools for Linked Media applications (LIME 2015 workshop talk)
PDF
LinkedTV Newsletter (2015 edition)
PDF
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
PDF
LinkedTV Deliverable D3.7 User Interfaces selected and refined (version 2)
PDF
LinkedTV Deliverable D2.6 LinkedTV Framework for Generating Video Enrichments...
PDF
LinkedTV Deliverable D1.5 The Editor Tool, final release
PDF
LinkedTV Deliverable D1.4 Visual, text and audio information analysis for hyp...
PDF
LinkedTV D8.6 Market and Product Survey for LinkedTV Services and Technology
PDF
LinkedTV D7.6 Project Demonstrator v2
PDF
LinkedTV D7.5 LinkedTV Dissemination and Standardisation Report v2
LinkedTV Deliverable 9.1.4 Annual Project Scientific Report (final)
LinkedTV Deliverable 9.3 Final LinkedTV Project Report
LinkedTV Deliverable 7.7 - Dissemination and Standardisation Report (v3)
LinkedTV Deliverable 6.5 - Final evaluation of the LinkedTV Scenarios
LinkedTV Deliverable 5.7 - Validation of the LinkedTV Architecture
LinkedTV Deliverable 4.7 - Contextualisation and personalisation evaluation a...
LinkedTV Deliverable 3.8 - Design guideline document for concept-based presen...
LinkedTV Deliverable 2.7 - Final Linked Media Layer and Evaluation
LinkedTV Deliverable 1.6 - Intelligent hypervideo analysis evaluation, final ...
LinkedTV Deliverable 5.5 - LinkedTV front-end: video player and MediaCanvas A...
LinkedTV tools for Linked Media applications (LIME 2015 workshop talk)
LinkedTV Newsletter (2015 edition)
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D3.7 User Interfaces selected and refined (version 2)
LinkedTV Deliverable D2.6 LinkedTV Framework for Generating Video Enrichments...
LinkedTV Deliverable D1.5 The Editor Tool, final release
LinkedTV Deliverable D1.4 Visual, text and audio information analysis for hyp...
LinkedTV D8.6 Market and Product Survey for LinkedTV Services and Technology
LinkedTV D7.6 Project Demonstrator v2
LinkedTV D7.5 LinkedTV Dissemination and Standardisation Report v2

Recently uploaded (20)

PPTX
Layers_of_the_Earth_Grade7.pptx class by
PDF
si manuel quezon at mga nagawa sa bansang pilipinas
PPTX
IPCNA VIRTUAL CLASSES INTERMEDIATE 6 PROJECT.pptx
PPT
415456121-Jiwratrwecdtwfdsfwgdwedvwe dbwsdjsadca-EVN.ppt
PDF
Slides PDF: The World Game (s) Eco Economic Epochs.pdf
PDF
The New Creative Director: How AI Tools for Social Media Content Creation Are...
PDF
Introduction to the IoT system, how the IoT system works
DOC
Rose毕业证学历认证,利物浦约翰摩尔斯大学毕业证国外本科毕业证
PDF
Uptota Investor Deck - Where Africa Meets Blockchain
PDF
mera desh ae watn.(a source of motivation and patriotism to the youth of the ...
PPTX
t_and_OpenAI_Combined_two_pressentations
PPT
250152213-Excitation-SystemWERRT (1).ppt
PDF
📍 LABUAN4D EXCLUSIVE SERVER STAR GAMING ASIA NO.1 TERPOPULER DI INDONESIA ! 🌟
PPT
isotopes_sddsadsaadasdasdasdasdsa1213.ppt
PPTX
newyork.pptxirantrafgshenepalchinachinane
PDF
SASE Traffic Flow - ZTNA Connector-1.pdf
PPTX
E -tech empowerment technologies PowerPoint
PPTX
June-4-Sermon-Powerpoint.pptx USE THIS FOR YOUR MOTIVATION
PDF
SlidesGDGoCxRAIS about Google Dialogflow and NotebookLM.pdf
PPT
Design_with_Watersergyerge45hrbgre4top (1).ppt
Layers_of_the_Earth_Grade7.pptx class by
si manuel quezon at mga nagawa sa bansang pilipinas
IPCNA VIRTUAL CLASSES INTERMEDIATE 6 PROJECT.pptx
415456121-Jiwratrwecdtwfdsfwgdwedvwe dbwsdjsadca-EVN.ppt
Slides PDF: The World Game (s) Eco Economic Epochs.pdf
The New Creative Director: How AI Tools for Social Media Content Creation Are...
Introduction to the IoT system, how the IoT system works
Rose毕业证学历认证,利物浦约翰摩尔斯大学毕业证国外本科毕业证
Uptota Investor Deck - Where Africa Meets Blockchain
mera desh ae watn.(a source of motivation and patriotism to the youth of the ...
t_and_OpenAI_Combined_two_pressentations
250152213-Excitation-SystemWERRT (1).ppt
📍 LABUAN4D EXCLUSIVE SERVER STAR GAMING ASIA NO.1 TERPOPULER DI INDONESIA ! 🌟
isotopes_sddsadsaadasdasdasdasdsa1213.ppt
newyork.pptxirantrafgshenepalchinachinane
SASE Traffic Flow - ZTNA Connector-1.pdf
E -tech empowerment technologies PowerPoint
June-4-Sermon-Powerpoint.pptx USE THIS FOR YOUR MOTIVATION
SlidesGDGoCxRAIS about Google Dialogflow and NotebookLM.pdf
Design_with_Watersergyerge45hrbgre4top (1).ppt

Media Fragments Indexing using Social Media

  • 1. Media Fragment Indexing Using Social Media Yunjia Li1, Raphael Troncy2, Mike Wald1 and Gary Wills1 1School of Electronics and Computer Science University of Southampton, UK 2EURECOM, Sophia Antipolis, France, 1
  • 2. Agenda • Media Fragments • Media Fragment Indexing Framework • Survey on Media Fragment URI Implementations on Video Sharing Platforms • Indexing Media Fragments Using Twitter • Conclusions and Future Work 2
  • 3. Media Fragment • Denote the inside content of multimedia resources • Dimensions defined in the Media Fragment URI 1.0 spec – Temporal dimension http://example.org/test.mp4#t=3,7 – Spatial dimension (a rectangle area) http://example.org/test.mp4#xywh=120,240,180,240 3
  • 4. Current Situation • Multimedia uploading, sharing, tagging is easy • Searching a complete multimedia resource on major search engines is easy • But searching multimedia resource at a fine-grained level on major search engines is difficult – Availability of annotations: limited amount of annotations linked to media fragments – SEO problem: • The landing page is not search-engine-friendly • Everything is on the same page and the notion of media fragment is not explicitly embedded in HTML 4
  • 6. Google’s Ajax Content Crawler • The Crawler is designed to index Ajax content • Replace token “#!” in URLs with “_escaped_fragment_” 6 *Diagram from https://developers.google.com/webmasters/ajax- crawling/docs/getting-started
  • 7. Key Ideas • The fragment information must be included in the URL – Syntax: W3C Media Fragment 1.0 Specification • Prepare two sets of pages for every media fragment – original landing page for end-users – a snapshot page for SEO • Landing page keeps the original user interaction – Highlight media fragments on opening • SEO page – ONLY includes annotations of the media fragment – Embed rich snippet 7
  • 8. The Solution 8 Server Crawler 1: 1: Submit pretty URL replay/1#!t=3,7 to the crawler 2: 2: Crawler asks server for replay/1?_escaped_fragment_=t=3,7 Terrace Theater 3: Snapshot page Snapshot/1?_escaped_frag ment_=t=3,7 3: Redirect the request to the snapshot page generated by the server. The snapshot page only contains annotations and Microdata for “#t=3,7”, Terrace Theater Linked Data Landing page replay/1#!t=3,7 Terrace Theater replay/1#!t=3,7 4: 4: The snapshot page is returned to the crawler with URL replay/1#!t=3,7 5: Terrace Theater 5: A user searches keyword “Terrace Theater” 6: replay/1#!t=3,7 6: Google includes replay/1#!t=3,7 in the search results 7: 7: The user click the link and ask for the document at replay/1#!t=3,7 8: 8: The server returns the landing page containing both “Terrace Theater” and “Linked Data” 9: 9: The landing page highlights the media fragment by start playing from 3s to 7s
  • 9. Discussion • The Media Fragment Indexing Framework solved the SEO problem of media fragments • The scalability of such method largely relies on whether there are large number of annotations linked to media fragments • Looking for media fragment annotations? – Timed-text, transcript, speech recognition – Manual annotations on each video sharing platforms – Social Media (Twitter) 9
  • 10. Survey on Media Fragment URI Implementation 10
  • 11. Media Fragments and Social Media • The deep-linking function • A Media Fragment URL can be embedded in a Tweet • Text of the Tweet is the annotation to the URL • Get annotations by filtering Tweets that have MF URIs 11
  • 12. Filter Tweets by Media Fragment URIs • Problem: – Any URL in Tweet is potentially a MF URI – Too many false-positive cases http://example.org/1234#t=23 http://example.org/1234?t=23 http://example.org/1234?track=23 – They could all be MF URIs, need to be identified manually • Work around: – Identify platforms (partially-)implementing MF URI – Only filter Tweets containing URLs from those domains 12
  • 13. Survey Methodology • Find a list of video sharing platforms – http://en.wikipedia.org/wiki/List_of_video_hosting_services – 59 websites are targeted in the survey – Some of them have access restrictions • Go through each website manually to see whether they provide deep-linking function, such as: – Social sharing button from a certain time point – Deep-linking option in right click menu 13
  • 14. Survey Results (1) • 9 websites partially-implemented MFURI – 56.com, Dailymotion, Hulu, Vbox7, Viddler, vimeo, Tudou, Youku and YouTube • They use different syntax to encode temporal dimension – Most of them use URI query, except YouTube & Vimeo – Parameter name: “start”, “t”, “st”, etc – Only Hulu implemented the end time • Only YouTube partially implemented spatial dimension – This is an external function implemented by Clickberry https://clickberry.tv/video/6dafe30e-dcb8-44b8-8190-32be8249a297 14
  • 15. Survey Results (2) • Only 9 websites partially-implemented MFURI, however: – Those websites have covered most videos shared on the web – eBizMBA report: http://www.ebizmba.com/articles/video-websites • Select filter keywords based on the survey results: – Twitter is banned in China, so 56.com, Tudou and Youku are ignored – Hulu has access restriction outside U.S. • Filter keywords: “YouTube”, “Dailymotion”, “Vbox7”, “Vimeo” and “Viddler” 15
  • 17. Twitter Media Fragment Indexer • Collect Tweets filtered by the keywords • Extract MF URIs in Tweets, parse the media fragment information • Use Media Fragment Indexing Framework to publish Tweets as media fragment annotations • Embed rich snippet in the snapshot pages • Create sitemap for Google to crawl the snapshot pages • User searches keywords in the Tweet in Google and the link will lead to the video with corresponding start time 17
  • 19. Indexing Results (1) • Monitor 50-hour non-stop Twitter stream • Filter phrase: “youtube, dailymotion, vimeo, vbox7, viddler” • 5,779,858 Tweets examined, 5,269,742 contain URLs • 32,754 Tweets contain MF URIs, 32796 MF URIs in total • Media Fragment URIs shared in each website: 19 Website No. of MFURIs % YouTube 32,666 99.604 Dailymotion 101 0.308 Vbox7 0 0 Viddler 0 0 Vimeo 29 0.088
  • 20. Indexing Results (2) • 13,088 distinct videos are found • 17,854 distinct MF URIs for sitemap – Many Tweets share the same video, but different fragments – Many retweets – Some video are not available in UK • 17,479 URLs (97.9%) in the sitemap have been indexed by Google • Only 775 URLs are indexed as VideoObject even though all rich snippets are embedded in all snapshot pages 20
  • 21. Demo • Search “Chris Eppstein” • As a result, this landing page will be opened and the video start playing from the time indicated in the Tweet containing keywords “Chris Eppstein” 21
  • 23. Conclusions • Introduced Media Fragment Indexing Framework • Propose the using of social media to acquire more annotations to media fragments • Survey the MF URI implementation on major video sharing platforms • Twitter Media Fragment Indexer – Monitor Tweet Stream and automatically create media fragment annotations – Index media fragments in Google – YouTube is the most important domain to share media fragments on Twitter 23
  • 24. Future Work • How valid tweets could be served as media fragment annotations – many noisy and unrelated text – many re-tweets • Experiment on larger scale (billions of tweets and continuous monitoring) • Expand the methodology to other media fragment annotations, such as timed-text • Extract named entities from tweets and further link media fragments to the Linked Data Cloud 24