SlideShare a Scribd company logo
Predicting Media Interestingness Task
Overview
Claire-Hélène Demarty – Technicolor
Mats Sjöberg – University of Helsinki
Bogdan Ionescu – University Polytehnica of Bucharest
Thanh-Toan Do – Singapor University of Science
Hanli Wang – Tongji University
Ngoc Q.K. Duong, Technicolor
Frédéric Lefebvre, Technicolor
MediaEval 2016 Workshop
October, 20-21st 2016
Interestingness?
Are these interesting images?
2
?
Interestingness?
Are these interesting images?
3
?
Definition?
Interestingness?
Are these interesting images?
4
?
Definition?
Subjective
SemanticPerceptual
n Derives from a use case at Technicolor
n Helping professionals to illustrate a Video on Demand (VOD) web site by
selecting some interesting frames and/or video excerpts for the posted
movies.
n The frames and excerpts should be suitable in terms of helping a user to
make his/her decision about whether he/she is interested in watching the
underlying movie.
n Two subtasks -> Image and Video
n Image subtask: given a set of keyframes extracted from a movie, …
n Video subtask: given the video shots of a movie, …
… automatically identify those images/shots that viewers report to be the most
interesting in the given movie.
n Binary classification task on a per movie basis…
… but confidence values are also required.
5
Task definition
12/7/16
n From Hollywood-like movie trailers
n Manual segmentation of shots
n Extraction of middle key-frame of each shot
6
Dataset & additional features
12/7/16
Development Set Test Set
Trailer # 52 26
Total % interesting Total % interesting
Shot # 5054 8.3 2342 9.6
Key-frame # 5054 9.4 2342 10.3
n Precomputed content descriptors:
n Low-level: denseSift, HoG, LBP, GIST, HSV color histograms, MFCC, fc7 and
prob layers from AlexNet
n Mid-level: face detection and tracking-by-detection
7
Manual annotations
12/7/16
Thank you
Mats!
Thanks to all
of you!
Binary decision
(manual
thresholding)
Pair comparison
protocol
Aggregation into
rankings
pairs rankings
Annotators:
>310 persons for video
>100 persons for image
From 29 countries
Image subtask: Visual information only, no external data
Video subtask: Audio and visual information, no external data
External data IS:
n Additional datasets and annotations dedicated to the interestingness
prediction
n Pre-trained models, features, detectors obtained from such dedicated
datasets
n Additional metadata that could be found on the internet on the provided
content
External data IS NOT:
n CNN features generated on generic datasets not dedicated to interestingness
prediction
8
Required runs
12/7/16
n Official measure:
Ø Mean Average Precision (over all trailers)
n Additional metrics are computed:
n False alarm rate, miss detection rate, precision, recall, F-measure, etc.
9
Evaluation metrics
12/7/16
10
Task participation
12/7/16
0
5
10
15
20
25
30
35
Registrations Returned agreements Submitting teams Workshop
Task Participation
n Registrations:
n 31 teams
n 16 countries
n 3 ‘experienced’ teams
n Submissions: 12 teams
n 9 teams submitted on both substasks
n 2 teams on image subtask
n 1 team on video subtask
11
Official results – Image subtask – 27 runs
12/7/16
Runs MAP Official ranking
me16in_tudmmc2_image_histface 0.2336 TUDMMC
me16in_technicolor_image_run1_SVM_rbf* 0.2336 Technicolor
me16in_technicolor_image_run2_DNNresampling06_100* 0.2315 Technicolor
me16in_MLPBOON_image_run5 0.2296 MLPBOON
me16in_BigVid_image_run5FusionCNN 0.2294 BigVid
me16in_MLPBOON_image_run1 0.2205 MLPBOON
me16in_tudmmc2_image_hist 0.2202 TUDMMC
me16in_MLPBOON_image_run4 0.217 MLPBOON
me16in_HUCVL_image_run1 0.2125 HUCVL
me16in_HUCVL_image_run2 0.2121 HUCVL
me16in_UITNII_image_FA 0.2115 UITNII
me16in_RUC_image_run2 0.2035 RUC
me16in_MLPBOON_image_run2 0.2023 MLPBOON
me16in_HUCVL_image_run3 0.2001 HUCVL
me16in_RUC_image_run3 0.1991 RUC
me16in_RUC_image_run1 0.1987 RUC
me16in_ethcvl1_image_run2 0.1952 ETHCVL
me16in_MLPBOON_image_run3 0.1941 MLPBOON
me16in_HKBU_image_baseline 0.1868 HKBU
me16in_ethcvl1_image_run1 0.1866 ETHCVL
me16in_ethcvl1_image_run3 0.1858 ETHCVL
me16in_HKBU_image_drbaseline 0.1839 HKBU
me16in_BigVid_image_run4SVM 0.1789 BigVid
me16in_UITNII_image_V1 0.1773 UITNII
me16in_lapi_image_runf1* 0.1714 LAPI
me16in_UNIGECISA_image_ReglineLoF 0.1704 UNIGECISA
BASELINE (on testset) 0.1655
me16in_lapi_image_runf2* 0.1398 LAPI
* organizers
12
Official results – Video subtask – 28 runs
12/7/16
* organizers
Runs MAP Official ranking
me16in_recod_video_run1 0.1815 RECOD
me16in_recod_video_run1_old 0.1753 RECOD
me16in_HKBU_video_drbaseline 0.1735 HKBU
me16in_UNIGECISA_video_RegsrrLoF 0.171 UNIGECISA
me16in_RUC_video_run2 0.1704 RUC
me16in_UITNII_video_A1 0.169 UITNII
me16in_recod_video_run4 0.1656 RECOD
me16in_RUC_video_run1 0.1647 RUC
me16in_UITNII_video_F1 0.1641 UITNII
me16in_lapi_video_runf5 0.1629 LAPI
me16in_technicolor_video_run5_CSP_multimodal_80_epoch7 0.1618 Technicolor
me16in_recod_video_run2 0.1617 RECOD
me16in_recod_video_run3 0.1617 RECOD
me16in_ethcvl1_video_run2 0.1574 ETHCVL
me16in_lapi_video_runf3 0.1574 LAPI
me16in_lapi_video_runf4 0.1572 LAPI
me16in_tudmmc2_video_histface 0.1558 TUDMMC
me16in_tudmmc2_video_hist 0.1557 TUDMMC
me16in_BigVid_video_run3RankSVM 0.154 BigVid
me16in_HKBU_video_baseline 0.1521 HKBU
me16in_BigVid_video_run2FusionCNN 0.1511 BigVid
me16in_UNIGECISA_video_RegsrrGiFe 0.1497 UNIGECISA
BASELINE (on testset) 0.1496
me16in_BigVid_video_run1SVM 0.1482 BigVId
me16in_technicolor_video_run3_LSTM_U19_100_epoch5 0.1465 Technicolor
me16in_recod_video_run5 0.1435 RECOD
me16in_UNIGECISA_video_SVRloAudio 0.1367 UNIGECISA
me16in_technicolor_video_run4_CSP_video_80_epoch9 0.1365 Technicolor
me16in_ethcvl1_video_run1 0.1362 ETHCVL
n On the task itself?
n Image interestingness is NOT video interestingness
n Issue with the video dataset (needs more interations? needs more data
samples?)
n Overall low map values: room for improvment!
n On the participants systems?
n This year’s trend? No trend!
n Classic machine learning, deep learning systems… but also rule-based systems
n Some multimodal (audio, video, text), some temporal… and some not.
n (Mostly) No use of external data
n Simple systems did as well (better…) than sophisticated systems
n Dataset unbalance: an issue?
n Dataset size: penalizing deep learning systems?
13
What we have learned
12/7/16
14 12/7/16
Thank you!

More Related Content

PDF
MediaEval 2016 - HUCVL Predicting Interesting Key Frames with Deep Models
PDF
MediaEval 2015 - CERTH at MediaEval 2015 Synchronization of Multi-User Event ...
PDF
MediaEval 2016 - UVigo System for Multimodal Person Discovery in Broadcast TV...
PDF
MediaEval 2016: A Multimodal System for the Verifying Multimedia Use Task
PDF
MediaEval 2016 - Placing Task Overview
PDF
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
PDF
MediaEval 2016 - LAPI @ 2016 Retrieving Diverse Social Images Task: A Pseudo-...
PDF
MediaEval 2016: LAPI at Predicting Media Interestingness Task
MediaEval 2016 - HUCVL Predicting Interesting Key Frames with Deep Models
MediaEval 2015 - CERTH at MediaEval 2015 Synchronization of Multi-User Event ...
MediaEval 2016 - UVigo System for Multimodal Person Discovery in Broadcast TV...
MediaEval 2016: A Multimodal System for the Verifying Multimedia Use Task
MediaEval 2016 - Placing Task Overview
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
MediaEval 2016 - LAPI @ 2016 Retrieving Diverse Social Images Task: A Pseudo-...
MediaEval 2016: LAPI at Predicting Media Interestingness Task

Viewers also liked (13)

PDF
MediaEval 2016 - Retrieving Diverse Social Images Task Overview
PDF
MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...
PDF
MediaEval 2015 - JRS at Synchronization of Multi-user Event Media Task
PDF
MediaEval 2016 - Verifying Multimedia Use Task Overview
PDF
MediaEval 2016 - the C@merata task: Natural Language Queries Derived from Exa...
PDF
MediaEval 2016 - UNED-UV @ Retrieving Diverse Social Images Task
PDF
MediaEval 2016 - RECOD at Placing Task
PDF
MediaEval 2016 - Tag Propagation in Talking Face Graphs
PDF
MediaEval 2016 - EUMSSI Team at the MediaEval Person Discovery Challenge
PDF
MediaEval 2016 - MLPBOON Predicting Media Interestingness System
PDF
MediaEval 2016 - ININ Submission to Zero Cost ASR Task
PDF
MediaEval 2016 - Multimodal Person Discovery in TV Broadcast
PDF
MediaEval 2016 - IR Evaluation: Putting the User Back in the Loop
MediaEval 2016 - Retrieving Diverse Social Images Task Overview
MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...
MediaEval 2015 - JRS at Synchronization of Multi-user Event Media Task
MediaEval 2016 - Verifying Multimedia Use Task Overview
MediaEval 2016 - the C@merata task: Natural Language Queries Derived from Exa...
MediaEval 2016 - UNED-UV @ Retrieving Diverse Social Images Task
MediaEval 2016 - RECOD at Placing Task
MediaEval 2016 - Tag Propagation in Talking Face Graphs
MediaEval 2016 - EUMSSI Team at the MediaEval Person Discovery Challenge
MediaEval 2016 - MLPBOON Predicting Media Interestingness System
MediaEval 2016 - ININ Submission to Zero Cost ASR Task
MediaEval 2016 - Multimodal Person Discovery in TV Broadcast
MediaEval 2016 - IR Evaluation: Putting the User Back in the Loop
Ad

Similar to 2016 MediaEval - Interestingness Task Overview (20)

PDF
MediaEval 2017 - Interestingness Task: MediaEval 2017 Predicting Media Intere...
PDF
MediaEval 2017 Predicting Media Interestingness Task (Poster)
PDF
Predicting Media Interestingness
PDF
MediaEval 2016 - MLPBOON Predicting Media Interestingness System
PPTX
MediaEval 2016 - HUCVL Predicting Interesting Key Frames with Deep Models
PDF
Video Thumbnail Selector
PDF
Interactive Video Search: Where is the User in the Age of Deep Learning?
PPTX
TechnicalBackgroundOverview
PDF
2016 MediaEval - Interestingness Task Overview
PDF
med_poster_spie
PDF
Deep Learning for Computer Vision: Video Analytics (UPC 2016)
PDF
[212]big models without big data using domain specific deep networks in data-...
PDF
Content Based Image Retrieval
PDF
Content based video summarization into object maps
PDF
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
PDF
LinkedTV Deliverable 1.6 - Intelligent hypervideo analysis evaluation, final ...
PDF
Video Classification: Human Action Recognition on HMDB-51 dataset
PDF
MediaEval 2016 - UNIFESP Predicting Media Interestingness Task
PDF
"How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"...
PDF
Overview of Video Concept Detection using (CNN) Convolutional Neural Network
MediaEval 2017 - Interestingness Task: MediaEval 2017 Predicting Media Intere...
MediaEval 2017 Predicting Media Interestingness Task (Poster)
Predicting Media Interestingness
MediaEval 2016 - MLPBOON Predicting Media Interestingness System
MediaEval 2016 - HUCVL Predicting Interesting Key Frames with Deep Models
Video Thumbnail Selector
Interactive Video Search: Where is the User in the Age of Deep Learning?
TechnicalBackgroundOverview
2016 MediaEval - Interestingness Task Overview
med_poster_spie
Deep Learning for Computer Vision: Video Analytics (UPC 2016)
[212]big models without big data using domain specific deep networks in data-...
Content Based Image Retrieval
Content based video summarization into object maps
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
LinkedTV Deliverable 1.6 - Intelligent hypervideo analysis evaluation, final ...
Video Classification: Human Action Recognition on HMDB-51 dataset
MediaEval 2016 - UNIFESP Predicting Media Interestingness Task
"How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"...
Overview of Video Concept Detection using (CNN) Convolutional Neural Network
Ad

More from multimediaeval (20)

PPTX
Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...
PDF
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...
PDF
Sports Video Classification: Classification of Strokes in Table Tennis for Me...
PDF
Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...
PPTX
Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task
PDF
Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...
PDF
Fooling an Automatic Image Quality Estimator
PDF
Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...
PDF
Pixel Privacy: Quality Camouflage for Social Images
PDF
HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching
PPTX
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
PDF
HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...
PDF
Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...
PPTX
Deep Conditional Adversarial learning for polyp Segmentation
PPTX
A Temporal-Spatial Attention Model for Medical Image Detection
PPTX
HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...
PDF
Fine-tuning for Polyp Segmentation with Attention
PPTX
Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...
PPTX
Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...
PDF
Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...
Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...
Sports Video Classification: Classification of Strokes in Table Tennis for Me...
Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...
Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task
Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...
Fooling an Automatic Image Quality Estimator
Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...
Pixel Privacy: Quality Camouflage for Social Images
HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...
Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...
Deep Conditional Adversarial learning for polyp Segmentation
A Temporal-Spatial Attention Model for Medical Image Detection
HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...
Fine-tuning for Polyp Segmentation with Attention
Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...
Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...
Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...

Recently uploaded (20)

PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
PPTX
neck nodes and dissection types and lymph nodes levels
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PPTX
2. Earth - The Living Planet Module 2ELS
PPTX
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PPTX
Classification Systems_TAXONOMY_SCIENCE8.pptx
PPTX
famous lake in india and its disturibution and importance
PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PPTX
2Systematics of Living Organisms t-.pptx
PDF
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
PDF
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
PDF
Biophysics 2.pdffffffffffffffffffffffffff
PDF
The scientific heritage No 166 (166) (2025)
PPTX
Microbiology with diagram medical studies .pptx
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
PDF
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
PDF
. Radiology Case Scenariosssssssssssssss
PPTX
INTRODUCTION TO EVS | Concept of sustainability
DOCX
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
neck nodes and dissection types and lymph nodes levels
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
2. Earth - The Living Planet Module 2ELS
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
ECG_Course_Presentation د.محمد صقران ppt
Classification Systems_TAXONOMY_SCIENCE8.pptx
famous lake in india and its disturibution and importance
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
2Systematics of Living Organisms t-.pptx
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
Biophysics 2.pdffffffffffffffffffffffffff
The scientific heritage No 166 (166) (2025)
Microbiology with diagram medical studies .pptx
AlphaEarth Foundations and the Satellite Embedding dataset
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
. Radiology Case Scenariosssssssssssssss
INTRODUCTION TO EVS | Concept of sustainability
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx

2016 MediaEval - Interestingness Task Overview

  • 1. Predicting Media Interestingness Task Overview Claire-Hélène Demarty – Technicolor Mats Sjöberg – University of Helsinki Bogdan Ionescu – University Polytehnica of Bucharest Thanh-Toan Do – Singapor University of Science Hanli Wang – Tongji University Ngoc Q.K. Duong, Technicolor Frédéric Lefebvre, Technicolor MediaEval 2016 Workshop October, 20-21st 2016
  • 3. Interestingness? Are these interesting images? 3 ? Definition?
  • 4. Interestingness? Are these interesting images? 4 ? Definition? Subjective SemanticPerceptual
  • 5. n Derives from a use case at Technicolor n Helping professionals to illustrate a Video on Demand (VOD) web site by selecting some interesting frames and/or video excerpts for the posted movies. n The frames and excerpts should be suitable in terms of helping a user to make his/her decision about whether he/she is interested in watching the underlying movie. n Two subtasks -> Image and Video n Image subtask: given a set of keyframes extracted from a movie, … n Video subtask: given the video shots of a movie, … … automatically identify those images/shots that viewers report to be the most interesting in the given movie. n Binary classification task on a per movie basis… … but confidence values are also required. 5 Task definition 12/7/16
  • 6. n From Hollywood-like movie trailers n Manual segmentation of shots n Extraction of middle key-frame of each shot 6 Dataset & additional features 12/7/16 Development Set Test Set Trailer # 52 26 Total % interesting Total % interesting Shot # 5054 8.3 2342 9.6 Key-frame # 5054 9.4 2342 10.3 n Precomputed content descriptors: n Low-level: denseSift, HoG, LBP, GIST, HSV color histograms, MFCC, fc7 and prob layers from AlexNet n Mid-level: face detection and tracking-by-detection
  • 7. 7 Manual annotations 12/7/16 Thank you Mats! Thanks to all of you! Binary decision (manual thresholding) Pair comparison protocol Aggregation into rankings pairs rankings Annotators: >310 persons for video >100 persons for image From 29 countries
  • 8. Image subtask: Visual information only, no external data Video subtask: Audio and visual information, no external data External data IS: n Additional datasets and annotations dedicated to the interestingness prediction n Pre-trained models, features, detectors obtained from such dedicated datasets n Additional metadata that could be found on the internet on the provided content External data IS NOT: n CNN features generated on generic datasets not dedicated to interestingness prediction 8 Required runs 12/7/16
  • 9. n Official measure: Ø Mean Average Precision (over all trailers) n Additional metrics are computed: n False alarm rate, miss detection rate, precision, recall, F-measure, etc. 9 Evaluation metrics 12/7/16
  • 10. 10 Task participation 12/7/16 0 5 10 15 20 25 30 35 Registrations Returned agreements Submitting teams Workshop Task Participation n Registrations: n 31 teams n 16 countries n 3 ‘experienced’ teams n Submissions: 12 teams n 9 teams submitted on both substasks n 2 teams on image subtask n 1 team on video subtask
  • 11. 11 Official results – Image subtask – 27 runs 12/7/16 Runs MAP Official ranking me16in_tudmmc2_image_histface 0.2336 TUDMMC me16in_technicolor_image_run1_SVM_rbf* 0.2336 Technicolor me16in_technicolor_image_run2_DNNresampling06_100* 0.2315 Technicolor me16in_MLPBOON_image_run5 0.2296 MLPBOON me16in_BigVid_image_run5FusionCNN 0.2294 BigVid me16in_MLPBOON_image_run1 0.2205 MLPBOON me16in_tudmmc2_image_hist 0.2202 TUDMMC me16in_MLPBOON_image_run4 0.217 MLPBOON me16in_HUCVL_image_run1 0.2125 HUCVL me16in_HUCVL_image_run2 0.2121 HUCVL me16in_UITNII_image_FA 0.2115 UITNII me16in_RUC_image_run2 0.2035 RUC me16in_MLPBOON_image_run2 0.2023 MLPBOON me16in_HUCVL_image_run3 0.2001 HUCVL me16in_RUC_image_run3 0.1991 RUC me16in_RUC_image_run1 0.1987 RUC me16in_ethcvl1_image_run2 0.1952 ETHCVL me16in_MLPBOON_image_run3 0.1941 MLPBOON me16in_HKBU_image_baseline 0.1868 HKBU me16in_ethcvl1_image_run1 0.1866 ETHCVL me16in_ethcvl1_image_run3 0.1858 ETHCVL me16in_HKBU_image_drbaseline 0.1839 HKBU me16in_BigVid_image_run4SVM 0.1789 BigVid me16in_UITNII_image_V1 0.1773 UITNII me16in_lapi_image_runf1* 0.1714 LAPI me16in_UNIGECISA_image_ReglineLoF 0.1704 UNIGECISA BASELINE (on testset) 0.1655 me16in_lapi_image_runf2* 0.1398 LAPI * organizers
  • 12. 12 Official results – Video subtask – 28 runs 12/7/16 * organizers Runs MAP Official ranking me16in_recod_video_run1 0.1815 RECOD me16in_recod_video_run1_old 0.1753 RECOD me16in_HKBU_video_drbaseline 0.1735 HKBU me16in_UNIGECISA_video_RegsrrLoF 0.171 UNIGECISA me16in_RUC_video_run2 0.1704 RUC me16in_UITNII_video_A1 0.169 UITNII me16in_recod_video_run4 0.1656 RECOD me16in_RUC_video_run1 0.1647 RUC me16in_UITNII_video_F1 0.1641 UITNII me16in_lapi_video_runf5 0.1629 LAPI me16in_technicolor_video_run5_CSP_multimodal_80_epoch7 0.1618 Technicolor me16in_recod_video_run2 0.1617 RECOD me16in_recod_video_run3 0.1617 RECOD me16in_ethcvl1_video_run2 0.1574 ETHCVL me16in_lapi_video_runf3 0.1574 LAPI me16in_lapi_video_runf4 0.1572 LAPI me16in_tudmmc2_video_histface 0.1558 TUDMMC me16in_tudmmc2_video_hist 0.1557 TUDMMC me16in_BigVid_video_run3RankSVM 0.154 BigVid me16in_HKBU_video_baseline 0.1521 HKBU me16in_BigVid_video_run2FusionCNN 0.1511 BigVid me16in_UNIGECISA_video_RegsrrGiFe 0.1497 UNIGECISA BASELINE (on testset) 0.1496 me16in_BigVid_video_run1SVM 0.1482 BigVId me16in_technicolor_video_run3_LSTM_U19_100_epoch5 0.1465 Technicolor me16in_recod_video_run5 0.1435 RECOD me16in_UNIGECISA_video_SVRloAudio 0.1367 UNIGECISA me16in_technicolor_video_run4_CSP_video_80_epoch9 0.1365 Technicolor me16in_ethcvl1_video_run1 0.1362 ETHCVL
  • 13. n On the task itself? n Image interestingness is NOT video interestingness n Issue with the video dataset (needs more interations? needs more data samples?) n Overall low map values: room for improvment! n On the participants systems? n This year’s trend? No trend! n Classic machine learning, deep learning systems… but also rule-based systems n Some multimodal (audio, video, text), some temporal… and some not. n (Mostly) No use of external data n Simple systems did as well (better…) than sophisticated systems n Dataset unbalance: an issue? n Dataset size: penalizing deep learning systems? 13 What we have learned 12/7/16