SlideShare a Scribd company logo
MediaEval2020
Predicting Media Memorability
Task Overview
Alba García Seco de Herrera, Rukiye Savran Kiziltepe, Jon Chamberlain, Mihai Gabriel Constantin,
Claire-Hélène Demarty, Faiyaz Doctor, Bogdan Ionescu, Alan Smeaton
Presentation Video
Task Description
Goal: predicting how memorable a video is to viewers
15/12/2020 MediaEval2020 2
• Automatically predicting short-term and
long-term memorability
• TRECVid 2019 Video to Text dataset1
• Sound and more action
1. Awad, G., Butt, A.A., Lee, Y., Fiscus, J., Godil, A., Delgado, A., Smeaton, A.F. and Graham, Y., Trecvid 2019:
An evaluation campaign to benchmark video activity detection, video captioning and matching, and video
search & retrieval. 2019.
Annotation Tool
• Short-term memorability : a few minutes after memorization
• Long-term memorability: 24 – 72 hours later
15/12/2020 MediaEval2020 3
Romain Cohendet, Claire-Hélène Demarty, Ngoc Duong, and Martin Engilberge. VideoMem: Constructing, Analyzing, Predicting Short-term and Long-term Video Memorability. Proceedings of the IEEE
International Conference on Computer Vision. 2019.
Video Memorability Game
Annotation Protocol
Step 1 (180 videos)
• 40 targets– repeated after a few minutes
• 60 fillers – non target videos
• 20 vigilance fillers – repeated quickly to monitor the attention
15/12/2020 MediaEval2020 4
Romain Cohendet, Claire-Hélène Demarty, Ngoc Duong, and Martin Engilberge. VideoMem: Constructing, Analyzing, Predicting Short-term and Long-term Video Memorability. Proceedings of the IEEE
International Conference on Computer Vision. 2019.
Step 2 (120 videos)
• 40 targets– randomly chosen from non-vigilance fillers
• 80 fillers – randomly chosen new videos
Dataset Description
• TRECVid 2019
(Video to Text)
• 1500 videos
• 1000 training set
• 500 test set
15/12/2020 MediaEval2020 5
Dataset Description
15/12/2020 MediaEval2020 6
• AlexNetFC7
• HOG
• HSVHist
• RGBHist
• LBP
• VGGFC7
• C3D
• Text descriptions
• Annotations
• Response time
• Key press
• Video position
Short-term memorability score
Long-term memorability score
Examples (Low Short-term and Long-term Memorability)
15/12/2020 MediaEval2020 7
• At football game, the ball is kicked past end zone and
woman is knocked down from her knees
• football player are playing at a football field.
• At a college football game, during a kickoff, the kicker
kicks the ball over the endzone and hits a spectator
in the face while they are trying to catch it.
• a person is injured when the football player kicked a
ball across a field during a game
• Football kicks football during a day game and a
cheerleader tries to catch it and ball hits her in the
head.
Examples (High Short-term and Long-term Memorability)
15/12/2020 MediaEval2020 8
• Two boys wearing white shirts on playground swings
• Two young men, are on a swing and yell, outdoors.
Results (Mean Spearman's Rank Correlation Scores )
• 14 teams registered
• 9 teams submitted 28 runs
• 8 papers
• Spearman’s rank correlation
15/12/2020 MediaEval2020 9
Short-term Long-term
Spearman Pearson MSE Spearman Pearson MSE
Mean 0.058 0.066 0.013 0.036 0.043 0.051
Variance 0.002 0.002 0.000 0.002 0.001 0.000
15/12/2020 MediaEval2020 10
Spearman Pearson MSE Spearman Pearson MSE
CUC_DMT run1-required 0.06 0.055 0.01 0.049 0.05 0.05
run1-required 0.054 0.044 0.01 0.113 0.121 0.05
run2-required 0.05 0.072 0.01 0.059 0.071 0.05
run3-required - - - 0.109 0.119 0.05
run4-required 0.076 0.092 0.01 0.041 0.058 0.05
memento10k 0.137 0.13 0.01 - - -
DCU@ML-Labsrun1-required 0.034 0.078 0.1 -0.01 0.022 0.09
HSV-Run1 0.042 0.042 0.01 0.032 0.016 0.05
RGB-Run2 -0.003 -0.026 0.01 0.043 0.042 0.04
RGB-Run3 -0.015 -0.012 0.01 0.032 0.037 0.04
RGB-HSV-Run4 -0.022 -0.001 0.01 -0.017 -0.012 0.04
Score-Run5 0.02 0.054 0.01 -0.054 -0.036 0.05
GTH-UPM run1-required 0.016 0.011 0.01 -0.041 -0.028 0.05
run0-required 0.007 0.029 0.01 0.028 0.033 0.05
run1-required -0.01 -0.019 0.01 0.012 0.021 0.05
run2-required 0.053 0.085 0.01 0.037 0.033 0.05
run3-required 0.05 0.053 0.01 0.014 0.017 0.05
run1-audiovisual 0.099 0.09 0.01 0.077 0.085 0.06
run2-vilbert 0.098 0.085 0.01 -0.017 0.011 0.06
run3-text 0.073 0.091 0.01 0.019 0.049 0.06
run4-all-SLT 0.101 0.09 0.01 0.078 0.085 0.06
run5-all-required 0.101 0.09 0.01 0.067 0.066 0.05
run1-required 0.136 0.145 0.01 0.012 0.012 0.05
run7 0.102 0.127 0.01 0.056 0.059 0.04
run8 0.091 0.095 0.01 0.077 0.068 0.05
run9 0.085 0.124 0.01 0.044 0.048 0.05
run42 0.116 0.144 0.01 0.076 0.069 0.05
MMSys run 0.007 0.01 0.01 0.048 0.032 0.05
MG-UCB
Team Run
Short-term Long-term
DCU-Audio
Essex-NLIP
KT-UPB
MeMAD
Results (Official Results on Test-set for Teams’ all runs)
Results (Official Results on Test-set for Teams’ best runs–Short-term)
15/12/2020 MediaEval2020 11
DCU-Audio memento10k 0.137 Audio Gestalt => Multimodal Deep Learning-based Late Fusion (Momento10K)
MG-UCB run1-required 0.136 Visual, Audio, Textual, Visiolinguistic Features=> Weighted Average
MeMAD run4-all-SLT and run5-all-required
0.101 Visual, Audio, Textual =>SVR , BR, GRU => Weighted Late Fusion
CUC_DMT run1-required 0.06 Multi-level Encoding and Captions=> Gradient Boosting, Random Forest, Neural Network
KT-UPB run2-required 0.053 C3D => Random Forest
Essex-NLIP HSV-Run1 0.042 HSV => Random Forest
DCU@ML-Labs run1-required 0.034 C3D => SemNET (Momento10K)
GTH-UPM run1-required 0.016 Multimodal Late Fusion of Self-Attention => SVR => Bidirectional LSTM
MMSys run 0.007 -
Team Run Approach
Short-term
Results (Official Results on Test-set for Teams’ best runs–Long-term)
15/12/2020 MediaEval2020 12
DCU-Audio run1-required 0.113 Audio Gestalt => Multimodal Deep Learning-based Late Fusion (Momento10K)
MeMAD run4-all-SLT 0.078 Visual, Audio, Textual, Visiolinguistic Features=> Weighted Average
MG-UCB run8 0.077 Visual, Audio, Textual =>SVR , BR, GRU => Weighted Late Fusion
CUC_DMT run1-required 0.049 Multi-level Encoding and Captions=> Gradient Boosting, Random Forest, Neural Network
MMSys run 0.048 -
Essex-NLIP RGB-Run2 0.043 RGB => Random Forest
KT-UPB run2-required 0.037 C3D => Random Forest
DCU@ML-Labs run1-required -0.01 C3D => SemNET (Momento10K)
GTH-UPM run1-required -0.041 Multimodal Late Fusion of Self-Attention => SVR => Bidirectional LSTM
Team Run Approach
Long-term
Conclusion
• Short-term memorability – better results
• Long-term memorability – results slightly lower
• The best results:
• DCU-Audio (0.137; 0.113)
• MG-UCB (0.136; 0.77)
• MeMAD (0.101; 0.078)
• Audio and captions
• Fusion
• Deep learning techniques
• More annotations
15/12/2020 MediaEval2020 13
THANK YOU!

More Related Content

PDF
Video Captioning at TRECVID 2022
PDF
Post CtA analyse for the CCP
PDF
Scheduling by Primavera - Training
PDF
The Search and Hyperlinking Task at MediaEval 2014
PPTX
Frogman
PPTX
Lean six sigma executive overview (case study) templates
PDF
Video Summarization for Sports
PPTX
21st Century Frogman Lessons Learned H4D Stanford 2017
Video Captioning at TRECVID 2022
Post CtA analyse for the CCP
Scheduling by Primavera - Training
The Search and Hyperlinking Task at MediaEval 2014
Frogman
Lean six sigma executive overview (case study) templates
Video Summarization for Sports
21st Century Frogman Lessons Learned H4D Stanford 2017

Similar to Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a Video Memorable? (20)

PDF
Learn2Sign : Sign language recognition and translation using human keypoint e...
PPTX
Forecasting database performance
PDF
TRECVID 2016 : Video to Text Description
PDF
Automatic Report Generation of a Football Match
PDF
170704admnet beamer-public
PDF
Alex Tellez, Deep Learning Applications
PPT
Registration System for Training Program in STC
PDF
Planning & Scheduling - Training
PDF
Quality of Experience of Web-based Adaptive HTTP Streaming Clients in Real-Wo...
PPT
Agile project management in heavy engineering design (John Underhill, Babcock)
PPTX
Google Glass, The META and Co. - How to calibrate your Optical See-Through He...
PDF
“MLPerf: An Industry Standard Performance Benchmark Suite for Machine Learnin...
PDF
AcademicProject
PDF
Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on ...
PDF
Runtime Performance Optimizations for an OpenFOAM Simulation
PPTX
Dataset and methods for 360-degree video summarization
PPTX
Lessons Learned.pptx
PDF
Getting the Best of TrueDEM - June News & Updates
PDF
Digital video watermarking using modified lsb and dct technique
PDF
MIPI DevCon 2016: How to Use the VESA Display Stream Compression (DSC) Standa...
Learn2Sign : Sign language recognition and translation using human keypoint e...
Forecasting database performance
TRECVID 2016 : Video to Text Description
Automatic Report Generation of a Football Match
170704admnet beamer-public
Alex Tellez, Deep Learning Applications
Registration System for Training Program in STC
Planning & Scheduling - Training
Quality of Experience of Web-based Adaptive HTTP Streaming Clients in Real-Wo...
Agile project management in heavy engineering design (John Underhill, Babcock)
Google Glass, The META and Co. - How to calibrate your Optical See-Through He...
“MLPerf: An Industry Standard Performance Benchmark Suite for Machine Learnin...
AcademicProject
Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on ...
Runtime Performance Optimizations for an OpenFOAM Simulation
Dataset and methods for 360-degree video summarization
Lessons Learned.pptx
Getting the Best of TrueDEM - June News & Updates
Digital video watermarking using modified lsb and dct technique
MIPI DevCon 2016: How to Use the VESA Display Stream Compression (DSC) Standa...
Ad

More from multimediaeval (20)

PPTX
Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...
PDF
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...
PDF
Sports Video Classification: Classification of Strokes in Table Tennis for Me...
PDF
Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...
PPTX
Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task
PDF
Fooling an Automatic Image Quality Estimator
PDF
Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...
PDF
Pixel Privacy: Quality Camouflage for Social Images
PDF
HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching
PPTX
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
PDF
HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...
PDF
Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...
PPTX
Deep Conditional Adversarial learning for polyp Segmentation
PPTX
A Temporal-Spatial Attention Model for Medical Image Detection
PPTX
HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...
PDF
Fine-tuning for Polyp Segmentation with Attention
PPTX
Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...
PPTX
Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...
PDF
Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...
PDF
Personal Air Quality Index Prediction Using Inverse Distance Weighting Method
Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...
Sports Video Classification: Classification of Strokes in Table Tennis for Me...
Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...
Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task
Fooling an Automatic Image Quality Estimator
Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...
Pixel Privacy: Quality Camouflage for Social Images
HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...
Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...
Deep Conditional Adversarial learning for polyp Segmentation
A Temporal-Spatial Attention Model for Medical Image Detection
HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...
Fine-tuning for Polyp Segmentation with Attention
Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...
Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...
Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...
Personal Air Quality Index Prediction Using Inverse Distance Weighting Method
Ad

Recently uploaded (20)

PPTX
famous lake in india and its disturibution and importance
PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PDF
Sciences of Europe No 170 (2025)
PDF
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
PPTX
Comparative Structure of Integument in Vertebrates.pptx
PPTX
INTRODUCTION TO EVS | Concept of sustainability
PPTX
Microbiology with diagram medical studies .pptx
PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
PPTX
Cell Membrane: Structure, Composition & Functions
PDF
HPLC-PPT.docx high performance liquid chromatography
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
PPTX
microscope-Lecturecjchchchchcuvuvhc.pptx
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
DOCX
Viruses (History, structure and composition, classification, Bacteriophage Re...
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PPTX
neck nodes and dissection types and lymph nodes levels
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
famous lake in india and its disturibution and importance
The KM-GBF monitoring framework – status & key messages.pptx
ECG_Course_Presentation د.محمد صقران ppt
Sciences of Europe No 170 (2025)
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
Comparative Structure of Integument in Vertebrates.pptx
INTRODUCTION TO EVS | Concept of sustainability
Microbiology with diagram medical studies .pptx
Introduction to Fisheries Biotechnology_Lesson 1.pptx
Cell Membrane: Structure, Composition & Functions
HPLC-PPT.docx high performance liquid chromatography
AlphaEarth Foundations and the Satellite Embedding dataset
microscope-Lecturecjchchchchcuvuvhc.pptx
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
Viruses (History, structure and composition, classification, Bacteriophage Re...
Taita Taveta Laboratory Technician Workshop Presentation.pptx
neck nodes and dissection types and lymph nodes levels
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...

Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a Video Memorable?

  • 1. MediaEval2020 Predicting Media Memorability Task Overview Alba García Seco de Herrera, Rukiye Savran Kiziltepe, Jon Chamberlain, Mihai Gabriel Constantin, Claire-Hélène Demarty, Faiyaz Doctor, Bogdan Ionescu, Alan Smeaton Presentation Video
  • 2. Task Description Goal: predicting how memorable a video is to viewers 15/12/2020 MediaEval2020 2 • Automatically predicting short-term and long-term memorability • TRECVid 2019 Video to Text dataset1 • Sound and more action 1. Awad, G., Butt, A.A., Lee, Y., Fiscus, J., Godil, A., Delgado, A., Smeaton, A.F. and Graham, Y., Trecvid 2019: An evaluation campaign to benchmark video activity detection, video captioning and matching, and video search & retrieval. 2019.
  • 3. Annotation Tool • Short-term memorability : a few minutes after memorization • Long-term memorability: 24 – 72 hours later 15/12/2020 MediaEval2020 3 Romain Cohendet, Claire-Hélène Demarty, Ngoc Duong, and Martin Engilberge. VideoMem: Constructing, Analyzing, Predicting Short-term and Long-term Video Memorability. Proceedings of the IEEE International Conference on Computer Vision. 2019. Video Memorability Game
  • 4. Annotation Protocol Step 1 (180 videos) • 40 targets– repeated after a few minutes • 60 fillers – non target videos • 20 vigilance fillers – repeated quickly to monitor the attention 15/12/2020 MediaEval2020 4 Romain Cohendet, Claire-Hélène Demarty, Ngoc Duong, and Martin Engilberge. VideoMem: Constructing, Analyzing, Predicting Short-term and Long-term Video Memorability. Proceedings of the IEEE International Conference on Computer Vision. 2019. Step 2 (120 videos) • 40 targets– randomly chosen from non-vigilance fillers • 80 fillers – randomly chosen new videos
  • 5. Dataset Description • TRECVid 2019 (Video to Text) • 1500 videos • 1000 training set • 500 test set 15/12/2020 MediaEval2020 5
  • 6. Dataset Description 15/12/2020 MediaEval2020 6 • AlexNetFC7 • HOG • HSVHist • RGBHist • LBP • VGGFC7 • C3D • Text descriptions • Annotations • Response time • Key press • Video position Short-term memorability score Long-term memorability score
  • 7. Examples (Low Short-term and Long-term Memorability) 15/12/2020 MediaEval2020 7 • At football game, the ball is kicked past end zone and woman is knocked down from her knees • football player are playing at a football field. • At a college football game, during a kickoff, the kicker kicks the ball over the endzone and hits a spectator in the face while they are trying to catch it. • a person is injured when the football player kicked a ball across a field during a game • Football kicks football during a day game and a cheerleader tries to catch it and ball hits her in the head.
  • 8. Examples (High Short-term and Long-term Memorability) 15/12/2020 MediaEval2020 8 • Two boys wearing white shirts on playground swings • Two young men, are on a swing and yell, outdoors.
  • 9. Results (Mean Spearman's Rank Correlation Scores ) • 14 teams registered • 9 teams submitted 28 runs • 8 papers • Spearman’s rank correlation 15/12/2020 MediaEval2020 9 Short-term Long-term Spearman Pearson MSE Spearman Pearson MSE Mean 0.058 0.066 0.013 0.036 0.043 0.051 Variance 0.002 0.002 0.000 0.002 0.001 0.000
  • 10. 15/12/2020 MediaEval2020 10 Spearman Pearson MSE Spearman Pearson MSE CUC_DMT run1-required 0.06 0.055 0.01 0.049 0.05 0.05 run1-required 0.054 0.044 0.01 0.113 0.121 0.05 run2-required 0.05 0.072 0.01 0.059 0.071 0.05 run3-required - - - 0.109 0.119 0.05 run4-required 0.076 0.092 0.01 0.041 0.058 0.05 memento10k 0.137 0.13 0.01 - - - DCU@ML-Labsrun1-required 0.034 0.078 0.1 -0.01 0.022 0.09 HSV-Run1 0.042 0.042 0.01 0.032 0.016 0.05 RGB-Run2 -0.003 -0.026 0.01 0.043 0.042 0.04 RGB-Run3 -0.015 -0.012 0.01 0.032 0.037 0.04 RGB-HSV-Run4 -0.022 -0.001 0.01 -0.017 -0.012 0.04 Score-Run5 0.02 0.054 0.01 -0.054 -0.036 0.05 GTH-UPM run1-required 0.016 0.011 0.01 -0.041 -0.028 0.05 run0-required 0.007 0.029 0.01 0.028 0.033 0.05 run1-required -0.01 -0.019 0.01 0.012 0.021 0.05 run2-required 0.053 0.085 0.01 0.037 0.033 0.05 run3-required 0.05 0.053 0.01 0.014 0.017 0.05 run1-audiovisual 0.099 0.09 0.01 0.077 0.085 0.06 run2-vilbert 0.098 0.085 0.01 -0.017 0.011 0.06 run3-text 0.073 0.091 0.01 0.019 0.049 0.06 run4-all-SLT 0.101 0.09 0.01 0.078 0.085 0.06 run5-all-required 0.101 0.09 0.01 0.067 0.066 0.05 run1-required 0.136 0.145 0.01 0.012 0.012 0.05 run7 0.102 0.127 0.01 0.056 0.059 0.04 run8 0.091 0.095 0.01 0.077 0.068 0.05 run9 0.085 0.124 0.01 0.044 0.048 0.05 run42 0.116 0.144 0.01 0.076 0.069 0.05 MMSys run 0.007 0.01 0.01 0.048 0.032 0.05 MG-UCB Team Run Short-term Long-term DCU-Audio Essex-NLIP KT-UPB MeMAD Results (Official Results on Test-set for Teams’ all runs)
  • 11. Results (Official Results on Test-set for Teams’ best runs–Short-term) 15/12/2020 MediaEval2020 11 DCU-Audio memento10k 0.137 Audio Gestalt => Multimodal Deep Learning-based Late Fusion (Momento10K) MG-UCB run1-required 0.136 Visual, Audio, Textual, Visiolinguistic Features=> Weighted Average MeMAD run4-all-SLT and run5-all-required 0.101 Visual, Audio, Textual =>SVR , BR, GRU => Weighted Late Fusion CUC_DMT run1-required 0.06 Multi-level Encoding and Captions=> Gradient Boosting, Random Forest, Neural Network KT-UPB run2-required 0.053 C3D => Random Forest Essex-NLIP HSV-Run1 0.042 HSV => Random Forest DCU@ML-Labs run1-required 0.034 C3D => SemNET (Momento10K) GTH-UPM run1-required 0.016 Multimodal Late Fusion of Self-Attention => SVR => Bidirectional LSTM MMSys run 0.007 - Team Run Approach Short-term
  • 12. Results (Official Results on Test-set for Teams’ best runs–Long-term) 15/12/2020 MediaEval2020 12 DCU-Audio run1-required 0.113 Audio Gestalt => Multimodal Deep Learning-based Late Fusion (Momento10K) MeMAD run4-all-SLT 0.078 Visual, Audio, Textual, Visiolinguistic Features=> Weighted Average MG-UCB run8 0.077 Visual, Audio, Textual =>SVR , BR, GRU => Weighted Late Fusion CUC_DMT run1-required 0.049 Multi-level Encoding and Captions=> Gradient Boosting, Random Forest, Neural Network MMSys run 0.048 - Essex-NLIP RGB-Run2 0.043 RGB => Random Forest KT-UPB run2-required 0.037 C3D => Random Forest DCU@ML-Labs run1-required -0.01 C3D => SemNET (Momento10K) GTH-UPM run1-required -0.041 Multimodal Late Fusion of Self-Attention => SVR => Bidirectional LSTM Team Run Approach Long-term
  • 13. Conclusion • Short-term memorability – better results • Long-term memorability – results slightly lower • The best results: • DCU-Audio (0.137; 0.113) • MG-UCB (0.136; 0.77) • MeMAD (0.101; 0.078) • Audio and captions • Fusion • Deep learning techniques • More annotations 15/12/2020 MediaEval2020 13