Predicting Media Interestingness

Predicting Media
Interestingness
Deep Learning for Multimedia Processing
Lluc Cardoner

Outline
● Motivation
● Predicting image interestingness
● Results
● Predicting video interestingness
● Results
2

What is interesting?
5
Not interesting Interesting

Problem definition
Interesting
Not interesting
6
Image / Video

MediaEval conclusions 2016
Features
● Image: CNN features
● Video: Multi-modal (visual + audio)
Models
● SVM mostly used
● Few end-to-end deep learning architectures
● Video: time dependencies
7
Demarty, Claire-Helène, et al. "Predicting Interestingness of Visual Content." Visual Content Indexing and Retrieval with Psycho‐Visual
Models (2017).

End-to-end deep learning approach
8

Dataset 2016: Data
● 52 movie trailers - development
● 26 movie trailers - testing
Total: 13 GB
9

Outline
● Motivation
● Results
● Results
10

Dataset 2016: Frames
11
Segment 1 Segment 2 Segment 3 Segment N
...
Movie trailer
Segment 4 Segment 5
Frame 1 Frame 2 Frame 3 Frame 4 Frame 5 Frame 6

Dataset: Ground truth
● Classification: 2 classes
○ 0 - not interesting
○ 1 - interesting
● Confidence values
○ Between 0 and 1
● Rank of the frame or segment in the video
12
Interesting: 1.0 → 1
Not Interesting: 0.026 → 0

Predicting image interestingness
● ResNet50
○ Transfer learning
○ Fine tuning
13He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern
recognition. 2016.

Adding layers
Problem: overfitting
14

Data augmentation
● Image Data Generator
○ Horizontal flip
○ Shuffling
15

Unbalanced classes
● Class weights
17

Outline
● Motivation
● Results
● Results
19

Evaluation metric
● Mean Average Precision (MAP)
For both subtasks
20

Results: Image interestingness
Id MAP Architecture
25 0.1392 train new layers and 2 last layers from ResNet
27 0.1728 augment just class 1 and balanced
30 0.1478 dropout of 0.5
31 0.1177 Class weights + dropout + horizontal flip
37 0.1564 Class weights + dropout + flip, shift, zoom
39 0.1402 Class weights + dropout + flip, shift, zoom + 2 ResNet layers
Threshold: 0.5
2016 MAP
Baseline 0.1655
Top result 0.2336
21

Results: Image interestingness
Static Threshold Dynamic threshold
Id MAP threshold MAP
25 0.1392 0.1577 0.1932
27 0.1728 0.4875 0.1909
30 0.1478 0.1572 0.2243
31 0.1177 0.5066 0.2396
37 0.1564 0.5295 0.2362
39 0.1402 0.1336 0.1795
2016 MAP
Baseline 0.1655
Top result 0.2336
23

Outline
● Results
● Results
24

Dataset 2016: Segments
Segment 1 Segment 2 Segment 3 Segment N
...
Movie trailer
Segment 4 Segment 5
25

Predicting video interestingness
● Extract features: C3D
● Training LSTM network
Time 26

3D Convolutional network
27
Montes, Alberto, Amaia Salvador, and Xavier Giro-i-Nieto. "Temporal activity detection in untrimmed videos with recurrent neural networks."
NIPS Workshop Large Scale Computer Vision Systems 2016

Extract features
● Preprocess
○ Clips
● Feature extraction
○ 3D convolutional network
● Label mapping
○ Feature vector
28

Label mapping
29
S1 {0, 0.50} S2 {1, 0.60}
Clip (16 frames)
80% 20%
0.8 x 0.5 + 0.2 x 0.6 = 0.52

Outline
● Results
● Results
31

Results: Video interestingness
2016 MAP
Baseline 0.1496
Top result 0.1815
Technicolor 0.1365
Id MAP
65 0.1541
Clips
Interestingnessvalue
32Shen, Yuesong, Claire-Hélène Demarty, and Ngoc QK Duong. "Technicolor@ MediaEval 2016 Predicting Media Interestingness Task."
MediaEval (2016).

Conclusions
33
Predicting image interestingness MAP
Class weights + dropout + horizontal flip 0.2396
Class weights + dropout + flip, shift, zoom 0.2362

Conclusions
34
Static Threshold Dynamic threshold
MAP MAP
0.1392 0.1932
0.1728 0.1909
0.1478 0.2243
0.1177 0.2396
0.1564 0.2362
0.1402 0.1795

Conclusions
35
Image Video
Baseline: 0.1655
Top result 2016: 0.2336
Our result: 0.2396
Baseline: 0.1496
Top result 2016: 0.1815
Our result: 0.1541
Technicolor: 0.1365

36
https://github.com/lluccardoner/MediaInterestingness

Predicting Media Interestingness

More Related Content

Similar to Predicting Media Interestingness (20)

More from Universitat Politècnica de Catalunya (20)

Recently uploaded (20)

Predicting Media Interestingness