This document summarizes research on predicting media interestingness using deep learning techniques. For image interestingness prediction, the researchers achieved a mean average precision of 0.2396 using a ResNet50 model fine-tuned with class weights, dropout and data augmentation. For video interestingness prediction, the researchers extracted C3D features from video clips and used an LSTM network, achieving a mean average precision of 0.1541. In both tasks, the researchers' results improved upon the 2016 baselines.
Related topics: