SlideShare a Scribd company logo
Copyright©2014 NTT corp. All Rights Reserved.
CVPR2014 reading
“Reconstructing storyline graphs for image
recommendation from web community photos”
Akisato Kimura <akisato@ieee.org> [@_akisato]
1
1-page summary
• Creating a storyline graph from a set of photo sequences
(and optionally friendship graphs) for a topic of interest.
• A photo sequence
= list[ zip( photos, time stamps ) ], created by a single user.
• A storyline
= a series of events with chronological or causal relations,
represented by a directed graph.
2
Why not storylines? (1)
• Many topics of interest consist of a sequence of
activities or events repeated across photo streams.
 Independence day = marathon race (1,2) + parades (3-6) +
barbeque + fireworks (8-9)
3
Why not storylines? (2)
• A storyline can characterize various branching
narrative structure associated with the topic.
 A single photo stream = a linear thread of story by a user.
 Its aggregation reveals underlying big pictures.
4
Related work by the 1st author
CVPR14 oral
CVPR14
CVPR13 oral
WSDM13
KDD12
ECCV10
+ another line of research: WSDM14, CVPR12 oral, ICCV11 oral, NIPS09, CVPR08 oral
5
ECCV10 paper
Generating a sparse similarity network of web images &
associated time stamps
• The method is simple: connecting temporally close & visually
similar images
• It reveals subtopic outbreaks and evolutions.
6
KDD12 paper
Modeling an image stream with a point process
• This enables us to predict what images are likely to appear
at a future time point by extrapolating the image stream
7
WSDM13 paper
Modeling an image stream with point processes &
developing a regularized multi-task regression
• For retrieving relevant and temporally suitable images for a
given word, time point and optionally user information.
8
CVPR13 paper
Aligning and segmenting multiple web photo streams
for inferring storylines
9
CVPR14 paper
• Creating a storyline graph from photo streams
• Segmentation in CVPR13 seems redundant.
• Image clustering might be sufficient for representing
subtopics, as shown in KDD12 & WSDM13 papers.
10
Another CVPR14 paper
A set of videos is useful for creating a storyline graph
• Videos convey temporal smoothness between frames, which is
often missing in photo streams.
11
Problem definition
[ Input ] A set of photo streams
The set of photo streams 𝑷𝑷 = 𝑃𝑃1, … , 𝑃𝑃𝐿𝐿
A photo stream 𝑃𝑃𝑙𝑙 = 𝑝𝑝1
𝑙𝑙
, … , 𝑝𝑝𝐿𝐿𝑙𝑙
𝑙𝑙
,
taken by a single person within a period of time [0, 𝑇𝑇]
A photo 𝑝𝑝𝑗𝑗
𝑙𝑙
= (𝑥𝑥𝑗𝑗
𝑙𝑙
, 𝑡𝑡𝑗𝑗
𝑙𝑙
) ,
a pair of an image descriptor and a time stamp.
[ Output ] A storyline graph
The storyline graph 𝑮𝑮 = (𝑶𝑶, 𝑬𝑬)
Each node in 𝑶𝑶 = an image cluster.
Edges 𝑬𝑬 = 𝑬𝑬𝑡𝑡
𝑡𝑡 smoothly change over time.
Each edge 𝑬𝑬𝑡𝑡 is represented by an adjacency matrix 𝑨𝑨𝑡𝑡.
12
Storyline graphs in detail
• Why image clusters for nodes?
 Images are too many, much of them are redundant.
• Edges should be sparse and time-varying
 Time-varying: popular transitions smoothly change over
time
timeline
At 12PM
At 7PM
t = 10AM t = 12PM t = 2PM
13
Image encoding
4 different image (global) descriptors
• [SIFT] 3-level spatial pyramid histograms for HSV color SIFT
• [HOG2x2] 3-level spatial pyramid histograms for HOG.
• [Tiny] 32x32 TinyImages.
• [Scene] SUN397 detector outputs.
Constructing image clusters by K-means (K=600)
+ assigning 𝑐𝑐-NN clusters with Gaussian weighting
• In the case of [Scene], top-𝑐𝑐 detector outputs are used.
• Each descriptor 𝑥𝑥𝑗𝑗
𝑙𝑙
has at most 4𝑐𝑐 non-zero components.
14
Modeling photo streams
Introducing several practical assumptions
All the photo streams are taken independently of one another.
Every photo stream obeys 1st-order Markovians.
𝑓𝑓 𝒙𝒙𝑗𝑗
𝑙𝑙
, 𝑡𝑡𝑗𝑗
𝑙𝑙
𝒙𝒙𝑗𝑗−1
𝑙𝑙
, 𝑡𝑡𝑗𝑗−1
𝑙𝑙
= � 𝑓𝑓(𝑥𝑥𝑗𝑗,𝑑𝑑
𝑙𝑙
, 𝑡𝑡𝑗𝑗
𝑙𝑙
|𝒙𝒙𝑗𝑗−1
𝑙𝑙
, 𝑡𝑡𝑗𝑗−1
𝑙𝑙
)
𝐷𝐷
𝑑𝑑=1
All the elements in a descriptor are conditionally independent
one another given the previous descriptor.
15
Modeling a storyline
A simple linear model for
Encoding temporal transitions into 𝑨𝑨𝑒𝑒
The log likelihood (for stationary A)
To be minimized
16
Optimization
A simple least squares if 𝑨𝑨𝑡𝑡 is time-independent.
Introducing neighborhood selection [Meinshausen+ 2006]
Plus 𝑙𝑙1-regularization
Gaussian kernel for 𝑡𝑡𝑖𝑖 centered at 𝑡𝑡
Introducing sparsity into 𝑨𝑨𝑡𝑡
17
Incorporating additional information
Strategy : introducing a product kernel
1. Original = neighborhood selection
2. If you’d customize the graph for a particular user 𝑢𝑢𝑞𝑞
3. If you’d introduce seasonal trends
𝑠𝑠𝑞𝑞 = 𝑠𝑠(𝑚𝑚𝑞𝑞) : A function of months to seasons
18
Image recommendation with storylines
2 typical tasks for sequential image prediction
1. Given an image sequence, predict K next likely images
2. Given two parts of temporary distant image sequences,
estimate the most likely path between them
A state space model would be helpful for those tasks
(remember, )
1. Applying the forward algorithm.
2. Exploiting the forward-backward algorithm with EM.
1. 2.
19
Experiments
1. Evaluating reconstructed storyline graphs
via user studies with AMT.
2. Quantitatively comparing the performance
for the 2 types of image prediction tasks.
a. Predicting next likely images.
b. Filling in missing parts of a photo stream.
[Baseline]
1. PageRank-based image retrieval (details missing)
2. HMM for modeling photo sequences
3. Clustering-based summarization
20
Dataset
3.3M Flickr images of 42K photo streams for 24 classes
The friendship graph was indirectly built from group information
(The edge weight indicates the number of groups that both users join together).
21
Scheme for evaluations
[ Basic idea ] Let each turker to compare tuples of
images representing the storyline graphs.
1. Each algorithm generates storyline per topic.
2. Sample 100 standard images as test instances.
3. Each algorithm predicts next most-likely image after the test
instance.
4. [ Turker task (>3 turners per test image)]
✔ Our method
Baseline 2Test image
B
A
A crowd of human
subjects evaluate
only a basic unit (i.e.
important edge of
storyline).
22
Evaluating storyline graphs
Better than baselines (HMM, PageRank & Clustering).
𝐼𝐼𝑞𝑞 𝐼𝐼𝑒𝑒
𝐼𝐼𝑞𝑞 𝐼𝐼𝑒𝑒 𝐼𝐼𝑞𝑞 𝐼𝐼𝑒𝑒
[66.5, 67.5, 69.4] over (HMM), (Page), (Clust)
23
Setting: Image prediction tasks (1)
• The “future prediction” task
Method 1
estimates
Hidden
Groundtruth
23
Procedures
Training (80%)
Build storyline graph
Task (I): Given a short sequence of test PS,
predict next likely images
Measure
similarity!
? ? ? ? ?
24
Setting: Image prediction tasks (2)
• The “filling in gaps” task
Procedures
Training (80%)
Build storyline graph
Method 1
estimates
Hidden
GT
24
Task (II): Given a pair of distant sequences,
fill in missing parts
? ? ? ? ?
Measure
similarity!
25
Performance measured by PSNR
Future prediction - Personalized
Future prediction - Normal [9.60, 8.99, 8.86, 8.75]
[9.53, 9.01, 8.85, 8.75]
Filling in gaps - Personalized
Filling in gaps - Normal [9.70, 8.97, 8.89, 8.96]
[9.57, 9.05, 8.87, 8.93]
26
Qualitative evaluations

More Related Content

PDF
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
PDF
Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D stream
PPT
Anatomy of a Texture Fetch
PPTX
Geometry Batching Using Texture-Arrays
PDF
Graph Convolutional Network
PPTX
Image Acquisition and Representation
PDF
Algorithm
PPTX
LAPLACE TRANSFORM SUITABILITY FOR IMAGE PROCESSING
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D stream
Anatomy of a Texture Fetch
Geometry Batching Using Texture-Arrays
Graph Convolutional Network
Image Acquisition and Representation
Algorithm
LAPLACE TRANSFORM SUITABILITY FOR IMAGE PROCESSING

What's hot (19)

PDF
Contour-Constrained Superpixels for Image and Video Processing
PDF
Case Study of Convolutional Neural Network
PPTX
Lect 03 - first portion
PPTX
Image enhancement techniques
PDF
Learning to Perceive the 3D World
PPTX
ImageNet classification with deep convolutional neural networks(2012)
PDF
imageCorrectionLinearDiffusion
PDF
UE4 Landscape
PPTX
Edited storyboard
PPT
Chapter10 image segmentation
PPTX
Visualizing and understanding convolutional networks(2014)
PDF
4 image enhancement in spatial domain
PDF
Mask R-CNN
PPS
Icdecs 2011
PPT
Spatial domain and filtering
PPTX
Intensity Transformation
PPTX
COM2304: Intensity Transformation and Spatial Filtering – I (Intensity Transf...
DOCX
Digital image processing short quesstion answers
PPTX
3.point operation and histogram based image enhancement
Contour-Constrained Superpixels for Image and Video Processing
Case Study of Convolutional Neural Network
Lect 03 - first portion
Image enhancement techniques
Learning to Perceive the 3D World
ImageNet classification with deep convolutional neural networks(2012)
imageCorrectionLinearDiffusion
UE4 Landscape
Edited storyboard
Chapter10 image segmentation
Visualizing and understanding convolutional networks(2014)
4 image enhancement in spatial domain
Mask R-CNN
Icdecs 2011
Spatial domain and filtering
Intensity Transformation
COM2304: Intensity Transformation and Spatial Filtering – I (Intensity Transf...
Digital image processing short quesstion answers
3.point operation and histogram based image enhancement
Ad

Viewers also liked (20)

PPTX
学生会講演会スライド
PDF
CVPR2015 reading "Understainding image virality" (in Japanese)
PDF
Productive OpenCL with Intel Xeon Phi Coprocessors
PPT
セキュリティの三途川
PPTX
結婚式の画像がどうしてもほしかった話
PDF
OpenStackのQuantum(LinuxBridge Plugin)が実際どうやって仮想ネットワークを構成するのか説明する資料
PPTX
八子クラウド_IDCFrontier_20161217
PDF
人工知能学会誌 2015年5月号 特集「イノベーションとAI研究」
PDF
「もっと可愛いワンピースないの?」ディープラーニングで実現する、いままでにないアイテム検索
PDF
GTC Japan 2016 Rescaleセッション資料「クラウドHPC ではじめるDeep Learning」- Oct/5/2016 at GTC ...
PDF
特許をとろう (15/09/17 pfiセミナー )
PDF
第2回cv勉強会@九州 LSD-SLAM
PDF
Spring4とSpring Bootで作る次世代Springアプリケーション #jjug #jsug
PDF
20分でおさらいするサーバレスアーキテクチャ 「サーバレスの薄い本ダイジェスト」 #serverlesstokyo
PDF
Qualite service-reseaux-internet
PDF
Ruby を用いた超絶技巧プログラミング(夏のプログラミングシンポジウム 2012)
PDF
Practical recommendations for gradient-based training of deep architectures
PPTX
Amazon Redshiftの開発者がこれだけは知っておきたい10のTIPS / 第18回 AWS User Group - Japan
PDF
DeepLearningDay2016Summer
PDF
自動微分変分ベイズ法の紹介
学生会講演会スライド
CVPR2015 reading "Understainding image virality" (in Japanese)
Productive OpenCL with Intel Xeon Phi Coprocessors
セキュリティの三途川
結婚式の画像がどうしてもほしかった話
OpenStackのQuantum(LinuxBridge Plugin)が実際どうやって仮想ネットワークを構成するのか説明する資料
八子クラウド_IDCFrontier_20161217
人工知能学会誌 2015年5月号 特集「イノベーションとAI研究」
「もっと可愛いワンピースないの?」ディープラーニングで実現する、いままでにないアイテム検索
GTC Japan 2016 Rescaleセッション資料「クラウドHPC ではじめるDeep Learning」- Oct/5/2016 at GTC ...
特許をとろう (15/09/17 pfiセミナー )
第2回cv勉強会@九州 LSD-SLAM
Spring4とSpring Bootで作る次世代Springアプリケーション #jjug #jsug
20分でおさらいするサーバレスアーキテクチャ 「サーバレスの薄い本ダイジェスト」 #serverlesstokyo
Qualite service-reseaux-internet
Ruby を用いた超絶技巧プログラミング(夏のプログラミングシンポジウム 2012)
Practical recommendations for gradient-based training of deep architectures
Amazon Redshiftの開発者がこれだけは知っておきたい10のTIPS / 第18回 AWS User Group - Japan
DeepLearningDay2016Summer
自動微分変分ベイズ法の紹介
Ad

Similar to CVPR2014 reading "Reconstructing storyline graphs for image recommendation from web community photos" (20)

PDF
Andrea Ceroni: Personal Photo Management and Preservation
PDF
MediaEval 2015 - CERTH at MediaEval 2015 Synchronization of Multi-User Event ...
PDF
2016 MediaEval - Interestingness Task Overview
PDF
Visual Summary of Egocentric Photostreams by Representative Keyframes
PDF
Visual Summary of Egocentric Photostreams by Representative Keyframes (WEsAX ...
PPTX
[NS][Lab_Seminar_241118]Relation Matters: Foreground-aware Graph-based Relati...
PPTX
Image captions.pptx
PPT
EventSense: Capturing the Pulse of Large-scale Events by Mining Social Media ...
PPTX
MediaEval 2018: Baseline Algorithms for Predicting the Interest in News
PDF
IceBreaker Solving Cold Start Problem For Video Recommendation Engines
PPTX
TechnicalBackgroundOverview
PPTX
Geotagging Social Media Content with a Refined Language Modelling Approach
PPTX
Geotagging Social Media Content with a Refined Language Modelling Approach
PPT
Retrieving Diverse Social Images at MediaEval 2014: Challenge, Dataset and Ev...
PPTX
Evolving a Medical Image Similarity Search
PDF
InternshipReport
PDF
Bn35364376
PPTX
Using classifiers to compute similarities between face images. Prof. Lior Wol...
PDF
[5 minutes LT] Brief Introduction to Recent Image Recognition Methods and Cha...
PDF
Science of culture? Computational analysis and visualization of cultural imag...
Andrea Ceroni: Personal Photo Management and Preservation
MediaEval 2015 - CERTH at MediaEval 2015 Synchronization of Multi-User Event ...
2016 MediaEval - Interestingness Task Overview
Visual Summary of Egocentric Photostreams by Representative Keyframes
Visual Summary of Egocentric Photostreams by Representative Keyframes (WEsAX ...
[NS][Lab_Seminar_241118]Relation Matters: Foreground-aware Graph-based Relati...
Image captions.pptx
EventSense: Capturing the Pulse of Large-scale Events by Mining Social Media ...
MediaEval 2018: Baseline Algorithms for Predicting the Interest in News
IceBreaker Solving Cold Start Problem For Video Recommendation Engines
TechnicalBackgroundOverview
Geotagging Social Media Content with a Refined Language Modelling Approach
Geotagging Social Media Content with a Refined Language Modelling Approach
Retrieving Diverse Social Images at MediaEval 2014: Challenge, Dataset and Ev...
Evolving a Medical Image Similarity Search
InternshipReport
Bn35364376
Using classifiers to compute similarities between face images. Prof. Lior Wol...
[5 minutes LT] Brief Introduction to Recent Image Recognition Methods and Cha...
Science of culture? Computational analysis and visualization of cultural imag...

More from Akisato Kimura (20)

PPTX
Paper reading - Dropout as a Bayesian Approximation: Representing Model Uncer...
PPTX
Paper reading - Dropout as a Bayesian Approximation: Representing Model Uncer...
PDF
多変量解析の一般化
PDF
CVPR2016 reading - 特徴量学習とクロスモーダル転移について
PDF
NIPS2015 reading - Learning visual biases from human imagination
PDF
CVPR2015 reading "Global refinement of random forest"
PDF
Computational models of human visual attention driven by auditory cues
PDF
NIPS2014 reading - Top rank optimization in linear time
PDF
ICCV2013 reading: Learning to rank using privileged information
PDF
ACMMM 2013 reading: Large-scale visual sentiment ontology and detectors using...
PDF
IJCAI13 Paper review: Large-scale spectral clustering on graphs
PDF
関西CVPR勉強会 2012.10.28
PDF
関西CVPR勉強会 2012.7.29
PDF
ICWSM12 Brief Review
PDF
関西CVPRML勉強会 2012.2.18 (一般物体認識 - データセット)
PDF
関西CVPRML勉強会(特定物体認識) 2012.1.14
PDF
人間の視覚的注意を予測するモデル - 動的ベイジアンネットワークに基づく 最新のアプローチ -
PDF
IBIS2011 企画セッション「CV/PRで独自の進化を遂げる学習・最適化技術」 趣旨説明
PDF
立命館大学 AMLコロキウム 2011.10.20
PDF
広島画像情報学セミナ 2011.9.16
Paper reading - Dropout as a Bayesian Approximation: Representing Model Uncer...
Paper reading - Dropout as a Bayesian Approximation: Representing Model Uncer...
多変量解析の一般化
CVPR2016 reading - 特徴量学習とクロスモーダル転移について
NIPS2015 reading - Learning visual biases from human imagination
CVPR2015 reading "Global refinement of random forest"
Computational models of human visual attention driven by auditory cues
NIPS2014 reading - Top rank optimization in linear time
ICCV2013 reading: Learning to rank using privileged information
ACMMM 2013 reading: Large-scale visual sentiment ontology and detectors using...
IJCAI13 Paper review: Large-scale spectral clustering on graphs
関西CVPR勉強会 2012.10.28
関西CVPR勉強会 2012.7.29
ICWSM12 Brief Review
関西CVPRML勉強会 2012.2.18 (一般物体認識 - データセット)
関西CVPRML勉強会(特定物体認識) 2012.1.14
人間の視覚的注意を予測するモデル - 動的ベイジアンネットワークに基づく 最新のアプローチ -
IBIS2011 企画セッション「CV/PRで独自の進化を遂げる学習・最適化技術」 趣旨説明
立命館大学 AMLコロキウム 2011.10.20
広島画像情報学セミナ 2011.9.16

Recently uploaded (20)

PDF
Approach and Philosophy of On baking technology
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
A Presentation on Artificial Intelligence
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Machine learning based COVID-19 study performance prediction
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Empathic Computing: Creating Shared Understanding
Approach and Philosophy of On baking technology
MYSQL Presentation for SQL database connectivity
A Presentation on Artificial Intelligence
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Encapsulation_ Review paper, used for researhc scholars
NewMind AI Monthly Chronicles - July 2025
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Network Security Unit 5.pdf for BCA BBA.
Mobile App Security Testing_ A Comprehensive Guide.pdf
The AUB Centre for AI in Media Proposal.docx
Building Integrated photovoltaic BIPV_UPV.pdf
20250228 LYD VKU AI Blended-Learning.pptx
NewMind AI Weekly Chronicles - August'25 Week I
“AI and Expert System Decision Support & Business Intelligence Systems”
Machine learning based COVID-19 study performance prediction
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Empathic Computing: Creating Shared Understanding

CVPR2014 reading "Reconstructing storyline graphs for image recommendation from web community photos"

  • 1. Copyright©2014 NTT corp. All Rights Reserved. CVPR2014 reading “Reconstructing storyline graphs for image recommendation from web community photos” Akisato Kimura <akisato@ieee.org> [@_akisato]
  • 2. 1 1-page summary • Creating a storyline graph from a set of photo sequences (and optionally friendship graphs) for a topic of interest. • A photo sequence = list[ zip( photos, time stamps ) ], created by a single user. • A storyline = a series of events with chronological or causal relations, represented by a directed graph.
  • 3. 2 Why not storylines? (1) • Many topics of interest consist of a sequence of activities or events repeated across photo streams.  Independence day = marathon race (1,2) + parades (3-6) + barbeque + fireworks (8-9)
  • 4. 3 Why not storylines? (2) • A storyline can characterize various branching narrative structure associated with the topic.  A single photo stream = a linear thread of story by a user.  Its aggregation reveals underlying big pictures.
  • 5. 4 Related work by the 1st author CVPR14 oral CVPR14 CVPR13 oral WSDM13 KDD12 ECCV10 + another line of research: WSDM14, CVPR12 oral, ICCV11 oral, NIPS09, CVPR08 oral
  • 6. 5 ECCV10 paper Generating a sparse similarity network of web images & associated time stamps • The method is simple: connecting temporally close & visually similar images • It reveals subtopic outbreaks and evolutions.
  • 7. 6 KDD12 paper Modeling an image stream with a point process • This enables us to predict what images are likely to appear at a future time point by extrapolating the image stream
  • 8. 7 WSDM13 paper Modeling an image stream with point processes & developing a regularized multi-task regression • For retrieving relevant and temporally suitable images for a given word, time point and optionally user information.
  • 9. 8 CVPR13 paper Aligning and segmenting multiple web photo streams for inferring storylines
  • 10. 9 CVPR14 paper • Creating a storyline graph from photo streams • Segmentation in CVPR13 seems redundant. • Image clustering might be sufficient for representing subtopics, as shown in KDD12 & WSDM13 papers.
  • 11. 10 Another CVPR14 paper A set of videos is useful for creating a storyline graph • Videos convey temporal smoothness between frames, which is often missing in photo streams.
  • 12. 11 Problem definition [ Input ] A set of photo streams The set of photo streams 𝑷𝑷 = 𝑃𝑃1, … , 𝑃𝑃𝐿𝐿 A photo stream 𝑃𝑃𝑙𝑙 = 𝑝𝑝1 𝑙𝑙 , … , 𝑝𝑝𝐿𝐿𝑙𝑙 𝑙𝑙 , taken by a single person within a period of time [0, 𝑇𝑇] A photo 𝑝𝑝𝑗𝑗 𝑙𝑙 = (𝑥𝑥𝑗𝑗 𝑙𝑙 , 𝑡𝑡𝑗𝑗 𝑙𝑙 ) , a pair of an image descriptor and a time stamp. [ Output ] A storyline graph The storyline graph 𝑮𝑮 = (𝑶𝑶, 𝑬𝑬) Each node in 𝑶𝑶 = an image cluster. Edges 𝑬𝑬 = 𝑬𝑬𝑡𝑡 𝑡𝑡 smoothly change over time. Each edge 𝑬𝑬𝑡𝑡 is represented by an adjacency matrix 𝑨𝑨𝑡𝑡.
  • 13. 12 Storyline graphs in detail • Why image clusters for nodes?  Images are too many, much of them are redundant. • Edges should be sparse and time-varying  Time-varying: popular transitions smoothly change over time timeline At 12PM At 7PM t = 10AM t = 12PM t = 2PM
  • 14. 13 Image encoding 4 different image (global) descriptors • [SIFT] 3-level spatial pyramid histograms for HSV color SIFT • [HOG2x2] 3-level spatial pyramid histograms for HOG. • [Tiny] 32x32 TinyImages. • [Scene] SUN397 detector outputs. Constructing image clusters by K-means (K=600) + assigning 𝑐𝑐-NN clusters with Gaussian weighting • In the case of [Scene], top-𝑐𝑐 detector outputs are used. • Each descriptor 𝑥𝑥𝑗𝑗 𝑙𝑙 has at most 4𝑐𝑐 non-zero components.
  • 15. 14 Modeling photo streams Introducing several practical assumptions All the photo streams are taken independently of one another. Every photo stream obeys 1st-order Markovians. 𝑓𝑓 𝒙𝒙𝑗𝑗 𝑙𝑙 , 𝑡𝑡𝑗𝑗 𝑙𝑙 𝒙𝒙𝑗𝑗−1 𝑙𝑙 , 𝑡𝑡𝑗𝑗−1 𝑙𝑙 = � 𝑓𝑓(𝑥𝑥𝑗𝑗,𝑑𝑑 𝑙𝑙 , 𝑡𝑡𝑗𝑗 𝑙𝑙 |𝒙𝒙𝑗𝑗−1 𝑙𝑙 , 𝑡𝑡𝑗𝑗−1 𝑙𝑙 ) 𝐷𝐷 𝑑𝑑=1 All the elements in a descriptor are conditionally independent one another given the previous descriptor.
  • 16. 15 Modeling a storyline A simple linear model for Encoding temporal transitions into 𝑨𝑨𝑒𝑒 The log likelihood (for stationary A) To be minimized
  • 17. 16 Optimization A simple least squares if 𝑨𝑨𝑡𝑡 is time-independent. Introducing neighborhood selection [Meinshausen+ 2006] Plus 𝑙𝑙1-regularization Gaussian kernel for 𝑡𝑡𝑖𝑖 centered at 𝑡𝑡 Introducing sparsity into 𝑨𝑨𝑡𝑡
  • 18. 17 Incorporating additional information Strategy : introducing a product kernel 1. Original = neighborhood selection 2. If you’d customize the graph for a particular user 𝑢𝑢𝑞𝑞 3. If you’d introduce seasonal trends 𝑠𝑠𝑞𝑞 = 𝑠𝑠(𝑚𝑚𝑞𝑞) : A function of months to seasons
  • 19. 18 Image recommendation with storylines 2 typical tasks for sequential image prediction 1. Given an image sequence, predict K next likely images 2. Given two parts of temporary distant image sequences, estimate the most likely path between them A state space model would be helpful for those tasks (remember, ) 1. Applying the forward algorithm. 2. Exploiting the forward-backward algorithm with EM. 1. 2.
  • 20. 19 Experiments 1. Evaluating reconstructed storyline graphs via user studies with AMT. 2. Quantitatively comparing the performance for the 2 types of image prediction tasks. a. Predicting next likely images. b. Filling in missing parts of a photo stream. [Baseline] 1. PageRank-based image retrieval (details missing) 2. HMM for modeling photo sequences 3. Clustering-based summarization
  • 21. 20 Dataset 3.3M Flickr images of 42K photo streams for 24 classes The friendship graph was indirectly built from group information (The edge weight indicates the number of groups that both users join together).
  • 22. 21 Scheme for evaluations [ Basic idea ] Let each turker to compare tuples of images representing the storyline graphs. 1. Each algorithm generates storyline per topic. 2. Sample 100 standard images as test instances. 3. Each algorithm predicts next most-likely image after the test instance. 4. [ Turker task (>3 turners per test image)] ✔ Our method Baseline 2Test image B A A crowd of human subjects evaluate only a basic unit (i.e. important edge of storyline).
  • 23. 22 Evaluating storyline graphs Better than baselines (HMM, PageRank & Clustering). 𝐼𝐼𝑞𝑞 𝐼𝐼𝑒𝑒 𝐼𝐼𝑞𝑞 𝐼𝐼𝑒𝑒 𝐼𝐼𝑞𝑞 𝐼𝐼𝑒𝑒 [66.5, 67.5, 69.4] over (HMM), (Page), (Clust)
  • 24. 23 Setting: Image prediction tasks (1) • The “future prediction” task Method 1 estimates Hidden Groundtruth 23 Procedures Training (80%) Build storyline graph Task (I): Given a short sequence of test PS, predict next likely images Measure similarity! ? ? ? ? ?
  • 25. 24 Setting: Image prediction tasks (2) • The “filling in gaps” task Procedures Training (80%) Build storyline graph Method 1 estimates Hidden GT 24 Task (II): Given a pair of distant sequences, fill in missing parts ? ? ? ? ? Measure similarity!
  • 26. 25 Performance measured by PSNR Future prediction - Personalized Future prediction - Normal [9.60, 8.99, 8.86, 8.75] [9.53, 9.01, 8.85, 8.75] Filling in gaps - Personalized Filling in gaps - Normal [9.70, 8.97, 8.89, 8.96] [9.57, 9.05, 8.87, 8.93]