Temporal models for mining, ranking and recommendation in the Web

TEMPORAL MODELS
FOR MINING, RANKING AND RECOMMENDATION IN
THE WEB
Tu Nguyen
L3S Research Center
Leibniz Universität Hannover
1

Outline
2
Temporal
Dynamics
Web
Web
Archives
Collaborative
Knowledge
Bases
Social
Networks

Through the Lens of Time..
3
tim
e

Research Questions
5
• RQ1.1: How do the relevant aspects of an entity-centric query change
around the associated event time, specifically just before, during and
after the event time.
• RQ1.2: Given an entity-centric query of semantical or topical ambiguity
at an event time, how should the ranked list of relevant documents be
formed so that the coverage at top-k is maximized?

Research Questions
6
• RQ1.1: How do the relevant aspects of an entity-centric query change
around the associated event time, specifically just before, after and
during the event time.

Motivation
8
Winners
Nominations
Movies Actors
Location
Athletes
Australia Open
Winners
Schedule
DrawResults

Motivation
9
t
Jan Feb Mar
Querying
time

Motivation
10
Long-term
cumulativeness
vs. Short-term
interest.

Motivation
11
In addition,
different event
times and types
entail different
characteristics
toward long-
term and short-
term interests.

Problem Statment
12
•Problem (Temporal Entity-Aspect Recommendation): Given an event
entity e and hitting time t as input, find the ranked list of entity
aspects that most relevant with regards to e and t.

Sub-task
15
• Time and Type cascaded classification
• Semantic relation between task labels
• à joint-learning in cascaded manner
• Features
• Seasonality
• Trending
• Auto-correlation
• Prediction Errors
• SpikeM fitting parameters[1]
[1] Matsubara, Yasuko, et al. "Rise and fall patterns of information diffusion: model and implications." Proceedings of the 18th ACM
SIGKDD international conference on Knowledge discovery and data mining. ACM, 2012.
02060100140
observed
202530
trend
0204060
seasonal
−4002040
1990 1995 2000 2005
random
Time
Decomposition of additive time series

Approach Overview
16
Ranking
Task

Multi-criteria Learning
17
• Multiple Ranking Models
• Idea: divide-and-conquer, each feature-set performs better for certain entity
type and at certain event time.
1. Probability the event entity e, at time t, of type C ∈ {Breaking, Anticipated}
2. Probability e is with subject to C is at event time T ∈ {Before, During, After}
3. We use RankSVM to estimate the ranking function f(X, ω) for yˆa
1 2 3

Ranking Features
18
• Salience features
• Mainly extracted from Wikipedia or long duration query logs
• Avg. TF-IDF
• Language Model-based features
• MLE, Entropy: reward most (cumulated) frequent aspects
• Short-term interest features
• Mainly extracted from recent query logs
• Trending velocity
• Temporal click entropy
• Cross correlation (entity vs. aspect)
• Temporal LM

Datasets
19
• AOL query logs
• 03-2006 to 05-2006: 3 months
• Over 30 mil. Queries
• Manual construction:
• 837 entity queries
• 300 event-related queries
• Ground-truth: 70 queries (Breaking: 30, Anticipated: 40)

Methods for Comparison
20
• Random walk with restart (RWR)
• SOTA time-aware query auto-completion:
• Most popular completion[2]
• Recent MPC[2]
• Last N query distribution[2]
• Predicted next N query distribution[2,3]
• SVM-salience: with all salient features[4]
• SVM-timeliness: with all short-term interest features
• SVM-all: with all features
•[2] S. Whiting and J. M. Jose. Recent and robust query auto-completion. In WWW ‘14.
•[3] M. Shokouhi and K. Radinsky. Time-sensitive query auto-completion. In SIGIR ’12.
•[4] Reinanda, Ridho, Edgar Meij, and Maarten de Rijke. Mining, ranking and recommending entity aspects. In SIGIR’15.

Experiments
21
• How do long-term salience and short-term interest features perform at
different time periods of different event types?
• Breaking: Salience model performs well for before, worsen for after

Experiments (2)
22
• How do long-term salience and short-term interest features perform at
different time periods of different event types?
• Anticipated: Timeliness model performs well for before and after, worsen for
during

Experiments (3)
23
• How does the ensemble ranking model perform compared to the single
model approaches?

Research Questions
24
• RQ1.2: Given an entity-centric query of semantical or topical ambiguity
at an event time, how should the ranked list of relevant documents be
formed so that the coverage at top-k is maximized?

Motivation
25
music
spy satellite mission
beer
beer
Search in November 2019

Motivation
26
music
spy satellite mission
beer
beer
Search in November 2019 Search in March 2020

Temporal Search Results Diversification
27
Objective function of the greedy optimization:
• c: subtopic
• S: incremental set of diversified documents
• q: query
• d: target document
- sensitive to time
- should take document
age into account

Motivation
28
Temporal
Dynamics
Collaborative
Knowledge
Bases

Wikipedia as a Global Memory Place
29

Collective memory in Wikipedia
30
•What triggers human remembering of past events?

Motivation
31
• Wikipedia as a source for global memory
• Largest and most up-to-date online encyclopedia
• Its open construction and negotiation in Wikipedia is an important new cultural
and societal phenomenon
• Indicators for identifying real-world events
• View logs as the proxy for collective memory
• Public page view traffics with a (very) long time span
• Not directly reflect how people forget; significant patterns are a good
estimate of public remembering

Research Questions
32
• RQ2.1: How past events are remembered and what triggers human
remembering of these events in Wikipedia?
• RQ2.2: How do we quantify the semantic relatedness between two
entities / events?

Research Questions
33
• RQ2.1: How past events are remembered and what triggers human
remembering of these events in Wikipedia?
• Large-scale analysis over 5500 high-impact events from
11 event categories

Approach
34
• We propose a 3-step approach, for a given event:
1. Heuristically quantify “remembering scores” of past events within the same
category
• Using page views
2. Rank related past events by the computed remembering scores
• Refer to thesis for details
3. Identify features (e.g., time, location, impact) having a high correlation with
remembering

Approach
35
• Remembering score: A linear mixture model of:
• Cross-correlation coefficient (CCF)
• Or sliding inner product
• a measure of similarity of two series as a function
of the displacement of one relative to the other
• Sum of squared prediction error (SSE) or surprise score
• Holt-winters as prediction model
• Skewness (Kurtosis)
• a measure for the degree of peakedness/flatness
in the variable distribution

Studied Features for Triggered Remembering
36
• Temporal similarity
• Time distance between two events (in days, months or years)
• Time distance based on exponential decay functions
• Location similarity
• Map a geographic hierarchy of event locations as follows:
• City à State à Country -> Neighbor countries -> Continent
• Assign 4 scale values: 4 to same city, 3 to state, 2 to country,1 to
continent
• Impact of Events
• Damaged area/properties/cost/fatalities
• Magnitude (for earthquake events)
• Highest winds, lowest pressure (for Atlantic hurricanes)

Study on Atlantic Hurricanes
37
Location and time have a low effect on the category

Study on Aviation Accidents
38
Location and time have a stronger effect on the category

Lessons Learned
39
• We identified some first patterns for event memory triggering for
diverse event types including natural and manmade disasters as well
as accidents and terrorism.
• Our analysis confirmed the influence of high-level features i.e., time
and location, but other (latent) semantic features of events also
influence which event memories are triggered by an event.
• Interpreting systematically factors contribute to event remembering is
hard, even for humans.

Research Questions
40
• RQ2.2: How do we quantify the semantic relatedness between two
entities / events?

Dynamic Entity Relatedness Ranking
41
TaylorLautner
in“Twilight“
[2008-2012]
TaylorLautner
in“Cuckoo“
[2012-]
TaylorLautner
in“RuntheTide“
[2016]

42
• Dynamic Entity Relatedness: between two entities es , ed , where es is the
source entity and ed is the target entity, in a given time t, is a function
(denoted by ft(es , ed)) with the following properties.
• Asymmetric: ft(ei , ej) != ft(ej , ei)
• Non-negativity: f(ei , ej) ≥ 0
• Indiscernibility of identicals: ei = ej → f(ei , ej) = 1 Elon Tesla
• Dynamic Entity Relatedness Ranking: Given a source entity es and time
point t, rank the candidate entities et
d by their semantic relatedness at time
t+1.
• Prediction task
• Use normalized pageview as supervision

43
• A joint “neural” learning model
• Graph-based representation
• Content-based representation
• Time-series representation
• Neural ranking:
• Early-interaction, (late for ts)
• Pair-wise ranking
• Cross-entropy loss

Temporal time-series based similarity
44
• 1-D Convolution layer
• Decay-guided self-attention mechanism
• Dot-product between feature states.
• The context vector is decay-guided
based on time.
• Decay function: Polynomial Curve
with a single decay (hyper)parameter.

Experiment settings
45
• Datasets

Experiment settings
46
• Baselines
• Wikipedia Link-based (WLM)
• DeepWalk (DW)
• Entity2Vec Model (E2V)
• ParaVecs(PV)
• RankSVM + handcrafted features
• Metrics
• Pearson correlation
• Spearman correlation
• Normalized Discounted Cumulative Gain - NDCG
Page views
Human
judgment

48
Temporal
Dynamics
Web
Archives

Motivation
49
Correlation between time series mined from anchor text
(left, ccf = 0.69, τdelay = 2) and Google Trend (right, ccf = 0.68,
τdelay = 9) for query electoral college

Motivation
50
Time series of popular vote (ccf = 0.94, τdelay = 2), border fence
(ccf = 0.40,τdelay = 1) and heath care reform (ccf = 0.44, τdelay =
2) from anchor text and Google Trend from left to right

Motivation
51
Cumulative signals from anchor text tend to well-reflect real-
world event trend patterns with some slight delay.

Motivation
52
In this work, we rely solely on the Web Archive link-graph to
mine important documents.

Research Questions
53
• RQ3: Given a query and the Web Archive, how do we come up with a
top-k ranked list of documents where the coverage of the most
important documents -- topic-wise and time-wise -- are maximized.

Anchor-text based Retrieval Pipeline
54

Motivation
55
• DivRank[*]
• Rich-get-richer phenomenon
• Has a clear optimization explanation
• [*] Mei, Qiaozhu, Jian Guo, and Dragomir Radev. "Divrank: the interplay of prestige and diversity in information
networks." Proceedings of KDD 2010
Illustrated graph PageRank DivRank

Temporal Random Surfer Model
56
• Time-aware Teleportation
• jump to any snapshot with a time preference
• Time-aware Transition probability
• a snapshot at time ti with high time preference
will have higher transition probability.
• a node most propagates its authority to the
nearest peaked time
• propagation scope is restricted to a time
window

Absorbing Random Walk on Temporal Graph
57
• Vertex-Reinforcement Random walk
• within-snapshot: the transition probability in the Markov random walk (to a
state from others) is reinforced by the number of previous visits to that
state
• cross-snapshots: voting mechanism, only one node gets propagated at a
time

Experiment results
58
Diversity by time Diversity by topics

59
Temporal
Dynamics
Social
Network

Research Questions
60
• RQ4: How do temporal models develop and how do we control and
improve the stability of such models at early-stage?

Research Questions
61
• Task 1: Rumor detection in Twitter

Motivation
63
The Amuay Explosion news and Castro’ Death rumor spread over Twitter[*]
[*] Jin, Fang, et al. "Epidemiological modeling of news and rumors on twitter.” Workshop on Social Network Mining and Analysis 2013.

Motivation
64
The Amuay Explosion news and Castro’s Death rumor spread over Twitter[*]
[*] Jin, Fang, et al. "Epidemiological modeling of news and rumors on twitter.” Workshop on Social Network Mining and Analysis 2013.
How do we handle the case when it is too
early for any propagation patterns to form?

System pipeline
65
• Sometimes Average is the best..

System pipeline
66
• Sometimes Average is the best..
Dynamic Series Time Structure: feature vector representation:
• incoporate the slopes of features between two consecutive intervals[*]
•[*] Ma, Jing, et al. "Detect rumors using time series of social context information on microblogging websites." CIKM 2015

Tweet-level credibility model
67
Tweet-level credibility model
6619.01.20

Research Questions
69
• Task 2: Personalized blood glucose prediction in clinical domain

70
Temporal
Dynamics
Social
Network
Clinical
domain

Research Questions
71
• Task 2: Personalized blood glucose prediction in clinical domain
• Strategy: allowing model to refuse to predict

Motivation
72
Task: predict BG-level in 1 hour

Motivation
73
Sparsity: Measurements taken
periodically and (somewhat) spontaneously.

Motivation – preliminary results
74

Uncertainty in Machine Learning
75
[*] Digrams adopted from https://www.groundai.com/project/aleatoric-and-epistemic-uncertainty-in-machine-learning-a-tutorial-introduction/1
Ensemble Learning..
Bagging or Boosting Prediction variance

Uncertainty in Machine Learning
76
Go Bayesian..
Posterior distribution Weighted average
[*] Digrams adopted from https://www.groundai.com/project/aleatoric-and-epistemic-uncertainty-in-machine-learning-a-tutorial-introduction/1
However, high
computational cost

Uncertainty in Random Forest
77
Tree
Finite#bootstrapreplicatesB
Tree
Tree
variance
estimatesRF
Ensemble Learning

78
RF
Tree
Finite#bootstrapreplicatesB
Tree
Tree
variance
estimates
MC noise
sampling
noise

79
• *Wager, Stefan, Trevor Hastie, and Bradley Efron. "Confidence intervals for random forests: The jackknife and the infinitesimal jackknife." JMLR (2014).
RF
Tree
Finite#bootstrapreplicates(B)
Tree
Tree
variance
estimates
MC noise Bias-
corrected*
B = Θ(n)

Experiment results
80
• Sanity filter: carefully-designed heuristic methods (e.g., no long gap prediction,
no malformed input).
• Stability filter: confidence interval based.

Conclusions
81
Temporal
Dynamics
Web
Web
Archives
Collaborative
Knowledge
Bases
Social
Networks
Search
Recommendations
Anchor-text and
Link-based
Analysis &
Temporal Ranking
Entity and Event
Relatedness
Mining and
Recommendation
Enrichment
methods for cold-
start predictions
ESWC’18 - oral
ECIR’14 - oral
SIGIR’15 (short)
WWW’15 Companion
CoNLL’18 - full
JCDL’14 - oral
Socinfo’17 - full
CIKM’17&18
Workshops

Temporal models for mining, ranking and recommendation in the Web

More Related Content

Similar to Temporal models for mining, ranking and recommendation in the Web (20)

Recently uploaded (20)

Temporal models for mining, ranking and recommendation in the Web