Dynamic Search and Beyond

Dynamic Search and
Beyond
Prof. Grace Hui Yang
InfoSense Group
Department of Computer Science
Georgetown University
huiyang@cs.georgetown.edu
Sep 29, 2018
CCIR 2018 @ Guilin

• Our graduate program focuses on
Information Systems,
Privacy and Security,
and Computer Theory.
• Ph.D., Master’s, Postdocs

• ACM International Conference on Theory of
Information Retrieval (ICTIR)
• Its importance in the IR community
• Acknowledgement to Guangxi normal university,
CCF, and many old and new friends

Statistical Modeling of
Information Seeking
• Aims to connect user’s information seeking
behaviors with retrieval models
• The ‘dynamics’ in the search process are the
primary elements to be modeled
• I call this set of novel retrieval algorithms “Dynamic
IR Modeling”

Task: Dynamic IR
• The information retrieval task that aims to find
relevant documents for a session of multiple queries.
• It happens when information needs are complex,
vague, evolving, often containing multiple subtopics
• Not possible to be resolved by one-shot ad-hoc
retrieval
• e.g. “Purchasing a home”, “What is the meaning of
life”

E.g. Find what city and state Dulles airport is in, what shuttles ride-sharing vans and
taxi cabs connect the airport to other cities, what hotels are close to the airport, what
are some cheap off-airport parking, and what are the metro stops close to the Dulles
airport.
Information
need
User
Search
Engine
An Illustration

Characteristics of Dynamic IR
• Rich interactions
• Query formulation
• Document clicks
• Document examination
• eye movement
• mouse movements
• etc.
4

• Temporal dependency
5
clicked
documentsquery
D1
ranked documents
q1 C1
D2
q2 C2 ……
…… Dn
qn Cn
I
information need
iteration 1 iteration 2 iteration n

• Aim for a long-term goal
• Great if we can find early what a user
ultimately want
4

Reinforcement Learning (RL)
• Fits well in this trial-and-error setting
• It is to learn from repeated, varied attempts which are
continued until success.
• The learner (also known as agent) learns from its dynamic
interactions with the world
• rather than from a labeled dataset as in supervised
learning.
• The stochastic model assumes that the system's current
state depend on the previous state and action in a non-
deterministic manner 6

Most of Our Work is inspired
by MDPs/POMDPs

○ Based on Markov Decision Process (MDP)
○ States: Queries
! Observable
○ Actions:
! User actions:
○ Add/remove/unchange the query terms
○ Nicely correspond to our definition of query change
! Search Engine actions:
○ Increase/ decrease /remain term weights
○ Rewards:
! nDCG
[Guan, Zhang, and Yang SIGIR 2013]
QUERY CHANGE MODEL

SEARCH ENGINE AGENT’S ACTIONS
∈ Di−1 action Example
qtheme
Y increase “pocono mountain” in s6
N increase
“france world cup 98 reaction” in s28, france
world cup 98 reaction stock market→ france world
cup 98 reaction
+∆q
Y decrease
‘policy’ in s37, Merck lobbyists → Merck
lobbyists US policy
N increase
‘US’ in s37, Merck lobbyists → Merck lobbyists
US policy
−∆q
Y decrease
‘reaction’ in s28, france world cup 98 reaction
→ france world cup 98
N No change
‘legislation’ in s32, bollywood legislation
→bollywood law

QUERY CHANGE RETRIEVAL MODEL (QCM)
○ Bellman Equation gives the optimal value for an MDP:
○ The reward function is used as the document relevance
score function and is tweaked backwards from Bellman
equation:
Document relevant
score
Query Transition
model
Maximum past
relevance
Current
reward/relevance
score

CALCULATING THE TRANSITION MODEL
• According to Query Change and Search Engine Actions
Current reward/
relevance score
Increase weights for
theme terms
Decrease weights for
old added terms
Decrease weights for
removed terms
Increase weights for
novel added terms

○ Partially Observable Markov Decision Process
○ Two agents
● Cooperative game
● Joint Optimization
WIN-WIN SEARCH: DUAL-AGENT STOCHASTIC GAME
● Hidden states
● Actions
● Rewards
● Markov
[Luo, Zhang, and Yang SIGIR 2014]

A MARKOV CHAIN OF DECISION MAKING STATES

SRT
Relevant &
Exploitation
SRR
Relevant &
Exploration
SNRT
Non-Relevant &
Exploitation
SNRR
Non-Relevant &
Exploration
● scooter price ⟶ scooter stores ● collecting old US coins⟶ selling
old US coins
● Philadelphia NYC travel ⟶
Philadelphia NYC train
● Boston tourism ⟶ NYC tourism
q0
HIDDEN DECISION MAKING STATES

ACTIONS
! User Action (Au)
○ add query terms (+Δq)
○ remove query terms (-Δq)
○ keep query terms (qtheme)
! Search Engine Action(Ase)
○ Increase/ decrease/ keep term weights
○ Switch on or off a search technique,
○ e.g. to use or not to use query expansion
○ adjust parameters in search techniques
○ e.g., select the best k for the top k docs used in
PRF
! Message from the user(Σu)
○ clicked documents
○ SAT clicked documents
! Message from search engine(Σse)
○ top k returned documents
Messages are essentially
documents that an agent thinks
are relevant.

REWARDS
! Explicit Rewards:
! nDCG
! Implicit Rewards:
! clicks
[Luo et al, SIGIR 2014, ECIR 2015]

EXPERIMENTS
○ Corpus: ClubWeb09 and ClueWeb 12, TREC DD datasets
○ Query Logs

SEARCH ACCURACY
○ Search accuracy on TREC 2012 Session Track
TREC 2012 Session Track
◆ Win-win outperforms most retrieval algorithms on TREC 2012.

◆ Systems in TREC 2012 perform better than in TREC 2013.
◆ many relevant documents are not included in ClueWeb12 CatB
collection
◆ Win-win outperforms all retrieval algorithms on TREC 2013.
◆ It is highly effective in Session Search.
SEARCH ACCURACY
○ Search accuracy on TREC 2013 Session Track
TREC 2013 Session Track

SEARCH ACCURACY FOR DIFFERENT
SESSION TYPES
○ TREC 2012 Sessions are classified into:
! Product: Factual / Intellectual
! Goal quality: Specific / Amorphous
Intellectual %chg Amorphous %chg Specific %chg Factual %chg
TREC best 0.3369 0.00% 0.3495 0.00% 0.3007 0.00% 0.3138 0.00%
Nugget 0.3305 -1.90% 0.3397 -2.80% 0.2736 -9.01% 0.2871 -8.51%
QCM 0.3870 14.87% 0.3689 5.55% 0.3091 2.79% 0.3066 -2.29%
QCM+DUP 0.3900 15.76% 0.3692 5.64% 0.3114 3.56% 0.3072 -2.10%
- Better handle sessions that demonstrate evolution and exploration Because QCM
treats a session as a continuous process by studying changes among query
transitions and modeling the dynamics
QCM

How to design the states,
actions, and rewards

DESIGN OPTIONS
○ Is there a temporal component?
○ States – What changes with each time step?
○ Actions – How does your system change the state?
○ Rewards – How do you measure feedback or
effectiveness in your problem at each time step?
○ Transition Probability – Can you determine this?
! If not, then model free approach is more suitable
ECIR’15

A Direct Policy Learning
Framework
• Learns a direct mapping from observations to actions by
gradient descent
• Define a history: A chain of events happening in a
session
• the dynamic changes of states, actions, observations,
and rewards in a session
ICTIR’15

Browse Phase
• Actor: the user
• It happens
• after the search results are shown to the user
• before the user starts to write the next query
• Records how the user perceives and examines the
(previously retrieved) search results
ICTIR’15
Decompose a history

Query Phase
• Actor: the user
• It happens
• when the user writes a query
• Assuming the query is created based on
• what has been seen in the browse phase
• the information need
ICTIR’15
Decompose a history

Rank Phase
• Actor: the search engine
• It happens
• after the query is entered
• before the search results are returned
• It is where the search algorithm takes place
Decompose a history

Action Selection Distribution
Softmax Function
Gradient

Ranking Function
• It originally presents the probability of selecting a
(ranking) action
• In our context, the probability of selecting d to be put
at the top of a ranked list under n3 and θ3 at the tth
iteration
• Then we sort the documents by it to generate the
document list

Updates:
Feature function:
Query Features
• Test if a search term w∈q
t
and w∈q
t
−1
• # of times that a term w occurs in q
1
,q
2
,…,q
t
Query-Document Features
• Test if a search term w∈+∆q
t
and w∈D
t
−1
• Test if a document d contains a term w ∈ −∆q
t
tf
.
idf score of a document d to q
t
Click Features
• Test if there are SAT-Clicks in Dt−1
• # of times a document being clicked in the
current session
• # of seconds a document being viewed and
reviewed in the current session
Query-Document-Click Features
• Test if qi leads to SAT-Clicks in Di, where i =
0...t−1
Session Features
• position at the current session
Browse
Query
Rank

Efficiency - TREC 2012 Session
• lemur > dpl > qcm > winwin
• dpl achieves a good balance between accuracy and efficiency
• the conclusions are also consistent upon experiments on TREC’12
~ 14 Session Tracks
DPL

TREC 2012 Session
• dpl achieves a significant improvement over the TREC best run
• We found similar conclusions on TREC 2013 and 2014 Session Tracks
DPL

TREC DYNAMIC DOMAIN 2015-2017
! The search task focuses on specific
domains
! In the three years, we had explored
domains from the dark web (illicit good and
Ebola) and polar science, to more general
web domains (NYT)
! What is consistent?
○ The participating system is expected to
help the user through interactions & get
their tasks done
○ User’s information need usually consists
of multiple aspects

FEEDBACK FROM A SIMULATED USER
! https://github.com/trec-dd/trec-dd-jig

DOMAIN USED IN 2017
○ New York Times Annotated Corpus
! Sandhaus, Evan. "The new york times annotated corpus." Linguistic Data
Consortium, Philadelphia 6, no. 12 (2008): e26752.
! Archives of New York Times in 20 years, from January 1, 1987 and June 19, 2007
! Uncompressed size 16 GB
! Over 1.8 million documents
! Over 650,000 article summaries written by library scientists.
! Over 1,500,000 articles manually tagged by library scientists
! Over 275,000 algorithmically-tagged articles that have been hand verified by
professionals

ANNOTATION
○ Create Topic and Relevance Judgement at the same time
! Not by pooling
○ Topic – subtopic – passage – Relevance Judgement
○ The challenge: how to be complete

○ Useful information that the user gains
! Raw relevance score
○ Discounting
! Based on document ranking
! Based on diversity
○ User’s efforts
! Time spent
! Lengths of documents being
viewed
EVALUATION METRICS FOR DYNAMIC SEARCH

○ Most session search metrics consider all those factors into
one overwhelmingly complex formula
○ The optimal value, aka upper bound, of those metrics highly
varies on different search topics
○ In Cranfield-like settings (e.g. TREC), the difference is often
ignored
THE PROBLEM

TOY EXAMPLE
Doc Relevance score regarding topic-subtopic
1-1 1-2 2-1 2-2 2-3 2-4 2-5
d1 1 4
d2 3 4
d3 4
d4 4
d5 4
System Topic 1 CT-
topic 1
Topic 2 CT-
topic
2
CT-avg Normaliz
ed CT-
avg
System1 d1, irrel, irrel, irrel,
irrel
1 d1, d3, d4, d5, irrel
16 8.5 0.596
System2 d2, irrel, irrel, irrel,
irrel
3 d1, d3, d4, d5, irrel
14 8.5 0.787
Optimal d1, d2, irrel, irrel, irrel 4 d1, d2, d3, d4, d5 17

○ What is the optimal metric value that a system can
achieve?
! How to get the upper bound for each search topic?
! How does it affect the evaluation conclusions?
○ Variance of different topics
○ Normalization
RESEARCH QUESTIONS
!"#$%& = (
)*+,-
$./_!"#$% 1#23", 5 − 7#/%$_8#9:;(1#23")
922%$_8#9:; 1#23" − 7#/%$_8#9:;(1#23")

○ Session-DCG (sDCG)
! Järvelin et al. "Discounted cumulated gain based evaluation of multiple-query IR
sessions." Advances in Information Retrieval (2008): 4-15.
○ Cube Test (CT)
! Luo et al. "The water filling model and the cube test: multi-dimensional evaluation for professional
search." CIKM, 2013.
○ Expected Utility (EU)
○ Yang and Abhimanyu. "Modeling expected utility of multi-session information distillation." ICTIR
2009.
DYNAMIC SEARCH METRICS
!" = $
%
& ' $
(,* ∈%
$
,∈-.,/
0, ∗ 23 ,,(,*45 − 7 ∗ 89:;(=, >))
@A =
∑(C5
D ∑*C5
|F(GH.|
∑, 0, IJK =, > ∗ 23(,,(,*45)
∑(C5
D ∑*C5
|F(GH.|
89:;(=, >)
:L@M = $
(C5
D
$
*C5
|F(GH.|
IJK(=, >)
1 + logS > ∗ 1 + logST =

○ sDCG
○ Cube Test
○ Expected Utility
DECONSTRUCT THE METRICS
CostGain Rank discount Novelty discount
!"#$ = &
'()
*
&
+()
|-'./0|
123(5, 7)
1 + log> 7 ∗ 1 + log>@ 5
#A =
∑'()
*
∑+()
|-'./0|
∑C DC 123 5, 7 ∗ EF(C,',+G))
∑'()
* ∑+()
|-'./0|
HI!J(5, 7)
KL = &
M
N O &
',+ ∈M
&
C∈Q0,R
DC ∗ EF C,',+G) − T ∗ HI!J(5, 7))

BOUNDS ON DIFFERENT TOPICS
!"#$ = "&!'()*+,- $.&*

!" =
$%&'()*+,- ./%*
!(&+

!" = $%&'()*+,- ./%*
−$%&'()*+,- 1(&+

! The difference of the optimal value a metric would
produce for different topics is large and should not
be ignored.

○ Rearrangement Inequality
○ In IR, Probability Ranking Principle [4]
! the overall effectiveness of an IR system can be
achieved the best by ranking the documents by their
usefulness in descending order
OUR SOLUTION
!"#$ + !&#$'" + … + !$#" ≤ !* " #" + !* & #& + … + !* $ #$ ≤ !"#" + !&#& + ⋯ + !$#$
,-. !" ≤ !& … ≤ !$ /01 #" ≤ #& … ≤ #$

NORMALIZATION EFFECT
!"#$ = "&!'()*+,- $.&*

!" =
$%&'()*+,- ./%*
!(&+

!" = $%&'()*+,- ./%* − / ∗ $%&'()*+,- 2(&+
/ = 0.01

! Using the bounds for normalization brings in more
fairness into evaluation

Conclusion
• Our main contributions:
• Put user into the models
• Created a bridge between information
seeking studies/user behavior studies with
learning
• Yield a family of new generative retrieval
models for a complex, dynamic settings
• Able to explain the results

A Few Thinkings
• Information seeking is a Markov Decision Process, instead of
independent searches
• User actions that cost more efforts, such as query changes,
are stronger signals than clicks
• Search is also a learning process for the user, who also
evolves
• Users and search engines form a partnership to explore the
information space
• They influence each other; It is a two-way communication
• Complex evaluation metrics might not be appropriate; the
complexity should either be modelled in the model or the
metric, but not in both

Look into the future
• Dynamic IR Models are good for modeling
information seeking
• A lot of room to study the user and the search
engine interaction in a generative way
• The thinking I presented here could be able to
generate new methods not only on retrieval and
evaluation, but also on related fields
• Exciting!!

Thank You!
• Email:
huiyang@cs.georgetown.edu
• Group Page: InfoSense at
http://infosense.cs.georgetown.
edu/
• Dynamic IR Website:
http://www.dynamic-ir-
modeling.org/
• Book: Dynamic Information
Retrieval Modeling
• TREC Dynamic Domain Track:
http://trec-dd.org/

Dynamic Search and Beyond

More Related Content

Similar to Dynamic Search and Beyond (20)

Recently uploaded (20)

Dynamic Search and Beyond