Dynamic Search and
Beyond
Prof. Grace Hui Yang
InfoSense Group
Department of Computer Science
Georgetown University
huiyang@cs.georgetown.edu
Sep 29, 2018
CCIR 2018 @ Guilin
• Our graduate program focuses on
Information Systems,
Privacy and Security,
and Computer Theory.
• Ph.D., Master’s, Postdocs
• ACM International Conference on Theory of
Information Retrieval (ICTIR)
• Its importance in the IR community
• Acknowledgement to Guangxi normal university,
CCF, and many old and new friends
Statistical Modeling of
Information Seeking
• Aims to connect user’s information seeking
behaviors with retrieval models
• The ‘dynamics’ in the search process are the
primary elements to be modeled
• I call this set of novel retrieval algorithms “Dynamic
IR Modeling”
Task: Dynamic IR
• The information retrieval task that aims to find
relevant documents for a session of multiple queries.
• It happens when information needs are complex,
vague, evolving, often containing multiple subtopics
• Not possible to be resolved by one-shot ad-hoc
retrieval
• e.g. “Purchasing a home”, “What is the meaning of
life”
E.g. Find what city and state Dulles airport is in, what shuttles ride-sharing vans and
taxi cabs connect the airport to other cities, what hotels are close to the airport, what
are some cheap off-airport parking, and what are the metro stops close to the Dulles
airport.
Information
need
User
Search
Engine
An Illustration
Characteristics of Dynamic IR
• Rich interactions
• Query formulation
• Document clicks
• Document examination
• eye movement
• mouse movements
• etc.
4
Characteristics of Dynamic IR
• Temporal dependency
5
clicked
documentsquery
D1
ranked documents
q1 C1
D2
q2 C2 ……
…… Dn
qn Cn
I
information need
iteration 1 iteration 2 iteration n
Characteristics of Dynamic IR
• Aim for a long-term goal
• Great if we can find early what a user
ultimately want
4
Reinforcement Learning (RL)
• Fits well in this trial-and-error setting
• It is to learn from repeated, varied attempts which are
continued until success.
• The learner (also known as agent) learns from its dynamic
interactions with the world
• rather than from a labeled dataset as in supervised
learning.
• The stochastic model assumes that the system's current
state depend on the previous state and action in a non-
deterministic manner 6
Most of Our Work is inspired
by MDPs/POMDPs
○ Based on Markov Decision Process (MDP)
○ States: Queries
! Observable
○ Actions:
! User actions:
○ Add/remove/unchange the query terms
○ Nicely correspond to our definition of query change
! Search Engine actions:
○ Increase/ decrease /remain term weights
○ Rewards:
! nDCG
[Guan, Zhang, and Yang SIGIR 2013]
QUERY CHANGE MODEL
SEARCH ENGINE AGENT’S ACTIONS
∈ Di−1 action Example
qtheme
Y increase “pocono mountain” in s6
N increase
“france world cup 98 reaction” in s28, france
world cup 98 reaction stock market→ france world
cup 98 reaction
+∆q
Y decrease
‘policy’ in s37, Merck lobbyists → Merck
lobbyists US policy
N increase
‘US’ in s37, Merck lobbyists → Merck lobbyists
US policy
−∆q
Y decrease
‘reaction’ in s28, france world cup 98 reaction
→ france world cup 98
N No change
‘legislation’ in s32, bollywood legislation
→bollywood law
QUERY CHANGE RETRIEVAL MODEL (QCM)
○ Bellman Equation gives the optimal value for an MDP:
○ The reward function is used as the document relevance
score function and is tweaked backwards from Bellman
equation:
Document relevant
score
Query Transition
model
Maximum past
relevance
Current
reward/relevance
score
CALCULATING THE TRANSITION MODEL
• According to Query Change and Search Engine Actions
Current reward/
relevance score
Increase weights for
theme terms
Decrease weights for
old added terms
Decrease weights for
removed terms
Increase weights for
novel added terms
○ Partially Observable Markov Decision Process
○ Two agents
● Cooperative game
● Joint Optimization
WIN-WIN SEARCH: DUAL-AGENT STOCHASTIC GAME
● Hidden states
● Actions
● Rewards
● Markov
[Luo, Zhang, and Yang SIGIR 2014]
A MARKOV CHAIN OF DECISION MAKING STATES
[Luo, Zhang, and Yang SIGIR 2014]
SRT
Relevant &
Exploitation
SRR
Relevant &
Exploration
SNRT
Non-Relevant &
Exploitation
SNRR
Non-Relevant &
Exploration
● scooter price ⟶		scooter stores ● collecting old US coins⟶	selling
old US coins
● Philadelphia NYC travel ⟶	
Philadelphia NYC train
● Boston tourism ⟶ NYC tourism
q0
HIDDEN DECISION MAKING STATES
[Luo, Zhang, and Yang SIGIR 2014]
Dual Agent Stochastic Game
ACTIONS
! User Action (Au)
○ add query terms (+Δq)
○ remove query terms (-Δq)
○ keep query terms (qtheme)
! Search Engine Action(Ase)
○ Increase/ decrease/ keep term weights
○ Switch on or off a search technique,
○ e.g. to use or not to use query expansion
○ adjust parameters in search techniques
○ e.g., select the best k for the top k docs used in
PRF
! Message from the user(Σu)
○ clicked documents
○ SAT clicked documents
! Message from search engine(Σse)
○ top k returned documents
Messages are essentially
documents that an agent thinks
are relevant.
[Luo, Zhang, and Yang SIGIR 2014]
REWARDS
! Explicit Rewards:
! nDCG
! Implicit Rewards:
! clicks
[Luo et al, SIGIR 2014, ECIR 2015]
EXPERIMENTS
○ Corpus: ClubWeb09 and ClueWeb 12, TREC DD datasets
○ Query Logs
SEARCH ACCURACY
○ Search accuracy on TREC 2012 Session Track
TREC 2012 Session Track
◆ Win-win outperforms most retrieval algorithms on TREC 2012.
◆ Systems in TREC 2012 perform better than in TREC 2013.
◆ many relevant documents are not included in ClueWeb12 CatB
collection
◆ Win-win outperforms all retrieval algorithms on TREC 2013.
◆ It is highly effective in Session Search.
SEARCH ACCURACY
○ Search accuracy on TREC 2013 Session Track
TREC 2013 Session Track
SEARCH ACCURACY FOR DIFFERENT
SESSION TYPES
○ TREC 2012 Sessions are classified into:
! Product: Factual / Intellectual
! Goal quality: Specific / Amorphous
Intellectual %chg Amorphous %chg Specific %chg Factual %chg
TREC best 0.3369 0.00% 0.3495 0.00% 0.3007 0.00% 0.3138 0.00%
Nugget 0.3305 -1.90% 0.3397 -2.80% 0.2736 -9.01% 0.2871 -8.51%
QCM 0.3870 14.87% 0.3689 5.55% 0.3091 2.79% 0.3066 -2.29%
QCM+DUP 0.3900 15.76% 0.3692 5.64% 0.3114 3.56% 0.3072 -2.10%
- Better handle sessions that demonstrate evolution and exploration Because QCM
treats a session as a continuous process by studying changes among query
transitions and modeling the dynamics
QCM
How to design the states,
actions, and rewards
DESIGN OPTIONS
○ Is there a temporal component?
○ States – What changes with each time step?
○ Actions – How does your system change the state?
○ Rewards – How do you measure feedback or
effectiveness in your problem at each time step?
○ Transition Probability – Can you determine this?
! If not, then model free approach is more suitable
ECIR’15
… can it be more
efficient?
A Direct Policy Learning
Framework
• Learns a direct mapping from observations to actions by
gradient descent
• Define a history: A chain of events happening in a
session
• the dynamic changes of states, actions, observations,
and rewards in a session
ICTIR’15
Browse Phase
• Actor: the user
• It happens
• after the search results are shown to the user
• before the user starts to write the next query
• Records how the user perceives and examines the
(previously retrieved) search results
ICTIR’15
Decompose a history
Query Phase
• Actor: the user
• It happens
• when the user writes a query
• Assuming the query is created based on
• what has been seen in the browse phase
• the information need
ICTIR’15
Decompose a history
Rank Phase
• Actor: the search engine
• It happens
• after the query is entered
• before the search results are returned
• It is where the search algorithm takes place
Decompose a history
Our objective function:
where
Action Selection Distribution
Softmax Function
Gradient
Ranking Function
• It originally presents the probability of selecting a
(ranking) action
• In our context, the probability of selecting d to be put
at the top of a ranked list under n3 and θ3 at the tth
iteration
• Then we sort the documents by it to generate the
document list
Updates:
Feature function:
Query Features
• Test if a search term w∈q
t
and w∈q
t
−1
• # of times that a term w occurs in q
1
,q
2
,…,q
t
Query-Document Features
• Test if a search term w∈+∆q
t
and w∈D
t
−1
• Test if a document d contains a term w ∈ −∆q
t
tf
.
idf score of a document d to q
t
Click Features
• Test if there are SAT-Clicks in Dt−1
• # of times a document being clicked in the
current session
• # of seconds a document being viewed and
reviewed in the current session
Query-Document-Click Features
• Test if qi leads to SAT-Clicks in Di, where i =
0...t−1
Session Features
• position at the current session
Browse
Query
Rank
Efficiency - TREC 2012 Session
• lemur > dpl > qcm > winwin
• dpl achieves a good balance between accuracy and efficiency
• the conclusions are also consistent upon experiments on TREC’12
~ 14 Session Tracks
DPL
TREC 2012 Session
• dpl achieves a significant improvement over the TREC best run
• We found similar conclusions on TREC 2013 and 2014 Session Tracks
DPL
TREC DYNAMIC DOMAIN 2015-2017
! The search task focuses on specific
domains
! In the three years, we had explored
domains from the dark web (illicit good and
Ebola) and polar science, to more general
web domains (NYT)
! What is consistent?
○ The participating system is expected to
help the user through interactions & get
their tasks done
○ User’s information need usually consists
of multiple aspects
THE TREC DYNAMIC DOMAIN TASK
FEEDBACK FROM A SIMULATED USER
! https://github.com/trec-dd/trec-dd-jig
DOMAIN USED IN 2017
○ New York Times Annotated Corpus
! Sandhaus, Evan. "The new york times annotated corpus." Linguistic Data
Consortium, Philadelphia 6, no. 12 (2008): e26752.
! Archives of New York Times in 20 years, from January 1, 1987 and June 19, 2007
! Uncompressed size 16 GB
! Over 1.8 million documents
! Over 650,000 article summaries written by library scientists.
! Over 1,500,000 articles manually tagged by library scientists
! Over 275,000 algorithmically-tagged articles that have been hand verified by
professionals
ANNOTATION
○ Create Topic and Relevance Judgement at the same time
! Not by pooling
○ Topic – subtopic – passage – Relevance Judgement
○ The challenge: how to be complete
○ Useful information that the user gains
! Raw relevance score
○ Discounting
! Based on document ranking
! Based on diversity
○ User’s efforts
! Time spent
! Lengths of documents being
viewed
EVALUATION METRICS FOR DYNAMIC SEARCH
○ Most session search metrics consider all those factors into
one overwhelmingly complex formula
○ The optimal value, aka upper bound, of those metrics highly
varies on different search topics
○ In Cranfield-like settings (e.g. TREC), the difference is often
ignored
THE PROBLEM
TOY EXAMPLE
Doc Relevance score regarding topic-subtopic
1-1 1-2 2-1 2-2 2-3 2-4 2-5
d1 1 4
d2 3 4
d3 4
d4 4
d5 4
System Topic 1 CT-
topic 1
Topic 2 CT-
topic
2
CT-avg Normaliz
ed CT-
avg
System1 d1, irrel, irrel, irrel,
irrel
1 d1, d3, d4, d5, irrel
16 8.5 0.596
System2 d2, irrel, irrel, irrel,
irrel
3 d1, d3, d4, d5, irrel
14 8.5 0.787
Optimal d1, d2, irrel, irrel, irrel 4 d1, d2, d3, d4, d5 17
○ What is the optimal metric value that a system can
achieve?
! How to get the upper bound for each search topic?
! How does it affect the evaluation conclusions?
○ Variance of different topics
○ Normalization
RESEARCH QUESTIONS
!"#$%& = (
)*+,-
$./_!"#$% 1#23", 5 − 7#/%$_8#9:;(1#23")
922%$_8#9:; 1#23" − 7#/%$_8#9:;(1#23")
○ Session-DCG (sDCG)
! Järvelin et al. "Discounted cumulated gain based evaluation of multiple-query IR
sessions." Advances in Information Retrieval (2008): 4-15.
○ Cube Test (CT)
! Luo et al. "The water filling model and the cube test: multi-dimensional evaluation for professional
search." CIKM, 2013.
○ Expected Utility (EU)
○ Yang and Abhimanyu. "Modeling expected utility of multi-session information distillation." ICTIR
2009.
DYNAMIC SEARCH METRICS
!" = $
%
& ' $
(,* ∈%
$
,∈-.,/
0, ∗ 23 ,,(,*45 − 7 ∗ 89:;(=, >))
@A =
∑(C5
D ∑*C5
|F(GH.|
∑, 0, IJK =, > ∗ 23(,,(,*45)
∑(C5
D ∑*C5
|F(GH.|
89:;(=, >)
:L@M = $
(C5
D
$
*C5
|F(GH.|
IJK(=, >)
1 + logS > ∗ 1 + logST =
○ sDCG
○ Cube Test
○ Expected Utility
DECONSTRUCT THE METRICS
CostGain Rank discount Novelty discount
!"#$ = &
'()
*
&
+()
|-'./0|
123(5, 7)
1 + log> 7 ∗ 1 + log>@ 5
#A =
∑'()
*
∑+()
|-'./0|
∑C DC 123 5, 7 ∗ EF(C,',+G))
∑'()
* ∑+()
|-'./0|
HI!J(5, 7)
KL = &
M
N O &
',+ ∈M
&
C∈Q0,R
DC ∗ EF C,',+G) − T ∗ HI!J(5, 7))
BOUNDS ON DIFFERENT TOPICS
!"#$ = "&!'()*+,- $.&*
BOUNDS ON DIFFERENT TOPICS
!" =
$%&'()*+,- ./%*
!(&+
BOUNDS ON DIFFERENT TOPICS
!" = $%&'()*+,- ./%*
−$%&'()*+,- 1(&+
! The difference of the optimal value a metric would
produce for different topics is large and should not
be ignored.
○ Rearrangement Inequality
○ In IR, Probability Ranking Principle [4]
! the overall effectiveness of an IR system can be
achieved the best by ranking the documents by their
usefulness in descending order
OUR SOLUTION
!"#$ + !&#$'" + … + !$#" ≤ !* " #" + !* & #& + … + !* $ #$ ≤ !"#" + !&#& + ⋯ + !$#$
,-. !" ≤ !& … ≤ !$ /01 #" ≤ #& … ≤ #$
NORMALIZATION EFFECT
!"#$ = "&!'()*+,- $.&*
NORMALIZATION EFFECT
!" =
$%&'()*+,- ./%*
!(&+
NORMALIZATION EFFECT
!" = $%&'()*+,- ./%* − / ∗ $%&'()*+,- 2(&+
/ = 0.01
! Using the bounds for normalization brings in more
fairness into evaluation
Conclusion
• Our main contributions:
• Put user into the models
• Created a bridge between information
seeking studies/user behavior studies with
learning
• Yield a family of new generative retrieval
models for a complex, dynamic settings
• Able to explain the results
A Few Thinkings
• Information seeking is a Markov Decision Process, instead of
independent searches
• User actions that cost more efforts, such as query changes,
are stronger signals than clicks
• Search is also a learning process for the user, who also
evolves
• Users and search engines form a partnership to explore the
information space
• They influence each other; It is a two-way communication
• Complex evaluation metrics might not be appropriate; the
complexity should either be modelled in the model or the
metric, but not in both
Look into the future
• Dynamic IR Models are good for modeling
information seeking
• A lot of room to study the user and the search
engine interaction in a generative way
• The thinking I presented here could be able to
generate new methods not only on retrieval and
evaluation, but also on related fields
• Exciting!!
Thank You!
• Email:
huiyang@cs.georgetown.edu
• Group Page: InfoSense at
http://infosense.cs.georgetown.
edu/
• Dynamic IR Website:
http://www.dynamic-ir-
modeling.org/
• Book: Dynamic Information
Retrieval Modeling
• TREC Dynamic Domain Track:
http://trec-dd.org/

More Related Content

PDF
Designing States, Actions, and Rewards for Using POMDP in Session Search
PDF
Utilizing Query Change for Session Search (SIGIR 2013)
PDF
Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)
PDF
Dynamic Information Retrieval Tutorial - SIGIR 2015
PDF
Ibrahim ramadan paper
PPTX
Information retrival system and PageRank algorithm
PDF
Information Retrieval Fundamentals - An introduction
PDF
Learning to Reinforce Search Effectiveness
Designing States, Actions, and Rewards for Using POMDP in Session Search
Utilizing Query Change for Session Search (SIGIR 2013)
Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)
Dynamic Information Retrieval Tutorial - SIGIR 2015
Ibrahim ramadan paper
Information retrival system and PageRank algorithm
Information Retrieval Fundamentals - An introduction
Learning to Reinforce Search Effectiveness

Similar to Dynamic Search and Beyond (20)

PPTX
moraes-a2017ictir
PDF
From Exploration to Construction
 - How to Support the Complex Dynamics of In...
PDF
Applying Basic Statistics to Text and Web Search
PDF
Exploring session search
PDF
Social Information Access: A Personal Update
PPT
Slides
PPTX
Machine Learned Relevance at A Large Scale Search Engine
PPT
CIKM Tutorial 2008
PPTX
Cognitive-Models-of-Information-Retrieval
PDF
Invited Lecture on Interactive Information Retrieval
PPTX
lecture8-evaluation.pptxnnnnnnnnnnnnnnnnnnnnnnnnn
PPTX
From queries to dialogues
PPTX
Week14-Multimedia Information Retrieval.pptx
PPTX
Introduction to Information Retrieval
PDF
Performance Evaluation of Query Processing Techniques in Information Retrieval
PPTX
IRT Unit_I.pptx
PPTX
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
PDF
Information retrieval systems irt ppt do
PDF
IRJET- A Novel Technique for Inferring User Search using Feedback Sessions
PDF
IRJET- On-AIR Based Information Retrieval System for Semi-Structure Data
moraes-a2017ictir
From Exploration to Construction
 - How to Support the Complex Dynamics of In...
Applying Basic Statistics to Text and Web Search
Exploring session search
Social Information Access: A Personal Update
Slides
Machine Learned Relevance at A Large Scale Search Engine
CIKM Tutorial 2008
Cognitive-Models-of-Information-Retrieval
Invited Lecture on Interactive Information Retrieval
lecture8-evaluation.pptxnnnnnnnnnnnnnnnnnnnnnnnnn
From queries to dialogues
Week14-Multimedia Information Retrieval.pptx
Introduction to Information Retrieval
Performance Evaluation of Query Processing Techniques in Information Retrieval
IRT Unit_I.pptx
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Information retrieval systems irt ppt do
IRJET- A Novel Technique for Inferring User Search using Feedback Sessions
IRJET- On-AIR Based Information Retrieval System for Semi-Structure Data
Ad

Recently uploaded (20)

PPTX
CYBER SECURITY the Next Warefare Tactics
PPTX
Business_Capability_Map_Collection__pptx
PDF
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
PPTX
Lesson-01intheselfoflifeofthekennyrogersoftheunderstandoftheunderstanded
PPT
DU, AIS, Big Data and Data Analytics.ppt
PPTX
Tapan_20220802057_Researchinternship_final_stage.pptx
PPTX
IMPACT OF LANDSLIDE.....................
PDF
Best Data Science Professional Certificates in the USA | IABAC
PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PPTX
Crypto_Trading_Beginners.pptxxxxxxxxxxxxxx
PPTX
Caseware_IDEA_Detailed_Presentation.pptx
PPT
Image processing and pattern recognition 2.ppt
PDF
CS3352FOUNDATION OF DATA SCIENCE _1_MAterial.pdf
PDF
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
PDF
Global Data and Analytics Market Outlook Report
PDF
Microsoft 365 products and services descrption
PPTX
ai agent creaction with langgraph_presentation_
PPTX
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
PPTX
eGramSWARAJ-PPT Training Module for beginners
PPTX
CHAPTER-2-THE-ACCOUNTING-PROCESS-2-4.pptx
CYBER SECURITY the Next Warefare Tactics
Business_Capability_Map_Collection__pptx
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
Lesson-01intheselfoflifeofthekennyrogersoftheunderstandoftheunderstanded
DU, AIS, Big Data and Data Analytics.ppt
Tapan_20220802057_Researchinternship_final_stage.pptx
IMPACT OF LANDSLIDE.....................
Best Data Science Professional Certificates in the USA | IABAC
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
Crypto_Trading_Beginners.pptxxxxxxxxxxxxxx
Caseware_IDEA_Detailed_Presentation.pptx
Image processing and pattern recognition 2.ppt
CS3352FOUNDATION OF DATA SCIENCE _1_MAterial.pdf
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
Global Data and Analytics Market Outlook Report
Microsoft 365 products and services descrption
ai agent creaction with langgraph_presentation_
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
eGramSWARAJ-PPT Training Module for beginners
CHAPTER-2-THE-ACCOUNTING-PROCESS-2-4.pptx
Ad

Dynamic Search and Beyond

  • 1. Dynamic Search and Beyond Prof. Grace Hui Yang InfoSense Group Department of Computer Science Georgetown University huiyang@cs.georgetown.edu Sep 29, 2018 CCIR 2018 @ Guilin
  • 2. • Our graduate program focuses on Information Systems, Privacy and Security, and Computer Theory. • Ph.D., Master’s, Postdocs
  • 3. • ACM International Conference on Theory of Information Retrieval (ICTIR) • Its importance in the IR community • Acknowledgement to Guangxi normal university, CCF, and many old and new friends
  • 4. Statistical Modeling of Information Seeking • Aims to connect user’s information seeking behaviors with retrieval models • The ‘dynamics’ in the search process are the primary elements to be modeled • I call this set of novel retrieval algorithms “Dynamic IR Modeling”
  • 5. Task: Dynamic IR • The information retrieval task that aims to find relevant documents for a session of multiple queries. • It happens when information needs are complex, vague, evolving, often containing multiple subtopics • Not possible to be resolved by one-shot ad-hoc retrieval • e.g. “Purchasing a home”, “What is the meaning of life”
  • 6. E.g. Find what city and state Dulles airport is in, what shuttles ride-sharing vans and taxi cabs connect the airport to other cities, what hotels are close to the airport, what are some cheap off-airport parking, and what are the metro stops close to the Dulles airport. Information need User Search Engine An Illustration
  • 7. Characteristics of Dynamic IR • Rich interactions • Query formulation • Document clicks • Document examination • eye movement • mouse movements • etc. 4
  • 8. Characteristics of Dynamic IR • Temporal dependency 5 clicked documentsquery D1 ranked documents q1 C1 D2 q2 C2 …… …… Dn qn Cn I information need iteration 1 iteration 2 iteration n
  • 9. Characteristics of Dynamic IR • Aim for a long-term goal • Great if we can find early what a user ultimately want 4
  • 10. Reinforcement Learning (RL) • Fits well in this trial-and-error setting • It is to learn from repeated, varied attempts which are continued until success. • The learner (also known as agent) learns from its dynamic interactions with the world • rather than from a labeled dataset as in supervised learning. • The stochastic model assumes that the system's current state depend on the previous state and action in a non- deterministic manner 6
  • 11. Most of Our Work is inspired by MDPs/POMDPs
  • 12. ○ Based on Markov Decision Process (MDP) ○ States: Queries ! Observable ○ Actions: ! User actions: ○ Add/remove/unchange the query terms ○ Nicely correspond to our definition of query change ! Search Engine actions: ○ Increase/ decrease /remain term weights ○ Rewards: ! nDCG [Guan, Zhang, and Yang SIGIR 2013] QUERY CHANGE MODEL
  • 13. SEARCH ENGINE AGENT’S ACTIONS ∈ Di−1 action Example qtheme Y increase “pocono mountain” in s6 N increase “france world cup 98 reaction” in s28, france world cup 98 reaction stock market→ france world cup 98 reaction +∆q Y decrease ‘policy’ in s37, Merck lobbyists → Merck lobbyists US policy N increase ‘US’ in s37, Merck lobbyists → Merck lobbyists US policy −∆q Y decrease ‘reaction’ in s28, france world cup 98 reaction → france world cup 98 N No change ‘legislation’ in s32, bollywood legislation →bollywood law
  • 14. QUERY CHANGE RETRIEVAL MODEL (QCM) ○ Bellman Equation gives the optimal value for an MDP: ○ The reward function is used as the document relevance score function and is tweaked backwards from Bellman equation: Document relevant score Query Transition model Maximum past relevance Current reward/relevance score
  • 15. CALCULATING THE TRANSITION MODEL • According to Query Change and Search Engine Actions Current reward/ relevance score Increase weights for theme terms Decrease weights for old added terms Decrease weights for removed terms Increase weights for novel added terms
  • 16. ○ Partially Observable Markov Decision Process ○ Two agents ● Cooperative game ● Joint Optimization WIN-WIN SEARCH: DUAL-AGENT STOCHASTIC GAME ● Hidden states ● Actions ● Rewards ● Markov [Luo, Zhang, and Yang SIGIR 2014]
  • 17. A MARKOV CHAIN OF DECISION MAKING STATES [Luo, Zhang, and Yang SIGIR 2014]
  • 18. SRT Relevant & Exploitation SRR Relevant & Exploration SNRT Non-Relevant & Exploitation SNRR Non-Relevant & Exploration ● scooter price ⟶ scooter stores ● collecting old US coins⟶ selling old US coins ● Philadelphia NYC travel ⟶ Philadelphia NYC train ● Boston tourism ⟶ NYC tourism q0 HIDDEN DECISION MAKING STATES [Luo, Zhang, and Yang SIGIR 2014]
  • 20. ACTIONS ! User Action (Au) ○ add query terms (+Δq) ○ remove query terms (-Δq) ○ keep query terms (qtheme) ! Search Engine Action(Ase) ○ Increase/ decrease/ keep term weights ○ Switch on or off a search technique, ○ e.g. to use or not to use query expansion ○ adjust parameters in search techniques ○ e.g., select the best k for the top k docs used in PRF ! Message from the user(Σu) ○ clicked documents ○ SAT clicked documents ! Message from search engine(Σse) ○ top k returned documents Messages are essentially documents that an agent thinks are relevant. [Luo, Zhang, and Yang SIGIR 2014]
  • 21. REWARDS ! Explicit Rewards: ! nDCG ! Implicit Rewards: ! clicks [Luo et al, SIGIR 2014, ECIR 2015]
  • 22. EXPERIMENTS ○ Corpus: ClubWeb09 and ClueWeb 12, TREC DD datasets ○ Query Logs
  • 23. SEARCH ACCURACY ○ Search accuracy on TREC 2012 Session Track TREC 2012 Session Track ◆ Win-win outperforms most retrieval algorithms on TREC 2012.
  • 24. ◆ Systems in TREC 2012 perform better than in TREC 2013. ◆ many relevant documents are not included in ClueWeb12 CatB collection ◆ Win-win outperforms all retrieval algorithms on TREC 2013. ◆ It is highly effective in Session Search. SEARCH ACCURACY ○ Search accuracy on TREC 2013 Session Track TREC 2013 Session Track
  • 25. SEARCH ACCURACY FOR DIFFERENT SESSION TYPES ○ TREC 2012 Sessions are classified into: ! Product: Factual / Intellectual ! Goal quality: Specific / Amorphous Intellectual %chg Amorphous %chg Specific %chg Factual %chg TREC best 0.3369 0.00% 0.3495 0.00% 0.3007 0.00% 0.3138 0.00% Nugget 0.3305 -1.90% 0.3397 -2.80% 0.2736 -9.01% 0.2871 -8.51% QCM 0.3870 14.87% 0.3689 5.55% 0.3091 2.79% 0.3066 -2.29% QCM+DUP 0.3900 15.76% 0.3692 5.64% 0.3114 3.56% 0.3072 -2.10% - Better handle sessions that demonstrate evolution and exploration Because QCM treats a session as a continuous process by studying changes among query transitions and modeling the dynamics QCM
  • 26. How to design the states, actions, and rewards
  • 27. DESIGN OPTIONS ○ Is there a temporal component? ○ States – What changes with each time step? ○ Actions – How does your system change the state? ○ Rewards – How do you measure feedback or effectiveness in your problem at each time step? ○ Transition Probability – Can you determine this? ! If not, then model free approach is more suitable ECIR’15
  • 28. … can it be more efficient?
  • 29. A Direct Policy Learning Framework • Learns a direct mapping from observations to actions by gradient descent • Define a history: A chain of events happening in a session • the dynamic changes of states, actions, observations, and rewards in a session ICTIR’15
  • 30. Browse Phase • Actor: the user • It happens • after the search results are shown to the user • before the user starts to write the next query • Records how the user perceives and examines the (previously retrieved) search results ICTIR’15 Decompose a history
  • 31. Query Phase • Actor: the user • It happens • when the user writes a query • Assuming the query is created based on • what has been seen in the browse phase • the information need ICTIR’15 Decompose a history
  • 32. Rank Phase • Actor: the search engine • It happens • after the query is entered • before the search results are returned • It is where the search algorithm takes place Decompose a history
  • 35. Ranking Function • It originally presents the probability of selecting a (ranking) action • In our context, the probability of selecting d to be put at the top of a ranked list under n3 and θ3 at the tth iteration • Then we sort the documents by it to generate the document list
  • 36. Updates: Feature function: Query Features • Test if a search term w∈q t and w∈q t −1 • # of times that a term w occurs in q 1 ,q 2 ,…,q t Query-Document Features • Test if a search term w∈+∆q t and w∈D t −1 • Test if a document d contains a term w ∈ −∆q t tf . idf score of a document d to q t Click Features • Test if there are SAT-Clicks in Dt−1 • # of times a document being clicked in the current session • # of seconds a document being viewed and reviewed in the current session Query-Document-Click Features • Test if qi leads to SAT-Clicks in Di, where i = 0...t−1 Session Features • position at the current session Browse Query Rank
  • 37. Efficiency - TREC 2012 Session • lemur > dpl > qcm > winwin • dpl achieves a good balance between accuracy and efficiency • the conclusions are also consistent upon experiments on TREC’12 ~ 14 Session Tracks DPL
  • 38. TREC 2012 Session • dpl achieves a significant improvement over the TREC best run • We found similar conclusions on TREC 2013 and 2014 Session Tracks DPL
  • 39. TREC DYNAMIC DOMAIN 2015-2017 ! The search task focuses on specific domains ! In the three years, we had explored domains from the dark web (illicit good and Ebola) and polar science, to more general web domains (NYT) ! What is consistent? ○ The participating system is expected to help the user through interactions & get their tasks done ○ User’s information need usually consists of multiple aspects
  • 40. THE TREC DYNAMIC DOMAIN TASK
  • 41. FEEDBACK FROM A SIMULATED USER ! https://github.com/trec-dd/trec-dd-jig
  • 42. DOMAIN USED IN 2017 ○ New York Times Annotated Corpus ! Sandhaus, Evan. "The new york times annotated corpus." Linguistic Data Consortium, Philadelphia 6, no. 12 (2008): e26752. ! Archives of New York Times in 20 years, from January 1, 1987 and June 19, 2007 ! Uncompressed size 16 GB ! Over 1.8 million documents ! Over 650,000 article summaries written by library scientists. ! Over 1,500,000 articles manually tagged by library scientists ! Over 275,000 algorithmically-tagged articles that have been hand verified by professionals
  • 43. ANNOTATION ○ Create Topic and Relevance Judgement at the same time ! Not by pooling ○ Topic – subtopic – passage – Relevance Judgement ○ The challenge: how to be complete
  • 44. ○ Useful information that the user gains ! Raw relevance score ○ Discounting ! Based on document ranking ! Based on diversity ○ User’s efforts ! Time spent ! Lengths of documents being viewed EVALUATION METRICS FOR DYNAMIC SEARCH
  • 45. ○ Most session search metrics consider all those factors into one overwhelmingly complex formula ○ The optimal value, aka upper bound, of those metrics highly varies on different search topics ○ In Cranfield-like settings (e.g. TREC), the difference is often ignored THE PROBLEM
  • 46. TOY EXAMPLE Doc Relevance score regarding topic-subtopic 1-1 1-2 2-1 2-2 2-3 2-4 2-5 d1 1 4 d2 3 4 d3 4 d4 4 d5 4 System Topic 1 CT- topic 1 Topic 2 CT- topic 2 CT-avg Normaliz ed CT- avg System1 d1, irrel, irrel, irrel, irrel 1 d1, d3, d4, d5, irrel 16 8.5 0.596 System2 d2, irrel, irrel, irrel, irrel 3 d1, d3, d4, d5, irrel 14 8.5 0.787 Optimal d1, d2, irrel, irrel, irrel 4 d1, d2, d3, d4, d5 17
  • 47. ○ What is the optimal metric value that a system can achieve? ! How to get the upper bound for each search topic? ! How does it affect the evaluation conclusions? ○ Variance of different topics ○ Normalization RESEARCH QUESTIONS !"#$%& = ( )*+,- $./_!"#$% 1#23", 5 − 7#/%$_8#9:;(1#23") 922%$_8#9:; 1#23" − 7#/%$_8#9:;(1#23")
  • 48. ○ Session-DCG (sDCG) ! Järvelin et al. "Discounted cumulated gain based evaluation of multiple-query IR sessions." Advances in Information Retrieval (2008): 4-15. ○ Cube Test (CT) ! Luo et al. "The water filling model and the cube test: multi-dimensional evaluation for professional search." CIKM, 2013. ○ Expected Utility (EU) ○ Yang and Abhimanyu. "Modeling expected utility of multi-session information distillation." ICTIR 2009. DYNAMIC SEARCH METRICS !" = $ % & ' $ (,* ∈% $ ,∈-.,/ 0, ∗ 23 ,,(,*45 − 7 ∗ 89:;(=, >)) @A = ∑(C5 D ∑*C5 |F(GH.| ∑, 0, IJK =, > ∗ 23(,,(,*45) ∑(C5 D ∑*C5 |F(GH.| 89:;(=, >) :L@M = $ (C5 D $ *C5 |F(GH.| IJK(=, >) 1 + logS > ∗ 1 + logST =
  • 49. ○ sDCG ○ Cube Test ○ Expected Utility DECONSTRUCT THE METRICS CostGain Rank discount Novelty discount !"#$ = & '() * & +() |-'./0| 123(5, 7) 1 + log> 7 ∗ 1 + log>@ 5 #A = ∑'() * ∑+() |-'./0| ∑C DC 123 5, 7 ∗ EF(C,',+G)) ∑'() * ∑+() |-'./0| HI!J(5, 7) KL = & M N O & ',+ ∈M & C∈Q0,R DC ∗ EF C,',+G) − T ∗ HI!J(5, 7))
  • 50. BOUNDS ON DIFFERENT TOPICS !"#$ = "&!'()*+,- $.&*
  • 51. BOUNDS ON DIFFERENT TOPICS !" = $%&'()*+,- ./%* !(&+
  • 52. BOUNDS ON DIFFERENT TOPICS !" = $%&'()*+,- ./%* −$%&'()*+,- 1(&+
  • 53. ! The difference of the optimal value a metric would produce for different topics is large and should not be ignored.
  • 54. ○ Rearrangement Inequality ○ In IR, Probability Ranking Principle [4] ! the overall effectiveness of an IR system can be achieved the best by ranking the documents by their usefulness in descending order OUR SOLUTION !"#$ + !&#$'" + … + !$#" ≤ !* " #" + !* & #& + … + !* $ #$ ≤ !"#" + !&#& + ⋯ + !$#$ ,-. !" ≤ !& … ≤ !$ /01 #" ≤ #& … ≤ #$
  • 55. NORMALIZATION EFFECT !"#$ = "&!'()*+,- $.&*
  • 57. NORMALIZATION EFFECT !" = $%&'()*+,- ./%* − / ∗ $%&'()*+,- 2(&+ / = 0.01
  • 58. ! Using the bounds for normalization brings in more fairness into evaluation
  • 59. Conclusion • Our main contributions: • Put user into the models • Created a bridge between information seeking studies/user behavior studies with learning • Yield a family of new generative retrieval models for a complex, dynamic settings • Able to explain the results
  • 60. A Few Thinkings • Information seeking is a Markov Decision Process, instead of independent searches • User actions that cost more efforts, such as query changes, are stronger signals than clicks • Search is also a learning process for the user, who also evolves • Users and search engines form a partnership to explore the information space • They influence each other; It is a two-way communication • Complex evaluation metrics might not be appropriate; the complexity should either be modelled in the model or the metric, but not in both
  • 61. Look into the future • Dynamic IR Models are good for modeling information seeking • A lot of room to study the user and the search engine interaction in a generative way • The thinking I presented here could be able to generate new methods not only on retrieval and evaluation, but also on related fields • Exciting!!
  • 62. Thank You! • Email: huiyang@cs.georgetown.edu • Group Page: InfoSense at http://infosense.cs.georgetown. edu/ • Dynamic IR Website: http://www.dynamic-ir- modeling.org/ • Book: Dynamic Information Retrieval Modeling • TREC Dynamic Domain Track: http://trec-dd.org/