SlideShare a Scribd company logo
Polyrepresentation in a Quantum-inspired
Information Retrieval Framework
Ingo Frommholz
@iFromm
University of Bedfordshire
Luton, UK
QUARTZ Winter School
University of Padua
Table of Contents
1 The Principle of Polyrepresentation
2 A Model for Quantum-inspired Information Access
3 Polyrepresentation in a Quantum-inspired IA Model
2 / 70
The Principle of Polyrepresentation
Cognitive Information Concept in Information Seeking and
Retrieval
The Principle of Polyrepresentation
Book Store Example
Different Kinds of Polyrepresentation
Retrieval with Polyrepresentation
Polyrepresentative Clustering
The Principle of Polyrepresentation
Cognitive Information Concept in Information Seeking and Retrieval
Cognitive Communication System
The “Cognitive Freefall”
isfy two conditions simultaneously (Ingwersen 1992, p. 33):
On the one hand information being something which is the
result of a transformation of a generator’s knowledge structures
(by intentionality, model of recipients’ states of knowledge, and in the
form of signs)
and on the other hand being something which,
when perceived, affects and transforms the recipient’s state of knowl-
edge.ee
Recipient World Model
World Model Problem
Space
State of
Uncertainty
Current
Cognitive-
Emotional
State
Perceived objectSignsSigns
InformationInformation
Context B
situation
Context A
situation
TransformationTransformation
InteractionInteraction
Cognitive
free
fall
InformationInformationInformation
processing
gstagesg
Interpretation
Cognitive-EmotionalCognitive-Emotional
Level of SystemLevel of System
Linguistic Level of SystemLinguistic Level of System
Generated object
Generator
Fig. 2.1. The cognitive communication system for Information Science, information
seeking and IR. Revision of Ingwersen (1992, p.33; 1996, p.6), from Belkin (1978).
Evidently, any transformation of state of knowledge involves an effect on
[Ingwersen and Järvelin, 2005, p. 33]
4 / 70
The Principle of Polyrepresentation
The Principle of Polyrepresentation
The Principle of Polyrepresentation
[Ingwersen, 1994, Ingwersen, 1996, Ingwersen and Järvelin, 2005]
Multi-evidence
Employs the diversity of (cognitively) different actors’
pre-suppositions and interpretations of their situations and
objects over time
Diversity may be derived from same actor but being of different
functional nature (functionally different)
Tries to mitigate the cognitive freefall
5 / 70
The Principle of Polyrepresentation
The Principle of Polyrepresentation
Polyrepresentation
Hypothesis [Ingwersen and Järvelin, 2005, p. 208]
the more interpretations of different cognitive and functional
nature, based on an IS&R situation, that point to a set of
objects in so-called cognitive overlaps, and the more
intensely they do so, the higher the probability that such
objects are relevant (pertinent, useful) to a perceived work
task/interest to be solved, the information (need) situation at
hand, the topic required, or/and the influencing context of
that situation
6 / 70
The Principle of Polyrepresentation
Book Store Example
Motivating Example: Book Store
“Good introduction to quantum mechanics”
7 / 70
The Principle of Polyrepresentation
Book Store Example
Motivating Example: Book Store
“Good introduction to quantum mechanics”
8 / 70
The Principle of Polyrepresentation
Book Store Example
IN Facets and Polyrepresentation
“Good introduction to quantum mechanics”
Relevance decision goes beyond topicality
Collections like Amazon/LT/BritishLibrary
Rich pool of potentially useful information (metadata,
user-generated content)
Different views on documents, relevant for different aspects of the
information need (IN)
Combine the evidence (e.g. metadata and user-generated
content) to get a more accurate estimation of
relevance/usefulness
[Koolen, 2014] puts user-generated content into the index – it
worked!
Reviews and tags complimentary to each other and to
professional metadata
9 / 70
The Principle of Polyrepresentation
Different Kinds of Polyrepresentation
Different Kinds of Polyrepresentation
Document/information object polyrepresentation
Information need (IN) polyreprepresentation
Algorithm polyrepresentation
10 / 70
The Principle of Polyrepresentation
Different Kinds of Polyrepresentation
Document Polyrepresentation
Different document features or properties
Examples
POLAR for annotation-based retrieval
[Frommholz and Fuhr, 2006]
?- document(D) & D[ quantum & mechanics & @A] &
A[ good & introduction ]
Polyrepresentation and CQQL for interactive multimedia IR
[Zellhöfer and Schmitt, 2011]
Features from a query-by-example document interpreted as
representations, translation into CQQL queries
Inter- and intra-document features in polyrepresentation
[Skov et al., 2008]
Cystic Fibrosis test collection: functionally different features: titles,
abstract, references (representing the author); Medical Subject
Headings (MeSH) (representing the indexer as a cognitively
different actor)
Overlaps generated by three or four representations have higher
precision than those generated from two overlaps
11 / 70
The Principle of Polyrepresentation
Different Kinds of Polyrepresentation
Information Need Polyrepresentation
Perceived work task, problem or information need
Query
Example [Kelly and Fu, 2007]:
What users already know, why they want to know it, TREC
title+description
Additional terms elicited from users used for query expansion
12 / 70
The Principle of Polyrepresentation
Different Kinds of Polyrepresentation
Algorithm Polyrepresentation
Different weighting schemes
Different relevance feedback and/or query modification algorithms
Example [Larsen et al., 2009]:
4 best performing TREC 5 ad hoc track models over 30 topics
Vogt and Cottrell concluded that not all fusions may outper-
form their individual components; only when almost equally
well-performing retrieval models are fused may the fusion (in
pairs) improve retrieval performance over the constituents.
They regarded the TREC relevance pooling methods as a
large-scale data fusion for IR. Ng and Kantor (2000) applied
TREC 5 routing data, andWu and McClean (2006) made pre-
dictions by means of overlap correlation measures. Ng and
Kantor’s experiments investigated the predictive power of
output dissimilarity and a pairwise measure of similarity of
performance in symmetrical data fusion (p. 1177). Wu and
McClean’s study demonstrated that versions of the same
IR model have a very high overlap correlation and thus
retrieve almost the same documents—a case to be avoided
because nonrelevant documents then tend to be promoted
in line with relevant ones. This view somehow contradicts
the Lee’s findings, but his tests involved different retrieval
models. In summary, there seems to be a consensus that
all the fused IR models should perform equally well. In
pairwise data fusions, the CombMNZ work quite well and
are commonly used, with the CombSUM of retrieval scores
performing nearly as well. When random combinations are
tested, results are mixed concerning which technique to use.
Normalization of retrieval scores should be applied, and if
not feasible, the Borda solution where rank scores are used
seems to perform well.
Most data fusion experiments in IR, including the ones
presented here, are based on retrieval results from already
performed searches in TREC. The analysis or prediction
methodologies in the present study are associated with
the principle of polyrepresentation. The difference lies in the
approach to prediction. While the former methodologies
attempt to make predictions based on automated and quan-
titative perspectives of the overlaps, our approach is from
a qualitative and conceptual/algorithmic perspective. In a
polyrepresentation sense, the more evidence deriving from
qualitatively different sources that point to a document, the
higher the probability that that document is (highly) rele-
vant. Thus, the relative size of overlap between fused IR
models has a quite different meaning in polyrepresentation.
In data fusion based on polyrepresentation, there are fun-
damentally two ways of combining retrieval systems. One
way is the restricted overlap fusion. Following this method,
only the inner (disjoint) overlap of the fused models provides
the result set to be ranked. Figure 1 illustrates this exper-
imental situation with four different retrieval models—and
their triple and pairwise disjoint overlaps as well as their
inner “central cognitive overlap” formed by all four models
(Fuse4). As the sets are disjoint, each of the 11 overlapping
sets constitutes separate data fusion results, and the perfor-
mance of each can be studied. Note that in the restricted
fusion, each single original retrieval model only includes the
documents that are “leftover;” that is, they do not include any
of the documents in overlaps with other retrieval models. For
instance,theperformanceoftheoriginalretrievalmodelCOR
is thus measured on the documents found in COR oval (1)
but not including the already retrieved documents in Fuse2
to Fuse4 areas (white area, Figure 1). Their performance
FIG. 1. Illustration of polyrepresentation of four different retrieval models’
search results in the form of disjoint overlapping documents (variation of
Ingwersen & Järvelin, 2005, p. 347; Lund, Schneider, & Ingwersen, 2006).
648 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—April 2009
DOI: 10.1002/asi
Restricted fusions made of two, three, or four
cognitively/algorithmically very different retrieval models perform
significantly better than the individual models
13 / 70
The Principle of Polyrepresentation
Retrieval with Polyrepresentation
Retrieval with Polyrepresentation
1 Get ranking for different representations
14 / 70
The Principle of Polyrepresentation
Retrieval with Polyrepresentation
Retrieval with Polyrepresentation
1 Get ranking for different representations
15 / 70
The Principle of Polyrepresentation
Retrieval with Polyrepresentation
Retrieval with Polyrepresentation
1 Get ranking for different representations
16 / 70
The Principle of Polyrepresentation
Retrieval with Polyrepresentation
Retrieval with Polyrepresentation
1 Get ranking for different representations
2 Find the cognitive overlap
17 / 70
The Principle of Polyrepresentation
Retrieval with Polyrepresentation
Polyrepresentation
Book Store Scenario
Content Author
Ratings
Comments
18 / 70
The Principle of Polyrepresentation
Polyrepresentative Clustering
Polyrepresentation and Clustering
Polyrepresentation creates
partitions
Clustering partitions
document sets too
Can clustering help in
creating polyrepresentative
partitions?
Polyrepresentation Cluster
Hypothesis: “documents
relevant to the same
representations should
appear in the same clus-
ter” [Frommholz and Abbasi, 2014]
19 / 70
The Principle of Polyrepresentation
Polyrepresentative Clustering
Polyrepresentation and Clustering
Mapping of clusters to
polyrepresentation (using
iSearch [Lykke et al., 2010])
Simulated user – search
strategy:
1 User investigates total
cognitive overlap cluster
2 User jumps to different
cluster based on
preferences
3 The user simulation
creates a ranked list of
documents
20 / 70
The Principle of Polyrepresentation
Polyrepresentative Clustering
Information Need-based Vector
Let REPin be the set of representations1
of an information need in
Motivated by the Optimum Clustering Framework (OCF), which is
based on the probability of relevance [Fuhr et al., 2011]
Pr(R|d,ri ) is computed for each document d and ri ∈ REPin
τin(d) =


Pr(R|d,r1)
...
Pr(R|d,rn)

 (1)
1
search terms, work task, ideal answer, current info need, background knowledge
21 / 70
The Principle of Polyrepresentation
Polyrepresentative Clustering
Document-based Polyrepresentation Vector
REPd consists of the different representations2
rdi of a document
d
Pr(R|rdi ,q) for q (search terms in this case) is computed
τdoc(d) =


Pr(R|rd1,q)
...
Pr(R|rdn,q)

 (2)
2
title, abstract, body, bibliographic context, references
22 / 70
The Principle of Polyrepresentation
Polyrepresentative Clustering
Some Findings (using iSearch)
Some statistically significant improvements over a BM25 baseline
(NDCG@30) using the ranking created by a simple simulated
user strategy when concatenating the IN and Document
representations [Abbasi and Frommholz, 2015b]
Statistical significant improvements (NDCG) when using
document and IN representations separately and assuming an
ideal (oracle-based) cluster ranking
[Abbasi and Frommholz, 2015a]
This shows us our idea is basically promising!
Finding the total cognitve overlap (TOC) using cluster ranking is
challenging [Frommholz and Abbasi, 2014]
Different interpretations of the TOC: The one with the highest
precision? The one with the highest pairwise precision? The one
where all representations get a high value?
The latter one could be identified more easily (MRR = 0.575
compared to around 0.3 for the others)
23 / 70
A Model for Quantum-inspired
Information Access
Quantum Information Access
Information Need Space
User Interaction and Feedback
Textual Representation
Evaluation
QIA Conclusion
Quantum Probabilities Introduction
Notation Wrap-Up
Hilbert space: vector space with an inner product
Dirac Notation:
|φ〉 is a ket (a vector φ)
〈φ| is a bra (a transposed vector φT
)
〈φ|ψ〉 ∈ C is a bra(c)ket (inner product)
|φ〉〈ψ| is a ketbra (a matrix)
A subspace S is represented by a projector (another matrix)
|φ〉〈φ| projector onto 1-dimensional subspace
Orthogonal projection of vector |φ〉 onto S: S |φ〉
25 / 70
Quantum Probabilities Introduction
Quantum Information Access
Assumptions underlying QIA
IR system uncertain about user’s information need (IN)
System view of the user’s IN becomes more and more specific
through interaction
The IN may change from the user’s point of view
There is an IN Space, a Hilbert space
26 / 70
Quantum Probabilities Introduction
Information Need Space
Information Need Space
|quantum mechanics
INs as vectors: IN
vector |φ〉
27 / 70
Quantum Probabilities Introduction
Information Need Space
Information Need Space
R
|quantum mechanics
INs as vectors: IN
vector |φ〉
Event “document d is
relevant” represented
by subspace R
28 / 70
Quantum Probabilities Introduction
Information Need Space
Information Need Space
R
|quantum mechanics
INs as vectors: IN
vector |φ〉
Event “document d is
relevant” represented
by subspace R
Probability of
relevance: squared
length of projection
Pr(R|d,φ) =
||R |φ〉||2
Unit vector imposes
relevance distribution
on subspaces (events)
29 / 70
Quantum Probabilities Introduction
Information Need Space
System’s Uncertainty about User’s Intentions
R
p1
p2
p4
p3
p5
System uncertain about
user’s IN
Expressed by an ensemble S
of possible IN vectors (density
ρ):
S = {(p1,|φ1〉),...,(pn,|φn〉)}
Probability of relevance:
Pr(R|d,S)=
i
pi ·Pr(R|d,φi )
=||R|φ〉||2
30 / 70
Quantum Probabilities Introduction
Information Need Space
System’s Uncertainty about User’s Intentions
R
p1
p2
p4
p3
p5
Dual representation using
density operator and trace
function
ρ = i pi ·|φi 〉〈φi |
Pr(R|d,S) = tr(ρR)
31 / 70
Quantum Probabilities Introduction
User Interaction and Feedback
User Interaction and Feedback
R∗
|ϕ1
|ϕ2
|ϕ5
|ϕ3
Outcome of feedback: Query
and query reformulation, (click
on) relevant document, ...
Expressed as subspace
Project IN vectors onto
document subspace
Document now gets
probability 1
System’s uncertainty
decreases
Also reflects changes in
information needs
32 / 70
Quantum Probabilities Introduction
User Interaction and Feedback
User Interaction and Feedback
R∗
|ϕ1
|ϕ2
|ϕ4
|ϕ3
|ϕ5
Outcome of feedback: Query
and query reformulation, (click
on) relevant document, ...
Expressed as subspace
Project IN vectors onto
document subspace
Document now gets
probability 1
System’s uncertainty
decreases
Also reflects changes in
information needs
33 / 70
Quantum Probabilities Introduction
Textual Representation
Textual Representation
IN Space / Documents
|crash (Term)
|car (Term)
|jupiter (Term)
|jupiter crash
|car crash
|ϕ
IN space based on term
space
IN vectors made of document
fragments
Weighting scheme (e.g., tf,
tf-idf,...)
Document is relevant to all
INs found in its fragments
Document subspace R
spanned by IN vectors
No length normalisation
necessary
34 / 70
Quantum Probabilities Introduction
Textual Representation
Single Query Term
Take all fragments vectors (IN
vectors) containing term t
This makes up ensemble St
35 / 70
Quantum Probabilities Introduction
Textual Representation
Mixture
Mixture of all combinations of
term fragments
Document must at least
satisfy one term fragment
The more term fragments are
contained, the more relevant
a document is
S(M) =
n
i=1
wi Sti
wi is term weight
36 / 70
Quantum Probabilities Introduction
Textual Representation
Mixture of Superposition
Superpose all combinations
(e.g.
1
2
(|φ〉+|ψ〉))
At least one query term
fragment superposition must
be contained
The more fragment
superpositions are contained,
the more relevant a document
is
Indication that it works well
with multi-term concepts (e.g.
“digital libraries”)
37 / 70
Quantum Probabilities Introduction
Textual Representation
Tensor product
Assumption: each term
covers an IN aspect
Tensor product of all fragment
vectors combination of IN
aspects
Document must satisfy all IN
aspects
The more tensor products are
satisfied, the more relevant is
the document
38 / 70
Quantum Probabilities Introduction
Evaluation
Evaluation
Evaluation with several TREC collections
[Piwowarski et al., 2010]
Tensor representation of query could compete with BM25
We don’t lose retrieval effectiveness in an ad hoc scenario
Expressive framework
39 / 70
Quantum Probabilities Introduction
QIA Conclusion
QIA Extensions
What can it bring to IR?
Queries in sessions [Frommholz et al., 2011]
Use geometry and projections to determine type of and handle
follow-up query (generalisation, information need drift,
specialisation)
Different forms of interactions (query reformulations, relevance
judgements)
Summarisation [Piwowarski et al., 2012]
QIA interpretation of LSA-based methods
Query algebra for the QIA framework [Caputo et al., 2011]
Diversity and novelty
Polyrepresentation (next)
40 / 70
Quantum Probabilities Introduction
QIA Conclusion
QIA Conclusion
User’s IN as ensemble of vectors
Documents as subspaces
User interaction and feedback
Term space, query construction
Can compete in an ad hoc scenario
41 / 70
Polyrepresentation in a
Quantum-inspired IA Model
Modelling Polyprepresentation
Combining the evidence
Relationships between representations
Polyrepresentation in a Quantum-inspired IA Model
Modelling Polyprepresentation
The Principle of Polyrepresentation
Book Store Scenario
Content Author
Ratings
Comments
1 Get ranking for different representations
2 Find the cognitive overlap
Based on different document representations, but also different
representations of user’s information need
Hypothesis: cognitive overlap contains highly relevant documents
(experiments support this)
43 / 70
Polyrepresentation in a Quantum-inspired IA Model
Modelling Polyprepresentation
How can we apply this in QIA?
Model single representations in a vector space (by example)
Authors
Ratings
Topical/term space: already discussed
Combine the representations
44 / 70
Polyrepresentation in a Quantum-inspired IA Model
Modelling Polyprepresentation
Example: Author Space
Each author is a dimension
Non-orthogonal vectors:
dependencies
Angle between vectors
reflects the degree of
dependency (90◦ =
orthogonal = upright =
independent)
Example: Jones and Smith
(somehow) related, Smith and
Miller not
45 / 70
Polyrepresentation in a Quantum-inspired IA Model
Modelling Polyprepresentation
Example: Author Space
Document by Smith and Miller
User seeks for documents by
Jones
Document retrieved due to
relationship between Jones
and Smith
46 / 70
Polyrepresentation in a Quantum-inspired IA Model
Modelling Polyprepresentation
Author/Topic Space
Combined author/topic space
Authors may be related only
w.r.t. a specific topic
Ex.: A user interested in
Smith’ documents about
logics may be interested in
Jones’ documents about
logics, but not in Jones’
documents about interactive
IR
Author represented as a
subspace
|SmithLogics
|JonesLogics
|JonesIIR
47 / 70
Polyrepresentation in a Quantum-inspired IA Model
Modelling Polyprepresentation
Rating Space
Example: rating scale
good/bad/average – each is a
dimension
“Average” rated book
represented by 2-dimensional
subspace
User wants books which are
rated good
⇒ not relevant (|good〉
orthogonal)
Rrating
|good
|average
|bad
48 / 70
Polyrepresentation in a Quantum-inspired IA Model
Combining the evidence
Combining the Evidence
Total Cognitive Overlap and Tensors
Content Author
Ratings
Comments
Modelled different representations in vector space
Probabilities w.r.t. single representations
How do we express user’s IN w.r.t. all representations?
How do we get a cognitive overlap?
49 / 70
Polyrepresentation in a Quantum-inspired IA Model
Combining the evidence
Combining the Evidence
Total Cognitive Overlap and Tensors
Content Author
Ratings
Comments
Polyrepresentation space as tensor product (“⊗”) of single
spaces
Probability that document is in total cognitive overlap:
Prpolyrep = Prcontent ·Prratings ·Prauthor ·Prcomments
50 / 70
Polyrepresentation in a Quantum-inspired IA Model
Combining the evidence
Wishlist
Content Author
Ratings
Comments
Documents not relevant in one representation should not get a
value of 0
Ignore selected representations
Relative importance of representations to user (mixing and
weighting)
51 / 70
Polyrepresentation in a Quantum-inspired IA Model
Combining the evidence
“Don’t care” dimension
Introduction of a “don’t care” dimension
Part of each document subspace
each document “satisfies” the don’t care “need”
Example: Document by Smith, user doesn’t care about authors
with probability α
|smith
|jones
|∗
α
Rauthor
α = 1 means representation is ignored at all
52 / 70
Polyrepresentation in a Quantum-inspired IA Model
Combining the evidence
Example
What the system assumes about the user’s IN:
Seeks books either by Jones or by Smith
Looks either for good books or doesn’t care about ratings
Assume a document d by Smith which is rated “bad”
Polyrepresentation space: 9-dimensional (3× 3)
53 / 70
Polyrepresentation in a Quantum-inspired IA Model
Combining the evidence
IN Vectors in Polyrepresentation Space
How do they look like and what do they mean?
Polyrepresentation space
Reflects all 4 possible
combinations of INs w.r.t. single
representations:
Smith/good: |smith〉⊗ |good〉
Smith/don’t care: |smith〉⊗ |∗〉
Jones/good: |jones〉⊗ |good〉
Jones/don’t care: |jones〉⊗ |∗〉
54 / 70
Polyrepresentation in a Quantum-inspired IA Model
Combining the evidence
Documents in Polyrepresentation Space
Represented as tensor product of single document subspaces
Here: 4-dimensional subspace (2× 2)
55 / 70
Polyrepresentation in a Quantum-inspired IA Model
Combining the evidence
Determining the Retrieval Weight
Why the system retrieves the
bad book by Smith
56 / 70
Polyrepresentation in a Quantum-inspired IA Model
Combining the evidence
Determining the Retrieval Weight
Why the system retrieves the
bad book by Smith
57 / 70
Polyrepresentation in a Quantum-inspired IA Model
Relationships between representations
Something left on the Wishlist
Relationships between Representations
System observes (interaction/feedback) user preferences:
If book is by Smith, it has to be rated good
If book is by Jones, don’t care about the ratings
58 / 70
Polyrepresentation in a Quantum-inspired IA Model
Relationships between representations
Something left on the Wishlist
Relationships between Representations
System observes (interaction/feedback) user preferences:
If book is by Smith, it has to be rated good
If book is by Jones, don’t care about the ratings
System evolves to new state in polyrepresentation space (2
combinations not allowed any more)
X
X
59 / 70
Polyrepresentation in a Quantum-inspired IA Model
Relationships between representations
What does that mean?
Only two assumed IN
cases left:
1 Smith/good
2 Jones/don’t care
Cannot be expressed as
combination of single
representations
60 / 70
Polyrepresentation in a Quantum-inspired IA Model
Relationships between representations
What does that mean?
Only two assumed IN
cases left:
1 Smith/good
2 Jones/don’t care
Cannot be expressed as
combination of single
representations
Bad book by Smith only
retrieved due to
relationship to Jones!
61 / 70
Polyrepresentation in a Quantum-inspired IA Model
Relationships between representations
Polyrepresentation Conclusion
Different non-topical representations as subspaces
Polyrepresentation space as tensor space to calculate cognitive
overlap
“Don’t care” dimension for weighting of representations
Non-separate states
62 / 70
Bibliography
Bibliography I
Abbasi, M. K. and Frommholz, I. (2015a).
Cluster-based polyrepresentation as science modelling approach
for information retrieval.
Scientometrics, 102(3):2301–2322.
Abbasi, M. K. and Frommholz, I. (2015b).
Polyrepresentative Clustering: A Study of Simulated User
Strategies and Representations.
In Mayr, P., Frommholz, I., and Mutschke, P., editors, Proc. of the
2nd Workshop on Bibliometric-enhanced Information Retrieval
(BIR2015), pages 47–54, Vienna, Austria. CEUR-WS.org.
63 / 70
Bibliography
Bibliography II
Caputo, A., Piwowarski, B., and Lalmas, M. (2011).
A Query Algebra for Quantum Information Retrieval.
In Proceedings of the 2nd Italian Information Retrieval Workshop
2011.
Frommholz, I. and Abbasi, M. K. (2014).
On Clustering and Polyrepresentation.
In de Rijke, M., Kenter, T., de Vries, A. P., Zhai, C., de Jong, F.,
Radinsky, K., and Hofmann, K., editors, Proceedings of the
European Conference on Information Retrieval (ECIR 2014),
volume 1, pages 618–623. Springer.
64 / 70
Bibliography
Bibliography III
Frommholz, I. and Fuhr, N. (2006).
Probabilistic, object-oriented logics for annotation-based retrieval
in digital libraries.
In Nelson, M., Marshall, C., and Marchionini, G., editors, Proc. of
the 6th ACM/IEEE Joint Conference on Digital Libraries (JCDL
2006), pages 55–64, New York. ACM.
Frommholz, I., Piwowarski, B., Lalmas, M., and van Rijsbergen, K.
(2011).
Processing Queries in Session in a Quantum-Inspired IR
Framework.
In Clough, P., Foley, C., Gurrin, C., Jones, G. J. F., Kraaij, W., Lee,
H., and Mudoch, V., editors, Proceedings ECIR 2011, volume
6611 of Lecture Notes in Computer Science, pages 751–754.
Springer.
65 / 70
Bibliography
Bibliography IV
Fuhr, N., Lechtenfeld, M., Stein, B., and Gollub, T. (2011).
The Optimum Clustering Framework: Implementing the Cluster
Hypothesis.
Information Retrieval, 14.
Ingwersen, P. (1994).
Polyrepresentation of Information Needs and Semantic Entities,
Elements of a Cognitive Theory for Information Retrieval
Interaction.
In Croft, B. W. and van Rijsbergen, C. J., editors, Proceedings of
the Seventeenth Annual International ACM SIGIR Conference on
Research and Development in Information Retrieval, pages
101–111, London, et al. Springer-Verlag.
66 / 70
Bibliography
Bibliography V
Ingwersen, P. (1996).
Cognitive perspectives of information retrieval interaction:
Elements of a cognitive IR theory.
The Journal of Documentation, 52:3–50.
Ingwersen, P. and Järvelin, K. (2005).
The turn: integration of information seeking and retrieval in
context.
Springer-Verlag New York, Inc., Secaucus, NJ, USA.
Kelly, D. and Fu, X. (2007).
Eliciting better information need descriptions from users of
information search systems.
Information Processing & Management, 43(1):30–46.
67 / 70
Bibliography
Bibliography VI
Koolen, M. (2014).
User reviews in the search index? That’ll never work!
In Proceedings ECIR 2014, pages 323–334.
Larsen, B., Ingwersen, P., and Lund, B. (2009).
Data fusion according to the principle of polyrepresentation.
J. Am. Soc. Inf. Sci. Technol., 60:646–654.
Lykke, M., Larsen, B., Lund, H., and Ingwersen, P. (2010).
Developing a Test Collection for the Evaluation of Integrated
Search.
In Proceedings ECIR 2010, pages 627–630.
68 / 70
Bibliography
Bibliography VII
Piwowarski, B., Amini, M.-R., and Lalmas, M. (2012).
On using a Quantum Physics formalism for Multi-document
Summarisation.
Journal of the American Society for Information Science and
Technology (JASIST).
Piwowarski, B., Frommholz, I., Lalmas, M., and Van Rijsbergen, K.
(2010).
What can Quantum Theory Bring to Information Retrieval?
In Proc. 19th International Conference on Information and
Knowledge Management, pages 59–68. Springer.
Skov, M., Larsen, B., and Ingwersen, P. (2008).
Inter and intra-document contexts applied in polyrepresentation
for best match IR.
Information Processing & Management, 44(5):1673–1683.
69 / 70
Bibliography
Bibliography VIII
Zellhöfer, D. and Schmitt, I. (2011).
A user interaction model based on the principle of
polyrepresentation.
In Proceedings of the 4th workshop on Workshop for Ph.D.
students in information & knowledge management - PIKM ’11,
pages 3–10, New York, New York, USA. ACM Press.
70 / 70

More Related Content

PDF
O01741103108
PDF
Rhetorical Sentence Classification for Automatic Title Generation in Scientif...
PDF
Linking Behavioral Patterns to Personal Attributes through Data Re-Mining
PDF
Conceptual similarity measurement algorithm for domain specific ontology[
PDF
A systematic study of text mining techniques
PDF
ONTOLOGICAL MODEL FOR CHARACTER RECOGNITION BASED ON SPATIAL RELATIONS
PDF
A NEW TOP-K CONDITIONAL XML PREFERENCE QUERIES
O01741103108
Rhetorical Sentence Classification for Automatic Title Generation in Scientif...
Linking Behavioral Patterns to Personal Attributes through Data Re-Mining
Conceptual similarity measurement algorithm for domain specific ontology[
A systematic study of text mining techniques
ONTOLOGICAL MODEL FOR CHARACTER RECOGNITION BASED ON SPATIAL RELATIONS
A NEW TOP-K CONDITIONAL XML PREFERENCE QUERIES

What's hot (16)

PDF
G04124041046
DOC
On Machine Learning and Data Mining
PDF
TEXT PLAGIARISM CHECKER USING FRIENDSHIP GRAPHS
PDF
Structural weights in ontology matching
PDF
The Statement of Conjunctive and Disjunctive Queries in Object Oriented Datab...
PDF
A syntactic analysis model for vietnamese questions in v dlg~tabl system
PDF
Blei ngjordan2003
PDF
Proposed Method for String Transformation using Probablistic Approach
PDF
NAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURES
PPTX
Statistical Modeling in 3D: Explaining, Predicting, Describing
PDF
Legal Document
PDF
Ijetcas14 624
PDF
A Semi-Automatic Ontology Extension Method for Semantic Web Services
PDF
USING TF-ISF WITH LOCAL CONTEXT TO GENERATE AN OWL DOCUMENT REPRESENTATION FO...
PDF
Multi label text classification
PDF
Experimental Result Analysis of Text Categorization using Clustering and Clas...
G04124041046
On Machine Learning and Data Mining
TEXT PLAGIARISM CHECKER USING FRIENDSHIP GRAPHS
Structural weights in ontology matching
The Statement of Conjunctive and Disjunctive Queries in Object Oriented Datab...
A syntactic analysis model for vietnamese questions in v dlg~tabl system
Blei ngjordan2003
Proposed Method for String Transformation using Probablistic Approach
NAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURES
Statistical Modeling in 3D: Explaining, Predicting, Describing
Legal Document
Ijetcas14 624
A Semi-Automatic Ontology Extension Method for Semantic Web Services
USING TF-ISF WITH LOCAL CONTEXT TO GENERATE AN OWL DOCUMENT REPRESENTATION FO...
Multi label text classification
Experimental Result Analysis of Text Categorization using Clustering and Clas...
Ad

Similar to Polyrepresentation in a Quantum-inspired Information Retrieval Framework (20)

PPTX
Model of information retrieval (3)
PDF
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : A C...
DOC
Abraham
DOC
Parts of research paper
PPT
A Multiple Ontology, Concept based, Context-sensitive Search and Retrieval
PDF
14. Michael Oakes (UoW) Natural Language Processing for Translation
PDF
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : ...
PPTX
Qualitative research
PDF
The composite data model a unified approach for combining and querying multip...
PDF
TWO LEVEL SELF-SUPERVISED RELATION EXTRACTION FROM MEDLINE USING UMLS
PDF
MIX methode- building better theory by bridging the quantitative-qualitative ...
PPTX
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
PDF
Towards a Query Rewriting Algorithm Over Proteomics XML Resources
PDF
Topics In Rough Set Theory Current Applications To Granular Computing Seiki A...
PDF
Case-Based Reasoning for Explaining Probabilistic Machine Learning
PDF
A Critical Survey On Current Literature-Based Discovery Models
PDF
Robustness in Machine Learning Explanations Does It Matter-1.pdf
PPTX
Session-5 Theoritical Framework.pptx document
DOCX
Li802 Applying Information Seeking Models to Student Research
DOCX
Chao Wrote Some trends that influence human resource are, Leade.docx
Model of information retrieval (3)
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : A C...
Abraham
Parts of research paper
A Multiple Ontology, Concept based, Context-sensitive Search and Retrieval
14. Michael Oakes (UoW) Natural Language Processing for Translation
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : ...
Qualitative research
The composite data model a unified approach for combining and querying multip...
TWO LEVEL SELF-SUPERVISED RELATION EXTRACTION FROM MEDLINE USING UMLS
MIX methode- building better theory by bridging the quantitative-qualitative ...
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Towards a Query Rewriting Algorithm Over Proteomics XML Resources
Topics In Rough Set Theory Current Applications To Granular Computing Seiki A...
Case-Based Reasoning for Explaining Probabilistic Machine Learning
A Critical Survey On Current Literature-Based Discovery Models
Robustness in Machine Learning Explanations Does It Matter-1.pdf
Session-5 Theoritical Framework.pptx document
Li802 Applying Information Seeking Models to Student Research
Chao Wrote Some trends that influence human resource are, Leade.docx
Ad

More from Ingo Frommholz (8)

PDF
Evaluating Search Performance
PDF
Evaluating Search Performance
PDF
Interactive Information Retrieval inspired by Quantum Theory
PDF
Quantum Mechanics meet Information Search and Retrieval – The QUARTZ Project
PDF
Modelling User Interaction utilising Information Foraging Theory (and a bit o...
PDF
Polyrepresentation in Complex (Book) Search Tasks - How can we use what the o...
PDF
Information Retrieval Models Part I
PDF
Quantum Probabilities and Quantum-inspired Information Retrieval
Evaluating Search Performance
Evaluating Search Performance
Interactive Information Retrieval inspired by Quantum Theory
Quantum Mechanics meet Information Search and Retrieval – The QUARTZ Project
Modelling User Interaction utilising Information Foraging Theory (and a bit o...
Polyrepresentation in Complex (Book) Search Tasks - How can we use what the o...
Information Retrieval Models Part I
Quantum Probabilities and Quantum-inspired Information Retrieval

Recently uploaded (20)

DOCX
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PPTX
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
PPT
protein biochemistry.ppt for university classes
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PDF
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
PPTX
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
PDF
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
PPT
Chemical bonding and molecular structure
PDF
. Radiology Case Scenariosssssssssssssss
PPTX
2. Earth - The Living Planet earth and life
PDF
HPLC-PPT.docx high performance liquid chromatography
PPTX
INTRODUCTION TO EVS | Concept of sustainability
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PPTX
neck nodes and dissection types and lymph nodes levels
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PPTX
Derivatives of integument scales, beaks, horns,.pptx
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
protein biochemistry.ppt for university classes
7. General Toxicologyfor clinical phrmacy.pptx
Phytochemical Investigation of Miliusa longipes.pdf
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
Chemical bonding and molecular structure
. Radiology Case Scenariosssssssssssssss
2. Earth - The Living Planet earth and life
HPLC-PPT.docx high performance liquid chromatography
INTRODUCTION TO EVS | Concept of sustainability
ECG_Course_Presentation د.محمد صقران ppt
neck nodes and dissection types and lymph nodes levels
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
Derivatives of integument scales, beaks, horns,.pptx
TOTAL hIP ARTHROPLASTY Presentation.pptx

Polyrepresentation in a Quantum-inspired Information Retrieval Framework

  • 1. Polyrepresentation in a Quantum-inspired Information Retrieval Framework Ingo Frommholz @iFromm University of Bedfordshire Luton, UK QUARTZ Winter School University of Padua
  • 2. Table of Contents 1 The Principle of Polyrepresentation 2 A Model for Quantum-inspired Information Access 3 Polyrepresentation in a Quantum-inspired IA Model 2 / 70
  • 3. The Principle of Polyrepresentation Cognitive Information Concept in Information Seeking and Retrieval The Principle of Polyrepresentation Book Store Example Different Kinds of Polyrepresentation Retrieval with Polyrepresentation Polyrepresentative Clustering
  • 4. The Principle of Polyrepresentation Cognitive Information Concept in Information Seeking and Retrieval Cognitive Communication System The “Cognitive Freefall” isfy two conditions simultaneously (Ingwersen 1992, p. 33): On the one hand information being something which is the result of a transformation of a generator’s knowledge structures (by intentionality, model of recipients’ states of knowledge, and in the form of signs) and on the other hand being something which, when perceived, affects and transforms the recipient’s state of knowl- edge.ee Recipient World Model World Model Problem Space State of Uncertainty Current Cognitive- Emotional State Perceived objectSignsSigns InformationInformation Context B situation Context A situation TransformationTransformation InteractionInteraction Cognitive free fall InformationInformationInformation processing gstagesg Interpretation Cognitive-EmotionalCognitive-Emotional Level of SystemLevel of System Linguistic Level of SystemLinguistic Level of System Generated object Generator Fig. 2.1. The cognitive communication system for Information Science, information seeking and IR. Revision of Ingwersen (1992, p.33; 1996, p.6), from Belkin (1978). Evidently, any transformation of state of knowledge involves an effect on [Ingwersen and Järvelin, 2005, p. 33] 4 / 70
  • 5. The Principle of Polyrepresentation The Principle of Polyrepresentation The Principle of Polyrepresentation [Ingwersen, 1994, Ingwersen, 1996, Ingwersen and Järvelin, 2005] Multi-evidence Employs the diversity of (cognitively) different actors’ pre-suppositions and interpretations of their situations and objects over time Diversity may be derived from same actor but being of different functional nature (functionally different) Tries to mitigate the cognitive freefall 5 / 70
  • 6. The Principle of Polyrepresentation The Principle of Polyrepresentation Polyrepresentation Hypothesis [Ingwersen and Järvelin, 2005, p. 208] the more interpretations of different cognitive and functional nature, based on an IS&R situation, that point to a set of objects in so-called cognitive overlaps, and the more intensely they do so, the higher the probability that such objects are relevant (pertinent, useful) to a perceived work task/interest to be solved, the information (need) situation at hand, the topic required, or/and the influencing context of that situation 6 / 70
  • 7. The Principle of Polyrepresentation Book Store Example Motivating Example: Book Store “Good introduction to quantum mechanics” 7 / 70
  • 8. The Principle of Polyrepresentation Book Store Example Motivating Example: Book Store “Good introduction to quantum mechanics” 8 / 70
  • 9. The Principle of Polyrepresentation Book Store Example IN Facets and Polyrepresentation “Good introduction to quantum mechanics” Relevance decision goes beyond topicality Collections like Amazon/LT/BritishLibrary Rich pool of potentially useful information (metadata, user-generated content) Different views on documents, relevant for different aspects of the information need (IN) Combine the evidence (e.g. metadata and user-generated content) to get a more accurate estimation of relevance/usefulness [Koolen, 2014] puts user-generated content into the index – it worked! Reviews and tags complimentary to each other and to professional metadata 9 / 70
  • 10. The Principle of Polyrepresentation Different Kinds of Polyrepresentation Different Kinds of Polyrepresentation Document/information object polyrepresentation Information need (IN) polyreprepresentation Algorithm polyrepresentation 10 / 70
  • 11. The Principle of Polyrepresentation Different Kinds of Polyrepresentation Document Polyrepresentation Different document features or properties Examples POLAR for annotation-based retrieval [Frommholz and Fuhr, 2006] ?- document(D) & D[ quantum & mechanics & @A] & A[ good & introduction ] Polyrepresentation and CQQL for interactive multimedia IR [Zellhöfer and Schmitt, 2011] Features from a query-by-example document interpreted as representations, translation into CQQL queries Inter- and intra-document features in polyrepresentation [Skov et al., 2008] Cystic Fibrosis test collection: functionally different features: titles, abstract, references (representing the author); Medical Subject Headings (MeSH) (representing the indexer as a cognitively different actor) Overlaps generated by three or four representations have higher precision than those generated from two overlaps 11 / 70
  • 12. The Principle of Polyrepresentation Different Kinds of Polyrepresentation Information Need Polyrepresentation Perceived work task, problem or information need Query Example [Kelly and Fu, 2007]: What users already know, why they want to know it, TREC title+description Additional terms elicited from users used for query expansion 12 / 70
  • 13. The Principle of Polyrepresentation Different Kinds of Polyrepresentation Algorithm Polyrepresentation Different weighting schemes Different relevance feedback and/or query modification algorithms Example [Larsen et al., 2009]: 4 best performing TREC 5 ad hoc track models over 30 topics Vogt and Cottrell concluded that not all fusions may outper- form their individual components; only when almost equally well-performing retrieval models are fused may the fusion (in pairs) improve retrieval performance over the constituents. They regarded the TREC relevance pooling methods as a large-scale data fusion for IR. Ng and Kantor (2000) applied TREC 5 routing data, andWu and McClean (2006) made pre- dictions by means of overlap correlation measures. Ng and Kantor’s experiments investigated the predictive power of output dissimilarity and a pairwise measure of similarity of performance in symmetrical data fusion (p. 1177). Wu and McClean’s study demonstrated that versions of the same IR model have a very high overlap correlation and thus retrieve almost the same documents—a case to be avoided because nonrelevant documents then tend to be promoted in line with relevant ones. This view somehow contradicts the Lee’s findings, but his tests involved different retrieval models. In summary, there seems to be a consensus that all the fused IR models should perform equally well. In pairwise data fusions, the CombMNZ work quite well and are commonly used, with the CombSUM of retrieval scores performing nearly as well. When random combinations are tested, results are mixed concerning which technique to use. Normalization of retrieval scores should be applied, and if not feasible, the Borda solution where rank scores are used seems to perform well. Most data fusion experiments in IR, including the ones presented here, are based on retrieval results from already performed searches in TREC. The analysis or prediction methodologies in the present study are associated with the principle of polyrepresentation. The difference lies in the approach to prediction. While the former methodologies attempt to make predictions based on automated and quan- titative perspectives of the overlaps, our approach is from a qualitative and conceptual/algorithmic perspective. In a polyrepresentation sense, the more evidence deriving from qualitatively different sources that point to a document, the higher the probability that that document is (highly) rele- vant. Thus, the relative size of overlap between fused IR models has a quite different meaning in polyrepresentation. In data fusion based on polyrepresentation, there are fun- damentally two ways of combining retrieval systems. One way is the restricted overlap fusion. Following this method, only the inner (disjoint) overlap of the fused models provides the result set to be ranked. Figure 1 illustrates this exper- imental situation with four different retrieval models—and their triple and pairwise disjoint overlaps as well as their inner “central cognitive overlap” formed by all four models (Fuse4). As the sets are disjoint, each of the 11 overlapping sets constitutes separate data fusion results, and the perfor- mance of each can be studied. Note that in the restricted fusion, each single original retrieval model only includes the documents that are “leftover;” that is, they do not include any of the documents in overlaps with other retrieval models. For instance,theperformanceoftheoriginalretrievalmodelCOR is thus measured on the documents found in COR oval (1) but not including the already retrieved documents in Fuse2 to Fuse4 areas (white area, Figure 1). Their performance FIG. 1. Illustration of polyrepresentation of four different retrieval models’ search results in the form of disjoint overlapping documents (variation of Ingwersen & Järvelin, 2005, p. 347; Lund, Schneider, & Ingwersen, 2006). 648 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—April 2009 DOI: 10.1002/asi Restricted fusions made of two, three, or four cognitively/algorithmically very different retrieval models perform significantly better than the individual models 13 / 70
  • 14. The Principle of Polyrepresentation Retrieval with Polyrepresentation Retrieval with Polyrepresentation 1 Get ranking for different representations 14 / 70
  • 15. The Principle of Polyrepresentation Retrieval with Polyrepresentation Retrieval with Polyrepresentation 1 Get ranking for different representations 15 / 70
  • 16. The Principle of Polyrepresentation Retrieval with Polyrepresentation Retrieval with Polyrepresentation 1 Get ranking for different representations 16 / 70
  • 17. The Principle of Polyrepresentation Retrieval with Polyrepresentation Retrieval with Polyrepresentation 1 Get ranking for different representations 2 Find the cognitive overlap 17 / 70
  • 18. The Principle of Polyrepresentation Retrieval with Polyrepresentation Polyrepresentation Book Store Scenario Content Author Ratings Comments 18 / 70
  • 19. The Principle of Polyrepresentation Polyrepresentative Clustering Polyrepresentation and Clustering Polyrepresentation creates partitions Clustering partitions document sets too Can clustering help in creating polyrepresentative partitions? Polyrepresentation Cluster Hypothesis: “documents relevant to the same representations should appear in the same clus- ter” [Frommholz and Abbasi, 2014] 19 / 70
  • 20. The Principle of Polyrepresentation Polyrepresentative Clustering Polyrepresentation and Clustering Mapping of clusters to polyrepresentation (using iSearch [Lykke et al., 2010]) Simulated user – search strategy: 1 User investigates total cognitive overlap cluster 2 User jumps to different cluster based on preferences 3 The user simulation creates a ranked list of documents 20 / 70
  • 21. The Principle of Polyrepresentation Polyrepresentative Clustering Information Need-based Vector Let REPin be the set of representations1 of an information need in Motivated by the Optimum Clustering Framework (OCF), which is based on the probability of relevance [Fuhr et al., 2011] Pr(R|d,ri ) is computed for each document d and ri ∈ REPin τin(d) =   Pr(R|d,r1) ... Pr(R|d,rn)   (1) 1 search terms, work task, ideal answer, current info need, background knowledge 21 / 70
  • 22. The Principle of Polyrepresentation Polyrepresentative Clustering Document-based Polyrepresentation Vector REPd consists of the different representations2 rdi of a document d Pr(R|rdi ,q) for q (search terms in this case) is computed τdoc(d) =   Pr(R|rd1,q) ... Pr(R|rdn,q)   (2) 2 title, abstract, body, bibliographic context, references 22 / 70
  • 23. The Principle of Polyrepresentation Polyrepresentative Clustering Some Findings (using iSearch) Some statistically significant improvements over a BM25 baseline (NDCG@30) using the ranking created by a simple simulated user strategy when concatenating the IN and Document representations [Abbasi and Frommholz, 2015b] Statistical significant improvements (NDCG) when using document and IN representations separately and assuming an ideal (oracle-based) cluster ranking [Abbasi and Frommholz, 2015a] This shows us our idea is basically promising! Finding the total cognitve overlap (TOC) using cluster ranking is challenging [Frommholz and Abbasi, 2014] Different interpretations of the TOC: The one with the highest precision? The one with the highest pairwise precision? The one where all representations get a high value? The latter one could be identified more easily (MRR = 0.575 compared to around 0.3 for the others) 23 / 70
  • 24. A Model for Quantum-inspired Information Access Quantum Information Access Information Need Space User Interaction and Feedback Textual Representation Evaluation QIA Conclusion
  • 25. Quantum Probabilities Introduction Notation Wrap-Up Hilbert space: vector space with an inner product Dirac Notation: |φ〉 is a ket (a vector φ) 〈φ| is a bra (a transposed vector φT ) 〈φ|ψ〉 ∈ C is a bra(c)ket (inner product) |φ〉〈ψ| is a ketbra (a matrix) A subspace S is represented by a projector (another matrix) |φ〉〈φ| projector onto 1-dimensional subspace Orthogonal projection of vector |φ〉 onto S: S |φ〉 25 / 70
  • 26. Quantum Probabilities Introduction Quantum Information Access Assumptions underlying QIA IR system uncertain about user’s information need (IN) System view of the user’s IN becomes more and more specific through interaction The IN may change from the user’s point of view There is an IN Space, a Hilbert space 26 / 70
  • 27. Quantum Probabilities Introduction Information Need Space Information Need Space |quantum mechanics INs as vectors: IN vector |φ〉 27 / 70
  • 28. Quantum Probabilities Introduction Information Need Space Information Need Space R |quantum mechanics INs as vectors: IN vector |φ〉 Event “document d is relevant” represented by subspace R 28 / 70
  • 29. Quantum Probabilities Introduction Information Need Space Information Need Space R |quantum mechanics INs as vectors: IN vector |φ〉 Event “document d is relevant” represented by subspace R Probability of relevance: squared length of projection Pr(R|d,φ) = ||R |φ〉||2 Unit vector imposes relevance distribution on subspaces (events) 29 / 70
  • 30. Quantum Probabilities Introduction Information Need Space System’s Uncertainty about User’s Intentions R p1 p2 p4 p3 p5 System uncertain about user’s IN Expressed by an ensemble S of possible IN vectors (density ρ): S = {(p1,|φ1〉),...,(pn,|φn〉)} Probability of relevance: Pr(R|d,S)= i pi ·Pr(R|d,φi ) =||R|φ〉||2 30 / 70
  • 31. Quantum Probabilities Introduction Information Need Space System’s Uncertainty about User’s Intentions R p1 p2 p4 p3 p5 Dual representation using density operator and trace function ρ = i pi ·|φi 〉〈φi | Pr(R|d,S) = tr(ρR) 31 / 70
  • 32. Quantum Probabilities Introduction User Interaction and Feedback User Interaction and Feedback R∗ |ϕ1 |ϕ2 |ϕ5 |ϕ3 Outcome of feedback: Query and query reformulation, (click on) relevant document, ... Expressed as subspace Project IN vectors onto document subspace Document now gets probability 1 System’s uncertainty decreases Also reflects changes in information needs 32 / 70
  • 33. Quantum Probabilities Introduction User Interaction and Feedback User Interaction and Feedback R∗ |ϕ1 |ϕ2 |ϕ4 |ϕ3 |ϕ5 Outcome of feedback: Query and query reformulation, (click on) relevant document, ... Expressed as subspace Project IN vectors onto document subspace Document now gets probability 1 System’s uncertainty decreases Also reflects changes in information needs 33 / 70
  • 34. Quantum Probabilities Introduction Textual Representation Textual Representation IN Space / Documents |crash (Term) |car (Term) |jupiter (Term) |jupiter crash |car crash |ϕ IN space based on term space IN vectors made of document fragments Weighting scheme (e.g., tf, tf-idf,...) Document is relevant to all INs found in its fragments Document subspace R spanned by IN vectors No length normalisation necessary 34 / 70
  • 35. Quantum Probabilities Introduction Textual Representation Single Query Term Take all fragments vectors (IN vectors) containing term t This makes up ensemble St 35 / 70
  • 36. Quantum Probabilities Introduction Textual Representation Mixture Mixture of all combinations of term fragments Document must at least satisfy one term fragment The more term fragments are contained, the more relevant a document is S(M) = n i=1 wi Sti wi is term weight 36 / 70
  • 37. Quantum Probabilities Introduction Textual Representation Mixture of Superposition Superpose all combinations (e.g. 1 2 (|φ〉+|ψ〉)) At least one query term fragment superposition must be contained The more fragment superpositions are contained, the more relevant a document is Indication that it works well with multi-term concepts (e.g. “digital libraries”) 37 / 70
  • 38. Quantum Probabilities Introduction Textual Representation Tensor product Assumption: each term covers an IN aspect Tensor product of all fragment vectors combination of IN aspects Document must satisfy all IN aspects The more tensor products are satisfied, the more relevant is the document 38 / 70
  • 39. Quantum Probabilities Introduction Evaluation Evaluation Evaluation with several TREC collections [Piwowarski et al., 2010] Tensor representation of query could compete with BM25 We don’t lose retrieval effectiveness in an ad hoc scenario Expressive framework 39 / 70
  • 40. Quantum Probabilities Introduction QIA Conclusion QIA Extensions What can it bring to IR? Queries in sessions [Frommholz et al., 2011] Use geometry and projections to determine type of and handle follow-up query (generalisation, information need drift, specialisation) Different forms of interactions (query reformulations, relevance judgements) Summarisation [Piwowarski et al., 2012] QIA interpretation of LSA-based methods Query algebra for the QIA framework [Caputo et al., 2011] Diversity and novelty Polyrepresentation (next) 40 / 70
  • 41. Quantum Probabilities Introduction QIA Conclusion QIA Conclusion User’s IN as ensemble of vectors Documents as subspaces User interaction and feedback Term space, query construction Can compete in an ad hoc scenario 41 / 70
  • 42. Polyrepresentation in a Quantum-inspired IA Model Modelling Polyprepresentation Combining the evidence Relationships between representations
  • 43. Polyrepresentation in a Quantum-inspired IA Model Modelling Polyprepresentation The Principle of Polyrepresentation Book Store Scenario Content Author Ratings Comments 1 Get ranking for different representations 2 Find the cognitive overlap Based on different document representations, but also different representations of user’s information need Hypothesis: cognitive overlap contains highly relevant documents (experiments support this) 43 / 70
  • 44. Polyrepresentation in a Quantum-inspired IA Model Modelling Polyprepresentation How can we apply this in QIA? Model single representations in a vector space (by example) Authors Ratings Topical/term space: already discussed Combine the representations 44 / 70
  • 45. Polyrepresentation in a Quantum-inspired IA Model Modelling Polyprepresentation Example: Author Space Each author is a dimension Non-orthogonal vectors: dependencies Angle between vectors reflects the degree of dependency (90◦ = orthogonal = upright = independent) Example: Jones and Smith (somehow) related, Smith and Miller not 45 / 70
  • 46. Polyrepresentation in a Quantum-inspired IA Model Modelling Polyprepresentation Example: Author Space Document by Smith and Miller User seeks for documents by Jones Document retrieved due to relationship between Jones and Smith 46 / 70
  • 47. Polyrepresentation in a Quantum-inspired IA Model Modelling Polyprepresentation Author/Topic Space Combined author/topic space Authors may be related only w.r.t. a specific topic Ex.: A user interested in Smith’ documents about logics may be interested in Jones’ documents about logics, but not in Jones’ documents about interactive IR Author represented as a subspace |SmithLogics |JonesLogics |JonesIIR 47 / 70
  • 48. Polyrepresentation in a Quantum-inspired IA Model Modelling Polyprepresentation Rating Space Example: rating scale good/bad/average – each is a dimension “Average” rated book represented by 2-dimensional subspace User wants books which are rated good ⇒ not relevant (|good〉 orthogonal) Rrating |good |average |bad 48 / 70
  • 49. Polyrepresentation in a Quantum-inspired IA Model Combining the evidence Combining the Evidence Total Cognitive Overlap and Tensors Content Author Ratings Comments Modelled different representations in vector space Probabilities w.r.t. single representations How do we express user’s IN w.r.t. all representations? How do we get a cognitive overlap? 49 / 70
  • 50. Polyrepresentation in a Quantum-inspired IA Model Combining the evidence Combining the Evidence Total Cognitive Overlap and Tensors Content Author Ratings Comments Polyrepresentation space as tensor product (“⊗”) of single spaces Probability that document is in total cognitive overlap: Prpolyrep = Prcontent ·Prratings ·Prauthor ·Prcomments 50 / 70
  • 51. Polyrepresentation in a Quantum-inspired IA Model Combining the evidence Wishlist Content Author Ratings Comments Documents not relevant in one representation should not get a value of 0 Ignore selected representations Relative importance of representations to user (mixing and weighting) 51 / 70
  • 52. Polyrepresentation in a Quantum-inspired IA Model Combining the evidence “Don’t care” dimension Introduction of a “don’t care” dimension Part of each document subspace each document “satisfies” the don’t care “need” Example: Document by Smith, user doesn’t care about authors with probability α |smith |jones |∗ α Rauthor α = 1 means representation is ignored at all 52 / 70
  • 53. Polyrepresentation in a Quantum-inspired IA Model Combining the evidence Example What the system assumes about the user’s IN: Seeks books either by Jones or by Smith Looks either for good books or doesn’t care about ratings Assume a document d by Smith which is rated “bad” Polyrepresentation space: 9-dimensional (3× 3) 53 / 70
  • 54. Polyrepresentation in a Quantum-inspired IA Model Combining the evidence IN Vectors in Polyrepresentation Space How do they look like and what do they mean? Polyrepresentation space Reflects all 4 possible combinations of INs w.r.t. single representations: Smith/good: |smith〉⊗ |good〉 Smith/don’t care: |smith〉⊗ |∗〉 Jones/good: |jones〉⊗ |good〉 Jones/don’t care: |jones〉⊗ |∗〉 54 / 70
  • 55. Polyrepresentation in a Quantum-inspired IA Model Combining the evidence Documents in Polyrepresentation Space Represented as tensor product of single document subspaces Here: 4-dimensional subspace (2× 2) 55 / 70
  • 56. Polyrepresentation in a Quantum-inspired IA Model Combining the evidence Determining the Retrieval Weight Why the system retrieves the bad book by Smith 56 / 70
  • 57. Polyrepresentation in a Quantum-inspired IA Model Combining the evidence Determining the Retrieval Weight Why the system retrieves the bad book by Smith 57 / 70
  • 58. Polyrepresentation in a Quantum-inspired IA Model Relationships between representations Something left on the Wishlist Relationships between Representations System observes (interaction/feedback) user preferences: If book is by Smith, it has to be rated good If book is by Jones, don’t care about the ratings 58 / 70
  • 59. Polyrepresentation in a Quantum-inspired IA Model Relationships between representations Something left on the Wishlist Relationships between Representations System observes (interaction/feedback) user preferences: If book is by Smith, it has to be rated good If book is by Jones, don’t care about the ratings System evolves to new state in polyrepresentation space (2 combinations not allowed any more) X X 59 / 70
  • 60. Polyrepresentation in a Quantum-inspired IA Model Relationships between representations What does that mean? Only two assumed IN cases left: 1 Smith/good 2 Jones/don’t care Cannot be expressed as combination of single representations 60 / 70
  • 61. Polyrepresentation in a Quantum-inspired IA Model Relationships between representations What does that mean? Only two assumed IN cases left: 1 Smith/good 2 Jones/don’t care Cannot be expressed as combination of single representations Bad book by Smith only retrieved due to relationship to Jones! 61 / 70
  • 62. Polyrepresentation in a Quantum-inspired IA Model Relationships between representations Polyrepresentation Conclusion Different non-topical representations as subspaces Polyrepresentation space as tensor space to calculate cognitive overlap “Don’t care” dimension for weighting of representations Non-separate states 62 / 70
  • 63. Bibliography Bibliography I Abbasi, M. K. and Frommholz, I. (2015a). Cluster-based polyrepresentation as science modelling approach for information retrieval. Scientometrics, 102(3):2301–2322. Abbasi, M. K. and Frommholz, I. (2015b). Polyrepresentative Clustering: A Study of Simulated User Strategies and Representations. In Mayr, P., Frommholz, I., and Mutschke, P., editors, Proc. of the 2nd Workshop on Bibliometric-enhanced Information Retrieval (BIR2015), pages 47–54, Vienna, Austria. CEUR-WS.org. 63 / 70
  • 64. Bibliography Bibliography II Caputo, A., Piwowarski, B., and Lalmas, M. (2011). A Query Algebra for Quantum Information Retrieval. In Proceedings of the 2nd Italian Information Retrieval Workshop 2011. Frommholz, I. and Abbasi, M. K. (2014). On Clustering and Polyrepresentation. In de Rijke, M., Kenter, T., de Vries, A. P., Zhai, C., de Jong, F., Radinsky, K., and Hofmann, K., editors, Proceedings of the European Conference on Information Retrieval (ECIR 2014), volume 1, pages 618–623. Springer. 64 / 70
  • 65. Bibliography Bibliography III Frommholz, I. and Fuhr, N. (2006). Probabilistic, object-oriented logics for annotation-based retrieval in digital libraries. In Nelson, M., Marshall, C., and Marchionini, G., editors, Proc. of the 6th ACM/IEEE Joint Conference on Digital Libraries (JCDL 2006), pages 55–64, New York. ACM. Frommholz, I., Piwowarski, B., Lalmas, M., and van Rijsbergen, K. (2011). Processing Queries in Session in a Quantum-Inspired IR Framework. In Clough, P., Foley, C., Gurrin, C., Jones, G. J. F., Kraaij, W., Lee, H., and Mudoch, V., editors, Proceedings ECIR 2011, volume 6611 of Lecture Notes in Computer Science, pages 751–754. Springer. 65 / 70
  • 66. Bibliography Bibliography IV Fuhr, N., Lechtenfeld, M., Stein, B., and Gollub, T. (2011). The Optimum Clustering Framework: Implementing the Cluster Hypothesis. Information Retrieval, 14. Ingwersen, P. (1994). Polyrepresentation of Information Needs and Semantic Entities, Elements of a Cognitive Theory for Information Retrieval Interaction. In Croft, B. W. and van Rijsbergen, C. J., editors, Proceedings of the Seventeenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 101–111, London, et al. Springer-Verlag. 66 / 70
  • 67. Bibliography Bibliography V Ingwersen, P. (1996). Cognitive perspectives of information retrieval interaction: Elements of a cognitive IR theory. The Journal of Documentation, 52:3–50. Ingwersen, P. and Järvelin, K. (2005). The turn: integration of information seeking and retrieval in context. Springer-Verlag New York, Inc., Secaucus, NJ, USA. Kelly, D. and Fu, X. (2007). Eliciting better information need descriptions from users of information search systems. Information Processing & Management, 43(1):30–46. 67 / 70
  • 68. Bibliography Bibliography VI Koolen, M. (2014). User reviews in the search index? That’ll never work! In Proceedings ECIR 2014, pages 323–334. Larsen, B., Ingwersen, P., and Lund, B. (2009). Data fusion according to the principle of polyrepresentation. J. Am. Soc. Inf. Sci. Technol., 60:646–654. Lykke, M., Larsen, B., Lund, H., and Ingwersen, P. (2010). Developing a Test Collection for the Evaluation of Integrated Search. In Proceedings ECIR 2010, pages 627–630. 68 / 70
  • 69. Bibliography Bibliography VII Piwowarski, B., Amini, M.-R., and Lalmas, M. (2012). On using a Quantum Physics formalism for Multi-document Summarisation. Journal of the American Society for Information Science and Technology (JASIST). Piwowarski, B., Frommholz, I., Lalmas, M., and Van Rijsbergen, K. (2010). What can Quantum Theory Bring to Information Retrieval? In Proc. 19th International Conference on Information and Knowledge Management, pages 59–68. Springer. Skov, M., Larsen, B., and Ingwersen, P. (2008). Inter and intra-document contexts applied in polyrepresentation for best match IR. Information Processing & Management, 44(5):1673–1683. 69 / 70
  • 70. Bibliography Bibliography VIII Zellhöfer, D. and Schmitt, I. (2011). A user interaction model based on the principle of polyrepresentation. In Proceedings of the 4th workshop on Workshop for Ph.D. students in information & knowledge management - PIKM ’11, pages 3–10, New York, New York, USA. ACM Press. 70 / 70