Evaluating Semantic Search Systems to Identify Future Directions of Research

IWEST 2012 workshop located at ESWC 2012

Evaluating Semantic Search
Systems to Identify Future
Directions of Research
Khadija Elbedweihy1, Stuart N. Wrigley1,
Fabio Ciravegna1, Dorothee Reinhard2,
Abraham Bernstein2
1University of Sheffield, UK
18.06.2012
2University of Zurich, Switzerland
1

Outline
• Introduction
• Evaluation Design
• Evaluation Execution
• Usability Feedback and Analysis
• Future Directions for Research
• Conclusions

18.06.2012

2

Semantic Search
• Semantic Search tools have different
• querying approaches (e.g., forms, graphs, keywords).
• search strategies during processing and query execution.
• format and content of the results presented to the user.

• These factors influence the user's perceived performance
usability of the tool.

• Searching is a user-centric process; usability evaluation is as
important as – if not more than – assessing the performance.

18.06.2012

4

Previous evaluation efforts
• Kaufmann (2007): compared 4 SW query interfaces (NL and Graph-
based)
• SemSearch Challenge: ad-hoc object retrieval using keywords
• Question Answering Over Linked Data (QALD): two NL interfaces
• TREC Entity List Completion (ELC) Task: similar to SemSearch

• All previous evaluations based upon the Cranfield methodology
– test collection; set of tasks; set of relevance judgments.

• Little or no focus on usability

18.06.2012

5

EVALUATION DESIGN

18.06.2012

6

Evaluation Design
Aspect Details
Tools • Any query input style
• Answers extracted from data (e.g., list of URIs or literals but not documents)
Data Mooney Natural Language Learning Data
• known within the search community
• simple and well-known domain for subjects (geography)
• questions already available
• Give me all the state capitals of the USA?
• Which rivers in Arkansas are longer than Alleghany river?
Subjects 38 subjects (26 males, 12 females); aged between 20 and 35 years old
Criteria • Usability:
• query input (expressiveness, etc.)
• usefulness and suitability of returned answers (data) and presentation
• Performance: speed of execution (also affects user satisfaction)

18.06.2012

7

Data Captured
• Results for each question:
– time required to formulate query
– number of attempts required to answer question
– success rate (user found satisfying answer or not)
– query execution time

• Questionnaires capturing user experience
– System Usability Scale (SUS) questionnaire
– Extended questionnaire
– Demographics questionnaire

04.08.2010

8

EVALUATION EXECUTION

18.06.2012

9

Participating tools

Tool Description
K-Search Form-based
Ginseng Natural language with constrained vocabulary and grammar
NLP-Reduce Natural language for full English questions, sentence fragments,
and keywords.
PowerAqua Natural language interface

18.06.2012

10

Running the experiment

18.06.2012

11

ANALYSIS AND FEEDBACK

18.06.2012

12

Results
Criterion K-Search Ginseng Nlp- PowerAqua
‘Bad’
Bad Form- Controlled Reduce NL-based
based NL-based NL-based
Mean experiment time (s) 4313.84 3612.12 4798.58 2003.9 ‘Awful’
Mean SUS (0 – 100) 44.38 40 25.94 72.25
‘Good’
Mean Ext.Questionnaire (0-100) 47.29 45 44.63 80.67
Mean number of attempts 2.37 2.03 5.54 2.01 Twice # of
attempts
Mean answer found rate 0.41 0.19 0.21 0.55
Mean execution time (s) 0.44 0.51 0.51 11 slowest
Mean input time (s) 69.11 81.63 29 16.03

slowest

18.06.2012

13

Feedback: input style
Input Positive Negative
Free NL • fast (16 and 29 sec on average) mismatch (habitability) problem: “I need to
• most natural (query in plain natural know and use the terms expected by the
language) system and not my own terms to get results”
Contr. NL • guidance: suggestions and auto- very restricted language model:
completion • frustration (low SUS)
• avoids habitability problem (only valid • limit flexibility and expressiveness
queries) • slow query formulation (highest input
time: 81.63 sec)
Form • allow users to build more complex • more difficult to use than NL
queries than NL • time consuming (input time: 69.11 sec on
• helpful to know the search space average)
(concepts & relations)

18.06.2012

14

Feedback: results
Aspect Comments
Presentation Results not user-friendly
• provided full URIs of the concepts
(e.g. `http://www.mooney.net/geo#tennesse2’)
• used ontology labels for providing a NL representation of the answer
(e.g. `montgomeryAI’)
Management Users have high expectations; requested advanced means of managing
the results such as:
• storing and reusing results of previous queries
• filtering results according to some suitable criteria
• checking the provenance of the results
• basic manipulations such as sorting results

18.06.2012

15

FUTURE DIRECTIONS FOR
RESEARCH
18.06.2012

16

Input Style
• Visualising the search space shows:
• what type of information is available (exploration)
• what queries are supported (query formulation guidance).
• Typing queries in natural language is fast and easy

• Provide ‘dual query formulation’ approach
• users unfamiliar with domain can correctly formulate their
intended queries using view-based
• users familiar with domain can use faster NL queries

18.06.2012

17

Input Style
• Comparatives and Superlatives still a challenge
e.g., FREyA uses an ‘intervention approach’
• if a numerical datatype property is found in user query:
1. generates maximum, minimum and sum functions
2. user chooses the required function

18.06.2012

18

Query Execution
Delays in response time negatively affect user experience and
satisfaction.

• Provide feedback
• reduces the effect of delays (more willing to wait if they know the
status of their search process).

• Provide intermediate (partial) results
• gradually incremented to provide the complete result set.
• similar to (arguably better than) basic feedback

18.06.2012

19

Results
• Presentation
• Attractive, accessible, understandable and user-friendly.
• Augment with associated information: `richer’ user experience.

• Management
• Filter, sort
• Some complex questions require multiple sub-queries
• Ability to store and reuse the result set could be helpful.
• Queries can then be constructed by combining saved queries
with logical operators such as `AND' and `OR’.

18.06.2012

20

Conclusions & Recommendations
• Query input approaches serve different purposes:
– View-based: explore and understand
– NL-based: efficiency and simplicity

• Dual query approach to input
– natural language and view-based input styles
– improve search effectiveness and user satisfaction

• More sophisticated results presentation and management
– customise: sort, filter, provenance and (temporary save)
– enrich: supplementary information

18.06.2012

22

Evaluating Semantic Search Systems to Identify Future Directions of Research

More Related Content

What's hot (17)

Similar to Evaluating Semantic Search Systems to Identify Future Directions of Research (20)

Recently uploaded (20)

Evaluating Semantic Search Systems to Identify Future Directions of Research