SlideShare a Scribd company logo
Formal Arguments, Preferences,
and Natural Language Interfaces
to Humans: an Empirical
Evaluation
Federico Cerutti Nava Tintarev Nir Oren
ECAI 2014 — Friday 22nd
August, 2014
Motivation
– Distributed autonomous systems increasingly used
– Reasoning can be formalized as argumentation
– However, if we need to explain this to people the information
presentation needs to be more natural
– Can we create a bridge between natural language and formal
argumentation?
– What kind of factors need to be considered
- Preferences between arguments?
- Domain specific knowledge?
2 of 31
Background
The Experiment
Methodology
Results
Conclusions
3 of 31
Background on P&S
Rule-based argumentation framework
Allows to express arguments in favour of preferences among rules
Includes negation as failure an strong negation
Although it is pre-Dung1995, it is easy to draw a correspondence with
an abstract argumentation frameworks (there are some points where
we should be cautious, but it is not the case of this work)
4 of 31
Crash course on P&S
Each rule as a set of antecedents and a consequent
Strict (they cannot contain negation as failure atoms) and defeasible
rules
Arguments as sequence (instead of recursive structure like in ASPIC)
of rules
The conclusions of an argument is the set containing each consequent
of each rule of the argument
Attacks:
on some antecedent of some rule
on some conclusion
Skeptical semantics: grounded
Credulous semantics: stable
5 of 31
Example
S D
s1 : ⇒ sAAA
s2 : ⇒ sBBB
s3 : ⇒ sdoc
r1 : sAAA ∧ ∼ exAAA ⇒ poorer
r2 : sBBB ∧ sdoc ∧ ∼ exBBB ∧ ∼ exdoc ⇒ ¬ poorer
r3 : ∼ exexpert ⇒ r1 r2
A politician and an economist discuss the potential financial outcome of the
independence of a region X. The politician puts forward an argument in favour of
the conclusion “If Region X becomes independent, X’s citizens will be poorer
than they are now”. Another argument holding a contradicting conclusion (i.e.
Region X will not be poorer) is advanced by the economist. The economist’s
opinion is likely to be preferred to that of the politician, and is supported by a
scientific document.
rgs = {a1 = 〈s1,r1〉,a2 = 〈s2,s3,r2〉,a3 = 〈r3〉}; a2 rgs-defeats a1
a2 justified
6 of 31
Background
The Experiment
Methodology
Results
Conclusions
7 of 31
The Experiment
Presenting each participant with a text, written in natural language,
followed by a questionnaire
Between subjects design across eight texts: each participant is shown a
single (randomly selected) text
Four domains:
1 weather forecast
2 political debate
3 used car sale
4 romantic relationship
Two KBs: base case, and extended case
The base case always consider two arguments a1 and a2 with two
contradicting conclusions; and a preference in favour of a2
8 of 31
The Extended Case for the Example
More recent research disputes the claim of the economist
S D
s1 : ⇒ sAAA
s2 : ⇒ sBBB
s3 : ⇒ sdoc
s4 : ⇒ sresearch
s5 : sresearch ⇒ ¬sdoc
r1 : sAAA ∧ ∼ exAAA ⇒ poorer
r2 : sBBB ∧ sdoc ∧ ∼ exBBB ∧ ∼ exdoc ⇒ ¬ poorer
r3 : ∼ exexpert ⇒ r1 r2
rgs = {a1 = 〈s1,r1〉,a2 = 〈s2,s3,r2〉,a3 = 〈r3〉,a4 = 〈s4,s5〉}
a2 rgs-defeats a1,a2 rgs-defeats a4,a4 rgs-defeats a2,
Two stable extensions:
{a1,a3,a4} and {a2,a3}
9 of 31
Domain 1: weather forecast
The weather forecasting service of the broadcasting company AAA says
that it will rain tomorrow (a1).
Meanwhile, the forecast service of the broadcasting company BBB says that
it will be cloudy tomorrow but that it will not rain (a2).
It is also well known that the forecasting service of BBB is more accurate
than the one of AAA (a3).
However, yesterday the trustworthy newspaper CCC published an article
which said that BBB has cut the resources for its weather forecasting
service in the past months, thus making it less reliable than in the past (a4).
10 of 31
Domain 2: political debate
In a TV debate, the politician AAA argues that if Region X becomes
independent then X’s citizens will be poorer than now (a1).
Subsequently, financial expert (a3) Dr. BBB presents a document; which
scientifically shows that Region X will not be worse off financially if it
becomes independent (a2).
After that, the moderator of the debate reminds BBB of more recent
research by several important economists that disputes the claims in that
document (a4).
11 of 31
Domain 3: buying a car
You are planning to buy a second-hand car, and you go to a dealership with
BBB, a mechanic whom has been recommended you by a friend (a3).
The salesperson AAA shows you a car and says that it needs very little
work done to it (a1).
BBB says it will require quite a lot of work, because in the past he had to
fix several issues in a car of the same model (a2).
While you are at the dealership, your friend calls you to tell you that he
knows (beyond a shadow of a doubt) that BBB made unnecessary repairs
to his car last month (a4).
12 of 31
Domain 4: romance
After several dates, you would like to start a serious relationship with J.
but you turn to ask two friends of yours, AAA and BBB, for advice. You
have known BBB for longer than you have known AAA (a3).
AAA tells you that J is lovely and you should go ahead (a1),
while BBB suggests that you should be very cautious because J might have
a hidden agenda (a2).
After some weeks, CCC, who is also a close friend of BBB, tells you that
BBB has been into you for years; BBB is too shy to tell you about their
feelings about you, but are still possessive of you (a4).
13 of 31
Formalisation summary
Domain Base Case Extended
Case
Type of reinstatement
1, weather 1.B 1.E preference attack
2, politics 2.B 2.E a2 rebuttal
3, buying car 3.B 3.E preference attack
4, romance 4.B 4.E preference rebuttal
14 of 31
Background
The Experiment
Methodology
Results
Conclusions
15 of 31
Methodology
Participants are asked to determine which of the following positions
they think is accurate:
A: I think that AAA’s position is correct (e.g. “X’s citizens will be
poorer than now”)
B: I think that BBB’s position is correct (e.g. “X’s citizens will not be
worse off financially”)
U: I cannot determine if either AAA’s or BBB’s position is correct
(e.g. “I cannot conclude anything about Region X’s finances”)
Rate a statements in terms of relevance (for the conclusion) and
agreement on a 7 points scale from Disagree to Agree for each
statement
16 of 31
Hypotheses
H1: In the base cases (Scenarios 1.B, 2.B, 3.B and 4.B), the majority of
participants will agree with BBB’s statement (position B)
H2: In the extended cases (Scenarios 1.E, 2.E, 3.E and 4.E), the
majority of participants will agree that they cannot conclude
anything from the text (position U).
H3: The majority of participants who view a base case scenario will
agree with the preference argument, and find it relevant
17 of 31
Background
The Experiment
Methodology
Results
Conclusions
18 of 31
Hypotheses H1 and H2
0
15
30
45
60
A B U
%
Distribution of acceptability of actors’ positions
Base cases Extended cases
Distribution of the final conclusion A/ B/ U
Base cases, χ 2
analysis (2, N=77)=37.74, p < 0.001;
extended cases χ 2
(2, N=84)=8.0, p < 0.02
19 of 31
Hypothesis H3
Participants rate how much (on a scale of 1 to 7) they agree with the
following statement (agreement), and whether it is relevant in drawing
their conclusion (relevance): “BBB is more trustworthy than AAA.”
Significant difference between the base and the extended cases for
agreement (Mann-Whitney U(1778), Z = −5.0, p < 0.001) and relevance
(Mann-Whitney U(1852), Z = −4.7, p < 0.001).
In addition, the median values both for agreement and relevance are
greater for the base cases than for the extended cases
20 of 31
Post Hoc: Motivations
Base Cases Extended Cases
A B U A B U
1, weather 5.0 50.0 45.0 15.8 21.1 63.2
2, politics 5.3 63.2 31.6 21.1 10.5 68.4
3, buying car 0.0 68.2 31.8 23.8 23.8 52.4
4, romance 12.5 68.8 18.8 48.0 36.0 16.0
Distribution of the final conclusion A/ B/ U
Fisher (N = 161) = 48.756, p < 0.001, 10000 sampled tables, Monte Carlo
approach with 99% confidence interval (MC99)
21 of 31
Post Hoc: Distributions of Base Cases
0
15
30
45
60
U1 U2 U3
%
Distributions of motivations for U (scenarios 1.B and 3.B)
1.B 3.B
Agreement with the U position in scenarios 1.B and 3.B:
U1: lack of information, U2: domain specific reasons; U3: other
22 of 31
Post Hoc: Distributions between Base/Extended
Cases
Base Cases Extended Cases
A B U A B U
1, weather 5.0 50.0 45.0 15.8 21.1 63.2
2, politics 5.3 63.2 31.6 21.1 10.5 68.4
3, buying car 0.0 68.2 31.8 23.8 23.8 52.4
4, romance 12.5 68.8 18.8 48.0 36.0 16.0
Are the distributions of choices (among A, B, and U) in the base case
is significantly different from the distribution of choices in the
corresponding extended case?
YES for the third domain (3.B and 3.E, buying a car) — Fisher
(N = 43) = 10.693, p < 0.001, 10000 sampled tables, MC99.
NO for the first domain (1.B and 1.E, weather forecasts) — Fisher
(N = 39) = 3.832, p = 0.187, 10000 sampled tables, MC99.
23 of 31
Post Hoc: Distributions Extended Cases
Base Cases Extended Cases
A B U A B U
1, weather 5.0 50.0 45.0 15.8 21.1 63.2
2, politics 5.3 63.2 31.6 21.1 10.5 68.4
3, buying car 0.0 68.2 31.8 23.8 23.8 52.4
4, romance 12.5 68.8 18.8 48.0 36.0 16.0
Domain has a significant effect on the distribution of positions — Fisher
(N = 84) = 16.308, p < 0.05, 10000 sampled tables, MC99.
24 of 31
Post Hoc: Relevance and Agreement
Base cases Extended cases
RB
†
Md∗
B
RE
†
Md∗
E
C.D.‡
Relevance
1, weather 110.38 6.00 82.92 4.00 46.60
2, politics 107.45 6.00 69.45 4.00 47.19
3, buying car 118.05 6.50 67.45 4.00 44.38
4, romance 48.34 2.00 44.40 2.00 46.57
Agreement
1, weather 116.38 6.00 87.18 4.00 46.60
2, politics 103.34 6.00 65.05 4.00 47.19
3, buying car 121.93 6.50 64.33 4.00 44.38
4, romance 44.94 2.00 44.20 2.00 46.57
Statistically significant cases when |Rx − Ry| > C.D.
†
Mean rank as computed with the Kruskal-Wallis test
‡
Critical Difference, as computed in [Siegel and Castellan Jr., 1988] cited
by [Field, 2009] with α = 0.05.
25 of 31
Post Hoc: Relevance and Agreement
Scenario 3.B Scenario 4.B
R3.B
†
Md∗
3.B
R4.B
†
Md∗
4.B
C.D.‡
Relevance 118.05 6.50 48.34 2.00 47.79
Agreement 121.93 6.50 44.94 2.00 47.79
Statistically significant cases when |Rx − Ry| > C.D.
†
Mean rank as computed with the Kruskal-Wallis test
‡
Critical Difference, as computed in [Siegel and Castellan Jr., 1988] cited
by [Field, 2009] with α = 0.05.
26 of 31
Background
The Experiment
Methodology
Results
Conclusions
27 of 31
Conclusions
Investigation into the relationship between formal systems of
defeasible argumentation and arguments in natural language
Results suggest a correspondence between the formal theory and its
representation in natural language
Preference generally applied “following” Prakken and Sartor:
importance of being able to represent them
Humans evaluate preference depending on the context
Collateral knowledge
Reverse of preference
28 of 31
Acknowledgement
Research was sponsored by US Army Research laboratory and the UK Ministry
of Defence and was accomplished under Agreement Number W911NF-06-3-0001.
The views and conclusions contained in this document are those of the authors
and should not be interpreted as representing the official policies, either expressed
or implied, of the US Army Research Laboratory, the U.S. Government, the UK
Ministry of Defense, or the UK Government. The US and UK Governments are
authorized to reproduce and distribute reprints for Government purposes
notwithstanding any copyright notation hereon.
This research has been carried out within the project “Scrutable Autonomous
Systems” (SAsSY), funded by the Engineering and Physical Sciences Research
Council (EPSRC, UK), grant ref. EP/J012084/1.
29 of 31
Advert
30 of 31
References I
[Field, 2009] Field, A. (2009).
Discovering Statistics Using SPSS (Introducing Statistical Methods series).
SAGE Publications Ltd.
[Siegel and Castellan Jr., 1988] Siegel, S. and Castellan Jr., N. J. (1988).
Nonparametric Statistics for The Behavioral Sciences.
McGraw-Hill Humanities/Social Sciences/Languages.
31 of 31

More Related Content

PDF
A SCC Recursive Meta-Algorithm for Computing Preferred Labellings in Abstract...
PDF
Argumentation Extensions Enumeration as a Constraint Satisfaction Problem: a ...
PDF
Cerutti -- TAFA2013
PDF
Cerutti-AT2013-Graphical Subjective Logic
PDF
Algorithm Selection for Preferred Extensions Enumeration
PDF
Cerutti--Web Information Systems (postgrad seminar @ University of Brescia)
PDF
Cerutti--NMR 2010
PDF
Cerutti-AT2013-Trust and Risk
A SCC Recursive Meta-Algorithm for Computing Preferred Labellings in Abstract...
Argumentation Extensions Enumeration as a Constraint Satisfaction Problem: a ...
Cerutti -- TAFA2013
Cerutti-AT2013-Graphical Subjective Logic
Algorithm Selection for Preferred Extensions Enumeration
Cerutti--Web Information Systems (postgrad seminar @ University of Brescia)
Cerutti--NMR 2010
Cerutti-AT2013-Trust and Risk

Viewers also liked (8)

PDF
Cerutti--TAFA 2011
PDF
Cerutti--Knowledge Representation and Reasoning (postgrad seminar @ Universit...
PDF
Cerutti--Verification of Crypto Protocols (postgrad seminar @ University of B...
PDF
Cerutti--AAAI Fall Symposia 2009
PDF
Cerutti--ARGAIP 2010
PDF
Cerutti--PhD viva voce defence
PDF
Cerutti--ECSQARU 2009
PDF
Cerutti--Introduction to Argumentation (seminar @ University of Aberdeen)
Cerutti--TAFA 2011
Cerutti--Knowledge Representation and Reasoning (postgrad seminar @ Universit...
Cerutti--Verification of Crypto Protocols (postgrad seminar @ University of B...
Cerutti--AAAI Fall Symposia 2009
Cerutti--ARGAIP 2010
Cerutti--PhD viva voce defence
Cerutti--ECSQARU 2009
Cerutti--Introduction to Argumentation (seminar @ University of Aberdeen)
Ad

Similar to Formal Arguments, Preferences, and Natural Language Interfaces to Humans: an Empirical Evaluation (20)

PPT
classfeb08and10.ppt
PDF
Test Bank for Perspectives on Personality, 8th Edition
PDF
Chapter2 slides-part 1-harish complete
PDF
Test Bank for Perspectives on Personality, 8th Edition
PPT
Psychological determinants of human judgment & decision making
PDF
Test Bank for Perspectives on Personality, 8th Edition
PDF
Test Bank for Perspectives on Personality, 8th Edition
PPT
AAPOR 2012 Langer Probability
PPTX
475 media effects methods 2012 up
PPT
Pp chapter 02 reading the news revised
PPTX
AAPOR 2012 Langer AASRO
PPTX
3. building a research design
PDF
Human-Argumentation Experiment Pilot 2013: Technical Material
PDF
Test Bank for Perspectives on Personality, 8th Edition
PDF
You Want Me to Measure What?
PDF
Test Bank for Perspectives on Personality, 8th Edition
PDF
(eBook PDF) Communication Research Methods 4th Edition
PDF
Seeing Through Statistics 4th Edition Utts Test Bank
PDF
Research Methods in Computer Science and Software Engineering
classfeb08and10.ppt
Test Bank for Perspectives on Personality, 8th Edition
Chapter2 slides-part 1-harish complete
Test Bank for Perspectives on Personality, 8th Edition
Psychological determinants of human judgment & decision making
Test Bank for Perspectives on Personality, 8th Edition
Test Bank for Perspectives on Personality, 8th Edition
AAPOR 2012 Langer Probability
475 media effects methods 2012 up
Pp chapter 02 reading the news revised
AAPOR 2012 Langer AASRO
3. building a research design
Human-Argumentation Experiment Pilot 2013: Technical Material
Test Bank for Perspectives on Personality, 8th Edition
You Want Me to Measure What?
Test Bank for Perspectives on Personality, 8th Edition
(eBook PDF) Communication Research Methods 4th Edition
Seeing Through Statistics 4th Edition Utts Test Bank
Research Methods in Computer Science and Software Engineering
Ad

More from Federico Cerutti (12)

PDF
Security of Artificial Intelligence
PDF
Introduction to Evidential Neural Networks
PDF
Argumentation and Machine Learning: When the Whole is Greater than the Sum of...
PDF
Probabilistic Logic Programming with Beta-Distributed Random Variables
PDF
Supporting Scientific Enquiry with Uncertain Sources
PDF
Introduction to Formal Argumentation Theory
PDF
Handout: Argumentation in Artificial Intelligence: From Theory to Practice
PDF
Argumentation in Artificial Intelligence: From Theory to Practice
PDF
Handout for the course Abstract Argumentation and Interfaces to Argumentative...
PDF
Argumentation in Artificial Intelligence: 20 years after Dung's work. Left ma...
PDF
Argumentation in Artificial Intelligence: 20 years after Dung's work. Right m...
PDF
Argumentation in Artificial Intelligence
Security of Artificial Intelligence
Introduction to Evidential Neural Networks
Argumentation and Machine Learning: When the Whole is Greater than the Sum of...
Probabilistic Logic Programming with Beta-Distributed Random Variables
Supporting Scientific Enquiry with Uncertain Sources
Introduction to Formal Argumentation Theory
Handout: Argumentation in Artificial Intelligence: From Theory to Practice
Argumentation in Artificial Intelligence: From Theory to Practice
Handout for the course Abstract Argumentation and Interfaces to Argumentative...
Argumentation in Artificial Intelligence: 20 years after Dung's work. Left ma...
Argumentation in Artificial Intelligence: 20 years after Dung's work. Right m...
Argumentation in Artificial Intelligence

Recently uploaded (20)

PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
master seminar digital applications in india
PPTX
Lesson notes of climatology university.
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
Pharma ospi slides which help in ospi learning
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
Cell Structure & Organelles in detailed.
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
GDM (1) (1).pptx small presentation for students
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
RMMM.pdf make it easy to upload and study
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
Pre independence Education in Inndia.pdf
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
master seminar digital applications in india
Lesson notes of climatology university.
human mycosis Human fungal infections are called human mycosis..pptx
Pharma ospi slides which help in ospi learning
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
102 student loan defaulters named and shamed – Is someone you know on the list?
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Cell Structure & Organelles in detailed.
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
GDM (1) (1).pptx small presentation for students
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
2.FourierTransform-ShortQuestionswithAnswers.pdf
VCE English Exam - Section C Student Revision Booklet
RMMM.pdf make it easy to upload and study
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Pre independence Education in Inndia.pdf

Formal Arguments, Preferences, and Natural Language Interfaces to Humans: an Empirical Evaluation

  • 1. Formal Arguments, Preferences, and Natural Language Interfaces to Humans: an Empirical Evaluation Federico Cerutti Nava Tintarev Nir Oren ECAI 2014 — Friday 22nd August, 2014
  • 2. Motivation – Distributed autonomous systems increasingly used – Reasoning can be formalized as argumentation – However, if we need to explain this to people the information presentation needs to be more natural – Can we create a bridge between natural language and formal argumentation? – What kind of factors need to be considered - Preferences between arguments? - Domain specific knowledge? 2 of 31
  • 4. Background on P&S Rule-based argumentation framework Allows to express arguments in favour of preferences among rules Includes negation as failure an strong negation Although it is pre-Dung1995, it is easy to draw a correspondence with an abstract argumentation frameworks (there are some points where we should be cautious, but it is not the case of this work) 4 of 31
  • 5. Crash course on P&S Each rule as a set of antecedents and a consequent Strict (they cannot contain negation as failure atoms) and defeasible rules Arguments as sequence (instead of recursive structure like in ASPIC) of rules The conclusions of an argument is the set containing each consequent of each rule of the argument Attacks: on some antecedent of some rule on some conclusion Skeptical semantics: grounded Credulous semantics: stable 5 of 31
  • 6. Example S D s1 : ⇒ sAAA s2 : ⇒ sBBB s3 : ⇒ sdoc r1 : sAAA ∧ ∼ exAAA ⇒ poorer r2 : sBBB ∧ sdoc ∧ ∼ exBBB ∧ ∼ exdoc ⇒ ¬ poorer r3 : ∼ exexpert ⇒ r1 r2 A politician and an economist discuss the potential financial outcome of the independence of a region X. The politician puts forward an argument in favour of the conclusion “If Region X becomes independent, X’s citizens will be poorer than they are now”. Another argument holding a contradicting conclusion (i.e. Region X will not be poorer) is advanced by the economist. The economist’s opinion is likely to be preferred to that of the politician, and is supported by a scientific document. rgs = {a1 = 〈s1,r1〉,a2 = 〈s2,s3,r2〉,a3 = 〈r3〉}; a2 rgs-defeats a1 a2 justified 6 of 31
  • 8. The Experiment Presenting each participant with a text, written in natural language, followed by a questionnaire Between subjects design across eight texts: each participant is shown a single (randomly selected) text Four domains: 1 weather forecast 2 political debate 3 used car sale 4 romantic relationship Two KBs: base case, and extended case The base case always consider two arguments a1 and a2 with two contradicting conclusions; and a preference in favour of a2 8 of 31
  • 9. The Extended Case for the Example More recent research disputes the claim of the economist S D s1 : ⇒ sAAA s2 : ⇒ sBBB s3 : ⇒ sdoc s4 : ⇒ sresearch s5 : sresearch ⇒ ¬sdoc r1 : sAAA ∧ ∼ exAAA ⇒ poorer r2 : sBBB ∧ sdoc ∧ ∼ exBBB ∧ ∼ exdoc ⇒ ¬ poorer r3 : ∼ exexpert ⇒ r1 r2 rgs = {a1 = 〈s1,r1〉,a2 = 〈s2,s3,r2〉,a3 = 〈r3〉,a4 = 〈s4,s5〉} a2 rgs-defeats a1,a2 rgs-defeats a4,a4 rgs-defeats a2, Two stable extensions: {a1,a3,a4} and {a2,a3} 9 of 31
  • 10. Domain 1: weather forecast The weather forecasting service of the broadcasting company AAA says that it will rain tomorrow (a1). Meanwhile, the forecast service of the broadcasting company BBB says that it will be cloudy tomorrow but that it will not rain (a2). It is also well known that the forecasting service of BBB is more accurate than the one of AAA (a3). However, yesterday the trustworthy newspaper CCC published an article which said that BBB has cut the resources for its weather forecasting service in the past months, thus making it less reliable than in the past (a4). 10 of 31
  • 11. Domain 2: political debate In a TV debate, the politician AAA argues that if Region X becomes independent then X’s citizens will be poorer than now (a1). Subsequently, financial expert (a3) Dr. BBB presents a document; which scientifically shows that Region X will not be worse off financially if it becomes independent (a2). After that, the moderator of the debate reminds BBB of more recent research by several important economists that disputes the claims in that document (a4). 11 of 31
  • 12. Domain 3: buying a car You are planning to buy a second-hand car, and you go to a dealership with BBB, a mechanic whom has been recommended you by a friend (a3). The salesperson AAA shows you a car and says that it needs very little work done to it (a1). BBB says it will require quite a lot of work, because in the past he had to fix several issues in a car of the same model (a2). While you are at the dealership, your friend calls you to tell you that he knows (beyond a shadow of a doubt) that BBB made unnecessary repairs to his car last month (a4). 12 of 31
  • 13. Domain 4: romance After several dates, you would like to start a serious relationship with J. but you turn to ask two friends of yours, AAA and BBB, for advice. You have known BBB for longer than you have known AAA (a3). AAA tells you that J is lovely and you should go ahead (a1), while BBB suggests that you should be very cautious because J might have a hidden agenda (a2). After some weeks, CCC, who is also a close friend of BBB, tells you that BBB has been into you for years; BBB is too shy to tell you about their feelings about you, but are still possessive of you (a4). 13 of 31
  • 14. Formalisation summary Domain Base Case Extended Case Type of reinstatement 1, weather 1.B 1.E preference attack 2, politics 2.B 2.E a2 rebuttal 3, buying car 3.B 3.E preference attack 4, romance 4.B 4.E preference rebuttal 14 of 31
  • 16. Methodology Participants are asked to determine which of the following positions they think is accurate: A: I think that AAA’s position is correct (e.g. “X’s citizens will be poorer than now”) B: I think that BBB’s position is correct (e.g. “X’s citizens will not be worse off financially”) U: I cannot determine if either AAA’s or BBB’s position is correct (e.g. “I cannot conclude anything about Region X’s finances”) Rate a statements in terms of relevance (for the conclusion) and agreement on a 7 points scale from Disagree to Agree for each statement 16 of 31
  • 17. Hypotheses H1: In the base cases (Scenarios 1.B, 2.B, 3.B and 4.B), the majority of participants will agree with BBB’s statement (position B) H2: In the extended cases (Scenarios 1.E, 2.E, 3.E and 4.E), the majority of participants will agree that they cannot conclude anything from the text (position U). H3: The majority of participants who view a base case scenario will agree with the preference argument, and find it relevant 17 of 31
  • 19. Hypotheses H1 and H2 0 15 30 45 60 A B U % Distribution of acceptability of actors’ positions Base cases Extended cases Distribution of the final conclusion A/ B/ U Base cases, χ 2 analysis (2, N=77)=37.74, p < 0.001; extended cases χ 2 (2, N=84)=8.0, p < 0.02 19 of 31
  • 20. Hypothesis H3 Participants rate how much (on a scale of 1 to 7) they agree with the following statement (agreement), and whether it is relevant in drawing their conclusion (relevance): “BBB is more trustworthy than AAA.” Significant difference between the base and the extended cases for agreement (Mann-Whitney U(1778), Z = −5.0, p < 0.001) and relevance (Mann-Whitney U(1852), Z = −4.7, p < 0.001). In addition, the median values both for agreement and relevance are greater for the base cases than for the extended cases 20 of 31
  • 21. Post Hoc: Motivations Base Cases Extended Cases A B U A B U 1, weather 5.0 50.0 45.0 15.8 21.1 63.2 2, politics 5.3 63.2 31.6 21.1 10.5 68.4 3, buying car 0.0 68.2 31.8 23.8 23.8 52.4 4, romance 12.5 68.8 18.8 48.0 36.0 16.0 Distribution of the final conclusion A/ B/ U Fisher (N = 161) = 48.756, p < 0.001, 10000 sampled tables, Monte Carlo approach with 99% confidence interval (MC99) 21 of 31
  • 22. Post Hoc: Distributions of Base Cases 0 15 30 45 60 U1 U2 U3 % Distributions of motivations for U (scenarios 1.B and 3.B) 1.B 3.B Agreement with the U position in scenarios 1.B and 3.B: U1: lack of information, U2: domain specific reasons; U3: other 22 of 31
  • 23. Post Hoc: Distributions between Base/Extended Cases Base Cases Extended Cases A B U A B U 1, weather 5.0 50.0 45.0 15.8 21.1 63.2 2, politics 5.3 63.2 31.6 21.1 10.5 68.4 3, buying car 0.0 68.2 31.8 23.8 23.8 52.4 4, romance 12.5 68.8 18.8 48.0 36.0 16.0 Are the distributions of choices (among A, B, and U) in the base case is significantly different from the distribution of choices in the corresponding extended case? YES for the third domain (3.B and 3.E, buying a car) — Fisher (N = 43) = 10.693, p < 0.001, 10000 sampled tables, MC99. NO for the first domain (1.B and 1.E, weather forecasts) — Fisher (N = 39) = 3.832, p = 0.187, 10000 sampled tables, MC99. 23 of 31
  • 24. Post Hoc: Distributions Extended Cases Base Cases Extended Cases A B U A B U 1, weather 5.0 50.0 45.0 15.8 21.1 63.2 2, politics 5.3 63.2 31.6 21.1 10.5 68.4 3, buying car 0.0 68.2 31.8 23.8 23.8 52.4 4, romance 12.5 68.8 18.8 48.0 36.0 16.0 Domain has a significant effect on the distribution of positions — Fisher (N = 84) = 16.308, p < 0.05, 10000 sampled tables, MC99. 24 of 31
  • 25. Post Hoc: Relevance and Agreement Base cases Extended cases RB † Md∗ B RE † Md∗ E C.D.‡ Relevance 1, weather 110.38 6.00 82.92 4.00 46.60 2, politics 107.45 6.00 69.45 4.00 47.19 3, buying car 118.05 6.50 67.45 4.00 44.38 4, romance 48.34 2.00 44.40 2.00 46.57 Agreement 1, weather 116.38 6.00 87.18 4.00 46.60 2, politics 103.34 6.00 65.05 4.00 47.19 3, buying car 121.93 6.50 64.33 4.00 44.38 4, romance 44.94 2.00 44.20 2.00 46.57 Statistically significant cases when |Rx − Ry| > C.D. † Mean rank as computed with the Kruskal-Wallis test ‡ Critical Difference, as computed in [Siegel and Castellan Jr., 1988] cited by [Field, 2009] with α = 0.05. 25 of 31
  • 26. Post Hoc: Relevance and Agreement Scenario 3.B Scenario 4.B R3.B † Md∗ 3.B R4.B † Md∗ 4.B C.D.‡ Relevance 118.05 6.50 48.34 2.00 47.79 Agreement 121.93 6.50 44.94 2.00 47.79 Statistically significant cases when |Rx − Ry| > C.D. † Mean rank as computed with the Kruskal-Wallis test ‡ Critical Difference, as computed in [Siegel and Castellan Jr., 1988] cited by [Field, 2009] with α = 0.05. 26 of 31
  • 28. Conclusions Investigation into the relationship between formal systems of defeasible argumentation and arguments in natural language Results suggest a correspondence between the formal theory and its representation in natural language Preference generally applied “following” Prakken and Sartor: importance of being able to represent them Humans evaluate preference depending on the context Collateral knowledge Reverse of preference 28 of 31
  • 29. Acknowledgement Research was sponsored by US Army Research laboratory and the UK Ministry of Defence and was accomplished under Agreement Number W911NF-06-3-0001. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the US Army Research Laboratory, the U.S. Government, the UK Ministry of Defense, or the UK Government. The US and UK Governments are authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation hereon. This research has been carried out within the project “Scrutable Autonomous Systems” (SAsSY), funded by the Engineering and Physical Sciences Research Council (EPSRC, UK), grant ref. EP/J012084/1. 29 of 31
  • 31. References I [Field, 2009] Field, A. (2009). Discovering Statistics Using SPSS (Introducing Statistical Methods series). SAGE Publications Ltd. [Siegel and Castellan Jr., 1988] Siegel, S. and Castellan Jr., N. J. (1988). Nonparametric Statistics for The Behavioral Sciences. McGraw-Hill Humanities/Social Sciences/Languages. 31 of 31