SlideShare a Scribd company logo
optimization of probabilistic argumentation
with markov processes
E. Hadoux1
, A. Beynier1
, N. Maudet1
, P. Weng2
and A. Hunter3
Tue., Sept. 29th
(1) Sorbonne Universités, UPMC Univ Paris 6, UMR 7606, LIP6, F-75005, Paris, France
(2) SYSU-CMU Joint Institute of Engineering, Guangzhou, China
SYSU-CMU Shunde International Joint Research Institute, Shunde, China
(3) Department of Computer Science, University College London, Gower Street, London
WC1E 6BT, UK
Introduction
∙ Debate argumentation problems between two agents
1
Introduction
∙ Debate argumentation problems between two agents
∙ Probabilistic executable logic to improve expressivity
1
Introduction
∙ Debate argumentation problems between two agents
∙ Probabilistic executable logic to improve expressivity
∙ New class of problems: Argumentation Problem with
Probabilistic Strategies (APS) (Hunter, 2014)
1
Introduction
∙ Debate argumentation problems between two agents
∙ Probabilistic executable logic to improve expressivity
∙ New class of problems: Argumentation Problem with
Probabilistic Strategies (APS) (Hunter, 2014)
∙ Purpose of this work: optimize the sequence of arguments
of one agent
1
Introduction
∙ Debate argumentation problems between two agents
∙ Probabilistic executable logic to improve expressivity
∙ New class of problems: Argumentation Problem with
Probabilistic Strategies (APS) (Hunter, 2014)
∙ Purpose of this work: optimize the sequence of arguments
of one agent
There will be abuse of the word predicate!
1
formalization
Formalization of a debate problem
∙ Turn-based game between two agents
∙ Rules to fire in order to attack arguments of the opponent
and revise knowledge
3
Formalization of a debate problem
∙ Turn-based game between two agents
∙ Rules to fire in order to attack arguments of the opponent
and revise knowledge
Let us define a debate problem with:
∙ A, the set or arguments
3
Formalization of a debate problem
∙ Turn-based game between two agents
∙ Rules to fire in order to attack arguments of the opponent
and revise knowledge
Let us define a debate problem with:
∙ A, the set or arguments
∙ E, the set of attacks
3
Formalization of a debate problem
∙ Turn-based game between two agents
∙ Rules to fire in order to attack arguments of the opponent
and revise knowledge
Let us define a debate problem with:
∙ A, the set or arguments
∙ E, the set of attacks
∙ P = 2A × 2E, the public space gathering voiced arguments
3
Formalization of a debate problem
∙ Turn-based game between two agents
∙ Rules to fire in order to attack arguments of the opponent
and revise knowledge
Let us define a debate problem with:
∙ A, the set or arguments
∙ E, the set of attacks
∙ P = 2A × 2E, the public space gathering voiced arguments
∙ Two agents: agent 1 and agent 2
3
Notation
∙ Arguments: literals (e.g., a, b, c)
4
Notation
∙ Arguments: literals (e.g., a, b, c)
∙ Attacks: e(x, y) if x attacks y
4
Notation
∙ Arguments: literals (e.g., a, b, c)
∙ Attacks: e(x, y) if x attacks y
∙ Args. in public (resp. private) space: a(x) (resp. hi(x))
4
Notation
∙ Arguments: literals (e.g., a, b, c)
∙ Attacks: e(x, y) if x attacks y
∙ Args. in public (resp. private) space: a(x) (resp. hi(x))
∙ Goals:
∧
k g(xk) (resp. g(¬xk)) if xk is (resp. is not) accepted
in the public space (Dung, 1995)
4
Notation
∙ Arguments: literals (e.g., a, b, c)
∙ Attacks: e(x, y) if x attacks y
∙ Args. in public (resp. private) space: a(x) (resp. hi(x))
∙ Goals:
∧
k g(xk) (resp. g(¬xk)) if xk is (resp. is not) accepted
in the public space (Dung, 1995)
∙ Rules: prem ⇒ Pr(Acts)
4
Notation
∙ Arguments: literals (e.g., a, b, c)
∙ Attacks: e(x, y) if x attacks y
∙ Args. in public (resp. private) space: a(x) (resp. hi(x))
∙ Goals:
∧
k g(xk) (resp. g(¬xk)) if xk is (resp. is not) accepted
in the public space (Dung, 1995)
∙ Rules: prem ⇒ Pr(Acts)
∙ Premises: conjunctions of e(, ), a(), hi()
4
Notation
∙ Arguments: literals (e.g., a, b, c)
∙ Attacks: e(x, y) if x attacks y
∙ Args. in public (resp. private) space: a(x) (resp. hi(x))
∙ Goals:
∧
k g(xk) (resp. g(¬xk)) if xk is (resp. is not) accepted
in the public space (Dung, 1995)
∙ Rules: prem ⇒ Pr(Acts)
∙ Premises: conjunctions of e(, ), a(), hi()
∙ Acts: conjunctions of ⊞, ⊟ on e(, ), a() and ⊕, ⊖ on hi()
4
Formalization of an APS
An APS is characterized (from the point of view of agent 1) by
⟨A, E, G, S1, g1, g2, S2, P, R1, R2⟩:
∙ A, E, P as specified above
∙ G, the set of all possible goals
∙ Si, the set of private states for agent i
∙ gi ∈ G, the given goal for agent i
∙ Ri, the set of rules for agent i
5
Example: Arguments
Is e-sport a sport?
6
Example: Arguments
Is e-sport a sport?
a E-sport is a sport
b E-sport requires focusing,
precision and generates
tiredness
c Not all sports are physical
d Sports not referenced by IOC
exist
e Chess is a sport
f E-sport is not a physical
activity
g E-sport is not referenced by
IOC
h Working requires focusing and
generates tiredness but is not
a sport
6
Example: Formalization
∙ A = {a, b, c, d, e, f, g, h}
7
Example: Formalization
∙ A = {a, b, c, d, e, f, g, h}
∙ E = { e(f, a), e(g, a), e(b, f), e(c, f), e(h, b), e(g, c),
e(d, g), e(e, g)}
7
Example: Formalization
∙ A = {a, b, c, d, e, f, g, h}
∙ E = { e(f, a), e(g, a), e(b, f), e(c, f), e(h, b), e(g, c),
e(d, g), e(e, g)}
∙ g1 = g(a)
7
Example: Formalization
∙ A = {a, b, c, d, e, f, g, h}
∙ E = { e(f, a), e(g, a), e(b, f), e(c, f), e(h, b), e(g, c),
e(d, g), e(e, g)}
∙ g1 = g(a)
∙ R1 = {h1(a) ⇒ ⊞a(a),
h1(b) ∧ a(f) ∧ h1(c) ∧ e(b, f) ∧ e(c, f) ⇒
0.5 : ⊞a(b) ∧ ⊞e(b, f) ∨ 0.5 : ⊞a(c) ∧ ⊞e(c, f),
h1(d) ∧ a(g) ∧ h1(e) ∧ e(d, g) ∧ e(e, g) ⇒
0.8 : ⊞a(e) ∧ ⊞e(e, g) ∨ 0.2 : ⊞a(d) ∧ ⊞e(d, g)}
7
Example: Formalization
∙ R2 = {h2(h) ∧ a(b) ∧ e(h, b) ⇒ ⊞a(h) ∧ ⊞e(h, b),
h2(g) ∧ a(c) ∧ e(g, c) ⇒ ⊞a(g) ∧ ⊞e(g, c),
a(a) ∧ h2(f) ∧ h2(g) ∧ e(f, a) ⇒
0.8 : ⊞a(f) ∧ ⊞e(f, a) ∨ 0.2 : ⊞a(g) ∧ ⊞e(g, a)}
∙ Initial state: h1(a, b, c, d, e), {}, h2(f, g, h)
8
Attacks graph
a
g f
c bde
h
Figure: Graph of arguments of Example e-sport
9
Probabilistic Finite State Machine: Graph
APS → Probabilistic Finite State Machine
σ1start σ2
σ3
σ4
σ5
σ6 σ7
σ8 σ9 σ10
σ11
σ12
1
0.8
0.2
0.5
0.5 1
1
0.8 0.2
0.8
0.2
Figure: PFSM of Example e-sport
10
Probabilistic Finite State Machine
To optimize the sequence of arguments for agent 1, we could
optimize the PFSM but:
11
Probabilistic Finite State Machine
To optimize the sequence of arguments for agent 1, we could
optimize the PFSM but:
1. depends of the initial state
11
Probabilistic Finite State Machine
To optimize the sequence of arguments for agent 1, we could
optimize the PFSM but:
1. depends of the initial state
2. requires knowledge of the private state of the opponent
11
Probabilistic Finite State Machine
To optimize the sequence of arguments for agent 1, we could
optimize the PFSM but:
1. depends of the initial state
2. requires knowledge of the private state of the opponent
Using Markov models, we can relax assumptions 1 and 2.
Moreover, the APS formalization can be modified in order to
comply with the Markov assumption.
11
Markov Decision Process
A Markov Decision Process (MDP) (Puterman, 1994) is
characterized by a tuple ⟨S, A, T, R⟩:
∙ S, a set of states,
∙ A, a set of actions,
∙ T : S × A → Pr(S), a transition function,
∙ R : S × A → R, a reward function.
12
Partially-Observable Markov Decision Process
A Partially-Observable MDP (POMDP) (Puterman, 1994) is
characterized by a tuple ⟨S, A, T, R, O, Q⟩:
∙ S, a set of states,
∙ A, a set of actions,
∙ T : S × A → Pr(S), a transition function,
∙ R : S × A → R, a reward function,
∙ O, an observation set,
∙ Q : S × A → Pr(O), an observation function.
13
Mixed-Observability Markov Decision Process
A Mixed-Observability MDP (MOMDP) (Ong et al., 2010) is
characterized by a tuple ⟨Sv, Sh, A, T, R, Ov, Oh, Q⟩:
∙ Sv, Sh, a visible and hidden parts of the state,
∙ A, a set of actions,
∙ T : Sv × A × Sh → Pr(Sv × Sh), a transition function,
∙ R : Sv × A × Sh → R, a reward function,
∙ Ov = Sv, an observation set on the visible part of the state,
∙ Oh, an observation set on the hidden part of the state,
∙ Q : Sv × A × Sh → Pr(Ov × Oh), an observation function.
14
transformation to a momdp
Transformation to a MOMDP
An APS from the point of view of agent 1 can be transformed to
a MOMDP:
∙ Sv = S1 × P, Sh = S2
∙ A = {prem(r) ⇒ m|r ∈ R1 and m ∈ acts(r)}
∙ Ov = Sv and Oh = ∅
∙ Q(⟨sv, sh⟩, a, ⟨sv⟩) = 1, otherwise 0
∙ T, see after
16
Transformation to a MOMDP: Transition function
Application set
Let Cs(Ri) be the set of rules of Ri that can be fired in state s.
The application set Fr(m, s) is the set of predicates resulting
from the application of act m of a rule r on s. If r cannot be
fired in s, Fr(m, s) = s.
∙ s, a state and r : p ⇒ m, an action s.t. r ∈ A
∙ s′ = Fr(m, s)
∙ r′ ∈ Cs′ (R2) s.t. r′ : p′ ⇒ [π1/m1, . . . , πn/mn]
∙ s′′
i = Fr′ (mi, s′)
∙ T(s, r, s′′
i ) = πi
17
Reward function
For the reward function:
∙ with Dung’s semantics: positive reward for each part holding
∙ can be generalized: General Gradual Valuation (Cayrol and
Lagasquie-Schiex, 2005)
18
Transformation to a MOMDP
Model sizes:
APS : 8 arguments, 8 attacks, 6 rules
POMDP : 4 294 967 296 states
MOMDP : 16 777 216 states
Untractable instances → need to optimize at the root
19
solving an aps
Solving an APS
Two algorithms to solve MOMDPs:
∙ MO-IP (Araya-López et al., 2010), IP of POMDP on MOMDP
(exact method)
∙ MO-SARSOP (Ong et al., 2010), SARSOP of POMDP on MOMDP
(approximate method albeit very efficient)
Two kinds of optimizations: with or without dependencies on
the initial state
21
Optimizations without dependencies
Irr. Prunes irrelevant arguments
22
Optimizations without dependencies
Irr. Prunes irrelevant arguments
Enth. Infers attacks
22
Optimizations without dependencies
Irr. Prunes irrelevant arguments
Enth. Infers attacks
Dom. Removes dominated arguments
22
Optimizations without dependencies
Irr. Prunes irrelevant arguments
Enth. Infers attacks
Dom. Removes dominated arguments
Guarantee on the unicity and optimality of the solution.
22
Attacks graph
Argument dominance
If an argument is attacked by
any unattacked argument, it is
dominated.
a f
g
b
c
d e
h
Figure: Attacks graph of Example
23
Optimization with dependencies
Irr(s0) has to be reapplied each time the initial state changes.
24
Optimization with dependencies
Irr(s0) has to be reapplied each time the initial state changes.
1. For each predicate that is never modified but used as
premises:
1.1 Remove all the rules that are not compatible with the value of
this predicate in the initial state.
1.2 For all remaining rules, remove the predicate from the premises.
24
Optimization with dependencies
Irr(s0) has to be reapplied each time the initial state changes.
1. For each predicate that is never modified but used as
premises:
1.1 Remove all the rules that are not compatible with the value of
this predicate in the initial state.
1.2 For all remaining rules, remove the predicate from the premises.
2. For each remaining action of agent 1, track the rules of agent
2 compatible with the application of this action. If a rule of
agent 2 is not compatible with any application of an action
of agent 1, remove it.
24
experiments
Experiments
We computed a solution for the e-sport problem with:
∙ MO-IP, which did not finish after tens of hours
∙ MO-SARSOP without optimizations, idem
∙ MO-SARSOP with optimizations, 4sec for the optimal solution
26
Experiments: Policy graph
r1
1,1start r1
2,2 r1
3,1 ∅
r1
3,1∅ r1
2,2
∅ ∅
o2 o5o4
o6
o7
o8
o5
o1
o7
o8
o3
o3
o4
Figure: Policy graph for Example
27
Experiments: More examples
None Irr. Enth. Dom. Irr(s0). All
Ex 1 — — — — — 0.56
Ex 2 3.3 0.3 0.3 0.4 0 0
Dv. — — — — — 32
6 1313 22 43 7 2.4 0.9
7 — 180 392 16 20 6.7
8 — — — — 319 45
9 — — — — — —
Table: Computation time (in seconds)
28
conclusion and discussions
Conclusion
We presented:
1. A new framework to represent more complex debate
problems (APS)
2. A method to transform those problems to a MOMDP
3. Several optimizations that can be used outside of the
context of MOMDP
4. A method to optimize actions of an agent in an APS
30
Perspectives
We are currently working on using POMCP (Silver and Veness,
2010).
We are also using HS3MDPs (Hadoux et al., 2014).
31
Questions?
32
Bibliography I
Araya-López, M., Thomas, V., Buffet, O., and Charpillet, F. (2010).
A closer look at MOMDPs. In 22nd IEEE International
Conference on Tools with Artificial Intelligence (ICTAI).
Cayrol, C. and Lagasquie-Schiex, M.-C. (2005). Graduality in
argumentation. Journal of Artificial Intelligence Research
(JAIR), 23:245–297.
Dung, P. M. (1995). On the acceptability of arguments and its
fundamental role in nonmonotonic reasoning, logic
programming and n-person games. Artificial Intelligence,
77(2):321–358.
33
Bibliography II
Hadoux, E., Beynier, A., and Weng, P. (2014). Solving
Hidden-Semi-Markov-Mode Markov Decision Problems. In
Straccia, U. and Calì, A., editors, Scalable Uncertainty
Management, volume 8720 of Lecture Notes in Computer
Science, pages 176–189. Springer International Publishing.
Hunter, A. (2014). Probabilistic strategies in dialogical
argumentation. In International Conference on Scalable
Uncertainty Management (SUM’14) LNCS volume 8720.
Ong, S. C., Png, S. W., Hsu, D., and Lee, W. S. (2010). Planning
under uncertainty for robotic tasks with mixed observability.
In The International Journal of Robotics Research.
34
Bibliography III
Puterman, M. L. (1994). Markov Decision Processes: discrete
stochastic dynamic programming. John Wiley & Sons.
Silver, D. and Veness, J. (2010). Monte-Carlo planning in large
POMDPs. In Proceedings of the 24th Conference on Neural
Information Processing Systems (NIPS), pages 2164–2172.
35

More Related Content

PDF
Slides ensae-2016-8
PDF
Declare Your Language: Constraint Resolution 1
PDF
Lundi 16h15-copules-charpentier
PDF
THE CHORD GAP DIVERGENCE AND A GENERALIZATION OF THE BHATTACHARYYA DISTANCE
PDF
Slides guanauato
PDF
Slides ensae 9
PDF
Testing for Extreme Volatility Transmission
Slides ensae-2016-8
Declare Your Language: Constraint Resolution 1
Lundi 16h15-copules-charpentier
THE CHORD GAP DIVERGENCE AND A GENERALIZATION OF THE BHATTACHARYYA DISTANCE
Slides guanauato
Slides ensae 9
Testing for Extreme Volatility Transmission

What's hot (19)

PDF
On the Jensen-Shannon symmetrization of distances relying on abstract means
PDF
Side 2019 #5
PDF
Slides risk-rennes
PDF
Slides ensae 8
PDF
Slides simplexe
PDF
Sildes buenos aires
PDF
Slides econ-lm
PDF
transformations and nonparametric inference
PDF
Quantile and Expectile Regression
PDF
Proba stats-r1-2017
PDF
Slides amsterdam-2013
PDF
Slides toulouse
PPT
29 conservative fields potential functions
PDF
Inequality #4
PDF
Slides univ-van-amsterdam
PDF
Quantum optical models in noncommutative spaces
PDF
Hands-On Algorithms for Predictive Modeling
PDF
Slides lln-risques
PPTX
2 integration and the substitution methods x
On the Jensen-Shannon symmetrization of distances relying on abstract means
Side 2019 #5
Slides risk-rennes
Slides ensae 8
Slides simplexe
Sildes buenos aires
Slides econ-lm
transformations and nonparametric inference
Quantile and Expectile Regression
Proba stats-r1-2017
Slides amsterdam-2013
Slides toulouse
29 conservative fields potential functions
Inequality #4
Slides univ-van-amsterdam
Quantum optical models in noncommutative spaces
Hands-On Algorithms for Predictive Modeling
Slides lln-risques
2 integration and the substitution methods x
Ad

Similar to Optimization of probabilistic argumentation with Markov processes (20)

PDF
Runtime Analysis of Population-based Evolutionary Algorithms
PDF
Runtime Analysis of Population-based Evolutionary Algorithms
PDF
Compiler Construction | Lecture 9 | Constraint Resolution
PDF
Declarative Datalog Debugging for Mere Mortals
PPTX
Algorithm Assignment Help
PDF
Generic Reinforcement Schemes and Their Optimization
PDF
02 math essentials
PPT
C2.0 propositional logic
PPTX
Otter 2016-11-28-01-ss
PDF
Hierarchical Reinforcement Learning with Option-Critic Architecture
PPTX
Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal Demon...
PDF
A New Nonlinear Reinforcement Scheme for Stochastic Learning Automata
PPT
L03 ai - knowledge representation using logic
ODP
Scala as a Declarative Language
PPT
Poggi analytics - star - 1a
PDF
Cheatsheet supervised-learning
PDF
PTSP PPT.pdf
PDF
Side 2019 #7
PDF
Application H-matrices for solving PDEs with multi-scale coefficients, jumpin...
PDF
Lecture 3 qualtifed rules of inference
Runtime Analysis of Population-based Evolutionary Algorithms
Runtime Analysis of Population-based Evolutionary Algorithms
Compiler Construction | Lecture 9 | Constraint Resolution
Declarative Datalog Debugging for Mere Mortals
Algorithm Assignment Help
Generic Reinforcement Schemes and Their Optimization
02 math essentials
C2.0 propositional logic
Otter 2016-11-28-01-ss
Hierarchical Reinforcement Learning with Option-Critic Architecture
Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal Demon...
A New Nonlinear Reinforcement Scheme for Stochastic Learning Automata
L03 ai - knowledge representation using logic
Scala as a Declarative Language
Poggi analytics - star - 1a
Cheatsheet supervised-learning
PTSP PPT.pdf
Side 2019 #7
Application H-matrices for solving PDEs with multi-scale coefficients, jumpin...
Lecture 3 qualtifed rules of inference
Ad

Recently uploaded (20)

PPTX
INTRODUCTION TO EVS | Concept of sustainability
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PDF
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
PPTX
Comparative Structure of Integument in Vertebrates.pptx
PPTX
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
PPTX
2. Earth - The Living Planet Module 2ELS
PPTX
microscope-Lecturecjchchchchcuvuvhc.pptx
PPTX
BIOMOLECULES PPT........................
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PDF
An interstellar mission to test astrophysical black holes
PDF
bbec55_b34400a7914c42429908233dbd381773.pdf
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PDF
Sciences of Europe No 170 (2025)
PPTX
Introduction to Cardiovascular system_structure and functions-1
INTRODUCTION TO EVS | Concept of sustainability
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
The KM-GBF monitoring framework – status & key messages.pptx
7. General Toxicologyfor clinical phrmacy.pptx
ECG_Course_Presentation د.محمد صقران ppt
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
Comparative Structure of Integument in Vertebrates.pptx
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
2. Earth - The Living Planet Module 2ELS
microscope-Lecturecjchchchchcuvuvhc.pptx
BIOMOLECULES PPT........................
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
An interstellar mission to test astrophysical black holes
bbec55_b34400a7914c42429908233dbd381773.pdf
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
TOTAL hIP ARTHROPLASTY Presentation.pptx
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
Sciences of Europe No 170 (2025)
Introduction to Cardiovascular system_structure and functions-1

Optimization of probabilistic argumentation with Markov processes

  • 1. optimization of probabilistic argumentation with markov processes E. Hadoux1 , A. Beynier1 , N. Maudet1 , P. Weng2 and A. Hunter3 Tue., Sept. 29th (1) Sorbonne Universités, UPMC Univ Paris 6, UMR 7606, LIP6, F-75005, Paris, France (2) SYSU-CMU Joint Institute of Engineering, Guangzhou, China SYSU-CMU Shunde International Joint Research Institute, Shunde, China (3) Department of Computer Science, University College London, Gower Street, London WC1E 6BT, UK
  • 2. Introduction ∙ Debate argumentation problems between two agents 1
  • 3. Introduction ∙ Debate argumentation problems between two agents ∙ Probabilistic executable logic to improve expressivity 1
  • 4. Introduction ∙ Debate argumentation problems between two agents ∙ Probabilistic executable logic to improve expressivity ∙ New class of problems: Argumentation Problem with Probabilistic Strategies (APS) (Hunter, 2014) 1
  • 5. Introduction ∙ Debate argumentation problems between two agents ∙ Probabilistic executable logic to improve expressivity ∙ New class of problems: Argumentation Problem with Probabilistic Strategies (APS) (Hunter, 2014) ∙ Purpose of this work: optimize the sequence of arguments of one agent 1
  • 6. Introduction ∙ Debate argumentation problems between two agents ∙ Probabilistic executable logic to improve expressivity ∙ New class of problems: Argumentation Problem with Probabilistic Strategies (APS) (Hunter, 2014) ∙ Purpose of this work: optimize the sequence of arguments of one agent There will be abuse of the word predicate! 1
  • 8. Formalization of a debate problem ∙ Turn-based game between two agents ∙ Rules to fire in order to attack arguments of the opponent and revise knowledge 3
  • 9. Formalization of a debate problem ∙ Turn-based game between two agents ∙ Rules to fire in order to attack arguments of the opponent and revise knowledge Let us define a debate problem with: ∙ A, the set or arguments 3
  • 10. Formalization of a debate problem ∙ Turn-based game between two agents ∙ Rules to fire in order to attack arguments of the opponent and revise knowledge Let us define a debate problem with: ∙ A, the set or arguments ∙ E, the set of attacks 3
  • 11. Formalization of a debate problem ∙ Turn-based game between two agents ∙ Rules to fire in order to attack arguments of the opponent and revise knowledge Let us define a debate problem with: ∙ A, the set or arguments ∙ E, the set of attacks ∙ P = 2A × 2E, the public space gathering voiced arguments 3
  • 12. Formalization of a debate problem ∙ Turn-based game between two agents ∙ Rules to fire in order to attack arguments of the opponent and revise knowledge Let us define a debate problem with: ∙ A, the set or arguments ∙ E, the set of attacks ∙ P = 2A × 2E, the public space gathering voiced arguments ∙ Two agents: agent 1 and agent 2 3
  • 13. Notation ∙ Arguments: literals (e.g., a, b, c) 4
  • 14. Notation ∙ Arguments: literals (e.g., a, b, c) ∙ Attacks: e(x, y) if x attacks y 4
  • 15. Notation ∙ Arguments: literals (e.g., a, b, c) ∙ Attacks: e(x, y) if x attacks y ∙ Args. in public (resp. private) space: a(x) (resp. hi(x)) 4
  • 16. Notation ∙ Arguments: literals (e.g., a, b, c) ∙ Attacks: e(x, y) if x attacks y ∙ Args. in public (resp. private) space: a(x) (resp. hi(x)) ∙ Goals: ∧ k g(xk) (resp. g(¬xk)) if xk is (resp. is not) accepted in the public space (Dung, 1995) 4
  • 17. Notation ∙ Arguments: literals (e.g., a, b, c) ∙ Attacks: e(x, y) if x attacks y ∙ Args. in public (resp. private) space: a(x) (resp. hi(x)) ∙ Goals: ∧ k g(xk) (resp. g(¬xk)) if xk is (resp. is not) accepted in the public space (Dung, 1995) ∙ Rules: prem ⇒ Pr(Acts) 4
  • 18. Notation ∙ Arguments: literals (e.g., a, b, c) ∙ Attacks: e(x, y) if x attacks y ∙ Args. in public (resp. private) space: a(x) (resp. hi(x)) ∙ Goals: ∧ k g(xk) (resp. g(¬xk)) if xk is (resp. is not) accepted in the public space (Dung, 1995) ∙ Rules: prem ⇒ Pr(Acts) ∙ Premises: conjunctions of e(, ), a(), hi() 4
  • 19. Notation ∙ Arguments: literals (e.g., a, b, c) ∙ Attacks: e(x, y) if x attacks y ∙ Args. in public (resp. private) space: a(x) (resp. hi(x)) ∙ Goals: ∧ k g(xk) (resp. g(¬xk)) if xk is (resp. is not) accepted in the public space (Dung, 1995) ∙ Rules: prem ⇒ Pr(Acts) ∙ Premises: conjunctions of e(, ), a(), hi() ∙ Acts: conjunctions of ⊞, ⊟ on e(, ), a() and ⊕, ⊖ on hi() 4
  • 20. Formalization of an APS An APS is characterized (from the point of view of agent 1) by ⟨A, E, G, S1, g1, g2, S2, P, R1, R2⟩: ∙ A, E, P as specified above ∙ G, the set of all possible goals ∙ Si, the set of private states for agent i ∙ gi ∈ G, the given goal for agent i ∙ Ri, the set of rules for agent i 5
  • 22. Example: Arguments Is e-sport a sport? a E-sport is a sport b E-sport requires focusing, precision and generates tiredness c Not all sports are physical d Sports not referenced by IOC exist e Chess is a sport f E-sport is not a physical activity g E-sport is not referenced by IOC h Working requires focusing and generates tiredness but is not a sport 6
  • 23. Example: Formalization ∙ A = {a, b, c, d, e, f, g, h} 7
  • 24. Example: Formalization ∙ A = {a, b, c, d, e, f, g, h} ∙ E = { e(f, a), e(g, a), e(b, f), e(c, f), e(h, b), e(g, c), e(d, g), e(e, g)} 7
  • 25. Example: Formalization ∙ A = {a, b, c, d, e, f, g, h} ∙ E = { e(f, a), e(g, a), e(b, f), e(c, f), e(h, b), e(g, c), e(d, g), e(e, g)} ∙ g1 = g(a) 7
  • 26. Example: Formalization ∙ A = {a, b, c, d, e, f, g, h} ∙ E = { e(f, a), e(g, a), e(b, f), e(c, f), e(h, b), e(g, c), e(d, g), e(e, g)} ∙ g1 = g(a) ∙ R1 = {h1(a) ⇒ ⊞a(a), h1(b) ∧ a(f) ∧ h1(c) ∧ e(b, f) ∧ e(c, f) ⇒ 0.5 : ⊞a(b) ∧ ⊞e(b, f) ∨ 0.5 : ⊞a(c) ∧ ⊞e(c, f), h1(d) ∧ a(g) ∧ h1(e) ∧ e(d, g) ∧ e(e, g) ⇒ 0.8 : ⊞a(e) ∧ ⊞e(e, g) ∨ 0.2 : ⊞a(d) ∧ ⊞e(d, g)} 7
  • 27. Example: Formalization ∙ R2 = {h2(h) ∧ a(b) ∧ e(h, b) ⇒ ⊞a(h) ∧ ⊞e(h, b), h2(g) ∧ a(c) ∧ e(g, c) ⇒ ⊞a(g) ∧ ⊞e(g, c), a(a) ∧ h2(f) ∧ h2(g) ∧ e(f, a) ⇒ 0.8 : ⊞a(f) ∧ ⊞e(f, a) ∨ 0.2 : ⊞a(g) ∧ ⊞e(g, a)} ∙ Initial state: h1(a, b, c, d, e), {}, h2(f, g, h) 8
  • 28. Attacks graph a g f c bde h Figure: Graph of arguments of Example e-sport 9
  • 29. Probabilistic Finite State Machine: Graph APS → Probabilistic Finite State Machine σ1start σ2 σ3 σ4 σ5 σ6 σ7 σ8 σ9 σ10 σ11 σ12 1 0.8 0.2 0.5 0.5 1 1 0.8 0.2 0.8 0.2 Figure: PFSM of Example e-sport 10
  • 30. Probabilistic Finite State Machine To optimize the sequence of arguments for agent 1, we could optimize the PFSM but: 11
  • 31. Probabilistic Finite State Machine To optimize the sequence of arguments for agent 1, we could optimize the PFSM but: 1. depends of the initial state 11
  • 32. Probabilistic Finite State Machine To optimize the sequence of arguments for agent 1, we could optimize the PFSM but: 1. depends of the initial state 2. requires knowledge of the private state of the opponent 11
  • 33. Probabilistic Finite State Machine To optimize the sequence of arguments for agent 1, we could optimize the PFSM but: 1. depends of the initial state 2. requires knowledge of the private state of the opponent Using Markov models, we can relax assumptions 1 and 2. Moreover, the APS formalization can be modified in order to comply with the Markov assumption. 11
  • 34. Markov Decision Process A Markov Decision Process (MDP) (Puterman, 1994) is characterized by a tuple ⟨S, A, T, R⟩: ∙ S, a set of states, ∙ A, a set of actions, ∙ T : S × A → Pr(S), a transition function, ∙ R : S × A → R, a reward function. 12
  • 35. Partially-Observable Markov Decision Process A Partially-Observable MDP (POMDP) (Puterman, 1994) is characterized by a tuple ⟨S, A, T, R, O, Q⟩: ∙ S, a set of states, ∙ A, a set of actions, ∙ T : S × A → Pr(S), a transition function, ∙ R : S × A → R, a reward function, ∙ O, an observation set, ∙ Q : S × A → Pr(O), an observation function. 13
  • 36. Mixed-Observability Markov Decision Process A Mixed-Observability MDP (MOMDP) (Ong et al., 2010) is characterized by a tuple ⟨Sv, Sh, A, T, R, Ov, Oh, Q⟩: ∙ Sv, Sh, a visible and hidden parts of the state, ∙ A, a set of actions, ∙ T : Sv × A × Sh → Pr(Sv × Sh), a transition function, ∙ R : Sv × A × Sh → R, a reward function, ∙ Ov = Sv, an observation set on the visible part of the state, ∙ Oh, an observation set on the hidden part of the state, ∙ Q : Sv × A × Sh → Pr(Ov × Oh), an observation function. 14
  • 38. Transformation to a MOMDP An APS from the point of view of agent 1 can be transformed to a MOMDP: ∙ Sv = S1 × P, Sh = S2 ∙ A = {prem(r) ⇒ m|r ∈ R1 and m ∈ acts(r)} ∙ Ov = Sv and Oh = ∅ ∙ Q(⟨sv, sh⟩, a, ⟨sv⟩) = 1, otherwise 0 ∙ T, see after 16
  • 39. Transformation to a MOMDP: Transition function Application set Let Cs(Ri) be the set of rules of Ri that can be fired in state s. The application set Fr(m, s) is the set of predicates resulting from the application of act m of a rule r on s. If r cannot be fired in s, Fr(m, s) = s. ∙ s, a state and r : p ⇒ m, an action s.t. r ∈ A ∙ s′ = Fr(m, s) ∙ r′ ∈ Cs′ (R2) s.t. r′ : p′ ⇒ [π1/m1, . . . , πn/mn] ∙ s′′ i = Fr′ (mi, s′) ∙ T(s, r, s′′ i ) = πi 17
  • 40. Reward function For the reward function: ∙ with Dung’s semantics: positive reward for each part holding ∙ can be generalized: General Gradual Valuation (Cayrol and Lagasquie-Schiex, 2005) 18
  • 41. Transformation to a MOMDP Model sizes: APS : 8 arguments, 8 attacks, 6 rules POMDP : 4 294 967 296 states MOMDP : 16 777 216 states Untractable instances → need to optimize at the root 19
  • 43. Solving an APS Two algorithms to solve MOMDPs: ∙ MO-IP (Araya-López et al., 2010), IP of POMDP on MOMDP (exact method) ∙ MO-SARSOP (Ong et al., 2010), SARSOP of POMDP on MOMDP (approximate method albeit very efficient) Two kinds of optimizations: with or without dependencies on the initial state 21
  • 44. Optimizations without dependencies Irr. Prunes irrelevant arguments 22
  • 45. Optimizations without dependencies Irr. Prunes irrelevant arguments Enth. Infers attacks 22
  • 46. Optimizations without dependencies Irr. Prunes irrelevant arguments Enth. Infers attacks Dom. Removes dominated arguments 22
  • 47. Optimizations without dependencies Irr. Prunes irrelevant arguments Enth. Infers attacks Dom. Removes dominated arguments Guarantee on the unicity and optimality of the solution. 22
  • 48. Attacks graph Argument dominance If an argument is attacked by any unattacked argument, it is dominated. a f g b c d e h Figure: Attacks graph of Example 23
  • 49. Optimization with dependencies Irr(s0) has to be reapplied each time the initial state changes. 24
  • 50. Optimization with dependencies Irr(s0) has to be reapplied each time the initial state changes. 1. For each predicate that is never modified but used as premises: 1.1 Remove all the rules that are not compatible with the value of this predicate in the initial state. 1.2 For all remaining rules, remove the predicate from the premises. 24
  • 51. Optimization with dependencies Irr(s0) has to be reapplied each time the initial state changes. 1. For each predicate that is never modified but used as premises: 1.1 Remove all the rules that are not compatible with the value of this predicate in the initial state. 1.2 For all remaining rules, remove the predicate from the premises. 2. For each remaining action of agent 1, track the rules of agent 2 compatible with the application of this action. If a rule of agent 2 is not compatible with any application of an action of agent 1, remove it. 24
  • 53. Experiments We computed a solution for the e-sport problem with: ∙ MO-IP, which did not finish after tens of hours ∙ MO-SARSOP without optimizations, idem ∙ MO-SARSOP with optimizations, 4sec for the optimal solution 26
  • 54. Experiments: Policy graph r1 1,1start r1 2,2 r1 3,1 ∅ r1 3,1∅ r1 2,2 ∅ ∅ o2 o5o4 o6 o7 o8 o5 o1 o7 o8 o3 o3 o4 Figure: Policy graph for Example 27
  • 55. Experiments: More examples None Irr. Enth. Dom. Irr(s0). All Ex 1 — — — — — 0.56 Ex 2 3.3 0.3 0.3 0.4 0 0 Dv. — — — — — 32 6 1313 22 43 7 2.4 0.9 7 — 180 392 16 20 6.7 8 — — — — 319 45 9 — — — — — — Table: Computation time (in seconds) 28
  • 57. Conclusion We presented: 1. A new framework to represent more complex debate problems (APS) 2. A method to transform those problems to a MOMDP 3. Several optimizations that can be used outside of the context of MOMDP 4. A method to optimize actions of an agent in an APS 30
  • 58. Perspectives We are currently working on using POMCP (Silver and Veness, 2010). We are also using HS3MDPs (Hadoux et al., 2014). 31
  • 60. Bibliography I Araya-López, M., Thomas, V., Buffet, O., and Charpillet, F. (2010). A closer look at MOMDPs. In 22nd IEEE International Conference on Tools with Artificial Intelligence (ICTAI). Cayrol, C. and Lagasquie-Schiex, M.-C. (2005). Graduality in argumentation. Journal of Artificial Intelligence Research (JAIR), 23:245–297. Dung, P. M. (1995). On the acceptability of arguments and its fundamental role in nonmonotonic reasoning, logic programming and n-person games. Artificial Intelligence, 77(2):321–358. 33
  • 61. Bibliography II Hadoux, E., Beynier, A., and Weng, P. (2014). Solving Hidden-Semi-Markov-Mode Markov Decision Problems. In Straccia, U. and Calì, A., editors, Scalable Uncertainty Management, volume 8720 of Lecture Notes in Computer Science, pages 176–189. Springer International Publishing. Hunter, A. (2014). Probabilistic strategies in dialogical argumentation. In International Conference on Scalable Uncertainty Management (SUM’14) LNCS volume 8720. Ong, S. C., Png, S. W., Hsu, D., and Lee, W. S. (2010). Planning under uncertainty for robotic tasks with mixed observability. In The International Journal of Robotics Research. 34
  • 62. Bibliography III Puterman, M. L. (1994). Markov Decision Processes: discrete stochastic dynamic programming. John Wiley & Sons. Silver, D. and Veness, J. (2010). Monte-Carlo planning in large POMDPs. In Proceedings of the 24th Conference on Neural Information Processing Systems (NIPS), pages 2164–2172. 35