Optimization of probabilistic argumentation with Markov processes

optimization of probabilistic argumentation
with markov processes
E. Hadoux1
, A. Beynier1
, N. Maudet1
, P. Weng2
and A. Hunter3
Tue., Sept. 29th
(1) Sorbonne Universités, UPMC Univ Paris 6, UMR 7606, LIP6, F-75005, Paris, France
(2) SYSU-CMU Joint Institute of Engineering, Guangzhou, China
SYSU-CMU Shunde International Joint Research Institute, Shunde, China
(3) Department of Computer Science, University College London, Gower Street, London
WC1E 6BT, UK

Introduction
∙ Debate argumentation problems between two agents
1

Introduction
∙ Probabilistic executable logic to improve expressivity
1

Introduction
∙ New class of problems: Argumentation Problem with
Probabilistic Strategies (APS) (Hunter, 2014)
1

Introduction
∙ Purpose of this work: optimize the sequence of arguments
of one agent
1

Introduction
∙ Purpose of this work: optimize the sequence of arguments
of one agent
There will be abuse of the word predicate!
1

Formalization of a debate problem
∙ Turn-based game between two agents
∙ Rules to ﬁre in order to attack arguments of the opponent
and revise knowledge
3

Let us deﬁne a debate problem with:
∙ A, the set or arguments
3

∙ E, the set of attacks
3

∙ P = 2A × 2E, the public space gathering voiced arguments
3

∙ P = 2A × 2E, the public space gathering voiced arguments
∙ Two agents: agent 1 and agent 2
3

Notation
∙ Arguments: literals (e.g., a, b, c)
4

Notation
∙ Attacks: e(x, y) if x attacks y
4

Notation
∙ Args. in public (resp. private) space: a(x) (resp. hi(x))
4

Notation
∙ Goals:
∧
k g(xk) (resp. g(¬xk)) if xk is (resp. is not) accepted
in the public space (Dung, 1995)
4

Notation
∙ Goals:
∧
∙ Rules: prem ⇒ Pr(Acts)
4

Notation
∙ Goals:
∧
∙ Premises: conjunctions of e(, ), a(), hi()
4

Notation
∙ Goals:
∧
∙ Premises: conjunctions of e(, ), a(), hi()
∙ Acts: conjunctions of ⊞, ⊟ on e(, ), a() and ⊕, ⊖ on hi()
4

Formalization of an APS
An APS is characterized (from the point of view of agent 1) by
⟨A, E, G, S1, g1, g2, S2, P, R1, R2⟩:
∙ A, E, P as speciﬁed above
∙ G, the set of all possible goals
∙ Si, the set of private states for agent i
∙ gi ∈ G, the given goal for agent i
∙ Ri, the set of rules for agent i
5

Example: Arguments
Is e-sport a sport?
6

Example: Arguments
Is e-sport a sport?
a E-sport is a sport
b E-sport requires focusing,
precision and generates
tiredness
c Not all sports are physical
d Sports not referenced by IOC
exist
e Chess is a sport
f E-sport is not a physical
activity
g E-sport is not referenced by
IOC
h Working requires focusing and
generates tiredness but is not
a sport
6

Example: Formalization
∙ A = {a, b, c, d, e, f, g, h}
7

∙ A = {a, b, c, d, e, f, g, h}
∙ E = { e(f, a), e(g, a), e(b, f), e(c, f), e(h, b), e(g, c),
e(d, g), e(e, g)}
7

∙ A = {a, b, c, d, e, f, g, h}
e(d, g), e(e, g)}
∙ g1 = g(a)
7

∙ A = {a, b, c, d, e, f, g, h}
e(d, g), e(e, g)}
∙ g1 = g(a)
∙ R1 = {h1(a) ⇒ ⊞a(a),
h1(b) ∧ a(f) ∧ h1(c) ∧ e(b, f) ∧ e(c, f) ⇒
0.5 : ⊞a(b) ∧ ⊞e(b, f) ∨ 0.5 : ⊞a(c) ∧ ⊞e(c, f),
h1(d) ∧ a(g) ∧ h1(e) ∧ e(d, g) ∧ e(e, g) ⇒
0.8 : ⊞a(e) ∧ ⊞e(e, g) ∨ 0.2 : ⊞a(d) ∧ ⊞e(d, g)}
7

∙ R2 = {h2(h) ∧ a(b) ∧ e(h, b) ⇒ ⊞a(h) ∧ ⊞e(h, b),
h2(g) ∧ a(c) ∧ e(g, c) ⇒ ⊞a(g) ∧ ⊞e(g, c),
a(a) ∧ h2(f) ∧ h2(g) ∧ e(f, a) ⇒
0.8 : ⊞a(f) ∧ ⊞e(f, a) ∨ 0.2 : ⊞a(g) ∧ ⊞e(g, a)}
∙ Initial state: h1(a, b, c, d, e), {}, h2(f, g, h)
8

Attacks graph
a
g f
c bde
h
Figure: Graph of arguments of Example e-sport
9

Probabilistic Finite State Machine: Graph
APS → Probabilistic Finite State Machine
σ1start σ2
σ3
σ4
σ5
σ6 σ7
σ8 σ9 σ10
σ11
σ12
1
0.8
0.2
0.5
0.5 1
1
0.8 0.2
0.8
0.2
Figure: PFSM of Example e-sport
10

Probabilistic Finite State Machine
To optimize the sequence of arguments for agent 1, we could
optimize the PFSM but:
11

1. depends of the initial state
11

2. requires knowledge of the private state of the opponent
11

2. requires knowledge of the private state of the opponent
Using Markov models, we can relax assumptions 1 and 2.
Moreover, the APS formalization can be modiﬁed in order to
comply with the Markov assumption.
11

Markov Decision Process
A Markov Decision Process (MDP) (Puterman, 1994) is
characterized by a tuple ⟨S, A, T, R⟩:
∙ S, a set of states,
∙ A, a set of actions,
∙ T : S × A → Pr(S), a transition function,
∙ R : S × A → R, a reward function.
12

Partially-Observable Markov Decision Process
A Partially-Observable MDP (POMDP) (Puterman, 1994) is
characterized by a tuple ⟨S, A, T, R, O, Q⟩:
∙ S, a set of states,
∙ T : S × A → Pr(S), a transition function,
∙ R : S × A → R, a reward function,
∙ O, an observation set,
∙ Q : S × A → Pr(O), an observation function.
13

Mixed-Observability Markov Decision Process
A Mixed-Observability MDP (MOMDP) (Ong et al., 2010) is
characterized by a tuple ⟨Sv, Sh, A, T, R, Ov, Oh, Q⟩:
∙ Sv, Sh, a visible and hidden parts of the state,
∙ T : Sv × A × Sh → Pr(Sv × Sh), a transition function,
∙ R : Sv × A × Sh → R, a reward function,
∙ Ov = Sv, an observation set on the visible part of the state,
∙ Oh, an observation set on the hidden part of the state,
∙ Q : Sv × A × Sh → Pr(Ov × Oh), an observation function.
14

Transformation to a MOMDP
An APS from the point of view of agent 1 can be transformed to
a MOMDP:
∙ Sv = S1 × P, Sh = S2
∙ A = {prem(r) ⇒ m|r ∈ R1 and m ∈ acts(r)}
∙ Ov = Sv and Oh = ∅
∙ Q(⟨sv, sh⟩, a, ⟨sv⟩) = 1, otherwise 0
∙ T, see after
16

Transformation to a MOMDP: Transition function
Application set
Let Cs(Ri) be the set of rules of Ri that can be ﬁred in state s.
The application set Fr(m, s) is the set of predicates resulting
from the application of act m of a rule r on s. If r cannot be
ﬁred in s, Fr(m, s) = s.
∙ s, a state and r : p ⇒ m, an action s.t. r ∈ A
∙ s′ = Fr(m, s)
∙ r′ ∈ Cs′ (R2) s.t. r′ : p′ ⇒ [π1/m1, . . . , πn/mn]
∙ s′′
i = Fr′ (mi, s′)
∙ T(s, r, s′′
i ) = πi
17

Reward function
For the reward function:
∙ with Dung’s semantics: positive reward for each part holding
∙ can be generalized: General Gradual Valuation (Cayrol and
Lagasquie-Schiex, 2005)
18

Transformation to a MOMDP
Model sizes:
APS : 8 arguments, 8 attacks, 6 rules
POMDP : 4 294 967 296 states
MOMDP : 16 777 216 states
Untractable instances → need to optimize at the root
19

Solving an APS
Two algorithms to solve MOMDPs:
∙ MO-IP (Araya-López et al., 2010), IP of POMDP on MOMDP
(exact method)
∙ MO-SARSOP (Ong et al., 2010), SARSOP of POMDP on MOMDP
(approximate method albeit very efﬁcient)
Two kinds of optimizations: with or without dependencies on
the initial state
21

Optimizations without dependencies
Irr. Prunes irrelevant arguments
22

Enth. Infers attacks
22

Dom. Removes dominated arguments
22

Dom. Removes dominated arguments
Guarantee on the unicity and optimality of the solution.
22

Attacks graph
Argument dominance
If an argument is attacked by
any unattacked argument, it is
dominated.
a f
g
b
c
d e
h
Figure: Attacks graph of Example
23

Optimization with dependencies
Irr(s0) has to be reapplied each time the initial state changes.
24

1. For each predicate that is never modiﬁed but used as
premises:
1.1 Remove all the rules that are not compatible with the value of
this predicate in the initial state.
1.2 For all remaining rules, remove the predicate from the premises.
24

1. For each predicate that is never modiﬁed but used as
premises:
1.1 Remove all the rules that are not compatible with the value of
this predicate in the initial state.
1.2 For all remaining rules, remove the predicate from the premises.
2. For each remaining action of agent 1, track the rules of agent
2 compatible with the application of this action. If a rule of
agent 2 is not compatible with any application of an action
of agent 1, remove it.
24

Experiments
We computed a solution for the e-sport problem with:
∙ MO-IP, which did not ﬁnish after tens of hours
∙ MO-SARSOP without optimizations, idem
∙ MO-SARSOP with optimizations, 4sec for the optimal solution
26

Experiments: Policy graph
r1
1,1start r1
2,2 r1
3,1 ∅
r1
3,1∅ r1
2,2
∅ ∅
o2 o5o4
o6
o7
o8
o5
o1
o7
o8
o3
o3
o4
Figure: Policy graph for Example
27

Experiments: More examples
None Irr. Enth. Dom. Irr(s0). All
Ex 1 — — — — — 0.56
Ex 2 3.3 0.3 0.3 0.4 0 0
Dv. — — — — — 32
6 1313 22 43 7 2.4 0.9
7 — 180 392 16 20 6.7
8 — — — — 319 45
9 — — — — — —
Table: Computation time (in seconds)
28

Conclusion
We presented:
1. A new framework to represent more complex debate
problems (APS)
2. A method to transform those problems to a MOMDP
3. Several optimizations that can be used outside of the
context of MOMDP
4. A method to optimize actions of an agent in an APS
30

Perspectives
We are currently working on using POMCP (Silver and Veness,
2010).
We are also using HS3MDPs (Hadoux et al., 2014).
31

Bibliography I
Araya-López, M., Thomas, V., Buffet, O., and Charpillet, F. (2010).
A closer look at MOMDPs. In 22nd IEEE International
Conference on Tools with Artificial Intelligence (ICTAI).
Cayrol, C. and Lagasquie-Schiex, M.-C. (2005). Graduality in
argumentation. Journal of Artificial Intelligence Research
(JAIR), 23:245–297.
Dung, P. M. (1995). On the acceptability of arguments and its
fundamental role in nonmonotonic reasoning, logic
programming and n-person games. Artificial Intelligence,
77(2):321–358.
33

Bibliography II
Hadoux, E., Beynier, A., and Weng, P. (2014). Solving
Hidden-Semi-Markov-Mode Markov Decision Problems. In
Straccia, U. and Calì, A., editors, Scalable Uncertainty
Management, volume 8720 of Lecture Notes in Computer
Science, pages 176–189. Springer International Publishing.
Hunter, A. (2014). Probabilistic strategies in dialogical
argumentation. In International Conference on Scalable
Uncertainty Management (SUM’14) LNCS volume 8720.
Ong, S. C., Png, S. W., Hsu, D., and Lee, W. S. (2010). Planning
under uncertainty for robotic tasks with mixed observability.
In The International Journal of Robotics Research.
34

Bibliography III
Puterman, M. L. (1994). Markov Decision Processes: discrete
stochastic dynamic programming. John Wiley & Sons.
Silver, D. and Veness, J. (2010). Monte-Carlo planning in large
POMDPs. In Proceedings of the 24th Conference on Neural
Information Processing Systems (NIPS), pages 2164–2172.
35

Optimization of probabilistic argumentation with Markov processes

More Related Content

What's hot (19)

Similar to Optimization of probabilistic argumentation with Markov processes (20)

Recently uploaded (20)

Optimization of probabilistic argumentation with Markov processes