This document discusses sequential decision-making problems in non-stationary environments and their application to argumentative debates. It begins by defining Markov decision processes (MDPs) and partially observable MDPs (POMDPs), which assume stationary environments. To address non-stationarity, it introduces hidden-mode MDPs (HM-MDPs) that model the environment as switching between different stationary modes over time. The document provides an example of modeling a traffic light problem as an HM-MDP and discusses applying these techniques to strategic debates and mediation problems.
Related topics: