This document summarizes a presentation on using reinforcement learning to determine optimal structured treatment interruption (STI) strategies for HIV patients based on clinical data. It discusses how clinical data from patients on drug regimens can be viewed as trajectories and processed using reinforcement learning techniques to infer STI policies without requiring an explicit model of HIV dynamics. The approach formulates STI optimization as a reinforcement learning problem to compute policies directly from sample trajectories that minimize costs like side effects and keep the virus under control.