I will talk about the validation and monitoring of AI agents using the example of a mobile application that interacts with a multi-agent system via OpenAPI. I will demonstrate practical approaches to testing agent logic, methods for collecting performance metrics, and setting up an observability system. I will share my experience in tracking agent behavior in real time, detecting anomalies, and ensuring the reliability of a multi-agent architecture in production.
Related topics: