24 giugno 2026 15:20 - 15:50
Beyond Red Teaming: Building a comprehensive monitoring and evaluation framework for conversational AI
As enterprises rush to deploy AI agents in customer-facing operations, most rely on one-time audits and manual testing. This approach offers limited coverage, surfaces problems only after the damage is done, and typically focuses on safety while ignoring accuracy, user experience, and operational efficiency. This session presents a practical framework for continuous AI monitoring and evaluation, drawn from deploying automated testing infrastructure across major European enterprises. Attendees will learn why point-in-time testing isn’t enough, how to implement live monitoring that catches failures before they become widespread, and what the EU AI Act actually requires in terms of ongoing oversight versus pre-deployment testing.