Discover how testers can govern AI quality using metrics, thresholds, and structured evaluation for probabilistic systems.
100 percent correct does not exist in AI. So what does quality mean?”
Artificial intelligence breaks one of the most fundamental assumptions of traditional testing: that systems behave deterministically and can be validated with clear pass or fail outcomes. AI systems do not behave this way. They drift, hallucinate, and change their behaviour depending on data, prompts, and context. Yet many organizations still attempt to apply classical testing approaches to probabilistic systems.
In this talk, Nicole challenges the traditional quality mindset and presents a practical approach for governing AI quality. Instead of asking whether an AI system is correct, the conversation shifts toward managing acceptable risk and making AI behaviour measurable. The talk introduces Evaluaite, a practical framework designed to help teams structure quality governance in AI-driven systems.
Using real-world examples and a short live demonstration, Nicole shows how testers can apply metrics, thresholds, and LLM evaluations to design structured quality control for AI applications. The session explores how QA professionals and agile teams can evolve their role from defect detection toward actively governing reliability and trust in AI systems.
Attendees will leave with a new perspective on testing in the AI era and practical ideas for keeping AI systems measurable, manageable, and controllable as they become part of critical software systems.