Skip to main content

Testing the Untestable: Building a strategy for testing AI

25-minute Talk

AI Agents output change every time. "Expected Results" do not exist. Traditional automation is brittle. Manual testing is too slow. Results are subjective! How on earth do you move quickly?

Virtual Pass session

Timetable

10:45 a.m. – 11:30 a.m. Tuesday 17th

Room

Room F1 - Track 1: Talks

Artificial Intelligence (AI)

Audience

Testers, Managers

Key-Learnings

  • The Input Shift: Moving from static test cases to Automated Persona-Driven Testing, using AI to simulate thousands of diverse user interactions.
  • The Verification Shift: Replacing binary assertions with Semantic Evaluation.
  • The Safety Shift: Why automation isn't enough. Discuss the role of Human-in-the-Loop review and the unique value it adds to high-risk scenarios.

For decades, testing has relied on a simple truth: If I provide Input X, I should get Output Y. Generative AI broke that truth.

For decades, testing has relied on a simple truth: If I provide Input X, I should get Output Y.

Generative AI broke that truth.

When you build AI Agents, the output changes every time. "Expected Results" do not exist in the same way. Traditional automation is brittle. Manual testing is too slow. And to make matters worse, results are subjective!


So, how do you assure quality in a system that, by its very nature, is unpredictable?


This talk outlines the strategic concepts required to tame the chaos of testing AI by exploring the fundamental shifts in the Quality Lifecycle:

  • The Input Shift: Moving from static test cases to Automated Persona-Driven Testing, using AI to simulate thousands of diverse user interactions 
  • The Verification Shift: Replacing binary assertions with Semantic Evaluation
  • The Baseline Shift: How to establish a "Quality Baseline" for your product. 
  • The Safety Shift: Why automation isn't enough. We will discuss the role of Human-in-the-Loop (HITL) review and the unique value it adds to high-risk scenarios.
  • The Observability Shift: We discuss the importance of detailed tracing to understand not just what the model said, but why it decided to say it.
     

 

Related Sessions

Mon, Nov 16 • 8:30 a.m. – 4:30 p.m.
F-,E- & D-Rooms

6-hour Masterclass

Artificial Intelligence (AI)

Wed, Nov 18 • 10:45 a.m. – 12:30 p.m.
Room E2+E3 - Track 5: Workshops

105-minute Workshop

Artificial Intelligence (AI) Testing Tools

Mon, Nov 16 • 8:30 a.m. – 4:30 p.m.
F-,E- & D-Rooms

Full-Day Workshop

Agile Methodologies Artificial Intelligence (AI) Coding for Testers

Virtual Pass session
Tue, Nov 17 • 11:45 a.m. – 12:30 p.m.
Room D7 - Track 9: Talks

25-minute Talk

Agile Methodologies Artificial Intelligence (AI) Collaboration & Communication