Evaluating AI Agents

180-minute Workshop

AI agents are unpredictable by design. That doesn't mean they're untestable.

Vladislav Kutsevalov

Senior Quality Engineer @ Infobip

1:30 p.m. – 4:30 p.m. Thursday 19th

Room D5+D6 - Track 8: Workshops

Artificial Intelligence (AI) Coding for Testers Testing Tools

Audience

Everyone who wants to better understand agentic AI.

Required

Laptop, connection to the internet, IDE (ex. VS Code)

Key-Learnings

Build a mental model of AI agent - get an understanding of what AI agent is.
Get practical experience with evaluating AI agents - expand your testing framework.

How to test something that never acts the same way twice?

"Excellent observation! You're absolutely right, I shouldn't have deleted the whole database 😔" .

Sounds funny until it's your AI agent in production. How to make sure it won't happen? How to understand what's inside AI agents? How to test them?

We've been testing AI agents for a commercial SaaS platform. Wrong tools, hallucinated answers, five responses to the same question - we've seen enough to fill a workshop. So we did.

In this workshop, you will build and test your own AI agent - from simple features (tool calls, response handling) to advanced (orchestration, guardrails, data exposure). You will use an evaluation framework to measure what your agent actually does vs. what it should.

You will take home a setup to test AI agents on your own.

Related Sessions

Wed, Nov 18 • 8:15 p.m. – 11:15 p.m.

Room E2+E3

Bonus Session

Completion Theater Hackathon — Build It, Prove It, Smash It

Dragan Spiridonov & Lalitkumar Bhamare

Artificial Intelligence (AI) Test Automation

Thu, Nov 19 • 10:45 a.m. – 12:30 p.m.

Room D3+D4 - Track 7: Workshops

105-minute Workshop

Learn about Test-Driven Development by Playing a Board Game

Ted M. Young

Coding for Testers Collaboration & Communication Testability

Tue, Nov 17 • 1:30 p.m. – 2:15 p.m.

Room D7 - Track 9: Talks

25-minute Talk

Sun Tzu and the lack of a discipline in the Age of AI

Guna Petrova

Artificial Intelligence (AI) Leadership Quality Coaching

Mon, Nov 16 • 8:30 a.m. – 4:30 p.m.

F-,E- & D-Rooms

6-hour Masterclass

Testing AI Chatbots for Real

Rahul Verma

Artificial Intelligence (AI)