Level 2 · Intermediate · 6 min
Evals
Automated tests for agents and prompts. Run on a fixed dataset, score the output, catch regressions before users do.
Automated tests for agents and prompts. Run on a fixed dataset, score the output, catch regressions before users do.