Evals
Purpose
Use Evals to define named, repeatable evaluation setups.
When to use it
Create an eval when you want to connect one or more agents with one or more specifications and run them as a saved workflow.
Prerequisites
- You have a project.
- You have at least one agent.
- You have at least one specification.
Steps
- Open Evals inside a project.
- Create a new eval and give it a descriptive name.
- Select the agents the eval should use.
- Select the specifications the eval should use.
- Save the eval and open the detail page.
- Review the overview, schedule, linked agents, linked specifications, and results summary.
- Start a run from the eval detail page when you are ready.
Expected result
You have a saved evaluation workflow that can be run again and reviewed later.
What to know
- Evals are the best place for repeatable team workflows.
- Some projects also configure schedules for recurring eval runs.
- Run details preserve status, outputs, and robustness-related summaries.