Skip to main content

Playground vs Evals

Both Playground and Evals help you run work in Spec27, but they serve different needs.

Use Playground when you want speed

Playground is best when you want to:

  • try a configuration quickly
  • inspect a run history for experimentation
  • generate derivative or adversarial datasets from an existing flow
  • validate that a setup is ready before using it more formally

Use Evals when you want repeatability

Evals are best when you want to:

  • save a named evaluation setup
  • attach one or more agents to one or more specifications
  • run the same setup again later
  • review persisted results in project result views

Simple rule of thumb

  • Playground is for exploration.
  • Evals are for named, reusable evaluation workflows.