Skip to main content

Troubleshooting

A run did not start

Check:

  • the eval has at least one linked agent
  • the eval has at least one linked specification
  • any dependent datasets or preparation steps are complete

Agent preview is blocked

Check:

  • whether the agent requires missing Secrets
  • whether the secret key names match what the agent expects

I do not see an asset I expected to see

Check:

  • the active organization
  • whether the asset belongs to a different project
  • whether ownership or visibility rules are filtering it out

A judge-based score looks wrong

Check:

  • the judge prompt or configuration
  • the sample input and output you used during judge testing
  • whether judge-based scoring is the right method for the task

My results are hard to interpret

Start with:

  • run status
  • latest step
  • per-row correctness
  • console output or error details

Then compare the run setup back to the eval and specification.