Troubleshooting
A run did not start
Check:
- the eval has at least one linked agent
- the eval has at least one linked specification
- any dependent datasets or preparation steps are complete
Agent preview is blocked
Check:
- whether the agent requires missing Secrets
- whether the secret key names match what the agent expects
I do not see an asset I expected to see
Check:
- the active organization
- whether the asset belongs to a different project
- whether ownership or visibility rules are filtering it out
A judge-based score looks wrong
Check:
- the judge prompt or configuration
- the sample input and output you used during judge testing
- whether judge-based scoring is the right method for the task
My results are hard to interpret
Start with:
- run status
- latest step
- per-row correctness
- console output or error details
Then compare the run setup back to the eval and specification.