Hi everyone,
I’ve recently started experimenting with the newly launched Evaluation feature by Atlassian and have encountered an issue.
I created an agent and tested it through normal conversation. In this case, the agent responds correctly and behaves as expected for the given inputs.
However, when I use the same inputs via a CSV file in the Evaluation feature, all the test cases are marked as failed, even though the responses appear to be valid.
Has anyone faced a similar issue or knows what might be causing this behavior? Any guidance or suggestions would be greatly appreciated.
Thanks in advance!