Modular Agent Evaluation Runner

Instructions:

  1. Please clone this space, then modify the code to define your agent's logic, the tools, the necessary packages, etc ...
  2. Log in to your Hugging Face account using the button below. This uses your HF username for submission.
  3. Click 'Run Evaluation & Submit All Answers' to fetch questions, run your agent, submit answers, and see the score.

Disclaimers: Once clicking on the "submit" button, it can take quite some time (this is the time for the agent to go through all the questions). This space provides a modular setup for robust, maintainable solutions.

Questions and Agent Answers

Questions and Agent Answers