MCP Server Evals
Version updated for https://github.com/mcp-use/eval-action to version v1.
- This action is used across all versions by ? repositories.
Action Type
This is a Composite action.
Go to the GitHub Marketplace to find the latest changes.
Action Summary
The MCP Server Eval Action is a reusable GitHub Action designed to automate the evaluation of language model agents (LLMs) using an MCP server. It facilitates running LLM-as-judge evaluations by executing test cases defined in a YAML file and assessing agent performance based on specified criteria. This action streamlines the process of testing LLM responses, supports both local and remote server configurations, and generates detailed evaluation reports in JSON and Markdown formats.