EvalView - AI Agent Testing
Version updated for https://github.com/hidai25/eval-view to version v0.5.1.
- This action is used across all versions by 0 repositories.
Action Type
This is a Composite action.
Go to the GitHub Marketplace to find the latest changes.
Action Summary
EvalView is a GitHub Action designed for regression testing and evaluation of AI agents, particularly tool-calling agents, in CI/CD workflows. It automates the generation of test suites, snapshots agent behavior as a golden baseline, and detects regressions in tool usage, output, cost, latency, and multi-turn interactions. By providing multi-layered testing and scoring, it ensures robust validation of AI agent performance before deployment.
Release notes
What’s New
Added
evalview generate— draft test suite generation from agent probing or log imports, with approval gating and CI review flow- Approval workflow — generated tests require explicit approval before becoming baselines
- CI review comments —
evalview ci commentposts generation reports on PRs
Fixed
- Python 3.9 compatibility: replaced
datetime.UTCwithtimezone.utc - Mypy type errors in generate command and test generation module
- Codebase refactor and cleanup across 71 files
Full Changelog: https://github.com/hidai25/eval-view/compare/v0.5.0...v0.5.1