EvalView - AI Agent Testing
Version updated for https://github.com/hidai25/eval-view to version v0.5.5.
- This action is used across all versions by 0 repositories.
Action Type
This is a Composite action.
Go to the GitHub Marketplace to find the latest changes.
Action Summary
EvalView is a GitHub Action designed to automate regression testing for AI agents, focusing on detecting silent behavior changes, tool call deviations, and quality regressions after updates. It provides capabilities to create baselines, track multi-turn execution traces, and pinpoint changes in agent behavior, ensuring reliable performance across model updates. This tool addresses gaps in traditional testing methods by offering specialized evaluations for scenarios like tool usage and multi-step workflows.
Release notes
What’s New
New Commands
evalview watch— re-run regression checks on every file save with live scorecard ($0 in quick mode)evalview badge— shields.io status badge, auto-updates on every checkevalview monitor --dashboard— live terminal dashboard with per-test history dots
Native Adapters
- Pydantic AI (
pydantic-ai) — callsagent.run()in-process, extracts tool calls from typed messages - CrewAI (
crewai-native) — callscrew.kickoff()in-process, captures tools via event bus
Smart DX
- Assertion wizard — capture real traffic, get pre-configured assertions automatically
- Auto-variant discovery —
--statistical N --auto-variantfinds and saves non-deterministic paths - Budget circuit breaker —
--budget 0.50enforces spend limits mid-execution - Eval profiles —
initauto-detects agent type and configures evaluators
Python API
gate(),gate_async(),gate_or_revert()— programmatic regression checks- OpenClaw integration with
check_and_decide()for autonomous loops
GitHub Action
- Auto PR comments, artifact uploads, version pinning — all in one step
Documentation
- CrewAI, Pydantic AI, and OpenClaw integration guides
- README rewritten for conversion
- 26 community issues for contributors
Full changelog: https://github.com/hidai25/eval-view/blob/main/CHANGELOG.md
Install: pip install evalview==0.5.5 or curl -fsSL https://raw.githubusercontent.com/hidai25/eval-view/main/install.sh | bash