EvalView - AI Agent Testing

Version updated for https://github.com/hidai25/eval-view to version v0.5.0. This action is used across all versions by 0 repositories. Action Type This is a Composite action. Go to the GitHub Marketplace to find the latest changes. Action Summary EvalView is a GitHub Action designed for regression testing of AI agents by automating the process of detecting behavioral changes and preventing regressions in continuous integration (CI) pipelines. It captures and snapshots the agent’s API behavior, including tool calls, sequences, parameters, outputs, costs, and latencies, creating a golden baseline for comparisons.

March 9, 2026

EvalView - AI Agent Testing

Version updated for https://github.com/hidai25/eval-view to version v0.4.1. This action is used across all versions by 0 repositories. Action Type This is a Composite action. Go to the GitHub Marketplace to find the latest changes. Action Summary EvalView is an open-source framework designed to automate regression testing and behavior evaluation for AI agents. It captures and analyzes agent execution traces to detect deviations from predefined baselines, ensuring reliability and preventing performance degradation.

February 27, 2026

EvalView - AI Agent Testing

Version updated for https://github.com/hidai25/eval-view to version v0.3.2. This action is used across all versions by 0 repositories. Action Type This is a Composite action. Go to the GitHub Marketplace to find the latest changes. Action Summary EvalView is an open-source testing framework for AI agents that automates regression detection by comparing changes in agent behavior against a saved golden baseline. It helps developers identify and prevent unintended behavioral changes when modifying prompts, models, or tools, making it a critical CI/CD layer for AI development.

February 20, 2026

EvalView - AI Agent Testing

Version updated for https://github.com/hidai25/eval-view to version v0.3.0. This action is used across all versions by 0 repositories. Action Type This is a Composite action. Go to the GitHub Marketplace to find the latest changes. Action Summary EvalView is a GitHub Action and CLI tool designed to detect regressions in AI agent behavior by comparing their current outputs against a saved baseline. It automates the process of identifying changes in prompt outputs, tool usage, and overall performance, helping developers confidently ensure their agents continue to function correctly after updates or modifications.

January 25, 2026

EvalView - AI Agent Testing

Version updated for https://github.com/hidai25/eval-view to version v0.2.3. This action is used across all versions by ? repositories. Go to the GitHub Marketplace to find the latest changes. Action Summary EvalView is a GitHub Action designed to automate the testing and validation of AI agent behavior in CI/CD pipelines. It detects issues like tool changes, output inconsistencies, cost increases, and latency regressions by comparing current runs to a golden baseline, ensuring that regressions are caught before deployment.

January 12, 2026

EvalView - AI Agent Testing

Version updated for https://github.com/hidai25/eval-view to version v0.2.1. This action is used across all versions by ? repositories. Go to the GitHub Marketplace to find the latest changes. Action Summary EvalView is a GitHub Action designed to detect agent regressions, such as changes in tool usage, output quality, cost, or latency, during CI testing before deploying to production. It automates the comparison of current agent runs against a saved baseline to identify issues, ensuring reliability and preventing costly errors.

January 10, 2026

EvalView - AI Agent Testing

Version updated for https://github.com/hidai25/eval-view to version v0.2.0. This action is used across all versions by ? repositories. Go to the GitHub Marketplace to find the latest changes. Action Summary EvalView is a GitHub Action designed to detect and prevent regressions in AI agents during development and CI workflows. It identifies issues such as changes in tool usage, output discrepancies, increased costs, and latency spikes by comparing current agent behavior against a baseline.

January 1, 2026

EvalView - AI Agent Testing

Version updated for https://github.com/hidai25/eval-view to version v0.1.8. This action is used across all versions by ? repositories. Go to the GitHub Marketplace to find the latest changes. Action Summary EvalView is a regression testing framework for AI agents, designed to detect behavioral issues, such as tool changes, output differences, cost increases, or latency spikes, before deployment. It automates the process of comparing current agent performance against a golden baseline and integrates into CI pipelines to block problematic deployments.

December 29, 2025

EvalView - AI Agent Testing

Version updated for https://github.com/hidai25/eval-view to version v0.1.7. This action is used across all versions by ? repositories. Go to the GitHub Marketplace to find the latest changes. Action Summary EvalView is a testing framework designed for AI agents, enabling developers to write test cases in YAML and automate the detection of regressions in behavior, cost, and latency during CI/CD workflows. By integrating with tools like LangGraph, CrewAI, OpenAI Assistants, and Anthropic Claude, it automates tasks such as tracking token costs, validating tool calls, and catching hallucinations, solving the challenges of manual testing and ensuring reliable agent performance before deployment.

December 19, 2025

EvalView - AI Agent Testing

Version updated for https://github.com/hidai25/eval-view to version v0.1.5. This action is used across all versions by ? repositories. Go to the GitHub Marketplace to find the latest changes. Release notes What’s New Statistical Pass/Fail System Variance-aware testing - Run tests multiple times to get statistically significant results Confidence levels - Configure how confident you want to be in pass/fail decisions CLI integration - New --runs flag to run tests multiple times # Run each test 5 times for statistical analysis evalview run --runs 5 LangGraph Adapter Fix Fixed adapter compatibility issues for better LangGraph integration Config-Free Runs Run evalview run without requiring a config file Automatically discovers test cases in the current directory Templates Added test case templates for common evaluation patterns Quick-start templates for tool calling, RAG, and multi-turn scenarios Node SDK License Fix Fixed license mismatch - now correctly uses Apache 2.

December 10, 2025

EvalView - AI Agent Testing

Version updated for https://github.com/hidai25/eval-view to version v0.1.4. This action is used across all versions by ? repositories. Go to the GitHub Marketplace to find the latest changes. Release notes What’s New Ollama Support (Free Local Evaluation) Ollama as LLM-as-judge - Run evaluations locally with zero API costs Auto-detection - Automatically detects Ollama running on localhost:11434 New adapter - Test LangGraph agents powered by local Llama models # Free local evaluation evalview run --judge-provider ollama --judge-model llama3.

December 9, 2025

EvalView - AI Agent Testing

Version updated for https://github.com/hidai25/eval-view to version v0.1.3. This action is used across all versions by ? repositories. Go to the GitHub Marketplace to find the latest changes. Release notes EvalView GitHub Action Pytest-style testing framework for AI agents — now available as a GitHub Action. Usage - uses: hidai25/eval-view@v0.1.3 with: openai-api-key: ${{ secrets.OPENAI_API_KEY }} Features - 🧪 Test LangGraph, CrewAI, OpenAI, Anthropic, and custom agents - ⚡ Parallel test execution (4 workers by default) - 📊 Auto-generated HTML reports - 💬 PR comments with test results - 🤖 LLM-as-judge output evaluation - 💰 Cost and latency threshold checks Action Inputs | Input | Description | Default | |----------------|---------------------------------|-----------------------| | openai-api-key | OpenAI API key for LLM-as-judge | - | | config-path | Path to config file | .