Aegis AI Agent Security Gate
Version updated for https://github.com/Acacian/aegis to version v0.9.4.
- This action is used across all versions by ? repositories.
Action Type
This is a Composite action.
Go to the GitHub Marketplace to find the latest changes.
Action Summary
Agent-Aegis is a governance layer for AI agents that provides a unified runtime to standardize and enforce essential governance features such as prompt-injection blocking, PII masking, policy enforcement, trust delegation, and tamper-evident auditing across 12 popular AI frameworks. It simplifies the implementation of these governance primitives by auto-instrumenting existing agent frameworks without requiring code changes, enabling developers to ensure compliance and security while reducing complexity. The action addresses challenges like inconsistent governance implementations and enhances trust, transparency, and control in AI-driven systems.
What’s Changed
What’s New
`aegis check drift` CLI
Offline entropy-based drift detector for saved agent traces. Same signal that `auto_instrument()` exposes at runtime, now runnable on any JSONL trace from LangSmith, OTel, or custom loggers.
```bash aegis check drift –trace path/to/trace.jsonl aegis check drift –trace trace.jsonl –baseline gpt-4o-retail.json aegis check drift –trace trace.jsonl –json –strict ```
Privacy invariant: reads only the `tool_name` field — never args, CoT, or prompts — so enterprise users can score prod traces without exfiltrating PII. Stdlib-only (Counter + math.log, no numpy).
Research: 1,960 Tau-Bench Agent Trajectories
Measured tool distribution drift on sierra-research/tau-bench public trajectories. 39.8% of 812 scored trajectories show measurable collapse (Δ entropy ≥ 0.3 nats). Cross-model gap on the same retail task family: Sonnet 3.5 New 48.2% vs GPT-4o 28.1% (1.7× ratio, n=599). Distribution is bimodal — agents either stay open or fall off a cliff.
- Post: https://acacian.github.io/aegis/research/tau-bench-tool-distribution-drift/
- Reproduces in ~30 seconds on a laptop (stdlib only)
4 pillars of differentiation
Unlike LLM-as-judge approaches (Patronus, Braintrust) and fine-tuned classifiers (Galileo, Maxim), the `check drift` metric is simultaneously:
- Deterministic — no second LLM judges the first, two runs give bit-identical results
- Privacy-preserving — tool names only, no prompt content ever read
- Cross-model comparable — normalized Δ on the same scale across GPT-4o and Sonnet
- 30-second reproducible — 120 lines of stdlib Python, no numpy or GPU
Other
- 15 new tests in `tests/cli/test_check.py` including a hard privacy-invariant assertion (PII planted in fixture traces must never appear in any output)
- `ScholarlyArticle` JSON-LD schema for `/research/*` pages, sitemap tier 0.8, `llms.txt` canonical facts section for LLM crawlers
Full Changelog: https://github.com/Acacian/aegis/compare/v0.9.3...v0.9.4