AI Prompt Security Test

January 7, 2026

Version updated for https://github.com/arpitbhasin1/ai-security-ci to version v1.

This action is used across all versions by ? repositories.

Go to the GitHub Marketplace to find the latest changes.

Action Summary

This GitHub Action automates security testing for AI systems by simulating common prompt-based attacks, such as jailbreak attempts, prompt leakage, and harmful content generation. It identifies vulnerabilities by evaluating system responses using heuristics and optional LLM-based judging, providing detailed reports in JSON and Markdown formats. This tool streamlines the process of ensuring AI prompt security within CI pipelines, helping developers safeguard their AI models from malicious exploitation.

Release notes

AI Prompt Security CI – Phase-1 MVP

AI Prompt Security CI automatically runs defensive prompt-based security checks (jailbreaks, prompt leaks, harmful instructions) against your AI prompts and models directly inside CI/CD.

Key Features

GitHub Action + CLI
Prompt injection & jailbreak simulation
Heuristic detection with optional LLM-as-judge
JSON + Markdown security reports
DEMO_MODE for zero-API-cost testing (no API key required)
CI-friendly exit codes (fail on high-severity findings)
Safe, defensive-only attack library

What This Tool Does NOT Do (Phase-1)

No dashboard (reports only)
No SaaS backend
No multi-turn attacks
No inference-time guardrails
No enterprise or billing features

Intended Use

This tool is designed for developers and teams testing their own AI systems before deployment. It is defensive-only and meant to surface prompt vulnerabilities early in development.

See README for setup instructions and usage examples.