A new open-source toolkit called pentest-ai-agents is redefining how security professionals leverage AI in penetration testing workflows, transforming Anthropic’s Claude Code into a fully specialized offensive security research assistant powered by 28 domain-specific subagents.

Released by security researcher 0xSteph on GitHub, pentest-ai-agents is a collection of 28 Claude Code subagents, each carrying deep domain expertise across the full penetration testing lifecycle.

Coverage spans reconnaissance, web application testing, Active Directory attacks, cloud security, mobile pentesting, wireless attacks, social engineering, exploit chaining, detection engineering, forensics, malware analysis, and report generation.

Rather than relying on a single general-purpose AI model, the framework automatically routes each query to the most appropriate specialist agent.

Pentest-AI-Agents Installation

Setup requires no servers, no external dependencies, and no complex configuration. A single command handles everything:

bashcurl -fsSL https://raw.githubusercontent.com/0xSteph/pentest-ai-agents/main/install.sh | bash

The script clones the repository, copies all 28 agent files to ~/.claude/agents/, and exits cleanly. It is fully idempotent, meaning re-running it safely updates existing agents.

Additional install options support project-scoped deployments (--project) and a cost-optimized lite mode (--global --lite) that runs advisory agents on Claude Haiku for reduced token consumption.

The toolkit introduces a two-tier execution model for safety and flexibility. Tier 1 agents operate in advisory mode, users paste tool output, and receive prioritized analysis, methodology guidance, and recommended next commands.

Tier 2 agents go further, composing and executing commands directly against a declared, authorized scope, with Claude Code displaying each command for explicit approval before execution.

Tier 2 agents include the Recon Advisor (nmap, whois, whatweb), Web Hunter (ffuf, sqlmap, dalfox), AD Attacker (BloodHound, Impacket, CrackMapExec, Certipy), Exploit Chainer, PoC Validator, and Business Logic Hunter. Every offensive action is mapped to MITRE ATT&CK identifiers and paired with defensive context.

Persistent Findings and MCP Support

A built-in SQLite-backed findings database (findings.sh) persists engagement data across Claude Code sessions, enabling multi-day operations with seamless handoffs.

Tier 2 agents write to this database automatically when findings.sh is in the system PATH. The Report Generator agent produces professional pentest reports complete with executive summaries, CVSS scoring, and remediation roadmaps.

For air-gapped or privacy-sensitive environments, agents can be converted to OpenCode custom commands compatible with Ollama, LM Studio, or any local model via the included opencode-setup.sh script.

A companion MCP server (pentest-ai) extends the ecosystem with 150+ tool wrappers, autonomous exploit chaining, and CI/CD pipeline integration for Claude Desktop, Cursor, and VS Code Copilot.