Devlery
Blog/AI

Decepticon 1.1.3 Tests the Guardrails for Autonomous Red-Team Agents

Decepticon 1.1.3 shows that red-team agents are now competing on rules of engagement, sandboxing, graphs, release integrity, and auditability.

Decepticon 1.1.3 Tests the Guardrails for Autonomous Red-Team Agents
AI 요약
  • What happened: PurpleAILAB released Decepticon v1.1.3, an open-source autonomous red-team agent, on May 27, 2026.
    • The release adds Windows installation, Podman and nerdctl detection, a Ghidra backend, 76 skill playbooks, and runtime fixes across sandbox, OAuth, streaming, and dashboard paths.
  • Why it matters: The red-team-agent race is moving from model answers toward rules of engagement, sandbox isolation, graph memory, and release integrity.
  • Watch: Decepticon handles offensive tooling and C2, so authorized scope, logs, abort conditions, and credential boundaries are product requirements, not afterthoughts.
    • The project's own disclaimer says it must not be used against systems or networks without explicit written authorization.

PurpleAILAB released Decepticon v1.1.3 on May 27, 2026. GitHub's release API listed the latest release at 2026-05-27T08:43:51Z. At research time, the repository showed about 4.1k stars and 809 forks, and its README described Decepticon as an "Autonomous Red Team Agent." For AI builders, the news is less about another "AI can hack" demo and more about what an agent needs before it is allowed to run dangerous tools.

Decepticon's README says it is not another AI hacker that runs nmap and writes a report. The distinction it tries to make is operational. Before sending packets, the agent is designed to create an engagement package: rules of engagement, concept of operations, deconfliction plan, OPPLAN, and MITRE ATT&CK mapping. That framing treats the agent less like a scanner and more like a worker operating under written constraints.

The v1.1.3 changelog looks like a patch release at first, but the contents widen both the product surface and the risk surface. The release adds a Windows PowerShell installation path. The Go launcher now runs operating-system, architecture, distribution, and Docker-readiness checks during onboarding. Container support also expands beyond Docker with automatic detection for Podman and nerdctl. On the reverse-engineering side, the release adds a Ghidra 12.1 headless backend and an optional MCP bridge sidecar.

The larger change is in skills and runtime behavior. The changelog says 76 new skill playbooks were added across Active Directory, cloud, smart contracts, web exploitation, LLM red teaming, mobile, reversing, supply chain, modern APIs, ICS-OT, and C2. The same release fixes the Soundwave interview loop, Codex and ChatGPT OAuth handling, streaming events, sandbox zombie processes, LiteLLM truncated tool use, CVE lookup timeouts, web dashboard behavior, and CLI TUI issues. As the agent gains more ways to act, the project is also spending release energy on the machinery that keeps long runs from breaking down.

Official Decepticon infrastructure diagram

The first architectural detail to inspect is network separation. Decepticon's architecture documentation splits management infrastructure and operational infrastructure into two Docker networks: decepticon-net and sandbox-net. LiteLLM, PostgreSQL, LangGraph, and the web dashboard live on the management side. The Kali sandbox, C2 server, and victim targets live on the operational side. The documentation says the sandbox is not routed to LiteLLM, PostgreSQL, the LangGraph API, or the web dashboard.

That separation is close to mandatory for a red-team agent. If the sandbox that runs offensive tooling can see the LLM gateway or provider credentials directly, hostile target input or a compromised process may be able to pivot into the management plane. Decepticon says LangGraph controls the sandbox through the Docker socket rather than TCP, while Neo4j is the shared component spanning both networks. Neo4j is framed as a knowledge store for the attack graph, not as a service carrying model credentials into the operational environment.

The agent design also assumes long-running work. The docs describe 16 specialist agents organized by kill-chain phase. Decepticon acts as the orchestrator, reads the OPPLAN, and delegates objectives to specialists such as recon, exploit, postexploit, analyst, reverser, contract auditor, cloud hunter, and Active Directory operator. The vulnerability-research path has a five-stage pipeline: scanner, detector, verifier, patcher, and exploiter. Soundwave acts as the engagement planner, producing RoE, threat profile, CONOPS, deconfliction, contact, data-handling, abort, and cleanup documents.

The useful design choice is the fresh-context model. Each specialist agent starts with a clean context window for each objective, while findings are persisted to disk and the knowledge graph instead of being carried only in chat memory. Long agent sessions that keep stuffing more state into one conversation accumulate token cost, stale reasoning, and context contamination. Decepticon's split between subagents and graph persistence resembles the pattern now appearing in coding agents: start new task contexts, preserve durable artifacts as diffs and logs, and avoid pretending one endless chat is a reliable execution substrate.

The README's benchmark claim is aggressive. Decepticon says it scored 102 out of 104 on XBOW validation benchmarks: 45 out of 45 on Easy, 50 out of 51 on Medium, and 7 out of 8 on Hard, for a 98.08% pass rate. That number shows how red-team agents are being packaged like evaluated products rather than one-off demos. It should still be read narrowly. A benchmark does not reproduce a real enterprise network, legal scope, operational logs, recovery duties, or collaboration with a defensive team. It is a performance claim, not a complete deployment argument.

Official Decepticon benchmark donut chart

Windows support is not a minor detail in this release. Red-team tools have often assumed Linux or Kali-centric workflows. Decepticon's README lists macOS, Linux, Windows, and WSL2 support, and v1.1.3 includes native Windows installer work plus windows_amd64 and windows_arm64 artifacts. That points at platform teams, consulting groups, and internal red teams that need to run the same agent runtime across mixed local environments.

Podman and nerdctl support aim at the same enterprise reality. Some organizations avoid Docker Desktop because of licensing, root-daemon concerns, or security policy. The changelog says the launcher detects reachable runtimes in Docker, Podman, then nerdctl order and exposes a DECEPTICON_CONTAINER_RUNTIME override. For an AI agent to become a real security-team tool, it has to pass the organization's container-runtime policy before model quality becomes relevant.

The Ghidra backend and reversing sidecar show the scope moving beyond web scanning. The changelog mentions ghidra_analyze, ghidra_decompile, ghidra_xrefs, and ghidra_status. Rather than making the default sandbox image heavier, the project keeps reversing behind INSTALL_REVERSING=false and an opt-in reversing compose profile. That choice matters because binary-analysis tooling is expensive and sensitive. Keeping it profile-gated is better than placing every offensive capability into the default runtime.

Decepticon's model strategy is also not a single-model demo. The README lists Anthropic, OpenAI, Google Gemini, MiniMax, DeepSeek, xAI, Mistral, OpenRouter, Nvidia NIM, and local Ollama as tier-mapped providers. Subscription OAuth paths include Claude Max, Pro, and Team; ChatGPT Pro, Plus, and Team; Gemini Advanced; Copilot Pro; SuperGrok; and Perplexity Pro. Profiles such as eco, max, and test let the operator spend higher-tier model calls on the orchestrator or exploiter while using cheaper tiers for reconnaissance.

That model routing is an economic design question, not only an accuracy question. Vulnerability analysis, exploit-chain construction, and reporting can all burn large token budgets. Pin every specialist to a frontier model and cost or rate limits become the first failure mode. Use weak models everywhere and false positives, bad hypotheses, and failed verification increase. Decepticon's tiers and fallbacks make the operational question explicit: which objective deserves which model budget?

The release process is worth inspecting because this is offensive-security software wrapped around AI execution. RELEASE.md describes PyPI Trusted Publishing, GoReleaser artifacts, GHCR multi-architecture images, Cosign keyless signing, CycloneDX SBOMs, and conditions for promoting :latest. The latest release API also showed several CycloneDX JSON assets and operating-system-specific launcher binaries. As agent tools gain offensive capability, users need to know not only what the agent can attack, but also where its runtime, images, and binaries came from.

Community attention is still modest compared with major model launches. GeekNews showed "Decepticon - autonomous hacking agent for red teams" on its front page on May 28, 2026 KST, roughly 10 hours after posting, with 15 points. The summary emphasized that it aims beyond the common "run nmap, output report" demo and into professional red-team operations. Hacker News' front page at the same time was dominated by topics such as YouTube AI labeling, product-market fit discussions around Anthropic and OpenAI, and fatigue around AI conversations; Decepticon was not observed there.

For security teams, the practical questions are direct. At what step does a human approve the RoE and OPPLAN that an agent writes? Can findings and command logs from the sandbox be preserved as auditable evidence? When C2 or exploitation functionality is enabled in a customer environment, how are legal authorization, target scope, abort conditions, and cleanup enforced? If the model provider is an external API, where do target metadata, prompts, tool results, and findings travel?

Developers outside security should still pay attention. A coding agent handling test runners and package managers and a red-team agent handling Kali tooling and C2 differ in the type of authority they receive, but the product questions rhyme. Where does execution happen? Where does durable memory live? Who can interrupt the agent? Which logs explain a mistaken action after the fact? Decepticon makes those questions more visible because the security domain raises the consequences.

Misuse risk cannot be treated as a footnote. The README disclaimer says the tool should not be used against systems or networks without explicit written authorization. That sentence is legal language, but it is also a product requirement. Autonomous red-team tools need approval workflows, target registries, network egress policy, credential isolation, report redaction, and emergency stop controls before demo-level autonomy becomes operational autonomy. As offensive capability improves, the quality of control systems becomes part of the core feature set.

Decepticon 1.1.3 does not mean every organization should adopt an autonomous red-team agent now. Many teams are not ready for this class of tool. The release does, however, show where AI security agents are heading: not just a CLI wrapped around model calls, but an operating package containing engagement documents, specialist agents, sandboxes, graph state, runtime profiles, release signing, and SBOMs. The competitive question may not be which agent can run the riskiest command. It may be which agent can prove when, where, and under whose authorization that command did not run.