Decepticon: When LangGraph Meets Offensive Security — Autonomous Exploitation Arrives
Summary
Architecture & Design
Multi-Agent Offensive Graph
Decepticon implements a directed cyclic graph using LangGraph, breaking the monolithic agent into specialized nodes: Reconnaissance, VulnerabilityAnalysis, ExploitOrchestrator, and ReportGenerator. Unlike linear AutoGPT-style agents, it supports cycles—allowing the system to pivot when initial exploits fail.
Tool Augmentation Layer
The architecture wraps traditional pentest tooling (Nmap, Metasploit, Gobuster, SQLmap) via function calling APIs, converting LLM intent into shell execution with structured JSON schemas. A critical component is the SandboxedExecutor, which containerizes commands to prevent host compromise during autonomous operation.
Memory & Context Management
Utilizes a hybrid memory system: short-term (thread-scoped LangGraph state for active sessions) and long-term (vector storage of previous engagement findings via ChromaDB). This enables cross-engagement learning—unusual for offensive tools—allowing the agent to reference similar network topologies from prior scans.
LLM Backend Agnostic
Supports OpenAI GPT-4o, Claude 3.5 Sonnet, and local models via Ollama, with a CapabilityRouter that routes complex exploit generation to frontier models while delegating port scanning logic to cheaper local inference.
Key Innovations
Autonomous Exploit Chaining
Whereas existing tools like PentestGPT require step-by-step human prompting, Decepticon implements goal-directed hierarchical planning. The planner decomposes "compromise domain controller" into sub-tasks (recon → lateral movement → privilege escalation), dynamically replanning when encountering hardened targets.
CVE-to-Exploit Translation
The system parses CVE descriptions and PoC code from ExploitDB, using RAG to match discovered services with known vulnerabilities—effectively automating the "Google the CVE, find the GitHub PoC" workflow that consumes 40% of manual pentest time.
Adversarial Evasion Modules
Includes experimental OpSec nodes that modify payload signatures and timing to evade basic IDS/IPS detection—controversial but technically sophisticated, applying GAN-like perturbations to shellcode (reference: "Adversarial Malware Generation via Neural Networks", though implementation details remain undisclosed).
Human-in-the-Loop Bypass
Features a "Full Autonomous" mode that removes confirmation prompts—a design choice that maximizes speed but raises significant ethical concerns. The innovation isn't the capability itself, but the confidence scoring mechanism that determines when to request human override versus proceeding autonomously.
Performance Characteristics
Benchmarks vs Traditional Workflows
| Metric | Decepticon (Autonomous) | Manual Pentest | PentestGPT |
|---|---|---|---|
| Network Recon Time (100 hosts) | 12 min | 45 min | 28 min |
| CVE Exploitation Success Rate* | 64% | 71% | 38% |
| False Positive Rate | 22% | 8% | 31% |
| Report Generation | Automated | 4-8 hours | Semi-automated |
| Cost per Engagement | $2-5 (API calls) | $2,000-5,000 | $10-20 |
*Tested against VulnHub CTFs and HackTheBox retired machines (Easy/Medium difficulty)
Limitations
- Context Window Collapse: Large network scans (>500 hosts) overwhelm the planner's context, requiring manual segmentation
- Hallucinated Exploits: GPT-4o occasionally generates non-existent Metasploit modules; the system lacks ground-truth verification for zero-days
- Rate Limiting: Autonomous scanning triggers AWS WAF/cloudflare blocks faster than human pacing
Inference Speed
Exploit generation latency averages 8.4 seconds per payload (GPT-4o), creating a bottleneck in fast-moving engagements. Local models (Llama 3.1 70B) reduce this to 2.1s but drop success rates to 41%.
Ecosystem & Alternatives
Deployment & Integration
Ships with Docker Compose configurations for isolated execution and Kubernetes manifests for scalable red-team operations. Integrates with Metasploit RPC and BloodHound for Active Directory enumeration, plus webhook support for Slack/Discord alerting during autonomous operations.
Fine-Tuning & Customization
Provides decepticon-trainer, a LoRA fine-tuning pipeline for domain-specific exploits (ICS/SCADA, cloud AWS misconfigurations). The community has already published adapters for:
- API security testing (OpenAPI spec parsing)
- Cloud-native pentesting (K8s, Terraform state analysis)
- Social engineering automation (phishing email generation with evasion)
Licensing & Safety Concerns
Decepticon uses a modified GPL-3.0 license with an "Ethical Use Clause"—legally unenforceable but signaling intent. The project lacks the safety guardrails seen in Bishop Fox's "Ghostwriter" or NVIDIA's "Morpheus," making it attractive to script kiddies while worrying enterprise security teams.
Community Velocity
Despite being weeks old, the project has spawned 27 third-party plugins (Discord bot integrations, Slack command interfaces) and a HuggingFace collection of fine-tuned exploit-generation models. The maintainer (PurpleAILAB) appears to be anonymous—a red flag for enterprise adoption but typical for offensive security tooling.
Momentum Analysis
AISignal exclusive — based on live signal data
Velocity Metrics
| Metric | Value | Interpretation |
|---|---|---|
| Weekly Growth | +48 stars/week | Viral within cybersecurity niche |
| 7-day Velocity | 46.5% | Breaking out of early adopter phase |
| 30-day Velocity | 49.7% | Sustained acceleration rare for security tools |
Adoption Phase Analysis
Decepticon sits at the hype inflection point—post-proof-of-concept but pre-enterprise validation. The 279 forks suggest immediate experimentation by red teams and CTF players, while the star-to-fork ratio (5.6:1) indicates high curiosity but low immediate utility for casual observers.
The growth driver isn't novelty (PentestGPT exists), but the removal of friction—autonomous execution appeals to under-stretched security teams and bug bounty hunters seeking volume.
Forward-Looking Assessment
Expect bifurcation: Enterprises will fork private versions with heavy safety guardrails (human-in-the-loop requirements, audit logging), while the public repo becomes a playground for automated vulnerability scanning—likely attracting GitHub TOS scrutiny if used for unauthorized testing. The 49.7% monthly velocity is unsustainable; expect stabilization at ~3k stars unless a major CVE is discovered by the tool itself, which would trigger second-order growth.
Risk Factor: High probability of media sensationalism around "AI hackers" leading to repository restrictions or license changes to non-commercial within 90 days.