-
Notifications
You must be signed in to change notification settings - Fork 195
Changelog
Sitaraman Subramanian edited this page Apr 4, 2026
·
3 revisions
- Introduced swarm mode for parallel multi-model attack runs. The orchestrator spawns a swarm of racers — each running a different model — that attack the same task simultaneously. The first to succeed can end the race (
first_success) or all can run to completion (all_complete). - Added the
spawn_swarmagent tool (available only in swarm-agent context). The orchestrator uses it to launch sub-swarms with per-agent task specs, optional timeout, and a configurable win condition. -
Orchestrator model — the main model is now explicitly configured as the orchestrator (
ORCHESTRATOR_*env vars). It drives high-level planning and swarm coordination. -
Racer models — up to 8 independent racer models can be configured (
RACER_1_*…RACER_8_*), each with its own provider, model, API key, base URL, and reasoning mode. Racers are assigned round-robin when a swarm spawns more agents than configured racers. - Environment variable cleanup:
MODEL_API_KEY,MODEL_PROVIDER,MODEL_BASE_PATH, andREASONING_MODEare removed. All model config now lives underORCHESTRATOR_*. Existing.envfiles with onlyMODEL_*set continue to work via the fallback inreadSwarmModelsFromEnv. -
run.sh configgains a dedicated Racers option (option 2). The guided startup now prompts to configure racers as an optional step after the main model.
- Swarm mode integrates with CTF solving: use
/solve <challenge>to focus the orchestrator, then let it spawn a racer swarm to attack the challenge with multiple models in parallel.
- The system now operates as an autonomous agent. Previously, the workflow was strictly human-in-the-loop with multiple stops where the user had to nudge the assistant to continue. Now the agent loops on its own for up to 25 iterations per turn, calling tools, analyzing output, and deciding next steps independently.
- Added a consent model with two modes: "Auto run" (agent executes freely, with safety checks on dangerous commands) and "Ask consent" (agent pauses for approval before tool execution).
- Manual execution fallback: when SSH connectivity drops, the agent presents commands for the user to run manually and accepts pasted output.
- Renamed "plugins" to "tools" throughout the codebase and UI.
- 16 agent tools available, organized into Core, Intelligence, Burp Suite, and Browser groups.
- Tools can be toggled on/off per session from the sidebar.
- Tools that depend on unconfigured integrations are automatically hidden.
- Introduced a capability registry with 100+ security tools and Python packages organized into 7 buckets: Core, Network & Recon, Reverse Engineering, Binary Exploitation, Cryptography, Forensics, and Steganography.
- Capabilities are exposed to the agent's system prompt so it knows what's available.
- Added "Detect Installed" to scan the exploit box and determine which tools are already present.
- Agent can auto-install capabilities on the fly via the
run_install_tooltool.
- Removed the copilot checklist/todo list feature. The agentic workflow replaces the need for a step-by-step checklist, as the agent manages its own task progression.
- Added integration with Burp Suite Professional via the
burp-rpcgRPC extension. - Dedicated proxy history page in the UI with filtering, request/response inspection, and intercept toggle.
- Agent tools:
search_burp_proxy_history,send_to_burp_repeater,send_to_burp_intruder,burp_collaborator. - "Send to Workspace" action to attach captured requests to the agent's chat context.
- Introduced browser automation via Magnitude. The agent can control a real browser to interact with web applications: navigate, fill forms, click buttons, extract data.
- Configurable proxy URL to route browser traffic through Burp Suite, combining browser-based testing with Burp's analysis tools.
- Headless and visible modes (visible mode uses the VNC display).
- Separate model configuration for the browser agent.
- Added a VPN management page for handling OpenVPN connections on the exploit box.
- Upload
.ovpn/.confprofiles from the browser. - Connect and disconnect VPN profiles with one click.
- Support for multiple simultaneous VPN connections.
- Status display showing PID, tunnel interface, and assigned IP.
- The agent can spawn subagents to work on tasks in parallel using the
spawn_subagenttool. - Each subagent gets its own iteration loop (up to 15 iterations, 10-minute timeout) with access to the same tools.
- Results are merged back into the main conversation.
- Introduced
run.shas the primary setup and launcher script. Handles configuration, Docker builds, and container orchestration in a guided flow. - Three modes: Core Docker, Docker + Kali, Developer.
- Split configuration into
config.toml(static, requires restart) andbackend/.env(dynamic, hot-reloadable). - Settings are configurable from both
run.shand the in-app Settings overlay.
- Dangerous command detection: commands matching patterns like recursive forced deletion of system paths, writes to block devices, fork bombs, and system shutdown are flagged and require explicit approval.
- Global consent override toggle in Settings.
- Commands run in
~/pentest-workspacesandbox by default.
- Added 8 slash commands:
/help,/status,/summarize,/targets,/export,/shells,/clear,/reset. - Run outside the agent loop for quick session management.
- Support for multiple LLM providers: OpenAI, Anthropic, Google, Mistral, and any OpenAI-compatible API.
- Anthropic OAuth authentication as an alternative to API keys.
- Reasoning mode support for models that offer extended thinking.
- Configurable from the Settings overlay.
- Automatic summarization of older messages when context gets large.
- Context usage indicator in the sidebar showing token consumption.
-
/summarizeand/clearslash commands for manual context management.
- Shell sessions on the exploit box with PTY (interactive) and exec (non-interactive) modes.
- Multiple named shell tabs in the UI.
- Agent can spawn, write to, read from, and close shells programmatically.
- Automatic SSH reconnection on connection drops.
- Auto-setup for VNC on the exploit box.
- Built-in VNC diagnostic and repair tools.
- noVNC-based browser access to the Kali desktop.
- The built-in Kali container includes XFCE, Firefox ESR, and noVNC pre-configured.
-
run.sh devstarts only infrastructure (MongoDB, Redis, optionally Kali) so you can run the frontend and backend locally for development. - Dev mode sets connection strings to localhost automatically.
Getting Started
Using Pentest Copilot
Configuration
Integrations
Reference