Problem Statement
FORGE needs an optional safety-review layer for risky agent actions. Without this, coding agents may run shell commands, access sensitive files, perform destructive filesystem operations, or touch network/remote systems without a separate project-specific reviewer checking the action against user-defined rules.
This should be implemented as a project-level adversarial agent mode, inspired by Goose adversary mode, but adapted for FORGE's coding dashboard and project settings model.
Reference:
https://goose-docs.ai/docs/guides/security/adversary-mode/
Desired Outcome
FORGE supports an optional adversarial agent mode for each project. The mode is off by default. When enabled, it reviews risky tool actions before execution using default rules that can be overridden per project.
The user can enable the mode from Project Settings, review/edit the adversarial rules in a modal, restore default rules, and see adversarial mode status from the Projects dashboard.
User Story
As a FORGE user,
I want to enable an adversarial reviewer for risky agent actions on selected projects,
So that I can give coding agents more autonomy while retaining a safety layer for dangerous, destructive, or privacy-invasive actions.
Requirements
- Adversarial mode is off by default for every project.
- New projects get default adversarial rules.
- Project Settings includes an “Enable adversarial agent” checkbox.
- Enabling the checkbox opens a modal for reviewing and editing rules.
- Users can save project-specific adversarial rules.
- Users can restore default adversarial rules.
- The Projects dashboard shows adversarial mode status.
- The reviewer runs before high-risk tool actions when adversarial mode is enabled.
- The reviewer blocks clearly dangerous actions while avoiding unnecessary interruption to normal coding work.
- Blocked actions include a specific explanation and a safer alternative where possible.
- The reviewer returns a structured allow/block decision.
Acceptance Criteria
Out of Scope
- Do not enable adversarial mode by default.
- Do not block normal coding work unnecessarily.
- Do not make this a general chat-only feature; it should be integrated into FORGE project settings and project dashboard behaviour.
- Do not implement unrelated MCP management work in this issue.
Implementation Scope
Large - architecture or workflow change
Technical Notes
Project Settings should include:
Security
[ ] Enable adversarial agent
Suggested default global rules path:
~/Documents/Forge/templates/default-project/adversary.md
Suggested project override path:
~/Documents/Forge/projects/<project>/adversary.md
Suggested project config shape:
{
"security": {
"adversarialModeEnabled": false,
"adversaryRulesPath": "~/Documents/Forge/projects/<project>/adversary.md",
"usesDefaultAdversaryRules": true
}
}
Suggested dashboard states:
Adversarial Mode: Off
Adversarial Mode: On · Default rules
Adversarial Mode: On · Custom rules
Last blocked action: npm script attempted to read .env
Suggested modal behaviour:
- Toggle: Enable adversarial agent
- Text area for rules
- Button: Restore default rules
- Button: Save rules
- Indicator: Default rules / Custom rules
- Warning that this mode may block risky agent actions
Suggested high-risk actions requiring review:
- Shell commands
- Filesystem writes/deletes outside expected project paths
- Network calls
- Package install scripts
- Git operations that rewrite history or touch remotes
- Access to secrets or credential-like files
Suggested reviewer decision shape:
{
"decision": "allow | block",
"reason": "Short explanation",
"triggeredRules": ["rule-id-or-summary"],
"saferAlternative": "Optional safer action"
}
Suggested default adversary.md:
# FORGE Adversarial Agent Rules
You are the adversarial safety reviewer for this FORGE project.
Your job is to review risky agent actions before they execute. You should block actions that are clearly dangerous, destructive, privacy-invasive, or outside the user's likely intent.
Default policy:
1. Block exfiltration of secrets, `.env` files, SSH keys, API tokens, cookies, credentials, or private keys.
2. Block destructive actions outside the project folder unless the user explicitly approves them.
3. Block privilege escalation, sudo usage, chmod/chown of sensitive paths, or system-level mutation unless explicitly approved.
4. Block downloading and executing unknown remote scripts.
5. Block network calls to unknown domains unless the project has a clear reason to contact them.
6. Block attempts to disable security checks, remove logs, hide activity, or bypass this reviewer.
7. Allow normal coding work, including editing project files, running tests, installing ordinary dependencies, local builds, and git operations inside the project.
8. Err on the side of allowing normal development unless the action is clearly risky.
9. When blocking, explain the specific rule triggered and suggest a safer alternative.
Depends on the workspace structure from issue #38. Complements the project dashboard and MCP health work in issue #39.
Problem Statement
FORGE needs an optional safety-review layer for risky agent actions. Without this, coding agents may run shell commands, access sensitive files, perform destructive filesystem operations, or touch network/remote systems without a separate project-specific reviewer checking the action against user-defined rules.
This should be implemented as a project-level adversarial agent mode, inspired by Goose adversary mode, but adapted for FORGE's coding dashboard and project settings model.
Reference:
https://goose-docs.ai/docs/guides/security/adversary-mode/
Desired Outcome
FORGE supports an optional adversarial agent mode for each project. The mode is off by default. When enabled, it reviews risky tool actions before execution using default rules that can be overridden per project.
The user can enable the mode from Project Settings, review/edit the adversarial rules in a modal, restore default rules, and see adversarial mode status from the Projects dashboard.
User Story
As a FORGE user,
I want to enable an adversarial reviewer for risky agent actions on selected projects,
So that I can give coding agents more autonomy while retaining a safety layer for dangerous, destructive, or privacy-invasive actions.
Requirements
Acceptance Criteria
adversary.mdrules.Out of Scope
Implementation Scope
Large - architecture or workflow change
Technical Notes
Project Settings should include:
Suggested default global rules path:
Suggested project override path:
Suggested project config shape:
{ "security": { "adversarialModeEnabled": false, "adversaryRulesPath": "~/Documents/Forge/projects/<project>/adversary.md", "usesDefaultAdversaryRules": true } }Suggested dashboard states:
Suggested modal behaviour:
Suggested high-risk actions requiring review:
Suggested reviewer decision shape:
{ "decision": "allow | block", "reason": "Short explanation", "triggeredRules": ["rule-id-or-summary"], "saferAlternative": "Optional safer action" }Suggested default
adversary.md:Depends on the workspace structure from issue #38. Complements the project dashboard and MCP health work in issue #39.