Devlery
Blog/AI

Cursor Auto-review ships for Shell and MCP approvals

Cursor 3.6 Auto-review Run Mode routes Shell, MCP, and Fetch calls through allowlists, sandboxing, classifier review, and user approval.

Cursor Auto-review ships for Shell and MCP approvals
AI 요약
  • What happened: Cursor published Auto-review Run Mode in its May 29, 2026 changelog.
    • The mode applies to Shell, MCP, and Fetch tool calls, routing them through allowlists, sandbox execution, classifier review, or user approval.
  • Why it matters: Long-running coding agents are shifting the bottleneck from model output to execution approval policy.
  • Watch: Cursor's own docs say classifier decisions can be wrong and Auto-review is not a security boundary.
  • Team impact: Enterprise admins need to tune Run Mode, sandbox networking, MCP protection, and file-deletion controls before broad rollout.

Cursor introduced Auto-review Run Mode in its 3.6 changelog on May 29, 2026. The short product pitch is that Cursor can work longer, show fewer approval prompts, and execute more safely. The deeper change is a new execution policy for the agent's Shell, MCP, and Fetch tool calls: Cursor decides whether a call can run from an allowlist, run inside a sandbox, go to a classifier subagent, or wait for explicit user approval.

That policy matters because AI coding agents increasingly spend their time outside pure text generation. They search code, read files, run tests, inspect build logs, call package managers, fetch documentation, and use MCP tools connected to external systems. Requiring approval for every npm test, pnpm build, rg, or git diff interrupts multi-step work. Allowing every Shell and MCP call to run automatically creates a different risk profile: file deletion, external network access, secret exposure, workspace escape, and unintended changes to systems outside the repository. Cursor's Auto-review sits between those two modes by inserting a classifier subagent into the gray area.

Cursor docs

The three tool surfaces Auto-review covers

Cursor names three target surfaces for Auto-review: Shell, MCP, and Fetch tool calls. Shell covers terminal commands. MCP covers the Model Context Protocol tools that connect agents to external data sources and services. Fetch covers web request tools. These are the places where a coding assistant becomes an executor rather than a code editor.

The routing path has four branches. Allowlisted calls run immediately. Calls that can run in a sandbox execute inside that sandbox. Other agent actions go to a classifier subagent. That classifier decides whether to allow the call, ask the main agent to try a different approach, or ask the user for approval. Users configure the behavior under Settings > Cursor Settings > Agents > Run Mode, and they can provide custom instructions to the classifier agent.

Shell · MCP · Fetch tool call

Run immediately when allowlisted

Run in isolation when sandboxable

Classifier subagent chooses allow · reroute · request approval

The design productizes an approval problem that coding agents have exposed for the last year. In earlier flows, users approved or rejected individual operations. In Auto-review, one agent is effectively reviewing another agent's tool execution. The user sees fewer individual calls, while allowlists, sandbox rules, classifier instructions, and enterprise policy define the risk envelope.

Cursor's docs draw a clear line

Cursor's Terminal documentation is more important than the changelog headline. The docs describe the classifier as non-deterministic and state that it can make mistakes in both directions: it may block a safe call, or allow a call that the user would have rejected. Cursor tells users to treat Auto-review as best-effort convenience and use allowlists or direct approval for stricter control.

That warning is not a minor footnote. It defines the responsibility boundary. The classifier subagent is not the security boundary. The real boundaries are sandboxing, allowlists, network policy, file protections, and enterprise controls. The classifier is closer to a user-experience layer for reducing approval fatigue. A team that treats Auto-review as an automatic security approval system will assign too much trust to a model decision.

OpenAI Codex, Claude Code, Cursor, and other coding agents face the same pressure. Users want agents to keep working for longer stretches. Products need more tool calls to complete real repository work. Security teams do not want arbitrary commands, MCP calls, and web requests to execute without control. Cursor's release shows approval policy becoming a first-class product feature, not just a modal dialog attached to a terminal command.

Sandbox and allowlist settings are the actual boundary

Cursor's docs describe sandboxing as a separate control plane. Sandboxed commands can have restricted filesystem and network access. Network access can be configured through modes such as sandbox.json Only, sandbox.json + Defaults, and Allow All. Cursor describes the default as combining the user's allowlist with Cursor's built-in defaults. That choice directly affects whether an agent can reach package registries, API endpoints, documentation sites, or arbitrary external hosts while running a command.

Allowlists also split into multiple categories. Command Allowlist defines terminal commands that can run without approval. MCP Allowlist defines MCP tools that can run without approval. Cursor's docs note that, depending on the selected mode, allowlisted commands may run outside the sandbox, while non-allowlisted commands may still run inside a sandbox. "Putting it on the allowlist" is therefore a permission decision, not just a convenience shortcut.

Teams should avoid building allowlists as a grab bag of commands they run often. git status, rg, ls, cat, and a project test command are comparatively low-risk in many repositories. rm, curl | sh, cloud CLIs, secret managers, deploy commands, and database migration commands are different. MCP tools need the same treatment. A documentation-search MCP server and a production-database MCP server should not land in the same approval bucket simply because they both speak MCP.

Protection settings map to common agent failure modes

Cursor documents Browser Protection, File-Deletion Protection, Dotfile Protection, and External-File Protection. Browser Protection blocks automatic browser-tool execution. File-Deletion Protection blocks automatic file deletion. Dotfile Protection blocks automatic edits to files such as .gitignore. External-File Protection blocks file creation or modification outside the workspace.

Those controls mirror common failure modes in coding-agent work. Terminal commands can touch a wider path than intended. Dotfiles and configuration files can quietly change future behavior for the whole project. Workspace escapes can affect local credentials, global configuration, or neighboring repositories. Browser and network tools can interact with external systems in ways that are harder to audit from a code diff.

The settings become more consequential in team repositories than in individual experiments. A solo developer can often recover from a bad local branch by inspecting the diff and reverting a small set of changes. In an enterprise repository, changes to dotfiles, CI config, MCP server settings, package locks, migration scripts, or deployment configuration can affect other engineers and automated pipelines. Before enabling Auto-review broadly, a team needs to decide which classes of changes can be automated, not only which individual commands are annoying to approve.

Enterprise controls turn Run Mode into policy

Cursor says Enterprise controls are available on Enterprise subscriptions. Admins can use the web dashboard under Settings > Run Mode to override editor configuration or control which settings end users see. The documented controls include Run Mode controls, Sandboxing Mode, Sandbox Networking, Delete File Protection, MCP Tool Protection, Terminal Command Allowlist, and Enable Run Everything.

That list signals how far AI coding tools have moved from personal productivity utilities. Traditional IDE administration mostly centered on extensions, keybindings, update channels, and sometimes compliance settings. Agentic IDE administration now has to decide whether an agent can access the network, run MCP tools, delete files, edit dotfiles, or enter a mode that executes everything. These are access-control questions inside a developer tool.

GitHub Copilot already exposes organization policy and model access controls. OpenAI Codex places sandboxing and approvals near the center of the product experience. Claude Code has its own permissions and execution boundaries. Cursor's Auto-review is the same category of product work: a coding agent can only run longer if command execution policy is explicit enough for both developers and security teams.

Approval fatigue is a real product bottleneck

Coding agents decompose work into many small operations. A typical fix includes code search, file reads, test execution, build execution, failure-log inspection, patching, and retesting. If every step waits on a prompt, the agent struggles to complete long-running tasks without the user sitting in the loop. The user also becomes conditioned to approve prompts quickly, which can reduce the safety value of the prompts that remain.

Auto-review targets that fatigue. Low-risk calls can run automatically. Sandboxable calls can run with isolation. Ambiguous calls can be reviewed by the classifier. The user is supposed to intervene only when the operation is risky, policy-sensitive, or unclear. If the setup is tuned well, the agent does more uninterrupted work and the human reviews fewer mechanical prompts.

The classifier caveat changes how teams should evaluate the feature. This is a risk-reduction model, not a security model. If the allowlist is broad, the sandbox has generous network access, and MCP tools can mutate external systems, the classifier can appear to be the final defense. Cursor's documentation explicitly rejects that interpretation. Teams should harden the deterministic controls first, then use classifier review to reduce noise inside those boundaries.

MCP makes approval policy harder

MCP has become a common way to attach tools and data sources to AI agents. Cursor includes MCP tool calls in Auto-review, which is a necessary but difficult choice. MCP tools vary widely in risk. Some tools are read-only: documentation search, issue lookup, or log inspection. Others can modify tickets, trigger deployments, query databases, access secrets, or change cloud resources.

The label "MCP call" does not describe the security meaning of the operation. Auto-review's MCP Allowlist means teams need tool-level policy, not just server-level policy. They should inspect action type, read/write capability, target environment, production access, and audit-log retention. As agents reach more systems through MCP, Cursor's Run Mode settings look less like editor preferences and more like access-control configuration.

Fetch is also not a small surface. When an agent reads websites or calls web APIs, network allowlisting and prompt-injection risk enter the workflow together. Reading a documentation page is not the same operation as downloading a shell script from an arbitrary URL and executing it. Cursor's inclusion of Fetch in Auto-review acknowledges that web access is part of execution policy, not merely research.

Practical rollout criteria for teams

Teams should set different Run Mode defaults by task category. Auto-review can fit lint fixes, test repairs, documentation edits, and isolated branch work. Production migrations, secret rotation, infrastructure configuration, database schema changes, and deployment automation need direct approval and stricter allowlists. The same repository may need different modes depending on the work being delegated.

Allowlists also need regular review. A command that looks low-risk can change meaning in a specific project. pnpm build is usually a reasonable candidate, but package scripts and dependency hooks can have filesystem or network effects. git has read-only commands and history-rewriting commands. An allowlist should be maintained like a policy document, not a personal list of conveniences.

MCP tools should be split between read-only and write-capable operations. A read-only docs search tool can be a candidate for automatic execution. Jira, GitHub, database, cloud-provider, and observability tools that can write or trigger side effects should require separate approval and auditability. Cursor's MCP Allowlist also puts pressure on MCP server authors to expose clearer metadata about tool permissions and side effects.

Classifier instructions should not be mistaken for controls. Custom instructions help the classifier make a judgment, but they are still instructions to a non-deterministic model. "Never delete files" is weaker than File-Deletion Protection. "Do not edit files outside the repository" is weaker than External-File Protection. Deterministic controls should encode the hard boundaries; classifier instructions should refine behavior inside them.

The next axis in coding-agent competition

Cursor Auto-review Run Mode is not a frontier-model announcement. It is not a benchmark jump or a million-token context-window release. It is still important because long-running coding agents succeed or fail on execution policy as much as generation quality. Agents need tool calls to repair real repositories. More tool calls create more approval prompts. Fewer prompts create more risk unless sandboxing and policy improve at the same time.

This release brings developer experience and security administration onto the same screen. Developers want fewer interruptions. Security teams want fewer automatic side effects. Cursor inserts a classifier subagent between those goals, but Cursor's own docs keep the boundary clear: the classifier is convenience; allowlists, sandboxing, protection settings, and enterprise controls are the real controls.

When teams compare coding agents, model choice is no longer enough. They need to ask how Shell access is restricted, how MCP tools are approved, whether workspace escapes are blocked, whether dotfile edits are protected, whether file deletion is guarded, and whether admins can prevent users from enabling a run-everything mode. Cursor 3.6 turns those questions from hidden operational concerns into product-surface decisions.

Sources