Devlery
Blog/AI

Claude self-hosted sandboxes set new rules for private MCP access

Anthropic added self-hosted sandboxes and MCP tunnels to Claude Managed Agents, shifting tool execution and private tool access into enterprise-controlled boundaries.

Claude self-hosted sandboxes set new rules for private MCP access
AI 요약
  • What happened: Anthropic added self-hosted sandboxes and MCP tunnels to Claude Managed Agents.
    • The update was announced on May 19, 2026; self-hosted sandboxes are in public beta, while MCP tunnels are a research preview.
  • Architecture change: Anthropic can keep managing the agent loop while tool execution and network egress move into customer infrastructure.
  • Developer impact: Teams connecting private files, internal APIs, and MCP servers to agents now have to design workers, outbound tunnels, and audit boundaries.
    • Every Managed Agents API request still needs the managed-agents-2026-04-01 beta header.
  • Watch: Managed Agents are stateful by design and are not currently eligible for Zero Data Retention or HIPAA BAA coverage.

Anthropic published a Claude Managed Agents update on May 19, 2026. The two additions are self-hosted sandboxes and MCP tunnels. Self-hosted sandboxes are available as a public beta. MCP tunnels require an access request and are described as a research preview. The product names are less important than the operational split they introduce: the agent can remain managed by Anthropic while tool execution and private tool access move closer to the enterprise boundary.

This is not just another "Claude can use more tools" release. Anthropic says Claude Managed Agents can operate in customer-controlled sandboxes and connect to private Model Context Protocol servers. Agent orchestration, context management, and error recovery can remain on Anthropic infrastructure, while Bash execution, file access, network calls, and tool invocation can run inside customer infrastructure or a sandbox provider such as Cloudflare, Daytona, Modal, or Vercel.

For AI engineering teams, the new question is not only which model to call. It is where the execution boundary sits. Once an agent runs Bash, reads and writes files, fetches the web, installs packages, or calls an MCP server, security review expands beyond prompt-injection defenses. Teams have to decide which filesystem is visible, who controls network egress, whether internal APIs need public endpoints, how credentials are injected, and how long session history and outputs are retained.

Self-hosted sandbox architecture diagram from Anthropic.

What Managed Agents keeps and what it delegates

Claude Managed Agents is a different surface from the Messages API. Anthropic's documentation presents Messages API as the lower-level model prompting interface, while Managed Agents is a configurable agent harness for long-running and asynchronous work. An Agent contains the model, system prompt, tools, MCP servers, and skills. An Environment determines whether a session runs in an Anthropic-managed cloud sandbox or a self-hosted sandbox.

The core concepts are Agent, Environment, Session, and Events. A Session is an instance of an agent running in a particular environment. Events are the bidirectional messages that carry user turns, tool results, status updates, spans, and agent activity. The session event stream documentation describes sending user events and receiving session events, span events, and agent events to track progress. Every Managed Agents API request requires the managed-agents-2026-04-01 beta header.

The listed tool surface is broad enough to make the runtime boundary material. Anthropic documents Bash, file operations, web search and fetch, and MCP servers. With that set, an agent is not merely answering questions. It can act like a remote worker: inspect a repository, run tests, install dependencies, generate outputs, and create an issue through an internal ticketing MCP server. At that point, credential handling, filesystem scope, and network policy become part of the product architecture.

Why self-hosted sandboxes matter

The self-hosted sandbox documentation states the main distinction directly. Managed Agents usually execute tools and code in Anthropic-managed cloud sandboxes. Self-hosted sandboxes keep orchestration on Anthropic's side but move tool execution into infrastructure controlled by the customer. The comparison table in the docs follows the same line: cloud environments run tools in Anthropic-managed sandboxes; self-hosted sandboxes run them in customer infrastructure.

The difference shows up immediately in data residency and compliance review. Agent jobs that touch internal files, private packages, database clients, or private APIs are hard to place inside an external cloud sandbox. With a self-hosted sandbox, network reach follows the customer's network policy rather than Anthropic's egress control. File and GitHub repository mounting also moves out of Anthropic's managed layer and into the customer's sandbox image and orchestration layer.

Anthropic frames this as keeping execution within the enterprise boundary. Sensitive files and services stay in customer infrastructure or in a managed sandbox provider environment, while Anthropic continues to operate the agent loop. That preserves a core advantage of managed agents: developers do not need to build the entire agent runtime themselves. But it also means the OS image, installed packages, secrets, network rules, and output paths become separately governed infrastructure.

ItemCloud environmentSelf-hosted sandbox
Tool execution locationAnthropic-managed sandboxCustomer infrastructure or sandbox provider
Network policyAnthropic egress controlCustomer network policy
File and repo mountingHandled by Managed AgentsHandled through sandbox image and metadata
Operating responsibilityAnthropic lifecycleWorker fleet, queue, image, and logging operations

The worker becomes a new operating surface

Self-hosted sandboxes are not just "containers in the customer's VPC." The documentation introduces an environment worker. The worker claims work items from Anthropic's queue, creates a session-specific execution context, downloads agent skills into /workspace/skills/<name>/, executes tool calls, and sends results back to Anthropic. Work can be processed through polling or by a webhook-triggered handler.

That design creates a new operations surface for development teams. When an agent feels slow, the cause may be model latency, Anthropic queue backlog, worker fleet capacity, sandbox image cold start, an internal package registry issue, or a failed output upload. The docs mention work.stats, which returns queue depth, pending work, the oldest queued timestamp, and the count of polling workers seen over the last 30 seconds. Even with Managed Agents, self-hosted mode turns SRE metrics into part of the agent product.

Dependencies are not a footnote. The SDK helper requires /bin/bash at that exact path. The TypeScript SDK helper needs unzip, tar, and Node.js 22 or later. The default working directory is /workspace, and final output files should be written under /mnt/session/outputs. A security team that wants a minimal base image and an engineering team that wants agents to use many build tools will have to agree on a concrete runtime image rather than treating the sandbox as an abstract safety layer.

What MCP tunnels solve

MCP tunnels address a different problem from self-hosted sandboxes. The sandbox decides where tool execution runs. MCP tunnels decide how Anthropic or an agent session reaches an internal MCP server. Anthropic's announcement describes a lightweight gateway deployed inside the customer's network. The gateway creates a single outbound connection, avoiding inbound firewall rules or public endpoints while carrying end-to-end encrypted traffic.

That directly targets a recurring burden for teams operating MCP servers. If an internal database, private API, knowledge base, or ticketing system becomes an MCP tool, the agent needs a path to it. Opening a public endpoint creates authentication, IP allowlist, WAF, rate-limit, audit-log, and prompt-injection-triggered tool-call problems. A tunnel reduces public exposure. It does not eliminate the need to decide what gets logged, how tool arguments are redacted, or which calls require approval.

Anthropic says MCP tunnels work with both Managed Agents and the Messages API. Administration lives in Claude Console workspace settings and is handled by organization admins. That matters because an MCP server moving from a developer's local experiment into organization workspace settings becomes a platform asset. Tool registration, approval, deprecation, key rotation, schema changes, and rollback all become operational processes.

MCP tunnel architecture diagram from Anthropic

Data retention has not gone away

The security reading of the announcement can miss one important sentence in the Managed Agents overview. Anthropic says Managed Agents are stateful by design. Sessions are long-running, can pause and resume, and store conversation history, sandbox state, and outputs server-side. Because of that design, Managed Agents are not currently eligible for Zero Data Retention or HIPAA Business Associate Agreement coverage.

Self-hosted sandboxes move the location of tool execution, but they do not erase the stateful nature of Managed Agents. Even when files and internal APIs remain inside the customer's boundary, teams still need to review event history, session metadata, tool result summaries, output handling, and deletion behavior. The documentation says users can delete sessions and uploaded files through the API. That is materially different from a guarantee that the data is never stored.

For healthcare, finance, public-sector, and other regulated workloads, this distinction can decide adoption. A team that needs a HIPAA BAA cannot treat "self-hosted sandbox" as equivalent to eligible data processing. A team that requires Zero Data Retention faces the same issue. Moving execution closer to the customer improves one boundary, but it does not automatically satisfy every contractual or compliance requirement.

Sandbox providers become part of the agent stack

Anthropic names Cloudflare, Daytona, Modal, and Vercel as supported providers. Cloudflare emphasizes microVMs and isolates, zero-trust secret injection, and proxy-based egress control. Daytona emphasizes long-running stateful computers, SSH, authenticated preview URLs, pause, and restore. Modal describes an AI workload cloud platform, custom container runtime, and CPU and GPU resources. Vercel points to VM security, VPC peering, bring your own cloud, and millisecond startup time.

That list suggests Anthropic is not trying to own the entire agent runtime. Claude Managed Agents owns the model-facing agent loop. Sandbox partners own parts of execution isolation. For buyers, the comparison set changes. OpenAI Agents SDK, Google's managed agent surfaces, an internal Kubernetes worker, and partner sandboxes now compete on startup latency, repository mounting, egress control, secret injection, GPU availability, audit exports, and restore behavior rather than model name alone.

There is also a switching-cost angle. Systems built around Managed Agents' Agent, Environment, Session, and Events model may need nontrivial remapping if they later move to a different provider. Session state, event schema, tool result format, and skill distribution become coupling points. Some developer reactions around managed harnesses focus on this tradeoff: the more the harness handles state, tracing, and sandboxing, the more costly it can become to leave.

Cost and rate limits need their own design

The Managed Agents overview documents organization-level rate limits. Create-style endpoints are limited to 300 requests per minute. Read, list, and stream-style endpoints are limited to 600 requests per minute. Spend limits and tier-based limits also apply. If an internal agent product is opened to thousands of employees, those numbers become capacity-planning inputs, not just API documentation details.

Cost also splits into layers. One layer is Anthropic model and Managed Agents usage. The other is the cost of self-hosted workers and sandbox providers. An agent that mixes long builds, browser automation, image generation, and data analysis may spend more on sandbox compute and queueing than on model tokens. Anthropic's emphasis on customer-controlled resource sizing and runtime images reflects that split.

Teams should observe cost at the tool-call level. Which sessions consume model tokens? Which sessions hold worker CPU for a long time? Which MCP server calls trigger retries? Without that breakdown, agent operating costs are hard to explain internally. Session events and span events are a useful starting point, but organizational chargeback and debugging still need tags, trace correlation, and provider-level metrics.

Agent security becomes more concrete

Agent security discussions often group prompt injection, tool permissions, sandbox escapes, and secret leakage into broad categories. Anthropic's update breaks those categories into smaller implementation decisions. Is tool execution in a cloud sandbox or a self-hosted sandbox? Is the private MCP server exposed as a public endpoint or reached through an outbound tunnel? Where does session state remain? How long are checkpoints kept? Does a worker host contain an organization-scoped API key or only environment-specific credentials?

The self-hosted sandbox documentation is especially careful about credential placement. It says monitoring and operations endpoints authenticate with an organization API key, but warns that placing ANTHROPIC_API_KEY on the worker host could expose organization-scoped credentials to agent tool calls. That is the kind of failure mode that turns a product feature into a security incident. The boundary works only if key placement and process isolation are correct.

MCP tunnels have the same pattern. Not exposing an MCP server to the public internet is a strong improvement. But when an agent calls an internal tool, teams still need to know which user identity is attached, which input arguments were sent, which result summaries were stored, and which error messages reached logs. For tools connected to ticketing, HR, customer data, or production operations, "inside a tunnel" is not enough. Redaction, access review, replay prevention, and approval steps remain necessary.

A practical checklist for teams

First, classify agent work as read-only, write, or admin action. File search and document summarization may be acceptable in a cloud sandbox. Private repository builds, internal API calls, and production ticket changes are more likely to require a self-hosted sandbox or tunnel. Without that classification, every agent session inherits the same cost and security model.

Second, treat the worker fleet as product infrastructure. Queue depth, pending work, oldest queued timestamp, polling worker count, sandbox startup latency, and output upload failures need monitoring. A user report that "Claude is slow" should not automatically become a model issue. In self-hosted mode, the product is a distributed system spanning Anthropic-managed orchestration and customer-managed execution.

Third, make data retention and contract terms explicit before adopting the product for regulated workloads. Managed Agents are not currently covered by Zero Data Retention or HIPAA BAA. Self-hosted sandboxes do not automatically change that. Buyers should review session history, sandbox state, outputs, uploaded file deletion APIs, retention policies, and audit exports before putting sensitive workflows into the system.

Fourth, manage MCP servers as organization assets. Even when MCP tunnels avoid public endpoints, teams still need tool schema ownership, auth scopes, allowed callers, logging, redaction, rate limits, review, and rollback. An MCP tool can grant broader operational access than a single internal API key. A schema change can alter agent behavior, so deployment review belongs in the platform process.

The question Anthropic is putting on the table

Claude Managed Agents' new features show where enterprise agent runtime design is moving. Model companies are packaging agent loops, context management, event streaming, and tool result handling into managed products. Customers want tighter control over execution location, network reach, internal tools, secrets, and compliance boundaries. Self-hosted sandboxes and MCP tunnels sit at the meeting point of those two demands.

The structure is not a complete answer. As long as orchestration remains with Anthropic, session state and event history remain part of security and procurement review. Moving tool execution into customer infrastructure increases control, but it also adds worker operations and cost responsibility. Connecting private MCP servers through tunnels reduces public exposure, but it raises the bar for audit and approval around internal tool calls.

The practical takeaway for developers is to ask "where does this run, what network does it cross, and what gets recorded?" before choosing the model. Claude Managed Agents' self-hosted sandboxes and MCP tunnels turn those questions into product options. The options are useful, but safe operation still depends on the quality of environment design, logging design, and permission design.