Devlery
Blog/AI

Codex Windows sandbox sets the baseline for local agent security

OpenAI disclosed the Codex Windows sandbox design, moving local coding agent security from app isolation to OS-level execution boundaries.

Codex Windows sandbox sets the baseline for local agent security
AI 요약
  • What happened: OpenAI published the Windows sandbox architecture that lets Codex run local coding tasks with tighter OS-level limits.
    • The May 13, 2026 engineering post combines restricted tokens, ACLs, dedicated local users, and Windows Firewall rules.
  • Why it matters: Network blocking moves beyond proxy environment variables and into operating-system principals and firewall policy.
  • Builder impact: Local coding agents are now judged not only by model quality, but by command execution boundaries and approval UX.
  • Watch: Strong isolation still has to coexist with real developer tools, caches, credentials, and enterprise endpoint policy.

OpenAI has published the Windows sandbox design behind Codex. At first glance this looks like a product update: Codex can run more safely on Windows. The more interesting story is deeper. Once a coding agent leaves the browser and starts running commands on a developer machine, what should the operating system be asked to guarantee? Which boundary belongs in an app sandbox, a virtual machine, a container, a firewall rule, an approval prompt, or a local user account? OpenAI's May 13, 2026 engineering post is a practical answer to that question.

Codex runs through a CLI, IDE integrations, and desktop flows where it can execute commands on behalf of a developer. It reads files, edits source code, runs tests, invokes package managers, creates branches, and sometimes starts Python, Node, PowerShell, Git, or build tooling. That workload does not fit the older mental model of "one app with a fixed permission set." A coding agent is closer to a delegated developer workflow. It is an actor that composes many tools at runtime.

That distinction matters because local agent security is often discussed as if it were only a model alignment problem or only a containerization problem. In a real development environment, the agent needs enough access to be useful. It must work against the actual checkout, use the developer's installed tools, and reproduce failures in the same project context. But if it inherits the real user's full authority, one bad command can reach home directories, credentials, private source code, network endpoints, or corporate systems.

Codex Windows sandbox architecture summary

Windows did not have the exact button Codex wanted

OpenAI evaluated three obvious Windows options first: AppContainer, Windows Sandbox, and Mandatory Integrity Control. Each one provides isolation, but each missed a different part of the Codex workload.

AppContainer is a native Windows sandboxing mechanism. It works well when an application knows its required capabilities ahead of time. Codex is different. It needs to launch unpredictable project-specific tools: Git, Python, Node, PowerShell, test runners, language toolchains, and whatever a repository expects. AppContainer offered a strong boundary, but the shape was too narrow for an open-ended developer workflow.

Windows Sandbox offered a stronger boundary through a disposable lightweight virtual machine. From a security perspective, that is attractive. The product problem is that Codex needs to work with the user's real checkout, real toolchain, and real development environment. Rebuilding or bridging that environment inside a fresh desktop VM would add friction, and Windows Sandbox is not available on every Windows edition. A coding assistant that works only when the platform SKU is right is hard to make the default experience.

Mandatory Integrity Control looked promising because a process can run at a lower integrity level while selected paths are relabeled as writable. OpenAI rejected that direction because it changes the meaning of the host filesystem. Marking a workspace as low integrity does not mean only Codex can write it. It means low-integrity processes generally see a different trust shape for that path. That is a broad change to make on a developer's real checkout.

CandidateStrengthWhy it did not fit Codex
AppContainerStrong native app isolationToo constrained for arbitrary developer tools and open workflows
Windows SandboxStrong VM-based boundaryHard to use directly with the real checkout and toolchain; not available on Windows Home
MIC integrity labelCan restrict writes without a separate VMChanges the trust semantics of the workspace itself
Elevated sandboxCombines OS-level write and network controlsRequires setup work and Windows-specific implementation complexity

This evaluation shows a pattern that will repeat across AI agent security. The strongest isolation is not automatically the best product. The most convenient execution model is not automatically an acceptable risk. Coding agents sit between those poles. They need a boundary open enough to let real work happen, and closed enough to make mistakes and malicious tool behavior survivable.

The first prototype handled files better than network

OpenAI's first working prototype was an unelevated sandbox. It did not require administrator privileges. The key Windows primitives were SIDs and write-restricted tokens. A SID is the identity Windows uses when checking permissions. A synthetic SID is not a real user or group, but it can still appear in ACLs. OpenAI created a synthetic SID for the Codex sandbox and granted it write access only on configured writable paths.

The write-restricted token is the important piece. A normal token asks whether the user can write to a file. A write-restricted token adds another check: the restricted SID must also have permission. That gives Codex a useful model. It can read broadly enough to understand the project, while writes are limited to the workspace or other explicitly allowed roots.

This answered part of the local-agent problem. It reduced the chance that a model-generated command would casually mutate unrelated directories. It also avoided the heavyweight cost of pushing everything into a VM. For coding tasks, that matters. A developer expects tests, formatters, build systems, and Git operations to run against the same files they are editing.

The weak point was network control. The first prototype suppressed network access through environment-variable proxy overrides. Many development tools honor HTTP_PROXY and HTTPS_PROXY, so this can be useful in practice. But it is not a security boundary. A malicious command can ignore proxy variables. A benign binary can use direct socket code. A dependency script can bypass the convention simply because it was never written to honor those environment variables.

This is the core security issue in the post. For a local coding agent, network blocking is not a decorative feature. If the agent can read source code, logs, environment variables, or credentials, outbound network access becomes a possible exfiltration path. An approval UI that says "network disabled" only matters if the operating system enforces it across the relevant process tree.

The elevated sandbox changes the principal

OpenAI's current design is an elevated sandbox. It asks for administrator privileges during setup, but Codex itself remains a normal user process during day-to-day use. The setup creates two local users: CodexSandboxOffline and CodexSandboxOnline. The offline user is the one targeted by Windows Firewall rules that block outbound network access. The online user is used when a task is allowed to reach the network.

This choice is subtle. Windows Firewall could not target only the synthetic SID inside a restricted token in the way OpenAI needed. A rule based on the codex.exe binary would not reliably cover child processes such as python.exe, git.exe, or shell scripts. A rule based only on ports or addresses would also miss the policy intent. The goal was not "block port 443." The goal was "block arbitrary outbound access for this sandboxed execution tree unless the user has chosen an online mode."

So OpenAI changed the principal. Rather than running the requested command directly as the real user, Codex uses a sandbox-specific local user. Inside that context, a command runner creates the restricted token and launches the actual child process. Windows Firewall can then apply policy to the offline sandbox user, and child commands inherit that policy.

OpenAI's final architecture has four layers. codex.exe runs as the real user and manages the harness. codex-windows-sandbox-setup.exe performs the elevated setup. codex-command-runner.exe runs as the sandbox user and creates the restricted token. The child process is the actual tool the developer cares about: Git, Python, a package manager, a build system, or a test runner.

The implementation is necessarily busy. Setup creates synthetic identities, provisions online and offline sandbox users, stores credentials with Windows Data Protection API, creates or verifies firewall rules, and grants read ACLs so the sandbox user can inspect enough of the real environment to work. OpenAI also notes that some of this setup is asynchronous to reduce how long users wait.

This is a delegated developer problem

The interesting part of the design is not that it produces a perfect sandbox. It redefines the security target. Traditional app security usually limits the authority of a known application. Browser tabs, mobile apps, and sandboxed desktop apps fit that shape. A coding agent is different. It is asked to perform developer work by invoking many other programs. The policy question is not just "can this app use the network?" It is "can this agent session use the network while running arbitrary child processes?"

That changes the language of control. The important question is not only whether one process can write a file. It is whether the agent can modify files outside the workspace it was assigned. It is not only whether a binary is allowed to connect outbound. It is whether a session with network disabled can still leak data through a child process. It is not only whether administrator rights are used. It is whether elevated setup is separated from normal command execution.

This also changes the product competition. Better models still matter. But if the agent constantly asks for approval on harmless reads, the workflow slows down. If the only way to make it useful is to turn on Full Access, the blast radius becomes hard to explain. Approval UX and OS boundaries start to feel like part of the model's capability because they determine how long and how independently the agent can work.

OpenAI's May 14, 2026 announcement that Codex can be controlled from the ChatGPT mobile app fits the same direction. Mobile control, remote SSH environments, and long-running threads make the agent easier to supervise away from the terminal. But as remote approval becomes more convenient, the default execution policy has to become more precise. A small-screen approval prompt is only trustworthy if the lower layers enforce what the prompt says.

Windows support is also an enterprise-readiness test

There is a market-expansion story here: Windows developers become first-class Codex users. But the more durable story is about agent runtime portability. Many AI coding tools start on macOS and Linux because developer culture and sandboxing primitives are easier there. macOS has Seatbelt. Linux has seccomp, namespaces, and tools such as bubblewrap. Windows requires a different composition of ACLs, tokens, local users, firewall policy, UAC, and profile permissions.

That matters for enterprise adoption. Coding agents cannot stay limited to macOS-heavy startup teams. They have to run on Windows laptops, managed endpoints, corporate proxy environments, restricted accounts, endpoint detection systems, private registries, and internal authentication flows. In that world, "we have a sandbox" is not enough. Security teams will ask which principal runs commands, whether restrictions propagate to child processes, whether outbound network access is truly blocked, and which permissions are required during setup.

The community reaction described in the research note reflects that tension. Some Windows users prefer to keep working through WSL. Others welcome OpenAI treating Windows as a serious target. Some report that strong sandboxing can block real work and needs to be disabled. That mix is predictable. Stronger boundaries reveal compatibility problems. Looser boundaries keep work moving but reduce the value of the sandbox.

What development teams should take from this

The first practical lesson is that Full Access should not become the default operating model for local agents. It may be useful for debugging or temporary escape hatches, but it is hard to defend as the normal path. A safer default is workspace-scoped writes, network disabled by default, and explicit approvals for exceptions.

The second lesson is that proxy environment variables are not enough for network security. They are useful compatibility controls, not enforcement controls. If the agent can see sensitive files or logs, outbound access needs to be governed by the OS, firewall, network layer, or an equivalent policy mechanism that child processes cannot trivially ignore.

The third lesson is that the workspace boundary is more complex than it sounds. Developers say "only edit this repository," but real builds touch global caches, temp directories, language servers, package-manager state, credential helpers, IDE configuration, and sometimes generated files outside the repo. Make the write boundary too narrow and the agent fails constantly. Make it too wide and the sandbox becomes mostly symbolic.

The fourth lesson is auditability. Teams need to know which commands ran, under which principal, with which approvals, and what files changed. OpenAI's Windows post is not primarily an audit-log announcement, but the separation between harness, setup, runner, and child process creates a structure that can support clearer policy and observability over time.

The new baseline

Codex's Windows sandbox is better understood as a baseline than a final answer. Coding agents will keep competing on model quality, IDE integration, price, and repository understanding. But the next category boundary is local execution: how network access is opened and closed per session, how command approvals work, how mobile supervision connects to remote or local execution, and how the product explains those choices to users.

This raises the bar for every coding-agent vendor. Saying "the agent edits code" is no longer enough. Where does it execute? Which user runs its commands? Do child processes inherit the same restrictions? Is the network actually blocked? What happens outside the workspace? Does setup need elevated rights, and are those rights separated from ordinary execution? Tools that cannot answer these questions will struggle in security-sensitive organizations.

The paradox is that all this complexity exists to make the agent feel more natural. Users want to delegate long tasks: run tests, inspect failures, patch code, verify the fix, and come back with a concise explanation. To do that, the agent needs freedom. To make that freedom acceptable, it needs boundaries. OpenAI's Windows sandbox is one concrete attempt to resolve that tension at the operating-system layer.

Viewed as a single feature, this is a modest Codex update. Viewed as AI-driven development infrastructure, it is a stronger signal. The bottleneck is moving from "can the model generate code?" to "can the agent safely work on a real computer?" Windows sandboxing turns that question from theory into product engineering.